Hi That was my first thought, too: Nothing prevents the Binary implementation from checking whether the InputStream is a FileInputStream and then access the FileChannel from it.
In the concrete case of Sling, the Sling RequestParameter.getInputStream() happens to call the Commons Upload FileItem.getInputStream() method which happens to return such a FileInputStream (if the item is actually stored in the filesystem, otherwise a ByteArrayInputStream happens to be returned). Regards Felix Am 18.02.2014 um 08:46 schrieb Ian Boston <i...@tfd.co.uk>: > Hi, > Is there a reason you would not use the commons upload streaming Api to > connect the target output stream to the request stream? Iirc you can test > if both have nio channels and if the do just connect the two. I have used > this in the past to eliminate all GC activity and spooling. The streaming > Api is sensitive to order of the multiparts. You must use them as they > appear and not expect to be able to treat request parameters as a map. In > addition it is sensitive to other frameworks buffering or accessing the > request input stream. > > Best regards > Ian > > On Tuesday, February 18, 2014, Chetan Mehrotra <chetan.mehro...@gmail.com> > wrote: > >> Hi, >> >> Currently in a Sling based application where a user uploads a file to >> the JCR following sequence of steps are executed >> >> 1. User uploads file via HTTP request mostly using Multi-Part form >> data based upload >> >> 2. Sling uses Commons File Upload to parse the multi-part request >> which uses a DiskFileItemFactory and write the binary content to a >> temporary file (for file size > 256 KB) [1] >> >> 3. Later the servlet would access the JCR Session and create a Binary >> value by extracting the InputStream >> >> 4. The file content would then be spooled into the BlobStore >> >> Effect of different blobstore >> ---------------------------------------- >> >> Now depending on the type of BlobStore one of the following code flow >> would happen >> >> A - JR2 DataStores - The inputstream would be copied to file >> B - S3DataStore - The AWS SDK would be creating a temporary file and >> then that file content would be streamed back to the S3 >> C - Segment - Content from InputStream would be stored as part of >> various segments >> D - MongoBlobStore - Content from InputStream would be pushed to >> remote mongo via multiple remote calls >> >> Things to note in above sequence >> >> 1. Uploaded content is copied twice. >> 2. The whole content is spooled via InputStream through JVM Heap >> >> Possible areas of Improvement >> -------------------------------- >> >> 1. If the BlobStore is finally using some File (on same hard disk not >> NFS) then it might be better to *move* the file which was created in >> upload. This would help local FileDataStore and S3DataStore >> >> 2. Avoid spooling via InputStream if possible. Spooling via IS is slow >> [3]. Though in most cases we use efficient buffered copy which is >> marginally slower than NIO based variants. However avoiding moving >> byte[] might reduce pressure on GC (probably!) >> >> Changes required >> ------------------------ >> >> If we can have a way to create JCR Binary implementations which >> enables DataStore/BlobStore to efficiently transfer content then that >> would help. >> >> For example for File based DS the Binary created can keep a reference >> to the source File object and that Binary is used in JCR API. >> Eventually the FileDataStore can treat it in a different way and move >> the file. >> >> Another example is S3DataStore - In some cases the file has already >> been transferred to S3 using other options. And the user wants to >> transfer the S3 file from its bucket to our bucket. So a Binary >> implementation which can just wrap the S3 url would enable the >> S3DataStore to transfer the content without streaming all content >> again [4] >> >> Any thoughts on the best way to enable users of Oak to create Binaries >> via other means (compared to current mode which only enables via >> InputStream) and enable the DataStores to make use of such binaries? >> >> Chetan Mehrotra >> >> [1] >> https://github.com/apache/sling/blob/trunk/bundles/engine/src/main/java/org/apache/sling/engine/impl/parameters/ParameterSupport.java#L190 >> [2] >> http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/model/PutObjectRequest.html >> [3] http://www.baptiste-wicht.com/2010/08/file-copy-in-java-benchmark/3/ >> [4] >> http://stackoverflow.com/questions/9664904/best-way-to-move-files-between-s3-buckets >>