[jira] [Commented] (HDFS-4953) enable HDFS local reads via mmap

Colin Patrick McCabe (JIRA) Thu, 05 Sep 2013 18:52:01 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13759764#comment-13759764
 ]


Colin Patrick McCabe commented on HDFS-4953:
--------------------------------------------

The current API is generic and not HDFS-specific.  You get a zero-copy cursor 
from {{FSDataInputStream#createZeroCopyCursor}}, and you read from it with 
{{ZeroCopyCursor#read}}.  Then, when you're done, you close it with 
{{ZeroCopyCursor#close}}.

It also supports a fallback path.  In order to have a fallback path, you must 
call {{ZeroCopyCursor#setFallbackBuffer}}.  That provides the cursor with a 
fallback buffer which will be used when an mmap is unavailable.

The big problem with "well, just return a ByteBuffer, then!" is that ByteBuffer 
has no close method.  So it's unclear how the mmap would ever be released.  It 
is not adequate to rely on the GC, since we are talking about file descriptors 
here.  Furthermore, there are a lot of applications where "valid until next 
read() call or close of stream" is not good enough.  Sometimes people want to 
do multiple reads and look at the results for each, and we should accommodate 
them.

Many prospective users of zero-copy are not interested in dealing with many 
small buffers.  They want to deal with either a single big contiguous mmap'ed 
memory area, or just do reads the standard way, performing many small reads 
that only access as much as they need.  The ability to turn off "fallback mode" 
(where we fall back to copying to service your read) was very specifically 
added in response to these users.

I think any reasonable design will end up looking a lot like what we already 
did in this JIRA.  I suppose instead of separating {{createZeroCopyCursor}} and 
{{read}}, we could have combined them, but that would have resulted in a 
function call with a lot more parameters.  The current design also prevents the 
scenario where more and more function variants get added over time with more 
and more parameters-- the kind of function overload hell we landed in with 
{{FileSystem#create}}.
                
> enable HDFS local reads via mmap
> --------------------------------
>
>                 Key: HDFS-4953
>                 URL: https://issues.apache.org/jira/browse/HDFS-4953
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>    Affects Versions: 2.3.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>             Fix For: HDFS-4949
>
>         Attachments: benchmark.png, HDFS-4953.001.patch, HDFS-4953.002.patch, 
> HDFS-4953.003.patch, HDFS-4953.004.patch, HDFS-4953.005.patch, 
> HDFS-4953.006.patch, HDFS-4953.007.patch, HDFS-4953.008.patch
>
>
> Currently, the short-circuit local read pathway allows HDFS clients to access 
> files directly without going through the DataNode.  However, all of these 
> reads involve a copy at the operating system level, since they rely on the 
> read() / pread() / etc family of kernel interfaces.
> We would like to enable HDFS to read local files via mmap.  This would enable 
> truly zero-copy reads.
> In the initial implementation, zero-copy reads will only be performed when 
> checksums were disabled.  Later, we can use the DataNode's cache awareness to 
> only perform zero-copy reads when we know that checksum has already been 
> verified.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4953) enable HDFS local reads via mmap

Reply via email to