[ 
https://issues.apache.org/jira/browse/HDFS-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13759630#comment-13759630
 ] 

Andrew Wang commented on HDFS-4953:
-----------------------------------

Hey Owen, thanks for taking a look,

The new zero-copy read API actually does support both normal and zero-copy 
reads, based on setting the fallback buffer. This allows flexibility for users:

- An unsophisticated user just sets the fallback buffer once for the cursor, 
and then calls away at the new API. This will ZCR when possible, and falls back 
to a normal copying read. There isn't a need to switch back and forth. This 
will also, by default, not return a short read until EOF.
- A sophisticated user might want to know if a read involves copying or not, so 
they can take different actions for fast vs. slow paths. This user would not 
set a fallback buffer and also would enable short reads. This would only return 
zero-copy data, and the user has to deal with switching read paths and so on.

This togglable behavior was a request from our Impala team, and Arun made a 
similar request for "is cached" on HDFS-4949: 
https://issues.apache.org/jira/browse/HDFS-4949?focusedCommentId=13736019&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13736019

I'll also note that it isn't easy for apps to deal with multiple returned 
buffers, they'll probably basically end up doing the current fallback path of 
copying them all into a buffer. I think that being a DFS our situation is also 
different from traditional scatter/gather APIs, since each buffer being 
collected can have drastically different costs (local zero-copy vs. remote is 
something like 100x)

Let me know if this makes sense, I might have missed something in your proposal.
                
> enable HDFS local reads via mmap
> --------------------------------
>
>                 Key: HDFS-4953
>                 URL: https://issues.apache.org/jira/browse/HDFS-4953
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>    Affects Versions: 2.3.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>             Fix For: HDFS-4949
>
>         Attachments: benchmark.png, HDFS-4953.001.patch, HDFS-4953.002.patch, 
> HDFS-4953.003.patch, HDFS-4953.004.patch, HDFS-4953.005.patch, 
> HDFS-4953.006.patch, HDFS-4953.007.patch, HDFS-4953.008.patch
>
>
> Currently, the short-circuit local read pathway allows HDFS clients to access 
> files directly without going through the DataNode.  However, all of these 
> reads involve a copy at the operating system level, since they rely on the 
> read() / pread() / etc family of kernel interfaces.
> We would like to enable HDFS to read local files via mmap.  This would enable 
> truly zero-copy reads.
> In the initial implementation, zero-copy reads will only be performed when 
> checksums were disabled.  Later, we can use the DataNode's cache awareness to 
> only perform zero-copy reads when we know that checksum has already been 
> verified.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to