[ 
https://issues.apache.org/jira/browse/HDFS-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13707573#comment-13707573
 ] 

Colin Patrick McCabe commented on HDFS-4953:
--------------------------------------------

well... zero copy for remote readers isn't possible, by definition :)

I cannae change the laws o' physics, cap'n.

We did some microbenchmarks where the existing short-circuit implementation 
(SCR) got around 2500 MB/s, and zero copy reads (ZCR) got around 7500 MB/s.  We 
found that the advantage disappeared if we had to re-create the mmap on each 
read through the file -- that's why this change includes a long-lived cache of 
mmaps in ClientMmapManager.
                
> enable HDFS local reads via mmap
> --------------------------------
>
>                 Key: HDFS-4953
>                 URL: https://issues.apache.org/jira/browse/HDFS-4953
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>    Affects Versions: 2.2.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-4953.001.patch, HDFS-4953.002.patch
>
>
> Currently, the short-circuit local read pathway allows HDFS clients to access 
> files directly without going through the DataNode.  However, all of these 
> reads involve a copy at the operating system level, since they rely on the 
> read() / pread() / etc family of kernel interfaces.
> We would like to enable HDFS to read local files via mmap.  This would enable 
> truly zero-copy reads.
> In the initial implementation, zero-copy reads will only be performed when 
> checksums were disabled.  Later, we can use the DataNode's cache awareness to 
> only perform zero-copy reads when we know that checksum has already been 
> verified.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to