[ 
https://issues.apache.org/jira/browse/HDFS-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-4953:
------------------------------

    Attachment: benchmark.png

Attached a benchmark plot I did in support of this work. The benchmark times 
processing a 1GB file which fits in buffer cache. The "processing" here is 
summing up the entire file as if it's a bunch of integers end-to-end (written 
using SSE, etc to be as efficient as possible).

The various items plotted here are:
- h: the current libhdfs code, with SCR, but without ZCR (averages around 
3G/sec or so)
- m: a C program which mallocs 1GB, reads the data into that buffer, and then 
runs the analysis on the malloced buffer. This is the upper bound performance. 
Gets about 8GB/sec
- mmap-each: C program which opens a local file, and on each iteration of 
processing, calls "mmap" and then processes it. Gets about 3G/sec. "perf top" 
indicates that this is slow because of page table entry population overhead 
(minor page faults)
- mmap-populate-each: the same, but with the MAP_POPULATE flag. Gets around 
4500M/sec. This is faster because it pre-populates the page table entries.
- mmap-once: the same, but only mmaps once, and doesn't count the mmap time. 
Gets around the same speed as the "malloc" path.
- z: the ZCR implementation _without_ the mmap caching. Gets the same as 
mmap-each, more or less, because of the same PTE faulting overhead.

These graphs show why we have to have the mmap cache -- and indicate that, with 
that cache, we should be in the same ballpark as the optimal (~9GB/sec/core).
                
> enable HDFS local reads via mmap
> --------------------------------
>
>                 Key: HDFS-4953
>                 URL: https://issues.apache.org/jira/browse/HDFS-4953
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>    Affects Versions: 2.2.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: benchmark.png, HDFS-4953.001.patch, HDFS-4953.002.patch
>
>
> Currently, the short-circuit local read pathway allows HDFS clients to access 
> files directly without going through the DataNode.  However, all of these 
> reads involve a copy at the operating system level, since they rely on the 
> read() / pread() / etc family of kernel interfaces.
> We would like to enable HDFS to read local files via mmap.  This would enable 
> truly zero-copy reads.
> In the initial implementation, zero-copy reads will only be performed when 
> checksums were disabled.  Later, we can use the DataNode's cache awareness to 
> only perform zero-copy reads when we know that checksum has already been 
> verified.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to