[
https://issues.apache.org/jira/browse/HDFS-2834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222969#comment-13222969
]
Henry Robinson commented on HDFS-2834:
--------------------------------------
Thanks for the review! Per your first two questions:
* There's no significant difference in my benchmarks with the old copying path
doing the same experiment:
|| ||Native Checksums|| No Checksums|| Non-native Checksums|| Remote, Native
Checksums||
|Copying (MB/s) - 32k buffer and request size| 2010.21 |2290.50| 721.52|
1412.20|
|Old copying path - 32k buffer and request size |2087.43 |2232.67|
708.67 |1365.60|
* I've run the modified TestParallelRead tests for a couple of hours, but I
plan to do a soak test overnight with the full suite before this gets
committed.
> ByteBuffer-based read API for DFSInputStream
> --------------------------------------------
>
> Key: HDFS-2834
> URL: https://issues.apache.org/jira/browse/HDFS-2834
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Henry Robinson
> Assignee: Henry Robinson
> Attachments: HDFS-2834-no-common.patch, HDFS-2834.3.patch,
> HDFS-2834.4.patch, HDFS-2834.5.patch, HDFS-2834.6.patch, HDFS-2834.patch,
> HDFS-2834.patch, hdfs-2834-libhdfs-benchmark.png
>
>
> The {{DFSInputStream}} read-path always copies bytes into a JVM-allocated
> {{byte[]}}. Although for many clients this is desired behaviour, in certain
> situations, such as native-reads through libhdfs, this imposes an extra copy
> penalty since the {{byte[]}} needs to be copied out again into a natively
> readable memory area.
> For these cases, it would be preferable to allow the client to supply its own
> buffer, wrapped in a {{ByteBuffer}}, to avoid that final copy overhead.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira