[
https://issues.apache.org/jira/browse/HADOOP-2705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12562722#action_12562722
]
Chris Douglas commented on HADOOP-2705:
---------------------------------------
bq. To be clear, this patch improves reads by 5% for single nodes only but by
40% when > a single node?
No, the patch has a measured 5% single-node improvement on reads. The 40% was
an artifact of the benchmark and not a real result. Specifically: the 4k
baseline was collected in the same run as a series of write benchmarks. Running
without the write benchmarks improves the first read benchmark significantly,
in that case the lzo, block compressed SequenceFiles. This was wrongly
attributed to the buffer size; runs directly comparing reads and varying only
the buffer size confirmed the more modest improvement.
> io.file.buffer.size should default to a value larger than 4k
> ------------------------------------------------------------
>
> Key: HADOOP-2705
> URL: https://issues.apache.org/jira/browse/HADOOP-2705
> Project: Hadoop Core
> Issue Type: Improvement
> Components: conf
> Reporter: Chris Douglas
> Assignee: Chris Douglas
> Priority: Minor
> Fix For: 0.16.0
>
> Attachments: 2705-0.patch
>
>
> Tests using HADOOP-2406 suggest that increasing this to 32k from 4k improves
> read times for block, lzo compressed SequenceFiles by over 40%; 32k is a
> relatively conservative bump.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.