[ https://issues.apache.org/jira/browse/HADOOP-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12546850 ]
Konstantin Shvachko commented on HADOOP-2154: --------------------------------------------- Rajagopal, I do not see how the data:header ratio is decreasing here. This issue is mainly about removing the interleaving buffer layout. Namely, now we partition the original data into chunks, calculate crc for each chunk and create the following buffer, which subsequently is transferred to a data-node: | data chunk 1 | crc for data chunk 1 | data chunk 2 | crc for data chunk 2 | ... | data chunk n | crc for data chunk n | I propose to change it [back] to | the original data (+not+ partitioned into chunks) | crc for for the original data | If you add a header before each data and crc chunk then in current approach you will have 2*n headers, while in the proposed approach there will be only 2. So the data:header ratio will increase: (|data| + |crc|) / 2n < (|data| + |crc|) / 2 This should let us get rid of that extra buffer that is used to collect all the interleaved pieces together. And thus the issue is not about "writing the chunks to the socket directly", but rather about removing chunks all together. Imo, this is related to both reads and writes. May be reads and writes should even share this code. Removing other redundant buffers is a part of a different issue. Eric, why do you think transferring crc before the data would require less RAM on the client? If it does then it definitely makes sense to send crcs before the data bytes. > Non-interleaved checksums would optimize block transfers. > --------------------------------------------------------- > > Key: HADOOP-2154 > URL: https://issues.apache.org/jira/browse/HADOOP-2154 > Project: Hadoop > Issue Type: Improvement > Components: dfs > Affects Versions: 0.14.0 > Reporter: Konstantin Shvachko > Assignee: Rajagopal Natarajan > Fix For: 0.16.0 > > > Currently when a block is transfered to a data-node the client interleaves > data chunks with the respective checksums. > This requires creating an extra copy of the original data in a new buffer > interleaved with the crcs. > We can avoid extra copying if the data and the crc are fed to the socket one > after another. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.