[ https://issues.apache.org/jira/browse/HADOOP-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12545517 ]
Raghu Angadi commented on HADOOP-2154: -------------------------------------- In my initial implementation of HADOOP-1134, I did not keep buffers between socket and datanode (reader and writer). Looks like this jira proposes that. Note that I had to put the buffers back since there was a regression on DFSIO benchmarks and sort. Pretty much none of our benchmarks is cpu intensive on Datanodes. If we want to get rid of extra buffer copies, I would either look in to one these two : # reorganize the while loop so that there is one extra copy (from disk to user buffer) and not two. i.e. large user buffer directly written to socket (in the case of block read). # Remove both copies by extending the protocol to allow one DATA_CHUNK to allow multiple CHECKSUM chunks. e.g. one DATA_CHUNK would contain 64k worth of block data directly to user buffer and 65k*4/512 checksum bytes at the end. So that Datanode directly reads in to large user buffer and that buffer is written to socket (basically bringing buffer handling back to pre HADOOP-1134). # Using multiple sockets is another option but I am not a fan of it. > Non-interleaved checksums would optimize block transfers. > --------------------------------------------------------- > > Key: HADOOP-2154 > URL: https://issues.apache.org/jira/browse/HADOOP-2154 > Project: Hadoop > Issue Type: Improvement > Components: dfs > Affects Versions: 0.14.0 > Reporter: Konstantin Shvachko > Assignee: Rajagopal Natarajan > Fix For: 0.16.0 > > > Currently when a block is transfered to a data-node the client interleaves > data chunks with the respective checksums. > This requires creating an extra copy of the original data in a new buffer > interleaved with the crcs. > We can avoid extra copying if the data and the crc are fed to the socket one > after another. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.