[ https://issues.apache.org/jira/browse/HDFS-3528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14053862#comment-14053862 ]
Todd Lipcon commented on HDFS-3528: ----------------------------------- [~james.thomas] -- I think we should probably file a JIRA in the HADOOP project to add the relevant APIs to DataChecksum. They aren't HDFS-specific -- I just filed MAPREDUCE-5962 for one example of a place where we can use them from within MR. Perhaps it makes sense to even file two: 1) support verifyChunkedSums on a byte array 2) native code to _calculate_ chunked sums (on both byte array and byte buffer) which would allow us to break up the work a bit better for easy review. Feel free to ping me for review when patches are ready. > Use native CRC32 in DFS write path > ---------------------------------- > > Key: HDFS-3528 > URL: https://issues.apache.org/jira/browse/HDFS-3528 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, hdfs-client, performance > Affects Versions: 2.0.0-alpha > Reporter: Todd Lipcon > Assignee: James Thomas > > HDFS-2080 improved the CPU efficiency of the read path by using native > SSE-enabled code for CRC verification. Benchmarks of the write path show that > it's often CPU bound by checksums as well, so we should make the same > improvement there. -- This message was sent by Atlassian JIRA (v6.2#6252)