[ https://issues.apache.org/jira/browse/HDFS-297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Todd Lipcon updated HDFS-297: ----------------------------- Attachment: hdfs-297.txt Here is a patch with Scott's latest implementation with a few changes: - Style cleanups (mostly whitespace, some more comments) - Removed the code to generate T1 through T4 since it should never change and the JIRA is here for reference. If people -1 this I'll submit a new patch with it included. Although this JIRA is for the benefit of HDFS, the code itself is in o.a.h.util, so this patch applies against the Common repository. We'll need a second patch against the HDFS repo to make the one-liner change from "new CRC32" to "new PureJavaCrc32". Should we move this JIRA back to Common and add a HDFS JIRA for the oneliner patch? Not sure how pedantic we're supposed to be about which JIRAs and patches apply where :) > Implement a pure Java CRC32 calculator > -------------------------------------- > > Key: HDFS-297 > URL: https://issues.apache.org/jira/browse/HDFS-297 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Owen O'Malley > Assignee: Todd Lipcon > Attachments: crc32-results.txt, hadoop-5598-evil.txt, > hadoop-5598-hybrid.txt, hadoop-5598.txt, hadoop-5598.txt, hdfs-297.txt, > PureJavaCrc32.java, PureJavaCrc32.java, PureJavaCrc32.java, > TestCrc32Performance.java, TestCrc32Performance.java, > TestCrc32Performance.java, TestPureJavaCrc32.java > > > We've seen a reducer writing 200MB to HDFS with replication = 1 spending a > long time in crc calculation. In particular, it was spending 5 seconds in crc > calculation out of a total of 6 for the write. I suspect that it is the > java-jni border that is causing us grief. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.