[ https://issues.apache.org/jira/browse/HADOOP-9601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13667360#comment-13667360 ]
Todd Lipcon commented on HADOOP-9601: ------------------------------------- If someone wants to work on this, I think it will help the CPU efficiency of our write path in particular without too much complication in the actual HDFS code. HBase would also be a consumer of this API. The specific thing to keep in mind is that, to avoid a memcpy in the JNI code, you need to use GetPrimitiveArrayCritical [1]. But, while you're in the "critical section", GCs are blocked, so if you hold it for a long time you'll end up stalling all of the threads waiting on the heap-wide lock that you're holding. So, in the context of CRC, it's probably reasonable to compute only 256kb or so per critical region -- at 1GB+/sec CRC speed this is only a maximum 250us delay on the GCs, which is noise next to typical GC lengths (~50ms young gen). [1] http://docs.oracle.com/javase/1.4.2/docs/guide/jni/jni-12.html#GetPrimitiveArrayCritical > Support native CRC on byte arrays > --------------------------------- > > Key: HADOOP-9601 > URL: https://issues.apache.org/jira/browse/HADOOP-9601 > Project: Hadoop Common > Issue Type: Improvement > Components: performance, util > Affects Versions: 3.0.0 > Reporter: Todd Lipcon > > When we first implemented the Native CRC code, we only did so for direct byte > buffers, because these correspond directly to native heap memory and thus > make it easy to access via JNI. We'd generally assumed that accessing byte[] > arrays from JNI was not efficient enough, but now that I know more about JNI > I don't think that's true -- we just need to make sure that the critical > sections where we lock the buffers are short. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira