[ 
https://issues.apache.org/jira/browse/HADOOP-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13442761#comment-13442761
 ] 

Andy Isaacson commented on HADOOP-8648:
---------------------------------------

I've reviewed the patch closely and agree that it's right.  The only tiny 
improvement I'd make is to fix this misleading comment in the 64-bit version of 
{{bulk_crc32.c}}:
{code}
374   int remainder = block_size % sizeof(uint64_t);
...
401       /* Take care of the remainder. They are only up to three bytes,
402        * so performing byte-level crc32 won't take much time.
403        */
404       bdata = (uint8_t*)data;
405       while (likely(remainder)) {
{code}
The comment says "up to three bytes" but since this is a uint64_t at a time, it 
should say "up to seven bytes".  This came from a copy-and-paste from the 
32-bit version.

Ideally we could refactor the 32-bit and 64-bit versions to one using a 
{{#define WORD_T uint32_t}} or similar, but let's do that in a followup jira.

I've also reviewed the original patch in HADOOP-7446 and confirmed that there 
weren't any other similar bugs added.

Note that the erroneous asm is only hit if the HDFS blocksize is not a multiple 
of the wordsize, which AFAICS can only happen for blocksize<7.  And, the bug 
lurked because the remainder codepath didn't have any tests, so thanks for 
adding those.

Overall, LGTM.  Fix the comment if you choose, else ship it.
                
> libhadoop:  native CRC32 validation crashes when io.bytes.per.checksum=1
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-8648
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8648
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 2.0.0-alpha
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HADOOP-8648.001.patch, HADOOP-8648.002.patch, 
> HADOOP-8648.003.patch
>
>
> The native CRC32 code, found in {{pipelined_crc32c}}, crashes when blocksize 
> is set to 1.
> {code}
> 12:27:14,886  INFO NativeCodeLoader:50 - Loaded the native-hadoop library
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x00007fa00ee5a340, pid=24100, tid=140326058854144
> #
> # JRE version: 6.0_29-b11
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (20.4-b02 mixed mode linux-amd64 
> compressed oops)
> # Problematic frame:
> # C  [libhadoop.so.1.0.0+0x8340]  pipelined_crc32c+0xa0
> #
> # An error report file with more information is saved as:
> # /h/hs_err_pid24100.log
> #
> # If you would like to submit a bug report, please visit:
> #   http://java.sun.com/webapps/bugreport/crash.jsp
> #
> Aborted
> {code}
> The Java CRC code works fine in this case.
> Choosing blocksize=1 is a __very__ odd choice.  It means that we're storing a 
> 4-byte checksum for every byte. 
> {code}
> -rw-r--r--  1 cmccabe users  49398 Aug  3 11:33 blk_4702510289566780538
> -rw-r--r--  1 cmccabe users 197599 Aug  3 11:33 
> blk_4702510289566780538_1199.meta
> {code}
> However, obviously crashing is never the right thing to do.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to