[ 
https://issues.apache.org/jira/browse/HADOOP-6166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12745095#action_12745095
 ] 

Scott Carey commented on HADOOP-6166:
-------------------------------------

Great test results!

The conclusions I draw from these:

* All of these perform well under concurrency, regardless of whether its 4K, 
8K, or 16K of lookup tables. 
* General performance trends in a more sophisticated test like yours line up 
with the simpler Perf test we have, so we can be confident in those results if 
they are consistent enough run to run.

The 32 bit JVM likes 8_8b, 64 bit 8_8d.  I agree that d is the winner here.

Because the '8' variants shift to one byte at a time if the input is less than 
8 bytes, they perform worse than the old PureJavaCrc32 at the 4 byte to 7 byte 
level.   Is this important?   It would be useful to know how often the crc code 
is called on small byte chunks.  We can get this to near PureJavaCrc32 speeds 
for 4 byte sizes if we add a four byte at a time block to 8_8d.   

That is, we can go to something of the form: 

{code}while(len > 7) {
       < 8 at a time code here >
    }
    while(len > 3) {
       < 4 at a time code here>
    }
    while(len > 0) {
       < 1 at a time code here>
    }
{code}

Whether that is important depends on the use cases and frequency of requests  
in the 4-7 and 12-15 byte range.  We can even have a block for two-at a time 
optimization.

> Improve PureJavaCrc32
> ---------------------
>
>                 Key: HADOOP-6166
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6166
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: util
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>         Attachments: c6166_20090722.patch, c6166_20090722_benchmark_32VM.txt, 
> c6166_20090722_benchmark_64VM.txt, c6166_20090727.patch, 
> c6166_20090728.patch, c6166_20090810.patch, c6166_20090811.patch, graph.r, 
> graph.r, Rplots-laptop.pdf, Rplots-nehalem32.pdf, Rplots-nehalem64.pdf, 
> Rplots.pdf, Rplots.pdf, Rplots.pdf
>
>
> Got some ideas to improve CRC32 calculation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to