[jira] [Commented] (KAFKA-3174) Re-evaluate the CRC32 class performance.

2017-04-04 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955113#comment-15955113
 ] 

Ismael Juma commented on KAFKA-3174:


We switched message format V2 to use CRC32C (see KAFKA-1449). It seems that the 
safest thing is not to change what we do for V0 and V1.

> Re-evaluate the CRC32 class performance.
> 
>
> Key: KAFKA-3174
> URL: https://issues.apache.org/jira/browse/KAFKA-3174
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
>
> We used org.apache.kafka.common.utils.CRC32 in clients because it has better 
> performance than java.util.zip.CRC32 in Java 1.6.
> In a recent test I ran it looks in Java 1.8 the CRC32 class is 2x as fast as 
> the Crc32 class we are using now. We may want to re-evaluate the performance 
> of Crc32 class and see it makes sense to simply use java CRC32 instead.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-3174) Re-evaluate the CRC32 class performance.

2016-02-02 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128038#comment-15128038
 ] 

Ismael Juma commented on KAFKA-3174:


Becket, I think the best way to validate the reasoning is to benchmark the 
producer with varying message sizes (and potentially with compression 
enabled/disabled).

Also, I'll say it again just to make this clear for others reading this: the 
JDK does not use the CRC32 instruction in SSE 4.2 because the CPU instruction 
uses a different polynomial than the one in ZIP. The Intel instruction is for 
CRC32-C. Relevant quote:

"CRC (Cyclic Redundancy Check) is a remainder from dividing your message by a 
polynomial. Most popular file formats and protocols (Ethernet, MPEG-2, ZIP, 
RAR, 7-Zip, GZip, and PNG) use the polynomial 0x04C11DB7, while Intel's 
hardware implementation is based on another polynomial, 0x1EDC6F41 (used in 
iSCSI and Btrfs). Newly designed protocols and formats can choose the second 
polynomial to benefit from hardware acceleration, but CRC-32 with polynomial 
0x04C11DB7 has to be calculated in software. The CRC32 instruction is not 
supported by AMD processors."
http://www.strchr.com/crc32_popcnt

The JDK simply tries to optimise the software implementation via SIMD 
instructions.

> Re-evaluate the CRC32 class performance.
> 
>
> Key: KAFKA-3174
> URL: https://issues.apache.org/jira/browse/KAFKA-3174
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> We used org.apache.kafka.common.utils.CRC32 in clients because it has better 
> performance than java.util.zip.CRC32 in Java 1.6.
> In a recent test I ran it looks in Java 1.8 the CRC32 class is 2x as fast as 
> the Crc32 class we are using now. We may want to re-evaluate the performance 
> of Crc32 class and see it makes sense to simply use java CRC32 instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3174) Re-evaluate the CRC32 class performance.

2016-02-02 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129644#comment-15129644
 ] 

Jiangjie Qin commented on KAFKA-3174:
-

[~ijuma] Thanks for the clarification. Yes, I agree we'd better run a 
parameterized test to see what is the performance difference between current 
Crc32 and Java CRC32. But I might be busy in the next a few days. So please 
feel free to take this ticket if you are interested. Thanks.

> Re-evaluate the CRC32 class performance.
> 
>
> Key: KAFKA-3174
> URL: https://issues.apache.org/jira/browse/KAFKA-3174
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.1.0
>
>
> We used org.apache.kafka.common.utils.CRC32 in clients because it has better 
> performance than java.util.zip.CRC32 in Java 1.6.
> In a recent test I ran it looks in Java 1.8 the CRC32 class is 2x as fast as 
> the Crc32 class we are using now. We may want to re-evaluate the performance 
> of Crc32 class and see it makes sense to simply use java CRC32 instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3174) Re-evaluate the CRC32 class performance.

2016-02-01 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15126959#comment-15126959
 ] 

Jiangjie Qin commented on KAFKA-3174:
-

[~ijuma] My test code was as below. Initially I only tested 1MB size. I added 
testing on different size after saw your comments above.

{code}
public static void main(String[] args) {
int[] sizes = {8, 16, 32, 128, 1024, 65536, 1048576};
for (int size : sizes) {
byte[] bytes = new byte[size];
Random random = new Random();
random.nextBytes(bytes);
int loop = 5000 * (1048576 / size);

long start = System.currentTimeMillis();
for (int i = 0; i < loop; i++) {
Crc32 crc32 = new Crc32();
crc32.update(bytes, 0, bytes.length);
}
System.out.println(String.format("KCrc32: Size = %d\t, time = %d", 
size, (System.currentTimeMillis() - start)));

start = System.currentTimeMillis();
for (int i = 0; i < loop; i++) {
CRC32 crc32 = new CRC32();
crc32.update(bytes, 0, bytes.length);
}
System.out.println(String.format("JCrc32: Size = %d\t, time = 
%d\n", size, (System.currentTimeMillis() - start)));

}
}
{code}

And here is the output:
{code}
KCrc32: Size = 8, time = 10400
JCrc32: Size = 8, time = 9907

KCrc32: Size = 16   , time = 6959
JCrc32: Size = 16   , time = 8419

KCrc32: Size = 32   , time = 5596
JCrc32: Size = 32   , time = 5587

KCrc32: Size = 128  , time = 4397
JCrc32: Size = 128  , time = 3305

KCrc32: Size = 1024 , time = 4115
JCrc32: Size = 1024 , time = 2392

KCrc32: Size = 65536, time = 4087
JCrc32: Size = 65536, time = 2296

KCrc32: Size = 1048576  , time = 4078
JCrc32: Size = 1048576  , time = 2298
{code}

>From the output above, it looks for size < 32 bytes KCrc32 and JCrc32 is 
>comparable (except 16 bytes). After size >= 64, JCrc32 is faster. My ~2x 
>result came from 1MB size.

In this Intel paper, they mentioned that the CRC32 instruction is actually in 
SSE4.2, which was introduced in Nov 2008. 
http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/crc-iscsi-polynomial-crc32-instruction-paper.pdf
Wikipedia says the same thing.
https://en.wikipedia.org/wiki/SSE4#SSE4.2
AMD started to support SSE4.2 in Oct 2011.

I ran the above test on both my desktop (Intel(R) Xeon(R) CPU E5-2620 v2 @ 
2.10GHz) and macbook (Intel(R) Core(TM) i7-4558U CPU @ 2.80GHz), both of them 
have SSE4.2 support.

> Re-evaluate the CRC32 class performance.
> 
>
> Key: KAFKA-3174
> URL: https://issues.apache.org/jira/browse/KAFKA-3174
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> We used org.apache.kafka.common.utils.CRC32 in clients because it has better 
> performance than java.util.zip.CRC32 in Java 1.6.
> In a recent test I ran it looks in Java 1.8 the CRC32 class is 2x as fast as 
> the Crc32 class we are using now. We may want to re-evaluate the performance 
> of Crc32 class and see it makes sense to simply use java CRC32 instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3174) Re-evaluate the CRC32 class performance.

2016-02-01 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15127127#comment-15127127
 ] 

Ismael Juma commented on KAFKA-3174:


Unfortunately, the micro-benchmark above is not reliable. That's why I used JMH 
(which is maintained by the Oracle JVM team for their own Java benchmarks) for 
my benchmarks. See 
https://groups.google.com/d/msg/mechanical-sympathy/m4opvy4xq3U/7lY8x8SvHgwJ 
for why you should use JMH for nano and micro-benchmarks.

The Intel CRC32 instruction is for CRC32C, which uses a different polynomial 
than the one we use for Kafka (again, see KAFKA-1449 for more details on this). 
I actually checked the HotSpot code 
(http://hg.openjdk.java.net/hsx/hsx25/hotspot/rev/b800986664f4) and the 
assembly generated by the HotSpot JIT to verify my assertion before I posted my 
previous comment with regards to SSE 2, SSE 4.1, AVX and CLMUL being used.

> Re-evaluate the CRC32 class performance.
> 
>
> Key: KAFKA-3174
> URL: https://issues.apache.org/jira/browse/KAFKA-3174
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> We used org.apache.kafka.common.utils.CRC32 in clients because it has better 
> performance than java.util.zip.CRC32 in Java 1.6.
> In a recent test I ran it looks in Java 1.8 the CRC32 class is 2x as fast as 
> the Crc32 class we are using now. We may want to re-evaluate the performance 
> of Crc32 class and see it makes sense to simply use java CRC32 instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3174) Re-evaluate the CRC32 class performance.

2016-02-01 Thread Todd Palino (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15126292#comment-15126292
 ] 

Todd Palino commented on KAFKA-3174:


Yeah, definitely no problems with Java 1.8. We've been running 1.8 u5 for quite 
some time, and we're in the process of updating to u40. It's worth noting that 
we have been running into a number of SEGVs with mirror maker (but not the 
broker) under u5, but the problem is supposedly fixed in u40.

> Re-evaluate the CRC32 class performance.
> 
>
> Key: KAFKA-3174
> URL: https://issues.apache.org/jira/browse/KAFKA-3174
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> We used org.apache.kafka.common.utils.CRC32 in clients because it has better 
> performance than java.util.zip.CRC32 in Java 1.6.
> In a recent test I ran it looks in Java 1.8 the CRC32 class is 2x as fast as 
> the Crc32 class we are using now. We may want to re-evaluate the performance 
> of Crc32 class and see it makes sense to simply use java CRC32 instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3174) Re-evaluate the CRC32 class performance.

2016-02-01 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15126346#comment-15126346
 ] 

Ismael Juma commented on KAFKA-3174:


[~becket_qin] We have started recommending Java 8 around the same time we 
released 0.9.0.0 (we also mention that LinkedIn is using Java 8 there):

http://kafka.apache.org/documentation.html#java

I did some investigation so that we understand the specifics of the improvement 
to CRC32 in the JDK. It relies on SSE 2, SSE 4.1, AVX and CLMUL. SSE has been 
available for a long time, CLMUL since Intel Westmere (2010) and AVX since 
Intel Sandy Bridge (2011). It's probably OK to assume that these instructions 
will be available for those who are constrained by CPU performance.

Note that this is not using CRC32 CPU instruction as we would have to use 
CRC32C for that (see KAFKA-1449 for more details on what is possible if we are 
willing to support CRC32C).

I wrote a simple JMH benchmark:

https://gist.github.com/ijuma/f86ad935715cfd4e258e

I tested it on my Ivy Bridge MacBook on JDK 7 update 80 and JDK 8 update 76, 
configuring JMH to use 10 one second measurement iterations, 10 one second 
warmup iterations and 1 fork.

JDK 8 update 76 results:

{code}
[info] Benchmark  (bytesSize)  Mode  Cnt   Score   Error  
Units
[info] Crc32Bench.jdkCrc32  8  avgt   10  24.902 ± 0.728  
ns/op
[info] Crc32Bench.jdkCrc32 16  avgt   10  48.819 ± 2.550  
ns/op
[info] Crc32Bench.jdkCrc32 32  avgt   10  83.434 ± 2.668  
ns/op
[info] Crc32Bench.jdkCrc32128  avgt   10 127.679 ± 5.185  
ns/op
[info] Crc32Bench.jdkCrc32   1024  avgt   10 450.105 ±18.943  
ns/op
[info] Crc32Bench.jdkCrc32  65536  avgt   10   25579.406 ±   683.017  
ns/op
[info] Crc32Bench.jdkCrc321048576  avgt   10  408708.242 ± 12183.543  
ns/op
[info] Crc32Bench.kafkaCrc328  avgt   10  14.761 ± 0.647  
ns/op
[info] Crc32Bench.kafkaCrc32   16  avgt   10  19.114 ± 0.423  
ns/op
[info] Crc32Bench.kafkaCrc32   32  avgt   10  34.243 ± 1.066  
ns/op
[info] Crc32Bench.kafkaCrc32  128  avgt   10 114.481 ± 2.812  
ns/op
[info] Crc32Bench.kafkaCrc32 1024  avgt   10 835.630 ±22.412  
ns/op
[info] Crc32Bench.kafkaCrc3265536  avgt   10   52234.713 ±  2229.624  
ns/op
[info] Crc32Bench.kafkaCrc32  1048576  avgt   10  822903.613 ± 20950.560  
ns/op
{code}

JDK 7 update 80 results:

{code}
[info] Benchmark  (bytesSize)  Mode  Cnt   Score   Error  
Units
[info] Crc32Bench.jdkCrc32  8  avgt   10 114.802 ± 8.289  
ns/op
[info] Crc32Bench.jdkCrc32 16  avgt   10 122.030 ± 3.153  
ns/op
[info] Crc32Bench.jdkCrc32 32  avgt   10 131.082 ± 5.501  
ns/op
[info] Crc32Bench.jdkCrc32128  avgt   10 154.116 ± 6.164  
ns/op
[info] Crc32Bench.jdkCrc32   1024  avgt   10 512.151 ±15.934  
ns/op
[info] Crc32Bench.jdkCrc32  65536  avgt   10   25460.014 ±  1532.627  
ns/op
[info] Crc32Bench.jdkCrc321048576  avgt   10  401996.290 ± 18606.012  
ns/op
[info] Crc32Bench.kafkaCrc328  avgt   10  14.493 ± 0.494  
ns/op
[info] Crc32Bench.kafkaCrc32   16  avgt   10  20.329 ± 2.019  
ns/op
[info] Crc32Bench.kafkaCrc32   32  avgt   10  37.706 ± 0.338  
ns/op
[info] Crc32Bench.kafkaCrc32  128  avgt   10 124.197 ± 6.368  
ns/op
[info] Crc32Bench.kafkaCrc32 1024  avgt   10 908.327 ±32.487  
ns/op
[info] Crc32Bench.kafkaCrc3265536  avgt   10   57000.705 ±  2976.852  
ns/op
[info] Crc32Bench.kafkaCrc32  1048576  avgt   10  940433.528 ± 26257.962  
ns/op
{code}

Using a VM intrinsic avoids JNI set-up costs making JDK 8 much faster than JDK 
7 for small byte arrays. Having said that, Kafka's pure Java implementation is 
still faster for byte arrays of up to 128 bytes according to this benchmark. 
Surprisingly, the results are similar for JDK 7 and JDK 8 for larger byte 
arrays. I had a quick look at the assembly generated for JDK 8 and it seems to 
use AVX and CLMUL as per the OpenJDK commit I linked to. Unfortunately, it's a 
bit more work to look at the assembly generated by JDK 7 on a Mac and so I 
didn't. More investigation would be required to understand why this is so (and 
to be able to trust the numbers).

Looking at how we compute CRCs in `Record`, there are two different code paths 
depending on whether we call it from `Compressor` or not. The former invokes 
Crc32 update methods several times (both the byte array and int versions) while 
the latter invokes the byte array version once only.

To really understand the impact of this change, I think we need to benchmark 
the producer with varying message sizes with both 

[jira] [Commented] (KAFKA-3174) Re-evaluate the CRC32 class performance.

2016-02-01 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15126349#comment-15126349
 ] 

Ismael Juma commented on KAFKA-3174:


[~toddpalino], let us know when you switch to u40 so that we update the docs to 
say that (particularly important if you are getting SEGVs with Mirror Maker!).

> Re-evaluate the CRC32 class performance.
> 
>
> Key: KAFKA-3174
> URL: https://issues.apache.org/jira/browse/KAFKA-3174
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> We used org.apache.kafka.common.utils.CRC32 in clients because it has better 
> performance than java.util.zip.CRC32 in Java 1.6.
> In a recent test I ran it looks in Java 1.8 the CRC32 class is 2x as fast as 
> the Crc32 class we are using now. We may want to re-evaluate the performance 
> of Crc32 class and see it makes sense to simply use java CRC32 instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3174) Re-evaluate the CRC32 class performance.

2016-01-30 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124813#comment-15124813
 ] 

Ismael Juma commented on KAFKA-3174:


Great, left a comment in the PR.

> Re-evaluate the CRC32 class performance.
> 
>
> Key: KAFKA-3174
> URL: https://issues.apache.org/jira/browse/KAFKA-3174
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> We used org.apache.kafka.common.utils.CRC32 in clients because it has better 
> performance than java.util.zip.CRC32 in Java 1.6.
> In a recent test I ran it looks in Java 1.8 the CRC32 class is 2x as fast as 
> the Crc32 class we are using now. We may want to re-evaluate the performance 
> of Crc32 class and see it makes sense to simply use java CRC32 instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3174) Re-evaluate the CRC32 class performance.

2016-01-30 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15125169#comment-15125169
 ] 

Ismael Juma commented on KAFKA-3174:


That's good to know Gwen. Still, we were recommending Java 7 in our 
documentation until very recently and we still claim to support it. If it's 
straigforward to avoid the performance regression, why not do it?

If we believe users have moved on to 8, we should just drop support for 7. That 
would make things easier for us on many levels.

> Re-evaluate the CRC32 class performance.
> 
>
> Key: KAFKA-3174
> URL: https://issues.apache.org/jira/browse/KAFKA-3174
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> We used org.apache.kafka.common.utils.CRC32 in clients because it has better 
> performance than java.util.zip.CRC32 in Java 1.6.
> In a recent test I ran it looks in Java 1.8 the CRC32 class is 2x as fast as 
> the Crc32 class we are using now. We may want to re-evaluate the performance 
> of Crc32 class and see it makes sense to simply use java CRC32 instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3174) Re-evaluate the CRC32 class performance.

2016-01-30 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15125158#comment-15125158
 ] 

Gwen Shapira commented on KAFKA-3174:
-

As anecdotal evidence, 100% of the customers I worked with in the last 6 month 
have been on Java 1.8 (including everything from tiny startups to huge 
financial institutions).
Looks like companies simply jumped from Java 6 to Java 8 directly. So I 
wouldn't be too concerned about a regression in Java 1.7.

> Re-evaluate the CRC32 class performance.
> 
>
> Key: KAFKA-3174
> URL: https://issues.apache.org/jira/browse/KAFKA-3174
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> We used org.apache.kafka.common.utils.CRC32 in clients because it has better 
> performance than java.util.zip.CRC32 in Java 1.6.
> In a recent test I ran it looks in Java 1.8 the CRC32 class is 2x as fast as 
> the Crc32 class we are using now. We may want to re-evaluate the performance 
> of Crc32 class and see it makes sense to simply use java CRC32 instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3174) Re-evaluate the CRC32 class performance.

2016-01-30 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15125186#comment-15125186
 ] 

Jiangjie Qin commented on KAFKA-3174:
-

Thanks for the info [~gwenshap]. We have also been using Java 1.8 for a while 
and did not see any issue as far as I know. [~toddpalino] can comment. Should 
we start to recommend people to use Java 1.8 while still supporting 1.7? We can 
put a note warning people about the potential CRC performance degrade if they 
are using 1.7.

> Re-evaluate the CRC32 class performance.
> 
>
> Key: KAFKA-3174
> URL: https://issues.apache.org/jira/browse/KAFKA-3174
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> We used org.apache.kafka.common.utils.CRC32 in clients because it has better 
> performance than java.util.zip.CRC32 in Java 1.6.
> In a recent test I ran it looks in Java 1.8 the CRC32 class is 2x as fast as 
> the Crc32 class we are using now. We may want to re-evaluate the performance 
> of Crc32 class and see it makes sense to simply use java CRC32 instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3174) Re-evaluate the CRC32 class performance.

2016-01-29 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124558#comment-15124558
 ] 

Ismael Juma commented on KAFKA-3174:


Becket, are you planning to do this one? I'll take it, if you aren't.

> Re-evaluate the CRC32 class performance.
> 
>
> Key: KAFKA-3174
> URL: https://issues.apache.org/jira/browse/KAFKA-3174
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> We org.apache.kafka.common.utils.CRC32 because it has better performance than 
> java.util.zip.CRC32 in Java 1.6.
> In a recent test I ran it looks in Java 1.8 the CRC32 class is 2x as fast as 
> the Crc32 class we are using now. We may want to re-evaluate the performance 
> of Crc32 class and see it makes sense to simply use java CRC32 instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3174) Re-evaluate the CRC32 class performance.

2016-01-29 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124537#comment-15124537
 ] 

Jay Kreps commented on KAFKA-3174:
--

This is a great find. It looks like it did become an intrinsic in 1.8 
https://bugs.openjdk.java.net/browse/JDK-8131048

We can either remove it now since we recommend 1.8, though this will appear as 
a perf regression on 1.7, or we can wait a little, or we can try to do 
something dynamic where we check the java version and jvm type and make a 
decision.

> Re-evaluate the CRC32 class performance.
> 
>
> Key: KAFKA-3174
> URL: https://issues.apache.org/jira/browse/KAFKA-3174
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> We org.apache.kafka.common.utils.CRC32 because it has better performance than 
> java.util.zip.CRC32 in Java 1.6.
> In a recent test I ran it looks in Java 1.8 the CRC32 class is 2x as fast as 
> the Crc32 class we are using now. We may want to re-evaluate the performance 
> of Crc32 class and see it makes sense to simply use java CRC32 instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3174) Re-evaluate the CRC32 class performance.

2016-01-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124778#comment-15124778
 ] 

ASF GitHub Bot commented on KAFKA-3174:
---

GitHub user becketqin opened a pull request:

https://github.com/apache/kafka/pull/841

KAFKA-3174: Change Crc32 to use java.util.zip.CRC32



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/becketqin/kafka KAFKA-3174

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/841.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #841


commit 4419cb58d84466e36c8b1119704eac93566c9974
Author: Jiangjie Qin 
Date:   2016-01-30T06:44:20Z

Change Crc32 to use java.util.zip.CRC32




> Re-evaluate the CRC32 class performance.
> 
>
> Key: KAFKA-3174
> URL: https://issues.apache.org/jira/browse/KAFKA-3174
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> We used org.apache.kafka.common.utils.CRC32 in clients because it has better 
> performance than java.util.zip.CRC32 in Java 1.6.
> In a recent test I ran it looks in Java 1.8 the CRC32 class is 2x as fast as 
> the Crc32 class we are using now. We may want to re-evaluate the performance 
> of Crc32 class and see it makes sense to simply use java CRC32 instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3174) Re-evaluate the CRC32 class performance.

2016-01-29 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124630#comment-15124630
 ] 

Jiangjie Qin commented on KAFKA-3174:
-

[~ijuma] I just started writing the patch. Should be able to submit it soon. Do 
you mind helping with review?

> Re-evaluate the CRC32 class performance.
> 
>
> Key: KAFKA-3174
> URL: https://issues.apache.org/jira/browse/KAFKA-3174
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> We org.apache.kafka.common.utils.CRC32 because it has better performance than 
> java.util.zip.CRC32 in Java 1.6.
> In a recent test I ran it looks in Java 1.8 the CRC32 class is 2x as fast as 
> the Crc32 class we are using now. We may want to re-evaluate the performance 
> of Crc32 class and see it makes sense to simply use java CRC32 instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3174) Re-evaluate the CRC32 class performance.

2016-01-29 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124555#comment-15124555
 ] 

Ismael Juma commented on KAFKA-3174:


Jay, I think you meant https://bugs.openjdk.java.net/browse/JDK-7088419 as your 
link is for PPC and Java 9.

We should probably select an implementation in a static block based on the JVM 
version until we drop support for Java 7.

> Re-evaluate the CRC32 class performance.
> 
>
> Key: KAFKA-3174
> URL: https://issues.apache.org/jira/browse/KAFKA-3174
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
> Fix For: 0.9.0.1
>
>
> We org.apache.kafka.common.utils.CRC32 because it has better performance than 
> java.util.zip.CRC32 in Java 1.6.
> In a recent test I ran it looks in Java 1.8 the CRC32 class is 2x as fast as 
> the Crc32 class we are using now. We may want to re-evaluate the performance 
> of Crc32 class and see it makes sense to simply use java CRC32 instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)