[ 
https://issues.apache.org/jira/browse/CASSANDRA-16360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771183#comment-17771183
 ] 

Maxim Muzafarov commented on CASSANDRA-16360:
---------------------------------------------

[~samt] Hello,

I have prepared a patch to replace the CRC32 algorithm with the more efficient 
CRC32C. This will only be used for a message payload in encoders and decoders. 
I suggest keeping the CRC24 algorithm for headers as it is used now (see the 
details below). New benchmarks have also been added to prove that the changes 
are worthwhile.

The implementation consists of the following parts:
 - CASSANDRA-18780 with a fix for ConnectionBurnTest which is used for the 
patch verification;
 - and a code that changes CRC32 to CRC32C;

Can you take a look at these changes?
[https://github.com/apache/cassandra/pull/2647]
h3. Implementation Plan

As you mentioned above and in the [Drift backwards compatibility 
#2|https://lists.apache.org/thread/2h0xtmf6ygqyf89s5hjy65185pplkh43] thread on 
the dev-list the difference between server-server and client-server protocols, 
so this patch focuses only on the internode communication protocol, which seems 
easier for us to change. Replacing the CRC algorithm in the client-server 
communication will probably require a bump to up the native protocol version, 
or as I suggested [Drift backwards compatibility 
#1|https://lists.apache.org/thread/t977l7b4x78zkcr24rn16j8v0jmhrmk5] we can try 
to implement the feature flags for it, but still, this thing needs a discussion 
and it is better to move it to another issue.

So, my plan here is:
 - We replace the CRC for the internode messaging protocol in this issue;
 - We create an issue to investigate possible solutions to fix the exact 
replacement in the SimpleClient and other drivers (this is related to 
client-server communication that we have now);

h3. Current solution: Replace CRC32 for the payload only

Benedict mentioned that we can also try to get rid of CRC24 and replace it with 
CRC32C. I think this is not possible and it is not worth it for the following 
reasons.
h4. Benchmarks haven't shown any valuable difference.

I have implemented the prototype 
[cassandra-16360-poc|https://github.com/apache/cassandra/compare/trunk...Mmuzaf:cassandra:cassandra-16360-poc#diff-3e4dbb556bbe3df9b3061a15c96c6be539727cb8788d5274e597425b71c0c58aR62]
 that replaces both CRC24 and CRC32 in an encoder and its corresponding decoder.

So, I have prepared and measured 
[here|https://gist.github.com/Mmuzaf/349c5f8dbdfada82e5199f36cbfc15b4] a few 
combinations of encoder-decoder pairs:
 - CRC24 + CRC32 (current)
 - CRC24 + CRC32C (change for payload)
 - CRC32C + CRC32C (a new implementation)

So, according to benchmarks, we can see a valuable difference between pairs 
[CRC24 + CRC32] and [CRC32C + CRC32C] and no difference between [CRC24 + 
CRC32C] and [CRC32C + CRC32C]. This leads me to think that calculating CRC24 
over an 8-byte header makes a really small contribution to the total time.
||CRC24 + CRC32||CRC32C + CRC32C||
|1109.830 ± 111.170 ns/op|863.869 ± 20.206 ns/op|

 
||CRC24 + CRC32C||CRC32C + CRC32C||
|1039.133 ± 61.430 ns/op|1090.281 ± 97.314 ns/op|
h4. Replacing CRC24 makes the header longer than 8 bytes

So, for the LZ4 encoder and decoder replacing the CRC24 algorithm (3 bytes) 
with the CRC32C algorithm (4 bytes) makes the header itself longer than 8 
bytes. Currently, we use one hop ({{{}readLong{}}}) to read the whole message 
header. If the message header is longer than 8 bytes, then the efficiency of 
the replacement itself is not worthwhile because, as it requires two reads 
instead of one for almost no good reason.
h4. Patch complexity and code duplication

It is worth noting that adding another implementation of frame decoders and 
frame encoders is much more difficult and time-consuming to verify on the 
reviewer's side, thus reducing the chance that the patch will be reviewed :)
h3. Benchmarks

I have included the FramingBench, which measures encoding and decoding 
operations, in the patch to verify that we are on the right track.
{code:java}
     [java] Benchmark            (payloadSize)      (type)  Mode  Cnt     Score 
    Error  Units
     [java] FramingBench.decode           2037         CRC  avgt    4   382.764 
±  75.327  ns/op
     [java] FramingBench.decode           2037      CRC32C  avgt    4   272.388 
±  63.618  ns/op
     [java] FramingBench.decode           2037         LZ4  avgt    4   377.231 
±  20.023  ns/op
     [java] FramingBench.decode           2037  LZ4_CRC32C  avgt    4   278.301 
±  28.004  ns/op
     [java] FramingBench.decode          16373         CRC  avgt    4  1324.664 
±  39.001  ns/op
     [java] FramingBench.decode          16373      CRC32C  avgt    4  1318.600 
± 102.559  ns/op
     [java] FramingBench.decode          16373         LZ4  avgt    4  1292.971 
±  66.381  ns/op
     [java] FramingBench.decode          16373  LZ4_CRC32C  avgt    4  1283.954 
±  87.084  ns/op
     [java] FramingBench.encode           2037         CRC  avgt    4   382.130 
±  12.318  ns/op
     [java] FramingBench.encode           2037      CRC32C  avgt    4   311.646 
±  17.114  ns/op
     [java] FramingBench.encode           2037         LZ4  avgt    4  2126.181 
± 287.391  ns/op
     [java] FramingBench.encode           2037  LZ4_CRC32C  avgt    4  2026.204 
± 219.598  ns/op
     [java] FramingBench.encode          16373         CRC  avgt    4  1311.842 
± 102.518  ns/op
     [java] FramingBench.encode          16373      CRC32C  avgt    4  1310.921 
± 139.089  ns/op
     [java] FramingBench.encode          16373         LZ4  avgt    4  6818.232 
± 200.489  ns/op
     [java] FramingBench.encode          16373  LZ4_CRC32C  avgt    4  6499.657 
± 618.240  ns/op
{code}
[https://gist.github.com/Mmuzaf/a7ac6f5759c16ee24683856704e7c941]

> CRC32 is inefficient on x86
> ---------------------------
>
>                 Key: CASSANDRA-16360
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16360
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Messaging/Client
>            Reporter: Avi Kivity
>            Assignee: Maxim Muzafarov
>            Priority: Normal
>              Labels: protocolv6
>             Fix For: 5.x
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The client/server protocol specifies CRC24 and CRC32 as the checksum 
> algorithm (cql_protocol_V5_framing.asc). Those however are expensive to 
> compute; this affects both the client and the server.
>  
> A better checksum algorithm is CRC32C, which has hardware support on x86 (as 
> well as other modern architectures).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to