[ https://issues.apache.org/jira/browse/CASSANDRA-13304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16582449#comment-16582449 ]
Tom van der Woerdt commented on CASSANDRA-13304: ------------------------------------------------ It looks like this can be configured entirely from the cassandra.yaml file, so this could be done as a change to the default configuration. Specifically, this is what I did to my cassandra.yaml to make this work: {code:java} client_encryption_options: enabled: true optional: true keystore: conf/.keystore keystore_password: cassandra cipher_suites: [TLS_ECDH_anon_WITH_NULL_SHA]{code} I had to then create an empty keystore, or Cassandra crashes, but we don't actually need it. This allowed me to connect: {code:java} $ openssl s_client -connect localhost:9042 -cipher NULL CONNECTED(00000005) --- no peer certificate available --- No client certificate CA names sent --- SSL handshake has read 290 bytes and written 390 bytes --- New, TLSv1/SSLv3, Cipher is AECDH-NULL-SHA Secure Renegotiation IS supported Compression: NONE Expansion: NONE No ALPN negotiated SSL-Session: Protocol : TLSv1.2 Cipher : AECDH-NULL-SHA Session-ID: 5B756CA79320B7AF5A4DB9FAE00CDAA3F64AFAACE14C57A51F732CB9BCAC0807 Session-ID-ctx: Master-Key: 652BE33C8F4579236966B50554F52A4C8C53BECAC4420B26BB7D22F33DFA6CE810A29E9BEA1FB8E3C9C0D22782D82A33 Start Time: 1534422183 Timeout : 300 (sec) Verify return code: 0 (ok) ---{code} I have to explicitly tell openssl that it's acceptable to not encrypt, and then we have a connection. The negotiated cipher is AECDH-NULL-SHA, effectively giving us a TLSv1.2 connection using no encryption or authentication, but with integrity protected by SHA1. Cassandra already has support for running both TLS and non-TLS connections over the same port, so port 9042 is still usable for unprotected connections, making this change fully backwards compatible. Hope that helps! > Add checksumming to the native protocol > --------------------------------------- > > Key: CASSANDRA-13304 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13304 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Michael Kjellman > Assignee: Sam Tunnicliffe > Priority: Blocker > Labels: client-impacting > Fix For: 4.x > > Attachments: 13304_v1.diff, boxplot-read-throughput.png, > boxplot-write-throughput.png > > > The native binary transport implementation doesn't include checksums. This > makes it highly susceptible to silently inserting corrupted data either due > to hardware issues causing bit flips on the sender/client side, C*/receiver > side, or network in between. > Attaching an implementation that makes checksum'ing mandatory (assuming both > client and server know about a protocol version that supports checksums) -- > and also adds checksumming to clients that request compression. > The serialized format looks something like this: > {noformat} > * 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 > * 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Number of Compressed Chunks | Compressed Length (e1) / > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * / Compressed Length cont. (e1) | Uncompressed Length (e1) / > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Uncompressed Length cont. (e1)| CRC32 Checksum of Lengths (e1)| > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Checksum of Lengths cont. (e1)| Compressed Bytes (e1) +// > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | CRC32 Checksum (e1) || > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Compressed Length (e2) | > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Uncompressed Length (e2) | > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | CRC32 Checksum of Lengths (e2) | > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Compressed Bytes (e2) +// > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | CRC32 Checksum (e2) || > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Compressed Length (en) | > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Uncompressed Length (en) | > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | CRC32 Checksum of Lengths (en) | > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Compressed Bytes (en) +// > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | CRC32 Checksum (en) || > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > {noformat} > The first pass here adds checksums only to the actual contents of the frame > body itself (and doesn't actually checksum lengths and headers). While it > would be great to fully add checksuming across the entire protocol, the > proposed implementation will ensure we at least catch corrupted data and > likely protect ourselves pretty well anyways. > I didn't go to the trouble of implementing a Snappy Checksum'ed Compressor > implementation as it's been deprecated for a while -- is really slow and > crappy compared to LZ4 -- and we should do everything in our power to make > sure no one in the community is still using it. I left it in (for obvious > backwards compatibility aspects) old for clients that don't know about the > new protocol. > The current protocol has a 256MB (max) frame body -- where the serialized > contents are simply written in to the frame body. > If the client sends a compression option in the startup, we will install a > FrameCompressor inline. Unfortunately, we went with a decision to treat the > frame body separately from the header bits etc in a given message. So, > instead we put a compressor implementation in the options and then if it's > not null, we push the serialized bytes for the frame body *only* thru the > given FrameCompressor implementation. The existing implementations simply > provide all the bytes for the frame body in one go to the compressor > implementation and then serialize it with the length of the compressed bytes > up front. > Unfortunately, this won't work for checksum'ing for obvious reasons as we > can't naively just checksum the entire (potentially) 256MB frame body and > slap it at the end... so, > The best place to start with the changes is in {{ChecksumedCompressor}}. I > implemented one single place to perform the checksuming (and to support > checksuming) the actual required chunking logic. Implementations of > ChecksumedCompressor only implement the actual calls to the given compression > algorithm for the provided bytes. > Although the interface takes a {{Checksum}}, right now the attached patch > uses CRC32 everywhere. As of right now, given JDK8+ has support for doing the > calculation with the Intel instruction set, CRC32 is about as fast as we can > get right now. > I went with a 32kb "default" for the chunk size -- meaning we will chunk the > entire frame body into 32kb chunks, compress each one of those chunks, and > checksum the chunk. Upon discussing with a bunch of people and researching > how checksums actually work and how much data they will protect etc -- if we > use 32kb chunks with CRC32 we can catch up to 32 bits flipped in a row (but > more importantly catch the more likely corruption where a single bit is > flipped) with pretty high certainty. 64kb seems to introduce too much of a > probability of missing corruption. > The maximum block size LZ4 operates on is a 64kb chunk -- so this combined > with the need to make sure the CRC32 checksums are actually going to catch > stuff -- chunking at 32kb seemed like a good reasonable value to use when > weighing both checksums and compression (to ensure we don't kill our > compression ratio etc). > I'm not including client changes here -- I asked around and I'm not really > sure what the policy there is -- do we update the python driver? java driver? > how has the timing of this stuff been handled in the past? -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org