> > setting performance requirements on this regard is a > nonsense. As long as it's reasonably usable in real world, and Cassandra > makes the estimated effects on performance available, it will be up to > the operators to decide whether to turn on the feature
I think Joey's argument, and correct me if I'm wrong, is that implementing a complex feature in Cassandra that we then have to manage that's essentially worse in every way compared to a built-in full-disk encryption option via LUKS+LVM etc is a poor use of our time and energy. i.e. we'd be better off investing our time into documenting how to do full disk encryption in a variety of scenarios + explaining why that is our recommended approach instead of taking the time and energy to design, implement, debug, and then maintain an inferior solution. On Fri, Nov 19, 2021 at 7:49 AM Joshua McKenzie <jmcken...@apache.org> wrote: > Are you for real here? > > Please keep things cordial. Statements like this don't help move the > conversation along. > > > On Fri, Nov 19, 2021 at 3:57 AM Stefan Miklosovic < > stefan.mikloso...@instaclustr.com> wrote: > >> On Fri, 19 Nov 2021 at 02:51, Joseph Lynch <joe.e.ly...@gmail.com> wrote: >> > >> > On Thu, Nov 18, 2021 at 7:23 PM Kokoori, Shylaja < >> shylaja.koko...@intel.com> >> > wrote: >> > >> > > To address Joey's concern, the OpenJDK JVM and its derivatives >> optimize >> > > Java crypto based on the underlying HW capabilities. For example, if >> the >> > > underlying HW supports AES-NI, JVM intrinsics will use those for >> crypto >> > > operations. Likewise, the new vector AES available on the latest Intel >> > > platform is utilized by the JVM while running on that platform to make >> > > crypto operations faster. >> > > >> > >> > Which JDK version were you running? We have had a number of issues with >> the >> > JVM being 2-10x slower than native crypto on Java 8 (especially MD5, >> SHA1, >> > and AES-GCM) and to a lesser extent Java 11 (usually ~2x slower). Again >> I >> > think we could get the JVM crypto penalty down to ~2x native if we >> linked >> > in e.g. ACCP by default [1, 2] but even the very best Java crypto I've >> seen >> > (fully utilizing hardware instructions) is still ~2x slower than native >> > code. The operating system has a number of advantages here in that they >> > don't pay JVM allocation costs or the JNI barrier (in the case of ACCP) >> and >> > the kernel also takes advantage of hardware instructions. >> > >> > >> > > From our internal experiments, we see single digit % regression when >> > > transparent data encryption is enabled. >> > > >> > >> > Which workloads are you testing and how are you measuring the >> regression? I >> > suspect that compaction, repair (validation compaction), streaming, and >> > quorum reads are probably much slower (probably ~10x slower for the >> > throughput bound operations and ~2x slower on the read path). As >> > compaction/repair/streaming usually take up between 10-20% of available >> CPU >> > cycles making them 2x slower might show up as <10% overall utilization >> > increase when you've really regressed 100% or more on key metrics >> > (compaction throughput, streaming throughput, memory allocation rate, >> etc >> > ...). For example, if compaction was able to achieve 2 MiBps of >> throughput >> > before encryption and it was only able to achieve 1MiBps of throughput >> > afterwards, that would be a huge real world impact to operators as >> > compactions now take twice as long. >> > >> > I think a CEP or details on the ticket that indicate the performance >> tests >> > and workloads that will be run might be wise? Perhaps something like >> > "encryption creates no more than a 1% regression of: compaction >> throughput >> > (MiBps), streaming throughput (MiBps), repair validation throughput >> > (duration of full repair on the entire cluster), read throughput at 10ms >> > p99 tail at quorum consistency (QPS handled while not exceeding P99 SLO >> of >> > 10ms), etc ... while a sustained load is applied to a multi-node >> cluster"? >> >> Are you for real here?Nobody will ever guarantee you these %1 numbers >> ... come on. I think we are >> super paranoid about performance when we are not paranoid enough about >> security. This is a two way street. >> People are willing to give up on performance if security is a must. >> You do not need to use it if you do not want to, >> it is not like we are going to turn it on and you have to stick with >> that. Are you just saying that we are going to >> protect people from using some security features because their db >> might be slow? What if they just dont care? >> >> > Even a microbenchmark that just sees how long it takes to encrypt and >> > decrypt a 500MiB dataset using the proposed JVM implementation versus >> > encrypting it with a native implementation might be enough to >> confirm/deny. >> > For example, keypipe (C, [3]) achieves around 2.8 GiBps symmetric of >> > AES-GCM and age (golang, ChaCha20-Poly1305, [4]) achieves about 1.6 >> GiBps >> > encryption and 1.0 GiBps decryption; from my past experiences with Java >> > crypto is it would achieve maybe 200 MiBps of _non-authenticated_ AES. >> > >> > Cheers, >> > -Joey >> > >> > [1] https://issues.apache.org/jira/browse/CASSANDRA-15294 >> > [2] https://github.com/corretto/amazon-corretto-crypto-provider >> > [3] https://github.com/FiloSottile/age >> > [4] https://github.com/hashbrowncipher/keypipe#encryption >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org >> For additional commands, e-mail: dev-h...@cassandra.apache.org >> >>