Stefan Miklosovic created CASSANDRA-21194:
---------------------------------------------
Summary: Sampling data for dictionary training on more than
Integer.MAX_VALUE bytes in pointless
Key: CASSANDRA-21194
URL: https://issues.apache.org/jira/browse/CASSANDRA-21194
Project: Apache Cassandra
Issue Type: Improvement
Components: Feature/Compression
Reporter: Stefan Miklosovic
ZstdDictTrainer from zstd-jni library we use uses
ByteBuffer.allocateDirect(size) for training samples. {{size}} is integer.
Integer.MAX_VALUE is basically 2.0 GiB. So if a user wants to sample on more,
like 3GiB, the sampling just stops at 2GiB and in training output it looks like
it is stuck. We should validate this value before training and reject anything
bigger than 2GiB.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]