yifan-c commented on code in PR #4667:
URL: https://github.com/apache/cassandra/pull/4667#discussion_r2969093742
##########
src/java/org/apache/cassandra/db/compression/CompressionDictionaryScheduler.java:
##########
@@ -83,31 +88,59 @@ public void scheduleRefreshTask()
}
@Override
- public void scheduleSSTableBasedTraining(ICompressionDictionaryTrainer
trainer,
- ColumnFamilyStore.RefViewFragment
refViewFragment,
+ public void scheduleSSTableBasedTraining(ColumnFamilyStore.RefViewFragment
refViewFragment,
+ CompressionParams
compressionParams,
CompressionDictionaryTrainingConfig config,
+ Consumer<CompressionDictionary>
listener,
boolean force)
{
- if (!manualTrainingInProgress.compareAndSet(false, true))
+ if (!trainingInProgress.compareAndSet(false, true))
{
refViewFragment.close();
throw new IllegalStateException("Training already in progress for
table " + keyspaceName + '.' + tableName);
}
- logger.info("Starting SSTable-based dictionary training for {}.{} from
{} SSTables",
- keyspaceName, tableName, refViewFragment.sstables.size());
+ ICompressionDictionaryTrainer trainer;
- // Run the SSTableSamplingTask asynchronously
- SSTableSamplingTask task = new SSTableSamplingTask(refViewFragment,
trainer, config, force);
- ScheduledExecutors.nonPeriodicTasks.submit(task);
+ try
+ {
+ trainer = ICompressionDictionaryTrainer.create(keyspaceName,
tableName, compressionParams);
+ trainer.setDictionaryTrainedListener(listener);
+ }
+ catch (Throwable t)
+ {
+ trainingInProgress.set(false);
+ refViewFragment.close();
+ throw t;
+ }
+
+ if (trainer.start(config))
+ {
+ activeTrainer = trainer;
+ lastTrainingState.set(trainer.getTrainingState());
+ logger.info("Starting SSTable-based dictionary training for {}.{}
from {} SSTables",
+ keyspaceName, tableName,
refViewFragment.sstables.size());
+
+ SSTableSamplingTask task = new
SSTableSamplingTask(refViewFragment, trainer, config, force);
+ // trainer is eventually closed here, as well as indicating
+ // in manualTrainingInProgress that it was finished
+ ScheduledExecutors.nonPeriodicTasks.submit(task);
+ }
+ else
+ {
+ finishTraining(trainer.getTrainingState());
+ cleanup(refViewFragment, trainer);
+ }
}
/**
* Cancels the in-progress manual training task.
*/
- private void cancelManualTraining()
+ private void finishTraining(TrainingState trainingState)
{
- manualTrainingInProgress.compareAndSet(true, false);
+ activeTrainer = null;
+ lastTrainingState.set(trainingState);
Review Comment:
I think `lastTrainingState.set(trainingState)` should be called before
`activeTrainer = null`, so that `getLastTrainingState()` method always return
the latest state.
Probably add a comment to note the expected sequence.
##########
src/java/org/apache/cassandra/db/compression/CompressionDictionaryTrainingConfig.java:
##########
@@ -67,12 +85,121 @@ public Builder chunkSize(int chunkSize)
return this;
}
+ public Builder minTrainingFrequency(int minTrainingFrequency)
+ {
+ this.minTrainingFrequency = minTrainingFrequency;
+ return this;
+ }
+
public CompressionDictionaryTrainingConfig build()
{
Preconditions.checkArgument(maxDictionarySize > 0,
"maxDictionarySize must be positive");
Preconditions.checkArgument(maxTotalSampleSize > 0,
"maxTotalSampleSize must be positive");
Preconditions.checkArgument(chunkSize > 0, "chunkSize must be
positive");
+ Preconditions.checkArgument(minTrainingFrequency >= 0, "min
training frequency must be positive");
Review Comment:
The message says "must be positive" but the condition is `>= 0`. Since 0 is
a valid value, let's update the message to use "non-negative"
##########
src/java/org/apache/cassandra/db/compression/ZstdDictionaryTrainer.java:
##########
@@ -298,8 +295,14 @@ public boolean start(CompressionDictionaryTrainingConfig
trainingConfig)
}
catch (Exception e)
{
- logger.warn("Failed to create ZstdDictTrainer for {}.{}",
keyspaceName, tableName, e);
- failureMessage = "Failed to create ZstdDictTrainer: " +
e.getMessage();
+ String message = String.format("Failed to create %s for %s.%s,
reason: %s",
+ ZstdDictTrainer.class,
Review Comment:
how about just print the simple name `ZstdDictTrainer.class.getSimpleName()`?
I think `ZstdDictTrainer.class` prints `"class
com.github.luben.zstd.ZstdDictTrainer"`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]