rzo1 commented on code in PR #758:
URL: https://github.com/apache/opennlp/pull/758#discussion_r2009706730
##########
opennlp-tools/src/main/java/opennlp/tools/monitoring/ConsoleTrainingProgressMonitor.java:
##########
@@ -0,0 +1,51 @@
+package opennlp.tools.monitoring;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.LinkedList;
+import java.util.List;
+import java.util.Objects;
+
+import static
opennlp.tools.monitoring.StopCriteria.TRAINING_FINISHED_DEFAULT_MSG;
+
+/**
+ * Publishes ML model's Training progress to console.
+ */
+public class ConsoleTrainingProgressMonitor implements TrainingProgressMonitor
{
+
+ private static final Logger logger =
LoggerFactory.getLogger(ConsoleTrainingProgressMonitor.class);
+
+ /**
+ * Keeps a track whether training was already finished because StopCriteria
was met.
+ */
+ volatile boolean isTrainingFinished;
Review Comment:
Can this be `private`?
##########
opennlp-tools/src/main/java/opennlp/tools/monitoring/PrevNIterationAccuracyLessThanTolerance.java:
##########
@@ -0,0 +1,30 @@
+package opennlp.tools.monitoring;
Review Comment:
License header
##########
opennlp-tools/src/main/java/opennlp/tools/ml/perceptron/PerceptronTrainer.java:
##########
@@ -257,15 +260,17 @@ public AbstractModel trainModel(int iterations,
DataIndexer di, int cutoff, bool
logger.info("Computing model parameters...");
- MutableContext[] finalParameters = findParameters(iterations, useAverage);
+
+ //@TODO : Ideally an instance of TrainingProgressMonitor will be
introduced via Trainer Factory.
+ MutableContext[] finalParameters = findParameters(iterations, useAverage,
new ConsoleTrainingProgressMonitor());
Review Comment:
I agree, that a TPM should be created via the factory. Could also be a part
of the training configuration (which TPM impl to use and which stop criteria to
apply)
##########
opennlp-tools/src/main/java/opennlp/tools/monitoring/StopCriteria.java:
##########
@@ -0,0 +1,10 @@
+package opennlp.tools.monitoring;
+
+/** Used to identify the stop criteria for {@link
opennlp.tools.ml.model.AbstractModel} training.*/
+public interface StopCriteria {
+
+ String TRAINING_FINISHED_DEFAULT_MSG = "Training Finished after completing
%s Iterations successfully.";
Review Comment:
`finished`
##########
opennlp-tools/src/main/java/opennlp/tools/monitoring/ConsoleTrainingProgressMonitor.java:
##########
@@ -0,0 +1,51 @@
+package opennlp.tools.monitoring;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.LinkedList;
+import java.util.List;
+import java.util.Objects;
+
+import static
opennlp.tools.monitoring.StopCriteria.TRAINING_FINISHED_DEFAULT_MSG;
+
+/**
+ * Publishes ML model's Training progress to console.
+ */
+public class ConsoleTrainingProgressMonitor implements TrainingProgressMonitor
{
+
+ private static final Logger logger =
LoggerFactory.getLogger(ConsoleTrainingProgressMonitor.class);
+
+ /**
+ * Keeps a track whether training was already finished because StopCriteria
was met.
+ */
+ volatile boolean isTrainingFinished;
+
+ private final List<String> progress = new LinkedList<>();
Review Comment:
We might want to add an API possibility to clear this list? Otherwise, it
might grow and grow really large?
##########
opennlp-tools/src/main/java/opennlp/tools/monitoring/ConsoleTrainingProgressMonitor.java:
##########
@@ -0,0 +1,51 @@
+package opennlp.tools.monitoring;
Review Comment:
Missing license header.
##########
opennlp-tools/src/main/java/opennlp/tools/monitoring/PrevNIterationAccuracyLessThanTolerance.java:
##########
@@ -0,0 +1,30 @@
+package opennlp.tools.monitoring;
+
+/** Identifies whether the difference between the training accuracy for
current iteration
+ * and the training accuracy of previous
+ * n-1, n-2.. n-ith iterations is less than the defined Tolerance.
+ *
+ */
+public class PrevNIterationAccuracyLessThanTolerance implements StopCriteria {
+
+ public static String ACCURACY_DIFF_UNDER_TOLERANCE = "Stopping: change in
training set accuracy less than {%s}";
+
+ private final double tolerance;
+
+ /** TODO: 24-03-2025 : i) Should a generic test method be exposed at
interface level?
+ * ii) iterationDeltaAccuracy could be a list?*/
+ public boolean test (double ... iterationDeltaAccuracy) {
Review Comment:
This would also need some more javadoc to clarify intend and possible
arguments.
##########
opennlp-tools/src/main/java/opennlp/tools/monitoring/TrainingProgressMonitor.java:
##########
@@ -0,0 +1,14 @@
+package opennlp.tools.monitoring;
Review Comment:
License header
##########
opennlp-tools/src/main/java/opennlp/tools/monitoring/StopCriteria.java:
##########
@@ -0,0 +1,10 @@
+package opennlp.tools.monitoring;
Review Comment:
License header
##########
opennlp-tools/src/main/java/opennlp/tools/monitoring/ConsoleTrainingProgressMonitor.java:
##########
@@ -0,0 +1,51 @@
+package opennlp.tools.monitoring;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.LinkedList;
+import java.util.List;
+import java.util.Objects;
+
+import static
opennlp.tools.monitoring.StopCriteria.TRAINING_FINISHED_DEFAULT_MSG;
+
+/**
+ * Publishes ML model's Training progress to console.
+ */
+public class ConsoleTrainingProgressMonitor implements TrainingProgressMonitor
{
Review Comment:
It depends on the logger configuration, so we might need to come up with a
better name for it.
##########
opennlp-tools/src/main/java/opennlp/tools/monitoring/PrevNIterationAccuracyLessThanTolerance.java:
##########
@@ -0,0 +1,30 @@
+package opennlp.tools.monitoring;
+
+/** Identifies whether the difference between the training accuracy for
current iteration
+ * and the training accuracy of previous
+ * n-1, n-2.. n-ith iterations is less than the defined Tolerance.
+ *
+ */
+public class PrevNIterationAccuracyLessThanTolerance implements StopCriteria {
+
+ public static String ACCURACY_DIFF_UNDER_TOLERANCE = "Stopping: change in
training set accuracy less than {%s}";
+
+ private final double tolerance;
+
+ /** TODO: 24-03-2025 : i) Should a generic test method be exposed at
interface level?
Review Comment:
I think, that we should have a generic method at interface level and impl.
can override.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]