Hi,
The model(s) learn a correlation between the label(s) and the features.
In the Random Forest Classification example the Labeled feature represents
the class that a wine belongs
to based on a given set of features.
see:
The labeled feature is defined here:
Vectorizer vectorizer =
new DummyVectorizer()
.labeled(Vectorizer.LabelCoordinate.FIRST);
ModelsComposition randomForestMdl = classifier.fit(ignite,
dataCache, vectorizer);
After the model has learned the associations between class and labels, it
is tested here:
double groundTruth = val.get(0);
double prediction = randomForestMdl.predict(inputs);
totalAmount++;
if (!Precision.equals(groundTruth, prediction,
Precision.EPSILON))
amountOfErrors++;
if you put breakpoints on these lines, groundTruth will be one of 3
available classes and the model
prediction will try match that classification based on available inputs.
see: https://apacheignite.readme.io/docs/random-forest
In that document you will find more references on working with random forest
models.
If you are new to ML, simple Linear Regression might be the most accessible
model to learn.
https://apacheignite.readme.io/docs/ols-multiple-linear-regression
Is there a way to parallelize the training across available cores while
still limiting
the operation to a single JVM process?
Apache Ignite machine learning was designed from the bottom up to train a
model quickly by spreading the load across all nodes of a cluster.
see: https://apacheignite.readme.io/docs/ml-partition-based-dataset
If you want to limit training to a single JVM process then create a cluster
of one node.
Take a look in the examples here on pointers with feature selection:
https://github.com/apache/ignite/tree/master/examples/src/main/java/org/apache/ignite/examples/ml/selection
https://github.com/apache/ignite/tree/master/examples/src/main/java/org/apache/ignite/examples/ml/tutorial/hyperparametertuning
Thanks, Alex
--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/