[ https://issues.apache.org/jira/browse/IGNITE-12396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Maxim Muzafarov updated IGNITE-12396: ------------------------------------- Fix Version/s: (was: 2.10) > [ML] Random Forest generates NaN for a part of models on small datasets > ----------------------------------------------------------------------- > > Key: IGNITE-12396 > URL: https://issues.apache.org/jira/browse/IGNITE-12396 > Project: Ignite > Issue Type: Bug > Components: ml > Reporter: Alexey Zinoviev > Assignee: Alexey Zinoviev > Priority: Major > > @Override public Double predict(Vector features) { > double[] predictions = new double[models.size()]; > for (int i = 0; i < models.size(); i++) > predictions[i] = models.get(i).predict(features); > return predictionsAggregator.apply(predictions); > } > > predictionAggreagtor gets a lot of models and part of them returns null and > it could be aggregated, first of all handle this in Aggregator (using > threshold for amount of broken models before aggregation) also RandomForest > trees should return Double.NaN - it should fail or throw message after the > training > > I've tested with 100 or 1000 rows and it fails and doesn't fail on 10 000 rows > > RF generates a few models with one LEAF node with empty val (Double.NaN by > default) -- This message was sent by Atlassian Jira (v8.3.4#803005)