[ https://issues.apache.org/jira/browse/SPARK-17987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-17987. ------------------------------- Resolution: Not A Problem > ML Evaluator fails to handle null values in the dataset > ------------------------------------------------------- > > Key: SPARK-17987 > URL: https://issues.apache.org/jira/browse/SPARK-17987 > Project: Spark > Issue Type: Improvement > Components: ML > Affects Versions: 1.6.2, 2.0.1 > Reporter: bo song > > Take the RegressionEvaluator as an example, when the predictionCol is null in > a row, en exception "scala.MatchEror" will be thrown. The missing null > prediction is a common case, for example when an predictor is missing, or its > value is out of bound, almost machine learning models could not produce > correct predictions, then null predictions would be returned. Evaluators > should handle the null values instead of an exception thrown, the common way > to handle missing null values is to ignore them. Besides of the null value, > the NAN value need to be handled correctly too. > Those three evaluators RegressionEvaluator, BinaryClassificationEvaluator and > MulticlassClassificationEvaluator have the same problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org