[ 
https://issues.apache.org/jira/browse/SPARK-17987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-17987.
-------------------------------
    Resolution: Not A Problem

> ML Evaluator fails to handle null values in the dataset
> -------------------------------------------------------
>
>                 Key: SPARK-17987
>                 URL: https://issues.apache.org/jira/browse/SPARK-17987
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML
>    Affects Versions: 1.6.2, 2.0.1
>            Reporter: bo song
>
> Take the RegressionEvaluator as an example, when the predictionCol is null in 
> a row, en exception "scala.MatchEror" will be thrown. The missing null 
> prediction is a common case, for example when an predictor is missing, or its 
> value is out of bound, almost machine learning models could not produce 
> correct predictions, then null predictions would be returned. Evaluators 
> should handle the null values instead of an exception thrown, the common way 
> to handle missing null values is to ignore them. Besides of the null value, 
> the NAN value need to be handled correctly too. 
> Those three evaluators RegressionEvaluator, BinaryClassificationEvaluator and 
> MulticlassClassificationEvaluator have the same problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to