[
https://issues.apache.org/jira/browse/SPARK-11343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dominik Dahlem updated SPARK-11343:
-----------------------------------
Affects Version/s: 1.5.1
Environment: all environments
Description:
Using pyspark.ml and DataFrames, The ALS recommender cannot be evaluated using
the RegressionEvaluator, because of a type mis-match between the model
transformation and the evaluation APIs. One can work around this by casting the
prediction column into double before passing it into the evaluator. However,
this does not work with pipelines and cross validation.
Code and traceback below:
{code}
als = ALS(rank=10, maxIter=30, regParam=0.1, userCol='userID',
itemCol='movieID', ratingCol='rating')
model = als.fit(training)
predictions = model.transform(validation)
evaluator = RegressionEvaluator(predictionCol='prediction', labelCol='rating')
validationRmse = evaluator.evaluate(predictions, {evaluator.metricName: 'rmse'})
{code}
Traceback:
validationRmse = evaluator.evaluate(predictions,
{evaluator.metricName: 'rmse'}
)
File
"/Users/dominikdahlem/software/spark-1.6.0-SNAPSHOT-bin-custom-spark/python/lib/pyspark.zip/pyspark/ml/evaluation.py",
line 63, in evaluate
File
"/Users/dominikdahlem/software/spark-1.6.0-SNAPSHOT-bin-custom-spark/python/lib/pyspark.zip/pyspark/ml/evaluation.py",
line 94, in _evaluate
File
"/Users/dominikdahlem/software/spark-1.6.0-SNAPSHOT-bin-custom-spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py",
line 813, in _call_
File
"/Users/dominikdahlem/projects/repositories/spark/python/pyspark/sql/utils.py",
line 42, in deco
raise IllegalArgumentException(s.split(': ', 1)[1])
pyspark.sql.utils.IllegalArgumentException: requirement failed: Column
prediction must be of type DoubleType but was actually FloatType.
Component/s: ML
Summary: Regression Imposes doubles on prediction/label columns
(was: Regression Imposes doubles on prediciton)
> Regression Imposes doubles on prediction/label columns
> ------------------------------------------------------
>
> Key: SPARK-11343
> URL: https://issues.apache.org/jira/browse/SPARK-11343
> Project: Spark
> Issue Type: Bug
> Components: ML
> Affects Versions: 1.5.1
> Environment: all environments
> Reporter: Dominik Dahlem
>
> Using pyspark.ml and DataFrames, The ALS recommender cannot be evaluated
> using the RegressionEvaluator, because of a type mis-match between the model
> transformation and the evaluation APIs. One can work around this by casting
> the prediction column into double before passing it into the evaluator.
> However, this does not work with pipelines and cross validation.
> Code and traceback below:
> {code}
> als = ALS(rank=10, maxIter=30, regParam=0.1, userCol='userID',
> itemCol='movieID', ratingCol='rating')
> model = als.fit(training)
> predictions = model.transform(validation)
> evaluator = RegressionEvaluator(predictionCol='prediction', labelCol='rating')
> validationRmse = evaluator.evaluate(predictions, {evaluator.metricName:
> 'rmse'})
> {code}
> Traceback:
> validationRmse = evaluator.evaluate(predictions,
> {evaluator.metricName: 'rmse'}
> )
> File
> "/Users/dominikdahlem/software/spark-1.6.0-SNAPSHOT-bin-custom-spark/python/lib/pyspark.zip/pyspark/ml/evaluation.py",
> line 63, in evaluate
> File
> "/Users/dominikdahlem/software/spark-1.6.0-SNAPSHOT-bin-custom-spark/python/lib/pyspark.zip/pyspark/ml/evaluation.py",
> line 94, in _evaluate
> File
> "/Users/dominikdahlem/software/spark-1.6.0-SNAPSHOT-bin-custom-spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py",
> line 813, in _call_
> File
> "/Users/dominikdahlem/projects/repositories/spark/python/pyspark/sql/utils.py",
> line 42, in deco
> raise IllegalArgumentException(s.split(': ', 1)[1])
> pyspark.sql.utils.IllegalArgumentException: requirement failed: Column
> prediction must be of type DoubleType but was actually FloatType.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]