Testing another Dataset after ML training

mckunkel Tue, 11 Jul 2017 04:43:15 -0700

Greetings,

Following the example on the AS page for Naive Bayes using Dataset<Row>
https://spark.apache.org/docs/latest/ml-classification-regression.html#naive-bayes
<https://spark.apache.org/docs/latest/ml-classification-regression.html#naive-bayes>


I want to predict the outcome of another set of data. So instead of
splitting the data into training and testing, I have 1 set of training and
one set of testing. i.e.;
                Dataset<Row> training = spark.createDataFrame(dataTraining,
schemaForFrame);
                Dataset<Row> testing = spark.createDataFrame(dataTesting, 
schemaForFrame);

                NaiveBayes nb = new NaiveBayes();
                NaiveBayesModel model = nb.fit(train);
                Dataset<Row> predictions = model.transform(testing);
                predictions.show();

But I get the error.

17/07/11 13:40:38 INFO DAGScheduler: Job 2 finished: collect at
NaiveBayes.scala:171, took 3.942413 s
Exception in thread "main" org.apache.spark.SparkException: Failed to
execute user defined function($anonfun$1: (vector) => vector)
        at
org.apache.spark.sql.catalyst.expressions.ScalaUDF.eval(ScalaUDF.scala:1075)
        at
org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala:144)
        at
org.apache.spark.sql.catalyst.expressions.InterpretedProjection.apply(Projection.scala:48)
        at
org.apache.spark.sql.catalyst.expressions.InterpretedProjection.apply(Projection.scala:30)
        at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)

...
...
...


How do I perform predictions on other datasets that were not created at a
split?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Testing-another-Dataset-after-ML-training-tp28845.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]

Testing another Dataset after ML training

Reply via email to