[ https://issues.apache.org/jira/browse/SPARK-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15422384#comment-15422384 ]
Yanbo Liang commented on SPARK-16993: ------------------------------------- [~dulajrajitha] I can not reproduce your reported issue, the following code works well. {code} val data = spark.read.format("libsvm").load("/Users/yliang/data/trunk0/spark/data/mllib/sample_libsvm_data.txt") val featureIndexer = new VectorIndexer() .setInputCol("features") .setOutputCol("indexedFeatures") .setMaxCategories(4) .fit(data) val trainingData = data val testData = data.drop("label") val rf = new RandomForestRegressor() .setLabelCol("label") .setFeaturesCol("indexedFeatures") val pipeline = new Pipeline() .setStages(Array(featureIndexer, rf)) val model = pipeline.fit(trainingData) val predictions = model.transform(testData) predictions.select("prediction", "features").show(5) {code} Could you tell me whether this code snippet coincide with your issues? If yes, I think it's not a bug. Thanks! > model.transform without label column in random forest regression > ---------------------------------------------------------------- > > Key: SPARK-16993 > URL: https://issues.apache.org/jira/browse/SPARK-16993 > Project: Spark > Issue Type: Question > Components: Java API, ML > Reporter: Dulaj Rajitha > > I need to use a separate data set to prediction (Not as show in example's > training data split). > But those data do not have the label column. (Since these data are the data > that needs to be predict the label). > but model.transform is informing label column is missing. > org.apache.spark.sql.AnalysisException: cannot resolve 'label' given input > columns: [id,features,prediction] -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org