[ https://issues.apache.org/jira/browse/SPARK-18476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15713515#comment-15713515 ]
Miao Wang commented on SPARK-18476: ----------------------------------- spark.logit predict should output original label instead of a numerical value as the prediction column. Example: > training <- suppressWarnings(createDataFrame(iris)) > binomial_training <- training[training$Species %in% c("versicolor", > "virginica"), ] > binomial_model <- spark.logit(binomial_training, Species ~ Sepal_Length + > Sepal_Width) > prediction <- predict(binomial_model, binomial_training) > showDF(prediction) Output: +------------+-----------+------------+-----------+----------+--------------------+--------------------+----------+ |Sepal_Length|Sepal_Width|Petal_Length|Petal_Width| Species| rawPrediction| probability|prediction| +------------+-----------+------------+-----------+----------+--------------------+--------------------+----------+ | 7.0| 3.2| 4.7| 1.4|versicolor|[-1.5655042626435...|[0.17285823940230...| virginica| | 6.4| 3.2| 4.5| 1.5|versicolor|[-0.4240802660720...|[0.39554079174312...| virginica| | 6.9| 3.1| 4.9| 1.5|versicolor|[-1.3348014339322...|[0.20836626079909...| virginica| | 5.5| 2.3| 4.0| 1.3|versicolor|[1.65224519232947...|[0.83919426374389...|versicolor| | 6.5| 2.8| 4.6| 1.5|versicolor|[-0.4524556150364...|[0.38877708044707...| virginica| | 5.7| 2.8| 4.5| 1.3|versicolor|[1.06944304705877...|[0.74449098435029...|versicolor| | 6.3| 3.3| 4.7| 1.6|versicolor|[-0.2743084292595...|[0.43184968922729...| virginica| | 4.9| 2.4| 3.3| 1.0|versicolor|[2.75320369295153...|[0.94009402758065...|versicolor| | 6.6| 2.9| 4.6| 1.3|versicolor|[-0.6831584437477...|[0.33555673563505...| virginica| | 5.2| 2.7| 3.9| 1.4|versicolor|[2.06109520681768...|[0.88706393592062...|versicolor| | 5.0| 2.0| 3.5| 1.0|versicolor|[2.72482834398713...|[0.93847590782569...|versicolor| | 5.9| 3.0| 4.2| 1.5|versicolor|[0.60803738963620...|[0.64749297424084...|versicolor| | 6.0| 2.2| 4.0| 1.0|versicolor|[0.74152402446931...|[0.67732902849243...|versicolor| | 6.1| 2.9| 4.7| 1.4|versicolor|[0.26802822006176...|[0.56660877197498...|versicolor| | 5.6| 2.9| 3.6| 1.3|versicolor|[1.21921488387130...|[0.77192535405997...|versicolor| | 6.7| 3.1| 4.4| 1.4|versicolor|[-0.9543267684084...|[0.27801550694056...| virginica| | 5.6| 3.0| 4.5| 1.5|versicolor|[1.17874938792192...|[0.76472286588073...|versicolor| | 5.8| 2.7| 4.1| 1.0|versicolor|[0.91967121024624...|[0.71497510778599...|versicolor| | 6.2| 2.2| 4.5| 1.5|versicolor|[0.36104935894550...|[0.58929443051304...|versicolor| | 5.6| 2.5| 3.9| 1.1|versicolor|[1.38107686766881...|[0.79916389423163...|versicolor| +------------+-----------+------------+-----------+----------+--------------------+--------------------+----------+ The `prediction` column should be the original label as shown above. > SparkR Logistic Regression should should support output original label. > ----------------------------------------------------------------------- > > Key: SPARK-18476 > URL: https://issues.apache.org/jira/browse/SPARK-18476 > Project: Spark > Issue Type: Bug > Components: SparkR > Reporter: Miao Wang > Assignee: Miao Wang > Fix For: 2.1.0 > > > Similar to [SPARK-18401], as a classification algorithm, logistic regression > should support output original label instead of supporting index label. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org