[ https://issues.apache.org/jira/browse/SPARK-15153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yanbo Liang updated SPARK-15153: -------------------------------- Description: When the type of label of dataset is numeric, SparkR spark.naiveBayes will throw error when training. This bug is easy to reproduce: {code} t <- as.data.frame(Titanic) t1 <- t[t$Freq > 0, -5] t1$NumericSurvived <- ifelse(t1$Survived == "No", 0, 1) t2 <- t1[-4] df <- suppressWarnings(createDataFrame(sqlContext, t2)) m <- spark.naiveBayes(df, NumericSurvived ~ .) s <- summary(m) 16/05/05 03:26:17 ERROR RBackendHandler: fit on org.apache.spark.ml.r.NaiveBayesWrapper failed Error in invokeJava(isStatic = TRUE, className, methodName, ...) : java.lang.ClassCastException: org.apache.spark.ml.attribute.UnresolvedAttribute$ cannot be cast to org.apache.spark.ml.attribute.NominalAttribute at org.apache.spark.ml.r.NaiveBayesWrapper$.fit(NaiveBayesWrapper.scala:66) at org.apache.spark.ml.r.NaiveBayesWrapper.fit(NaiveBayesWrapper.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:141) at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:86) at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:38) at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) at io.netty.channel.AbstractChannelHandlerContext.invo {code} was:When the type of label of dataset is numeric, SparkR spark.naiveBayes will throw error when extracting labels attribute. > SparkR spark.naiveBayes error when label is numeric type > -------------------------------------------------------- > > Key: SPARK-15153 > URL: https://issues.apache.org/jira/browse/SPARK-15153 > Project: Spark > Issue Type: Bug > Components: ML, SparkR > Reporter: Yanbo Liang > > When the type of label of dataset is numeric, SparkR spark.naiveBayes will > throw error when training. This bug is easy to reproduce: > {code} > t <- as.data.frame(Titanic) > t1 <- t[t$Freq > 0, -5] > t1$NumericSurvived <- ifelse(t1$Survived == "No", 0, 1) > t2 <- t1[-4] > df <- suppressWarnings(createDataFrame(sqlContext, t2)) > m <- spark.naiveBayes(df, NumericSurvived ~ .) > s <- summary(m) > 16/05/05 03:26:17 ERROR RBackendHandler: fit on > org.apache.spark.ml.r.NaiveBayesWrapper failed > Error in invokeJava(isStatic = TRUE, className, methodName, ...) : > java.lang.ClassCastException: > org.apache.spark.ml.attribute.UnresolvedAttribute$ cannot be cast to > org.apache.spark.ml.attribute.NominalAttribute > at > org.apache.spark.ml.r.NaiveBayesWrapper$.fit(NaiveBayesWrapper.scala:66) > at org.apache.spark.ml.r.NaiveBayesWrapper.fit(NaiveBayesWrapper.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:141) > at > org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:86) > at > org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:38) > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > at io.netty.channel.AbstractChannelHandlerContext.invo > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org