[ https://issues.apache.org/jira/browse/SPARK-5904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Patrick Wendell resolved SPARK-5904. ------------------------------------ Resolution: Fixed Fix Version/s: 1.3.0 I think rxin just forgot to close this. It was merged several days ago. > DataFrame methods with varargs do not work in Java > -------------------------------------------------- > > Key: SPARK-5904 > URL: https://issues.apache.org/jira/browse/SPARK-5904 > Project: Spark > Issue Type: Sub-task > Components: Java API, SQL > Affects Versions: 1.3.0 > Reporter: Joseph K. Bradley > Assignee: Reynold Xin > Priority: Blocker > Labels: DataFrame > Fix For: 1.3.0 > > > DataFrame methods with varargs fail when called from Java due to a bug in > Scala. > This can be produced by, e.g., modifying the end of the example > ml.JavaSimpleParamsExample in the master branch: > {code} > DataFrame results = model2.transform(test); > results.printSchema(); // works > results.collect(); // works > results.filter("label > 0.0").count(); // works > for (Row r: results.select("features", "label", "myProbability", > "prediction").collect()) { // fails on select > System.out.println("(" + r.get(0) + ", " + r.get(1) + ") -> prob=" + > r.get(2) > + ", prediction=" + r.get(3)); > } > {code} > I have also tried groupBy and found that failed too. > The error looks like this: > {code} > Exception in thread "main" java.lang.AbstractMethodError: > org.apache.spark.sql.DataFrameImpl.groupBy(Ljava/lang/String;[Ljava/lang/String;)Lorg/apache/spark/sql/GroupedData; > at > org.apache.spark.examples.ml.JavaSimpleParamsExample.main(JavaSimpleParamsExample.java:108) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569) > at > org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166) > at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > {code} > The error appears to be from this Scala bug with using varargs in an abstract > method: > [https://issues.scala-lang.org/browse/SI-9013] > My current plan is to move the implementations of the methods with varargs > from DataFrameImpl to DataFrame. > However, this may cause issues with IncomputableColumn---feedback?? > Thanks to [~joshrosen] for figuring the bug and fix out! -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org