I don't think that's a scala compiler bug. println is a valid expression that returns unit.
Unit is not a single-argument function, and does not match any of the overloads of foreachPartition You may be used to a conversion taking place when println is passed to method expecting a function, but that's not a safe thing to do silently for multiple overloads. tldr; just use ds.foreachPartition(x => println(x)) you don't need any type annotations On Tue, Jul 5, 2016 at 2:53 PM, Jacek Laskowski <ja...@japila.pl> wrote: > Hi Reynold, > > Is this already reported and tracked somewhere. I'm quite sure that > people will be asking about the reasons Spark does this. Where are > such issues reported usually? > > Pozdrawiam, > Jacek Laskowski > ---- > https://medium.com/@jaceklaskowski/ > Mastering Apache Spark http://bit.ly/mastering-apache-spark > Follow me at https://twitter.com/jaceklaskowski > > > On Tue, Jul 5, 2016 at 6:19 PM, Reynold Xin <r...@databricks.com> wrote: >> This seems like a Scala compiler bug. >> >> >> On Tuesday, July 5, 2016, Jacek Laskowski <ja...@japila.pl> wrote: >>> >>> Well, there is foreach for Java and another foreach for Scala. That's >>> what I can understand. But while supporting two language-specific APIs >>> -- Scala and Java -- Dataset API lost support for such simple calls >>> without type annotations so you have to be explicit about the variant >>> (since I'm using Scala I want to use Scala API right). It appears that >>> any single-argument-function operators in Datasets are affected :( >>> >>> My question was to know whether there are works to fix it (if possible >>> -- I don't know if it is). >>> >>> Pozdrawiam, >>> Jacek Laskowski >>> ---- >>> https://medium.com/@jaceklaskowski/ >>> Mastering Apache Spark http://bit.ly/mastering-apache-spark >>> Follow me at https://twitter.com/jaceklaskowski >>> >>> >>> On Tue, Jul 5, 2016 at 4:21 PM, Sean Owen <so...@cloudera.com> wrote: >>> > Right, should have noticed that in your second mail. But foreach >>> > already does what you want, right? it would be identical here. >>> > >>> > How these two methods do conceptually different things on different >>> > arguments. I don't think I'd expect them to accept the same functions. >>> > >>> > On Tue, Jul 5, 2016 at 3:18 PM, Jacek Laskowski <ja...@japila.pl> wrote: >>> >> ds is Dataset and the problem is that println (or any other >>> >> one-element function) would not work here (and perhaps other methods >>> >> with two variants - Java's and Scala's). >>> >> >>> >> Pozdrawiam, >>> >> Jacek Laskowski >>> >> ---- >>> >> https://medium.com/@jaceklaskowski/ >>> >> Mastering Apache Spark http://bit.ly/mastering-apache-spark >>> >> Follow me at https://twitter.com/jaceklaskowski >>> >> >>> >> >>> >> On Tue, Jul 5, 2016 at 3:53 PM, Sean Owen <so...@cloudera.com> wrote: >>> >>> A DStream is a sequence of RDDs, not of elements. I don't think I'd >>> >>> expect to express an operation on a DStream as if it were elements. >>> >>> >>> >>> On Tue, Jul 5, 2016 at 2:47 PM, Jacek Laskowski <ja...@japila.pl> >>> >>> wrote: >>> >>>> Sort of. Your example works, but could you do a mere >>> >>>> ds.foreachPartition(println)? Why not? What should I even see the >>> >>>> Java >>> >>>> version? >>> >>>> >>> >>>> scala> val ds = spark.range(10) >>> >>>> ds: org.apache.spark.sql.Dataset[Long] = [id: bigint] >>> >>>> >>> >>>> scala> ds.foreachPartition(println) >>> >>>> <console>:26: error: overloaded method value foreachPartition with >>> >>>> alternatives: >>> >>>> (func: >>> >>>> org.apache.spark.api.java.function.ForeachPartitionFunction[Long])Unit >>> >>>> <and> >>> >>>> (f: Iterator[Long] => Unit)Unit >>> >>>> cannot be applied to (Unit) >>> >>>> ds.foreachPartition(println) >>> >>>> ^ >>> >>>> >>> >>>> Pozdrawiam, >>> >>>> Jacek Laskowski >>> >>>> ---- >>> >>>> https://medium.com/@jaceklaskowski/ >>> >>>> Mastering Apache Spark http://bit.ly/mastering-apache-spark >>> >>>> Follow me at https://twitter.com/jaceklaskowski >>> >>>> >>> >>>> >>> >>>> On Tue, Jul 5, 2016 at 3:32 PM, Sean Owen <so...@cloudera.com> wrote: >>> >>>>> Do you not mean ds.foreachPartition(_.foreach(println)) or similar? >>> >>>>> >>> >>>>> On Tue, Jul 5, 2016 at 2:22 PM, Jacek Laskowski <ja...@japila.pl> >>> >>>>> wrote: >>> >>>>>> Hi, >>> >>>>>> >>> >>>>>> It's with the master built today. Why can't I call >>> >>>>>> ds.foreachPartition(println)? Is using type annotation the only way >>> >>>>>> to >>> >>>>>> go forward? I'd be so sad if that's the case. >>> >>>>>> >>> >>>>>> scala> ds.foreachPartition(println) >>> >>>>>> <console>:28: error: overloaded method value foreachPartition with >>> >>>>>> alternatives: >>> >>>>>> (func: >>> >>>>>> org.apache.spark.api.java.function.ForeachPartitionFunction[Record])Unit >>> >>>>>> <and> >>> >>>>>> (f: Iterator[Record] => Unit)Unit >>> >>>>>> cannot be applied to (Unit) >>> >>>>>> ds.foreachPartition(println) >>> >>>>>> ^ >>> >>>>>> >>> >>>>>> scala> sc.version >>> >>>>>> res9: String = 2.0.0-SNAPSHOT >>> >>>>>> >>> >>>>>> Pozdrawiam, >>> >>>>>> Jacek Laskowski >>> >>>>>> ---- >>> >>>>>> https://medium.com/@jaceklaskowski/ >>> >>>>>> Mastering Apache Spark http://bit.ly/mastering-apache-spark >>> >>>>>> Follow me at https://twitter.com/jaceklaskowski >>> >>>>>> >>> >>>>>> >>> >>>>>> --------------------------------------------------------------------- >>> >>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>> >>>>>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>> >> > > --------------------------------------------------------------------- > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org