Thanks Cody, Reynold, and Ryan! Learnt a lot and feel "corrected".
Pozdrawiam, Jacek Laskowski ---- https://medium.com/@jaceklaskowski/ Mastering Apache Spark http://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski On Wed, Jul 6, 2016 at 2:46 AM, Shixiong(Ryan) Zhu <shixi...@databricks.com> wrote: > I asked this question in Scala user group two years ago: > https://groups.google.com/forum/#!topic/scala-user/W4f0d8xK1nk > > Take a look if you are interested in. > > On Tue, Jul 5, 2016 at 1:31 PM, Reynold Xin <r...@databricks.com> wrote: >> >> You can file it here: https://issues.scala-lang.org/secure/Dashboard.jspa >> >> Perhaps "bug" is not the right word, but "limitation". println accepts a >> single argument of type Any and returns Unit, and it appears that Scala >> fails to infer the correct overloaded method in this case. >> >> def println() = Console.println() >> def println(x: Any) = Console.println(x) >> >> >> >> On Tue, Jul 5, 2016 at 1:27 PM, Cody Koeninger <c...@koeninger.org> wrote: >>> >>> I don't think that's a scala compiler bug. >>> >>> println is a valid expression that returns unit. >>> >>> Unit is not a single-argument function, and does not match any of the >>> overloads of foreachPartition >>> >>> You may be used to a conversion taking place when println is passed to >>> method expecting a function, but that's not a safe thing to do >>> silently for multiple overloads. >>> >>> tldr; >>> >>> just use >>> >>> ds.foreachPartition(x => println(x)) >>> >>> you don't need any type annotations >>> >>> >>> On Tue, Jul 5, 2016 at 2:53 PM, Jacek Laskowski <ja...@japila.pl> wrote: >>> > Hi Reynold, >>> > >>> > Is this already reported and tracked somewhere. I'm quite sure that >>> > people will be asking about the reasons Spark does this. Where are >>> > such issues reported usually? >>> > >>> > Pozdrawiam, >>> > Jacek Laskowski >>> > ---- >>> > https://medium.com/@jaceklaskowski/ >>> > Mastering Apache Spark http://bit.ly/mastering-apache-spark >>> > Follow me at https://twitter.com/jaceklaskowski >>> > >>> > >>> > On Tue, Jul 5, 2016 at 6:19 PM, Reynold Xin <r...@databricks.com> >>> > wrote: >>> >> This seems like a Scala compiler bug. >>> >> >>> >> >>> >> On Tuesday, July 5, 2016, Jacek Laskowski <ja...@japila.pl> wrote: >>> >>> >>> >>> Well, there is foreach for Java and another foreach for Scala. That's >>> >>> what I can understand. But while supporting two language-specific >>> >>> APIs >>> >>> -- Scala and Java -- Dataset API lost support for such simple calls >>> >>> without type annotations so you have to be explicit about the variant >>> >>> (since I'm using Scala I want to use Scala API right). It appears >>> >>> that >>> >>> any single-argument-function operators in Datasets are affected :( >>> >>> >>> >>> My question was to know whether there are works to fix it (if >>> >>> possible >>> >>> -- I don't know if it is). >>> >>> >>> >>> Pozdrawiam, >>> >>> Jacek Laskowski >>> >>> ---- >>> >>> https://medium.com/@jaceklaskowski/ >>> >>> Mastering Apache Spark http://bit.ly/mastering-apache-spark >>> >>> Follow me at https://twitter.com/jaceklaskowski >>> >>> >>> >>> >>> >>> On Tue, Jul 5, 2016 at 4:21 PM, Sean Owen <so...@cloudera.com> wrote: >>> >>> > Right, should have noticed that in your second mail. But foreach >>> >>> > already does what you want, right? it would be identical here. >>> >>> > >>> >>> > How these two methods do conceptually different things on different >>> >>> > arguments. I don't think I'd expect them to accept the same >>> >>> > functions. >>> >>> > >>> >>> > On Tue, Jul 5, 2016 at 3:18 PM, Jacek Laskowski <ja...@japila.pl> >>> >>> > wrote: >>> >>> >> ds is Dataset and the problem is that println (or any other >>> >>> >> one-element function) would not work here (and perhaps other >>> >>> >> methods >>> >>> >> with two variants - Java's and Scala's). >>> >>> >> >>> >>> >> Pozdrawiam, >>> >>> >> Jacek Laskowski >>> >>> >> ---- >>> >>> >> https://medium.com/@jaceklaskowski/ >>> >>> >> Mastering Apache Spark http://bit.ly/mastering-apache-spark >>> >>> >> Follow me at https://twitter.com/jaceklaskowski >>> >>> >> >>> >>> >> >>> >>> >> On Tue, Jul 5, 2016 at 3:53 PM, Sean Owen <so...@cloudera.com> >>> >>> >> wrote: >>> >>> >>> A DStream is a sequence of RDDs, not of elements. I don't think >>> >>> >>> I'd >>> >>> >>> expect to express an operation on a DStream as if it were >>> >>> >>> elements. >>> >>> >>> >>> >>> >>> On Tue, Jul 5, 2016 at 2:47 PM, Jacek Laskowski <ja...@japila.pl> >>> >>> >>> wrote: >>> >>> >>>> Sort of. Your example works, but could you do a mere >>> >>> >>>> ds.foreachPartition(println)? Why not? What should I even see >>> >>> >>>> the >>> >>> >>>> Java >>> >>> >>>> version? >>> >>> >>>> >>> >>> >>>> scala> val ds = spark.range(10) >>> >>> >>>> ds: org.apache.spark.sql.Dataset[Long] = [id: bigint] >>> >>> >>>> >>> >>> >>>> scala> ds.foreachPartition(println) >>> >>> >>>> <console>:26: error: overloaded method value foreachPartition >>> >>> >>>> with >>> >>> >>>> alternatives: >>> >>> >>>> (func: >>> >>> >>>> >>> >>> >>>> org.apache.spark.api.java.function.ForeachPartitionFunction[Long])Unit >>> >>> >>>> <and> >>> >>> >>>> (f: Iterator[Long] => Unit)Unit >>> >>> >>>> cannot be applied to (Unit) >>> >>> >>>> ds.foreachPartition(println) >>> >>> >>>> ^ >>> >>> >>>> >>> >>> >>>> Pozdrawiam, >>> >>> >>>> Jacek Laskowski >>> >>> >>>> ---- >>> >>> >>>> https://medium.com/@jaceklaskowski/ >>> >>> >>>> Mastering Apache Spark http://bit.ly/mastering-apache-spark >>> >>> >>>> Follow me at https://twitter.com/jaceklaskowski >>> >>> >>>> >>> >>> >>>> >>> >>> >>>> On Tue, Jul 5, 2016 at 3:32 PM, Sean Owen <so...@cloudera.com> >>> >>> >>>> wrote: >>> >>> >>>>> Do you not mean ds.foreachPartition(_.foreach(println)) or >>> >>> >>>>> similar? >>> >>> >>>>> >>> >>> >>>>> On Tue, Jul 5, 2016 at 2:22 PM, Jacek Laskowski >>> >>> >>>>> <ja...@japila.pl> >>> >>> >>>>> wrote: >>> >>> >>>>>> Hi, >>> >>> >>>>>> >>> >>> >>>>>> It's with the master built today. Why can't I call >>> >>> >>>>>> ds.foreachPartition(println)? Is using type annotation the >>> >>> >>>>>> only way >>> >>> >>>>>> to >>> >>> >>>>>> go forward? I'd be so sad if that's the case. >>> >>> >>>>>> >>> >>> >>>>>> scala> ds.foreachPartition(println) >>> >>> >>>>>> <console>:28: error: overloaded method value foreachPartition >>> >>> >>>>>> with >>> >>> >>>>>> alternatives: >>> >>> >>>>>> (func: >>> >>> >>>>>> >>> >>> >>>>>> org.apache.spark.api.java.function.ForeachPartitionFunction[Record])Unit >>> >>> >>>>>> <and> >>> >>> >>>>>> (f: Iterator[Record] => Unit)Unit >>> >>> >>>>>> cannot be applied to (Unit) >>> >>> >>>>>> ds.foreachPartition(println) >>> >>> >>>>>> ^ >>> >>> >>>>>> >>> >>> >>>>>> scala> sc.version >>> >>> >>>>>> res9: String = 2.0.0-SNAPSHOT >>> >>> >>>>>> >>> >>> >>>>>> Pozdrawiam, >>> >>> >>>>>> Jacek Laskowski >>> >>> >>>>>> ---- >>> >>> >>>>>> https://medium.com/@jaceklaskowski/ >>> >>> >>>>>> Mastering Apache Spark http://bit.ly/mastering-apache-spark >>> >>> >>>>>> Follow me at https://twitter.com/jaceklaskowski >>> >>> >>>>>> >>> >>> >>>>>> >>> >>> >>>>>> >>> >>> >>>>>> --------------------------------------------------------------------- >>> >>> >>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>> >>> >>>>>> >>> >>> >>> >>> --------------------------------------------------------------------- >>> >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>> >>> >>> >> >>> > >>> > --------------------------------------------------------------------- >>> > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>> > >> >> > --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org