[VOTE] Release Apache Spark 2.0.0 (RC2)

2016-07-05 Thread Reynold Xin
Please vote on releasing the following candidate as Apache Spark version 2.0.0. The vote is open until Friday, July 8, 2016 at 23:00 PDT and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 2.0.0 [ ] -1 Do not release this package because ...

Re: Why's ds.foreachPartition(println) not possible?

2016-07-05 Thread Cody Koeninger
I don't think that's a scala compiler bug. println is a valid expression that returns unit. Unit is not a single-argument function, and does not match any of the overloads of foreachPartition You may be used to a conversion taking place when println is passed to method expecting a function, but

Re: spark git commit: [SPARK-15204][SQL] improve nullability inference for Aggregator

2016-07-05 Thread Reynold Xin
Jacek, This is definitely not necessary, but I wouldn't waste cycles "fixing" things like this when they have virtually zero impact. Perhaps next time we update this code we can "fix" it. Also can you comment on the pull request directly? On Tue, Jul 5, 2016 at 1:07 PM, Jacek Laskowski

Re: spark git commit: [SPARK-15204][SQL] improve nullability inference for Aggregator

2016-07-05 Thread Koert Kuipers
oh you mean instead of: assert(ds3.select(NameAgg.toColumn).schema.head.nullable === true) just do: assert(ds3.select(NameAgg.toColumn).schema.head.nullable) i did mostly === true because i also had === false, and i liked the symmetry, but sure this can be fixed if its not the norm On Tue, Jul

Re: spark git commit: [SPARK-15204][SQL] improve nullability inference for Aggregator

2016-07-05 Thread Jacek Laskowski
On Mon, Jul 4, 2016 at 6:14 AM, wrote: > Repository: spark > Updated Branches: > refs/heads/master 88134e736 -> 8cdb81fa8 > > > [SPARK-15204][SQL] improve nullability inference for Aggregator > > ## What changes were proposed in this pull request? > >

Re: Why's ds.foreachPartition(println) not possible?

2016-07-05 Thread Jacek Laskowski
Hi Reynold, Is this already reported and tracked somewhere. I'm quite sure that people will be asking about the reasons Spark does this. Where are such issues reported usually? Pozdrawiam, Jacek Laskowski https://medium.com/@jaceklaskowski/ Mastering Apache Spark

Re: [VOTE] Release Apache Spark 2.0.0 (RC1)

2016-07-05 Thread Reynold Xin
Please consider this vote canceled and I will work on another RC soon. On Tue, Jun 21, 2016 at 6:26 PM, Reynold Xin wrote: > Please vote on releasing the following candidate as Apache Spark version > 2.0.0. The vote is open until Friday, June 24, 2016 at 19:00 PDT and

Re: Call to new JObject sometimes returns an empty R environment

2016-07-05 Thread Shivaram Venkataraman
-sparkr-dev@googlegroups +dev@spark.apache.org [Please send SparkR development questions to the Spark user / dev mailing lists. Replies inline] > From: > Date: Tue, Jul 5, 2016 at 3:30 AM > Subject: Call to new JObject sometimes returns an empty R environment > To:

Re: Why's ds.foreachPartition(println) not possible?

2016-07-05 Thread Reynold Xin
This seems like a Scala compiler bug. On Tuesday, July 5, 2016, Jacek Laskowski wrote: > Well, there is foreach for Java and another foreach for Scala. That's > what I can understand. But while supporting two language-specific APIs > -- Scala and Java -- Dataset API lost

Re: SparkSession replace SQLContext

2016-07-05 Thread Michael Allman
These topics have been included in the documentation for recent builds of Spark 2.0. Michael > On Jul 5, 2016, at 3:49 AM, Romi Kuntsman wrote: > > You can also claim that there's a whole section of "Migrating from 1.6 to > 2.0" missing there: >

Re: Why's ds.foreachPartition(println) not possible?

2016-07-05 Thread Jacek Laskowski
Well, there is foreach for Java and another foreach for Scala. That's what I can understand. But while supporting two language-specific APIs -- Scala and Java -- Dataset API lost support for such simple calls without type annotations so you have to be explicit about the variant (since I'm using

Re: Why's ds.foreachPartition(println) not possible?

2016-07-05 Thread Sean Owen
Right, should have noticed that in your second mail. But foreach already does what you want, right? it would be identical here. How these two methods do conceptually different things on different arguments. I don't think I'd expect them to accept the same functions. On Tue, Jul 5, 2016 at 3:18

Re: Why's ds.foreachPartition(println) not possible?

2016-07-05 Thread Jacek Laskowski
ds is Dataset and the problem is that println (or any other one-element function) would not work here (and perhaps other methods with two variants - Java's and Scala's). Pozdrawiam, Jacek Laskowski https://medium.com/@jaceklaskowski/ Mastering Apache Spark

Why's ds.foreachPartition(println) not possible?

2016-07-05 Thread Jacek Laskowski
Hi, It's with the master built today. Why can't I call ds.foreachPartition(println)? Is using type annotation the only way to go forward? I'd be so sad if that's the case. scala> ds.foreachPartition(println) :28: error: overloaded method value foreachPartition with alternatives: (func:

Re: SparkSession replace SQLContext

2016-07-05 Thread Romi Kuntsman
You can also claim that there's a whole section of "Migrating from 1.6 to 2.0" missing there: https://spark.apache.org/docs/2.0.0-preview/sql-programming-guide.html#migration-guide *Romi Kuntsman*, *Big Data Engineer* http://www.totango.com On Tue, Jul 5, 2016 at 12:24 PM, nihed mbarek

SparkSession replace SQLContext

2016-07-05 Thread nihed mbarek
Hi, I just discover that that SparkSession will replace SQLContext for spark 2.0 JavaDoc is clear https://spark.apache.org/docs/2.0.0-preview/api/java/org/apache/spark/sql/SparkSession.html but there is no mention in sql programming guide