Re: [DISCUSS][SPARK-25299] SPIP: Shuffle storage API

2019-05-08 Thread Mridul Muralidharan
Unfortunately I do not have bandwidth to do a detailed review, but a few things come to mind after a quick read: - While it might be tactically beneficial to align with existing implementation, a clean design which does not tie into existing shuffle implementation would be preferable (if it can

RE: Need guidance on Spark Session Termination.

2019-05-08 Thread Nasrulla Khan Haris
HI All, Any Inputs here ? Thanks, Nasrulla From: Nasrulla Khan Haris Sent: Tuesday, May 7, 2019 3:58 PM To: dev@spark.apache.org Subject: Need guidance on Spark Session Termination. Hi fellow Spark-devs, I am pretty new to spark core and I am looking for some answers to my use case. I have

[DISCUSS][SPARK-25299] SPIP: Shuffle storage API

2019-05-08 Thread Yifei Huang (PD)
Hi everyone, For the past several months, we have been working on an API for pluggable storage of shuffle data. In this SPIP, we describe the proposed API, its implications, and how it fits into other work being done in the Spark shuffle space. If you're interested in Spark shuffle, and

Re: [VOTE][SPARK-27396] SPIP: Public APIs for extended Columnar Processing Support

2019-05-08 Thread Bryan Cutler
+1 (non-binding) On Tue, May 7, 2019 at 12:04 PM Bobby Evans wrote: > I am +! > > On Tue, May 7, 2019 at 1:37 PM Thomas graves wrote: > >> Hi everyone, >> >> I'd like to call for another vote on SPARK-27396 - SPIP: Public APIs >> for extended Columnar Processing Support. The proposal is to

Re: SparkR latest API docs missing?

2019-05-08 Thread Shivaram Venkataraman
Comparing https://github.com/apache/spark-website/tree/asf-site/site/docs/2.4.2/api/R and https://github.com/apache/spark-website/tree/asf-site/site/docs/2.4.3/api/R, it looks like the github commit of the docs is missing this. cc'ing recent release managers. Thanks Shivaram On Wed, May 8,

Re: SparkR latest API docs missing?

2019-05-08 Thread Shivaram Venkataraman
Actually I found this while I was uploading the latest release to CRAN -- these docs should be generated as a part of the release process though and shouldn't be related to CRAN. On Wed, May 8, 2019 at 11:24 AM Sean Owen wrote: > > I think the SparkR release always trails a little bit due to the

Re: SparkR latest API docs missing?

2019-05-08 Thread Sean Owen
I think the SparkR release always trails a little bit due to the additional CRAN processes. On Wed, May 8, 2019 at 11:23 AM Shivaram Venkataraman wrote: > > I just noticed that the SparkR API docs are missing at > https://spark.apache.org/docs/latest/api/R/index.html --- It looks > like they

SparkR latest API docs missing?

2019-05-08 Thread Shivaram Venkataraman
I just noticed that the SparkR API docs are missing at https://spark.apache.org/docs/latest/api/R/index.html --- It looks like they were missing from the 2.4.3 release? Thanks Shivaram - To unsubscribe e-mail:

Re: Static partitioning in partitionBy()

2019-05-08 Thread Shubham Chaurasia
Thanks On Wed, May 8, 2019 at 10:36 AM Felix Cheung wrote: > You could > > df.filter(col(“c”) = “c1”).write().partitionBy(“c”).save > > It could get some data skew problem but might work for you > > > > -- > *From:* Burak Yavuz > *Sent:* Tuesday, May 7, 2019 9:35:10