Re: [SPARK-25079] moving from python 3.4 to python 3.6.8, impacts all active branches

2019-04-15 Thread shane knapp
2.4 PR: https://github.com/apache/spark/pull/24379 2.3 PR: https://github.com/apache/spark/pull/24380 both of these branches failed pretty spectacularly during my non-PR testing in the pyspark sql tests, but let's see how they fare when things are run automagically by jenkins. shane On Mon, Ap

Re: Thoughts on dataframe cogroup?

2019-04-15 Thread Chris Martin
Ah sorry- I've updated the link which should give you access. Can you try again now? thanks, Chris On Mon, Apr 15, 2019 at 9:49 PM Li Jin wrote: > Hi Chris, > > Thanks! The permission to the google doc is maybe not set up properly. I > cannot view the doc by default. > > Li > > On Mon, Apr

Re: Thoughts on dataframe cogroup?

2019-04-15 Thread Li Jin
Hi Chris, Thanks! The permission to the google doc is maybe not set up properly. I cannot view the doc by default. Li On Mon, Apr 15, 2019 at 3:58 PM Chris Martin wrote: > I've updated the jira so that the main body is now inside a google doc. > Anyone should be able to comment- if you want/ne

Re: Thoughts on dataframe cogroup?

2019-04-15 Thread Chris Martin
I've updated the jira so that the main body is now inside a google doc. Anyone should be able to comment- if you want/need write access please drop me a mail and I can add you. Ryan- regarding your specific point regarding why I'm not proposing to add this to the Scala API, I think the main point

Re: [SPARK-25079] moving from python 3.4 to python 3.6.8, impacts all active branches

2019-04-15 Thread shane knapp
1) i absolutely do not want to test against more than two python versions. consider my foot to have been put down on that. :) 2) i'll start testing against 2.3 and 2.4 now (last week was a bit crazy, so i didn't get around to it). once i'm happy w/the 2.3 and 2.4 results, i'll follow up here a

Re: Thoughts on dataframe cogroup?

2019-04-15 Thread Ryan Blue
I agree, it would be great to have a document to comment on. The main thing that stands out right now is that this is only for PySpark and states that it will not be added to the Scala API. Why not make this available since most of the work would be done? On Mon, Apr 15, 2019 at 7:50 AM Li Jin w

Re: Preserving cache name and storage level upon table refresh

2019-04-15 Thread William Wong
Hi Sean and @gatorsmile , Thanks a lot for your previous review. I updated those test ( https://github.com/apache/spark/pull/24221) accordingly. May I know if you can help reviewing them again? Best regards, William On Wed, Apr 3, 2019 at 1:03 AM William Wong wr

Re: Antlr plugin for sql/catalyst project

2019-04-15 Thread William Wong
Hi Sean, I just submitted a PR for updating the develop-tools.html. ( https://github.com/apache/spark-website/pull/195). May i know if you may help review it? Many thanks for your help. Best regards, William On Mon, Apr 15, 2019 at 7:04 AM William Wong wrote: > I built the spark with build/

Re: Thoughts on dataframe cogroup?

2019-04-15 Thread Li Jin
Thank you Chris, this looks great. Would you mind share a google doc version of the proposal? I believe that's the preferred way of discussing proposals (Other people please correct me if I am wrong). Li On Mon, Apr 15, 2019 at 8:20 AM wrote: > Hi, > > As promised I’ve raised SPARK-27463 for

Re: Thoughts on dataframe cogroup?

2019-04-15 Thread chris
Hi, As promised I’ve raised SPARK-27463 for this. All feedback welcome! Chris > On 9 Apr 2019, at 13:22, Chris Martin wrote: > > Thanks Bryan and Li, that is much appreciated. Hopefully should have the > SPIP ready in the next couple of days. > > thanks, > > Chris > > > > >> On Mon,