date:20190614

DataSourceV2 sync notes - 12 June 2019

2019-06-14 Thread Ryan Blue

Here are the latest DSv2 sync notes. Please reply with updates or corrections. *Attendees*: Ryan Blue Michael Armbrust Gengliang Wang Matt Cheah John Zhuge *Topics*: Wenchen’s reorganization proposal Problems with TableProvider - property map isn’t sufficient New PRs: - ReplaceTable: https

Re: [DISCUSS] Increasing minimum supported version of Pandas

2019-06-14 Thread Bryan Cutler

Yeah, PyArrow is the only other PySpark dependency we check for a minimum version. We updated that not too long ago to be 0.12.1, which I think we are still good on for now. On Fri, Jun 14, 2019 at 11:36 AM Felix Cheung wrote: > How about pyArrow? > > -- > *From:* Hol

Re: [VOTE][SPARK-25299] SPIP: Shuffle Storage API

2019-06-14 Thread Imran Rashid

+1 (binding) I think this is a really important feature for spark. First, there is already a lot of interest in alternative shuffle storage in the community. There is already a lot of interest in alternative shuffle storage, from dynamic allocation in kubernetes, to even just improving stabilit

Re: [DISCUSS] Increasing minimum supported version of Pandas

2019-06-14 Thread Felix Cheung

How about pyArrow? From: Holden Karau Sent: Friday, June 14, 2019 11:06:15 AM To: Felix Cheung Cc: Bryan Cutler; Dongjoon Hyun; Hyukjin Kwon; dev; shane knapp Subject: Re: [DISCUSS] Increasing minimum supported version of Pandas Are there other Python dependencie

Re: [DISCUSS] Increasing minimum supported version of Pandas

2019-06-14 Thread Holden Karau

Are there other Python dependencies we should consider upgrading at the same time? On Fri, Jun 14, 2019 at 7:45 PM Felix Cheung wrote: > So to be clear, min version check is 0.23 > Jenkins test is 0.24 > > I’m ok with this. I hope someone will test 0.23 on releases though before > we sign off? >

Re: [DISCUSS] Increasing minimum supported version of Pandas

2019-06-14 Thread Felix Cheung

So to be clear, min version check is 0.23 Jenkins test is 0.24 I’m ok with this. I hope someone will test 0.23 on releases though before we sign off? From: shane knapp Sent: Friday, June 14, 2019 10:23:56 AM To: Bryan Cutler Cc: Dongjoon Hyun; Holden Karau; Hyuk

Re: Exposing JIRA issue types at GitHub PRs

2019-06-14 Thread Dongjoon Hyun

Now, you can see the exposed component labels (ordered by the number of PRs) here and click the component to search. https://github.com/apache/spark/labels?sort=count-desc Dongjoon. On Fri, Jun 14, 2019 at 1:15 AM Dongjoon Hyun wrote: > Hi, All. > > JIRA and PR is ready for reviews. > > h

jQuery 3.4.1 update

2019-06-14 Thread Sean Owen

Just surfacing this change as it's probably pretty good to go, but, a) I'm not a jQuery / JS expert and b) we don't have comprehensive UI tests. https://github.com/apache/spark/pull/24843 I'd like to get us up to a modern jQuery for 3.0, to keep up with security fixes (which was the minor motivat

Re: [DISCUSS] Increasing minimum supported version of Pandas

2019-06-14 Thread shane knapp

excellent. i shall not touch anything. :) On Fri, Jun 14, 2019 at 10:22 AM Bryan Cutler wrote: > Shane, I think 0.24.2 is probably more common right now, so if we were to > pick one to test against, I still think it should be that one. Our Pandas > usage in PySpark is pretty conservative, so i

Re: [DISCUSS] Increasing minimum supported version of Pandas

2019-06-14 Thread Bryan Cutler

Shane, I think 0.24.2 is probably more common right now, so if we were to pick one to test against, I still think it should be that one. Our Pandas usage in PySpark is pretty conservative, so it's pretty unlikely that we will add something that would break 0.23.X. On Fri, Jun 14, 2019 at 10:10 AM

Re: [VOTE][SPARK-25299] SPIP: Shuffle Storage API

2019-06-14 Thread Ilan Filonenko

+1 (non-binding). This API is versatile and flexible enough to handle Bloomberg's internal use-cases. The ability for us to vary implementation strategies is quite appealing. It is also worth to note the minimal changes to Spark core in order to make it work. This is a very much needed addition wit

Re: [DISCUSS] Increasing minimum supported version of Pandas

2019-06-14 Thread shane knapp

ah, ok... should we downgrade the testing env on jenkins then? any specific version? shane, who is loathe (and i mean LOATHE) to touch python envs ;) On Fri, Jun 14, 2019 at 10:08 AM Bryan Cutler wrote: > I should have stated this earlier, but when the user does something that > requires Pand

Re: [DISCUSS] Increasing minimum supported version of Pandas

2019-06-14 Thread Bryan Cutler

I should have stated this earlier, but when the user does something that requires Pandas, the minimum version is checked against what was imported and will raise an exception if it is a lower version. So I'm concerned that using 0.24.2 might be a little too new for users running older clusters. To

Re: [VOTE][SPARK-25299] SPIP: Shuffle Storage API

2019-06-14 Thread bo yang

+1 This is great work, allowing plugin of different sort shuffle write/read implementation! Also great to see it retain the current Spark configuration (spark.shuffle.manager=org.apache.spark.shuffle.YourShuffleManagerImpl). On Thu, Jun 13, 2019 at 2:58 PM Matt Cheah wrote: > Hi everyone, > > >

Re: [DISCUSS][SPARK-25299] SPIP: Shuffle storage API

2019-06-14 Thread Matt Cheah

We opened a thread for voting yesterday, so please participate! -Matt Cheah From: Yue Li Date: Thursday, June 13, 2019 at 7:22 PM To: Saisai Shao , Imran Rashid Cc: Matt Cheah , "Yifei Huang (PD)" , Mridul Muralidharan , Bo Yang , Ilan Filonenko , Imran Rashid , Justin Uang , Liang Tan

Re: [DISCUSS] Increasing minimum supported version of Pandas

2019-06-14 Thread shane knapp

just to everyone knows, our python 3.6 testing infra is currently on 0.24.2... On Fri, Jun 14, 2019 at 9:16 AM Dongjoon Hyun wrote: > +1 > > Thank you for this effort, Bryan! > > Bests, > Dongjoon. > > On Fri, Jun 14, 2019 at 4:24 AM Holden Karau wrote: > >> I’m +1 for upgrading, although since

Re: [build system] upcoming jenkins downtime: august 3rd 2019

2019-06-14 Thread Dongjoon Hyun

Thank you for the early notice, Shane! :) Dongjoon On Fri, Jun 14, 2019 at 9:13 AM shane knapp wrote: > the campus colo will be performing some electrical maintenance, which > means that they'll be powering off the entire building. > > since the jenkins cluster is located in that colo, we are m

Re: [DISCUSS] Increasing minimum supported version of Pandas

2019-06-14 Thread Dongjoon Hyun

+1 Thank you for this effort, Bryan! Bests, Dongjoon. On Fri, Jun 14, 2019 at 4:24 AM Holden Karau wrote: > I’m +1 for upgrading, although since this is probably the last easy chance > we’ll have to bump version numbers easily I’d suggest 0.24.2 > > > On Fri, Jun 14, 2019 at 4:38 AM Hyukjin Kw

[build system] upcoming jenkins downtime: august 3rd 2019

2019-06-14 Thread shane knapp

the campus colo will be performing some electrical maintenance, which means that they'll be powering off the entire building. since the jenkins cluster is located in that colo, we are most definitely affected. :) i'll be out of town that weekend, but will have one of my sysadmins bring everythin

Re: [DISCUSS] Increasing minimum supported version of Pandas

2019-06-14 Thread Holden Karau

I’m +1 for upgrading, although since this is probably the last easy chance we’ll have to bump version numbers easily I’d suggest 0.24.2 On Fri, Jun 14, 2019 at 4:38 AM Hyukjin Kwon wrote: > I am +1 to go for 0.23.2 - it brings some overhead to test PyArrow and > pandas combinations. Spark 3 sho

Re: Exposing JIRA issue types at GitHub PRs

2019-06-14 Thread Dongjoon Hyun

Hi, All. JIRA and PR is ready for reviews. https://issues.apache.org/jira/browse/SPARK-28051 (Exposing JIRA issue component types at GitHub PRs) https://github.com/apache/spark/pull/24871 Bests, Dongjoon. On Thu, Jun 13, 2019 at 10:48 AM Dongjoon Hyun wrote: > Thank you for the feedbacks and

DataSourceV2 sync notes - 12 June 2019

Re: [DISCUSS] Increasing minimum supported version of Pandas

Re: [VOTE][SPARK-25299] SPIP: Shuffle Storage API

Re: [DISCUSS] Increasing minimum supported version of Pandas

Re: [DISCUSS] Increasing minimum supported version of Pandas

Re: [DISCUSS] Increasing minimum supported version of Pandas

Re: Exposing JIRA issue types at GitHub PRs

jQuery 3.4.1 update

Re: [DISCUSS] Increasing minimum supported version of Pandas

Re: [DISCUSS] Increasing minimum supported version of Pandas

Re: [VOTE][SPARK-25299] SPIP: Shuffle Storage API

Re: [DISCUSS] Increasing minimum supported version of Pandas

Re: [DISCUSS] Increasing minimum supported version of Pandas

Re: [VOTE][SPARK-25299] SPIP: Shuffle Storage API

Re: [DISCUSS][SPARK-25299] SPIP: Shuffle storage API

Re: [DISCUSS] Increasing minimum supported version of Pandas

Re: [build system] upcoming jenkins downtime: august 3rd 2019

Re: [DISCUSS] Increasing minimum supported version of Pandas

[build system] upcoming jenkins downtime: august 3rd 2019

Re: [DISCUSS] Increasing minimum supported version of Pandas

Re: Exposing JIRA issue types at GitHub PRs

21 matches

Site Navigation

Mail list logo

Footer information