Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-28 Thread Ryan Blue
gt;>> > >>> > -Matt Cheah? >>> > >>> > >>> > >>> > From: Ryan Blue >>> > Reply-To: "rb...@netflix.com" >>> > Date: Tuesday, February 26, 2019 at 4:53 PM >>> > To: Matt Cheah >&

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-27 Thread Wenchen Fan
>> > >> > From: Ryan Blue >> > Reply-To: "rb...@netflix.com" >> > Date: Tuesday, February 26, 2019 at 4:53 PM >> > To: Matt Cheah >> > Cc: Sean Owen , Wenchen Fan , >> Xiao Li , Matei Zaharia , >> Spark Dev List >&

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-27 Thread Ryan Blue
gt; > > > > > > > > > -Matt Cheah? > > > > > > > > From: Ryan Blue > > Reply-To: "rb...@netflix.com" > > Date: Tuesday, February 26, 2019 at 4:53 PM > > To: Matt Cheah > > Cc: Sean Owen , Wenchen Fan , > Xiao

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-26 Thread Matei Zaharia
e that for Spark > 4? > > > > > -Matt Cheah? > > > > From: Ryan Blue > Reply-To: "rb...@netflix.com" > Date: Tuesday, February 26, 2019 at 4:53 PM > To: Matt Cheah > Cc: Sean Owen , Wenchen Fan , Xiao Li > , Matei Zaha

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-26 Thread Reynold Xin
t; > > -Matt Cheah? > > > > *From: *Ryan Blue > *Reply-To: *"rb...@netflix.com" > *Date: *Tuesday, February 26, 2019 at 4:53 PM > *To: *Matt Cheah > *Cc: *Sean Owen , Wenchen Fan , > Xiao Li , Matei Zaharia , > Spark Dev List > *Subject: *Re: [DISCU

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-26 Thread Matt Cheah
ubject: Re: [DISCUSS] Spark 3.0 and DataSourceV2 That's a good question. While I'd love to have a solution for that, I don't think it is a good idea to delay DSv2 until we have one. That is going to require a lot of internal changes and I don't see how we could make the release date if we are

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-26 Thread Ryan Blue
ebruary 26, 2019 at 4:40 PM > *To: *Matt Cheah > *Cc: *Sean Owen , Wenchen Fan , > Xiao Li , Matei Zaharia , > Spark Dev List > *Subject: *Re: [DISCUSS] Spark 3.0 and DataSourceV2 > > > > Thanks for bumping this, Matt. I think we can have the discussion here to > clari

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-26 Thread Ryan Blue
Thanks for bumping this, Matt. I think we can have the discussion here to clarify exactly what we’re committing to and then have a vote thread once we’re agreed. Getting back to the DSv2 discussion, I think we have a good handle on what would be added: - Plugin system for catalogs -

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-26 Thread Matt Cheah
ia , Spark Dev List Subject: Re: [DISCUSS] Spark 3.0 and DataSourceV2 Thanks for bumping this, Matt. I think we can have the discussion here to clarify exactly what we’re committing to and then have a vote thread once we’re agreed. Getting back to the DSv2 discussion, I think we have a g

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-26 Thread Matt Cheah
What would then be the next steps we'd take to collectively decide on plans and timelines moving forward? Might I suggest scheduling a conference call with appropriate PMCs to put our ideas together? Maybe such a discussion can take place at next week's meeting? Or do we need to have a separate

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-24 Thread Sean Owen
Sure, I don't read anyone making these statements though? Let's assume good intent, that "foo should happen" as "my opinion as a member of the community, which is not solely up to me, is that foo should happen". I understand it's possible for a person to make their opinion over-weighted; this

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-24 Thread Mark Hamstra
> > I’m not quite sure what you mean here. > I'll try to explain once more, then I'll drop it since continuing the rest of the discussion in this thread is more important than getting side-tracked. There is nothing wrong with individuals advocating for what they think should or should not be in

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-24 Thread Ryan Blue
Thanks to Matt for his philosophical take. I agree. The intent is to set a common goal, so that we work toward getting v2 in a usable state as a community. Part of that is making choices to get it done on time, which we have already seen on this thread: setting out more clearly what we mean by

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-22 Thread Mark Hamstra
> > To your other message: I already see a number of PMC members here. Who's > the other entity? > I'll answer indirectly since pointing fingers isn't really my intent. In the absence of a PMC vote, I react negatively to individuals making new declarative policy statements or statements to the

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-22 Thread Sean Owen
To your other message: I already see a number of PMC members here. Who's the other entity? The PMC is the thing that says a thing is a release, sure, but this discussion is properly a community one. And here we are, this is lovely to see. (May I remind everyone to casually, sometime, browse the

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-21 Thread Ryan Blue
also the features that have remained open for the longest time > and we really need to move forward on these. Putting a target release for > 3.0 will help in that regard. > > > > -Matt Cheah > > > > *From: *Ryan Blue > *Reply-To: *"rb...@netflix.com" > *D

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-21 Thread Matt Cheah
ebruary 21, 2019 at 2:22 PM To: Matei Zaharia Cc: Spark Dev List Subject: Re: [DISCUSS] Spark 3.0 and DataSourceV2 I'm all for making releases more often if we want. But this work could really use a target release to motivate getting it done. If we agree that it will block a release, the

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-21 Thread Ryan Blue
I'm all for making releases more often if we want. But this work could really use a target release to motivate getting it done. If we agree that it will block a release, then everyone is motivated to review and get the PRs in. If this work doesn't make it in the 3.0 release, I'm not confident

Re: [DISCUSS] Spark 3.0 and DataSourceV2

2019-02-21 Thread Matei Zaharia
How large would the delay be? My 2 cents are that there’s nothing stopping us from making feature releases more often if we want to, so we shouldn’t see this as an “either delay 3.0 or release in >6 months” decision. If the work is likely to get in with a small delay and simplifies our work

[DISCUSS] Spark 3.0 and DataSourceV2

2019-02-21 Thread Ryan Blue
Hi everyone, In the DSv2 sync last night, we had a discussion about roadmap and what the goal should be for getting the main features into Spark. We all agreed that 3.0 should be that goal, even if it means delaying the 3.0 release. The possibility of delaying the 3.0 release may be