Re: [DISCUSS] SPIP: Row-level operations in Data Source V2

2021-11-12 Thread Anton Okolnychyi
I agree with the idea to start getting parts as soon as possible to make sure the APIs are well-defined and the implementation is generic. I have everything ready from my side. - Anton пт, 12 лист. 2021 о 17:47 L. C. Hsieh пише: > Hi all, > > I think mostly we are in favor for the SPIP as I've

Re: [DISCUSS] SPIP: Row-level operations in Data Source V2

2021-11-12 Thread L. C. Hsieh
Hi all, I think mostly we are in favor for the SPIP as I've seen. If not more comments or discussion on the SPIP doc, I will raise a vote soon. Thanks. On Tue, Nov 2, 2021 at 9:58 AM L. C. Hsieh wrote: > > +1 for the idea to commit the work earlier. > > I think we will raise the voting soon.

Re: [DISCUSS] SPIP: Row-level operations in Data Source V2

2021-11-02 Thread L. C. Hsieh
+1 for the idea to commit the work earlier. I think we will raise the voting soon. Once it is passed, we can submit the PRs. What do you think? Anton. On Mon, Nov 1, 2021 at 7:59 AM Wenchen Fan wrote: > > The general idea looks great. This is indeed a complicated API and we > probably need

Re: [DISCUSS] SPIP: Row-level operations in Data Source V2

2021-11-01 Thread Wenchen Fan
The general idea looks great. This is indeed a complicated API and we probably need more time to evaluate the API design. It's better to commit this work earlier so that we have more time to verify it before the 3.3 release. Maybe we can commit the group-based API first, then the delta-based one,

Re: [DISCUSS] SPIP: Row-level operations in Data Source V2

2021-10-27 Thread L . C . Hsieh
Thanks for the initial feedback. I think previously the community is busy on the works related to Spark 3.2 release. As 3.2 release was done, I'd like to bring this up to the surface again and seek for more discussion and feedback. Thanks. On 2021/06/25 15:49:49, huaxin gao wrote: > I

Re: [DISCUSS] SPIP: Row-level operations in Data Source V2

2021-06-25 Thread huaxin gao
I took a quick look at the PR and it looks like a great feature to have. It provides unified APIs for data sources to perform the commonly used operations easily and efficiently, so users don't have to implement customer extensions on their own. Thanks Anton for the work! On Thu, Jun 24, 2021 at

Re: [DISCUSS] SPIP: Row-level operations in Data Source V2

2021-06-24 Thread L . C . Hsieh
Thanks Anton. I'm voluntarily to be the shepherd of the SPIP. This is also my first time to shepherd a SPIP, so please let me know if anything I can improve. This looks great features and the rationale claimed by the proposal makes sense. These operations are getting more common and more

Re: [DISCUSS] SPIP: Row-level operations in Data Source V2

2021-06-24 Thread Jungtaek Lim
Meta question: this doesn't target Spark 3.2, right? Many folks have been working on branch cut for Spark 3.2, so might be less active to jump in new feature proposals right now. On Fri, Jun 25, 2021 at 9:00 AM Holden Karau wrote: > I took an initial look at the PRs this morning and I’ll go

Re: [DISCUSS] SPIP: Row-level operations in Data Source V2

2021-06-24 Thread Holden Karau
I took an initial look at the PRs this morning and I’ll go through the design doc in more detail but I think these features look great. It’s especially important with the CA regulation changes to make this easier for folks to implement. On Thu, Jun 24, 2021 at 4:54 PM Anton Okolnychyi wrote: >

[DISCUSS] SPIP: Row-level operations in Data Source V2

2021-06-24 Thread Anton Okolnychyi
Hey everyone, I'd like to start a discussion on adding support for executing row-level operations such as DELETE, UPDATE, MERGE for v2 tables (SPARK-35801). The execution should be the same across data sources and the best way to do that is to implement it in Spark. Right now, Spark can only