Re: Proposal for RESTful Data Operations

2024-04-23 Thread Drew
I’d like to start a discussion on thoughts around introducing this >>>>>>> new endpoint and its functionality to support the ongoing fine-grained >>>>>>> metadata commit efforts. From the discussion in the ContentFile spec >>>>>>> ch

Re: Proposal for RESTful Data Operations

2024-02-28 Thread Ryan Blue
quot;accept-delay-ms": 30, // acceptable delay for processing >>>>>> "data-files": [...] >>>>>> } >>>>>> >>>>>> *Response*: >>>>>> 202 accepted >>>>>> { >>>>>>

Re: Proposal for RESTful Data Operations

2024-02-26 Thread Jack Ye
g your thoughts on the asynchronous >>>>> operation model and the suggested endpoint structure. >>>>> >>>>> Building on this, we previously discussed having these update options: >>>>> RemoveDataFiles and RemoveDeleteFiles. Given this n

Re: Proposal for RESTful Data Operations

2024-02-21 Thread Ryan Blue
int and specify operationType, or do establish distinct >>>> endpoints for these operations? Given that appends can support batch >>>> processing, we can accommodate this in the request model. >>>> >>>> Thank you, >>>> Drew >>>> &g

Re: Proposal for RESTful Data Operations

2024-02-21 Thread Jack Ye
t;> I wanted to provide a quick update on the progress of the commit API >>>> proposal. Based on the feedback in the design doc and the Slack >>>> conversation with Dan and Jack, we've reached an agreement that this is >>>> more of a fine-grained metadata commit, r

Re: Proposal for RESTful Data Operations

2024-02-21 Thread Ryan Blue
ing the >>> necessary tests to ensure its end-to-end functionality. >>> >>> Thanks for all the feedback, I still have an open PR for an appendFiles. >>> If you have a chance to review, I would appreciate any additional feedback >>> you may have. >>>

Re: Proposal for RESTful Data Operations

2024-02-20 Thread Jack Ye
endFiles. >> If you have a chance to review, I would appreciate any additional feedback >> you may have. >> >> https://github.com/apache/iceberg/pull/9292 >> >> Best, >> >> Drew >> >> On Fri, Jan 12, 2024 at 3:40 PM Drew wrote: >> &

Re: Proposal for RESTful Data Operations

2024-02-20 Thread Drew
> > Best, > > Drew > > On Fri, Jan 12, 2024 at 3:40 PM Drew wrote: > >> Hi everyone, >> >> I hope you all had great holidays! I wanted to resurface this proposal >> for RESTful Data operations. >> >> Currently, I have a open PR here: >> http

Re: Proposal for RESTful Data Operations

2024-01-26 Thread Drew
hance to review, I would appreciate any additional feedback you may have. https://github.com/apache/iceberg/pull/9292 Best, Drew On Fri, Jan 12, 2024 at 3:40 PM Drew wrote: > Hi everyone, > > I hope you all had great holidays! I wanted to resurface this proposal for > RESTful Data o

Re: Proposal for RESTful Data Operations

2024-01-12 Thread Drew
Hi everyone, I hope you all had great holidays! I wanted to resurface this proposal for RESTful Data operations. Currently, I have a open PR here: https://github.com/apache/iceberg/pull/9292 Thanks, Drew On Wed, Dec 13, 2023 at 3:04 PM Jack Ye wrote: > Thanks Drew for the quick turnaround

Re: Proposal for RESTful Data Operations

2023-12-13 Thread Jack Ye
Thanks Drew for the quick turnaround, I will take a deeper look into the PR. I think if we all agree that it is beneficial to have the AppendFIles(DataFile[]) API (maybe we should call it AppendRows instead), I would like to know if it also makes sense to have: 1. DeleteRows(DeleteFile[]), which c

Re: Proposal for RESTful Data Operations

2023-12-13 Thread Drew
Hi Ryan, Thanks for the feedback, I'll start going through the comments left in the doc! You're right in pointing out that the logic here can be simplified to roll back a commit. For now I introduced a smaller PR, that focuses on the append files operation. Github PR: https://github.com/apache/ic

Re: Proposal for RESTful Data Operations

2023-12-11 Thread Ryan Blue
> Based on my understanding of the proposal, I think it's more about the possibility of enabling other ways that do not require a full rollback. it's just currently we implemented it as a rollback to prove the feasibility. My main question is this: what can be done besides rolling back a commit? A

Re: Proposal for RESTful Data Operations

2023-12-11 Thread Jack Ye
> The proposal is to roll back rewrite commits, but that's already possible with the much simpler API that exists today. Based on my understanding of the proposal, I think it's more about the possibility of enabling other ways that do not require a full rollback. it's just currently we implemented

Re: Proposal for RESTful Data Operations

2023-12-08 Thread Ryan Blue
Thanks, Drew. I think it's a good idea in general to be able to perform commits on the server-side, but I would much rather break this down into smaller parts. I would definitely want to start with just file append use cases, since I think that is the biggest win. It can reduce retries and is an e

RE: Proposal for RESTful Data Operations

2023-12-08 Thread Gallardo, Drew
In regards to the multiple emails sent earlier, please use this one for discussions. Thanks you! On 2023/12/07 00:47:42 Drew wrote: > Hi everyone, > > My name is Drew Gallardo, and I’m a part of the Iceberg team at Amazon EMR > and Athena. I’m reaching out to share a proposal that introduces d

Proposal for RESTful Data operations

2023-12-08 Thread Drew
Hi everyone, My name is Drew Gallardo, and I’m a part of the Iceberg team at Amazon EMR and Athena. I’m reaching out to share a proposal that introduces data commits as a part of the RESTCatalog. The current process for data commits lives on the client side, and by shifting this logic into the RES

Proposal for RESTful Data operations

2023-12-08 Thread Drew
Hi everyone, My name is Drew Gallardo, and I’m a part of the Iceberg team at Amazon EMR and Athena. I’m reaching out to share a proposal that introduces data commits as a part of the RESTCatalog. The current process for data commits lives on the client side, and by shifting this logic into the RES

Proposal for RESTful Data Operations

2023-12-06 Thread Drew
Hi everyone, My name is Drew Gallardo, and I’m a part of the Iceberg team at Amazon EMR and Athena. I’m reaching out to share a proposal that introduces data commits as a part of the RESTCatalog. The current process for data commits lives on the client side, and by shifting this logic into the RES