I updated the doc with few changes related to partition evolution. Thanks.
Jun On Tue, Dec 22, 2020 at 5:06 PM Ryan Blue <rb...@netflix.com.invalid> wrote: > > Thanks, Yan! > > To summarize that doc a bit, the main blockers are: > * Finish updating the spec for NaN counters and behavior > * Fix the issue with partition transforms and values before 1970 (#1680) > * Partition evolution: Add lastPartitionFieldId to table metadata and update > docs > * Add order id column to manifest files > * Track the schema of each snapshot > > Only the last one is a somewhat large task, but even that should be fairly > quick. I think we can take care of those in the first couple months of 2021 > after the 0.11.0 release is out. > > On Fri, Dec 18, 2020 at 12:59 AM OpenInx <open...@gmail.com> wrote: >> >> Thanks Yan for the document, I will take a look at it, and see what I can >> do. >> >> On Fri, Dec 18, 2020 at 3:38 AM Yan Yan <yyany...@gmail.com> wrote: >>> >>> Hi OpenInx, >>> >>> Thanks for bringing this up. I am currently working on Format v2 blocking >>> tasks, and am maintaining a full list of blocking tasks with their >>> description and current status here after speaking with Ryan a while ago, >>> which covers all open issues listed in the github milestone plus some >>> others brought up by people during community sync. It would be great if you >>> are interested in collaborating/code reviewing! >>> >>> Everyone please feel free to let me know/update the doc if you see any item >>> missing/described inaccurately. >>> >>> Thanks, >>> Yan >>> >>> On Wed, Dec 16, 2020 at 11:03 PM OpenInx <open...@gmail.com> wrote: >>>> >>>> Hi >>>> >>>> I wrote this email to align with the community about the time to expose >>>> format v2 to end users. >>>> >>>> In iceberg format v2, we've accomplished the row-level delete. It's >>>> designed for two user cases: >>>> >>>> 1. Execute a single query to update or delete lots of rows. It's a >>>> typical batch update/delete job, which is suitable for GDPR or the case >>>> that we want to correct the wrong data. >>>> 2. Write the real-time CDC/UPSERT stream to the iceberg table, so that >>>> the upper layer compute engines could analyze the change log in minutes. >>>> It's almost ready in the current master branch for flink integration. >>>> >>>> >>>> I'm not quite sure what's the blocker about the iceberg format v2 now. >>>> I'd love to resolve those blockers if there're some. >>>> >>>> Thanks. > > > > -- > Ryan Blue > Software Engineer > Netflix