Re: [QUESTION] Handle record partition change

2019-12-18 Thread Shiyan Xu
Sure. I can create a JIRA and note down the discussion points there. On Wed, Dec 18, 2019 at 7:14 PM Vinoth Chandar wrote: > Interesting discussion. We can file a JIRA for option 2? It seems to also > make the semantics simpler. > > On Wed, Dec 18, 2019 at 11:21 AM Shiyan Xu > wrote: > > > Tha

Re: [QUESTION] Handle record partition change

2019-12-18 Thread Vinoth Chandar
Interesting discussion. We can file a JIRA for option 2? It seems to also make the semantics simpler. On Wed, Dec 18, 2019 at 11:21 AM Shiyan Xu wrote: > Thanks Sivabalan. Exactly, that's what I meant. > I can think of a usecase for option 2: a Hudi dataset manages people info > and partitioned

Re: [QUESTION] Handle record partition change

2019-12-18 Thread Shiyan Xu
Thanks Sivabalan. Exactly, that's what I meant. I can think of a usecase for option 2: a Hudi dataset manages people info and partitioned by birthday. In most cases, where people info are updated, birthdays are not to be changed (that's why we choose it as partition field). But in some edge cases w

Re: [QUESTION] Handle record partition change

2019-12-18 Thread Sivabalan
Raymond, The patch which I have put up works differently. If initial record is in Partition1, and updates are sent to Partition2, we silently update the record in Partition1. Guess you are asking for opposite, i.e. insert in Partition2 and d

Re: [QUESTION] Handle record partition change

2019-12-18 Thread Shiyan Xu
Hi Sivabalan, Sorry for the late reply. I now see that GLOBAL_BLOOM allows records to be looked up in different partitions. This is indeed helpful in the situation where the same record key gets updated on its partition path. Now I'm thinking when we "tagLocationBacktoRecords

Re: [QUESTION] Handle record partition change

2019-12-11 Thread Sivabalan
Depends on whether you are using regular BLOOM or GLOBAL_BLOOM. May I know which one are you talking about? On Wed, Dec 11, 2019 at 9:12 AM Shiyan Xu wrote: > Hi Hudi devs, > > Upon upsert operations, does Hudi detect record's partition path change? As > for the same record, the partition path

[QUESTION] Handle record partition change

2019-12-11 Thread Shiyan Xu
Hi Hudi devs, Upon upsert operations, does Hudi detect record's partition path change? As for the same record, the partition path field may get updated while the record key (the primary id) stays the same, then the insert would result in duplicate record (based on record key) in the dataset. Is th