[ https://issues.apache.org/jira/browse/HUDI-499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008452#comment-17008452 ]
Raymond Xu commented on HUDI-499: --------------------------------- [~shivnarayan] Thank you for the info! Guess I'll have to work more on the testing as well. > Allow partition path to be updated with GLOBAL_BLOOM index > ---------------------------------------------------------- > > Key: HUDI-499 > URL: https://issues.apache.org/jira/browse/HUDI-499 > Project: Apache Hudi (incubating) > Issue Type: Improvement > Components: Index > Reporter: Raymond Xu > Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > h3. Context > When a record is to be updated with a new partition path, and when set to > GLOBAL_BLOOM as index, the current logic implemented in > [https://github.com/apache/incubator-hudi/pull/1091/] ignores the new > partition path and update the record in the original partition path. > h3. Proposed change > Allow records to be inserted into their new partition paths and delete the > records in the old partition paths. A configuration (e.g. > {{hoodie.index.bloom.update.partitionpath=true}}) can be added to enable this > feature. > h4. An example use case > A Hudi dataset manages people info and partitioned by birthday. In most > cases, where people info are updated, birthdays are not to be changed (that's > why we choose it as partition field). But in some edge cases where birthday > info are input wrongly and we want to manually fix it or allow user to > updated it occasionally. In this case, option 2 would be helpful in keeping > records in the expected partition, so that a query like "show me people who > were born after 2000" would work. > -- This message was sent by Atlassian Jira (v8.3.4#803005)