[jira] [Updated] (HUDI-2703) [RFC-37] Metadata based bloom index
[ https://issues.apache.org/jira/browse/HUDI-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2703: - Epic Status: Done > [RFC-37] Metadata based bloom index > --- > > Key: HUDI-2703 > URL: https://issues.apache.org/jira/browse/HUDI-2703 > Project: Apache Hudi > Issue Type: Epic >Reporter: sivabalan narayanan >Assignee: sivabalan narayanan >Priority: Major > Labels: hudi-umbrellas > Fix For: 0.11.0 > > > Hudi has indices to assit in tagging incoming records. Most commonly used one > is Bloom index. This involves looking up (loading) bloom from data files > which could be time consuming and could have throttling impact in cloud > stores like S3. So, proposing this RFC to add bloom as a special partition in > metadata table and implement an index based on that. > > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (HUDI-2703) [RFC-37] Metadata based bloom index
[ https://issues.apache.org/jira/browse/HUDI-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raymond Xu updated HUDI-2703: - Issue Type: Epic (was: Improvement) > [RFC-37] Metadata based bloom index > --- > > Key: HUDI-2703 > URL: https://issues.apache.org/jira/browse/HUDI-2703 > Project: Apache Hudi > Issue Type: Epic >Reporter: sivabalan narayanan >Assignee: sivabalan narayanan >Priority: Major > Labels: hudi-umbrellas > Fix For: 0.11.0 > > > Hudi has indices to assit in tagging incoming records. Most commonly used one > is Bloom index. This involves looking up (loading) bloom from data files > which could be time consuming and could have throttling impact in cloud > stores like S3. So, proposing this RFC to add bloom as a special partition in > metadata table and implement an index based on that. > > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HUDI-2703) [RFC-37] Metadata based bloom index
[ https://issues.apache.org/jira/browse/HUDI-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Chen updated HUDI-2703: - Fix Version/s: (was: 0.10.0) > [RFC-37] Metadata based bloom index > --- > > Key: HUDI-2703 > URL: https://issues.apache.org/jira/browse/HUDI-2703 > Project: Apache Hudi > Issue Type: Improvement >Reporter: sivabalan narayanan >Assignee: sivabalan narayanan >Priority: Major > Labels: hudi-umbrellas > Fix For: 0.11.0 > > > Hudi has indices to assit in tagging incoming records. Most commonly used one > is Bloom index. This involves looking up (loading) bloom from data files > which could be time consuming and could have throttling impact in cloud > stores like S3. So, proposing this RFC to add bloom as a special partition in > metadata table and implement an index based on that. > > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HUDI-2703) [RFC-37] Metadata based bloom index
[ https://issues.apache.org/jira/browse/HUDI-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sivabalan narayanan updated HUDI-2703: -- Labels: hudi-umbrellas (was: ) > [RFC-37] Metadata based bloom index > --- > > Key: HUDI-2703 > URL: https://issues.apache.org/jira/browse/HUDI-2703 > Project: Apache Hudi > Issue Type: Improvement >Reporter: sivabalan narayanan >Priority: Major > Labels: hudi-umbrellas > Fix For: 0.10.0 > > > Hudi has indices to assit in tagging incoming records. Most commonly used one > is Bloom index. This involves looking up (loading) bloom from data files > which could be time consuming and could have throttling impact in cloud > stores like S3. So, proposing this RFC to add bloom as a special partition in > metadata table and implement an index based on that. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)