[ https://issues.apache.org/jira/browse/HUDI-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17470041#comment-17470041 ]
Vinoth Chandar commented on HUDI-1628: -------------------------------------- [~guoyihua] assigning to you to drive this forward. cc [~thirumalai.raj] please let us know if you are still interested in pursuing this. > [Umbrella] Improve data locality during ingestion > ------------------------------------------------- > > Key: HUDI-1628 > URL: https://issues.apache.org/jira/browse/HUDI-1628 > Project: Apache Hudi > Issue Type: Epic > Components: Writer Core > Reporter: satish > Assignee: Ethan Guo > Priority: Major > Labels: hudi-umbrellas > Fix For: 0.11.0 > > > Today the upsert partitioner does the file sizing/bin-packing etc for > inserts and then sends some inserts over to existing file groups to > maintain file size. > We can abstract all of this into strategies and some kind of pipeline > abstractions and have it also consider "affinity" to an existing file group > based > on say information stored in the metadata table? > See http://mail-archives.apache.org/mod_mbox/hudi-dev/202102.mbox/browser > for more details -- This message was sent by Atlassian Jira (v8.20.1#820001)