bhat-vinay opened a new pull request, #10876:
URL: https://github.com/apache/hudi/pull/10876

   ### Change Logs
   
   Allows for sorting input records in insert operation. This is still a 
in-progress PR - uploading to get some test signals.
   
   Pending: Custom sort columns, more unit tests
   
   ### Impact
   
   Allows the input records to be globally sorted for insert operation. This 
means that the records will be bin-packed into new files in a sorted manner 
(allowing for efficient pruning of files in the query path). The existing small 
files are bin-packed first and hence those existing small files may not be 
efficiently pruned.
   
   ### Risk level (write none, low medium or high below)
   
   Low. Existing tests and new tests should suffice. the new functionality is 
guarded behind a config (which is default disabled)
   
   ### Documentation Update
   
   New config op[tions need to be documented.
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to