[ https://issues.apache.org/jira/browse/HUDI-3883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
sivabalan narayanan updated HUDI-3883: -------------------------------------- Sprint: Hudi-Sprint-Apr-19, Hudi-Sprint-Apr-25, 2022/05/02, 2022/05/16, 2022/05/31, 2022/08/22 (was: Hudi-Sprint-Apr-19, Hudi-Sprint-Apr-25, 2022/05/02, 2022/05/16, 2022/05/31, 2022/08/08) > Bulk-insert w/ sort-mode "NONE" leads to file-sizing issues > ----------------------------------------------------------- > > Key: HUDI-3883 > URL: https://issues.apache.org/jira/browse/HUDI-3883 > Project: Apache Hudi > Issue Type: Bug > Reporter: Alexey Kudinkin > Assignee: Alexey Kudinkin > Priority: Blocker > Labels: pull-request-available > Fix For: 0.13.0 > > Attachments: Screen Shot 2022-04-14 at 1.08.19 PM.png > > > Even after HUDI-3709, i still see that when writing partitioned-table > file-sizing doesn't seem to be properly respected: in that case i was running > ingestion job with following configs which was supposed to yield me ~100Mb > files > {code:java} > Map( > "hoodie.parquet.small.file.limit" -> String.valueOf(100 * 1024 * 1024), // > 100Mb > "hoodie.parquet.max.file.size" -> String.valueOf(120 * 1024 * 1024) // > 120Mb > ) {code} > > Instead, my table contains a lot of very small (~1Mb) files: > !Screen Shot 2022-04-14 at 1.08.19 PM.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)