[ https://issues.apache.org/jira/browse/HUDI-3883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexey Kudinkin updated HUDI-3883: ---------------------------------- Description: Even after HUDI-3709, i still see that when writing partitioned-table file-sizing doesn't seem to be properly respected: in that case i was running ingestion job with following configs which was supposed to yield me ~100Mb files {code:java} Map( "hoodie.parquet.small.file.limit" -> String.valueOf(100 * 1024 * 1024), // 100Mb "hoodie.parquet.max.file.size" -> String.valueOf(120 * 1024 * 1024) // 120Mb ) {code} Instead, my table contains a lot of very small (~1Mb) files: !Screen Shot 2022-04-14 at 1.08.19 PM.png! was: Even after XXX, i still see that when writing partitioned-table file-sizing doesn't seem to be properly respected: in that case i was running ingestion job with following configs which was supposed to yield me ~100Mb files {code:java} Map( "hoodie.parquet.small.file.limit" -> String.valueOf(100 * 1024 * 1024), // 100Mb "hoodie.parquet.max.file.size" -> String.valueOf(120 * 1024 * 1024) // 120Mb ) {code} Instead, my table contains a lot of very small (~1Mb) files: !Screen Shot 2022-04-14 at 1.08.19 PM.png! > File-sizing issues when writing COW table to S3 > ----------------------------------------------- > > Key: HUDI-3883 > URL: https://issues.apache.org/jira/browse/HUDI-3883 > Project: Apache Hudi > Issue Type: Bug > Reporter: Alexey Kudinkin > Assignee: Alexey Kudinkin > Priority: Blocker > Attachments: Screen Shot 2022-04-14 at 1.08.19 PM.png > > > Even after HUDI-3709, i still see that when writing partitioned-table > file-sizing doesn't seem to be properly respected: in that case i was running > ingestion job with following configs which was supposed to yield me ~100Mb > files > {code:java} > Map( > "hoodie.parquet.small.file.limit" -> String.valueOf(100 * 1024 * 1024), // > 100Mb > "hoodie.parquet.max.file.size" -> String.valueOf(120 * 1024 * 1024) // > 120Mb > ) {code} > > Instead, my table contains a lot of very small (~1Mb) files: > !Screen Shot 2022-04-14 at 1.08.19 PM.png! -- This message was sent by Atlassian Jira (v8.20.1#820001)