[ https://issues.apache.org/jira/browse/HUDI-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373869#comment-17373869 ]
ASF GitHub Bot commented on HUDI-2122: -------------------------------------- wangxianghu commented on a change in pull request #3213: URL: https://github.com/apache/hudi/pull/3213#discussion_r663303427 ########## File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java ########## @@ -185,21 +185,26 @@ private void assignInserts(WorkloadProfile profile, HoodieEngineContext context) // first try packing this into one of the smallFiles for (SmallFile smallFile : smallFiles) { - long recordsToAppend = Math.min((config.getParquetMaxFileSize() - smallFile.sizeBytes) / averageRecordSize, - totalUnassignedInserts); - if (recordsToAppend > 0 && totalUnassignedInserts > 0) { - // create a new bucket or re-use an existing bucket - int bucket; - if (updateLocationToBucket.containsKey(smallFile.location.getFileId())) { - bucket = updateLocationToBucket.get(smallFile.location.getFileId()); - LOG.info("Assigning " + recordsToAppend + " inserts to existing update bucket " + bucket); - } else { - bucket = addUpdateBucket(partitionPath, smallFile.location.getFileId()); - LOG.info("Assigning " + recordsToAppend + " inserts to new update bucket " + bucket); + if (totalUnassignedInserts > 0) { + long recordsToAppend = Math.min((config.getParquetMaxFileSize() - smallFile.sizeBytes) / averageRecordSize, + totalUnassignedInserts); + if (recordsToAppend > 0) { + // create a new bucket or re-use an existing bucket + int bucket; + if (updateLocationToBucket.containsKey(smallFile.location.getFileId())) { + bucket = updateLocationToBucket.get(smallFile.location.getFileId()); + LOG.info("Assigning " + recordsToAppend + " inserts to existing update bucket " + bucket); + } else { + bucket = addUpdateBucket(partitionPath, smallFile.location.getFileId()); + LOG.info("Assigning " + recordsToAppend + " inserts to new update bucket " + bucket); + } + bucketNumbers.add(bucket); + recordsPerBucket.add(recordsToAppend); + totalUnassignedInserts -= recordsToAppend; } - bucketNumbers.add(bucket); - recordsPerBucket.add(recordsToAppend); - totalUnassignedInserts -= recordsToAppend; + } else { Review comment: > if the main thing you want to fix is to break out early. Can we do it after L203, with a if check, without nesting this 1 level under again in if block? > > ```java > totalUnassignedInserts -= recordsToAppend; > if (totalUnassignedInserts <= 0) { > break; > } > ``` Yes, I want to break it early. done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Improvement in packaging insert into smallfiles > ----------------------------------------------- > > Key: HUDI-2122 > URL: https://issues.apache.org/jira/browse/HUDI-2122 > Project: Apache Hudi > Issue Type: Improvement > Reporter: Xianghu Wang > Assignee: Xianghu Wang > Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005)