zaynt4606 commented on code in PR #3653:
URL: https://github.com/apache/celeborn/pull/3653#discussion_r3058631000
##########
worker/src/main/scala/org/apache/celeborn/service/deploy/worker/PushDataHandler.scala:
##########
@@ -1456,8 +1456,10 @@ class PushDataHandler(val workerSource: WorkerSource)
extends BaseMessageHandler
|""".stripMargin)
val diskFileInfo = fileWriter.getDiskFileInfo
if (diskFileInfo != null) {
- if (workerPartitionSplitEnabled && ((diskFull &&
diskFileInfo.getFileLength > partitionSplitMinimumSize) ||
- (isPrimary && diskFileInfo.getFileLength >
fileWriter.getSplitThreshold))) {
+ if (workerPartitionSplitEnabled && diskFull &&
(diskFileInfo.getFileLength >= partitionSplitMinimumSize)) {
Review Comment:
> In case of disk full, do we need check `diskFileInfo.getFileLength >=
partitionSplitMinimumSize` ? I think main priority should be save the remaining
buffer disk space and send HARD_SPLIT for every partition.
When partitionSplitMinimumSize is only 1MB, its impact on the remaining
buffer disk space is negligible. Not splitting such tiny data chunks is also a
performance trade-off, so the original logic remains unchanged.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]