Copilot commented on code in PR #3637:
URL: https://github.com/apache/celeborn/pull/3637#discussion_r3014272578


##########
worker/src/main/java/org/apache/celeborn/service/deploy/worker/storage/MapPartitionDataReader.java:
##########
@@ -138,8 +138,34 @@ public void open(FileChannel dataFileChannel, FileChannel 
indexFileChannel, long
       this.dataFileChannel = dataFileChannel;
       this.dataFileChannelSize = dataFileChannel.size();
       this.indexFileChannel = indexFileChannel;
+
+      int numSubpartitions = mapFileMeta.getNumSubpartitions();
+      // If numSubpartitions is 0, it means pushDataHandShake was never 
successfully called.
+      // This can happen when the first handshake failed before any data was 
written.
+      // In this case, check if data file is empty, and if so, treat this as 
an empty partition.
+      if (numSubpartitions == 0) {
+        if (dataFileChannelSize == 0) {
+          logger.warn(
+              "Partition {} has numSubpartitions=0 and empty data file, this 
indicates a failed "
+                  + "handshake before any data was written. Treating as empty 
partition.",
+              fileInfo.getFilePath());
+          isOpen = true;
+          // Mark as finished and notify consumer with backlog=1 so they can 
complete normally
+          closeReader();
+          notifyBacklog(1);

Review Comment:
   In the numSubpartitions==0 && dataFileChannelSize==0 path, 
`notifyBacklog(1)` is advertising a backlog that doesn’t exist (no buffers are 
enqueued) and relies on the client sending at least one credit so `sendData()` 
runs and the reader can be recycled (and thus send BufferStreamEnd when split 
is enabled). This can lead to unnecessary buffer reservation and can stall 
cleanup if the client never sends credits (e.g., no buffers available / reader 
closing).
   
   Consider instead directly recycling/releasing the reader in this branch (so 
BufferStreamEnd is emitted when appropriate) without faking backlog, or only 
sending a backlog that matches actual queued buffers (0) and using an explicit 
end-of-stream notification for the split-enabled case.
   ```suggestion
             // Mark as finished and notify consumer with a backlog matching 
actual queued buffers (0)
             closeReader();
             notifyBacklog(0);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to