pltbkd commented on code in PR #3637:
URL: https://github.com/apache/celeborn/pull/3637#discussion_r3015520002
##########
worker/src/main/java/org/apache/celeborn/service/deploy/worker/storage/MapPartitionDataReader.java:
##########
@@ -138,8 +138,34 @@ public void open(FileChannel dataFileChannel, FileChannel
indexFileChannel, long
this.dataFileChannel = dataFileChannel;
this.dataFileChannelSize = dataFileChannel.size();
this.indexFileChannel = indexFileChannel;
+
+ int numSubpartitions = mapFileMeta.getNumSubpartitions();
+ // If numSubpartitions is 0, it means pushDataHandShake was never
successfully called.
+ // This can happen when the first handshake failed before any data was
written.
+ // In this case, check if data file is empty, and if so, treat this as
an empty partition.
+ if (numSubpartitions == 0) {
+ if (dataFileChannelSize == 0) {
+ logger.warn(
+ "Partition {} has numSubpartitions=0 and empty data file, this
indicates a failed "
+ + "handshake before any data was written. Treating as empty
partition.",
+ fileInfo.getFilePath());
+ isOpen = true;
+ // Mark as finished and notify consumer with backlog=1 so they can
complete normally
+ closeReader();
+ notifyBacklog(1);
Review Comment:
notifyBacklog(1) is indeed not a standard operation, but I suppose this
should be fine as a fallback mechanism, considering it's a rare case, and
compared to endless failover without this fix. We need the client to send a
credit and trigger data sending, so the recycle can be processed normally.
notifyBacklog(0) won't make the client send the credit. Besides, I tried to
recycle the reader here but it caused a stuck state since the client cannot
properly finish reading. I suppose this should be the only approach to handle
this issue without client-side changes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]