pltbkd opened a new pull request, #3637:
URL: https://github.com/apache/celeborn/pull/3637

   …ls before any data written
   What changes were proposed in this pull request?
   
       Handle the case where numSubpartitions is zero in 
MapPartitionDataReader.open(). When the partition is empty, treat it as a 
normal empty partition and notify consumers accordingly.
   
       Why are the changes needed?
   
       When the first PUSH_DATA_HAND_SHAKE request fails (e.g., timeout), 
client triggers revive with reason HARD_SPLIT. Manager adds the failed 
partition to partition locations, but numSubpartitions remains uninitialized 
(zero). Reading such partition causes ArithmeticException: / by zero.
   
       Since this is caused by client-side behavior, we handle it on worker 
side first for cross-version compatibility. The issue that flink shuffle client 
revives with fixed reason HARD_SPLIT can be addressed in later PRs.
   
       Does this PR resolve a correctness bug?
   
       No
   
       Does this PR introduce any user-facing change?
   
       No
   
       How was this patch tested?
   
       Manually tested with a hacked version that throws exception on the first 
handshake invocation. But the test code is too hacky to included into this PR. 
Advices are welcomed on how to add a proper unit test for this scenario without 
introducing too much complexity.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to