zhouyejoe commented on pull request #32007:
URL: https://github.com/apache/spark/pull/32007#issuecomment-846476449


   > 1. `RemoteBlockPushResolver` needs to ignore any `PushBlock` message that 
is from previous attempts otherwise it will still merge a block of previous 
attempt to files of latest attempt and this is going to corrupt merged files.
   > 2. We should try to keep active partitions info in `partitions` map and 
delete stale entries (partition info belonging to old attempts).
   > 3. Need to add UTs for the server to ignore any pushblock messages from 
previous attempts.
   > 4. I don't think we need to add attemptId to FinalizeMerge message
   
   1. Added the part to ignore the PushBlock from previous attempt
   2. Added the part to delete stale entries in partitions hashmap
   3. Added UT
   4. As discussed above, it is still needed to have the attemptID in the 
FinalizeMerge message


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to