Victsm commented on a change in pull request #30433: URL: https://github.com/apache/spark/pull/30433#discussion_r530564551
########## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java ########## @@ -827,13 +833,16 @@ void resetChunkTracker() { void updateChunkInfo(long chunkOffset, int mapIndex) throws IOException { long idxStartPos = -1; try { - // update the chunk tracker to meta file before index file - writeChunkTracker(mapIndex); idxStartPos = indexFile.getFilePointer(); logger.trace("{} shuffleId {} reduceId {} updated index current {} updated {}", appShuffleId.appId, appShuffleId.shuffleId, reduceId, this.lastChunkOffset, chunkOffset); - indexFile.writeLong(chunkOffset); + indexFile.write(Longs.toByteArray(chunkOffset)); + // Chunk bitmap should be written to the meta file after the index file because if there are + // any exceptions during writing the offset to the index file, meta file should not be + // updated. If the update to the index file is successful but the update to meta file isn't + // then the index file position is reset in the catch clause. + writeChunkTracker(mapIndex); Review comment: For dynamically changing merger locations for the current push, if the non-recoverable IOException happen after majority of the blocks have already been merged, we won't gain much by pushing the remaining blocks in this shuffle targeted to the failed shuffle service to a new location. Probably need to check the merge ratio before deciding whether to do so or not. The current framework is designed around the best-effort nature of the block push operation, which simplifies the handling of these failure scenarios. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org