mridulm commented on a change in pull request #33078:
URL: https://github.com/apache/spark/pull/33078#discussion_r670907624



##########
File path: 
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java
##########
@@ -778,11 +773,6 @@ public void onComplete(String streamId) throws IOException 
{
               // IOException to the client. This may increase the chunk size 
however the increase is
               // still limited because of the limit on the number of 
IOExceptions for a
               // particular shuffle partition.
-            } catch (NullPointerException e) {
-              throw new RuntimeException(
-                String.format("The merged shuffle partition info for appId %s 
shuffleId %s "
-                  + "reduceId %s has been cleaned up", partitionInfo.appId,
-                  partitionInfo.shuffleId, partitionInfo.reduceId));

Review comment:
       As discussed offline, let us not do this - this is inherently buggy 
approach.
   Let us do the alternative discussed -
   
   a) Remove use of direct field reference, and rely on accessor methods.
   b) In the getter, do the relevant checks to ensure non-null values (and 
throw exception as relevant).
   c) Use `AtomicReference` for the field related fields in partition info - 
given they are getting mutated async.
   d) Ensure close/cleanup relies on atomic getAndSet to ensure no MT-safety 
issues : note, we need to do this only in `AppShufflePartitionInfo`.
   e) We can also make `channel` and `dos` `final` in `MergeShuffleFile` - 
since `AppShufflePartitionInfo` will atomically handle state transitions.
   
   (e) is something we had not discussed .. thoughts @zhouyejoe, @otterc ?
   
   +CC @Ngone51 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to