abstractdog commented on code in PR #202:
URL: https://github.com/apache/tez/pull/202#discussion_r1240764323


##########
tez-mapreduce/src/main/java/org/apache/tez/mapreduce/grouper/TezSplitGrouper.java:
##########
@@ -260,23 +260,20 @@ public List<GroupedSplitContainer> 
getGroupedSplits(Configuration conf,
         desiredNumSplits = newDesiredNumSplits;
       } else if (lengthPerGroup < minLengthPerGroup) {
         // splits too small to work. Need to override with size.
-        int newDesiredNumSplits = (int)(totalLength/minLengthPerGroup) + 1;
         /**
          * This is a workaround for systems like S3 that pass the same
          * fake hostname for all splits.
          */
         if (!allSplitsHaveLocalhost) {
+          int newDesiredNumSplits = (int)(totalLength/minLengthPerGroup) + 1;
+          LOG.info("Desired splits: " + desiredNumSplits + " too large. " +
+              " Desired splitLength: " + lengthPerGroup +
+              " Min splitLength: " + minLengthPerGroup +
+              " New desired splits: " + newDesiredNumSplits +
+              " Total length: " + totalLength +
+              " Original splits: " + originalSplits.size());
           desiredNumSplits = newDesiredNumSplits;
         }

Review Comment:
   what about adding an else branch here with some useful log message 
describing whatever happens (or doesn't happen) when allSplitsHaveLocalhost=true
   otherwise LGTM
   cc: @rbalamohan 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to