[ https://issues.apache.org/jira/browse/TEZ-3334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008327#comment-16008327 ]
Jason Lowe commented on TEZ-3334: --------------------------------- Sorry for the delay. I finally got some time to look at this. h6. ContainerLauncherWrapper Does it make sense to create an abstract class in tez-common that derives from ContainerLauncher and has an abstract dagComplete method? Then we can have the existing launchers derive from that rather than ContainerLauncher directly and can RTTI check against that one class rather than maintain a specific list of container launchers. h6. LocalContainerLauncher I'm confused why we go through the trouble of serializing the hardcoded value of zero into the aux service protocol buffer, stuff it into the env, then immediately go fetch it back out and extract the integer from the byte buffer. Isn't this a complicated way to say, shufflePort = 0? tezDefaultComponentName only needs to be computed when cleanupDagDataOnComplete is true. Actually may not be needed at all, see related comment for DagDeleteRunnable below. The reflection instantiation is invoking a constructor signature that isn't in the DeletionTracker abstract class? Isn't that too much knowledge about the actual class being created? h6. DagDeleteRunnable tezDefaultComponentName is unused? I think this transitively means pluginName is unused in DeletionTracker which would simplify it's constructor signature. h6. DeletionTracker Nit: addNodeShufflePorts method name being plural implies more than one port can be added but it's only for adding a single node, port pair. h6. AMContainerHelpers As Sidd mentioned before, we should avoid the redundant conf key lookup when creating each container launch context. h6. ShuffleInputEventHandler and ShuffleInputEventHandlerOrderedGrouped We could do a better job of leveraging the emptyPartitionsBitSet. Currently we iterate it bit-by-bit. Instead we could mask it with the desired bits to examine and iterate the result with nextSetBit. This should be a lot faster if there are a lot of bits to iterate and we expect a significant number of the partitions to not be empty. Can be postponed to a followup JIRA if desired. h6. DagDeleteRunnable Do we need to do any cleanup on the httpConnection? h6. DeletionTrackerImpl What if the submission to the executor throws RejectedExecutionException because the executor was already shutdown and a late dagComplete was invoked? > Tez Custom Shuffle Handler > -------------------------- > > Key: TEZ-3334 > URL: https://issues.apache.org/jira/browse/TEZ-3334 > Project: Apache Tez > Issue Type: New Feature > Reporter: Jonathan Eagles > Attachments: TEZ-3334.1.patch, TEZ-3334.2.patch > > > For conditions where auto-parallelism is reduced (e.g. TEZ-3222), a custom > shuffle handler could help reduce the number of fetches and could more > efficiently fetch data. In particular if a reducer is fetching 100 pieces > serially from the same mapper it could do this in one fetch call. -- This message was sent by Atlassian JIRA (v6.3.15#6346)