[ 
https://issues.apache.org/jira/browse/TEZ-3334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008327#comment-16008327
 ] 

Jason Lowe commented on TEZ-3334:
---------------------------------

Sorry for the delay.  I finally got some time to look at this.

h6. ContainerLauncherWrapper
Does it make sense to create an abstract class in tez-common that derives from 
ContainerLauncher and has an abstract dagComplete method?  Then we can have the 
existing launchers derive from that rather than ContainerLauncher directly and 
can RTTI check against that one class rather than maintain a specific list of 
container launchers.

h6. LocalContainerLauncher
I'm confused why we go through the trouble of serializing the hardcoded value 
of zero into the aux service protocol buffer, stuff it into the env, then 
immediately go fetch it back out and extract the integer from the byte buffer.  
Isn't this a complicated way to say, shufflePort = 0?

tezDefaultComponentName only needs to be computed when cleanupDagDataOnComplete 
is true.  Actually may not be needed at all, see related comment for 
DagDeleteRunnable below.

The reflection instantiation is invoking a constructor signature that isn't in 
the DeletionTracker abstract class?  Isn't that too much knowledge about the 
actual class being created?

h6. DagDeleteRunnable
tezDefaultComponentName is unused?  I think this transitively means pluginName 
is unused in DeletionTracker which would simplify it's constructor signature.

h6. DeletionTracker
Nit: addNodeShufflePorts method name being plural implies more than one port 
can be added but it's only for adding a single node, port pair.

h6. AMContainerHelpers
As Sidd mentioned before, we should avoid the redundant conf key lookup when 
creating each container launch context.

h6. ShuffleInputEventHandler and ShuffleInputEventHandlerOrderedGrouped
We could do a better job of leveraging the emptyPartitionsBitSet.  Currently we 
iterate it bit-by-bit.  Instead we could mask it with the desired bits to 
examine and iterate the result with nextSetBit.  This should be a lot faster if 
there are a lot of bits to iterate and we expect a significant number of the 
partitions to not be empty.  Can be postponed to a followup JIRA if desired.

h6. DagDeleteRunnable
Do we need to do any cleanup on the httpConnection?

h6. DeletionTrackerImpl
What if the submission to the executor throws RejectedExecutionException 
because the executor was already shutdown and a late dagComplete was invoked?


> Tez Custom Shuffle Handler
> --------------------------
>
>                 Key: TEZ-3334
>                 URL: https://issues.apache.org/jira/browse/TEZ-3334
>             Project: Apache Tez
>          Issue Type: New Feature
>            Reporter: Jonathan Eagles
>         Attachments: TEZ-3334.1.patch, TEZ-3334.2.patch
>
>
> For conditions where auto-parallelism is reduced (e.g. TEZ-3222), a custom 
> shuffle handler could help reduce the number of fetches and could more 
> efficiently fetch data. In particular if a reducer is fetching 100 pieces 
> serially from the same mapper it could do this in one fetch call. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to