[ 
https://issues.apache.org/jira/browse/TEZ-3334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15372143#comment-15372143
 ] 

Siddharth Seth commented on TEZ-3334:
-------------------------------------

[~jeagles]. Don't think there's any other template at the moment.
LLAP has a custom shuffle handler - which is essentially a stripped down 
version of the MR shuffle handler, but adds one more component to path 
processing (the dag id).

A new shuffle handler would ideally handle MR, Tez and LLAP (which currently 
hosts it's shuffle service, but will likely move it out soon). Some of the 
issues which would be good to fix eventually
Multiple partition ranges in a single request (main intent of this jira per the 
description?)
Handling UnsortedOutput - multiple files / files merged in a different manner.
Improvements to broadcast shuffle handling
Better error reporting.

It would be really nice if we could build something out which potentially 
understand file formats beyond the current index files + merged IFiles.

> Tez Custom Shuffle Handler
> --------------------------
>
>                 Key: TEZ-3334
>                 URL: https://issues.apache.org/jira/browse/TEZ-3334
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Jonathan Eagles
>
> For conditions where auto-parallelism is reduced (e.g. TEZ-3222), a custom 
> shuffle handler could help reduce the number of fetches and could more 
> efficiently fetch data. In particular if a reducer is fetching 100 pieces 
> serially from the same mapper it could do this in one fetch call. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to