[ https://issues.apache.org/jira/browse/TEZ-3334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15372143#comment-15372143 ]
Siddharth Seth commented on TEZ-3334: ------------------------------------- [~jeagles]. Don't think there's any other template at the moment. LLAP has a custom shuffle handler - which is essentially a stripped down version of the MR shuffle handler, but adds one more component to path processing (the dag id). A new shuffle handler would ideally handle MR, Tez and LLAP (which currently hosts it's shuffle service, but will likely move it out soon). Some of the issues which would be good to fix eventually Multiple partition ranges in a single request (main intent of this jira per the description?) Handling UnsortedOutput - multiple files / files merged in a different manner. Improvements to broadcast shuffle handling Better error reporting. It would be really nice if we could build something out which potentially understand file formats beyond the current index files + merged IFiles. > Tez Custom Shuffle Handler > -------------------------- > > Key: TEZ-3334 > URL: https://issues.apache.org/jira/browse/TEZ-3334 > Project: Apache Tez > Issue Type: Bug > Reporter: Jonathan Eagles > > For conditions where auto-parallelism is reduced (e.g. TEZ-3222), a custom > shuffle handler could help reduce the number of fetches and could more > efficiently fetch data. In particular if a reducer is fetching 100 pieces > serially from the same mapper it could do this in one fetch call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)