[ 
https://issues.apache.org/jira/browse/TEZ-3361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15529928#comment-15529928
 ] 

Kuhu Shukla commented on TEZ-3361:
----------------------------------

[~jeagles], Thank you for the latest patch!
{code}
UnorderedKVInput.java:
boolean compositeFetch = ShuffleUtils.isTezShuffleHandler(conf);
{code}
Config key for composite fetch should be part of the static confKeys Set.
{code}
Fetcher.java
       /*

      if (shouldRetry(srcAttemptId, ioe)) {
        //release mem/file handles
        cleanupFetchedInput(fetchedInput);
        throw new FetcherReadTimeoutException(ioe);
      }
{code}
It would be nice if fetchInputs() error handling could retry all if we fail to 
fetch even one of the inputs. Similar addition for FetcherOrderedGrouped.

As we discussed offline :
{code}
 if (header.getCompressedLength() == 0) {
            // Empty partitions are already accounted for
            continue;
          }
{code}
Similar change is needed in OrderedGrouped case when partLength is zero.

One other follow up would be to remove empty partitions altogether from the 
shuffle header.


> Fetch Multiple Partitions from the Shuffle Handler
> --------------------------------------------------
>
>                 Key: TEZ-3361
>                 URL: https://issues.apache.org/jira/browse/TEZ-3361
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Jonathan Eagles
>            Assignee: Jonathan Eagles
>         Attachments: TEZ-3361.1.patch, TEZ-3361.2.patch, TEZ-3361.3.patch
>
>
> Provide an API that allows for fetching multiple partitions at once from a 
> single upstream task. This is to better support auto-reduce parallelism where 
> a single downstream task is impersonating several (possibly?) consecutive 
> downstream tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to