[ https://issues.apache.org/jira/browse/TEZ-3361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15529928#comment-15529928 ]
Kuhu Shukla commented on TEZ-3361: ---------------------------------- [~jeagles], Thank you for the latest patch! {code} UnorderedKVInput.java: boolean compositeFetch = ShuffleUtils.isTezShuffleHandler(conf); {code} Config key for composite fetch should be part of the static confKeys Set. {code} Fetcher.java /* if (shouldRetry(srcAttemptId, ioe)) { //release mem/file handles cleanupFetchedInput(fetchedInput); throw new FetcherReadTimeoutException(ioe); } {code} It would be nice if fetchInputs() error handling could retry all if we fail to fetch even one of the inputs. Similar addition for FetcherOrderedGrouped. As we discussed offline : {code} if (header.getCompressedLength() == 0) { // Empty partitions are already accounted for continue; } {code} Similar change is needed in OrderedGrouped case when partLength is zero. One other follow up would be to remove empty partitions altogether from the shuffle header. > Fetch Multiple Partitions from the Shuffle Handler > -------------------------------------------------- > > Key: TEZ-3361 > URL: https://issues.apache.org/jira/browse/TEZ-3361 > Project: Apache Tez > Issue Type: Sub-task > Reporter: Jonathan Eagles > Assignee: Jonathan Eagles > Attachments: TEZ-3361.1.patch, TEZ-3361.2.patch, TEZ-3361.3.patch > > > Provide an API that allows for fetching multiple partitions at once from a > single upstream task. This is to better support auto-reduce parallelism where > a single downstream task is impersonating several (possibly?) consecutive > downstream tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)