[
https://issues.apache.org/jira/browse/TEZ-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14053215#comment-14053215
]
Bikas Saha commented on TEZ-1257:
---------------------------------
I am wondering if OnFileUnorderedKVOutput and ShuffledMergedInput are a
correct/efficient combination. If the output is not sorted then what happens in
the merge phase of the input where its tries to sort merge input chunks that
are expected to be sorted? Should we be using ShuffledUnorderedKVInput when we
use OnFileUnorderedKVOutput ?
> Error on empty partition when using OnFileUnorderedKVOutput and
> ShuffledMergedInput
> -----------------------------------------------------------------------------------
>
> Key: TEZ-1257
> URL: https://issues.apache.org/jira/browse/TEZ-1257
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Rohini Palaniswamy
>
> Encountering exception
> {code}
> org.apache.tez.dag.api.TezUncheckedException: Path component must start with:
> attempt InputAttemptIdentifier [inputIdentifier=InputIdentifier
> [inputIndex=0], attemptNumber=0, pathComponent=]
> at
> org.apache.tez.runtime.library.common.InputAttemptIdentifier.<init>(InputAttemptIdentifier.java:45)
> at
> org.apache.tez.runtime.library.common.InputAttemptIdentifier.<init>(InputAttemptIdentifier.java:51)
> at
> org.apache.tez.runtime.library.common.shuffle.impl.ShuffleInputEventHandler.processDataMovementEvent(ShuffleInputEventHandler.java:81)
> at
> org.apache.tez.runtime.library.common.shuffle.impl.ShuffleInputEventHandler.handleEvent(ShuffleInputEventHandler.java:66)
> at
> org.apache.tez.runtime.library.common.shuffle.impl.ShuffleInputEventHandler.handleEvents(ShuffleInputEventHandler.java:59)
> {code}
> This is because the pathComponent is not set by UnorderedPartitionedKVWriter
> for empty partition
> {code}
> if (emptyPartitions.cardinality() != numPartitions) {
> // Populate payload only if at least 1 partition has data
> payloadBuidler.setHost(host);
> payloadBuidler.setPort(shufflePort);
> payloadBuidler.setPathComponent(outputContext.getUniqueIdentifier());
> }
> {code}
> The combination of OnFileUnorderedKVOutput and ShuffledMergedInput works fine
> otherwise if there are no empty partitions.
--
This message was sent by Atlassian JIRA
(v6.2#6252)