[
https://issues.apache.org/jira/browse/HIVE-20113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17904833#comment-17904833
]
Sungwoo Park commented on HIVE-20113:
-------------------------------------
For the record, by setting tez.runtime.pipelined-shuffle.enabled=false and
tez.runtime.enable.final-merge.in.output=true, we can revert this commit and
take advantage of one-to-one edges because every task is guaranteed to produce
a single output file in the end.
> Shuffle avoidance: Disable 1-1 edges for sorted shuffle
> --------------------------------------------------------
>
> Key: HIVE-20113
> URL: https://issues.apache.org/jira/browse/HIVE-20113
> Project: Hive
> Issue Type: Bug
> Components: Tez
> Reporter: Gopal Vijayaraghavan
> Assignee: Vineet Garg
> Priority: Major
> Labels: Branch3Candidate
> Fix For: 4.0.0-alpha-1
>
> Attachments: HIVE-20113.1.patch, HIVE-20113.10.patch,
> HIVE-20113.10.patch, HIVE-20113.2.patch, HIVE-20113.3.patch,
> HIVE-20113.4.patch, HIVE-20113.4.patch, HIVE-20113.5.patch,
> HIVE-20113.6.patch, HIVE-20113.7.patch, HIVE-20113.8.patch, HIVE-20113.9.patch
>
>
> The sorted shuffle avoidance can have some issues when the shuffle data gets
> broken up into multiple chunks on disk.
> The 1-1 edge cannot skip the tez final merge - there's no reason for 1-1 to
> have a final merge at all, it should open a single compressed file and write
> a single index entry.
> Until the shuffle issue is resolved & a lot more testing, it is prudent to
> disable the optimization for sorted shuffle edges and stop rewriting the
> RS(sorted) = = = RS(sorted) into RS(sorted) = = = RS(FORWARD).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)