[jira] [Commented] (HIVE-26986) A DAG created by OperatorGraph is not equal to the Tez DAG.

2023-12-08 Thread Seonggon Namgung (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17794801#comment-17794801
 ] 

Seonggon Namgung commented on HIVE-26986:
-

[~dkuzmenko] , I just created OperatorGraph before and after PEF application, 
and created graphviz files using OperatorGraph.toDot() method.

> A DAG created by OperatorGraph is not equal to the Tez DAG.
> ---
>
> Key: HIVE-26986
> URL: https://issues.apache.org/jira/browse/HIVE-26986
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 4.0.0-alpha-2
>Reporter: Seonggon Namgung
>Assignee: Seonggon Namgung
>Priority: Major
>  Labels: pull-request-available
> Attachments: Query71 OperatorGraph.png, Query71 TezDAG.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> A DAG created by OperatorGraph is not equal to the corresponding DAG that is 
> submitted to Tez.
> Because of this problem, ParallelEdgeFixer reports a pair of normal edges to 
> a parallel edge.
> We observe this problem by comparing OperatorGraph and Tez DAG when running 
> TPC-DS query 71 on 1TB ORC format managed table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26986) A DAG created by OperatorGraph is not equal to the Tez DAG.

2023-12-08 Thread Denys Kuzmenko (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17794791#comment-17794791
 ] 

Denys Kuzmenko commented on HIVE-26986:
---

[~seonggon], could you please share how one could create OperatorGraph for the 
query? under debug OperatorGraph.toDot()?

> A DAG created by OperatorGraph is not equal to the Tez DAG.
> ---
>
> Key: HIVE-26986
> URL: https://issues.apache.org/jira/browse/HIVE-26986
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 4.0.0-alpha-2
>Reporter: Seonggon Namgung
>Assignee: Seonggon Namgung
>Priority: Major
>  Labels: pull-request-available
> Attachments: Query71 OperatorGraph.png, Query71 TezDAG.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> A DAG created by OperatorGraph is not equal to the corresponding DAG that is 
> submitted to Tez.
> Because of this problem, ParallelEdgeFixer reports a pair of normal edges to 
> a parallel edge.
> We observe this problem by comparing OperatorGraph and Tez DAG when running 
> TPC-DS query 71 on 1TB ORC format managed table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26986) A DAG created by OperatorGraph is not equal to the Tez DAG.

2023-11-23 Thread Seonggon Namgung (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17789109#comment-17789109
 ] 

Seonggon Namgung commented on HIVE-26986:
-

[~dkuzmenko]  , we don't have a correctness issue with query71 because 
HIVE-27006 solves HIVE-26660. I'll create a new link from HIVE-26660 to 
HIVE-27006 and close HIVE-26660.

> A DAG created by OperatorGraph is not equal to the Tez DAG.
> ---
>
> Key: HIVE-26986
> URL: https://issues.apache.org/jira/browse/HIVE-26986
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 4.0.0-alpha-2
>Reporter: Seonggon Namgung
>Assignee: Seonggon Namgung
>Priority: Major
>  Labels: pull-request-available
> Attachments: Query71 OperatorGraph.png, Query71 TezDAG.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> A DAG created by OperatorGraph is not equal to the corresponding DAG that is 
> submitted to Tez.
> Because of this problem, ParallelEdgeFixer reports a pair of normal edges to 
> a parallel edge.
> We observe this problem by comparing OperatorGraph and Tez DAG when running 
> TPC-DS query 71 on 1TB ORC format managed table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26986) A DAG created by OperatorGraph is not equal to the Tez DAG.

2023-11-23 Thread Denys Kuzmenko (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17789079#comment-17789079
 ] 

Denys Kuzmenko commented on HIVE-26986:
---

[~seonggon], could you please clarify, do we have a correctness issue with 
query71? If not, let's close HIVE-26660 or remove the link from the current 
ticket since it's only about performance.

> A DAG created by OperatorGraph is not equal to the Tez DAG.
> ---
>
> Key: HIVE-26986
> URL: https://issues.apache.org/jira/browse/HIVE-26986
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 4.0.0-alpha-2
>Reporter: Seonggon Namgung
>Assignee: Seonggon Namgung
>Priority: Major
>  Labels: pull-request-available
> Attachments: Query71 OperatorGraph.png, Query71 TezDAG.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> A DAG created by OperatorGraph is not equal to the corresponding DAG that is 
> submitted to Tez.
> Because of this problem, ParallelEdgeFixer reports a pair of normal edges to 
> a parallel edge.
> We observe this problem by comparing OperatorGraph and Tez DAG when running 
> TPC-DS query 71 on 1TB ORC format managed table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26986) A DAG created by OperatorGraph is not equal to the Tez DAG.

2023-11-15 Thread Seonggon Namgung (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786224#comment-17786224
 ] 

Seonggon Namgung commented on HIVE-26986:
-

@kkasa

1. This issue is not about data correctness; this issue addresses the insertion 
of unnecessary ReduceSink operators, which causes unnecessary shuffle during 
runtime.

The unnecessary insertion is performed by ParallelEdgeFixer(PEF), and it makes 
a wrong decision because OperatorGraph creates wrong a DAG from the given query 
plan. My previous comments explains how OperatorGraph groups operators into a 
vertex(cluster in terms of OperatorGraph) in the wrong way.

Since this issue originates from OperatorGraph, not PEF or 
SharedWorkOptimizer(SWO), the submitted PR introduces TestOperatorGraph, which 
tests the behaviour of OperatorGraph. You can check the problem by running this 
test using master branch. The following comment explains about the added test 
for the sake of your better understanding.

 

The test compares 2 DAGs generated by OperatorGraph and TezCompiler. The 
following graph represents the query plan used in the test.
TS1┐
TS2┴UNION─SEL─RS─GBY─RS

The correct DAG corresponding to the query plan should be:
Map1: \{TS1, SEL, RS1}
Map2: \{TS2, SEL, RS1}
Reduce: \{GBY, RS2}

But current OperatorGraph groups operator into 2 groups as following:
Cluster1: \{TS1, TS2, UNION, SEL, RS1}
Cluster2: \{GBY, RS2}

 

2. As I mentioned above, this issue is unrelated to data correctness. Moreover, 
PEF is applied on a query plan regardless of the value of 
`hive.optimize.shared.work.parallel.edge.support`. I think the test attached in 
the PR is sufficient to verify this issue.

FYI, `hive.optimize.shared.work.parallel.edge.support` controls the types of 
edges that are allowed to construct a parallel edge. If it is set to true, 
DynamicPartitionPruning(DPP), SemiJoinReduction, and Broadcast edges can 
construct parallel edge. If not, only DPP edges can construct parallel edge. As 
a consequence, SWO can make parallel edges regardless of the value of 
`hive.optimize.shared.work.parallel.edge.support`. So Hive always runs PEF 
after SWO in order to resolve parallel edges by adding extra RS operators.

 

> A DAG created by OperatorGraph is not equal to the Tez DAG.
> ---
>
> Key: HIVE-26986
> URL: https://issues.apache.org/jira/browse/HIVE-26986
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 4.0.0-alpha-2
>Reporter: Seonggon Namgung
>Assignee: Seonggon Namgung
>Priority: Major
>  Labels: hive-4.0.0-must, pull-request-available
> Attachments: Query71 OperatorGraph.png, Query71 TezDAG.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> A DAG created by OperatorGraph is not equal to the corresponding DAG that is 
> submitted to Tez.
> Because of this problem, ParallelEdgeFixer reports a pair of normal edges to 
> a parallel edge.
> We observe this problem by comparing OperatorGraph and Tez DAG when running 
> TPC-DS query 71 on 1TB ORC format managed table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26986) A DAG created by OperatorGraph is not equal to the Tez DAG.

2023-11-10 Thread Krisztian Kasa (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17784956#comment-17784956
 ] 

Krisztian Kasa commented on HIVE-26986:
---

[~seonggon] 
1. It is not clear why adding extra concentrator RS leads to data correctness 
issue.
Could you please share a simple repro on a small dataset which has the 
necessary records only. It can be also added to the PR to extend the test 
coverage of SWO and ParallelEdgeFixer.
2. IIUC parallel edge support can be controlled via config setting. Could you 
please verify if the correctness issue stands when
{code:java}
set hive.optimize.shared.work.parallel.edge.support=false;
{code}

> A DAG created by OperatorGraph is not equal to the Tez DAG.
> ---
>
> Key: HIVE-26986
> URL: https://issues.apache.org/jira/browse/HIVE-26986
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 4.0.0-alpha-2
>Reporter: Seonggon Namgung
>Assignee: Seonggon Namgung
>Priority: Major
>  Labels: hive-4.0.0-must, pull-request-available
> Attachments: Query71 OperatorGraph.png, Query71 TezDAG.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> A DAG created by OperatorGraph is not equal to the corresponding DAG that is 
> submitted to Tez.
> Because of this problem, ParallelEdgeFixer reports a pair of normal edges to 
> a parallel edge.
> We observe this problem by comparing OperatorGraph and Tez DAG when running 
> TPC-DS query 71 on 1TB ORC format managed table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26986) A DAG created by OperatorGraph is not equal to the Tez DAG.

2023-01-26 Thread Seonggon Namgung (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17680933#comment-17680933
 ] 

Seonggon Namgung commented on HIVE-26986:
-

The attatched image files("Query71 TezDAG.png" and "Query71 OperatorGraph.png") 
show Tez DAG and OperatorGraph of TPC-DS query71.
I set tez.generate.debug.artifacts to get a dot file of Tez DAG.
The OperatorGraph is created after ParallelEdgeFixer is applied.

The number of clusters in the OperatorGraph is 10, but the number of vertices 
in the Tez DAG is 12.
The difference comes from cluster 3 of the OperatorGraph, which contains 3 TS 
operators and a UNION operator.

Current OperatorGraph creates a singleton cluster for each operator and merges 
parent operator's cluster to child operator's cluster unless parent operator is 
ReduceSink operator.
As a result, there can be a cluster with multiple root operators, which cannot 
form a single vertex in Tez DAG.
This inequality between Tez DAG and OperatorGraph makes false-positive errors 
when detecting parallel edges and leads to insertion of unnecessary 
concentrator RS.

> A DAG created by OperatorGraph is not equal to the Tez DAG.
> ---
>
> Key: HIVE-26986
> URL: https://issues.apache.org/jira/browse/HIVE-26986
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0-alpha-2
>Reporter: Seonggon Namgung
>Assignee: Seonggon Namgung
>Priority: Major
> Attachments: Query71 OperatorGraph.png, Query71 TezDAG.png
>
>
> A DAG created by OperatorGraph is not equal to the corresponding DAG that is 
> submitted to Tez.
> Because of this problem, ParallelEdgeFixer reports a pair of normal edges to 
> a parallel edge.
> We observe this problem by comparing OperatorGraph and Tez DAG when running 
> TPC-DS query 71 on 1TB ORC format managed table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)