[
https://issues.apache.org/jira/browse/HIVE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191140#comment-14191140
]
Ashutosh Chauhan commented on HIVE-8667:
----------------------------------------
[~hagleitn] Seems like this is only an issue on Tez. On the last run of
HIVE-8395
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1556/testReport/
this test failed only for MiniTezCliDriver and passed for CliDriver. This is
surprising since optimization itself is not aware of execution engine, but this
is possible since on Tez, we have different set of physical optimizations than
on MR. However, I am not able to repro this. After enabling hive.cbo.enable,
only diffs that I get is:
{code}
diff --git a/ql/src/test/results/clientpositive/tez/optimize_nullscan.q.out
b/ql/src/test/results/clientpositive/tez/optimize_nullscan.q.out
index c45f0db..4228cec 100644
--- a/ql/src/test/results/clientpositive/tez/optimize_nullscan.q.out
+++ b/ql/src/test/results/clientpositive/tez/optimize_nullscan.q.out
@@ -1815,9 +1815,9 @@ STAGE PLANS:
value expressions: key (type: string)
auto parallelism: true
Path -> Alias:
- -mr-10003default.src{} [s2]
+ -mr-10002default.src{} [s2]
Path -> Partition:
- -mr-10003default.src{}
+ -mr-10002default.src{}
Partition
base file name: src
input format:
org.apache.hadoop.hive.ql.io.OneNullRowInputFormat
@@ -1862,7 +1862,7 @@ STAGE PLANS:
name: default.src
name: default.src
Truncated Path -> Alias:
- -mr-10003default.src{} [s2]
+ -mr-10002default.src{} [s2]
Map 3
Map Operator Tree:
TableScan
@@ -1882,9 +1882,9 @@ STAGE PLANS:
value expressions: key (type: string)
auto parallelism: true
Path -> Alias:
- -mr-10002default.src{} [s1]
+ -mr-10003default.src{} [s1]
Path -> Partition:
- -mr-10002default.src{}
+ -mr-10003default.src{}
{code}
[~sershe] This optimizer dont care whether join is present in query or not. It
looks for TS->FIL pattern with FIL being where false. So, final plan as you
have printed after CBO is exactly of its liking. In Hive, PPD makes sure we get
that pattern.
Whats different on Tez is really intriguing. I will dig more.
> CBO: optimize_nullscan - some TableScans for nullscans appeared after CBO
> -------------------------------------------------------------------------
>
> Key: HIVE-8667
> URL: https://issues.apache.org/jira/browse/HIVE-8667
> Project: Hive
> Issue Type: Sub-task
> Components: CBO
> Reporter: Sergey Shelukhin
> Assignee: Sergey Shelukhin
> Fix For: 0.15.0
>
>
> Looks like some rewriting by CBO prevents nullscans from being optimized.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)