[jira] [Commented] (HIVE-8667) CBO: optimize_nullscan - some TableScans for nullscans appeared after CBO

Ashutosh Chauhan (JIRA) Thu, 30 Oct 2014 18:03:43 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191140#comment-14191140
 ]


Ashutosh Chauhan commented on HIVE-8667:
----------------------------------------

 [~hagleitn] Seems like this is only an issue on Tez. On the last run of 
HIVE-8395 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1556/testReport/
 this test failed only for MiniTezCliDriver and passed for CliDriver. This is 
surprising since optimization itself is not aware of execution engine, but this 
is possible since on Tez, we have different set of physical optimizations than 
on MR. However,  I am not able to repro this. After enabling hive.cbo.enable, 
only diffs that I get is:
{code}
diff --git a/ql/src/test/results/clientpositive/tez/optimize_nullscan.q.out 
b/ql/src/test/results/clientpositive/tez/optimize_nullscan.q.out
index c45f0db..4228cec 100644
--- a/ql/src/test/results/clientpositive/tez/optimize_nullscan.q.out
+++ b/ql/src/test/results/clientpositive/tez/optimize_nullscan.q.out
@@ -1815,9 +1815,9 @@ STAGE PLANS:
                       value expressions: key (type: string)
                       auto parallelism: true
             Path -> Alias:
-              -mr-10003default.src{} [s2]
+              -mr-10002default.src{} [s2]
             Path -> Partition:
-              -mr-10003default.src{} 
+              -mr-10002default.src{} 
                 Partition
                   base file name: src
                   input format: 
org.apache.hadoop.hive.ql.io.OneNullRowInputFormat
@@ -1862,7 +1862,7 @@ STAGE PLANS:
                     name: default.src
                   name: default.src
             Truncated Path -> Alias:
-              -mr-10003default.src{} [s2]
+              -mr-10002default.src{} [s2]
         Map 3 
             Map Operator Tree:
                 TableScan
@@ -1882,9 +1882,9 @@ STAGE PLANS:
                       value expressions: key (type: string)
                       auto parallelism: true
             Path -> Alias:
-              -mr-10002default.src{} [s1]
+              -mr-10003default.src{} [s1]
             Path -> Partition:
-              -mr-10002default.src{} 
+              -mr-10003default.src{} 
{code}

[~sershe] This optimizer dont care whether join is present in query or not. It 
looks for TS->FIL pattern with FIL being where false. So, final plan as you 
have printed after CBO is exactly of its liking. In Hive, PPD makes sure we get 
that pattern.  
Whats different on Tez is really intriguing. I will dig more.

> CBO: optimize_nullscan - some TableScans for nullscans appeared after CBO
> -------------------------------------------------------------------------
>
>                 Key: HIVE-8667
>                 URL: https://issues.apache.org/jira/browse/HIVE-8667
>             Project: Hive
>          Issue Type: Sub-task
>          Components: CBO
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>             Fix For: 0.15.0
>
>
> Looks like some rewriting by CBO prevents nullscans from being optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8667) CBO: optimize_nullscan - some TableScans for nullscans appeared after CBO

Reply via email to