scarlin-cloudera commented on code in PR #4783:
URL: https://github.com/apache/hive/pull/4783#discussion_r1355474928


##########
ql/src/test/results/clientpositive/llap/multi_insert_gby5.q.out:
##########
@@ -0,0 +1,250 @@
+PREHOOK: query: CREATE TABLE tbl1 (key int, f1 int)
+PREHOOK: type: CREATETABLE
+PREHOOK: Output: database:default
+PREHOOK: Output: default@tbl1
+POSTHOOK: query: CREATE TABLE tbl1 (key int, f1 int)
+POSTHOOK: type: CREATETABLE
+POSTHOOK: Output: database:default
+POSTHOOK: Output: default@tbl1
+PREHOOK: query: CREATE TABLE tbl2 (f1 int) PARTITIONED BY (key int)
+PREHOOK: type: CREATETABLE
+PREHOOK: Output: database:default
+PREHOOK: Output: default@tbl2
+POSTHOOK: query: CREATE TABLE tbl2 (f1 int) PARTITIONED BY (key int)
+POSTHOOK: type: CREATETABLE
+POSTHOOK: Output: database:default
+POSTHOOK: Output: default@tbl2
+PREHOOK: query: EXPLAIN FROM (SELECT key, f1 FROM tbl1 WHERE key=5) a
+INSERT OVERWRITE TABLE tbl2 PARTITION(key=5)
+SELECT f1 WHERE key > 0 GROUP BY f1
+INSERT OVERWRITE TABLE tbl2 partition(key=6)
+SELECT f1 WHERE key > 0 GROUP BY f1
+PREHOOK: type: QUERY
+PREHOOK: Input: default@tbl1
+PREHOOK: Output: default@tbl2@key=5
+PREHOOK: Output: default@tbl2@key=6
+POSTHOOK: query: EXPLAIN FROM (SELECT key, f1 FROM tbl1 WHERE key=5) a
+INSERT OVERWRITE TABLE tbl2 PARTITION(key=5)
+SELECT f1 WHERE key > 0 GROUP BY f1
+INSERT OVERWRITE TABLE tbl2 partition(key=6)
+SELECT f1 WHERE key > 0 GROUP BY f1
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@tbl1
+POSTHOOK: Output: default@tbl2@key=5
+POSTHOOK: Output: default@tbl2@key=6
+STAGE DEPENDENCIES:
+  Stage-2 is a root stage
+  Stage-3 depends on stages: Stage-2
+  Stage-0 depends on stages: Stage-3
+  Stage-4 depends on stages: Stage-0
+  Stage-1 depends on stages: Stage-3
+  Stage-5 depends on stages: Stage-1
+
+STAGE PLANS:
+  Stage: Stage-2
+    Tez
+#### A masked pattern was here ####
+      Edges:
+        Reducer 2 <- Map 1 (SIMPLE_EDGE)
+        Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
+        Reducer 4 <- Reducer 2 (SIMPLE_EDGE)
+#### A masked pattern was here ####
+      Vertices:
+        Map 1 
+            Map Operator Tree:
+                TableScan
+                  alias: tbl1
+                  filterExpr: (key = 5) (type: boolean)
+                  Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE 
Column stats: NONE
+                  Filter Operator
+                    predicate: (key = 5) (type: boolean)
+                    Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE 
Column stats: NONE
+                    Select Operator
+                      expressions: f1 (type: int)
+                      outputColumnNames: _col1
+                      Statistics: Num rows: 1 Data size: 8 Basic stats: 
COMPLETE Column stats: NONE
+                      Filter Operator
+                        predicate: (5 > 0) (type: boolean)
+                        Statistics: Num rows: 1 Data size: 8 Basic stats: 
COMPLETE Column stats: NONE
+                        Reduce Output Operator
+                          key expressions: _col1 (type: int)
+                          null sort order: z
+                          sort order: +
+                          Map-reduce partition columns: _col1 (type: int)
+                          Statistics: Num rows: 1 Data size: 8 Basic stats: 
COMPLETE Column stats: NONE
+                          value expressions: 5 (type: int)
+            Execution mode: vectorized, llap
+            LLAP IO: all inputs
+        Reducer 2 
+            Execution mode: llap
+            Reduce Operator Tree:
+              Forward
+                Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE 
Column stats: NONE
+                Filter Operator
+                  predicate: (VALUE._col0 > 0) (type: boolean)

Review Comment:
   Same comment above applies here.  There is no CBO applied to the INSERT 
OVERWRITE portion.  You're right that it should be stripped out, but that would 
entail a bigger fix.
   
   In fact, I would go so far as to say that I have not really come up with an 
example that truly makes sense from an end-user point of view.  The bug only 
occurs where there is constant folding happening in the CBO portion and there 
is a filter being applied on the same column in the INSERT OVERWRITE.  From a 
technical point of view, the query is correct.  From a practical point of view, 
the query doesn't make sense.
   
   A customer did hit this issue, and it turned out their query was incorrect.  
They did request that this should get fixed.  I'm actually almost tempted to 
let the bug exist because if the query isn't something the end user would want, 
perhaps they should get the failure at compile time than at runtime?  But the 
purist in me, says that a correct query should go through, which is why I think 
we should still fix this.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

Reply via email to