[ https://issues.apache.org/jira/browse/HIVE-21170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17113945#comment-17113945 ]
zhaolong commented on HIVE-21170: --------------------------------- hello, can you share a document about how to deploy llap on hive 3.1.0 > Wrong (no) results of cross-product query executed on LLAP > ----------------------------------------------------------- > > Key: HIVE-21170 > URL: https://issues.apache.org/jira/browse/HIVE-21170 > Project: Hive > Issue Type: Bug > Affects Versions: 3.1.0 > Environment: Hive distribution: HDP 3.1.0. > LLAP execution engine in version: > {code:java} > $ beeline --version > Hive 3.1.0.3.1.0.0-78 > Git > git://ctr-e138-1518143905142-586755-01-000015.hwx.site/grid/0/jenkins/workspace/HDP-parallel-centos7/SOURCES/hive > -r 56673b027117d8cb3400675b1680a4d992360808 > Compiled by jenkins on Thu Dec 6 12:27:21 UTC 2018 > From source with checksum 97cc61f6acbe68b1fa988aa9f76b34cc{code} > > Reporter: Krzysztof Zarzycki > Assignee: Ivan Suller > Priority: Critical > > *In the environment:* > {code:java} > $ beeline --version > Hive 3.1.0.3.1.0.0-78 > Git > git://ctr-e138-1518143905142-586755-01-000015.hwx.site/grid/0/jenkins/workspace/HDP-parallel-centos7/SOURCES/hive > -r 56673b027117d8cb3400675b1680a4d992360808 > Compiled by jenkins on Thu Dec 6 12:27:21 UTC 2018 > From source with checksum 97cc61f6acbe68b1fa988aa9f76b34cc > {code} > *On LLAP execution engine*, the following query gives *wrong results*: > {code:java} > -- prepare test data > set hive.query.results.cache.enabled=false; > create table test1 (id int); > insert into test1 values (1),(2),(3); > -- query > select * from test1 t1 cross join test1 t2; > -- or any other cross join query, that involve 1 or more tables.{code} > *Query result:* > {code:java} > 0: jdbc:hive2://hostname:> select t1.* from test1 t1 cross join test1 t2; > INFO : Compiling > command(queryId=hive_20190125215942_7df8062c-8511-4915-a0d9-5e7ac84030f6): > select t1.* from test1 t1 cross join test1 t2 > INFO : Warning: Shuffle Join MERGEJOIN[9][tables = [$hdt$_0, $hdt$_1]] in > Stage 'Reducer 2' is a cross product > INFO : Semantic Analysis Completed (retrial = false) > INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:t1.id, > type:int, comment:null)], properties:null) > INFO : Completed compiling > command(queryId=hive_20190125215942_7df8062c-8511-4915-a0d9-5e7ac84030f6); > Time taken: 0.229 seconds > INFO : Executing > command(queryId=hive_20190125215942_7df8062c-8511-4915-a0d9-5e7ac84030f6): > select t1.* from test1 t1 cross join test1 t2 > INFO : Query ID = hive_20190125215942_7df8062c-8511-4915-a0d9-5e7ac84030f6 > INFO : Total jobs = 1 > INFO : Launching Job 1 out of 1 > INFO : Starting task [Stage-1:MAPRED] in parallel > > ---------------------------------------------------------------------------------------------- > VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED > > ---------------------------------------------------------------------------------------------- > Map 1 .......... llap SUCCEEDED 1 1 0 0 0 0 > Map 3 .......... llap SUCCEEDED 1 1 0 0 0 0 > Reducer 2 llap SUCCEEDED 0 0 0 0 0 0 > > ---------------------------------------------------------------------------------------------- > VERTICES: 02/03 [==========================>>] 100% ELAPSED TIME: 1.92 s > > ---------------------------------------------------------------------------------------------- > INFO : Completed executing > command(queryId=hive_20190125215942_7df8062c-8511-4915-a0d9-5e7ac84030f6); > Time taken: 2.006 seconds > INFO : OK > +--------+ > |t1.id| > +--------+ > +--------+ > No rows selected (2.284 seconds) > {code} > *Expected result:* > {code:java} > +--------+ > |t1.id| > +--------+ > |3| > |3| > |3| > |2| > |2| > |2| > |1| > |1| > |1| > +--------+ > 9 rows selected > {code} > *What worked as a workaround:* > {code:java} > set hive.tez.cartesian-product.enabled=false; (default true) > {code} > Then query gave a correct result. > *Difference in execution plans :* > * With {{set hive.tez.cartesian-product.enabled=true;}}: > {code:java} > +----------------------------------------------------+ > |Explain| > +----------------------------------------------------+ > |Plan optimized by CBO.| > | | > |Vertex dependency in root stage| > |Reducer 2 <- Map 1 (XPROD_EDGE), Map 3 (XPROD_EDGE)| > | | > |Stage-0| > |Fetch Operator| > |limit:-1| > |Stage-1| > |Reducer 2 llap| > |File Output Operator [FS_8]| > |Merge Join Operator [MERGEJOIN_9] (rows=9 width=8)| > |Conds:(Inner),Output:["_col0","_col1"]| > |<-Map 1 [XPROD_EDGE] vectorized, llap| > |XPROD_EDGE [RS_11]| > |Select Operator [SEL_10] (rows=3 width=4)| > |Output:["_col0"]| > |TableScan [TS_0] (rows=3 width=4)| > |zarz@test1,t1, ACID table,Tbl:COMPLETE,Col:COMPLETE,Output:["id"]| > |<-Map 3 [XPROD_EDGE] vectorized, llap| > |XPROD_EDGE [RS_13]| > |Select Operator [SEL_12] (rows=3 width=4)| > |Output:["_col0"]| > |TableScan [TS_2] (rows=3 width=4)| > |zarz@test2,t2, ACID table,Tbl:COMPLETE,Col:COMPLETE,Output:["id"]| > | | > +----------------------------------------------------+ > {code} > * With {{set hive.tez.cartesian-product.enabled=false;}}: > {code:java} > +----------------------------------------------------+ > |Explain| > +----------------------------------------------------+ > |Plan optimized by CBO.| > | | > |Vertex dependency in root stage| > |Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE), Map 3 (CUSTOM_SIMPLE_EDGE)| > | | > |Stage-0| > |Fetch Operator| > |limit:-1| > |Stage-1| > |Reducer 2 llap| > |File Output Operator [FS_8]| > |Merge Join Operator [MERGEJOIN_9] (rows=9 width=8)| > |Conds:(Inner),Output:["_col0","_col1"]| > |<-Map 1 [CUSTOM_SIMPLE_EDGE] vectorized, llap| > |PARTITION_ONLY_SHUFFLE [RS_11]| > |Select Operator [SEL_10] (rows=3 width=4)| > |Output:["_col0"]| > |TableScan [TS_0] (rows=3 width=4)| > |zarz@test1,t1, ACID table,Tbl:COMPLETE,Col:COMPLETE,Output:["id"]| > |<-Map 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap| > |PARTITION_ONLY_SHUFFLE [RS_13]| > |Select Operator [SEL_12] (rows=3 width=4)| > |Output:["_col0"]| > |TableScan [TS_2] (rows=3 width=4)| > |zarz@test2,t2, ACID table,Tbl:COMPLETE,Col:COMPLETE,Output:["id"]| > | | > +----------------------------------------------------+ > {code} > So I suspect some bug in XPROD_EDGE component executed on LLAP. > *Note 1: Same query on Hive on Tez works as expected.* > *Note 2: What I also tried but didn't help*: > {code:java} > set hive.vectorized.execution.enabled=false; (default true) > {code} > That didn't change the result. -- This message was sent by Atlassian Jira (v8.3.4#803005)