very interesting. so this did work correctly on your previous distribution of these two products? May i ask what they were?
On Mon, Feb 29, 2016 at 8:24 PM, GAO Chi <[email protected]> wrote: > Hi all, > > > > We encountered a strange behavior after upgrading to HIVE 2.0.0 + TEZ > 0.8.2. > > > > I simplified our query to this: > > > > SELECT > > a.key, > > a.a_one, > > b.b_one, > > a.a_zero, > > b.b_zero > > FROM > > ( > > SELECT > > 11 key, > > 0 confuse_you, > > 1 a_one, > > 0 a_zero > > ) a > > LEFT JOIN > > ( > > SELECT > > 11 key, > > 0 confuse_you, > > 1 b_one, > > 0 b_zero > > ) b > > ON a.key = b.key > > ; > > > > > > Above query generates this unexpected result: > > > > INFO : Status: Running (Executing on YARN cluster with App id > application_1456723490535_3653) > > > > INFO : Map 1: 0/1 Map 2: 0/1 > > INFO : Map 1: 0/1 Map 2: 0(+1)/1 > > INFO : Map 1: 0(+1)/1 Map 2: 0(+1)/1 > > INFO : Map 1: 0(+1)/1 Map 2: 1/1 > > INFO : Map 1: 1/1 Map 2: 1/1 > > INFO : Completed executing > command(queryId=hive_20160301115630_0a0dbee5-ba4b-45e7-b027-085f655640fd); > Time taken: 10.225 seconds > > INFO : OK > > +--------+----------+----------+-----------+-----------+--+ > > | a.key | a.a_one | b.b_one | a.a_zero | b.b_zero | > > +--------+----------+----------+-----------+-----------+--+ > > | 11 | 1 | 0 | 0 | 1 | > > +--------+----------+----------+-----------+-----------+--+ > > > > If you change the constant value of subquery-b’s confuse_you column from 0 > to 2, the problem disappears. The plan returned from EXPLAIN shows the > incorrect one is picking _col1 and _col2, while the correct one is picking > _col2 and _col3 form sub query b. > > > > Seems it cannot distinguish 2 columns with same constant value? > > > > > > Anyone encountered similar problem? > > > > > > Thanks! > > > > Chi > > >
