good to know. then per Jeff Zhang's thinking if you were to set the exec engine to 'mr' would it still fail? if so, then its not Tez . :)
On Mon, Feb 29, 2016 at 9:31 PM, GAO Chi <[email protected]> wrote: > > > Yes. We have not changed our script, and this only appears after we > upgraded to new version at 24th. > > > > > > Previously we’re using HIVE 1.2.0 + TEZ 0.7.0 > > > > > > Thanks! > > > > Chi > > > > *From:* Stephen Sprague [mailto:[email protected]] > *Sent:* Tuesday, March 1, 2016 12:31 PM > *To:* [email protected] > *Subject:* Re: Wrong column is picked in HIVE 2.0.0 + TEZ 0.8.2 left join > > > > very interesting. so this did work correctly on your previous > distribution of these two products? May i ask what they were? > > > > On Mon, Feb 29, 2016 at 8:24 PM, GAO Chi <[email protected]> wrote: > > Hi all, > > > > We encountered a strange behavior after upgrading to HIVE 2.0.0 + TEZ > 0.8.2. > > > > I simplified our query to this: > > > > SELECT > > a.key, > > a.a_one, > > b.b_one, > > a.a_zero, > > b.b_zero > > FROM > > ( > > SELECT > > 11 key, > > 0 confuse_you, > > 1 a_one, > > 0 a_zero > > ) a > > LEFT JOIN > > ( > > SELECT > > 11 key, > > 0 confuse_you, > > 1 b_one, > > 0 b_zero > > ) b > > ON a.key = b.key > > ; > > > > > > Above query generates this unexpected result: > > > > INFO : Status: Running (Executing on YARN cluster with App id > application_1456723490535_3653) > > > > INFO : Map 1: 0/1 Map 2: 0/1 > > INFO : Map 1: 0/1 Map 2: 0(+1)/1 > > INFO : Map 1: 0(+1)/1 Map 2: 0(+1)/1 > > INFO : Map 1: 0(+1)/1 Map 2: 1/1 > > INFO : Map 1: 1/1 Map 2: 1/1 > > INFO : Completed executing > command(queryId=hive_20160301115630_0a0dbee5-ba4b-45e7-b027-085f655640fd); > Time taken: 10.225 seconds > > INFO : OK > > +--------+----------+----------+-----------+-----------+--+ > > | a.key | a.a_one | b.b_one | a.a_zero | b.b_zero | > > +--------+----------+----------+-----------+-----------+--+ > > | 11 | 1 | 0 | 0 | 1 | > > +--------+----------+----------+-----------+-----------+--+ > > > > If you change the constant value of subquery-b’s confuse_you column from 0 > to 2, the problem disappears. The plan returned from EXPLAIN shows the > incorrect one is picking _col1 and _col2, while the correct one is picking > _col2 and _col3 form sub query b. > > > > Seems it cannot distinguish 2 columns with same constant value? > > > > > > Anyone encountered similar problem? > > > > > > Thanks! > > > > Chi > > > > >
