very interesting.  so this did work correctly on your previous distribution
of these two products?  May i ask what they were?

On Mon, Feb 29, 2016 at 8:24 PM, GAO Chi <[email protected]> wrote:

> Hi all,
>
>
>
> We encountered a strange behavior after upgrading to HIVE 2.0.0 + TEZ
> 0.8.2.
>
>
>
> I simplified our query to this:
>
>
>
> SELECT
>
>   a.key,
>
>   a.a_one,
>
>   b.b_one,
>
>   a.a_zero,
>
>   b.b_zero
>
> FROM
>
> (
>
>     SELECT
>
>       11 key,
>
>       0 confuse_you,
>
>       1 a_one,
>
>       0 a_zero
>
> ) a
>
> LEFT JOIN
>
> (
>
>     SELECT
>
>       11 key,
>
>       0 confuse_you,
>
>       1 b_one,
>
>       0 b_zero
>
> ) b
>
> ON a.key = b.key
>
> ;
>
>
>
>
>
> Above query generates this unexpected result:
>
>
>
> INFO  : Status: Running (Executing on YARN cluster with App id
> application_1456723490535_3653)
>
>
>
> INFO  : Map 1: 0/1      Map 2: 0/1
>
> INFO  : Map 1: 0/1      Map 2: 0(+1)/1
>
> INFO  : Map 1: 0(+1)/1  Map 2: 0(+1)/1
>
> INFO  : Map 1: 0(+1)/1  Map 2: 1/1
>
> INFO  : Map 1: 1/1      Map 2: 1/1
>
> INFO  : Completed executing
> command(queryId=hive_20160301115630_0a0dbee5-ba4b-45e7-b027-085f655640fd);
> Time taken: 10.225 seconds
>
> INFO  : OK
>
> +--------+----------+----------+-----------+-----------+--+
>
> | a.key  | a.a_one  | b.b_one  | a.a_zero  | b.b_zero  |
>
> +--------+----------+----------+-----------+-----------+--+
>
> | 11     | 1        | 0        | 0         | 1         |
>
> +--------+----------+----------+-----------+-----------+--+
>
>
>
> If you change the constant value of subquery-b’s confuse_you column from 0
> to 2, the problem disappears. The plan returned from EXPLAIN shows the
> incorrect one is picking _col1 and _col2, while the correct one is picking
> _col2 and _col3 form sub query b.
>
>
>
> Seems it cannot distinguish 2 columns with same constant value?
>
>
>
>
>
> Anyone encountered similar problem?
>
>
>
>
>
> Thanks!
>
>
>
> Chi
>
>
>

Reply via email to