[ 
https://issues.apache.org/jira/browse/HIVE-9613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14325850#comment-14325850
 ] 

Li Xin commented on HIVE-9613:
------------------------------

thank you for your reply Chao. 

I just set up a new hive of version 1.0 in my cluster,and didn't change any 
configuration ,and the results is still the same , strange.

i will just attached the sql i tested , could you take any time to test it and 
let me know it is ok or not . Best regards  and happy Chinese new year. 


> Left join query plan outputs  wrong column when using subquery
> --------------------------------------------------------------
>
>                 Key: HIVE-9613
>                 URL: https://issues.apache.org/jira/browse/HIVE-9613
>             Project: Hive
>          Issue Type: Bug
>          Components: Parser, Query Planning
>    Affects Versions: 0.14.0, 1.0.0
>         Environment: apache hadoop 2.5.1 
>            Reporter: Li Xin
>         Attachments: test.sql
>
>
> I have a query that outputs a column with wrong contents when using 
> subquery,and the contents of that column is equal to another column,not its 
> own.
> I have three tables,as follows:
> table 1: _hivetemp.category_city_rank_:
> ||category||city||rank||
> |jinrongfuwu|shanghai|1|
> |ktvjiuba|shanghai|2|
> table 2:_hivetemp.category_match_:
> ||src_category_en||src_category_cn||dst_category_en||dst_category_cn||
> |danbaobaoxiantouzi|投资担保|担保/贷款|jinrongfuwu|
> |zpwentiyingshi|娱乐/休闲|KTV/酒吧|ktvjiuba|
> table 3:_hivetemp.city_match_:
> ||src_city_name_en||dst_city_name_en||city_name_cn||
> |sh|shanghai|上海|
> And the query is :
> {code}
> select
>     a.category,
>     a.city,
>     a.rank,
>     b.src_category_en,
>     c.src_city_name_en
> from
>     hivetemp.category_city_rank a
> left outer join
> (select
>     src_category_en,
>     dst_category_en
> from
>     hivetemp.category_match) b
> on  a.category = b.dst_category_en
> left outer join
> (select
>     src_city_name_en,
>     dst_city_name_en
> from
>     hivetemp.city_match) c
> on  a.city = c.dst_city_name_en
> {code}
> which shoud output the results as follows,and i test it in hive 0.13:
> ||category||city||rank||src_category_en||src_city_name_en||
> |jinrongfuwu|shanghai|1|danbaobaoxiantouzi|sh|
> |ktvjiuba|shanghai|2|zpwentiyingshi|sh|
> but int hive0.14,the results in the column *src_category_en* is wrong,and is 
> just the *city* contents:
> ||category||city||rank||src_category_en||src_city_name_en||
> |jinrongfuwu|shanghai|1|shanghai|sh|
> |ktvjiuba|shanghai|2|shanghai|sh|
> Using explain to examine the execution plan,i can see the first subquery just 
> outputs one column of *dst_category_en*,and *src_category_en* is just missing.
> {quote}
>            b:category_match
>           TableScan
>             alias: category_match
>             Statistics: Num rows: 131 Data size: 13149 Basic stats: COMPLETE 
> Column stats: NONE
>             Select Operator
>               expressions: dst_category_en (type: string)
>               outputColumnNames: _col1
>               Statistics: Num rows: 131 Data size: 13149 Basic stats: 
> COMPLETE Column stats: NONE
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to