[ https://issues.apache.org/jira/browse/HIVE-9613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14325850#comment-14325850 ]
Li Xin commented on HIVE-9613: ------------------------------ thank you for your reply Chao. I just set up a new hive of version 1.0 in my cluster,and didn't change any configuration ,and the results is still the same , strange. i will just attached the sql i tested , could you take any time to test it and let me know it is ok or not . Best regards and happy Chinese new year. > Left join query plan outputs wrong column when using subquery > -------------------------------------------------------------- > > Key: HIVE-9613 > URL: https://issues.apache.org/jira/browse/HIVE-9613 > Project: Hive > Issue Type: Bug > Components: Parser, Query Planning > Affects Versions: 0.14.0, 1.0.0 > Environment: apache hadoop 2.5.1 > Reporter: Li Xin > Attachments: test.sql > > > I have a query that outputs a column with wrong contents when using > subquery,and the contents of that column is equal to another column,not its > own. > I have three tables,as follows: > table 1: _hivetemp.category_city_rank_: > ||category||city||rank|| > |jinrongfuwu|shanghai|1| > |ktvjiuba|shanghai|2| > table 2:_hivetemp.category_match_: > ||src_category_en||src_category_cn||dst_category_en||dst_category_cn|| > |danbaobaoxiantouzi|投资担保|担保/贷款|jinrongfuwu| > |zpwentiyingshi|娱乐/休闲|KTV/酒吧|ktvjiuba| > table 3:_hivetemp.city_match_: > ||src_city_name_en||dst_city_name_en||city_name_cn|| > |sh|shanghai|上海| > And the query is : > {code} > select > a.category, > a.city, > a.rank, > b.src_category_en, > c.src_city_name_en > from > hivetemp.category_city_rank a > left outer join > (select > src_category_en, > dst_category_en > from > hivetemp.category_match) b > on a.category = b.dst_category_en > left outer join > (select > src_city_name_en, > dst_city_name_en > from > hivetemp.city_match) c > on a.city = c.dst_city_name_en > {code} > which shoud output the results as follows,and i test it in hive 0.13: > ||category||city||rank||src_category_en||src_city_name_en|| > |jinrongfuwu|shanghai|1|danbaobaoxiantouzi|sh| > |ktvjiuba|shanghai|2|zpwentiyingshi|sh| > but int hive0.14,the results in the column *src_category_en* is wrong,and is > just the *city* contents: > ||category||city||rank||src_category_en||src_city_name_en|| > |jinrongfuwu|shanghai|1|shanghai|sh| > |ktvjiuba|shanghai|2|shanghai|sh| > Using explain to examine the execution plan,i can see the first subquery just > outputs one column of *dst_category_en*,and *src_category_en* is just missing. > {quote} > b:category_match > TableScan > alias: category_match > Statistics: Num rows: 131 Data size: 13149 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: dst_category_en (type: string) > outputColumnNames: _col1 > Statistics: Num rows: 131 Data size: 13149 Basic stats: > COMPLETE Column stats: NONE > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)