Hi Navis, My colleague chenchun finds that hashcode of 'deal' and 'dim_pay_date' are the same and the code in MapJoinProcessor.java ignores the order of rowschema. I look at your patch and it's exactly the same place we are working on. Thanks for your patch.
在 2013年8月11日星期日,下午9:38,Navis류승우 写道: > Hi, > > I've booked this on https://issues.apache.org/jira/browse/HIVE-5056 > and attached patch for it. > > It needs full test for confirmation but you can try it. > > Thanks. > > 2013/8/11 <wzc1...@gmail.com (mailto:wzc1...@gmail.com)>: > > Hi all: > > when I change the table alias dim_pay_date to A, the query pass in hive > > 0.11(https://gist.github.com/code6/6187569#file-hive11_auto_convert_join_change_alias_pass): > > > > use test; > > create table if not exists src ( `key` int,`val` string); > > load data local inpath '/Users/code6/git/hive/data/files/kv1.txt' overwrite > > into table src; > > drop table if exists orderpayment_small; > > create table orderpayment_small (`dealid` int,`date` string,`time` string, > > `cityid` int, `userid` int); > > insert overwrite table orderpayment_small select 748, '2011-03-24', > > '2011-03-24', 55 ,5372613 from src limit 1; > > drop table if exists user_small; > > create table user_small( userid int); > > insert overwrite table user_small select key from src limit 100; > > set hive.auto.convert.join.noconditionaltask.size = 200; > > SELECT > > `A`.`date` > > , `deal`.`dealid` > > FROM `orderpayment_small` `orderpayment` > > JOIN `orderpayment_small` `A` ON `A`.`date` = `orderpayment`.`date` > > JOIN `orderpayment_small` `deal` ON `deal`.`dealid` = > > `orderpayment`.`dealid` > > JOIN `orderpayment_small` `order_city` ON `order_city`.`cityid` = > > `orderpayment`.`cityid` > > JOIN `user_small` `user` ON `user`.`userid` = `orderpayment`.`userid` > > limit 5; > > > > > > It's quite strange and interesting now. I will keep searching for the answer > > to this issue. > > > > > > > > 在 2013年8月9日星期五,上午3:32,wzc1...@gmail.com (mailto:wzc1...@gmail.com) 写道: > > > > Hi all: > > I'm currently testing hive11 and encounter one bug with > > hive.auto.convert.join, I construct a testcase so everyone can reproduce > > it(or you can reach the testcase > > here:https://gist.github.com/code6/6187569#file-hive11_auto_convert_join_bug): > > > > use test; > > create table src ( `key` int,`val` string); > > load data local inpath '/Users/code6/git/hive/data/files/kv1.txt' overwrite > > into table src; > > drop table if exists orderpayment_small; > > create table orderpayment_small (`dealid` int,`date` string,`time` string, > > `cityid` int, `userid` int); > > insert overwrite table orderpayment_small select 748, '2011-03-24', > > '2011-03-24', 55 ,5372613 from src limit 1; > > drop table if exists user_small; > > create table user_small( userid int); > > insert overwrite table user_small select key from src limit 100; > > set hive.auto.convert.join.noconditionaltask.size = 200; > > SELECT > > `dim_pay_date`.`date` > > , `deal`.`dealid` > > FROM `orderpayment_small` `orderpayment` > > JOIN `orderpayment_small` `dim_pay_date` ON `dim_pay_date`.`date` = > > `orderpayment`.`date` > > JOIN `orderpayment_small` `deal` ON `deal`.`dealid` = > > `orderpayment`.`dealid` > > JOIN `orderpayment_small` `order_city` ON `order_city`.`cityid` = > > `orderpayment`.`cityid` > > JOIN `user_small` `user` ON `user`.`userid` = `orderpayment`.`userid` > > limit 5; > > > > > > You should replace the path of kv1.txt by yourself. You can run the above > > query in hive 0.11 and it will fail with ArrayIndexOutOfBoundsException, You > > can see the explain result and the console output of the query here : > > https://gist.github.com/code6/6187569 > > > > I compile the trunk code but it doesn't work with this query. I can run this > > query in hive 0.9 with hive.auto.convert.join turns on. > > > > I try to dig into this problem and I think it may be caused by the map join > > optimization. Some adjacent operators aren't match for the input/output > > tableinfo(column positions diff). > > > > I'm not able to fix this bug and I would appreciate it if someone would like > > to look into this problem. > > > > Thanks.