Join operation fails for some queries -------------------------------------
Key: HIVE-106 URL: https://issues.apache.org/jira/browse/HIVE-106 Project: Hadoop Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.19.0 Reporter: Josh Ferguson The Tables Are CREATE TABLE activities (actor_id STRING, actee_id STRING, properties MAP<STRING, STRING>) PARTITIONED BY (account STRING, application STRING, dataset STRING, hour INT) CLUSTERED BY (actor_id, actee_id) INTO 32 BUCKETS ROW FORMAT DELIMITED COLLECTION ITEMS TERMINATED BY '44' MAP KEYS TERMINATED BY '58' STORED AS TEXTFILE; Detailed Table Information: Table(tableName:activities,dbName:default,owner:Josh,createTime:1228208598,lastAccessTime:0,retention:0,sd:StorageDescriptor(cols:[FieldSchema(name:actor_id,type:string,comment:null), FieldSchema(name:actee_id,type:string,comment:null), FieldSchema(name:properties,type:map<string,string>,comment:null)],location:/user/hive/warehouse/activities,inputFormat:org.apache.hadoop.mapred.TextInputFormat,outputFormat:org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat,compressed:false,numBuckets:32,serdeInfo:SerDeInfo(name:null,serializationLib:org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe,parameters:{colelction.delim=44,mapkey.delim=58,serialization.format=org.apache.hadoop.hive.serde2.thrift.TCTLSeparatedProtocol}),bucketCols:[actor_id, actee_id],sortCols:[],parameters:{}),partitionKeys:[FieldSchema(name:account,type:string,comment:null), FieldSchema(name:application,type:string,comment:null), FieldSchema(name:dataset,type:string,comment:null), FieldSchema(name:hour,type:int,comment:null)],parameters:{}) CREATE TABLE users (id STRING, properties MAP<STRING, STRING>) PARTITIONED BY (account STRING, application STRING, dataset STRING, hour INT) CLUSTERED BY (id) INTO 32 BUCKETS ROW FORMAT DELIMITED COLLECTION ITEMS TERMINATED BY '44' MAP KEYS TERMINATED BY '58' STORED AS TEXTFILE; Detailed Table Information: Table(tableName:users,dbName:default,owner:Josh,createTime:1228208633,lastAccessTime:0,retention:0,sd:StorageDescriptor(cols:[FieldSchema(name:id,type:string,comment:null), FieldSchema(name:properties,type:map<string,string>,comment:null)],location:/user/hive/warehouse/users,inputFormat:org.apache.hadoop.mapred.TextInputFormat,outputFormat:org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat,compressed:false,numBuckets:32,serdeInfo:SerDeInfo(name:null,serializationLib:org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe,parameters:{colelction.delim=44,mapkey.delim=58,serialization.format=org.apache.hadoop.hive.serde2.thrift.TCTLSeparatedProtocol}),bucketCols:[id],sortCols:[],parameters:{}),partitionKeys:[FieldSchema(name:account,type:string,comment:null), FieldSchema(name:application,type:string,comment:null), FieldSchema(name:dataset,type:string,comment:null), FieldSchema(name:hour,type:int,comment:null)],parameters:{}) A working query is SELECT activities.* FROM activities WHERE activities.dataset='poke' AND activities.properties['verb'] = 'Dance'; A non working query is SELECT activities.*, users.* FROM activities LEFT OUTER JOIN users ON activities.actor_id = users.id WHERE activities.dataset='poke' AND activities.properties['verb'] = 'Dance'; The Exception Is java.lang.RuntimeException: Hive 2 Internal error: cannot evaluate index expression on string at org.apache.hadoop.hive.ql.exec.ExprNodeIndexEvaluator.evaluate(ExprNodeIndexEvaluator.java:64) at org.apache.hadoop.hive.ql.exec.ExprNodeFuncEvaluator.evaluate(ExprNodeFuncEvaluator.java:72) at org.apache.hadoop.hive.ql.exec.ExprNodeFuncEvaluator.evaluate(ExprNodeFuncEvaluator.java:72) at org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:67) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:262) at org.apache.hadoop.hive.ql.exec.JoinOperator.createForwardJoinObject(JoinOperator.java:257) at org.apache.hadoop.hive.ql.exec.JoinOperator.genObject(JoinOperator.java:477) at org.apache.hadoop.hive.ql.exec.JoinOperator.genObject(JoinOperator.java:467) at org.apache.hadoop.hive.ql.exec.JoinOperator.genObject(JoinOperator.java:467) at org.apache.hadoop.hive.ql.exec.JoinOperator.checkAndGenObject(JoinOperator.java:507) at org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:489) at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:140) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:430) at org.apache.hadoop.mapred.Child.main(Child.java:155) This is thrown every time in the first phase of reduction. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.