[ 
https://issues.apache.org/jira/browse/HIVE-106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668711#action_12668711
 ] 

Namit Jain commented on HIVE-106:
---------------------------------

Josh, can you provide the data files for the tables activities and users which 
was failing

> Join operation fails for some queries
> -------------------------------------
>
>                 Key: HIVE-106
>                 URL: https://issues.apache.org/jira/browse/HIVE-106
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Josh Ferguson
>            Assignee: Namit Jain
>            Priority: Critical
>
> The Tables Are
> CREATE TABLE activities 
> (actor_id STRING, actee_id STRING, properties MAP<STRING, STRING>) 
> PARTITIONED BY (account STRING, application STRING, dataset STRING, hour INT) 
> CLUSTERED BY (actor_id, actee_id) INTO 32 BUCKETS 
> ROW FORMAT DELIMITED 
> COLLECTION ITEMS TERMINATED BY '44'
> MAP KEYS TERMINATED BY '58'
> STORED AS TEXTFILE;
> Detailed Table Information:
> Table(tableName:activities,dbName:default,owner:Josh,createTime:1228208598,lastAccessTime:0,retention:0,sd:StorageDescriptor(cols:[FieldSchema(name:actor_id,type:string,comment:null),
>  FieldSchema(name:actee_id,type:string,comment:null), 
> FieldSchema(name:properties,type:map<string,string>,comment:null)],location:/user/hive/warehouse/activities,inputFormat:org.apache.hadoop.mapred.TextInputFormat,outputFormat:org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat,compressed:false,numBuckets:32,serdeInfo:SerDeInfo(name:null,serializationLib:org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe,parameters:{colelction.delim=44,mapkey.delim=58,serialization.format=org.apache.hadoop.hive.serde2.thrift.TCTLSeparatedProtocol}),bucketCols:[actor_id,
>  
> actee_id],sortCols:[],parameters:{}),partitionKeys:[FieldSchema(name:account,type:string,comment:null),
>  FieldSchema(name:application,type:string,comment:null), 
> FieldSchema(name:dataset,type:string,comment:null), 
> FieldSchema(name:hour,type:int,comment:null)],parameters:{})
> CREATE TABLE users 
> (id STRING, properties MAP<STRING, STRING>) 
> PARTITIONED BY (account STRING, application STRING, dataset STRING, hour INT) 
> CLUSTERED BY (id) INTO 32 BUCKETS 
> ROW FORMAT DELIMITED 
> COLLECTION ITEMS TERMINATED BY '44'
> MAP KEYS TERMINATED BY '58'
> STORED AS TEXTFILE;
> Detailed Table Information:
> Table(tableName:users,dbName:default,owner:Josh,createTime:1228208633,lastAccessTime:0,retention:0,sd:StorageDescriptor(cols:[FieldSchema(name:id,type:string,comment:null),
>  
> FieldSchema(name:properties,type:map<string,string>,comment:null)],location:/user/hive/warehouse/users,inputFormat:org.apache.hadoop.mapred.TextInputFormat,outputFormat:org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat,compressed:false,numBuckets:32,serdeInfo:SerDeInfo(name:null,serializationLib:org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe,parameters:{colelction.delim=44,mapkey.delim=58,serialization.format=org.apache.hadoop.hive.serde2.thrift.TCTLSeparatedProtocol}),bucketCols:[id],sortCols:[],parameters:{}),partitionKeys:[FieldSchema(name:account,type:string,comment:null),
>  FieldSchema(name:application,type:string,comment:null), 
> FieldSchema(name:dataset,type:string,comment:null), 
> FieldSchema(name:hour,type:int,comment:null)],parameters:{})
> A working query is
> SELECT activities.* FROM activities WHERE activities.dataset='poke' AND 
> activities.properties['verb'] = 'Dance';
> A non working query is
> SELECT activities.*, users.* FROM activities LEFT OUTER JOIN users ON 
> activities.actor_id = users.id WHERE activities.dataset='poke' AND 
> activities.properties['verb'] = 'Dance';
> The Exception Is
> java.lang.RuntimeException: Hive 2 Internal error: cannot evaluate index 
> expression on string
>       at 
> org.apache.hadoop.hive.ql.exec.ExprNodeIndexEvaluator.evaluate(ExprNodeIndexEvaluator.java:64)
>       at 
> org.apache.hadoop.hive.ql.exec.ExprNodeFuncEvaluator.evaluate(ExprNodeFuncEvaluator.java:72)
>       at 
> org.apache.hadoop.hive.ql.exec.ExprNodeFuncEvaluator.evaluate(ExprNodeFuncEvaluator.java:72)
>       at 
> org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:67)
>       at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:262)
>       at 
> org.apache.hadoop.hive.ql.exec.JoinOperator.createForwardJoinObject(JoinOperator.java:257)
>       at 
> org.apache.hadoop.hive.ql.exec.JoinOperator.genObject(JoinOperator.java:477)
>       at 
> org.apache.hadoop.hive.ql.exec.JoinOperator.genObject(JoinOperator.java:467)
>       at 
> org.apache.hadoop.hive.ql.exec.JoinOperator.genObject(JoinOperator.java:467)
>       at 
> org.apache.hadoop.hive.ql.exec.JoinOperator.checkAndGenObject(JoinOperator.java:507)
>       at 
> org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:489)
>       at 
> org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:140)
>       at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:430)
>       at org.apache.hadoop.mapred.Child.main(Child.java:155)
> This is thrown every time in the first phase of reduction.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to