Hi, It appears that queries with all three of (join, group by, non-string datatype) cause a crash in the serde code run at the reducer:
java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.io.DoubleWritable cannot be cast to org.apache.hadoop.io.Text at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:179) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:430) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:170) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.io.DoubleWritable cannot be cast to org.apache.hadoop.io.Text at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:489) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:359) the workaround mentioned in the FAQ(3) (and reported fixed by HIVE-405, although this seems to be a different issue) does not seem to fix the problem, which has existed since hive revision 764548. I am using hadoop v.0.19, though I get the same errors when I use the latest trunk. this script (which includes the workaround) captures the issue: ----------------------------------------- create table foo ( bas string, bam double ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\174'; create table bar ( bas string--, --bat double ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\174'; load data local inpath '/PATH/TO/demo_foo.txt' overwrite into table foo; load data local inpath '/PATH/TO/demo_bar.txt' overwrite into table bar; select f.bas, cast(bam as string) from foo f join bar b on (f.bas = b.bas) group by f.bas, cast(bam as string); ------------------------------- contents of demo_foo.txt: 11234325|0.123 221346|10.12 33463246|100.25 432462634|0.12 5346236|345.12 contents of demo_bar: 11234325|0.1222 221346|1.11 33463246|235.23 432462634|6.33 5346236|77.77 thanks, Peter Alvaro UC Berkeley