Mithun Radhakrishnan created HIVE-6389: ------------------------------------------
Summary: LazyBinaryColumnarSerDe-based RCFile tables break when looking up elements in null-maps. Key: HIVE-6389 URL: https://issues.apache.org/jira/browse/HIVE-6389 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.12.0, 0.11.0, 0.10.0, 0.13.0 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan RCFile tables that use the LazyBinaryColumnarSerDe don't seem to handle look-ups into map-columns when the value of the column is null. When an RCFile table is created with LazyBinaryColumnarSerDe (as is default in 0.12), and queried as follows: {code} select mymap['1024'] from mytable; {code} and if the mymap column has nulls, then one is treated to the following guttural utterance: {code} 2014-02-05 21:50:25,050 FATAL mr.ExecMapper (ExecMapper.java:map(194)) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"id":null,"mymap":null,"isnull":null} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:235) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.ClassCastException: java.lang.Integer cannot be cast to org.apache.hadoop.io.Text at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveWritableObject(WritableStringObjectInspector.java:41) at org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:226) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:486) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:439) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:423) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:560) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:790) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524) ... 10 more {code} A patch is on the way, but the short of it is that the LazyBinaryMapOI needs to return nulls if either the map or the lookup-key is null. This is handled correctly for Text data, and for RCFiles using ColumnarSerDe. -- This message was sent by Atlassian JIRA (v6.1.5#6160)