I'm currently running a hive build from trunk, revision number 911889. I've built a UDTF called map_explode which just emits the key and value of each entry in a map as a row in the result table. The table I'm running it against looks like:
hive> describe mytable; product string from deserializer ... interactions map<string,int> from deserializer If I use the map_explode in the select clause, I get the expected results: hive> select map_explode(interactions) as (key, value) from mytable where day = '2010-02-18' and hour = 1 limit 10; ... OK invite_impression 1 invite_impression 1 invite_impression 1 invite_impression 1 rollout 12 invite_impression 1 invite_impression 1 invite_impression 1 rollout 4 invite_impression 1 Time taken: 22.11 seconds However, if I try to use LATERAL JOIN to relate the exploded values back to the parent table, like so: hive> select product, key, sum(value) from mytable LATERAL VIEW map_explode(interactions) interacts as key, value where day = '2010-02-18' and hour = 1 group by product, key; I get the following error: FAILED: Unknown exception: null Looking in hive.log, I see the follow stack trace: 2010-02-19 14:15:17,215 ERROR ql.Driver (SessionState.java:printError(255)) - FAILED: Unknown exception: null java.lang.NullPointerException at org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory$ColumnExprProcessor.process(ExprWalkerProcFactory.java:87) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:129) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:103) at org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory.extractPushdownPreds(ExprWalkerProcFactory.java:273) at org.apache.hadoop.hive.ql.ppd.OpProcFactory$DefaultPPD.mergeWithChildrenPred(OpProcFactory.java:317) at org.apache.hadoop.hive.ql.ppd.OpProcFactory$DefaultPPD.process(OpProcFactory.java:258) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:129) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:103) at org.apache.hadoop.hive.ql.ppd.PredicatePushDown.transform(PredicatePushDown.java:103) at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:74) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5758) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:125) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:304) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:377) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:138) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:197) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:303) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) I peeked at ExprWalkerProcFactory, but couldn't readily see what was causing the problem. Any ideas? Jason