I'm currently running a hive build from trunk, revision number 911889.  I've 
built a UDTF called map_explode which just emits the key and value of each 
entry in a map as a row in the result table.  The table I'm running it against 
looks like:

hive> describe mytable;
product    string    from deserializer
...
interactions    map<string,int>    from deserializer

If I use the map_explode in the select clause, I get the expected results:

hive> select map_explode(interactions) as (key, value) from mytable where day = 
'2010-02-18' and hour = 1 limit 10;
...
OK
invite_impression    1
invite_impression    1
invite_impression    1
invite_impression    1
rollout    12
invite_impression    1
invite_impression    1
invite_impression    1
rollout    4
invite_impression    1
Time taken: 22.11 seconds

However, if I try to use LATERAL JOIN to relate the exploded values back to the 
parent table, like so:

hive> select product, key, sum(value) from mytable LATERAL VIEW 
map_explode(interactions) interacts as key, value where day = '2010-02-18' and 
hour = 1 group by product, key;

I get the following error:

FAILED: Unknown exception: null

Looking in hive.log, I see the follow stack trace:

2010-02-19 14:15:17,215 ERROR ql.Driver (SessionState.java:printError(255)) - 
FAILED: Unknown exception: null
java.lang.NullPointerException
    at 
org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory$ColumnExprProcessor.process(ExprWalkerProcFactory.java:87)
    at 
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
    at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
    at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:129)
    at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:103)
    at 
org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory.extractPushdownPreds(ExprWalkerProcFactory.java:273)
    at 
org.apache.hadoop.hive.ql.ppd.OpProcFactory$DefaultPPD.mergeWithChildrenPred(OpProcFactory.java:317)
    at 
org.apache.hadoop.hive.ql.ppd.OpProcFactory$DefaultPPD.process(OpProcFactory.java:258)
    at 
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
    at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
    at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:129)
    at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:103)
    at 
org.apache.hadoop.hive.ql.ppd.PredicatePushDown.transform(PredicatePushDown.java:103)
    at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:74)
    at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5758)
    at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:125)
    at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:304)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:377)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:138)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:197)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:303)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

I peeked at ExprWalkerProcFactory, but couldn't readily see what was causing 
the problem.  Any ideas?

Jason

Reply via email to