Could you create a JIRA with repro case? Thanks, Xuefu
On Thu, Dec 17, 2015 at 9:21 PM, Jone Zhang <joyoungzh...@gmail.com> wrote: > *My query is * > set hive.execution.engine=spark; > select > > t3.pcid,channel,version,ip,hour,app_id,app_name,app_apk,app_version,app_type,dwl_tool,dwl_status,err_type,dwl_store,dwl_maxspeed,dwl_minspeed,dwl_avgspeed,last_time,dwl_num, > (case when t4.cnt is null then 0 else 1 end) as is_evil > from > (select /*+mapjoin(t2)*/ > pcid,channel,version,ip,hour, > (case when t2.app_id is null then t1.app_id else t2.app_id end) as app_id, > t2.name as app_name, > app_apk, > > app_version,app_type,dwl_tool,dwl_status,err_type,dwl_store,dwl_maxspeed,dwl_minspeed,dwl_avgspeed,last_time,dwl_num > from > t_ed_soft_downloadlog_molo t1 left outer join t_rd_soft_app_pkg_name t2 on > (lower(t1.app_apk) = lower(t2.package_id) and t1.ds = 20151217 and t2.ds = > 20151217) > where > t1.ds = 20151217) t3 > left outer join > ( > select pcid,count(1) cnt from t_ed_soft_evillog_molo where ds=20151217 > group by pcid > ) t4 > on t3.pcid=t4.pcid; > > > *And the error log is * > 2015-12-18 08:10:18,685 INFO [main]: spark.SparkMapJoinOptimizer > (SparkMapJoinOptimizer.java:process(79)) - Check if it can be converted to > map join > 2015-12-18 08:10:18,686 ERROR [main]: ql.Driver > (SessionState.java:printError(966)) - FAILED: NullPointerException null > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.getConnectedParentMapJoinSize(SparkMapJoinOptimizer.java:312) > at > org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.getConnectedMapJoinSize(SparkMapJoinOptimizer.java:292) > at > org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.getMapJoinConversionInfo(SparkMapJoinOptimizer.java:271) > at > org.apache.hadoop.hive.ql.optimizer.spark.SparkMapJoinOptimizer.process(SparkMapJoinOptimizer.java:80) > at > org.apache.hadoop.hive.ql.optimizer.spark.SparkJoinOptimizer.process(SparkJoinOptimizer.java:58) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:92) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:97) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:81) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:135) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:112) > at > org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:128) > at > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:102) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10238) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:210) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:233) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:425) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) > at > org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1123) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1171) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1060) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1050) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:208) > at > org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:160) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:447) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:357) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:795) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:767) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:704) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.main(RunJar.java:212) > > > *Some properties on hive-site.xml is * > <property> > <name>hive.ignore.mapjoin.hint</name> > <value>false</value> > </property> > <property> > <name>hive.auto.convert.join</name> > <value>true</value> > </property> > <property> > <name>hive.auto.convert.join.noconditionaltask</name> > <value>true</value> > </property> > > > *The error relevant code is * > long mjSize = ctx.getMjOpSizes().get(op); > *I think it should be checked whether or not * ctx.getMjOpSizes().get(op) *is > null.* > > *Of course, more strict logic need to you to decide.* > > > *Thanks.* > *Best Wishes.* >