[ 
https://issues.apache.org/jira/browse/HIVE-20441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Huang updated HIVE-20441:
-----------------------------
    Status: In Progress  (was: Patch Available)

> NPE in ExprNodeGenericFuncDesc  when hive.allow.udf.load.on.demand is set to 
> true
> ---------------------------------------------------------------------------------
>
>                 Key: HIVE-20441
>                 URL: https://issues.apache.org/jira/browse/HIVE-20441
>             Project: Hive
>          Issue Type: Bug
>          Components: CLI, HiveServer2
>    Affects Versions: 1.2.1, 2.3.3
>            Reporter: Hui Huang
>            Assignee: Hui Huang
>            Priority: Major
>             Fix For: 2.3.3
>
>         Attachments: HIVE-20441.1.patch, HIVE-20441.2.patch, HIVE-20441.patch
>
>
> When hive.allow.udf.load.on.demand is set to true and hiveserver2 has been 
> started, the new created function from other clients or hiveserver2 will be 
> loaded from the metastore at the first time. 
> When the udf is used in where clause, we got a NPE like:
> {code:java}
> Error executing statement:
> org.apache.hive.service.cli.HiveSQLException: Error while compiling 
> statement: FAILED: NullPointerException null
>         at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:380)
>  ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:206)
>  ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:290)
>  ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:320) 
> ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:530)
>  ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAP
> SHOT]
>         at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:517)
>  ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHO
> T]
>         at 
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:310)
>  ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:542)
>  ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1437)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNA
> PSHOT]
>         at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1422)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNA
> PSHOT]
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) 
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) 
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:57)
>  ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [?:1.8.0_77]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [?:1.8.0_77]
>         at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77]
> Caused by: java.lang.NullPointerException
>         at 
> org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:236)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:1104)
>  ~[hive-exec-2.
> 3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1359)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.
> 3.4-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:76) 
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:229)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:176)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:11613)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:11568)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:11536)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:3303)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:3283)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:9592)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:10549)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:10427)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:11125)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11138)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10807)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258)
>  ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:512) 
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317) 
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1295) 
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
>         at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:204)
>  ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> {code}
>  
> The code to get udf from metastore is:
> {code:java}
> private FunctionInfo getFunctionInfoFromMetastoreNoLock(String functionName, 
> HiveConf conf) {
>     try {
>       String[] parts = 
> FunctionUtils.getQualifiedFunctionNameParts(functionName);
>       Function func = Hive.get(conf).getFunction(parts[0].toLowerCase(), 
> parts[1]);
>       if (func == null) {
>         return null;
>       }
>       // Found UDF in metastore - now add it to the function registry.
>       FunctionInfo fi = registerPermanentFunction(functionName, 
> func.getClassName(), true,
>           FunctionTask.toFunctionResource(func.getResourceUris()));
>       if (fi == null) {
>         LOG.error(func.getClassName() + " is not a valid UDF class and was 
> not registered");
>         return null;
>       }
>       return fi;
>     } catch (Throwable e) {
>       LOG.info("Unable to look up " + functionName + " in metastore", e);
>     }
>     return null;
>   }
> {code}
>  
> After getting the function, the function is registered to permanent function 
> list through method 'registerPermanentFunction'.
> {code:java}
> public FunctionInfo registerPermanentFunction(String functionName,
>       String className, boolean registerToSession, FunctionResource... 
> resources) {
>     FunctionInfo function = new FunctionInfo(functionName, className, 
> resources);
>     // register to session first for backward compatibility
>     if (registerToSession) {
>       String qualifiedName = FunctionUtils.qualifyFunctionName(
>           functionName, 
> SessionState.get().getCurrentDatabase().toLowerCase());
>       if (registerToSessionRegistry(qualifiedName, function) != null) {
>         addFunction(functionName, function);
>         return function;
>       }
>     } else {
>         addFunction(functionName, function);
>     }
>     return null;
>   }
> {code}
> And the variable registerToSession is true, so  the object 'function' will be 
> returned. But the genericUDF field of the returned function is null which 
> cause the error. 
> We should return the result of the method registerToSessionRegistry returned.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to