[
https://issues.apache.org/jira/browse/HIVE-20441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhihua Deng updated HIVE-20441:
-------------------------------
Comment: was deleted
(was: Nice work! a few questions:
# the registerToSessionRegistry method has registered the function into the
session cache, why should called again by addFunction(functionName, function) ,
this may override the udf registered recently, what if the same query executed
again in the same session? check newFunction != null && isNative before add
function?
# the registerToSession set to true and if the newFunction returned is null,
this will goes on to next addFunction(functionName, function);. In this case,
does it make any sense? in which case should we do like this?
thanks)
> NPE in ExprNodeGenericFuncDesc when hive.allow.udf.load.on.demand is set to
> true
> ---------------------------------------------------------------------------------
>
> Key: HIVE-20441
> URL: https://issues.apache.org/jira/browse/HIVE-20441
> Project: Hive
> Issue Type: Bug
> Components: CLI, HiveServer2
> Affects Versions: 1.2.1, 2.3.3
> Reporter: Hui Huang
> Assignee: Hui Huang
> Priority: Major
> Fix For: 2.3.3
>
> Attachments: HIVE-20441.1.patch, HIVE-20441.patch
>
>
> When hive.allow.udf.load.on.demand is set to true and hiveserver2 has been
> started, the new created function from other clients or hiveserver2 will be
> loaded from the metastore at the first time.
> When the udf is used in where clause, we got a NPE like:
> {code:java}
> Error executing statement:
> org.apache.hive.service.cli.HiveSQLException: Error while compiling
> statement: FAILED: NullPointerException null
> at
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:380)
> ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:206)
> ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:290)
> ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:320)
> ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:530)
> ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAP
> SHOT]
> at
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:517)
> ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHO
> T]
> at
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:310)
> ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:542)
> ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1437)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNA
> PSHOT]
> at
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1422)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNA
> PSHOT]
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:57)
> ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [?:1.8.0_77]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [?:1.8.0_77]
> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77]
> Caused by: java.lang.NullPointerException
> at
> org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:236)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:1104)
> ~[hive-exec-2.
> 3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1359)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.
> 3.4-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:76)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:229)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:176)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:11613)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:11568)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:11536)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:3303)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:3283)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:9592)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:10549)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:10427)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:11125)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11138)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10807)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:512)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1295)
> ~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> at
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:204)
> ~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
> {code}
>
> The code to get udf from metastore is:
> {code:java}
> private FunctionInfo getFunctionInfoFromMetastoreNoLock(String functionName,
> HiveConf conf) {
> try {
> String[] parts =
> FunctionUtils.getQualifiedFunctionNameParts(functionName);
> Function func = Hive.get(conf).getFunction(parts[0].toLowerCase(),
> parts[1]);
> if (func == null) {
> return null;
> }
> // Found UDF in metastore - now add it to the function registry.
> FunctionInfo fi = registerPermanentFunction(functionName,
> func.getClassName(), true,
> FunctionTask.toFunctionResource(func.getResourceUris()));
> if (fi == null) {
> LOG.error(func.getClassName() + " is not a valid UDF class and was
> not registered");
> return null;
> }
> return fi;
> } catch (Throwable e) {
> LOG.info("Unable to look up " + functionName + " in metastore", e);
> }
> return null;
> }
> {code}
>
> After getting the function, the function is registered to permanent function
> list through method 'registerPermanentFunction'.
> {code:java}
> public FunctionInfo registerPermanentFunction(String functionName,
> String className, boolean registerToSession, FunctionResource...
> resources) {
> FunctionInfo function = new FunctionInfo(functionName, className,
> resources);
> // register to session first for backward compatibility
> if (registerToSession) {
> String qualifiedName = FunctionUtils.qualifyFunctionName(
> functionName,
> SessionState.get().getCurrentDatabase().toLowerCase());
> if (registerToSessionRegistry(qualifiedName, function) != null) {
> addFunction(functionName, function);
> return function;
> }
> } else {
> addFunction(functionName, function);
> }
> return null;
> }
> {code}
> And the variable registerToSession is true, so the object 'function' will be
> returned. But the genericUDF field of the returned function is null which
> cause the error.
> We should return the result of the method registerToSessionRegistry returned.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)