Hui Huang created HIVE-20441:
--------------------------------
Summary: NPE in ExprNodeGenericFuncDesc when
hive.allow.udf.load.on.demand is set to true
Key: HIVE-20441
URL: https://issues.apache.org/jira/browse/HIVE-20441
Project: Hive
Issue Type: Bug
Components: CLI, HiveServer2
Affects Versions: 1.2.1, 2.3.3
Reporter: Hui Huang
Assignee: Hui Huang
When hive.allow.udf.load.on.demand is set to true and hiveserver2 has been
started, the new created function from other clients or hiveserver2 will be
loaded from the metastore at the first time.
When the udf is used in where clause, we got a NPE like:
{code:java}
Error executing statement:
org.apache.hive.service.cli.HiveSQLException: Error while compiling statement:
FAILED: NullPointerException null
at
org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:380)
~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:206)
~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:290)
~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hive.service.cli.operation.Operation.run(Operation.java:320)
~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:530)
~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAP
SHOT]
at
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:517)
~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHO
T]
at
org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:310)
~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:542)
~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1437)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNA
PSHOT]
at
org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1422)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNA
PSHOT]
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:57)
~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[?:1.8.0_77]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[?:1.8.0_77]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77]
Caused by: java.lang.NullPointerException
at
org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:236)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:1104)
~[hive-exec-2.
3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1359)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.
3.4-SNAPSHOT]
at
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:76)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:229)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:176)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:11613)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:11568)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:11536)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:3303)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:3283)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:9592)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:10549)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:10427)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:11125)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11138)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10807)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:512)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1295)
~[hive-exec-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
at
org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:204)
~[hive-service-2.3.4-SNAPSHOT.jar:2.3.4-SNAPSHOT]
{code}
The code to get udf from metastore is:
{code:java}
private FunctionInfo getFunctionInfoFromMetastoreNoLock(String functionName,
HiveConf conf) {
try {
String[] parts =
FunctionUtils.getQualifiedFunctionNameParts(functionName);
Function func = Hive.get(conf).getFunction(parts[0].toLowerCase(),
parts[1]);
if (func == null) {
return null;
}
// Found UDF in metastore - now add it to the function registry.
FunctionInfo fi = registerPermanentFunction(functionName,
func.getClassName(), true,
FunctionTask.toFunctionResource(func.getResourceUris()));
if (fi == null) {
LOG.error(func.getClassName() + " is not a valid UDF class and was not
registered");
return null;
}
return fi;
} catch (Throwable e) {
LOG.info("Unable to look up " + functionName + " in metastore", e);
}
return null;
}
{code}
After getting the function, the function is registered to permanent function
list through method 'registerPermanentFunction'.
{code:java}
public FunctionInfo registerPermanentFunction(String functionName,
String className, boolean registerToSession, FunctionResource...
resources) {
FunctionInfo function = new FunctionInfo(functionName, className,
resources);
// register to session first for backward compatibility
if (registerToSession) {
String qualifiedName = FunctionUtils.qualifyFunctionName(
functionName, SessionState.get().getCurrentDatabase().toLowerCase());
if (registerToSessionRegistry(qualifiedName, function) != null) {
addFunction(functionName, function);
return function;
}
} else {
addFunction(functionName, function);
}
return null;
}
{code}
And the variable registerToSession is true, so the local variable 'function'
will be returned. But the genericUDF field of the returned function is null.
We should return the result of the method registerToSessionRegistry returned.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)