[ 
https://issues.apache.org/jira/browse/HIVE-21971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-21971:
------------------------------------
    Description: 
https://issues.apache.org/jira/browse/HIVE-10329 helped in moving away from 
hadoop's ReflectionUtils constructor cache issue 
(https://issues.apache.org/jira/browse/HADOOP-10513).

However, there are corner cases where hadoop's {{ReflectionUtils}} is in use 
and this causes gradual build up of memory in HS2.

I have observed this in Hive 2.3. But the codepath in master for this has not 
changed much.

Easiest way to repro would be to add a temp function which extends 
{{GenericUDF}}. In {{FunctionRegistry::cloneGenericUDF,}} this would 
end up using {{org.apache.hadoop.util.ReflectionUtils.newInstance}} which in 
turn lands up in COSNTRUCTOR_CACHE of ReflectionUtils. 


{noformat}

CREATE TEMPORARY FUNCTION dummy AS 'com.hive.test.DummyGenericUDF' USING JAR 
'file:///home/test/udf/dummy.jar';

select dummy();

        at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:107)
        at 
org.apache.hadoop.hive.ql.exec.FunctionRegistry.cloneGenericUDF(FunctionRegistry.java:1353)
        at 
org.apache.hadoop.hive.ql.exec.FunctionInfo.getGenericUDF(FunctionInfo.java:122)
        at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:983)
        at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1359)
        at 
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
        at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
        at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
        at 
org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:76)
        at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
{noformat}

Note: Reflection based invocation of hadoop's {{ReflectionUtils::clear}} was 
removed in 2.x. 

  was:
https://issues.apache.org/jira/browse/HIVE-10329 helped in moving away from 
hadoop's ReflectionUtils constructor cache issue 
(https://issues.apache.org/jira/browse/HADOOP-10513).

However, there are corner cases where hadoop's {{ReflectionUtils}} is in use 
and this causes gradual build up of memory in HS2.

I have observed this in Hive 2.3. But the codepath in master for this has not 
changed much.

Easiest way to repro would be to add a temp function which extends 
{{GenericUDF}}. In {{FunctionRegistry::cloneGenericUDF,}} this would 
end up using {{org.apache.hadoop.util.ReflectionUtils.newInstance}} which in 
turn lands up in COSNTRUCTOR_CACHE of ReflectionUtils. 


{noformat}

CREATE TEMPORARY FUNCTION dummy AS 'com.hive.test.DummyGenericUDF' USING JAR 
'file:///home/test/udf/dummy.jar';

select dummy();

        at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:107)
        at 
org.apache.hadoop.hive.ql.exec.FunctionRegistry.cloneGenericUDF(FunctionRegistry.java:1353)
        at 
org.apache.hadoop.hive.ql.exec.FunctionInfo.getGenericUDF(FunctionInfo.java:122)
        at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:983)
        at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1359)
        at 
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
        at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
        at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
        at 
org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:76)
        at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
{noformat}

Note: Reflection based invocation of hadoop's `ReflectionUtils::clear` was 
removed in 2.x. 


> HS2 leaks classload due to `ReflectionUtils::CONSTRUCTOR_CACHE` with 
> temporary functions + GenericUDF
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-21971
>                 URL: https://issues.apache.org/jira/browse/HIVE-21971
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>    Affects Versions: 2.3.4
>            Reporter: Rajesh Balamohan
>            Priority: Critical
>
> https://issues.apache.org/jira/browse/HIVE-10329 helped in moving away from 
> hadoop's ReflectionUtils constructor cache issue 
> (https://issues.apache.org/jira/browse/HADOOP-10513).
> However, there are corner cases where hadoop's {{ReflectionUtils}} is in use 
> and this causes gradual build up of memory in HS2.
> I have observed this in Hive 2.3. But the codepath in master for this has not 
> changed much.
> Easiest way to repro would be to add a temp function which extends 
> {{GenericUDF}}. In {{FunctionRegistry::cloneGenericUDF,}} this would 
> end up using {{org.apache.hadoop.util.ReflectionUtils.newInstance}} which in 
> turn lands up in COSNTRUCTOR_CACHE of ReflectionUtils. 
> {noformat}
> CREATE TEMPORARY FUNCTION dummy AS 'com.hive.test.DummyGenericUDF' USING JAR 
> 'file:///home/test/udf/dummy.jar';
> select dummy();
>       at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:107)
>       at 
> org.apache.hadoop.hive.ql.exec.FunctionRegistry.cloneGenericUDF(FunctionRegistry.java:1353)
>       at 
> org.apache.hadoop.hive.ql.exec.FunctionInfo.getGenericUDF(FunctionInfo.java:122)
>       at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:983)
>       at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1359)
>       at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>       at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
>       at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
>       at 
> org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:76)
>       at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> {noformat}
> Note: Reflection based invocation of hadoop's {{ReflectionUtils::clear}} was 
> removed in 2.x. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to