[
https://issues.apache.org/jira/browse/PIG-5338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16446630#comment-16446630
]
Koji Noguchi commented on PIG-5338:
-----------------------------------
Thanks Greg, Adam.
bq. although we'll also need to run (Scripting) e2e tests for verification.
Good idea. Blindly running e2e with the patch, getting two failures.
Scripting.Scripting_5 and Scripting.Scripting_9
Pasting the error message.
{noformat}
2018-04-20 18:38:51,316 [main] ERROR org.apache.pig.tools.pigstats.PigStats -
ERROR 0: org.apache.pig.backend.executionengine.ExecException: ERROR 2997:
Unable to recreate exception from backed error: Error:
org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while
executing (Name: c: New For Each(false,false,false)[bag] - scope-21 Operator
Key: scope-21): org.apache.pig.backend.executionengine.ExecException: ERROR
2078: Caught error from UDF: org.apache.pig.scripting.jython.JythonFunction
[Error executing function]
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:315)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:260)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:280)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:275)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:65)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1949)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2078:
Caught error from UDF: org.apache.pig.scripting.jython.JythonFunction [Error
executing function]
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:358)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNextTuple(POUserFunc.java:369)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:359)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:408)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:325)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:305)
... 12 more
Caused by: java.io.IOException: Error executing function
at
org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:122)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:330)
... 17 more
Caused by: com.google.inject.ConfigurationException: Guice configuration errors:
1) Unable to method intercept: org.apache.pig.scripting.jython.JythonBag
while locating org.apache.pig.scripting.jython.JythonBag
1 error
at
com.google.inject.internal.InjectorImpl.getProvider(InjectorImpl.java:1004)
at
com.google.inject.internal.InjectorImpl.getProvider(InjectorImpl.java:961)
at
com.google.inject.internal.InjectorImpl.getInstance(InjectorImpl.java:1013)
at
org.apache.pig.scripting.jython.JythonUtils.pigToPython(JythonUtils.java:133)
at
org.apache.pig.scripting.jython.JythonUtils.pigTupleToPyTuple(JythonUtils.java:153)
at
org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:116)
... 18 more
Caused by: java.lang.IllegalArgumentException: Cannot subclass final class
class org.apache.pig.scripting.jython.JythonBag
at
com.google.inject.internal.cglib.proxy.$Enhancer.generateClass(Enhancer.java:446)
at
com.google.inject.internal.cglib.core.$DefaultGeneratorStrategy.generate(DefaultGeneratorStrategy.java:25)
at
com.google.inject.internal.cglib.core.$AbstractClassGenerator.create(AbstractClassGenerator.java:216)
at
com.google.inject.internal.cglib.proxy.$Enhancer.createHelper(Enhancer.java:377)
at
com.google.inject.internal.cglib.proxy.$Enhancer.createClass(Enhancer.java:317)
at
com.google.inject.internal.ProxyFactory$ProxyConstructor._init_(ProxyFactory.java:246)
at com.google.inject.internal.ProxyFactory.create(ProxyFactory.java:172)
at
com.google.inject.internal.ConstructorInjectorStore.createConstructor(ConstructorInjectorStore.java:89)
at
com.google.inject.internal.ConstructorInjectorStore.access$000(ConstructorInjectorStore.java:28)
at
com.google.inject.internal.ConstructorInjectorStore$1.create(ConstructorInjectorStore.java:36)
at
com.google.inject.internal.ConstructorInjectorStore$1.create(ConstructorInjectorStore.java:32)
at
com.google.inject.internal.FailableCache$1.apply(FailableCache.java:39)
at
com.google.inject.internal.util.$MapMaker$StrategyImpl.compute(MapMaker.java:549)
at
com.google.inject.internal.util.$MapMaker$StrategyImpl.compute(MapMaker.java:419)
at
com.google.inject.internal.util.$CustomConcurrentHashMap$ComputingImpl.get(CustomConcurrentHashMap.java:2041)
at com.google.inject.internal.FailableCache.get(FailableCache.java:50)
at
com.google.inject.internal.ConstructorInjectorStore.get(ConstructorInjectorStore.java:49)
at
com.google.inject.internal.ConstructorBindingImpl.initialize(ConstructorBindingImpl.java:125)
at
com.google.inject.internal.InjectorImpl.initializeJitBinding(InjectorImpl.java:521)
at
com.google.inject.internal.InjectorImpl.createJustInTimeBinding(InjectorImpl.java:847)
at
com.google.inject.internal.InjectorImpl.createJustInTimeBindingRecursive(InjectorImpl.java:772)
at
com.google.inject.internal.InjectorImpl.getJustInTimeBinding(InjectorImpl.java:256)
at
com.google.inject.internal.InjectorImpl.getBindingOrThrow(InjectorImpl.java:205)
at
com.google.inject.internal.InjectorImpl.getInternalFactory(InjectorImpl.java:853)
at
com.google.inject.internal.InjectorImpl.getProviderOrThrow(InjectorImpl.java:967)
at
com.google.inject.internal.InjectorImpl.getProvider(InjectorImpl.java:1000)
... 23 more
{noformat}
> Prevent deep copy of DataBag into Jython List
> ---------------------------------------------
>
> Key: PIG-5338
> URL: https://issues.apache.org/jira/browse/PIG-5338
> Project: Pig
> Issue Type: Improvement
> Reporter: Greg Phillips
> Assignee: Greg Phillips
> Priority: Major
> Attachments: PIG-5338.patch
>
>
> Pig Python UDFs currently perform deep copies on Bags converting them into
> Jython PyLists. This can cause Jython UDFs to run out of memory and fail. A
> Jython DataBag which extends PyList could allow for iterative access to
> DataBag elements, while only performing a deep copy when necessary.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)