[ 
https://issues.apache.org/jira/browse/PIG-5338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16454432#comment-16454432
 ] 

Greg Phillips commented on PIG-5338:
------------------------------------

Thanks [~knoguchi]! I was able to run e2e successfully on a small cluster in a 
reasonable amount of time (220 minutes). In addition to resolving the in the 
e2e error noted before I've added testing, documentation, and the ability to 
return a native java DataBag from the Jython UDF. I'm not certain returning a 
DataBag is the correct way to go, I may add more functionality to the JythonBag 
to make it writable if that seems like a better way to proceed. 

> Prevent deep copy of DataBag into Jython List
> ---------------------------------------------
>
>                 Key: PIG-5338
>                 URL: https://issues.apache.org/jira/browse/PIG-5338
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Greg Phillips
>            Assignee: Greg Phillips
>            Priority: Major
>         Attachments: PIG-5338.001.patch, PIG-5338.patch
>
>
> Pig Python UDFs currently perform deep copies on Bags converting them into 
> Jython PyLists. This can cause Jython UDFs to run out of memory and fail. A 
> Jython DataBag which extends PyList could allow for iterative access to 
> DataBag elements, while only performing a deep copy when necessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to