[ 
https://issues.apache.org/jira/browse/PIG-2344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226299#comment-13226299
 ] 

Thomas Weise commented on PIG-2344:
-----------------------------------

Serialization alone would not help in situations where UDF exec(..) depends on 
state that needs to be initialized where exec(..) runs. One of the workarounds 
is to do that lazily from exec(...) currently, that will guarantee it happens 
where the action is...

Speaking about solutions with Ashutosh, we identified the following needs:

Pig should construct UDF through default/no-arg ctor. No more multiple times 
instantiation through UDF ctor with arguments.

Pig should call initialize(...) in the frontend, with the arguments provided 
for the UDF.

Pig should call preExec() in the backend once, this would be the place where 
things like local file system access etc. can take place

Probably there should also be a postExec() hook for any cleanup to be done.

And finally, need to address backward compatibility also, so that existing UDFs 
don't suddenly stop to work.

                
> UDF / LoadFunc / StoreFunc should be serializable
> -------------------------------------------------
>
>                 Key: PIG-2344
>                 URL: https://issues.apache.org/jira/browse/PIG-2344
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Ashutosh Chauhan
>
> If there is a redesign, this should be a requirement. We will get away with 
> all the saving of state which got created in frontend and then recreating the 
> same state in backend.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to