[ 
https://issues.apache.org/jira/browse/PIG-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226083#comment-13226083
 ] 

Thomas Weise commented on PIG-2576:
-----------------------------------

For the user it is a regression or incompatibility introduced in a minor/patch 
release. Whether the previous behavior of 
UDFContext.getUDFContext().getJobConf() was intended/documented and what 
workarounds there are (see below) is not really important, it was in place for 
long and users have come to rely on it. It is impossible for us to find out how 
many UDFs have been written with this dependency or change them. 

A better way for UDF developers to ensure initialization only runs in the 
backend is another issue. It should be possible to determine that easily and in 
a standard way. The UDF constructor is executed by the frontend (several times) 
and mapred, hence there is a need to determine where the code runs (for access 
to files etc.) 

UDF developer can workaround this by just being a bit more "safe" with the 
"where do I run" check:

1) The initialization can be done lazily from exec(...), which is never called 
in the frontend?

2) Instead of just checking for null, do this:

{code}

Configuration jobConf = UDFContext.getUDFContext().getJobConf();
if (jobConf != null && jobConf.get("mapred.task.id") != null) {
   System.err.println("Executing in BACKEND:" + jobConf.get("mapred.task.id")); 
   
} else {
   System.err.println("Executing in FRONTEND" + jobConf);    
}

{code}

Both of the above work on 0.9.2 and previous releases.


                
> Change in behavior for UDFContext.getUDFContext().getJobConf() in front-end
> ---------------------------------------------------------------------------
>
>                 Key: PIG-2576
>                 URL: https://issues.apache.org/jira/browse/PIG-2576
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.9.3
>            Reporter: Vivek Padmanabhan
>         Attachments: PIG-2576_Script_UDF.txt
>
>
> We read a file in the UDF constructor. (The file is transferred to the 
> compute nodes via distache)
> To avoid this case in the front-end while the script is in the compile stage,
> we differentiate between front end and back end execution depending upon a 
> condition ( UDFContext.getUDFContext().getJobConf() == null )
> This was working till Pig 0.9.1, in the current Pig 0.9 version this is 
> breaking.
> ie, If I have any 'fs' commands after the STORE statement, the GruntParser 
> invokes the udf constructor again and the above condition check returns false 
> causing errors.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to