[ 
https://issues.apache.org/jira/browse/PIG-2751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412502#comment-13412502
 ] 

Joshua Hartman commented on PIG-2751:
-------------------------------------

I have this exact use case for trying to do things like filter out nulls using 
a FOREACH. It's a pain to use a ternary operator every time - it would be much 
nicer to have a sort of getOrElse macro that can run.
                
> Allow macros in FOREACH
> -----------------------
>
>                 Key: PIG-2751
>                 URL: https://issues.apache.org/jira/browse/PIG-2751
>             Project: Pig
>          Issue Type: Improvement
>          Components: parser
>    Affects Versions: 0.10.0
>         Environment: Kubuntu 12.04 64Bit
>            Reporter: Johannes Schwenk
>
> I would like to be able to use macros within the GENERATE of an FOREACH.
> Example:
> {code}
> define test_macro(param1, param2) returns ret_val {
>   $ret_val = (param1 == 0 ? param2 : param1);
> };
> a = LOAD ('data') AS (id, val1, val2);
> b = FOREACH a GENERATE id, test_macro(val1, val2);
> DUMP b;
> {code}
> This would be most useful for having only a single point to edit (the macro) 
> if a definition for a special computation changes. Lets say, you have raw log 
> data and several scripts loading it. All scripts need to filter out specific 
> unused columns. Most (but not all) of the scripts are dealing with a field 
> that needs to be handled in a special way. So I cannot just use two different 
> LOAD functions (one with the special computation and one without) because 
> that would make a second FOREACH ... GENERATE necessary to filter out the 
> unused columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to