[
https://issues.apache.org/jira/browse/PIG-2751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412502#comment-13412502
]
Joshua Hartman commented on PIG-2751:
-------------------------------------
I have this exact use case for trying to do things like filter out nulls using
a FOREACH. It's a pain to use a ternary operator every time - it would be much
nicer to have a sort of getOrElse macro that can run.
> Allow macros in FOREACH
> -----------------------
>
> Key: PIG-2751
> URL: https://issues.apache.org/jira/browse/PIG-2751
> Project: Pig
> Issue Type: Improvement
> Components: parser
> Affects Versions: 0.10.0
> Environment: Kubuntu 12.04 64Bit
> Reporter: Johannes Schwenk
>
> I would like to be able to use macros within the GENERATE of an FOREACH.
> Example:
> {code}
> define test_macro(param1, param2) returns ret_val {
> $ret_val = (param1 == 0 ? param2 : param1);
> };
> a = LOAD ('data') AS (id, val1, val2);
> b = FOREACH a GENERATE id, test_macro(val1, val2);
> DUMP b;
> {code}
> This would be most useful for having only a single point to edit (the macro)
> if a definition for a special computation changes. Lets say, you have raw log
> data and several scripts loading it. All scripts need to filter out specific
> unused columns. Most (but not all) of the scripts are dealing with a field
> that needs to be handled in a special way. So I cannot just use two different
> LOAD functions (one with the special computation and one without) because
> that would make a second FOREACH ... GENERATE necessary to filter out the
> unused columns.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira