[
https://issues.apache.org/jira/browse/PIG-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192327#comment-13192327
]
Russell Jurney commented on PIG-2490:
-------------------------------------
I think this is fantastic.
> Add UDF function chaining syntax
> --------------------------------
>
> Key: PIG-2490
> URL: https://issues.apache.org/jira/browse/PIG-2490
> Project: Pig
> Issue Type: Improvement
> Reporter: David Ciemiewicz
>
> Nested function/UDF calls can make for very convoluted data transformations.
> For example, give the following sample data:
> {code}
> business1 9:00 AM - 4:00 PM
> {code}
> Transforming it with Pig UDFs might look like the following to normalize
> hours to "9:00a-4:00p"
> {code}
> B = foreach A generate
> REGEXREPLACE(REGEXREPLACE(REGEXREPLACE(hours,' AM','a'), ' PM', 'p'), '
> *- *', '-')
> as hours_normalized.
> {code}
> Yes, you could recast this as but it's still rather convoluted.
> {code}
> B = foreach A {
> hours1 = REGEXREPLACE(hours,' AM\\b','a');
> hours2 = REGEXREPLACE(hours1,' PM\\b','p');
> hours3 = REGEXREPLACE(hours2,' *- *','-');
> generate
> hours3 as hours_normalized;
> };
> {code}
> I suggest an "object-style" function chaining enhancement to the grammar a la
> Java, JavaScript, etc.
> {code}
> B = foreach A generate
> REGEXREPLACE(hours,' AM\\b','a').REGEXREPLACE('
> PM\\b','p').REGEXREPLACE(' *- *','-')
> as hours_normalized;
> {code}
> This chaining notation makes it much clearer as to the sequence of actions
> without the convoluted nesting.
> In the case of the "object-method" style dot (.) notation, the result of the
> prior expression is just used as the first value in the tuple passed to the
> function call.
> In other words, the following two expressions would be equivalent:
> {code}
> f(a,b)
> a.f(b)
> {code}
> As such, I don't think there are any requirements to modify existing UDFs.
> I think this is just a syntactic "sugar" enhancement that should be fairly
> trivial to implement, yet would make coding complex data transformations with
> Pig UDFs "cleaner".
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira