[ 
https://issues.apache.org/jira/browse/PIG-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192327#comment-13192327
 ] 

Russell Jurney commented on PIG-2490:
-------------------------------------

I think this is fantastic.
                
> Add UDF function chaining syntax
> --------------------------------
>
>                 Key: PIG-2490
>                 URL: https://issues.apache.org/jira/browse/PIG-2490
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: David Ciemiewicz
>
> Nested function/UDF calls can make for very convoluted data transformations.
> For example, give the following sample data:
> {code}
> business1     9:00 AM - 4:00 PM
> {code}
> Transforming it with Pig UDFs might look like the following to normalize 
> hours to "9:00a-4:00p"
> {code}
> B = foreach A generate
>     REGEXREPLACE(REGEXREPLACE(REGEXREPLACE(hours,' AM','a'), ' PM', 'p'), ' 
> *- *', '-')
>         as hours_normalized.
> {code}
> Yes, you could recast this as but it's still rather convoluted.
> {code}
> B = foreach A {
>     hours1 = REGEXREPLACE(hours,' AM\\b','a');
>     hours2 = REGEXREPLACE(hours1,' PM\\b','p');
>     hours3 = REGEXREPLACE(hours2,' *- *','-');
>     generate
>     hours3 as hours_normalized;
>     };
> {code}
> I suggest an "object-style" function chaining enhancement to the grammar a la 
> Java, JavaScript, etc.
> {code}
> B = foreach A generate
>     REGEXREPLACE(hours,' AM\\b','a').REGEXREPLACE(' 
> PM\\b','p').REGEXREPLACE(' *- *','-')
>         as hours_normalized;
> {code}
> This chaining notation makes it much clearer as to the sequence of actions 
> without the convoluted nesting.
> In the case of the "object-method" style dot (.) notation, the result of the 
> prior expression is just used as the first value in the tuple passed to the 
> function call.
> In other words, the following two expressions would be equivalent:
> {code}
> f(a,b)
> a.f(b)
> {code}
> As such, I don't think there are any requirements to modify existing UDFs.
> I think this is just a syntactic "sugar" enhancement that should be fairly 
> trivial to implement, yet would make coding complex data transformations with 
> Pig UDFs "cleaner".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to