Cheolsoo Park created PIG-3268:
----------------------------------

             Summary: Case statement support
                 Key: PIG-3268
                 URL: https://issues.apache.org/jira/browse/PIG-3268
             Project: Pig
          Issue Type: New Feature
          Components: internal-udfs, parser
    Affects Versions: 0.11
            Reporter: Cheolsoo Park
            Assignee: Cheolsoo Park
             Fix For: 0.12


Currently, Pig has no support for case statement. To mimic it, users often use 
nested bincond operators. However, that easily becomes unreadable when there 
are multiple levels of nesting.

For example,
{code}
a = LOAD '1.txt' USING PigStorage(',') AS (i:int);
b = FOREACH a GENERATE (
    i % 3 == 0 ? '3n' : (i % 3 == 1 ? '3n + 1' : '3n + 2')
);
{code}
This can be re-written much more nicely using case statement as follows:
{code}
a = LOAD '1.txt' USING PigStorage(',') AS (i:int);
b = FOREACH a GENERATE (
    CASE i % 3
        WHEN 0 THEN '3n'
        WHEN 1 THEN '3n + 1'
        ELSE        '3n + 2'
    END
);
{code}
I propose that we implement case statement in the following manner:
* Add built-in UDFs that take expressions as args. Take for example the 
aforementioned case statement, we can define a UDF such as {{builtInUdf(i % 3, 
0, '3n', 1, '3n + 1', '3n + 2')}}.
* Add syntactical sugar for these built-in UDFs.

In fact, I burrowed this idea from HIVE-164. 

One downside of this approach is that all the possible args schemas of these 
UDFs must be pre-computed. Specifically, we need to populate the full list of 
possible args schemas in {{EvalFunc.getArgToFuncMapping}}.

In particular, since we obviously cannot support infinitely long args, it is 
necessary to impose a limit on level of nesting. For now, I arbitrarily chose 
50, but it can be easily changed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to