[ https://issues.apache.org/jira/browse/PIG-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15277318#comment-15277318 ]
Caleb Holtzinger commented on PIG-3268: --------------------------------------- Hi Chelsoo, It doesn't look like MATCHES is supported by your syntax. For instance, the following does not work: {code}CASE e1 WHEN 'a' THEN 'alpha' WHEN MATCHES 'b.*' THEN 'alpha' WHEN 'c' THEN 'alpha' WHEN 'd' THEN 'alpha' ELSE 'numeric' END {code} whereas this works: {code}CASE WHEN e1 == 'a' THEN 'alpha' WHEN e1 MATCHES 'b.*' THEN 'alpha' WHEN e1 == 'c' THEN 'alpha' WHEN e1 == 'd' THEN 'alpha' ELSE 'numeric' END {code} > Case statement support > ---------------------- > > Key: PIG-3268 > URL: https://issues.apache.org/jira/browse/PIG-3268 > Project: Pig > Issue Type: New Feature > Components: internal-udfs, parser > Affects Versions: 0.11 > Reporter: Cheolsoo Park > Assignee: Cheolsoo Park > Fix For: 0.12.0 > > Attachments: PIG-3268-2.patch, PIG-3268-3.patch, PIG-3268-4.patch, > PIG-3268-5.patch, PIG-3268-6.patch, PIG-3268-7.patch, PIG-3268.patch > > > Currently, Pig has no support for case statement. To mimic it, users often > use nested bincond operators. However, that easily becomes unreadable when > there are multiple levels of nesting. > For example, > {code} > a = LOAD '1.txt' USING PigStorage(',') AS (i:int); > b = FOREACH a GENERATE ( > i % 3 == 0 ? '3n' : (i % 3 == 1 ? '3n + 1' : '3n + 2') > ); > {code} > This can be re-written much more nicely using case statement as follows: > {code} > a = LOAD '1.txt' USING PigStorage(',') AS (i:int); > b = FOREACH a GENERATE ( > CASE i % 3 > WHEN 0 THEN '3n' > WHEN 1 THEN '3n + 1' > ELSE '3n + 2' > END > ); > {code} > I propose that we implement case statement in the following manner: > * Add built-in UDFs that take expressions as args. Take for example the > aforementioned case statement, we can define a UDF such as {{builtInUdf(i % > 3, 0, '3n', 1, '3n + 1', '3n + 2')}}. > * Add syntactical sugar for these built-in UDFs. > In fact, I burrowed this idea from HIVE-164. > One downside of this approach is that all the possible args schemas of these > UDFs must be pre-computed. Specifically, we need to populate the full list of > possible args schemas in {{EvalFunc.getArgToFuncMapping}}. > In particular, since we obviously cannot support infinitely long args, it is > necessary to impose a limit on the size of when branches. For now, I > arbitrarily chose 50, but it can be easily changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)