[
https://issues.apache.org/jira/browse/PIG-3247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13601801#comment-13601801
]
Alan Gates commented on PIG-3247:
-
Basic OVER functionality can be accomplished in Pig using GROUP BY and FOREACH
FLATTEN. For example:
{code}
select s, min(i) over (partition by s) from T
{code}
is done in Pig as:
{code}
A = load 'T';
B = group A by s;
C = foreach B generate flatten(A), MIN(A.i) as min;
D = foreach C generate A::s, min;
{code}
But as soon as a windowing clause is added this no longer works because the
function needs to be called once for each row in the bag and only a subset of
the bag should be passed to the function. To address this I've added two new
functions:
Stitch - Given multiple bags this stitches them together row by row. So if you
have two bags:
{code}
bag A:
{ (1, 2),
(3, 4) }
bag B
{ (a, b),
(c, d) }
{code}
Then Stitch(A, B) will return
{code}
{ (1, 2, a, b),
(3, 4, c, d) }
{code}
Over - Implements the standard SQL windowing and analytic functions, including
: rank, dense_rank, cume_dist, percent_rank, ntile, first_value, last_value,
lead, and lag. Together these can be used to do windowing and analytics
functions in Pig.
Pig already has rank and dense_rank, and this is in no way meant to replace
that. This is meant to mimic exactly the SQL functionality. Also, these
functions make no allowance for large sets that don't fit in memory on a single
reducer.
> Piggybank functions to mimic OVER clause in SQL
> ---
>
> Key: PIG-3247
> URL: https://issues.apache.org/jira/browse/PIG-3247
> Project: Pig
> Issue Type: New Feature
> Components: piggybank
>Reporter: Alan Gates
>Assignee: Alan Gates
>
> In order to test Hive I have written some UDFs to mimic the behavior of SQL's
> OVER clause. I thought they would be useful to share.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira