[ https://issues.apache.org/jira/browse/PIG-3941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Philip (flip) Kromer updated PIG-3941: -------------------------------------- Status: Patch Available (was: Open) > Piggybank's Over UDF returns an output schema with named fields > --------------------------------------------------------------- > > Key: PIG-3941 > URL: https://issues.apache.org/jira/browse/PIG-3941 > Project: Pig > Issue Type: Improvement > Components: piggybank > Reporter: Philip (flip) Kromer > Priority: Minor > Labels: piggybank, schema, udf, window > Attachments: > 0001-Over-accepts-a-field-spec-name-and-type-separated-by.patch > > > With attached patch, if Over is constructed with a colon-delimited string > like 'bob:int', the first part is used to set the return field's name and the > second part its type. Otherwise, if a type is given the name of the return > field is set to 'result'. > {code} > cities = LOAD 'us_city_pops.tsv' AS (city:chararray, state:chararray, > pop:int); > DEFINE IOver > org.apache.pig.piggybank.evaluation.Over('state_rk:int'); > ranked = FOREACH(GROUP cities BY state) { > c_ord = ORDER cities BY pop DESC; > GENERATE FLATTEN(Stitch(c_ord, > IOver(c_ord, 'rank', -1, -1, 2))); -- beginning (-1) to end (-1) on third > field (2) > }; > DESCRIBE ranked; > -- ranked: {stitched::city: chararray,stitched::state: > chararray,stitched::pop: int,stitched::state_rk: int} > DUMP ranked; > -- ... > -- (Nashville,Tennessee,609644,2) > -- (Houston,Texas,2145146,1) > -- (San Antonio,Texas,1359758,2) > -- (Dallas,Texas,1223229,3) > -- (Austin,Texas,820611,4) > -- ... > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)