[ https://issues.apache.org/jira/browse/PIG-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Thejas M Nair resolved PIG-1693. -------------------------------- Resolution: Fixed Release Note: Project-range ( '..' ) can be used to project a range of columns from input. For example, the expressions - ..$x : projects columns $0 through $x, inclusive $x.. : projects columns through end, inclusive $x..$y : projects columns through $y, inclusive If the input relation has a schema, you can also use column aliases instead of referring to columns using position. You can also combine the use of alias and column positions in a project-range expression (ie, "col1 .. $5" is valid). This expression can be used in all cases where the use of '*' (project-star) is allowed, except as a udf argument. Support for that use case will be added in PIG-1938. It can be used in following statements - - foreach - join - order (also when it is within a nested foreach block) - group/cogroup Examples - {code} grunt> F = foreach IN generate (int)col0, col1 .. col3; grunt> describe F; F: {col0: int,col1: bytearray,col2: bytearray,col3: bytearray} {code} {code} grunt> SORT = order IN by col2 .. col3, col0, col4 ..; {code} {code} J = join IN1 by $0 .. $3, IN2 by $0 .. $3; {code} {code} g = group l1 by b .. c; {code} Limitations: There are some restrictions on the use of project-to-end form of project range (eg "x .. ") when input schema is null (unknown). These are also cases where the use of project-star ('*') is restricted. 1. In Cogroup/Group statements, project-to-end form of project-range is only allowed if the input has a schema 2. In order-by statement, project-to-end form of project-range is supported only as last sort column, if input schema is null. Note: there is a bug PIG-1939, because of which the use is restricted when schema is present. That should be fixed soon. example- {code} grunt> describe IN; Schema for IN unknown. -- Following statement is supported SORT = order IN by $2 .. $3, $6 ..; -- Following statement is NOT supported SORT = order IN by $2 .. $3, $6 ..; {code} Patch committed to trunk. > support project-range expression. (was: There needs to be a way in foreach to > indicate "and all the rest of the fields" ) > ------------------------------------------------------------------------------------------------------------------------- > > Key: PIG-1693 > URL: https://issues.apache.org/jira/browse/PIG-1693 > Project: Pig > Issue Type: New Feature > Components: impl > Reporter: Alan Gates > Assignee: Thejas M Nair > Fix For: 0.9.0 > > Attachments: PIG-1693.1.patch, PIG-1693.2.patch > > > A common use case we see in Pig is people have many columns in their data and > they only want to operate on a few of them. Consider for example if before > storing data with ten columns, the user wants to perform a cast on one column: > {code} > ... > Z = foreach Y generate (int)firstcol, secondcol, thridcol, forthcol, > fifthcol, sixthcol, seventhcol, eigthcol, ninethcol, tenthcol; > store Z into 'output'; > {code} > Obviously this only gets worse as the user has more columns. Ideally the > above could be transformed to something like: > {code} > ... > Z = foreach Y generate (int)firstcol, "and all the rest"; > store Z into 'output' > {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira