[ 
https://issues.apache.org/jira/browse/PIG-2511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205084#comment-13205084
 ] 

Dmitriy V. Ryaboy commented on PIG-2511:
----------------------------------------

Couple of things:

First, you are using the tag "newbie" incorrectly :). I know you are trying to 
be self-effacing and say "I am a newbie" but actually that tag is intended to 
be "If I am a newbie and want to contribute to Pig, what JIRAs should I tackle 
first while I get the lay of the land?".  This is clearly not one of them.

Second, I think this feature would be oh-my-god confusing to users. The example 
Thejas used above illustrates the point nicely, actually -- we use a udf of a, 
b, c, and some constant to get a field called c, then project "the rest" -- 
with "the rest" being defined as "anything that doesn't have a name conflict". 
But "the rest" could just as easily mean "the rest of the columns I didn't use" 
(so, just d). It also changes the script if you rename one alias -- say, you 
realize you didn't want to call the result of the udf c, but instead want to 
call it processed_c, and all of a sudden the number of columns produced 
changes, and their respective ordinals shift. It'll be a nightmare.

Just use a new name when generating your derived column. It's derived, after 
all.

I'd be ok with some syntax that would indicate columns to *not* generate 
("generate *^a^b"?), but the proposed syntax is fraught with peril.
                
> Enable '*' to skip any fields that have already been generated and cast in 
> other parts of the GENERATE, as in: foo = FOREACH my_relation GENERATE 
> manipulate(foo1) as foo1, *;
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2511
>                 URL: https://issues.apache.org/jira/browse/PIG-2511
>             Project: Pig
>          Issue Type: New Feature
>          Components: grunt, parser
>    Affects Versions: 0.9.1
>            Reporter: Russell Jurney
>              Labels: grunt, latin, newbie, pig
>
> This should work:
> grunt> good_dates = foreach filtered generate CustomFormatToISO(date, 'EEE, 
> dd MMM yyyy HH:mm:ss Z') AS date, *;
> 2012-02-06 14:56:23,286 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 1108: 
> <line 8, column 30> Duplicate schema alias: date
> 2012-02-06 14:56:23,286 [main] ERROR org.apache.pig.tools.grunt.Grunt - 
> org.apache.pig.impl.plan.PlanValidationException: ERROR 1108: 
> <line 8, column 30> Duplicate schema alias: date
>       at 
> org.apache.pig.newplan.logical.visitor.SchemaAliasVisitor.validate(SchemaAliasVisitor.java:74)
>       at 
> org.apache.pig.newplan.logical.visitor.SchemaAliasVisitor.visit(SchemaAliasVisitor.java:104)
>       at 
> org.apache.pig.newplan.logical.relational.LOGenerate.accept(LOGenerate.java:240)
>       at 
> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
>       at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
>       at 
> org.apache.pig.newplan.logical.visitor.SchemaAliasVisitor.visit(SchemaAliasVisitor.java:99)
>       at 
> org.apache.pig.newplan.logical.relational.LOForEach.accept(LOForEach.java:74)
>       at 
> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
>       at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
>       at org.apache.pig.PigServer$Graph.compile(PigServer.java:1661)
>       at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1610)
>       at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1582)
>       at org.apache.pig.PigServer.registerQuery(PigServer.java:584)
>       at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:942)
>       at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386)
>       at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188)
>       at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164)
>       at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
>       at org.apache.pig.Main.run(Main.java:495)
>       at org.apache.pig.Main.main(Main.java:111)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to