Adam, thanks for the comments. Below is the cat of the patch (it's short
enough to just paste in line):

Your comments are welcome, and I'd be curious what others think as well.
The blurring of the line between bags and relations is what I'm worried
about, but at the same time, one of the things people confuse the most is
that distinction.


Index: test/org/apache/pig/test/TestEvalPipeline.java
===================================================================
--- test/org/apache/pig/test/TestEvalPipeline.java    (revision 1244760)
+++ test/org/apache/pig/test/TestEvalPipeline.java    (working copy)
@@ -383,7 +383,7 @@
         pigServer.registerQuery("A = LOAD '"
                 + Util.generateURI(tmpFile.toString(), pigContext) + "';");
         if (eliminateDuplicates){
-            pigServer.registerQuery("B = DISTINCT (FOREACH A GENERATE $0)
PARALLEL 10;");
+            pigServer.registerQuery("B = DISTINCT A.$0 PARALLEL 10;");
         }else{
             if(!useUDF) {
                 pigServer.registerQuery("B = ORDER A BY $0 PARALLEL 10;");
Index: test/org/apache/pig/test/TestEvalPipelineLocal.java
===================================================================
--- test/org/apache/pig/test/TestEvalPipelineLocal.java    (revision
1244760)
+++ test/org/apache/pig/test/TestEvalPipelineLocal.java    (working copy)
@@ -400,7 +400,7 @@
                 + Util.generateURI(tmpFile.toString(), pigServer
                         .getPigContext()) + "';");
         if (eliminateDuplicates){
-            pigServer.registerQuery("B = DISTINCT (FOREACH A GENERATE $0)
PARALLEL 10;");
+            pigServer.registerQuery("B = DISTINCT A.$0 PARALLEL 10;");
         }else{
             if(!useUDF) {
                 pigServer.registerQuery("B = ORDER A BY $0 PARALLEL 10;");
Index: src/org/apache/pig/parser/AstPrinter.g
===================================================================
Index: src/org/apache/pig/parser/QueryParser.g
===================================================================
--- src/org/apache/pig/parser/QueryParser.g    (revision 1244760)
+++ src/org/apache/pig/parser/QueryParser.g    (working copy)
@@ -506,7 +506,10 @@
           | LEFT_PAREN! col_ref ( ASC | DESC )? RIGHT_PAREN!
 ;

-distinct_clause : DISTINCT^ rel partition_clause?
+distinct_clause : DISTINCT rel PERIOD ( col_alias_or_index | ( LEFT_PAREN
col_alias_or_index ( COMMA col_alias_or_index )* RIGHT_PAREN ) )
partition_clause?
+               -> ^( DISTINCT ^( FOREACH rel ^( FOREACH_PLAN_SIMPLE ^(
GENERATE col_alias_or_index+ ) ) ) partition_clause? )
+                | DISTINCT rel partition_clause?
+               -> ^( DISTINCT rel partition_clause? )
 ;

 partition_clause : PARTITION^ BY! func_name

Reply via email to