[jira] [Commented] (PIG-2021) Parser error while referring a map nested foreach

2011-05-09 Thread Vivek Padmanabhan (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13030682#comment-13030682
 ] 

Vivek Padmanabhan commented on PIG-2021:


Attaching a script avoiding all the dependencies;

{code}
A = load 'temp' as ( s, m, l );
B = foreach A generate *, LOWER((chararray) s#'url') as parsedurl;
C = foreach B {
  urlpath = (chararray) parsedurl#'path';
  lc_urlpath = org.apache.pig.piggybank.evaluation.string.Reverse((chararray) 
urlpath);
  generate *;
};
{code}

 Parser error while referring a map nested foreach
 -

 Key: PIG-2021
 URL: https://issues.apache.org/jira/browse/PIG-2021
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.9.0
Reporter: Vivek Padmanabhan
Assignee: Xuefu Zhang
 Fix For: 0.9.0


 The below script is throwing parser errors
 {code}
 register string.jar;
 A = load 'test1'  using MapLoader() as ( s, m, l );   
 B = foreach A generate *, string.URLPARSE((chararray) s#'url') as parsedurl;
 C = foreach B {
   urlpath = (chararray) parsedurl#'path';
   lc_urlpath = string.TOLOWERCASE((chararray) urlpath);
   generate *;
 };
 {code}
 Error message;
 | Failed to generate logical plan.
 |Nested exception: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 
 2225: Projection with nothing to reference!
 PIG-2002 reports a similar issue, but when i tried with the patch of PIG-2002 
 i was getting the below exception;
  ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: file repro.pig, line 
 11, column 33  mismatched input '(' expecting SEMI_COLON

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2046) Properties defined through 'SET' are not passed through to fs commands

2011-05-09 Thread Kevin J. Price (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13030769#comment-13030769
 ] 

Kevin J. Price commented on PIG-2046:
-

Odd.  It definitely works correctly if you set up a 
pig-cluster-hadoop-site.xml file in a conf directory and include it on the 
class path using -cp.  That's the workaround I'm using right now.

 Properties defined through 'SET' are not passed through to fs commands
 --

 Key: PIG-2046
 URL: https://issues.apache.org/jira/browse/PIG-2046
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0, 0.9.0
Reporter: Vivek Padmanabhan

 The properties which are set through 'SET' commands are not passed through to 
 FS commands.
 Ex;
 SET dfs.umaskmode '026'
 fs -touchz umasktest/file0
 It looks like the SET commands are processed by GruntParser after the FsShell 
 creation happens with current set of properties. Hence whatever properties 
 defined in SET will not be reflected for fs commands executed in the script.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-2052) Ship guava.jar to backend

2011-05-09 Thread Daniel Dai (JIRA)
Ship guava.jar to backend
-

 Key: PIG-2052
 URL: https://issues.apache.org/jira/browse/PIG-2052
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.8.1, 0.9.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.9.0


We need to ship guava.jar to backend. GenericInvoker is using it.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1827) When passing a parameter to Pig, if the value contains $ it has to be escaped for no apparent reason

2011-05-09 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13030838#comment-13030838
 ] 

Julien Le Dem commented on PIG-1827:


right.
Please add a unit test to verify that fixNonEscapedDollarSign returns what we 
expect.


 When passing a parameter to Pig, if the value contains $ it has to be escaped 
 for no apparent reason
 

 Key: PIG-1827
 URL: https://issues.apache.org/jira/browse/PIG-1827
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.8.0
Reporter: Julien Le Dem
Assignee: Richard Ding
 Fix For: 0.9.0

 Attachments: PIG-1827-1.patch, PIG-1827_2.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2051) new LogicalSchema column prune code does not preserve type information for map subfields

2011-05-09 Thread Woody Anderson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Woody Anderson updated PIG-2051:


Attachment: 2051.patch

this patch propagates type information more correctly (though not 
recursive/fully) to the pushProjection call.

Mainly, this means putting type information into via subfields into map types.

It doesn't fully descend and provide type information for subfields of 
subfields etc. But, provided fields have the correct type information rather 
than DataType.BYTEARRAY


 new LogicalSchema column prune code does not preserve type information for 
 map subfields
 

 Key: PIG-2051
 URL: https://issues.apache.org/jira/browse/PIG-2051
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.10
Reporter: Woody Anderson
Assignee: Woody Anderson
 Fix For: 0.10

 Attachments: 2051.patch


 current impl of ColumnPruneVisitor.visit ignores field type info and passes 
 type BYTEARRAY for all map fields.
 the corrected type is pretty easy to fill in, especially since map field info 
 is only attempted 1 level deep.
 i came across this b/c i utilize the type information in the pushProjection 
 call, and this was previously of the 'correct' type information, the change 
 over to LogicalSchema caused a regression.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-2053) PigInputFormat uses class.isAssignableFrom() where instanceof is more appropriate

2011-05-09 Thread Woody Anderson (JIRA)
PigInputFormat uses class.isAssignableFrom() where instanceof is more 
appropriate
-

 Key: PIG-2053
 URL: https://issues.apache.org/jira/browse/PIG-2053
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.10
Reporter: Woody Anderson
Priority: Minor
 Fix For: 0.10
 Attachments: 2053.patch

This is a code style/quality improvement.

isAssignableFrom is appropriate when the class is not known at compile type, 
but assignment needs to be checked.
e.g. foo.getClass().isAssignableFrom(bar.getClass())

but, if the class of foo is known (e.g. X.class), then instanceof is more 
appropriate and readable.
i also made use of de morgan's to simply the is combininable boolean 
statement, which is hard to grok as written.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2053) PigInputFormat uses class.isAssignableFrom() where instanceof is more appropriate

2011-05-09 Thread Woody Anderson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Woody Anderson updated PIG-2053:


Attachment: 2053.patch

patch

 PigInputFormat uses class.isAssignableFrom() where instanceof is more 
 appropriate
 -

 Key: PIG-2053
 URL: https://issues.apache.org/jira/browse/PIG-2053
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.10
Reporter: Woody Anderson
Priority: Minor
 Fix For: 0.10

 Attachments: 2053.patch


 This is a code style/quality improvement.
 isAssignableFrom is appropriate when the class is not known at compile type, 
 but assignment needs to be checked.
 e.g. foo.getClass().isAssignableFrom(bar.getClass())
 but, if the class of foo is known (e.g. X.class), then instanceof is more 
 appropriate and readable.
 i also made use of de morgan's to simply the is combininable boolean 
 statement, which is hard to grok as written.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1890) Fix piggybank unit test TestAvroStorage

2011-05-09 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13030845#comment-13030845
 ] 

Jakob Homan commented on PIG-1890:
--

@Ken - any update now that we're in a new week?

 Fix piggybank unit test TestAvroStorage
 ---

 Key: PIG-1890
 URL: https://issues.apache.org/jira/browse/PIG-1890
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.9.0
Reporter: Daniel Dai
Assignee: Jakob Homan
 Fix For: 0.9.0

 Attachments: PIG-1890-1.patch


 TestAvroStorage fail on trunk. There are two reasons:
 1. After PIG-1680, we call LoadFunc.setLocation one more time.
 2. The schema for AvroStorage seems to be wrong. For example, in first test 
 case testArrayDefault, the schema for in is set to PIG_WRAPPER: (FIELD: 
 {PIG_WRAPPER: (ARRAY_ELEM: float)}). It seems PIG_WRAPPER is redundant. This 
 issue is hidden until PIG-1188 checked in.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1827) When passing a parameter to Pig, if the value contains $ it has to be escaped for no apparent reason

2011-05-09 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13030848#comment-13030848
 ] 

Julien Le Dem commented on PIG-1827:


After discussing with Richard and looking into the code of PreprocessorContext
http://svn.apache.org/viewvc/pig/trunk/src/org/apache/pig/tools/parameters/PreprocessorContext.java?view=markup
There seems to be a bug here:
{code}
235 //String litVal = Matcher.quoteReplacement(val);
236 replaced_line = replaced_line.replaceFirst(\\$+key, val); 
{code}
the replacement (2nd) parameter of replaceFirst is not a plain string, it can 
contain references to the matched pattern like $0 so $ in val must be escaped.
Does someone know why line 235 is commented out ?


 When passing a parameter to Pig, if the value contains $ it has to be escaped 
 for no apparent reason
 

 Key: PIG-1827
 URL: https://issues.apache.org/jira/browse/PIG-1827
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.8.0
Reporter: Julien Le Dem
Assignee: Richard Ding
 Fix For: 0.9.0

 Attachments: PIG-1827-1.patch, PIG-1827_2.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2012) Comments at the begining of the file throws off line numbers in errors

2011-05-09 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-2012:
--

Attachment: PIG-2012_2.patch

Thanks Xuefu. The new patch addresses the review comments.

 Comments at the begining of the file throws off line numbers in errors
 --

 Key: PIG-2012
 URL: https://issues.apache.org/jira/browse/PIG-2012
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.9.0
Reporter: Alan Gates
Assignee: Richard Ding
 Fix For: 0.9.0

 Attachments: PIG-2012_1.patch, PIG-2012_2.patch, macro.pig


 The preprocessor does not appear to be handling leading comments properly 
 when calculating line numbers for error messages.  In the attached script, 
 the error is reported to be on line 7.  It is actually on line 10.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2039) IndexOutOfBounException for a case

2011-05-09 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated PIG-2039:
-

Attachment: PIG-2039.patch

Unit test passed.

 IndexOutOfBounException for a case
 --

 Key: PIG-2039
 URL: https://issues.apache.org/jira/browse/PIG-2039
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.9.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 0.9.0

 Attachments: PIG-2039.patch


 The following query gives an exception:
 a = load '1.txt' as (a0:int, a1:int, a2:int);
 b = group a by a0;
 c = foreach b { c1 = limit a 10; c2 =  distinct c1.a1; c3 = distinct c1.a2; 
 generate c2, c3;};
 store c into 'output';
 2011-05-04 12:36:01,720 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
 2999: Unexpected internal error. Index: 0, Size: 0
 Stack trace:
 java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
 at java.util.ArrayList.RangeCheck(ArrayList.java:547)
 at java.util.ArrayList.get(ArrayList.java:322)
 at 
 org.apache.pig.newplan.logical.expression.ProjectExpression.getFieldSchema(ProjectExpression.java:279)
 at 
 org.apache.pig.newplan.logical.relational.LOGenerate.getSchema(LOGenerate.java:88)
 at 
 org.apache.pig.newplan.logical.visitor.SchemaAliasVisitor.validate(SchemaAliasVisitor.java:60)
 at 
 org.apache.pig.newplan.logical.visitor.SchemaAliasVisitor.visit(SchemaAliasVisitor.java:104)
 at 
 org.apache.pig.newplan.logical.relational.LOGenerate.accept(LOGenerate.java:240)
 at 
 org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
 at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
 at 
 org.apache.pig.newplan.logical.visitor.SchemaAliasVisitor.visit(SchemaAliasVisitor.java:99)
 at 
 org.apache.pig.newplan.logical.relational.LOForEach.accept(LOForEach.java:73)
 at 
 org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
 at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
 at org.apache.pig.PigServer$Graph.compile(PigServer.java:1664)
 at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1615)
 at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1586)
 at org.apache.pig.PigServer.registerQuery(PigServer.java:580)
 at 
 org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:930)
 at 
 org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:176)
 at 
 org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:152)
 at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76)
 at org.apache.pig.Main.run(Main.java:488)
 at org.apache.pig.Main.main(Main.java:109)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-2054) Need to clarify globbing on command line vs in load statement

2011-05-09 Thread Olga Natkovich (JIRA)
Need to clarify globbing on command line vs in load statement
-

 Key: PIG-2054
 URL: https://issues.apache.org/jira/browse/PIG-2054
 Project: Pig
  Issue Type: Improvement
  Components: documentation
Reporter: Olga Natkovich
Assignee: Corinne Chandel
 Fix For: 0.9.0


We had several user reports saying that globbing in Pig and Hadoop are not the 
same. They based this assertion on the fact that some patterns work from 
hadoop command line but would not work in Pig load statement.

Pig uses Hadoop globbing so the functionality is identical; however, when you 
run on command line, shell can be doing some of the substitution giving 
impression that things are different.

Example:

hadoop fs -ls 
/mydata/20110423{00,01,02,03,04,05,06,07,08,09,{10..23}}00/*/part* - this works
LOAD '/mydata/20110423{00,01,02,03,04,05,06,07,08,09,{10..23}}00/*/part*' - 
this does not

We should add a note to the description of globbing 


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (PIG-2052) Ship guava.jar to backend

2011-05-09 Thread Dmitriy V. Ryaboy (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy V. Ryaboy reassigned PIG-2052:
--

Assignee: Dmitriy V. Ryaboy  (was: Daniel Dai)

 Ship guava.jar to backend
 -

 Key: PIG-2052
 URL: https://issues.apache.org/jira/browse/PIG-2052
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.8.1, 0.9.0
Reporter: Daniel Dai
Assignee: Dmitriy V. Ryaboy
 Fix For: 0.9.0


 We need to ship guava.jar to backend. GenericInvoker is using it.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2052) Ship guava.jar to backend

2011-05-09 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13030860#comment-13030860
 ] 

Dmitriy V. Ryaboy commented on PIG-2052:


I'd like to pick this one up, planning on doing a bunch of Pig related work 
tonight anyway.

 Ship guava.jar to backend
 -

 Key: PIG-2052
 URL: https://issues.apache.org/jira/browse/PIG-2052
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.8.1, 0.9.0
Reporter: Daniel Dai
Assignee: Dmitriy V. Ryaboy
 Fix For: 0.9.0


 We need to ship guava.jar to backend. GenericInvoker is using it.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-1949) e2e test harness should use bin/pig rather than calling java directly

2011-05-09 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated PIG-1949:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

Patch checked in.

 e2e test harness should use bin/pig rather than calling java directly
 -

 Key: PIG-1949
 URL: https://issues.apache.org/jira/browse/PIG-1949
 Project: Pig
  Issue Type: Bug
  Components: tools
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Minor
 Fix For: 0.10

 Attachments: PIG-1949.patch


 Currently TestDriverPig.pm uses java directly to invoke Pig.  It should use 
 the bash shell script bin/pig instead.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2038) Pig fails to parse empty tuple/map/bag constant

2011-05-09 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13030888#comment-13030888
 ] 

Xuefu Zhang commented on PIG-2038:
--

Unit test passed. Test-patch run:

 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec]
 [exec] +1 tests included.  The patch appears to include 6 new or 
modified tests.
 [exec]
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec]
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec]
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec]
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.


 Pig fails to parse empty tuple/map/bag constant
 ---

 Key: PIG-2038
 URL: https://issues.apache.org/jira/browse/PIG-2038
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.9.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 0.9.0

 Attachments: PIG-2038.patch


 Pig fails to parse the following query:
 a = foreach (load 'b') generate ();
 store a into 'output';
 Error msg: Failed to parse: null
 Similar problem occurs for empty bag/map constant.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2021) Parser error while referring a map nested foreach

2011-05-09 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13030889#comment-13030889
 ] 

Xuefu Zhang commented on PIG-2021:
--

The above test case seems having a few problems:

1. LOWER() returns a string, so parsedurl is a string. However, it's later used 
as a map. Such conversion is invalid.

2. generate * will output input, so the nested commands in the second foreach 
is useless.

3. with latest in the trunk, the above query parses without problem.

grunt A = load 'temp' as ( s, m, l );
grunt B = foreach A generate *, LOWER((chararray) s#'url') as parsedurl;
2011-05-09 13:37:04,796 [main] WARN  org.apache.pig.PigServer - Encountered 
Warning IMPLICIT_CAST_TO_MAP 1 time(s).
grunt C = foreach B {
   urlpath = (chararray) parsedurl#'path';
   lc_urlpath = 
 org.apache.pig.piggybank.evaluation.string.Reverse((chararray) urlpath);
   generate *;
 };
2011-05-09 13:37:06,315 [main] WARN  org.apache.pig.PigServer - Encountered 
Warning IMPLICIT_CAST_TO_MAP 1 time(s).
grunt describe C;
2011-05-09 13:37:10,676 [main] WARN  org.apache.pig.PigServer - Encountered 
Warning IMPLICIT_CAST_TO_MAP 1 time(s).
C: {s: bytearray,m: bytearray,l: bytearray,parsedurl: chararray}

Please provide a valid case.

 Parser error while referring a map nested foreach
 -

 Key: PIG-2021
 URL: https://issues.apache.org/jira/browse/PIG-2021
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.9.0
Reporter: Vivek Padmanabhan
Assignee: Xuefu Zhang
 Fix For: 0.9.0


 The below script is throwing parser errors
 {code}
 register string.jar;
 A = load 'test1'  using MapLoader() as ( s, m, l );   
 B = foreach A generate *, string.URLPARSE((chararray) s#'url') as parsedurl;
 C = foreach B {
   urlpath = (chararray) parsedurl#'path';
   lc_urlpath = string.TOLOWERCASE((chararray) urlpath);
   generate *;
 };
 {code}
 Error message;
 | Failed to generate logical plan.
 |Nested exception: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 
 2225: Projection with nothing to reference!
 PIG-2002 reports a similar issue, but when i tried with the patch of PIG-2002 
 i was getting the below exception;
  ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: file repro.pig, line 
 11, column 33  mismatched input '(' expecting SEMI_COLON

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2012) Comments at the begining of the file throws off line numbers in errors

2011-05-09 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13030930#comment-13030930
 ] 

Xuefu Zhang commented on PIG-2012:
--

+1 to patch PIG-2012_2.patch

 Comments at the begining of the file throws off line numbers in errors
 --

 Key: PIG-2012
 URL: https://issues.apache.org/jira/browse/PIG-2012
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.9.0
Reporter: Alan Gates
Assignee: Richard Ding
 Fix For: 0.9.0

 Attachments: PIG-2012_1.patch, PIG-2012_2.patch, macro.pig


 The preprocessor does not appear to be handling leading comments properly 
 when calculating line numbers for error messages.  In the attached script, 
 the error is reported to be on line 7.  It is actually on line 10.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2052) Ship guava.jar to backend

2011-05-09 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-2052:


Attachment: PIG-2052-1.patch

Attach an initial patch for reference.

 Ship guava.jar to backend
 -

 Key: PIG-2052
 URL: https://issues.apache.org/jira/browse/PIG-2052
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.8.1, 0.9.0
Reporter: Daniel Dai
Assignee: Dmitriy V. Ryaboy
 Fix For: 0.9.0

 Attachments: PIG-2052-1.patch


 We need to ship guava.jar to backend. GenericInvoker is using it.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2038) Pig fails to parse empty tuple/map/bag constant

2011-05-09 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13030949#comment-13030949
 ] 

Thejas M Nair commented on PIG-2038:


+1

 Pig fails to parse empty tuple/map/bag constant
 ---

 Key: PIG-2038
 URL: https://issues.apache.org/jira/browse/PIG-2038
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.9.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 0.9.0

 Attachments: PIG-2038.patch


 Pig fails to parse the following query:
 a = foreach (load 'b') generate ();
 store a into 'output';
 Error msg: Failed to parse: null
 Similar problem occurs for empty bag/map constant.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2038) Pig fails to parse empty tuple/map/bag constant

2011-05-09 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13030965#comment-13030965
 ] 

Xuefu Zhang commented on PIG-2038:
--

Patch is committed to both trunk and 0.9.0.

 Pig fails to parse empty tuple/map/bag constant
 ---

 Key: PIG-2038
 URL: https://issues.apache.org/jira/browse/PIG-2038
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.9.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 0.9.0

 Attachments: PIG-2038.patch


 Pig fails to parse the following query:
 a = foreach (load 'b') generate ();
 store a into 'output';
 Error msg: Failed to parse: null
 Similar problem occurs for empty bag/map constant.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (PIG-2038) Pig fails to parse empty tuple/map/bag constant

2011-05-09 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang resolved PIG-2038.
--

Resolution: Fixed

 Pig fails to parse empty tuple/map/bag constant
 ---

 Key: PIG-2038
 URL: https://issues.apache.org/jira/browse/PIG-2038
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.9.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 0.9.0

 Attachments: PIG-2038.patch


 Pig fails to parse the following query:
 a = foreach (load 'b') generate ();
 store a into 'output';
 Error msg: Failed to parse: null
 Similar problem occurs for empty bag/map constant.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-1827) When passing a parameter to Pig, if the value contains $ it has to be escaped for no apparent reason

2011-05-09 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-1827:
--

Attachment: PIG-1827_3.patch

We should limit this jira to fix the issue in embedded Pig (i.e. workaround the 
general parameter substitution) and visit parameter substitution parser and 
related code in a separate jira.

 When passing a parameter to Pig, if the value contains $ it has to be escaped 
 for no apparent reason
 

 Key: PIG-1827
 URL: https://issues.apache.org/jira/browse/PIG-1827
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.8.0
Reporter: Julien Le Dem
Assignee: Richard Ding
 Fix For: 0.9.0

 Attachments: PIG-1827-1.patch, PIG-1827_2.patch, PIG-1827_3.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1827) When passing a parameter to Pig, if the value contains $ it has to be escaped for no apparent reason

2011-05-09 Thread Richard Ding (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13030972#comment-13030972
 ] 

Richard Ding commented on PIG-1827:
---

New patch added a unit test case as suggested.

 When passing a parameter to Pig, if the value contains $ it has to be escaped 
 for no apparent reason
 

 Key: PIG-1827
 URL: https://issues.apache.org/jira/browse/PIG-1827
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.8.0
Reporter: Julien Le Dem
Assignee: Richard Ding
 Fix For: 0.9.0

 Attachments: PIG-1827-1.patch, PIG-1827_2.patch, PIG-1827_3.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (PIG-1986) ant eclipse-files target needs to be updated for new jars/new jar locations

2011-05-09 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair resolved PIG-1986.


Resolution: Duplicate

 ant eclipse-files target needs to be updated for new jars/new jar locations
 ---

 Key: PIG-1986
 URL: https://issues.apache.org/jira/browse/PIG-1986
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.9.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.9.0

 Attachments: PIG-1986.1.patch


 .eclipse.templates/.classpath needs to be updated to address following -
 1. new jars, jars that moved from lib/ to build/ivy/lib/Pig .
 2. test/e2e dir- test/e2e/pig/udfs/java needs to be added as top level dir so 
 that dir structure matches package name.
 I am also making a change to TOMAP.java in e2e dir , to add a package name 
 that matches the dir structure.
  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2036) Set header delimiter in PigStorageSchema

2011-05-09 Thread Dmitriy V. Ryaboy (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy V. Ryaboy updated PIG-2036:
---

   Resolution: Fixed
Fix Version/s: 0.10
   Status: Resolved  (was: Patch Available)

Piggybank tests pass. Committed to trunk.

 Set header delimiter in PigStorageSchema
 

 Key: PIG-2036
 URL: https://issues.apache.org/jira/browse/PIG-2036
 Project: Pig
  Issue Type: Improvement
  Components: impl
Affects Versions: 0.7.0, 0.8.0
Reporter: Mads Moeller
Assignee: Mads Moeller
Priority: Minor
 Fix For: 0.10

 Attachments: PIG-2036.1.patch, PIG-2036.patch


 Piggybanks' PigStorageSchema currently defaults the delimiter to a tab in the 
 generated header file (.pig_header).
 The attached patch set the header delimiter to what is passed in via the 
 constructor. Otherwise it'll default to tab '\t'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2012) Comments at the begining of the file throws off line numbers in errors

2011-05-09 Thread Woody Anderson (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031001#comment-13031001
 ] 

Woody Anderson commented on PIG-2012:
-

thanks for this one! this has been a major pain for me.

 Comments at the begining of the file throws off line numbers in errors
 --

 Key: PIG-2012
 URL: https://issues.apache.org/jira/browse/PIG-2012
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.9.0
Reporter: Alan Gates
Assignee: Richard Ding
 Fix For: 0.9.0

 Attachments: PIG-2012_1.patch, PIG-2012_2.patch, macro.pig


 The preprocessor does not appear to be handling leading comments properly 
 when calculating line numbers for error messages.  In the attached script, 
 the error is reported to be on line 7.  It is actually on line 10.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1946) HBaseStorage constructor syntax is error prone

2011-05-09 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031006#comment-13031006
 ] 

Dmitriy V. Ryaboy commented on PIG-1946:


Bill, sorry it took me a while to get to this. Looks good, but I just confirmed 
that commas are valid in column names.. we should add escaping.

 HBaseStorage constructor syntax is error prone
 --

 Key: PIG-1946
 URL: https://issues.apache.org/jira/browse/PIG-1946
 Project: Pig
  Issue Type: Improvement
Reporter: Bill Graham
Assignee: Bill Graham
 Fix For: 0.10

 Attachments: PIG-1946_1.patch


 Using {{HBaseStorage}} like so seems like a reasonable thing to do, but it 
 will yield unexpected results:
 {code}
 STORE result INTO 'hbase://foo' USING
  org.apache.pig.backend.hadoop.hbase.HBaseStorage(
  'info:first_name, info:last_name');
 {code}
 The problem us that a column named {{info:first_name,}} will be created, with 
 the trailing comma included. I've had numerous developers get tripped up on 
 this issue since everywhere else in Pig variables are separated by commas, so 
 I propose we fix it.
 I propose we trim leading/trailing commas from column names, but I'm open to 
 other ideas.
 Also should we accept column names that are comman-delimited without spaces?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1825) ability to turn off the write ahead log for pig's HBaseStorage

2011-05-09 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031012#comment-13031012
 ] 

Dmitriy V. Ryaboy commented on PIG-1825:


The patch is really straightforward and the test doesn't actually test the 
patch, except to make sure the argument doesn't break parsing.  WAL behavior is 
not actually verified.

Two things we can do here: 
1) make a createPut() method in HBStorage, call it from putNext(), and in a 
test create our own HBS, call createPut(), and check that put.getWriteToWal() 
returns the right value
2) ignore the trivial test.

Option 1 is the right thing to do, 2 I can probably be convinced of. As is we 
shouldn't commit, since the test just adds extra time to unit tests without 
doing much useful work.

 ability to turn off the write ahead log for pig's HBaseStorage
 --

 Key: PIG-1825
 URL: https://issues.apache.org/jira/browse/PIG-1825
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.8.0
Reporter: Corbin Hoenes
Priority: Minor
 Attachments: HBaseStorage_noWAL.patch, PIG-1825_1.patch


 Added an option to allow a caller of HBaseStorage to turn off the 
 WriteAheadLog feature while doing bulk loads into hbase.
 From the performance tuning wikipage: 
 http://wiki.apache.org/hadoop/PerformanceTuning
 To speed up the inserts in a non critical job (like an import job), you can 
 use Put.writeToWAL(false) to bypass writing to the write ahead log.
 We've tested this on HBase 0.20.6 and it helps dramatically.  
 The -noWAL options is passed in just like other options for hbase storage:
 STORE myalias INTO 'MyTable' USING 
 org.apache.pig.backend.hadoop.hbase.HBaseStorage('mycolumnfamily:field1 
 mycolumnfamily:field2','-noWAL');
 This would be my first patch so please educate me with any steps I need to 
 do.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1778) Some dependencies not packaged with Pig 0.8 release

2011-05-09 Thread Soren Macbeth (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13031020#comment-13031020
 ] 

Soren Macbeth commented on PIG-1778:


Yes, looks like it's fixed from here. 

 Some dependencies not packaged with Pig 0.8 release
 ---

 Key: PIG-1778
 URL: https://issues.apache.org/jira/browse/PIG-1778
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Dmitriy V. Ryaboy

 Some of the libraries required for new Pig features are not included in the 
 built tarball of 0.8 release:
 guava, required for HBaseStorage
 jython, required for Jython UDFs
 We should discuss how to properly package these dependencies.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira