Build failed in Jenkins: Pig-trunk-commit #822

2011-05-19 Thread Apache Jenkins Server
See 

Changes:

[tomwhite] Include 32-bit and 64-bit native libraries in Jenkins tarball builds

--
[...truncated 4755 lines...]
-
| buildJar |   57  |   0   |   0   |   8   ||   50  |   0   |
-
[ivy:retrieve] :: retrieving :: org.apache.pig#Pig
[ivy:retrieve]  confs: [buildJar]
[ivy:retrieve]  1 artifacts copied, 49 already retrieved (288kB/9ms)

buildJar:
 [echo] svnString svn: This client is too old to work with working copy 
'  You need
 [echo] to get a newer Subversion client, or to downgrade this working copy.
 [echo] See http://subversion.tigris.org/faq.html#working-copy-format-change
 [echo] for details.
  [jar] Manifest is invalid: Manifest line "to get a newer Subversion 
client, or to downgrade this working cop" is not valid as it does not contain a 
name and a value separated by ': ' 
  [jar] error while reading original manifest in file: 

 Manifest: null
  [jar] Building jar: 

  [jar] Manifest is invalid: Manifest line "to get a newer Subversion 
client, or to downgrade this working cop" is not valid as it does not contain a 
name and a value separated by ': ' 
  [jar] error while reading original manifest in file: 

 Manifest: null
  [jar] Building jar: 

 [copy] Copying 1 file to 


jarWithOutSvn:

findbugs:
[mkdir] Created dir: 

 [findbugs] Executing findbugs from ant task
 [findbugs] Running FindBugs...
 [findbugs] Exception in thread "main" java.io.IOException: invalid header field
 [findbugs] at java.util.jar.Attributes.read(Attributes.java:389)
 [findbugs] at java.util.jar.Manifest.read(Manifest.java:234)
 [findbugs] at java.util.jar.Manifest.(Manifest.java:52)
 [findbugs] at 
edu.umd.cs.findbugs.classfile.impl.ClassPathBuilder.scanJarManifestForClassPathEntries(ClassPathBuilder.java:706)
 [findbugs] at 
edu.umd.cs.findbugs.classfile.impl.ClassPathBuilder.processWorkList(ClassPathBuilder.java:580)
 [findbugs] at 
edu.umd.cs.findbugs.classfile.impl.ClassPathBuilder.build(ClassPathBuilder.java:195)
 [findbugs] at 
edu.umd.cs.findbugs.FindBugs2.buildClassPath(FindBugs2.java:584)
 [findbugs] at edu.umd.cs.findbugs.FindBugs2.execute(FindBugs2.java:181)
 [findbugs] at edu.umd.cs.findbugs.FindBugs.runMain(FindBugs.java:348)
 [findbugs] at edu.umd.cs.findbugs.FindBugs2.main(FindBugs2.java:1057)
 [findbugs] Java Result: 1
 [findbugs] Output saved to 

 [xslt] Processing 

 to 

 [xslt] Loading stylesheet 
/homes/hudson/tools/findbugs/latest/src/xsl/default.xsl
 [xslt] : Error! Premature end of file.
 [xslt] : Error! 
com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: Premature end of 
file.
 [xslt] Failed to process 


BUILD FAILED
javax.xml.transform.TransformerException: 
javax.xml.transform.TransformerException: 
com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: Premature end of 
file.
at 
com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:719)
at 
com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:313)
at 
org.apache.tools.ant.taskdefs.optional.TraXLiaison.transform(TraXLiaison.java:187)
at 
org.apache.tools.ant.taskdefs.XSLTProcess.process(XSLTProcess.java:709)
at 
org.apache.tools.ant.taskdefs.XSLTProcess.execute(XSLTProcess.java:333)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:288)
at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
  

Build failed in Jenkins: Pig-trunk-commit #821

2011-05-19 Thread Apache Jenkins Server
See 

Changes:

[rding] Replace RuntimeException with ParserException

--
[...truncated 4845 lines...]
-
| buildJar |   57  |   0   |   0   |   8   ||   50  |   0   |
-
[ivy:retrieve] :: retrieving :: org.apache.pig#Pig
[ivy:retrieve]  confs: [buildJar]
[ivy:retrieve]  1 artifacts copied, 49 already retrieved (288kB/11ms)

buildJar:
 [echo] svnString svn: This client is too old to work with working copy 
'  You need
 [echo] to get a newer Subversion client, or to downgrade this working copy.
 [echo] See http://subversion.tigris.org/faq.html#working-copy-format-change
 [echo] for details.
  [jar] Manifest is invalid: Manifest line "to get a newer Subversion 
client, or to downgrade this working cop" is not valid as it does not contain a 
name and a value separated by ': ' 
  [jar] error while reading original manifest in file: 

 Manifest: null
  [jar] Building jar: 

  [jar] Manifest is invalid: Manifest line "to get a newer Subversion 
client, or to downgrade this working cop" is not valid as it does not contain a 
name and a value separated by ': ' 
  [jar] error while reading original manifest in file: 

 Manifest: null
  [jar] Building jar: 

 [copy] Copying 1 file to 


jarWithOutSvn:

findbugs:
[mkdir] Created dir: 

 [findbugs] Executing findbugs from ant task
 [findbugs] Running FindBugs...
 [findbugs] Exception in thread "main" java.io.IOException: invalid header field
 [findbugs] at java.util.jar.Attributes.read(Attributes.java:389)
 [findbugs] at java.util.jar.Manifest.read(Manifest.java:234)
 [findbugs] at java.util.jar.Manifest.(Manifest.java:52)
 [findbugs] at 
edu.umd.cs.findbugs.classfile.impl.ClassPathBuilder.scanJarManifestForClassPathEntries(ClassPathBuilder.java:706)
 [findbugs] at 
edu.umd.cs.findbugs.classfile.impl.ClassPathBuilder.processWorkList(ClassPathBuilder.java:580)
 [findbugs] at 
edu.umd.cs.findbugs.classfile.impl.ClassPathBuilder.build(ClassPathBuilder.java:195)
 [findbugs] at 
edu.umd.cs.findbugs.FindBugs2.buildClassPath(FindBugs2.java:584)
 [findbugs] at edu.umd.cs.findbugs.FindBugs2.execute(FindBugs2.java:181)
 [findbugs] at edu.umd.cs.findbugs.FindBugs.runMain(FindBugs.java:348)
 [findbugs] at edu.umd.cs.findbugs.FindBugs2.main(FindBugs2.java:1057)
 [findbugs] Java Result: 1
 [findbugs] Output saved to 

 [xslt] Processing 

 to 

 [xslt] Loading stylesheet 
/homes/hudson/tools/findbugs/latest/src/xsl/default.xsl
 [xslt] : Error! Premature end of file.
 [xslt] : Error! 
com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: Premature end of 
file.
 [xslt] Failed to process 


BUILD FAILED
javax.xml.transform.TransformerException: 
javax.xml.transform.TransformerException: 
com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: Premature end of 
file.
at 
com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:719)
at 
com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:313)
at 
org.apache.tools.ant.taskdefs.optional.TraXLiaison.transform(TraXLiaison.java:187)
at 
org.apache.tools.ant.taskdefs.XSLTProcess.process(XSLTProcess.java:709)
at 
org.apache.tools.ant.taskdefs.XSLTProcess.execute(XSLTProcess.java:333)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:288)
at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.tool

[jira] [Resolved] (PIG-1880) Get rid of javacc in the rest of the code

2011-05-19 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich resolved PIG-1880.
-

Resolution: Won't Fix

Created separate tickets for individual components

> Get rid of javacc in the rest of the code
> -
>
> Key: PIG-1880
> URL: https://issues.apache.org/jira/browse/PIG-1880
> Project: Pig
>  Issue Type: Task
>Reporter: Olga Natkovich
> Fix For: 0.10
>
>
> Woth Pig 0.9 grunt and pig parsers will be moved from javacc to antlr. For 
> the following release it would be great to fully migrate to a single parser 
> technology

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2077) Project UDF output inside a non-foreach statement fail on 0.8

2011-05-19 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036535#comment-13036535
 ] 

jirapos...@reviews.apache.org commented on PIG-2077:



---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/767/
---

Review request for pig and thejas.


Summary
---

See PIG-2077


This addresses bug PIG-2077.
https://issues.apache.org/jira/browse/PIG-2077


Diffs
-

  
branches/branch-0.8/src/org/apache/pig/newplan/logical/LogicalExpPlanMigrationVistor.java
 1104455 
  branches/branch-0.8/test/org/apache/pig/test/TestEvalPipeline2.java 1104455 

Diff: https://reviews.apache.org/r/767/diff


Testing
---

Test patch:
 [exec] +1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.

Unit test:
all pass

End to end test:
all pass


Thanks,

Daniel



> Project UDF output inside a non-foreach statement fail on 0.8
> -
>
> Key: PIG-2077
> URL: https://issues.apache.org/jira/browse/PIG-2077
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.8.1
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.8.1
>
> Attachments: PIG-2077-1.patch
>
>
> The following script fail on 0.8:
> {code}
> A = load '1.txt' as (tracking_id, day:chararray);
> B = load '2.txt' as (tracking_id, timestamp:chararray);
> C = JOIN A by (tracking_id, day) LEFT OUTER, B by (tracking_id,  
> STRSPLIT(timestamp, ' ').$0);
> explain C;
> {code}
> Error stack:
> Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
> at java.util.ArrayList.get(ArrayList.java:324)
> at 
> org.apache.pig.newplan.logical.expression.ProjectExpression.findReferent(ProjectExpression.java:207)
> at 
> org.apache.pig.newplan.logical.expression.ProjectExpression.getFieldSchema(ProjectExpression.java:121)
> at 
> org.apache.pig.newplan.logical.optimizer.FieldSchemaResetter.execute(SchemaResetter.java:193)
> at 
> org.apache.pig.newplan.logical.expression.AllSameExpressionVisitor.visit(AllSameExpressionVisitor.java:53)
> at 
> org.apache.pig.newplan.logical.expression.ProjectExpression.accept(ProjectExpression.java:75)
> at 
> org.apache.pig.newplan.ReverseDependencyOrderWalker.walk(ReverseDependencyOrderWalker.java:70)
> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
> at 
> org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:83)
> at 
> org.apache.pig.newplan.logical.relational.LOJoin.accept(LOJoin.java:149)
> at 
> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
> at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:262)
> This is not a problem on 0.9, trunk, since LogicalExpPlanMigrationVistor is 
> dropped in 0.9.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Review Request: Project UDF output inside a non-foreach statement fail on 0.8

2011-05-19 Thread Daniel Dai

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/767/
---

Review request for pig and thejas.


Summary
---

See PIG-2077


This addresses bug PIG-2077.
https://issues.apache.org/jira/browse/PIG-2077


Diffs
-

  
branches/branch-0.8/src/org/apache/pig/newplan/logical/LogicalExpPlanMigrationVistor.java
 1104455 
  branches/branch-0.8/test/org/apache/pig/test/TestEvalPipeline2.java 1104455 

Diff: https://reviews.apache.org/r/767/diff


Testing
---

Test patch:
 [exec] +1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.

Unit test:
all pass

End to end test:
all pass


Thanks,

Daniel



Build failed in Jenkins: Pig-trunk-commit #820

2011-05-19 Thread Apache Jenkins Server
See 

Changes:

[daijy] PIG-2078: POProject.getNext(DataBag) does not handle null

--
[...truncated 4747 lines...]
-
| buildJar |   57  |   0   |   0   |   8   ||   50  |   0   |
-
[ivy:retrieve] :: retrieving :: org.apache.pig#Pig
[ivy:retrieve]  confs: [buildJar]
[ivy:retrieve]  1 artifacts copied, 49 already retrieved (288kB/9ms)

buildJar:
 [echo] svnString svn: This client is too old to work with working copy 
'  You need
 [echo] to get a newer Subversion client, or to downgrade this working copy.
 [echo] See http://subversion.tigris.org/faq.html#working-copy-format-change
 [echo] for details.
  [jar] Manifest is invalid: Manifest line "to get a newer Subversion 
client, or to downgrade this working cop" is not valid as it does not contain a 
name and a value separated by ': ' 
  [jar] error while reading original manifest in file: 

 Manifest: null
  [jar] Building jar: 

  [jar] Manifest is invalid: Manifest line "to get a newer Subversion 
client, or to downgrade this working cop" is not valid as it does not contain a 
name and a value separated by ': ' 
  [jar] error while reading original manifest in file: 

 Manifest: null
  [jar] Building jar: 

 [copy] Copying 1 file to 


jarWithOutSvn:

findbugs:
[mkdir] Created dir: 

 [findbugs] Executing findbugs from ant task
 [findbugs] Running FindBugs...
 [findbugs] Exception in thread "main" java.io.IOException: invalid header field
 [findbugs] at java.util.jar.Attributes.read(Attributes.java:389)
 [findbugs] at java.util.jar.Manifest.read(Manifest.java:234)
 [findbugs] at java.util.jar.Manifest.(Manifest.java:52)
 [findbugs] at 
edu.umd.cs.findbugs.classfile.impl.ClassPathBuilder.scanJarManifestForClassPathEntries(ClassPathBuilder.java:706)
 [findbugs] at 
edu.umd.cs.findbugs.classfile.impl.ClassPathBuilder.processWorkList(ClassPathBuilder.java:580)
 [findbugs] at 
edu.umd.cs.findbugs.classfile.impl.ClassPathBuilder.build(ClassPathBuilder.java:195)
 [findbugs] at 
edu.umd.cs.findbugs.FindBugs2.buildClassPath(FindBugs2.java:584)
 [findbugs] at edu.umd.cs.findbugs.FindBugs2.execute(FindBugs2.java:181)
 [findbugs] at edu.umd.cs.findbugs.FindBugs.runMain(FindBugs.java:348)
 [findbugs] at edu.umd.cs.findbugs.FindBugs2.main(FindBugs2.java:1057)
 [findbugs] Java Result: 1
 [findbugs] Output saved to 

 [xslt] Processing 

 to 

 [xslt] Loading stylesheet 
/homes/hudson/tools/findbugs/latest/src/xsl/default.xsl
 [xslt] : Error! Premature end of file.
 [xslt] : Error! 
com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: Premature end of 
file.
 [xslt] Failed to process 


BUILD FAILED
javax.xml.transform.TransformerException: 
javax.xml.transform.TransformerException: 
com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: Premature end of 
file.
at 
com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:719)
at 
com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:313)
at 
org.apache.tools.ant.taskdefs.optional.TraXLiaison.transform(TraXLiaison.java:187)
at 
org.apache.tools.ant.taskdefs.XSLTProcess.process(XSLTProcess.java:709)
at 
org.apache.tools.ant.taskdefs.XSLTProcess.execute(XSLTProcess.java:333)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:288)
at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.

Build failed in Jenkins: Pig-trunk #1018

2011-05-19 Thread Apache Jenkins Server
See 

Changes:

[daijy] PIG-2078: POProject.getNext(DataBag) does not handle null

[rding] PIG-2029: Inconsistency in Pig Stats reports

--
[...truncated 4240 lines...]
-
| buildJar |   57  |   0   |   0   |   8   ||   50  |   0   |
-
[ivy:retrieve] :: retrieving :: org.apache.pig#Pig
[ivy:retrieve]  confs: [buildJar]
[ivy:retrieve]  1 artifacts copied, 49 already retrieved (288kB/9ms)

buildJar:
 [echo] svnString svn: This client is too old to work with working copy 
'  You need
 [echo] to get a newer Subversion client, or to downgrade this working copy.
 [echo] See http://subversion.tigris.org/faq.html#working-copy-format-change
 [echo] for details.
  [jar] Manifest is invalid: Manifest line "to g" is not valid as it does 
not contain a name and a value separated by ': ' 
  [jar] error while reading original manifest in file: 

 Manifest: null
  [jar] Building jar: 

  [jar] Manifest is invalid: Manifest line "to g" is not valid as it does 
not contain a name and a value separated by ': ' 
  [jar] error while reading original manifest in file: 

 Manifest: null
  [jar] Building jar: 

 [copy] Copying 1 file to 


jarWithOutSvn:

findbugs:
[mkdir] Created dir: 

 [findbugs] Executing findbugs from ant task
 [findbugs] Running FindBugs...
 [findbugs] Exception in thread "main" java.io.IOException: invalid header field
 [findbugs] at java.util.jar.Attributes.read(Attributes.java:389)
 [findbugs] at java.util.jar.Manifest.read(Manifest.java:234)
 [findbugs] at java.util.jar.Manifest.(Manifest.java:52)
 [findbugs] at 
edu.umd.cs.findbugs.classfile.impl.ClassPathBuilder.scanJarManifestForClassPathEntries(ClassPathBuilder.java:706)
 [findbugs] at 
edu.umd.cs.findbugs.classfile.impl.ClassPathBuilder.processWorkList(ClassPathBuilder.java:580)
 [findbugs] at 
edu.umd.cs.findbugs.classfile.impl.ClassPathBuilder.build(ClassPathBuilder.java:195)
 [findbugs] at 
edu.umd.cs.findbugs.FindBugs2.buildClassPath(FindBugs2.java:584)
 [findbugs] at edu.umd.cs.findbugs.FindBugs2.execute(FindBugs2.java:181)
 [findbugs] at edu.umd.cs.findbugs.FindBugs.runMain(FindBugs.java:348)
 [findbugs] at edu.umd.cs.findbugs.FindBugs2.main(FindBugs2.java:1057)
 [findbugs] Java Result: 1
 [findbugs] Output saved to 

 [xslt] Processing 

 to 

 [xslt] Loading stylesheet 
/homes/hudson/tools/findbugs/latest/src/xsl/default.xsl
 [xslt] : Error! Premature end of file.
 [xslt] : Error! 
com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: Premature end of 
file.
 [xslt] Failed to process 


BUILD FAILED
javax.xml.transform.TransformerException: 
javax.xml.transform.TransformerException: 
com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: Premature end of 
file.
at 
com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:719)
at 
com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:313)
at 
org.apache.tools.ant.taskdefs.optional.TraXLiaison.transform(TraXLiaison.java:187)
at 
org.apache.tools.ant.taskdefs.XSLTProcess.process(XSLTProcess.java:709)
at 
org.apache.tools.ant.taskdefs.XSLTProcess.execute(XSLTProcess.java:333)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:288)
at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
at org.apache.tools.ant.Task.perform(Task.java:348)
   

[jira] [Updated] (PIG-1866) Dereference a bag within a tuple does not work

2011-05-19 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1866:


Attachment: PIG-1866-4.patch

> Dereference a bag within a tuple does not work
> --
>
> Key: PIG-1866
> URL: https://issues.apache.org/jira/browse/PIG-1866
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.8.0
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.9.0
>
> Attachments: PIG-1866-1.patch, PIG-1866-2.patch, PIG-1866-3.patch, 
> PIG-1866-4.patch
>
>
> The following script does not work (both in new and old logical plan):
> {code}
> a = load '1.txt' as (t : tuple(i: int, b1: bag { b_tuple : tuple ( b_str: 
> chararray) }));
> b = foreach a generate t.b1;
> dump b;
> {code}
> 1.txt:
> (1,{(one),(two)})
> Error from old logical plan:
> java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be 
> cast to org.apache.pig.data.DataBag
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:482)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:480)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:339)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:237)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:232)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> Error from new logical plan:
> java.lang.NullPointerException
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.consumeInputBag(POProject.java:246)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:200)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:339)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:237)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:232)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> If we change "b = foreach a generate t.b1;" to "b = foreach a generate t.i;", 
> it works fine, only refer to a bag does not work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2011-05-19 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036514#comment-13036514
 ] 

Olga Natkovich commented on PIG-1824:
-

I believe that Richard is running some additional tests. Once he is done, he is 
planning to commit the patch

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.10
>
> Attachments: 1824.patch, 1824_final.patch, 1824a.patch, 1824b.patch, 
> 1824c.patch, 1824d.patch, 1824x.patch, 
> TEST-org.apache.pig.test.TestGrunt.txt, 
> TEST-org.apache.pig.test.TestScriptLanguage.txt, 
> TEST-org.apache.pig.test.TestScriptUDF.txt
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-1866) Dereference a bag within a tuple does not work

2011-05-19 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1866:


Attachment: (was: PIG-1866-4.patch)

> Dereference a bag within a tuple does not work
> --
>
> Key: PIG-1866
> URL: https://issues.apache.org/jira/browse/PIG-1866
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.8.0
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.9.0
>
> Attachments: PIG-1866-1.patch, PIG-1866-2.patch, PIG-1866-3.patch, 
> PIG-1866-4.patch
>
>
> The following script does not work (both in new and old logical plan):
> {code}
> a = load '1.txt' as (t : tuple(i: int, b1: bag { b_tuple : tuple ( b_str: 
> chararray) }));
> b = foreach a generate t.b1;
> dump b;
> {code}
> 1.txt:
> (1,{(one),(two)})
> Error from old logical plan:
> java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be 
> cast to org.apache.pig.data.DataBag
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:482)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:480)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:339)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:237)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:232)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> Error from new logical plan:
> java.lang.NullPointerException
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.consumeInputBag(POProject.java:246)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:200)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:339)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:237)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:232)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> If we change "b = foreach a generate t.b1;" to "b = foreach a generate t.i;", 
> it works fine, only refer to a bag does not work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-1866) Dereference a bag within a tuple does not work

2011-05-19 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1866:


Attachment: PIG-1866-4.patch

PIG-1866-4.patch contains additional fixes for those who need this patch in 0.8.

> Dereference a bag within a tuple does not work
> --
>
> Key: PIG-1866
> URL: https://issues.apache.org/jira/browse/PIG-1866
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.8.0
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.9.0
>
> Attachments: PIG-1866-1.patch, PIG-1866-2.patch, PIG-1866-3.patch, 
> PIG-1866-4.patch
>
>
> The following script does not work (both in new and old logical plan):
> {code}
> a = load '1.txt' as (t : tuple(i: int, b1: bag { b_tuple : tuple ( b_str: 
> chararray) }));
> b = foreach a generate t.b1;
> dump b;
> {code}
> 1.txt:
> (1,{(one),(two)})
> Error from old logical plan:
> java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be 
> cast to org.apache.pig.data.DataBag
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:482)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:480)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:339)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:237)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:232)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> Error from new logical plan:
> java.lang.NullPointerException
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.consumeInputBag(POProject.java:246)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:200)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:339)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:237)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:232)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> If we change "b = foreach a generate t.b1;" to "b = foreach a generate t.i;", 
> it works fine, only refer to a bag does not work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2011-05-19 Thread Woody Anderson (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036496#comment-13036496
 ] 

Woody Anderson commented on PIG-1824:
-

cool. can we get this into trunk so i don't have to keep fixing the patches?


> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.10
>
> Attachments: 1824.patch, 1824_final.patch, 1824a.patch, 1824b.patch, 
> 1824c.patch, 1824d.patch, 1824x.patch, 
> TEST-org.apache.pig.test.TestGrunt.txt, 
> TEST-org.apache.pig.test.TestScriptLanguage.txt, 
> TEST-org.apache.pig.test.TestScriptUDF.txt
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (PIG-2078) POProject.getNext(DataBag) does not handle null

2011-05-19 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai resolved PIG-2078.
-

  Resolution: Fixed
Hadoop Flags: [Reviewed]

Committed to both trunk and 0.9 branch.

> POProject.getNext(DataBag) does not handle null
> ---
>
> Key: PIG-2078
> URL: https://issues.apache.org/jira/browse/PIG-2078
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.9.0
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.9.0
>
> Attachments: PIG-2078-1.patch
>
>
> The following script fail with "-t MergeForEach"
> {code}
> a = load '1.txt' as (a0:bag{}, a1:int);
> b = foreach a generate a0;
> dump b;
> {code}
> 1.txt:
> {(1)}   2
> 3
> Error stack:
> java.lang.NullPointerException
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.consumeInputBag(POProject.java:310)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:251)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:316)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:261)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:256)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:1)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2081) Dryrun gives wrong line numbers in error message for scripts containing macro.

2011-05-19 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-2081:
--

Attachment: PIG-2081.patch

> Dryrun gives wrong line numbers in error message for scripts containing macro.
> --
>
> Key: PIG-2081
> URL: https://issues.apache.org/jira/browse/PIG-2081
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.9.0
>Reporter: Richard Ding
>Assignee: Richard Ding
> Fix For: 0.9.0
>
> Attachments: PIG-2081.patch
>
>
> For following script (test.pig)
> {code}
> 1 DEFINE my_macro (X,key) returns Y
>   2 {
>   3 tmp1 = foreach  $X generate TOKENIZE((chararray)$key) as tokens;
>   4 tmp2 = foreach tmp1 generate flatten(tokens);
>   5 tmp3 = order tmp2 by $0;
>   6 $Y = distinct tmp3;
>   7 }
>   8 
>   9 A = load 'sometext' using TextLoader() as (row) ;
>  10 E = my_macro(A,row);
>  11 
>  12 A1 = load 'sometext2' using TextLoader() as (row1);
>  13 E1 = my_macro(A1,row1);
>  14 
>  15 A3 = load 'sometext3' using TextLoader() as (row3);
>  16 E3 = my_macro(A3,$0);
>  17 
>  18 F = cogroup E by $0, E1 by $0,E3 by $0;
>  19 dump F;
> {code}
> pig test.pig gives correct line number in error message:
> {code}
> ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200:  column 17>  mismatched input '$0' expecting set null
> {code}
> while pig -r test.pig gives incorrect line number in error message:
> {code}
> ERROR org.apache.pig.Main - ERROR 1200:  column 17>  mismatched input '$0' expecting set null
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Pig-trunk-commit #819

2011-05-19 Thread Apache Jenkins Server
See 

Changes:

[rding] PIG-2029: Inconsistency in Pig Stats reports

--
[...truncated 4747 lines...]
-
| buildJar |   57  |   0   |   0   |   8   ||   50  |   0   |
-
[ivy:retrieve] :: retrieving :: org.apache.pig#Pig
[ivy:retrieve]  confs: [buildJar]
[ivy:retrieve]  1 artifacts copied, 49 already retrieved (288kB/9ms)

buildJar:
 [echo] svnString svn: This client is too old to work with working copy 
'  You need
 [echo] to get a newer Subversion client, or to downgrade this working copy.
 [echo] See http://subversion.tigris.org/faq.html#working-copy-format-change
 [echo] for details.
  [jar] Manifest is invalid: Manifest line "to get a newer Subversion 
client, or to downgrade this working cop" is not valid as it does not contain a 
name and a value separated by ': ' 
  [jar] error while reading original manifest in file: 

 Manifest: null
  [jar] Building jar: 

  [jar] Manifest is invalid: Manifest line "to get a newer Subversion 
client, or to downgrade this working cop" is not valid as it does not contain a 
name and a value separated by ': ' 
  [jar] error while reading original manifest in file: 

 Manifest: null
  [jar] Building jar: 

 [copy] Copying 1 file to 


jarWithOutSvn:

findbugs:
[mkdir] Created dir: 

 [findbugs] Executing findbugs from ant task
 [findbugs] Running FindBugs...
 [findbugs] Exception in thread "main" java.io.IOException: invalid header field
 [findbugs] at java.util.jar.Attributes.read(Attributes.java:389)
 [findbugs] at java.util.jar.Manifest.read(Manifest.java:234)
 [findbugs] at java.util.jar.Manifest.(Manifest.java:52)
 [findbugs] at 
edu.umd.cs.findbugs.classfile.impl.ClassPathBuilder.scanJarManifestForClassPathEntries(ClassPathBuilder.java:706)
 [findbugs] at 
edu.umd.cs.findbugs.classfile.impl.ClassPathBuilder.processWorkList(ClassPathBuilder.java:580)
 [findbugs] at 
edu.umd.cs.findbugs.classfile.impl.ClassPathBuilder.build(ClassPathBuilder.java:195)
 [findbugs] at 
edu.umd.cs.findbugs.FindBugs2.buildClassPath(FindBugs2.java:584)
 [findbugs] at edu.umd.cs.findbugs.FindBugs2.execute(FindBugs2.java:181)
 [findbugs] at edu.umd.cs.findbugs.FindBugs.runMain(FindBugs.java:348)
 [findbugs] at edu.umd.cs.findbugs.FindBugs2.main(FindBugs2.java:1057)
 [findbugs] Java Result: 1
 [findbugs] Output saved to 

 [xslt] Processing 

 to 

 [xslt] Loading stylesheet 
/homes/hudson/tools/findbugs/latest/src/xsl/default.xsl
 [xslt] : Error! Premature end of file.
 [xslt] : Error! 
com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: Premature end of 
file.
 [xslt] Failed to process 


BUILD FAILED
javax.xml.transform.TransformerException: 
javax.xml.transform.TransformerException: 
com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: Premature end of 
file.
at 
com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:719)
at 
com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:313)
at 
org.apache.tools.ant.taskdefs.optional.TraXLiaison.transform(TraXLiaison.java:187)
at 
org.apache.tools.ant.taskdefs.XSLTProcess.process(XSLTProcess.java:709)
at 
org.apache.tools.ant.taskdefs.XSLTProcess.execute(XSLTProcess.java:333)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:288)
at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.tools.

[jira] [Commented] (PIG-2029) Inconsistency in Pig Stats reports

2011-05-19 Thread Richard Ding (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036403#comment-13036403
 ] 

Richard Ding commented on PIG-2029:
---

Patch committed to trunk and 0.9 branch.

> Inconsistency in Pig Stats reports 
> ---
>
> Key: PIG-2029
> URL: https://issues.apache.org/jira/browse/PIG-2029
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.8.1, 0.9.0
>Reporter: Viraj Bhat
>Assignee: Richard Ding
> Fix For: 0.9.0
>
> Attachments: PIG-2029.patch
>
>
> I have a Pig script which reports varying Stats for the same M/R job (same 
> inputs). Sometimes the PigStats reports all the stats (such as 
> Maps,Reduces,MaxMapTime,MinMapTime,AvgMapTime,MaxReduceTime, MinReduceTime 
> and AvgReduceTime) for the M/R job as 0. Sometimes it reports it correctly.
> Enclosed are the stderr logs for 2 runs, you can notice that for Run 1 
> job_201103091134_556600 from Run 1; has 0 against all the columns whereas in 
> Run 2, Hadoop job job_201104272229_75693 has some valid values. 
> The actual Job Tracker link shows that they are non empty. This points to a 
> bug in the interaction of the PigStats module with the Jobtracker.
> Run 1:
> {quote}
> Job Stats (time in seconds):
> JobId MapsReduces MaxMapTime  MinMapTIme  AvgMapTime  
> MaxReduceTime   MinReduceTime   AvgReduceTime   Alias   Feature Outputs
> job_201103091134_556458   160 100 552 191 368 1257
> 371 392 
> IN,SP10P,SP11P,SP12P,SP13P,SP16P,SP17P,SP18P,SP20P,SP21P,SP22P,SP23P,SP24P,SP26P,SP27P,SP28P,SP29P,SP30P,SP31P,SP32P,SP33P,SP34P,SP4P,SP6P,SP7P,SP8P,SP9P
>DISTINCT,MULTI_QUERY
> job_201103091134_556600   0   0   0   0   0   0   
> 0   0   UNION5  MULTI_QUERY,MAP_ONLY/user/viraj/dir,,
> job_201103091134_556601   7   100 17  8   14  200 
> 15  27  CNJOIN25,GNJOIN25,sampleNJOIN25 GROUP_BY,COMBINER   
> job_201103091134_556602   0   0   0   0   0   0   
> 0   0   CNJOIN3,GNJOIN3,sampleNJOIN3GROUP_BY,COMBINER   
> job_201103091134_556603   0   0   0   0   0   0   
> 0   0   CNJOIN15,GNJOIN15,sampleNJOIN15 GROUP_BY,COMBINER   
> job_201103091134_556604   2   100 13  7   10  34  
> 13  31  CNJOIN19,GNJOIN19,sampleNJOIN19 GROUP_BY,COMBINER   
> job_201103091134_556644   0   0   0   0   0   0   
> 0   0   ONJOIN15SAMPLER 
> job_201103091134_556645   0   0   0   0   0   0   
> 0   0   ONJOIN25SAMPLER 
> job_201103091134_556646   0   0   0   0   0   0   
> 0   0   ONJOIN3 SAMPLER 
> job_201103091134_556654   0   0   0   0   0   0   
> 0   0   ONJOIN19SAMPLER 
> job_201103091134_556662   0   0   0   0   0   0   
> 0   0   ONJOIN19ORDER_BY,COMBINER
> ..
> {quote}
> Run 2:
> {quote}
> Job Stats (time in seconds):
> JobId MapsReduces MaxMapTime  MinMapTIme  AvgMapTime  
> MaxReduceTime   MinReduceTime   AvgReduceTime   Alias   Feature Outputs
> job_201104272229_75503159 100 484 192 353 396 
> 308 321 
> IN,SP10P,SP11P,SP12P,SP13P,SP16P,SP17P,SP18P,SP20P,SP21P,SP22P,SP23P,SP24P,SP26P,SP27P,SP28P,SP29P,SP30P,SP31P,SP32P,SP33P,SP34P,SP4P,SP6P,SP7P,SP8P,SP9P
>DISTINCT,MULTI_QUERY
> job_201104272229_7569318  0   31  14  24  0   
> 0   UNION5 MULTI_QUERY,MAP_ONLY /user/viraj/dir,
> job_201104272229_756947   100 34  13  22  46  
> 20  25  CNJOIN25,GNJOIN25,sampleNJOIN25 GROUP_BY,COMBINER   
> job_201104272229_75695125 100 19  11  15  32  
> 18  26  CNJOIN3,GNJOIN3,sampleNJOIN3GROUP_BY,COMBINER   
> job_201104272229_756981   100 12  12  12  13  
> 9   11  CNJOIN15,GNJOIN15,sampleNJOIN15 GROUP_BY,COMBINER   
> job_201104272229_757022   100 21  5   13  35  
> 22  26  CNJOIN19,GNJOIN19,sampleNJOIN19 GROUP_BY,COMBINER   
> job_201104272229_757241   1   4   4   4   11  
> 11  11  ONJOIN15SAMPLER 
> job_201104272229_757250   0   0   0   0   0   
> 0   ONJOIN25SAMPLER 
> job_201104272229_757266   1   8   6   8   24  
> 24  24  ONJOIN3 SAMPLER 
> job_201104272229_757290   0   0   0   0 

[jira] [Created] (PIG-2081) Dryrun gives wrong line numbers in error message for scripts containing macro.

2011-05-19 Thread Richard Ding (JIRA)
Dryrun gives wrong line numbers in error message for scripts containing macro.
--

 Key: PIG-2081
 URL: https://issues.apache.org/jira/browse/PIG-2081
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.9.0
Reporter: Richard Ding
Assignee: Richard Ding
 Fix For: 0.9.0


For following script (test.pig)

{code}
1 DEFINE my_macro (X,key) returns Y
  2 {
  3 tmp1 = foreach  $X generate TOKENIZE((chararray)$key) as tokens;
  4 tmp2 = foreach tmp1 generate flatten(tokens);
  5 tmp3 = order tmp2 by $0;
  6 $Y = distinct tmp3;
  7 }
  8 
  9 A = load 'sometext' using TextLoader() as (row) ;
 10 E = my_macro(A,row);
 11 
 12 A1 = load 'sometext2' using TextLoader() as (row1);
 13 E1 = my_macro(A1,row1);
 14 
 15 A3 = load 'sometext3' using TextLoader() as (row3);
 16 E3 = my_macro(A3,$0);
 17 
 18 F = cogroup E by $0, E1 by $0,E3 by $0;
 19 dump F;
{code}

pig test.pig gives correct line number in error message:

{code}
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200:   mismatched input '$0' expecting set null
{code}

while pig -r test.pig gives incorrect line number in error message:

{code}
ERROR org.apache.pig.Main - ERROR 1200:   mismatched input '$0' expecting set null
{code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2078) POProject.getNext(DataBag) does not handle null

2011-05-19 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036344#comment-13036344
 ] 

jirapos...@reviews.apache.org commented on PIG-2078:



---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/763/#review687
---

Ship it!


+1

- thejas


On 2011-05-19 17:46:48, Daniel Dai wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/763/
bq.  ---
bq.  
bq.  (Updated 2011-05-19 17:46:48)
bq.  
bq.  
bq.  Review request for pig and thejas.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  See PIG-2078
bq.  
bq.  
bq.  This addresses bug PIG-2078.
bq.  https://issues.apache.org/jira/browse/PIG-2078
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POProject.java
 1100118 
bq.trunk/test/org/apache/pig/test/TestEvalPipeline2.java 1100118 
bq.  
bq.  Diff: https://reviews.apache.org/r/763/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Test-patch:
bq.   [exec] +1 overall.  
bq.   [exec] 
bq.   [exec] +1 @author.  The patch does not contain any @author tags.
bq.   [exec] 
bq.   [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
bq.   [exec] 
bq.   [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
bq.   [exec] 
bq.   [exec] +1 javac.  The applied patch does not increase the total 
number of javac compiler warnings.
bq.   [exec] 
bq.   [exec] +1 findbugs.  The patch does not introduce any new 
Findbugs warnings.
bq.   [exec] 
bq.   [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
bq.  
bq.  Unit test:
bq.  all pass
bq.  
bq.  End-to-end test:
bq.  all pass
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Daniel
bq.  
bq.



> POProject.getNext(DataBag) does not handle null
> ---
>
> Key: PIG-2078
> URL: https://issues.apache.org/jira/browse/PIG-2078
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.9.0
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.9.0
>
> Attachments: PIG-2078-1.patch
>
>
> The following script fail with "-t MergeForEach"
> {code}
> a = load '1.txt' as (a0:bag{}, a1:int);
> b = foreach a generate a0;
> dump b;
> {code}
> 1.txt:
> {(1)}   2
> 3
> Error stack:
> java.lang.NullPointerException
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.consumeInputBag(POProject.java:310)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:251)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:316)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:261)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:256)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:1)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: POProject.getNext(DataBag) does not handle null

2011-05-19 Thread thejas . nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/763/#review687
---

Ship it!


+1

- thejas


On 2011-05-19 17:46:48, Daniel Dai wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/763/
> ---
> 
> (Updated 2011-05-19 17:46:48)
> 
> 
> Review request for pig and thejas.
> 
> 
> Summary
> ---
> 
> See PIG-2078
> 
> 
> This addresses bug PIG-2078.
> https://issues.apache.org/jira/browse/PIG-2078
> 
> 
> Diffs
> -
> 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POProject.java
>  1100118 
>   trunk/test/org/apache/pig/test/TestEvalPipeline2.java 1100118 
> 
> Diff: https://reviews.apache.org/r/763/diff
> 
> 
> Testing
> ---
> 
> Test-patch:
>  [exec] +1 overall.  
>  [exec] 
>  [exec] +1 @author.  The patch does not contain any @author tags.
>  [exec] 
>  [exec] +1 tests included.  The patch appears to include 3 new or 
> modified tests.
>  [exec] 
>  [exec] +1 javadoc.  The javadoc tool did not generate any warning 
> messages.
>  [exec] 
>  [exec] +1 javac.  The applied patch does not increase the total 
> number of javac compiler warnings.
>  [exec] 
>  [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
> warnings.
>  [exec] 
>  [exec] +1 release audit.  The applied patch does not increase the 
> total number of release audit warnings.
> 
> Unit test:
> all pass
> 
> End-to-end test:
> all pass
> 
> 
> Thanks,
> 
> Daniel
> 
>



[jira] [Commented] (PIG-2078) POProject.getNext(DataBag) does not handle null

2011-05-19 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036321#comment-13036321
 ] 

jirapos...@reviews.apache.org commented on PIG-2078:



---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/763/
---

Review request for pig and thejas.


Summary
---

See PIG-2078


This addresses bug PIG-2078.
https://issues.apache.org/jira/browse/PIG-2078


Diffs
-

  
trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POProject.java
 1100118 
  trunk/test/org/apache/pig/test/TestEvalPipeline2.java 1100118 

Diff: https://reviews.apache.org/r/763/diff


Testing
---

Test-patch:
 [exec] +1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.

Unit test:
all pass

End-to-end test:
all pass


Thanks,

Daniel



> POProject.getNext(DataBag) does not handle null
> ---
>
> Key: PIG-2078
> URL: https://issues.apache.org/jira/browse/PIG-2078
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.9.0
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.9.0
>
> Attachments: PIG-2078-1.patch
>
>
> The following script fail with "-t MergeForEach"
> {code}
> a = load '1.txt' as (a0:bag{}, a1:int);
> b = foreach a generate a0;
> dump b;
> {code}
> 1.txt:
> {(1)}   2
> 3
> Error stack:
> java.lang.NullPointerException
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.consumeInputBag(POProject.java:310)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:251)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:316)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:261)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:256)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:1)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Review Request: POProject.getNext(DataBag) does not handle null

2011-05-19 Thread Daniel Dai

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/763/
---

Review request for pig and thejas.


Summary
---

See PIG-2078


This addresses bug PIG-2078.
https://issues.apache.org/jira/browse/PIG-2078


Diffs
-

  
trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POProject.java
 1100118 
  trunk/test/org/apache/pig/test/TestEvalPipeline2.java 1100118 

Diff: https://reviews.apache.org/r/763/diff


Testing
---

Test-patch:
 [exec] +1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.

Unit test:
all pass

End-to-end test:
all pass


Thanks,

Daniel



[jira] [Updated] (PIG-2078) POProject.getNext(DataBag) does not handle null

2011-05-19 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-2078:


Attachment: PIG-2078-1.patch

> POProject.getNext(DataBag) does not handle null
> ---
>
> Key: PIG-2078
> URL: https://issues.apache.org/jira/browse/PIG-2078
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.9.0
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.9.0
>
> Attachments: PIG-2078-1.patch
>
>
> The following script fail with "-t MergeForEach"
> {code}
> a = load '1.txt' as (a0:bag{}, a1:int);
> b = foreach a generate a0;
> dump b;
> {code}
> 1.txt:
> {(1)}   2
> 3
> Error stack:
> java.lang.NullPointerException
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.consumeInputBag(POProject.java:310)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:251)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:316)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:261)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:256)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:1)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2029) Inconsistency in Pig Stats reports

2011-05-19 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036315#comment-13036315
 ] 

Thejas M Nair commented on PIG-2029:


+1

> Inconsistency in Pig Stats reports 
> ---
>
> Key: PIG-2029
> URL: https://issues.apache.org/jira/browse/PIG-2029
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.8.1, 0.9.0
>Reporter: Viraj Bhat
>Assignee: Richard Ding
> Fix For: 0.9.0
>
> Attachments: PIG-2029.patch
>
>
> I have a Pig script which reports varying Stats for the same M/R job (same 
> inputs). Sometimes the PigStats reports all the stats (such as 
> Maps,Reduces,MaxMapTime,MinMapTime,AvgMapTime,MaxReduceTime, MinReduceTime 
> and AvgReduceTime) for the M/R job as 0. Sometimes it reports it correctly.
> Enclosed are the stderr logs for 2 runs, you can notice that for Run 1 
> job_201103091134_556600 from Run 1; has 0 against all the columns whereas in 
> Run 2, Hadoop job job_201104272229_75693 has some valid values. 
> The actual Job Tracker link shows that they are non empty. This points to a 
> bug in the interaction of the PigStats module with the Jobtracker.
> Run 1:
> {quote}
> Job Stats (time in seconds):
> JobId MapsReduces MaxMapTime  MinMapTIme  AvgMapTime  
> MaxReduceTime   MinReduceTime   AvgReduceTime   Alias   Feature Outputs
> job_201103091134_556458   160 100 552 191 368 1257
> 371 392 
> IN,SP10P,SP11P,SP12P,SP13P,SP16P,SP17P,SP18P,SP20P,SP21P,SP22P,SP23P,SP24P,SP26P,SP27P,SP28P,SP29P,SP30P,SP31P,SP32P,SP33P,SP34P,SP4P,SP6P,SP7P,SP8P,SP9P
>DISTINCT,MULTI_QUERY
> job_201103091134_556600   0   0   0   0   0   0   
> 0   0   UNION5  MULTI_QUERY,MAP_ONLY/user/viraj/dir,,
> job_201103091134_556601   7   100 17  8   14  200 
> 15  27  CNJOIN25,GNJOIN25,sampleNJOIN25 GROUP_BY,COMBINER   
> job_201103091134_556602   0   0   0   0   0   0   
> 0   0   CNJOIN3,GNJOIN3,sampleNJOIN3GROUP_BY,COMBINER   
> job_201103091134_556603   0   0   0   0   0   0   
> 0   0   CNJOIN15,GNJOIN15,sampleNJOIN15 GROUP_BY,COMBINER   
> job_201103091134_556604   2   100 13  7   10  34  
> 13  31  CNJOIN19,GNJOIN19,sampleNJOIN19 GROUP_BY,COMBINER   
> job_201103091134_556644   0   0   0   0   0   0   
> 0   0   ONJOIN15SAMPLER 
> job_201103091134_556645   0   0   0   0   0   0   
> 0   0   ONJOIN25SAMPLER 
> job_201103091134_556646   0   0   0   0   0   0   
> 0   0   ONJOIN3 SAMPLER 
> job_201103091134_556654   0   0   0   0   0   0   
> 0   0   ONJOIN19SAMPLER 
> job_201103091134_556662   0   0   0   0   0   0   
> 0   0   ONJOIN19ORDER_BY,COMBINER
> ..
> {quote}
> Run 2:
> {quote}
> Job Stats (time in seconds):
> JobId MapsReduces MaxMapTime  MinMapTIme  AvgMapTime  
> MaxReduceTime   MinReduceTime   AvgReduceTime   Alias   Feature Outputs
> job_201104272229_75503159 100 484 192 353 396 
> 308 321 
> IN,SP10P,SP11P,SP12P,SP13P,SP16P,SP17P,SP18P,SP20P,SP21P,SP22P,SP23P,SP24P,SP26P,SP27P,SP28P,SP29P,SP30P,SP31P,SP32P,SP33P,SP34P,SP4P,SP6P,SP7P,SP8P,SP9P
>DISTINCT,MULTI_QUERY
> job_201104272229_7569318  0   31  14  24  0   
> 0   UNION5 MULTI_QUERY,MAP_ONLY /user/viraj/dir,
> job_201104272229_756947   100 34  13  22  46  
> 20  25  CNJOIN25,GNJOIN25,sampleNJOIN25 GROUP_BY,COMBINER   
> job_201104272229_75695125 100 19  11  15  32  
> 18  26  CNJOIN3,GNJOIN3,sampleNJOIN3GROUP_BY,COMBINER   
> job_201104272229_756981   100 12  12  12  13  
> 9   11  CNJOIN15,GNJOIN15,sampleNJOIN15 GROUP_BY,COMBINER   
> job_201104272229_757022   100 21  5   13  35  
> 22  26  CNJOIN19,GNJOIN19,sampleNJOIN19 GROUP_BY,COMBINER   
> job_201104272229_757241   1   4   4   4   11  
> 11  11  ONJOIN15SAMPLER 
> job_201104272229_757250   0   0   0   0   0   
> 0   ONJOIN25SAMPLER 
> job_201104272229_757266   1   8   6   8   24  
> 24  24  ONJOIN3 SAMPLER 
> job_201104272229_757290   0   0   0   0   0   
> 0  

Re: Welcome to Aniket Mokashi

2011-05-19 Thread Ashutosh Chauhan
Congratulations, Aniket!
Hoping to see many more contributions in Pig from you.

Ashutosh
On Thu, May 19, 2011 at 10:08, Alan Gates  wrote:
> Please join me in welcoming Aniket Mokashi as a new committer on Pig.
>  Aniket has been contributing to Pig since last summer.  He wrote or helped
> shepherd several major features in 0.8, including the Python UDF work, the
> new mapreduce functionality, and the custom partitioner.  We look forward to
> more great work from him in the future.
>
> Alan.
>


Welcome to Aniket Mokashi

2011-05-19 Thread Alan Gates
Please join me in welcoming Aniket Mokashi as a new committer on Pig.   
Aniket has been contributing to Pig since last summer.  He wrote or  
helped shepherd several major features in 0.8, including the Python  
UDF work, the new mapreduce functionality, and the custom  
partitioner.  We look forward to more great work from him in the future.


Alan.


Re: Review Request: PIG-1702. Fix for task output logs for streaming jobs containing null input-split information.

2011-05-19 Thread Adam Warrington


> On 2011-04-13 18:03:22, Dmitriy Ryaboy wrote:
> > trunk/src/org/apache/pig/backend/hadoop/streaming/HadoopExecutableManager.java,
> >  line 205
> > 
> >
> > please clean up whitespace :)

Oops, sorry. I'll clean that up.


> On 2011-04-13 18:03:22, Dmitriy Ryaboy wrote:
> > trunk/src/org/apache/pig/backend/hadoop/streaming/HadoopExecutableManager.java,
> >  line 202
> > 
> >
> > Do we care about the specifics of how this output is written?
> > 
> > Seems like it would be less code, and potentially better in the long 
> > run (if we are dealing with other kinds of splits) to just call toString() 
> > on the InputSplit. FileSplit already defines its own toString() which 
> > prints out the path, the start offset, and the length.
> 
> Ashutosh Chauhan wrote:
> I agree with Dmitriy. If possible, we should avoid special casing for a 
> particular type of InputSplit. Further, InputSplit provides getLocations() 
> and getLength() api which should be used instead of FileSplit specific api.

So it seems the options are to either:

1. Use the input splits toString() method.
2. Use just getLocations and getLength, which are part of the InputSplit API.

I'm leaning towards toString, because it is going to contain useful information 
for the common case of FIleSplit which getLocations won't have, that being the 
file offset and the file name.

If this is the common consensus, I'll submit a patch with that update. Let me 
know.


- Adam


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/547/#review452
---


On 2011-05-19 16:27:22, Adam Warrington wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/547/
> ---
> 
> (Updated 2011-05-19 16:27:22)
> 
> 
> Review request for pig.
> 
> 
> Summary
> ---
> 
> This is a patch for PIG-1702, which describes an issue where the task output 
> logs for PIG streaming jobs contains null input-split information. The 
> ability to query the input-split information through the JobConf went away 
> with the new MR API. We must now gain a reference to the underlying 
> FiletSplit, and query this reference for that information.
> 
> 
> Diffs
> -
> 
>   
> trunk/src/org/apache/pig/backend/hadoop/streaming/HadoopExecutableManager.java
>  1088692 
> 
> Diff: https://reviews.apache.org/r/547/diff
> 
> 
> Testing
> ---
> 
> To test this, I wrote a very simple python script to pass data through using 
> PIG. After checking the task logs of the completed task, the stderr logs now 
> contain valid input split information. Below are the scripts and test data 
> used.
> 
> ### PIG commands run ###
> DEFINE testpy `test.py` SHIP ('test.py');
> raw_records = LOAD '/test.txt2'; 
> T1 = STREAM raw_records THROUGH testpy;
> dump T1;
> 
> ### test.py ###
> #!/usr/bin/python
> import sys
> 
> cnt = 0
> for line in sys.stdin:
> print line.strip() + " " + str(cnt)
> cnt += 1
> 
> ### contents of /test.txt on hdfs ###
> one line
> two line
> three line
> four line
> 
> 
> Thanks,
> 
> Adam
> 
>



Re: Review Request: PIG-1702. Fix for task output logs for streaming jobs containing null input-split information.

2011-05-19 Thread Adam Warrington

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/547/
---

(Updated 2011-05-19 16:27:22.583249)


Review request for pig.


Changes
---

Sigh...I edited this a while back, but didn't publish what I wrote.


Summary
---

This is a patch for PIG-1702, which describes an issue where the task output 
logs for PIG streaming jobs contains null input-split information. The ability 
to query the input-split information through the JobConf went away with the new 
MR API. We must now gain a reference to the underlying FiletSplit, and query 
this reference for that information.


Diffs
-

  
trunk/src/org/apache/pig/backend/hadoop/streaming/HadoopExecutableManager.java 
1088692 

Diff: https://reviews.apache.org/r/547/diff


Testing (updated)
---

To test this, I wrote a very simple python script to pass data through using 
PIG. After checking the task logs of the completed task, the stderr logs now 
contain valid input split information. Below are the scripts and test data used.

### PIG commands run ###
DEFINE testpy `test.py` SHIP ('test.py');
raw_records = LOAD '/test.txt2'; 
T1 = STREAM raw_records THROUGH testpy;
dump T1;

### test.py ###
#!/usr/bin/python
import sys

cnt = 0
for line in sys.stdin:
print line.strip() + " " + str(cnt)
cnt += 1

### contents of /test.txt on hdfs ###
one line
two line
three line
four line


Thanks,

Adam