[jira] [Updated] (PIG-2029) Inconsistency in Pig Stats reports

2011-05-18 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-2029:


Fix Version/s: (was: 0.10)
   0.9.0

> Inconsistency in Pig Stats reports 
> ---
>
> Key: PIG-2029
> URL: https://issues.apache.org/jira/browse/PIG-2029
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.8.1, 0.9.0
>Reporter: Viraj Bhat
>Assignee: Richard Ding
> Fix For: 0.9.0
>
> Attachments: PIG-2029.patch
>
>
> I have a Pig script which reports varying Stats for the same M/R job (same 
> inputs). Sometimes the PigStats reports all the stats (such as 
> Maps,Reduces,MaxMapTime,MinMapTime,AvgMapTime,MaxReduceTime, MinReduceTime 
> and AvgReduceTime) for the M/R job as 0. Sometimes it reports it correctly.
> Enclosed are the stderr logs for 2 runs, you can notice that for Run 1 
> job_201103091134_556600 from Run 1; has 0 against all the columns whereas in 
> Run 2, Hadoop job job_201104272229_75693 has some valid values. 
> The actual Job Tracker link shows that they are non empty. This points to a 
> bug in the interaction of the PigStats module with the Jobtracker.
> Run 1:
> {quote}
> Job Stats (time in seconds):
> JobId MapsReduces MaxMapTime  MinMapTIme  AvgMapTime  
> MaxReduceTime   MinReduceTime   AvgReduceTime   Alias   Feature Outputs
> job_201103091134_556458   160 100 552 191 368 1257
> 371 392 
> IN,SP10P,SP11P,SP12P,SP13P,SP16P,SP17P,SP18P,SP20P,SP21P,SP22P,SP23P,SP24P,SP26P,SP27P,SP28P,SP29P,SP30P,SP31P,SP32P,SP33P,SP34P,SP4P,SP6P,SP7P,SP8P,SP9P
>DISTINCT,MULTI_QUERY
> job_201103091134_556600   0   0   0   0   0   0   
> 0   0   UNION5  MULTI_QUERY,MAP_ONLY/user/viraj/dir,,
> job_201103091134_556601   7   100 17  8   14  200 
> 15  27  CNJOIN25,GNJOIN25,sampleNJOIN25 GROUP_BY,COMBINER   
> job_201103091134_556602   0   0   0   0   0   0   
> 0   0   CNJOIN3,GNJOIN3,sampleNJOIN3GROUP_BY,COMBINER   
> job_201103091134_556603   0   0   0   0   0   0   
> 0   0   CNJOIN15,GNJOIN15,sampleNJOIN15 GROUP_BY,COMBINER   
> job_201103091134_556604   2   100 13  7   10  34  
> 13  31  CNJOIN19,GNJOIN19,sampleNJOIN19 GROUP_BY,COMBINER   
> job_201103091134_556644   0   0   0   0   0   0   
> 0   0   ONJOIN15SAMPLER 
> job_201103091134_556645   0   0   0   0   0   0   
> 0   0   ONJOIN25SAMPLER 
> job_201103091134_556646   0   0   0   0   0   0   
> 0   0   ONJOIN3 SAMPLER 
> job_201103091134_556654   0   0   0   0   0   0   
> 0   0   ONJOIN19SAMPLER 
> job_201103091134_556662   0   0   0   0   0   0   
> 0   0   ONJOIN19ORDER_BY,COMBINER
> ..
> {quote}
> Run 2:
> {quote}
> Job Stats (time in seconds):
> JobId MapsReduces MaxMapTime  MinMapTIme  AvgMapTime  
> MaxReduceTime   MinReduceTime   AvgReduceTime   Alias   Feature Outputs
> job_201104272229_75503159 100 484 192 353 396 
> 308 321 
> IN,SP10P,SP11P,SP12P,SP13P,SP16P,SP17P,SP18P,SP20P,SP21P,SP22P,SP23P,SP24P,SP26P,SP27P,SP28P,SP29P,SP30P,SP31P,SP32P,SP33P,SP34P,SP4P,SP6P,SP7P,SP8P,SP9P
>DISTINCT,MULTI_QUERY
> job_201104272229_7569318  0   31  14  24  0   
> 0   UNION5 MULTI_QUERY,MAP_ONLY /user/viraj/dir,
> job_201104272229_756947   100 34  13  22  46  
> 20  25  CNJOIN25,GNJOIN25,sampleNJOIN25 GROUP_BY,COMBINER   
> job_201104272229_75695125 100 19  11  15  32  
> 18  26  CNJOIN3,GNJOIN3,sampleNJOIN3GROUP_BY,COMBINER   
> job_201104272229_756981   100 12  12  12  13  
> 9   11  CNJOIN15,GNJOIN15,sampleNJOIN15 GROUP_BY,COMBINER   
> job_201104272229_757022   100 21  5   13  35  
> 22  26  CNJOIN19,GNJOIN19,sampleNJOIN19 GROUP_BY,COMBINER   
> job_201104272229_757241   1   4   4   4   11  
> 11  11  ONJOIN15SAMPLER 
> job_201104272229_757250   0   0   0   0   0   
> 0   ONJOIN25SAMPLER 
> job_201104272229_757266   1   8   6   8   24  
> 24  24  ONJOIN3 SAMPLER 
> job_201104272229_757290   0   0   0   0   0   
> 0

[jira] [Created] (PIG-2079) Transition Grunt parser to antlr

2011-05-18 Thread Olga Natkovich (JIRA)
Transition Grunt parser to antlr


 Key: PIG-2079
 URL: https://issues.apache.org/jira/browse/PIG-2079
 Project: Pig
  Issue Type: Task
Reporter: Olga Natkovich
 Fix For: 0.10




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-2080) Transition parameter substitution to use antlr

2011-05-18 Thread Olga Natkovich (JIRA)
Transition parameter substitution to use antlr
--

 Key: PIG-2080
 URL: https://issues.apache.org/jira/browse/PIG-2080
 Project: Pig
  Issue Type: Task
Reporter: Olga Natkovich
 Fix For: 0.10




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1824) Support import modules in Jython UDF

2011-05-18 Thread Richard Ding (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035542#comment-13035542
 ] 

Richard Ding commented on PIG-1824:
---

The new patch fixed the unit test errors reported earlier. I have one 
(different) failed test in TestGrunt, not sure if it's related to the patch. 

> Support import modules in Jython UDF
> 
>
> Key: PIG-1824
> URL: https://issues.apache.org/jira/browse/PIG-1824
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Richard Ding
>Assignee: Woody Anderson
> Fix For: 0.10
>
> Attachments: 1824.patch, 1824_final.patch, 1824a.patch, 1824b.patch, 
> 1824c.patch, 1824d.patch, 1824x.patch, 
> TEST-org.apache.pig.test.TestGrunt.txt, 
> TEST-org.apache.pig.test.TestScriptLanguage.txt, 
> TEST-org.apache.pig.test.TestScriptUDF.txt
>
>
> Currently, Jython UDF script doesn't support Jython import statement as in 
> the following example:
> {code}
> #!/usr/bin/python
> import re
> @outputSchema("word:chararray")
> def resplit(content, regex, index):
> return re.compile(regex).split(content)[index]
> {code}
> Can Pig automatically locate the Jython module file and ship it to the 
> backend? Or should we add a ship clause to let user explicitly specify the 
> module to ship? 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-2078) POProject.getNext(DataBag) does not handle null

2011-05-18 Thread Daniel Dai (JIRA)
POProject.getNext(DataBag) does not handle null
---

 Key: PIG-2078
 URL: https://issues.apache.org/jira/browse/PIG-2078
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.9.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.9.0


The following script fail with "-t MergeForEach"
{code}
a = load '1.txt' as (a0:bag{}, a1:int);
b = foreach a generate a0;
dump b;
{code}

1.txt:
{(1)}   2
3

Error stack:
java.lang.NullPointerException
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.consumeInputBag(POProject.java:310)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:251)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:316)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:261)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:256)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:1)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira