[jira] [Commented] (PIG-2031) NPE in TOP

2011-05-04 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028810#comment-13028810
 ] 

Dmitriy V. Ryaboy commented on PIG-2031:


Patch seems fine, thanks for doing that.
A new test that tries the null bag would be good.

 NPE in TOP
 --

 Key: PIG-2031
 URL: https://issues.apache.org/jira/browse/PIG-2031
 Project: Pig
  Issue Type: Bug
Reporter: Jacob Perkins
 Attachments: toppatch.txt


 If a NULL DataBag is passed to org.apache.pig.builtin.TOP then a NPE is 
 thrown. Consider:
 {code}
 $: cat foo.tsv
 a  {(foo,1),(bar,2)}
 b
 c  {(fyha,4),(asdf,9)}
 {code}
 then:
 {code}
 data  = LOAD 'foo.tsv' AS (key:chararray, a_bag:bag {t:tuple (name:chararray, 
 value:int)});
 tpd   = FOREACH data {
   top_n = TOP(1, 1, a_bag);
   GENERATE
 key   AS key,
 top_n AS top_n
   ; 
 };
 DUMP tpd;
 {code}
 will throw an NPE when it gets to the row with no bag.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


run_hadoop_grid5000

2011-05-04 Thread hiba houimli

Hi,
I run hadoop in grid 5000,I like to know how I can use pig in hadoop mode in 
this case ??
and thank you 

[jira] [Commented] (PIG-2008) Cache outputFormat in HBaseStorage

2011-05-04 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13028841#comment-13028841
 ] 

Alan Gates commented on PIG-2008:
-

Committed to trunk.  Will commit to 0.9 branch shortly.

 Cache outputFormat in HBaseStorage
 --

 Key: PIG-2008
 URL: https://issues.apache.org/jira/browse/PIG-2008
 Project: Pig
  Issue Type: Bug
  Components: build
Affects Versions: 0.8.0
Reporter: Jacob Perkins
Priority: Minor
 Fix For: 0.8.0

 Attachments: patch_file.txt

   Original Estimate: 10m
  Remaining Estimate: 10m

 getOutputFormat gets called more than one time in a StoreFunc. Modify 
 HBaseStorage to only create an instance of TableOutputFormat one time (since 
 it creates a new HTable connection each time) as opposed to multiple times 
 like it does now.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-2035) Macro expansion doesn't handle multiple expansions of same macro inside another macro

2011-05-04 Thread Richard Ding (JIRA)
Macro expansion doesn't handle multiple expansions of same macro inside another 
macro
-

 Key: PIG-2035
 URL: https://issues.apache.org/jira/browse/PIG-2035
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.9.0
Reporter: Richard Ding
Assignee: Richard Ding
 Fix For: 0.9.0


Here is the use case:

{code}
define test ( in, out, x ) returns c { 
a = load '$in' as (name, age, gpa);
b = group a by gpa;
$c = foreach b generate group, COUNT(a.$x);
store $c into '$out';
};

define test2( in, out ) returns x { 
$x = test( '$in', '$out', 'name' );
$x = test( '$in', '$out.1', 'age' );
$x = test( '$in', '$out.2', 'gpa' );
};

x = test2('studenttab10k', 'myoutput');
{code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2036) Set header delimiter in PigStorageSchema

2011-05-04 Thread Mads Moeller (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mads Moeller updated PIG-2036:
--

Attachment: PIG-2036.patch

Patch

 Set header delimiter in PigStorageSchema
 

 Key: PIG-2036
 URL: https://issues.apache.org/jira/browse/PIG-2036
 Project: Pig
  Issue Type: Improvement
  Components: impl
Affects Versions: 0.7.0, 0.8.0
Reporter: Mads Moeller
Priority: Minor
 Attachments: PIG-2036.patch


 Piggybanks' PigStorageSchema currently defaults the delimiter to a tab in the 
 generated header file (.pig_header).
 The attached patch set the header delimiter to what is passed in via the 
 constructor. Otherwise it'll default to tab '\t'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-2039) IndexOutOfBounException for a case

2011-05-04 Thread Xuefu Zhang (JIRA)
IndexOutOfBounException for a case
--

 Key: PIG-2039
 URL: https://issues.apache.org/jira/browse/PIG-2039
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.9.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Fix For: 0.9.0


The following query gives an exception:

a = load '1.txt' as (a0:int, a1:int, a2:int);
b = group a by a0;
c = foreach b { c1 = limit a 10; c2 =  distinct c1.a1; c3 = distinct c1.a2; 
generate c2, c3;};
store c into 'output';

2011-05-04 12:36:01,720 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
2999: Unexpected internal error. Index: 0, Size: 0

Stack trace:

java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.RangeCheck(ArrayList.java:547)
at java.util.ArrayList.get(ArrayList.java:322)
at 
org.apache.pig.newplan.logical.expression.ProjectExpression.getFieldSchema(ProjectExpression.java:279)
at 
org.apache.pig.newplan.logical.relational.LOGenerate.getSchema(LOGenerate.java:88)
at 
org.apache.pig.newplan.logical.visitor.SchemaAliasVisitor.validate(SchemaAliasVisitor.java:60)
at 
org.apache.pig.newplan.logical.visitor.SchemaAliasVisitor.visit(SchemaAliasVisitor.java:104)
at 
org.apache.pig.newplan.logical.relational.LOGenerate.accept(LOGenerate.java:240)
at 
org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
at 
org.apache.pig.newplan.logical.visitor.SchemaAliasVisitor.visit(SchemaAliasVisitor.java:99)
at 
org.apache.pig.newplan.logical.relational.LOForEach.accept(LOForEach.java:73)
at 
org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
at org.apache.pig.PigServer$Graph.compile(PigServer.java:1664)
at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1615)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1586)
at org.apache.pig.PigServer.registerQuery(PigServer.java:580)
at 
org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:930)
at 
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:176)
at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:152)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76)
at org.apache.pig.Main.run(Main.java:488)
at org.apache.pig.Main.main(Main.java:109)



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2035) Macro expansion doesn't handle multiple expansions of same macro inside another macro

2011-05-04 Thread Richard Ding (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-2035:
--

Attachment: PIG-2035_1.patch

 Macro expansion doesn't handle multiple expansions of same macro inside 
 another macro
 -

 Key: PIG-2035
 URL: https://issues.apache.org/jira/browse/PIG-2035
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.9.0
Reporter: Richard Ding
Assignee: Richard Ding
 Fix For: 0.9.0

 Attachments: PIG-2035_1.patch


 Here is the use case:
 {code}
 define test ( in, out, x ) returns c { 
 a = load '$in' as (name, age, gpa);
 b = group a by gpa;
 $c = foreach b generate group, COUNT(a.$x);
 store $c into '$out';
 };
 define test2( in, out ) returns x { 
 $x = test( '$in', '$out', 'name' );
 $x = test( '$in', '$out.1', 'age' );
 $x = test( '$in', '$out.2', 'gpa' );
 };
 x = test2('studenttab10k', 'myoutput');
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-2040) Move classloader from QueryParserDriver to PigContext

2011-05-04 Thread Daniel Dai (JIRA)
Move classloader from QueryParserDriver to PigContext
-

 Key: PIG-2040
 URL: https://issues.apache.org/jira/browse/PIG-2040
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.9.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.9.0
 Attachments: PIG-2040-1.patch

After PIG-1775, mapreduce mode fail. The reason is we move classloader from 
LogicalPlanBuilder to QueryParserDriver, which will need antlr.jar, however, we 
don't ship antlr.jar to backend. It is better to move classloader to PigContext.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2040) Move classloader from QueryParserDriver to PigContext

2011-05-04 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-2040:


Attachment: PIG-2040-1.patch

 Move classloader from QueryParserDriver to PigContext
 -

 Key: PIG-2040
 URL: https://issues.apache.org/jira/browse/PIG-2040
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.9.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.9.0

 Attachments: PIG-2040-1.patch


 After PIG-1775, mapreduce mode fail. The reason is we move classloader from 
 LogicalPlanBuilder to QueryParserDriver, which will need antlr.jar, however, 
 we don't ship antlr.jar to backend. It is better to move classloader to 
 PigContext.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-1866) Dereference a bag within a tuple does not work

2011-05-04 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13029014#comment-13029014
 ] 

Daniel Dai commented on PIG-1866:
-

Yes, it only goes to 0.9. For 0.8, you need to apply PIG-1866-3.patch manually.

 Dereference a bag within a tuple does not work
 --

 Key: PIG-1866
 URL: https://issues.apache.org/jira/browse/PIG-1866
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.9.0

 Attachments: PIG-1866-1.patch, PIG-1866-2.patch, PIG-1866-3.patch


 The following script does not work (both in new and old logical plan):
 {code}
 a = load '1.txt' as (t : tuple(i: int, b1: bag { b_tuple : tuple ( b_str: 
 chararray) }));
 b = foreach a generate t.b1;
 dump b;
 {code}
 1.txt:
 (1,{(one),(two)})
 Error from old logical plan:
 java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be 
 cast to org.apache.pig.data.DataBag
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:482)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:480)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:339)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:237)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:232)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
 at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
 Error from new logical plan:
 java.lang.NullPointerException
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.consumeInputBag(POProject.java:246)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:200)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:339)
 at 
 org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:237)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:232)
 at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
 at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
 If we change b = foreach a generate t.b1; to b = foreach a generate t.i;, 
 it works fine, only refer to a bag does not work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2035) Macro expansion doesn't handle multiple expansions of same macro inside another macro

2011-05-04 Thread Richard Ding (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13029045#comment-13029045
 ] 

Richard Ding commented on PIG-2035:
---

test-patch result:

{code}
 [exec] -1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] -1 release audit.  The applied patch generated 585 release 
audit warnings (more than the trunk's current 584 warnings).
{code}



 Macro expansion doesn't handle multiple expansions of same macro inside 
 another macro
 -

 Key: PIG-2035
 URL: https://issues.apache.org/jira/browse/PIG-2035
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.9.0
Reporter: Richard Ding
Assignee: Richard Ding
 Fix For: 0.9.0

 Attachments: PIG-2035_1.patch


 Here is the use case:
 {code}
 define test ( in, out, x ) returns c { 
 a = load '$in' as (name, age, gpa);
 b = group a by gpa;
 $c = foreach b generate group, COUNT(a.$x);
 store $c into '$out';
 };
 define test2( in, out ) returns x { 
 $x = test( '$in', '$out', 'name' );
 $x = test( '$in', '$out.1', 'age' );
 $x = test( '$in', '$out.2', 'gpa' );
 };
 x = test2('studenttab10k', 'myoutput');
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2040) Move classloader from QueryParserDriver to PigContext

2011-05-04 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13029058#comment-13029058
 ] 

Xuefu Zhang commented on PIG-2040:
--

+1

 Move classloader from QueryParserDriver to PigContext
 -

 Key: PIG-2040
 URL: https://issues.apache.org/jira/browse/PIG-2040
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.9.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.9.0

 Attachments: PIG-2040-1.patch


 After PIG-1775, mapreduce mode fail. The reason is we move classloader from 
 LogicalPlanBuilder to QueryParserDriver, which will need antlr.jar, however, 
 we don't ship antlr.jar to backend. It is better to move classloader to 
 PigContext.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2034) Pig client uses fs.default.name as provided from the JobTracker instead of local value

2011-05-04 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13029059#comment-13029059
 ] 

Bill Graham commented on PIG-2034:
--

I was wondering the same thing after I filed this. I can't tell if this is from 
the MR job submission doing this on the client or Pig. Can you tell from the 
stack trace I provided?

 Pig client uses fs.default.name as provided from the JobTracker instead of 
 local value
 --

 Key: PIG-2034
 URL: https://issues.apache.org/jira/browse/PIG-2034
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham
 Attachments: pig_1304465896181.log


 When submitting a Pig job, the client uses the {{fs.default.name}} supplied 
 to it by the JobTracker (via core-site.xml on the master typically) during 
 the staging phase. After that, the client then uses the {{fs.default.name}} 
 from it's local configs. This seems like a bug to me. Expected behavior would 
 be to always use the local value.
 I found this bug when the server configs were set to not use a FQDN for 
 {{fs.default.name}}. This caused the client to fail because it didn't have 
 the same default DNS domain. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2012) Comments at the begining of the file throws off line numbers in errors

2011-05-04 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13029063#comment-13029063
 ] 

Xuefu Zhang commented on PIG-2012:
--

1. Macro invocation stack should be per macro, not per node in a macro. Also, 
stack seems to be a more appropriate data structure for this.

2. To get line numbers correctly, we should be able to do it more naturally in 
PigParserNode constractor, rather than prepending \n in the text.

3. It might be cleaner and more OO if we have PigParserMacroNode that inherits 
from PigParserNode. In that case, we only need to override toString() method, 
but avoid if-else clauses in the code.

 Comments at the begining of the file throws off line numbers in errors
 --

 Key: PIG-2012
 URL: https://issues.apache.org/jira/browse/PIG-2012
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.9.0
Reporter: Alan Gates
Assignee: Richard Ding
 Fix For: 0.9.0

 Attachments: PIG-2012_1.patch, macro.pig


 The preprocessor does not appear to be handling leading comments properly 
 when calculating line numbers for error messages.  In the attached script, 
 the error is reported to be on line 7.  It is actually on line 10.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-1775) Removal of old logical plan

2011-05-04 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated PIG-1775:
-

Attachment: PIG-1775-3.patch

1. Migrated additional test cases
2. Provided new implementation for parsing schema from a string, parsing 
constant from a string, etc.

 Removal of old logical plan
 ---

 Key: PIG-1775
 URL: https://issues.apache.org/jira/browse/PIG-1775
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.9.0
Reporter: Yan Zhou
Assignee: Xuefu Zhang
 Fix For: 0.9.0

 Attachments: PIG-1775-2.patch, PIG-1775-3.patch, PIG-1775.patch


 The new logical plan will only be used and the old logical plan will be 
 removed once the new one is stable enough. It is scheduled for the 0.9 
 release.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2034) Pig client uses fs.default.name as provided from the JobTracker instead of local value

2011-05-04 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13029073#comment-13029073
 ] 

Daniel Dai commented on PIG-2034:
-

The stack is all from mapreduce. I wonder if you will see the same thing in a 
pure mapreduce job.

 Pig client uses fs.default.name as provided from the JobTracker instead of 
 local value
 --

 Key: PIG-2034
 URL: https://issues.apache.org/jira/browse/PIG-2034
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham
 Attachments: pig_1304465896181.log


 When submitting a Pig job, the client uses the {{fs.default.name}} supplied 
 to it by the JobTracker (via core-site.xml on the master typically) during 
 the staging phase. After that, the client then uses the {{fs.default.name}} 
 from it's local configs. This seems like a bug to me. Expected behavior would 
 be to always use the local value.
 I found this bug when the server configs were set to not use a FQDN for 
 {{fs.default.name}}. This caused the client to fail because it didn't have 
 the same default DNS domain. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2040) Move classloader from QueryParserDriver to PigContext

2011-05-04 Thread Santhosh Srinivasan (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13029081#comment-13029081
 ] 

Santhosh Srinivasan commented on PIG-2040:
--

I remember a user trying to parse schemas using the Util.parseFromString() 
method in an UDF. This might require shipping the antlr binaries to the 
back-end. Will this patch address this issue too?

 Move classloader from QueryParserDriver to PigContext
 -

 Key: PIG-2040
 URL: https://issues.apache.org/jira/browse/PIG-2040
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.9.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.9.0

 Attachments: PIG-2040-1.patch


 After PIG-1775, mapreduce mode fail. The reason is we move classloader from 
 LogicalPlanBuilder to QueryParserDriver, which will need antlr.jar, however, 
 we don't ship antlr.jar to backend. It is better to move classloader to 
 PigContext.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2034) Pig client uses fs.default.name as provided from the JobTracker instead of local value

2011-05-04 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13029082#comment-13029082
 ] 

Bill Graham commented on PIG-2034:
--

Yup. I just tried the same with WordCount and got the same failure, so this 
doesn't seem to be a Pig bug. Do you think this is a valid MapReduce bug?

{code}
Exception in thread main java.net.UnknownHostException: unknown host: 
colo1-hadoop-nn-r0-n0
at org.apache.hadoop.ipc.Client$Connection.init(Client.java:195)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:850)
at org.apache.hadoop.ipc.Client.call(Client.java:720)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
at $Proxy0.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
at 
org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106)
at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:207)
at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:170)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1419)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:1444)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1432)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175)
at org.apache.hadoop.mapred.JobClient.getFs(JobClient.java:504)
at 
org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:608)
at 
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:802)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:771)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1290)
at cnwk.hadoop.mapreduce.WordCount.main(WordCount.java:54)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
{code}

 Pig client uses fs.default.name as provided from the JobTracker instead of 
 local value
 --

 Key: PIG-2034
 URL: https://issues.apache.org/jira/browse/PIG-2034
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham
 Attachments: pig_1304465896181.log


 When submitting a Pig job, the client uses the {{fs.default.name}} supplied 
 to it by the JobTracker (via core-site.xml on the master typically) during 
 the staging phase. After that, the client then uses the {{fs.default.name}} 
 from it's local configs. This seems like a bug to me. Expected behavior would 
 be to always use the local value.
 I found this bug when the server configs were set to not use a FQDN for 
 {{fs.default.name}}. This caused the client to fail because it didn't have 
 the same default DNS domain. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-2041) Minicluster should make each run independent

2011-05-04 Thread Daniel Dai (JIRA)
Minicluster should make each run independent


 Key: PIG-2041
 URL: https://issues.apache.org/jira/browse/PIG-2041
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.8.0, 0.9.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.9.0


Minicluster will reuse ~/pigtest/conf/hadoop-site.xml. If something wrong in 
hadoop-site.xml, next test will also be affected. This leads to some mysterious 
test failures. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2041) Minicluster should make each run independent

2011-05-04 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-2041:


Attachment: PIG-2041-1.patch

PIG-2041-1.patch also include a change to fix TestStoreInstances failure.

 Minicluster should make each run independent
 

 Key: PIG-2041
 URL: https://issues.apache.org/jira/browse/PIG-2041
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.8.0, 0.9.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.9.0

 Attachments: PIG-2041-1.patch


 Minicluster will reuse ~/pigtest/conf/hadoop-site.xml. If something wrong in 
 hadoop-site.xml, next test will also be affected. This leads to some 
 mysterious test failures. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2040) Move classloader from QueryParserDriver to PigContext

2011-05-04 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13029093#comment-13029093
 ] 

Daniel Dai commented on PIG-2040:
-

bq. I remember a user trying to parse schemas using the Util.parseFromString() 
method in an UDF. This might require shipping the antlr binaries to the 
back-end. Will this patch address this issue too?

No, it will not. Seems we need to ship antlr.jar as well. But still put 
classloader into PigContext is more clear. Also I wonder how 
Util.parseFromString() works now? We don't ship javacc as well.

 Move classloader from QueryParserDriver to PigContext
 -

 Key: PIG-2040
 URL: https://issues.apache.org/jira/browse/PIG-2040
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.9.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.9.0

 Attachments: PIG-2040-1.patch


 After PIG-1775, mapreduce mode fail. The reason is we move classloader from 
 LogicalPlanBuilder to QueryParserDriver, which will need antlr.jar, however, 
 we don't ship antlr.jar to backend. It is better to move classloader to 
 PigContext.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (PIG-2008) Cache outputFormat in HBaseStorage

2011-05-04 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates resolved PIG-2008.
-

   Resolution: Fixed
Fix Version/s: (was: 0.8.0)
   0.10
   0.9.0
 Assignee: Jacob Perkins

Patch checked into trunk and 0.9 branch.  Thanks Jacob.

 Cache outputFormat in HBaseStorage
 --

 Key: PIG-2008
 URL: https://issues.apache.org/jira/browse/PIG-2008
 Project: Pig
  Issue Type: Bug
  Components: build
Affects Versions: 0.8.0
Reporter: Jacob Perkins
Assignee: Jacob Perkins
Priority: Minor
 Fix For: 0.9.0, 0.10

 Attachments: patch_file.txt

   Original Estimate: 10m
  Remaining Estimate: 10m

 getOutputFormat gets called more than one time in a StoreFunc. Modify 
 HBaseStorage to only create an instance of TableOutputFormat one time (since 
 it creates a new HTable connection each time) as opposed to multiple times 
 like it does now.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira