date:20110124

[jira] Resolved: (PIG-1699) HBaseStorage option -gte resolves to CompareOp.GREATER instead of CompareOp.GREATER_OR_EQUAL

2011-01-24 Thread Dmitriy V. Ryaboy (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-1699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy V. Ryaboy resolved PIG-1699.


Resolution: Not A Problem

Resolving as 'not a problem' since I don't believe it to be and have not been 
provided with a counter example.

Please reopen if I am wrong.

> HBaseStorage option -gte resolves to CompareOp.GREATER instead of 
> CompareOp.GREATER_OR_EQUAL
> 
>
> Key: PIG-1699
> URL: https://issues.apache.org/jira/browse/PIG-1699
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Jeremy Hinegardner
>Priority: Minor
> Attachments: PIG-1699.patch
>
>
> When using HBaseStorage, and using  '-gte'  option, this is passed to  
> HTableInputFormat, which then uses CompareOp.GREATER instead of 
> CompareOp.GREATER_OR_EQUAL for split decisions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1769) Consistency for HBaseStorage

2011-01-24 Thread Dmitriy V. Ryaboy (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy V. Ryaboy updated PIG-1769:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Consistency for HBaseStorage
> 
>
> Key: PIG-1769
> URL: https://issues.apache.org/jira/browse/PIG-1769
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0
>Reporter: Corbin Hoenes
>Assignee: Dmitriy V. Ryaboy
> Fix For: 0.9.0
>
> Attachments: PIG_1769.patch
>
>
> In our load statement we are allowed to prefix the table name with "hbase://" 
> but when we call
> store it throws an exception unless we remove hbase:// from the table
> name:
> this works:
> store raw into 'piggytest2' USING
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('content2:field1
> anchor2:field1a anchor2:field2a');
> this won't
> store raw into 'hbase://piggytest2'
> Exception:
> Caused by: java.lang.IllegalArgumentException:
> java.net.URISyntaxException: Relative path in absolute URI:
> hbase://piggytest2_logs
> Would be nice to be able to prefix the store with hbase:// so it's consistent 
> with the load syntax

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-1769) Consistency for HBaseStorage

2011-01-24 Thread Dmitriy V. Ryaboy (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12986236#action_12986236
 ] 

Dmitriy V. Ryaboy commented on PIG-1769:


committed.

> Consistency for HBaseStorage
> 
>
> Key: PIG-1769
> URL: https://issues.apache.org/jira/browse/PIG-1769
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0
>Reporter: Corbin Hoenes
>Assignee: Dmitriy V. Ryaboy
> Fix For: 0.9.0
>
> Attachments: PIG_1769.patch
>
>
> In our load statement we are allowed to prefix the table name with "hbase://" 
> but when we call
> store it throws an exception unless we remove hbase:// from the table
> name:
> this works:
> store raw into 'piggytest2' USING
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('content2:field1
> anchor2:field1a anchor2:field2a');
> this won't
> store raw into 'hbase://piggytest2'
> Exception:
> Caused by: java.lang.IllegalArgumentException:
> java.net.URISyntaxException: Relative path in absolute URI:
> hbase://piggytest2_logs
> Would be nice to be able to prefix the store with hbase:// so it's consistent 
> with the load syntax

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Closed: (PIG-1769) Consistency for HBaseStorage

2011-01-24 Thread Dmitriy V. Ryaboy (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy V. Ryaboy closed PIG-1769.
--


> Consistency for HBaseStorage
> 
>
> Key: PIG-1769
> URL: https://issues.apache.org/jira/browse/PIG-1769
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0
>Reporter: Corbin Hoenes
>Assignee: Dmitriy V. Ryaboy
> Fix For: 0.9.0
>
> Attachments: PIG_1769.patch
>
>
> In our load statement we are allowed to prefix the table name with "hbase://" 
> but when we call
> store it throws an exception unless we remove hbase:// from the table
> name:
> this works:
> store raw into 'piggytest2' USING
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('content2:field1
> anchor2:field1a anchor2:field2a');
> this won't
> store raw into 'hbase://piggytest2'
> Exception:
> Caused by: java.lang.IllegalArgumentException:
> java.net.URISyntaxException: Relative path in absolute URI:
> hbase://piggytest2_logs
> Would be nice to be able to prefix the store with hbase:// so it's consistent 
> with the load syntax

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-1748) Add load/store function AvroStorage for avro data

2011-01-24 Thread Scott Carey (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12986204#action_12986204
 ] 

Scott Carey commented on PIG-1748:
--

@Jacob
{quote} I can't say I'm convinced, and am in fact more concerned from your 
example, given that this approach essentially builds dependencies on all of 
those projects into Avro.{quote}

Avro is completely modularized now, so there would not be any dependency mess 
like that.  It is now easy to add separate modules such as 'avro-pig.jar'  or 
'avro-hive.jar'.  It already has 'avro-mapred.jar'. 
https://cwiki.apache.org/confluence/display/AVRO/Build+Documentation#BuildDocumentation-Java

As this gets off topic, we can use Avro developer mailing list.  Related issues 
are https://issues.apache.org/jira/browse/AVRO-647 and the issues linked to it, 
as well as https://issues.apache.org/jira/browse/AVRO-592.  There is no ticket 
yet on the broader scope stuff. 

> Add load/store function AvroStorage for avro data
> -
>
> Key: PIG-1748
> URL: https://issues.apache.org/jira/browse/PIG-1748
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: lin guo
>Assignee: Jakob Homan
> Attachments: avro_storage.patch, avro_test_files.tar.gz, 
> PIG-1748-2.patch
>
>
> We want to use Pig to process arbitrary Avro data and store results as Avro 
> files. AvroStorage() extends two PigFuncs: LoadFunc and StoreFunc. 
> Due to discrepancies of Avro and Pig data models, AvroStorage has:
> 1. Limited support for "record": we do not support recursively defined record 
> because the number of fields in such records is data dependent.
> 2. Limited support for "union": we only accept nullable union like ["null", 
> "some-type"].
> For simplicity, we also make the following assumptions:
> If the input directory is a leaf directory, then we assume Avro data files in 
> it have the same schema;
> If the input directory contains sub-directories, then we assume Avro data 
> files in all sub-directories have the same schema.
> AvroStorage takes no input parameters when used as a LoadFunc (except for 
> "debug [debug-level]"). 
> Users can provide parameters to AvroStorage when used as a StoreFunc. If they 
> don't, Avro schema of output data is derived from its 
> Pig schema.
> Detailed documentation can be found in 
> http://snaprojects.jira.com/wiki/display/HTOOLS/AvroStorage+-+Pig+support+for+Avro+data

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-1813) Pig 0.8 throws ERROR 1075 while trying to refer a map in the result of eval udf.Works with 0.7

2011-01-24 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12986189#action_12986189
 ] 

Daniel Dai commented on PIG-1813:
-

Patch is ready for review:
https://reviews.apache.org/r/353/

> Pig 0.8 throws ERROR 1075 while trying to refer a map in the result of  eval 
> udf.Works with 0.7
> ---
>
> Key: PIG-1813
> URL: https://issues.apache.org/jira/browse/PIG-1813
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.8.0
>Reporter: Vivek Padmanabhan
> Attachments: PIG-1813-0.patch, PIG-1813-1.patch, PIG-1813-2.patch
>
>
> register myudf.jar;
> A = load 'input' MyZippedStorage('\u0001') as ($inputSchema);
> B = foreach A generate id , value  ;
> C = foreach B generate id , org.myudf.ExplodeHashList( (chararray)value, 
> '\u0002', '\u0004', '\u0003') as value;
> D = FILTER C by value is not null;
> E = foreach D generate id , flatten(org.myudf.GETFIRST(value)) as hop;
> F = foreach E generate id , hop#'rmli' as rmli:bytearray ;
> store F into 'output.bz2' using PigStorage();
> The above script fails when run with Pig 0.8 but runs fine with Pig 0.7 or if 
> pig.usenewlogicalplan=false.
> The below is the exception thrown in 0.8 :
> org.apache.pig.backend.executionengine.ExecException: ERROR 1075: Received a 
> bytearray from the UDF. Cannot determine how to convert the bytearray to map.
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:952)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POMapLookUp.processInput(POMapLookUp.java:87)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POMapLookUp.getNext(POMapLookUp.java:98)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POMapLookUp.getNext(POMapLookUp.java:117)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:346)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:236)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:638)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:314)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1062)
>   at org.apache.hadoop.mapred.Child.main(Child.java:211)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Review Request: Pig 0.8 throws ERROR 1075 while trying to refer a map in the result of eval udf.Works with 0.7

2011-01-24 Thread Daniel Dai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/353/
---

Review request for pig and thejas.


Summary
---

The following script fail:
public static class BagGenerateNoSchema extends EvalFunc {
@Override
public DataBag exec(Tuple input) throws IOException {
DataBag bg = DefaultBagFactory.getInstance().newDefaultBag();
bg.add(input);
return bg;
}
}

a = load '1.txt' as (a0:map[]);
b = foreach a generate BagGenerateNoSchema(*) as b0;
c = foreach b generate flatten(IdentityColumn(b0));
d = foreach c generate $0#'key';
dump d;

Error message:
org.apache.pig.backend.executionengine.ExecException: ERROR 1075: Received a 
bytearray from the UDF. Cannot determine how to convert the bytearray to map.
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:952)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POMapLookUp.processInput(POMapLookUp.java:87)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POMapLookUp.getNext(POMapLookUp.java:98)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POMapLookUp.getNext(POMapLookUp.java:117)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:346)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:236)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:638)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:314)
at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1062)
at org.apache.hadoop.mapred.Child.main(Child.java:211)


This addresses bug PIG-1813.
https://issues.apache.org/jira/browse/PIG-1813


Diffs
-

  
http://svn.apache.org/repos/asf/pig/branches/branch-0.8/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java
 1062989 
  
http://svn.apache.org/repos/asf/pig/branches/branch-0.8/src/org/apache/pig/newplan/logical/relational/LOGenerate.java
 1062989 
  
http://svn.apache.org/repos/asf/pig/branches/branch-0.8/test/org/apache/pig/test/TestEvalPipeline2.java
 1062989 

Diff: https://reviews.apache.org/r/353/diff


Testing
---

Test-patch:
 [exec] +1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.

Unit test:
all pass

End-to-end test:
all pass


Thanks,

Daniel

[jira] Updated: (PIG-1812) Problem with DID_NOT_FIND_LOAD_ONLY_MAP_PLAN

2011-01-24 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1812:


Attachment: PIG-1812-1.patch

> Problem with DID_NOT_FIND_LOAD_ONLY_MAP_PLAN
> 
>
> Key: PIG-1812
> URL: https://issues.apache.org/jira/browse/PIG-1812
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.7.0, 0.8.0
> Environment: RHEL, Pig 0.8.0
>Reporter: xianyu
>Assignee: Daniel Dai
> Attachments: PIG-1812-1.patch
>
>
> Hi, 
> I have the following input files:
> pkg.txt
> a   3   {(123,1.0),(236,2.0)}
> a   3   {(236,1.0)}
> model.txt
> a   123 2   0.33
> a   236 2   0.5
> My script is listed below:
> A = load 'pkg.txt' using PigStorage('\t') as (pkg:chararray, ts:int, 
> cat_bag:{t:(id:chararray, wht:float)});
> M = load 'model.txt' using PigStorage('\t') as (pkg:chararray, 
> cat_id:chararray, ts:int, score:double);
> B = foreach A generate ts, pkg, flatten(cat_bag.id) as (cat_id:chararray);
> B = distinct B;
> H1 = cogroup M by (pkg, cat_id) inner, B by (pkg, cat_id);
> H2 = foreach H1 {
> I = order M by ts;
> J = order B by ts;
> generate flatten(group) as (pkg:chararray, cat_id:chararray), J.ts as 
> tsorig, I.ts as tsmap;
> }
> dump H2;
> When running this script, I got a warning about "Encountered Warning 
> DID_NOT_FIND_LOAD_ONLY_MAP_PLAN 1 time(s)" and pig error log as below:
> Pig Stack Trace
> ---
> ERROR 2043: Unexpected error during execution.
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to 
> open iterator for alias H2
> at org.apache.pig.PigServer.openIterator(PigServer.java:764)
> at 
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612)
> at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:90)
> at org.apache.pig.Main.run(Main.java:500)
> at org.apache.pig.Main.main(Main.java:107)
> Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store alias H2
> at org.apache.pig.PigServer.storeEx(PigServer.java:888)
> at org.apache.pig.PigServer.store(PigServer.java:826)
> at org.apache.pig.PigServer.openIterator(PigServer.java:738)
> ... 7 more
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2043: 
> Unexpected error during execution.
> at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:403)
> at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1208)
> at org.apache.pig.PigServer.storeEx(PigServer.java:884)
> ... 9 more
> Caused by: java.lang.ClassCastException: 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLoad
>  cannot be cast to 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SecondaryKeyOptimizer.visitMROp(SecondaryKeyOptimizer.java:352)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceOper.visit(MapReduceOper.java:246)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceOper.visit(MapReduceOper.java:41)
> at 
> org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:69)
> at 
> org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:71)
> at 
> org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:52)
> at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.compile(MapReduceLauncher.java:498)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:117)
> at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:378)
> ... 11 more
> But, when I removed the DISTINCT statement before COGROUP, i.e. "B = distinct 
> B;"  this script can run smoothly. I have also tried other reducer side 
> operations like ORDER, it seems that they will also trigger above error. This 
> is really very confusing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1813) Pig 0.8 throws ERROR 1075 while trying to refer a map in the result of eval udf.Works with 0.7

2011-01-24 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1813:


Attachment: PIG-1813-2.patch

Resync with 0.8 branch. For trunk, a different patch needed.

> Pig 0.8 throws ERROR 1075 while trying to refer a map in the result of  eval 
> udf.Works with 0.7
> ---
>
> Key: PIG-1813
> URL: https://issues.apache.org/jira/browse/PIG-1813
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.8.0
>Reporter: Vivek Padmanabhan
> Attachments: PIG-1813-0.patch, PIG-1813-1.patch, PIG-1813-2.patch
>
>
> register myudf.jar;
> A = load 'input' MyZippedStorage('\u0001') as ($inputSchema);
> B = foreach A generate id , value  ;
> C = foreach B generate id , org.myudf.ExplodeHashList( (chararray)value, 
> '\u0002', '\u0004', '\u0003') as value;
> D = FILTER C by value is not null;
> E = foreach D generate id , flatten(org.myudf.GETFIRST(value)) as hop;
> F = foreach E generate id , hop#'rmli' as rmli:bytearray ;
> store F into 'output.bz2' using PigStorage();
> The above script fails when run with Pig 0.8 but runs fine with Pig 0.7 or if 
> pig.usenewlogicalplan=false.
> The below is the exception thrown in 0.8 :
> org.apache.pig.backend.executionengine.ExecException: ERROR 1075: Received a 
> bytearray from the UDF. Cannot determine how to convert the bytearray to map.
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:952)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POMapLookUp.processInput(POMapLookUp.java:87)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POMapLookUp.getNext(POMapLookUp.java:98)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POMapLookUp.getNext(POMapLookUp.java:117)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:346)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:236)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:231)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:638)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:314)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:217)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1062)
>   at org.apache.hadoop.mapred.Child.main(Child.java:211)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Hudson build is back to normal : Pig-trunk-commit #650

2011-01-24 Thread Apache Hudson Server

See

[jira] Assigned: (PIG-1812) Problem with DID_NOT_FIND_LOAD_ONLY_MAP_PLAN

2011-01-24 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai reassigned PIG-1812:
---

Assignee: Daniel Dai

> Problem with DID_NOT_FIND_LOAD_ONLY_MAP_PLAN
> 
>
> Key: PIG-1812
> URL: https://issues.apache.org/jira/browse/PIG-1812
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.7.0, 0.8.0
> Environment: RHEL, Pig 0.8.0
>Reporter: xianyu
>Assignee: Daniel Dai
>
> Hi, 
> I have the following input files:
> pkg.txt
> a   3   {(123,1.0),(236,2.0)}
> a   3   {(236,1.0)}
> model.txt
> a   123 2   0.33
> a   236 2   0.5
> My script is listed below:
> A = load 'pkg.txt' using PigStorage('\t') as (pkg:chararray, ts:int, 
> cat_bag:{t:(id:chararray, wht:float)});
> M = load 'model.txt' using PigStorage('\t') as (pkg:chararray, 
> cat_id:chararray, ts:int, score:double);
> B = foreach A generate ts, pkg, flatten(cat_bag.id) as (cat_id:chararray);
> B = distinct B;
> H1 = cogroup M by (pkg, cat_id) inner, B by (pkg, cat_id);
> H2 = foreach H1 {
> I = order M by ts;
> J = order B by ts;
> generate flatten(group) as (pkg:chararray, cat_id:chararray), J.ts as 
> tsorig, I.ts as tsmap;
> }
> dump H2;
> When running this script, I got a warning about "Encountered Warning 
> DID_NOT_FIND_LOAD_ONLY_MAP_PLAN 1 time(s)" and pig error log as below:
> Pig Stack Trace
> ---
> ERROR 2043: Unexpected error during execution.
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to 
> open iterator for alias H2
> at org.apache.pig.PigServer.openIterator(PigServer.java:764)
> at 
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612)
> at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:90)
> at org.apache.pig.Main.run(Main.java:500)
> at org.apache.pig.Main.main(Main.java:107)
> Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store alias H2
> at org.apache.pig.PigServer.storeEx(PigServer.java:888)
> at org.apache.pig.PigServer.store(PigServer.java:826)
> at org.apache.pig.PigServer.openIterator(PigServer.java:738)
> ... 7 more
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2043: 
> Unexpected error during execution.
> at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:403)
> at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1208)
> at org.apache.pig.PigServer.storeEx(PigServer.java:884)
> ... 9 more
> Caused by: java.lang.ClassCastException: 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLoad
>  cannot be cast to 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SecondaryKeyOptimizer.visitMROp(SecondaryKeyOptimizer.java:352)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceOper.visit(MapReduceOper.java:246)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceOper.visit(MapReduceOper.java:41)
> at 
> org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:69)
> at 
> org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:71)
> at 
> org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:52)
> at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.compile(MapReduceLauncher.java:498)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:117)
> at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:378)
> ... 11 more
> But, when I removed the DISTINCT statement before COGROUP, i.e. "B = distinct 
> B;"  this script can run smoothly. I have also tried other reducer side 
> operations like ORDER, it seems that they will also trigger above error. This 
> is really very confusing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1820) New logical plan: FilterLogicExpressionSimplifier fail to deal with UDF

2011-01-24 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-1820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-1820:


Attachment: PIG-1820-1.patch

> New logical plan: FilterLogicExpressionSimplifier fail to deal with UDF
> ---
>
> Key: PIG-1820
> URL: https://issues.apache.org/jira/browse/PIG-1820
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.8.0
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.8.0
>
> Attachments: PIG-1820-1.patch
>
>
> The following script fail:
> {code}
> a = load '1.txt' as (a0, a1);
> b = filter a by (a0 is not null or a1 is not null) and IsEmpty(a0);
> explain b;
> {code}
> Error message:
> Caused by: java.lang.ClassCastException: 
> org.apache.pig.newplan.logical.expression.UserFuncExpression cannot be cast 
> to org.apache.pig.newplan.logical.expression.BinaryExpression
> at 
> org.apache.pig.newplan.logical.rules.LogicalExpressionSimplifier$LogicalExpressionSimplifierTransformer.handleBinary(LogicalExpressionSimplifier.java:561)
> at 
> org.apache.pig.newplan.logical.rules.LogicalExpressionSimplifier$LogicalExpressionSimplifierTransformer.handleAnd(LogicalExpressionSimplifier.java:429)
> at 
> org.apache.pig.newplan.logical.rules.LogicalExpressionSimplifier$LogicalExpressionSimplifierTransformer.inferRelationship(LogicalExpressionSimplifier.java:397)
> at 
> org.apache.pig.newplan.logical.rules.LogicalExpressionSimplifier$LogicalExpressionSimplifierTransformer.handleDNFOr(LogicalExpressionSimplifier.java:281)
> at 
> org.apache.pig.newplan.logical.rules.LogicalExpressionSimplifier$LogicalExpressionSimplifierTransformer.checkDNFLeaves(LogicalExpressionSimplifier.java:192)
> at 
> org.apache.pig.newplan.logical.rules.LogicalExpressionSimplifier$LogicalExpressionSimplifierTransformer.transform(LogicalExpressionSimplifier.java:108)
> at 
> org.apache.pig.newplan.optimizer.PlanOptimizer.optimize(PlanOptimizer.java:110)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (PIG-1820) New logical plan: FilterLogicExpressionSimplifier fail to deal with UDF

2011-01-24 Thread Daniel Dai (JIRA)

New logical plan: FilterLogicExpressionSimplifier fail to deal with UDF
---

 Key: PIG-1820
 URL: https://issues.apache.org/jira/browse/PIG-1820
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.8.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.8.0
 Attachments: PIG-1820-1.patch

The following script fail:
{code}
a = load '1.txt' as (a0, a1);
b = filter a by (a0 is not null or a1 is not null) and IsEmpty(a0);
explain b;
{code}

Error message:
Caused by: java.lang.ClassCastException: 
org.apache.pig.newplan.logical.expression.UserFuncExpression cannot be cast to 
org.apache.pig.newplan.logical.expression.BinaryExpression
at 
org.apache.pig.newplan.logical.rules.LogicalExpressionSimplifier$LogicalExpressionSimplifierTransformer.handleBinary(LogicalExpressionSimplifier.java:561)
at 
org.apache.pig.newplan.logical.rules.LogicalExpressionSimplifier$LogicalExpressionSimplifierTransformer.handleAnd(LogicalExpressionSimplifier.java:429)
at 
org.apache.pig.newplan.logical.rules.LogicalExpressionSimplifier$LogicalExpressionSimplifierTransformer.inferRelationship(LogicalExpressionSimplifier.java:397)
at 
org.apache.pig.newplan.logical.rules.LogicalExpressionSimplifier$LogicalExpressionSimplifierTransformer.handleDNFOr(LogicalExpressionSimplifier.java:281)
at 
org.apache.pig.newplan.logical.rules.LogicalExpressionSimplifier$LogicalExpressionSimplifierTransformer.checkDNFLeaves(LogicalExpressionSimplifier.java:192)
at 
org.apache.pig.newplan.logical.rules.LogicalExpressionSimplifier$LogicalExpressionSimplifierTransformer.transform(LogicalExpressionSimplifier.java:108)
at 
org.apache.pig.newplan.optimizer.PlanOptimizer.optimize(PlanOptimizer.java:110)


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (PIG-1819) For implicit binding, Jython embedded Pig should skip any variable/value that contains $.

2011-01-24 Thread Richard Ding (JIRA)

For implicit binding, Jython embedded Pig should skip any variable/value that 
contains $. 
--

 Key: PIG-1819
 URL: https://issues.apache.org/jira/browse/PIG-1819
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.9.0
Reporter: Richard Ding
Assignee: Richard Ding
 Fix For: 0.9.0


We use the Pig parameter substitution for the bindings so variable/value that 
contains $ cannot be used.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-1769) Consistency for HBaseStorage

2011-01-24 Thread Alan Gates (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12986071#action_12986071
 ] 

Alan Gates commented on PIG-1769:
-

 [exec] +1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] 
 [exec] 

> Consistency for HBaseStorage
> 
>
> Key: PIG-1769
> URL: https://issues.apache.org/jira/browse/PIG-1769
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0
>Reporter: Corbin Hoenes
>Assignee: Dmitriy V. Ryaboy
> Fix For: 0.9.0
>
> Attachments: PIG_1769.patch
>
>
> In our load statement we are allowed to prefix the table name with "hbase://" 
> but when we call
> store it throws an exception unless we remove hbase:// from the table
> name:
> this works:
> store raw into 'piggytest2' USING
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('content2:field1
> anchor2:field1a anchor2:field2a');
> this won't
> store raw into 'hbase://piggytest2'
> Exception:
> Caused by: java.lang.IllegalArgumentException:
> java.net.URISyntaxException: Relative path in absolute URI:
> hbase://piggytest2_logs
> Would be nice to be able to prefix the store with hbase:// so it's consistent 
> with the load syntax

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Build failed in Hudson: Pig-trunk-commit #649

2011-01-24 Thread Apache Hudson Server

See 

Changes:

[daijy] Fix piggybank unit test failure TestPigStorageSchema

[gates] Doubled timeout on findbugs since it was timing out sometimes during 
test-patch.

--
[...truncated 5047 lines...]
[ivy:resolve]   found commons-httpclient#commons-httpclient;3.0.1 in maven2
[ivy:resolve]   found commons-codec#commons-codec;1.3 in maven2
[ivy:resolve]   found commons-net#commons-net;1.4.1 in maven2
[ivy:resolve]   found oro#oro;2.0.8 in maven2
[ivy:resolve]   found org.mortbay.jetty#jetty;6.1.14 in maven2
[ivy:resolve]   found org.mortbay.jetty#jetty-util;6.1.14 in maven2
[ivy:resolve]   found org.mortbay.jetty#servlet-api-2.5;6.1.14 in maven2
[ivy:resolve]   found tomcat#jasper-runtime;5.5.12 in maven2
[ivy:resolve]   found tomcat#jasper-compiler;5.5.12 in maven2
[ivy:resolve]   found org.mortbay.jetty#jsp-api-2.1;6.1.14 in maven2
[ivy:resolve]   found org.mortbay.jetty#jsp-2.1;6.1.14 in maven2
[ivy:resolve]   found org.eclipse.jdt#core;3.1.1 in maven2
[ivy:resolve]   found ant#ant;1.6.5 in maven2
[ivy:resolve]   found net.java.dev.jets3t#jets3t;0.7.1 in maven2
[ivy:resolve]   found commons-logging#commons-logging;1.1.1 in maven2
[ivy:resolve]   found net.sf.kosmosfs#kfs;0.3 in maven2
[ivy:resolve]   found junit#junit;4.5 in maven2
[ivy:resolve]   found hsqldb#hsqldb;1.8.0.10 in maven2
[ivy:resolve]   found org.apache.hadoop#hadoop-test;0.20.2 in maven2
[ivy:resolve]   found org.apache.ftpserver#ftplet-api;1.0.0 in maven2
[ivy:resolve]   found org.apache.mina#mina-core;2.0.0-M5 in maven2
[ivy:resolve]   found org.slf4j#slf4j-api;1.5.2 in maven2
[ivy:resolve]   found org.apache.ftpserver#ftpserver-core;1.0.0 in maven2
[ivy:resolve]   found org.apache.ftpserver#ftpserver-deprecated;1.0.0-M2 in 
maven2
[ivy:resolve]   found org.slf4j#slf4j-log4j12;1.4.3 in maven2
[ivy:resolve]   found com.jcraft#jsch;0.1.38 in maven2
[ivy:resolve]   found jline#jline;0.9.94 in maven2
[ivy:resolve]   found net.java.dev.javacc#javacc;4.2 in maven2
[ivy:resolve]   found org.codehaus.jackson#jackson-mapper-asl;1.0.1 in maven2
[ivy:resolve]   found org.codehaus.jackson#jackson-core-asl;1.0.1 in maven2
[ivy:resolve]   found joda-time#joda-time;1.6 in maven2
[ivy:resolve]   found commons-lang#commons-lang;2.4 in maven2
[ivy:resolve]   found com.google.guava#guava;r06 in maven2
[ivy:resolve]   found org.python#jython;2.5.0 in maven2
[ivy:resolve] :: resolution report :: resolve 104ms :: artifacts dl 18ms
[ivy:resolve]   :: evicted modules:
[ivy:resolve]   junit#junit;3.8.1 by [junit#junit;4.5] in [buildJar]
[ivy:resolve]   commons-logging#commons-logging;1.0.3 by 
[commons-logging#commons-logging;1.1.1] in [buildJar]
[ivy:resolve]   commons-codec#commons-codec;1.2 by 
[commons-codec#commons-codec;1.3] in [buildJar]
[ivy:resolve]   commons-httpclient#commons-httpclient;3.1 by 
[commons-httpclient#commons-httpclient;3.0.1] in [buildJar]
[ivy:resolve]   org.apache.mina#mina-core;2.0.0-M4 by 
[org.apache.mina#mina-core;2.0.0-M5] in [buildJar]
[ivy:resolve]   org.apache.ftpserver#ftplet-api;1.0.0-M2 by 
[org.apache.ftpserver#ftplet-api;1.0.0] in [buildJar]
[ivy:resolve]   org.apache.ftpserver#ftpserver-core;1.0.0-M2 by 
[org.apache.ftpserver#ftpserver-core;1.0.0] in [buildJar]
[ivy:resolve]   org.apache.mina#mina-core;2.0.0-M2 by 
[org.apache.mina#mina-core;2.0.0-M5] in [buildJar]
-
|  |modules||   artifacts   |
|   conf   | number| search|dwnlded|evicted|| number|dwnlded|
-
| buildJar |   47  |   0   |   0   |   8   ||   39  |   0   |
-
[ivy:retrieve] :: retrieving :: org.apache.pig#Pig
[ivy:retrieve]  confs: [buildJar]
[ivy:retrieve]  1 artifacts copied, 38 already retrieved (288kB/10ms)

buildJar:
 [echo] svnString 1063010
  [jar] Building jar: 

  [jar] Building jar: 

 [copy] Copying 1 file to 


jarWithOutSvn:

source-jar:
  [jar] Building jar: 


ivy-javadoc:
[ivy:resolve] :: resolving dependencies :: org.apache.pig#Pig;0.9.0-SNAPSHOT
[ivy:resolve]   confs: [javadoc]
[ivy:resolve]   found commons-el#commons-el;1.0 in maven2
[ivy:resolve]   found log4j#log4j;1.2.14 in maven2
[ivy:resolve]   found org.apache.hadoop#hadoop-core;0.20.2 in maven2
[ivy:resolve]   found commons-cli#commons-cli;1.2 in maven2
[ivy:resolve]   found xmlenc#xmlenc;0.52 in maven2
[iv

Re: dataflow in logical plan

2011-01-24 Thread Baraa Mohamad

WAW very interesting !!

So the expression operators are passed to the relational operators as inputs
to these operators ?
and on the other hand, generally when we want to draw the logical plan (when
Pig create the logical plan) we dont need to consider the expression
operators as nodes
we draw as you said just

Load --. Filter --> Store

We dont have to add other nodes for Proj(0) ,  '>'  , Const(5)

regards

On Mon, Jan 24, 2011 at 10:32 PM, Alan Gates  wrote:

> Pig has two levels of operators in its logical (and physical) plans,
> relational and expression.  Projection is an expression operator in Pig, not
> a relational operator (as it is in most databases).  So (ignoring the
> affects of the optimizer for now) all of your data will be sent to the
> filter relational operator.  Your filter will see 1 3 4 etc., not 1 etc.
>  Inside that filter the tuples will be trimmed by the projection operator as
> part of the expression plan for '>'.
>
> Alan.
>
>
> On Jan 24, 2011, at 1:23 PM, Baraa Mohamad wrote:
>
>  Thank you very much for your explination ,
>> Just to verify that I understood correctly
>> For example if myfile contains the following data
>> 1 3 4
>> 3 4 6
>> 7 8 2
>> 4 5 9
>> 9 3 5
>> 6 6 2
>>
>> so all this data will be sent to Proj(0) operator which gives as a results
>> 1
>> 3
>> 7
>> 4
>> 9
>> 6
>>
>> After that all this data in myfile will be sent to the filter operator, so
>> that the filter take tow inputs the myfile data and the result of the
>> proj(0) > 5 which is
>> 7
>> 9
>> 6
>>
>> regards
>>
>>
>> On Mon, Jan 24, 2011 at 10:08 PM, Alan Gates  wrote:
>> The logical plan for your script will look like:
>>
>> Load -> Filter -> Store
>>
>> Filter will have an expression plan that looks like Proj($0) > const(5)
>>
>> So yes, all your data will go through the filter operator.  But keep in
>> mind that there is a filter operator in each map task, so all your code will
>> not go through any one instance of the operator (unless myfile is small).
>>  Hope that helps.
>>
>> Unfortunately, there is not any great architecture document on Pig.
>>  Probably the best substitute is a paper we published in VLDB 2009, which
>> you can get here:
>> http://infolab.stanford.edu/~olston/publications/vldb09.pdf.  Since this
>> is almost 2 years old now some of the specific information is out of date
>> but the basic structure is still correct.
>>
>> Alan.
>>
>>
>> On Jan 24, 2011, at 12:48 PM, Baraa Mohamad wrote:
>>
>> Hello all:
>>
>> I'm new user of Pig , and I'm very interested in the architecture of Pig.
>> I have a question about the logical plan
>>
>> In the logical plan of this example: (in attach)
>> a = load 'myfile';
>> b = filter a by $0 > 5;
>> store b into 'myfilteredfile';
>>
>>
>> Does all the data in 'myfile' will be sent in it's totality to the Proj(0)
>> operator and to the Filter Operator ??
>> More generally what are runing on the arrows in the logical plan ??
>>
>> what is the best documentation to understand the architecture of Pig not
>> only how to use it because I'll try to use it in the medical domain but
>> first I have to understand it
>> deeply
>>
>> thank you very much for your help
>>
>>
>> Baraa MOHAMAD
>> Doctorante en informatique
>> ISIMA-LIMOS
>> Université Blaise Pascal
>> Clermont-Ferrand
>> France
>> Tél:  +33 658900080
>>
>>
>>
>

Re: dataflow in logical plan

2011-01-24 Thread Alan Gates

Pig has two levels of operators in its logical (and physical) plans,  
relational and expression.  Projection is an expression operator in  
Pig, not a relational operator (as it is in most databases).  So  
(ignoring the affects of the optimizer for now) all of your data will  
be sent to the filter relational operator.  Your filter will see 1 3 4  
etc., not 1 etc.  Inside that filter the tuples will be trimmed by the  
projection operator as part of the expression plan for '>'.


Alan.

On Jan 24, 2011, at 1:23 PM, Baraa Mohamad wrote:


Thank you very much for your explination ,
Just to verify that I understood correctly
For example if myfile contains the following data
1 3 4
3 4 6
7 8 2
4 5 9
9 3 5
6 6 2

so all this data will be sent to Proj(0) operator which gives as a  
results

1
3
7
4
9
6

After that all this data in myfile will be sent to the filter  
operator, so that the filter take tow inputs the myfile data and the  
result of the proj(0) > 5 which is

7
9
6

regards


On Mon, Jan 24, 2011 at 10:08 PM, Alan Gates   
wrote:

The logical plan for your script will look like:

Load -> Filter -> Store

Filter will have an expression plan that looks like Proj($0) >  
const(5)


So yes, all your data will go through the filter operator.  But keep  
in mind that there is a filter operator in each map task, so all  
your code will not go through any one instance of the operator  
(unless myfile is small).  Hope that helps.


Unfortunately, there is not any great architecture document on Pig.   
Probably the best substitute is a paper we published in VLDB 2009,  
which you can get here:  http://infolab.stanford.edu/~olston/publications/vldb09.pdf 
.  Since this is almost 2 years old now some of the specific  
information is out of date but the basic structure is still correct.


Alan.


On Jan 24, 2011, at 12:48 PM, Baraa Mohamad wrote:

Hello all:

I'm new user of Pig , and I'm very interested in the architecture of  
Pig.

I have a question about the logical plan

In the logical plan of this example: (in attach)
a = load 'myfile';
b = filter a by $0 > 5;
store b into 'myfilteredfile';


Does all the data in 'myfile' will be sent in it's totality to the  
Proj(0) operator and to the Filter Operator ??

More generally what are runing on the arrows in the logical plan ??

what is the best documentation to understand the architecture of Pig  
not only how to use it because I'll try to use it in the medical  
domain but first I have to understand it

deeply

thank you very much for your help


Baraa MOHAMAD
Doctorante en informatique
ISIMA-LIMOS
Université Blaise Pascal
Clermont-Ferrand
France
Tél:  +33 658900080

Pig 0.8 HBaseStorage patch

2011-01-24 Thread Corbin Hoenes

We've got a patch we've made to HBaseStorage which allows a caller to turn
off the WriteAheadLog feature while doing bulk loads into hbase.

>From the performance tuning wikipage:
http://wiki.apache.org/hadoop/PerformanceTuning
"To speed up the inserts in a non critical job (like an import job), you can
use Put.writeToWAL(false) to bypass writing to the write ahead log."

We've tested this on HBase 0.20.6 and it helps dramatically.  It sounds like
future versions of HBase support a feature like this by default--so maybe
this problem goes away when we start using 0.90?

Is this something valuable to contribute back?

Re: dataflow in logical plan

2011-01-24 Thread Baraa Mohamad

Thank you very much for your explination ,
Just to verify that I understood correctly
For example if myfile contains the following data
1 3 4
3 4 6
7 8 2
4 5 9
9 3 5
6 6 2

so all this data will be sent to Proj(0) operator which gives as a results
1
3
7
4
9
6

After that all this data in myfile will be sent to the filter operator, so
that the filter take tow inputs the myfile data and the result of the
proj(0) > 5 which is
7
9
6

regards


On Mon, Jan 24, 2011 at 10:08 PM, Alan Gates  wrote:

> The logical plan for your script will look like:
>
> Load -> Filter -> Store
>
> Filter will have an expression plan that looks like Proj($0) > const(5)
>
> So yes, all your data will go through the filter operator.  But keep in
> mind that there is a filter operator in each map task, so all your code will
> not go through any one instance of the operator (unless myfile is small).
>  Hope that helps.
>
> Unfortunately, there is not any great architecture document on Pig.
>  Probably the best substitute is a paper we published in VLDB 2009, which
> you can get here:
> http://infolab.stanford.edu/~olston/publications/vldb09.pdf.  Since this
> is almost 2 years old now some of the specific information is out of date
> but the basic structure is still correct.
>
> Alan.
>
>
> On Jan 24, 2011, at 12:48 PM, Baraa Mohamad wrote:
>
>  Hello all:
>>
>> I'm new user of Pig , and I'm very interested in the architecture of Pig.
>> I have a question about the logical plan
>>
>> In the logical plan of this example: (in attach)
>> a = load 'myfile';
>> b = filter a by $0 > 5;
>> store b into 'myfilteredfile';
>>
>>
>> Does all the data in 'myfile' will be sent in it's totality to the Proj(0)
>> operator and to the Filter Operator ??
>> More generally what are runing on the arrows in the logical plan ??
>>
>> what is the best documentation to understand the architecture of Pig not
>> only how to use it because I'll try to use it in the medical domain but
>> first I have to understand it
>> deeply
>>
>> thank you very much for your help
>>
>>
>> Baraa MOHAMAD
>> Doctorante en informatique
>> ISIMA-LIMOS
>> Université Blaise Pascal
>> Clermont-Ferrand
>> France
>> Tél:  +33 658900080
>>
>
>

Re: dataflow in logical plan

2011-01-24 Thread Alan Gates


The logical plan for your script will look like:

Load -> Filter -> Store

Filter will have an expression plan that looks like Proj($0) > const(5)

So yes, all your data will go through the filter operator.  But keep  
in mind that there is a filter operator in each map task, so all your  
code will not go through any one instance of the operator (unless  
myfile is small).  Hope that helps.


Unfortunately, there is not any great architecture document on Pig.   
Probably the best substitute is a paper we published in VLDB 2009,  
which you can get here:  http://infolab.stanford.edu/~olston/publications/vldb09.pdf 
.  Since this is almost 2 years old now some of the specific  
information is out of date but the basic structure is still correct.


Alan.

On Jan 24, 2011, at 12:48 PM, Baraa Mohamad wrote:


Hello all:

I'm new user of Pig , and I'm very interested in the architecture of  
Pig.

I have a question about the logical plan

In the logical plan of this example: (in attach)
a = load 'myfile';
b = filter a by $0 > 5;
store b into 'myfilteredfile';


Does all the data in 'myfile' will be sent in it's totality to the  
Proj(0) operator and to the Filter Operator ??

More generally what are runing on the arrows in the logical plan ??

what is the best documentation to understand the architecture of Pig  
not only how to use it because I'll try to use it in the medical  
domain but first I have to understand it

deeply

thank you very much for your help


Baraa MOHAMAD
Doctorante en informatique
ISIMA-LIMOS
Université Blaise Pascal
Clermont-Ferrand
France
Tél:  +33 658900080

dataflow in logical plan

2011-01-24 Thread Baraa Mohamad

Hello all:

I'm new user of Pig , and I'm very interested in the architecture of Pig.
I have a question about the logical plan

In the logical plan of this example: (in attach)

a = load 'myfile';

b = filter a by $0 > 5;

store b into 'myfilteredfile';



Does all the data in 'myfile' will be sent in it's totality to the Proj(0)
operator and to the Filter Operator ??
More generally what are runing on the arrows in the logical plan ??

what is the best documentation to understand the architecture of Pig not
only how to use it because I'll try to use it in the medical domain but
first I have to understand it
deeply

thank you very much for your help


Baraa MOHAMAD
Doctorante en informatique
ISIMA-LIMOS
Université Blaise Pascal
Clermont-Ferrand
France
Tél:  +33 658900080

[jira] Commented: (PIG-1748) Add load/store function AvroStorage for avro data

2011-01-24 Thread Jakob Homan (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12985972#action_12985972
 ] 

Jakob Homan commented on PIG-1748:
--

@Scott
I can't say I'm convinced, and am in fact more concerned from your example, 
given that this approach essentially builds dependencies on all of those 
projects into Avro.  However, this JIRA isn't the best place to discuss this.  
Is there a discussion about this type of integration going on in Avro that the 
community can contribute to?  Is there a JIRA?  Thanks.

> Add load/store function AvroStorage for avro data
> -
>
> Key: PIG-1748
> URL: https://issues.apache.org/jira/browse/PIG-1748
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: lin guo
>Assignee: Jakob Homan
> Attachments: avro_storage.patch, avro_test_files.tar.gz, 
> PIG-1748-2.patch
>
>
> We want to use Pig to process arbitrary Avro data and store results as Avro 
> files. AvroStorage() extends two PigFuncs: LoadFunc and StoreFunc. 
> Due to discrepancies of Avro and Pig data models, AvroStorage has:
> 1. Limited support for "record": we do not support recursively defined record 
> because the number of fields in such records is data dependent.
> 2. Limited support for "union": we only accept nullable union like ["null", 
> "some-type"].
> For simplicity, we also make the following assumptions:
> If the input directory is a leaf directory, then we assume Avro data files in 
> it have the same schema;
> If the input directory contains sub-directories, then we assume Avro data 
> files in all sub-directories have the same schema.
> AvroStorage takes no input parameters when used as a LoadFunc (except for 
> "debug [debug-level]"). 
> Users can provide parameters to AvroStorage when used as a StoreFunc. If they 
> don't, Avro schema of output data is derived from its 
> Pig schema.
> Detailed documentation can be found in 
> http://snaprojects.jira.com/wiki/display/HTOOLS/AvroStorage+-+Pig+support+for+Avro+data

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-1748) Add load/store function AvroStorage for avro data

2011-01-24 Thread Scott Carey (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12985958#action_12985958
 ] 

Scott Carey commented on PIG-1748:
--

@Jacob
Of course projects can do what they wish.  I'm simply hoping many can 
collaborate together on this general problem category.

{quote}This seems like an odd approach to me, essentially inverting the domain 
knowledge of each application to Avro, rather than the application itself where 
its developers frolic and work. Is there something I'm missing here?
{quote}
Writing a Pig storage adapter requires Avro domain knowledge and Pig domain 
knowledge.  I found that it required more knowledge of Avro than Pig to do 
well.  If all you ever want to achieve is:

Pig - >> Avro file - >> Pig, then maybe it doesn't matter who hosts it. 

But what if you want to do:
Pig - >>  Avro file - >> Cascading - >> Avro file - >> Hive - >> Avro file - >> 
Pig   ?

Now which project should host what defines how all those data models can 
interact through a common schema system?  pig contrib?  hive contrib?  howl? 
cascading (gpl . . .)?

In the longer term, the common elements needed by all of the above can 
crystallize out into an avro module general to all, and individual modules 
hosted by each project can use that.  What that might look like won't be 
apparent until there are enough example use cases however.

> Add load/store function AvroStorage for avro data
> -
>
> Key: PIG-1748
> URL: https://issues.apache.org/jira/browse/PIG-1748
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: lin guo
>Assignee: Jakob Homan
> Attachments: avro_storage.patch, avro_test_files.tar.gz, 
> PIG-1748-2.patch
>
>
> We want to use Pig to process arbitrary Avro data and store results as Avro 
> files. AvroStorage() extends two PigFuncs: LoadFunc and StoreFunc. 
> Due to discrepancies of Avro and Pig data models, AvroStorage has:
> 1. Limited support for "record": we do not support recursively defined record 
> because the number of fields in such records is data dependent.
> 2. Limited support for "union": we only accept nullable union like ["null", 
> "some-type"].
> For simplicity, we also make the following assumptions:
> If the input directory is a leaf directory, then we assume Avro data files in 
> it have the same schema;
> If the input directory contains sub-directories, then we assume Avro data 
> files in all sub-directories have the same schema.
> AvroStorage takes no input parameters when used as a LoadFunc (except for 
> "debug [debug-level]"). 
> Users can provide parameters to AvroStorage when used as a StoreFunc. If they 
> don't, Avro schema of output data is derived from its 
> Pig schema.
> Detailed documentation can be found in 
> http://snaprojects.jira.com/wiki/display/HTOOLS/AvroStorage+-+Pig+support+for+Avro+data

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-1748) Add load/store function AvroStorage for avro data

2011-01-24 Thread Jakob Homan (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-1748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12985936#action_12985936
 ] 

Jakob Homan commented on PIG-1748:
--

@Daniel- Let me take a look.

@Scott - It's worth noting that projects can include Avro support as they wish, 
just as Avro can incorporate that work as it wishes.  But I'm not sure I 
understand.  You're saying that you'd rather have any higher-level application 
supporting Avro to have that support hosted in Avro, rather than treating it as 
a library to be included?  This seems like an odd approach to me, essentially 
inverting the domain knowledge of each application to Avro, rather than the 
application itself where its developers frolic and work.  Is there something 
I'm missing here?

> Add load/store function AvroStorage for avro data
> -
>
> Key: PIG-1748
> URL: https://issues.apache.org/jira/browse/PIG-1748
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: lin guo
>Assignee: Jakob Homan
> Attachments: avro_storage.patch, avro_test_files.tar.gz, 
> PIG-1748-2.patch
>
>
> We want to use Pig to process arbitrary Avro data and store results as Avro 
> files. AvroStorage() extends two PigFuncs: LoadFunc and StoreFunc. 
> Due to discrepancies of Avro and Pig data models, AvroStorage has:
> 1. Limited support for "record": we do not support recursively defined record 
> because the number of fields in such records is data dependent.
> 2. Limited support for "union": we only accept nullable union like ["null", 
> "some-type"].
> For simplicity, we also make the following assumptions:
> If the input directory is a leaf directory, then we assume Avro data files in 
> it have the same schema;
> If the input directory contains sub-directories, then we assume Avro data 
> files in all sub-directories have the same schema.
> AvroStorage takes no input parameters when used as a LoadFunc (except for 
> "debug [debug-level]"). 
> Users can provide parameters to AvroStorage when used as a StoreFunc. If they 
> don't, Avro schema of output data is derived from its 
> Pig schema.
> Detailed documentation can be found in 
> http://snaprojects.jira.com/wiki/display/HTOOLS/AvroStorage+-+Pig+support+for+Avro+data

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-1799) Provide deployable maven artifacts for pigunit and pig smoke tests

2011-01-24 Thread Konstantin Boudnik (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Boudnik updated PIG-1799:


Attachment: PIG-1799.deploy.patch

Missed dependencies are added to the mvn-deploy target.

> Provide deployable maven artifacts for pigunit and pig smoke tests
> --
>
> Key: PIG-1799
> URL: https://issues.apache.org/jira/browse/PIG-1799
> Project: Pig
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 0.9.0
>Reporter: Konstantin Boudnik
>Assignee: Konstantin Boudnik
> Fix For: 0.9.0
>
> Attachments: PIG-1799.deploy.patch, PIG-1799.patch, PIG-1799.patch, 
> PIG-1799.patch, PIG-1799.patch, PIG-1799.patch, PIG-1799.patch, PIG-1799.patch
>
>
> Having maven artifacts for pigunit framework and smoke test will help to 
> separate execution of smoke tests against a physical cluster from a source 
> tree (ant build),

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Build failed in Hudson: Pig-trunk-commit #648

2011-01-24 Thread Apache Hudson Server

See 

Changes:

[daijy] PIG-313: Error handling aggregate of a computation

[daijy] PIG-496: project of bags from complex data causes failures

[daijy] PIG-730: problem combining schema from a union of several LOAD 
expressions, with a nested bag inside the schema

[daijy] PIG-767: Schema reported from DESCRIBE and actual schema of inner bags 
are different

[daijy] PIG-1786: Move describe/nested describe to new logical plan

--
[...truncated 5039 lines...]
[ivy:resolve]   found commons-httpclient#commons-httpclient;3.0.1 in maven2
[ivy:resolve]   found commons-codec#commons-codec;1.3 in maven2
[ivy:resolve]   found commons-net#commons-net;1.4.1 in maven2
[ivy:resolve]   found oro#oro;2.0.8 in maven2
[ivy:resolve]   found org.mortbay.jetty#jetty;6.1.14 in maven2
[ivy:resolve]   found org.mortbay.jetty#jetty-util;6.1.14 in maven2
[ivy:resolve]   found org.mortbay.jetty#servlet-api-2.5;6.1.14 in maven2
[ivy:resolve]   found tomcat#jasper-runtime;5.5.12 in maven2
[ivy:resolve]   found tomcat#jasper-compiler;5.5.12 in maven2
[ivy:resolve]   found org.mortbay.jetty#jsp-api-2.1;6.1.14 in maven2
[ivy:resolve]   found org.mortbay.jetty#jsp-2.1;6.1.14 in maven2
[ivy:resolve]   found org.eclipse.jdt#core;3.1.1 in maven2
[ivy:resolve]   found ant#ant;1.6.5 in maven2
[ivy:resolve]   found net.java.dev.jets3t#jets3t;0.7.1 in maven2
[ivy:resolve]   found commons-logging#commons-logging;1.1.1 in maven2
[ivy:resolve]   found net.sf.kosmosfs#kfs;0.3 in maven2
[ivy:resolve]   found junit#junit;4.5 in maven2
[ivy:resolve]   found hsqldb#hsqldb;1.8.0.10 in maven2
[ivy:resolve]   found org.apache.hadoop#hadoop-test;0.20.2 in maven2
[ivy:resolve]   found org.apache.ftpserver#ftplet-api;1.0.0 in maven2
[ivy:resolve]   found org.apache.mina#mina-core;2.0.0-M5 in maven2
[ivy:resolve]   found org.slf4j#slf4j-api;1.5.2 in maven2
[ivy:resolve]   found org.apache.ftpserver#ftpserver-core;1.0.0 in maven2
[ivy:resolve]   found org.apache.ftpserver#ftpserver-deprecated;1.0.0-M2 in 
maven2
[ivy:resolve]   found org.slf4j#slf4j-log4j12;1.4.3 in maven2
[ivy:resolve]   found com.jcraft#jsch;0.1.38 in maven2
[ivy:resolve]   found jline#jline;0.9.94 in maven2
[ivy:resolve]   found net.java.dev.javacc#javacc;4.2 in maven2
[ivy:resolve]   found org.codehaus.jackson#jackson-mapper-asl;1.0.1 in maven2
[ivy:resolve]   found org.codehaus.jackson#jackson-core-asl;1.0.1 in maven2
[ivy:resolve]   found joda-time#joda-time;1.6 in maven2
[ivy:resolve]   found commons-lang#commons-lang;2.4 in maven2
[ivy:resolve]   found com.google.guava#guava;r06 in maven2
[ivy:resolve]   found org.python#jython;2.5.0 in maven2
[ivy:resolve] :: resolution report :: resolve 103ms :: artifacts dl 17ms
[ivy:resolve]   :: evicted modules:
[ivy:resolve]   junit#junit;3.8.1 by [junit#junit;4.5] in [buildJar]
[ivy:resolve]   commons-logging#commons-logging;1.0.3 by 
[commons-logging#commons-logging;1.1.1] in [buildJar]
[ivy:resolve]   commons-codec#commons-codec;1.2 by 
[commons-codec#commons-codec;1.3] in [buildJar]
[ivy:resolve]   commons-httpclient#commons-httpclient;3.1 by 
[commons-httpclient#commons-httpclient;3.0.1] in [buildJar]
[ivy:resolve]   org.apache.mina#mina-core;2.0.0-M4 by 
[org.apache.mina#mina-core;2.0.0-M5] in [buildJar]
[ivy:resolve]   org.apache.ftpserver#ftplet-api;1.0.0-M2 by 
[org.apache.ftpserver#ftplet-api;1.0.0] in [buildJar]
[ivy:resolve]   org.apache.ftpserver#ftpserver-core;1.0.0-M2 by 
[org.apache.ftpserver#ftpserver-core;1.0.0] in [buildJar]
[ivy:resolve]   org.apache.mina#mina-core;2.0.0-M2 by 
[org.apache.mina#mina-core;2.0.0-M5] in [buildJar]
-
|  |modules||   artifacts   |
|   conf   | number| search|dwnlded|evicted|| number|dwnlded|
-
| buildJar |   47  |   0   |   0   |   8   ||   39  |   0   |
-
[ivy:retrieve] :: retrieving :: org.apache.pig#Pig
[ivy:retrieve]  confs: [buildJar]
[ivy:retrieve]  1 artifacts copied, 38 already retrieved (288kB/8ms)

buildJar:
 [echo] svnString 1062921
  [jar] Building jar: 

  [jar] Building jar: 

 [copy] Copying 1 file to 


jarWithOutSvn:

source-jar:
  [jar] Building jar: 


ivy-javadoc:
[ivy:resolve] :: resolving dependencies :: org.apache.pig#Pig;0.9.0-SNAPSHOT
[ivy:resolve]   confs: [javadoc]
[ivy:resolve]   found commons-el

[jira] Resolved: (PIG-1627) Flattening of bags with unknown schemas produces wrong schema

2011-01-24 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-1627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai resolved PIG-1627.
-

Resolution: Fixed

PIG-1786 checked in. Retest and now we get:
Schema for C unknown.

Close the Jira.

> Flattening of bags with unknown schemas produces wrong schema
> -
>
> Key: PIG-1627
> URL: https://issues.apache.org/jira/browse/PIG-1627
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.7.0
>Reporter: Alan Gates
>Assignee: Daniel Dai
> Fix For: 0.9.0
>
>
> The following should produce an unknown schema:
> {code}
> A = load '/Users/gates/test/data/studenttab10';
> B = group A by $0;
> C = foreach B generate flatten(A);
> describe C;
> {code}
> Instead it gives
> {code}
> C: {bytearray}
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (PIG-313) Error handling aggregate of a computation

2011-01-24 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai resolved PIG-313.


  Resolution: Fixed
Hadoop Flags: [Reviewed]

Review notes:
https://reviews.apache.org/r/276/

Patch committed to trunk.

> Error handling aggregate of a computation
> -
>
> Key: PIG-313
> URL: https://issues.apache.org/jira/browse/PIG-313
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.2.0
>Reporter: Pradeep Kamath
>Assignee: Alan Gates
>Priority: Minor
> Fix For: 0.9.0
>
> Attachments: PIG-313-1.patch
>
>
> Query which fails:
> {code}
> a = load ':INPATH:/singlefile/studenttab10k' as (name:chararray, age:int, 
> gpa:double);
> b = group a by name;
> c = foreach b generate group, SUM(a.age*a.gpa);
> store c into ':OUTPATH:';\,
> {code}
> Error output:
> {quote}
> 2008-07-14 16:34:08,684 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting 
> to hadoop file system at: testhost.com:8020
> 2008-07-14 16:34:08,741 [main] WARN  org.apache.hadoop.fs.FileSystem - 
> "testhost.com:8020" is a deprecated filesystem name. Use 
> "hdfs://testhost:8020/" instead.
> 2008-07-14 16:34:08,995 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting 
> to map-reduce job tracker at: testhost.com:50020
> 2008-07-14 16:34:09,251 [main] WARN  org.apache.hadoop.fs.FileSystem - 
> "testhost.com:8020" is a deprecated filesystem name. Use 
> "hdfs://testhost:8020/" instead.
> 2008-07-14 16:34:09,559 [main] ERROR org.apache.pig.PigServer - Cannot 
> evaluate output type of Mul/Div Operator
> 2008-07-14 16:34:09,559 [main] ERROR org.apache.pig.PigServer - Problem 
> resolving LOForEach schema
> 2008-07-14 16:34:09,559 [main] ERROR org.apache.pig.PigServer - Severe 
> problem found during validation 
> org.apache.pig.impl.plan.PlanValidationException: An unexpected exception 
> caused the validation to stop 
> 2008-07-14 16:34:09,560 [main] ERROR org.apache.pig.tools.grunt.Grunt - 
> java.io.IOException: Unable to store for alias: c
> 2008-07-14 16:34:09,560 [main] ERROR org.apache.pig.Main - 
> java.io.IOException: Unable to store for alias: c
> {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (PIG-496) project of bags from complex data causes failures

2011-01-24 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai resolved PIG-496.


  Resolution: Fixed
Hadoop Flags: [Reviewed]

Review notes:
https://reviews.apache.org/r/272/

Patch committed to trunk.

> project of bags from complex data causes failures
> -
>
> Key: PIG-496
> URL: https://issues.apache.org/jira/browse/PIG-496
> Project: Pig
>  Issue Type: Bug
>Reporter: Olga Natkovich
>Assignee: Daniel Dai
>Priority: Minor
> Fix For: 0.9.0
>
> Attachments: PIG-496-1.patch
>
>
> A = load 'complex data' as (x: bag{});
> B = foreach A generate x.($1, $2);
> produces stack trace:
> 2008-10-14 15:11:07,639 [main] ERROR 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher - Error 
> message from task (reduce) 
> task_200809241441_9923_r_00java.lang.NullPointerException
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:183)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:215)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:166)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:252)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:222)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:134)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
> at 
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
> Pradeep suspects that the problem is in 
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POProject.java;
>  line 374

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (PIG-730) problem combining schema from a union of several LOAD expressions, with a nested bag inside the schema.

2011-01-24 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai resolved PIG-730.


  Resolution: Fixed
Hadoop Flags: [Reviewed]

Review notes:
https://reviews.apache.org/r/273/

Patch committed to trunk.

> problem combining schema from a union of several LOAD expressions, with a 
> nested bag inside the schema.
> ---
>
> Key: PIG-730
> URL: https://issues.apache.org/jira/browse/PIG-730
> Project: Pig
>  Issue Type: Bug
> Environment: pig local mode
>Reporter: Christopher Olston
>Assignee: Daniel Dai
> Fix For: 0.9.0
>
> Attachments: PIG-730-1.patch, PIG-730-2.patch
>
>
> grunt> a = load 'foo' using BinStorage as 
> (url:chararray,outlinks:{t:(target:chararray,text:chararray)});
> grunt> b = union (load 'foo' using BinStorage as 
> (url:chararray,outlinks:{t:(target:chararray,text:chararray)})), (load 'bar' 
> using BinStorage as 
> (url:chararray,outlinks:{t:(target:chararray,text:chararray)}));
> grunt> c = foreach a generate flatten(outlinks.target);
> grunt> d = foreach b generate flatten(outlinks.target);
> ---> Would expect both C and D to work, but only C works. D gives the error 
> shown below.
> ---> Turns out using outlinks.t.target (instead of outlinks.target) works for 
> D but not for C.
> ---> I don't care which one, but the same syntax should work for both!
> 2009-03-24 13:15:05,376 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 1000: Error during parsing. Invalid alias: target in {t: (target: 
> chararray,text: chararray)}
> Details at logfile: /echo/olston/data/pig_1237925683748.log
> grunt> quit
> $ cat pig_1237925683748.log 
> ERROR 1000: Error during parsing. Invalid alias: target in {t: (target: 
> chararray,text: chararray)}
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during 
> parsing. Invalid alias: target in {t: (target: chararray,text: chararray)}
> at org.apache.pig.PigServer.parseQuery(PigServer.java:317)
> at org.apache.pig.PigServer.registerQuery(PigServer.java:276)
> at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:529)
> at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:280)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:99)
> at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
> at org.apache.pig.Main.main(Main.java:321)
> Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException: Invalid 
> alias: target in {t: (target: chararray,text: chararray)}
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.AliasFieldOrSpec(QueryParser.java:6042)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.ColOrSpec(QueryParser.java:5898)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.BracketedSimpleProj(QueryParser.java:5423)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseEvalSpec(QueryParser.java:4100)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.UnaryExpr(QueryParser.java:3967)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.CastExpr(QueryParser.java:3920)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.MultiplicativeExpr(QueryParser.java:3829)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.AdditiveExpr(QueryParser.java:3755)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.InfixExpr(QueryParser.java:3721)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItem(QueryParser.java:3617)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItemList(QueryParser.java:3557)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.GenerateStatement(QueryParser.java:3514)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.NestedBlock(QueryParser.java:2985)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.ForEachClause(QueryParser.java:2395)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1028)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:804)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:595)
> at 
> org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:60)
> at org.apache.pig.PigServer.parseQuery(PigServer.java:310)
> ... 6 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-730) problem combining schema from a union of several LOAD expressions, with a nested bag inside the schema.

2011-01-24 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-730:
---

Attachment: PIG-730-2.patch

PIG-730-2.patch resync with current trunk.

> problem combining schema from a union of several LOAD expressions, with a 
> nested bag inside the schema.
> ---
>
> Key: PIG-730
> URL: https://issues.apache.org/jira/browse/PIG-730
> Project: Pig
>  Issue Type: Bug
> Environment: pig local mode
>Reporter: Christopher Olston
>Assignee: Daniel Dai
> Fix For: 0.9.0
>
> Attachments: PIG-730-1.patch, PIG-730-2.patch
>
>
> grunt> a = load 'foo' using BinStorage as 
> (url:chararray,outlinks:{t:(target:chararray,text:chararray)});
> grunt> b = union (load 'foo' using BinStorage as 
> (url:chararray,outlinks:{t:(target:chararray,text:chararray)})), (load 'bar' 
> using BinStorage as 
> (url:chararray,outlinks:{t:(target:chararray,text:chararray)}));
> grunt> c = foreach a generate flatten(outlinks.target);
> grunt> d = foreach b generate flatten(outlinks.target);
> ---> Would expect both C and D to work, but only C works. D gives the error 
> shown below.
> ---> Turns out using outlinks.t.target (instead of outlinks.target) works for 
> D but not for C.
> ---> I don't care which one, but the same syntax should work for both!
> 2009-03-24 13:15:05,376 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 1000: Error during parsing. Invalid alias: target in {t: (target: 
> chararray,text: chararray)}
> Details at logfile: /echo/olston/data/pig_1237925683748.log
> grunt> quit
> $ cat pig_1237925683748.log 
> ERROR 1000: Error during parsing. Invalid alias: target in {t: (target: 
> chararray,text: chararray)}
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during 
> parsing. Invalid alias: target in {t: (target: chararray,text: chararray)}
> at org.apache.pig.PigServer.parseQuery(PigServer.java:317)
> at org.apache.pig.PigServer.registerQuery(PigServer.java:276)
> at 
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:529)
> at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:280)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:99)
> at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
> at org.apache.pig.Main.main(Main.java:321)
> Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException: Invalid 
> alias: target in {t: (target: chararray,text: chararray)}
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.AliasFieldOrSpec(QueryParser.java:6042)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.ColOrSpec(QueryParser.java:5898)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.BracketedSimpleProj(QueryParser.java:5423)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseEvalSpec(QueryParser.java:4100)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.UnaryExpr(QueryParser.java:3967)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.CastExpr(QueryParser.java:3920)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.MultiplicativeExpr(QueryParser.java:3829)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.AdditiveExpr(QueryParser.java:3755)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.InfixExpr(QueryParser.java:3721)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItem(QueryParser.java:3617)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItemList(QueryParser.java:3557)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.GenerateStatement(QueryParser.java:3514)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.NestedBlock(QueryParser.java:2985)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.ForEachClause(QueryParser.java:2395)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1028)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:804)
> at 
> org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:595)
> at 
> org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:60)
> at org.apache.pig.PigServer.parseQuery(PigServer.java:310)
> ... 6 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-767) Schema reported from DESCRIBE and actual schema of inner bags are different.

2011-01-24 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12985878#action_12985878
 ] 

Daniel Dai commented on PIG-767:


Patch committed to trunk.

> Schema reported from DESCRIBE and actual schema of inner bags are different.
> 
>
> Key: PIG-767
> URL: https://issues.apache.org/jira/browse/PIG-767
> Project: Pig
>  Issue Type: Bug
>Reporter: George Mavromatis
>Assignee: Daniel Dai
> Fix For: 0.9.0
>
> Attachments: PIG-767-1.patch, PIG-767-2.patch, PIG-767-3.patch, 
> PIG-767-4.patch
>
>
> The following script:
> urlContents = LOAD 'inputdir' USING BinStorage() AS (url:bytearray, 
> pg:bytearray);
> -- describe and dump are in-sync
> DESCRIBE urlContents;
> DUMP urlContents;
> urlContentsG = GROUP urlContents BY url;
> DESCRIBE urlContentsG;
> urlContentsF = FOREACH urlContentsG GENERATE group,urlContents.pg;
> DESCRIBE urlContentsF;
> DUMP urlContentsF;
> Prints for the DESCRIBE commands:
> urlContents: {url: chararray,pg: chararray}
> urlContentsG: {group: chararray,urlContents: {url: chararray,pg: chararray}}
> urlContentsF: {group: chararray,pg: {pg: chararray}}
> The reported schemas for urlContentsG and urlContentsF are wrong. They are 
> also against the section "Schemas for Complex Data Types" in 
> http://wiki.apache.org/pig-data/attachments/FrontPage/attachments/plrm.htm#_Schemas.
> As expected, actual data observed from DUMP urlContentsG and DUMP 
> urlContentsF do contain the tuple inside the inner bags.
> The correct schema for urlContentsG is:  {group: chararray,urlContents: 
> {t1:(url: chararray,pg: chararray)}}
> This may sound like a technicality, but it isn't. For instance, a UDF that 
> assumes an inner bag of {chararray} will not work with {(chararray)}. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-1769) Consistency for HBaseStorage

2011-01-24 Thread Alan Gates (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12985879#action_12985879
 ] 

Alan Gates commented on PIG-1769:
-

+1, changes look good.  I'll run the test-patch and unit tests on it.

> Consistency for HBaseStorage
> 
>
> Key: PIG-1769
> URL: https://issues.apache.org/jira/browse/PIG-1769
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.8.0
>Reporter: Corbin Hoenes
>Assignee: Dmitriy V. Ryaboy
> Fix For: 0.9.0
>
> Attachments: PIG_1769.patch
>
>
> In our load statement we are allowed to prefix the table name with "hbase://" 
> but when we call
> store it throws an exception unless we remove hbase:// from the table
> name:
> this works:
> store raw into 'piggytest2' USING
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('content2:field1
> anchor2:field1a anchor2:field2a');
> this won't
> store raw into 'hbase://piggytest2'
> Exception:
> Caused by: java.lang.IllegalArgumentException:
> java.net.URISyntaxException: Relative path in absolute URI:
> hbase://piggytest2_logs
> Would be nice to be able to prefix the store with hbase:// so it's consistent 
> with the load syntax

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (PIG-767) Schema reported from DESCRIBE and actual schema of inner bags are different.

2011-01-24 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai resolved PIG-767.


  Resolution: Fixed
Hadoop Flags: [Reviewed]

Review notes: https://reviews.apache.org/r/278/

> Schema reported from DESCRIBE and actual schema of inner bags are different.
> 
>
> Key: PIG-767
> URL: https://issues.apache.org/jira/browse/PIG-767
> Project: Pig
>  Issue Type: Bug
>Reporter: George Mavromatis
>Assignee: Daniel Dai
> Fix For: 0.9.0
>
> Attachments: PIG-767-1.patch, PIG-767-2.patch, PIG-767-3.patch, 
> PIG-767-4.patch
>
>
> The following script:
> urlContents = LOAD 'inputdir' USING BinStorage() AS (url:bytearray, 
> pg:bytearray);
> -- describe and dump are in-sync
> DESCRIBE urlContents;
> DUMP urlContents;
> urlContentsG = GROUP urlContents BY url;
> DESCRIBE urlContentsG;
> urlContentsF = FOREACH urlContentsG GENERATE group,urlContents.pg;
> DESCRIBE urlContentsF;
> DUMP urlContentsF;
> Prints for the DESCRIBE commands:
> urlContents: {url: chararray,pg: chararray}
> urlContentsG: {group: chararray,urlContents: {url: chararray,pg: chararray}}
> urlContentsF: {group: chararray,pg: {pg: chararray}}
> The reported schemas for urlContentsG and urlContentsF are wrong. They are 
> also against the section "Schemas for Complex Data Types" in 
> http://wiki.apache.org/pig-data/attachments/FrontPage/attachments/plrm.htm#_Schemas.
> As expected, actual data observed from DUMP urlContentsG and DUMP 
> urlContentsF do contain the tuple inside the inner bags.
> The correct schema for urlContentsG is:  {group: chararray,urlContents: 
> {t1:(url: chararray,pg: chararray)}}
> This may sound like a technicality, but it isn't. For instance, a UDF that 
> assumes an inner bag of {chararray} will not work with {(chararray)}. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: Review Request: Schema reported from DESCRIBE and actual schema of inner bags are different.

2011-01-24 Thread Daniel Dai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/278/
---

(Updated 2011-01-24 10:28:18.178511)


Review request for pig and Richard Ding.


Summary
---

The following script:

urlContents = LOAD 'inputdir' USING BinStorage() AS (url:bytearray, 
pg:bytearray);
– describe and dump are in-sync
DESCRIBE urlContents;
DUMP urlContents;

urlContentsG = GROUP urlContents BY url;
DESCRIBE urlContentsG;

urlContentsF = FOREACH urlContentsG GENERATE group,urlContents.pg;

DESCRIBE urlContentsF;
DUMP urlContentsF;

Prints for the DESCRIBE commands:

urlContents: {url: chararray,pg: chararray}
urlContentsG: {group: chararray,urlContents: {url: chararray,pg: chararray}}
urlContentsF: {group: chararray,pg: {pg: chararray}}

The reported schemas for urlContentsG and urlContentsF are wrong. They are also 
against the section "Schemas for Complex Data Types" in 
http://wiki.apache.org/pig-data/attachments/FrontPage/attachments/plrm.htm#_Schemas.

As expected, actual data observed from DUMP urlContentsG and DUMP urlContentsF 
do contain the tuple inside the inner bags.

The correct schema for urlContentsG is: {group: chararray,urlContents: 
{t1:(url: chararray,pg: chararray)}}

This may sound like a technicality, but it isn't. For instance, a UDF that 
assumes an inner bag of {chararray} will not work with {(chararray)}. 


This addresses bug PIG-767.
https://issues.apache.org/jira/browse/PIG-767


Diffs
-

  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/relational/LOCogroup.java
 1057928 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/relational/LOGenerate.java
 1057928 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/relational/LOInnerLoad.java
 1057928 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestLogicalPlanMigrationVisitor.java
 1057928 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestNewPlanLogToPhyTranslationVisitor.java
 1057928 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestSchema.java
 1057928 

Diff: https://reviews.apache.org/r/278/diff


Testing (updated)
---

Test-patch:
 [exec] +1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 9 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.

Unit-test:
all pass.

End-to-end test:
all pass.


Thanks,

Daniel

[jira] Resolved: (PIG-1786) Move describe/nested describe to new logical plan

2011-01-24 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai resolved PIG-1786.
-

  Resolution: Fixed
Hadoop Flags: [Reviewed]

Patch committed to trunk.

> Move describe/nested describe to new logical plan
> -
>
> Key: PIG-1786
> URL: https://issues.apache.org/jira/browse/PIG-1786
> Project: Pig
>  Issue Type: Sub-task
>  Components: impl
>Affects Versions: 0.9.0
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.9.0
>
> Attachments: PIG-1786-1.patch, PIG-1786-2.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: Review Request: Move describe/nested describe to new logical plan

2011-01-24 Thread Daniel Dai


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/314/
---

(Updated 2011-01-24 10:16:09.970421)


Review request for pig and thejas.


Summary
---

PIG-1786-1.patch is based on LogicalPlanMigrationVistor. After new parser is 
done, we need to move it completely to new logical plan. I want to fix it like 
this for now to unblock other issues in semantic cleanup.


This addresses bug PIG-1786.
https://issues.apache.org/jira/browse/PIG-1786


Diffs
-

  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/PigServer.java 
1058308 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/relational/LOForEach.java
 1058308 
  
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/relational/LOGenerate.java
 1058308 
  
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestPigServer.java
 1058308 

Diff: https://reviews.apache.org/r/314/diff


Testing (updated)
---

Test-patch:
 [exec] -1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] -1 release audit.  The applied patch generated 479 release 
audit warnings (more than the trunk's current 478 warnings).

No new file added, ignore "release audit".

Unit-test:
all pass

End-to-end test:
all pass


Thanks,

Daniel

Pig developer meeting in February

2011-01-24 Thread Olga Natkovich

Hi Guys,

I think it is time for us to have another meeting. Yahoo would be happy to host 
if this works for everybody. How about Wednesday, 2/9 4-6 pm. Please, let us 
know if you are planning to attend and if the date/time works for you.

Things that come to mind to discuss and as always feel free to suggest others:

-  Error handling proposal - this might be easier to finalize 
face-to-face
-  Pig 0.9 plan
-  Pig Roadmap beyond 0.9
oWhat do we want to do in Pig.next?
oAre we ready for Pig 1.0

Olga

Re: Pig 0.8.0 in Maven

2011-01-24 Thread Gianmarco

Great!
I was waiting for this :)
--
Gianmarco De Francisci Morales



On Sat, Jan 22, 2011 at 01:00, Richard Ding  wrote:
> Good news. Pig 0.8.0 now is available through maven repository:
>
> http://repo1.maven.org/maven2/org/apache/pig/pig/0.8.0/
>
> Thanks
> -- Richard
>
>

Re: Pig 0.8.0 in Maven

2011-01-24 Thread Charles Gonçalves

Great news .

On Fri, Jan 21, 2011 at 10:00 PM, Richard Ding  wrote:

> Good news. Pig 0.8.0 now is available through maven repository:
>
> http://repo1.maven.org/maven2/org/apache/pig/pig/0.8.0/
>
> Thanks
> -- Richard
>
>


-- 
*Charles Ferreira Gonçalves *
http://homepages.dcc.ufmg.br/~charles/
UFMG - ICEx - Dcc
Cel.: 55 31 87741485
Tel.:  55 31 34741485
Lab.: 55 31 34095840

RE: Pig 0.8.0 in Maven

2011-01-24 Thread Gerrit van Vuuren

Thanks guys
I'm sure that this will make life easier for everybody trying to use pig in 
maven/ivy projects. At least for my projects it will.



-Original Message-
From: Charles Gonçalves [mailto:charles...@gmail.com] 
Sent: Sunday, January 23, 2011 3:31 AM
To: u...@pig.apache.org
Cc: dev@pig.apache.org
Subject: Re: Pig 0.8.0 in Maven

Great news .

On Fri, Jan 21, 2011 at 10:00 PM, Richard Ding  wrote:

> Good news. Pig 0.8.0 now is available through maven repository:
>
> http://repo1.maven.org/maven2/org/apache/pig/pig/0.8.0/
>
> Thanks
> -- Richard
>
>


-- 
*Charles Ferreira Gonçalves *
http://homepages.dcc.ufmg.br/~charles/
UFMG - ICEx - Dcc
Cel.: 55 31 87741485
Tel.:  55 31 34741485
Lab.: 55 31 34095840

42 matches

Mail list logo