[jira] [Resolved] (PIG-3307) Refactor physical operators to remove methods parameters that are always null

2013-05-17 Thread Julien Le Dem (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Le Dem resolved PIG-3307.


   Resolution: Fixed
Fix Version/s: 0.12
 Hadoop Flags: Reviewed

> Refactor physical operators to remove methods parameters that are always null
> -
>
> Key: PIG-3307
> URL: https://issues.apache.org/jira/browse/PIG-3307
> Project: Pig
>  Issue Type: Improvement
>Reporter: Julien Le Dem
>Assignee: Julien Le Dem
> Fix For: 0.12
>
> Attachments: PIG-3307_0.patch, PIG-3307_1.patch, PIG-3307_2.patch, 
> PIG-3307_3.patch
>
>
> The physical operators are sometimes overly complex. I'm trying to cleanup 
> some unnecessary code.
> in particular there is an array of getNext(*T* v) where the value v does not 
> seem to have any importance and is just used to pick the correct method.
> I have started a refactoring for a more readable getNext*T*().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3307) Refactor physical operators to remove methods parameters that are always null

2013-05-17 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13661202#comment-13661202
 ] 

Julien Le Dem commented on PIG-3307:


committed to TRUNK
Committed revision 1484037

> Refactor physical operators to remove methods parameters that are always null
> -
>
> Key: PIG-3307
> URL: https://issues.apache.org/jira/browse/PIG-3307
> Project: Pig
>  Issue Type: Improvement
>Reporter: Julien Le Dem
>Assignee: Julien Le Dem
> Attachments: PIG-3307_0.patch, PIG-3307_1.patch, PIG-3307_2.patch, 
> PIG-3307_3.patch
>
>
> The physical operators are sometimes overly complex. I'm trying to cleanup 
> some unnecessary code.
> in particular there is an array of getNext(*T* v) where the value v does not 
> seem to have any importance and is just used to pick the correct method.
> I have started a refactoring for a more readable getNext*T*().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Subscription: PIG patch available

2013-05-17 Thread jira
Issue Subscription
Filter: PIG patch available (17 issues)

Subscriber: pigdaily

Key Summary
PIG-3328DataBags created with an initial list of tuples don't get 
registered as spillable
https://issues.apache.org/jira/browse/PIG-3328
PIG-3318AVRO: 'default value' not honored when merging schemas on load with 
AvroStorage
https://issues.apache.org/jira/browse/PIG-3318
PIG-3295Casting from bytearray failing after Union (even when each field is 
from a single Loader)
https://issues.apache.org/jira/browse/PIG-3295
PIG-3285Jobs using HBaseStorage fail to ship dependency jars
https://issues.apache.org/jira/browse/PIG-3285
PIG-3258Patch to allow MultiStorage to use more than one index to generate 
output tree
https://issues.apache.org/jira/browse/PIG-3258
PIG-3257Add unique identifier UDF
https://issues.apache.org/jira/browse/PIG-3257
PIG-3247Piggybank functions to mimic OVER clause in SQL
https://issues.apache.org/jira/browse/PIG-3247
PIG-3210Pig fails to start when it cannot write log to log files
https://issues.apache.org/jira/browse/PIG-3210
PIG-3199Expose LogicalPlan via PigServer API
https://issues.apache.org/jira/browse/PIG-3199
PIG-3166Update eclipse .classpath according to ivy library.properties
https://issues.apache.org/jira/browse/PIG-3166
PIG-3123Simplify Logical Plans By Removing Unneccessary Identity Projections
https://issues.apache.org/jira/browse/PIG-3123
PIG-3088Add a builtin udf which removes prefixes
https://issues.apache.org/jira/browse/PIG-3088
PIG-3024TestEmptyInputDir unit test - hadoop version detection logic is 
brittle
https://issues.apache.org/jira/browse/PIG-3024
PIG-3015Rewrite of AvroStorage
https://issues.apache.org/jira/browse/PIG-3015
PIG-2248Pig parser does not detect when a macro name masks a UDF name
https://issues.apache.org/jira/browse/PIG-2248
PIG-2244Macros cannot be passed relation names
https://issues.apache.org/jira/browse/PIG-2244
PIG-1914Support load/store JSON data in Pig
https://issues.apache.org/jira/browse/PIG-1914

You may edit this subscription at:
https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=13225&filterId=12322384


[jira] [Commented] (PIG-3323) AVRO: default value not stored in file when given as paramter to AvroStorage

2013-05-17 Thread Egil Sorensen (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13661145#comment-13661145
 ] 

Egil Sorensen commented on PIG-3323:


Thanks, Viraj, for digging into this. I am content that at least a legit 
problem was uncovered in the process. 

> AVRO: default value not stored in file when given as paramter to AvroStorage
> 
>
> Key: PIG-3323
> URL: https://issues.apache.org/jira/browse/PIG-3323
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
>
> A pig script like the below succeeds, but inspecting the resulting file I 
> find that the schema is stripped of the default value specification.
> {code}
> a = load ':INPATH:/types/numbers.txt' using PigStorage(':') as (intnum1000: 
> int,id: int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum: 
> float,doublenum: double);
> b2 = foreach a generate id, intnum5, intnum100;
> c2 = filter b2 by 110 <= id and id < 120;
> describe c2;
> dump c2;
> store c2 into ':OUTPATH:.intermediate_2' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage('
> {
>"debug" : 5,
>"schema" : {  
>   "name" : "schema_2",
>   "type" : "record",
>   "fields" : [
>  {  
> "name" : "id",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {  
> "name" : "intnum5",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {
> "name" : "intnum100",
> "type" : [
>"null",
>"int"
> ],
> "default" : 0
>  }
>   ]
>}
> }
> ');
> {code}
> BTW, the documentation on https://cwiki.apache.org/PIG/avrostorage.html is 
> mute on the subject of defaults, so first question is: is my expectation that 
> the default is to be written to file not correct?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3307) Refactor physical operators to remove methods parameters that are always null

2013-05-17 Thread Julien Le Dem (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Le Dem updated PIG-3307:
---

Attachment: PIG-3307_3.patch

PIG-3307_3.patch addresses [~cheolsoo]'s comments

> Refactor physical operators to remove methods parameters that are always null
> -
>
> Key: PIG-3307
> URL: https://issues.apache.org/jira/browse/PIG-3307
> Project: Pig
>  Issue Type: Improvement
>Reporter: Julien Le Dem
>Assignee: Julien Le Dem
> Attachments: PIG-3307_0.patch, PIG-3307_1.patch, PIG-3307_2.patch, 
> PIG-3307_3.patch
>
>
> The physical operators are sometimes overly complex. I'm trying to cleanup 
> some unnecessary code.
> in particular there is an array of getNext(*T* v) where the value v does not 
> seem to have any importance and is just used to pick the correct method.
> I have started a refactoring for a more readable getNext*T*().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (PIG-3323) AVRO: default value not stored in file when given as paramter to AvroStorage

2013-05-17 Thread Viraj Bhat (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Bhat resolved PIG-3323.
-

Resolution: Invalid

> AVRO: default value not stored in file when given as paramter to AvroStorage
> 
>
> Key: PIG-3323
> URL: https://issues.apache.org/jira/browse/PIG-3323
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
>
> A pig script like the below succeeds, but inspecting the resulting file I 
> find that the schema is stripped of the default value specification.
> {code}
> a = load ':INPATH:/types/numbers.txt' using PigStorage(':') as (intnum1000: 
> int,id: int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum: 
> float,doublenum: double);
> b2 = foreach a generate id, intnum5, intnum100;
> c2 = filter b2 by 110 <= id and id < 120;
> describe c2;
> dump c2;
> store c2 into ':OUTPATH:.intermediate_2' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage('
> {
>"debug" : 5,
>"schema" : {  
>   "name" : "schema_2",
>   "type" : "record",
>   "fields" : [
>  {  
> "name" : "id",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {  
> "name" : "intnum5",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {
> "name" : "intnum100",
> "type" : [
>"null",
>"int"
> ],
> "default" : 0
>  }
>   ]
>}
> }
> ');
> {code}
> BTW, the documentation on https://cwiki.apache.org/PIG/avrostorage.html is 
> mute on the subject of defaults, so first question is: is my expectation that 
> the default is to be written to file not correct?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3323) AVRO: default value not stored in file when given as paramter to AvroStorage

2013-05-17 Thread Viraj Bhat (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13661142#comment-13661142
 ] 

Viraj Bhat commented on PIG-3323:
-

One correction on my first comment:
Default values for union fields correspond to the first schema in the union 
according to the specification. So for the above use case posted by Egil, the 
final Output Schema should not contain the default value. 

In fact there is a bug in AvroStorage which does not write the default values 
of the individual fields. I will open another Jira and close this one.

Viraj

> AVRO: default value not stored in file when given as paramter to AvroStorage
> 
>
> Key: PIG-3323
> URL: https://issues.apache.org/jira/browse/PIG-3323
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
>
> A pig script like the below succeeds, but inspecting the resulting file I 
> find that the schema is stripped of the default value specification.
> {code}
> a = load ':INPATH:/types/numbers.txt' using PigStorage(':') as (intnum1000: 
> int,id: int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum: 
> float,doublenum: double);
> b2 = foreach a generate id, intnum5, intnum100;
> c2 = filter b2 by 110 <= id and id < 120;
> describe c2;
> dump c2;
> store c2 into ':OUTPATH:.intermediate_2' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage('
> {
>"debug" : 5,
>"schema" : {  
>   "name" : "schema_2",
>   "type" : "record",
>   "fields" : [
>  {  
> "name" : "id",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {  
> "name" : "intnum5",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {
> "name" : "intnum100",
> "type" : [
>"null",
>"int"
> ],
> "default" : 0
>  }
>   ]
>}
> }
> ');
> {code}
> BTW, the documentation on https://cwiki.apache.org/PIG/avrostorage.html is 
> mute on the subject of defaults, so first question is: is my expectation that 
> the default is to be written to file not correct?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: Refactor physical operators to remove methods parameters that are always null

2013-05-17 Thread Julien Le Dem


> On May 17, 2013, 3:12 p.m., Cheolsoo Park wrote:
> > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POCast.java,
> >  lines 394-395
> > 
> >
> > This isn't what you introduced, but I think this is incorrect.
> > 
> > Shouldn't "in.getNextBigDecimal()" be "in.getNextBigInteger()" since 
> > we're casting BI to BD here?

Yep, looks like a bug. Good catch!


> On May 17, 2013, 3:12 p.m., Cheolsoo Park wrote:
> > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/PhysicalOperator.java,
> >  lines 301-302
> > 
> >
> > Can you fix indentation here?

That because I generated the patch ignoring whitespace to make it more readable.
I will commit with correct indentation


- Julien


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11203/#review20674
---


On May 16, 2013, 9:35 p.m., Julien Le Dem wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11203/
> ---
> 
> (Updated May 16, 2013, 9:35 p.m.)
> 
> 
> Review request for pig, Daniel Dai, Dmitriy Ryaboy, Cheolsoo Park, and Bill 
> Graham.
> 
> 
> Description
> ---
> 
> Refactor physical operators to remove methods parameters that are always null
> 
> 
> This addresses bug PIG-3307.
> https://issues.apache.org/jira/browse/PIG-3307
> 
> 
> Diffs
> -
> 
>   
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MergeJoinIndexer.java
>  d5aff3d 
>   
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java
>  6cfc8c0 
>   
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapBase.java
>  7c499f6 
>   
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java
>  6145214 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/PhysicalOperator.java
>  fc0112a 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Add.java
>  5bceca6 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/BinaryComparisonOperator.java
>  3e434f3 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/ComparisonOperator.java
>  51d9f34 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/ConstantExpression.java
>  7e4cffa 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Divide.java
>  bdcc72b 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/EqualToExpr.java
>  a767c36 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/ExpressionOperator.java
>  9cca2c3 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/GTOrEqualToExpr.java
>  b5e3c83 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/GreaterThanExpr.java
>  f3b5d44 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/LTOrEqualToExpr.java
>  35786c0 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/LessThanExpr.java
>  c9b3157 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Mod.java
>  1108846 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Multiply.java
>  2795b78 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/NotEqualToExpr.java
>  294f84a 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POAnd.java
>  f24c2ac 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POBinCond.java
>  312f3ac 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POCast.java
>  987cc21 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POIsNull.java
>  9ea89f7 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POMapLookUp.java
>  fd5573f 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/PONegative.java
>  8d3fcb1 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/PONot.java
>  973dfc5 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POOr.java
>  498eb12 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalL

Jenkins build is back to normal : Pig-trunk #1477

2013-05-17 Thread Apache Jenkins Server
See 



[jira] [Updated] (PIG-3015) Rewrite of AvroStorage

2013-05-17 Thread Joseph Adler (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Adler updated PIG-3015:
--

Attachment: PIG-3015-12.patch

Incremental patch that adds support for push down projections, fixed some bugs 
with options, gets all the test cases working again

> Rewrite of AvroStorage
> --
>
> Key: PIG-3015
> URL: https://issues.apache.org/jira/browse/PIG-3015
> Project: Pig
>  Issue Type: Improvement
>  Components: piggybank
>Reporter: Joseph Adler
>Assignee: Joseph Adler
> Attachments: bad.avro, good.avro, PIG-3015-10.patch, 
> PIG-3015-11.patch, PIG-3015-12.patch, PIG-3015-2.patch, PIG-3015-3.patch, 
> PIG-3015-4.patch, PIG-3015-5.patch, PIG-3015-6.patch, PIG-3015-7.patch, 
> PIG-3015-9.patch, PIG-3015-doc-2.patch, PIG-3015-doc.patch, TestInput.java, 
> Test.java, with_dates.pig
>
>
> The current AvroStorage implementation has a lot of issues: it requires old 
> versions of Avro, it copies data much more than needed, and it's verbose and 
> complicated. (One pet peeve of mine is that old versions of Avro don't 
> support Snappy compression.)
> I rewrote AvroStorage from scratch to fix these issues. In early tests, the 
> new implementation is significantly faster, and the code is a lot simpler. 
> Rewriting AvroStorage also enabled me to implement support for Trevni (as 
> TrevniStorage).
> I'm opening this ticket to facilitate discussion while I figure out the best 
> way to contribute the changes back to Apache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3330) please fix the change that created a dependency on org.apache.pig.impl.PigImplConstants

2013-05-17 Thread Joseph Adler (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13661059#comment-13661059
 ] 

Joseph Adler commented on PIG-3330:
---

Thanks for checking in the file; looks like that was the only problem.

> please fix the change that created a dependency on 
> org.apache.pig.impl.PigImplConstants
> ---
>
> Key: PIG-3330
> URL: https://issues.apache.org/jira/browse/PIG-3330
> Project: Pig
>  Issue Type: Bug
>Reporter: Joseph Adler
>Assignee: Bill Graham
>Priority: Blocker
>
> I can't build Pig from trunk because several source files (including 
> org.apache.pig.Main.java) require org.apache.pig.impl.PigImplConstants, but 
> that class isn't available.
> I'm assuming someone left out a file on a recent commit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3323) AVRO: default value not stored in file when given as paramter to AvroStorage

2013-05-17 Thread Viraj Bhat (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13661019#comment-13661019
 ] 

Viraj Bhat commented on PIG-3323:
-

Spoke to Egil offline:
His original comments were:
1) Should default value be written to a file?
Ans) It should be if it is specified for a valid Complex Types.

2) Should Default schema specification be written to the file's metadata?
Ans) It should be if it is valid for that Complex Type. Since Union does not 
support default it was not written out. But we need to see how the default 
schema's work for other data types.

Viraj

> AVRO: default value not stored in file when given as paramter to AvroStorage
> 
>
> Key: PIG-3323
> URL: https://issues.apache.org/jira/browse/PIG-3323
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
>
> A pig script like the below succeeds, but inspecting the resulting file I 
> find that the schema is stripped of the default value specification.
> {code}
> a = load ':INPATH:/types/numbers.txt' using PigStorage(':') as (intnum1000: 
> int,id: int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum: 
> float,doublenum: double);
> b2 = foreach a generate id, intnum5, intnum100;
> c2 = filter b2 by 110 <= id and id < 120;
> describe c2;
> dump c2;
> store c2 into ':OUTPATH:.intermediate_2' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage('
> {
>"debug" : 5,
>"schema" : {  
>   "name" : "schema_2",
>   "type" : "record",
>   "fields" : [
>  {  
> "name" : "id",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {  
> "name" : "intnum5",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {
> "name" : "intnum100",
> "type" : [
>"null",
>"int"
> ],
> "default" : 0
>  }
>   ]
>}
> }
> ');
> {code}
> BTW, the documentation on https://cwiki.apache.org/PIG/avrostorage.html is 
> mute on the subject of defaults, so first question is: is my expectation that 
> the default is to be written to file not correct?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (PIG-3330) please fix the change that created a dependency on org.apache.pig.impl.PigImplConstants

2013-05-17 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham resolved PIG-3330.
--

Resolution: Fixed

My bad, I made the commit last night and forgot 'svn add'. Just made the fix by 
adding the missing file.

> please fix the change that created a dependency on 
> org.apache.pig.impl.PigImplConstants
> ---
>
> Key: PIG-3330
> URL: https://issues.apache.org/jira/browse/PIG-3330
> Project: Pig
>  Issue Type: Bug
>Reporter: Joseph Adler
>Assignee: Bill Graham
>Priority: Blocker
>
> I can't build Pig from trunk because several source files (including 
> org.apache.pig.Main.java) require org.apache.pig.impl.PigImplConstants, but 
> that class isn't available.
> I'm assuming someone left out a file on a recent commit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3330) please fix the change that created a dependency on org.apache.pig.impl.PigImplConstants

2013-05-17 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-3330:
-

Assignee: Bill Graham

> please fix the change that created a dependency on 
> org.apache.pig.impl.PigImplConstants
> ---
>
> Key: PIG-3330
> URL: https://issues.apache.org/jira/browse/PIG-3330
> Project: Pig
>  Issue Type: Bug
>Reporter: Joseph Adler
>Assignee: Bill Graham
>Priority: Blocker
>
> I can't build Pig from trunk because several source files (including 
> org.apache.pig.Main.java) require org.apache.pig.impl.PigImplConstants, but 
> that class isn't available.
> I'm assuming someone left out a file on a recent commit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-3330) please fix the change that created a dependency on org.apache.pig.impl.PigImplConstants

2013-05-17 Thread Joseph Adler (JIRA)
Joseph Adler created PIG-3330:
-

 Summary: please fix the change that created a dependency on 
org.apache.pig.impl.PigImplConstants
 Key: PIG-3330
 URL: https://issues.apache.org/jira/browse/PIG-3330
 Project: Pig
  Issue Type: Bug
Reporter: Joseph Adler
Priority: Blocker


I can't build Pig from trunk because several source files (including 
org.apache.pig.Main.java) require org.apache.pig.impl.PigImplConstants, but 
that class isn't available.

I'm assuming someone left out a file on a recent commit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3323) AVRO: default value not stored in file when given as paramter to AvroStorage

2013-05-17 Thread Viraj Bhat (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13660959#comment-13660959
 ] 

Viraj Bhat commented on PIG-3323:
-

Hi Egil,
 I looked at the specification of the UNION, Default types and the source code 
in: "PigAvroDatumWriter"
Field: "intum100" is a UNION of "null" and "int". So the type can be a "null" 
or an "int"
That means if Pig does not find a value for "intnum100" in the previous step 
before the store it will generate null which is perfectly acceptable here. So 
the default value makes no sense here if the item does not exist. 
Also if you remove "null" from the specification of "intnumm100" and hope the 
default value is written out, there is another problem: 

If you read specification for Unions 
http://avro.apache.org/docs/current/spec.html#Unions plus
Section on Default Values 
http://avro.apache.org/docs/current/spec.html#schema_complex
Union does not have any default values in the specification. 

Closing a INVAILD
Regards
Viraj

> AVRO: default value not stored in file when given as paramter to AvroStorage
> 
>
> Key: PIG-3323
> URL: https://issues.apache.org/jira/browse/PIG-3323
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.11.2
>Reporter: Egil Sorensen
>Assignee: Viraj Bhat
>  Labels: patch
> Fix For: 0.12, 0.11.2
>
>
> A pig script like the below succeeds, but inspecting the resulting file I 
> find that the schema is stripped of the default value specification.
> {code}
> a = load ':INPATH:/types/numbers.txt' using PigStorage(':') as (intnum1000: 
> int,id: int,intnum5: int,intnum100: int,intnum: int,longnum: long,floatnum: 
> float,doublenum: double);
> b2 = foreach a generate id, intnum5, intnum100;
> c2 = filter b2 by 110 <= id and id < 120;
> describe c2;
> dump c2;
> store c2 into ':OUTPATH:.intermediate_2' USING 
> org.apache.pig.piggybank.storage.avro.AvroStorage('
> {
>"debug" : 5,
>"schema" : {  
>   "name" : "schema_2",
>   "type" : "record",
>   "fields" : [
>  {  
> "name" : "id",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {  
> "name" : "intnum5",
> "type" : [
>"null",
>"int"
> ]
>  },
>  {
> "name" : "intnum100",
> "type" : [
>"null",
>"int"
> ],
> "default" : 0
>  }
>   ]
>}
> }
> ');
> {code}
> BTW, the documentation on https://cwiki.apache.org/PIG/avrostorage.html is 
> mute on the subject of defaults, so first question is: is my expectation that 
> the default is to be written to file not correct?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2378) macros don't accept references to items within tuples as arguments

2013-05-17 Thread Johnny Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johnny Zhang updated PIG-2378:
--

Status: Open  (was: Patch Available)

turn out it breaks several unit tests, cancel the current patch

> macros don't accept references to items within tuples as arguments
> --
>
> Key: PIG-2378
> URL: https://issues.apache.org/jira/browse/PIG-2378
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.9.1
>Reporter: Joseph Adler
>Assignee: Johnny Zhang
> Attachments: PIG-2378.patch.txt, PIG-2378.patch.txt
>
>
> I'd like to be able to pass a reference to an item within a parameter to a 
> Pig Macro.
> For example, suppose that I had a relation A with the schema A:{id:long, 
> header:(time:long, type:chararray)}. I'd like to call a macro by typing:
>B = MY_MACRO(A, header.time);
> but this does not currently work. Obviously, I could define a new relation as 
> a workaround, for example I could use some pig code like 
>   AA = FOREACH a GENERATE *, header.time as time;
>   B = MY_MACRO(AA, time);
> But that's ugly and clunky

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3307) Refactor physical operators to remove methods parameters that are always null

2013-05-17 Thread Cheolsoo Park (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13660784#comment-13660784
 ] 

Cheolsoo Park commented on PIG-3307:


I am giving my +1 because this makes code much cleaner.

I haven't done any performance benchmarks because I agree with Julien's 
assessment. But please advise if anyone has concerns.

> Refactor physical operators to remove methods parameters that are always null
> -
>
> Key: PIG-3307
> URL: https://issues.apache.org/jira/browse/PIG-3307
> Project: Pig
>  Issue Type: Improvement
>Reporter: Julien Le Dem
>Assignee: Julien Le Dem
> Attachments: PIG-3307_0.patch, PIG-3307_1.patch, PIG-3307_2.patch
>
>
> The physical operators are sometimes overly complex. I'm trying to cleanup 
> some unnecessary code.
> in particular there is an array of getNext(*T* v) where the value v does not 
> seem to have any importance and is just used to pick the correct method.
> I have started a refactoring for a more readable getNext*T*().

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: Refactor physical operators to remove methods parameters that are always null

2013-05-17 Thread Cheolsoo Park

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11203/#review20674
---

Ship it!


Looks good to me. I only have minor comments as below. Do you mind fixing them 
when you commit?

I also confirmed that all unit tests pass.


src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/PhysicalOperator.java


Can you fix indentation here?



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/ExpressionOperator.java


Can you fix indentation here?



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POCast.java


This isn't what you introduced, but I think this is incorrect.

Shouldn't "in.getNextBigDecimal()" be "in.getNextBigInteger()" since we're 
casting BI to BD here?


- Cheolsoo Park


On May 16, 2013, 9:35 p.m., Julien Le Dem wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11203/
> ---
> 
> (Updated May 16, 2013, 9:35 p.m.)
> 
> 
> Review request for pig, Daniel Dai, Dmitriy Ryaboy, Cheolsoo Park, and Bill 
> Graham.
> 
> 
> Description
> ---
> 
> Refactor physical operators to remove methods parameters that are always null
> 
> 
> This addresses bug PIG-3307.
> https://issues.apache.org/jira/browse/PIG-3307
> 
> 
> Diffs
> -
> 
>   
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MergeJoinIndexer.java
>  d5aff3d 
>   
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java
>  6cfc8c0 
>   
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapBase.java
>  7c499f6 
>   
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java
>  6145214 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/PhysicalOperator.java
>  fc0112a 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Add.java
>  5bceca6 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/BinaryComparisonOperator.java
>  3e434f3 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/ComparisonOperator.java
>  51d9f34 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/ConstantExpression.java
>  7e4cffa 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Divide.java
>  bdcc72b 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/EqualToExpr.java
>  a767c36 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/ExpressionOperator.java
>  9cca2c3 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/GTOrEqualToExpr.java
>  b5e3c83 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/GreaterThanExpr.java
>  f3b5d44 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/LTOrEqualToExpr.java
>  35786c0 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/LessThanExpr.java
>  c9b3157 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Mod.java
>  1108846 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Multiply.java
>  2795b78 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/NotEqualToExpr.java
>  294f84a 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POAnd.java
>  f24c2ac 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POBinCond.java
>  312f3ac 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POCast.java
>  987cc21 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POIsNull.java
>  9ea89f7 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POMapLookUp.java
>  fd5573f 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/PONegative.java
>  8d3fcb1 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/PONot.java
>  973dfc5 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POOr.java
>  498eb12 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POProje

Build failed in Jenkins: Pig-trunk #1476

2013-05-17 Thread Apache Jenkins Server
See 

Changes:

[billgraham] PIG-3317: disable optimizations via pig properties (traviscrawford 
via billgraham)

--
[...truncated 2676 lines...]
[ivy:resolve]   found log4j#log4j;1.2.16 in fs
[ivy:resolve]   found org.slf4j#slf4j-log4j12;1.6.1 in fs
[ivy:resolve]   found org.apache.avro#avro;1.5.3 in fs
[ivy:resolve]   found com.thoughtworks.paranamer#paranamer;2.3 in fs
[ivy:resolve]   found org.xerial.snappy#snappy-java;1.0.3.2 in fs
[ivy:resolve]   found org.slf4j#slf4j-api;1.6.1 in fs
[ivy:resolve]   found com.googlecode.json-simple#json-simple;1.1 in fs
[ivy:resolve]   found com.jcraft#jsch;0.1.38 in fs
[ivy:resolve]   found jline#jline;0.9.94 in fs
[ivy:resolve]   found net.java.dev.javacc#javacc;4.2 in maven2
[ivy:resolve]   found org.codehaus.groovy#groovy-all;1.8.6 in maven2
[ivy:resolve]   found org.codehaus.jackson#jackson-mapper-asl;1.8.8 in fs
[ivy:resolve]   found org.codehaus.jackson#jackson-core-asl;1.8.8 in fs
[ivy:resolve]   found org.fusesource.jansi#jansi;1.9 in maven2
[ivy:resolve]   found joda-time#joda-time;2.1 in maven2
[ivy:resolve]   found com.google.guava#guava;11.0 in maven2
[ivy:resolve]   found org.python#jython-standalone;2.5.3 in maven2
[ivy:resolve]   found rhino#js;1.7R2 in maven2
[ivy:resolve]   found org.antlr#antlr;3.4 in fs
[ivy:resolve]   found org.antlr#antlr-runtime;3.4 in fs
[ivy:resolve]   found org.antlr#stringtemplate;3.2.1 in fs
[ivy:resolve]   found antlr#antlr;2.7.7 in fs
[ivy:resolve]   found org.antlr#ST4;4.0.4 in fs
[ivy:resolve]   found org.apache.zookeeper#zookeeper;3.4.4 in maven2
[ivy:resolve]   found dk.brics.automaton#automaton;1.11-8 in maven2
[ivy:resolve]   found org.jruby#jruby-complete;1.6.7 in maven2
[ivy:resolve]   found asm#asm;3.3.1 in fs
[ivy:resolve]   found org.apache.hbase#hbase;0.94.1 in maven2
[ivy:resolve]   found org.vafer#jdeb;0.8 in maven2
[ivy:resolve]   found org.mockito#mockito-all;1.8.4 in maven2
[ivy:resolve]   found xalan#xalan;2.7.1 in maven2
[ivy:resolve]   found xalan#serializer;2.7.1 in maven2
[ivy:resolve]   found xml-apis#xml-apis;1.3.04 in fs
[ivy:resolve]   found xerces#xercesImpl;2.10.0 in maven2
[ivy:resolve]   found xml-apis#xml-apis;1.4.01 in maven2
[ivy:resolve]   found junit#junit;4.11 in maven2
[ivy:resolve]   found org.hamcrest#hamcrest-core;1.3 in maven2
[ivy:resolve]   found org.jboss.netty#netty;3.2.2.Final in fs
[ivy:resolve]   found com.github.stephenc.high-scale-lib#high-scale-lib;1.1.1 
in fs
[ivy:resolve]   found com.google.protobuf#protobuf-java;2.4.0a in fs
[ivy:resolve]   found com.yammer.metrics#metrics-core;2.1.2 in fs
[ivy:resolve]   found org.slf4j#slf4j-api;1.6.4 in fs
[ivy:resolve]   found org.apache.hive#hive-exec;0.8.0 in maven2
[ivy:resolve]   found junit#junit;3.8.1 in fs
[ivy:resolve]   found com.google.code.p.arat#rat-lib;0.5.1 in maven2
[ivy:resolve]   found commons-collections#commons-collections;3.2 in fs
[ivy:resolve]   found commons-lang#commons-lang;2.1 in fs
[ivy:resolve]   found jdiff#jdiff;1.0.9 in fs
[ivy:resolve]   found checkstyle#checkstyle;4.2 in maven2
[ivy:resolve]   found commons-beanutils#commons-beanutils-core;1.7.0 in fs
[ivy:resolve]   found commons-cli#commons-cli;1.0 in fs
[ivy:resolve]   found commons-logging#commons-logging;1.0.3 in fs
[ivy:resolve]   found org.codehaus.jackson#jackson-mapper-asl;1.0.1 in fs
[ivy:resolve]   found org.codehaus.jackson#jackson-core-asl;1.0.1 in fs
[ivy:resolve]   found com.sun.jersey#jersey-bundle;1.8 in maven2
[ivy:resolve]   found com.sun.jersey#jersey-server;1.8 in fs
[ivy:resolve]   found com.sun.jersey.contribs#jersey-guice;1.8 in fs
[ivy:resolve]   found commons-httpclient#commons-httpclient;3.1 in fs
[ivy:resolve]   found javax.servlet#servlet-api;2.5 in fs
[ivy:resolve]   found javax.ws.rs#jsr311-api;1.1.1 in maven2
[ivy:resolve]   found javax.inject#javax.inject;1 in fs
[ivy:resolve]   found javax.xml.bind#jaxb-api;2.2.2 in fs
[ivy:resolve]   found com.sun.xml.bind#jaxb-impl;2.2.3-1 in fs
[ivy:resolve]   found com.google.inject#guice;3.0 in fs
[ivy:resolve]   found com.google.inject.extensions#guice-servlet;3.0 in fs
[ivy:resolve]   found aopalliance#aopalliance;1.0 in fs
[ivy:resolve]   found org.apache.hadoop#hadoop-annotations;2.0.3-alpha in fs
[ivy:resolve]   found org.apache.hadoop#hadoop-auth;2.0.3-alpha in fs
[ivy:resolve]   found org.apache.hadoop#hadoop-common;2.0.3-alpha in fs
[ivy:resolve]   found org.apache.hadoop#hadoop-hdfs;2.0.3-alpha in fs
[ivy:resolve]   found 
org.apache.hadoop#hadoop-mapreduce-client-core;2.0.3-alpha in fs
[ivy:resolve]   found 
org.apache.hadoop#hadoop-mapreduce-client-jobclient;2.0.3-alpha in fs
[ivy:resolve]   found org.apache.hadoop#hadoop-yarn-server-tests;2.0.3-alpha in 
fs
[ivy:resolve]   found org.apache.hadoop#hadoop-mapreduce-client-app;2.0.3-alpha 
in fs
[ivy:resolve]   found 
org.apache.hadoop#hadoop-mapreduce-client-shuffle;2.0.3-alpha in fs
[ivy:resolve]   found 
org.apache.hadoop

[jira] [Commented] (PIG-3314) make better symbol resolving order in Pig

2013-05-17 Thread Johnny Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13660469#comment-13660469
 ] 

Johnny Zhang commented on PIG-3314:
---

the symbol I can think include (from high priority to low priority):
1. Pig keyword
2. default
3. Pig properties
4. macro name
5. macro argument
6. builtin UDF
7. custom UDF
8. script arguments


we may not be able to take care all of them, please add new symbol categories 
you think matters, also you can suggest other order which you think more 
reasonable.

> make better symbol resolving order in Pig
> -
>
> Key: PIG-3314
> URL: https://issues.apache.org/jira/browse/PIG-3314
> Project: Pig
>  Issue Type: Improvement
>  Components: grunt, parser
>Reporter: Johnny Zhang
>
> this idea comes when we trying to resolve PIG-2248, please take a look the 
> comments starts from 
> https://issues.apache.org/jira/browse/PIG-2248?focusedCommentId=13648831&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13648831
> Basically, we want sure different level of symbol in Pig, when they have name 
> conflict, higher side symbol can mask lover side symbol.
> We want first make an agreement on the symbol, then we are going to
> (1) add unit tests to make sure it is working as expected, otherwise, open 
> jira and fix it
> (2) document the usage

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira