[jira] [Updated] (PIG-2782) Specifying sorting field(s) at nightly.conf

2012-07-03 Thread Cheolsoo Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheolsoo Park updated PIG-2782:
---

Status: Patch Available  (was: Open)

> Specifying sorting field(s) at nightly.conf
> ---
>
> Key: PIG-2782
> URL: https://issues.apache.org/jira/browse/PIG-2782
> Project: Pig
>  Issue Type: Bug
> Environment: Mac OS X Lion 10.7.3
> Hadoop 1.0.1-SNAPSHOT
> Apache Pig version 0.11.0-SNAPSHOT (r1355798)
>Reporter: Allan AvendaƱo
>Assignee: Cheolsoo Park
> Attachments: PIG-2782.patch
>
>
> After running the Checkin tests, it fails because one of the parameters 
> passed to the sort is incorrect (instead of +1 -2, on POSIX is -k2,2). 
> According to this http://ss64.com/bash/sort.html, it was on an old notation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (PIG-2782) Specifying sorting field(s) at nightly.conf

2012-07-03 Thread Cheolsoo Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheolsoo Park updated PIG-2782:
---

Attachment: PIG-2782.patch

Attached is the patch that replaces obsolete sort options with posix options in 
e2e tests.

Modified tests are as follows:

Checkin, Foreach, Order, Types, Limit, Split, MissingColumns, Jython_Checkin, 
and BigData

I verified that all tests pass in local mode.

> Specifying sorting field(s) at nightly.conf
> ---
>
> Key: PIG-2782
> URL: https://issues.apache.org/jira/browse/PIG-2782
> Project: Pig
>  Issue Type: Bug
> Environment: Mac OS X Lion 10.7.3
> Hadoop 1.0.1-SNAPSHOT
> Apache Pig version 0.11.0-SNAPSHOT (r1355798)
>Reporter: Allan AvendaƱo
>Assignee: Cheolsoo Park
> Attachments: PIG-2782.patch
>
>
> After running the Checkin tests, it fails because one of the parameters 
> passed to the sort is incorrect (instead of +1 -2, on POSIX is -k2,2). 
> According to this http://ss64.com/bash/sort.html, it was on an old notation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (PIG-1314) Add DateTime Support to Pig

2012-07-03 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406238#comment-13406238
 ] 

Zhijie Shen commented on PIG-1314:
--

{quote}
But the timezone part and time part of the datetime string should be optional. 
Does jodatime support that?
{quote}

Yes, these two parts are not mandatory. The default time value is 
"00:00:00.000" while the default timezone offset is "+00:00". When the datetime 
object is outputed an ISO-format string, the default parts will be filled up 
(e.g., 2012-07-03T00:00:00.000Z).

> Add DateTime Support to Pig
> ---
>
> Key: PIG-1314
> URL: https://issues.apache.org/jira/browse/PIG-1314
> Project: Pig
>  Issue Type: Bug
>  Components: data
>Affects Versions: 0.7.0
>Reporter: Russell Jurney
>Assignee: Zhijie Shen
>  Labels: gsoc2012
> Attachments: PIG-1314-1.patch, PIG-1314-2.patch, joda_vs_builtin.zip
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Hadoop/Pig are primarily used to parse log data, and most logs have a 
> timestamp component.  Therefore Pig should support dates as a primitive.
> Can someone familiar with adding types to pig comment on how hard this is?  
> We're looking at doing this, rather than use UDFs.  Is this a patch that 
> would be accepted?
> This is a candidate project for Google summer of code 2012. More information 
> about the program can be found at 
> https://cwiki.apache.org/confluence/display/PIG/GSoc2012

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Are there any explanations of the implementation of illustrate?

2012-07-03 Thread Jonathan Coveney
Jie, that's perfect, thanks. This doc, specifically:
http://i.stanford.edu/~olston/publications/sigmod09.pdf is exactly the
detailed explanation I was looking for.

2012/7/3 Jie Li 

> Some document here: http://wiki.apache.org/pig/PigIllustrate
>
> I agree that more tests are needed for illustrate, otherwise it can be
> easily broken without notice.
>
> Jie
>
> On Tue, Jul 3, 2012 at 12:45 PM, Jon Coveney  wrote:
> > I was curious at a level slightly higher than "dig through the code" how
> illustrate is so fast, and how it deals with joins effectively. Are there
> any resources on this (or does anyone at Hortonworks want to write a tech
> oriented blog post? :)
> >
>


[jira] [Resolved] (PIG-2787) change the module name in ivy to lowercase to match the maven repo

2012-07-03 Thread Julien Le Dem (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Le Dem resolved PIG-2787.


   Resolution: Fixed
Fix Version/s: 0.11
 Assignee: Julien Le Dem

> change the module name in ivy to lowercase to match the maven repo
> --
>
> Key: PIG-2787
> URL: https://issues.apache.org/jira/browse/PIG-2787
> Project: Pig
>  Issue Type: Bug
>Reporter: Julien Le Dem
>Assignee: Julien Le Dem
> Fix For: 0.11
>
> Attachments: PIG-2787.patch
>
>
> ivy.xml
> {noformat}
>  http://ant.apache.org/ivy/maven";
>  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
>  
> xsi:noNamespaceSchemaLocation="http://ant.apache.org/ivy/schemas/ivy.xsd";>
> -   revision="${version}">
> +  
>  
>  http://hadoop.apache.org/pig"/>
>  Pig
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (PIG-2787) change the module name in ivy to lowercase to match the maven repo

2012-07-03 Thread Julien Le Dem (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Le Dem updated PIG-2787:
---

Attachment: PIG-2787.patch

> change the module name in ivy to lowercase to match the maven repo
> --
>
> Key: PIG-2787
> URL: https://issues.apache.org/jira/browse/PIG-2787
> Project: Pig
>  Issue Type: Bug
>Reporter: Julien Le Dem
> Attachments: PIG-2787.patch
>
>
> ivy.xml
> {noformat}
>  http://ant.apache.org/ivy/maven";
>  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
>  
> xsi:noNamespaceSchemaLocation="http://ant.apache.org/ivy/schemas/ivy.xsd";>
> -   revision="${version}">
> +  
>  
>  http://hadoop.apache.org/pig"/>
>  Pig
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (PIG-2788) improved string interpolation of variables

2012-07-03 Thread Jeff Hodges (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Hodges updated PIG-2788:
-

Affects Version/s: 0.9.2
   0.10.0

> improved string interpolation of variables
> --
>
> Key: PIG-2788
> URL: https://issues.apache.org/jira/browse/PIG-2788
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.9.2, 0.10.0
> Environment: The simplest example of the failure of the current 
> string interpolation is 
> {code}
> store my_rel into '$OUTPUT_';
> {code}
> This will raise an error saying that OUTPUT_ is not a variable passed in. 
> Similar errors happen with a variety of other trailing characters.
> It would be nice if '${OUTPUT}_', or something similar, worked.
>Reporter: Jeff Hodges
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (PIG-2788) improved string interpolation of variables

2012-07-03 Thread Jeff Hodges (JIRA)
Jeff Hodges created PIG-2788:


 Summary: improved string interpolation of variables
 Key: PIG-2788
 URL: https://issues.apache.org/jira/browse/PIG-2788
 Project: Pig
  Issue Type: Bug
 Environment: The simplest example of the failure of the current string 
interpolation is 

{code}
store my_rel into '$OUTPUT_';
{code}

This will raise an error saying that OUTPUT_ is not a variable passed in. 
Similar errors happen with a variety of other trailing characters.

It would be nice if '${OUTPUT}_', or something similar, worked.
Reporter: Jeff Hodges




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (PIG-2787) change the module name in ivy to lowercase to match the maven repo

2012-07-03 Thread Jonathan Coveney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406217#comment-13406217
 ] 

Jonathan Coveney commented on PIG-2787:
---

+1

> change the module name in ivy to lowercase to match the maven repo
> --
>
> Key: PIG-2787
> URL: https://issues.apache.org/jira/browse/PIG-2787
> Project: Pig
>  Issue Type: Bug
>Reporter: Julien Le Dem
>
> ivy.xml
> {noformat}
>  http://ant.apache.org/ivy/maven";
>  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
>  
> xsi:noNamespaceSchemaLocation="http://ant.apache.org/ivy/schemas/ivy.xsd";>
> -   revision="${version}">
> +  
>  
>  http://hadoop.apache.org/pig"/>
>  Pig
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (PIG-2787) change the module name in ivy to lowercase to match the maven repo

2012-07-03 Thread Julien Le Dem (JIRA)
Julien Le Dem created PIG-2787:
--

 Summary: change the module name in ivy to lowercase to match the 
maven repo
 Key: PIG-2787
 URL: https://issues.apache.org/jira/browse/PIG-2787
 Project: Pig
  Issue Type: Bug
Reporter: Julien Le Dem


ivy.xml
{noformat}
 http://ant.apache.org/ivy/maven";
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
 xsi:noNamespaceSchemaLocation="http://ant.apache.org/ivy/schemas/ivy.xsd";>
-  
+  
 
 http://hadoop.apache.org/pig"/>
 Pig
{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (PIG-2765) Implementing RollupDimensions UDF and adding ROLLUP clause in CUBE operator

2012-07-03 Thread Prasanth J (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406075#comment-13406075
 ] 

Prasanth J commented on PIG-2765:
-

Since the second version of this patch is generated using git, I have created a 
new review board request https://reviews.apache.org/r/5733/

Please let me know in case of any issues. 

> Implementing RollupDimensions UDF and adding ROLLUP clause in CUBE operator
> ---
>
> Key: PIG-2765
> URL: https://issues.apache.org/jira/browse/PIG-2765
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Prasanth J
>Assignee: Prasanth J
> Attachments: PIG-2765.1.patch, PIG-2765.2.git.patch
>
>
> Implement RollupDimensions UDF which performs aggregation from most detailed 
> level of dimensions to the most general level (grand total) in hierarchical 
> order. Provide support for ROLLUP clause in CUBE operator. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (PIG-2726) Handling legitimate NULL values

2012-07-03 Thread Prasanth J (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406072#comment-13406072
 ] 

Prasanth J commented on PIG-2726:
-

Attaching a git patch. Patch for PIG-2765 is generated on top of this git 
patch. 

> Handling legitimate NULL values
> ---
>
> Key: PIG-2726
> URL: https://issues.apache.org/jira/browse/PIG-2726
> Project: Pig
>  Issue Type: Sub-task
>Affects Versions: 0.11
>Reporter: Prasanth J
>Assignee: Prasanth J
> Attachments: PIG-2726.1.git.patch, PIG-2726.1.patch
>
>
> Look into SQL/Oracle server for how they are handling legitimate NULL values 
> in the input while performing operations like roll-up, filtering etc. Current 
> implementation outputs NULL string which should be replaced by actual null 
> value.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (PIG-2726) Handling legitimate NULL values

2012-07-03 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated PIG-2726:


Attachment: PIG-2726.1.git.patch

> Handling legitimate NULL values
> ---
>
> Key: PIG-2726
> URL: https://issues.apache.org/jira/browse/PIG-2726
> Project: Pig
>  Issue Type: Sub-task
>Affects Versions: 0.11
>Reporter: Prasanth J
>Assignee: Prasanth J
> Attachments: PIG-2726.1.git.patch, PIG-2726.1.patch
>
>
> Look into SQL/Oracle server for how they are handling legitimate NULL values 
> in the input while performing operations like roll-up, filtering etc. Current 
> implementation outputs NULL string which should be replaced by actual null 
> value.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (PIG-2765) Implementing RollupDimensions UDF and adding ROLLUP clause in CUBE operator

2012-07-03 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated PIG-2765:


Attachment: PIG-2765.2.git.patch

> Implementing RollupDimensions UDF and adding ROLLUP clause in CUBE operator
> ---
>
> Key: PIG-2765
> URL: https://issues.apache.org/jira/browse/PIG-2765
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Prasanth J
>Assignee: Prasanth J
> Attachments: PIG-2765.1.patch, PIG-2765.2.git.patch
>
>
> Implement RollupDimensions UDF which performs aggregation from most detailed 
> level of dimensions to the most general level (grand total) in hierarchical 
> order. Provide support for ROLLUP clause in CUBE operator. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: PIG-2765: Implementing RollupDimensions UDF and adding ROLLUP clause in CUBE operator

2012-07-03 Thread j . prasanth . j


> On July 2, 2012, 5:15 a.m., Dmitriy Ryaboy wrote:
> > overall, looks good.

Added the modified patch to a new review board 
https://reviews.apache.org/r/5733/ which uses git diff.


- Prasanth_J


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/5521/#review8778
---


On June 22, 2012, 7:35 a.m., Prasanth_J wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/5521/
> ---
> 
> (Updated June 22, 2012, 7:35 a.m.)
> 
> 
> Review request for pig and Dmitriy Ryaboy.
> 
> 
> Description
> ---
> 
> This is a review board request for 
> https://issues.apache.org/jira/browse/PIG-2765
> 
> 
> This addresses bug PIG-2765.
> https://issues.apache.org/jira/browse/PIG-2765
> 
> 
> Diffs
> -
> 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/RollupDimensions.java
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/relational/LOCube.java
>  1352776 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/AliasMasker.g
>  1352776 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/AstPrinter.g
>  1352776 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/AstValidator.g
>  1352776 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/LogicalPlanBuilder.java
>  1352776 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/LogicalPlanGenerator.g
>  1352776 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/QueryLexer.g
>  1352776 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/QueryParser.g
>  1352776 
>   
> http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/parser/TestLexer.pig
>  1352776 
>   
> http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/parser/TestLogicalPlanGenerator.java
>  1352776 
>   
> http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/parser/TestParser.pig
>  1352776 
>   
> http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/parser/TestQueryLexer.java
>  1352776 
>   
> http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/parser/TestQueryParser.java
>  1352776 
>   
> http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestCubeOperator.java
>  1352776 
>   
> http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestRollupDimensions.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/5521/diff/
> 
> 
> Testing
> ---
> 
> Unit tests: All passed
> 
> Pre-commit tests: All passed
> ant clean test-commit
> 
> 
> Thanks,
> 
> Prasanth_J
> 
>



Review Request: PIG-2765: [NEW] Implementing RollupDimensions UDF and adding ROLLUP clause in CUBE operator

2012-07-03 Thread j . prasanth . j

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/5733/
---

Review request for pig and Dmitriy Ryaboy.


Description
---

This is a review board request for 
https://issues.apache.org/jira/browse/PIG-2765


This addresses bug PIG-2765.
https://issues.apache.org/jira/browse/PIG-2765


Diffs
-

  src/org/apache/pig/builtin/CubeDimensions.java 4f2680f 
  src/org/apache/pig/builtin/RollupDimensions.java PRE-CREATION 
  src/org/apache/pig/newplan/logical/relational/LOCube.java ee60e9b 
  src/org/apache/pig/parser/AliasMasker.g e2166ee 
  src/org/apache/pig/parser/AstPrinter.g b03fd6b 
  src/org/apache/pig/parser/AstValidator.g d5c8229 
  src/org/apache/pig/parser/LogicalPlanBuilder.java a7420a7 
  src/org/apache/pig/parser/LogicalPlanGenerator.g 6c1da32 
  src/org/apache/pig/parser/QueryLexer.g fe68daf 
  src/org/apache/pig/parser/QueryParser.g 719aa94 
  test/org/apache/pig/parser/TestLexer.pig 9505512 
  test/org/apache/pig/parser/TestLogicalPlanGenerator.java 0fb56e2 
  test/org/apache/pig/parser/TestParser.pig 3f62145 
  test/org/apache/pig/parser/TestQueryLexer.java 921045f 
  test/org/apache/pig/parser/TestQueryParser.java 6e4805d 
  test/org/apache/pig/test/TestCubeOperator.java db14976 
  test/org/apache/pig/test/TestRollupDimensions.java PRE-CREATION 

Diff: https://reviews.apache.org/r/5733/diff/


Testing
---

Unit tests: All passed

Pre-commit tests: All passed
ant clean test-commit


Thanks,

Prasanth_J



[jira] [Updated] (PIG-2779) Refactoring the code for setting number of reducers

2012-07-03 Thread Jie Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Li updated PIG-2779:


Assignee: Jie Li
  Status: Patch Available  (was: Open)

> Refactoring the code for setting number of reducers
> ---
>
> Key: PIG-2779
> URL: https://issues.apache.org/jira/browse/PIG-2779
> Project: Pig
>  Issue Type: Bug
>Reporter: Jie Li
>Assignee: Jie Li
> Fix For: 0.11
>
> Attachments: PIG-2779.0.patch, TestNumberOfReducers.java, 
> TestNumberOfReducers.java
>
>
> As PIG-2652 observed, currently the code for setting number of reducers is a 
> little messy. MapReduceOper.requestedParallelism seems being misused in some 
> plases, and now we support runtime estimation of #reducer which further 
> complicates the problem.
> For example, if we specify parallel 1 for the order-by, the estimated 
> #reducer will be used. If we specify parallel 2 while it estimates 4, 
> order-by will fail due to "Illegal partition for Null". If we specify 
> parallel 4 while it estimates 2, then some reducers will have nothing to do. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (PIG-2632) Create a SchemaTuple which generates efficient Tuples via code gen

2012-07-03 Thread Jonathan Coveney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406044#comment-13406044
 ] 

Jonathan Coveney commented on PIG-2632:
---

Julien +1'd on reviewboard (didn't +1 here because JIRA has been down for 
people). Revision is: r1356921. I will add more documentation in a separate 
patch. This is TURNED OFF by default so should be invisible to existing jobs.

> Create a SchemaTuple which generates efficient Tuples via code gen
> --
>
> Key: PIG-2632
> URL: https://issues.apache.org/jira/browse/PIG-2632
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jonathan Coveney
>Assignee: Jonathan Coveney
> Fix For: 0.11
>
> Attachments: PIG-2632-0.patch, PIG-2632-1.patch, PIG-2632-10.patch, 
> PIG-2632-10.patch, PIG-2632-3.patch, PIG-2632-4.patch, PIG-2632-5.patch, 
> PIG-2632-6.patch, PIG-2632-7.patch, PIG-2632-8.patch, PIG-2632-9.patch, 
> PIG-2632-9.patch, schematuple benchmarking.pdf, schematuple benchmarking.pptx
>
>
> This work builds on Dmitriy's PrimitiveTuple work. The idea is that, knowing 
> the Schema on the frontend, we can code generate Tuples which can be used for 
> fun and profit. In rudimentary tests, the memory efficiency is 2-4x better, 
> and it's ~15% smaller serialized (heavily heavily depends on the data, 
> though). Need to do get/set tests, but assuming that it's on par (or even 
> faster) than Tuple, the memory gain is huge.
> Need to clean up the code and add tests.
> Right now, it generates a SchemaTuple for every inputSchema and outputSchema 
> given to UDF's. The next step is to make a SchemaBag, where I think the 
> serialization savings will be really huge.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Are there any explanations of the implementation of illustrate?

2012-07-03 Thread Jie Li
Some document here: http://wiki.apache.org/pig/PigIllustrate

I agree that more tests are needed for illustrate, otherwise it can be
easily broken without notice.

Jie

On Tue, Jul 3, 2012 at 12:45 PM, Jon Coveney  wrote:
> I was curious at a level slightly higher than "dig through the code" how 
> illustrate is so fast, and how it deals with joins effectively. Are there any 
> resources on this (or does anyone at Hortonworks want to write a tech 
> oriented blog post? :)
>


[jira] [Updated] (PIG-2779) Refactoring the code for setting number of reducers

2012-07-03 Thread Jie Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Li updated PIG-2779:


Attachment: PIG-2779.0.patch

Attached a patch that fixed the failed cases. The idea is that the sample job 
shouldn't determine the order-by's parallelism at compile-time, because we'll 
estimate and adjust it at run-time. 

A more elegant refactoring may be possible after PIG-2784. 

Note this patch is necessary for us to remove the sample job at runtime.

> Refactoring the code for setting number of reducers
> ---
>
> Key: PIG-2779
> URL: https://issues.apache.org/jira/browse/PIG-2779
> Project: Pig
>  Issue Type: Bug
>Reporter: Jie Li
> Fix For: 0.11
>
> Attachments: PIG-2779.0.patch, TestNumberOfReducers.java, 
> TestNumberOfReducers.java
>
>
> As PIG-2652 observed, currently the code for setting number of reducers is a 
> little messy. MapReduceOper.requestedParallelism seems being misused in some 
> plases, and now we support runtime estimation of #reducer which further 
> complicates the problem.
> For example, if we specify parallel 1 for the order-by, the estimated 
> #reducer will be used. If we specify parallel 2 while it estimates 4, 
> order-by will fail due to "Illegal partition for Null". If we specify 
> parallel 4 while it estimates 2, then some reducers will have nothing to do. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Review Request: PIG-2765: Implementing RollupDimensions UDF and adding ROLLUP clause in CUBE operator

2012-07-03 Thread j . prasanth . j

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/5731/
---

Review request for pig and Dmitriy Ryaboy.


Description
---

This is a review board request for 
https://issues.apache.org/jira/browse/PIG-2765


This addresses bug PIG-2765.
https://issues.apache.org/jira/browse/PIG-2765


Diffs
-

  src/org/apache/pig/builtin/RollupDimensions.java PRE-CREATION 
  src/org/apache/pig/newplan/logical/relational/LOCube.java ee60e9b 
  src/org/apache/pig/parser/AliasMasker.g e2166ee 
  src/org/apache/pig/parser/AstPrinter.g b03fd6b 
  src/org/apache/pig/parser/AstValidator.g d5c8229 
  src/org/apache/pig/parser/LogicalPlanBuilder.java bf46e77 
  src/org/apache/pig/parser/LogicalPlanGenerator.g 6c1da32 
  src/org/apache/pig/parser/QueryLexer.g fe68daf 
  src/org/apache/pig/parser/QueryParser.g 719aa94 
  test/org/apache/pig/parser/TestLexer.pig 9505512 
  test/org/apache/pig/parser/TestLogicalPlanGenerator.java 0fb56e2 
  test/org/apache/pig/parser/TestParser.pig 3f62145 
  test/org/apache/pig/parser/TestQueryLexer.java 921045f 
  test/org/apache/pig/parser/TestQueryParser.java 6e4805d 
  test/org/apache/pig/test/TestCubeOperator.java db14976 
  test/org/apache/pig/test/TestRollupDimensions.java PRE-CREATION 

Diff: https://reviews.apache.org/r/5731/diff/


Testing
---

Unit tests: All passed

Pre-commit tests: All passed
ant clean test-commit


Thanks,

Prasanth_J



Are there any explanations of the implementation of illustrate?

2012-07-03 Thread Jon Coveney
I was curious at a level slightly higher than "dig through the code" how 
illustrate is so fast, and how it deals with joins effectively. Are there any 
resources on this (or does anyone at Hortonworks want to write a tech oriented 
blog post? :)

Re: Review Request: PIG-2765: Implementing RollupDimensions UDF and adding ROLLUP clause in CUBE operator

2012-07-03 Thread j . prasanth . j


> On July 2, 2012, 5:15 a.m., Dmitriy Ryaboy wrote:
> > http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/RollupDimensions.java,
> >  line 49
> > 
> >
> > how are you planning to inherit null handling behavior from 
> > CubeDimensions?

By using static function inside CubeDimensions function for converting nulls to 
unknown. This seems to be an easy solution which solves the purpose of reusing 
the same code for CubeDimensions and RollupDimensions. Ideally this should be 
done outside the UDFs in some Util class or somewhere before passing the tuples 
to UDFs.


> On July 2, 2012, 5:15 a.m., Dmitriy Ryaboy wrote:
> > http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/RollupDimensions.java,
> >  line 62
> > 
> >
> > the capacity set here is too large -- for hierarchical rollup, we just 
> > need tuple.size() elements
> > 
> > does the sql standard rollup go all the way up to (null, null, null, 
> > null) or does it stop at (a, null, null, null) ?
> > 
> >

Good catch. Updated. Yeah SQL rolls all the way up to (null, null, null, null). 
So the capacity should be tuple.size() + 1.


> On July 2, 2012, 5:15 a.m., Dmitriy Ryaboy wrote:
> > http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/RollupDimensions.java,
> >  line 71
> > 
> >
> > I believe that's also in the null-handling patch. Since you need it 
> > both here and in CubeDimensions, it would be better to reuse that bit of 
> > code than to make copies of it.
> > 
> > The comment about not allocating new tuples and copying into them 
> > unnecessarily applies here as well.

Fixed. As explained above using static function.


> On July 2, 2012, 5:15 a.m., Dmitriy Ryaboy wrote:
> > http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/RollupDimensions.java,
> >  line 83
> > 
> >
> > seems like recursion makes the code too complex when all we need to do 
> > is tuple.size() loops in which we make a copy of the tuple and set some 
> > fields to null.

Updated to an iterative way. 


> On July 2, 2012, 5:15 a.m., Dmitriy Ryaboy wrote:
> > http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/RollupDimensions.java,
> >  line 95
> > 
> >
> > can you document here what happens to "dimensions" the string as it is 
> > adjusted by the cube/rollup operator?

Updated.


> On July 2, 2012, 5:15 a.m., Dmitriy Ryaboy wrote:
> > http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/relational/LOCube.java,
> >  line 45
> > 
> >
> > let's remove the word "now"

Updated.


> On July 2, 2012, 5:15 a.m., Dmitriy Ryaboy wrote:
> > http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/LogicalPlanBuilder.java,
> >  line 416
> > 
> >
> > there is an implicit else here (because of the "continue") -- make it 
> > explicit

Updated.


> On July 2, 2012, 5:15 a.m., Dmitriy Ryaboy wrote:
> > http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/LogicalPlanBuilder.java,
> >  line 419
> > 
> >
> > style: prefer
> > 
> > if {
> > 
> > } else if {
> > 
> > }
> >

Yeah. there was an issue with formatting. Updated all over the patch. 


> On July 2, 2012, 5:15 a.m., Dmitriy Ryaboy wrote:
> > http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/LogicalPlanBuilder.java,
> >  line 434
> > 
> >
> > document what the corner case is?

Updated.


> On July 2, 2012, 5:15 a.m., Dmitriy Ryaboy wrote:
> > http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/LogicalPlanBuilder.java,
> >  line 587
> > 
> >
> > this will need to change to match the null handling change in the other 
> > ticket

Updated.


> On July 2, 2012, 5:15 a.m., Dmitriy Ryaboy wrote:
> > http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/LogicalPlanGenerator.g,
> >  line 509
> > 
> >
> > something odd is going on with the indentation here

Fixed.


> On July 2, 2012, 5:15 a.m., Dmitriy Ryaboy wrote:
> > http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestRollupDimensions.java,
> >  line 44
> > 

Re: Review Request: PIG-2763 - Groovy UDFs

2012-07-03 Thread Mathias Herberts

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/5591/
---

(Updated July 3, 2012, 7:34 p.m.)


Review request for pig, Julien Le Dem and Jonathan Coveney.


Changes
---

Added support for arguments to method main when running an embedded script.


Description
---

Adds support for Groovy UDFs in Pig.


This addresses bug PIG-2763.
https://issues.apache.org/jira/browse/PIG-2763


Diffs (updated)
-

  /trunk/ivy.xml 1353307 
  /trunk/ivy/libraries.properties 1353307 
  /trunk/src/org/apache/pig/scripting/ScriptEngine.java 1356486 
  /trunk/src/org/apache/pig/scripting/groovy/AccumulatorAccumulate.java 
PRE-CREATION 
  /trunk/src/org/apache/pig/scripting/groovy/AccumulatorCleanup.java 
PRE-CREATION 
  /trunk/src/org/apache/pig/scripting/groovy/AccumulatorGetValue.java 
PRE-CREATION 
  /trunk/src/org/apache/pig/scripting/groovy/AlgebraicFinal.java PRE-CREATION 
  /trunk/src/org/apache/pig/scripting/groovy/AlgebraicInitial.java PRE-CREATION 
  /trunk/src/org/apache/pig/scripting/groovy/AlgebraicIntermed.java 
PRE-CREATION 
  /trunk/src/org/apache/pig/scripting/groovy/GroovyAccumulatorEvalFunc.java 
PRE-CREATION 
  /trunk/src/org/apache/pig/scripting/groovy/GroovyAlgebraicEvalFunc.java 
PRE-CREATION 
  /trunk/src/org/apache/pig/scripting/groovy/GroovyEvalFunc.java PRE-CREATION 
  /trunk/src/org/apache/pig/scripting/groovy/GroovyEvalFuncObject.java 
PRE-CREATION 
  /trunk/src/org/apache/pig/scripting/groovy/GroovyScriptEngine.java 
PRE-CREATION 
  /trunk/src/org/apache/pig/scripting/groovy/GroovyUtils.java PRE-CREATION 
  /trunk/src/org/apache/pig/scripting/groovy/OutputSchemaFunction.java 
PRE-CREATION 
  /trunk/test/org/apache/pig/test/TestUDFGroovy.java PRE-CREATION 
  /trunk/test/unit-tests 1353307 

Diff: https://reviews.apache.org/r/5591/diff/


Testing
---


Thanks,

Mathias Herberts



Re: Review Request: PIG-2726: Handling legitimate NULL values in CUBE operator

2012-07-03 Thread j . prasanth . j


> On July 2, 2012, 1:49 a.m., Dmitriy Ryaboy wrote:
> > http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/CubeDimensions.java,
> >  line 94
> > 
> >
> > I would rather we set the fields, rather than copy the tuple and then 
> > set them. Unnecessary copies => time + gc pressure.

Updated this in second patch of PIG-2765 JIRA (PIG-2765.2.patch). 

PIG-2765 patch is generated on top of PIG-2726 patch. I assume that PIG-2726 
will be committed first followed by PIG-2765. After committing the patch for 
PIG-2726, do I need to regenerate the patch for PIG-2765 against the trunk? 


- Prasanth_J


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/5470/#review8777
---


On June 21, 2012, 5:44 a.m., Prasanth_J wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/5470/
> ---
> 
> (Updated June 21, 2012, 5:44 a.m.)
> 
> 
> Review request for pig and Dmitriy Ryaboy.
> 
> 
> Description
> ---
> 
> This is a review board for https://issues.apache.org/jira/browse/PIG-2726
> 
> 
> This addresses bug PIG-2726.
> https://issues.apache.org/jira/browse/PIG-2726
> 
> 
> Diffs
> -
> 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/CubeDimensions.java
>  1350081 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/LogicalPlanBuilder.java
>  1350081 
>   
> http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestCubeOperator.java
>  1350081 
> 
> Diff: https://reviews.apache.org/r/5470/diff/
> 
> 
> Testing
> ---
> 
> Unit tests: All passed
> 
> Pre-commit tests: All passed
> ant clean test-commit
> 
> 
> Thanks,
> 
> Prasanth_J
> 
>



Re: Review Request: PIG-2763 - Groovy UDFs

2012-07-03 Thread Mathias Herberts


> On July 3, 2012, 5:54 p.m., Julien Le Dem wrote:
> > /trunk/src/org/apache/pig/scripting/groovy/AccumulatorAccumulate.java, line 
> > 40
> > 
> >
> > you don't have to define value(). If you don't want it just delete this 
> > line.
> > (same bellow)

We need the annotation's parameter as it defines the UDF name.


> On July 3, 2012, 5:54 p.m., Julien Le Dem wrote:
> > /trunk/src/org/apache/pig/scripting/groovy/GroovyScriptEngine.java, lines 
> > 118-121
> > 
> >
> > check out this to pass parameters:
> > 
> > 
> > http://svn.apache.org/viewvc/pig/trunk/src/org/apache/pig/scripting/jython/JythonScriptEngine.java?view=markup
> > line 178
> > 
> > argv = 
> > (String[])ObjectSerializer.deserialize(pigContext.getProperties().getProperty(PigContext.PIG_CMD_ARGS_REMAINDERS));
> >

I was wondering where to get this, thanks.


- Mathias


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/5591/#review8844
---


On July 3, 2012, 1:07 a.m., Mathias Herberts wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/5591/
> ---
> 
> (Updated July 3, 2012, 1:07 a.m.)
> 
> 
> Review request for pig, Julien Le Dem and Jonathan Coveney.
> 
> 
> Description
> ---
> 
> Adds support for Groovy UDFs in Pig.
> 
> 
> This addresses bug PIG-2763.
> https://issues.apache.org/jira/browse/PIG-2763
> 
> 
> Diffs
> -
> 
>   /trunk/ivy.xml 1353307 
>   /trunk/ivy/libraries.properties 1353307 
>   /trunk/src/org/apache/pig/scripting/ScriptEngine.java 1356486 
>   /trunk/src/org/apache/pig/scripting/groovy/AccumulatorAccumulate.java 
> PRE-CREATION 
>   /trunk/src/org/apache/pig/scripting/groovy/AccumulatorCleanup.java 
> PRE-CREATION 
>   /trunk/src/org/apache/pig/scripting/groovy/AccumulatorGetValue.java 
> PRE-CREATION 
>   /trunk/src/org/apache/pig/scripting/groovy/AlgebraicFinal.java PRE-CREATION 
>   /trunk/src/org/apache/pig/scripting/groovy/AlgebraicInitial.java 
> PRE-CREATION 
>   /trunk/src/org/apache/pig/scripting/groovy/AlgebraicIntermed.java 
> PRE-CREATION 
>   /trunk/src/org/apache/pig/scripting/groovy/GroovyAccumulatorEvalFunc.java 
> PRE-CREATION 
>   /trunk/src/org/apache/pig/scripting/groovy/GroovyAlgebraicEvalFunc.java 
> PRE-CREATION 
>   /trunk/src/org/apache/pig/scripting/groovy/GroovyEvalFunc.java PRE-CREATION 
>   /trunk/src/org/apache/pig/scripting/groovy/GroovyEvalFuncObject.java 
> PRE-CREATION 
>   /trunk/src/org/apache/pig/scripting/groovy/GroovyScriptEngine.java 
> PRE-CREATION 
>   /trunk/src/org/apache/pig/scripting/groovy/GroovyUtils.java PRE-CREATION 
>   /trunk/src/org/apache/pig/scripting/groovy/OutputSchemaFunction.java 
> PRE-CREATION 
>   /trunk/test/org/apache/pig/test/TestUDFGroovy.java PRE-CREATION 
>   /trunk/test/unit-tests 1353307 
> 
> Diff: https://reviews.apache.org/r/5591/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Mathias Herberts
> 
>



[jira] [Created] (PIG-2786) enhance Pig launcher script wrt. HBase integration

2012-07-03 Thread Roman Shaposhnik (JIRA)
Roman Shaposhnik created PIG-2786:
-

 Summary: enhance Pig launcher script wrt. HBase integration
 Key: PIG-2786
 URL: https://issues.apache.org/jira/browse/PIG-2786
 Project: Pig
  Issue Type: Improvement
  Components: grunt
Affects Versions: 0.10.0
Reporter: Roman Shaposhnik
Assignee: Roman Shaposhnik
Priority: Minor


The current bin/pig script suffers from a couple of issues as far as 
integration with HBase is concerned:
  # it only detects ZK/HBase jars under a PIG_HOME/share/.. layout
  # it doesn't detect HBase dependencies

The proposal here would be to ask HBase itself for its classpath

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (PIG-1314) Add DateTime Support to Pig

2012-07-03 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13405971#comment-13405971
 ] 

Thejas M Nair commented on PIG-1314:


PigStorage is meant to be a human readable format. So that is another reason to 
store the timestamp in the ISO string as you suggested. 
Yes, If the timezone is specified in the string, pig should use that value. But 
the timezone part and time part of the datetime string should be optional. Does 
jodatime support that ?


> Add DateTime Support to Pig
> ---
>
> Key: PIG-1314
> URL: https://issues.apache.org/jira/browse/PIG-1314
> Project: Pig
>  Issue Type: Bug
>  Components: data
>Affects Versions: 0.7.0
>Reporter: Russell Jurney
>Assignee: Zhijie Shen
>  Labels: gsoc2012
> Attachments: PIG-1314-1.patch, PIG-1314-2.patch, joda_vs_builtin.zip
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Hadoop/Pig are primarily used to parse log data, and most logs have a 
> timestamp component.  Therefore Pig should support dates as a primitive.
> Can someone familiar with adding types to pig comment on how hard this is?  
> We're looking at doing this, rather than use UDFs.  Is this a patch that 
> would be accepted?
> This is a candidate project for Google summer of code 2012. More information 
> about the program can be found at 
> https://cwiki.apache.org/confluence/display/PIG/GSoc2012

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: PIG-2763 - Groovy UDFs

2012-07-03 Thread Julien Le Dem

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/5591/#review8844
---


Thanks for implementing the embedding as well.


/trunk/src/org/apache/pig/scripting/groovy/AccumulatorAccumulate.java


you don't have to define value(). If you don't want it just delete this 
line.
(same bellow)



/trunk/src/org/apache/pig/scripting/groovy/GroovyScriptEngine.java


check out this to pass parameters:


http://svn.apache.org/viewvc/pig/trunk/src/org/apache/pig/scripting/jython/JythonScriptEngine.java?view=markup
line 178

argv = 
(String[])ObjectSerializer.deserialize(pigContext.getProperties().getProperty(PigContext.PIG_CMD_ARGS_REMAINDERS));



- Julien Le Dem


On July 3, 2012, 1:07 a.m., Mathias Herberts wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/5591/
> ---
> 
> (Updated July 3, 2012, 1:07 a.m.)
> 
> 
> Review request for pig, Julien Le Dem and Jonathan Coveney.
> 
> 
> Description
> ---
> 
> Adds support for Groovy UDFs in Pig.
> 
> 
> This addresses bug PIG-2763.
> https://issues.apache.org/jira/browse/PIG-2763
> 
> 
> Diffs
> -
> 
>   /trunk/ivy.xml 1353307 
>   /trunk/ivy/libraries.properties 1353307 
>   /trunk/src/org/apache/pig/scripting/ScriptEngine.java 1356486 
>   /trunk/src/org/apache/pig/scripting/groovy/AccumulatorAccumulate.java 
> PRE-CREATION 
>   /trunk/src/org/apache/pig/scripting/groovy/AccumulatorCleanup.java 
> PRE-CREATION 
>   /trunk/src/org/apache/pig/scripting/groovy/AccumulatorGetValue.java 
> PRE-CREATION 
>   /trunk/src/org/apache/pig/scripting/groovy/AlgebraicFinal.java PRE-CREATION 
>   /trunk/src/org/apache/pig/scripting/groovy/AlgebraicInitial.java 
> PRE-CREATION 
>   /trunk/src/org/apache/pig/scripting/groovy/AlgebraicIntermed.java 
> PRE-CREATION 
>   /trunk/src/org/apache/pig/scripting/groovy/GroovyAccumulatorEvalFunc.java 
> PRE-CREATION 
>   /trunk/src/org/apache/pig/scripting/groovy/GroovyAlgebraicEvalFunc.java 
> PRE-CREATION 
>   /trunk/src/org/apache/pig/scripting/groovy/GroovyEvalFunc.java PRE-CREATION 
>   /trunk/src/org/apache/pig/scripting/groovy/GroovyEvalFuncObject.java 
> PRE-CREATION 
>   /trunk/src/org/apache/pig/scripting/groovy/GroovyScriptEngine.java 
> PRE-CREATION 
>   /trunk/src/org/apache/pig/scripting/groovy/GroovyUtils.java PRE-CREATION 
>   /trunk/src/org/apache/pig/scripting/groovy/OutputSchemaFunction.java 
> PRE-CREATION 
>   /trunk/test/org/apache/pig/test/TestUDFGroovy.java PRE-CREATION 
>   /trunk/test/unit-tests 1353307 
> 
> Diff: https://reviews.apache.org/r/5591/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Mathias Herberts
> 
>



Re: Review Request: Add Schema aware Tuple to Pig

2012-07-03 Thread Julien Le Dem

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/5716/#review8842
---

Ship it!


Excellent!

- Julien Le Dem


On July 3, 2012, 1:48 a.m., Jonathan Coveney wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/5716/
> ---
> 
> (Updated July 3, 2012, 1:48 a.m.)
> 
> 
> Review request for pig and Julien Le Dem.
> 
> 
> Description
> ---
> 
> I am putting this in the new pig-git, because it's much easier to update! I 
> responded to your comment's, Julien!
> 
> 
> This addresses bug PIG-2632.
> https://issues.apache.org/jira/browse/PIG-2632
> 
> 
> Diffs
> -
> 
>   .gitignore fd2fe31 
>   conf/pig.properties a4b77cf 
>   
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java
>  7ad61f0 
>   
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapBase.java
>  0828c5d 
>   
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java
>  298071d 
>   
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigTupleDefaultRawComparator.java
>  08ce0bb 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/PhysicalOperator.java
>  ee9f921 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java
>  47c727a 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POFRJoin.java
>  3d27e95 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMergeJoin.java
>  ccc7b94 
>   src/org/apache/pig/builtin/mock/Storage.java 9628902 
>   src/org/apache/pig/data/AppendableSchemaTuple.java PRE-CREATION 
>   src/org/apache/pig/data/BinInterSedes.java f3d08c7 
>   src/org/apache/pig/data/BinSedesTupleFactory.java a2e80a2 
>   src/org/apache/pig/data/DataByteArray.java d98a09c 
>   src/org/apache/pig/data/FieldIsNullException.java PRE-CREATION 
>   src/org/apache/pig/data/PBooleanTuple.java 06bb4bd 
>   src/org/apache/pig/data/PDoubleTuple.java 2b44ab2 
>   src/org/apache/pig/data/PFloatTuple.java 6fbc3b0 
>   src/org/apache/pig/data/PIntTuple.java e4da64c 
>   src/org/apache/pig/data/PLongTuple.java 44081fd 
>   src/org/apache/pig/data/PStringTuple.java f91d0c7 
>   src/org/apache/pig/data/PrimitiveFieldTuple.java e9ca506 
>   src/org/apache/pig/data/PrimitiveTuple.java 6e35b91 
>   src/org/apache/pig/data/SchemaTuple.java PRE-CREATION 
>   src/org/apache/pig/data/SchemaTupleBackend.java PRE-CREATION 
>   src/org/apache/pig/data/SchemaTupleClassGenerator.java PRE-CREATION 
>   src/org/apache/pig/data/SchemaTupleFactory.java PRE-CREATION 
>   src/org/apache/pig/data/SchemaTupleFrontend.java PRE-CREATION 
>   src/org/apache/pig/data/TupleFactory.java 1b241b2 
>   src/org/apache/pig/data/TupleMaker.java PRE-CREATION 
>   src/org/apache/pig/data/TypeAwareTuple.java f7476f7 
>   src/org/apache/pig/data/utils/BytesHelper.java PRE-CREATION 
>   src/org/apache/pig/data/utils/MethodHelper.java PRE-CREATION 
>   src/org/apache/pig/data/utils/SedesHelper.java PRE-CREATION 
>   src/org/apache/pig/data/utils/StructuresHelper.java PRE-CREATION 
>   src/org/apache/pig/impl/PigContext.java 8cb3d12 
>   src/org/apache/pig/impl/io/InterRecordReader.java 0258b44 
>   src/org/apache/pig/impl/io/NullableTuple.java c17011e 
>   
> src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java 
> e8530da 
>   src/org/apache/pig/newplan/logical/expression/UserFuncExpression.java 
> d6a45d4 
>   
> src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java 
> 6511fee 
>   
> src/org/apache/pig/newplan/logical/relational/LogicalRelationalOperator.java 
> acb564e 
>   src/org/apache/pig/newplan/logical/rules/GroupByConstParallelSetter.java 
> 463c3d9 
>   src/org/apache/pig/newplan/logical/rules/MergeForEach.java c7244c1 
>   test/org/apache/pig/data/TestSchemaTuple.java PRE-CREATION 
>   test/org/apache/pig/data/utils/TestMethodHelper.java PRE-CREATION 
>   test/org/apache/pig/test/TestDataBag.java b2eaef7 
>   test/org/apache/pig/test/TestLogicalPlanBuilder.java 5068dfd 
>   test/org/apache/pig/test/TestPrimitiveFieldTuple.java 93b15e7 
>   test/org/apache/pig/test/TestPrimitiveTuple.java f8e88dc 
>   test/org/apache/pig/test/TestSchema.java c27a5a6 
> 
> Diff: https://reviews.apache.org/r/5716/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jonathan Coveney
> 
>



[jira] [Commented] (PIG-1314) Add DateTime Support to Pig

2012-07-03 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13405724#comment-13405724
 ] 

Zhijie Shen commented on PIG-1314:
--

There's some issues with loading/storing pig data. When store a DateTime object 
with "Utf8StorageConverter" without using UDFs to convert it to some string, 
should we serialize it as a millis+timezone composite, or output an UTC-style 
datetime string (e.g., 2012-07-03T08:14:19.962+01:00))? The latter operation 
behaves the same as uses "String ToString(DateTime d)" before storing the 
string? Personally, I like the latter choice, because the data is directly 
readable from the stored files.

On the other hand, if a datetime object is stored in the file as a datetime 
string, when we load it again as a datetime object, should we use the default 
timezone or use the one specified in the timezone string (e.g., +01:00 in the 
last example)? I again prefer the second choice. When we use Pig, it is 
possible to do a bunch of store/load to achieve some goal. The timezone 
information need to be preserved. For example, let's assume +08:00 is the 
default timezone. A datatime object whose individual timezone is -04:00 is 
stored as a string, which will have -04:00 as suffix. When the string is loaded 
as a datetime object for further process, we'd better keep to the previously 
used timezone, -04:00, instead of the default one.

How do you think about this? Thanks!


> Add DateTime Support to Pig
> ---
>
> Key: PIG-1314
> URL: https://issues.apache.org/jira/browse/PIG-1314
> Project: Pig
>  Issue Type: Bug
>  Components: data
>Affects Versions: 0.7.0
>Reporter: Russell Jurney
>Assignee: Zhijie Shen
>  Labels: gsoc2012
> Attachments: PIG-1314-1.patch, PIG-1314-2.patch, joda_vs_builtin.zip
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Hadoop/Pig are primarily used to parse log data, and most logs have a 
> timestamp component.  Therefore Pig should support dates as a primitive.
> Can someone familiar with adding types to pig comment on how hard this is?  
> We're looking at doing this, rather than use UDFs.  Is this a patch that 
> would be accepted?
> This is a candidate project for Google summer of code 2012. More information 
> about the program can be found at 
> https://cwiki.apache.org/confluence/display/PIG/GSoc2012

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira