Re: Prepare for Pig 0.10.1 release

2012-12-18 Thread Russell Jurney
High five! Think of all those new cores running Pig jobs!
On Dec 18, 2012 6:24 PM, "Daniel Dai"  wrote:

> I would like to commit Windows patches to 0.11.0. Allow me several days.
>
> Thanks,
> Daniel
>
> On Tue, Dec 18, 2012 at 5:47 PM, Julien Le Dem  wrote:
> > Sounds good to me.
> > Can we cut pig 0.11.0 at the same time ?
> > Julien
> >
> >
> > On Tue, Dec 18, 2012 at 7:54 AM, Daniel Dai 
> wrote:
> >
> >> Hi, Pig developers,
> >>
> >> We have fixed a bunch of bugs since
> >> 0.10.0(
> >> http://svn.apache.org/repos/asf/pig/branches/branch-0.10/CHANGES.txt).
> >> I would like to propose a 0.10.1 release from top of 0.10 branch after
> >> clearing all pending issues
> >> (
> >>
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20PIG%20AND%20resolution%20%3D%20Unresolved%20AND%20fixVersion%20%3D%20%220.10.1%22
> >> ).
> >>
> >> Any objections?
> >>
> >> Thanks,
> >> Daniel
> >>
>


[jira] [Updated] (PIG-2602) packageImportList should be configurable

2012-12-18 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-2602:


Fix Version/s: (was: 0.10.1)

> packageImportList should be configurable
> 
>
> Key: PIG-2602
> URL: https://issues.apache.org/jira/browse/PIG-2602
> Project: Pig
>  Issue Type: New Feature
>Reporter: Ashutosh Chauhan
>
> Currently, its hard-coded. These strings can be read from some config and 
> then can be used to resolve class names. That should succeed as long as those 
> classes are in classpath.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2955) Fix bunch of Pig e2e tests on Windows

2012-12-18 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-2955:


Fix Version/s: (was: 0.10.1)

>  Fix bunch of Pig e2e tests on Windows 
> ---
>
> Key: PIG-2955
> URL: https://issues.apache.org/jira/browse/PIG-2955
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.11, 0.12
>
> Attachments: PIG-2955-1.patch, PIG-2955-2_0.10.patch, PIG-2955-2.patch
>
>
> Fix the following test aborts and failures:
> ComputeSpec_1
> ComputeSpec_2
> Unicode_cmdline_1
> Warning_1
> Warning_4
> Checkin_2
> UdfDistributedCache_1
> Jython_Checkin_2
> Jython_Diagnostics_4
> Jython_Diagnostics_5
> Jython_Diagnostics_6
> Jython_Error_3
> Jython_Error_4
> Jython_Error_5
> Jython_Error_6
> Jython_Error_7
> Grunt_6
> Grunt_8
> Grunt_13
> Grunt_14

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2803) Include Wonderdog (ElasticSearch Integration) in contrib/

2012-12-18 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-2803:


Fix Version/s: (was: 0.10.1)

> Include Wonderdog (ElasticSearch Integration) in contrib/
> -
>
> Key: PIG-2803
> URL: https://issues.apache.org/jira/browse/PIG-2803
> Project: Pig
>  Issue Type: New Feature
>  Components: tools
>Affects Versions: 0.10.0, 0.11, 0.10.1
> Environment: contrib/ github
>Reporter: Russell Jurney
>Assignee: Russell Jurney
>Priority: Critical
>  Labels: contrib, elasticsearch, fun, happy, integration, pants, 
> pig, udf
>
> I propose to add Wonderdog to Pig contrib/
> Wonderdog is an Apache 2.0 licensed project that adds Hadoop and Pig 
> integration for ElasticSearch. This lets you index any Pig relation with a 
> single UDF call, which is very powerful. Both writing searchable indexes and 
> loading based on search queries is supported.
> More information on Wonderdog is available at 
> https://github.com/infochimps-labs/wonderdog and a great introduction to 
> ElasticSearch is available at 
> http://www.elasticsearchtutorial.com/elasticsearch-in-5-minutes.html
> Wonderdog broke in Pig 0.10.0, and was patched to work here: 
> https://github.com/infochimps-labs/wonderdog/pull/9 Even still, there is the 
> issue of Pig creating schema files when storing and loading JSON that must be 
> manually removed to make Wonderdog go.
> Moving forward, I would like the Pig project to maintain Wonderdog in 
> contrib/ and verify that it works with each version increment. Wonderdog is 
> an incredibly useful library that is license compatible with Pig itself. 
> Along with ElasticSearch, it adds the ability for any user to index his Pig 
> relations and to load subsets of data by pushing search queries down to 
> ElasticSearch.
> I use Wonderdog in production and in my book, so I volunteer to do the 
> maintenance on contrib/wonderdog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3026) Pig checked-in baseline comparisons need a pre-filter to address OS-specific newline differences

2012-12-18 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3026:


Fix Version/s: (was: 0.10.1)
   0.11

> Pig checked-in baseline comparisons need a pre-filter to address OS-specific 
> newline differences
> 
>
> Key: PIG-3026
> URL: https://issues.apache.org/jira/browse/PIG-3026
> Project: Pig
>  Issue Type: Sub-task
>Affects Versions: 0.10.0
>Reporter: John Gordon
> Fix For: 0.11
>
> Attachments: PIG-3026.branch-0.10.1.patch, 
> PIG-3026.branch-0.10.2.patch
>
>
> TestScriptLanguage, TestOptimizeLimit, TestMRCompiler, and 
> TestLogToPhyCompiler compare text files that were checked-in with text files 
> generated at run-time.  Because of differences in things like line-endings 
> between operating systems and even repository/enlistment settings it makes 
> sense to pre-filter newlines.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3024) TestEmptyInputDir unit test - hadoop version detection logic is brittle

2012-12-18 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3024:


Fix Version/s: (was: 0.10.1)
   0.11

> TestEmptyInputDir unit test - hadoop version detection logic is brittle
> ---
>
> Key: PIG-3024
> URL: https://issues.apache.org/jira/browse/PIG-3024
> Project: Pig
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 0.10.0
>Reporter: John Gordon
>Assignee: John Gordon
>Priority: Minor
> Fix For: 0.11
>
> Attachments: PIG-3024.branch-0.10.1.patch
>
>
> When the hadoop core version dependency is changed, TestEmptyInputDir almost 
> always fails as coded.  The issue is that it assumes all hadoop versions  
> have MAPREDUCE-3606 fixed except two.  It is more resilient to check families 
> of releases for the fix than to check the release number directly.  Right 
> now, it is safer to assume the bug is present unless the release is one of 
> those with the fix than it is to assume all releases have the fix except for 
> two.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3025) TestPruneColumn unit test - SimpleEchoStreamingCommand perl inline script needs simplification

2012-12-18 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3025:


Fix Version/s: (was: 0.10.1)
   0.11

> TestPruneColumn unit test - SimpleEchoStreamingCommand perl inline script 
> needs simplification
> --
>
> Key: PIG-3025
> URL: https://issues.apache.org/jira/browse/PIG-3025
> Project: Pig
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 0.10.0
>Reporter: John Gordon
>Assignee: John Gordon
> Fix For: 0.11
>
> Attachments: PIG-3025.branch-0.10.1-2.patch, 
> PIG-3025.branch-0.10.1.patch
>
>
> The "SimpleEchoStreamingCommand" string, which is an inline perl script, is 
> unnecessarily complicated by escaping nested quote characters on the 
> command-line.  As a result, it ends up unstable across shell implementations 
> and operating systems.
> Considering that perl has qq and can print unquoted values, this seems like 
> it is not needed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3027) pigTest unit test needs a newline filter for comparisons of golden multi-line

2012-12-18 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3027:


Fix Version/s: (was: 0.10.1)
   0.11

> pigTest unit test needs a newline filter for comparisons of golden multi-line
> -
>
> Key: PIG-3027
> URL: https://issues.apache.org/jira/browse/PIG-3027
> Project: Pig
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 0.10.0
>Reporter: John Gordon
>Assignee: John Gordon
> Fix For: 0.11
>
> Attachments: PIG-3027.trunk.1.patch
>
>
> pigTest leverages assertOutput throughout for text file comparisons to golden 
> checked-in baselines.  This method doesn't take into account line ending 
> differences across platforms.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3028) testGrunt dev test needs some command filters to run correctly without cygwin

2012-12-18 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3028:


Fix Version/s: (was: 0.10.1)
   0.11

> testGrunt dev test needs some command filters to run correctly without cygwin
> -
>
> Key: PIG-3028
> URL: https://issues.apache.org/jira/browse/PIG-3028
> Project: Pig
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 0.10.0
>Reporter: John Gordon
>Assignee: John Gordon
> Fix For: 0.11
>
> Attachments: PIG-3028.trunk.1.patch
>
>
> TestGrunt still has some commands that depend on cygwin, Namely rm -rf.  This 
> should be rd /S on Windows.  It needs a hook and variable abstraction for os 
> commands like this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-3029) TestTypeCheckingValidatorNewLP has some path reference issues for cross-platform execution

2012-12-18 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-3029:


Fix Version/s: (was: 0.10.1)
   0.11

> TestTypeCheckingValidatorNewLP has some path reference issues for 
> cross-platform execution
> --
>
> Key: PIG-3029
> URL: https://issues.apache.org/jira/browse/PIG-3029
> Project: Pig
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 0.10.0
>Reporter: John Gordon
>Assignee: John Gordon
> Fix For: 0.11
>
> Attachments: PIG-3029.trunk.1.patch
>
>
> TestTypeCheckingValidatorNewLP has a few references that are hand-coded URI 
> strings consisting of concatenating local file paths with file://.  This is 
> somewhat brittle across platforms and environment settings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Prepare for Pig 0.10.1 release

2012-12-18 Thread Daniel Dai
I would like to commit Windows patches to 0.11.0. Allow me several days.

Thanks,
Daniel

On Tue, Dec 18, 2012 at 5:47 PM, Julien Le Dem  wrote:
> Sounds good to me.
> Can we cut pig 0.11.0 at the same time ?
> Julien
>
>
> On Tue, Dec 18, 2012 at 7:54 AM, Daniel Dai  wrote:
>
>> Hi, Pig developers,
>>
>> We have fixed a bunch of bugs since
>> 0.10.0(
>> http://svn.apache.org/repos/asf/pig/branches/branch-0.10/CHANGES.txt).
>> I would like to propose a 0.10.1 release from top of 0.10 branch after
>> clearing all pending issues
>> (
>> https://issues.apache.org/jira/issues/?jql=project%20%3D%20PIG%20AND%20resolution%20%3D%20Unresolved%20AND%20fixVersion%20%3D%20%220.10.1%22
>> ).
>>
>> Any objections?
>>
>> Thanks,
>> Daniel
>>


[jira] [Updated] (PIG-3099) Pig unit test fixes for TestGrunt(1), TestStore(2), TestEmptyInputDir(3)

2012-12-18 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated PIG-3099:


Attachment: PIG-3099.patch

Fixes the tests mentioned in the description.

> Pig unit test fixes for TestGrunt(1), TestStore(2), TestEmptyInputDir(3)
> 
>
> Key: PIG-3099
> URL: https://issues.apache.org/jira/browse/PIG-3099
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Attachments: PIG-3099.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-3099) Pig unit test fixes for TestGrunt(1), TestStore(2), TestEmptyInputDir(3)

2012-12-18 Thread Vikram Dixit K (JIRA)
Vikram Dixit K created PIG-3099:
---

 Summary: Pig unit test fixes for TestGrunt(1), TestStore(2), 
TestEmptyInputDir(3)
 Key: PIG-3099
 URL: https://issues.apache.org/jira/browse/PIG-3099
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.10.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3096) Make PigUnit thread safe

2012-12-18 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13535536#comment-13535536
 ] 

Bill Graham commented on PIG-3096:
--

+1 lgtm.

> Make PigUnit thread safe
> 
>
> Key: PIG-3096
> URL: https://issues.apache.org/jira/browse/PIG-3096
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.11
>Reporter: Cheolsoo Park
>Assignee: Cheolsoo Park
> Fix For: 0.12
>
> Attachments: PIG-3096.patch
>
>
> Currently, {{PigUnit}} is not thread-safe because {{Cluster}} and 
> {{PigServer}} are declared as static. Converting them to ThreadLocal allows 
> PigUnit to run in multi-threaded environment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: PIG-3096 Make PigUnit thread safe

2012-12-18 Thread Bill Graham

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8631/#review14704
---

Ship it!


Ship It!

- Bill Graham


On Dec. 17, 2012, 1:34 a.m., Cheolsoo Park wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8631/
> ---
> 
> (Updated Dec. 17, 2012, 1:34 a.m.)
> 
> 
> Review request for pig and Santhosh Srinivasan.
> 
> 
> Description
> ---
> 
> Currently, PigUnit is not thread-safe because Cluster and PigServer are 
> declared as static. Converting them to ThreadLocal allows PigUnit to run in 
> multi-threaded environment.
> 
> 
> This addresses bug PIG-3096.
> https://issues.apache.org/jira/browse/PIG-3096
> 
> 
> Diffs
> -
> 
>   test/org/apache/pig/pigunit/PigTest.java 50a5c79 
> 
> Diff: https://reviews.apache.org/r/8631/diff/
> 
> 
> Testing
> ---
> 
> ant test -Dtestcase=TestPigTest
> 
> I also tested it by running multiple PigUnit cases in parallel with 
> tempus-fugit (http://tempusfugitlibrary.org/documentation/junit/parallel/) on 
> a real cluster.
> 
> 
> Thanks,
> 
> Cheolsoo Park
> 
>



[jira] Subscription: PIG patch available

2012-12-18 Thread jira
Issue Subscription
Filter: PIG patch available (39 issues)

Subscriber: pigdaily

Key Summary
PIG-3098Add another test for the self join case
https://issues.apache.org/jira/browse/PIG-3098
PIG-3096Make PigUnit thread safe
https://issues.apache.org/jira/browse/PIG-3096
PIG-3088Add a builtin udf which removes prefixes
https://issues.apache.org/jira/browse/PIG-3088
PIG-3086Allow A Prefix To Be Added To URIs In PigUnit Tests 
https://issues.apache.org/jira/browse/PIG-3086
PIG-3078Make a UDF that, given a string, returns just the columns prefixed 
by that string
https://issues.apache.org/jira/browse/PIG-3078
PIG-3073POUserFunc creating log spam for large scripts
https://issues.apache.org/jira/browse/PIG-3073
PIG-3069Native Windows Compatibility for Pig E2E Tests and Harness
https://issues.apache.org/jira/browse/PIG-3069
PIG-3067HBaseStorage should be split up to become more managable
https://issues.apache.org/jira/browse/PIG-3067
PIG-3066Fix TestPigRunner in trunk
https://issues.apache.org/jira/browse/PIG-3066
PIG-3057make readField protected to be able to override it if we extend 
PigStorage
https://issues.apache.org/jira/browse/PIG-3057
PIG-3051java.lang.IndexOutOfBoundsException  failure with LimitOptimizer + 
ColumnPruning
https://issues.apache.org/jira/browse/PIG-3051
PIG-3050Fix FindBugs multithreading warnings
https://issues.apache.org/jira/browse/PIG-3050
PIG-3029TestTypeCheckingValidatorNewLP has some path reference issues for 
cross-platform execution
https://issues.apache.org/jira/browse/PIG-3029
PIG-3028testGrunt dev test needs some command filters to run correctly 
without cygwin
https://issues.apache.org/jira/browse/PIG-3028
PIG-3027pigTest unit test needs a newline filter for comparisons of golden 
multi-line
https://issues.apache.org/jira/browse/PIG-3027
PIG-3026Pig checked-in baseline comparisons need a pre-filter to address 
OS-specific newline differences
https://issues.apache.org/jira/browse/PIG-3026
PIG-3025TestPruneColumn unit test - SimpleEchoStreamingCommand perl inline 
script needs simplification
https://issues.apache.org/jira/browse/PIG-3025
PIG-3024TestEmptyInputDir unit test - hadoop version detection logic is 
brittle
https://issues.apache.org/jira/browse/PIG-3024
PIG-3015Rewrite of AvroStorage
https://issues.apache.org/jira/browse/PIG-3015
PIG-3010Allow UDF's to flatten themselves
https://issues.apache.org/jira/browse/PIG-3010
PIG-2959Add a pig.cmd for Pig to run under Windows
https://issues.apache.org/jira/browse/PIG-2959
PIG-2957TetsScriptUDF fail due to volume prefix in jar
https://issues.apache.org/jira/browse/PIG-2957
PIG-2956Invalid cache specification for some streaming statement
https://issues.apache.org/jira/browse/PIG-2956
PIG-2955 Fix bunch of Pig e2e tests on Windows 
https://issues.apache.org/jira/browse/PIG-2955
PIG-2878Pig current releases lack a UDF equalIgnoreCase.This function 
returns a Boolean value indicating whether string left is equal to string 
right. This check is case insensitive.
https://issues.apache.org/jira/browse/PIG-2878
PIG-2873Converting bin/pig shell script to python
https://issues.apache.org/jira/browse/PIG-2873
PIG-2834MultiStorage requires unused constructor argument
https://issues.apache.org/jira/browse/PIG-2834
PIG-2824Pushing checking number of fields into LoadFunc
https://issues.apache.org/jira/browse/PIG-2824
PIG-2788improved string interpolation of variables
https://issues.apache.org/jira/browse/PIG-2788
PIG-2661Pig uses an extra job for loading data in Pigmix L9
https://issues.apache.org/jira/browse/PIG-2661
PIG-2645PigSplit does not handle the case where SerializationFactory 
returns null
https://issues.apache.org/jira/browse/PIG-2645
PIG-2614AvroStorage crashes on LOADING a single bad error
https://issues.apache.org/jira/browse/PIG-2614
PIG-2507Semicolon in paramenters for UDF results in parsing error
https://issues.apache.org/jira/browse/PIG-2507
PIG-2433Jython import module not working if module path is in classpath
https://issues.apache.org/jira/browse/PIG-2433
PIG-2417Streaming UDFs -  allow users to easily write UDFs in scripting 
languages with no JVM implementation.
https://issues.apache.org/jira/browse/PIG-2417
PIG-2362Rework Ant build.xml to use macrodef instead of antcall
https://issues.apache.org/jira/browse/PIG-2362
PIG-2312NPE when relation and column share the same name and used in Nested 
Foreach 
https://issues.apache.org

[jira] [Commented] (PIG-2353) RANK function like in SQL

2012-12-18 Thread Jonathan Coveney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13535506#comment-13535506
 ] 

Jonathan Coveney commented on PIG-2353:
---

You cannot, so this is not without precedent. We should document it, and 
ideally introduce better error messages around it (separate JIRA, and for other 
keywords it is equally as bad).

> RANK function like in SQL
> -
>
> Key: PIG-2353
> URL: https://issues.apache.org/jira/browse/PIG-2353
> Project: Pig
>  Issue Type: New Feature
>Reporter: Gianmarco De Francisci Morales
>Assignee: Allan Avendaño
>  Labels: gsoc2012, mentor
> Fix For: 0.11
>
> Attachments: PIG-2353-2, PIG-2353-3.txt, PIG-2353-4.txt, 
> PIG-2353-5.txt, PIG2353.patch
>
>
> Implement a function that given a (sorted) bag adds to each tuple a unique, 
> increasing identifier without gaps, like what RANK does for SQL.
> This is a candidate project for Google summer of code 2012. More information 
> about the program can be found at 
> https://cwiki.apache.org/confluence/display/PIG/GSoc2012
> Functionality implemented so far, is available at 
> https://reviews.apache.org/r/5523/diff/#index_header

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3090) Introduce a syntax to be able to easily refer to the previously defined relation

2012-12-18 Thread Jonathan Coveney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13535503#comment-13535503
 ] 

Jonathan Coveney commented on PIG-3090:
---

Poke one of your colleagues into reviewing it :)

> Introduce a syntax to be able to easily refer to the previously defined 
> relation
> 
>
> Key: PIG-3090
> URL: https://issues.apache.org/jira/browse/PIG-3090
> Project: Pig
>  Issue Type: New Feature
>Reporter: Jonathan Coveney
> Attachments: PIG-3090-0.patch
>
>
> Sometimes I feel like swimming with ANTLRs. This particular feature isn't too 
> hard to add... and supports syntax like this:
> {code}
> a = load 'thing' as (x:int);
> b = foreach @ generate x;
> c = foreach @ generate x;
> d = foreach @ generate x;
> {code}
> I have a patch, though I need to make sure it doesn't change anything (it 
> shouldn't) and I need to add tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-3090) Introduce a syntax to be able to easily refer to the previously defined relation

2012-12-18 Thread Russell Jurney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13535474#comment-13535474
 ] 

Russell Jurney commented on PIG-3090:
-

I like this.

> Introduce a syntax to be able to easily refer to the previously defined 
> relation
> 
>
> Key: PIG-3090
> URL: https://issues.apache.org/jira/browse/PIG-3090
> Project: Pig
>  Issue Type: New Feature
>Reporter: Jonathan Coveney
> Attachments: PIG-3090-0.patch
>
>
> Sometimes I feel like swimming with ANTLRs. This particular feature isn't too 
> hard to add... and supports syntax like this:
> {code}
> a = load 'thing' as (x:int);
> b = foreach @ generate x;
> c = foreach @ generate x;
> d = foreach @ generate x;
> {code}
> I have a patch, though I need to make sure it doesn't change anything (it 
> shouldn't) and I need to add tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2353) RANK function like in SQL

2012-12-18 Thread Gianmarco De Francisci Morales (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13535455#comment-13535455
 ] 

Gianmarco De Francisci Morales commented on PIG-2353:
-

Hi Jonathan,
Yes, RANK is now an operator and thus a reserved keyword.
We can add it to the release notes.

The parser is definitely a bit rough and could use some reworking, especially 
in the error messages, so I am all in for it. Not sure if it is a known issue. 
Can you use LOAD or FOREACH as column names?

> RANK function like in SQL
> -
>
> Key: PIG-2353
> URL: https://issues.apache.org/jira/browse/PIG-2353
> Project: Pig
>  Issue Type: New Feature
>Reporter: Gianmarco De Francisci Morales
>Assignee: Allan Avendaño
>  Labels: gsoc2012, mentor
> Fix For: 0.11
>
> Attachments: PIG-2353-2, PIG-2353-3.txt, PIG-2353-4.txt, 
> PIG-2353-5.txt, PIG2353.patch
>
>
> Implement a function that given a (sorted) bag adds to each tuple a unique, 
> increasing identifier without gaps, like what RANK does for SQL.
> This is a candidate project for Google summer of code 2012. More information 
> about the program can be found at 
> https://cwiki.apache.org/confluence/display/PIG/GSoc2012
> Functionality implemented so far, is available at 
> https://reviews.apache.org/r/5523/diff/#index_header

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Is it known that you can't use rank as a column name?

2012-12-18 Thread Jonathan Coveney
I am getting weird errors. I assume this is because of the rank operator,
and it is a non-backwards compatible change that could break scripts (it is
breaking some scripts on our end).

Wanted to see what people knew before trying to make this work in the
parser.

I updated the JIRA, but wanted to hit the broader list of devs.
https://issues.apache.org/jira/browse/PIG-2353


[jira] [Commented] (PIG-2353) RANK function like in SQL

2012-12-18 Thread Jonathan Coveney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13535441#comment-13535441
 ] 

Jonathan Coveney commented on PIG-2353:
---

Did this make rank a reserved keyword? We may need to document this as a 
non-backwards compatible change if it is, as many scripts use "rank" as a 
column name. Example:

{code}
A = load 'thing';
B = FOREACH (GROUP A all) GENERATE MIN(A.rank);
{code}

Of all the errors you'd expect, I wasn't expecting this one:


2012-12-18 23:18:36,142 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
1200:   mismatched input '(' expecting SEMI_COLON
Details at logfile: /var/log/pig/pig_1355872714665.log

I culled this example from a larger script, and it looks like removing rank as 
a column name fixed it. Is this a known issue? I think we can refine the parser 
to work with rank in that position, but I thought it would be worth asking.

> RANK function like in SQL
> -
>
> Key: PIG-2353
> URL: https://issues.apache.org/jira/browse/PIG-2353
> Project: Pig
>  Issue Type: New Feature
>Reporter: Gianmarco De Francisci Morales
>Assignee: Allan Avendaño
>  Labels: gsoc2012, mentor
> Fix For: 0.11
>
> Attachments: PIG-2353-2, PIG-2353-3.txt, PIG-2353-4.txt, 
> PIG-2353-5.txt, PIG2353.patch
>
>
> Implement a function that given a (sorted) bag adds to each tuple a unique, 
> increasing identifier without gaps, like what RANK does for SQL.
> This is a candidate project for Google summer of code 2012. More information 
> about the program can be found at 
> https://cwiki.apache.org/confluence/display/PIG/GSoc2012
> Functionality implemented so far, is available at 
> https://reviews.apache.org/r/5523/diff/#index_header

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (PIG-2493) UNION causes casting issues

2012-12-18 Thread Julien Le Dem (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Le Dem resolved PIG-2493.


Resolution: Fixed

I'm closing this issue as it has been committed and we are stabilizing a 
release.
[~arov] please open a new JIRA if you still see problems

> UNION causes casting issues
> ---
>
> Key: PIG-2493
> URL: https://issues.apache.org/jira/browse/PIG-2493
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.9.1, 0.10.0
>Reporter: Anitha Raju
>Assignee: Vivek Padmanabhan
> Fix For: 0.9.3, 0.11, 0.10.1
>
> Attachments: PIG-2493_2.patch, PIG-2493-3.patch, PIG-2493.patch
>
>
> Hi,
> For the below script,
> {code}
> A = load '/user/anithar/ip' as (a);
> B = load '/user/anithar/ip1' as (a);
> C = union  A , B ;
> D = foreach C generate (chararray)a;
> dump D;
> {code}
> it gives casting error at runtime
> {code}
> org.apache.pig.backend.executionengine.ExecException: ERROR 1075: Received a 
> bytearray from the UDF. Cannot determine how to convert the bytearray to 
> string.
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POCast.getNext(POCast.java:660)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:322)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332)
> at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:267)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:262)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1082)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> {code}
> It looks like in POCast.java the value of "funcSpec" is not getting any 
> value(stays null when there is a UNION involved), causing "caster" to get 
> null and thus the exception.
> The same works in 0.8 without any issue.
> Regards,
> Anitha

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (PIG-2583) Add Grunt command to list the statements in cache

2012-12-18 Thread Julien Le Dem (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Le Dem resolved PIG-2583.


Resolution: Fixed

> Add Grunt command to list the statements in cache
> -
>
> Key: PIG-2583
> URL: https://issues.apache.org/jira/browse/PIG-2583
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.10.0
>Reporter: Daniel Dai
>Assignee: Allan Avendaño
>Priority: Minor
>  Labels: newbie
> Fix For: 0.11
>
> Attachments: gruntHistory1.patch, gruntHistory2.patch, 
> gruntHistory3.patch, gruntHistory4.patch, gruntHistory.patch
>
>
> It is convenient to list statements in cache:
> grunt> a = load '1.txt'; 
> grunt> b = foreach a generate $0, $1;
> grunt> list
> a = load '1.txt';
> b = foreach a generate $0, $1;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2583) Add Grunt command to list the statements in cache

2012-12-18 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13535355#comment-13535355
 ] 

Julien Le Dem commented on PIG-2583:


[~xalan] I'm closing this ticket as it has been committed.
Please open a new ticket to further improve your contribution.
Thanks again


> Add Grunt command to list the statements in cache
> -
>
> Key: PIG-2583
> URL: https://issues.apache.org/jira/browse/PIG-2583
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.10.0
>Reporter: Daniel Dai
>Assignee: Allan Avendaño
>Priority: Minor
>  Labels: newbie
> Fix For: 0.11
>
> Attachments: gruntHistory1.patch, gruntHistory2.patch, 
> gruntHistory3.patch, gruntHistory4.patch, gruntHistory.patch
>
>
> It is convenient to list statements in cache:
> grunt> a = load '1.txt'; 
> grunt> b = foreach a generate $0, $1;
> grunt> list
> a = load '1.txt';
> b = foreach a generate $0, $1;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2614) AvroStorage crashes on LOADING a single bad error

2012-12-18 Thread Julien Le Dem (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Le Dem updated PIG-2614:
---

Fix Version/s: (was: 0.10.1)
   (was: 0.11)
   0.12

moving this to next release so that we can converge on pig 0.11

> AvroStorage crashes on LOADING a single bad error
> -
>
> Key: PIG-2614
> URL: https://issues.apache.org/jira/browse/PIG-2614
> Project: Pig
>  Issue Type: Bug
>  Components: piggybank
>Affects Versions: 0.10.0, 0.11
>Reporter: Russell Jurney
>Assignee: Jonathan Coveney
>  Labels: avro, avrostorage, bad, book, cutting, doug, for, my, 
> pig, sadism
> Fix For: 0.12
>
> Attachments: PIG-2614_0.patch, PIG-2614_1.patch, PIG-2614_2.patch, 
> test_avro_files.tar.gz
>
>
> AvroStorage dies when a single bad record exists, such as one with missing 
> fields.  This is very bad on 'big data,' where bad records are inevitable.  
> See discussion at 
> http://www.quora.com/Big-Data/In-Big-Data-ETL-how-many-records-are-an-acceptable-loss
>  for more theory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2927) SHIP and use JRuby gems in JRuby UDFs

2012-12-18 Thread Julien Le Dem (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Le Dem updated PIG-2927:
---

Fix Version/s: (was: 0.11)
   0.12

This will go in the next release as we are stabilizing the 0.11 branch

> SHIP and use JRuby gems in JRuby UDFs
> -
>
> Key: PIG-2927
> URL: https://issues.apache.org/jira/browse/PIG-2927
> Project: Pig
>  Issue Type: New Feature
>  Components: parser
>Affects Versions: 0.11
> Environment: JRuby UDFs
>Reporter: Russell Jurney
>Assignee: Jonathan Coveney
>Priority: Minor
> Fix For: 0.12
>
> Attachments: PIG-2927-0.patch, PIG-2927-1.patch, PIG-2927-2.patch, 
> PIG-2927-3.patch, PIG-2927-4.patch
>
>
> It would be great to use JRuby gems in JRuby UDFs without installing them on 
> all machines on the cluster. Some way to SHIP them automatically with the job 
> would be great.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2955) Fix bunch of Pig e2e tests on Windows

2012-12-18 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13535339#comment-13535339
 ] 

Julien Le Dem commented on PIG-2955:


Daniel, do you want to check that in?

>  Fix bunch of Pig e2e tests on Windows 
> ---
>
> Key: PIG-2955
> URL: https://issues.apache.org/jira/browse/PIG-2955
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.11, 0.10.1, 0.12
>
> Attachments: PIG-2955-1.patch, PIG-2955-2_0.10.patch, PIG-2955-2.patch
>
>
> Fix the following test aborts and failures:
> ComputeSpec_1
> ComputeSpec_2
> Unicode_cmdline_1
> Warning_1
> Warning_4
> Checkin_2
> UdfDistributedCache_1
> Jython_Checkin_2
> Jython_Diagnostics_4
> Jython_Diagnostics_5
> Jython_Diagnostics_6
> Jython_Error_3
> Jython_Error_4
> Jython_Error_5
> Jython_Error_6
> Jython_Error_7
> Grunt_6
> Grunt_8
> Grunt_13
> Grunt_14

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2954) TestParamSubPreproc still depends on "bash" to run

2012-12-18 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13535340#comment-13535340
 ] 

Julien Le Dem commented on PIG-2954:


is this still on target for pig-0.11?

>  TestParamSubPreproc still depends on "bash" to run 
> 
>
> Key: PIG-2954
> URL: https://issues.apache.org/jira/browse/PIG-2954
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.11
>
> Attachments: PIG-2954-1.patch, PIG-2954-2.patch
>
>
> If bash is not exist in path, there are 3 test failures.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2956) Invalid cache specification for some streaming statement

2012-12-18 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13535335#comment-13535335
 ] 

Julien Le Dem commented on PIG-2956:


Daniel? any update on this?

> Invalid cache specification for some streaming statement
> 
>
> Key: PIG-2956
> URL: https://issues.apache.org/jira/browse/PIG-2956
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.11
>
> Attachments: PIG-2956-1_0.10.patch, PIG-2956-1.patch
>
>
> Another category of failure in e2e tests, such as ComputeSpec_1, 
> ComputeSpec_2, ComputeSpec_3, RaceConditions_1, RaceConditions_3, 
> RaceConditions_4, RaceConditions_7, RaceConditions_8.
> Here is stack:
> ERROR 6003: Invalid cache specification. File doesn't exist: C:/Program Files 
> (x86)/GnuWin32/bin/head.exe
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobCreationException:
>  ERROR 2017: Internal error creating job configuration.
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:723)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:258)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:151)
> at org.apache.pig.PigServer.launchPlan(PigServer.java:1318)
> at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1303)
> at org.apache.pig.PigServer.execute(PigServer.java:1293)
> at org.apache.pig.PigServer.executeBatch(PigServer.java:364)
> at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:133)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
> at 
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166)
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
> at org.apache.pig.Main.run(Main.java:561)
> at org.apache.pig.Main.main(Main.java:111)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 6003: 
> Invalid cache specification. File doesn't exist: C:/Program Files 
> (x86)/GnuWin32/bin/head.exe
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.setupDistributedCache(JobControlCompiler.java:1151)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.setupDistributedCache(JobControlCompiler.java:1129)
> at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:447)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2764) Add a biginteger and bigdecimal type to pig

2012-12-18 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13535274#comment-13535274
 ] 

Olga Natkovich commented on PIG-2764:
-

I agree with using standard type.

> Add a biginteger and bigdecimal type to pig
> ---
>
> Key: PIG-2764
> URL: https://issues.apache.org/jira/browse/PIG-2764
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jonathan Coveney
>Assignee: Jonathan Coveney
> Attachments: fixedpoint.patch, PIG-2764-0.patch, PIG-2764-1.patch
>
>
> I think it would be useful for applications where precision is more important 
> than speed to have the option of using java's bigdecimal and biginteger types 
> natively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2957) TetsScriptUDF fail due to volume prefix in jar

2012-12-18 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13535249#comment-13535249
 ] 

Julien Le Dem commented on PIG-2957:


could you call the method something more explicit than "cleanupPath". Something 
like "getPathForJar" maybe?
Also add comments to explain what exactly this is doing:
{noformat}
if (path.charAt(1)==':') {
newPath = path.charAt(0) + path.substring(2);
}
{noformat}
It would be useful to describe what it is changing in the path and why.
In particular the drive letter becomes a root dir in the jar (C:/foo becomes 
C/foo). If that's what we want then it should be clearer.

> TetsScriptUDF fail due to volume prefix in jar
> --
>
> Key: PIG-2957
> URL: https://issues.apache.org/jira/browse/PIG-2957
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.11
>
> Attachments: PIG-2957-1.patch, PIG-2957-2_0.10.patch, PIG-2957-2.patch
>
>
> testPythonAbsolutePath fail. Stack is:
> java.io.IOException: Mkdirs failed to create 
> C:\tmp\hadoop-Administrator\mapred\local\1_0\taskTracker\Administrator\jobcache\job_20120725074728013_0011\jars\C:\Users\Administrator\pig-monarch
> at org.apache.hadoop.util.RunJar.unJar(RunJar.java:47)
> at 
> org.apache.hadoop.mapred.JobLocalizer.localizeJobJarFile(JobLocalizer.java:277)
> at 
> org.apache.hadoop.mapred.JobLocalizer.localizeJobFiles(JobLocalizer.java:377)
> at 
> org.apache.hadoop.mapred.JobLocalizer.localizeJobFiles(JobLocalizer.java:367)
> at 
> org.apache.hadoop.mapred.DefaultTaskController.initializeJob(DefaultTaskController.java:214)
> at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1237)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1107)
> at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1212)
> at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1127)
> at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2417)
> at java.lang.Thread.run(Thread.java:662)
> The reason is we pack the volume prefix into the job.jar.
> jar tvf C:\Users\ADMINI~1\AppData\Local\Temp\Job6350
> 669482684441868.jar|grep testPythonAbsolutePath
> 98 Wed Jul 25 11:12:58 PDT 2012 C:\Users\Administrator\pig-monarch\testPytho
> nAbsolutePath.py

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2959) Add a pig.cmd for Pig to run under Windows

2012-12-18 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13535220#comment-13535220
 ] 

Julien Le Dem commented on PIG-2959:


hey Daniel, are you going to commit this?

> Add a pig.cmd for Pig to run under Windows
> --
>
> Key: PIG-2959
> URL: https://issues.apache.org/jira/browse/PIG-2959
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.11
>
> Attachments: pig.cmd
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2764) Add a biginteger and bigdecimal type to pig

2012-12-18 Thread Mathias Herberts (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13535178#comment-13535178
 ] 

Mathias Herberts commented on PIG-2764:
---

Support for BigDecimal would be nice, and I think there is no need to have a 
separate BigInteger type, it suffice to be smart in the way casts are done.

> Add a biginteger and bigdecimal type to pig
> ---
>
> Key: PIG-2764
> URL: https://issues.apache.org/jira/browse/PIG-2764
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jonathan Coveney
>Assignee: Jonathan Coveney
> Attachments: fixedpoint.patch, PIG-2764-0.patch, PIG-2764-1.patch
>
>
> I think it would be useful for applications where precision is more important 
> than speed to have the option of using java's bigdecimal and biginteger types 
> natively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2764) Add a biginteger and bigdecimal type to pig

2012-12-18 Thread Jonathan Coveney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13535166#comment-13535166
 ] 

Jonathan Coveney commented on PIG-2764:
---

I would happy to make a cut against trunk, since this patch is probably out of 
date. Would love to have people weigh in on criteria for inclusion. IMHO (re: 
Alan's previous comments) I would rather go with the well known (even if a 
little slower) BigNumber types...rolling our own will mean we'll continue 
adding features until we converge on a crappier version of them, imho.

> Add a biginteger and bigdecimal type to pig
> ---
>
> Key: PIG-2764
> URL: https://issues.apache.org/jira/browse/PIG-2764
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jonathan Coveney
>Assignee: Jonathan Coveney
> Attachments: fixedpoint.patch, PIG-2764-0.patch, PIG-2764-1.patch
>
>
> I think it would be useful for applications where precision is more important 
> than speed to have the option of using java's bigdecimal and biginteger types 
> natively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Prepare for Pig 0.10.1 release

2012-12-18 Thread Prashant Kommireddi
Let's do it!

Sent from my iPhone

On Dec 18, 2012, at 9:25 PM, Daniel Dai  wrote:

> Hi, Pig developers,
>
> We have fixed a bunch of bugs since
> 0.10.0(http://svn.apache.org/repos/asf/pig/branches/branch-0.10/CHANGES.txt).
> I would like to propose a 0.10.1 release from top of 0.10 branch after
> clearing all pending issues
> (https://issues.apache.org/jira/issues/?jql=project%20%3D%20PIG%20AND%20resolution%20%3D%20Unresolved%20AND%20fixVersion%20%3D%20%220.10.1%22).
>
> Any objections?
>
> Thanks,
> Daniel


Re: Prepare for Pig 0.10.1 release

2012-12-18 Thread Julien Le Dem
 PIG-2803  and
PIG-2602are new
features.
We should include bug fixes only.
does that sound reasonable to you?
Julien



On Tue, Dec 18, 2012 at 9:47 AM, Julien Le Dem  wrote:

> Sounds good to me.
> Can we cut pig 0.11.0 at the same time ?
> Julien
>
>
>
> On Tue, Dec 18, 2012 at 7:54 AM, Daniel Dai  wrote:
>
>> Hi, Pig developers,
>>
>> We have fixed a bunch of bugs since
>> 0.10.0(
>> http://svn.apache.org/repos/asf/pig/branches/branch-0.10/CHANGES.txt).
>> I would like to propose a 0.10.1 release from top of 0.10 branch after
>> clearing all pending issues
>> (
>> https://issues.apache.org/jira/issues/?jql=project%20%3D%20PIG%20AND%20resolution%20%3D%20Unresolved%20AND%20fixVersion%20%3D%20%220.10.1%22
>> ).
>>
>> Any objections?
>>
>> Thanks,
>> Daniel
>>
>
>


Re: Prepare for Pig 0.10.1 release

2012-12-18 Thread Russell Jurney
Awesome.

Russell Jurney http://datasyndrome.com

On Dec 18, 2012, at 7:55 AM, Daniel Dai  wrote:

> Hi, Pig developers,
>
> We have fixed a bunch of bugs since
> 0.10.0(http://svn.apache.org/repos/asf/pig/branches/branch-0.10/CHANGES.txt).
> I would like to propose a 0.10.1 release from top of 0.10 branch after
> clearing all pending issues
> (https://issues.apache.org/jira/issues/?jql=project%20%3D%20PIG%20AND%20resolution%20%3D%20Unresolved%20AND%20fixVersion%20%3D%20%220.10.1%22).
>
> Any objections?
>
> Thanks,
> Daniel


Re: Prepare for Pig 0.10.1 release

2012-12-18 Thread Julien Le Dem
Sounds good to me.
Can we cut pig 0.11.0 at the same time ?
Julien


On Tue, Dec 18, 2012 at 7:54 AM, Daniel Dai  wrote:

> Hi, Pig developers,
>
> We have fixed a bunch of bugs since
> 0.10.0(
> http://svn.apache.org/repos/asf/pig/branches/branch-0.10/CHANGES.txt).
> I would like to propose a 0.10.1 release from top of 0.10 branch after
> clearing all pending issues
> (
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20PIG%20AND%20resolution%20%3D%20Unresolved%20AND%20fixVersion%20%3D%20%220.10.1%22
> ).
>
> Any objections?
>
> Thanks,
> Daniel
>


[jira] [Commented] (PIG-2764) Add a biginteger and bigdecimal type to pig

2012-12-18 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13535085#comment-13535085
 ] 

Olga Natkovich commented on PIG-2764:
-

I think having support for BigInteger would be very helpful. We have asks 
within Yahoo for it. 

> Add a biginteger and bigdecimal type to pig
> ---
>
> Key: PIG-2764
> URL: https://issues.apache.org/jira/browse/PIG-2764
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jonathan Coveney
>Assignee: Jonathan Coveney
> Attachments: fixedpoint.patch, PIG-2764-0.patch, PIG-2764-1.patch
>
>
> I think it would be useful for applications where precision is more important 
> than speed to have the option of using java's bigdecimal and biginteger types 
> natively.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Prepare for Pig 0.10.1 release

2012-12-18 Thread Daniel Dai
Hi, Pig developers,

We have fixed a bunch of bugs since
0.10.0(http://svn.apache.org/repos/asf/pig/branches/branch-0.10/CHANGES.txt).
I would like to propose a 0.10.1 release from top of 0.10 branch after
clearing all pending issues
(https://issues.apache.org/jira/issues/?jql=project%20%3D%20PIG%20AND%20resolution%20%3D%20Unresolved%20AND%20fixVersion%20%3D%20%220.10.1%22).

Any objections?

Thanks,
Daniel


Build failed in Jenkins: Pig-trunk #1380

2012-12-18 Thread Apache Jenkins Server
See 

Changes:

[jcoveney] Update trunk CHANGES to reflect that PIG-3020 is in 0.11 as well

[jcoveney] PIG-3020: "Duplicate uid in schema" error when joining two relations 
derived from the same load statement (jcoveney)

--
[...truncated 6591 lines...]
 [findbugs]   jline.History
 [findbugs]   org.jruby.embed.internal.LocalContextProvider
 [findbugs]   org.apache.hadoop.io.BooleanWritable
 [findbugs]   org.apache.log4j.Logger
 [findbugs]   org.apache.hadoop.hbase.filter.FamilyFilter
 [findbugs]   org.codehaus.jackson.annotate.JsonPropertyOrder
 [findbugs]   groovy.lang.Tuple
 [findbugs]   org.antlr.runtime.IntStream
 [findbugs]   org.apache.hadoop.util.ReflectionUtils
 [findbugs]   org.apache.hadoop.fs.ContentSummary
 [findbugs]   org.jruby.runtime.builtin.IRubyObject
 [findbugs]   org.jruby.RubyInteger
 [findbugs]   org.python.core.PyTuple
 [findbugs]   org.mortbay.log.Log
 [findbugs]   org.apache.hadoop.conf.Configuration
 [findbugs]   com.google.common.base.Joiner
 [findbugs]   org.apache.hadoop.mapreduce.lib.input.FileSplit
 [findbugs]   org.apache.hadoop.mapred.Counters$Counter
 [findbugs]   com.jcraft.jsch.Channel
 [findbugs]   org.apache.hadoop.mapred.JobPriority
 [findbugs]   org.apache.commons.cli.Options
 [findbugs]   org.apache.hadoop.mapred.JobID
 [findbugs]   org.apache.hadoop.util.bloom.BloomFilter
 [findbugs]   org.python.core.PyFrame
 [findbugs]   org.apache.hadoop.hbase.filter.CompareFilter
 [findbugs]   org.apache.hadoop.util.VersionInfo
 [findbugs]   org.python.core.PyString
 [findbugs]   org.apache.hadoop.io.Text$Comparator
 [findbugs]   org.jruby.runtime.Block
 [findbugs]   org.antlr.runtime.MismatchedSetException
 [findbugs]   org.apache.hadoop.io.BytesWritable
 [findbugs]   org.apache.hadoop.fs.FsShell
 [findbugs]   org.joda.time.Months
 [findbugs]   org.mozilla.javascript.ImporterTopLevel
 [findbugs]   org.apache.hadoop.hbase.mapreduce.TableOutputFormat
 [findbugs]   org.apache.hadoop.mapred.TaskReport
 [findbugs]   org.apache.hadoop.security.UserGroupInformation
 [findbugs]   org.antlr.runtime.tree.RewriteRuleSubtreeStream
 [findbugs]   org.apache.commons.cli.HelpFormatter
 [findbugs]   com.google.common.collect.Maps
 [findbugs]   org.joda.time.ReadableInstant
 [findbugs]   org.mozilla.javascript.NativeObject
 [findbugs]   org.apache.hadoop.hbase.HConstants
 [findbugs]   org.apache.hadoop.io.serializer.Deserializer
 [findbugs]   org.antlr.runtime.FailedPredicateException
 [findbugs]   org.apache.hadoop.io.compress.CompressionCodec
 [findbugs]   org.jruby.RubyNil
 [findbugs]   org.apache.hadoop.fs.FileStatus
 [findbugs]   org.apache.hadoop.hbase.client.Result
 [findbugs]   org.apache.hadoop.mapreduce.JobContext
 [findbugs]   org.codehaus.jackson.JsonGenerator
 [findbugs]   org.apache.hadoop.mapreduce.TaskAttemptContext
 [findbugs]   org.apache.hadoop.io.LongWritable$Comparator
 [findbugs]   org.codehaus.jackson.map.util.LRUMap
 [findbugs]   org.apache.hadoop.hbase.util.Bytes
 [findbugs]   org.antlr.runtime.MismatchedTokenException
 [findbugs]   org.codehaus.jackson.JsonParser
 [findbugs]   com.jcraft.jsch.UserInfo
 [findbugs]   org.apache.hadoop.hbase.filter.WhileMatchFilter
 [findbugs]   org.python.core.PyException
 [findbugs]   org.apache.commons.cli.ParseException
 [findbugs]   org.apache.hadoop.io.compress.CompressionOutputStream
 [findbugs]   org.apache.hadoop.hbase.filter.WritableByteArrayComparable
 [findbugs]   org.antlr.runtime.tree.CommonTreeNodeStream
 [findbugs]   org.apache.log4j.Level
 [findbugs]   org.apache.hadoop.hbase.client.Scan
 [findbugs]   org.jruby.anno.JRubyMethod
 [findbugs]   org.apache.hadoop.mapreduce.Job
 [findbugs]   com.google.common.util.concurrent.Futures
 [findbugs]   org.apache.commons.logging.LogFactory
 [findbugs]   org.apache.commons.collections.IteratorUtils
 [findbugs]   org.apache.commons.codec.binary.Base64
 [findbugs]   org.codehaus.jackson.map.ObjectMapper
 [findbugs]   org.apache.hadoop.fs.FileSystem
 [findbugs]   org.jruby.embed.LocalContextScope
 [findbugs]   org.apache.hadoop.hbase.filter.FilterList$Operator
 [findbugs]   org.jruby.RubySymbol
 [findbugs]   org.codehaus.jackson.map.annotate.JacksonStdImpl
 [findbugs]   org.apache.hadoop.hbase.io.ImmutableBytesWritable
 [findbugs]   org.apache.hadoop.io.serializer.SerializationFactory
 [findbugs]   org.antlr.runtime.tree.TreeAdaptor
 [findbugs]   org.apache.hadoop.mapred.RunningJob
 [findbugs]   org.antlr.runtime.CommonTokenStream
 [findbugs]   org.apache.hadoop.io.DataInputBuffer
 [findbugs]   org.apache.hadoop.io.file.tfile.TFile
 [findbugs]   org.apache.commons.cli.GnuParser
 [findbugs]   org.mozilla.javascript.Context
 [findbugs]   org.apache.hadoop.io.FloatWritable
 [findbugs]   org.antlr.runtime.tree.RewriteEarlyExitException
 [findbugs]   org.apache.hadoop.hbase.HBaseConfiguration
 [findbugs]   org.codehaus.jackson.JsonGenerationException
 [findbugs]   org.apache.hadoop.mapreduce.TaskIn