Re: Review Request: TestJobSumission and TestHBaseStorage don't work with HBase 0.94 and ZK 3.4.3

2012-10-22 Thread Cheolsoo Park


> On Oct. 23, 2012, 5:07 a.m., Santhosh Srinivasan wrote:
> > A few comments. Its probably a good idea to have someone with more 
> > knowledge of HBaseStorage to take a second look.

Thank you very much for your feedback! I added answers below. Please let me 
know if you disagree with me.


> On Oct. 23, 2012, 5:07 a.m., Santhosh Srinivasan wrote:
> > ivy/libraries.properties, line 74
> > 
> >
> > Zookeeper-3.4.4 has been out but has a known issue with SASL and Java 
> > 1.7.  Is 3.3.3 required for Hbase 0.94.1 ?

You're asking whether ZK 3.4.3 (not 3.3.3) is required by hbase 0.94.1, right?

The answer is yes. In particular, HBaseTestingUtility depends on the following 
ZK class, which doesn't seem to exist in ZK 3.3.3:

java.lang.NoClassDefFoundError: org/apache/zookeeper/server/NIOServerCnxnFactory

In fact, I don't think that we should worry about those ZK known issues because 
the versions of HBase and ZK that I am updating only matter to unit test. As 
far as I can tell, HBaseStorage itself is fully compatible with all of HBase 
0.90, 0.92, and 0.94 and won't be effected by this change at all.


> On Oct. 23, 2012, 5:07 a.m., Santhosh Srinivasan wrote:
> > test/org/apache/pig/test/TestJobSubmission.java, line 431
> > 
> >
> > Can the commented out code be removed?

To be honest, I do not know why we keep that block of code. Nevertheless, I am 
hesitating to remove it since someone might have commented it out only 
temporarily.


- Cheolsoo


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7676/#review12676
---


On Oct. 22, 2012, 6:50 a.m., Cheolsoo Park wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7676/
> ---
> 
> (Updated Oct. 22, 2012, 6:50 a.m.)
> 
> 
> Review request for pig and Santhosh Srinivasan.
> 
> 
> Description
> ---
> 
> The changes include:
> 
> 1. Stop bundling hbase.jar and zookeoper.jar with pig.jar. So there should be 
> no longer incompatibility issues when using pig.jar with different versions 
> of hbase.jar. But to use HBaseStorage, HBASE_HOME and ZOOKEEPER_HOME must be 
> set by the user. Note that I am adding protobuf-java.jar to pig.jar because 
> otherwise it has to be explicitly added to PIG_CLASSPATH to use HBaseStorage, 
> which is not very intuitive.
> 
> 2. Bump hbase and zk to 0.94.1 and 3.4.3 respectively. Since we no longer 
> bundle them in pig.jar, which versions we use doesn't matter. These jar files 
> will be used for unit test only.
> 
> 3. Make the unit test cases work with newer versions of hbase and zk.
> 
> 4. Add hbase runtime dependencies to ivy.xml.
> 
> 
> This addresses bug PIG-2885.
> https://issues.apache.org/jira/browse/PIG-2885
> 
> 
> Diffs
> -
> 
>   build.xml 6b04f8a 
>   ivy.xml 6e0a2e5 
>   ivy/libraries.properties 55da6c6 
>   test/org/apache/pig/test/TestHBaseStorage.java cc1efef 
>   test/org/apache/pig/test/TestJobSubmission.java 021662f 
> 
> Diff: https://reviews.apache.org/r/7676/diff/
> 
> 
> Testing
> ---
> 
> ant clean test-commit -Dhadoopversion=20
> ant clean test -Dtestcase=TestHBaseStorage -Dhadoopversion=20
> ant clean test -Dtestcase=TestJobSumission -Dhadoopversion=20
> 
> I also manually tested pig.jar with hbase 0.90 and 0.94. Once HBASE_HOME and 
> ZOOKEEPER_HOME are set, HBaseStorage works fine with both versions.
> 
> 
> Thanks,
> 
> Cheolsoo Park
> 
>



[jira] [Commented] (PIG-2994) Grunt shortcuts

2012-10-22 Thread Prasanth J (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482135#comment-13482135
 ] 

Prasanth J commented on PIG-2994:
-

The reason I thought of using \ instead of other char is that, it will be easy 
for postgres users(not sure if any other databases support shell shortcuts) to 
easily adapt to this convention (colon shouldn't be hard though). This 
convention will be more useful for viewing/exploring HCat tables, and for 
frequently used diagnostic operators (dump, describe, explain, illustrate). 
Since these operators doesn't start with t,r,n, IMHO it will not be confused 
with tab, newline.

> Grunt shortcuts
> ---
>
> Key: PIG-2994
> URL: https://issues.apache.org/jira/browse/PIG-2994
> Project: Pig
>  Issue Type: New Feature
>  Components: grunt
>Reporter: Prasanth J
>Priority: Minor
>
> This feature is aimed at providing shortcuts for frequently used commands 
> like illustrate, dump, explain, describe, quit, help etc. This feature is 
> inspired from postgres(psql) shortcuts. I tried implementing a simple 
> shortcut for quitting the grunt shell using \q with very minimal changes. I 
> think this feature will help save many keystrokes for users. If this feature 
> looks useful I can submit the current patch for review and go ahead with 
> implementing the following shortcuts
> \i  - illustrate
> \e  - explain
> \de  - describe
> \du  - dump 
> \h - help
> This will also be useful to view information about tables/statistics stored 
> in HCatalog similar to the way psql does. 
> \dt  - display table
> \dm - display metadata
> etc..
> except \t, \r and \n delimiters we should be able to use all other characters 
> as shortcuts. 
> Please let me know your thoughts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2994) Grunt shortcuts

2012-10-22 Thread Prashant Kommireddi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482130#comment-13482130
 ] 

Prashant Kommireddi commented on PIG-2994:
--

Nice, I think this would be useful. Like you said backslash (with t,r,n) might 
be confused with the tab, newline delimiters. How about using a colon instead?

> Grunt shortcuts
> ---
>
> Key: PIG-2994
> URL: https://issues.apache.org/jira/browse/PIG-2994
> Project: Pig
>  Issue Type: New Feature
>  Components: grunt
>Reporter: Prasanth J
>Priority: Minor
>
> This feature is aimed at providing shortcuts for frequently used commands 
> like illustrate, dump, explain, describe, quit, help etc. This feature is 
> inspired from postgres(psql) shortcuts. I tried implementing a simple 
> shortcut for quitting the grunt shell using \q with very minimal changes. I 
> think this feature will help save many keystrokes for users. If this feature 
> looks useful I can submit the current patch for review and go ahead with 
> implementing the following shortcuts
> \i  - illustrate
> \e  - explain
> \de  - describe
> \du  - dump 
> \h - help
> This will also be useful to view information about tables/statistics stored 
> in HCatalog similar to the way psql does. 
> \dt  - display table
> \dm - display metadata
> etc..
> except \t, \r and \n delimiters we should be able to use all other characters 
> as shortcuts. 
> Please let me know your thoughts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: TestJobSumission and TestHBaseStorage don't work with HBase 0.94 and ZK 3.4.3

2012-10-22 Thread Santhosh Srinivasan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7676/#review12676
---


A few comments. Its probably a good idea to have someone with more knowledge of 
HBaseStorage to take a second look.


ivy/libraries.properties


Zookeeper-3.4.4 has been out but has a known issue with SASL and Java 1.7.  
Is 3.3.3 required for Hbase 0.94.1 ?



test/org/apache/pig/test/TestJobSubmission.java


Can the commented out code be removed?


- Santhosh Srinivasan


On Oct. 22, 2012, 6:50 a.m., Cheolsoo Park wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7676/
> ---
> 
> (Updated Oct. 22, 2012, 6:50 a.m.)
> 
> 
> Review request for pig and Santhosh Srinivasan.
> 
> 
> Description
> ---
> 
> The changes include:
> 
> 1. Stop bundling hbase.jar and zookeoper.jar with pig.jar. So there should be 
> no longer incompatibility issues when using pig.jar with different versions 
> of hbase.jar. But to use HBaseStorage, HBASE_HOME and ZOOKEEPER_HOME must be 
> set by the user. Note that I am adding protobuf-java.jar to pig.jar because 
> otherwise it has to be explicitly added to PIG_CLASSPATH to use HBaseStorage, 
> which is not very intuitive.
> 
> 2. Bump hbase and zk to 0.94.1 and 3.4.3 respectively. Since we no longer 
> bundle them in pig.jar, which versions we use doesn't matter. These jar files 
> will be used for unit test only.
> 
> 3. Make the unit test cases work with newer versions of hbase and zk.
> 
> 4. Add hbase runtime dependencies to ivy.xml.
> 
> 
> This addresses bug PIG-2885.
> https://issues.apache.org/jira/browse/PIG-2885
> 
> 
> Diffs
> -
> 
>   build.xml 6b04f8a 
>   ivy.xml 6e0a2e5 
>   ivy/libraries.properties 55da6c6 
>   test/org/apache/pig/test/TestHBaseStorage.java cc1efef 
>   test/org/apache/pig/test/TestJobSubmission.java 021662f 
> 
> Diff: https://reviews.apache.org/r/7676/diff/
> 
> 
> Testing
> ---
> 
> ant clean test-commit -Dhadoopversion=20
> ant clean test -Dtestcase=TestHBaseStorage -Dhadoopversion=20
> ant clean test -Dtestcase=TestJobSumission -Dhadoopversion=20
> 
> I also manually tested pig.jar with hbase 0.90 and 0.94. Once HBASE_HOME and 
> ZOOKEEPER_HOME are set, HBaseStorage works fine with both versions.
> 
> 
> Thanks,
> 
> Cheolsoo Park
> 
>



[jira] [Commented] (PIG-2999) Regression after PIG-2975: BinInterSedesTupleRawComparator secondary sort failing

2012-10-22 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482096#comment-13482096
 ] 

Rohini Palaniswamy commented on PIG-2999:
-

+1 (non-binding) for the patch. I am also seeing failures.For eg. 
TestPigTupleRawComparator tests fail with OOM.  Had mentioned about the 
bb.position having to be incremented earlier to Koji, then retracted the 
statement when I looked at the code and thought the ByteBuffer was not accessed 
after that. But I was mistaken and it is accessed again in case of secondary 
sort.

{code}
result = compareBinInterSedesDatum(bb1, bb2, mAsc);
if (result == 0)
result = compareBinInterSedesDatum(bb1, bb2, 
mSecondaryAsc);
{code} 

> Regression after PIG-2975: BinInterSedesTupleRawComparator secondary sort 
> failing
> -
>
> Key: PIG-2999
> URL: https://issues.apache.org/jira/browse/PIG-2999
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.11, 0.12
>Reporter: Koji Noguchi
>Assignee: Jonathan Coveney
> Attachments: pig-2999-v1.txt
>
>
> I think I broke the build from PIG-2975.  I see couple of tests failing at 
> BinInterSedesTupleRawComparator. 
> {noformat}
> 12/10/22 22:26:15 WARN mapred.LocalJobRunner: job_local_0022
> java.nio.BufferUnderflowException
>   at java.nio.Buffer.nextGetIndex(Buffer.java:478)
>   at java.nio.HeapByteBuffer.getLong(HeapByteBuffer.java:387)
>   at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compareBinInterSedesDatum(BinInterSedes.java:829)
>   at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compareBinSedesTuple(BinInterSedes.java:732)
>   at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compare(BinInterSedes.java:695)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSecondaryKeyComparator.compare(PigSecondaryKeyComparator.java:78)
>   at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373)
>   at org.apache.hadoop.util.PriorityQueue.downHeap(PriorityQueue.java:139)
>   at 
> org.apache.hadoop.util.PriorityQueue.adjustTop(PriorityQueue.java:103)
>   at 
> org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:335)
>   at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350)
>   at org.apache.hadoop.mapred.ReduceTask$4.next(ReduceTask.java:625)
>   at 
> org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:117)
>   at 
> org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
>   at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2999) Regression after PIG-2975: BinInterSedesTupleRawComparator secondary sort failing

2012-10-22 Thread Koji Noguchi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482095#comment-13482095
 ] 

Koji Noguchi commented on PIG-2999:
---

bq. and more. 

org.apache.pig.test.TestEvalPipeline.testNestedPlan
org.apache.pig.test.TestEvalPipeline.testNestedPlanWithExpressionAssignment
org.apache.pig.test.TestEvalPipeline.testNestedPlanForCloning
org.apache.pig.test.TestStreamingLocal.testSimpleOrderedReduceSideStreamingAfterFlatten
org.apache.pig.test.TestPigTupleRawComparator.testCompareDataBag
org.apache.pig.test.TestPigTupleRawComparator.compareInnerTuples
org.apache.pig.test.TestPigTupleRawComparator.testCompareCharArray
org.apache.pig.test.TestPigTupleRawComparator.testCompareEquals
org.apache.pig.test.TestForEachNestedPlanLocal.testNestedCrossTwoRelationsComplex
org.apache.pig.test.TestForEachNestedPlanLocal.testInnerOrderBy

instead of fixing test failures in pig-2975, I introduced so many more 
failures.  Sorry guys.

> Regression after PIG-2975: BinInterSedesTupleRawComparator secondary sort 
> failing
> -
>
> Key: PIG-2999
> URL: https://issues.apache.org/jira/browse/PIG-2999
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.11, 0.12
>Reporter: Koji Noguchi
>Assignee: Jonathan Coveney
> Attachments: pig-2999-v1.txt
>
>
> I think I broke the build from PIG-2975.  I see couple of tests failing at 
> BinInterSedesTupleRawComparator. 
> {noformat}
> 12/10/22 22:26:15 WARN mapred.LocalJobRunner: job_local_0022
> java.nio.BufferUnderflowException
>   at java.nio.Buffer.nextGetIndex(Buffer.java:478)
>   at java.nio.HeapByteBuffer.getLong(HeapByteBuffer.java:387)
>   at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compareBinInterSedesDatum(BinInterSedes.java:829)
>   at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compareBinSedesTuple(BinInterSedes.java:732)
>   at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compare(BinInterSedes.java:695)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSecondaryKeyComparator.compare(PigSecondaryKeyComparator.java:78)
>   at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373)
>   at org.apache.hadoop.util.PriorityQueue.downHeap(PriorityQueue.java:139)
>   at 
> org.apache.hadoop.util.PriorityQueue.adjustTop(PriorityQueue.java:103)
>   at 
> org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:335)
>   at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350)
>   at org.apache.hadoop.mapred.ReduceTask$4.next(ReduceTask.java:625)
>   at 
> org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:117)
>   at 
> org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
>   at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2999) Regression after PIG-2975: BinInterSedesTupleRawComparator secondary sort failing

2012-10-22 Thread Koji Noguchi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482088#comment-13482088
 ] 

Koji Noguchi commented on PIG-2999:
---

Thanks Jonathan, Gianmarco. 

bq. Can you post the tests that you are failing that you see?

org.apache.pig.test.TestEvalPipelineLocal.testNestedPlanForCloning 
org.apache.pig.test.TestPruneColumn.testCoGroup7 

and more.  

> Regression after PIG-2975: BinInterSedesTupleRawComparator secondary sort 
> failing
> -
>
> Key: PIG-2999
> URL: https://issues.apache.org/jira/browse/PIG-2999
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.11, 0.12
>Reporter: Koji Noguchi
>Assignee: Jonathan Coveney
> Attachments: pig-2999-v1.txt
>
>
> I think I broke the build from PIG-2975.  I see couple of tests failing at 
> BinInterSedesTupleRawComparator. 
> {noformat}
> 12/10/22 22:26:15 WARN mapred.LocalJobRunner: job_local_0022
> java.nio.BufferUnderflowException
>   at java.nio.Buffer.nextGetIndex(Buffer.java:478)
>   at java.nio.HeapByteBuffer.getLong(HeapByteBuffer.java:387)
>   at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compareBinInterSedesDatum(BinInterSedes.java:829)
>   at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compareBinSedesTuple(BinInterSedes.java:732)
>   at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compare(BinInterSedes.java:695)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSecondaryKeyComparator.compare(PigSecondaryKeyComparator.java:78)
>   at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373)
>   at org.apache.hadoop.util.PriorityQueue.downHeap(PriorityQueue.java:139)
>   at 
> org.apache.hadoop.util.PriorityQueue.adjustTop(PriorityQueue.java:103)
>   at 
> org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:335)
>   at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350)
>   at org.apache.hadoop.mapred.ReduceTask$4.next(ReduceTask.java:625)
>   at 
> org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:117)
>   at 
> org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
>   at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2999) Regression after PIG-2975: BinInterSedesTupleRawComparator secondary sort failing

2012-10-22 Thread Koji Noguchi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Noguchi updated PIG-2999:
--

Assignee: Jonathan Coveney

> Regression after PIG-2975: BinInterSedesTupleRawComparator secondary sort 
> failing
> -
>
> Key: PIG-2999
> URL: https://issues.apache.org/jira/browse/PIG-2999
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.11, 0.12
>Reporter: Koji Noguchi
>Assignee: Jonathan Coveney
> Attachments: pig-2999-v1.txt
>
>
> I think I broke the build from PIG-2975.  I see couple of tests failing at 
> BinInterSedesTupleRawComparator. 
> {noformat}
> 12/10/22 22:26:15 WARN mapred.LocalJobRunner: job_local_0022
> java.nio.BufferUnderflowException
>   at java.nio.Buffer.nextGetIndex(Buffer.java:478)
>   at java.nio.HeapByteBuffer.getLong(HeapByteBuffer.java:387)
>   at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compareBinInterSedesDatum(BinInterSedes.java:829)
>   at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compareBinSedesTuple(BinInterSedes.java:732)
>   at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compare(BinInterSedes.java:695)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSecondaryKeyComparator.compare(PigSecondaryKeyComparator.java:78)
>   at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373)
>   at org.apache.hadoop.util.PriorityQueue.downHeap(PriorityQueue.java:139)
>   at 
> org.apache.hadoop.util.PriorityQueue.adjustTop(PriorityQueue.java:103)
>   at 
> org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:335)
>   at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350)
>   at org.apache.hadoop.mapred.ReduceTask$4.next(ReduceTask.java:625)
>   at 
> org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:117)
>   at 
> org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
>   at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2999) Regression after PIG-2975: BinInterSedesTupleRawComparator secondary sort failing

2012-10-22 Thread Koji Noguchi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Noguchi updated PIG-2999:
--

Attachment: pig-2999-v1.txt

During the performance test in pig-2975, I made a change inside 
compareBinInterSedesDatum for bytearray comparisons but didn't update the 
bytebuffer position after the read leading to random failures when bytebuffer 
is accessed afterwards.


> Regression after PIG-2975: BinInterSedesTupleRawComparator secondary sort 
> failing
> -
>
> Key: PIG-2999
> URL: https://issues.apache.org/jira/browse/PIG-2999
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.11, 0.12
>Reporter: Koji Noguchi
> Attachments: pig-2999-v1.txt
>
>
> I think I broke the build from PIG-2975.  I see couple of tests failing at 
> BinInterSedesTupleRawComparator. 
> {noformat}
> 12/10/22 22:26:15 WARN mapred.LocalJobRunner: job_local_0022
> java.nio.BufferUnderflowException
>   at java.nio.Buffer.nextGetIndex(Buffer.java:478)
>   at java.nio.HeapByteBuffer.getLong(HeapByteBuffer.java:387)
>   at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compareBinInterSedesDatum(BinInterSedes.java:829)
>   at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compareBinSedesTuple(BinInterSedes.java:732)
>   at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compare(BinInterSedes.java:695)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSecondaryKeyComparator.compare(PigSecondaryKeyComparator.java:78)
>   at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373)
>   at org.apache.hadoop.util.PriorityQueue.downHeap(PriorityQueue.java:139)
>   at 
> org.apache.hadoop.util.PriorityQueue.adjustTop(PriorityQueue.java:103)
>   at 
> org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:335)
>   at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350)
>   at org.apache.hadoop.mapred.ReduceTask$4.next(ReduceTask.java:625)
>   at 
> org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:117)
>   at 
> org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
>   at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2999) Regression after PIG-2975: BinInterSedesTupleRawComparator secondary sort failing

2012-10-22 Thread Gianmarco De Francisci Morales (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482082#comment-13482082
 ] 

Gianmarco De Francisci Morales commented on PIG-2999:
-

Most likely you are correct Jonathan.

> Regression after PIG-2975: BinInterSedesTupleRawComparator secondary sort 
> failing
> -
>
> Key: PIG-2999
> URL: https://issues.apache.org/jira/browse/PIG-2999
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.11, 0.12
>Reporter: Koji Noguchi
>
> I think I broke the build from PIG-2975.  I see couple of tests failing at 
> BinInterSedesTupleRawComparator. 
> {noformat}
> 12/10/22 22:26:15 WARN mapred.LocalJobRunner: job_local_0022
> java.nio.BufferUnderflowException
>   at java.nio.Buffer.nextGetIndex(Buffer.java:478)
>   at java.nio.HeapByteBuffer.getLong(HeapByteBuffer.java:387)
>   at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compareBinInterSedesDatum(BinInterSedes.java:829)
>   at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compareBinSedesTuple(BinInterSedes.java:732)
>   at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compare(BinInterSedes.java:695)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSecondaryKeyComparator.compare(PigSecondaryKeyComparator.java:78)
>   at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373)
>   at org.apache.hadoop.util.PriorityQueue.downHeap(PriorityQueue.java:139)
>   at 
> org.apache.hadoop.util.PriorityQueue.adjustTop(PriorityQueue.java:103)
>   at 
> org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:335)
>   at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350)
>   at org.apache.hadoop.mapred.ReduceTask$4.next(ReduceTask.java:625)
>   at 
> org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:117)
>   at 
> org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
>   at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2999) Regression after PIG-2975: BinInterSedesTupleRawComparator secondary sort failing

2012-10-22 Thread Jonathan Coveney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482081#comment-13482081
 ] 

Jonathan Coveney commented on PIG-2999:
---

I have a theory on what it is.

{code}
case BinInterSedes.TINYBYTEARRAY:
case BinInterSedes.SMALLBYTEARRAY:
case BinInterSedes.BYTEARRAY: {
type1 = DataType.BYTEARRAY;
type2 = getGeneralizedDataType(dt2);
if (type1 == type2) {
int basz1 = readSize(bb1, dt1);
int basz2 = readSize(bb2, dt2);
rc = org.apache.hadoop.io.WritableComparator.compareBytes(
  bb1.array(), bb1.position(), basz1,
  bb2.array(), bb2.position(), basz2);
}
break;
}
{code}

In the old code, the act of comparing would have advanced the respective 
pointers in the bytebuffer. In this case, now it doesn't. So now, after the 
comparison, assuming that the two are equal, it will keep going as if the next 
data type where the next byte (this would explain why the specific line in 
question it fails on is DateTime in code I bet is comparing Bytearrays).

The solution is after doing the comparison, to skip ahead the bytebuffers.

That's my guess, though.

> Regression after PIG-2975: BinInterSedesTupleRawComparator secondary sort 
> failing
> -
>
> Key: PIG-2999
> URL: https://issues.apache.org/jira/browse/PIG-2999
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.11, 0.12
>Reporter: Koji Noguchi
>
> I think I broke the build from PIG-2975.  I see couple of tests failing at 
> BinInterSedesTupleRawComparator. 
> {noformat}
> 12/10/22 22:26:15 WARN mapred.LocalJobRunner: job_local_0022
> java.nio.BufferUnderflowException
>   at java.nio.Buffer.nextGetIndex(Buffer.java:478)
>   at java.nio.HeapByteBuffer.getLong(HeapByteBuffer.java:387)
>   at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compareBinInterSedesDatum(BinInterSedes.java:829)
>   at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compareBinSedesTuple(BinInterSedes.java:732)
>   at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compare(BinInterSedes.java:695)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSecondaryKeyComparator.compare(PigSecondaryKeyComparator.java:78)
>   at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373)
>   at org.apache.hadoop.util.PriorityQueue.downHeap(PriorityQueue.java:139)
>   at 
> org.apache.hadoop.util.PriorityQueue.adjustTop(PriorityQueue.java:103)
>   at 
> org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:335)
>   at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350)
>   at org.apache.hadoop.mapred.ReduceTask$4.next(ReduceTask.java:625)
>   at 
> org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:117)
>   at 
> org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
>   at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2999) Regression after PIG-2975: BinInterSedesTupleRawComparator secondary sort failing

2012-10-22 Thread Jonathan Coveney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482079#comment-13482079
 ] 

Jonathan Coveney commented on PIG-2999:
---

Can you post the tests that you are failing that you see?

> Regression after PIG-2975: BinInterSedesTupleRawComparator secondary sort 
> failing
> -
>
> Key: PIG-2999
> URL: https://issues.apache.org/jira/browse/PIG-2999
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.11, 0.12
>Reporter: Koji Noguchi
>
> I think I broke the build from PIG-2975.  I see couple of tests failing at 
> BinInterSedesTupleRawComparator. 
> {noformat}
> 12/10/22 22:26:15 WARN mapred.LocalJobRunner: job_local_0022
> java.nio.BufferUnderflowException
>   at java.nio.Buffer.nextGetIndex(Buffer.java:478)
>   at java.nio.HeapByteBuffer.getLong(HeapByteBuffer.java:387)
>   at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compareBinInterSedesDatum(BinInterSedes.java:829)
>   at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compareBinSedesTuple(BinInterSedes.java:732)
>   at 
> org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compare(BinInterSedes.java:695)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSecondaryKeyComparator.compare(PigSecondaryKeyComparator.java:78)
>   at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373)
>   at org.apache.hadoop.util.PriorityQueue.downHeap(PriorityQueue.java:139)
>   at 
> org.apache.hadoop.util.PriorityQueue.adjustTop(PriorityQueue.java:103)
>   at 
> org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:335)
>   at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350)
>   at org.apache.hadoop.mapred.ReduceTask$4.next(ReduceTask.java:625)
>   at 
> org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:117)
>   at 
> org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
>   at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2975) TestTypedMap.testOrderBy failing with incorrect result

2012-10-22 Thread Koji Noguchi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482077#comment-13482077
 ] 

Koji Noguchi commented on PIG-2975:
---

bq. Now I am running all the tests with your fix and going to close jiras as 
soon as I verify that they are fixed.

Shoot.  My test run for this patch finished and I see some new tests failing.  
Opened Pig-2999.


> TestTypedMap.testOrderBy failing with incorrect result 
> ---
>
> Key: PIG-2975
> URL: https://issues.apache.org/jira/browse/PIG-2975
> Project: Pig
>  Issue Type: Sub-task
>Affects Versions: 0.11
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Blocker
> Fix For: 0.11
>
> Attachments: PIG-2975-0_jco.patch, PIG-2975-0_jco-v2.patch, 
> pig-2975-trunk_v01.txt, pig-2975-trunk_v02-broken.txt, 
> pig-2975-trunk_v03-unionapproach.txt, pig-2975-trunk_v04-purerawcompare.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withouttest.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withtest2.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withtest.txt
>
>
> Looked at 
> {noformat}
> junit.framework.AssertionFailedError
> at org.apache.pig.test.TestTypedMap.testOrderBy(TestTypedMap.java:352)
> {noformat}
> This looks like a valid test case failing with incorrect result.
> {noformat}
> % cat test/orderby.txt
> [key#1,key9#23]
> [key#3,key3#2]
> [key#22]
> % cat test/orderby.pig
> a = load 'test/orderby.txt' as (m:[]);
> b = foreach a generate m#'key' as b0;
> dump b;
> c = order b by b0;
> dump c;
> % java ... org.apache.pig.Main-x local test/orderby.pig 
> [dump b]
> (1)
> (3)
> (22)
> ...
> [dump c]
> (1)
> (1)
> (22)
> %
> where did the '(3)' go?
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-2999) Regression after PIG-2975: BinInterSedesTupleRawComparator secondary sort failing

2012-10-22 Thread Koji Noguchi (JIRA)
Koji Noguchi created PIG-2999:
-

 Summary: Regression after PIG-2975: 
BinInterSedesTupleRawComparator secondary sort failing
 Key: PIG-2999
 URL: https://issues.apache.org/jira/browse/PIG-2999
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.11, 0.12
Reporter: Koji Noguchi


I think I broke the build from PIG-2975.  I see couple of tests failing at 
BinInterSedesTupleRawComparator. 

{noformat}
12/10/22 22:26:15 WARN mapred.LocalJobRunner: job_local_0022
java.nio.BufferUnderflowException
at java.nio.Buffer.nextGetIndex(Buffer.java:478)
at java.nio.HeapByteBuffer.getLong(HeapByteBuffer.java:387)
at 
org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compareBinInterSedesDatum(BinInterSedes.java:829)
at 
org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compareBinSedesTuple(BinInterSedes.java:732)
at 
org.apache.pig.data.BinInterSedes$BinInterSedesTupleRawComparator.compare(BinInterSedes.java:695)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSecondaryKeyComparator.compare(PigSecondaryKeyComparator.java:78)
at org.apache.hadoop.mapred.Merger$MergeQueue.lessThan(Merger.java:373)
at org.apache.hadoop.util.PriorityQueue.downHeap(PriorityQueue.java:139)
at 
org.apache.hadoop.util.PriorityQueue.adjustTop(PriorityQueue.java:103)
at 
org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:335)
at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350)
at org.apache.hadoop.mapred.ReduceTask$4.next(ReduceTask.java:625)
at 
org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:117)
at 
org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
at 
org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:417)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260)
{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2828) DataType.compare null

2012-10-22 Thread Haitao Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haitao Yao updated PIG-2828:


Attachment: test.patch

unit test  for the DataType


> DataType.compare null
> -
>
> Key: PIG-2828
> URL: https://issues.apache.org/jira/browse/PIG-2828
> Project: Pig
>  Issue Type: Bug
>Reporter: Haitao Yao
> Attachments: DataType.patch, test.patch
>
>
> While using TOP, and if the DataBag contains null value to compare, it will 
> generate the following exception:
> Caused by: java.lang.NullPointerException
>   at org.apache.pig.data.DataType.compare(DataType.java:427)
>   at org.apache.pig.builtin.TOP$TupleComparator.compare(TOP.java:97)
>   at org.apache.pig.builtin.TOP$TupleComparator.compare(TOP.java:1)
>   at java.util.PriorityQueue.siftUpUsingComparator(PriorityQueue.java:649)
>   at java.util.PriorityQueue.siftUp(PriorityQueue.java:627)
>   at java.util.PriorityQueue.offer(PriorityQueue.java:329)
>   at java.util.PriorityQueue.add(PriorityQueue.java:306)
>   at org.apache.pig.builtin.TOP.updateTop(TOP.java:141)
>   at org.apache.pig.builtin.TOP.exec(TOP.java:116)
> code: (TOP.java, starts with line 91)
> Object field1 = o1.get(fieldNum);
> Object field2 = o2.get(fieldNum);
> if (!typeFound) {
> datatype = DataType.findType(field1);
> typeFound = true;
> }
> return DataType.compare(field1, field2, datatype, datatype);
> The reason is that if the typeFound is true , and the dataType is not null, 
> and field1 is null, the script failed.
> So we need to judge the field1 whether is null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2975) TestTypedMap.testOrderBy failing with incorrect result

2012-10-22 Thread Cheolsoo Park (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482072#comment-13482072
 ] 

Cheolsoo Park commented on PIG-2975:


Thank you Koji!

Now I am running all the tests with your fix and going to close jiras as soon 
as I verify that they are fixed.

> TestTypedMap.testOrderBy failing with incorrect result 
> ---
>
> Key: PIG-2975
> URL: https://issues.apache.org/jira/browse/PIG-2975
> Project: Pig
>  Issue Type: Sub-task
>Affects Versions: 0.11
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Blocker
> Fix For: 0.11
>
> Attachments: PIG-2975-0_jco.patch, PIG-2975-0_jco-v2.patch, 
> pig-2975-trunk_v01.txt, pig-2975-trunk_v02-broken.txt, 
> pig-2975-trunk_v03-unionapproach.txt, pig-2975-trunk_v04-purerawcompare.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withouttest.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withtest2.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withtest.txt
>
>
> Looked at 
> {noformat}
> junit.framework.AssertionFailedError
> at org.apache.pig.test.TestTypedMap.testOrderBy(TestTypedMap.java:352)
> {noformat}
> This looks like a valid test case failing with incorrect result.
> {noformat}
> % cat test/orderby.txt
> [key#1,key9#23]
> [key#3,key3#2]
> [key#22]
> % cat test/orderby.pig
> a = load 'test/orderby.txt' as (m:[]);
> b = foreach a generate m#'key' as b0;
> dump b;
> c = order b by b0;
> dump c;
> % java ... org.apache.pig.Main-x local test/orderby.pig 
> [dump b]
> (1)
> (3)
> (22)
> ...
> [dump c]
> (1)
> (1)
> (22)
> %
> where did the '(3)' go?
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (PIG-2987) TestCounters fails

2012-10-22 Thread Cheolsoo Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheolsoo Park resolved PIG-2987.


Resolution: Duplicate

> TestCounters fails
> --
>
> Key: PIG-2987
> URL: https://issues.apache.org/jira/browse/PIG-2987
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Cheolsoo Park
> Fix For: 0.11
>
>
> To reproduce:
> {code}
> ant clean test -Dtestcase=TestCounters -Dhadoopversion=20
> {code}
> This fails with the following error:
> {code}
> Testcase: testMultipleMRJobs took 52.073 sec
> FAILED
> expected:<10> but was:<1>
> junit.framework.AssertionFailedError: expected:<10> but was:<1>
> at 
> org.apache.pig.test.TestCounters.testMultipleMRJobs(TestCounters.java:452)
> {code}
> I see the failures with both hadoop-1.0.x and 2-0.x.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2975) TestTypedMap.testOrderBy failing with incorrect result

2012-10-22 Thread Jonathan Coveney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482064#comment-13482064
 ] 

Jonathan Coveney commented on PIG-2975:
---

'twas a delight. On to the next one ;)

> TestTypedMap.testOrderBy failing with incorrect result 
> ---
>
> Key: PIG-2975
> URL: https://issues.apache.org/jira/browse/PIG-2975
> Project: Pig
>  Issue Type: Sub-task
>Affects Versions: 0.11
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Blocker
> Fix For: 0.11
>
> Attachments: PIG-2975-0_jco.patch, PIG-2975-0_jco-v2.patch, 
> pig-2975-trunk_v01.txt, pig-2975-trunk_v02-broken.txt, 
> pig-2975-trunk_v03-unionapproach.txt, pig-2975-trunk_v04-purerawcompare.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withouttest.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withtest2.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withtest.txt
>
>
> Looked at 
> {noformat}
> junit.framework.AssertionFailedError
> at org.apache.pig.test.TestTypedMap.testOrderBy(TestTypedMap.java:352)
> {noformat}
> This looks like a valid test case failing with incorrect result.
> {noformat}
> % cat test/orderby.txt
> [key#1,key9#23]
> [key#3,key3#2]
> [key#22]
> % cat test/orderby.pig
> a = load 'test/orderby.txt' as (m:[]);
> b = foreach a generate m#'key' as b0;
> dump b;
> c = order b by b0;
> dump c;
> % java ... org.apache.pig.Main-x local test/orderby.pig 
> [dump b]
> (1)
> (3)
> (22)
> ...
> [dump c]
> (1)
> (1)
> (22)
> %
> where did the '(3)' go?
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2975) TestTypedMap.testOrderBy failing with incorrect result

2012-10-22 Thread Koji Noguchi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482063#comment-13482063
 ] 

Koji Noguchi commented on PIG-2975:
---

Thanks Jonathan, Gianmarco and Cheolsoo!  
(and sorry for my lng detours :) 

> TestTypedMap.testOrderBy failing with incorrect result 
> ---
>
> Key: PIG-2975
> URL: https://issues.apache.org/jira/browse/PIG-2975
> Project: Pig
>  Issue Type: Sub-task
>Affects Versions: 0.11
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Blocker
> Fix For: 0.11
>
> Attachments: PIG-2975-0_jco.patch, PIG-2975-0_jco-v2.patch, 
> pig-2975-trunk_v01.txt, pig-2975-trunk_v02-broken.txt, 
> pig-2975-trunk_v03-unionapproach.txt, pig-2975-trunk_v04-purerawcompare.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withouttest.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withtest2.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withtest.txt
>
>
> Looked at 
> {noformat}
> junit.framework.AssertionFailedError
> at org.apache.pig.test.TestTypedMap.testOrderBy(TestTypedMap.java:352)
> {noformat}
> This looks like a valid test case failing with incorrect result.
> {noformat}
> % cat test/orderby.txt
> [key#1,key9#23]
> [key#3,key3#2]
> [key#22]
> % cat test/orderby.pig
> a = load 'test/orderby.txt' as (m:[]);
> b = foreach a generate m#'key' as b0;
> dump b;
> c = order b by b0;
> dump c;
> % java ... org.apache.pig.Main-x local test/orderby.pig 
> [dump b]
> (1)
> (3)
> (22)
> ...
> [dump c]
> (1)
> (1)
> (22)
> %
> where did the '(3)' go?
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2998) Fix TestScriptLangunage

2012-10-22 Thread Cheolsoo Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheolsoo Park updated PIG-2998:
---

Attachment: PIG-2998.patch

Attached is a patch that fixes the failing test.

I changed replaceAll() as follows:
{code}
val = val.replaceAll("(? Fix TestScriptLangunage
> ---
>
> Key: PIG-2998
> URL: https://issues.apache.org/jira/browse/PIG-2998
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Cheolsoo Park
>Assignee: Cheolsoo Park
> Fix For: 0.11
>
> Attachments: PIG-2998.patch
>
>
> This is a regression from PIG-2931.
> I made changes so that $ signs in a replacement string get escaped by 
> PreprocessorContext. But they shouldn't be escaped if they're already escaped.
> In particular, TestScriptLanguage#bindLocalVariableTest2 is failing in trunk 
> because $ signs are escaped by Pig#bind() and then escaped again by 
> PreprocessorContext.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2998) Fix TestScriptLangunage

2012-10-22 Thread Cheolsoo Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheolsoo Park updated PIG-2998:
---

Status: Patch Available  (was: Open)

> Fix TestScriptLangunage
> ---
>
> Key: PIG-2998
> URL: https://issues.apache.org/jira/browse/PIG-2998
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Cheolsoo Park
>Assignee: Cheolsoo Park
> Fix For: 0.11
>
> Attachments: PIG-2998.patch
>
>
> This is a regression from PIG-2931.
> I made changes so that $ signs in a replacement string get escaped by 
> PreprocessorContext. But they shouldn't be escaped if they're already escaped.
> In particular, TestScriptLanguage#bindLocalVariableTest2 is failing in trunk 
> because $ signs are escaped by Pig#bind() and then escaped again by 
> PreprocessorContext.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2975) TestTypedMap.testOrderBy failing with incorrect result

2012-10-22 Thread Jonathan Coveney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482057#comment-13482057
 ] 

Jonathan Coveney commented on PIG-2975:
---

Thanks for the great job, Koji. It's in!

> TestTypedMap.testOrderBy failing with incorrect result 
> ---
>
> Key: PIG-2975
> URL: https://issues.apache.org/jira/browse/PIG-2975
> Project: Pig
>  Issue Type: Sub-task
>Affects Versions: 0.11
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Blocker
> Fix For: 0.11
>
> Attachments: PIG-2975-0_jco.patch, PIG-2975-0_jco-v2.patch, 
> pig-2975-trunk_v01.txt, pig-2975-trunk_v02-broken.txt, 
> pig-2975-trunk_v03-unionapproach.txt, pig-2975-trunk_v04-purerawcompare.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withouttest.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withtest2.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withtest.txt
>
>
> Looked at 
> {noformat}
> junit.framework.AssertionFailedError
> at org.apache.pig.test.TestTypedMap.testOrderBy(TestTypedMap.java:352)
> {noformat}
> This looks like a valid test case failing with incorrect result.
> {noformat}
> % cat test/orderby.txt
> [key#1,key9#23]
> [key#3,key3#2]
> [key#22]
> % cat test/orderby.pig
> a = load 'test/orderby.txt' as (m:[]);
> b = foreach a generate m#'key' as b0;
> dump b;
> c = order b by b0;
> dump c;
> % java ... org.apache.pig.Main-x local test/orderby.pig 
> [dump b]
> (1)
> (3)
> (22)
> ...
> [dump c]
> (1)
> (1)
> (22)
> %
> where did the '(3)' go?
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (PIG-2975) TestTypedMap.testOrderBy failing with incorrect result

2012-10-22 Thread Jonathan Coveney (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Coveney resolved PIG-2975.
---

Resolution: Fixed

> TestTypedMap.testOrderBy failing with incorrect result 
> ---
>
> Key: PIG-2975
> URL: https://issues.apache.org/jira/browse/PIG-2975
> Project: Pig
>  Issue Type: Sub-task
>Affects Versions: 0.11
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Blocker
> Fix For: 0.11
>
> Attachments: PIG-2975-0_jco.patch, PIG-2975-0_jco-v2.patch, 
> pig-2975-trunk_v01.txt, pig-2975-trunk_v02-broken.txt, 
> pig-2975-trunk_v03-unionapproach.txt, pig-2975-trunk_v04-purerawcompare.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withouttest.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withtest2.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withtest.txt
>
>
> Looked at 
> {noformat}
> junit.framework.AssertionFailedError
> at org.apache.pig.test.TestTypedMap.testOrderBy(TestTypedMap.java:352)
> {noformat}
> This looks like a valid test case failing with incorrect result.
> {noformat}
> % cat test/orderby.txt
> [key#1,key9#23]
> [key#3,key3#2]
> [key#22]
> % cat test/orderby.pig
> a = load 'test/orderby.txt' as (m:[]);
> b = foreach a generate m#'key' as b0;
> dump b;
> c = order b by b0;
> dump c;
> % java ... org.apache.pig.Main-x local test/orderby.pig 
> [dump b]
> (1)
> (3)
> (22)
> ...
> [dump c]
> (1)
> (1)
> (22)
> %
> where did the '(3)' go?
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2982) add unit tests for DateTime type that test setting timezone

2012-10-22 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482047#comment-13482047
 ] 

Thejas M Nair commented on PIG-2982:


+1. Will commit after running tests.


> add unit tests for DateTime type that test setting timezone
> ---
>
> Key: PIG-2982
> URL: https://issues.apache.org/jira/browse/PIG-2982
> Project: Pig
>  Issue Type: Test
>Reporter: Thejas M Nair
>Assignee: Zhijie Shen
> Fix For: 0.11
>
> Attachments: PIG-2982.patch
>
>
> The default timezone can be set for the new DateTime type. We need to add 
> unit tests that test this functionality. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2941) Ivy resolvers in pig don't have consistent chaining and don't have a kitchen sink option for novices

2012-10-22 Thread John Gordon (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482026#comment-13482026
 ] 

John Gordon commented on PIG-2941:
--

Thank you!


> Ivy resolvers in pig don't have consistent chaining and don't have a kitchen 
> sink option for novices
> 
>
> Key: PIG-2941
> URL: https://issues.apache.org/jira/browse/PIG-2941
> Project: Pig
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.10.0
>Reporter: John Gordon
>Assignee: John Gordon
> Fix For: 0.12
>
> Attachments: 
> 0001-IvySettings.xml-refactor-to-simplify-resolution.patch, 
> PIG-2941.trunk.002.patch
>
>
> The Ivy resolvers in Pig are split into default, external, and internal -- 
> and they are all actually distinct.  There isn't a resolver that rolls over 
> all three, and fallbacks aren't in place.  Ideally, these resolver should 
> chain right through with the default following a best practice fallback for 
> novices.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2328) Add builtin UDFs for building and using bloom filters

2012-10-22 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482010#comment-13482010
 ] 

Daniel Dai commented on PIG-2328:
-

They are piggybank UDFs which we have javadoc but no forest doc.

> Add builtin UDFs for building and using bloom filters
> -
>
> Key: PIG-2328
> URL: https://issues.apache.org/jira/browse/PIG-2328
> Project: Pig
>  Issue Type: New Feature
>  Components: internal-udfs
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: 0.10.0, 0.11
>
> Attachments: PIG-bloom-2.patch, PIG-bloom-3.patch, PIG-bloom.patch
>
>
> Bloom filters are a common way to do select a limited set of records before 
> moving data for a join or other heavy weight operation.  Pig should add UDFs 
> to support building and using bloom filters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2941) Ivy resolvers in pig don't have consistent chaining and don't have a kitchen sink option for novices

2012-10-22 Thread Gianmarco De Francisci Morales (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gianmarco De Francisci Morales updated PIG-2941:


   Resolution: Fixed
Fix Version/s: (was: 0.10.0)
   0.12
   Status: Resolved  (was: Patch Available)

+1
Committed to trunk.

Thanks John!

> Ivy resolvers in pig don't have consistent chaining and don't have a kitchen 
> sink option for novices
> 
>
> Key: PIG-2941
> URL: https://issues.apache.org/jira/browse/PIG-2941
> Project: Pig
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.10.0
>Reporter: John Gordon
>Assignee: John Gordon
> Fix For: 0.12
>
> Attachments: 
> 0001-IvySettings.xml-refactor-to-simplify-resolution.patch, 
> PIG-2941.trunk.002.patch
>
>
> The Ivy resolvers in Pig are split into default, external, and internal -- 
> and they are all actually distinct.  There isn't a resolver that rolls over 
> all three, and fallbacks aren't in place.  Ideally, these resolver should 
> chain right through with the default following a best practice fallback for 
> novices.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Subscription: PIG patch available

2012-10-22 Thread jira
Issue Subscription
Filter: PIG patch available (35 issues)

Subscriber: pigdaily

Key Summary
PIG-2990the -secretDebugCmd shouldn't be a secret and should just be...a 
command
https://issues.apache.org/jira/browse/PIG-2990
PIG-2978TestLoadStoreFuncLifeCycle fails with hadoop-2.0.x
https://issues.apache.org/jira/browse/PIG-2978
PIG-2973TestStreaming test times out
https://issues.apache.org/jira/browse/PIG-2973
PIG-2968ColumnMapKeyPrune fails to prune a subtree inside foreach
https://issues.apache.org/jira/browse/PIG-2968
PIG-2960Increase the timeout for unit test
https://issues.apache.org/jira/browse/PIG-2960
PIG-2959Add a pig.cmd for Pig to run under Windows
https://issues.apache.org/jira/browse/PIG-2959
PIG-2957TetsScriptUDF fail due to volume prefix in jar
https://issues.apache.org/jira/browse/PIG-2957
PIG-2956Invalid cache specification for some streaming statement
https://issues.apache.org/jira/browse/PIG-2956
PIG-2955 Fix bunch of Pig e2e tests on Windows 
https://issues.apache.org/jira/browse/PIG-2955
PIG-2954 TestParamSubPreproc still depends on "bash" to run 
https://issues.apache.org/jira/browse/PIG-2954
PIG-2953"which" utility does not exist on Windows
https://issues.apache.org/jira/browse/PIG-2953
PIG-2942DevTests, TestLoad has a false failure on Windows
https://issues.apache.org/jira/browse/PIG-2942
PIG-2941Ivy resolvers in pig don't have consistent chaining and don't have 
a kitchen sink option for novices
https://issues.apache.org/jira/browse/PIG-2941
PIG-2904Scripting UDFs should allow DEFINE statements to pass parameters to 
the UDF's constructor
https://issues.apache.org/jira/browse/PIG-2904
PIG-2898Parallel execution of e2e tests
https://issues.apache.org/jira/browse/PIG-2898
PIG-2885TestJobSumission and TestHBaseStorage don't work with HBase 0.94 
and ZK 3.4.3
https://issues.apache.org/jira/browse/PIG-2885
PIG-2881Add SUBTRACT eval function
https://issues.apache.org/jira/browse/PIG-2881
PIG-2873Converting bin/pig shell script to python
https://issues.apache.org/jira/browse/PIG-2873
PIG-2834MultiStorage requires unused constructor argument
https://issues.apache.org/jira/browse/PIG-2834
PIG-2824Pushing checking number of fields into LoadFunc
https://issues.apache.org/jira/browse/PIG-2824
PIG-2801grunt "sh" command should invoke the shell implicitly instead of 
calling exec directly with the command tokens
https://issues.apache.org/jira/browse/PIG-2801
PIG-2799Update pig streaming interface to run correctly on Windows without 
Cygwin
https://issues.apache.org/jira/browse/PIG-2799
PIG-2798pig streaming tests assume interpreters are auto-resolved
https://issues.apache.org/jira/browse/PIG-2798
PIG-2796Local temporary paths are not always valid HDFS path names.
https://issues.apache.org/jira/browse/PIG-2796
PIG-2795Fix test cases that generate pig scripts with "load " + pathStr to 
encode "\" in the path
https://issues.apache.org/jira/browse/PIG-2795
PIG-2661Pig uses an extra job for loading data in Pigmix L9
https://issues.apache.org/jira/browse/PIG-2661
PIG-2657Print warning if using wrong jython version
https://issues.apache.org/jira/browse/PIG-2657
PIG-2495Using merge JOIN from a HBaseStorage produces an error
https://issues.apache.org/jira/browse/PIG-2495
PIG-2417Streaming UDFs -  allow users to easily write UDFs in scripting 
languages with no JVM implementation.
https://issues.apache.org/jira/browse/PIG-2417
PIG-2405svn tags/release-0.9.1: some unit test case failed with open JDK
https://issues.apache.org/jira/browse/PIG-2405
PIG-2362Rework Ant build.xml to use macrodef instead of antcall
https://issues.apache.org/jira/browse/PIG-2362
PIG-2312NPE when relation and column share the same name and used in Nested 
Foreach 
https://issues.apache.org/jira/browse/PIG-2312
PIG-1942script UDF (jython) should utilize the intended output schema to 
more directly convert Py objects to Pig objects
https://issues.apache.org/jira/browse/PIG-1942
PIG-1431Current DateTime UDFs: ISONOW(), UNIXNOW()
https://issues.apache.org/jira/browse/PIG-1431
PIG-1237Piggybank MutliStorage - specify field to write in output
https://issues.apache.org/jira/browse/PIG-1237

You may edit this subscription at:
https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=13225&filterId=12322384


[jira] [Resolved] (PIG-2993) Fix local mode on Hadoop-0.23

2012-10-22 Thread Gianmarco De Francisci Morales (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gianmarco De Francisci Morales resolved PIG-2993.
-

   Resolution: Duplicate
Fix Version/s: (was: 0.11)

Thanks for the walkthrough.
Indeed, Pig was picking the Hadoop installed on my machine.
All the rest is as you described.

Closing as duplicate.

> Fix local mode on Hadoop-0.23
> -
>
> Key: PIG-2993
> URL: https://issues.apache.org/jira/browse/PIG-2993
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Gianmarco De Francisci Morales
>
> When compiling with -Dhadoopversion=23 and launching Pig in local mode (-x 
> local) the shell just fills up with error notifications:
> {code}
> 2012-10-19 15:10:17,360 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 
> 2998: Unhandled internal error. Could not initialize class 
> org.apache.pig.tools.pigstats.PigStatsUtil
> {code}
> Here the stack trace:
> {code}
> Pig Stack Trace
> ---
> ERROR 2998: Unhandled internal error. 
> org/apache/hadoop/mapreduce/task/JobContextImpl
> java.lang.NoClassDefFoundError: 
> org/apache/hadoop/mapreduce/task/JobContextImpl
> at 
> org.apache.pig.tools.pigstats.PigStatsUtil.(PigStatsUtil.java:54)
> at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:67)
> at org.apache.pig.Main.run(Main.java:538)
> at org.apache.pig.Main.main(Main.java:154)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.mapreduce.task.JobContextImpl
> at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
> ... 9 more
> 
> Pig Stack Trace
> ---
> ERROR 2998: Unhandled internal error. Could not initialize class 
> org.apache.pig.tools.pigstats.PigStatsUtil
> java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.pig.tools.pigstats.PigStatsUtil
> at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:67)
> at org.apache.pig.Main.run(Main.java:538)
> at org.apache.pig.Main.main(Main.java:154)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> 
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: PROPOSAL: how to handle release documentation going forward

2012-10-22 Thread Jonathan Coveney
As someone who chronically under-documents, I think that this is a good
idea. +1

2012/10/22 Olga Natkovich 

> Hi,
>
> Since we lost the dedicated document writer for Pig, would it make sense
> to require that going forward (0.12 and beyond) we require that
> documentation updates are included in the patch together with code changes
> and tests. I think that should work for most features/updates except
> perhaps big items that might require more than one JIRA to be completed
> before documentation changes make sense.
>
> Comments?
>
> Olga
>


[jira] [Commented] (PIG-2328) Add builtin UDFs for building and using bloom filters

2012-10-22 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481948#comment-13481948
 ] 

Olga Natkovich commented on PIG-2328:
-

Looks like the change made it into 10 but what about documentation? I could not 
find it ib builtins but just want to make sure it was not put in some other 
place?

> Add builtin UDFs for building and using bloom filters
> -
>
> Key: PIG-2328
> URL: https://issues.apache.org/jira/browse/PIG-2328
> Project: Pig
>  Issue Type: New Feature
>  Components: internal-udfs
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: 0.10.0, 0.11
>
> Attachments: PIG-bloom-2.patch, PIG-bloom-3.patch, PIG-bloom.patch
>
>
> Bloom filters are a common way to do select a limited set of records before 
> moving data for a join or other heavy weight operation.  Pig should add UDFs 
> to support building and using bloom filters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2980) documentation for DateTime datatype

2012-10-22 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481938#comment-13481938
 ] 

Olga Natkovich commented on PIG-2980:
-

Sounds good. Zhijie, please, re-assign to me once you provide the information, 
thanks!

> documentation for DateTime datatype
> ---
>
> Key: PIG-2980
> URL: https://issues.apache.org/jira/browse/PIG-2980
> Project: Pig
>  Issue Type: Bug
>  Components: documentation
>Reporter: Thejas M Nair
>Assignee: Zhijie Shen
> Fix For: 0.11
>
>
> Documentation for new DateTime type needs to be added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (PIG-2980) documentation for DateTime datatype

2012-10-22 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich reassigned PIG-2980:
---

Assignee: Zhijie Shen

> documentation for DateTime datatype
> ---
>
> Key: PIG-2980
> URL: https://issues.apache.org/jira/browse/PIG-2980
> Project: Pig
>  Issue Type: Bug
>  Components: documentation
>Reporter: Thejas M Nair
>Assignee: Zhijie Shen
> Fix For: 0.11
>
>
> Documentation for new DateTime type needs to be added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2980) documentation for DateTime datatype

2012-10-22 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481936#comment-13481936
 ] 

Thejas M Nair commented on PIG-2980:


Olga,
Zhijie is planning to work on this. If you can help with the formatting, that 
would be great!

> documentation for DateTime datatype
> ---
>
> Key: PIG-2980
> URL: https://issues.apache.org/jira/browse/PIG-2980
> Project: Pig
>  Issue Type: Bug
>  Components: documentation
>Reporter: Thejas M Nair
> Fix For: 0.11
>
>
> Documentation for new DateTime type needs to be added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


PROPOSAL: how to handle release documentation going forward

2012-10-22 Thread Olga Natkovich
Hi,

Since we lost the dedicated document writer for Pig, would it make sense to 
require that going forward (0.12 and beyond) we require that documentation 
updates are included in the patch together with code changes and tests. I think 
that should work for most features/updates except perhaps big items that might 
require more than one JIRA to be completed before documentation changes make 
sense.

Comments?

Olga


[jira] [Commented] (PIG-2980) documentation for DateTime datatype

2012-10-22 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481927#comment-13481927
 ] 

Olga Natkovich commented on PIG-2980:
-

Who would be a good person to provide content. I would be happy to create and 
commit the patch

> documentation for DateTime datatype
> ---
>
> Key: PIG-2980
> URL: https://issues.apache.org/jira/browse/PIG-2980
> Project: Pig
>  Issue Type: Bug
>  Components: documentation
>Reporter: Thejas M Nair
> Fix For: 0.11
>
>
> Documentation for new DateTime type needs to be added.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-1314) Add DateTime Support to Pig

2012-10-22 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-1314:


Fix Version/s: 0.11

> Add DateTime Support to Pig
> ---
>
> Key: PIG-1314
> URL: https://issues.apache.org/jira/browse/PIG-1314
> Project: Pig
>  Issue Type: Bug
>  Components: data
>Affects Versions: 0.7.0
>Reporter: Russell Jurney
>Assignee: Zhijie Shen
>  Labels: gsoc2012
> Fix For: 0.11
>
> Attachments: joda_vs_builtin.zip, PIG-1314-1.patch, PIG-1314-2.patch, 
> PIG-1314-3.patch, PIG-1314-4.patch, PIG-1314-5.patch, PIG-1314-6.patch, 
> PIG-1314-7.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Hadoop/Pig are primarily used to parse log data, and most logs have a 
> timestamp component.  Therefore Pig should support dates as a primitive.
> Can someone familiar with adding types to pig comment on how hard this is?  
> We're looking at doing this, rather than use UDFs.  Is this a patch that 
> would be accepted?
> This is a candidate project for Google summer of code 2012. More information 
> about the program can be found at 
> https://cwiki.apache.org/confluence/display/PIG/GSoc2012

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (PIG-2756) Documentation for 0.11

2012-10-22 Thread Olga Natkovich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich reassigned PIG-2756:
---

Assignee: Olga Natkovich

> Documentation for 0.11
> --
>
> Key: PIG-2756
> URL: https://issues.apache.org/jira/browse/PIG-2756
> Project: Pig
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 0.11
>Reporter: Bill Graham
>Assignee: Olga Natkovich
> Fix For: 0.11
>
>
> Tracking areas where we need documentation on the pig.apache.org site 
> (Javadocs are typically pretty good). We can open child tasks as needed. 
> Please add to the list if you know of others.
> * Pluggable {{PigProgressNotificationListener}} isn't in the docs
> * Pluggable reducer estimators (see PIG-2574)
> * ILLUSTRATE seems to have dropped off the docs
> * {{HBaseStorage}} (see PIG-2341)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2995) Refactor unit test temporary file allocation patterns to use FileLocalizer.getTemporaryPath

2012-10-22 Thread Jonathan Coveney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481900#comment-13481900
 ] 

Jonathan Coveney commented on PIG-2995:
---

I fully support this effort, and making the tests all follow best practices in 
general. Some of them are a mess!

> Refactor unit test temporary file allocation patterns to use 
> FileLocalizer.getTemporaryPath
> ---
>
> Key: PIG-2995
> URL: https://issues.apache.org/jira/browse/PIG-2995
> Project: Pig
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.10.0
>Reporter: John Gordon
>Priority: Minor
>
> Pig unit tests have a lot of diverse patterns for temporary file allocation.  
> Not all of them are best practices.  There is an abstraction that could house 
> best practices for test temporary file allocation -- 
> FileLocalizer.getTemporaryPath.  With this, we should be able to have 
> all/most of the temporary file usage fall under just a few methods that can 
> handle arbitrary pig contexts and provide more flexibility around testing pig 
> with different fs implementations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-1431) Current DateTime UDFs: ISONOW(), UNIXNOW()

2012-10-22 Thread Jonathan Coveney (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Coveney updated PIG-1431:
--

Fix Version/s: 0.12
Affects Version/s: (was: 0.7.0)
   Status: Patch Available  (was: Open)

> Current DateTime UDFs: ISONOW(), UNIXNOW()
> --
>
> Key: PIG-1431
> URL: https://issues.apache.org/jira/browse/PIG-1431
> Project: Pig
>  Issue Type: New Feature
>  Components: impl
>Reporter: Russell Jurney
>Assignee: Jonathan Coveney
>  Labels: datetime, now, simple, udf
> Fix For: 0.12
>
> Attachments: PIG-1431-0.patch
>
>
> Need a NOW() for getting datetime diffs between now and a prior or future 
> date.  Will use the system timezone.  Will make one for ISO datetime and one 
> for Unix time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-1431) Current DateTime UDFs: ISONOW(), UNIXNOW()

2012-10-22 Thread Jonathan Coveney (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Coveney updated PIG-1431:
--

Attachment: PIG-1431-0.patch

I bet you never thought you'd get a patch for this, Russell :) Now that we have 
a DataTime datatype in Pig, it seems totally reasonable to have a NOW(). The 
time will be the default DateTime object (unix epoch, iso chronology), and it 
will be _as of the moment the object is created on the front-end_. All values 
from the same instantiation of NOW() will be equal, though we should add tests 
etc to make sure this is the case.

I whipped this up quickly to see if the strategy I thought would work would 
work. It did (as far as I can tell).

> Current DateTime UDFs: ISONOW(), UNIXNOW()
> --
>
> Key: PIG-1431
> URL: https://issues.apache.org/jira/browse/PIG-1431
> Project: Pig
>  Issue Type: New Feature
>  Components: impl
>Reporter: Russell Jurney
>Assignee: Jonathan Coveney
>  Labels: datetime, now, simple, udf
> Fix For: 0.12
>
> Attachments: PIG-1431-0.patch
>
>
> Need a NOW() for getting datetime diffs between now and a prior or future 
> date.  Will use the system timezone.  Will make one for ISO datetime and one 
> for Unix time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (PIG-1431) Current DateTime UDFs: ISONOW(), UNIXNOW()

2012-10-22 Thread Jonathan Coveney (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-1431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Coveney reassigned PIG-1431:
-

Assignee: Jonathan Coveney

> Current DateTime UDFs: ISONOW(), UNIXNOW()
> --
>
> Key: PIG-1431
> URL: https://issues.apache.org/jira/browse/PIG-1431
> Project: Pig
>  Issue Type: New Feature
>  Components: impl
>Affects Versions: 0.7.0
>Reporter: Russell Jurney
>Assignee: Jonathan Coveney
>  Labels: datetime, now, simple, udf
>
> Need a NOW() for getting datetime diffs between now and a prior or future 
> date.  Will use the system timezone.  Will make one for ISO datetime and one 
> for Unix time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2904) Scripting UDFs should allow DEFINE statements to pass parameters to the UDF's constructor

2012-10-22 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481885#comment-13481885
 ] 

Julien Le Dem commented on PIG-2904:


Hi Cheolsoo
I will look into this

> Scripting UDFs should allow DEFINE statements to pass parameters to the UDF's 
> constructor
> -
>
> Key: PIG-2904
> URL: https://issues.apache.org/jira/browse/PIG-2904
> Project: Pig
>  Issue Type: New Feature
>Reporter: Julien Le Dem
>Assignee: Cheolsoo Park
> Attachments: PIG-2904.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (PIG-2074) When passing a parameter to Pig, if the value contains $ it has to be escaped because of a bug in PrecprocessorContext

2012-10-22 Thread Julien Le Dem (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Le Dem resolved PIG-2074.


Resolution: Duplicate

> When passing a parameter to Pig, if the value contains $ it has to be escaped 
> because of a bug in PrecprocessorContext
> --
>
> Key: PIG-2074
> URL: https://issues.apache.org/jira/browse/PIG-2074
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Reporter: Julien Le Dem
>Priority: Minor
>
> This was raised while looking at PIG-1827
> There seems to be a bug in PreprocessorContext:
> http://svn.apache.org/viewvc/pig/trunk/src/org/apache/pig/tools/parameters/PreprocessorContext.java?view=markup
> {code}
> 235   //String litVal = Matcher.quoteReplacement(val);
> 236   replaced_line = replaced_line.replaceFirst("\\$"+key, val);
> {code}
> the replacement (2nd) parameter of replaceFirst is not a plain string, it can 
> contain references to the matched pattern like "$0" so $ in val must be 
> escaped.
> Does someone know why line 235 is commented out ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2990) the -secretDebugCmd shouldn't be a secret and should just be...a command

2012-10-22 Thread Jonathan Coveney (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Coveney updated PIG-2990:
--

Status: Patch Available  (was: Open)

> the -secretDebugCmd shouldn't be a secret and should just be...a command
> 
>
> Key: PIG-2990
> URL: https://issues.apache.org/jira/browse/PIG-2990
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jonathan Coveney
>Assignee: Jonathan Coveney
>Priority: Minor
>  Labels: newbie
> Attachments: PIG-2990-0.patch
>
>
> It's a useful command, and it's weird that it's not in -help

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (PIG-2990) the -secretDebugCmd shouldn't be a secret and should just be...a command

2012-10-22 Thread Jonathan Coveney (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Coveney reassigned PIG-2990:
-

Assignee: Jonathan Coveney

> the -secretDebugCmd shouldn't be a secret and should just be...a command
> 
>
> Key: PIG-2990
> URL: https://issues.apache.org/jira/browse/PIG-2990
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jonathan Coveney
>Assignee: Jonathan Coveney
>Priority: Minor
>  Labels: newbie
> Attachments: PIG-2990-0.patch
>
>
> It's a useful command, and it's weird that it's not in -help

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2990) the -secretDebugCmd shouldn't be a secret and should just be...a command

2012-10-22 Thread Jonathan Coveney (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Coveney updated PIG-2990:
--

Attachment: PIG-2990-0.patch

> the -secretDebugCmd shouldn't be a secret and should just be...a command
> 
>
> Key: PIG-2990
> URL: https://issues.apache.org/jira/browse/PIG-2990
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jonathan Coveney
>Assignee: Jonathan Coveney
>Priority: Minor
>  Labels: newbie
> Attachments: PIG-2990-0.patch
>
>
> It's a useful command, and it's weird that it's not in -help

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Documentation planning for Pig 0.11 release

2012-10-22 Thread Olga Natkovich
Thanks, Bill!

 
- Original Message -
From: Bill Graham 
To: dev@pig.apache.org; Olga Natkovich 
Cc: 
Sent: Monday, October 22, 2012 1:57 PM
Subject: Re: Documentation planning for Pig 0.11 release

Hi Olga,

This is a great list, thanks for putting it together. Here are 3 more
issues pulled from PIG-2756:

Better HBaseStorage documentation:
https://issues.apache.org/jira/browse/PIG-2341
DateTime data type: https://issues.apache.org/jira/browse/PIG-2980
ILLUSTRATE seems to have dropped off the docs (no JIRA afaik)

thanks,
Bill

On Mon, Oct 22, 2012 at 11:50 AM, Olga Natkovich wrote:

> Hi,
>
> I have gone through the resolved JIRAs for 0.11, and here is what I
> believe needs to go into the documentation. Please, let me know if I missed
> anything. Also, I have not looked and anything that has not yet been
> committed:
>
> Bloom filter UDF: https://issues.apache.org/jira/browse/PIG-2328
> Clear command in Grunt: https://issues.apache.org/jira/browse/PIG-2706 -
> this is already in the docs
> RANK operator: https://issues.apache.org/jira/browse/PIG-2353 - this is
> already in docs
> UDF convinience classes: https://issues.apache.org/jira/browse/PIG-2547
> More efficient tuple support:
> https://issues.apache.org/jira/browse/PIG-2359
> Pluggable progress notification:
> https://issues.apache.org/jira/browse/PIG-2525
> Merge join after ORDER BY: https://issues.apache.org/jira/browse/PIG-2673
> Measure time spent in UDF: https://issues.apache.org/jira/browse/PIG-2855
> Storage func improvements: https://issues.apache.org/jira/browse/PIG-1891
> UDFs to flatten bags: https://issues.apache.org/jira/browse/PIG-2166
> Make Tuple iterable: https://issues.apache.org/jira/browse/PIG-2724
> New accumulate interface: https://issues.apache.org/jira/browse/PIG-2651
> RUBY UDF: https://issues.apache.org/jira/browse/PIG-2317 . Looks like
> this is also in 0.10. Was documentation for this committed to 10?
> Re-aliasing: https://issues.apache.org/jira/browse/PIG-438 . Looks like
> this is also in 0.10. Was documentation for this committed to 10?
> Groovy UDFs: https://issues.apache.org/jira/browse/PIG-2763 Docs already
> committed
> Native cube operator: https://issues.apache.org/jira/browse/PIG-2710 -
> docs at:  http://goo.gl/SpUad
> Better map support: https://issues.apache.org/jira/browse/PIG-2600 - This
> needs release notes to include in docs.
>
> Olga
>



-- 
*Note that I'm no longer using my Yahoo! email address. Please email me at
billgra...@gmail.com going forward.*



[jira] [Commented] (PIG-2937) generated field in nested foreach does not inherit the variable name as the field name

2012-10-22 Thread Jonathan Coveney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481841#comment-13481841
 ] 

Jonathan Coveney commented on PIG-2937:
---

This is definitely important and useful.

To my eye, the way that this should work is that in any case where you don't 
have a schema (in this case, generated_field inside of the GENERATE) we should 
do our best to fill it in. In the case of a binary conditional, etc, we know 
the return type, so that gives us the type, and the field name (ie 
generated_field) would give us the name.

I think that this is not a deep change, but it is a tricky one as getting Pig 
to thread through Schema information like this that isn't currently threaded 
through can be tricky.

> generated field in nested foreach does not inherit the variable name as the 
> field name
> --
>
> Key: PIG-2937
> URL: https://issues.apache.org/jira/browse/PIG-2937
> Project: Pig
>  Issue Type: Bug
>Reporter: Feng Peng
>
> {code}
> raw_data = load 'xyz' using Loader() as (field_a, field_b, field_c);
> records = foreach raw_data {
>   generated_field = (field_a is null ? '-' : someUDF(field_b)); 
>   GENERATE
> field_c,
> generated_field
>   ;
> }
> describe records;
> {code}
> One would expect the generated_field to have a field name, similar to the 
> field_c that is from the original relation. However, Pig currently doesn't 
> assign the field name by default. It'd be nice if we can assign the variable 
> name as the default field name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PIG-2998) Fix TestScriptLangunage

2012-10-22 Thread Cheolsoo Park (JIRA)
Cheolsoo Park created PIG-2998:
--

 Summary: Fix TestScriptLangunage
 Key: PIG-2998
 URL: https://issues.apache.org/jira/browse/PIG-2998
 Project: Pig
  Issue Type: Sub-task
Reporter: Cheolsoo Park
Assignee: Cheolsoo Park
 Fix For: 0.11


This is a regression from PIG-2931.

I made changes so that $ signs in a replacement string get escaped by 
PreprocessorContext. But they shouldn't be escaped if they're already escaped.

In particular, TestScriptLanguage#bindLocalVariableTest2 is failing in trunk 
because $ signs are escaped by Pig#bind() and then escaped again by 
PreprocessorContext.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Pig 0.11

2012-10-22 Thread Olga Natkovich
There are still 76 unresolved JIRAs more than half unassigned. Lets clean this 
up by theend of this week. I propose we do the following:
 
(1) Unlink all JIRAs for new features since we already branched so we should 
not be taken on new work. If people feel strongly that some new features still 
need to go in please bring it up.
(2) For bug fixes, if people fill strongly that some of the unassigned issues 
need to be addressed please take ownership. If you are unable to solve them but 
still feel they are important, please, bring them up.
(3) Owners of unresolved issues, please, take a look if you will have time to 
solve them in the next 2 weeks. If not, lets move them to 12. If you can't 
address them but feel they are important, please, bring it up.
 
Lets make sure that all JIRAs that require changes to the documentation have 
appropriate information in the release notes section so that we can quickly 
compile release documentation.
 
Thanks for you help!
 
Olga



From: Alan Gates 
To: dev@pig.apache.org 
Sent: Monday, October 15, 2012 11:55 AM
Subject: Re: Pig 0.11

At this point no one has taken on release documentation for 0.11.

Alan.

On Oct 15, 2012, at 11:49 AM, Olga Natkovich wrote:

> Thanks!
>  
> Are you talking about items 15 and 16 on the How To Release.Publish  page? 
>  
> Also, who is doing release documentation these days? I can help with that as 
> well. I would also be happy to roll the release if you guys need help with 
> that.
>  
> Olga
> 
> 
> 
> From: Dmitriy Ryaboy 
> To: "dev@pig.apache.org"  
> Cc: "dev@pig.apache.org"  
> Sent: Friday, October 12, 2012 5:59 PM
> Subject: Re: Pig 0.11
> 
> Thanks Olga and welcome back! 
> I know there's some process for linking jiras to releases, but I'm not sure 
> what that is. If you could explain and maybe cover a portion of that work, 
> that'd be super helpful. And reviews, of course. 
> 
> On Oct 12, 2012, at 2:06 PM, Olga Natkovich  wrote:
> 
>> Dmitry, I would be happy to help with the release process. Want to get back 
>> into this now that I am back at work. Let me know what you would like me to 
>> do.
>>  
>> Olga
>> 
>> 
>> 
>> 
>> From: Dmitriy Ryaboy 
>> To: dev@pig.apache.org 
>> Cc: billgra...@gmail.com 
>> Sent: Thursday, October 11, 2012 2:44 PM
>> Subject: Re: Pig 0.11
>> 
>> Ok I will branch 0.11 tomorrow morning unless someone objects.
>> From then on, committers should be careful to commit bug fixes to both
>> 0.11 branch and trunk; minor polish can go into the branch, but whole
>> new features should not (we can discuss on the list if something is in
>> the gray area).
>> 
>> D
>> 
>> On Thu, Oct 11, 2012 at 2:16 PM, Gianmarco De Francisci Morales
>>  wrote:
>>> I added it as a dependency as it has already its own Jira.
>>> I hope it is OK.
>>> 
>>> Cheers,
>>> --
>>> Gianmarco
>>> 
>>> 
>>> 
>>> On Wed, Oct 10, 2012 at 11:23 PM, Bill Graham  wrote:
>>> 
 +1 for me.
 
 There's https://issues.apache.org/jira/browse/PIG-2756 which tracks a few
 documentation issues that should block Pig 0.11, but they can also be done
 on the trunk and merged to the branch. Gianmarco, you can add a rank
 subtask there to serve as a reminder.
 
 
 On Wed, Oct 10, 2012 at 11:03 PM, Gianmarco De Francisci Morales <
 g...@apache.org> wrote:
 
> We are missing some documentation on the RANK but I guess we could add
 that
> to the branch and trunk in parallel.
> All the patches I was keeping an eye on are in.
> 
> So +1 for me.
> --
> Gianmarco
> 
> 
> 
> On Wed, Oct 10, 2012 at 5:31 PM, Jonathan Coveney > wrote:
> 
>> I think all of the major patches are in, no? Now it's just bug testing?
>> Just wanted to touch base on where we are at with this.
>> 
> 
 
 
 
 --
 *Note that I'm no longer using my Yahoo! email address. Please email me at
 billgra...@gmail.com going forward.*

[jira] [Commented] (PIG-2931) $ signs in the replacement string make parameter substitution fail

2012-10-22 Thread Cheolsoo Park (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481819#comment-13481819
 ] 

Cheolsoo Park commented on PIG-2931:


Hi Julien,

Thank you very much for pointing that out. In fact, I've just realized that I 
broke TestScriptLanguange for the reason that you're describing. Pig#bind() 
escapes '$', so my change makes bindLocalVariableTest2() fail.

Please let me open a jira under PIG-2972 and post a patch.

Thanks!


> $ signs in the replacement string make parameter substitution fail
> --
>
> Key: PIG-2931
> URL: https://issues.apache.org/jira/browse/PIG-2931
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Cheolsoo Park
>Assignee: Cheolsoo Park
> Fix For: 0.11
>
> Attachments: PIG-2931.patch
>
>
> To reproduce the issue, use the following pig script:
> {code:title=test.pig}
> a = load 'data';
> b = filter by $FILTER;
> {code}
> and run the following command:
> {code}
> pig -x local -dryrun -f test.pig -p FILTER="(\$0 == 'a')"
> {code}
> This generates the following script:
> {code:title=test.pig.substituted}
> a = load 'data';
> b = filter by ($FILTER == 'a');
> {code}
> However this should be:
> {code}
> a = load 'data';
> b = filter by ($0 == 'a');
> {code}
> This is because Pig calls replaceFirst() with a replacement string that 
> include a $ sign as follows:
> {code}
> "$FILTER".replaceFirst("\\$FILTER", "($0 == 'a')"));
> {code}
> To treat $ signs as literals in the replacement string, we must escape them. 
> Please see the [Java 
> doc|http://docs.oracle.com/javase/6/docs/api/java/util/regex/Matcher.html#replaceFirst(java.lang.String)]
>  for Matcher class for explanation:
> {quote}
> Note that backslashes (\) and dollar signs ($) in the replacement string may 
> cause the results to be different than if it were being treated as a literal 
> replacement string. Dollar signs may be treated as references to captured 
> subsequences as described above, and backslashes are used to escape literal 
> characters in the replacement string.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2353) RANK function like in SQL

2012-10-22 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481817#comment-13481817
 ] 

Olga Natkovich commented on PIG-2353:
-

Yes, I think that's fine - I did not realize it was covered in a separate JIRA, 
thanks!

> RANK function like in SQL
> -
>
> Key: PIG-2353
> URL: https://issues.apache.org/jira/browse/PIG-2353
> Project: Pig
>  Issue Type: New Feature
>Reporter: Gianmarco De Francisci Morales
>Assignee: Allan Avendaño
>  Labels: gsoc2012, mentor
> Fix For: 0.11
>
> Attachments: PIG-2353-2, PIG-2353-3.txt, PIG-2353-4.txt, 
> PIG-2353-5.txt, PIG2353.patch
>
>
> Implement a function that given a (sorted) bag adds to each tuple a unique, 
> increasing identifier without gaps, like what RANK does for SQL.
> This is a candidate project for Google summer of code 2012. More information 
> about the program can be found at 
> https://cwiki.apache.org/confluence/display/PIG/GSoc2012
> Functionality implemented so far, is available at 
> https://reviews.apache.org/r/5523/diff/#index_header

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2989) Illustrate for Rank Operator

2012-10-22 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PIG-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Avendaño updated PIG-2989:


Priority: Minor  (was: Major)

> Illustrate for Rank Operator
> 
>
> Key: PIG-2989
> URL: https://issues.apache.org/jira/browse/PIG-2989
> Project: Pig
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.11
>Reporter: Allan Avendaño
>Assignee: Allan Avendaño
>Priority: Minor
> Attachments: patch_1
>
>
> Specifically useful, when it's required a quick view of final results of Rank 
> operator use.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2975) TestTypedMap.testOrderBy failing with incorrect result

2012-10-22 Thread Jonathan Coveney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481790#comment-13481790
 ] 

Jonathan Coveney commented on PIG-2975:
---

I'm going to give it one last look-over and make sure that test-commit passes, 
otherwise I'll commit it shortly.

> TestTypedMap.testOrderBy failing with incorrect result 
> ---
>
> Key: PIG-2975
> URL: https://issues.apache.org/jira/browse/PIG-2975
> Project: Pig
>  Issue Type: Sub-task
>Affects Versions: 0.11
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Blocker
> Fix For: 0.11
>
> Attachments: PIG-2975-0_jco.patch, PIG-2975-0_jco-v2.patch, 
> pig-2975-trunk_v01.txt, pig-2975-trunk_v02-broken.txt, 
> pig-2975-trunk_v03-unionapproach.txt, pig-2975-trunk_v04-purerawcompare.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withouttest.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withtest2.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withtest.txt
>
>
> Looked at 
> {noformat}
> junit.framework.AssertionFailedError
> at org.apache.pig.test.TestTypedMap.testOrderBy(TestTypedMap.java:352)
> {noformat}
> This looks like a valid test case failing with incorrect result.
> {noformat}
> % cat test/orderby.txt
> [key#1,key9#23]
> [key#3,key3#2]
> [key#22]
> % cat test/orderby.pig
> a = load 'test/orderby.txt' as (m:[]);
> b = foreach a generate m#'key' as b0;
> dump b;
> c = order b by b0;
> dump c;
> % java ... org.apache.pig.Main-x local test/orderby.pig 
> [dump b]
> (1)
> (3)
> (22)
> ...
> [dump c]
> (1)
> (1)
> (22)
> %
> where did the '(3)' go?
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2931) $ signs in the replacement string make parameter substitution fail

2012-10-22 Thread Julien Le Dem (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481788#comment-13481788
 ] 

Julien Le Dem commented on PIG-2931:


sorry to reply late on this. This would break users that are already double 
escaping $ to insert values containing $
We should escape $ only if it is not escaped yet.

> $ signs in the replacement string make parameter substitution fail
> --
>
> Key: PIG-2931
> URL: https://issues.apache.org/jira/browse/PIG-2931
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Cheolsoo Park
>Assignee: Cheolsoo Park
> Fix For: 0.11
>
> Attachments: PIG-2931.patch
>
>
> To reproduce the issue, use the following pig script:
> {code:title=test.pig}
> a = load 'data';
> b = filter by $FILTER;
> {code}
> and run the following command:
> {code}
> pig -x local -dryrun -f test.pig -p FILTER="(\$0 == 'a')"
> {code}
> This generates the following script:
> {code:title=test.pig.substituted}
> a = load 'data';
> b = filter by ($FILTER == 'a');
> {code}
> However this should be:
> {code}
> a = load 'data';
> b = filter by ($0 == 'a');
> {code}
> This is because Pig calls replaceFirst() with a replacement string that 
> include a $ sign as follows:
> {code}
> "$FILTER".replaceFirst("\\$FILTER", "($0 == 'a')"));
> {code}
> To treat $ signs as literals in the replacement string, we must escape them. 
> Please see the [Java 
> doc|http://docs.oracle.com/javase/6/docs/api/java/util/regex/Matcher.html#replaceFirst(java.lang.String)]
>  for Matcher class for explanation:
> {quote}
> Note that backslashes (\) and dollar signs ($) in the replacement string may 
> cause the results to be different than if it were being treated as a literal 
> replacement string. Dollar signs may be treated as references to captured 
> subsequences as described above, and backslashes are used to escape literal 
> characters in the replacement string.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Documentation planning for Pig 0.11 release

2012-10-22 Thread Bill Graham
Hi Olga,

This is a great list, thanks for putting it together. Here are 3 more
issues pulled from PIG-2756:

Better HBaseStorage documentation:
https://issues.apache.org/jira/browse/PIG-2341
DateTime data type: https://issues.apache.org/jira/browse/PIG-2980
ILLUSTRATE seems to have dropped off the docs (no JIRA afaik)

thanks,
Bill

On Mon, Oct 22, 2012 at 11:50 AM, Olga Natkovich wrote:

> Hi,
>
> I have gone through the resolved JIRAs for 0.11, and here is what I
> believe needs to go into the documentation. Please, let me know if I missed
> anything. Also, I have not looked and anything that has not yet been
> committed:
>
> Bloom filter UDF: https://issues.apache.org/jira/browse/PIG-2328
> Clear command in Grunt: https://issues.apache.org/jira/browse/PIG-2706 -
> this is already in the docs
> RANK operator: https://issues.apache.org/jira/browse/PIG-2353 - this is
> already in docs
> UDF convinience classes: https://issues.apache.org/jira/browse/PIG-2547
> More efficient tuple support:
> https://issues.apache.org/jira/browse/PIG-2359
> Pluggable progress notification:
> https://issues.apache.org/jira/browse/PIG-2525
> Merge join after ORDER BY: https://issues.apache.org/jira/browse/PIG-2673
> Measure time spent in UDF: https://issues.apache.org/jira/browse/PIG-2855
> Storage func improvements: https://issues.apache.org/jira/browse/PIG-1891
> UDFs to flatten bags: https://issues.apache.org/jira/browse/PIG-2166
> Make Tuple iterable: https://issues.apache.org/jira/browse/PIG-2724
> New accumulate interface: https://issues.apache.org/jira/browse/PIG-2651
> RUBY UDF: https://issues.apache.org/jira/browse/PIG-2317 . Looks like
> this is also in 0.10. Was documentation for this committed to 10?
> Re-aliasing: https://issues.apache.org/jira/browse/PIG-438 . Looks like
> this is also in 0.10. Was documentation for this committed to 10?
> Groovy UDFs: https://issues.apache.org/jira/browse/PIG-2763 Docs already
> committed
> Native cube operator: https://issues.apache.org/jira/browse/PIG-2710 -
> docs at:  http://goo.gl/SpUad
> Better map support: https://issues.apache.org/jira/browse/PIG-2600 - This
> needs release notes to include in docs.
>
> Olga
>



-- 
*Note that I'm no longer using my Yahoo! email address. Please email me at
billgra...@gmail.com going forward.*


[jira] [Commented] (PIG-2828) DataType.compare null

2012-10-22 Thread Jonathan Coveney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481775#comment-13481775
 ] 

Jonathan Coveney commented on PIG-2828:
---

It would be nice to add a unit test that isolates this case.

> DataType.compare null
> -
>
> Key: PIG-2828
> URL: https://issues.apache.org/jira/browse/PIG-2828
> Project: Pig
>  Issue Type: Bug
>Reporter: Haitao Yao
> Attachments: DataType.patch
>
>
> While using TOP, and if the DataBag contains null value to compare, it will 
> generate the following exception:
> Caused by: java.lang.NullPointerException
>   at org.apache.pig.data.DataType.compare(DataType.java:427)
>   at org.apache.pig.builtin.TOP$TupleComparator.compare(TOP.java:97)
>   at org.apache.pig.builtin.TOP$TupleComparator.compare(TOP.java:1)
>   at java.util.PriorityQueue.siftUpUsingComparator(PriorityQueue.java:649)
>   at java.util.PriorityQueue.siftUp(PriorityQueue.java:627)
>   at java.util.PriorityQueue.offer(PriorityQueue.java:329)
>   at java.util.PriorityQueue.add(PriorityQueue.java:306)
>   at org.apache.pig.builtin.TOP.updateTop(TOP.java:141)
>   at org.apache.pig.builtin.TOP.exec(TOP.java:116)
> code: (TOP.java, starts with line 91)
> Object field1 = o1.get(fieldNum);
> Object field2 = o2.get(fieldNum);
> if (!typeFound) {
> datatype = DataType.findType(field1);
> typeFound = true;
> }
> return DataType.compare(field1, field2, datatype, datatype);
> The reason is that if the typeFound is true , and the dataType is not null, 
> and field1 is null, the script failed.
> So we need to judge the field1 whether is null.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2975) TestTypedMap.testOrderBy failing with incorrect result

2012-10-22 Thread Gianmarco De Francisci Morales (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481770#comment-13481770
 ] 

Gianmarco De Francisci Morales commented on PIG-2975:
-

Guys, great job in moving this forward!
I am sold an all the improvements in the patch.
+1

> TestTypedMap.testOrderBy failing with incorrect result 
> ---
>
> Key: PIG-2975
> URL: https://issues.apache.org/jira/browse/PIG-2975
> Project: Pig
>  Issue Type: Sub-task
>Affects Versions: 0.11
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Blocker
> Fix For: 0.11
>
> Attachments: PIG-2975-0_jco.patch, PIG-2975-0_jco-v2.patch, 
> pig-2975-trunk_v01.txt, pig-2975-trunk_v02-broken.txt, 
> pig-2975-trunk_v03-unionapproach.txt, pig-2975-trunk_v04-purerawcompare.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withouttest.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withtest2.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withtest.txt
>
>
> Looked at 
> {noformat}
> junit.framework.AssertionFailedError
> at org.apache.pig.test.TestTypedMap.testOrderBy(TestTypedMap.java:352)
> {noformat}
> This looks like a valid test case failing with incorrect result.
> {noformat}
> % cat test/orderby.txt
> [key#1,key9#23]
> [key#3,key3#2]
> [key#22]
> % cat test/orderby.pig
> a = load 'test/orderby.txt' as (m:[]);
> b = foreach a generate m#'key' as b0;
> dump b;
> c = order b by b0;
> dump c;
> % java ... org.apache.pig.Main-x local test/orderby.pig 
> [dump b]
> (1)
> (3)
> (22)
> ...
> [dump c]
> (1)
> (1)
> (22)
> %
> where did the '(3)' go?
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2788) improved string interpolation of variables

2012-10-22 Thread Jonathan Coveney (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Coveney updated PIG-2788:
--

Description: 
The simplest example of the failure of the current string interpolation is 

{code}
store my_rel into '$OUTPUT_';
{code}

This will raise an error saying that OUTPUT_ is not a variable passed in. 
Similar errors happen with a variety of other trailing characters.

It would be nice if '${OUTPUT}_', or something similar, worked.

> improved string interpolation of variables
> --
>
> Key: PIG-2788
> URL: https://issues.apache.org/jira/browse/PIG-2788
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.9.2, 0.10.0
>Reporter: Jeff Hodges
>
> The simplest example of the failure of the current string interpolation is 
> {code}
> store my_rel into '$OUTPUT_';
> {code}
> This will raise an error saying that OUTPUT_ is not a variable passed in. 
> Similar errors happen with a variety of other trailing characters.
> It would be nice if '${OUTPUT}_', or something similar, worked.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2788) improved string interpolation of variables

2012-10-22 Thread Jonathan Coveney (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Coveney updated PIG-2788:
--

Environment: (was: The simplest example of the failure of the current 
string interpolation is 

{code}
store my_rel into '$OUTPUT_';
{code}

This will raise an error saying that OUTPUT_ is not a variable passed in. 
Similar errors happen with a variety of other trailing characters.

It would be nice if '${OUTPUT}_', or something similar, worked.)

> improved string interpolation of variables
> --
>
> Key: PIG-2788
> URL: https://issues.apache.org/jira/browse/PIG-2788
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.9.2, 0.10.0
>Reporter: Jeff Hodges
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2975) TestTypedMap.testOrderBy failing with incorrect result

2012-10-22 Thread Koji Noguchi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481761#comment-13481761
 ] 

Koji Noguchi commented on PIG-2975:
---

bq. Would it have been the same before?
Yes. My test fails on trunk and passes with the patch.

bq.  A serialized long would have been 8 bytes and a serialized Integer would 
have been 4 bytes..
Actually, Tuple serializes Long by 

BinInterSedes.java
{noformat}
 498 } else if (Integer.MIN_VALUE <= lng && lng <= 
Integer.MAX_VALUE) {
 499 out.writeByte(LONG_ININT);
 500 out.writeInt((int)lng);
{noformat}


bq. I would just go with what BinInterSedesTupleRawComparator does, and we can 
note the minor backwards incompatibility
+1


> TestTypedMap.testOrderBy failing with incorrect result 
> ---
>
> Key: PIG-2975
> URL: https://issues.apache.org/jira/browse/PIG-2975
> Project: Pig
>  Issue Type: Sub-task
>Affects Versions: 0.11
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Blocker
> Fix For: 0.11
>
> Attachments: PIG-2975-0_jco.patch, PIG-2975-0_jco-v2.patch, 
> pig-2975-trunk_v01.txt, pig-2975-trunk_v02-broken.txt, 
> pig-2975-trunk_v03-unionapproach.txt, pig-2975-trunk_v04-purerawcompare.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withouttest.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withtest2.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withtest.txt
>
>
> Looked at 
> {noformat}
> junit.framework.AssertionFailedError
> at org.apache.pig.test.TestTypedMap.testOrderBy(TestTypedMap.java:352)
> {noformat}
> This looks like a valid test case failing with incorrect result.
> {noformat}
> % cat test/orderby.txt
> [key#1,key9#23]
> [key#3,key3#2]
> [key#22]
> % cat test/orderby.pig
> a = load 'test/orderby.txt' as (m:[]);
> b = foreach a generate m#'key' as b0;
> dump b;
> c = order b by b0;
> dump c;
> % java ... org.apache.pig.Main-x local test/orderby.pig 
> [dump b]
> (1)
> (3)
> (22)
> ...
> [dump c]
> (1)
> (1)
> (22)
> %
> where did the '(3)' go?
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2975) TestTypedMap.testOrderBy failing with incorrect result

2012-10-22 Thread Jonathan Coveney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481748#comment-13481748
 ] 

Jonathan Coveney commented on PIG-2975:
---

Would it have been the same before? A serialized long would have been 8 bytes 
and a serialized Integer would have been 4 bytes... I guess it depends what 
order it is serialized in.

I would just go with what BinInterSedesTupleRawComparator does, and we can note 
the minor backwards incompatibility (though it doesn't actually violate 
anything people should be relying on, it might be nice to explain what is going 
on in the release notes, just to clarify the semantics).

> TestTypedMap.testOrderBy failing with incorrect result 
> ---
>
> Key: PIG-2975
> URL: https://issues.apache.org/jira/browse/PIG-2975
> Project: Pig
>  Issue Type: Sub-task
>Affects Versions: 0.11
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Blocker
> Fix For: 0.11
>
> Attachments: PIG-2975-0_jco.patch, PIG-2975-0_jco-v2.patch, 
> pig-2975-trunk_v01.txt, pig-2975-trunk_v02-broken.txt, 
> pig-2975-trunk_v03-unionapproach.txt, pig-2975-trunk_v04-purerawcompare.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withouttest.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withtest2.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withtest.txt
>
>
> Looked at 
> {noformat}
> junit.framework.AssertionFailedError
> at org.apache.pig.test.TestTypedMap.testOrderBy(TestTypedMap.java:352)
> {noformat}
> This looks like a valid test case failing with incorrect result.
> {noformat}
> % cat test/orderby.txt
> [key#1,key9#23]
> [key#3,key3#2]
> [key#22]
> % cat test/orderby.pig
> a = load 'test/orderby.txt' as (m:[]);
> b = foreach a generate m#'key' as b0;
> dump b;
> c = order b by b0;
> dump c;
> % java ... org.apache.pig.Main-x local test/orderby.pig 
> [dump b]
> (1)
> (3)
> (22)
> ...
> [dump c]
> (1)
> (1)
> (22)
> %
> where did the '(3)' go?
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2600) Better Map support

2012-10-22 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481738#comment-13481738
 ] 

Olga Natkovich commented on PIG-2600:
-

Yes, please, put all the information into the release notes. This way it is 
much easier to created documentation patch.

> Better Map support
> --
>
> Key: PIG-2600
> URL: https://issues.apache.org/jira/browse/PIG-2600
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jonathan Coveney
>Assignee: Prashant Kommireddi
> Fix For: 0.11
>
> Attachments: PIG-2600_2.patch, PIG-2600_3.patch, PIG-2600_4.patch, 
> PIG-2600_5.patch, PIG-2600_6.patch, PIG-2600_7.patch, PIG-2600_8.patch, 
> PIG-2600_9.patch, PIG-2600.patch
>
>
> It would be nice if Pig played better with Maps. To that end, I'd like to add 
> a lot of utility around Maps.
> - TOBAG should take a Map and output {(key, value)}
> - TOMAP should take a Bag in that same form and make a map.
> - KEYSET should return the set of keys.
> - VALUESET should return the set of values.
> - VALUELIST should return the List of values (no deduping).
> - INVERSEMAP would return a Map of values => the set of keys that refer to 
> that Key
> This would all be pretty easy. A more substantial piece of work would be to 
> make Pig support non-String keys (this is especially an issue since UDFs and 
> whatnot probably assume that they are all Integers). Not sure if it is worth 
> it.
> I'd love to hear other things that would be useful for people!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2975) TestTypedMap.testOrderBy failing with incorrect result

2012-10-22 Thread Koji Noguchi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Noguchi updated PIG-2975:
--

Attachment: 
pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withtest2.txt

bq. statically import assertEquals()
done.

bq. "compareTwoObjects" is a bit misleading,
Renamed to compareTwoObjectsAsNullableBytesWritables.

bq. a few more comments could be helpful.
Added.

Also, added one more test case testLongByteArrays to make sure I'm setting the 
offset/length right.

One test I don't have an answer on.

{noformat}
118   @Test119   public void testDifferentType() throws Exception {
120  assertTrue("Integer  and Long  considered equal",
121 compareTwoObjectsAsNullableBytesWritables(new Integer(), new 
Long()) != 0 );
122 }
{noformat}

when comparing Integer() and Long() as unknown type, should they be 
considered same ? or different? 

Inside BinInterSedesTupleRawComparator, it's the latter.  Before, it was former 
(since the bug was skipping the datatype header).

> TestTypedMap.testOrderBy failing with incorrect result 
> ---
>
> Key: PIG-2975
> URL: https://issues.apache.org/jira/browse/PIG-2975
> Project: Pig
>  Issue Type: Sub-task
>Affects Versions: 0.11
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Blocker
> Fix For: 0.11
>
> Attachments: PIG-2975-0_jco.patch, PIG-2975-0_jco-v2.patch, 
> pig-2975-trunk_v01.txt, pig-2975-trunk_v02-broken.txt, 
> pig-2975-trunk_v03-unionapproach.txt, pig-2975-trunk_v04-purerawcompare.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withouttest.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withtest2.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withtest.txt
>
>
> Looked at 
> {noformat}
> junit.framework.AssertionFailedError
> at org.apache.pig.test.TestTypedMap.testOrderBy(TestTypedMap.java:352)
> {noformat}
> This looks like a valid test case failing with incorrect result.
> {noformat}
> % cat test/orderby.txt
> [key#1,key9#23]
> [key#3,key3#2]
> [key#22]
> % cat test/orderby.pig
> a = load 'test/orderby.txt' as (m:[]);
> b = foreach a generate m#'key' as b0;
> dump b;
> c = order b by b0;
> dump c;
> % java ... org.apache.pig.Main-x local test/orderby.pig 
> [dump b]
> (1)
> (3)
> (22)
> ...
> [dump c]
> (1)
> (1)
> (22)
> %
> where did the '(3)' go?
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2353) RANK function like in SQL

2012-10-22 Thread JIRA

[ 
https://issues.apache.org/jira/browse/PIG-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481703#comment-13481703
 ] 

Allan Avendaño commented on PIG-2353:
-

Hi Olga!

Does PIG-2947 apply as release notes? 

> RANK function like in SQL
> -
>
> Key: PIG-2353
> URL: https://issues.apache.org/jira/browse/PIG-2353
> Project: Pig
>  Issue Type: New Feature
>Reporter: Gianmarco De Francisci Morales
>Assignee: Allan Avendaño
>  Labels: gsoc2012, mentor
> Fix For: 0.11
>
> Attachments: PIG-2353-2, PIG-2353-3.txt, PIG-2353-4.txt, 
> PIG-2353-5.txt, PIG2353.patch
>
>
> Implement a function that given a (sorted) bag adds to each tuple a unique, 
> increasing identifier without gaps, like what RANK does for SQL.
> This is a candidate project for Google summer of code 2012. More information 
> about the program can be found at 
> https://cwiki.apache.org/confluence/display/PIG/GSoc2012
> Functionality implemented so far, is available at 
> https://reviews.apache.org/r/5523/diff/#index_header

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2600) Better Map support

2012-10-22 Thread Prashant Kommireddi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481704#comment-13481704
 ] 

Prashant Kommireddi commented on PIG-2600:
--

Btw, syntax and usage examples are all present in java files for each of the 
UDFs. Lmk if it needs to go into release notes (and where) and I can put it 
there.

> Better Map support
> --
>
> Key: PIG-2600
> URL: https://issues.apache.org/jira/browse/PIG-2600
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jonathan Coveney
>Assignee: Prashant Kommireddi
> Fix For: 0.11
>
> Attachments: PIG-2600_2.patch, PIG-2600_3.patch, PIG-2600_4.patch, 
> PIG-2600_5.patch, PIG-2600_6.patch, PIG-2600_7.patch, PIG-2600_8.patch, 
> PIG-2600_9.patch, PIG-2600.patch
>
>
> It would be nice if Pig played better with Maps. To that end, I'd like to add 
> a lot of utility around Maps.
> - TOBAG should take a Map and output {(key, value)}
> - TOMAP should take a Bag in that same form and make a map.
> - KEYSET should return the set of keys.
> - VALUESET should return the set of values.
> - VALUELIST should return the List of values (no deduping).
> - INVERSEMAP would return a Map of values => the set of keys that refer to 
> that Key
> This would all be pretty easy. A more substantial piece of work would be to 
> make Pig support non-String keys (this is especially an issue since UDFs and 
> whatnot probably assume that they are all Integers). Not sure if it is worth 
> it.
> I'd love to hear other things that would be useful for people!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2600) Better Map support

2012-10-22 Thread Prashant Kommireddi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481676#comment-13481676
 ] 

Prashant Kommireddi commented on PIG-2600:
--

Sure, I can do that. Where exactly should I be adding this?

> Better Map support
> --
>
> Key: PIG-2600
> URL: https://issues.apache.org/jira/browse/PIG-2600
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jonathan Coveney
>Assignee: Prashant Kommireddi
> Fix For: 0.11
>
> Attachments: PIG-2600_2.patch, PIG-2600_3.patch, PIG-2600_4.patch, 
> PIG-2600_5.patch, PIG-2600_6.patch, PIG-2600_7.patch, PIG-2600_8.patch, 
> PIG-2600_9.patch, PIG-2600.patch
>
>
> It would be nice if Pig played better with Maps. To that end, I'd like to add 
> a lot of utility around Maps.
> - TOBAG should take a Map and output {(key, value)}
> - TOMAP should take a Bag in that same form and make a map.
> - KEYSET should return the set of keys.
> - VALUESET should return the set of values.
> - VALUELIST should return the List of values (no deduping).
> - INVERSEMAP would return a Map of values => the set of keys that refer to 
> that Key
> This would all be pretty easy. A more substantial piece of work would be to 
> make Pig support non-String keys (this is especially an issue since UDFs and 
> whatnot probably assume that they are all Integers). Not sure if it is worth 
> it.
> I'd love to hear other things that would be useful for people!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2600) Better Map support

2012-10-22 Thread Jonathan Coveney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481630#comment-13481630
 ] 

Jonathan Coveney commented on PIG-2600:
---

Prashant, do you want to take the honor? :)

> Better Map support
> --
>
> Key: PIG-2600
> URL: https://issues.apache.org/jira/browse/PIG-2600
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jonathan Coveney
>Assignee: Prashant Kommireddi
> Fix For: 0.11
>
> Attachments: PIG-2600_2.patch, PIG-2600_3.patch, PIG-2600_4.patch, 
> PIG-2600_5.patch, PIG-2600_6.patch, PIG-2600_7.patch, PIG-2600_8.patch, 
> PIG-2600_9.patch, PIG-2600.patch
>
>
> It would be nice if Pig played better with Maps. To that end, I'd like to add 
> a lot of utility around Maps.
> - TOBAG should take a Map and output {(key, value)}
> - TOMAP should take a Bag in that same form and make a map.
> - KEYSET should return the set of keys.
> - VALUESET should return the set of values.
> - VALUELIST should return the List of values (no deduping).
> - INVERSEMAP would return a Map of values => the set of keys that refer to 
> that Key
> This would all be pretty easy. A more substantial piece of work would be to 
> make Pig support non-String keys (this is especially an issue since UDFs and 
> whatnot probably assume that they are all Integers). Not sure if it is worth 
> it.
> I'd love to hear other things that would be useful for people!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2975) TestTypedMap.testOrderBy failing with incorrect result

2012-10-22 Thread Jonathan Coveney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481628#comment-13481628
 ] 

Jonathan Coveney commented on PIG-2975:
---

Koji,

It's been a pleasure :)

Ok, I'm king nitpick, but my last nitpick is to statically import 
assertEquals() (and any other junit methods) instead of calling 
Assert.whatever. Why? I'm slowly but surely trying to promote a consistent 
style in all of the unit tests.

As far as the annoying l1 etc variable, I'll let it slide ;) You're right that 
it's a common pattern, but I think it's a bad one. That said: this isn't the 
place to fix that.

Also, your tests are great, though a few more comments could be helpful. 
"compareTwoObjects" is a bit misleading, for example. I know not all of the 
tests have great comments (I'm probably guilty of this myself), but a line or 
two per would go a long way. Be the commit you want to see in the world and all 
that jazz!

In the meantime, I'm going to make sure the tests run, and make sure that the 
unit tests fail on trunk (ie that it is isolating the issue).

Thanks for being patient with ME, Koji
Jon

> TestTypedMap.testOrderBy failing with incorrect result 
> ---
>
> Key: PIG-2975
> URL: https://issues.apache.org/jira/browse/PIG-2975
> Project: Pig
>  Issue Type: Sub-task
>Affects Versions: 0.11
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Blocker
> Fix For: 0.11
>
> Attachments: PIG-2975-0_jco.patch, PIG-2975-0_jco-v2.patch, 
> pig-2975-trunk_v01.txt, pig-2975-trunk_v02-broken.txt, 
> pig-2975-trunk_v03-unionapproach.txt, pig-2975-trunk_v04-purerawcompare.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withouttest.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withtest.txt
>
>
> Looked at 
> {noformat}
> junit.framework.AssertionFailedError
> at org.apache.pig.test.TestTypedMap.testOrderBy(TestTypedMap.java:352)
> {noformat}
> This looks like a valid test case failing with incorrect result.
> {noformat}
> % cat test/orderby.txt
> [key#1,key9#23]
> [key#3,key3#2]
> [key#22]
> % cat test/orderby.pig
> a = load 'test/orderby.txt' as (m:[]);
> b = foreach a generate m#'key' as b0;
> dump b;
> c = order b by b0;
> dump c;
> % java ... org.apache.pig.Main-x local test/orderby.pig 
> [dump b]
> (1)
> (3)
> (22)
> ...
> [dump c]
> (1)
> (1)
> (22)
> %
> where did the '(3)' go?
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Documentation planning for Pig 0.11 release

2012-10-22 Thread Olga Natkovich
Hi,
 
I have gone through the resolved JIRAs for 0.11, and here is what I believe 
needs to go into the documentation. Please, let me know if I missed anything. 
Also, I have not looked and anything that has not yet been committed:
 
Bloom filter UDF: https://issues.apache.org/jira/browse/PIG-2328
Clear command in Grunt: https://issues.apache.org/jira/browse/PIG-2706 - this 
is already in the docs
RANK operator: https://issues.apache.org/jira/browse/PIG-2353 - this is already 
in docs
UDF convinience classes: https://issues.apache.org/jira/browse/PIG-2547
More efficient tuple support: https://issues.apache.org/jira/browse/PIG-2359
Pluggable progress notification: https://issues.apache.org/jira/browse/PIG-2525
Merge join after ORDER BY: https://issues.apache.org/jira/browse/PIG-2673
Measure time spent in UDF: https://issues.apache.org/jira/browse/PIG-2855
Storage func improvements: https://issues.apache.org/jira/browse/PIG-1891
UDFs to flatten bags: https://issues.apache.org/jira/browse/PIG-2166
Make Tuple iterable: https://issues.apache.org/jira/browse/PIG-2724
New accumulate interface: https://issues.apache.org/jira/browse/PIG-2651
RUBY UDF: https://issues.apache.org/jira/browse/PIG-2317 . Looks like this is 
also in 0.10. Was documentation for this committed to 10?
Re-aliasing: https://issues.apache.org/jira/browse/PIG-438 . Looks like this is 
also in 0.10. Was documentation for this committed to 10?
Groovy UDFs: https://issues.apache.org/jira/browse/PIG-2763 Docs already 
committed
Native cube operator: https://issues.apache.org/jira/browse/PIG-2710 - docs 
at:  http://goo.gl/SpUad
Better map support: https://issues.apache.org/jira/browse/PIG-2600 - This needs 
release notes to include in docs.
 
Olga


[jira] [Commented] (PIG-2710) Implement Naive CUBE operator

2012-10-22 Thread Prasanth J (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481614#comment-13481614
 ] 

Prasanth J commented on PIG-2710:
-

It covers the necessary documentation for naive CUBE/ROLLUP operator 
implementation. It doesn't cover anything related to scalable MR-Cube 
implementation (PIG-2831) which as per Dmitriy is going into 0.12. 

> Implement Naive CUBE operator
> -
>
> Key: PIG-2710
> URL: https://issues.apache.org/jira/browse/PIG-2710
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Dmitriy V. Ryaboy
>Assignee: Prasanth J
> Fix For: 0.11
>
> Attachments: PIG-2710.1.patch
>
>
> The Naive CUBE operator is just syntactic sugar for the CubeDimensions UDFS 
> followed by a flatten+group-by.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2710) Implement Naive CUBE operator

2012-10-22 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481603#comment-13481603
 ] 

Olga Natkovich commented on PIG-2710:
-

Hi Prashanth,

The release notes look great! Do they basically cover all work we have done in 
this release for CUBE related support?

> Implement Naive CUBE operator
> -
>
> Key: PIG-2710
> URL: https://issues.apache.org/jira/browse/PIG-2710
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Dmitriy V. Ryaboy
>Assignee: Prasanth J
> Fix For: 0.11
>
> Attachments: PIG-2710.1.patch
>
>
> The Naive CUBE operator is just syntactic sugar for the CubeDimensions UDFS 
> followed by a flatten+group-by.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2710) Implement Naive CUBE operator

2012-10-22 Thread Prasanth J (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481599#comment-13481599
 ] 

Prasanth J commented on PIG-2710:
-

Hi Olga

I have included the release notes in another subtask (PIG-2765). Here it is 
http://goo.gl/SpUad
Please let me know if this is sufficient for documentation. 

> Implement Naive CUBE operator
> -
>
> Key: PIG-2710
> URL: https://issues.apache.org/jira/browse/PIG-2710
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Dmitriy V. Ryaboy
>Assignee: Prasanth J
> Fix For: 0.11
>
> Attachments: PIG-2710.1.patch
>
>
> The Naive CUBE operator is just syntactic sugar for the CubeDimensions UDFS 
> followed by a flatten+group-by.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2600) Better Map support

2012-10-22 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481596#comment-13481596
 ] 

Olga Natkovich commented on PIG-2600:
-

can you please add to release notes the UDFs that were added as well as their 
syntax and usage examples. This is for inclusion in the documentation, thanks!

> Better Map support
> --
>
> Key: PIG-2600
> URL: https://issues.apache.org/jira/browse/PIG-2600
> Project: Pig
>  Issue Type: Improvement
>Reporter: Jonathan Coveney
>Assignee: Prashant Kommireddi
> Fix For: 0.11
>
> Attachments: PIG-2600_2.patch, PIG-2600_3.patch, PIG-2600_4.patch, 
> PIG-2600_5.patch, PIG-2600_6.patch, PIG-2600_7.patch, PIG-2600_8.patch, 
> PIG-2600_9.patch, PIG-2600.patch
>
>
> It would be nice if Pig played better with Maps. To that end, I'd like to add 
> a lot of utility around Maps.
> - TOBAG should take a Map and output {(key, value)}
> - TOMAP should take a Bag in that same form and make a map.
> - KEYSET should return the set of keys.
> - VALUESET should return the set of values.
> - VALUELIST should return the List of values (no deduping).
> - INVERSEMAP would return a Map of values => the set of keys that refer to 
> that Key
> This would all be pretty easy. A more substantial piece of work would be to 
> make Pig support non-String keys (this is especially an issue since UDFs and 
> whatnot probably assume that they are all Integers). Not sure if it is worth 
> it.
> I'd love to hear other things that would be useful for people!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2710) Implement Naive CUBE operator

2012-10-22 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481591#comment-13481591
 ] 

Olga Natkovich commented on PIG-2710:
-

Could you, please, include release notes including syntax and examples for 
inclusion in the documentation, thanks!

> Implement Naive CUBE operator
> -
>
> Key: PIG-2710
> URL: https://issues.apache.org/jira/browse/PIG-2710
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Dmitriy V. Ryaboy
>Assignee: Prasanth J
> Fix For: 0.11
>
> Attachments: PIG-2710.1.patch
>
>
> The Naive CUBE operator is just syntactic sugar for the CubeDimensions UDFS 
> followed by a flatten+group-by.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2582) Store size in bytes (not mbytes) in ResourceStatistics

2012-10-22 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481590#comment-13481590
 ] 

Bill Graham commented on PIG-2582:
--

I didn't notice the Unstable annotation. I'm ok changing scope if others agree.

> Store size in bytes (not mbytes) in ResourceStatistics
> --
>
> Key: PIG-2582
> URL: https://issues.apache.org/jira/browse/PIG-2582
> Project: Pig
>  Issue Type: Bug
>Reporter: Travis Crawford
>Assignee: Prashant Kommireddi
>Priority: Minor
> Attachments: PIG-2582.patch
>
>
> In 
> [ResourceStatistics.java|http://svn.apache.org/viewvc/pig/trunk/src/org/apache/pig/ResourceStatistics.java?view=markup]
>  we see mBytes is public, and has a public getter/setter.
> {code}
> 47public Long mBytes; // size in megabytes
> 196   public Long getmBytes() {
> 197   return mBytes;
> 198   }
> 199   public ResourceStatistics setmBytes(Long mBytes) {
> 200   this.mBytes = mBytes;
> 201   return this;
> 202   }
> {code}
> Typically sizes are stored as bytes, potentially having convenience functions 
> to return with different units.
> If mBytes can be marked private without causing woes it might be worth 
> storing size as bytes instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2582) Store size in bytes (not mbytes) in ResourceStatistics

2012-10-22 Thread Bill Graham (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481589#comment-13481589
 ] 

Bill Graham commented on PIG-2582:
--

As much as we'd love to make these members private, we should resist the urge 
and keep them public for backward compatibility.

> Store size in bytes (not mbytes) in ResourceStatistics
> --
>
> Key: PIG-2582
> URL: https://issues.apache.org/jira/browse/PIG-2582
> Project: Pig
>  Issue Type: Bug
>Reporter: Travis Crawford
>Assignee: Prashant Kommireddi
>Priority: Minor
> Attachments: PIG-2582.patch
>
>
> In 
> [ResourceStatistics.java|http://svn.apache.org/viewvc/pig/trunk/src/org/apache/pig/ResourceStatistics.java?view=markup]
>  we see mBytes is public, and has a public getter/setter.
> {code}
> 47public Long mBytes; // size in megabytes
> 196   public Long getmBytes() {
> 197   return mBytes;
> 198   }
> 199   public ResourceStatistics setmBytes(Long mBytes) {
> 200   this.mBytes = mBytes;
> 201   return this;
> 202   }
> {code}
> Typically sizes are stored as bytes, potentially having convenience functions 
> to return with different units.
> If mBytes can be marked private without causing woes it might be worth 
> storing size as bytes instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2582) Store size in bytes (not mbytes) in ResourceStatistics

2012-10-22 Thread Prashant Kommireddi (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481588#comment-13481588
 ] 

Prashant Kommireddi commented on PIG-2582:
--

Hey Prasanth, the reason is that we don't want to break backward compatibility 
in case these member variables are accessed directly and not through getters. 
There are no such references from within the Pig project, but I'm being wary of 
any users who use this from outside of it.

The interface stability is marked Unstable on this, so I am ok if all decide 
its cool to change the scope of these variables :) 

> Store size in bytes (not mbytes) in ResourceStatistics
> --
>
> Key: PIG-2582
> URL: https://issues.apache.org/jira/browse/PIG-2582
> Project: Pig
>  Issue Type: Bug
>Reporter: Travis Crawford
>Assignee: Prashant Kommireddi
>Priority: Minor
> Attachments: PIG-2582.patch
>
>
> In 
> [ResourceStatistics.java|http://svn.apache.org/viewvc/pig/trunk/src/org/apache/pig/ResourceStatistics.java?view=markup]
>  we see mBytes is public, and has a public getter/setter.
> {code}
> 47public Long mBytes; // size in megabytes
> 196   public Long getmBytes() {
> 197   return mBytes;
> 198   }
> 199   public ResourceStatistics setmBytes(Long mBytes) {
> 200   this.mBytes = mBytes;
> 201   return this;
> 202   }
> {code}
> Typically sizes are stored as bytes, potentially having convenience functions 
> to return with different units.
> If mBytes can be marked private without causing woes it might be worth 
> storing size as bytes instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2927) SHIP and use JRuby gems in JRuby UDFs

2012-10-22 Thread Jonathan Coveney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481586#comment-13481586
 ] 

Jonathan Coveney commented on PIG-2927:
---

Thanks for taking a look, Cheolsoo!

A test would be awesome. IMHO we should try and use a non-GPL patch. For a 
test, there is no reason not to since we control all the variables. In the 
future if it is unavoidable, we can work around that.

> SHIP and use JRuby gems in JRuby UDFs
> -
>
> Key: PIG-2927
> URL: https://issues.apache.org/jira/browse/PIG-2927
> Project: Pig
>  Issue Type: New Feature
>  Components: parser
>Affects Versions: 0.11
> Environment: JRuby UDFs
>Reporter: Russell Jurney
>Assignee: Jonathan Coveney
>Priority: Minor
> Fix For: 0.11
>
> Attachments: PIG-2927-0.patch, PIG-2927-1.patch, PIG-2927-2.patch, 
> PIG-2927-3.patch
>
>
> It would be great to use JRuby gems in JRuby UDFs without installing them on 
> all machines on the cluster. Some way to SHIP them automatically with the job 
> would be great.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2975) TestTypedMap.testOrderBy failing with incorrect result

2012-10-22 Thread Koji Noguchi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Noguchi updated PIG-2975:
--

Attachment: 
pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withtest.txt

Attaching patch with Jonathan's suggested changes (except the first one).

Thanks Jonathan for all your help and being patient with me!

bq. l1 is a really hard to read. Please use fuller names (even len1 and len2)

Agree. But this is coming from the original code. l1,s1,l2,s2 seem to be used 
everywhere for compare() method unfortunately.  Leaving them for now.


bq.  IMHO, spaces make = and + etc more readable (ie databytearraycompare=false)

Added.

bq. on that front, use camelCase for multi-word lines

Changed.

Also added couple of test cases for incorrect results and one for Alphabetical 
sorting of bytearrays across Tiny/Small/Regular size boundaries. 


> TestTypedMap.testOrderBy failing with incorrect result 
> ---
>
> Key: PIG-2975
> URL: https://issues.apache.org/jira/browse/PIG-2975
> Project: Pig
>  Issue Type: Sub-task
>Affects Versions: 0.11
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Blocker
> Fix For: 0.11
>
> Attachments: PIG-2975-0_jco.patch, PIG-2975-0_jco-v2.patch, 
> pig-2975-trunk_v01.txt, pig-2975-trunk_v02-broken.txt, 
> pig-2975-trunk_v03-unionapproach.txt, pig-2975-trunk_v04-purerawcompare.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withouttest.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withtest.txt
>
>
> Looked at 
> {noformat}
> junit.framework.AssertionFailedError
> at org.apache.pig.test.TestTypedMap.testOrderBy(TestTypedMap.java:352)
> {noformat}
> This looks like a valid test case failing with incorrect result.
> {noformat}
> % cat test/orderby.txt
> [key#1,key9#23]
> [key#3,key3#2]
> [key#22]
> % cat test/orderby.pig
> a = load 'test/orderby.txt' as (m:[]);
> b = foreach a generate m#'key' as b0;
> dump b;
> c = order b by b0;
> dump c;
> % java ... org.apache.pig.Main-x local test/orderby.pig 
> [dump b]
> (1)
> (3)
> (22)
> ...
> [dump c]
> (1)
> (1)
> (22)
> %
> where did the '(3)' go?
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2959) Add a pig.cmd for Pig to run under Windows

2012-10-22 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481577#comment-13481577
 ] 

Daniel Dai commented on PIG-2959:
-

It is PIG-2873. Testing is especially important to make sure all feature work.

> Add a pig.cmd for Pig to run under Windows
> --
>
> Key: PIG-2959
> URL: https://issues.apache.org/jira/browse/PIG-2959
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.11
>
> Attachments: pig.cmd
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2881) Add SUBTRACT eval function

2012-10-22 Thread Jonathan Coveney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481575#comment-13481575
 ] 

Jonathan Coveney commented on PIG-2881:
---

Joel,

This looks fine, but for the final oomph, can you do the following:
- Put it in Pig w/the proper package (either org.apache.pig.builtin or the 
Piggybank)
- Generate a diff against trunk

This will make it so people can download and see exactly what is being applied. 
We like to avoid cases where many steps need to be taken to exactly replicate 
what is in trunk.

Thanks for contributing
Jon

> Add SUBTRACT eval function
> --
>
> Key: PIG-2881
> URL: https://issues.apache.org/jira/browse/PIG-2881
> Project: Pig
>  Issue Type: New Feature
>  Components: piggybank
>Affects Versions: 0.10.0
>Reporter: Joel Costigliola
>Priority: Minor
> Attachments: Subtract.java, SubtractTest.java
>
>
> Close to DIFF function but SUBTRACT(bag1, bag2) will subtract elements of 
> bag2 from bag1.
>   

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2582) Store size in bytes (not mbytes) in ResourceStatistics

2012-10-22 Thread Prasanth J (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481572#comment-13481572
 ] 

Prasanth J commented on PIG-2582:
-

Hi Prashant

Looks like some of the class member variables are still public. Is there any 
reason to leave it public? 

> Store size in bytes (not mbytes) in ResourceStatistics
> --
>
> Key: PIG-2582
> URL: https://issues.apache.org/jira/browse/PIG-2582
> Project: Pig
>  Issue Type: Bug
>Reporter: Travis Crawford
>Assignee: Prashant Kommireddi
>Priority: Minor
> Attachments: PIG-2582.patch
>
>
> In 
> [ResourceStatistics.java|http://svn.apache.org/viewvc/pig/trunk/src/org/apache/pig/ResourceStatistics.java?view=markup]
>  we see mBytes is public, and has a public getter/setter.
> {code}
> 47public Long mBytes; // size in megabytes
> 196   public Long getmBytes() {
> 197   return mBytes;
> 198   }
> 199   public ResourceStatistics setmBytes(Long mBytes) {
> 200   this.mBytes = mBytes;
> 201   return this;
> 202   }
> {code}
> Typically sizes are stored as bytes, potentially having convenience functions 
> to return with different units.
> If mBytes can be marked private without causing woes it might be worth 
> storing size as bytes instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2582) Store size in bytes (not mbytes) in ResourceStatistics

2012-10-22 Thread Jonathan Coveney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481569#comment-13481569
 ] 

Jonathan Coveney commented on PIG-2582:
---

Thanks for handling this, ya'll. I'll +1 and commit, pending this making Travis 
happy.

> Store size in bytes (not mbytes) in ResourceStatistics
> --
>
> Key: PIG-2582
> URL: https://issues.apache.org/jira/browse/PIG-2582
> Project: Pig
>  Issue Type: Bug
>Reporter: Travis Crawford
>Assignee: Prashant Kommireddi
>Priority: Minor
> Attachments: PIG-2582.patch
>
>
> In 
> [ResourceStatistics.java|http://svn.apache.org/viewvc/pig/trunk/src/org/apache/pig/ResourceStatistics.java?view=markup]
>  we see mBytes is public, and has a public getter/setter.
> {code}
> 47public Long mBytes; // size in megabytes
> 196   public Long getmBytes() {
> 197   return mBytes;
> 198   }
> 199   public ResourceStatistics setmBytes(Long mBytes) {
> 200   this.mBytes = mBytes;
> 201   return this;
> 202   }
> {code}
> Typically sizes are stored as bytes, potentially having convenience functions 
> to return with different units.
> If mBytes can be marked private without causing woes it might be worth 
> storing size as bytes instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2959) Add a pig.cmd for Pig to run under Windows

2012-10-22 Thread Jonathan Coveney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481559#comment-13481559
 ] 

Jonathan Coveney commented on PIG-2959:
---

I agree with Dmitriy on all counts. Further, where is the conversation on 
moving the bash script to python? That's probably a good change, but a 
nontrivial one.

> Add a pig.cmd for Pig to run under Windows
> --
>
> Key: PIG-2959
> URL: https://issues.apache.org/jira/browse/PIG-2959
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.11
>
> Attachments: pig.cmd
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Pig 0.11

2012-10-22 Thread Russell Jurney
JRuby gems in JRuby UDFs is a new feature and I'd like it to go in 0.11.

https://issues.apache.org/jira/browse/PIG-2927

Russell Jurney http://datasyndrome.com

On Oct 22, 2012, at 1:17 PM, Olga Natkovich  wrote:

> There are still 76 unresolved JIRAs more than half unassigned. Lets clean 
> this up by theend of this week. I propose we do the following:
>
> (1) Unlink all JIRAs for new features since we already branched so we should 
> not be taken on new work. If people feel strongly that some new features 
> still need to go in please bring it up.
> (2) For bug fixes, if people fill strongly that some of the unassigned issues 
> need to be addressed please take ownership. If you are unable to solve them 
> but still feel they are important, please, bring them up.
> (3) Owners of unresolved issues, please, take a look if you will have time to 
> solve them in the next 2 weeks. If not, lets move them to 12. If you can't 
> address them but feel they are important, please, bring it up.
>
> Lets make sure that all JIRAs that require changes to the documentation have 
> appropriate information in the release notes section so that we can quickly 
> compile release documentation.
>
> Thanks for you help!
>
> Olga
>
>
>
>
> 
> From: Alan Gates 
> To: dev@pig.apache.org
> Sent: Monday, October 15, 2012 11:55 AM
> Subject: Re: Pig 0.11
>
> At this point no one has taken on release documentation for 0.11.
>
> Alan.
>
> On Oct 15, 2012, at 11:49 AM, Olga Natkovich wrote:
>
>> Thanks!
>>
>> Are you talking about items 15 and 16 on the How To Release.Publish  page?
>>
>> Also, who is doing release documentation these days? I can help with that as 
>> well. I would also be happy to roll the release if you guys need help with 
>> that.
>>
>> Olga
>>
>>
>> 
>> From: Dmitriy Ryaboy 
>> To: "dev@pig.apache.org" 
>> Cc: "dev@pig.apache.org" 
>> Sent: Friday, October 12, 2012 5:59 PM
>> Subject: Re: Pig 0.11
>>
>> Thanks Olga and welcome back!
>> I know there's some process for linking jiras to releases, but I'm not sure 
>> what that is. If you could explain and maybe cover a portion of that work, 
>> that'd be super helpful. And reviews, of course.
>>
>> On Oct 12, 2012, at 2:06 PM, Olga Natkovich  wrote:
>>
>>> Dmitry, I would be happy to help with the release process. Want to get back 
>>> into this now that I am back at work. Let me know what you would like me to 
>>> do.
>>>
>>> Olga
>>>
>>>
>>>
>>> 
>>> From: Dmitriy Ryaboy 
>>> To: dev@pig.apache.org
>>> Cc: billgra...@gmail.com
>>> Sent: Thursday, October 11, 2012 2:44 PM
>>> Subject: Re: Pig 0.11
>>>
>>> Ok I will branch 0.11 tomorrow morning unless someone objects.
>>> From then on, committers should be careful to commit bug fixes to both
>>> 0.11 branch and trunk; minor polish can go into the branch, but whole
>>> new features should not (we can discuss on the list if something is in
>>> the gray area).
>>>
>>> D
>>>
>>> On Thu, Oct 11, 2012 at 2:16 PM, Gianmarco De Francisci Morales
>>>  wrote:
 I added it as a dependency as it has already its own Jira.
 I hope it is OK.

 Cheers,
 --
 Gianmarco



 On Wed, Oct 10, 2012 at 11:23 PM, Bill Graham  wrote:

> +1 for me.
>
> There's https://issues.apache.org/jira/browse/PIG-2756 which tracks a few
> documentation issues that should block Pig 0.11, but they can also be done
> on the trunk and merged to the branch. Gianmarco, you can add a rank
> subtask there to serve as a reminder.
>
>
> On Wed, Oct 10, 2012 at 11:03 PM, Gianmarco De Francisci Morales <
> g...@apache.org> wrote:
>
>> We are missing some documentation on the RANK but I guess we could add
> that
>> to the branch and trunk in parallel.
>> All the patches I was keeping an eye on are in.
>>
>> So +1 for me.
>> --
>> Gianmarco
>>
>>
>>
>> On Wed, Oct 10, 2012 at 5:31 PM, Jonathan Coveney >> wrote:
>>
>>> I think all of the major patches are in, no? Now it's just bug testing?
>>> Just wanted to touch base on where we are at with this.
>>>
>>
>
>
>
> --
> *Note that I'm no longer using my Yahoo! email address. Please email me at
> billgra...@gmail.com going forward.*


[jira] [Commented] (PIG-2975) TestTypedMap.testOrderBy failing with incorrect result

2012-10-22 Thread Jonathan Coveney (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481548#comment-13481548
 ] 

Jonathan Coveney commented on PIG-2975:
---

Koji,

I am digging this!

A couple stylistic points:
- l1 is a really hard to read. Please use fuller names (even len1 and len2)
- IMHO, spaces make = and + etc more readable (ie databytearraycompare=false)
- on that front, use camelCase for multi-word lines

Last point: I would love to have a unit test that tests this specifically. if 
you could do that, this would be truly pro-style. Otherwise it looks great

> TestTypedMap.testOrderBy failing with incorrect result 
> ---
>
> Key: PIG-2975
> URL: https://issues.apache.org/jira/browse/PIG-2975
> Project: Pig
>  Issue Type: Sub-task
>Affects Versions: 0.11
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Blocker
> Fix For: 0.11
>
> Attachments: PIG-2975-0_jco.patch, PIG-2975-0_jco-v2.patch, 
> pig-2975-trunk_v01.txt, pig-2975-trunk_v02-broken.txt, 
> pig-2975-trunk_v03-unionapproach.txt, pig-2975-trunk_v04-purerawcompare.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withouttest.txt
>
>
> Looked at 
> {noformat}
> junit.framework.AssertionFailedError
> at org.apache.pig.test.TestTypedMap.testOrderBy(TestTypedMap.java:352)
> {noformat}
> This looks like a valid test case failing with incorrect result.
> {noformat}
> % cat test/orderby.txt
> [key#1,key9#23]
> [key#3,key3#2]
> [key#22]
> % cat test/orderby.pig
> a = load 'test/orderby.txt' as (m:[]);
> b = foreach a generate m#'key' as b0;
> dump b;
> c = order b by b0;
> dump c;
> % java ... org.apache.pig.Main-x local test/orderby.pig 
> [dump b]
> (1)
> (3)
> (22)
> ...
> [dump c]
> (1)
> (1)
> (22)
> %
> where did the '(3)' go?
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2959) Add a pig.cmd for Pig to run under Windows

2012-10-22 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481540#comment-13481540
 ] 

Daniel Dai commented on PIG-2959:
-

Thanks Dmitriy!

> Add a pig.cmd for Pig to run under Windows
> --
>
> Key: PIG-2959
> URL: https://issues.apache.org/jira/browse/PIG-2959
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.11
>
> Attachments: pig.cmd
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2353) RANK function like in SQL

2012-10-22 Thread Olga Natkovich (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481537#comment-13481537
 ] 

Olga Natkovich commented on PIG-2353:
-

Can you please add usage example to release notes section, thanks!

> RANK function like in SQL
> -
>
> Key: PIG-2353
> URL: https://issues.apache.org/jira/browse/PIG-2353
> Project: Pig
>  Issue Type: New Feature
>Reporter: Gianmarco De Francisci Morales
>Assignee: Allan Avendaño
>  Labels: gsoc2012, mentor
> Fix For: 0.11
>
> Attachments: PIG-2353-2, PIG-2353-3.txt, PIG-2353-4.txt, 
> PIG-2353-5.txt, PIG2353.patch
>
>
> Implement a function that given a (sorted) bag adds to each tuple a unique, 
> increasing identifier without gaps, like what RANK does for SQL.
> This is a candidate project for Google summer of code 2012. More information 
> about the program can be found at 
> https://cwiki.apache.org/confluence/display/PIG/GSoc2012
> Functionality implemented so far, is available at 
> https://reviews.apache.org/r/5523/diff/#index_header

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2769) a simple logic causes very long compiling time on pig 0.10.0

2012-10-22 Thread Timothy Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481538#comment-13481538
 ] 

Timothy Chen commented on PIG-2769:
---

I decided to leave this issue open to grab for now since this issue is mostly 
in the ANTLR generated code and I am still learning about ANTLR.
It does seems that it's recursively calling foreach_complex_statement pretty 
excessively, but I don't have enough knowledge to deduce the grammer so it 
still works.
I did verify that the AST generated in 0.9.2 and 0.10-SNAPSHOT generated is 
exactly the same if that helps.

> a simple logic causes very long compiling time on pig 0.10.0
> 
>
> Key: PIG-2769
> URL: https://issues.apache.org/jira/browse/PIG-2769
> Project: Pig
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.10.0
> Environment: Apache Pig version 0.10.0-SNAPSHOT (rexported)
>Reporter: Dan Li
> Fix For: 0.12
>
> Attachments: case1.tar
>
>
> We found the following simple logic will cause very long compiling time for 
> pig 0.10.0, while using pig 0.8.1, everything is fine.
> A = load 'A.txt' using PigStorage()  AS (m: int);
> B = FOREACH A {
> days_str = (chararray)
> (m == 1 ? 31: 
> (m == 2 ? 28: 
> (m == 3 ? 31: 
> (m == 4 ? 30: 
> (m == 5 ? 31: 
> (m == 6 ? 30: 
> (m == 7 ? 31: 
> (m == 8 ? 31: 
> (m == 9 ? 30: 
> (m == 10 ? 31: 
> (m == 11 ? 30:31)));
> GENERATE
>days_str as days_str;
> }   
> store B into 'B';
> and here's a simple input file example: A.txt
> 1
> 2
> 3
> The pig version we used in the test
> Apache Pig version 0.10.0-SNAPSHOT (rexported)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (PIG-2207) Support custom counters for aggregating warnings from different udfs

2012-10-22 Thread Timothy Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Chen reassigned PIG-2207:
-

Assignee: Timothy Chen

> Support custom counters for aggregating warnings from different udfs
> 
>
> Key: PIG-2207
> URL: https://issues.apache.org/jira/browse/PIG-2207
> Project: Pig
>  Issue Type: Improvement
>Reporter: Thejas M Nair
>Assignee: Timothy Chen
>  Labels: newbie
>
> Pig allows udfs to aggregate warning messages instead of writing out a 
> separate warning message each time. Udfs can do this by logging the warning 
> using EvalFunc.warn(String msg, Enum) call. But the udfs are forced to use 
> PigWarning class if the warning needs to be printed at the end of the pig 
> script . 
> For example, with the changes in PIG-2191, some of the builtin udfs are using 
> PigWarning.UDF_WARNING_1 as argument in calls to EvalFunc.warn. This will 
> result in the warning count being printed on STDERR -
> {code}
> 2011-08-05 22:10:29,285 [main] WARN  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>  - Encountered Warning UDF_WARNING_1 2 time(s).
> 2011-08-05 22:10:29,285 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>  - Success!
> {code}
> But it would be better if a udf such as the LOWER udf could use a custom 
> warning counter, and the STDERR is like -
> {code}
> 2011-08-05 22:10:29,285 [main] WARN  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>  - Encountered Warning LOWER_FUNC_INPUT_WARNING 2 time(s).
> 2011-08-05 22:10:29,285 [main] INFO  
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
>  - Success!
> {code}
> A new function could be added to support this - (something like) 
> EvalFunc.warn(String warnName, String warnMsg);  A specific counter group 
> could be used for udf warnings (see org.apache.hadoop.mapred.Counters), and 
> counters for that group could be done during final warning aggregation in 
> done in MapReduceLauncher.computeWarningAggregate(). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2769) a simple logic causes very long compiling time on pig 0.10.0

2012-10-22 Thread Timothy Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Chen updated PIG-2769:
--

Assignee: (was: Timothy Chen)

> a simple logic causes very long compiling time on pig 0.10.0
> 
>
> Key: PIG-2769
> URL: https://issues.apache.org/jira/browse/PIG-2769
> Project: Pig
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.10.0
> Environment: Apache Pig version 0.10.0-SNAPSHOT (rexported)
>Reporter: Dan Li
> Fix For: 0.12
>
> Attachments: case1.tar
>
>
> We found the following simple logic will cause very long compiling time for 
> pig 0.10.0, while using pig 0.8.1, everything is fine.
> A = load 'A.txt' using PigStorage()  AS (m: int);
> B = FOREACH A {
> days_str = (chararray)
> (m == 1 ? 31: 
> (m == 2 ? 28: 
> (m == 3 ? 31: 
> (m == 4 ? 30: 
> (m == 5 ? 31: 
> (m == 6 ? 30: 
> (m == 7 ? 31: 
> (m == 8 ? 31: 
> (m == 9 ? 30: 
> (m == 10 ? 31: 
> (m == 11 ? 30:31)));
> GENERATE
>days_str as days_str;
> }   
> store B into 'B';
> and here's a simple input file example: A.txt
> 1
> 2
> 3
> The pig version we used in the test
> Apache Pig version 0.10.0-SNAPSHOT (rexported)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2898) Parallel execution of e2e tests

2012-10-22 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481532#comment-13481532
 ] 

Rohini Palaniswamy commented on PIG-2898:
-

There is a logfile created per test group. Once the execution of test group 
completes, the log is appended to the main log file. 

> Parallel execution of e2e tests
> ---
>
> Key: PIG-2898
> URL: https://issues.apache.org/jira/browse/PIG-2898
> Project: Pig
>  Issue Type: Improvement
>  Components: e2e harness
>Affects Versions: 0.10.0
>Reporter: Andrey Klochkov
>Assignee: Ivan A. Veselovsky
>  Labels: test
> Attachments: PIG-2898-branch-0.10-6-final.patch, 
> PIG-2898-trunk-3.patch, PIG-2898-trunk-6-final.patch
>
>
> Today it takes ~19 hours to run the full set of e2e tests in mapred mode. The 
> bottleneck here is the client side, and per our observations it can help a 
> lot if the e2e harness would be able to run tests in parallel threads.
> We prototyped changes in e2e harness allowing to run tests in a configurable 
> number of threads. Preliminary results show more than 6x reduction in 
> execution time when using a small 3-nodes M/R cluster with modest 
> configuration. Going to share a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PIG-2898) Parallel execution of e2e tests

2012-10-22 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481528#comment-13481528
 ] 

Daniel Dai commented on PIG-2898:
-

One other question, is the test log in sequence in the case of parallel 
execution?

> Parallel execution of e2e tests
> ---
>
> Key: PIG-2898
> URL: https://issues.apache.org/jira/browse/PIG-2898
> Project: Pig
>  Issue Type: Improvement
>  Components: e2e harness
>Affects Versions: 0.10.0
>Reporter: Andrey Klochkov
>Assignee: Ivan A. Veselovsky
>  Labels: test
> Attachments: PIG-2898-branch-0.10-6-final.patch, 
> PIG-2898-trunk-3.patch, PIG-2898-trunk-6-final.patch
>
>
> Today it takes ~19 hours to run the full set of e2e tests in mapred mode. The 
> bottleneck here is the client side, and per our observations it can help a 
> lot if the e2e harness would be able to run tests in parallel threads.
> We prototyped changes in e2e harness allowing to run tests in a configurable 
> number of threads. Preliminary results show more than 6x reduction in 
> execution time when using a small 3-nodes M/R cluster with modest 
> configuration. Going to share a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2769) a simple logic causes very long compiling time on pig 0.10.0

2012-10-22 Thread Timothy Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Chen updated PIG-2769:
--

Fix Version/s: (was: 0.11)
   0.12

> a simple logic causes very long compiling time on pig 0.10.0
> 
>
> Key: PIG-2769
> URL: https://issues.apache.org/jira/browse/PIG-2769
> Project: Pig
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.10.0
> Environment: Apache Pig version 0.10.0-SNAPSHOT (rexported)
>Reporter: Dan Li
>Assignee: Timothy Chen
> Fix For: 0.12
>
> Attachments: case1.tar
>
>
> We found the following simple logic will cause very long compiling time for 
> pig 0.10.0, while using pig 0.8.1, everything is fine.
> A = load 'A.txt' using PigStorage()  AS (m: int);
> B = FOREACH A {
> days_str = (chararray)
> (m == 1 ? 31: 
> (m == 2 ? 28: 
> (m == 3 ? 31: 
> (m == 4 ? 30: 
> (m == 5 ? 31: 
> (m == 6 ? 30: 
> (m == 7 ? 31: 
> (m == 8 ? 31: 
> (m == 9 ? 30: 
> (m == 10 ? 31: 
> (m == 11 ? 30:31)));
> GENERATE
>days_str as days_str;
> }   
> store B into 'B';
> and here's a simple input file example: A.txt
> 1
> 2
> 3
> The pig version we used in the test
> Apache Pig version 0.10.0-SNAPSHOT (rexported)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Pig 0.11

2012-10-22 Thread Olga Natkovich
There are still 76 unresolved JIRAs more than half unassigned. Lets clean this 
up by theend of this week. I propose we do the following:
 
(1) Unlink all JIRAs for new features since we already branched so we should 
not be taken on new work. If people feel strongly that some new features still 
need to go in please bring it up.
(2) For bug fixes, if people fill strongly that some of the unassigned issues 
need to be addressed please take ownership. If you are unable to solve them but 
still feel they are important, please, bring them up.
(3) Owners of unresolved issues, please, take a look if you will have time to 
solve them in the next 2 weeks. If not, lets move them to 12. If you can't 
address them but feel they are important, please, bring it up.
 
Lets make sure that all JIRAs that require changes to the documentation have 
appropriate information in the release notes section so that we can quickly 
compile release documentation.
 
Thanks for you help!
 
Olga





From: Alan Gates 
To: dev@pig.apache.org 
Sent: Monday, October 15, 2012 11:55 AM
Subject: Re: Pig 0.11

At this point no one has taken on release documentation for 0.11.

Alan.

On Oct 15, 2012, at 11:49 AM, Olga Natkovich wrote:

> Thanks!
>  
> Are you talking about items 15 and 16 on the How To Release.Publish  page? 
>  
> Also, who is doing release documentation these days? I can help with that as 
> well. I would also be happy to roll the release if you guys need help with 
> that.
>  
> Olga
> 
> 
> 
> From: Dmitriy Ryaboy 
> To: "dev@pig.apache.org"  
> Cc: "dev@pig.apache.org"  
> Sent: Friday, October 12, 2012 5:59 PM
> Subject: Re: Pig 0.11
> 
> Thanks Olga and welcome back! 
> I know there's some process for linking jiras to releases, but I'm not sure 
> what that is. If you could explain and maybe cover a portion of that work, 
> that'd be super helpful. And reviews, of course. 
> 
> On Oct 12, 2012, at 2:06 PM, Olga Natkovich  wrote:
> 
>> Dmitry, I would be happy to help with the release process. Want to get back 
>> into this now that I am back at work. Let me know what you would like me to 
>> do.
>>  
>> Olga
>> 
>> 
>> 
>> 
>> From: Dmitriy Ryaboy 
>> To: dev@pig.apache.org 
>> Cc: billgra...@gmail.com 
>> Sent: Thursday, October 11, 2012 2:44 PM
>> Subject: Re: Pig 0.11
>> 
>> Ok I will branch 0.11 tomorrow morning unless someone objects.
>> From then on, committers should be careful to commit bug fixes to both
>> 0.11 branch and trunk; minor polish can go into the branch, but whole
>> new features should not (we can discuss on the list if something is in
>> the gray area).
>> 
>> D
>> 
>> On Thu, Oct 11, 2012 at 2:16 PM, Gianmarco De Francisci Morales
>>  wrote:
>>> I added it as a dependency as it has already its own Jira.
>>> I hope it is OK.
>>> 
>>> Cheers,
>>> --
>>> Gianmarco
>>> 
>>> 
>>> 
>>> On Wed, Oct 10, 2012 at 11:23 PM, Bill Graham  wrote:
>>> 
 +1 for me.
 
 There's https://issues.apache.org/jira/browse/PIG-2756 which tracks a few
 documentation issues that should block Pig 0.11, but they can also be done
 on the trunk and merged to the branch. Gianmarco, you can add a rank
 subtask there to serve as a reminder.
 
 
 On Wed, Oct 10, 2012 at 11:03 PM, Gianmarco De Francisci Morales <
 g...@apache.org> wrote:
 
> We are missing some documentation on the RANK but I guess we could add
 that
> to the branch and trunk in parallel.
> All the patches I was keeping an eye on are in.
> 
> So +1 for me.
> --
> Gianmarco
> 
> 
> 
> On Wed, Oct 10, 2012 at 5:31 PM, Jonathan Coveney > wrote:
> 
>> I think all of the major patches are in, no? Now it's just bug testing?
>> Just wanted to touch base on where we are at with this.
>> 
> 
 
 
 
 --
 *Note that I'm no longer using my Yahoo! email address. Please email me at
 billgra...@gmail.com going forward.*


[jira] [Commented] (PIG-2959) Add a pig.cmd for Pig to run under Windows

2012-10-22 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481490#comment-13481490
 ] 

Dmitriy V. Ryaboy commented on PIG-2959:


Ok. I don't particularly care about the Windows compatibility one way or 
another, but it seems to me like if a .py script is the eventual goal, then 
rather than having 2 divergent shell scripts that will have to be unified in 
the .py script later on, it would make sense to make the windows runner a 
python script to begin with, then make it compatible with the current pig bash 
script, and deprecate the latter. 
But I'm not going to -1 this patch or anything, do as you see fit.

> Add a pig.cmd for Pig to run under Windows
> --
>
> Key: PIG-2959
> URL: https://issues.apache.org/jira/browse/PIG-2959
> Project: Pig
>  Issue Type: Sub-task
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.11
>
> Attachments: pig.cmd
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PIG-2975) TestTypedMap.testOrderBy failing with incorrect result

2012-10-22 Thread Koji Noguchi (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Koji Noguchi updated PIG-2975:
--

Attachment: 
pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withouttest.txt

bq. or we can just have a special lightweight comparator that special cases 
DataByteArrays, and delegates to BinInterSedesRawComparator otherwise.

This one was faster than I expected.
414 seconds average vs the simple raw compare(including the header) of 398 
seconds.
(Much faster than my bulky union approach of 436 seconds.)

I also tried moving this special case comparator to inside 
BinInterSedesRawComparator.compare, but that jumped the runtime back to over 
600 seconds.

It's just one extra hop(method) + one extra checking(Tuple_1) but somehow jvm 
couldn't handle it well.

Adding test cases now.

> TestTypedMap.testOrderBy failing with incorrect result 
> ---
>
> Key: PIG-2975
> URL: https://issues.apache.org/jira/browse/PIG-2975
> Project: Pig
>  Issue Type: Sub-task
>Affects Versions: 0.11
>Reporter: Koji Noguchi
>Assignee: Koji Noguchi
>Priority: Blocker
> Fix For: 0.11
>
> Attachments: PIG-2975-0_jco.patch, PIG-2975-0_jco-v2.patch, 
> pig-2975-trunk_v01.txt, pig-2975-trunk_v02-broken.txt, 
> pig-2975-trunk_v03-unionapproach.txt, pig-2975-trunk_v04-purerawcompare.txt, 
> pig-2975-trunk_v05-BinInterSedesRawComparatorAndlightweight-withouttest.txt
>
>
> Looked at 
> {noformat}
> junit.framework.AssertionFailedError
> at org.apache.pig.test.TestTypedMap.testOrderBy(TestTypedMap.java:352)
> {noformat}
> This looks like a valid test case failing with incorrect result.
> {noformat}
> % cat test/orderby.txt
> [key#1,key9#23]
> [key#3,key3#2]
> [key#22]
> % cat test/orderby.pig
> a = load 'test/orderby.txt' as (m:[]);
> b = foreach a generate m#'key' as b0;
> dump b;
> c = order b by b0;
> dump c;
> % java ... org.apache.pig.Main-x local test/orderby.pig 
> [dump b]
> (1)
> (3)
> (22)
> ...
> [dump c]
> (1)
> (1)
> (22)
> %
> where did the '(3)' go?
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Pig-trunk #1343

2012-10-22 Thread Apache Jenkins Server
See 

Changes:

[daijy] PIG-2950: Fix tiny documentation error in BagToString builtin

--
[...truncated 37832 lines...]
[junit] at 
org.apache.hadoop.metrics2.util.MBeans.unregister(MBeans.java:71)
[junit] at 
org.apache.hadoop.hdfs.server.datanode.FSDataset.shutdown(FSDataset.java:1934)
[junit] at 
org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:788)
[junit] at 
org.apache.hadoop.hdfs.MiniDFSCluster.shutdownDataNodes(MiniDFSCluster.java:566)
[junit] at 
org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:550)
[junit] at 
org.apache.pig.test.MiniGenericCluster.shutdownMiniDfsClusters(MiniGenericCluster.java:87)
[junit] at 
org.apache.pig.test.MiniGenericCluster.shutdownMiniDfsAndMrClusters(MiniGenericCluster.java:77)
[junit] at 
org.apache.pig.test.MiniGenericCluster.shutDown(MiniGenericCluster.java:68)
[junit] at 
org.apache.pig.test.TestStore.oneTimeTearDown(TestStore.java:141)
[junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[junit] at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[junit] at java.lang.reflect.Method.invoke(Method.java:597)
[junit] at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
[junit] at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
[junit] at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
[junit] at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:37)
[junit] at org.junit.runners.ParentRunner.run(ParentRunner.java:220)
[junit] at 
junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
[junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420)
[junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911)
[junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:768)
[junit] 12/10/22 10:32:53 WARN datanode.FSDatasetAsyncDiskService: 
AsyncDiskService has already shut down.
[junit] Shutting down DataNode 2
[junit] 12/10/22 10:32:53 INFO mortbay.log: Stopped 
SelectChannelConnector@localhost:0
[junit] 12/10/22 10:32:53 INFO ipc.Server: Stopping server on 50122
[junit] 12/10/22 10:32:53 INFO ipc.Server: IPC Server handler 1 on 50122: 
exiting
[junit] 12/10/22 10:32:53 INFO ipc.Server: IPC Server handler 0 on 50122: 
exiting
[junit] 12/10/22 10:32:53 INFO ipc.Server: IPC Server handler 2 on 50122: 
exiting
[junit] 12/10/22 10:32:53 INFO ipc.Server: Stopping IPC Server listener on 
50122
[junit] 12/10/22 10:32:53 INFO metrics.RpcInstrumentation: shut down
[junit] 12/10/22 10:32:53 INFO ipc.Server: Stopping IPC Server Responder
[junit] 12/10/22 10:32:53 WARN datanode.DataNode: 
DatanodeRegistration(127.0.0.1:59307, 
storageID=DS-918473012-67.195.138.20-59307-1350901486519, infoPort=43797, 
ipcPort=50122):DataXceiveServer:java.nio.channels.AsynchronousCloseException
[junit] at 
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:185)
[junit] at 
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:159)
[junit] at 
sun.nio.ch.ServerSocketAdaptor.accept(ServerSocketAdaptor.java:84)
[junit] at 
org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:131)
[junit] at java.lang.Thread.run(Thread.java:662)
[junit] 
[junit] 12/10/22 10:32:53 INFO datanode.DataNode: Exiting DataXceiveServer
[junit] 12/10/22 10:32:53 INFO datanode.DataNode: Waiting for threadgroup 
to exit, active threads is 0
[junit] 12/10/22 10:32:53 INFO datanode.DataBlockScanner: Exiting 
DataBlockScanner thread.
[junit] 12/10/22 10:32:53 INFO datanode.DataNode: 
DatanodeRegistration(127.0.0.1:59307, 
storageID=DS-918473012-67.195.138.20-59307-1350901486519, infoPort=43797, 
ipcPort=50122):Finishing DataNode in: 
FSDataset{dirpath='
[junit] 12/10/22 10:32:53 INFO ipc.Server: Stopping server on 50122
[junit] 12/10/22 10:32:53 INFO metrics.RpcInstrumentation: shut down
[junit] 12/10/22 10:32:53 INFO datanode.DataNode: Waiting for threadgroup 
to exit, active threads is 0
[junit] 12/10/22 10:32:53 INFO datanode.FSDatasetAsyncDiskService: Shutting 
down all async disk service threads...
[junit] 12/10/22 10:32:53 INFO datanode.

  1   2   >