date:20110718

[jira] [Updated] (PIG-2178) Filtering a source and then merging the filtered rows only generates data from one half of the filtering

2011-07-18 Thread Derek Wollenstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Derek Wollenstein updated PIG-2178:
---

Description: 
Pig is generating a plan that eliminates half of input data when using FILTER BY

To better illustrate, I created a small test case.
1. Create a file in HDFS called "/testinput"
   The contents of the file should be:
"1\ta\taline\n1\tb\tbline"
2. Run the following pig script:
ORIG = LOAD '/testinput' USING PigStorage() AS (parent_id: chararray, 
child_id:chararray, value:chararray);
-- Split into two inputs based on the value of child_id
A = FILTER ORIG BY child_id =='a';
B = FILTER ORIG BY child_id =='b';
-- Project out the column which chooses the correct data set
APROJ = FOREACH A GENERATE parent_id, value;
BPROJ = FOREACH B GENERATE parent_id, value;
-- Merge both datasets by parent id
ABMERGE = JOIN APROJ by parent_id FULL OUTER, BPROJ by parent_id;
-- Project the result
ABPROJ = FOREACH ABMERGE GENERATE APROJ::parent_id AS parent_id, 
APROJ::value,BPROJ::value;
DUMP ABPROJ;
3. The resulting tuple will be
(1,aline,aline)


  was:
Pig is generating a plan that eliminates half of input data when using FILTER BY

To better illustarte, I created a small test case.
1. Create a file in HDFS called "/testinput"
   The contents of the file should be:
"1\ta\taline\n1\tb\tbline"
2. Run the following pig script:
ORIG = LOAD '/testinput' USING PigStorage() AS (parent_id: chararray, 
child_id:chararray, value:chararray);
-- Split into two inputs based on the value of child_id
A = FILTER ORIG BY child_id =='a';
B = FILTER ORIG BY child_id =='b';
-- Project out the column which chooses the correct data set
APROJ = FOREACH A GENERATE parent_id, value;
BPROJ = FOREACH B GENERATE parent_id, value;
-- Merge both datasets by parent id
ABMERGE = JOIN APROJ by parent_id FULL OUTER, BPROJ by parent_id;
-- Project the result
ABPROJ = FOREACH ABMERGE GENERATE APROJ::parent_id AS parent_id, 
APROJ::value,BPROJ::value;
DUMP ABPROJ;
3. The resulting tuple will be
(1,aline,aline)



> Filtering a source and then merging the filtered rows only generates data 
> from one half of the filtering
> 
>
> Key: PIG-2178
> URL: https://issues.apache.org/jira/browse/PIG-2178
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.8.1
>Reporter: Derek Wollenstein
>
> Pig is generating a plan that eliminates half of input data when using FILTER 
> BY
> To better illustrate, I created a small test case.
> 1. Create a file in HDFS called "/testinput"
>The contents of the file should be:
> "1\ta\taline\n1\tb\tbline"
> 2. Run the following pig script:
> ORIG = LOAD '/testinput' USING PigStorage() AS (parent_id: chararray, 
> child_id:chararray, value:chararray);
> -- Split into two inputs based on the value of child_id
> A = FILTER ORIG BY child_id =='a';
> B = FILTER ORIG BY child_id =='b';
> -- Project out the column which chooses the correct data set
> APROJ = FOREACH A GENERATE parent_id, value;
> BPROJ = FOREACH B GENERATE parent_id, value;
> -- Merge both datasets by parent id
> ABMERGE = JOIN APROJ by parent_id FULL OUTER, BPROJ by parent_id;
> -- Project the result
> ABPROJ = FOREACH ABMERGE GENERATE APROJ::parent_id AS parent_id, 
> APROJ::value,BPROJ::value;
> DUMP ABPROJ;
> 3. The resulting tuple will be
> (1,aline,aline)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (PIG-2178) Filtering a source and then merging the filtered rows only generates data from one half of the filtering

2011-07-18 Thread Derek Wollenstein (JIRA)

Filtering a source and then merging the filtered rows only generates data from 
one half of the filtering


 Key: PIG-2178
 URL: https://issues.apache.org/jira/browse/PIG-2178
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.8.1
Reporter: Derek Wollenstein


Pig is generating a plan that eliminates half of input data when using FILTER BY

To better illustarte, I created a small test case.
1. Create a file in HDFS called "/testinput"
   The contents of the file should be:
"1\ta\taline\n1\tb\tbline"
2. Run the following pig script:
ORIG = LOAD '/testinput' USING PigStorage() AS (parent_id: chararray, 
child_id:chararray, value:chararray);
-- Split into two inputs based on the value of child_id
A = FILTER ORIG BY child_id =='a';
B = FILTER ORIG BY child_id =='b';
-- Project out the column which chooses the correct data set
APROJ = FOREACH A GENERATE parent_id, value;
BPROJ = FOREACH B GENERATE parent_id, value;
-- Merge both datasets by parent id
ABMERGE = JOIN APROJ by parent_id FULL OUTER, BPROJ by parent_id;
-- Project the result
ABPROJ = FOREACH ABMERGE GENERATE APROJ::parent_id AS parent_id, 
APROJ::value,BPROJ::value;
DUMP ABPROJ;
3. The resulting tuple will be
(1,aline,aline)


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Build failed in Jenkins: Pig-trunk-commit #862

2011-07-18 Thread Apache Jenkins Server

See 

Changes:

[daijy] Make Pig work with hadoop .NEXT

--
[...truncated 39793 lines...]
[junit] at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
[junit] at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
[junit] at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
[junit] at java.security.AccessController.doPrivileged(Native Method)
[junit] at javax.security.auth.Subject.doAs(Subject.java:396)
[junit] at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
[junit] 
[junit] org.apache.hadoop.ipc.RemoteException: java.io.IOException: Could 
not complete write to file 
/tmp/TestStore-output--9208950352273007539.txt_cleanupOnFailure_succeeded1 by 
DFSClient_-814205210
[junit] at 
org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:449)
[junit] at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[junit] at java.lang.reflect.Method.invoke(Method.java:597)
[junit] at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
[junit] at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
[junit] at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
[junit] at java.security.AccessController.doPrivileged(Native Method)
[junit] at javax.security.auth.Subject.doAs(Subject.java:396)
[junit] at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
[junit] 
[junit] at org.apache.hadoop.ipc.Client.call(Client.java:740)
[junit] at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
[junit] at $Proxy0.complete(Unknown Source)
[junit] at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[junit] at java.lang.reflect.Method.invoke(Method.java:597)
[junit] at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
[junit] at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
[junit] at $Proxy0.complete(Unknown Source)
[junit] at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3264)
[junit] at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3188)
[junit] at 
org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java:1043)
[junit] at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:237)
[junit] at 
org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:269)
[junit] at 
org.apache.pig.test.MiniGenericCluster.shutdownMiniDfsClusters(MiniGenericCluster.java:83)
[junit] at 
org.apache.pig.test.MiniGenericCluster.shutdownMiniDfsAndMrClusters(MiniGenericCluster.java:77)
[junit] at 
org.apache.pig.test.MiniGenericCluster.shutDown(MiniGenericCluster.java:68)
[junit] at 
org.apache.pig.test.TestStore.oneTimeTearDown(TestStore.java:127)
[junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[junit] at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[junit] at java.lang.reflect.Method.invoke(Method.java:597)
[junit] at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
[junit] at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
[junit] at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
[junit] at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:37)
[junit] at org.junit.runners.ParentRunner.run(ParentRunner.java:220)
[junit] at 
junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
[junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420)
[junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911)
[junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:768)
[junit] 11/07/19 02:28:01 WARN hdfs.StateChange: DIR* 
NameSystem.completeFile: failed to complete 
/tmp/TestStore-output-201847589470793672.txt_cleanupOnFailure_succeeded because 
dir.getFileBlocks() is null  and pendingFile is null
[junit] 11/07/19 02:28:01 INFO ipc.Server: IPC Server handler 6 on 52543, 
call 
complete(/tmp/TestStore-output-201847589470793672.txt_cleanupOnFailure_succeeded,
 DFSClient_-814205210) from 127.0.0.1:51454: e

[jira] [Updated] (PIG-2177) e2e test harness should not assume hadoop dir structure

2011-07-18 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated PIG-2177:
--

Status: Patch Available  (was: Open)

> e2e test harness should not assume hadoop dir structure
> ---
>
> Key: PIG-2177
> URL: https://issues.apache.org/jira/browse/PIG-2177
> Project: Pig
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 0.10
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Trivial
> Fix For: 0.10
>
> Attachments: pig-2177.patch
>
>
> testconfigpath variable assumes conf/ dir to exist in $PH_CLUSTER location. 
> It may or may not exist. If it exists, its better to provide full path 
> including conf/ If it doesn't exist full path to the dir leading to *.xml can 
> be provided. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2177) e2e test harness should not assume hadoop dir structure

2011-07-18 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated PIG-2177:
--

Attachment: pig-2177.patch

Patch in correct format.

> e2e test harness should not assume hadoop dir structure
> ---
>
> Key: PIG-2177
> URL: https://issues.apache.org/jira/browse/PIG-2177
> Project: Pig
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 0.10
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Trivial
> Fix For: 0.10
>
> Attachments: pig-2177.patch
>
>
> testconfigpath variable assumes conf/ dir to exist in $PH_CLUSTER location. 
> It may or may not exist. If it exists, its better to provide full path 
> including conf/ If it doesn't exist full path to the dir leading to *.xml can 
> be provided. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2177) e2e test harness should not assume hadoop dir structure

2011-07-18 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated PIG-2177:
--

Attachment: (was: pig-2177.patch)

> e2e test harness should not assume hadoop dir structure
> ---
>
> Key: PIG-2177
> URL: https://issues.apache.org/jira/browse/PIG-2177
> Project: Pig
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 0.10
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Trivial
> Fix For: 0.10
>
>
> testconfigpath variable assumes conf/ dir to exist in $PH_CLUSTER location. 
> It may or may not exist. If it exists, its better to provide full path 
> including conf/ If it doesn't exist full path to the dir leading to *.xml can 
> be provided. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2177) e2e test harness should not assume hadoop dir structure

2011-07-18 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated PIG-2177:
--

Attachment: pig-2177.patch

Patch attached.

> e2e test harness should not assume hadoop dir structure
> ---
>
> Key: PIG-2177
> URL: https://issues.apache.org/jira/browse/PIG-2177
> Project: Pig
>  Issue Type: Improvement
>  Components: tools
>Affects Versions: 0.10
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Trivial
> Fix For: 0.10
>
> Attachments: pig-2177.patch
>
>
> testconfigpath variable assumes conf/ dir to exist in $PH_CLUSTER location. 
> It may or may not exist. If it exists, its better to provide full path 
> including conf/ If it doesn't exist full path to the dir leading to *.xml can 
> be provided. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (PIG-2177) e2e test harness should not assume hadoop dir structure

2011-07-18 Thread Ashutosh Chauhan (JIRA)

e2e test harness should not assume hadoop dir structure
---

 Key: PIG-2177
 URL: https://issues.apache.org/jira/browse/PIG-2177
 Project: Pig
  Issue Type: Improvement
  Components: tools
Affects Versions: 0.10
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
Priority: Trivial
 Fix For: 0.10


testconfigpath variable assumes conf/ dir to exist in $PH_CLUSTER location. It 
may or may not exist. If it exists, its better to provide full path including 
conf/ If it doesn't exist full path to the dir leading to *.xml can be 
provided. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-1857) Create an package integration project

2011-07-18 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067429#comment-13067429
 ] 

Eric Yang commented on PIG-1857:


The meta packages do not match between debian and redhat.  On Debian, it 
depends on openjdk-6-jre-headless because Oracle does not offer Debian java 
package on their website.  Hence, the dependency is set to the default jdk 
provided by Debian (Same for ubuntu).  I will add "hadoop" to debian control 
file.  For sh-utils and textutils, those packages are part of base OS, hence no 
package needs to be defined.

"source" change to "source-distribution" works for me too.

> Create an package integration project
> -
>
> Key: PIG-1857
> URL: https://issues.apache.org/jira/browse/PIG-1857
> Project: Pig
>  Issue Type: New Feature
>  Components: build
>Affects Versions: 0.9.0
> Environment: RHEL 5.5/Ubuntu 10.10
>Reporter: Eric Yang
>Assignee: Eric Yang
> Attachments: PIG-1857-1.patch, PIG-1857-2.patch, 
> PIG-1857-draft.patch, PIG-1857.patch
>
>
> This goal of this ticket is to generate a set of RPM/debian package which 
> integrate well with RPM sets created by HADOOP-6255.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2125) Make Pig work with hadoop .NEXT

2011-07-18 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067428#comment-13067428
 ] 

Daniel Dai commented on PIG-2125:
-

PIG-2125-4.patch committed to both trunk and 0.9 branch. This is the first 
phase work (mapreduce and local mode end-to-end), I will continue work on the 
next phase (unit test).

> Make Pig work with hadoop .NEXT
> ---
>
> Key: PIG-2125
> URL: https://issues.apache.org/jira/browse/PIG-2125
> Project: Pig
>  Issue Type: New Feature
>  Components: impl
>Affects Versions: 0.10
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.10
>
> Attachments: PIG-2125-1.patch, PIG-2125-2.patch, PIG-2125-3.patch, 
> PIG-2125-4.patch
>
>
> We need to make Pig work with hadoop .NEXT, the svn branch currently is: 
> https://svn.apache.org/repos/asf/hadoop/common/branches/MR-279

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-1857) Create an package integration project

2011-07-18 Thread Alan Gates (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067419#comment-13067419
 ] 

Alan Gates commented on PIG-1857:
-

A few initial comments:

In rpm/spec/pig.spec the dependencies are set to "Requires: hadoop, sh-utils, 
textutils, jdk >= 1.6".  But in deb/pig.control/control they are set to 
"Depends: openjdk-6-jre-headless".  Shouldn't these match?

nit: the build target "source" should be called source-distribution or 
something.  "source" sounds like it checks out the source.

I still need to test this and think more about how it arranges files.


> Create an package integration project
> -
>
> Key: PIG-1857
> URL: https://issues.apache.org/jira/browse/PIG-1857
> Project: Pig
>  Issue Type: New Feature
>  Components: build
>Affects Versions: 0.9.0
> Environment: RHEL 5.5/Ubuntu 10.10
>Reporter: Eric Yang
>Assignee: Eric Yang
> Attachments: PIG-1857-1.patch, PIG-1857-2.patch, 
> PIG-1857-draft.patch, PIG-1857.patch
>
>
> This goal of this ticket is to generate a set of RPM/debian package which 
> integrate well with RPM sets created by HADOOP-6255.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Build failed in Jenkins: Pig-trunk-commit #861

2011-07-18 Thread Apache Jenkins Server

See 

Changes:

[daijy] PIG-2159: New logical plan uses incorrect class for SUM causing for 
ClassCastException

--
[...truncated 40200 lines...]
[junit] at 
org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:449)
[junit] at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[junit] at java.lang.reflect.Method.invoke(Method.java:597)
[junit] at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
[junit] at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
[junit] at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
[junit] at java.security.AccessController.doPrivileged(Native Method)
[junit] at javax.security.auth.Subject.doAs(Subject.java:396)
[junit] at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
[junit] 11/07/19 00:28:53 ERROR hdfs.DFSClient: Exception closing file 
/tmp/TestStore-output--819617621618344.txt_cleanupOnFailure_succeeded2 : 
org.apache.hadoop.ipc.RemoteException: java.io.IOException: Could not complete 
write to file 
/tmp/TestStore-output--819617621618344.txt_cleanupOnFailure_succeeded2 by 
DFSClient_1268025084
[junit] at 
org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:449)
[junit] at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[junit] at java.lang.reflect.Method.invoke(Method.java:597)
[junit] at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
[junit] at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
[junit] at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
[junit] at java.security.AccessController.doPrivileged(Native Method)
[junit] at javax.security.auth.Subject.doAs(Subject.java:396)
[junit] at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
[junit] 
[junit] org.apache.hadoop.ipc.RemoteException: java.io.IOException: Could 
not complete write to file 
/tmp/TestStore-output--819617621618344.txt_cleanupOnFailure_succeeded2 by 
DFSClient_1268025084
[junit] at 
org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:449)
[junit] at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[junit] at java.lang.reflect.Method.invoke(Method.java:597)
[junit] at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
[junit] at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
[junit] at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
[junit] at java.security.AccessController.doPrivileged(Native Method)
[junit] at javax.security.auth.Subject.doAs(Subject.java:396)
[junit] at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
[junit] 
[junit] at org.apache.hadoop.ipc.Client.call(Client.java:740)
[junit] at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
[junit] at $Proxy0.complete(Unknown Source)
[junit] at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[junit] at java.lang.reflect.Method.invoke(Method.java:597)
[junit] at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
[junit] at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
[junit] at $Proxy0.complete(Unknown Source)
[junit] at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3264)
[junit] at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3188)
[junit] at 
org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java:1043)
[junit] at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:237)
[junit] at 
org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:269)
[junit] at 
org.apache.pig.test.MiniCluster.shutdownMiniDfsAndMrClusters(MiniCluster.java:111)
[junit] at 
org.apache.pig.test.MiniCluster.shutDown(MiniCluster.java:101)
[junit] at 
org.apache.pig.test.TestStore.oneTimeTearDown(TestStore.java:127)
[junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[junit] at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

[jira] [Updated] (PIG-2171) TestScriptLanguage is broken on trunk

2011-07-18 Thread Thejas M Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-2171:
---

Attachment: PIG-2171.2.patch

I have made a small change to the patch, so that the expected message including 
line number is checked - PIG-2171.2.patch 

> TestScriptLanguage is broken on trunk
> -
>
> Key: PIG-2171
> URL: https://issues.apache.org/jira/browse/PIG-2171
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.10
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.10
>
> Attachments: PIG-2171-1.patch, PIG-2171.2.patch
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2077) Project UDF output inside a non-foreach statement fail on 0.8

2011-07-18 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-2077:


  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

Patch committed to 0.8 branch.

> Project UDF output inside a non-foreach statement fail on 0.8
> -
>
> Key: PIG-2077
> URL: https://issues.apache.org/jira/browse/PIG-2077
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.8.1
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.8.1
>
> Attachments: PIG-2077-1.patch
>
>
> The following script fail on 0.8:
> {code}
> A = load '1.txt' as (tracking_id, day:chararray);
> B = load '2.txt' as (tracking_id, timestamp:chararray);
> C = JOIN A by (tracking_id, day) LEFT OUTER, B by (tracking_id,  
> STRSPLIT(timestamp, ' ').$0);
> explain C;
> {code}
> Error stack:
> Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
> at java.util.ArrayList.get(ArrayList.java:324)
> at 
> org.apache.pig.newplan.logical.expression.ProjectExpression.findReferent(ProjectExpression.java:207)
> at 
> org.apache.pig.newplan.logical.expression.ProjectExpression.getFieldSchema(ProjectExpression.java:121)
> at 
> org.apache.pig.newplan.logical.optimizer.FieldSchemaResetter.execute(SchemaResetter.java:193)
> at 
> org.apache.pig.newplan.logical.expression.AllSameExpressionVisitor.visit(AllSameExpressionVisitor.java:53)
> at 
> org.apache.pig.newplan.logical.expression.ProjectExpression.accept(ProjectExpression.java:75)
> at 
> org.apache.pig.newplan.ReverseDependencyOrderWalker.walk(ReverseDependencyOrderWalker.java:70)
> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
> at 
> org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:83)
> at 
> org.apache.pig.newplan.logical.relational.LOJoin.accept(LOJoin.java:149)
> at 
> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
> at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:262)
> This is not a problem on 0.9, trunk, since LogicalExpPlanMigrationVistor is 
> dropped in 0.9.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2159) New logical plan uses incorrect class for SUM causing for ClassCastException

2011-07-18 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-2159:


  Resolution: Fixed
Assignee: Daniel Dai
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

Patch committed to both 0.9 branch and trunk

> New logical plan uses incorrect class for  SUM causing for ClassCastException
> -
>
> Key: PIG-2159
> URL: https://issues.apache.org/jira/browse/PIG-2159
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.9.0
>Reporter: Vivek Padmanabhan
>Assignee: Daniel Dai
>Priority: Blocker
> Fix For: 0.9.0
>
> Attachments: PIG-2159-1.patch, PIG-2159-2.patch
>
>
> The below is my script;
> {code}
> A = load 'input1' using PigStorage(',')  as 
> (f1:int,f2:int,f3:int,f4:long,f5:double);
> B = load 'input2' using PigStorage(',')  as 
> (f1:int,f2:int,f3:int,f4:long,f5:double);
> C = load 'input_Main' using PigStorage(',')  as (f1:int,f2:int,f3:int);
> U = UNION ONSCHEMA A,B;
> J = join C by (f1,f2,f3) LEFT OUTER, U by (f1,f2,f3);
> Porj = foreach J generate C::f1 as f1 ,C::f2 as f2,C::f3 as f3,U::f4 as 
> f4,U::f5 as f5;
> G = GROUP Porj by (f1,f2,f3,f5);
> Final = foreach G generate SUM(Porj.f4) as total;
> dump Final;
> {code}
> The script fails at while computing the sum with class cast exception.
> Caused by: java.lang.ClassCastException: java.lang.Long cannot be cast to 
> java.lang.Double
>  at org.apache.pig.builtin.DoubleSum$Initial.exec(DoubleSum.java:82)
>  ... 19 more
> This is clearly a bug in the logical plan created in 0.9. The sum operation 
> should have processed using org.apache.pig.builtin.LongSum, but instead 0.9 
> logical plan have used org.apache.pig.builtin.DoubleSum which is meant for 
> sum of doubles. And hence the ClassCastException.
> The same script works fine with Pig 0.8.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Build failed in Jenkins: Pig-trunk-commit #860

2011-07-18 Thread Apache Jenkins Server

See 

Changes:

[daijy] PIG-2172: Fix test failure for ant 1.8.x

--
[...truncated 40211 lines...]
[junit] at java.security.AccessController.doPrivileged(Native Method)
[junit] at javax.security.auth.Subject.doAs(Subject.java:396)
[junit] at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
[junit] 11/07/18 23:28:49 ERROR hdfs.DFSClient: Exception closing file 
/tmp/TestStore-output-741936957838234.txt_cleanupOnFailure_succeeded2 : 
org.apache.hadoop.ipc.RemoteException: java.io.IOException: Could not complete 
write to file 
/tmp/TestStore-output-741936957838234.txt_cleanupOnFailure_succeeded2 by 
DFSClient_1891683814
[junit] at 
org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:449)
[junit] at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[junit] at java.lang.reflect.Method.invoke(Method.java:597)
[junit] at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
[junit] at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
[junit] at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
[junit] at java.security.AccessController.doPrivileged(Native Method)
[junit] at javax.security.auth.Subject.doAs(Subject.java:396)
[junit] at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
[junit] 
[junit] org.apache.hadoop.ipc.RemoteException: java.io.IOException: Could 
not complete write to file 
/tmp/TestStore-output-741936957838234.txt_cleanupOnFailure_succeeded2 by 
DFSClient_1891683814
[junit] at 
org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:449)
[junit] at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[junit] at java.lang.reflect.Method.invoke(Method.java:597)
[junit] at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
[junit] at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
[junit] at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
[junit] at java.security.AccessController.doPrivileged(Native Method)
[junit] at javax.security.auth.Subject.doAs(Subject.java:396)
[junit] at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
[junit] 
[junit] at org.apache.hadoop.ipc.Client.call(Client.java:740)
[junit] at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
[junit] at $Proxy0.complete(Unknown Source)
[junit] at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[junit] at java.lang.reflect.Method.invoke(Method.java:597)
[junit] at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
[junit] at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
[junit] at $Proxy0.complete(Unknown Source)
[junit] at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3264)
[junit] at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3188)
[junit] at 
org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java:1043)
[junit] at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:237)
[junit] at 
org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:269)
[junit] at 
org.apache.pig.test.MiniCluster.shutdownMiniDfsAndMrClusters(MiniCluster.java:111)
[junit] at 
org.apache.pig.test.MiniCluster.shutDown(MiniCluster.java:101)
[junit] at 
org.apache.pig.test.TestStore.oneTimeTearDown(TestStore.java:127)
[junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[junit] at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[junit] at java.lang.reflect.Method.invoke(Method.java:597)
[junit] at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
[junit] at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
[junit] at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
[junit] at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:37)
[junit] at org.junit.runners.ParentRunner.run(ParentRunner.java:220)
[junit] at 
junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java

[jira] [Commented] (PIG-2077) Project UDF output inside a non-foreach statement fail on 0.8

2011-07-18 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-2077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067390#comment-13067390
 ] 

jirapos...@reviews.apache.org commented on PIG-2077:

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/767/#review1105
---

Ship it!

+1

- thejas

On 2011-05-19 22:26:01, Daniel Dai wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/767/
bq.  ---
bq.  
bq.  (Updated 2011-05-19 22:26:01)
bq.  
bq.  
bq.  Review request for pig and thejas.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  See PIG-2077
bq.  
bq.  
bq.  This addresses bug PIG-2077.
bq.  https://issues.apache.org/jira/browse/PIG-2077
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
branches/branch-0.8/src/org/apache/pig/newplan/logical/LogicalExpPlanMigrationVistor.java
 1104455 
bq.branches/branch-0.8/test/org/apache/pig/test/TestEvalPipeline2.java 
1104455 
bq.  
bq.  Diff: https://reviews.apache.org/r/767/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Test patch:
bq.   [exec] +1 overall.  
bq.   [exec] 
bq.   [exec] +1 @author.  The patch does not contain any @author tags.
bq.   [exec] 
bq.   [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
bq.   [exec] 
bq.   [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
bq.   [exec] 
bq.   [exec] +1 javac.  The applied patch does not increase the total 
number of javac compiler warnings.
bq.   [exec] 
bq.   [exec] +1 findbugs.  The patch does not introduce any new 
Findbugs warnings.
bq.   [exec] 
bq.   [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
bq.  
bq.  Unit test:
bq.  all pass
bq.  
bq.  End to end test:
bq.  all pass
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Daniel
bq.  
bq.

> Project UDF output inside a non-foreach statement fail on 0.8
> -
>
> Key: PIG-2077
> URL: https://issues.apache.org/jira/browse/PIG-2077
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.8.1
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.8.1
>
> Attachments: PIG-2077-1.patch
>
>
> The following script fail on 0.8:
> {code}
> A = load '1.txt' as (tracking_id, day:chararray);
> B = load '2.txt' as (tracking_id, timestamp:chararray);
> C = JOIN A by (tracking_id, day) LEFT OUTER, B by (tracking_id,  
> STRSPLIT(timestamp, ' ').$0);
> explain C;
> {code}
> Error stack:
> Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
> at java.util.ArrayList.get(ArrayList.java:324)
> at 
> org.apache.pig.newplan.logical.expression.ProjectExpression.findReferent(ProjectExpression.java:207)
> at 
> org.apache.pig.newplan.logical.expression.ProjectExpression.getFieldSchema(ProjectExpression.java:121)
> at 
> org.apache.pig.newplan.logical.optimizer.FieldSchemaResetter.execute(SchemaResetter.java:193)
> at 
> org.apache.pig.newplan.logical.expression.AllSameExpressionVisitor.visit(AllSameExpressionVisitor.java:53)
> at 
> org.apache.pig.newplan.logical.expression.ProjectExpression.accept(ProjectExpression.java:75)
> at 
> org.apache.pig.newplan.ReverseDependencyOrderWalker.walk(ReverseDependencyOrderWalker.java:70)
> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
> at 
> org.apache.pig.newplan.logical.optimizer.SchemaResetter.visit(SchemaResetter.java:83)
> at 
> org.apache.pig.newplan.logical.relational.LOJoin.accept(LOJoin.java:149)
> at 
> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
> at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:262)
> This is not a problem on 0.9, trunk, since LogicalExpPlanMigrationVistor is 
> dropped in 0.9.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Review Request: Project UDF output inside a non-foreach statement fail on 0.8

2011-07-18 Thread thejas . nair


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/767/#review1105
---

Ship it!


+1

- thejas


On 2011-05-19 22:26:01, Daniel Dai wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/767/
> ---
> 
> (Updated 2011-05-19 22:26:01)
> 
> 
> Review request for pig and thejas.
> 
> 
> Summary
> ---
> 
> See PIG-2077
> 
> 
> This addresses bug PIG-2077.
> https://issues.apache.org/jira/browse/PIG-2077
> 
> 
> Diffs
> -
> 
>   
> branches/branch-0.8/src/org/apache/pig/newplan/logical/LogicalExpPlanMigrationVistor.java
>  1104455 
>   branches/branch-0.8/test/org/apache/pig/test/TestEvalPipeline2.java 1104455 
> 
> Diff: https://reviews.apache.org/r/767/diff
> 
> 
> Testing
> ---
> 
> Test patch:
>  [exec] +1 overall.  
>  [exec] 
>  [exec] +1 @author.  The patch does not contain any @author tags.
>  [exec] 
>  [exec] +1 tests included.  The patch appears to include 3 new or 
> modified tests.
>  [exec] 
>  [exec] +1 javadoc.  The javadoc tool did not generate any warning 
> messages.
>  [exec] 
>  [exec] +1 javac.  The applied patch does not increase the total 
> number of javac compiler warnings.
>  [exec] 
>  [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
> warnings.
>  [exec] 
>  [exec] +1 release audit.  The applied patch does not increase the 
> total number of release audit warnings.
> 
> Unit test:
> all pass
> 
> End to end test:
> all pass
> 
> 
> Thanks,
> 
> Daniel
> 
>

[jira] [Created] (PIG-2176) add logical plan assumption checker

2011-07-18 Thread Thejas M Nair (JIRA)

add logical plan assumption checker 


 Key: PIG-2176
 URL: https://issues.apache.org/jira/browse/PIG-2176
 Project: Pig
  Issue Type: Improvement
Affects Versions: 0.9.0
Reporter: Thejas M Nair
 Fix For: 0.9.0, 0.10


Pig expects certain things about LogicalPlan, and optimizer logic depends on 
those to be true. Could that verifies that these assumptions are true will help 
in catching issues early on. 
Some of the assumptions that should be checked - 
1. All schema have valid uid . (not -1).
2. All fields in schema have distinct uid. 


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2159) New logical plan uses incorrect class for SUM causing for ClassCastException

2011-07-18 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067380#comment-13067380
 ] 

Thejas M Nair commented on PIG-2159:


+1


> New logical plan uses incorrect class for  SUM causing for ClassCastException
> -
>
> Key: PIG-2159
> URL: https://issues.apache.org/jira/browse/PIG-2159
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.9.0
>Reporter: Vivek Padmanabhan
>Priority: Blocker
> Fix For: 0.9.0
>
> Attachments: PIG-2159-1.patch, PIG-2159-2.patch
>
>
> The below is my script;
> {code}
> A = load 'input1' using PigStorage(',')  as 
> (f1:int,f2:int,f3:int,f4:long,f5:double);
> B = load 'input2' using PigStorage(',')  as 
> (f1:int,f2:int,f3:int,f4:long,f5:double);
> C = load 'input_Main' using PigStorage(',')  as (f1:int,f2:int,f3:int);
> U = UNION ONSCHEMA A,B;
> J = join C by (f1,f2,f3) LEFT OUTER, U by (f1,f2,f3);
> Porj = foreach J generate C::f1 as f1 ,C::f2 as f2,C::f3 as f3,U::f4 as 
> f4,U::f5 as f5;
> G = GROUP Porj by (f1,f2,f3,f5);
> Final = foreach G generate SUM(Porj.f4) as total;
> dump Final;
> {code}
> The script fails at while computing the sum with class cast exception.
> Caused by: java.lang.ClassCastException: java.lang.Long cannot be cast to 
> java.lang.Double
>  at org.apache.pig.builtin.DoubleSum$Initial.exec(DoubleSum.java:82)
>  ... 19 more
> This is clearly a bug in the logical plan created in 0.9. The sum operation 
> should have processed using org.apache.pig.builtin.LongSum, but instead 0.9 
> logical plan have used org.apache.pig.builtin.DoubleSum which is meant for 
> sum of doubles. And hence the ClassCastException.
> The same script works fine with Pig 0.8.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Build failed in Jenkins: Pig-trunk #1047

2011-07-18 Thread Apache Jenkins Server

See 

Changes:

[dvryaboy] PIG-2143: Make PigStorage optionally store schema; improve docs.

[thejas] PIG-1973: UDFContext.getUDFContext usage of ThreadLocal pattern
 is not typical (woody via thejas)

[thejas] PIG-2053: PigInputFormat uses class.isAssignableFrom() where 
instanceof is more appropriate (woody via thejas)

--
[...truncated 39691 lines...]
[junit] at java.security.AccessController.doPrivileged(Native Method)
[junit] at javax.security.auth.Subject.doAs(Subject.java:396)
[junit] at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
[junit] 11/07/18 22:32:59 ERROR hdfs.DFSClient: Exception closing file 
/tmp/TestStore-output-109919389985994422.txt_cleanupOnFailure_succeeded2 : 
org.apache.hadoop.ipc.RemoteException: java.io.IOException: Could not complete 
write to file 
/tmp/TestStore-output-109919389985994422.txt_cleanupOnFailure_succeeded2 by 
DFSClient_-18740787
[junit] at 
org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:449)
[junit] at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[junit] at java.lang.reflect.Method.invoke(Method.java:597)
[junit] at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
[junit] at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
[junit] at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
[junit] at java.security.AccessController.doPrivileged(Native Method)
[junit] at javax.security.auth.Subject.doAs(Subject.java:396)
[junit] at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
[junit] 
[junit] org.apache.hadoop.ipc.RemoteException: java.io.IOException: Could 
not complete write to file 
/tmp/TestStore-output-109919389985994422.txt_cleanupOnFailure_succeeded2 by 
DFSClient_-18740787
[junit] at 
org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:449)
[junit] at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[junit] at java.lang.reflect.Method.invoke(Method.java:597)
[junit] at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
[junit] at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
[junit] at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
[junit] at java.security.AccessController.doPrivileged(Native Method)
[junit] at javax.security.auth.Subject.doAs(Subject.java:396)
[junit] at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
[junit] 
[junit] at org.apache.hadoop.ipc.Client.call(Client.java:740)
[junit] at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
[junit] at $Proxy0.complete(Unknown Source)
[junit] at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[junit] at java.lang.reflect.Method.invoke(Method.java:597)
[junit] at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
[junit] at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
[junit] at $Proxy0.complete(Unknown Source)
[junit] at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3264)
[junit] at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3188)
[junit] at 
org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java:1043)
[junit] at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:237)
[junit] at 
org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:269)
[junit] at 
org.apache.pig.test.MiniCluster.shutdownMiniDfsAndMrClusters(MiniCluster.java:111)
[junit] at 
org.apache.pig.test.MiniCluster.shutDown(MiniCluster.java:101)
[junit] at 
org.apache.pig.test.TestStore.oneTimeTearDown(TestStore.java:127)
[junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[junit] at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[junit] at java.lang.reflect.Method.invoke(Method.java:597)
[junit] at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
[junit] at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
[junit] at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
[j

[jira] [Resolved] (PIG-2172) Fix test failure for ant 1.8.x

2011-07-18 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai resolved PIG-2172.
-

  Resolution: Fixed
Hadoop Flags: [Reviewed]

> Fix test failure for ant 1.8.x
> --
>
> Key: PIG-2172
> URL: https://issues.apache.org/jira/browse/PIG-2172
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.10
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.10
>
> Attachments: PIG-2172-1.patch
>
>
> Some tests fail using ant 1.8.x. But in ant 1.7.x, these tests work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2174) HBaseStorage column filters miss some fields

2011-07-18 Thread Bill Graham (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067358#comment-13067358
 ] 

Bill Graham commented on PIG-2174:
--

FYI, the HBase issue I mentioned was in fact a bug which has been fixed 
(HBASE-3550). The Pig bug is still valid though.

> HBaseStorage column filters miss some fields
> 
>
> Key: PIG-2174
> URL: https://issues.apache.org/jira/browse/PIG-2174
> Project: Pig
>  Issue Type: Bug
>Reporter: Bill Graham
>Assignee: Bill Graham
> Attachments: PIG-2174_1.patch
>
>
> When mixing static and dynamic column mappings, {{HBaseStorage}} sometimes 
> doesn't pick up the static column values and nulls are returned. I believe 
> this bug has been masked by HBase being a bit over-eager when it comes to 
> respecting column filters (i.e. HBase is returning more columns than it 
> should).
> For example, this query returns nulls for the {{sc}} column, even when it 
> contains data:
> {noformat}
> a = LOAD 'hbase://pigtable_1' USING
>   org.apache.pig.backend.hadoop.hbase.HBaseStorage
>   ('pig:sc pig:prefixed_col_*','-loadKey') AS
>   (rowKey:chararray, sc:chararray, pig_cf_map:map[]);
> {noformat}
> What is very strange (about HBase), is that the same script will return 
> values just fine if {{sc}} is instead {{col_a}}, assuming of course that both 
> columns contain data:
> {noformat}
> a = LOAD 'hbase://pigtable_1' USING
>   org.apache.pig.backend.hadoop.hbase.HBaseStorage
>   ('pig:col_a pig:prefixed_col_*','-loadKey') AS
>   (rowKey:chararray, col_a:chararray, pig_cf_map:map[]);
> {noformat}
> Potential HBase issues aside, I think there is a bug in the logic on the Pig 
> side. Patch to follow. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2175) Switch Pig wiki to use confluence

2011-07-18 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-2175:


Attachment: PIG-2175-1.patch

> Switch Pig wiki to use confluence
> -
>
> Key: PIG-2175
> URL: https://issues.apache.org/jira/browse/PIG-2175
> Project: Pig
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: PIG-2175-1.patch
>
>
> Confluence gives us more functionality and more permission control features. 
> We plan to migrate our wiki to confluence. I migrated part of our wiki to 
> https://cwiki.apache.org/confluence/display/PIG. I also put a link to the old 
> wiki on that site. Attached patch change links on Pig main site.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (PIG-2175) Switch Pig wiki to use confluence

2011-07-18 Thread Daniel Dai (JIRA)

Switch Pig wiki to use confluence
-

 Key: PIG-2175
 URL: https://issues.apache.org/jira/browse/PIG-2175
 Project: Pig
  Issue Type: Improvement
  Components: documentation
Reporter: Daniel Dai
Assignee: Daniel Dai


Confluence gives us more functionality and more permission control features. We 
plan to migrate our wiki to confluence. I migrated part of our wiki to 
https://cwiki.apache.org/confluence/display/PIG. I also put a link to the old 
wiki on that site. Attached patch change links on Pig main site.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2172) Fix test failure for ant 1.8.x

2011-07-18 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067350#comment-13067350
 ] 

Thejas M Nair commented on PIG-2172:


+1

> Fix test failure for ant 1.8.x
> --
>
> Key: PIG-2172
> URL: https://issues.apache.org/jira/browse/PIG-2172
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.10
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.10
>
> Attachments: PIG-2172-1.patch
>
>
> Some tests fail using ant 1.8.x. But in ant 1.7.x, these tests work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2174) HBaseStorage column filters miss some fields

2011-07-18 Thread Bill Graham (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-2174:
-

Attachment: PIG-2174_1.patch

Adding patch #1, please review.

> HBaseStorage column filters miss some fields
> 
>
> Key: PIG-2174
> URL: https://issues.apache.org/jira/browse/PIG-2174
> Project: Pig
>  Issue Type: Bug
>Reporter: Bill Graham
>Assignee: Bill Graham
> Attachments: PIG-2174_1.patch
>
>
> When mixing static and dynamic column mappings, {{HBaseStorage}} sometimes 
> doesn't pick up the static column values and nulls are returned. I believe 
> this bug has been masked by HBase being a bit over-eager when it comes to 
> respecting column filters (i.e. HBase is returning more columns than it 
> should).
> For example, this query returns nulls for the {{sc}} column, even when it 
> contains data:
> {noformat}
> a = LOAD 'hbase://pigtable_1' USING
>   org.apache.pig.backend.hadoop.hbase.HBaseStorage
>   ('pig:sc pig:prefixed_col_*','-loadKey') AS
>   (rowKey:chararray, sc:chararray, pig_cf_map:map[]);
> {noformat}
> What is very strange (about HBase), is that the same script will return 
> values just fine if {{sc}} is instead {{col_a}}, assuming of course that both 
> columns contain data:
> {noformat}
> a = LOAD 'hbase://pigtable_1' USING
>   org.apache.pig.backend.hadoop.hbase.HBaseStorage
>   ('pig:col_a pig:prefixed_col_*','-loadKey') AS
>   (rowKey:chararray, col_a:chararray, pig_cf_map:map[]);
> {noformat}
> Potential HBase issues aside, I think there is a bug in the logic on the Pig 
> side. Patch to follow. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2174) HBaseStorage column filters miss some fields

2011-07-18 Thread Bill Graham (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated PIG-2174:
-

Release Note: Fix HBaseStorage column filtering bug.
  Status: Patch Available  (was: Open)

> HBaseStorage column filters miss some fields
> 
>
> Key: PIG-2174
> URL: https://issues.apache.org/jira/browse/PIG-2174
> Project: Pig
>  Issue Type: Bug
>Reporter: Bill Graham
>Assignee: Bill Graham
> Attachments: PIG-2174_1.patch
>
>
> When mixing static and dynamic column mappings, {{HBaseStorage}} sometimes 
> doesn't pick up the static column values and nulls are returned. I believe 
> this bug has been masked by HBase being a bit over-eager when it comes to 
> respecting column filters (i.e. HBase is returning more columns than it 
> should).
> For example, this query returns nulls for the {{sc}} column, even when it 
> contains data:
> {noformat}
> a = LOAD 'hbase://pigtable_1' USING
>   org.apache.pig.backend.hadoop.hbase.HBaseStorage
>   ('pig:sc pig:prefixed_col_*','-loadKey') AS
>   (rowKey:chararray, sc:chararray, pig_cf_map:map[]);
> {noformat}
> What is very strange (about HBase), is that the same script will return 
> values just fine if {{sc}} is instead {{col_a}}, assuming of course that both 
> columns contain data:
> {noformat}
> a = LOAD 'hbase://pigtable_1' USING
>   org.apache.pig.backend.hadoop.hbase.HBaseStorage
>   ('pig:col_a pig:prefixed_col_*','-loadKey') AS
>   (rowKey:chararray, col_a:chararray, pig_cf_map:map[]);
> {noformat}
> Potential HBase issues aside, I think there is a bug in the logic on the Pig 
> side. Patch to follow. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (PIG-2174) HBaseStorage column filters miss some fields

2011-07-18 Thread Bill Graham (JIRA)

HBaseStorage column filters miss some fields


 Key: PIG-2174
 URL: https://issues.apache.org/jira/browse/PIG-2174
 Project: Pig
  Issue Type: Bug
Reporter: Bill Graham
Assignee: Bill Graham


When mixing static and dynamic column mappings, {{HBaseStorage}} sometimes 
doesn't pick up the static column values and nulls are returned. I believe this 
bug has been masked by HBase being a bit over-eager when it comes to respecting 
column filters (i.e. HBase is returning more columns than it should).

For example, this query returns nulls for the {{sc}} column, even when it 
contains data:
{noformat}
a = LOAD 'hbase://pigtable_1' USING
  org.apache.pig.backend.hadoop.hbase.HBaseStorage
  ('pig:sc pig:prefixed_col_*','-loadKey') AS
  (rowKey:chararray, sc:chararray, pig_cf_map:map[]);
{noformat}

What is very strange (about HBase), is that the same script will return values 
just fine if {{sc}} is instead {{col_a}}, assuming of course that both columns 
contain data:
{noformat}
a = LOAD 'hbase://pigtable_1' USING
  org.apache.pig.backend.hadoop.hbase.HBaseStorage
  ('pig:col_a pig:prefixed_col_*','-loadKey') AS
  (rowKey:chararray, col_a:chararray, pig_cf_map:map[]);
{noformat}

Potential HBase issues aside, I think there is a bug in the logic on the Pig 
side. Patch to follow. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2143) Make PigStorage optionally store schema; improve docs.

2011-07-18 Thread Dmitriy V. Ryaboy (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy V. Ryaboy updated PIG-2143:
---

Release Note: 
Documentation has been updated to reflect reality.

An optional second constructor argument is provided that allows one to 
customize advanced behaviors. A list of available options is below:

-schema Stores the schema of the relation using a hidden JSON file.
-noschema Ignores a stored schema during loading.

Schemas
If -schema is specified, a hidden ".pig_schema" file is created in the output 
directory when storing data. It is used by PigStorage (with or without -schema) 
during loading to determine the field names and types of the data without the 
need for a user to explicitly provide the schema in an as clause, unless 
-noschema is specified. No attempt to merge conflicting schemas is made during 
loading. The first schema encountered during a file system scan is used.
In addition, using -schema drops a ".pig_headers" file in the output directory. 
This file simply lists the delimited aliases. This is intended to make export 
to tools that can read files with header lines easier (just cat the header to 
your data).

Note that regardless of whether or not you store the schema, you always need to 
specify the correct delimiter to read your data. If you store reading delimiter 
"#" and then load using the default delimiter, your data will not be parsed 
correctly.



> Make PigStorage optionally store schema; improve docs.
> --
>
> Key: PIG-2143
> URL: https://issues.apache.org/jira/browse/PIG-2143
> Project: Pig
>  Issue Type: Improvement
>Reporter: Dmitriy V. Ryaboy
>Assignee: Dmitriy V. Ryaboy
> Fix For: 0.10
>
> Attachments: PIG-2143.2.diff, PIG-2143.3.patch, PIG-2143.4.patch, 
> PIG-2143.5.patch, PIG-2143.diff
>
>
> I'd like to propose that we allow for a greater degree of customization in 
> PigStorage.
> An incomplete list features that we might want to add:
> - flag to tell it to overwrite existing output if it exists
> - flag to tell it to compress output using gzip|bzip|lzo (currently this can 
> be achieved by setting the directory name to end in .gz or .bz2, which is a 
> bit awkward)
> - flag to tell it to store the schema and header (perhaps by merging in 
> PigStorageSchema work?)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (PIG-2173) piggybank datetime conversion javadocs not properly formatted

2011-07-18 Thread Joe Crobak (JIRA)

piggybank datetime conversion javadocs not properly formatted
-

 Key: PIG-2173
 URL: https://issues.apache.org/jira/browse/PIG-2173
 Project: Pig
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Joe Crobak
Priority: Trivial


e.g. 
http://pig.apache.org/docs/r0.8.1/api/org/apache/pig/piggybank/evaluation/datetime/convert/CustomFormatToISO.html

The sample code in the class description should be wrapped in a  or 
otherwise formatted correctly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2143) Make PigStorage optionally store schema; improve docs.

2011-07-18 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067170#comment-13067170
 ] 

Daniel Dai commented on PIG-2143:
-

Can we put release notes on this? Thanks.

> Make PigStorage optionally store schema; improve docs.
> --
>
> Key: PIG-2143
> URL: https://issues.apache.org/jira/browse/PIG-2143
> Project: Pig
>  Issue Type: Improvement
>Reporter: Dmitriy V. Ryaboy
>Assignee: Dmitriy V. Ryaboy
> Fix For: 0.10
>
> Attachments: PIG-2143.2.diff, PIG-2143.3.patch, PIG-2143.4.patch, 
> PIG-2143.5.patch, PIG-2143.diff
>
>
> I'd like to propose that we allow for a greater degree of customization in 
> PigStorage.
> An incomplete list features that we might want to add:
> - flag to tell it to overwrite existing output if it exists
> - flag to tell it to compress output using gzip|bzip|lzo (currently this can 
> be achieved by setting the directory name to end in .gz or .bz2, which is a 
> bit awkward)
> - flag to tell it to store the schema and header (perhaps by merging in 
> PigStorageSchema work?)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2172) Fix test failure for ant 1.8.x

2011-07-18 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-2172:


Attachment: PIG-2172-1.patch

> Fix test failure for ant 1.8.x
> --
>
> Key: PIG-2172
> URL: https://issues.apache.org/jira/browse/PIG-2172
> Project: Pig
>  Issue Type: Bug
>  Components: impl
>Affects Versions: 0.10
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.10
>
> Attachments: PIG-2172-1.patch
>
>
> Some tests fail using ant 1.8.x. But in ant 1.7.x, these tests work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (PIG-2172) Fix test failure for ant 1.8.x

2011-07-18 Thread Daniel Dai (JIRA)

Fix test failure for ant 1.8.x
--

 Key: PIG-2172
 URL: https://issues.apache.org/jira/browse/PIG-2172
 Project: Pig
  Issue Type: Bug
  Components: impl
Affects Versions: 0.10
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.10


Some tests fail using ant 1.8.x. But in ant 1.7.x, these tests work.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Build failed in Jenkins: Pig-trunk-commit #859

2011-07-18 Thread Apache Jenkins Server

See 

Changes:

[dvryaboy] PIG-2143: Make PigStorage optionally store schema; improve docs.

[thejas] PIG-1973: UDFContext.getUDFContext usage of ThreadLocal pattern
 is not typical (woody via thejas)

[thejas] PIG-2053: PigInputFormat uses class.isAssignableFrom() where 
instanceof is more appropriate (woody via thejas)

--
[...truncated 40179 lines...]
[junit] 
[junit] at org.apache.hadoop.ipc.Client.call(Client.java:740)
[junit] at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
[junit] at $Proxy0.complete(Unknown Source)
[junit] at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[junit] at java.lang.reflect.Method.invoke(Method.java:597)
[junit] at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
[junit] at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
[junit] at $Proxy0.complete(Unknown Source)
[junit] at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3264)
[junit] at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3188)
[junit] at 
org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java:1043)
[junit] at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:237)
[junit] at 
org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:269)
[junit] at 
org.apache.pig.test.MiniCluster.shutdownMiniDfsAndMrClusters(MiniCluster.java:111)
[junit] at 
org.apache.pig.test.MiniCluster.shutDown(MiniCluster.java:101)
[junit] at 
org.apache.pig.test.TestStore.oneTimeTearDown(TestStore.java:127)
[junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[junit] at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[junit] at java.lang.reflect.Method.invoke(Method.java:597)
[junit] at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
[junit] at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
[junit] at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
[junit] at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:37)
[junit] at org.junit.runners.ParentRunner.run(ParentRunner.java:220)
[junit] at 
junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
[junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420)
[junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911)
[junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:768)
[junit] Shutting down the Mini HDFS Cluster
[junit] Shutting down DataNode 3
[junit] 11/07/18 16:29:13 INFO ipc.Server: Stopping server on 42421
[junit] 11/07/18 16:29:13 INFO ipc.Server: IPC Server handler 2 on 42421: 
exiting
[junit] 11/07/18 16:29:13 INFO ipc.Server: IPC Server handler 0 on 42421: 
exiting
[junit] 11/07/18 16:29:13 INFO ipc.Server: Stopping IPC Server listener on 
42421
[junit] 11/07/18 16:29:13 INFO ipc.Server: IPC Server handler 1 on 42421: 
exiting
[junit] 11/07/18 16:29:13 INFO datanode.DataNode: Waiting for threadgroup 
to exit, active threads is 1
[junit] 11/07/18 16:29:13 WARN datanode.DataNode: 
DatanodeRegistration(127.0.0.1:50529, 
storageID=DS-855549840-127.0.1.1-50529-1311006185850, infoPort=56258, 
ipcPort=42421):DataXceiveServer: java.nio.channels.AsynchronousCloseException
[junit] at 
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:185)
[junit] at 
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:159)
[junit] at 
sun.nio.ch.ServerSocketAdaptor.accept(ServerSocketAdaptor.java:84)
[junit] at 
org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:130)
[junit] at java.lang.Thread.run(Thread.java:662)
[junit] 
[junit] 11/07/18 16:29:13 INFO ipc.Server: Stopping IPC Server Responder
[junit] 11/07/18 16:29:13 INFO datanode.DataBlockScanner: Exiting 
DataBlockScanner thread.
[junit] 11/07/18 16:29:14 INFO datanode.DataNode: Deleting block 
blk_397522998911042587_1123 file 
build/test/data/dfs/data/data2/current/blk_397522998911042587
[junit] 11/07/18 16:29:14 INFO datanode.DataNode: Deleting block 
blk_4827013878883143290_1124 file 
build/test/data/dfs/data

[jira] [Closed] (PIG-2143) Make PigStorage optionally store schema; improve docs.

2011-07-18 Thread Dmitriy V. Ryaboy (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy V. Ryaboy closed PIG-2143.
--


> Make PigStorage optionally store schema; improve docs.
> --
>
> Key: PIG-2143
> URL: https://issues.apache.org/jira/browse/PIG-2143
> Project: Pig
>  Issue Type: Improvement
>Reporter: Dmitriy V. Ryaboy
>Assignee: Dmitriy V. Ryaboy
> Fix For: 0.10
>
> Attachments: PIG-2143.2.diff, PIG-2143.3.patch, PIG-2143.4.patch, 
> PIG-2143.5.patch, PIG-2143.diff
>
>
> I'd like to propose that we allow for a greater degree of customization in 
> PigStorage.
> An incomplete list features that we might want to add:
> - flag to tell it to overwrite existing output if it exists
> - flag to tell it to compress output using gzip|bzip|lzo (currently this can 
> be achieved by setting the directory name to end in .gz or .bz2, which is a 
> bit awkward)
> - flag to tell it to store the schema and header (perhaps by merging in 
> PigStorageSchema work?)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2143) Make PigStorage optionally store schema; improve docs.

2011-07-18 Thread Dmitriy V. Ryaboy (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy V. Ryaboy updated PIG-2143:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed. Thanks Raghu and Thejas for reviews.

> Make PigStorage optionally store schema; improve docs.
> --
>
> Key: PIG-2143
> URL: https://issues.apache.org/jira/browse/PIG-2143
> Project: Pig
>  Issue Type: Improvement
>Reporter: Dmitriy V. Ryaboy
>Assignee: Dmitriy V. Ryaboy
> Fix For: 0.10
>
> Attachments: PIG-2143.2.diff, PIG-2143.3.patch, PIG-2143.4.patch, 
> PIG-2143.5.patch, PIG-2143.diff
>
>
> I'd like to propose that we allow for a greater degree of customization in 
> PigStorage.
> An incomplete list features that we might want to add:
> - flag to tell it to overwrite existing output if it exists
> - flag to tell it to compress output using gzip|bzip|lzo (currently this can 
> be achieved by setting the directory name to end in .gz or .bz2, which is a 
> bit awkward)
> - flag to tell it to store the schema and header (perhaps by merging in 
> PigStorageSchema work?)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2143) Make PigStorage optionally store schema; improve docs.

2011-07-18 Thread Dmitriy V. Ryaboy (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy V. Ryaboy updated PIG-2143:
---

Summary: Make PigStorage optionally store schema; improve docs.  (was: 
Improvements for PigStorage)

Changed the title to reflect what we actually did in this iteration.


> Make PigStorage optionally store schema; improve docs.
> --
>
> Key: PIG-2143
> URL: https://issues.apache.org/jira/browse/PIG-2143
> Project: Pig
>  Issue Type: Improvement
>Reporter: Dmitriy V. Ryaboy
>Assignee: Dmitriy V. Ryaboy
> Fix For: 0.10
>
> Attachments: PIG-2143.2.diff, PIG-2143.3.patch, PIG-2143.4.patch, 
> PIG-2143.5.patch, PIG-2143.diff
>
>
> I'd like to propose that we allow for a greater degree of customization in 
> PigStorage.
> An incomplete list features that we might want to add:
> - flag to tell it to overwrite existing output if it exists
> - flag to tell it to compress output using gzip|bzip|lzo (currently this can 
> be achieved by setting the directory name to end in .gz or .bz2, which is a 
> bit awkward)
> - flag to tell it to store the schema and header (perhaps by merging in 
> PigStorageSchema work?)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-1973) UDFContext.getUDFContext usage of ThreadLocal pattern is not typical

2011-07-18 Thread Thejas M Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-1973:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

+1 
Patch committed to trunk.
Thanks Woody!

> UDFContext.getUDFContext usage of ThreadLocal pattern is not typical
> 
>
> Key: PIG-1973
> URL: https://issues.apache.org/jira/browse/PIG-1973
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Woody Anderson
>Assignee: Woody Anderson
>Priority: Minor
> Attachments: 1973.patch, PIG-1973.1.patch
>
>
> this is probably isn't manifesting anywhere, but it's an incorrect use of the 
> ThreadLocal pattern.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-1973) UDFContext.getUDFContext usage of ThreadLocal pattern is not typical

2011-07-18 Thread Thejas M Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-1973:
---

Summary: UDFContext.getUDFContext usage of ThreadLocal pattern is not 
typical  (was: UDFContext.getUDFContext has a thread race condition around it's 
ThreadLocal)

Updating the summary to indicate that this is a code improvement, but not a bug.


> UDFContext.getUDFContext usage of ThreadLocal pattern is not typical
> 
>
> Key: PIG-1973
> URL: https://issues.apache.org/jira/browse/PIG-1973
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Affects Versions: 0.8.0, 0.9.0
>Reporter: Woody Anderson
>Assignee: Woody Anderson
>Priority: Minor
> Attachments: 1973.patch, PIG-1973.1.patch
>
>
> this is probably isn't manifesting anywhere, but it's an incorrect use of the 
> ThreadLocal pattern.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2053) PigInputFormat uses class.isAssignableFrom() where instanceof is more appropriate

2011-07-18 Thread Thejas M Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-2053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-2053:
---

Resolution: Fixed
  Assignee: Woody Anderson
Status: Resolved  (was: Patch Available)

> PigInputFormat uses class.isAssignableFrom() where instanceof is more 
> appropriate
> -
>
> Key: PIG-2053
> URL: https://issues.apache.org/jira/browse/PIG-2053
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.10
>Reporter: Woody Anderson
>Assignee: Woody Anderson
>Priority: Minor
>  Labels: newbie
> Fix For: 0.10
>
> Attachments: 2053.patch
>
>
> This is a code style/quality improvement.
> isAssignableFrom is appropriate when the class is not known at compile type, 
> but assignment needs to be checked.
> e.g. foo.getClass().isAssignableFrom(bar.getClass())
> but, if the class of foo is known (e.g. X.class), then instanceof is more 
> appropriate and readable.
> i also made use of de morgan's to simply the "is combininable" boolean 
> statement, which is hard to grok as written.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2053) PigInputFormat uses class.isAssignableFrom() where instanceof is more appropriate

2011-07-18 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/PIG-2053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067076#comment-13067076
 ] 

Thejas M Nair commented on PIG-2053:


+1 
Patch committed to trunk.
Thanks Woody!

> PigInputFormat uses class.isAssignableFrom() where instanceof is more 
> appropriate
> -
>
> Key: PIG-2053
> URL: https://issues.apache.org/jira/browse/PIG-2053
> Project: Pig
>  Issue Type: Improvement
>Affects Versions: 0.10
>Reporter: Woody Anderson
>Priority: Minor
>  Labels: newbie
> Fix For: 0.10
>
> Attachments: 2053.patch
>
>
> This is a code style/quality improvement.
> isAssignableFrom is appropriate when the class is not known at compile type, 
> but assignment needs to be checked.
> e.g. foo.getClass().isAssignableFrom(bar.getClass())
> but, if the class of foo is known (e.g. X.class), then instanceof is more 
> appropriate and readable.
> i also made use of de morgan's to simply the "is combininable" boolean 
> statement, which is hard to grok as written.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-1904) Default split destination

2011-07-18 Thread Gianmarco De Francisci Morales (JIRA)


 [ 
https://issues.apache.org/jira/browse/PIG-1904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gianmarco De Francisci Morales updated PIG-1904:


Attachment: PIG-1904.2.patch

Attaching PIG-1902.2.patch
Added unit tests for Split-Otherwise
Added a check for Nondeterministic UDF. There was no need to create my own 
visitor, I reused the one available in Utils.
Fixed issue with Split with 1 branch only. The solution proposed by Thejas does 
not work directly because the '*' is greedy, but I worked around it.

I think it is ready for review.

> Default split destination
> -
>
> Key: PIG-1904
> URL: https://issues.apache.org/jira/browse/PIG-1904
> Project: Pig
>  Issue Type: New Feature
>Reporter: Daniel Dai
>  Labels: gsoc2011
> Fix For: 0.10
>
> Attachments: PIG-1904.1.patch, PIG-1904.2.patch
>
>
> "split" statement is better to have a default destination, eg:
> {code}
> SPLIT A INTO X IF f1<7, Y IF f2==5, Z IF (f3<6 OR f3>6), OTHER otherwise; -- 
> OTHERS has all tuples with f1>=7 && f2!=5 && f3==6
> {code}
> This is a candidate project for Google summer of code 2011. More information 
> about the program can be found at http://wiki.apache.org/pig/GSoc2011

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

43 matches

Mail list logo