[jira] [Updated] (PIG-4711) Tests in TestCombiner fail due to missing leveldb dependency

2015-10-26 Thread Pallavi Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pallavi Rao updated PIG-4711:
-
Priority: Blocker  (was: Major)

> Tests in TestCombiner fail due to missing leveldb dependency
> 
>
> Key: PIG-4711
> URL: https://issues.apache.org/jira/browse/PIG-4711
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: Pallavi Rao
>Assignee: Pallavi Rao
>Priority: Blocker
>  Labels: spork
> Fix For: spark-branch
>
> Attachments: PIG-4711-v1.patch, PIG-4711.patch
>
>
> Tests in TestCombiner use MiniYARNCluster which in turn has leveldb 
> dependencies.
> Currently, tests fail with Caused by: java.lang.ClassNotFoundException: 
> org.iq80.leveldb.DBException
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>   ... 43 more
> The leveldb dependency is included in trunk but is missing in this branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Jenkins build is back to normal : Pig-trunk-commit #2253

2015-10-26 Thread Apache Jenkins Server
See 



[jira] [Updated] (PIG-4711) Tests in TestCombiner fail due to missing leveldb dependency

2015-10-26 Thread Pallavi Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pallavi Rao updated PIG-4711:
-
Status: Patch Available  (was: Reopened)

[~xuefuz], the older patch had the dependency misplaced. Here is the correct 
patch. Could you please check this one in? Sorry for the inconvenience.

> Tests in TestCombiner fail due to missing leveldb dependency
> 
>
> Key: PIG-4711
> URL: https://issues.apache.org/jira/browse/PIG-4711
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: Pallavi Rao
>Assignee: Pallavi Rao
>  Labels: spork
> Fix For: spark-branch
>
> Attachments: PIG-4711-v1.patch, PIG-4711.patch
>
>
> Tests in TestCombiner use MiniYARNCluster which in turn has leveldb 
> dependencies.
> Currently, tests fail with Caused by: java.lang.ClassNotFoundException: 
> org.iq80.leveldb.DBException
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>   ... 43 more
> The leveldb dependency is included in trunk but is missing in this branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4711) Tests in TestCombiner fail due to missing leveldb dependency

2015-10-26 Thread Pallavi Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pallavi Rao updated PIG-4711:
-
Attachment: PIG-4711-v1.patch

The new patch that doesn't cause the build break.

> Tests in TestCombiner fail due to missing leveldb dependency
> 
>
> Key: PIG-4711
> URL: https://issues.apache.org/jira/browse/PIG-4711
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: Pallavi Rao
>Assignee: Pallavi Rao
>  Labels: spork
> Fix For: spark-branch
>
> Attachments: PIG-4711-v1.patch, PIG-4711.patch
>
>
> Tests in TestCombiner use MiniYARNCluster which in turn has leveldb 
> dependencies.
> Currently, tests fail with Caused by: java.lang.ClassNotFoundException: 
> org.iq80.leveldb.DBException
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>   ... 43 more
> The leveldb dependency is included in trunk but is missing in this branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (PIG-4711) Tests in TestCombiner fail due to missing leveldb dependency

2015-10-26 Thread Pallavi Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pallavi Rao reopened PIG-4711:
--

The patch broke the build. Misplaced line in ivy.xml.

> Tests in TestCombiner fail due to missing leveldb dependency
> 
>
> Key: PIG-4711
> URL: https://issues.apache.org/jira/browse/PIG-4711
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: Pallavi Rao
>Assignee: Pallavi Rao
>  Labels: spork
> Fix For: spark-branch
>
> Attachments: PIG-4711.patch
>
>
> Tests in TestCombiner use MiniYARNCluster which in turn has leveldb 
> dependencies.
> Currently, tests fail with Caused by: java.lang.ClassNotFoundException: 
> org.iq80.leveldb.DBException
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>   ... 43 more
> The leveldb dependency is included in trunk but is missing in this branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PIG-4716) Add support for global PIG_OPTS configuration for Pig e2e

2015-10-26 Thread Rohini Palaniswamy (JIRA)
Rohini Palaniswamy created PIG-4716:
---

 Summary: Add support for global PIG_OPTS configuration for Pig e2e
 Key: PIG-4716
 URL: https://issues.apache.org/jira/browse/PIG-4716
 Project: Pig
  Issue Type: Improvement
Reporter: Rohini Palaniswamy


 It helps if you want to run the whole e2e with different parameters. For eg: 
lesser heap size to run more tests in parallel, turn on non-default settings 
like pig.exec.mapPartAgg etc. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4715) Pig 0.14 does not work with Tez 0.5.4

2015-10-26 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-4715:

Attachment: PIG-4715-1.patch

> Pig 0.14 does not work with Tez 0.5.4
> -
>
> Key: PIG-4715
> URL: https://issues.apache.org/jira/browse/PIG-4715
> Project: Pig
>  Issue Type: Bug
>  Components: tez
>Affects Versions: 0.14.0
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.14.1
>
> Attachments: PIG-4715-1.patch
>
>
> Pig 0.14 with Tez 0.5.4 is broken by TEZ-2305. The output has extra tuple 
> around actual data.
> The issue is fixed in Pig 0.15/trunk as part of PIG-4434.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PIG-4715) Pig 0.14 does not work with Tez 0.5.4

2015-10-26 Thread Daniel Dai (JIRA)
Daniel Dai created PIG-4715:
---

 Summary: Pig 0.14 does not work with Tez 0.5.4
 Key: PIG-4715
 URL: https://issues.apache.org/jira/browse/PIG-4715
 Project: Pig
  Issue Type: Bug
  Components: tez
Affects Versions: 0.14.0
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.14.1


Pig 0.14 with Tez 0.5.4 is broken by TEZ-2305. The output has extra tuple 
around actual data.

The issue is fixed in Pig 0.15/trunk as part of PIG-4434.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4714) Improve logging across multiple components with callerId

2015-10-26 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-4714:

Attachment: PIG-4714-1.patch

> Improve logging across multiple components with callerId
> 
>
> Key: PIG-4714
> URL: https://issues.apache.org/jira/browse/PIG-4714
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.16.0
>
> Attachments: PIG-4714-1.patch
>
>
> The idea is to add callerId to every component, so we can track the chain of 
> application which cause the underlining operation. A typical chain is 
> Oozie->Pig->Tez->Hdfs. With proper callerId logging, we can trace Hdfs 
> operation back to Oozie workflow which triggers it.
> The protocol we decided is every component log its immediate callerId. 
> For Pig, this includes passing Pig script ID to underlining components as 
> callerId using component specific API, log callerId of Pig and store it on 
> ATS. More specific, it includes:
> 1. Generate a CallerId for each Pig script, pass it to 
> Hdfs/Yarn/MapReduce/Tez which Pig invokes
> 2. Pig caller passes "pig.log.trace.id" to Pig, Pig will publish it to ATS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4714) Improve logging across multiple components with callerId

2015-10-26 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-4714:

Attachment: (was: PIG-4714-1.patch)

> Improve logging across multiple components with callerId
> 
>
> Key: PIG-4714
> URL: https://issues.apache.org/jira/browse/PIG-4714
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.16.0
>
> Attachments: PIG-4714-1.patch
>
>
> The idea is to add callerId to every component, so we can track the chain of 
> application which cause the underlining operation. A typical chain is 
> Oozie->Pig->Tez->Hdfs. With proper callerId logging, we can trace Hdfs 
> operation back to Oozie workflow which triggers it.
> The protocol we decided is every component log its immediate callerId. 
> For Pig, this includes passing Pig script ID to underlining components as 
> callerId using component specific API, log callerId of Pig and store it on 
> ATS. More specific, it includes:
> 1. Generate a CallerId for each Pig script, pass it to 
> Hdfs/Yarn/MapReduce/Tez which Pig invokes
> 2. Pig caller passes "pig.log.trace.id" to Pig, Pig will publish it to ATS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4714) Improve logging across multiple components with callerId

2015-10-26 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-4714:

Attachment: PIG-4714-1.patch

Attach initial patch.

> Improve logging across multiple components with callerId
> 
>
> Key: PIG-4714
> URL: https://issues.apache.org/jira/browse/PIG-4714
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.16.0
>
> Attachments: PIG-4714-1.patch
>
>
> The idea is to add callerId to every component, so we can track the chain of 
> application which cause the underlining operation. A typical chain is 
> Oozie->Pig->Tez->Hdfs. With proper callerId logging, we can trace Hdfs 
> operation back to Oozie workflow which triggers it.
> The protocol we decided is every component log its immediate callerId. 
> For Pig, this includes passing Pig script ID to underlining components as 
> callerId using component specific API, log callerId of Pig and store it on 
> ATS. More specific, it includes:
> 1. Generate a CallerId for each Pig script, pass it to 
> Hdfs/Yarn/MapReduce/Tez which Pig invokes
> 2. Pig caller passes "pig.log.trace.id" to Pig, Pig will publish it to ATS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PIG-4714) Improve logging across multiple components with callerId

2015-10-26 Thread Daniel Dai (JIRA)
Daniel Dai created PIG-4714:
---

 Summary: Improve logging across multiple components with callerId
 Key: PIG-4714
 URL: https://issues.apache.org/jira/browse/PIG-4714
 Project: Pig
  Issue Type: Improvement
  Components: impl
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 0.16.0


The idea is to add callerId to every component, so we can track the chain of 
application which cause the underlining operation. A typical chain is 
Oozie->Pig->Tez->Hdfs. With proper callerId logging, we can trace Hdfs 
operation back to Oozie workflow which triggers it.

The protocol we decided is every component log its immediate callerId. 

For Pig, this includes passing Pig script ID to underlining components as 
callerId using component specific API, log callerId of Pig and store it on ATS. 
More specific, it includes:
1. Generate a CallerId for each Pig script, pass it to Hdfs/Yarn/MapReduce/Tez 
which Pig invokes
2. Pig caller passes "pig.log.trace.id" to Pig, Pig will publish it to ATS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4711) Tests in TestCombiner fail due to missing leveldb dependency

2015-10-26 Thread Pallavi Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14975691#comment-14975691
 ] 

Pallavi Rao commented on PIG-4711:
--

Thanks [~xuefuz] for the quick review and commit.

> Tests in TestCombiner fail due to missing leveldb dependency
> 
>
> Key: PIG-4711
> URL: https://issues.apache.org/jira/browse/PIG-4711
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: Pallavi Rao
>Assignee: Pallavi Rao
>  Labels: spork
> Fix For: spark-branch
>
> Attachments: PIG-4711.patch
>
>
> Tests in TestCombiner use MiniYARNCluster which in turn has leveldb 
> dependencies.
> Currently, tests fail with Caused by: java.lang.ClassNotFoundException: 
> org.iq80.leveldb.DBException
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>   ... 43 more
> The leveldb dependency is included in trunk but is missing in this branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4698) Enable dynamic resource allocation/de-allocation on Yarn backends

2015-10-26 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated PIG-4698:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to Spark branch. Thanks, Srikanth!

> Enable dynamic resource allocation/de-allocation on Yarn backends
> -
>
> Key: PIG-4698
> URL: https://issues.apache.org/jira/browse/PIG-4698
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Affects Versions: spark-branch
>Reporter: Srikanth Sundarrajan
>Assignee: Srikanth Sundarrajan
>  Labels: spork
> Fix For: spark-branch
>
> Attachments: PIG-4698.patch
>
>
> Resource elasticity needs to be enabled on Yarn backend to allow jobs to 
> scale out better and provide better wall clock execution times, while unused 
> resources should be released back to RM for use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4711) Tests in TestCombiner fail due to missing leveldb dependency

2015-10-26 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated PIG-4711:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to Spark branch. Thanks, Pallavi!

> Tests in TestCombiner fail due to missing leveldb dependency
> 
>
> Key: PIG-4711
> URL: https://issues.apache.org/jira/browse/PIG-4711
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: Pallavi Rao
>Assignee: Pallavi Rao
>  Labels: spork
> Fix For: spark-branch
>
> Attachments: PIG-4711.patch
>
>
> Tests in TestCombiner use MiniYARNCluster which in turn has leveldb 
> dependencies.
> Currently, tests fail with Caused by: java.lang.ClassNotFoundException: 
> org.iq80.leveldb.DBException
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>   ... 43 more
> The leveldb dependency is included in trunk but is missing in this branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4711) Tests in TestCombiner fail due to missing leveldb dependency

2015-10-26 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14975628#comment-14975628
 ] 

Xuefu Zhang commented on PIG-4711:
--

+1

> Tests in TestCombiner fail due to missing leveldb dependency
> 
>
> Key: PIG-4711
> URL: https://issues.apache.org/jira/browse/PIG-4711
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: Pallavi Rao
>Assignee: Pallavi Rao
>  Labels: spork
> Fix For: spark-branch
>
> Attachments: PIG-4711.patch
>
>
> Tests in TestCombiner use MiniYARNCluster which in turn has leveldb 
> dependencies.
> Currently, tests fail with Caused by: java.lang.ClassNotFoundException: 
> org.iq80.leveldb.DBException
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>   ... 43 more
> The leveldb dependency is included in trunk but is missing in this branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4711) Tests in TestCombiner fail due to missing leveldb dependency

2015-10-26 Thread Pallavi Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14975609#comment-14975609
 ] 

Pallavi Rao commented on PIG-4711:
--

[~xuefuz], [~mohitsabharwal], please review the patch.

> Tests in TestCombiner fail due to missing leveldb dependency
> 
>
> Key: PIG-4711
> URL: https://issues.apache.org/jira/browse/PIG-4711
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: Pallavi Rao
>Assignee: Pallavi Rao
>  Labels: spork
> Fix For: spark-branch
>
> Attachments: PIG-4711.patch
>
>
> Tests in TestCombiner use MiniYARNCluster which in turn has leveldb 
> dependencies.
> Currently, tests fail with Caused by: java.lang.ClassNotFoundException: 
> org.iq80.leveldb.DBException
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>   ... 43 more
> The leveldb dependency is included in trunk but is missing in this branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4634) Fix records count issues in output statistics

2015-10-26 Thread Mohit Sabharwal (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14975606#comment-14975606
 ] 

Mohit Sabharwal commented on PIG-4634:
--

Thanks, [~xianda]. I had couple of code readability nits on RB. Otherwise LGTM.

> Fix records count issues in output statistics
> -
>
> Key: PIG-4634
> URL: https://issues.apache.org/jira/browse/PIG-4634
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: Xianda Ke
>Assignee: Xianda Ke
> Fix For: spark-branch
>
> Attachments: PIG-4634-3.patch, PIG-4634-4.patch, PIG-4634-5.patch, 
> PIG-4634.patch, PIG-4634_2.patch
>
>
> Test cases simpleTest() and simpleTest2()  in TestPigRunner failed, caused by 
> following issues:
> 1. pig context in SparkPigStats isn't initialized.
> 2. the records count logic hasn't been implemented.
> 3. getOutpugAlias(), getPigProperties(), getBytesWritten() and 
> getRecordWritten() have not been implemented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4698) Enable dynamic resource allocation/de-allocation on Yarn backends

2015-10-26 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14975596#comment-14975596
 ] 

Xuefu Zhang commented on PIG-4698:
--

+1

> Enable dynamic resource allocation/de-allocation on Yarn backends
> -
>
> Key: PIG-4698
> URL: https://issues.apache.org/jira/browse/PIG-4698
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Affects Versions: spark-branch
>Reporter: Srikanth Sundarrajan
>Assignee: Srikanth Sundarrajan
>  Labels: spork
> Fix For: spark-branch
>
> Attachments: PIG-4698.patch
>
>
> Resource elasticity needs to be enabled on Yarn backend to allow jobs to 
> scale out better and provide better wall clock execution times, while unused 
> resources should be released back to RM for use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 39641: PIG-4698 Enable dynamic resource allocation/de-allocation on Yarn backends

2015-10-26 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39641/#review104134
---

Ship it!


Ship It!

- Xuefu Zhang


On Oct. 26, 2015, 7:28 a.m., Srikanth Sundarrajan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39641/
> ---
> 
> (Updated Oct. 26, 2015, 7:28 a.m.)
> 
> 
> Review request for pig.
> 
> 
> Bugs: PIG-4698
> https://issues.apache.org/jira/browse/PIG-4698
> 
> 
> Repository: pig-git
> 
> 
> Description
> ---
> 
> Resource elasticity needs to be enabled on Yarn backend to allow jobs to 
> scale out better and provide better wall clock execution times, while unused 
> resources should be released back to RM for use.
> 
> 
> Diffs
> -
> 
>   src/docs/src/documentation/content/xdocs/start.xml eedd5b7 
>   src/org/apache/pig/backend/hadoop/executionengine/spark/SparkLauncher.java 
> b542013 
> 
> Diff: https://reviews.apache.org/r/39641/diff/
> 
> 
> Testing
> ---
> 
> Verified that the dynamic configuration is hornoured by the yarn system. 
> Requires the auxillary shuffle service need to be enabled at the node manager 
> and application level for this to work correctly.
> 
> 
> Thanks,
> 
> Srikanth Sundarrajan
> 
>



Jenkins build is back to normal : Pig-trunk #1842

2015-10-26 Thread Apache Jenkins Server
See 



[jira] [Updated] (PIG-4468) Pig's jackson version conflicts with that of hadoop 2.6.0 or newer

2015-10-26 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-4468:

   Resolution: Fixed
 Assignee: Jeff Zhang
 Hadoop Flags: Reviewed
Fix Version/s: 0.16.0
   Status: Resolved  (was: Patch Available)

Patch committed to trunk. Thanks Jeff, Rohini!

> Pig's jackson version conflicts with that of hadoop 2.6.0 or newer
> --
>
> Key: PIG-4468
> URL: https://issues.apache.org/jira/browse/PIG-4468
> Project: Pig
>  Issue Type: Bug
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
> Fix For: 0.16.0
>
> Attachments: PIG-4468-2.patch, PIG_4468_1.patch
>
>
> Pig use jackson of 1.8.8 while hadoop 2.6.0 use 1.9.13. And hadoop 2.6.0 use 
> one of ObjectMapper's new method setSerializationInclusion which is not 
> existed in jackson 1.8.8. It would cause the following issue
> {code}
> Caused by: java.lang.NoSuchMethodError: 
> org.codehaus.jackson.map.ObjectMapper.setSerializationInclusion(Lorg/codehaus/jackson/map/annotate/JsonSerialize$Inclusion;)Lorg/codehaus/jackson/map/ObjectMapper;
> at 
> org.apache.hadoop.yarn.webapp.YarnJacksonJaxbJsonProvider.configObjectMapper(YarnJacksonJaxbJsonProvider.java:59)
> at 
> org.apache.hadoop.yarn.util.timeline.TimelineUtils.(TimelineUtils.java:47)
> at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceInit(YarnClientImpl.java:166)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at org.apache.tez.client.TezYarnClient.init(TezYarnClient.java:45)
> at org.apache.tez.client.TezClient.start(TezClient.java:299)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.TezSessionManager.createSession(TezSessionManager.java:95)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.TezSessionManager.getClient(TezSessionManager.java:195)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob.run(TezJob.java:158)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher$1.run(TezLauncher.java:174)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4689) CSV Writes incorrect header if two CSV files are created in one script

2015-10-26 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14975131#comment-14975131
 ] 

Daniel Dai commented on PIG-4689:
-

Can you add your test case to TestCSVExcelStorage?

> CSV Writes incorrect header if two CSV files are created in one script
> --
>
> Key: PIG-4689
> URL: https://issues.apache.org/jira/browse/PIG-4689
> Project: Pig
>  Issue Type: Bug
>Affects Versions: 0.14.0, 0.15.0
>Reporter: Niels Basjes
>Assignee: Niels Basjes
> Attachments: PIG-4689-2015-10-06.patch, PIG-4689-20151016.patch
>
>
> From a single Pig script I write two completely different and unrelated CSV 
> files; both with the flag 'WRITE_OUTPUT_HEADER'.
> The bug is that both files get the SAME header at the top of the output file 
> even though the data is different.
> *Reproduction:*
> {code:title=foo.txt}
> 1
> {code}
> {code:title=bar.txt (Tab separated)}
> 1 a
> {code}
> {code:title=WriteTwoCSV.pig}
> FOO =
> LOAD 'foo.txt'
> USING PigStorage('\t')
> AS (a:chararray);
> BAR =
> LOAD 'bar.txt'
> USING PigStorage('\t')
> AS (b:chararray, c:chararray);
> STORE FOO into 'Foo'
> USING org.apache.pig.piggybank.storage.CSVExcelStorage('\t','NO_MULTILINE', 
> 'UNIX', 'WRITE_OUTPUT_HEADER');
> STORE BAR into 'Bar'
> USING org.apache.pig.piggybank.storage.CSVExcelStorage('\t','NO_MULTILINE', 
> 'UNIX', 'WRITE_OUTPUT_HEADER');
> {code}
> *Command:*
> {quote}pig -x local WriteTwoCSV.pig{quote}
> *Result:*
> {quote}cat Bar/part-*{quote}
> {code}
> b c
> 1 a
> {code}
> {quote}cat Foo/part-*{quote}
> {code}
> b c
> 1
> {code}
> *The error is that the {{Foo}} output has a the two column header from the 
> {{Bar}} output.*
> *One of the effects is that parsing the {{Foo}} data will probably fail due 
> to the varying number of columns*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (PIG-4708) Upgrade joda-time to 2.8

2015-10-26 Thread Rohini Palaniswamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy resolved PIG-4708.
-
  Resolution: Fixed
Hadoop Flags: Reviewed

Committed to trunk. Thanks Daniel for the review.

> Upgrade joda-time to 2.8
> 
>
> Key: PIG-4708
> URL: https://issues.apache.org/jira/browse/PIG-4708
> Project: Pig
>  Issue Type: Improvement
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: 0.16.0
>
> Attachments: PIG-4708-1.patch
>
>
>Folks writing UDFs are using more recent versions of joda-time and that 
> conflicts with the one bundled with Pig. Would be good to upgrade to 2.8.2 
> which is the latest one.  I also see some minor bug fixes for JDK 8 
> too(http://www.joda.org/joda-time/upgradeto281.html)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4468) Pig's jackson version conflicts with that of hadoop 2.6.0 or newer

2015-10-26 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14975113#comment-14975113
 ] 

Rohini Palaniswamy commented on PIG-4468:
-

+1

> Pig's jackson version conflicts with that of hadoop 2.6.0 or newer
> --
>
> Key: PIG-4468
> URL: https://issues.apache.org/jira/browse/PIG-4468
> Project: Pig
>  Issue Type: Bug
>Reporter: Jeff Zhang
> Attachments: PIG-4468-2.patch, PIG_4468_1.patch
>
>
> Pig use jackson of 1.8.8 while hadoop 2.6.0 use 1.9.13. And hadoop 2.6.0 use 
> one of ObjectMapper's new method setSerializationInclusion which is not 
> existed in jackson 1.8.8. It would cause the following issue
> {code}
> Caused by: java.lang.NoSuchMethodError: 
> org.codehaus.jackson.map.ObjectMapper.setSerializationInclusion(Lorg/codehaus/jackson/map/annotate/JsonSerialize$Inclusion;)Lorg/codehaus/jackson/map/ObjectMapper;
> at 
> org.apache.hadoop.yarn.webapp.YarnJacksonJaxbJsonProvider.configObjectMapper(YarnJacksonJaxbJsonProvider.java:59)
> at 
> org.apache.hadoop.yarn.util.timeline.TimelineUtils.(TimelineUtils.java:47)
> at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceInit(YarnClientImpl.java:166)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at org.apache.tez.client.TezYarnClient.init(TezYarnClient.java:45)
> at org.apache.tez.client.TezClient.start(TezClient.java:299)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.TezSessionManager.createSession(TezSessionManager.java:95)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.TezSessionManager.getClient(TezSessionManager.java:195)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob.run(TezJob.java:158)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher$1.run(TezLauncher.java:174)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4468) Pig's jackson version conflicts with that of hadoop 2.6.0 or newer

2015-10-26 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-4468:

Attachment: PIG-4468-2.patch

Fix TestRegisteredJarVisibility test failure. Also tested on Hadoop 1 and works.

> Pig's jackson version conflicts with that of hadoop 2.6.0 or newer
> --
>
> Key: PIG-4468
> URL: https://issues.apache.org/jira/browse/PIG-4468
> Project: Pig
>  Issue Type: Bug
>Reporter: Jeff Zhang
> Attachments: PIG-4468-2.patch, PIG_4468_1.patch
>
>
> Pig use jackson of 1.8.8 while hadoop 2.6.0 use 1.9.13. And hadoop 2.6.0 use 
> one of ObjectMapper's new method setSerializationInclusion which is not 
> existed in jackson 1.8.8. It would cause the following issue
> {code}
> Caused by: java.lang.NoSuchMethodError: 
> org.codehaus.jackson.map.ObjectMapper.setSerializationInclusion(Lorg/codehaus/jackson/map/annotate/JsonSerialize$Inclusion;)Lorg/codehaus/jackson/map/ObjectMapper;
> at 
> org.apache.hadoop.yarn.webapp.YarnJacksonJaxbJsonProvider.configObjectMapper(YarnJacksonJaxbJsonProvider.java:59)
> at 
> org.apache.hadoop.yarn.util.timeline.TimelineUtils.(TimelineUtils.java:47)
> at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceInit(YarnClientImpl.java:166)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> at org.apache.tez.client.TezYarnClient.init(TezYarnClient.java:45)
> at org.apache.tez.client.TezClient.start(TezClient.java:299)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.TezSessionManager.createSession(TezSessionManager.java:95)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.TezSessionManager.getClient(TezSessionManager.java:195)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.TezJob.run(TezJob.java:158)
> at 
> org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher$1.run(TezLauncher.java:174)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4708) Upgrade joda-time to 2.8

2015-10-26 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14974998#comment-14974998
 ] 

Daniel Dai commented on PIG-4708:
-

Upgrade to joda-time 2.8 should be totally doable. +1 on PIG-4708-1.patch.

> Upgrade joda-time to 2.8
> 
>
> Key: PIG-4708
> URL: https://issues.apache.org/jira/browse/PIG-4708
> Project: Pig
>  Issue Type: Improvement
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: 0.16.0
>
> Attachments: PIG-4708-1.patch
>
>
>Folks writing UDFs are using more recent versions of joda-time and that 
> conflicts with the one bundled with Pig. Would be good to upgrade to 2.8.2 
> which is the latest one.  I also see some minor bug fixes for JDK 8 
> too(http://www.joda.org/joda-time/upgradeto281.html)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4712) [Pig on Tez] NPE in Bloom UDF after Union

2015-10-26 Thread Rohini Palaniswamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-4712:

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks for the review Daniel.

> [Pig on Tez] NPE in Bloom UDF after Union 
> --
>
> Key: PIG-4712
> URL: https://issues.apache.org/jira/browse/PIG-4712
> Project: Pig
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: 0.16.0
>
> Attachments: PIG-4712-1.patch, PIG-4712-2.patch
>
>
>POUserFunc clone does not take care of cloning shipFiles and cacheFiles. 
> So Bloom UDF hits a NPE as the bloom file (cacheFile) is not shipped to the 
> task and available to the UDF.  Issue will happen with any udf that has 
> shipFiles or cacheFiles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4697) Serialize relevant part of the udfcontext per vertex to reduce payload size

2015-10-26 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14974786#comment-14974786
 ] 

Rohini Palaniswamy commented on PIG-4697:
-

Committed PIG-4697-fixunittests.patch into trunk. Thanks Daniel for the review.

> Serialize relevant part of the udfcontext per vertex to reduce payload size
> ---
>
> Key: PIG-4697
> URL: https://issues.apache.org/jira/browse/PIG-4697
> Project: Pig
>  Issue Type: Improvement
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: 0.16.0
>
> Attachments: PIG-4697-1.patch, PIG-4697-2.patch, 
> PIG-4697-fixunittests.patch
>
>
>   What HCatLoader/HCatStorer puts in UDFContext is huge and if there are 
> multiple of them in the pig script, the size of data sent to Tez AM is huge 
> and also the size of data that Tez AM sends to tasks is huge causing RPC 
> limit exceeded and OOM issues respectively.  If Pig serializes only part of 
> the udfcontext that is required for each vertex, it will save a lot.  HCat 
> folks are also looking up at cleaning what goes into the conf (it ends up 
> serializing whole job conf, not just hive-site.xml) and moving out the common 
> part to be shared by all hcat loaders and stores. 
> Also looking at other options for faster and compact serialization. Will 
> create separate jiras for that. Will use PIG-4653 to cleanup all other pig 
> config other than udfcontext.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4712) [Pig on Tez] NPE in Bloom UDF after Union

2015-10-26 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14974784#comment-14974784
 ] 

Daniel Dai commented on PIG-4712:
-

+1

> [Pig on Tez] NPE in Bloom UDF after Union 
> --
>
> Key: PIG-4712
> URL: https://issues.apache.org/jira/browse/PIG-4712
> Project: Pig
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: 0.16.0
>
> Attachments: PIG-4712-1.patch, PIG-4712-2.patch
>
>
>POUserFunc clone does not take care of cloning shipFiles and cacheFiles. 
> So Bloom UDF hits a NPE as the bloom file (cacheFile) is not shipped to the 
> task and available to the UDF.  Issue will happen with any udf that has 
> shipFiles or cacheFiles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4697) Serialize relevant part of the udfcontext per vertex to reduce payload size

2015-10-26 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14974783#comment-14974783
 ] 

Daniel Dai commented on PIG-4697:
-

Makes sense. +1.

> Serialize relevant part of the udfcontext per vertex to reduce payload size
> ---
>
> Key: PIG-4697
> URL: https://issues.apache.org/jira/browse/PIG-4697
> Project: Pig
>  Issue Type: Improvement
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: 0.16.0
>
> Attachments: PIG-4697-1.patch, PIG-4697-2.patch, 
> PIG-4697-fixunittests.patch
>
>
>   What HCatLoader/HCatStorer puts in UDFContext is huge and if there are 
> multiple of them in the pig script, the size of data sent to Tez AM is huge 
> and also the size of data that Tez AM sends to tasks is huge causing RPC 
> limit exceeded and OOM issues respectively.  If Pig serializes only part of 
> the udfcontext that is required for each vertex, it will save a lot.  HCat 
> folks are also looking up at cleaning what goes into the conf (it ends up 
> serializing whole job conf, not just hive-site.xml) and moving out the common 
> part to be shared by all hcat loaders and stores. 
> Also looking at other options for faster and compact serialization. Will 
> create separate jiras for that. Will use PIG-4653 to cleanup all other pig 
> config other than udfcontext.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4712) [Pig on Tez] NPE in Bloom UDF after Union

2015-10-26 Thread Rohini Palaniswamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-4712:

Attachment: PIG-4712-2.patch

Uploaded new patch which adds a new test instead of updating old one.

> [Pig on Tez] NPE in Bloom UDF after Union 
> --
>
> Key: PIG-4712
> URL: https://issues.apache.org/jira/browse/PIG-4712
> Project: Pig
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: 0.16.0
>
> Attachments: PIG-4712-1.patch, PIG-4712-2.patch
>
>
>POUserFunc clone does not take care of cloning shipFiles and cacheFiles. 
> So Bloom UDF hits a NPE as the bloom file (cacheFile) is not shipped to the 
> task and available to the UDF.  Issue will happen with any udf that has 
> shipFiles or cacheFiles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4697) Serialize relevant part of the udfcontext per vertex to reduce payload size

2015-10-26 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14974769#comment-14974769
 ] 

Rohini Palaniswamy commented on PIG-4697:
-

bq. But how about other TezInput? Will the Loader instantiation increase?
  POSimpleTezLoad is the only one which replaces POLoad. None of the other 
TezInput deal with MRInput.

> Serialize relevant part of the udfcontext per vertex to reduce payload size
> ---
>
> Key: PIG-4697
> URL: https://issues.apache.org/jira/browse/PIG-4697
> Project: Pig
>  Issue Type: Improvement
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: 0.16.0
>
> Attachments: PIG-4697-1.patch, PIG-4697-2.patch, 
> PIG-4697-fixunittests.patch
>
>
>   What HCatLoader/HCatStorer puts in UDFContext is huge and if there are 
> multiple of them in the pig script, the size of data sent to Tez AM is huge 
> and also the size of data that Tez AM sends to tasks is huge causing RPC 
> limit exceeded and OOM issues respectively.  If Pig serializes only part of 
> the udfcontext that is required for each vertex, it will save a lot.  HCat 
> folks are also looking up at cleaning what goes into the conf (it ends up 
> serializing whole job conf, not just hive-site.xml) and moving out the common 
> part to be shared by all hcat loaders and stores. 
> Also looking at other options for faster and compact serialization. Will 
> create separate jiras for that. Will use PIG-4653 to cleanup all other pig 
> config other than udfcontext.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (PIG-4697) Serialize relevant part of the udfcontext per vertex to reduce payload size

2015-10-26 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14974769#comment-14974769
 ] 

Rohini Palaniswamy edited comment on PIG-4697 at 10/26/15 6:28 PM:
---

bq. But how about other TezInput? Will the Loader instantiation increase?
  POSimpleTezLoad is the only one which replaces POLoad. None of the other 
TezInput extend POLoad or deal with MRInput.


was (Author: rohini):
bq. But how about other TezInput? Will the Loader instantiation increase?
  POSimpleTezLoad is the only one which replaces POLoad. None of the other 
TezInput deal with MRInput.

> Serialize relevant part of the udfcontext per vertex to reduce payload size
> ---
>
> Key: PIG-4697
> URL: https://issues.apache.org/jira/browse/PIG-4697
> Project: Pig
>  Issue Type: Improvement
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: 0.16.0
>
> Attachments: PIG-4697-1.patch, PIG-4697-2.patch, 
> PIG-4697-fixunittests.patch
>
>
>   What HCatLoader/HCatStorer puts in UDFContext is huge and if there are 
> multiple of them in the pig script, the size of data sent to Tez AM is huge 
> and also the size of data that Tez AM sends to tasks is huge causing RPC 
> limit exceeded and OOM issues respectively.  If Pig serializes only part of 
> the udfcontext that is required for each vertex, it will save a lot.  HCat 
> folks are also looking up at cleaning what goes into the conf (it ends up 
> serializing whole job conf, not just hive-site.xml) and moving out the common 
> part to be shared by all hcat loaders and stores. 
> Also looking at other options for faster and compact serialization. Will 
> create separate jiras for that. Will use PIG-4653 to cleanup all other pig 
> config other than udfcontext.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4697) Serialize relevant part of the udfcontext per vertex to reduce payload size

2015-10-26 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14974709#comment-14974709
 ] 

Daniel Dai commented on PIG-4697:
-

Seems you fix it by decrease Loader instantiation of 
POSimpleTezLoad.getLoadFunc. But how about other TezInput? Will the Loader 
instantiation increase?

> Serialize relevant part of the udfcontext per vertex to reduce payload size
> ---
>
> Key: PIG-4697
> URL: https://issues.apache.org/jira/browse/PIG-4697
> Project: Pig
>  Issue Type: Improvement
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: 0.16.0
>
> Attachments: PIG-4697-1.patch, PIG-4697-2.patch, 
> PIG-4697-fixunittests.patch
>
>
>   What HCatLoader/HCatStorer puts in UDFContext is huge and if there are 
> multiple of them in the pig script, the size of data sent to Tez AM is huge 
> and also the size of data that Tez AM sends to tasks is huge causing RPC 
> limit exceeded and OOM issues respectively.  If Pig serializes only part of 
> the udfcontext that is required for each vertex, it will save a lot.  HCat 
> folks are also looking up at cleaning what goes into the conf (it ends up 
> serializing whole job conf, not just hive-site.xml) and moving out the common 
> part to be shared by all hcat loaders and stores. 
> Also looking at other options for faster and compact serialization. Will 
> create separate jiras for that. Will use PIG-4653 to cleanup all other pig 
> config other than udfcontext.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4712) [Pig on Tez] NPE in Bloom UDF after Union

2015-10-26 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14974703#comment-14974703
 ] 

Daniel Dai commented on PIG-4712:
-

I am not sure if changing Bloom_1 is better or just adding a new test, seems 
the original test is the prevailing use case and should be tested.

> [Pig on Tez] NPE in Bloom UDF after Union 
> --
>
> Key: PIG-4712
> URL: https://issues.apache.org/jira/browse/PIG-4712
> Project: Pig
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: 0.16.0
>
> Attachments: PIG-4712-1.patch
>
>
>POUserFunc clone does not take care of cloning shipFiles and cacheFiles. 
> So Bloom UDF hits a NPE as the bloom file (cacheFile) is not shipped to the 
> task and available to the UDF.  Issue will happen with any udf that has 
> shipFiles or cacheFiles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4698) Enable dynamic resource allocation/de-allocation on Yarn backends

2015-10-26 Thread Srikanth Sundarrajan (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14974015#comment-14974015
 ] 

Srikanth Sundarrajan commented on PIG-4698:
---

FYKA [~xuefuz] / [~mohitsabharwal]

> Enable dynamic resource allocation/de-allocation on Yarn backends
> -
>
> Key: PIG-4698
> URL: https://issues.apache.org/jira/browse/PIG-4698
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Affects Versions: spark-branch
>Reporter: Srikanth Sundarrajan
>Assignee: Srikanth Sundarrajan
>  Labels: spork
> Fix For: spark-branch
>
> Attachments: PIG-4698.patch
>
>
> Resource elasticity needs to be enabled on Yarn backend to allow jobs to 
> scale out better and provide better wall clock execution times, while unused 
> resources should be released back to RM for use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PIG-4713) Document Bloom UDF

2015-10-26 Thread Rohini Palaniswamy (JIRA)
Rohini Palaniswamy created PIG-4713:
---

 Summary: Document Bloom UDF
 Key: PIG-4713
 URL: https://issues.apache.org/jira/browse/PIG-4713
 Project: Pig
  Issue Type: Task
Reporter: Rohini Palaniswamy


Release notes of https://issues.apache.org/jira/browse/PIG-2328 should go into 
Builtin Functions (https://pig.apache.org/docs/r0.15.0/func.html) of Apache Pig 
documentation.  

Saw one user trying to use Bloom Filter to filter data on a different column 
than the join column which should not be done as Bloom Filters give false 
positives and can include records that actually don't match the filter 
criteria. That should be documented as well and highlighted to avoid users 
trying to use Bloom Filters for just regular filtering. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4697) Serialize relevant part of the udfcontext per vertex to reduce payload size

2015-10-26 Thread Rohini Palaniswamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-4697:

Attachment: PIG-4697-fixunittests.patch

PIG-4697-fixunittests.patch fixes
   - TestLoadStoreFuncLifeCycle.testLoadStoreFunc() - The Loader instantiation 
count had increased because ld.getLoadFunc() was called in the 
UDFContextSeparator on POSimpleTezLoad and it was instantiated again as it was 
empty. So copied over from POLoad.  
  - Couple of tests were failing with UnSupportedOperationException as trying 
to add to algebraicUDFKeys did not work. It should have been new hashset. I did 
test specifically for that case running some testcases that did SUM/COUNT. 
Might have messed up in copy paste before creating the final patch. 



> Serialize relevant part of the udfcontext per vertex to reduce payload size
> ---
>
> Key: PIG-4697
> URL: https://issues.apache.org/jira/browse/PIG-4697
> Project: Pig
>  Issue Type: Improvement
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: 0.16.0
>
> Attachments: PIG-4697-1.patch, PIG-4697-2.patch, 
> PIG-4697-fixunittests.patch
>
>
>   What HCatLoader/HCatStorer puts in UDFContext is huge and if there are 
> multiple of them in the pig script, the size of data sent to Tez AM is huge 
> and also the size of data that Tez AM sends to tasks is huge causing RPC 
> limit exceeded and OOM issues respectively.  If Pig serializes only part of 
> the udfcontext that is required for each vertex, it will save a lot.  HCat 
> folks are also looking up at cleaning what goes into the conf (it ends up 
> serializing whole job conf, not just hive-site.xml) and moving out the common 
> part to be shared by all hcat loaders and stores. 
> Also looking at other options for faster and compact serialization. Will 
> create separate jiras for that. Will use PIG-4653 to cleanup all other pig 
> config other than udfcontext.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4712) [Pig on Tez] NPE in Bloom UDF after Union

2015-10-26 Thread Rohini Palaniswamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-4712:

Status: Patch Available  (was: Open)

> [Pig on Tez] NPE in Bloom UDF after Union 
> --
>
> Key: PIG-4712
> URL: https://issues.apache.org/jira/browse/PIG-4712
> Project: Pig
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: 0.16.0
>
> Attachments: PIG-4712-1.patch
>
>
>POUserFunc clone does not take care of cloning shipFiles and cacheFiles. 
> So Bloom UDF hits a NPE as the bloom file (cacheFile) is not shipped to the 
> task and available to the UDF.  Issue will happen with any udf that has 
> shipFiles or cacheFiles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4712) [Pig on Tez] NPE in Bloom UDF after Union

2015-10-26 Thread Rohini Palaniswamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-4712:

Attachment: PIG-4712-1.patch

> [Pig on Tez] NPE in Bloom UDF after Union 
> --
>
> Key: PIG-4712
> URL: https://issues.apache.org/jira/browse/PIG-4712
> Project: Pig
>  Issue Type: Bug
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: 0.16.0
>
> Attachments: PIG-4712-1.patch
>
>
>POUserFunc clone does not take care of cloning shipFiles and cacheFiles. 
> So Bloom UDF hits a NPE as the bloom file (cacheFile) is not shipped to the 
> task and available to the UDF.  Issue will happen with any udf that has 
> shipFiles or cacheFiles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PIG-4712) [Pig on Tez] NPE in Bloom UDF after Union

2015-10-26 Thread Rohini Palaniswamy (JIRA)
Rohini Palaniswamy created PIG-4712:
---

 Summary: [Pig on Tez] NPE in Bloom UDF after Union 
 Key: PIG-4712
 URL: https://issues.apache.org/jira/browse/PIG-4712
 Project: Pig
  Issue Type: Bug
Reporter: Rohini Palaniswamy
Assignee: Rohini Palaniswamy
 Fix For: 0.16.0


   POUserFunc clone does not take care of cloning shipFiles and cacheFiles. So 
Bloom UDF hits a NPE as the bloom file (cacheFile) is not shipped to the task 
and available to the UDF.  Issue will happen with any udf that has shipFiles or 
cacheFiles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (PIG-4705) Error Schema for data cannot be determined using HCatalog

2015-10-26 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai resolved PIG-4705.
-
Resolution: Duplicate

This issue is the same as PIG-4703. As per the internal ticket in Hortonworks, 
PIG-4703 capture the root cause of the issue.

> Error Schema for data cannot be determined using HCatalog
> -
>
> Key: PIG-4705
> URL: https://issues.apache.org/jira/browse/PIG-4705
> Project: Pig
>  Issue Type: Bug
>  Components: tez
>Affects Versions: 0.15.0
> Environment: HDP 2.3.2
>Reporter: Krzysztof Indyk
> Attachments: hive_tables.hql, sample.csv, stack_trace.log
>
>
> When we use {{HCatalog}} as source and destination of data for {{Pig}} on 
> {{Tez}} we get  ??ERROR 1115: Schema for data cannot be determined??.
> Pig works fine when we use map reduce or use HCatalog only as one of 
> endpoints i.e. load data directly from file and store using HCatalog.
> The error appears after upgrading from {{Pig 0.14}} on {{Tez 0.5.2}} to {{Pig 
> 0.15}} on {{Tez 0.7.0}} ( {{HDP 2.2.6}} to {{HDP 2.3.2}}).
> To reproduce:
> - create hive tables from [^hive_tables.hql]
> - load data to table_input from [^sample.csv]
> - run following Pig script on Tez
> {code}
> data = LOAD 'table_input' USING org.apache.hive.hcatalog.pig.HCatLoader();
> items_unique = DISTINCT data;
> counted = FOREACH (GROUP items_unique BY col2)
>   GENERATE
> group AS name,
> COUNT(items_unique) AS value;
>   
> STORE counted INTO 'table_output' USING 
> org.apache.hive.hcatalog.pig.HCatStorer();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4705) Error Schema for data cannot be determined using HCatalog

2015-10-26 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated PIG-4705:

Fix Version/s: 0.16.0

> Error Schema for data cannot be determined using HCatalog
> -
>
> Key: PIG-4705
> URL: https://issues.apache.org/jira/browse/PIG-4705
> Project: Pig
>  Issue Type: Bug
>  Components: tez
>Affects Versions: 0.15.0
> Environment: HDP 2.3.2
>Reporter: Krzysztof Indyk
> Fix For: 0.16.0
>
> Attachments: hive_tables.hql, sample.csv, stack_trace.log
>
>
> When we use {{HCatalog}} as source and destination of data for {{Pig}} on 
> {{Tez}} we get  ??ERROR 1115: Schema for data cannot be determined??.
> Pig works fine when we use map reduce or use HCatalog only as one of 
> endpoints i.e. load data directly from file and store using HCatalog.
> The error appears after upgrading from {{Pig 0.14}} on {{Tez 0.5.2}} to {{Pig 
> 0.15}} on {{Tez 0.7.0}} ( {{HDP 2.2.6}} to {{HDP 2.3.2}}).
> To reproduce:
> - create hive tables from [^hive_tables.hql]
> - load data to table_input from [^sample.csv]
> - run following Pig script on Tez
> {code}
> data = LOAD 'table_input' USING org.apache.hive.hcatalog.pig.HCatLoader();
> items_unique = DISTINCT data;
> counted = FOREACH (GROUP items_unique BY col2)
>   GENERATE
> group AS name,
> COUNT(items_unique) AS value;
>   
> STORE counted INTO 'table_output' USING 
> org.apache.hive.hcatalog.pig.HCatStorer();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4698) Enable dynamic resource allocation/de-allocation on Yarn backends

2015-10-26 Thread Srikanth Sundarrajan (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Srikanth Sundarrajan updated PIG-4698:
--
Attachment: PIG-4698.patch

> Enable dynamic resource allocation/de-allocation on Yarn backends
> -
>
> Key: PIG-4698
> URL: https://issues.apache.org/jira/browse/PIG-4698
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Affects Versions: spark-branch
>Reporter: Srikanth Sundarrajan
>Assignee: Srikanth Sundarrajan
>  Labels: spork
> Fix For: spark-branch
>
> Attachments: PIG-4698.patch
>
>
> Resource elasticity needs to be enabled on Yarn backend to allow jobs to 
> scale out better and provide better wall clock execution times, while unused 
> resources should be released back to RM for use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4698) Enable dynamic resource allocation/de-allocation on Yarn backends

2015-10-26 Thread Srikanth Sundarrajan (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Srikanth Sundarrajan updated PIG-4698:
--
Status: Patch Available  (was: Open)

> Enable dynamic resource allocation/de-allocation on Yarn backends
> -
>
> Key: PIG-4698
> URL: https://issues.apache.org/jira/browse/PIG-4698
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Affects Versions: spark-branch
>Reporter: Srikanth Sundarrajan
>Assignee: Srikanth Sundarrajan
>  Labels: spork
> Fix For: spark-branch
>
> Attachments: PIG-4698.patch
>
>
> Resource elasticity needs to be enabled on Yarn backend to allow jobs to 
> scale out better and provide better wall clock execution times, while unused 
> resources should be released back to RM for use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 39641: PIG-4698 Enable dynamic resource allocation/de-allocation on Yarn backends

2015-10-26 Thread Srikanth Sundarrajan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39641/
---

Review request for pig.


Bugs: PIG-4698
https://issues.apache.org/jira/browse/PIG-4698


Repository: pig-git


Description
---

Resource elasticity needs to be enabled on Yarn backend to allow jobs to scale 
out better and provide better wall clock execution times, while unused 
resources should be released back to RM for use.


Diffs
-

  src/docs/src/documentation/content/xdocs/start.xml eedd5b7 
  src/org/apache/pig/backend/hadoop/executionengine/spark/SparkLauncher.java 
b542013 

Diff: https://reviews.apache.org/r/39641/diff/


Testing
---

Verified that the dynamic configuration is hornoured by the yarn system. 
Requires the auxillary shuffle service need to be enabled at the node manager 
and application level for this to work correctly.


Thanks,

Srikanth Sundarrajan



[jira] Subscription: PIG patch available

2015-10-26 Thread jira
Issue Subscription
Filter: PIG patch available (29 issues)

Subscriber: pigdaily

Key Summary
PIG-4711Tests in TestCombiner fail due to missing leveldb dependency
https://issues.apache.org/jira/browse/PIG-4711
PIG-4689CSV Writes incorrect header if two CSV files are created in one 
script
https://issues.apache.org/jira/browse/PIG-4689
PIG-4684Exception should be changed to warning when job diagnostics cannot 
be fetched
https://issues.apache.org/jira/browse/PIG-4684
PIG-4677Display failure information on stop on failure
https://issues.apache.org/jira/browse/PIG-4677
PIG-4656Improve String serialization and comparator performance in 
BinInterSedes
https://issues.apache.org/jira/browse/PIG-4656
PIG-4641Print the instance of Object without using toString()
https://issues.apache.org/jira/browse/PIG-4641
PIG-4598Allow user defined plan optimizer rules
https://issues.apache.org/jira/browse/PIG-4598
PIG-4581thread safe issue in NodeIdGenerator
https://issues.apache.org/jira/browse/PIG-4581
PIG-4539New PigUnit
https://issues.apache.org/jira/browse/PIG-4539
PIG-4515org.apache.pig.builtin.Distinct throws ClassCastException
https://issues.apache.org/jira/browse/PIG-4515
PIG-4468Pig's jackson version conflicts with that of hadoop 2.6.0 or newer
https://issues.apache.org/jira/browse/PIG-4468
PIG-4455Should use DependencyOrderWalker instead of DepthFirstWalker in 
MRPrinter
https://issues.apache.org/jira/browse/PIG-4455
PIG-4417Pig's register command should support automatic fetching of jars 
from repo.
https://issues.apache.org/jira/browse/PIG-4417
PIG-4373Implement PIG-3861 in Tez
https://issues.apache.org/jira/browse/PIG-4373
PIG-4341Add CMX support to pig.tmpfilecompression.codec
https://issues.apache.org/jira/browse/PIG-4341
PIG-4323PackageConverter hanging in Spark
https://issues.apache.org/jira/browse/PIG-4323
PIG-4313StackOverflowError in LIMIT operation on Spark
https://issues.apache.org/jira/browse/PIG-4313
PIG-4251Pig on Storm
https://issues.apache.org/jira/browse/PIG-4251
PIG-4111Make Pig compiles with avro-1.7.7
https://issues.apache.org/jira/browse/PIG-4111
PIG-4002Disable combiner when map-side aggregation is used
https://issues.apache.org/jira/browse/PIG-4002
PIG-3952PigStorage accepts '-tagSplit' to return full split information
https://issues.apache.org/jira/browse/PIG-3952
PIG-3911Define unique fields with @OutputSchema
https://issues.apache.org/jira/browse/PIG-3911
PIG-3877Getting Geo Latitude/Longitude from Address Lines
https://issues.apache.org/jira/browse/PIG-3877
PIG-3873Geo distance calculation using Haversine
https://issues.apache.org/jira/browse/PIG-3873
PIG-3866Create ThreadLocal classloader per PigContext
https://issues.apache.org/jira/browse/PIG-3866
PIG-3864ToDate(userstring, format, timezone) computes DateTime with strange 
handling of Daylight Saving Time with location based timezones
https://issues.apache.org/jira/browse/PIG-3864
PIG-3851Upgrade jline to 2.11
https://issues.apache.org/jira/browse/PIG-3851
PIG-3668COR built-in function when atleast one of the coefficient values is 
NaN
https://issues.apache.org/jira/browse/PIG-3668
PIG-3587add functionality for rolling over dates
https://issues.apache.org/jira/browse/PIG-3587

You may edit this subscription at:
https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=16328&filterId=12322384