Re: Review Request 45667: Support Pig On Spark

2016-06-13 Thread Pallavi Rao
: https://reviews.apache.org/r/45667/diff/ Testing --- New UTs were added where required and ensure old UTs pass -> https://builds.apache.org/job/Pig-spark/ Thanks, Pallavi Rao

Re: Review Request 45667: Support Pig On Spark

2016-07-10 Thread Pallavi Rao
UTs were added where required and ensure old UTs pass -> https://builds.apache.org/job/Pig-spark/ Thanks, Pallavi Rao

Re: Review Request 45667: Support Pig On Spark

2016-10-26 Thread Pallavi Rao
required and ensure old UTs pass -> https://builds.apache.org/job/Pig-spark/ Thanks, Pallavi Rao

Re: [ANNOUNCE] Welcome new Pig Committer - Liyun Zhang

2016-12-19 Thread Pallavi Rao
Congratulations Liyun! On Fri, Dec 16, 2016 at 2:12 PM, Praveen R wrote: > Congratulations Liyun ! > > On Fri, Dec 16, 2016 at 11:52 AM, Zhang, Liyun > wrote: > >> Thanks for all your help on this project! >> >> >> Regards, >> Zhang,Liyun >> >> >> >> >> -Original Message- >> From: Ke, X

Re: Why pig on spark use RDD API rather than DataFrame API ?

2017-01-08 Thread Pallavi Rao
Yes. That was the first question I asked when I started work on Pig on Spark. After investigating a little more, I realized that the current design does not allow for easy use of DataFrame API. We do an operator by operator substitution and use Tuple as the datatype. We would end up converting RDDs

PIG on Spark - PIG to DataFrames?

2015-09-07 Thread Pallavi Rao
Hi, I was looking at the PIG on Spark effort and I noticed that there is scope for optimization for performance. For example, we don't try to evaluate from the plan what the best fit for groupBy is, it could be mapped groupBy/aggregateBy/reduceBy of Spark. With DataFrames in Spark, the Catalyst Op

Review Request 40743: PIG-4709 Improve performance of GROUPBY operator on Spark

2015-11-27 Thread Pallavi Rao
Testing --- The patch unblocked one UT in TestCombiner. Added another UT in the same class. Also did some manual testing. Thanks, Pallavi Rao

Re: Review Request 40743: PIG-4709 Improve performance of GROUPBY operator on Spark

2015-12-07 Thread Pallavi Rao
c/org/apache/pig/parser/LogicalPlanGenerator.g 99545b0 test/org/apache/pig/test/TestCombiner.java df44293 Diff: https://reviews.apache.org/r/40743/diff/ Testing --- The patch unblocked one UT in TestCombiner. Added another UT in the same class. Also did some manual testing. Thanks, Pallavi Rao

Re: Review Request 40743: PIG-4709 Improve performance of GROUPBY operator on Spark

2015-12-08 Thread Pallavi Rao
ner.java df44293 Diff: https://reviews.apache.org/r/40743/diff/ Testing --- The patch unblocked one UT in TestCombiner. Added another UT in the same class. Also did some manual testing. Thanks, Pallavi Rao

Re: Review Request 40743: PIG-4709 Improve performance of GROUPBY operator on Spark

2015-12-17 Thread Pallavi Rao
cOp.Final) > > // -> reduceBy (uses algebraicOp.Intermediate) > > // -> localRearrange > > // -> foreach (using algebraicOp.Initial) Yep. Will change. It should read: // Output: // foreach (using algebrai

Re: Review Request 40743: PIG-4709 Improve performance of GROUPBY operator on Spark

2015-12-17 Thread Pallavi Rao
at we want and hence the jugglery there. I can add a comment in the code to make this clear. - Pallavi --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/40743/#review109469 --- On Dec. 9, 2015, 5:

Re: Review Request 40743: PIG-4709 Improve performance of GROUPBY operator on Spark

2015-12-17 Thread Pallavi Rao
another UT in the same class. Also did some manual testing. Thanks, Pallavi Rao

Re: Review Request 40743: PIG-4709 Improve performance of GROUPBY operator on Spark

2016-01-27 Thread Pallavi Rao
------------ On Dec. 18, 2015, 6:47 a.m., Pallavi Rao wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/40743/ > --

Re: Review Request 40743: PIG-4709 Improve performance of GROUPBY operator on Spark

2016-01-27 Thread Pallavi Rao
156749line253> > > > > Don't remember if this needs to be earlier. Probably safer to keep this > > right after the combiner optimization. Done. Moved it up. - Pallavi --- This is an automa

Re: Review Request 40743: PIG-4709 Improve performance of GROUPBY operator on Spark

2016-01-27 Thread Pallavi Rao
Testing --- The patch unblocked one UT in TestCombiner. Added another UT in the same class. Also did some manual testing. Thanks, Pallavi Rao

Review Request 43044: PIG-4766 Ensure GroupBy is optimized for all algebraic Operations

2016-02-01 Thread Pallavi Rao
. Thanks, Pallavi Rao

Re: Review Request 43044: PIG-4766 Ensure GroupBy is optimized for all algebraic Operations

2016-02-02 Thread Pallavi Rao
pass. Thanks, Pallavi Rao

Re: Review Request 43044: PIG-4766 Ensure GroupBy is optimized for all algebraic Operations

2016-02-04 Thread Pallavi Rao
generated e-mail. To reply, visit: https://reviews.apache.org/r/43044/#review117779 ----------- On Feb. 3, 2016, 6:23 a.m., Pallavi Rao wrote: > > --- > Thi

Re: Review Request 43044: PIG-4766 Ensure GroupBy is optimized for all algebraic Operations

2016-02-04 Thread Pallavi Rao
/TestLocationInPhysicalPlan.java 0e45434 test/org/apache/pig/test/TestCombiner.java b2e81ac Diff: https://reviews.apache.org/r/43044/diff/ Testing --- With this patch, all tests in TestCombiner pass. Thanks, Pallavi Rao

Re: Review Request 43327: Fixed few unit test cases in "TestEvalPipelineLocal" Test group

2016-02-10 Thread Pallavi Rao
tiple times. test/org/apache/pig/test/TestEvalPipelineLocal.java (line 1012) <https://reviews.apache.org/r/43327/#comment180125> Can move this looping up. - Pallavi Rao On Feb. 8, 2016, 11:55 a.m., prateek vaishnav wrote: > > --- > Thi

Re: Review Request 43327: Fixed few unit test cases in "TestEvalPipelineLocal" Test group

2016-02-11 Thread Pallavi Rao
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/43327/#review118980 --- Ship it! Ship It! - Pallavi Rao On Feb. 11, 2016, 12:51 p.m

Re: Review Request 43875: Fixed TestEvalPipelineLocal test suite.

2016-02-23 Thread Pallavi Rao
an/DotSparkPrinter.java (line 130) <https://reviews.apache.org/r/43875/#comment181913> Nit : naming of variable sparkp. Should follow camel case. May be sparkInnnerOp? - Pallavi Rao On Feb. 23, 2016, 12:29 p.m., prateek vaishnav wrote: > > --

Re: Review Request 43875: Fixed TestEvalPipelineLocal test suite.

2016-02-24 Thread Pallavi Rao
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/43875/#review120467 --- Ship it! Ship It! - Pallavi Rao On Feb. 24, 2016, 7:47 a.m

RE: Welcome to our new Pig PMC member Xuefu Zhang

2016-02-24 Thread Pallavi Rao
Congratulations Xuefu! On Feb 25, 2016 7:44 AM, "Zhang, Liyun" wrote: > Congratulations Xuefu! > > > Kelly Zhang/Zhang,Liyun > Best Regards > > > > -Original Message- > From: Jarek Jarcec Cecho [mailto:jar...@gmail.com] On Behalf Of Jarek > Jarcec Cecho > Sent: Thursday, February 25, 2016

Review Request 45667: Support Pig On Spark

2016-04-03 Thread Pallavi Rao
://reviews.apache.org/r/45667/diff/ Testing --- New UTs were added where required and ensure old UTs pass -> https://builds.apache.org/job/Pig-spark/ Thanks, Pallavi Rao

[jira] [Commented] (PIG-4893) Task deserialization time is too long for spark on yarn mode

2016-06-01 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15310108#comment-15310108 ] Pallavi Rao commented on PIG-4893: -- +1 for addressing this. When I had noticed

[jira] [Commented] (PIG-4893) Task deserialization time is too long for spark on yarn mode

2016-06-16 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333428#comment-15333428 ] Pallavi Rao commented on PIG-4893: -- [~kellyzly], can you please upload the patch to re

[jira] [Commented] (PIG-4846) Use pigmix to test the performance of pig on spark

2016-06-23 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15347623#comment-15347623 ] Pallavi Rao commented on PIG-4846: -- This is really good news, [~kellyzly]. All

[jira] [Commented] (PIG-4969) Optimize combine case for spark mode

2016-09-13 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15486763#comment-15486763 ] Pallavi Rao commented on PIG-4969: -- [~kellyzly], my comment is that we should try t

[jira] [Commented] (PIG-4969) Optimize combine case for spark mode

2016-09-13 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15489202#comment-15489202 ] Pallavi Rao commented on PIG-4969: -- +1. PIG-4969_3.patch looks good to me. > O

[jira] [Created] (PIG-4709) Improve performance of GROUPBY operator on Spark

2015-10-21 Thread Pallavi Rao (JIRA)
Pallavi Rao created PIG-4709: Summary: Improve performance of GROUPBY operator on Spark Key: PIG-4709 URL: https://issues.apache.org/jira/browse/PIG-4709 Project: Pig Issue Type: Sub-task

[jira] [Commented] (PIG-4709) Improve performance of GROUPBY operator on Spark

2015-10-21 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14966413#comment-14966413 ] Pallavi Rao commented on PIG-4709: -- I hacked around the code a bit and optimized

[jira] [Created] (PIG-4711) Tests in TestCombiner fail due to missing leveldb dependency

2015-10-23 Thread Pallavi Rao (JIRA)
Pallavi Rao created PIG-4711: Summary: Tests in TestCombiner fail due to missing leveldb dependency Key: PIG-4711 URL: https://issues.apache.org/jira/browse/PIG-4711 Project: Pig Issue Type

[jira] [Assigned] (PIG-4711) Tests in TestCombiner fail due to missing leveldb dependency

2015-10-23 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao reassigned PIG-4711: Assignee: Pallavi Rao > Tests in TestCombiner fail due to missing leveldb depende

[jira] [Updated] (PIG-4711) Tests in TestCombiner fail due to missing leveldb dependency

2015-10-23 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao updated PIG-4711: - Status: Patch Available (was: Open) > Tests in TestCombiner fail due to missing leveldb depende

[jira] [Updated] (PIG-4711) Tests in TestCombiner fail due to missing leveldb dependency

2015-10-23 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao updated PIG-4711: - Attachment: PIG-4711.patch Small patch, hence no Review Board request. With this change, 4 out of 13

[jira] [Commented] (PIG-4711) Tests in TestCombiner fail due to missing leveldb dependency

2015-10-26 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14975609#comment-14975609 ] Pallavi Rao commented on PIG-4711: -- [~xuefuz], [~mohitsabharwal], please review the p

[jira] [Commented] (PIG-4711) Tests in TestCombiner fail due to missing leveldb dependency

2015-10-26 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14975691#comment-14975691 ] Pallavi Rao commented on PIG-4711: -- Thanks [~xuefuz] for the quick review and co

[jira] [Reopened] (PIG-4711) Tests in TestCombiner fail due to missing leveldb dependency

2015-10-26 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao reopened PIG-4711: -- The patch broke the build. Misplaced line in ivy.xml. > Tests in TestCombiner fail due to missing leve

[jira] [Updated] (PIG-4711) Tests in TestCombiner fail due to missing leveldb dependency

2015-10-26 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao updated PIG-4711: - Attachment: PIG-4711-v1.patch The new patch that doesn't cause the build break. > Tests in Test

[jira] [Updated] (PIG-4711) Tests in TestCombiner fail due to missing leveldb dependency

2015-10-26 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao updated PIG-4711: - Status: Patch Available (was: Reopened) [~xuefuz], the older patch had the dependency misplaced. Here is

[jira] [Updated] (PIG-4711) Tests in TestCombiner fail due to missing leveldb dependency

2015-10-26 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao updated PIG-4711: - Priority: Blocker (was: Major) > Tests in TestCombiner fail due to missing leveldb depende

[jira] [Created] (PIG-4746) Enable spork in Oozie

2015-11-26 Thread Pallavi Rao (JIRA)
Pallavi Rao created PIG-4746: Summary: Enable spork in Oozie Key: PIG-4746 URL: https://issues.apache.org/jira/browse/PIG-4746 Project: Pig Issue Type: Sub-task Components: spark

[jira] [Updated] (PIG-4746) Ensure spork can be run as PIG action in Oozie

2015-11-26 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao updated PIG-4746: - Summary: Ensure spork can be run as PIG action in Oozie (was: Enable spork in Oozie) > Ensure spork

[jira] [Updated] (PIG-4709) Improve performance of GROUPBY operator on Spark

2015-11-27 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao updated PIG-4709: - Attachment: PIG-4709.patch Initial patch. Handles algebraic operations on grouped data. There are certain

[jira] [Commented] (PIG-4709) Improve performance of GROUPBY operator on Spark

2015-11-27 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15029780#comment-15029780 ] Pallavi Rao commented on PIG-4709: -- Before patch: {code} 2015-11-27 14:04:16,811 [

[jira] [Updated] (PIG-4709) Improve performance of GROUPBY operator on Spark

2015-11-27 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao updated PIG-4709: - Status: Patch Available (was: Open) > Improve performance of GROUPBY operator on Sp

[jira] [Updated] (PIG-4746) Ensure spork can be run as PIG action in Oozie

2015-11-27 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao updated PIG-4746: - Description: I was able get PIG on SPARK going with Oozie. But, only in "local" mode. Here is

[jira] [Commented] (PIG-4709) Improve performance of GROUPBY operator on Spark

2015-11-29 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15031385#comment-15031385 ] Pallavi Rao commented on PIG-4709: -- [~mohitsabharwal], [~xuefuz], review pl

[jira] [Updated] (PIG-4709) Improve performance of GROUPBY operator on Spark

2015-12-07 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao updated PIG-4709: - Attachment: PIG-4709-v1.patch Outlining the approach here: Currently, the GROUPBY operator of PIG is mapped

[jira] [Commented] (PIG-4709) Improve performance of GROUPBY operator on Spark

2015-12-10 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15050314#comment-15050314 ] Pallavi Rao commented on PIG-4709: -- That is right [~kellyzly], the patch does NOT add

[jira] [Updated] (PIG-4709) Improve performance of GROUPBY operator on Spark

2015-12-17 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao updated PIG-4709: - Attachment: PIG-4709-v2.patch Addressed [~kellyzly]'s comments > Improve performance of GROUPBY

[jira] [Commented] (PIG-4765) Enable TestPoissonSampleLoader in spark mode

2015-12-20 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066122#comment-15066122 ] Pallavi Rao commented on PIG-4765: -- +1. Patch looks good to me. Applies cleanly and

[jira] [Created] (PIG-4766) Ensure GroupBy is optimized for all algebraic Operations

2015-12-24 Thread Pallavi Rao (JIRA)
Pallavi Rao created PIG-4766: Summary: Ensure GroupBy is optimized for all algebraic Operations Key: PIG-4766 URL: https://issues.apache.org/jira/browse/PIG-4766 Project: Pig Issue Type: Sub

[jira] [Commented] (PIG-4766) Ensure GroupBy is optimized for all algebraic Operations

2015-12-24 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070782#comment-15070782 ] Pallavi Rao commented on PIG-4766: -- PIG-4709 introduced Combiner optimization for Grou

[jira] [Updated] (PIG-4766) Ensure GroupBy is optimized for all algebraic Operations

2015-12-24 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao updated PIG-4766: - Attachment: PIG-4766.patch The patch requires PIG-4709. Hence, cannot upload to Review Board. Will upload

[jira] [Updated] (PIG-4766) Ensure GroupBy is optimized for all algebraic Operations

2015-12-24 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao updated PIG-4766: - Status: Patch Available (was: Open) > Ensure GroupBy is optimized for all algebraic Operati

[jira] [Commented] (PIG-4766) Ensure GroupBy is optimized for all algebraic Operations

2015-12-24 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070800#comment-15070800 ] Pallavi Rao commented on PIG-4766: -- With this patch, all tests in TestCombiner

[jira] [Commented] (PIG-4709) Improve performance of GROUPBY operator on Spark

2016-01-07 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088689#comment-15088689 ] Pallavi Rao commented on PIG-4709: -- [~kellyzly], I have addressed your review comments

[jira] [Commented] (PIG-4601) Implement Merge CoGroup for Spark engine

2016-01-13 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15097571#comment-15097571 ] Pallavi Rao commented on PIG-4601: -- [~kellyzly], is this Patch ready for review? I

[jira] [Updated] (PIG-4709) Improve performance of GROUPBY operator on Spark

2016-01-27 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao updated PIG-4709: - Attachment: PIG-4709-v3.patch Addressed comments from [~kellyzly] and [~mohitsabharwal] > Impr

[jira] [Commented] (PIG-4783) Refactor SparkLauncher for spark engine

2016-02-01 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15126102#comment-15126102 ] Pallavi Rao commented on PIG-4783: -- [~kellyzly], the refactoring makes the code a

[jira] [Commented] (PIG-4784) Enable "pig.disable.counter“ for spark engine

2016-02-01 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15126119#comment-15126119 ] Pallavi Rao commented on PIG-4784: -- [~kellyzly], overall, the patch looks good. How

[jira] [Commented] (PIG-4616) Fix UT errors of TestPigRunner in Spark mode

2016-02-01 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15126121#comment-15126121 ] Pallavi Rao commented on PIG-4616: -- Not a small patch. Will review once it is i

[jira] [Updated] (PIG-4766) Ensure GroupBy is optimized for all algebraic Operations

2016-02-01 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao updated PIG-4766: - Attachment: (was: PIG-4766.patch) > Ensure GroupBy is optimized for all algebraic Operati

[jira] [Updated] (PIG-4766) Ensure GroupBy is optimized for all algebraic Operations

2016-02-01 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao updated PIG-4766: - Attachment: PIG-4766.patch [~kellyzly], [~mohitsabharwal], [~kexianda], [~xuefuz], please review. > Ens

[jira] [Commented] (PIG-4783) Refactor SparkLauncher for spark engine

2016-02-02 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15129744#comment-15129744 ] Pallavi Rao commented on PIG-4783: -- The new patch looks good to me. +1. > R

[jira] [Updated] (PIG-4766) Ensure GroupBy is optimized for all algebraic Operations

2016-02-02 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao updated PIG-4766: - Attachment: PIG-4766-v1.patch Some corner cases weren't taken care of (key being null, co-group)

[jira] [Commented] (PIG-4766) Ensure GroupBy is optimized for all algebraic Operations

2016-02-04 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15132024#comment-15132024 ] Pallavi Rao commented on PIG-4766: -- [~kellyzly], yes, for group by, all tuples will &

[jira] [Created] (PIG-4797) Analyze JOIN performance and improve the same.

2016-02-04 Thread Pallavi Rao (JIRA)
Pallavi Rao created PIG-4797: Summary: Analyze JOIN performance and improve the same. Key: PIG-4797 URL: https://issues.apache.org/jira/browse/PIG-4797 Project: Pig Issue Type: Improvement

[jira] [Updated] (PIG-4766) Ensure GroupBy is optimized for all algebraic Operations

2016-02-04 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao updated PIG-4766: - Attachment: PIG-4766-v2.patch Rebased patch. > Ensure GroupBy is optimized for all algebraic Operati

[jira] [Commented] (PIG-4766) Ensure GroupBy is optimized for all algebraic Operations

2016-02-04 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133655#comment-15133655 ] Pallavi Rao commented on PIG-4766: -- Thanks [~xuefuz]. Thanks [~kellyzly] for the re

[jira] [Commented] (PIG-4784) Enable "pig.disable.counter“ for spark engine

2016-02-04 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133776#comment-15133776 ] Pallavi Rao commented on PIG-4784: -- +1 for the latest patch. >

[jira] [Commented] (PIG-4777) Enable "TestEvalPipelineLocal" for spark

2016-02-10 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15142297#comment-15142297 ] Pallavi Rao commented on PIG-4777: -- Left some comments on RB. Please address those.

[jira] [Commented] (PIG-4616) Fix UT errors of TestPigRunner in Spark mode

2016-02-10 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15142315#comment-15142315 ] Pallavi Rao commented on PIG-4616: -- Patch looks good. +1. One very minor nit on RB. &

[jira] [Updated] (PIG-4616) Fix UT errors of TestPigRunner in Spark mode

2016-02-10 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao updated PIG-4616: - Status: Patch Available (was: Open) > Fix UT errors of TestPigRunner in Spark m

[jira] [Commented] (PIG-4777) Enable "TestEvalPipelineLocal" for spark

2016-02-11 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15144180#comment-15144180 ] Pallavi Rao commented on PIG-4777: -- +1 for the new patch. [~xuefuz], the patch ca

[jira] [Commented] (PIG-4616) Fix UT errors of TestPigRunner in Spark mode

2016-02-14 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15146898#comment-15146898 ] Pallavi Rao commented on PIG-4616: -- Thanks [~kellyzly]. [~xuefuz], please commit. &

[jira] [Commented] (PIG-4281) Fix TestFinish for Spark engine

2016-02-17 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15151745#comment-15151745 ] Pallavi Rao commented on PIG-4281: -- [~kellyzly], I have left review comments on RB. Pl

[jira] [Updated] (PIG-4797) Analyze JOIN performance and improve the same.

2016-02-18 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao updated PIG-4797: - Attachment: Join performance analysis - Google Docs.pdf The attached doc details the performance analysis

[jira] [Commented] (PIG-4797) Analyze JOIN performance and improve the same.

2016-02-18 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15152085#comment-15152085 ] Pallavi Rao commented on PIG-4797: -- Errata: There was NO performance difference bet

[jira] [Updated] (PIG-4797) Analyze JOIN performance and improve the same.

2016-02-18 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao updated PIG-4797: - Attachment: (was: Join performance analysis - Google Docs.pdf) > Analyze JOIN performance and impr

[jira] [Updated] (PIG-4797) Analyze JOIN performance and improve the same.

2016-02-18 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao updated PIG-4797: - Attachment: Join performance analysis.pdf > Analyze JOIN performance and improve the s

[jira] [Commented] (PIG-4797) Analyze JOIN performance and improve the same.

2016-02-18 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15152220#comment-15152220 ] Pallavi Rao commented on PIG-4797: -- Solution Proposal: Currently, the Spark plan tha

[jira] [Commented] (PIG-4281) Fix TestFinish for Spark engine

2016-02-19 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15154123#comment-15154123 ] Pallavi Rao commented on PIG-4281: -- Thanks [~kellyzly]! +1 for the latest patch. [~xu

[jira] [Resolved] (PIG-4601) Implement Merge CoGroup for Spark engine

2016-02-19 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao resolved PIG-4601. -- Resolution: Fixed > Implement Merge CoGroup for Spark eng

[jira] [Commented] (PIG-4807) Fix test cases of "TestEvalPipelineLocal" test suite.

2016-02-23 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15160238#comment-15160238 ] Pallavi Rao commented on PIG-4807: -- [~Pratyy], overall the patch looks good. Have

[jira] [Commented] (PIG-4807) Fix test cases of "TestEvalPipelineLocal" test suite.

2016-02-24 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15160356#comment-15160356 ] Pallavi Rao commented on PIG-4807: -- Thanks [~Pratyy]. +1 for the new patch. [~xu

[jira] [Commented] (PIG-4797) Analyze JOIN performance and improve the same.

2016-02-24 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15162743#comment-15162743 ] Pallavi Rao commented on PIG-4797: -- Yes [~kellyzly], that is one optimization. The se

[jira] [Commented] (PIG-4788) the value BytesRead metric info always returns 0 even the length of input file is not 0 in spark engine

2016-02-25 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15168572#comment-15168572 ] Pallavi Rao commented on PIG-4788: -- [~kellyzly], as you mentioned, since the wrappedSp

[jira] [Commented] (PIG-4788) the value BytesRead metric info always returns 0 even the length of input file is not 0 in spark engine

2016-02-28 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15171393#comment-15171393 ] Pallavi Rao commented on PIG-4788: -- [~xuefuz], [~kellyzly], agreed. Lets disable t

[jira] [Commented] (PIG-4243) Fix "TestStore" for Spark engine

2016-02-29 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15171562#comment-15171562 ] Pallavi Rao commented on PIG-4243: -- +1 for the new patch. [~xuefuz], please commit. &

[jira] [Assigned] (PIG-4820) Merge trunk[3] into spark branch

2016-03-01 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao reassigned PIG-4820: Assignee: Pallavi Rao > Merge trunk[3] into spark bra

[jira] [Created] (PIG-4820) Merge trunk[3] into spark branch

2016-03-01 Thread Pallavi Rao (JIRA)
Pallavi Rao created PIG-4820: Summary: Merge trunk[3] into spark branch Key: PIG-4820 URL: https://issues.apache.org/jira/browse/PIG-4820 Project: Pig Issue Type: Sub-task Components

[jira] [Updated] (PIG-4820) Merge trunk[3] into spark branch

2016-03-03 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao updated PIG-4820: - Attachment: PIG-4820.patch The following files had conflict that I resolved manually: build.xml ivy.xml ivy

[jira] [Commented] (PIG-4820) Merge trunk[3] into spark branch

2016-03-03 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15177465#comment-15177465 ] Pallavi Rao commented on PIG-4820: -- [~mohitsabharwal], [~kellyzly], [~xuefuz], shou

[jira] [Updated] (PIG-4820) Merge trunk[3] into spark branch

2016-03-03 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Rao updated PIG-4820: - Status: Patch Available (was: Open) > Merge trunk[3] into spark bra

[jira] [Commented] (PIG-4820) Merge trunk[3] into spark branch

2016-03-03 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15177474#comment-15177474 ] Pallavi Rao commented on PIG-4820: -- Also, there are a few test failures that I see wh

[jira] [Created] (PIG-4822) Spark UT failures after merge with trunk

2016-03-03 Thread Pallavi Rao (JIRA)
Pallavi Rao created PIG-4822: Summary: Spark UT failures after merge with trunk Key: PIG-4822 URL: https://issues.apache.org/jira/browse/PIG-4822 Project: Pig Issue Type: Bug

[jira] [Commented] (PIG-4822) Spark UT failures after merge with trunk

2016-03-03 Thread Pallavi Rao (JIRA)
[ https://issues.apache.org/jira/browse/PIG-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15177565#comment-15177565 ] Pallavi Rao commented on PIG-4822: -- We might need to create sub-tasks for each of

  1   2   >