[jira] Subscription: PIG patch available

2016-02-24 Thread jira
Issue Subscription
Filter: PIG patch available (31 issues)

Subscriber: pigdaily

Key Summary
PIG-4796Authenticate with Kerberos using a keytab file
https://issues.apache.org/jira/browse/PIG-4796
PIG-4788the value BytesRead metric info always returns 0 even the length of 
input file is not 0 in spark engine
https://issues.apache.org/jira/browse/PIG-4788
PIG-4781Fix remaining unit failure about "TestCollectedGroup" for spark 
engine
https://issues.apache.org/jira/browse/PIG-4781
PIG-4745DataBag should protect content of passed list of tuples
https://issues.apache.org/jira/browse/PIG-4745
PIG-4734TOMAP schema inferring breaks some scripts in type checking for 
bincond
https://issues.apache.org/jira/browse/PIG-4734
PIG-4684Exception should be changed to warning when job diagnostics cannot 
be fetched
https://issues.apache.org/jira/browse/PIG-4684
PIG-4656Improve String serialization and comparator performance in 
BinInterSedes
https://issues.apache.org/jira/browse/PIG-4656
PIG-4641Print the instance of Object without using toString()
https://issues.apache.org/jira/browse/PIG-4641
PIG-4598Allow user defined plan optimizer rules
https://issues.apache.org/jira/browse/PIG-4598
PIG-4581thread safe issue in NodeIdGenerator
https://issues.apache.org/jira/browse/PIG-4581
PIG-4551Partition filter is not pushed down in case of SPLIT
https://issues.apache.org/jira/browse/PIG-4551
PIG-4539New PigUnit
https://issues.apache.org/jira/browse/PIG-4539
PIG-4526Make setting up the build environment easier
https://issues.apache.org/jira/browse/PIG-4526
PIG-4515org.apache.pig.builtin.Distinct throws ClassCastException
https://issues.apache.org/jira/browse/PIG-4515
PIG-4455Should use DependencyOrderWalker instead of DepthFirstWalker in 
MRPrinter
https://issues.apache.org/jira/browse/PIG-4455
PIG-4341Add CMX support to pig.tmpfilecompression.codec
https://issues.apache.org/jira/browse/PIG-4341
PIG-4323PackageConverter hanging in Spark
https://issues.apache.org/jira/browse/PIG-4323
PIG-4313StackOverflowError in LIMIT operation on Spark
https://issues.apache.org/jira/browse/PIG-4313
PIG-4251Pig on Storm
https://issues.apache.org/jira/browse/PIG-4251
PIG-4111Make Pig compiles with avro-1.7.7
https://issues.apache.org/jira/browse/PIG-4111
PIG-4002Disable combiner when map-side aggregation is used
https://issues.apache.org/jira/browse/PIG-4002
PIG-3952PigStorage accepts '-tagSplit' to return full split information
https://issues.apache.org/jira/browse/PIG-3952
PIG-3911Define unique fields with @OutputSchema
https://issues.apache.org/jira/browse/PIG-3911
PIG-3906ant site errors out
https://issues.apache.org/jira/browse/PIG-3906
PIG-3877Getting Geo Latitude/Longitude from Address Lines
https://issues.apache.org/jira/browse/PIG-3877
PIG-3873Geo distance calculation using Haversine
https://issues.apache.org/jira/browse/PIG-3873
PIG-3866Create ThreadLocal classloader per PigContext
https://issues.apache.org/jira/browse/PIG-3866
PIG-3864ToDate(userstring, format, timezone) computes DateTime with strange 
handling of Daylight Saving Time with location based timezones
https://issues.apache.org/jira/browse/PIG-3864
PIG-3851Upgrade jline to 2.11
https://issues.apache.org/jira/browse/PIG-3851
PIG-3668COR built-in function when atleast one of the coefficient values is 
NaN
https://issues.apache.org/jira/browse/PIG-3668
PIG-3587add functionality for rolling over dates
https://issues.apache.org/jira/browse/PIG-3587

You may edit this subscription at:
https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=16328&filterId=12322384


[jira] [Commented] (PIG-4781) Fix remaining unit failure about "TestCollectedGroup" for spark engine

2016-02-24 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15166760#comment-15166760
 ] 

Xuefu Zhang commented on PIG-4781:
--

+1. 

> Fix remaining unit failure about "TestCollectedGroup" for spark engine
> --
>
> Key: PIG-4781
> URL: https://issues.apache.org/jira/browse/PIG-4781
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: PIG-4781.patch
>
>
> in 
> https://builds.apache.org/job/Pig-spark/lastUnsuccessfulBuild/#showFailuresLink,
>  it shows that following unit test fails:
> org.apache.pig.test.TestCollectedGroup.testMapsideGroupWithMergeJoin
> This fails because currently we use regular join to implement merge join.
> the exeception is 
> {code}
> Caused by: 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompilerException:
>  ERROR 2171: Expected one but found more then one root physical operator in 
> physical physicalPlan.
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.visitCollectedGroup(SparkCompiler.java:512)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POCollectedGroup.visit(POCollectedGroup.java:93)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.compile(SparkCompiler.java:259)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.compile(SparkCompiler.java:240)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.compile(SparkCompiler.java:240)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.compile(SparkCompiler.java:165)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.compile(SparkLauncher.java:425)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:150)
>   at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:301)
>   at org.apache.pig.PigServer.launchPlan(PigServer.java:1390)
>   at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1375)
>   at org.apache.pig.PigServer.storeEx(PigServer.java:1034)
>   ... 27 more
> {code}
> After we implement Merge join, this unit test can be fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4781) Fix remaining unit failure about "TestCollectedGroup" for spark engine

2016-02-24 Thread liyunzhang_intel (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang_intel updated PIG-4781:
--
Assignee: liyunzhang_intel
  Status: Patch Available  (was: Open)

> Fix remaining unit failure about "TestCollectedGroup" for spark engine
> --
>
> Key: PIG-4781
> URL: https://issues.apache.org/jira/browse/PIG-4781
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: PIG-4781.patch
>
>
> in 
> https://builds.apache.org/job/Pig-spark/lastUnsuccessfulBuild/#showFailuresLink,
>  it shows that following unit test fails:
> org.apache.pig.test.TestCollectedGroup.testMapsideGroupWithMergeJoin
> This fails because currently we use regular join to implement merge join.
> the exeception is 
> {code}
> Caused by: 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompilerException:
>  ERROR 2171: Expected one but found more then one root physical operator in 
> physical physicalPlan.
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.visitCollectedGroup(SparkCompiler.java:512)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POCollectedGroup.visit(POCollectedGroup.java:93)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.compile(SparkCompiler.java:259)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.compile(SparkCompiler.java:240)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.compile(SparkCompiler.java:240)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.compile(SparkCompiler.java:165)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.compile(SparkLauncher.java:425)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:150)
>   at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:301)
>   at org.apache.pig.PigServer.launchPlan(PigServer.java:1390)
>   at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1375)
>   at org.apache.pig.PigServer.storeEx(PigServer.java:1034)
>   ... 27 more
> {code}
> After we implement Merge join, this unit test can be fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4781) Fix remaining unit failure about "TestCollectedGroup" for spark engine

2016-02-24 Thread liyunzhang_intel (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang_intel updated PIG-4781:
--
Attachment: PIG-4781.patch

We use regular join to replace merge join now so 
TestCollectedGroup#testMapsideGroupWithMergeJoin fails(see the analysis in jira 
description). PIG-4781's patch skips the unit test in spark mode. Once 
PIG-4810(implement merge join in spark mode) has been fixed, we will import 
this unit test again in spark mode.

> Fix remaining unit failure about "TestCollectedGroup" for spark engine
> --
>
> Key: PIG-4781
> URL: https://issues.apache.org/jira/browse/PIG-4781
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: PIG-4781.patch
>
>
> in 
> https://builds.apache.org/job/Pig-spark/lastUnsuccessfulBuild/#showFailuresLink,
>  it shows that following unit test fails:
> org.apache.pig.test.TestCollectedGroup.testMapsideGroupWithMergeJoin
> This fails because currently we use regular join to implement merge join.
> the exeception is 
> {code}
> Caused by: 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompilerException:
>  ERROR 2171: Expected one but found more then one root physical operator in 
> physical physicalPlan.
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.visitCollectedGroup(SparkCompiler.java:512)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POCollectedGroup.visit(POCollectedGroup.java:93)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.compile(SparkCompiler.java:259)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.compile(SparkCompiler.java:240)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.compile(SparkCompiler.java:240)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.compile(SparkCompiler.java:165)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.compile(SparkLauncher.java:425)
>   at 
> org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:150)
>   at 
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:301)
>   at org.apache.pig.PigServer.launchPlan(PigServer.java:1390)
>   at 
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1375)
>   at org.apache.pig.PigServer.storeEx(PigServer.java:1034)
>   ... 27 more
> {code}
> After we implement Merge join, this unit test can be fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Welcome to our new Pig PMC member Xuefu Zhang

2016-02-24 Thread Xuefu Zhang
Thank you, Liyun! You did the hard work. I think you well deserve a
committership once we merge the branch to trunk.

--Xuefu

On Wed, Feb 24, 2016 at 5:18 PM, Zhang, Liyun  wrote:

> Congratulations Xuefu!
>
>
> Kelly Zhang/Zhang,Liyun
> Best Regards
>
>
>
> -Original Message-
> From: Jarek Jarcec Cecho [mailto:jar...@gmail.com] On Behalf Of Jarek
> Jarcec Cecho
> Sent: Thursday, February 25, 2016 6:36 AM
> To: dev@pig.apache.org
> Cc: u...@pig.apache.org
> Subject: Re: Welcome to our new Pig PMC member Xuefu Zhang
>
> Congratulations Xuefu!
>
> Jarcec
>
> > On Feb 24, 2016, at 1:29 PM, Rohini Palaniswamy 
> wrote:
> >
> > It is my pleasure to announce that Xuefu Zhang is our newest addition
> > to the Pig PMC. Xuefu is a long time committer of Pig and has been
> > actively involved in driving the Pig on Spark effort for the past year.
> >
> > Please join me in congratulating Xuefu !!!
> >
> > Regards,
> > Rohini
>
>


Re: Welcome to our new Pig PMC member Xuefu Zhang

2016-02-24 Thread Praveen R
Congratulations Xuefu.

On Thu, Feb 25, 2016 at 2:59 AM, Rohini Palaniswamy  wrote:

> It is my pleasure to announce that Xuefu Zhang is our newest addition to
> the Pig PMC. Xuefu is a long time committer of Pig and has been actively
> involved in driving the Pig on Spark effort for the past year.
>
> Please join me in congratulating Xuefu !!!
>
> Regards,
> Rohini
>


RE: Welcome to our new Pig PMC member Xuefu Zhang

2016-02-24 Thread Pallavi Rao
Congratulations Xuefu!
On Feb 25, 2016 7:44 AM, "Zhang, Liyun"  wrote:

> Congratulations Xuefu!
>
>
> Kelly Zhang/Zhang,Liyun
> Best Regards
>
>
>
> -Original Message-
> From: Jarek Jarcec Cecho [mailto:jar...@gmail.com] On Behalf Of Jarek
> Jarcec Cecho
> Sent: Thursday, February 25, 2016 6:36 AM
> To: dev@pig.apache.org
> Cc: u...@pig.apache.org
> Subject: Re: Welcome to our new Pig PMC member Xuefu Zhang
>
> Congratulations Xuefu!
>
> Jarcec
>
> > On Feb 24, 2016, at 1:29 PM, Rohini Palaniswamy 
> wrote:
> >
> > It is my pleasure to announce that Xuefu Zhang is our newest addition
> > to the Pig PMC. Xuefu is a long time committer of Pig and has been
> > actively involved in driving the Pig on Spark effort for the past year.
> >
> > Please join me in congratulating Xuefu !!!
> >
> > Regards,
> > Rohini
>
>

-- 
_
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.


RE: Welcome to our new Pig PMC member Xuefu Zhang

2016-02-24 Thread Zhang, Liyun
Congratulations Xuefu!


Kelly Zhang/Zhang,Liyun
Best Regards



-Original Message-
From: Jarek Jarcec Cecho [mailto:jar...@gmail.com] On Behalf Of Jarek Jarcec 
Cecho
Sent: Thursday, February 25, 2016 6:36 AM
To: dev@pig.apache.org
Cc: u...@pig.apache.org
Subject: Re: Welcome to our new Pig PMC member Xuefu Zhang

Congratulations Xuefu!

Jarcec

> On Feb 24, 2016, at 1:29 PM, Rohini Palaniswamy  
> wrote:
> 
> It is my pleasure to announce that Xuefu Zhang is our newest addition 
> to the Pig PMC. Xuefu is a long time committer of Pig and has been 
> actively involved in driving the Pig on Spark effort for the past year.
> 
> Please join me in congratulating Xuefu !!!
> 
> Regards,
> Rohini



Re: Welcome to our new Pig PMC member Xuefu Zhang

2016-02-24 Thread Jarek Jarcec Cecho
Congratulations Xuefu!

Jarcec

> On Feb 24, 2016, at 1:29 PM, Rohini Palaniswamy  
> wrote:
> 
> It is my pleasure to announce that Xuefu Zhang is our newest addition to
> the Pig PMC. Xuefu is a long time committer of Pig and has been actively
> involved in driving the Pig on Spark effort for the past year.
> 
> Please join me in congratulating Xuefu !!!
> 
> Regards,
> Rohini



Welcome to our new Pig PMC member Xuefu Zhang

2016-02-24 Thread Rohini Palaniswamy
It is my pleasure to announce that Xuefu Zhang is our newest addition to
the Pig PMC. Xuefu is a long time committer of Pig and has been actively
involved in driving the Pig on Spark effort for the past year.

Please join me in congratulating Xuefu !!!

Regards,
Rohini


[jira] [Updated] (PIG-4807) Fix test cases of "TestEvalPipelineLocal" test suite.

2016-02-24 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated PIG-4807:
-
   Resolution: Fixed
Fix Version/s: spark-branch
   Status: Resolved  (was: Patch Available)

Committed to Spark branch. Thanks, Prateek!

> Fix test cases of "TestEvalPipelineLocal" test suite.
> -
>
> Key: PIG-4807
> URL: https://issues.apache.org/jira/browse/PIG-4807
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Affects Versions: spark-branch
>Reporter: prateek vaishnav
>Assignee: prateek vaishnav
> Fix For: spark-branch
>
> Attachments: diff_1, diff_2
>
>
> This jira is created to address the failure of test cases 
> org.apache.pig.test.TestEvalPipelineLocal.testSetLocationCalledInFE
> org.apache.pig.test.TestEvalPipelineLocal.testExplainInDotGraph
> org.apache.pig.test.TestEvalPipelineLocal.testSortWithUDF



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4526) Make setting up the build environment easier

2016-02-24 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated PIG-4526:
--
Status: Patch Available  (was: Open)

This will really help in making builds more reproducible.

> Make setting up the build environment easier
> 
>
> Key: PIG-4526
> URL: https://issues.apache.org/jira/browse/PIG-4526
> Project: Pig
>  Issue Type: New Feature
>Reporter: Niels Basjes
>Assignee: Niels Basjes
> Fix For: 0.16.0
>
> Attachments: PIG-4526-2015-04-30-1632.patch, 
> PIG-4526-2015-05-01-1545.patch, PIG-4526-2015-05-03-0910.patch, 
> PIG-4526-2016-02-24-1310.patch
>
>
> In AVRO-1537 and HADOOP-11843 a docker based solution was created to setup 
> all the tools for doing a full build. This enables much easier reproduction 
> of any issues and getting up and running for new developers.
> This issue is to 'copy/port' that setup into the pig project.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-4526) Make setting up the build environment easier

2016-02-24 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated PIG-4526:
--
Attachment: PIG-4526-2016-02-24-1310.patch

I ran these commands successfully (Hadoop 23):
{code}
ANT='ant -Dhadoopversion=23 -Djavac.args="-Xlint -Xmaxwarns 100"'
${ANT} clean piggybank jar compile-test
cd contrib/piggybank/java && ${ANT} test
${ANT} test-commit
{code}


> Make setting up the build environment easier
> 
>
> Key: PIG-4526
> URL: https://issues.apache.org/jira/browse/PIG-4526
> Project: Pig
>  Issue Type: New Feature
>Reporter: Niels Basjes
>Assignee: Niels Basjes
> Fix For: 0.16.0
>
> Attachments: PIG-4526-2015-04-30-1632.patch, 
> PIG-4526-2015-05-01-1545.patch, PIG-4526-2015-05-03-0910.patch, 
> PIG-4526-2016-02-24-1310.patch
>
>
> In AVRO-1537 and HADOOP-11843 a docker based solution was created to setup 
> all the tools for doing a full build. This enables much easier reproduction 
> of any issues and getting up and running for new developers.
> This issue is to 'copy/port' that setup into the pig project.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-3906) ant site errors out

2016-02-24 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated PIG-3906:
--
Affects Version/s: 0.15.0
   Status: Patch Available  (was: Open)

Is removing the PDFs an acceptable way of fixing this problem ?


> ant site errors out
> ---
>
> Key: PIG-3906
> URL: https://issues.apache.org/jira/browse/PIG-3906
> Project: Pig
>  Issue Type: Bug
>  Components: build, site
>Affects Versions: 0.15.0, 0.12.1
>Reporter: Konstantin Boudnik
> Attachments: PIG-3906-Disable-PDF.patch
>
>
> While running 
> {noformat}
> ant -Djavac.version=1.6 -Dforrest.home=/usr/local/apache-forrest 
> -Dversion=0.12.1 -Dhadoopversion=23 -buildfile contrib/zebra/build.xml clean 
> jar
> {noformat}
> site target errors out (see the comment for detailed message)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PIG-3906) ant site errors out

2016-02-24 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-3906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated PIG-3906:
--
Attachment: PIG-3906-Disable-PDF.patch

I just checked and this problem is still reproducible using the current trunk.
Apparently this problem is related to forrest and generating pdf files.
So I create a rather brutal fix (workaround) to stop this error from happening: 
Simply disable the PDF link at the top right of all pages.

The main question is: Is this an acceptable way of fixing this issue?


> ant site errors out
> ---
>
> Key: PIG-3906
> URL: https://issues.apache.org/jira/browse/PIG-3906
> Project: Pig
>  Issue Type: Bug
>  Components: build, site
>Affects Versions: 0.12.1
>Reporter: Konstantin Boudnik
> Attachments: PIG-3906-Disable-PDF.patch
>
>
> While running 
> {noformat}
> ant -Djavac.version=1.6 -Dforrest.home=/usr/local/apache-forrest 
> -Dversion=0.12.1 -Dhadoopversion=23 -buildfile contrib/zebra/build.xml clean 
> jar
> {noformat}
> site target errors out (see the comment for detailed message)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4797) Analyze JOIN performance and improve the same.

2016-02-24 Thread Pallavi Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15162743#comment-15162743
 ] 

Pallavi Rao commented on PIG-4797:
--

Yes [~kellyzly], that is one optimization. The second one can be done on the 
packager side - by absorbing packaging into 
GlobalRearrageConverter.ToGroupKeyValueFunction rather than as a separate map 
in Packager. So, basically, collapse LocalRearrange, GlobalRearrange and 
Package into GlobalRearrange. This will reduce 2 map operations.

> Analyze JOIN performance and improve the same.
> --
>
> Key: PIG-4797
> URL: https://issues.apache.org/jira/browse/PIG-4797
> Project: Pig
>  Issue Type: Improvement
>  Components: spark
>Reporter: Pallavi Rao
>Assignee: Pallavi Rao
>  Labels: spork
> Attachments: Join performance analysis.pdf
>
>
> There are a big  performance difference in join between spark and mr mode.
> {code}
> daily = load './NYSE_daily' as (exchange:chararray, symbol:chararray,
> date:chararray, open:float, high:float, low:float,
> close:float, volume:int, adj_close:float);
> divs  = load './NYSE_dividends' as (exchange:chararray, symbol:chararray,
> date:chararray, dividends:float);
> jnd   = join daily by (exchange, symbol), divs by (exchange, symbol);
> store jnd into './join.out';
> {code}
> join.sh
> {code}
> mode=$1
> start=$(date +%s)
> ./pig -x $mode  $PIG_HOME/bin/join.pig
> end=$(date +%s)
> execution_time=$(( $end - $start ))
> echo "execution_time:"$excution_time
> {code}
> The execution time:
> || |||mr||spark||
> |join|20 sec|79 sec|
> You can download the test data NYSE_daily and NYSE_dividends in 
> https://github.com/alanfgates/programmingpig/blob/master/data/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (PIG-4621) Enable Illustrate in spark

2016-02-24 Thread prateek vaishnav (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-4621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

prateek vaishnav reassigned PIG-4621:
-

Assignee: prateek vaishnav  (was: Syed Zulfiqar Ali)

> Enable Illustrate in spark
> --
>
> Key: PIG-4621
> URL: https://issues.apache.org/jira/browse/PIG-4621
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Reporter: liyunzhang_intel
>Assignee: prateek vaishnav
> Fix For: spark-branch
>
>
> Current we don't support illustrate in spark mode.
> How illustrate works 
> see:http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#ILLUSTRATE



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PIG-4807) Fix test cases of "TestEvalPipelineLocal" test suite.

2016-02-24 Thread Pallavi Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/PIG-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15160356#comment-15160356
 ] 

Pallavi Rao commented on PIG-4807:
--

Thanks [~Pratyy]. +1 for the new patch. [~xuefuz], please commit the patch.

> Fix test cases of "TestEvalPipelineLocal" test suite.
> -
>
> Key: PIG-4807
> URL: https://issues.apache.org/jira/browse/PIG-4807
> Project: Pig
>  Issue Type: Sub-task
>  Components: spark
>Affects Versions: spark-branch
>Reporter: prateek vaishnav
>Assignee: prateek vaishnav
> Attachments: diff_1, diff_2
>
>
> This jira is created to address the failure of test cases 
> org.apache.pig.test.TestEvalPipelineLocal.testSetLocationCalledInFE
> org.apache.pig.test.TestEvalPipelineLocal.testExplainInDotGraph
> org.apache.pig.test.TestEvalPipelineLocal.testSortWithUDF



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 43875: Fixed TestEvalPipelineLocal test suite.

2016-02-24 Thread Pallavi Rao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43875/#review120467
---


Ship it!




Ship It!

- Pallavi Rao


On Feb. 24, 2016, 7:47 a.m., prateek vaishnav wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43875/
> ---
> 
> (Updated Feb. 24, 2016, 7:47 a.m.)
> 
> 
> Review request for pig and Pallavi Rao.
> 
> 
> Repository: pig-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/PIG-4807
> 
> Following test cases have been fixed -
> 1. org.apache.pig.test.TestEvalPipelineLocal.testSetLocationCalledInFE
> 2. org.apache.pig.test.TestEvalPipelineLocal.testExplainInDotGraph
> 3. org.apache.pig.test.TestEvalPipelineLocal.testSortWithUDF
> 
> 1 was failing because of not saving UDF_CONTEXT configuration in jobConf. 
> This leads  UDFContext.getUDFProperties() to return NULL.
>  
> public Properties getUDFProperties(Class c) {
> UDFContextKey k = generateKey(c, null);
> Properties p = udfConfs.get(k);
> if (p == null) {
> p = new Properties();
> udfConfs.put(k, p);
> }
> return p;
> }
> 
> Here, udfConfs remains empty even when it was set while processing the pig 
> query.
> udf configuration in jobConf is getting lost while running the job.
> In the code udf configuration is meant to be saved by serializing them in 
> jobConf.
> 
> Currently, serialization is done before loading configuration in jobConf.
> It is done in 'newJobConf(PigContext pigContext)'
> It needs to be done after loading configuration.
> 
> JobConf jobConf = SparkUtil.newJobConf(pigContext);
> configureLoader(physicalPlan, op, jobConf);
> UDFContext.getUDFContext().serialize(jobConf);
> 
> 2 was failing because of pig-spark not supporting 'explain' in dot format. I 
> have added the DotSparkPrinter to fix the same.
> 
> 3 was failing because instead of UDFSortComparator, SortConveter class was 
> using SortComparator. 
> 
> JavaPairRDD sorted = r.sortByKey(
> sortOperator.new SortComparator(), true);
> 
> It should be using mComparator stored in POSort class. I have changed it to 
> following
> 
> JavaPairRDD sorted = r.sortByKey(
> sortOperator.getMComparator(), true);
> 
> 
> Diffs
> -
> 
>   
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POSort.java
>  a759857 
>   src/org/apache/pig/backend/hadoop/executionengine/spark/SparkLauncher.java 
> b74977d 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/LoadConverter.java
>  90cff23 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/SortConverter.java
>  f54f8fc 
>   
> src/org/apache/pig/backend/hadoop/executionengine/spark/plan/DotSparkPrinter.java
>  e69de29 
>   test/org/apache/pig/test/TestEvalPipelineLocal.java d73074c 
> 
> Diff: https://reviews.apache.org/r/43875/diff/
> 
> 
> Testing
> ---
> 
> Successfully ran TestEvalPipelineLocal in spark/mr/local mode.
> 
> 
> Thanks,
> 
> prateek vaishnav
> 
>