[jira] Subscription: PIG patch available
Issue Subscription Filter: PIG patch available (31 issues) Subscriber: pigdaily Key Summary PIG-4796Authenticate with Kerberos using a keytab file https://issues.apache.org/jira/browse/PIG-4796 PIG-4788the value BytesRead metric info always returns 0 even the length of input file is not 0 in spark engine https://issues.apache.org/jira/browse/PIG-4788 PIG-4781Fix remaining unit failure about "TestCollectedGroup" for spark engine https://issues.apache.org/jira/browse/PIG-4781 PIG-4745DataBag should protect content of passed list of tuples https://issues.apache.org/jira/browse/PIG-4745 PIG-4734TOMAP schema inferring breaks some scripts in type checking for bincond https://issues.apache.org/jira/browse/PIG-4734 PIG-4684Exception should be changed to warning when job diagnostics cannot be fetched https://issues.apache.org/jira/browse/PIG-4684 PIG-4656Improve String serialization and comparator performance in BinInterSedes https://issues.apache.org/jira/browse/PIG-4656 PIG-4641Print the instance of Object without using toString() https://issues.apache.org/jira/browse/PIG-4641 PIG-4598Allow user defined plan optimizer rules https://issues.apache.org/jira/browse/PIG-4598 PIG-4581thread safe issue in NodeIdGenerator https://issues.apache.org/jira/browse/PIG-4581 PIG-4551Partition filter is not pushed down in case of SPLIT https://issues.apache.org/jira/browse/PIG-4551 PIG-4539New PigUnit https://issues.apache.org/jira/browse/PIG-4539 PIG-4526Make setting up the build environment easier https://issues.apache.org/jira/browse/PIG-4526 PIG-4515org.apache.pig.builtin.Distinct throws ClassCastException https://issues.apache.org/jira/browse/PIG-4515 PIG-4455Should use DependencyOrderWalker instead of DepthFirstWalker in MRPrinter https://issues.apache.org/jira/browse/PIG-4455 PIG-4341Add CMX support to pig.tmpfilecompression.codec https://issues.apache.org/jira/browse/PIG-4341 PIG-4323PackageConverter hanging in Spark https://issues.apache.org/jira/browse/PIG-4323 PIG-4313StackOverflowError in LIMIT operation on Spark https://issues.apache.org/jira/browse/PIG-4313 PIG-4251Pig on Storm https://issues.apache.org/jira/browse/PIG-4251 PIG-4111Make Pig compiles with avro-1.7.7 https://issues.apache.org/jira/browse/PIG-4111 PIG-4002Disable combiner when map-side aggregation is used https://issues.apache.org/jira/browse/PIG-4002 PIG-3952PigStorage accepts '-tagSplit' to return full split information https://issues.apache.org/jira/browse/PIG-3952 PIG-3911Define unique fields with @OutputSchema https://issues.apache.org/jira/browse/PIG-3911 PIG-3906ant site errors out https://issues.apache.org/jira/browse/PIG-3906 PIG-3877Getting Geo Latitude/Longitude from Address Lines https://issues.apache.org/jira/browse/PIG-3877 PIG-3873Geo distance calculation using Haversine https://issues.apache.org/jira/browse/PIG-3873 PIG-3866Create ThreadLocal classloader per PigContext https://issues.apache.org/jira/browse/PIG-3866 PIG-3864ToDate(userstring, format, timezone) computes DateTime with strange handling of Daylight Saving Time with location based timezones https://issues.apache.org/jira/browse/PIG-3864 PIG-3851Upgrade jline to 2.11 https://issues.apache.org/jira/browse/PIG-3851 PIG-3668COR built-in function when atleast one of the coefficient values is NaN https://issues.apache.org/jira/browse/PIG-3668 PIG-3587add functionality for rolling over dates https://issues.apache.org/jira/browse/PIG-3587 You may edit this subscription at: https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=16328&filterId=12322384
[jira] [Commented] (PIG-4781) Fix remaining unit failure about "TestCollectedGroup" for spark engine
[ https://issues.apache.org/jira/browse/PIG-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15166760#comment-15166760 ] Xuefu Zhang commented on PIG-4781: -- +1. > Fix remaining unit failure about "TestCollectedGroup" for spark engine > -- > > Key: PIG-4781 > URL: https://issues.apache.org/jira/browse/PIG-4781 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Fix For: spark-branch > > Attachments: PIG-4781.patch > > > in > https://builds.apache.org/job/Pig-spark/lastUnsuccessfulBuild/#showFailuresLink, > it shows that following unit test fails: > org.apache.pig.test.TestCollectedGroup.testMapsideGroupWithMergeJoin > This fails because currently we use regular join to implement merge join. > the exeception is > {code} > Caused by: > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompilerException: > ERROR 2171: Expected one but found more then one root physical operator in > physical physicalPlan. > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.visitCollectedGroup(SparkCompiler.java:512) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POCollectedGroup.visit(POCollectedGroup.java:93) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.compile(SparkCompiler.java:259) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.compile(SparkCompiler.java:240) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.compile(SparkCompiler.java:240) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.compile(SparkCompiler.java:165) > at > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.compile(SparkLauncher.java:425) > at > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:150) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:301) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1390) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1375) > at org.apache.pig.PigServer.storeEx(PigServer.java:1034) > ... 27 more > {code} > After we implement Merge join, this unit test can be fixed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PIG-4781) Fix remaining unit failure about "TestCollectedGroup" for spark engine
[ https://issues.apache.org/jira/browse/PIG-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyunzhang_intel updated PIG-4781: -- Assignee: liyunzhang_intel Status: Patch Available (was: Open) > Fix remaining unit failure about "TestCollectedGroup" for spark engine > -- > > Key: PIG-4781 > URL: https://issues.apache.org/jira/browse/PIG-4781 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Fix For: spark-branch > > Attachments: PIG-4781.patch > > > in > https://builds.apache.org/job/Pig-spark/lastUnsuccessfulBuild/#showFailuresLink, > it shows that following unit test fails: > org.apache.pig.test.TestCollectedGroup.testMapsideGroupWithMergeJoin > This fails because currently we use regular join to implement merge join. > the exeception is > {code} > Caused by: > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompilerException: > ERROR 2171: Expected one but found more then one root physical operator in > physical physicalPlan. > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.visitCollectedGroup(SparkCompiler.java:512) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POCollectedGroup.visit(POCollectedGroup.java:93) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.compile(SparkCompiler.java:259) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.compile(SparkCompiler.java:240) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.compile(SparkCompiler.java:240) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.compile(SparkCompiler.java:165) > at > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.compile(SparkLauncher.java:425) > at > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:150) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:301) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1390) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1375) > at org.apache.pig.PigServer.storeEx(PigServer.java:1034) > ... 27 more > {code} > After we implement Merge join, this unit test can be fixed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PIG-4781) Fix remaining unit failure about "TestCollectedGroup" for spark engine
[ https://issues.apache.org/jira/browse/PIG-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyunzhang_intel updated PIG-4781: -- Attachment: PIG-4781.patch We use regular join to replace merge join now so TestCollectedGroup#testMapsideGroupWithMergeJoin fails(see the analysis in jira description). PIG-4781's patch skips the unit test in spark mode. Once PIG-4810(implement merge join in spark mode) has been fixed, we will import this unit test again in spark mode. > Fix remaining unit failure about "TestCollectedGroup" for spark engine > -- > > Key: PIG-4781 > URL: https://issues.apache.org/jira/browse/PIG-4781 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: liyunzhang_intel > Fix For: spark-branch > > Attachments: PIG-4781.patch > > > in > https://builds.apache.org/job/Pig-spark/lastUnsuccessfulBuild/#showFailuresLink, > it shows that following unit test fails: > org.apache.pig.test.TestCollectedGroup.testMapsideGroupWithMergeJoin > This fails because currently we use regular join to implement merge join. > the exeception is > {code} > Caused by: > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompilerException: > ERROR 2171: Expected one but found more then one root physical operator in > physical physicalPlan. > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.visitCollectedGroup(SparkCompiler.java:512) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POCollectedGroup.visit(POCollectedGroup.java:93) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.compile(SparkCompiler.java:259) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.compile(SparkCompiler.java:240) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.compile(SparkCompiler.java:240) > at > org.apache.pig.backend.hadoop.executionengine.spark.plan.SparkCompiler.compile(SparkCompiler.java:165) > at > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.compile(SparkLauncher.java:425) > at > org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:150) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:301) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1390) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1375) > at org.apache.pig.PigServer.storeEx(PigServer.java:1034) > ... 27 more > {code} > After we implement Merge join, this unit test can be fixed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Welcome to our new Pig PMC member Xuefu Zhang
Thank you, Liyun! You did the hard work. I think you well deserve a committership once we merge the branch to trunk. --Xuefu On Wed, Feb 24, 2016 at 5:18 PM, Zhang, Liyun wrote: > Congratulations Xuefu! > > > Kelly Zhang/Zhang,Liyun > Best Regards > > > > -Original Message- > From: Jarek Jarcec Cecho [mailto:jar...@gmail.com] On Behalf Of Jarek > Jarcec Cecho > Sent: Thursday, February 25, 2016 6:36 AM > To: dev@pig.apache.org > Cc: u...@pig.apache.org > Subject: Re: Welcome to our new Pig PMC member Xuefu Zhang > > Congratulations Xuefu! > > Jarcec > > > On Feb 24, 2016, at 1:29 PM, Rohini Palaniswamy > wrote: > > > > It is my pleasure to announce that Xuefu Zhang is our newest addition > > to the Pig PMC. Xuefu is a long time committer of Pig and has been > > actively involved in driving the Pig on Spark effort for the past year. > > > > Please join me in congratulating Xuefu !!! > > > > Regards, > > Rohini > >
Re: Welcome to our new Pig PMC member Xuefu Zhang
Congratulations Xuefu. On Thu, Feb 25, 2016 at 2:59 AM, Rohini Palaniswamy wrote: > It is my pleasure to announce that Xuefu Zhang is our newest addition to > the Pig PMC. Xuefu is a long time committer of Pig and has been actively > involved in driving the Pig on Spark effort for the past year. > > Please join me in congratulating Xuefu !!! > > Regards, > Rohini >
RE: Welcome to our new Pig PMC member Xuefu Zhang
Congratulations Xuefu! On Feb 25, 2016 7:44 AM, "Zhang, Liyun" wrote: > Congratulations Xuefu! > > > Kelly Zhang/Zhang,Liyun > Best Regards > > > > -Original Message- > From: Jarek Jarcec Cecho [mailto:jar...@gmail.com] On Behalf Of Jarek > Jarcec Cecho > Sent: Thursday, February 25, 2016 6:36 AM > To: dev@pig.apache.org > Cc: u...@pig.apache.org > Subject: Re: Welcome to our new Pig PMC member Xuefu Zhang > > Congratulations Xuefu! > > Jarcec > > > On Feb 24, 2016, at 1:29 PM, Rohini Palaniswamy > wrote: > > > > It is my pleasure to announce that Xuefu Zhang is our newest addition > > to the Pig PMC. Xuefu is a long time committer of Pig and has been > > actively involved in driving the Pig on Spark effort for the past year. > > > > Please join me in congratulating Xuefu !!! > > > > Regards, > > Rohini > > -- _ The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
RE: Welcome to our new Pig PMC member Xuefu Zhang
Congratulations Xuefu! Kelly Zhang/Zhang,Liyun Best Regards -Original Message- From: Jarek Jarcec Cecho [mailto:jar...@gmail.com] On Behalf Of Jarek Jarcec Cecho Sent: Thursday, February 25, 2016 6:36 AM To: dev@pig.apache.org Cc: u...@pig.apache.org Subject: Re: Welcome to our new Pig PMC member Xuefu Zhang Congratulations Xuefu! Jarcec > On Feb 24, 2016, at 1:29 PM, Rohini Palaniswamy > wrote: > > It is my pleasure to announce that Xuefu Zhang is our newest addition > to the Pig PMC. Xuefu is a long time committer of Pig and has been > actively involved in driving the Pig on Spark effort for the past year. > > Please join me in congratulating Xuefu !!! > > Regards, > Rohini
Re: Welcome to our new Pig PMC member Xuefu Zhang
Congratulations Xuefu! Jarcec > On Feb 24, 2016, at 1:29 PM, Rohini Palaniswamy > wrote: > > It is my pleasure to announce that Xuefu Zhang is our newest addition to > the Pig PMC. Xuefu is a long time committer of Pig and has been actively > involved in driving the Pig on Spark effort for the past year. > > Please join me in congratulating Xuefu !!! > > Regards, > Rohini
Welcome to our new Pig PMC member Xuefu Zhang
It is my pleasure to announce that Xuefu Zhang is our newest addition to the Pig PMC. Xuefu is a long time committer of Pig and has been actively involved in driving the Pig on Spark effort for the past year. Please join me in congratulating Xuefu !!! Regards, Rohini
[jira] [Updated] (PIG-4807) Fix test cases of "TestEvalPipelineLocal" test suite.
[ https://issues.apache.org/jira/browse/PIG-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated PIG-4807: - Resolution: Fixed Fix Version/s: spark-branch Status: Resolved (was: Patch Available) Committed to Spark branch. Thanks, Prateek! > Fix test cases of "TestEvalPipelineLocal" test suite. > - > > Key: PIG-4807 > URL: https://issues.apache.org/jira/browse/PIG-4807 > Project: Pig > Issue Type: Sub-task > Components: spark >Affects Versions: spark-branch >Reporter: prateek vaishnav >Assignee: prateek vaishnav > Fix For: spark-branch > > Attachments: diff_1, diff_2 > > > This jira is created to address the failure of test cases > org.apache.pig.test.TestEvalPipelineLocal.testSetLocationCalledInFE > org.apache.pig.test.TestEvalPipelineLocal.testExplainInDotGraph > org.apache.pig.test.TestEvalPipelineLocal.testSortWithUDF -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PIG-4526) Make setting up the build environment easier
[ https://issues.apache.org/jira/browse/PIG-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated PIG-4526: -- Status: Patch Available (was: Open) This will really help in making builds more reproducible. > Make setting up the build environment easier > > > Key: PIG-4526 > URL: https://issues.apache.org/jira/browse/PIG-4526 > Project: Pig > Issue Type: New Feature >Reporter: Niels Basjes >Assignee: Niels Basjes > Fix For: 0.16.0 > > Attachments: PIG-4526-2015-04-30-1632.patch, > PIG-4526-2015-05-01-1545.patch, PIG-4526-2015-05-03-0910.patch, > PIG-4526-2016-02-24-1310.patch > > > In AVRO-1537 and HADOOP-11843 a docker based solution was created to setup > all the tools for doing a full build. This enables much easier reproduction > of any issues and getting up and running for new developers. > This issue is to 'copy/port' that setup into the pig project. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PIG-4526) Make setting up the build environment easier
[ https://issues.apache.org/jira/browse/PIG-4526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated PIG-4526: -- Attachment: PIG-4526-2016-02-24-1310.patch I ran these commands successfully (Hadoop 23): {code} ANT='ant -Dhadoopversion=23 -Djavac.args="-Xlint -Xmaxwarns 100"' ${ANT} clean piggybank jar compile-test cd contrib/piggybank/java && ${ANT} test ${ANT} test-commit {code} > Make setting up the build environment easier > > > Key: PIG-4526 > URL: https://issues.apache.org/jira/browse/PIG-4526 > Project: Pig > Issue Type: New Feature >Reporter: Niels Basjes >Assignee: Niels Basjes > Fix For: 0.16.0 > > Attachments: PIG-4526-2015-04-30-1632.patch, > PIG-4526-2015-05-01-1545.patch, PIG-4526-2015-05-03-0910.patch, > PIG-4526-2016-02-24-1310.patch > > > In AVRO-1537 and HADOOP-11843 a docker based solution was created to setup > all the tools for doing a full build. This enables much easier reproduction > of any issues and getting up and running for new developers. > This issue is to 'copy/port' that setup into the pig project. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PIG-3906) ant site errors out
[ https://issues.apache.org/jira/browse/PIG-3906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated PIG-3906: -- Affects Version/s: 0.15.0 Status: Patch Available (was: Open) Is removing the PDFs an acceptable way of fixing this problem ? > ant site errors out > --- > > Key: PIG-3906 > URL: https://issues.apache.org/jira/browse/PIG-3906 > Project: Pig > Issue Type: Bug > Components: build, site >Affects Versions: 0.15.0, 0.12.1 >Reporter: Konstantin Boudnik > Attachments: PIG-3906-Disable-PDF.patch > > > While running > {noformat} > ant -Djavac.version=1.6 -Dforrest.home=/usr/local/apache-forrest > -Dversion=0.12.1 -Dhadoopversion=23 -buildfile contrib/zebra/build.xml clean > jar > {noformat} > site target errors out (see the comment for detailed message) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PIG-3906) ant site errors out
[ https://issues.apache.org/jira/browse/PIG-3906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Niels Basjes updated PIG-3906: -- Attachment: PIG-3906-Disable-PDF.patch I just checked and this problem is still reproducible using the current trunk. Apparently this problem is related to forrest and generating pdf files. So I create a rather brutal fix (workaround) to stop this error from happening: Simply disable the PDF link at the top right of all pages. The main question is: Is this an acceptable way of fixing this issue? > ant site errors out > --- > > Key: PIG-3906 > URL: https://issues.apache.org/jira/browse/PIG-3906 > Project: Pig > Issue Type: Bug > Components: build, site >Affects Versions: 0.12.1 >Reporter: Konstantin Boudnik > Attachments: PIG-3906-Disable-PDF.patch > > > While running > {noformat} > ant -Djavac.version=1.6 -Dforrest.home=/usr/local/apache-forrest > -Dversion=0.12.1 -Dhadoopversion=23 -buildfile contrib/zebra/build.xml clean > jar > {noformat} > site target errors out (see the comment for detailed message) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PIG-4797) Analyze JOIN performance and improve the same.
[ https://issues.apache.org/jira/browse/PIG-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15162743#comment-15162743 ] Pallavi Rao commented on PIG-4797: -- Yes [~kellyzly], that is one optimization. The second one can be done on the packager side - by absorbing packaging into GlobalRearrageConverter.ToGroupKeyValueFunction rather than as a separate map in Packager. So, basically, collapse LocalRearrange, GlobalRearrange and Package into GlobalRearrange. This will reduce 2 map operations. > Analyze JOIN performance and improve the same. > -- > > Key: PIG-4797 > URL: https://issues.apache.org/jira/browse/PIG-4797 > Project: Pig > Issue Type: Improvement > Components: spark >Reporter: Pallavi Rao >Assignee: Pallavi Rao > Labels: spork > Attachments: Join performance analysis.pdf > > > There are a big performance difference in join between spark and mr mode. > {code} > daily = load './NYSE_daily' as (exchange:chararray, symbol:chararray, > date:chararray, open:float, high:float, low:float, > close:float, volume:int, adj_close:float); > divs = load './NYSE_dividends' as (exchange:chararray, symbol:chararray, > date:chararray, dividends:float); > jnd = join daily by (exchange, symbol), divs by (exchange, symbol); > store jnd into './join.out'; > {code} > join.sh > {code} > mode=$1 > start=$(date +%s) > ./pig -x $mode $PIG_HOME/bin/join.pig > end=$(date +%s) > execution_time=$(( $end - $start )) > echo "execution_time:"$excution_time > {code} > The execution time: > || |||mr||spark|| > |join|20 sec|79 sec| > You can download the test data NYSE_daily and NYSE_dividends in > https://github.com/alanfgates/programmingpig/blob/master/data/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (PIG-4621) Enable Illustrate in spark
[ https://issues.apache.org/jira/browse/PIG-4621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] prateek vaishnav reassigned PIG-4621: - Assignee: prateek vaishnav (was: Syed Zulfiqar Ali) > Enable Illustrate in spark > -- > > Key: PIG-4621 > URL: https://issues.apache.org/jira/browse/PIG-4621 > Project: Pig > Issue Type: Sub-task > Components: spark >Reporter: liyunzhang_intel >Assignee: prateek vaishnav > Fix For: spark-branch > > > Current we don't support illustrate in spark mode. > How illustrate works > see:http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#ILLUSTRATE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PIG-4807) Fix test cases of "TestEvalPipelineLocal" test suite.
[ https://issues.apache.org/jira/browse/PIG-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15160356#comment-15160356 ] Pallavi Rao commented on PIG-4807: -- Thanks [~Pratyy]. +1 for the new patch. [~xuefuz], please commit the patch. > Fix test cases of "TestEvalPipelineLocal" test suite. > - > > Key: PIG-4807 > URL: https://issues.apache.org/jira/browse/PIG-4807 > Project: Pig > Issue Type: Sub-task > Components: spark >Affects Versions: spark-branch >Reporter: prateek vaishnav >Assignee: prateek vaishnav > Attachments: diff_1, diff_2 > > > This jira is created to address the failure of test cases > org.apache.pig.test.TestEvalPipelineLocal.testSetLocationCalledInFE > org.apache.pig.test.TestEvalPipelineLocal.testExplainInDotGraph > org.apache.pig.test.TestEvalPipelineLocal.testSortWithUDF -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 43875: Fixed TestEvalPipelineLocal test suite.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/43875/#review120467 --- Ship it! Ship It! - Pallavi Rao On Feb. 24, 2016, 7:47 a.m., prateek vaishnav wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/43875/ > --- > > (Updated Feb. 24, 2016, 7:47 a.m.) > > > Review request for pig and Pallavi Rao. > > > Repository: pig-git > > > Description > --- > > https://issues.apache.org/jira/browse/PIG-4807 > > Following test cases have been fixed - > 1. org.apache.pig.test.TestEvalPipelineLocal.testSetLocationCalledInFE > 2. org.apache.pig.test.TestEvalPipelineLocal.testExplainInDotGraph > 3. org.apache.pig.test.TestEvalPipelineLocal.testSortWithUDF > > 1 was failing because of not saving UDF_CONTEXT configuration in jobConf. > This leads UDFContext.getUDFProperties() to return NULL. > > public Properties getUDFProperties(Class c) { > UDFContextKey k = generateKey(c, null); > Properties p = udfConfs.get(k); > if (p == null) { > p = new Properties(); > udfConfs.put(k, p); > } > return p; > } > > Here, udfConfs remains empty even when it was set while processing the pig > query. > udf configuration in jobConf is getting lost while running the job. > In the code udf configuration is meant to be saved by serializing them in > jobConf. > > Currently, serialization is done before loading configuration in jobConf. > It is done in 'newJobConf(PigContext pigContext)' > It needs to be done after loading configuration. > > JobConf jobConf = SparkUtil.newJobConf(pigContext); > configureLoader(physicalPlan, op, jobConf); > UDFContext.getUDFContext().serialize(jobConf); > > 2 was failing because of pig-spark not supporting 'explain' in dot format. I > have added the DotSparkPrinter to fix the same. > > 3 was failing because instead of UDFSortComparator, SortConveter class was > using SortComparator. > > JavaPairRDD sorted = r.sortByKey( > sortOperator.new SortComparator(), true); > > It should be using mComparator stored in POSort class. I have changed it to > following > > JavaPairRDD sorted = r.sortByKey( > sortOperator.getMComparator(), true); > > > Diffs > - > > > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POSort.java > a759857 > src/org/apache/pig/backend/hadoop/executionengine/spark/SparkLauncher.java > b74977d > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/LoadConverter.java > 90cff23 > > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/SortConverter.java > f54f8fc > > src/org/apache/pig/backend/hadoop/executionengine/spark/plan/DotSparkPrinter.java > e69de29 > test/org/apache/pig/test/TestEvalPipelineLocal.java d73074c > > Diff: https://reviews.apache.org/r/43875/diff/ > > > Testing > --- > > Successfully ran TestEvalPipelineLocal in spark/mr/local mode. > > > Thanks, > > prateek vaishnav > >