Re: Unit test classpath trouble
As you know we've been testing Pig 0.11 vs 2.0.4-alpha as a part of Bigtop's validation for the latest hadoop release and it worked ok. Bigtop doesn't run unit tests though, so it seems like a build issue to me. Cos On Sat, May 11, 2013 at 09:19AM, Andrew Purtell wrote: > I've tried that, thanks. I did a bit more investigation and it seems the > issue is recent Hadoop 2 releases. Has anyone tried running Pig unit tests > using a more recent Hadoop release than 2.0.0-alpha? Maybe my trouble is a > simple thing that someone with more experience with Pig internals would see > right away? Cluster testing seems ok. It's just unit tests that fail. But > that is concerning. > > I'm trying HEAD of branch-0.11. > > My Java is version "1.6.0_43" Java(TM) SE Runtime Environment (build > 1.6.0_43-b01) Java HotSpot(TM) 64-Bit Server VM (build 20.14-b01, mixed > mode). OS is Ubuntu 13.04 (GNU/Linux 3.8.0-19-generic x86_64). > > With defaults and only -Dhadoopversion=23 on the Ant command line, it seems > ok. > > With build.properties of: > > hadoopversion=23 > hadoop-common.version=2.0.4-alpha > hadoop-hdfs.version=2.0.4-alpha > hadoop-mapreduce.version=2.0.4-alpha > > > or defined on the Ant command line, I'll see unit test failures like: > > Testcase: testAccumWithDistinct took 0.868 sec > Caused an ERROR > org/apache/hadoop/mapred/ResourceMgrDelegate > java.lang.NoClassDefFoundError: org/apache/hadoop/mapred/ResourceMgrDelegate > at org.apache.hadoop.mapred.YARNRunner.(YARNRunner.java:112) > at > org.apache.hadoop.mapred.YarnClientProtocolProvider.create(YarnClientProtocolProvider.java:34) > at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:94) > at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:81) > at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:74) > at org.apache.hadoop.mapred.JobClient.init(JobClient.java:482) > at org.apache.hadoop.mapred.JobClient.(JobClient.java:461) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:152) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1264) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1249) > at org.apache.pig.PigServer.storeEx(PigServer.java:931) > at org.apache.pig.PigServer.store(PigServer.java:898) > at org.apache.pig.PigServer.openIterator(PigServer.java:811) > at > org.apache.pig.test.TestAccumulator.testAccumWithDistinct(TestAccumulator.java:424) > > That suggests a cause but I've not started spelunking code with the hope > this is something simple that someone has already encountered. > > > On Sat, May 11, 2013 at 1:31 AM, Johnny Zhang wrote: > > > Hi, Andrew: > > Does something like "-Dhadoopversion=23" help ? eg. ant clean test > > -Dhadoopversion=23 -Dtest.junit.output.format=xml > > > > Johnny > > > > > > On Fri, May 10, 2013 at 3:39 AM, Andrew Purtell > > wrote: > > > > > Please pardon the basic question. I'm building Pig 0.11.2-SNAPSHOT > > against > > > Hadoop 2.0.4. 'ant package' and full cluster tests work fine, but I'm not > > > having much luck with running the unit tests, 'ant test-core' or 'ant > > > test'. The problem looks to be a MR app classpath issue. > > > > > > Sometimes: java.lang.NoClassDefFoundError: > > > org/apache/hadoop/yarn/client/YarnClientImpl > > > > > > Sometimes: java.lang.NoClassDefFoundError: > > > org/apache/hadoop/mapred/ResourceMgrDelegate > > > > > > A few Google searches have turned up no useful pointers. Maybe there is > > > something simple I am missing? How do you set up for running unit tests > > on > > > your dev boxes? > > > > > > -- > > > Best regards, > > > > > >- Andy > > > > > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > > > (via Tom White) > > > > > > > > > -- > Best regards, > >- Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) signature.asc Description: Digital signature
Re: Unit test classpath trouble
Hi, Andrew: I just set up a job to run unit test against 2.0.4-alpha. I will investigate failure and reply to thread. Thanks, Johnny Zhang On Fri, May 10, 2013 at 6:19 PM, Andrew Purtell wrote: > I've tried that, thanks. I did a bit more investigation and it seems the > issue is recent Hadoop 2 releases. Has anyone tried running Pig unit tests > using a more recent Hadoop release than 2.0.0-alpha? Maybe my trouble is a > simple thing that someone with more experience with Pig internals would see > right away? Cluster testing seems ok. It's just unit tests that fail. But > that is concerning. > > I'm trying HEAD of branch-0.11. > > My Java is version "1.6.0_43" Java(TM) SE Runtime Environment (build > 1.6.0_43-b01) Java HotSpot(TM) 64-Bit Server VM (build 20.14-b01, mixed > mode). OS is Ubuntu 13.04 (GNU/Linux 3.8.0-19-generic x86_64). > > With defaults and only -Dhadoopversion=23 on the Ant command line, it seems > ok. > > With build.properties of: > > hadoopversion=23 > hadoop-common.version=2.0.4-alpha > hadoop-hdfs.version=2.0.4-alpha > hadoop-mapreduce.version=2.0.4-alpha > > > or defined on the Ant command line, I'll see unit test failures like: > > Testcase: testAccumWithDistinct took 0.868 sec > Caused an ERROR > org/apache/hadoop/mapred/ResourceMgrDelegate > java.lang.NoClassDefFoundError: > org/apache/hadoop/mapred/ResourceMgrDelegate > at org.apache.hadoop.mapred.YARNRunner.(YARNRunner.java:112) > at > > org.apache.hadoop.mapred.YarnClientProtocolProvider.create(YarnClientProtocolProvider.java:34) > at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:94) > at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:81) > at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:74) > at org.apache.hadoop.mapred.JobClient.init(JobClient.java:482) > at org.apache.hadoop.mapred.JobClient.(JobClient.java:461) > at > > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:152) > at org.apache.pig.PigServer.launchPlan(PigServer.java:1264) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1249) > at org.apache.pig.PigServer.storeEx(PigServer.java:931) > at org.apache.pig.PigServer.store(PigServer.java:898) > at org.apache.pig.PigServer.openIterator(PigServer.java:811) > at > > org.apache.pig.test.TestAccumulator.testAccumWithDistinct(TestAccumulator.java:424) > > That suggests a cause but I've not started spelunking code with the hope > this is something simple that someone has already encountered. > > > On Sat, May 11, 2013 at 1:31 AM, Johnny Zhang > wrote: > > > Hi, Andrew: > > Does something like "-Dhadoopversion=23" help ? eg. ant clean test > > -Dhadoopversion=23 -Dtest.junit.output.format=xml > > > > Johnny > > > > > > On Fri, May 10, 2013 at 3:39 AM, Andrew Purtell > > wrote: > > > > > Please pardon the basic question. I'm building Pig 0.11.2-SNAPSHOT > > against > > > Hadoop 2.0.4. 'ant package' and full cluster tests work fine, but I'm > not > > > having much luck with running the unit tests, 'ant test-core' or 'ant > > > test'. The problem looks to be a MR app classpath issue. > > > > > > Sometimes: java.lang.NoClassDefFoundError: > > > org/apache/hadoop/yarn/client/YarnClientImpl > > > > > > Sometimes: java.lang.NoClassDefFoundError: > > > org/apache/hadoop/mapred/ResourceMgrDelegate > > > > > > A few Google searches have turned up no useful pointers. Maybe there is > > > something simple I am missing? How do you set up for running unit tests > > on > > > your dev boxes? > > > > > > -- > > > Best regards, > > > > > >- Andy > > > > > > Problems worthy of attack prove their worth by hitting back. - Piet > Hein > > > (via Tom White) > > > > > > > > > -- > Best regards, > >- Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) >
Re: Unit test classpath trouble
I've tried that, thanks. I did a bit more investigation and it seems the issue is recent Hadoop 2 releases. Has anyone tried running Pig unit tests using a more recent Hadoop release than 2.0.0-alpha? Maybe my trouble is a simple thing that someone with more experience with Pig internals would see right away? Cluster testing seems ok. It's just unit tests that fail. But that is concerning. I'm trying HEAD of branch-0.11. My Java is version "1.6.0_43" Java(TM) SE Runtime Environment (build 1.6.0_43-b01) Java HotSpot(TM) 64-Bit Server VM (build 20.14-b01, mixed mode). OS is Ubuntu 13.04 (GNU/Linux 3.8.0-19-generic x86_64). With defaults and only -Dhadoopversion=23 on the Ant command line, it seems ok. With build.properties of: hadoopversion=23 hadoop-common.version=2.0.4-alpha hadoop-hdfs.version=2.0.4-alpha hadoop-mapreduce.version=2.0.4-alpha or defined on the Ant command line, I'll see unit test failures like: Testcase: testAccumWithDistinct took 0.868 sec Caused an ERROR org/apache/hadoop/mapred/ResourceMgrDelegate java.lang.NoClassDefFoundError: org/apache/hadoop/mapred/ResourceMgrDelegate at org.apache.hadoop.mapred.YARNRunner.(YARNRunner.java:112) at org.apache.hadoop.mapred.YarnClientProtocolProvider.create(YarnClientProtocolProvider.java:34) at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:94) at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:81) at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:74) at org.apache.hadoop.mapred.JobClient.init(JobClient.java:482) at org.apache.hadoop.mapred.JobClient.(JobClient.java:461) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:152) at org.apache.pig.PigServer.launchPlan(PigServer.java:1264) at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1249) at org.apache.pig.PigServer.storeEx(PigServer.java:931) at org.apache.pig.PigServer.store(PigServer.java:898) at org.apache.pig.PigServer.openIterator(PigServer.java:811) at org.apache.pig.test.TestAccumulator.testAccumWithDistinct(TestAccumulator.java:424) That suggests a cause but I've not started spelunking code with the hope this is something simple that someone has already encountered. On Sat, May 11, 2013 at 1:31 AM, Johnny Zhang wrote: > Hi, Andrew: > Does something like "-Dhadoopversion=23" help ? eg. ant clean test > -Dhadoopversion=23 -Dtest.junit.output.format=xml > > Johnny > > > On Fri, May 10, 2013 at 3:39 AM, Andrew Purtell > wrote: > > > Please pardon the basic question. I'm building Pig 0.11.2-SNAPSHOT > against > > Hadoop 2.0.4. 'ant package' and full cluster tests work fine, but I'm not > > having much luck with running the unit tests, 'ant test-core' or 'ant > > test'. The problem looks to be a MR app classpath issue. > > > > Sometimes: java.lang.NoClassDefFoundError: > > org/apache/hadoop/yarn/client/YarnClientImpl > > > > Sometimes: java.lang.NoClassDefFoundError: > > org/apache/hadoop/mapred/ResourceMgrDelegate > > > > A few Google searches have turned up no useful pointers. Maybe there is > > something simple I am missing? How do you set up for running unit tests > on > > your dev boxes? > > > > -- > > Best regards, > > > >- Andy > > > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > > (via Tom White) > > > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
[jira] Subscription: PIG patch available
Issue Subscription Filter: PIG patch available (22 issues) Subscriber: pigdaily Key Summary PIG-3317disable optimizations via pig properties https://issues.apache.org/jira/browse/PIG-3317 PIG-3295Casting from bytearray failing after Union (even when each field is from a single Loader) https://issues.apache.org/jira/browse/PIG-3295 PIG-3291TestExampleGenerator fails on Windows because of lack of file name escaping https://issues.apache.org/jira/browse/PIG-3291 PIG-3285Jobs using HBaseStorage fail to ship dependency jars https://issues.apache.org/jira/browse/PIG-3285 PIG-3258Patch to allow MultiStorage to use more than one index to generate output tree https://issues.apache.org/jira/browse/PIG-3258 PIG-3257Add unique identifier UDF https://issues.apache.org/jira/browse/PIG-3257 PIG-3247Piggybank functions to mimic OVER clause in SQL https://issues.apache.org/jira/browse/PIG-3247 PIG-3210Pig fails to start when it cannot write log to log files https://issues.apache.org/jira/browse/PIG-3210 PIG-3199Expose LogicalPlan via PigServer API https://issues.apache.org/jira/browse/PIG-3199 PIG-3166Update eclipse .classpath according to ivy library.properties https://issues.apache.org/jira/browse/PIG-3166 PIG-3123Simplify Logical Plans By Removing Unneccessary Identity Projections https://issues.apache.org/jira/browse/PIG-3123 PIG-3088Add a builtin udf which removes prefixes https://issues.apache.org/jira/browse/PIG-3088 PIG-3069Native Windows Compatibility for Pig E2E Tests and Harness https://issues.apache.org/jira/browse/PIG-3069 PIG-3026Pig checked-in baseline comparisons need a pre-filter to address OS-specific newline differences https://issues.apache.org/jira/browse/PIG-3026 PIG-3025TestPruneColumn unit test - SimpleEchoStreamingCommand perl inline script needs simplification https://issues.apache.org/jira/browse/PIG-3025 PIG-3024TestEmptyInputDir unit test - hadoop version detection logic is brittle https://issues.apache.org/jira/browse/PIG-3024 PIG-3015Rewrite of AvroStorage https://issues.apache.org/jira/browse/PIG-3015 PIG-2959Add a pig.cmd for Pig to run under Windows https://issues.apache.org/jira/browse/PIG-2959 PIG-2955 Fix bunch of Pig e2e tests on Windows https://issues.apache.org/jira/browse/PIG-2955 PIG-2248Pig parser does not detect when a macro name masks a UDF name https://issues.apache.org/jira/browse/PIG-2248 PIG-2244Macros cannot be passed relation names https://issues.apache.org/jira/browse/PIG-2244 PIG-1914Support load/store JSON data in Pig https://issues.apache.org/jira/browse/PIG-1914 You may edit this subscription at: https://issues.apache.org/jira/secure/FilterSubscription!default.jspa?subId=13225&filterId=12322384
[jira] [Commented] (PIG-3316) Pig failed to interpret DateTime values in some special cases
[ https://issues.apache.org/jira/browse/PIG-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13655069#comment-13655069 ] Johnny Zhang commented on PIG-3316: --- it is a little bit late :) but the unit tests with patch pass for me > Pig failed to interpret DateTime values in some special cases > - > > Key: PIG-3316 > URL: https://issues.apache.org/jira/browse/PIG-3316 > Project: Pig > Issue Type: Bug > Components: data, impl >Affects Versions: 0.11 > Environment: 1970-01-01 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Fix For: 0.11 > > Attachments: PIG-3316.patch > > > For the query > A = load 'date.txt' as ( f1:int, f2:datetime ); > dump A; > with input data > 1,1970-01-01 > 2,1970-01 > pig generates the following output > (1,1970-01-01T00:00:00.000-01:00) > (2,1970-01-01T00:00:00.000-01:00) > which seemingly incorrectly interprets the day or month part as time zone. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (PIG-3279) Support nested RANK
[ https://issues.apache.org/jira/browse/PIG-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Johnny Zhang reassigned PIG-3279: - Assignee: Johnny Zhang > Support nested RANK > --- > > Key: PIG-3279 > URL: https://issues.apache.org/jira/browse/PIG-3279 > Project: Pig > Issue Type: Improvement >Reporter: Gianmarco De Francisci Morales >Assignee: Johnny Zhang > Attachments: PIG-3279-1.patch.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3307) Refactor physical operators to remove methods parameters that are always null
[ https://issues.apache.org/jira/browse/PIG-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13654887#comment-13654887 ] Julien Le Dem commented on PIG-3307: [~daijy] Also most likely it wont make any difference performance wise. > Refactor physical operators to remove methods parameters that are always null > - > > Key: PIG-3307 > URL: https://issues.apache.org/jira/browse/PIG-3307 > Project: Pig > Issue Type: Improvement >Reporter: Julien Le Dem >Assignee: Julien Le Dem > Attachments: PIG-3307_0.patch, PIG-3307_1.patch, PIG-3307_2.patch > > > The physical operators are sometimes overly complex. I'm trying to cleanup > some unnecessary code. > in particular there is an array of getNext(*T* v) where the value v does not > seem to have any importance and is just used to pick the correct method. > I have started a refactoring for a more readable getNext*T*(). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3279) Support nested RANK
[ https://issues.apache.org/jira/browse/PIG-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13654834#comment-13654834 ] Johnny Zhang commented on PIG-3279: --- the exception seems happen between 'global rearrange' and 'POPackage', still looking at it. > Support nested RANK > --- > > Key: PIG-3279 > URL: https://issues.apache.org/jira/browse/PIG-3279 > Project: Pig > Issue Type: Improvement >Reporter: Gianmarco De Francisci Morales > Attachments: PIG-3279-1.patch.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3316) Pig failed to interpret DateTime values in some special cases
[ https://issues.apache.org/jira/browse/PIG-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13654814#comment-13654814 ] Xuefu Zhang commented on PIG-3316: -- Thanks, Santhosh. Patch committed to trunk. > Pig failed to interpret DateTime values in some special cases > - > > Key: PIG-3316 > URL: https://issues.apache.org/jira/browse/PIG-3316 > Project: Pig > Issue Type: Bug > Components: data, impl >Affects Versions: 0.11 > Environment: 1970-01-01 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Fix For: 0.11 > > Attachments: PIG-3316.patch > > > For the query > A = load 'date.txt' as ( f1:int, f2:datetime ); > dump A; > with input data > 1,1970-01-01 > 2,1970-01 > pig generates the following output > (1,1970-01-01T00:00:00.000-01:00) > (2,1970-01-01T00:00:00.000-01:00) > which seemingly incorrectly interprets the day or month part as time zone. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3316) Pig failed to interpret DateTime values in some special cases
[ https://issues.apache.org/jira/browse/PIG-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated PIG-3316: - Resolution: Fixed Status: Resolved (was: Patch Available) > Pig failed to interpret DateTime values in some special cases > - > > Key: PIG-3316 > URL: https://issues.apache.org/jira/browse/PIG-3316 > Project: Pig > Issue Type: Bug > Components: data, impl >Affects Versions: 0.11 > Environment: 1970-01-01 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Fix For: 0.11 > > Attachments: PIG-3316.patch > > > For the query > A = load 'date.txt' as ( f1:int, f2:datetime ); > dump A; > with input data > 1,1970-01-01 > 2,1970-01 > pig generates the following output > (1,1970-01-01T00:00:00.000-01:00) > (2,1970-01-01T00:00:00.000-01:00) > which seemingly incorrectly interprets the day or month part as time zone. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-3297) Avro files with stringType set to String cannot be read by the AvroStorage LoadFunc
[ https://issues.apache.org/jira/browse/PIG-3297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13654700#comment-13654700 ] Michael Moss commented on PIG-3297: --- Niels, I've run into this also (and a similar issue with Hive), and it seems that it might be brought on not by the code you patched, but perhaps in the avro-1.x.y.jar files itself. We are serializing strings as avro.java.string and everything was working fine on our HDP1.2 (Hortonworks) cluster, but when I upgraded the avro jar that pig uses to avro-1.7.4 from avro-1.5.3, I get this exception. I'm also have this issue on the latest version of CDH4.2 (with Impala1.0) in both pig and hive and the culprit there seems to be the avro-1.7.x.jar that they use. I'm just starting to dig into finding out why, but was hoping you or someone here might have some insight. Thanks. > Avro files with stringType set to String cannot be read by the AvroStorage > LoadFunc > --- > > Key: PIG-3297 > URL: https://issues.apache.org/jira/browse/PIG-3297 > Project: Pig > Issue Type: Bug > Components: piggybank >Affects Versions: 0.11.1 >Reporter: Niels Basjes > Attachments: PIG-3297-1.patch, test_record.avro > > > When an Avro file is created there exists the option to set the "String Type" > to a different class than the default Utf8. > A very common situation is that the "String Type" is set to the default > String class. > When trying to read such an Avro file in Pig using the AvroStorage LoadFunc > from the included piggybank this gives the following Exception: > Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to > org.apache.avro.util.Utf8 > at > org.apache.pig.piggybank.storage.avro.PigAvroDatumReader.readString(PigAvroDatumReader.java:154) > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:150) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: disable optimizations via pig properties
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11032/#review20427 --- Julien raised a good question asking if "set" in the script works because query parsing may not have happened yet. He's correct - I did not explicitly test that and it doesn't work. Taking a look at how to proceed. It would be ideal if individual scripts can disable optimizations. - Travis Crawford On May 9, 2013, 9:03 p.m., Travis Crawford wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/11032/ > --- > > (Updated May 9, 2013, 9:03 p.m.) > > > Review request for pig, Julien Le Dem, Bill Graham, and Feng Peng. > > > Description > --- > > Update pig to allow disabling optimizations via pig properties. Currently > optimizations must be disabled via command-line options. Pig properties can > be set in pig.properties, "set" commands in scripts themselves, and > command-line -D options. > > The use-case is, for scripts that require certain optimizations to be > disabled, allowing the script itself to disable the optimization. Currently > whatever runs the script needs to specially handle disabling the optimization > for that specific query. > > > This addresses bug PIG-3317. > https://issues.apache.org/jira/browse/PIG-3317 > > > Diffs > - > > src/docs/src/documentation/content/xdocs/perf.xml 108ae7e > src/org/apache/pig/Main.java f97ed9f > src/org/apache/pig/PigConstants.java ea77e97 > src/org/apache/pig/newplan/logical/optimizer/LogicalPlanOptimizer.java > d26f381 > > Diff: https://reviews.apache.org/r/11032/diff/ > > > Testing > --- > > Manually tested on a fully-distributed cluster. > > THIS FAILS: > PIG_CONF_DIR=/etc/pig/conf ./bin/pig -c query.pig > > THIS WORKS: > PIG_CONF_DIR=/etc/pig/conf ./bin/pig > -Dpig.optimizer.rules.disabled=ColumnMapKeyPrune -c query.pig > > Notice how "-Dpig.optimizer.rules.disabled=ColumnMapKeyPrune" specifies a pig > property, which could be in pig.properties, or the script itself. > > > Failure message: > > Pig Stack Trace > --- > ERROR 2229: Couldn't find matching uid -1 for project (Name: Project Type: > bytearray Uid: 97550 Input: 0 Column: 1) > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1067: Unable to > explain alias null > at org.apache.pig.PigServer.explain(PigServer.java:1057) > at > org.apache.pig.tools.grunt.GruntParser.explainCurrentBatch(GruntParser.java:419) > at > org.apache.pig.tools.grunt.GruntParser.processExplain(GruntParser.java:351) > at org.apache.pig.tools.grunt.Grunt.checkScript(Grunt.java:98) > at org.apache.pig.Main.run(Main.java:607) > at org.apache.pig.Main.main(Main.java:152) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:186) > Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2000: > Error processing rule ColumnMapKeyPrune. Try -t ColumnMapKeyPrune > at > org.apache.pig.newplan.optimizer.PlanOptimizer.optimize(PlanOptimizer.java:122) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:281) > at org.apache.pig.PigServer.compilePp(PigServer.java:1380) > at org.apache.pig.PigServer.explain(PigServer.java:1042) > ... 10 more > Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2229: > Couldn't find matching uid -1 for project (Name: Project Type: bytearray Uid: > 97550 Input: 0 Column: 1) > at > org.apache.pig.newplan.logical.optimizer.ProjectionPatcher$ProjectionRewriter.visit(ProjectionPatcher.java:91) > at > org.apache.pig.newplan.logical.expression.ProjectExpression.accept(ProjectExpression.java:207) > at > org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64) > at > org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53) > at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52) > at > org.apache.pig.newplan.logical.optimizer.AllExpressionVisitor.visit(AllExpressionVisitor.java:142) > at > org.apache.pig.newplan.logical.relational.LOInnerLoad.accept(LOInnerLoad.java:128) > at > org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75) > at > org.apache.pig.newplan.logical.optimizer.AllExpressionVisitor.visit(AllExpressionVisitor.java:124) >
Re: JavaScript UDFs
Yes that would be great. I contributed the javascript UDFs. Let me know if you find bugs On Thu, May 9, 2013 at 7:58 PM, Russell Jurney wrote: > That would be so cool! > > Russell Jurney http://datasyndrome.com > > On May 9, 2013, at 7:30 PM, Ruan Pethiyagoda wrote: > >> At Hack Reactor, we are setting up a computationally heavy MapReduce job >> across a 1,000 machine cluster. It is an algorithmic tree traversal >> expected to run over several hours, comprising over 10 quintillion >> computations, and occupying at least a few petabytes of storage across >> HDFS. >> >> We plan to run the job using a Java implementation of our functions. The >> original, however, is in JavaScript, and I noticed that JavaScript UDFs are >> still considered experimental for want of additional testing. If our >> project could be of any use in testing edge cases or proving out the >> JavaScript UDF functionality in Pig, we would be more than happy to help. >> >> Cheers, >> >> RP
[jira] [Updated] (PIG-3316) Pig failed to interpret DateTime values in some special cases
[ https://issues.apache.org/jira/browse/PIG-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated PIG-3316: - Status: Patch Available (was: In Progress) > Pig failed to interpret DateTime values in some special cases > - > > Key: PIG-3316 > URL: https://issues.apache.org/jira/browse/PIG-3316 > Project: Pig > Issue Type: Bug > Components: data, impl >Affects Versions: 0.11 > Environment: 1970-01-01 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Fix For: 0.11 > > Attachments: PIG-3316.patch > > > For the query > A = load 'date.txt' as ( f1:int, f2:datetime ); > dump A; > with input data > 1,1970-01-01 > 2,1970-01 > pig generates the following output > (1,1970-01-01T00:00:00.000-01:00) > (2,1970-01-01T00:00:00.000-01:00) > which seemingly incorrectly interprets the day or month part as time zone. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-3316) Pig failed to interpret DateTime values in some special cases
[ https://issues.apache.org/jira/browse/PIG-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated PIG-3316: - Status: In Progress (was: Patch Available) > Pig failed to interpret DateTime values in some special cases > - > > Key: PIG-3316 > URL: https://issues.apache.org/jira/browse/PIG-3316 > Project: Pig > Issue Type: Bug > Components: data, impl >Affects Versions: 0.11 > Environment: 1970-01-01 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Fix For: 0.11 > > Attachments: PIG-3316.patch > > > For the query > A = load 'date.txt' as ( f1:int, f2:datetime ); > dump A; > with input data > 1,1970-01-01 > 2,1970-01 > pig generates the following output > (1,1970-01-01T00:00:00.000-01:00) > (2,1970-01-01T00:00:00.000-01:00) > which seemingly incorrectly interprets the day or month part as time zone. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Unit test classpath trouble
Hi, Andrew: Does something like "-Dhadoopversion=23" help ? eg. ant clean test -Dhadoopversion=23 -Dtest.junit.output.format=xml Johnny On Fri, May 10, 2013 at 3:39 AM, Andrew Purtell wrote: > Please pardon the basic question. I'm building Pig 0.11.2-SNAPSHOT against > Hadoop 2.0.4. 'ant package' and full cluster tests work fine, but I'm not > having much luck with running the unit tests, 'ant test-core' or 'ant > test'. The problem looks to be a MR app classpath issue. > > Sometimes: java.lang.NoClassDefFoundError: > org/apache/hadoop/yarn/client/YarnClientImpl > > Sometimes: java.lang.NoClassDefFoundError: > org/apache/hadoop/mapred/ResourceMgrDelegate > > A few Google searches have turned up no useful pointers. Maybe there is > something simple I am missing? How do you set up for running unit tests on > your dev boxes? > > -- > Best regards, > >- Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) >
Unit test classpath trouble
Please pardon the basic question. I'm building Pig 0.11.2-SNAPSHOT against Hadoop 2.0.4. 'ant package' and full cluster tests work fine, but I'm not having much luck with running the unit tests, 'ant test-core' or 'ant test'. The problem looks to be a MR app classpath issue. Sometimes: java.lang.NoClassDefFoundError: org/apache/hadoop/yarn/client/YarnClientImpl Sometimes: java.lang.NoClassDefFoundError: org/apache/hadoop/mapred/ResourceMgrDelegate A few Google searches have turned up no useful pointers. Maybe there is something simple I am missing? How do you set up for running unit tests on your dev boxes? -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)