[jira] [Created] (PIG-2045) Pig treating map values as String causing ClassCastException in CONCAT
Pig treating map values as String causing ClassCastException in CONCAT --- Key: PIG-2045 URL: https://issues.apache.org/jira/browse/PIG-2045 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.8.0, 0.9.0 Reporter: Vivek Padmanabhan I have the below script ; {code} register mymapudf.jar; a = load '4523893_1' as (f1); a1 = foreach a generate org.vivek.udfs.mToMapUDF(f1); a2 = foreach a1 generate mapout#'k1' as str1,mapout#'k3' as str2; b = load '4523893_2' as (f1,f2); c = join a2 by CONCAT(str1,str2) , b by CONCAT(f1,f2); dump c; {code} The udf looks like below; {code} public class mToMapUDF extends EvalFunc { @Override public Map exec(Tuple arg0) throws IOException { Map myMapTResult = new HashMap(); myMapTResult.put("k1", "SomeString"); myMapTResult.put("k3", "SomeOtherString"); return myMapTResult; } @Override public Schema outputSchema(Schema input) { // TODO Auto-generated method stub return new Schema(new Schema.FieldSchema("mapout",DataType.MAP)); } } {code} The script fails with exception ; java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.DataByteArray at org.apache.pig.builtin.CONCAT.exec(CONCAT.java:51) The values of the map output, ie str1 and str2, is autmomatically treated as String by Pig and this causes the ClassCast exception when it is used in subsequent udfs. Since there are no explicit casting done nor any types defined, Pig should treat the values as the default bytearray. This issue is also observed in 0.9 The workaround in this case is to cast explicitly to chararray all keys involved in join. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2036) Set header delimiter in PigStorageSchema
[ https://issues.apache.org/jira/browse/PIG-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029713#comment-13029713 ] Mads Moeller commented on PIG-2036: --- Dmitriy, Thanks for going over the patch. Your changes makes it much cleaner. > Set header delimiter in PigStorageSchema > > > Key: PIG-2036 > URL: https://issues.apache.org/jira/browse/PIG-2036 > Project: Pig > Issue Type: Improvement > Components: impl >Affects Versions: 0.7.0, 0.8.0 >Reporter: Mads Moeller >Assignee: Mads Moeller >Priority: Minor > Attachments: PIG-2036.1.patch, PIG-2036.patch > > > Piggybanks' PigStorageSchema currently defaults the delimiter to a tab in the > generated header file (.pig_header). > The attached patch set the header delimiter to what is passed in via the > constructor. Otherwise it'll default to tab '\t'. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-1883) Pig's progress estimation should account for parallel job executions
[ https://issues.apache.org/jira/browse/PIG-1883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029677#comment-13029677 ] Laukik Chitnis commented on PIG-1883: - > This doesn't lend itself well to automated testing. Any thoughts on how to > test how the new progress indicator does versus the existing one? Have you > run any initial tests to measure this? Thats correct; it is difficult to test in an automated fashion. One metric for defining the performance of the progress estimator would be similar to what is used in the paratimer paper, may be a RMS of the difference from the linear time (assuming "ideal" estimates are 0.0 to 1.0 from start time to finish time) I manually tested it with various pig scripts that generated different kinds of physical plans. In most cases, I observed that the progress report was the same for both old and new methods. One simple case where the new method does better is when a very small and a very large job are executed in parallel. In this case, the old estimate shoots up to 50% very early, and then moves slowly to 100%, whereas the new estimate grows more gradually from 0-100 as the bigger job execution progresses. I haven't yet automated capturing these and analyzing the metric yet. > I don't understand the logic here. Why is it 0% done if ANY job is waiting, > etc.? Some of the jobs may be done and some partially done and some not even > started. The 0% is only for those set of jobs that are executing in parallel. For the set of jobs that have finished execution in the previous rounds of parallel execution, their contribution to the total estimate is 1/#rounds per round of execution i.e. per JobControl object (so, #rounds is the length of the critical path along the operator plan tree) > This code shouldn't be in OperatorPlan. We want to keep that as clean as > possible. Instead you should build a new Walker type that can do this > calculation. Ah, ok; Will do that. > You have tabs here and some other spots. Please make sure you use 4 spaces > rather than tabs. I need to change my editor's auto-indentation formatting :) > Why is a separate method needed here? When users turn on the new progress > indicator I assume they don't get the old one too. Given that the interfaces > are the same it seems one method should suffice here. Initially, I put in a separate method assuming that users could have listeners for either of them. For example, we could use these separate listeners for the performance comparison between the old and new methods. Later on, however, when I added the command-line option to choose, I made the old and new methods as an either-or choice. Perhaps I should make it possible to have both indicators turned on at the same time? > It looks like this comment got attached to the run method. Also, the method > has only one parameter, but two are listed in the comment. I will fix this. > Pig's progress estimation should account for parallel job executions > > > Key: PIG-1883 > URL: https://issues.apache.org/jira/browse/PIG-1883 > Project: Pig > Issue Type: Improvement >Reporter: Laukik Chitnis >Assignee: Laukik Chitnis > Attachments: PIG-1883-2.patch > > > Currently, Pig's progress estimation is based on the percentage of jobs > completed out of the total number of MR jobs. However, since the MR operators > are arranged in a DAG (and hence more than 1 job might be submitted for > execution in parallel), the progress estimation can be improved by > considering the number of jobs in the critical path, instead of just the > total number of jobs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (PIG-1985) Utils.getSchemaFromString does not use the new parser, and thus fails to parse valid schema
[ https://issues.apache.org/jira/browse/PIG-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai resolved PIG-1985. - Resolution: Fixed This is done as part of PIG-1775. There is no need for this patch. Close the ticket. > Utils.getSchemaFromString does not use the new parser, and thus fails to > parse valid schema > --- > > Key: PIG-1985 > URL: https://issues.apache.org/jira/browse/PIG-1985 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.9.0 >Reporter: Woody Anderson >Assignee: Daniel Dai > Fix For: 0.9.0, 0.10 > > Attachments: PIG-1985-1.patch > > > I've been told this is because Utils.getSchemaFromString does not use the new > parser to parse the schema, so we should update the impl to use the new > parser: > {code} > Utils.getSchemaFromString("f: map[]") > {code} > results in: (org.apache.pig.impl.logicalLayer.schema.Schema) {f: map[]} > {code} > Utils.getSchemaFromString("f: map[int]") > {code} > results in: An exception occurred: > org.apache.pig.impl.logicalLayer.parser.ParseException > .. > org.apache.pig.impl.logicalLayer.parser.ParseException: Encountered " "map" > "map "" at line 1, column 4. > Was expecting one of: > "int" ... > "long" ... > "float" ... > "double" ... > "chararray" ... > "bytearray" ... > "int" ... > "long" ... > "float" ... > "double" ... > "chararray" ... > "bytearray" ... -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2043) Ship antlr-runtime.jar to backend
[ https://issues.apache.org/jira/browse/PIG-2043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-2043: Attachment: PIG-2043-1.patch Tested manually using PigStorageSchema, which make use of getSchemaFromString. > Ship antlr-runtime.jar to backend > - > > Key: PIG-2043 > URL: https://issues.apache.org/jira/browse/PIG-2043 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.9.0 >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.9.0 > > Attachments: PIG-2043-1.patch > > > Following the discussion in PIG-2040, we want to make getSchemaFromString > work in the backend, so we need to ship antlr-runtime.jar. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-1775) Removal of old logical plan
[ https://issues.apache.org/jira/browse/PIG-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1775: Attachment: piggybank.patch There also a piggybank compilation error. Committed piggybank.patch. > Removal of old logical plan > --- > > Key: PIG-1775 > URL: https://issues.apache.org/jira/browse/PIG-1775 > Project: Pig > Issue Type: Improvement >Affects Versions: 0.9.0 >Reporter: Yan Zhou >Assignee: Xuefu Zhang > Fix For: 0.9.0 > > Attachments: PIG-1775-2.patch, PIG-1775-3.patch, PIG-1775-4.patch, > PIG-1775.patch, piggybank.patch > > > The new logical plan will only be used and the old logical plan will be > removed once the new one is stable enough. It is scheduled for the 0.9 > release. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2030) Merged join/cogroup does not automatically ship loader
[ https://issues.apache.org/jira/browse/PIG-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029647#comment-13029647 ] Daniel Dai commented on PIG-2030: - Test-patch: [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. Unit-test: all pass > Merged join/cogroup does not automatically ship loader > -- > > Key: PIG-2030 > URL: https://issues.apache.org/jira/browse/PIG-2030 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.9.0 >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.9.0 > > Attachments: PIG-2030-1.patch, PIG-2030-2.patch > > > The following script fail due to TableLoader class not found (If the jar is > in classpath): > {code} > a = load '/user/pig/tests/data/zebra/singlefile/studentsortedtab10k' using > org.apache.hadoop.zebra.pig.TableLoader('', 'sorted'); > b = load '/user/pig/tests/data/zebra/singlefile/votersortedtab10k' using > org.apache.hadoop.zebra.pig.TableLoader('', 'sorted'); > g = cogroup a by $0, b by $0 using 'merge'; > store g into '/user/pig/out/jianyong.1304374720/ZebraMapCogrp_1.out'; > {code} > If we use register, the error goes away. However, Pig always ship jars > containing LoadFunc automatically. It should be the same for merged > cogroup/join. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-1994) e2e test harness deployment implementation for existing cluster
[ https://issues.apache.org/jira/browse/PIG-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029637#comment-13029637 ] Daniel Dai commented on PIG-1994: - +1 > e2e test harness deployment implementation for existing cluster > --- > > Key: PIG-1994 > URL: https://issues.apache.org/jira/browse/PIG-1994 > Project: Pig > Issue Type: Sub-task > Components: tools >Reporter: Alan Gates >Assignee: Alan Gates > Fix For: 0.10 > > Attachments: PIG-1994.patch > > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-1949) e2e test harness should use bin/pig rather than calling java directly
[ https://issues.apache.org/jira/browse/PIG-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029632#comment-13029632 ] Daniel Dai commented on PIG-1949: - +1 > e2e test harness should use bin/pig rather than calling java directly > - > > Key: PIG-1949 > URL: https://issues.apache.org/jira/browse/PIG-1949 > Project: Pig > Issue Type: Bug > Components: tools >Reporter: Alan Gates >Assignee: Alan Gates >Priority: Minor > Fix For: 0.10 > > Attachments: PIG-1949.patch > > > Currently TestDriverPig.pm uses java directly to invoke Pig. It should use > the bash shell script bin/pig instead. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2030) Merged join/cogroup does not automatically ship loader
[ https://issues.apache.org/jira/browse/PIG-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-2030: Attachment: PIG-2030-2.patch PIG-2030-2.patch rebase with current trunk. > Merged join/cogroup does not automatically ship loader > -- > > Key: PIG-2030 > URL: https://issues.apache.org/jira/browse/PIG-2030 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.9.0 >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.9.0 > > Attachments: PIG-2030-1.patch, PIG-2030-2.patch > > > The following script fail due to TableLoader class not found (If the jar is > in classpath): > {code} > a = load '/user/pig/tests/data/zebra/singlefile/studentsortedtab10k' using > org.apache.hadoop.zebra.pig.TableLoader('', 'sorted'); > b = load '/user/pig/tests/data/zebra/singlefile/votersortedtab10k' using > org.apache.hadoop.zebra.pig.TableLoader('', 'sorted'); > g = cogroup a by $0, b by $0 using 'merge'; > store g into '/user/pig/out/jianyong.1304374720/ZebraMapCogrp_1.out'; > {code} > If we use register, the error goes away. However, Pig always ship jars > containing LoadFunc automatically. It should be the same for merged > cogroup/join. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2033) Pig returns sucess for the failed Pig script
[ https://issues.apache.org/jira/browse/PIG-2033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Richard Ding updated PIG-2033: -- Attachment: PIG-2033.patch We make sure that Pig returns success iff the number of successfully jobs equal the number of compiled jobs. This patch doesn't include a unit test since it's difficult to simulate the failure case. > Pig returns sucess for the failed Pig script > > > Key: PIG-2033 > URL: https://issues.apache.org/jira/browse/PIG-2033 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.8.0 >Reporter: Richard Ding >Assignee: Richard Ding > Fix For: 0.8.1, 0.9.0 > > Attachments: PIG-2033.patch > > > Pig returns success when a Pig script fails but the count of failed MR jobs > is zero. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (PIG-2044) Patten match bug in org.apache.pig.newplan.optimizer.Rule
Patten match bug in org.apache.pig.newplan.optimizer.Rule - Key: PIG-2044 URL: https://issues.apache.org/jira/browse/PIG-2044 Project: Pig Issue Type: Bug Affects Versions: 0.9.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.9.0 Koji find that we have a bug org.apache.pig.newplan.optimizer.Rule. The "break" in line 179 seems to be wrong. This multiple branch matching is not used in Pig, but could be a problem for the future. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2027) NPE if Pig don't have permission for log file
[ https://issues.apache.org/jira/browse/PIG-2027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029577#comment-13029577 ] Daniel Dai commented on PIG-2027: - [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. Unit-test: all pass Manual test: tested when --l point to a file which don't have write permission > NPE if Pig don't have permission for log file > - > > Key: PIG-2027 > URL: https://issues.apache.org/jira/browse/PIG-2027 > Project: Pig > Issue Type: Bug >Reporter: Daniel Dai >Assignee: Daniel Dai >Priority: Trivial > Fix For: 0.10 > > Attachments: PIG-2027-1.patch > > > If specify a log file to Pig, but Pig don't have write permission, if any > failure in Pig script, we will get a NPE in addition to Pig script failure: > 2011-05-02 13:18:36,493 [main] ERROR org.apache.pig.tools.grunt.Grunt - > java.lang.NullPointerException > at org.apache.pig.impl.util.LogUtils.writeLog(LogUtils.java:172) > at org.apache.pig.impl.util.LogUtils.writeLog(LogUtils.java:79) > at > org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:131) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:180) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:152) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:90) > at org.apache.pig.Main.run(Main.java:554) > at org.apache.pig.Main.main(Main.java:109) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2026) e2e tests in eclipse classpath
[ https://issues.apache.org/jira/browse/PIG-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-2026: -- Resolution: Fixed Fix Version/s: 0.10 Status: Resolved (was: Patch Available) Patch committed. Thanks, Gianmarco! > e2e tests in eclipse classpath > -- > > Key: PIG-2026 > URL: https://issues.apache.org/jira/browse/PIG-2026 > Project: Pig > Issue Type: Bug >Reporter: Gianmarco De Francisci Morales >Assignee: Gianmarco De Francisci Morales >Priority: Trivial > Fix For: 0.10 > > Attachments: PIG-2026.patch, PIG-2026.patch > > > e2e tests under test/e2e/pig/udfs/java should have their own entry as a > source dir in eclipse .classpath -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (PIG-2041) Minicluster should make each run independent
[ https://issues.apache.org/jira/browse/PIG-2041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai resolved PIG-2041. - Resolution: Fixed Hadoop Flags: [Reviewed] Patch committed to both trunk and 0.9 branch. > Minicluster should make each run independent > > > Key: PIG-2041 > URL: https://issues.apache.org/jira/browse/PIG-2041 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.8.0, 0.9.0 >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.9.0 > > Attachments: PIG-2041-1.patch > > > Minicluster will reuse ~/pigtest/conf/hadoop-site.xml. If something wrong in > hadoop-site.xml, next test will also be affected. This leads to some > mysterious test failures. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (PIG-2040) Move classloader from QueryParserDriver to PigContext
[ https://issues.apache.org/jira/browse/PIG-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai resolved PIG-2040. - Resolution: Fixed Hadoop Flags: [Reviewed] Patch committed to both trunk and 0.9 branch. > Move classloader from QueryParserDriver to PigContext > - > > Key: PIG-2040 > URL: https://issues.apache.org/jira/browse/PIG-2040 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.9.0 >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.9.0 > > Attachments: PIG-2040-1.patch, PIG-2040-2.patch > > > After PIG-1775, mapreduce mode fail. The reason is we move classloader from > LogicalPlanBuilder to QueryParserDriver, which will need antlr.jar, however, > we don't ship antlr.jar to backend. It is better to move classloader to > PigContext. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2040) Move classloader from QueryParserDriver to PigContext
[ https://issues.apache.org/jira/browse/PIG-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-2040: Attachment: PIG-2040-2.patch > Move classloader from QueryParserDriver to PigContext > - > > Key: PIG-2040 > URL: https://issues.apache.org/jira/browse/PIG-2040 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.9.0 >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.9.0 > > Attachments: PIG-2040-1.patch, PIG-2040-2.patch > > > After PIG-1775, mapreduce mode fail. The reason is we move classloader from > LogicalPlanBuilder to QueryParserDriver, which will need antlr.jar, however, > we don't ship antlr.jar to backend. It is better to move classloader to > PigContext. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2040) Move classloader from QueryParserDriver to PigContext
[ https://issues.apache.org/jira/browse/PIG-2040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029539#comment-13029539 ] Daniel Dai commented on PIG-2040: - Test-patch: [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. No new unit test added. This only happens when test in real cluster. Unit-test: all pass > Move classloader from QueryParserDriver to PigContext > - > > Key: PIG-2040 > URL: https://issues.apache.org/jira/browse/PIG-2040 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.9.0 >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.9.0 > > Attachments: PIG-2040-1.patch > > > After PIG-1775, mapreduce mode fail. The reason is we move classloader from > LogicalPlanBuilder to QueryParserDriver, which will need antlr.jar, however, > we don't ship antlr.jar to backend. It is better to move classloader to > PigContext. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-1775) Removal of old logical plan
[ https://issues.apache.org/jira/browse/PIG-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated PIG-1775: - Attachment: PIG-1775-4.patch Remove obsolete code related to old logical plan. Some test cases are changed as a result. > Removal of old logical plan > --- > > Key: PIG-1775 > URL: https://issues.apache.org/jira/browse/PIG-1775 > Project: Pig > Issue Type: Improvement >Affects Versions: 0.9.0 >Reporter: Yan Zhou >Assignee: Xuefu Zhang > Fix For: 0.9.0 > > Attachments: PIG-1775-2.patch, PIG-1775-3.patch, PIG-1775-4.patch, > PIG-1775.patch > > > The new logical plan will only be used and the old logical plan will be > removed once the new one is stable enough. It is scheduled for the 0.9 > release. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-1775) Removal of old logical plan
[ https://issues.apache.org/jira/browse/PIG-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029526#comment-13029526 ] Xuefu Zhang commented on PIG-1775: -- Test-patch run for patch PIG-1775-3.patch: [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 45 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. Unit test also passed. Patch is committed to both trunk and 0.9.0. > Removal of old logical plan > --- > > Key: PIG-1775 > URL: https://issues.apache.org/jira/browse/PIG-1775 > Project: Pig > Issue Type: Improvement >Affects Versions: 0.9.0 >Reporter: Yan Zhou >Assignee: Xuefu Zhang > Fix For: 0.9.0 > > Attachments: PIG-1775-2.patch, PIG-1775-3.patch, PIG-1775.patch > > > The new logical plan will only be used and the old logical plan will be > removed once the new one is stable enough. It is scheduled for the 0.9 > release. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-1775) Removal of old logical plan
[ https://issues.apache.org/jira/browse/PIG-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029506#comment-13029506 ] Thejas M Nair commented on PIG-1775: +1 > Removal of old logical plan > --- > > Key: PIG-1775 > URL: https://issues.apache.org/jira/browse/PIG-1775 > Project: Pig > Issue Type: Improvement >Affects Versions: 0.9.0 >Reporter: Yan Zhou >Assignee: Xuefu Zhang > Fix For: 0.9.0 > > Attachments: PIG-1775-2.patch, PIG-1775-3.patch, PIG-1775.patch > > > The new logical plan will only be used and the old logical plan will be > removed once the new one is stable enough. It is scheduled for the 0.9 > release. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2026) e2e tests in eclipse classpath
[ https://issues.apache.org/jira/browse/PIG-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gianmarco De Francisci Morales updated PIG-2026: Attachment: PIG-2026.patch Yes, I made a mistake generating the patch, sorry. Here is the correct one. > e2e tests in eclipse classpath > -- > > Key: PIG-2026 > URL: https://issues.apache.org/jira/browse/PIG-2026 > Project: Pig > Issue Type: Bug >Reporter: Gianmarco De Francisci Morales >Assignee: Gianmarco De Francisci Morales >Priority: Trivial > Attachments: PIG-2026.patch, PIG-2026.patch > > > e2e tests under test/e2e/pig/udfs/java should have their own entry as a > source dir in eclipse .classpath -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2026) e2e tests in eclipse classpath
[ https://issues.apache.org/jira/browse/PIG-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029494#comment-13029494 ] Ashutosh Chauhan commented on PIG-2026: --- Looks like this patch contains PIG-2025 and thus conflicting. > e2e tests in eclipse classpath > -- > > Key: PIG-2026 > URL: https://issues.apache.org/jira/browse/PIG-2026 > Project: Pig > Issue Type: Bug >Reporter: Gianmarco De Francisci Morales >Assignee: Gianmarco De Francisci Morales >Priority: Trivial > Attachments: PIG-2026.patch > > > e2e tests under test/e2e/pig/udfs/java should have their own entry as a > source dir in eclipse .classpath -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2024) Incorrect jar paths in .classpath template for eclipse
[ https://issues.apache.org/jira/browse/PIG-2024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-2024: -- Resolution: Fixed Fix Version/s: 0.10 Status: Resolved (was: Patch Available) Great. I will take a look at that one. Patch committed. Thanks, Gianmarco! > Incorrect jar paths in .classpath template for eclipse > -- > > Key: PIG-2024 > URL: https://issues.apache.org/jira/browse/PIG-2024 > Project: Pig > Issue Type: Bug >Reporter: Gianmarco De Francisci Morales >Assignee: Gianmarco De Francisci Morales >Priority: Minor > Fix For: 0.10 > > Attachments: PIG-2024.patch > > > The jars listed in .eclipse.templates/.classpath are outdated. > Importing the project in eclipse after using ant eclipse-files generates > build path errors. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2041) Minicluster should make each run independent
[ https://issues.apache.org/jira/browse/PIG-2041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029488#comment-13029488 ] Richard Ding commented on PIG-2041: --- +1 > Minicluster should make each run independent > > > Key: PIG-2041 > URL: https://issues.apache.org/jira/browse/PIG-2041 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.8.0, 0.9.0 >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.9.0 > > Attachments: PIG-2041-1.patch > > > Minicluster will reuse ~/pigtest/conf/hadoop-site.xml. If something wrong in > hadoop-site.xml, next test will also be affected. This leads to some > mysterious test failures. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2024) Incorrect jar paths in .classpath template for eclipse
[ https://issues.apache.org/jira/browse/PIG-2024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029484#comment-13029484 ] Gianmarco De Francisci Morales commented on PIG-2024: - Hi Ashutosh, I noticed the issue with e2e, and I had already opened a different jira for that, PIG-2026. The patch there does exactly what you suggest. > Incorrect jar paths in .classpath template for eclipse > -- > > Key: PIG-2024 > URL: https://issues.apache.org/jira/browse/PIG-2024 > Project: Pig > Issue Type: Bug >Reporter: Gianmarco De Francisci Morales >Assignee: Gianmarco De Francisci Morales >Priority: Minor > Attachments: PIG-2024.patch > > > The jars listed in .eclipse.templates/.classpath are outdated. > Importing the project in eclipse after using ant eclipse-files generates > build path errors. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2024) Incorrect jar paths in .classpath template for eclipse
[ https://issues.apache.org/jira/browse/PIG-2024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029481#comment-13029481 ] Ashutosh Chauhan commented on PIG-2024: --- Patch looks good. To make Pig more eclipse friendly, we need to tweak .classpath further to accomodate recently introduced e2e test folder. Something of the effect of the following: {code} - + + {code} Gianmarco, Can you roll this in your patch. > Incorrect jar paths in .classpath template for eclipse > -- > > Key: PIG-2024 > URL: https://issues.apache.org/jira/browse/PIG-2024 > Project: Pig > Issue Type: Bug >Reporter: Gianmarco De Francisci Morales >Assignee: Gianmarco De Francisci Morales >Priority: Minor > Attachments: PIG-2024.patch > > > The jars listed in .eclipse.templates/.classpath are outdated. > Importing the project in eclipse after using ant eclipse-files generates > build path errors. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (PIG-2043) Ship antlr-runtime.jar to backend
Ship antlr-runtime.jar to backend - Key: PIG-2043 URL: https://issues.apache.org/jira/browse/PIG-2043 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.9.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.9.0 Following the discussion in PIG-2040, we want to make getSchemaFromString work in the backend, so we need to ship antlr-runtime.jar. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2041) Minicluster should make each run independent
[ https://issues.apache.org/jira/browse/PIG-2041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13029469#comment-13029469 ] Daniel Dai commented on PIG-2041: - Test-patch: [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. Unit test: all pass > Minicluster should make each run independent > > > Key: PIG-2041 > URL: https://issues.apache.org/jira/browse/PIG-2041 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.8.0, 0.9.0 >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 0.9.0 > > Attachments: PIG-2041-1.patch > > > Minicluster will reuse ~/pigtest/conf/hadoop-site.xml. If something wrong in > hadoop-site.xml, next test will also be affected. This leads to some > mysterious test failures. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2016) -dot option does not work with explain and new logical plan
[ https://issues.apache.org/jira/browse/PIG-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-2016: Attachment: dot-test.patch dot-test.patch changes the test case cuz we observe indeterministics in old test case, due to node ordering, which out of our control. > -dot option does not work with explain and new logical plan > --- > > Key: PIG-2016 > URL: https://issues.apache.org/jira/browse/PIG-2016 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.8.0, 0.8.1, 0.9.0 >Reporter: Alan Gates >Assignee: Daniel Dai >Priority: Minor > Fix For: 0.9.0 > > Attachments: PIG-2016-1.patch, PIG-2016-2.patch, dot-test.patch > > > If you specify -dot in explain, it is supposed to produce a file with the > graphs in .dot format. While the physical plan and map reduce plan are > correctly output in .dot format, the new logical plan is still output in text > format. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (PIG-1999) Macro alias masker should consider schema context
[ https://issues.apache.org/jira/browse/PIG-1999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Richard Ding resolved PIG-1999. --- Resolution: Fixed Hadoop Flags: [Reviewed] Unit tests pass. Patch committed to trunk and 0.9 branch. > Macro alias masker should consider schema context > -- > > Key: PIG-1999 > URL: https://issues.apache.org/jira/browse/PIG-1999 > Project: Pig > Issue Type: Bug > Components: impl >Affects Versions: 0.9.0 >Reporter: Richard Ding >Assignee: Richard Ding > Fix For: 0.9.0 > > Attachments: PIG-1999_1.patch, PIG-1999_2.patch > > > Macro alias masker doesn't consider the current schema context. This results > errors when deciding with alias to mask. Here is an example: > {code} > define toBytearray(in, intermediate) returns e { >a = load '$in' as (name:chararray, age:long, gpa: float); >b = group a by name; >c = foreach b generate a, (1,2,3); >store c into '$intermediate' using BinStorage(); >d = load '$intermediate' using BinStorage() as (b:bag{t:tuple(x,y,z)}, > t2:tuple(a,b,c)); >$e = foreach d generate COUNT(b), t2.a, t2.b, t2.c; > }; > > f = toBytearray ('data', 'output1'); > {code} > Now the alias masker mistakes b in COUNT(b) as an alias instead of b in the > current schema. > The workaround is to not use alias as as names in the schema definition. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-1821) UDFContext.getUDFProperties does not handle collisions in hashcode of udf classname (+ arg hashcodes)
[ https://issues.apache.org/jira/browse/PIG-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated PIG-1821: --- Resolution: Fixed Status: Resolved (was: Patch Available) Patch committed to 0.9 branch and trunk. > UDFContext.getUDFProperties does not handle collisions in hashcode of udf > classname (+ arg hashcodes) > - > > Key: PIG-1821 > URL: https://issues.apache.org/jira/browse/PIG-1821 > Project: Pig > Issue Type: Bug >Affects Versions: 0.8.0 >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Fix For: 0.9.0 > > Attachments: PIG-1821.1.patch, PIG-1821.2.patch > > > In code below, if generateKey() returns same value for two udfs, the udfs > would end up sharing the properties object. > {code} > private HashMap udfConfs = new HashMap Properties>(); > public Properties getUDFProperties(Class c) { > Integer k = generateKey(c); > Properties p = udfConfs.get(k); > if (p == null) { > p = new Properties(); > udfConfs.put(k, p); > } > return p; > } > private int generateKey(Class c) { > return c.getName().hashCode(); > } > public Properties getUDFProperties(Class c, String[] args) { > Integer k = generateKey(c, args); > Properties p = udfConfs.get(k); > if (p == null) { > p = new Properties(); > udfConfs.put(k, p); > } > return p; > } > private int generateKey(Class c, String[] args) { > int hc = c.getName().hashCode(); > for (int i = 0; i < args.length; i++) { > hc <<= 1; > hc ^= args[i].hashCode(); > } > return hc; > } > {code} > To prevent this, a new class (say X) that can hold the classname and args > should be created, and instead of HashMap, HashMap Properties> should be used. Then HahsMap will deal with the collisions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (PIG-2042) Our NOTICE.txt file needs to add Antlr
Our NOTICE.txt file needs to add Antlr -- Key: PIG-2042 URL: https://issues.apache.org/jira/browse/PIG-2042 Project: Pig Issue Type: Bug Components: build Affects Versions: 0.9.0 Reporter: Alan Gates Priority: Blocker Fix For: 0.9.0 We should also check if there are other libraries we are picking up that we have failed to put in the NOTICE file. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (PIG-2036) Set header delimiter in PigStorageSchema
[ https://issues.apache.org/jira/browse/PIG-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy V. Ryaboy reassigned PIG-2036: -- Assignee: Mads Moeller > Set header delimiter in PigStorageSchema > > > Key: PIG-2036 > URL: https://issues.apache.org/jira/browse/PIG-2036 > Project: Pig > Issue Type: Improvement > Components: impl >Affects Versions: 0.7.0, 0.8.0 >Reporter: Mads Moeller >Assignee: Mads Moeller >Priority: Minor > Attachments: PIG-2036.1.patch, PIG-2036.patch > > > Piggybanks' PigStorageSchema currently defaults the delimiter to a tab in the > generated header file (.pig_header). > The attached patch set the header delimiter to what is passed in via the > constructor. Otherwise it'll default to tab '\t'. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2036) Set header delimiter in PigStorageSchema
[ https://issues.apache.org/jira/browse/PIG-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy V. Ryaboy updated PIG-2036: --- Status: Patch Available (was: Open) > Set header delimiter in PigStorageSchema > > > Key: PIG-2036 > URL: https://issues.apache.org/jira/browse/PIG-2036 > Project: Pig > Issue Type: Improvement > Components: impl >Affects Versions: 0.8.0, 0.7.0 >Reporter: Mads Moeller >Priority: Minor > Attachments: PIG-2036.1.patch, PIG-2036.patch > > > Piggybanks' PigStorageSchema currently defaults the delimiter to a tab in the > generated header file (.pig_header). > The attached patch set the header delimiter to what is passed in via the > constructor. Otherwise it'll default to tab '\t'. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2036) Set header delimiter in PigStorageSchema
[ https://issues.apache.org/jira/browse/PIG-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitriy V. Ryaboy updated PIG-2036: --- Attachment: PIG-2036.1.patch Thanks Mads! I took the opportunity to speed up the tests since they were getting pretty slow, and move some of your changes around (just a bit). The new test time is 1 min 47 sec, down from about 5 minutes. > Set header delimiter in PigStorageSchema > > > Key: PIG-2036 > URL: https://issues.apache.org/jira/browse/PIG-2036 > Project: Pig > Issue Type: Improvement > Components: impl >Affects Versions: 0.7.0, 0.8.0 >Reporter: Mads Moeller >Priority: Minor > Attachments: PIG-2036.1.patch, PIG-2036.patch > > > Piggybanks' PigStorageSchema currently defaults the delimiter to a tab in the > generated header file (.pig_header). > The attached patch set the header delimiter to what is passed in via the > constructor. Otherwise it'll default to tab '\t'. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira