[jira] Commented: (PIG-1356) [zebra] TableLoader makes unnecessary calls to build a Job instance that create a new JobClient in the hadoop 0.20.9
[ https://issues.apache.org/jira/browse/PIG-1356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12855316#action_12855316 ] Hadoop QA commented on PIG-1356: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12441228/PIG-1356.patch against trunk revision 932161. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/291/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/291/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/291/console This message is automatically generated. [zebra] TableLoader makes unnecessary calls to build a Job instance that create a new JobClient in the hadoop 0.20.9 Key: PIG-1356 URL: https://issues.apache.org/jira/browse/PIG-1356 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Yan Zhou Fix For: 0.7.0 Attachments: PIG-1356.patch, PIG-1356.patch This extra JobClient is actually a bug in Hadoop 0.20.9, but Zebra could have avoided the problem by not creating the unnecessary instance of Job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-506) Does pig need a NATIVE keyword?
[ https://issues.apache.org/jira/browse/PIG-506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12855354#action_12855354 ] ashitosh commented on PIG-506: -- I have published my proposal on the Gsoc application http://socghop.appspot.com/gsoc/student_proposal/private/google/gsoc2010/ashitosh/t127081039065 Any feedback is more than welcome. Does pig need a NATIVE keyword? --- Key: PIG-506 URL: https://issues.apache.org/jira/browse/PIG-506 Project: Pig Issue Type: New Feature Components: impl Reporter: Alan Gates Assignee: Alan Gates Priority: Minor Assume a user had a job that broke easily into three pieces. Further assume that pieces one and three were easily expressible in pig, but that piece two needed to be written in map reduce for whatever reason (performance, something that pig could not easily express, legacy job that was too important to change, etc.). Today the user would either have to use map reduce for the entire job or manually handle the stitching together of pig and map reduce jobs. What if instead pig provided a NATIVE keyword that would allow the script to pass off the data stream to the underlying system (in this case map reduce). The semantics of NATIVE would vary by underlying system. In the map reduce case, we would assume that this indicated a collection of one or more fully contained map reduce jobs, so that pig would store the data, invoke the map reduce jobs, and then read the resulting data to continue. It might look something like this: {code} A = load 'myfile'; X = load 'myotherfile'; B = group A by $0; C = foreach B generate group, myudf(B); D = native (jar=mymr.jar, infile=frompig outfile=topig); E = join D by $0, X by $0; ... {code} This differs from streaming in that it allows the user to insert an arbitrary amount of native processing, whereas streaming allows the insertion of one binary. It also differs in that, for streaming, data is piped directly into and out of the binary as part of the pig pipeline. Here the pipeline would be broken, data written to disk, and the native block invoked, then data read back from disk. Another alternative is to say this is unnecessary because the user can do the coordination from java, using the PIgServer interface to run pig and calling the map reduce job explicitly. The advantages of the native keyword are that the user need not be worried about coordination between the jobs, pig will take care of it. Also the user can make use of existing java applications without being a java programmer. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1370) Marking Pig interfaces for org.apache.pig package
[ https://issues.apache.org/jira/browse/PIG-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12855384#action_12855384 ] Hadoop QA commented on PIG-1370: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12441251/PIG-1370.patch against trunk revision 932161. +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/292/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/292/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/292/console This message is automatically generated. Marking Pig interfaces for org.apache.pig package - Key: PIG-1370 URL: https://issues.apache.org/jira/browse/PIG-1370 Project: Pig Issue Type: Sub-task Components: documentation Reporter: Alan Gates Assignee: Alan Gates Fix For: 0.8.0 Attachments: PIG-1370.patch Done as a separate JIRA from PIG-1311 since this alone contains quite a lot of changes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1364) Public javadoc on apache site still on 0.2, needs to be updated for each version release
[ https://issues.apache.org/jira/browse/PIG-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12855402#action_12855402 ] Hadoop QA commented on PIG-1364: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12441227/PIG-1364-trunk.patch against trunk revision 932161. +1 @author. The patch does not contain any @author tags. +0 tests included. The patch appears to be a documentation patch that doesn't require tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/280/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/280/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/280/console This message is automatically generated. Public javadoc on apache site still on 0.2, needs to be updated for each version release Key: PIG-1364 URL: https://issues.apache.org/jira/browse/PIG-1364 Project: Pig Issue Type: Bug Components: documentation Affects Versions: 0.4.0, 0.5.0, 0.6.0 Reporter: Alan Gates Assignee: Alan Gates Priority: Critical Fix For: 0.4.0, 0.5.0, 0.6.0, 0.7.0 Attachments: PIG-1364-0.4.patch, PIG-1364-0.5.patch, PIG-1364-0.6.patch, PIG-1364-0.7.patch, PIG-1364-trunk.patch See http://hadoop.apache.org/pig/javadoc/docs/api/. This currently contains javadocs for 0.2. It is also versionless. It needs to be changed so that javadocs for recent versions are posted. It also needs to change so that the version is in the api so that multiple versions of the API can be posted. It's probably too late to do this for 0.6 and before, but it needs to happen for 0.7. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-506) Does pig need a NATIVE keyword?
[ https://issues.apache.org/jira/browse/PIG-506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12855434#action_12855434 ] Ashutosh Chauhan commented on PIG-506: -- Ashitosh, When I click on that link, I get: {noformat} You do not have the required role. {noformat} Do you need to set permissions for it to be world-readable? (if that is what you are intending to do) Does pig need a NATIVE keyword? --- Key: PIG-506 URL: https://issues.apache.org/jira/browse/PIG-506 Project: Pig Issue Type: New Feature Components: impl Reporter: Alan Gates Assignee: Alan Gates Priority: Minor Assume a user had a job that broke easily into three pieces. Further assume that pieces one and three were easily expressible in pig, but that piece two needed to be written in map reduce for whatever reason (performance, something that pig could not easily express, legacy job that was too important to change, etc.). Today the user would either have to use map reduce for the entire job or manually handle the stitching together of pig and map reduce jobs. What if instead pig provided a NATIVE keyword that would allow the script to pass off the data stream to the underlying system (in this case map reduce). The semantics of NATIVE would vary by underlying system. In the map reduce case, we would assume that this indicated a collection of one or more fully contained map reduce jobs, so that pig would store the data, invoke the map reduce jobs, and then read the resulting data to continue. It might look something like this: {code} A = load 'myfile'; X = load 'myotherfile'; B = group A by $0; C = foreach B generate group, myudf(B); D = native (jar=mymr.jar, infile=frompig outfile=topig); E = join D by $0, X by $0; ... {code} This differs from streaming in that it allows the user to insert an arbitrary amount of native processing, whereas streaming allows the insertion of one binary. It also differs in that, for streaming, data is piped directly into and out of the binary as part of the pig pipeline. Here the pipeline would be broken, data written to disk, and the native block invoked, then data read back from disk. Another alternative is to say this is unnecessary because the user can do the coordination from java, using the PIgServer interface to run pig and calling the map reduce job explicitly. The advantages of the native keyword are that the user need not be worried about coordination between the jobs, pig will take care of it. Also the user can make use of existing java applications without being a java programmer. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1356) [zebra] TableLoader makes unnecessary calls to build a Job instance that create a new JobClient in the hadoop 0.20.9
[ https://issues.apache.org/jira/browse/PIG-1356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-1356: -- Resolution: Fixed Fix Version/s: 0.8.0 Status: Resolved (was: Patch Available) Patch committed to the trunk and the 0.7 branch. [zebra] TableLoader makes unnecessary calls to build a Job instance that create a new JobClient in the hadoop 0.20.9 Key: PIG-1356 URL: https://issues.apache.org/jira/browse/PIG-1356 Project: Pig Issue Type: Bug Affects Versions: 0.7.0 Reporter: Yan Zhou Fix For: 0.7.0, 0.8.0 Attachments: PIG-1356.patch, PIG-1356.patch This extra JobClient is actually a bug in Hadoop 0.20.9, but Zebra could have avoided the problem by not creating the unnecessary instance of Job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1366) PigStorage's pushProjection implementation results in NPE under certain data conditions
[ https://issues.apache.org/jira/browse/PIG-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pradeep Kamath updated PIG-1366: Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Patch committed to trunk and branch-0.7 PigStorage's pushProjection implementation results in NPE under certain data conditions --- Key: PIG-1366 URL: https://issues.apache.org/jira/browse/PIG-1366 Project: Pig Issue Type: Bug Affects Versions: 0.6.0, 0.7.0 Reporter: Pradeep Kamath Assignee: Pradeep Kamath Fix For: 0.7.0 Attachments: PIG-1366.patch Under the following conditions, a NullPointerException is caused when PigStorage is used: If in the script, only the 2nd and 3rd column of the data (say) are used, the PruneColumns optimization passes this information to PigStorage through the pushProjection() method. If the data contains a row with only one column (malformed data due to missing cols in certain rows), PigStorage returns a Tuple backed by a null ArrayList. Subsequent projection operations on this tuple result in the NPE. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with
[ https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1178: Attachment: pig_1178_3.4.patch pig_1178_3.4.patch include all logical relational operators. LogicalPlan and Optimizer are too complex and hard to work with --- Key: PIG-1178 URL: https://issues.apache.org/jira/browse/PIG-1178 Project: Pig Issue Type: Improvement Reporter: Alan Gates Assignee: Daniel Dai Attachments: expressions-2.patch, expressions.patch, lp.patch, lp.patch, pig_1178.patch, pig_1178.patch, PIG_1178.patch, pig_1178_2.patch, pig_1178_3.2.patch, pig_1178_3.3.patch, pig_1178_3.4.patch, pig_1178_3.patch The current implementation of the logical plan and the logical optimizer in Pig has proven to not be easily extensible. Developer feedback has indicated that adding new rules to the optimizer is quite burdensome. In addition, the logical plan has been an area of numerous bugs, many of which have been difficult to fix. Developers also feel that the logical plan is difficult to understand and maintain. The root cause for these issues is that a number of design decisions that were made as part of the 0.2 rewrite of the front end have now proven to be sub-optimal. The heart of this proposal is to revisit a number of those proposals and rebuild the logical plan with a simpler design that will make it much easier to maintain the logical plan as well as extend the logical optimizer. See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full details. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with
[ https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1178: Status: Open (was: Patch Available) LogicalPlan and Optimizer are too complex and hard to work with --- Key: PIG-1178 URL: https://issues.apache.org/jira/browse/PIG-1178 Project: Pig Issue Type: Improvement Reporter: Alan Gates Assignee: Daniel Dai Attachments: expressions-2.patch, expressions.patch, lp.patch, lp.patch, pig_1178.patch, pig_1178.patch, PIG_1178.patch, pig_1178_2.patch, pig_1178_3.2.patch, pig_1178_3.3.patch, pig_1178_3.4.patch, pig_1178_3.patch The current implementation of the logical plan and the logical optimizer in Pig has proven to not be easily extensible. Developer feedback has indicated that adding new rules to the optimizer is quite burdensome. In addition, the logical plan has been an area of numerous bugs, many of which have been difficult to fix. Developers also feel that the logical plan is difficult to understand and maintain. The root cause for these issues is that a number of design decisions that were made as part of the 0.2 rewrite of the front end have now proven to be sub-optimal. The heart of this proposal is to revisit a number of those proposals and rebuild the logical plan with a simpler design that will make it much easier to maintain the logical plan as well as extend the logical optimizer. See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full details. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with
[ https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1178: Status: Patch Available (was: Open) LogicalPlan and Optimizer are too complex and hard to work with --- Key: PIG-1178 URL: https://issues.apache.org/jira/browse/PIG-1178 Project: Pig Issue Type: Improvement Reporter: Alan Gates Assignee: Daniel Dai Attachments: expressions-2.patch, expressions.patch, lp.patch, lp.patch, pig_1178.patch, pig_1178.patch, PIG_1178.patch, pig_1178_2.patch, pig_1178_3.2.patch, pig_1178_3.3.patch, pig_1178_3.4.patch, pig_1178_3.patch The current implementation of the logical plan and the logical optimizer in Pig has proven to not be easily extensible. Developer feedback has indicated that adding new rules to the optimizer is quite burdensome. In addition, the logical plan has been an area of numerous bugs, many of which have been difficult to fix. Developers also feel that the logical plan is difficult to understand and maintain. The root cause for these issues is that a number of design decisions that were made as part of the 0.2 rewrite of the front end have now proven to be sub-optimal. The heart of this proposal is to revisit a number of those proposals and rebuild the logical plan with a simpler design that will make it much easier to maintain the logical plan as well as extend the logical optimizer. See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full details. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with
[ https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1178: Attachment: pig_1178_3.4.patch LogicalPlan and Optimizer are too complex and hard to work with --- Key: PIG-1178 URL: https://issues.apache.org/jira/browse/PIG-1178 Project: Pig Issue Type: Improvement Reporter: Alan Gates Assignee: Daniel Dai Attachments: expressions-2.patch, expressions.patch, lp.patch, lp.patch, pig_1178.patch, pig_1178.patch, PIG_1178.patch, pig_1178_2.patch, pig_1178_3.2.patch, pig_1178_3.3.patch, pig_1178_3.4.patch, pig_1178_3.patch The current implementation of the logical plan and the logical optimizer in Pig has proven to not be easily extensible. Developer feedback has indicated that adding new rules to the optimizer is quite burdensome. In addition, the logical plan has been an area of numerous bugs, many of which have been difficult to fix. Developers also feel that the logical plan is difficult to understand and maintain. The root cause for these issues is that a number of design decisions that were made as part of the 0.2 rewrite of the front end have now proven to be sub-optimal. The heart of this proposal is to revisit a number of those proposals and rebuild the logical plan with a simpler design that will make it much easier to maintain the logical plan as well as extend the logical optimizer. See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full details. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with
[ https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1178: Attachment: (was: pig_1178_3.4.patch) LogicalPlan and Optimizer are too complex and hard to work with --- Key: PIG-1178 URL: https://issues.apache.org/jira/browse/PIG-1178 Project: Pig Issue Type: Improvement Reporter: Alan Gates Assignee: Daniel Dai Attachments: expressions-2.patch, expressions.patch, lp.patch, lp.patch, pig_1178.patch, pig_1178.patch, PIG_1178.patch, pig_1178_2.patch, pig_1178_3.2.patch, pig_1178_3.3.patch, pig_1178_3.4.patch, pig_1178_3.patch The current implementation of the logical plan and the logical optimizer in Pig has proven to not be easily extensible. Developer feedback has indicated that adding new rules to the optimizer is quite burdensome. In addition, the logical plan has been an area of numerous bugs, many of which have been difficult to fix. Developers also feel that the logical plan is difficult to understand and maintain. The root cause for these issues is that a number of design decisions that were made as part of the 0.2 rewrite of the front end have now proven to be sub-optimal. The heart of this proposal is to revisit a number of those proposals and rebuild the logical plan with a simpler design that will make it much easier to maintain the logical plan as well as extend the logical optimizer. See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full details. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with
[ https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12855486#action_12855486 ] Hadoop QA commented on PIG-1178: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12441315/pig_1178_3.4.patch against trunk revision 932472. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to cause Findbugs to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/281/testReport/ Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/281/console This message is automatically generated. LogicalPlan and Optimizer are too complex and hard to work with --- Key: PIG-1178 URL: https://issues.apache.org/jira/browse/PIG-1178 Project: Pig Issue Type: Improvement Reporter: Alan Gates Assignee: Daniel Dai Attachments: expressions-2.patch, expressions.patch, lp.patch, lp.patch, pig_1178.patch, pig_1178.patch, PIG_1178.patch, pig_1178_2.patch, pig_1178_3.2.patch, pig_1178_3.3.patch, pig_1178_3.4.patch, pig_1178_3.patch The current implementation of the logical plan and the logical optimizer in Pig has proven to not be easily extensible. Developer feedback has indicated that adding new rules to the optimizer is quite burdensome. In addition, the logical plan has been an area of numerous bugs, many of which have been difficult to fix. Developers also feel that the logical plan is difficult to understand and maintain. The root cause for these issues is that a number of design decisions that were made as part of the 0.2 rewrite of the front end have now proven to be sub-optimal. The heart of this proposal is to revisit a number of those proposals and rebuild the logical plan with a simpler design that will make it much easier to maintain the logical plan as well as extend the logical optimizer. See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full details. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with
[ https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1178: Status: Patch Available (was: Open) LogicalPlan and Optimizer are too complex and hard to work with --- Key: PIG-1178 URL: https://issues.apache.org/jira/browse/PIG-1178 Project: Pig Issue Type: Improvement Reporter: Alan Gates Assignee: Daniel Dai Attachments: expressions-2.patch, expressions.patch, lp.patch, lp.patch, pig_1178.patch, pig_1178.patch, PIG_1178.patch, pig_1178_2.patch, pig_1178_3.2.patch, pig_1178_3.3.patch, pig_1178_3.4.patch, pig_1178_3.patch The current implementation of the logical plan and the logical optimizer in Pig has proven to not be easily extensible. Developer feedback has indicated that adding new rules to the optimizer is quite burdensome. In addition, the logical plan has been an area of numerous bugs, many of which have been difficult to fix. Developers also feel that the logical plan is difficult to understand and maintain. The root cause for these issues is that a number of design decisions that were made as part of the 0.2 rewrite of the front end have now proven to be sub-optimal. The heart of this proposal is to revisit a number of those proposals and rebuild the logical plan with a simpler design that will make it much easier to maintain the logical plan as well as extend the logical optimizer. See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full details. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1178) LogicalPlan and Optimizer are too complex and hard to work with
[ https://issues.apache.org/jira/browse/PIG-1178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-1178: Status: Open (was: Patch Available) LogicalPlan and Optimizer are too complex and hard to work with --- Key: PIG-1178 URL: https://issues.apache.org/jira/browse/PIG-1178 Project: Pig Issue Type: Improvement Reporter: Alan Gates Assignee: Daniel Dai Attachments: expressions-2.patch, expressions.patch, lp.patch, lp.patch, pig_1178.patch, pig_1178.patch, PIG_1178.patch, pig_1178_2.patch, pig_1178_3.2.patch, pig_1178_3.3.patch, pig_1178_3.4.patch, pig_1178_3.patch The current implementation of the logical plan and the logical optimizer in Pig has proven to not be easily extensible. Developer feedback has indicated that adding new rules to the optimizer is quite burdensome. In addition, the logical plan has been an area of numerous bugs, many of which have been difficult to fix. Developers also feel that the logical plan is difficult to understand and maintain. The root cause for these issues is that a number of design decisions that were made as part of the 0.2 rewrite of the front end have now proven to be sub-optimal. The heart of this proposal is to revisit a number of those proposals and rebuild the logical plan with a simpler design that will make it much easier to maintain the logical plan as well as extend the logical optimizer. See http://wiki.apache.org/pig/PigLogicalPlanOptimizerRewrite for full details. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.