[jira] Updated: (PIG-943) Pig crash when it cannot get counter from hadoop
[ https://issues.apache.org/jira/browse/PIG-943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pradeep Kamath updated PIG-943: --- Assignee: Daniel Dai Status: Patch Available (was: Open) Pig crash when it cannot get counter from hadoop Key: PIG-943 URL: https://issues.apache.org/jira/browse/PIG-943 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.3.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.4.0 Attachments: PIG-943-1.patch We see following call stacks in Pig: Case 1: Caused by: java.lang.NullPointerException at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.computeWarningAggregate(MapReduceLauncher.java:390) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:238) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265) Case 2: Caused by: java.lang.NullPointerException at org.apache.pig.tools.pigstats.PigStats.accumulateMRStats(PigStats.java:150) at org.apache.pig.tools.pigstats.PigStats.accumulateStats(PigStats.java:91) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:192) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265) In both cases, hadoop jobs finishes without error. The cause of both problems is RunningJob.getCounters() returns a null, and Pig do not currently check for that. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-943) Pig crash when it cannot get counter from hadoop
[ https://issues.apache.org/jira/browse/PIG-943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12751560#action_12751560 ] Pradeep Kamath commented on PIG-943: I am wondering if we should report that we were unable to determine the number of records/bytes written - {code} if (stats.getRecordsWritten()!=-1) log.info(Records written : + stats.getRecordsWritten()); if (stats.getBytesWritten()!=-1) log.info(Bytes written : + stats.getBytesWritten()); } {code} could change to {code} if(stats.getRecordsWritten()==-1) { log.info(Records written : Unable to determine number of records written); } else { log.info(Records written : stats.getRecordsWritten()); } if(stats.getBytesWritten()==-1) { log.info(Bytes written : Unable to determine number of bytes written); } else { log.info(Bytes written : stats.getBytesWritten()); } {code} If we are using -1 to indicate we did not get counter information but the warning has occured some number of times (maybe in one of the jobs where we did get counter information), the following function's output message should change : {code} public static void logAggregate(MapEnum, Long aggMap, MessageType messageType, Log log) { for(Enum e: aggMap.keySet()) { Long count = aggMap.get(e); if(count != null count 0) { String message = Encountered + messageType + + e.toString() + + count + time(s).; logMessage(message, messageType, log); } } } {code} to {code} public static void logAggregate(MapEnum, Long aggMap, MessageType messageType, Log log) { for(Enum e: aggMap.keySet()) { Long count = aggMap.get(e); if(count != null (count == -1 || count 0)) { String message = Encountered + messageType + + e.toString() + + (count == -1 ? unknown number of : count.toString()) + time(s).; logMessage(message, messageType, log); } } } {code} This way we still warn the user about the warning message which got aggregated but just not report the number of occurences. To enable the above change the code in MapReduceLauncher.java should also change to ensure we report 3 types of counts: 0 - the warning did not occur -1 -the warning occured in some of the jobs but in one or more jobs we get a null counter and hence we cannot report an accurate count 0 - we can accurately report the count of occurences of the warning. Pig crash when it cannot get counter from hadoop Key: PIG-943 URL: https://issues.apache.org/jira/browse/PIG-943 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.3.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.4.0 Attachments: PIG-943-1.patch We see following call stacks in Pig: Case 1: Caused by: java.lang.NullPointerException at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.computeWarningAggregate(MapReduceLauncher.java:390) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:238) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265) Case 2: Caused by: java.lang.NullPointerException at org.apache.pig.tools.pigstats.PigStats.accumulateMRStats(PigStats.java:150) at org.apache.pig.tools.pigstats.PigStats.accumulateStats(PigStats.java:91) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:192) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265) In both cases, hadoop jobs finishes without error. The cause of both problems is RunningJob.getCounters() returns
[jira] Commented: (PIG-943) Pig crash when it cannot get counter from hadoop
[ https://issues.apache.org/jira/browse/PIG-943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12751575#action_12751575 ] Hadoop QA commented on PIG-943: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12418545/PIG-943-1.patch against trunk revision 811203. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/13/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/13/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/13/console This message is automatically generated. Pig crash when it cannot get counter from hadoop Key: PIG-943 URL: https://issues.apache.org/jira/browse/PIG-943 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.3.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.4.0 Attachments: PIG-943-1.patch We see following call stacks in Pig: Case 1: Caused by: java.lang.NullPointerException at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.computeWarningAggregate(MapReduceLauncher.java:390) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:238) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265) Case 2: Caused by: java.lang.NullPointerException at org.apache.pig.tools.pigstats.PigStats.accumulateMRStats(PigStats.java:150) at org.apache.pig.tools.pigstats.PigStats.accumulateStats(PigStats.java:91) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:192) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265) In both cases, hadoop jobs finishes without error. The cause of both problems is RunningJob.getCounters() returns a null, and Pig do not currently check for that. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-943) Pig crash when it cannot get counter from hadoop
[ https://issues.apache.org/jira/browse/PIG-943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12751631#action_12751631 ] Dmitriy V. Ryaboy commented on PIG-943: --- Hi Daniel, My apologies, I worded my comment poorly. I wasn't minus-oneing the patch, I was saying that the use of -1 as a magic value is a bit hacky. I think inserting Long.NaN or null and checking for it on the other end, instead of checking for -1, is cleaner. Pig crash when it cannot get counter from hadoop Key: PIG-943 URL: https://issues.apache.org/jira/browse/PIG-943 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.3.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.4.0 Attachments: PIG-943-1.patch We see following call stacks in Pig: Case 1: Caused by: java.lang.NullPointerException at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.computeWarningAggregate(MapReduceLauncher.java:390) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:238) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265) Case 2: Caused by: java.lang.NullPointerException at org.apache.pig.tools.pigstats.PigStats.accumulateMRStats(PigStats.java:150) at org.apache.pig.tools.pigstats.PigStats.accumulateStats(PigStats.java:91) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:192) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265) In both cases, hadoop jobs finishes without error. The cause of both problems is RunningJob.getCounters() returns a null, and Pig do not currently check for that. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-943) Pig crash when it cannot get counter from hadoop
[ https://issues.apache.org/jira/browse/PIG-943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-943: --- Attachment: PIG-943-2.patch This patch include Pradeep's comment. Now if we encounter any null counters from hadoop, we ignore it when doing the calculation. Then in the report, we put a note We cannot retrieve hadoop counter for x jobs, the number following warnings may not be correct. Pig crash when it cannot get counter from hadoop Key: PIG-943 URL: https://issues.apache.org/jira/browse/PIG-943 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.3.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.4.0 Attachments: PIG-943-1.patch, PIG-943-2.patch We see following call stacks in Pig: Case 1: Caused by: java.lang.NullPointerException at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.computeWarningAggregate(MapReduceLauncher.java:390) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:238) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265) Case 2: Caused by: java.lang.NullPointerException at org.apache.pig.tools.pigstats.PigStats.accumulateMRStats(PigStats.java:150) at org.apache.pig.tools.pigstats.PigStats.accumulateStats(PigStats.java:91) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:192) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265) In both cases, hadoop jobs finishes without error. The cause of both problems is RunningJob.getCounters() returns a null, and Pig do not currently check for that. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-943) Pig crash when it cannot get counter from hadoop
[ https://issues.apache.org/jira/browse/PIG-943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated PIG-943: --- Status: Open (was: Patch Available) Pig crash when it cannot get counter from hadoop Key: PIG-943 URL: https://issues.apache.org/jira/browse/PIG-943 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.3.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.4.0 Attachments: PIG-943-1.patch, PIG-943-2.patch We see following call stacks in Pig: Case 1: Caused by: java.lang.NullPointerException at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.computeWarningAggregate(MapReduceLauncher.java:390) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:238) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265) Case 2: Caused by: java.lang.NullPointerException at org.apache.pig.tools.pigstats.PigStats.accumulateMRStats(PigStats.java:150) at org.apache.pig.tools.pigstats.PigStats.accumulateStats(PigStats.java:91) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:192) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265) In both cases, hadoop jobs finishes without error. The cause of both problems is RunningJob.getCounters() returns a null, and Pig do not currently check for that. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-943) Pig crash when it cannot get counter from hadoop
[ https://issues.apache.org/jira/browse/PIG-943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12751645#action_12751645 ] Daniel Dai commented on PIG-943: Hi, Dmitriy, Thank you for your clarification. I look up the code, if we change -1 to a Long.NaN or null, then we need to change more code. Besides, I don't see much difference if we use null instead of -1. In both cases, we use a special value to indicate something special. Pig crash when it cannot get counter from hadoop Key: PIG-943 URL: https://issues.apache.org/jira/browse/PIG-943 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.3.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.4.0 Attachments: PIG-943-1.patch, PIG-943-2.patch We see following call stacks in Pig: Case 1: Caused by: java.lang.NullPointerException at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.computeWarningAggregate(MapReduceLauncher.java:390) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:238) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265) Case 2: Caused by: java.lang.NullPointerException at org.apache.pig.tools.pigstats.PigStats.accumulateMRStats(PigStats.java:150) at org.apache.pig.tools.pigstats.PigStats.accumulateStats(PigStats.java:91) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:192) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265) In both cases, hadoop jobs finishes without error. The cause of both problems is RunningJob.getCounters() returns a null, and Pig do not currently check for that. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (PIG-946) Combiner optimizer does not optimize when limit follow group, foreach
Combiner optimizer does not optimize when limit follow group, foreach - Key: PIG-946 URL: https://issues.apache.org/jira/browse/PIG-946 Project: Pig Issue Type: Bug Affects Versions: 0.3.0 Reporter: Pradeep Kamath The following script is combinable but is not optimized: a = load '/user/pig/tests/data/singlefile/studenttab10k'; b = group a by $1; c = foreach b generate group, AVG(a.$2); d = limit c 10; dump d; -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-944) Zebra schema is taken from Pig through TableStorer's construct
[ https://issues.apache.org/jira/browse/PIG-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-944: - Attachment: zebra_pig_interface.patch patch file to fix the problem Zebra schema is taken from Pig through TableStorer's construct -- Key: PIG-944 URL: https://issues.apache.org/jira/browse/PIG-944 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Yan Zhou Attachments: zebra_pig_interface.patch It should be from StoreConfig in TableOutputFormat.checkOutputSpecs method because the information is dynamic in Pig's execution engine and should not be taking a static argument to the constructor. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-944) Zebra schema is taken from Pig through TableStorer's construct
[ https://issues.apache.org/jira/browse/PIG-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Zhou updated PIG-944: - Status: Patch Available (was: Open) pig to zebra schema conversion support is added to fix the problem Zebra schema is taken from Pig through TableStorer's construct -- Key: PIG-944 URL: https://issues.apache.org/jira/browse/PIG-944 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Yan Zhou Attachments: zebra_pig_interface.patch It should be from StoreConfig in TableOutputFormat.checkOutputSpecs method because the information is dynamic in Pig's execution engine and should not be taking a static argument to the constructor. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-943) Pig crash when it cannot get counter from hadoop
[ https://issues.apache.org/jira/browse/PIG-943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12751676#action_12751676 ] Hadoop QA commented on PIG-943: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12418677/PIG-943-2.patch against trunk revision 811203. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/14/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/14/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/14/console This message is automatically generated. Pig crash when it cannot get counter from hadoop Key: PIG-943 URL: https://issues.apache.org/jira/browse/PIG-943 Project: Pig Issue Type: Bug Components: impl Affects Versions: 0.3.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.4.0 Attachments: PIG-943-1.patch, PIG-943-2.patch We see following call stacks in Pig: Case 1: Caused by: java.lang.NullPointerException at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.computeWarningAggregate(MapReduceLauncher.java:390) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:238) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265) Case 2: Caused by: java.lang.NullPointerException at org.apache.pig.tools.pigstats.PigStats.accumulateMRStats(PigStats.java:150) at org.apache.pig.tools.pigstats.PigStats.accumulateStats(PigStats.java:91) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:192) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:265) In both cases, hadoop jobs finishes without error. The cause of both problems is RunningJob.getCounters() returns a null, and Pig do not currently check for that. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-944) Zebra schema is taken from Pig through TableStorer's construct
[ https://issues.apache.org/jira/browse/PIG-944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12751686#action_12751686 ] Hadoop QA commented on PIG-944: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12418687/zebra_pig_interface.patch against trunk revision 811203. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 15 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Pig-Patch-h8.grid.sp2.yahoo.net/1/console This message is automatically generated. Zebra schema is taken from Pig through TableStorer's construct -- Key: PIG-944 URL: https://issues.apache.org/jira/browse/PIG-944 Project: Pig Issue Type: Bug Affects Versions: 0.4.0 Reporter: Yan Zhou Attachments: zebra_pig_interface.patch It should be from StoreConfig in TableOutputFormat.checkOutputSpecs method because the information is dynamic in Pig's execution engine and should not be taking a static argument to the constructor. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.