[jira] [Commented] (HBASE-10452) Potential bugs in exception handlers
[ https://issues.apache.org/jira/browse/HBASE-10452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13899401#comment-13899401 ] Ted Yu commented on HBASE-10452: [~apurtell]: What do you think of patch v3 ? Potential bugs in exception handlers Key: HBASE-10452 URL: https://issues.apache.org/jira/browse/HBASE-10452 Project: HBase Issue Type: Bug Components: Client, master, regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Attachments: HBase-10452-trunk-v2.patch, HBase-10452-trunk-v3.patch, HBase-10452-trunk.patch Hi HBase developers, We are a group of researchers on software reliability. Recently we did a study and found that majority of the most severe failures in HBase are caused by bugs in exception handling logic -- that it is hard to anticipate all the possible real-world error scenarios. Therefore we built a simple checking tool that automatically detects some bug patterns that have caused some very severe real-world failures. I am reporting some of the results here. Any feedback is much appreciated! Ding = Case 1: Line: 134, File: org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java {noformat} protected void releaseTableLock() { if (this.tableLock != null) { try { this.tableLock.release(); } catch (IOException ex) { LOG.warn(Could not release the table lock, ex); //TODO: if we get here, and not abort RS, this lock will never be released } } {noformat} The lock is not released if the exception occurs, causing potential deadlock or starvation. Similar code pattern can be found at: Line: 135, File: org/apache/hadoop/hbase/regionserver/SplitRequest.java == = Case 2: Line: 252, File: org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java {noformat} try { Field fEnd = SequenceFile.Reader.class.getDeclaredField(end); fEnd.setAccessible(true); end = fEnd.getLong(this.reader); } catch(Exception e) { /* reflection fail. keep going */ } {noformat} The caught Exception seems to be too general. While reflection-related errors might be harmless, the try block can throw other exceptions including SecurityException, IllegalAccessException, etc. Currently all those exceptions are ignored. Maybe the safe way is to ignore the specific reflection-related errors while logging and handling other types of unexpected exceptions. == = Case 3: Line: 148, File: org/apache/hadoop/hbase/HBaseConfiguration.java {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (Exception e) { } {noformat} Similar to the previous case, the exception handling is too general. While ClassNotFound error might be the normal case and ignored, Class.forName can also throw other exceptions (e.g., LinkageError) under some unexpected and rare error cases. If that happens, the error will be lost. So maybe change it to below: {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (LinkageError e) { LOG.warn(..); // handle linkage error } catch (ExceptionInInitializerError e) { LOG.warn(..); // handle Initializer error } catch (ClassNotFoundException e) { LOG.debug(..); // ignore } {noformat} == = Case 4: Line: 163, File: org/apache/hadoop/hbase/client/Get.java {noformat} public Get setTimeStamp(long timestamp) { try { tr = new TimeRange(timestamp, timestamp+1); } catch(IOException e) { // Will never happen } return this; } {noformat} Even if the IOException never happens right now, is it possible to happen in the future due to code change? At least there should be a log message. The current behavior is dangerous since if the exception ever happens in any unexpected scenario, it will be silently swallowed. Similar code pattern can be found at: Line: 300, File: org/apache/hadoop/hbase/client/Scan.java == = Case 5: Line: 207, File: org/apache/hadoop/hbase/util/JVM.java {noformat} if (input != null){ try { input.close(); } catch (IOException ignored) { } } {noformat} Any exception encountered in close is completely ignored, not even logged. In particular, the same exception scenario was handled differently in other methods
[jira] [Commented] (HBASE-10452) Potential bugs in exception handlers
[ https://issues.apache.org/jira/browse/HBASE-10452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896424#comment-13896424 ] Ding Yuan commented on HBASE-10452: --- Thanks for the comments! Attached a new patch to address them. As for the possible integer overflow error from TimeRange: an IOException instead of RuntimeException is now thrown so the upper levels will deal with it. Let me know if this is fine. Potential bugs in exception handlers Key: HBASE-10452 URL: https://issues.apache.org/jira/browse/HBASE-10452 Project: HBase Issue Type: Bug Components: Client, master, regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Attachments: HBase-10452-trunk-v2.patch, HBase-10452-trunk-v3.patch, HBase-10452-trunk.patch Hi HBase developers, We are a group of researchers on software reliability. Recently we did a study and found that majority of the most severe failures in HBase are caused by bugs in exception handling logic -- that it is hard to anticipate all the possible real-world error scenarios. Therefore we built a simple checking tool that automatically detects some bug patterns that have caused some very severe real-world failures. I am reporting some of the results here. Any feedback is much appreciated! Ding = Case 1: Line: 134, File: org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java {noformat} protected void releaseTableLock() { if (this.tableLock != null) { try { this.tableLock.release(); } catch (IOException ex) { LOG.warn(Could not release the table lock, ex); //TODO: if we get here, and not abort RS, this lock will never be released } } {noformat} The lock is not released if the exception occurs, causing potential deadlock or starvation. Similar code pattern can be found at: Line: 135, File: org/apache/hadoop/hbase/regionserver/SplitRequest.java == = Case 2: Line: 252, File: org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java {noformat} try { Field fEnd = SequenceFile.Reader.class.getDeclaredField(end); fEnd.setAccessible(true); end = fEnd.getLong(this.reader); } catch(Exception e) { /* reflection fail. keep going */ } {noformat} The caught Exception seems to be too general. While reflection-related errors might be harmless, the try block can throw other exceptions including SecurityException, IllegalAccessException, etc. Currently all those exceptions are ignored. Maybe the safe way is to ignore the specific reflection-related errors while logging and handling other types of unexpected exceptions. == = Case 3: Line: 148, File: org/apache/hadoop/hbase/HBaseConfiguration.java {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (Exception e) { } {noformat} Similar to the previous case, the exception handling is too general. While ClassNotFound error might be the normal case and ignored, Class.forName can also throw other exceptions (e.g., LinkageError) under some unexpected and rare error cases. If that happens, the error will be lost. So maybe change it to below: {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (LinkageError e) { LOG.warn(..); // handle linkage error } catch (ExceptionInInitializerError e) { LOG.warn(..); // handle Initializer error } catch (ClassNotFoundException e) { LOG.debug(..); // ignore } {noformat} == = Case 4: Line: 163, File: org/apache/hadoop/hbase/client/Get.java {noformat} public Get setTimeStamp(long timestamp) { try { tr = new TimeRange(timestamp, timestamp+1); } catch(IOException e) { // Will never happen } return this; } {noformat} Even if the IOException never happens right now, is it possible to happen in the future due to code change? At least there should be a log message. The current behavior is dangerous since if the exception ever happens in any unexpected scenario, it will be silently swallowed. Similar code pattern can be found at: Line: 300, File: org/apache/hadoop/hbase/client/Scan.java == = Case 5: Line: 207, File: org/apache/hadoop/hbase/util/JVM.java {noformat} if (input != null){ try { input.close(); } catch
[jira] [Commented] (HBASE-10452) Potential bugs in exception handlers
[ https://issues.apache.org/jira/browse/HBASE-10452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896489#comment-13896489 ] Hadoop QA commented on HBASE-10452: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12627964/HBase-10452-trunk-v3.patch against trunk revision . ATTACHMENT ID: 12627964 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8650//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8650//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8650//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8650//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8650//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8650//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8650//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8650//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8650//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8650//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8650//console This message is automatically generated. Potential bugs in exception handlers Key: HBASE-10452 URL: https://issues.apache.org/jira/browse/HBASE-10452 Project: HBase Issue Type: Bug Components: Client, master, regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Attachments: HBase-10452-trunk-v2.patch, HBase-10452-trunk-v3.patch, HBase-10452-trunk.patch Hi HBase developers, We are a group of researchers on software reliability. Recently we did a study and found that majority of the most severe failures in HBase are caused by bugs in exception handling logic -- that it is hard to anticipate all the possible real-world error scenarios. Therefore we built a simple checking tool that automatically detects some bug patterns that have caused some very severe real-world failures. I am reporting some of the results here. Any feedback is much appreciated! Ding = Case 1: Line: 134, File: org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java {noformat} protected void releaseTableLock() { if (this.tableLock != null) { try { this.tableLock.release(); } catch (IOException ex) { LOG.warn(Could not release the table lock, ex); //TODO: if we get here, and not abort RS, this lock will never be released } } {noformat} The lock is not released if the exception occurs, causing potential deadlock or starvation. Similar code pattern can be found at: Line: 135, File: org/apache/hadoop/hbase/regionserver/SplitRequest.java ==
[jira] [Commented] (HBASE-10452) Potential bugs in exception handlers
[ https://issues.apache.org/jira/browse/HBASE-10452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896228#comment-13896228 ] Andrew Purtell commented on HBASE-10452: Adding error logging is fine. Throwing RuntimeExceptions is borderline. I would prefer you just add more error logging in those places. The changes to RegionMergeRequest and SplitRequest are not good. Don't retry with a hardcoded and arbitrary strategy. If it is important enough to abort, just abort. Potential bugs in exception handlers Key: HBASE-10452 URL: https://issues.apache.org/jira/browse/HBASE-10452 Project: HBase Issue Type: Bug Components: Client, master, regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Attachments: HBase-10452-trunk-v2.patch, HBase-10452-trunk.patch Hi HBase developers, We are a group of researchers on software reliability. Recently we did a study and found that majority of the most severe failures in HBase are caused by bugs in exception handling logic -- that it is hard to anticipate all the possible real-world error scenarios. Therefore we built a simple checking tool that automatically detects some bug patterns that have caused some very severe real-world failures. I am reporting some of the results here. Any feedback is much appreciated! Ding = Case 1: Line: 134, File: org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java {noformat} protected void releaseTableLock() { if (this.tableLock != null) { try { this.tableLock.release(); } catch (IOException ex) { LOG.warn(Could not release the table lock, ex); //TODO: if we get here, and not abort RS, this lock will never be released } } {noformat} The lock is not released if the exception occurs, causing potential deadlock or starvation. Similar code pattern can be found at: Line: 135, File: org/apache/hadoop/hbase/regionserver/SplitRequest.java == = Case 2: Line: 252, File: org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java {noformat} try { Field fEnd = SequenceFile.Reader.class.getDeclaredField(end); fEnd.setAccessible(true); end = fEnd.getLong(this.reader); } catch(Exception e) { /* reflection fail. keep going */ } {noformat} The caught Exception seems to be too general. While reflection-related errors might be harmless, the try block can throw other exceptions including SecurityException, IllegalAccessException, etc. Currently all those exceptions are ignored. Maybe the safe way is to ignore the specific reflection-related errors while logging and handling other types of unexpected exceptions. == = Case 3: Line: 148, File: org/apache/hadoop/hbase/HBaseConfiguration.java {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (Exception e) { } {noformat} Similar to the previous case, the exception handling is too general. While ClassNotFound error might be the normal case and ignored, Class.forName can also throw other exceptions (e.g., LinkageError) under some unexpected and rare error cases. If that happens, the error will be lost. So maybe change it to below: {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (LinkageError e) { LOG.warn(..); // handle linkage error } catch (ExceptionInInitializerError e) { LOG.warn(..); // handle Initializer error } catch (ClassNotFoundException e) { LOG.debug(..); // ignore } {noformat} == = Case 4: Line: 163, File: org/apache/hadoop/hbase/client/Get.java {noformat} public Get setTimeStamp(long timestamp) { try { tr = new TimeRange(timestamp, timestamp+1); } catch(IOException e) { // Will never happen } return this; } {noformat} Even if the IOException never happens right now, is it possible to happen in the future due to code change? At least there should be a log message. The current behavior is dangerous since if the exception ever happens in any unexpected scenario, it will be silently swallowed. Similar code pattern can be found at: Line: 300, File: org/apache/hadoop/hbase/client/Scan.java == = Case 5: Line: 207, File: org/apache/hadoop/hbase/util/JVM.java {noformat} if (input != null){ try {
[jira] [Commented] (HBASE-10452) Potential bugs in exception handlers
[ https://issues.apache.org/jira/browse/HBASE-10452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896236#comment-13896236 ] stack commented on HBASE-10452: --- Thanks for looking into this [~d.yuan] All looks good to me caveat what Andy says -- just abort if you can't release the lock (something is really wrong if you can't release a lock). Potential bugs in exception handlers Key: HBASE-10452 URL: https://issues.apache.org/jira/browse/HBASE-10452 Project: HBase Issue Type: Bug Components: Client, master, regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Attachments: HBase-10452-trunk-v2.patch, HBase-10452-trunk.patch Hi HBase developers, We are a group of researchers on software reliability. Recently we did a study and found that majority of the most severe failures in HBase are caused by bugs in exception handling logic -- that it is hard to anticipate all the possible real-world error scenarios. Therefore we built a simple checking tool that automatically detects some bug patterns that have caused some very severe real-world failures. I am reporting some of the results here. Any feedback is much appreciated! Ding = Case 1: Line: 134, File: org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java {noformat} protected void releaseTableLock() { if (this.tableLock != null) { try { this.tableLock.release(); } catch (IOException ex) { LOG.warn(Could not release the table lock, ex); //TODO: if we get here, and not abort RS, this lock will never be released } } {noformat} The lock is not released if the exception occurs, causing potential deadlock or starvation. Similar code pattern can be found at: Line: 135, File: org/apache/hadoop/hbase/regionserver/SplitRequest.java == = Case 2: Line: 252, File: org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java {noformat} try { Field fEnd = SequenceFile.Reader.class.getDeclaredField(end); fEnd.setAccessible(true); end = fEnd.getLong(this.reader); } catch(Exception e) { /* reflection fail. keep going */ } {noformat} The caught Exception seems to be too general. While reflection-related errors might be harmless, the try block can throw other exceptions including SecurityException, IllegalAccessException, etc. Currently all those exceptions are ignored. Maybe the safe way is to ignore the specific reflection-related errors while logging and handling other types of unexpected exceptions. == = Case 3: Line: 148, File: org/apache/hadoop/hbase/HBaseConfiguration.java {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (Exception e) { } {noformat} Similar to the previous case, the exception handling is too general. While ClassNotFound error might be the normal case and ignored, Class.forName can also throw other exceptions (e.g., LinkageError) under some unexpected and rare error cases. If that happens, the error will be lost. So maybe change it to below: {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (LinkageError e) { LOG.warn(..); // handle linkage error } catch (ExceptionInInitializerError e) { LOG.warn(..); // handle Initializer error } catch (ClassNotFoundException e) { LOG.debug(..); // ignore } {noformat} == = Case 4: Line: 163, File: org/apache/hadoop/hbase/client/Get.java {noformat} public Get setTimeStamp(long timestamp) { try { tr = new TimeRange(timestamp, timestamp+1); } catch(IOException e) { // Will never happen } return this; } {noformat} Even if the IOException never happens right now, is it possible to happen in the future due to code change? At least there should be a log message. The current behavior is dangerous since if the exception ever happens in any unexpected scenario, it will be silently swallowed. Similar code pattern can be found at: Line: 300, File: org/apache/hadoop/hbase/client/Scan.java == = Case 5: Line: 207, File: org/apache/hadoop/hbase/util/JVM.java {noformat} if (input != null){ try { input.close(); } catch (IOException ignored) { } } {noformat} Any exception encountered in close is completely
[jira] [Commented] (HBASE-10452) Potential bugs in exception handlers
[ https://issues.apache.org/jira/browse/HBASE-10452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890844#comment-13890844 ] Ding Yuan commented on HBASE-10452: --- Thanks Ram! Attaching a patch against trunk. I did not fix case 7 since it requires using HBCK to clear the node. I have little expertise in HBase code base and any of my attempt in this case is likely to do more harm than good. Again, any comment is much appreciated! Potential bugs in exception handlers Key: HBASE-10452 URL: https://issues.apache.org/jira/browse/HBASE-10452 Project: HBase Issue Type: Bug Components: Client, master, regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Attachments: HBase-10452-trunk.patch Hi HBase developers, We are a group of researchers on software reliability. Recently we did a study and found that majority of the most severe failures in HBase are caused by bugs in exception handling logic -- that it is hard to anticipate all the possible real-world error scenarios. Therefore we built a simple checking tool that automatically detects some bug patterns that have caused some very severe real-world failures. I am reporting some of the results here. Any feedback is much appreciated! Ding = Case 1: Line: 134, File: org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java {noformat} protected void releaseTableLock() { if (this.tableLock != null) { try { this.tableLock.release(); } catch (IOException ex) { LOG.warn(Could not release the table lock, ex); //TODO: if we get here, and not abort RS, this lock will never be released } } {noformat} The lock is not released if the exception occurs, causing potential deadlock or starvation. Similar code pattern can be found at: Line: 135, File: org/apache/hadoop/hbase/regionserver/SplitRequest.java == = Case 2: Line: 252, File: org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java {noformat} try { Field fEnd = SequenceFile.Reader.class.getDeclaredField(end); fEnd.setAccessible(true); end = fEnd.getLong(this.reader); } catch(Exception e) { /* reflection fail. keep going */ } {noformat} The caught Exception seems to be too general. While reflection-related errors might be harmless, the try block can throw other exceptions including SecurityException, IllegalAccessException, etc. Currently all those exceptions are ignored. Maybe the safe way is to ignore the specific reflection-related errors while logging and handling other types of unexpected exceptions. == = Case 3: Line: 148, File: org/apache/hadoop/hbase/HBaseConfiguration.java {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (Exception e) { } {noformat} Similar to the previous case, the exception handling is too general. While ClassNotFound error might be the normal case and ignored, Class.forName can also throw other exceptions (e.g., LinkageError) under some unexpected and rare error cases. If that happens, the error will be lost. So maybe change it to below: {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (LinkageError e) { LOG.warn(..); // handle linkage error } catch (ExceptionInInitializerError e) { LOG.warn(..); // handle Initializer error } catch (ClassNotFoundException e) { LOG.debug(..); // ignore } {noformat} == = Case 4: Line: 163, File: org/apache/hadoop/hbase/client/Get.java {noformat} public Get setTimeStamp(long timestamp) { try { tr = new TimeRange(timestamp, timestamp+1); } catch(IOException e) { // Will never happen } return this; } {noformat} Even if the IOException never happens right now, is it possible to happen in the future due to code change? At least there should be a log message. The current behavior is dangerous since if the exception ever happens in any unexpected scenario, it will be silently swallowed. Similar code pattern can be found at: Line: 300, File: org/apache/hadoop/hbase/client/Scan.java == = Case 5: Line: 207, File: org/apache/hadoop/hbase/util/JVM.java {noformat} if (input != null){ try { input.close(); } catch (IOException ignored) { } }
[jira] [Commented] (HBASE-10452) Potential bugs in exception handlers
[ https://issues.apache.org/jira/browse/HBASE-10452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890916#comment-13890916 ] Ted Yu commented on HBASE-10452: nit: {code} + throw new RuntimeException(TimeRange failed, likely integer overflow. Preventing bad things to propagate.. , e); {code} Please wrap long line - 100 characters limit. Potential bugs in exception handlers Key: HBASE-10452 URL: https://issues.apache.org/jira/browse/HBASE-10452 Project: HBase Issue Type: Bug Components: Client, master, regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Attachments: HBase-10452-trunk.patch Hi HBase developers, We are a group of researchers on software reliability. Recently we did a study and found that majority of the most severe failures in HBase are caused by bugs in exception handling logic -- that it is hard to anticipate all the possible real-world error scenarios. Therefore we built a simple checking tool that automatically detects some bug patterns that have caused some very severe real-world failures. I am reporting some of the results here. Any feedback is much appreciated! Ding = Case 1: Line: 134, File: org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java {noformat} protected void releaseTableLock() { if (this.tableLock != null) { try { this.tableLock.release(); } catch (IOException ex) { LOG.warn(Could not release the table lock, ex); //TODO: if we get here, and not abort RS, this lock will never be released } } {noformat} The lock is not released if the exception occurs, causing potential deadlock or starvation. Similar code pattern can be found at: Line: 135, File: org/apache/hadoop/hbase/regionserver/SplitRequest.java == = Case 2: Line: 252, File: org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java {noformat} try { Field fEnd = SequenceFile.Reader.class.getDeclaredField(end); fEnd.setAccessible(true); end = fEnd.getLong(this.reader); } catch(Exception e) { /* reflection fail. keep going */ } {noformat} The caught Exception seems to be too general. While reflection-related errors might be harmless, the try block can throw other exceptions including SecurityException, IllegalAccessException, etc. Currently all those exceptions are ignored. Maybe the safe way is to ignore the specific reflection-related errors while logging and handling other types of unexpected exceptions. == = Case 3: Line: 148, File: org/apache/hadoop/hbase/HBaseConfiguration.java {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (Exception e) { } {noformat} Similar to the previous case, the exception handling is too general. While ClassNotFound error might be the normal case and ignored, Class.forName can also throw other exceptions (e.g., LinkageError) under some unexpected and rare error cases. If that happens, the error will be lost. So maybe change it to below: {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (LinkageError e) { LOG.warn(..); // handle linkage error } catch (ExceptionInInitializerError e) { LOG.warn(..); // handle Initializer error } catch (ClassNotFoundException e) { LOG.debug(..); // ignore } {noformat} == = Case 4: Line: 163, File: org/apache/hadoop/hbase/client/Get.java {noformat} public Get setTimeStamp(long timestamp) { try { tr = new TimeRange(timestamp, timestamp+1); } catch(IOException e) { // Will never happen } return this; } {noformat} Even if the IOException never happens right now, is it possible to happen in the future due to code change? At least there should be a log message. The current behavior is dangerous since if the exception ever happens in any unexpected scenario, it will be silently swallowed. Similar code pattern can be found at: Line: 300, File: org/apache/hadoop/hbase/client/Scan.java == = Case 5: Line: 207, File: org/apache/hadoop/hbase/util/JVM.java {noformat} if (input != null){ try { input.close(); } catch (IOException ignored) { } } {noformat} Any exception encountered in close is completely ignored, not even logged. In
[jira] [Commented] (HBASE-10452) Potential bugs in exception handlers
[ https://issues.apache.org/jira/browse/HBASE-10452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890939#comment-13890939 ] Hadoop QA commented on HBASE-10452: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12626898/HBase-10452-trunk.patch against trunk revision . ATTACHMENT ID: 12626898 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 hadoop1.0{color}. The patch failed to compile against the hadoop 1.0 profile. Here is snippet of errors: {code}[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.5.1:compile (default-compile) on project hbase-server: Compilation failure: Compilation failure: [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java:[258,12] cannot find symbol [ERROR] symbol : class ReflectiveOperationException [ERROR] location: class org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader [ERROR] /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java:[277,12] cannot find symbol [ERROR] symbol : class ReflectiveOperationException -- org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.5.1:compile (default-compile) on project hbase-server: Compilation failure at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:213) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59) -- Caused by: org.apache.maven.plugin.CompilationFailureException: Compilation failure at org.apache.maven.plugin.AbstractCompilerMojo.execute(AbstractCompilerMojo.java:729) at org.apache.maven.plugin.CompilerMojo.execute(CompilerMojo.java:128) at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:101) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:209) ... 19 more{code} Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8590//console This message is automatically generated. Potential bugs in exception handlers Key: HBASE-10452 URL: https://issues.apache.org/jira/browse/HBASE-10452 Project: HBase Issue Type: Bug Components: Client, master, regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Attachments: HBase-10452-trunk.patch Hi HBase developers, We are a group of researchers on software reliability. Recently we did a study and found that majority of the most severe failures in HBase are caused by bugs in exception handling logic -- that it is hard to anticipate all the possible real-world error scenarios. Therefore we built a simple checking tool that automatically detects some bug patterns that have caused some very severe real-world failures. I am reporting some of the results here. Any feedback is much appreciated! Ding = Case 1: Line: 134, File: org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java {noformat} protected void releaseTableLock() { if (this.tableLock != null) { try { this.tableLock.release(); } catch (IOException ex) { LOG.warn(Could not release the table lock, ex); //TODO: if we get here, and not abort RS, this lock will never be released } } {noformat} The lock is not released if the exception occurs, causing potential deadlock or starvation. Similar code pattern can be found at: Line: 135, File: org/apache/hadoop/hbase/regionserver/SplitRequest.java == = Case 2: Line: 252, File: org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java {noformat} try { Field fEnd = SequenceFile.Reader.class.getDeclaredField(end); fEnd.setAccessible(true); end =
[jira] [Commented] (HBASE-10452) Potential bugs in exception handlers
[ https://issues.apache.org/jira/browse/HBASE-10452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13891109#comment-13891109 ] Ding Yuan commented on HBASE-10452: --- Thanks Ted! Fixed both the line wrapping and the compilation error on java 6. Potential bugs in exception handlers Key: HBASE-10452 URL: https://issues.apache.org/jira/browse/HBASE-10452 Project: HBase Issue Type: Bug Components: Client, master, regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Attachments: HBase-10452-trunk-v2.patch, HBase-10452-trunk.patch Hi HBase developers, We are a group of researchers on software reliability. Recently we did a study and found that majority of the most severe failures in HBase are caused by bugs in exception handling logic -- that it is hard to anticipate all the possible real-world error scenarios. Therefore we built a simple checking tool that automatically detects some bug patterns that have caused some very severe real-world failures. I am reporting some of the results here. Any feedback is much appreciated! Ding = Case 1: Line: 134, File: org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java {noformat} protected void releaseTableLock() { if (this.tableLock != null) { try { this.tableLock.release(); } catch (IOException ex) { LOG.warn(Could not release the table lock, ex); //TODO: if we get here, and not abort RS, this lock will never be released } } {noformat} The lock is not released if the exception occurs, causing potential deadlock or starvation. Similar code pattern can be found at: Line: 135, File: org/apache/hadoop/hbase/regionserver/SplitRequest.java == = Case 2: Line: 252, File: org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java {noformat} try { Field fEnd = SequenceFile.Reader.class.getDeclaredField(end); fEnd.setAccessible(true); end = fEnd.getLong(this.reader); } catch(Exception e) { /* reflection fail. keep going */ } {noformat} The caught Exception seems to be too general. While reflection-related errors might be harmless, the try block can throw other exceptions including SecurityException, IllegalAccessException, etc. Currently all those exceptions are ignored. Maybe the safe way is to ignore the specific reflection-related errors while logging and handling other types of unexpected exceptions. == = Case 3: Line: 148, File: org/apache/hadoop/hbase/HBaseConfiguration.java {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (Exception e) { } {noformat} Similar to the previous case, the exception handling is too general. While ClassNotFound error might be the normal case and ignored, Class.forName can also throw other exceptions (e.g., LinkageError) under some unexpected and rare error cases. If that happens, the error will be lost. So maybe change it to below: {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (LinkageError e) { LOG.warn(..); // handle linkage error } catch (ExceptionInInitializerError e) { LOG.warn(..); // handle Initializer error } catch (ClassNotFoundException e) { LOG.debug(..); // ignore } {noformat} == = Case 4: Line: 163, File: org/apache/hadoop/hbase/client/Get.java {noformat} public Get setTimeStamp(long timestamp) { try { tr = new TimeRange(timestamp, timestamp+1); } catch(IOException e) { // Will never happen } return this; } {noformat} Even if the IOException never happens right now, is it possible to happen in the future due to code change? At least there should be a log message. The current behavior is dangerous since if the exception ever happens in any unexpected scenario, it will be silently swallowed. Similar code pattern can be found at: Line: 300, File: org/apache/hadoop/hbase/client/Scan.java == = Case 5: Line: 207, File: org/apache/hadoop/hbase/util/JVM.java {noformat} if (input != null){ try { input.close(); } catch (IOException ignored) { } } {noformat} Any exception encountered in close is completely ignored, not even logged. In particular, the same exception scenario was handled differently in other
[jira] [Commented] (HBASE-10452) Potential bugs in exception handlers
[ https://issues.apache.org/jira/browse/HBASE-10452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13891326#comment-13891326 ] Hadoop QA commented on HBASE-10452: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12626957/HBase-10452-trunk-v2.patch against trunk revision . ATTACHMENT ID: 12626957 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8595//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8595//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8595//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8595//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8595//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8595//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8595//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8595//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8595//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8595//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8595//console This message is automatically generated. Potential bugs in exception handlers Key: HBASE-10452 URL: https://issues.apache.org/jira/browse/HBASE-10452 Project: HBase Issue Type: Bug Components: Client, master, regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Attachments: HBase-10452-trunk-v2.patch, HBase-10452-trunk.patch Hi HBase developers, We are a group of researchers on software reliability. Recently we did a study and found that majority of the most severe failures in HBase are caused by bugs in exception handling logic -- that it is hard to anticipate all the possible real-world error scenarios. Therefore we built a simple checking tool that automatically detects some bug patterns that have caused some very severe real-world failures. I am reporting some of the results here. Any feedback is much appreciated! Ding = Case 1: Line: 134, File: org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java {noformat} protected void releaseTableLock() { if (this.tableLock != null) { try { this.tableLock.release(); } catch (IOException ex) { LOG.warn(Could not release the table lock, ex); //TODO: if we get here, and not abort RS, this lock will never be released } } {noformat} The lock is not released if the exception occurs, causing potential deadlock or starvation. Similar code pattern can be found at: Line: 135, File: org/apache/hadoop/hbase/regionserver/SplitRequest.java == = Case 2:
[jira] [Commented] (HBASE-10452) Potential bugs in exception handlers
[ https://issues.apache.org/jira/browse/HBASE-10452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13889210#comment-13889210 ] ramkrishna.s.vasudevan commented on HBASE-10452: Thanks for checking this. It would be better if you create a patch with the above mentioned changes. http://hbase.apache.org/book.html#submitting.patches If you need to know how to start contributing. Potential bugs in exception handlers Key: HBASE-10452 URL: https://issues.apache.org/jira/browse/HBASE-10452 Project: HBase Issue Type: Bug Components: Client, master, regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Hi HBase developers, We are a group of researchers on software reliability. Recently we did a study and found that majority of the most severe failures in HBase are caused by bugs in exception handling logic -- that it is hard to anticipate all the possible real-world error scenarios. Therefore we built a simple checking tool that automatically detects some bug patterns that have caused some very severe real-world failures. I am reporting some of the results here. Any feedback is much appreciated! Ding = Case 1: Line: 134, File: org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java {noformat} protected void releaseTableLock() { if (this.tableLock != null) { try { this.tableLock.release(); } catch (IOException ex) { LOG.warn(Could not release the table lock, ex); //TODO: if we get here, and not abort RS, this lock will never be released } } {noformat} The lock is not released if the exception occurs, causing potential deadlock or starvation. Similar code pattern can be found at: Line: 135, File: org/apache/hadoop/hbase/regionserver/SplitRequest.java == = Case 2: Line: 252, File: org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java {noformat} try { Field fEnd = SequenceFile.Reader.class.getDeclaredField(end); fEnd.setAccessible(true); end = fEnd.getLong(this.reader); } catch(Exception e) { /* reflection fail. keep going */ } {noformat} The caught Exception seems to be too general. While reflection-related errors might be harmless, the try block can throw other exceptions including SecurityException, IllegalAccessException, etc. Currently all those exceptions are ignored. Maybe the safe way is to ignore the specific reflection-related errors while logging and handling other types of unexpected exceptions. == = Case 3: Line: 148, File: org/apache/hadoop/hbase/HBaseConfiguration.java {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (Exception e) { } {noformat} Similar to the previous case, the exception handling is too general. While ClassNotFound error might be the normal case and ignored, Class.forName can also throw other exceptions (e.g., LinkageError) under some unexpected and rare error cases. If that happens, the error will be lost. So maybe change it to below: {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (LinkageError e) { LOG.warn(..); // handle linkage error } catch (ExceptionInInitializerError e) { LOG.warn(..); // handle Initializer error } catch (ClassNotFoundException e) { LOG.debug(..); // ignore } {noformat} == = Case 4: Line: 163, File: org/apache/hadoop/hbase/client/Get.java {noformat} public Get setTimeStamp(long timestamp) { try { tr = new TimeRange(timestamp, timestamp+1); } catch(IOException e) { // Will never happen } return this; } {noformat} Even if the IOException never happens right now, is it possible to happen in the future due to code change? At least there should be a log message. The current behavior is dangerous since if the exception ever happens in any unexpected scenario, it will be silently swallowed. Similar code pattern can be found at: Line: 300, File: org/apache/hadoop/hbase/client/Scan.java == = Case 5: Line: 207, File: org/apache/hadoop/hbase/util/JVM.java {noformat} if (input != null){ try { input.close(); } catch (IOException ignored) { } } {noformat} Any exception encountered in close is completely ignored, not even logged.