[jira] [Commented] (HBASE-12187) Review in source the paper "Simple Testing Can Prevent Most Critical Failures"
[ https://issues.apache.org/jira/browse/HBASE-12187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17129775#comment-17129775 ] Ding Yuan commented on HBASE-12187: --- [~mattf] the empty catch block rule is now part of [error-prone's newest release (v2.4.0)|https://github.com/google/error-prone/releases/tag/v2.4.0]. This is by far the most important rule among Aspirator's rules (vast majority of the bugs fall under this category). Try it :) > Review in source the paper "Simple Testing Can Prevent Most Critical Failures" > -- > > Key: HBASE-12187 > URL: https://issues.apache.org/jira/browse/HBASE-12187 > Project: HBase > Issue Type: Bug >Reporter: Michael Stack >Priority: Critical > Attachments: HBASE-12187.patch, abortInOvercatch.warnings.txt, > emptyCatch.warnings.txt, todoInCatch.warnings.txt > > > Review the helpful paper > https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-yuan.pdf > It describes 'catastrophic failures', especially issues where exceptions are > thrown but not properly handled. Their static analysis tool Aspirator turns > up a bunch of the obvious offenders (Lets add to test-patch.sh alongside > findbugs?). This issue is about going through code base making sub-issues to > root out these and others (Don't we have the test described in figure #6 > already? I thought we did? If we don't, need to add). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-12187) Review in source the paper Simple Testing Can Prevent Most Critical Failures
[ https://issues.apache.org/jira/browse/HBASE-12187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14721858#comment-14721858 ] Ding Yuan commented on HBASE-12187: --- Hi [~busbey], i don't think so. I remember I sent error-prone's developers the patches last year via google group, but I can no longer find those discussion threads (seems they migrated the whole thing to github). Should I send the patch to the error-prone devs again or you guys are hosting error-prone repository locally? Review in source the paper Simple Testing Can Prevent Most Critical Failures -- Key: HBASE-12187 URL: https://issues.apache.org/jira/browse/HBASE-12187 Project: HBase Issue Type: Bug Reporter: stack Priority: Critical Attachments: HBASE-12187.patch, abortInOvercatch.warnings.txt, emptyCatch.warnings.txt, todoInCatch.warnings.txt Review the helpful paper https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-yuan.pdf It describes 'catastrophic failures', especially issues where exceptions are thrown but not properly handled. Their static analysis tool Aspirator turns up a bunch of the obvious offenders (Lets add to test-patch.sh alongside findbugs?). This issue is about going through code base making sub-issues to root out these and others (Don't we have the test described in figure #6 already? I thought we did? If we don't, need to add). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12349) Add Maven build support module for a custom version of error-prone
[ https://issues.apache.org/jira/browse/HBASE-12349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14500285#comment-14500285 ] Ding Yuan commented on HBASE-12349: --- Ping. Just checking if the custom error-prone has been integrated? Or if there is anything that I can further help? Add Maven build support module for a custom version of error-prone -- Key: HBASE-12349 URL: https://issues.apache.org/jira/browse/HBASE-12349 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Fix For: 2.0.0 Add a new Maven build support module that builds and publishes a custom error-prone artifact for use by the rest of the build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12187) Review in source the paper Simple Testing Can Prevent Most Critical Failures
[ https://issues.apache.org/jira/browse/HBASE-12187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184734#comment-14184734 ] Ding Yuan commented on HBASE-12187: --- I have implemented the three checks from aspirator into error-prone version 1.1.2. These three checks are: (1). Catch block that ignores exception (including containing only a log printing statement); (2). Aborting the system on exception over-catch; (3). Catch block containing TODO or FIXME in comments Among them, (1) is a bit complicated since I included quite a few false positive suppression heuristics as described in the paper. I have tested all three checks on HBase-0.99.0. The first check found 111 cases, while the other two found less than 10 each. I have attached the reported cases as attachments. Currently I assigned all of the three checks as ERROR severity. So if one thinks that a case is fine, an annotation like @SupressWarnings(EmptyCatch) is needed to get the compilation to succeed. I am attaching the patch to error-prone v1.1.2, which contains the three added checks. I have also uploaded my error-prone repository to: https://github.com/diy1/error-prone-aspirator Please let me know how i can further help. cheers, ding Review in source the paper Simple Testing Can Prevent Most Critical Failures -- Key: HBASE-12187 URL: https://issues.apache.org/jira/browse/HBASE-12187 Project: HBase Issue Type: Bug Reporter: stack Priority: Critical Review the helpful paper https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-yuan.pdf It describes 'catastrophic failures', especially issues where exceptions are thrown but not properly handled. Their static analysis tool Aspirator turns up a bunch of the obvious offenders (Lets add to test-patch.sh alongside findbugs?). This issue is about going through code base making sub-issues to root out these and others (Don't we have the test described in figure #6 already? I thought we did? If we don't, need to add). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12187) Review in source the paper Simple Testing Can Prevent Most Critical Failures
[ https://issues.apache.org/jira/browse/HBASE-12187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ding Yuan updated HBASE-12187: -- Attachment: todoInCatch.warnings.txt emptyCatch.warnings.txt abortInOvercatch.warnings.txt HBASE-12187.patch Review in source the paper Simple Testing Can Prevent Most Critical Failures -- Key: HBASE-12187 URL: https://issues.apache.org/jira/browse/HBASE-12187 Project: HBase Issue Type: Bug Reporter: stack Priority: Critical Attachments: HBASE-12187.patch, abortInOvercatch.warnings.txt, emptyCatch.warnings.txt, todoInCatch.warnings.txt Review the helpful paper https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-yuan.pdf It describes 'catastrophic failures', especially issues where exceptions are thrown but not properly handled. Their static analysis tool Aspirator turns up a bunch of the obvious offenders (Lets add to test-patch.sh alongside findbugs?). This issue is about going through code base making sub-issues to root out these and others (Don't we have the test described in figure #6 already? I thought we did? If we don't, need to add). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12187) Review in source the paper Simple Testing Can Prevent Most Critical Failures
[ https://issues.apache.org/jira/browse/HBASE-12187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14170714#comment-14170714 ] Ding Yuan commented on HBASE-12187: --- Great! I will work on it and get back to you in a few days. Review in source the paper Simple Testing Can Prevent Most Critical Failures -- Key: HBASE-12187 URL: https://issues.apache.org/jira/browse/HBASE-12187 Project: HBase Issue Type: Bug Reporter: stack Priority: Critical Review the helpful paper https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-yuan.pdf It describes 'catastrophic failures', especially issues where exceptions are thrown but not properly handled. Their static analysis tool Aspirator turns up a bunch of the obvious offenders (Lets add to test-patch.sh alongside findbugs?). This issue is about going through code base making sub-issues to root out these and others (Don't we have the test described in figure #6 already? I thought we did? If we don't, need to add). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12187) Review in source the paper Simple Testing Can Prevent Most Critical Failures
[ https://issues.apache.org/jira/browse/HBASE-12187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14170285#comment-14170285 ] Ding Yuan commented on HBASE-12187: --- Hi Stack and Andrew, It seems from the discussion of HBASE-11912 you guys think error-prone might be a better approach, then I will try to add a few checks into error-prone to catch the trivial bug patterns implemented by aspirator. I am not familiar with error-prone's AST representation, so I am not sure whether some of the checks can be easily implemented, but it shouldn't be complicated anyway. Afterward I can also try to provide a FindBugs implementation. Do you guys have your own error-prone repository or I should simply work on the one from https://code.google.com/p/error-prone/? cheers, Review in source the paper Simple Testing Can Prevent Most Critical Failures -- Key: HBASE-12187 URL: https://issues.apache.org/jira/browse/HBASE-12187 Project: HBase Issue Type: Bug Reporter: stack Priority: Critical Review the helpful paper https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-yuan.pdf It describes 'catastrophic failures', especially issues where exceptions are thrown but not properly handled. Their static analysis tool Aspirator turns up a bunch of the obvious offenders (Lets add to test-patch.sh alongside findbugs?). This issue is about going through code base making sub-issues to root out these and others (Don't we have the test described in figure #6 already? I thought we did? If we don't, need to add). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12187) Review in source the paper Simple Testing Can Prevent Most Critical Failures
[ https://issues.apache.org/jira/browse/HBASE-12187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14168691#comment-14168691 ] Ding Yuan commented on HBASE-12187: --- Hi Stack, I'm the author of this paper you mentioned. Thanks for your attentions. I understand that it may be more convenient to build the checks implemented by aspirator into the tools you guys already use. I am very happy to help here. Do you think it would be helpful to build these checks into FindBugs or other tools you guys use? If so, I'm happy to give it a shot. Review in source the paper Simple Testing Can Prevent Most Critical Failures -- Key: HBASE-12187 URL: https://issues.apache.org/jira/browse/HBASE-12187 Project: HBase Issue Type: Bug Reporter: stack Priority: Critical Review the helpful paper https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-yuan.pdf It describes 'catastrophic failures', especially issues where exceptions are thrown but not properly handled. Their static analysis tool Aspirator turns up a bunch of the obvious offenders (Lets add to test-patch.sh alongside findbugs?). This issue is about going through code base making sub-issues to root out these and others (Don't we have the test described in figure #6 already? I thought we did? If we don't, need to add). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-10813) Possible over-catch of exceptions
[ https://issues.apache.org/jira/browse/HBASE-10813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13965397#comment-13965397 ] Ding Yuan commented on HBASE-10813: --- Thanks all! Possible over-catch of exceptions - Key: HBASE-10813 URL: https://issues.apache.org/jira/browse/HBASE-10813 Project: HBase Issue Type: Improvement Components: regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Assignee: Ding Yuan Fix For: 0.99.0 Attachments: HBASE-10813_trunk_fixed_tests.patch, HBase-10813-trunk.patch There are a few cases found by a tool that are possibly over-catch of exceptions, especially those that will abort the server. Over-catching these exceptions may unexpectedly abort the server, and may cause problems in the future when code in the try-block evolves. I am attaching a patch against trunk that constrains the catch blocks to the exact exceptions that were thrown. My tool actually found one more case in 0.96.1, but I found it has already been fixed in trunk: {noformat} Line: 1175, File: org/apache/hadoop/hbase/master/SplitLogManager.java 1173: try { 1174: Thread.sleep(20); 1175:-} catch (Exception ignoreE) { 1175:+ } catch (InterruptedException e) { 1176: // ignore 1177: } {noformat} Any feedbacks are much appreciated! -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10813) Possible over-catch of exceptions
[ https://issues.apache.org/jira/browse/HBASE-10813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13964204#comment-13964204 ] Ding Yuan commented on HBASE-10813: --- Hi folks, any chance to commit this patch soon? Possible over-catch of exceptions - Key: HBASE-10813 URL: https://issues.apache.org/jira/browse/HBASE-10813 Project: HBase Issue Type: Improvement Components: regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Assignee: Ding Yuan Attachments: HBase-10813-trunk.patch There are a few cases found by a tool that are possibly over-catch of exceptions, especially those that will abort the server. Over-catching these exceptions may unexpectedly abort the server, and may cause problems in the future when code in the try-block evolves. I am attaching a patch against trunk that constrains the catch blocks to the exact exceptions that were thrown. My tool actually found one more case in 0.96.1, but I found it has already been fixed in trunk: {noformat} Line: 1175, File: org/apache/hadoop/hbase/master/SplitLogManager.java 1173: try { 1174: Thread.sleep(20); 1175:-} catch (Exception ignoreE) { 1175:+ } catch (InterruptedException e) { 1176: // ignore 1177: } {noformat} Any feedbacks are much appreciated! -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10813) Possible over-catch of exceptions
[ https://issues.apache.org/jira/browse/HBASE-10813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13946396#comment-13946396 ] Ding Yuan commented on HBASE-10813: --- One comment for this code block: {noformat} try { if (zk !ZKAssign.checkClosingState(server.getZooKeeper(), regionInfo, expectedVersion)){ // bad znode state return; // We're node deleting the znode, but it's not ours... } // TODO: If we need to keep updating CLOSING stamp to prevent against // a timeout if this is long-running, need to spin up a thread? if (region.close(abort) == null) { // This region got closed. Most likely due to a split. So instead // of doing the setClosedState() below, let's just ignore cont // The split message will clean up the master state. LOG.warn(Can't close region: was already closed during close(): + regionInfo.getRegionNameAsString()); return; } } catch (Throwable t) { // A throwable here indicates that we couldn't successfully flush the // memstore before closing. So, we need to abort the server and allow // the master to split our logs in order to recover the data. server.abort(Unrecoverable exception while closing region + regionInfo.getRegionNameAsString() + , still finishing close, t); throw new RuntimeException(t); } {noformat} checkClosingState could throw KeeperException, which will also abort the server. Currently my patch followed the same logic, but is it the desired logic (without system specific understanding I couldn't tell)? I'm asking because the comment of the catch block seems to suggest that the abort is to only defend against an exception in region.close(). Possible over-catch of exceptions - Key: HBASE-10813 URL: https://issues.apache.org/jira/browse/HBASE-10813 Project: HBase Issue Type: Improvement Components: regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Assignee: Ding Yuan Attachments: HBase-10813-trunk.patch There are a few cases found by a tool that are possibly over-catch of exceptions, especially those that will abort the server. Over-catching these exceptions may unexpectedly abort the server, and may cause problems in the future when code in the try-block evolves. I am attaching a patch against trunk that constrains the catch blocks to the exact exceptions that were thrown. My tool actually found one more case in 0.96.1, but I found it has already been fixed in trunk: {noformat} Line: 1175, File: org/apache/hadoop/hbase/master/SplitLogManager.java 1173: try { 1174: Thread.sleep(20); 1175:-} catch (Exception ignoreE) { 1175:+ } catch (InterruptedException e) { 1176: // ignore 1177: } {noformat} Any feedbacks are much appreciated! -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HBASE-10813) Possible over-catch of exceptions
Ding Yuan created HBASE-10813: - Summary: Possible over-catch of exceptions Key: HBASE-10813 URL: https://issues.apache.org/jira/browse/HBASE-10813 Project: HBase Issue Type: Improvement Components: regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan There are a few cases found by a tool that are possibly over-catch of exceptions, especially those that will abort the server. Over-catching these exceptions may unexpectedly abort the server, and may cause problems in the future when code in the try-block evolves. I am attaching a patch against trunk that constrains the catch blocks to the exact exceptions that were thrown. My tool actually found one more case in 0.96.1, but I found it has already been fixed in trunk: {noformat} Line: 1175, File: org/apache/hadoop/hbase/master/SplitLogManager.java 1173: try { 1174: Thread.sleep(20); 1175:-} catch (Exception ignoreE) { 1175:+ } catch (InterruptedException e) { 1176: // ignore 1177: } {noformat} Any feedbacks are much appreciated! -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-10813) Possible over-catch of exceptions
[ https://issues.apache.org/jira/browse/HBASE-10813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ding Yuan updated HBASE-10813: -- Attachment: HBase-10813-trunk.patch Possible over-catch of exceptions - Key: HBASE-10813 URL: https://issues.apache.org/jira/browse/HBASE-10813 Project: HBase Issue Type: Improvement Components: regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Attachments: HBase-10813-trunk.patch There are a few cases found by a tool that are possibly over-catch of exceptions, especially those that will abort the server. Over-catching these exceptions may unexpectedly abort the server, and may cause problems in the future when code in the try-block evolves. I am attaching a patch against trunk that constrains the catch blocks to the exact exceptions that were thrown. My tool actually found one more case in 0.96.1, but I found it has already been fixed in trunk: {noformat} Line: 1175, File: org/apache/hadoop/hbase/master/SplitLogManager.java 1173: try { 1174: Thread.sleep(20); 1175:-} catch (Exception ignoreE) { 1175:+ } catch (InterruptedException e) { 1176: // ignore 1177: } {noformat} Any feedbacks are much appreciated! -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10452) Fix potential bugs in exception handlers
[ https://issues.apache.org/jira/browse/HBASE-10452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13900558#comment-13900558 ] Ding Yuan commented on HBASE-10452: --- Great! Thank you all! Fix potential bugs in exception handlers Key: HBASE-10452 URL: https://issues.apache.org/jira/browse/HBASE-10452 Project: HBase Issue Type: Bug Components: Client, master, regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Assignee: Ding Yuan Fix For: 0.98.1, 0.99.0 Attachments: HBase-10452-trunk-v2.patch, HBase-10452-trunk-v3.patch, HBase-10452-trunk-v4.patch, HBase-10452-trunk.patch Hi HBase developers, We are a group of researchers on software reliability. Recently we did a study and found that majority of the most severe failures in HBase are caused by bugs in exception handling logic -- that it is hard to anticipate all the possible real-world error scenarios. Therefore we built a simple checking tool that automatically detects some bug patterns that have caused some very severe real-world failures. I am reporting some of the results here. Any feedback is much appreciated! Ding = Case 1: Line: 134, File: org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java {noformat} protected void releaseTableLock() { if (this.tableLock != null) { try { this.tableLock.release(); } catch (IOException ex) { LOG.warn(Could not release the table lock, ex); //TODO: if we get here, and not abort RS, this lock will never be released } } {noformat} The lock is not released if the exception occurs, causing potential deadlock or starvation. Similar code pattern can be found at: Line: 135, File: org/apache/hadoop/hbase/regionserver/SplitRequest.java == = Case 2: Line: 252, File: org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java {noformat} try { Field fEnd = SequenceFile.Reader.class.getDeclaredField(end); fEnd.setAccessible(true); end = fEnd.getLong(this.reader); } catch(Exception e) { /* reflection fail. keep going */ } {noformat} The caught Exception seems to be too general. While reflection-related errors might be harmless, the try block can throw other exceptions including SecurityException, IllegalAccessException, etc. Currently all those exceptions are ignored. Maybe the safe way is to ignore the specific reflection-related errors while logging and handling other types of unexpected exceptions. == = Case 3: Line: 148, File: org/apache/hadoop/hbase/HBaseConfiguration.java {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (Exception e) { } {noformat} Similar to the previous case, the exception handling is too general. While ClassNotFound error might be the normal case and ignored, Class.forName can also throw other exceptions (e.g., LinkageError) under some unexpected and rare error cases. If that happens, the error will be lost. So maybe change it to below: {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (LinkageError e) { LOG.warn(..); // handle linkage error } catch (ExceptionInInitializerError e) { LOG.warn(..); // handle Initializer error } catch (ClassNotFoundException e) { LOG.debug(..); // ignore } {noformat} == = Case 4: Line: 163, File: org/apache/hadoop/hbase/client/Get.java {noformat} public Get setTimeStamp(long timestamp) { try { tr = new TimeRange(timestamp, timestamp+1); } catch(IOException e) { // Will never happen } return this; } {noformat} Even if the IOException never happens right now, is it possible to happen in the future due to code change? At least there should be a log message. The current behavior is dangerous since if the exception ever happens in any unexpected scenario, it will be silently swallowed. Similar code pattern can be found at: Line: 300, File: org/apache/hadoop/hbase/client/Scan.java == = Case 5: Line: 207, File: org/apache/hadoop/hbase/util/JVM.java {noformat} if (input != null){ try { input.close(); } catch (IOException ignored) { } } {noformat} Any exception encountered in close is completely ignored, not even
[jira] [Updated] (HBASE-10452) Fix potential bugs in exception handlers
[ https://issues.apache.org/jira/browse/HBASE-10452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ding Yuan updated HBASE-10452: -- Attachment: HBase-10452-trunk-v4.patch Attaching v4 to address Andrew's comments. Fix potential bugs in exception handlers Key: HBASE-10452 URL: https://issues.apache.org/jira/browse/HBASE-10452 Project: HBase Issue Type: Bug Components: Client, master, regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Attachments: HBase-10452-trunk-v2.patch, HBase-10452-trunk-v3.patch, HBase-10452-trunk-v4.patch, HBase-10452-trunk.patch Hi HBase developers, We are a group of researchers on software reliability. Recently we did a study and found that majority of the most severe failures in HBase are caused by bugs in exception handling logic -- that it is hard to anticipate all the possible real-world error scenarios. Therefore we built a simple checking tool that automatically detects some bug patterns that have caused some very severe real-world failures. I am reporting some of the results here. Any feedback is much appreciated! Ding = Case 1: Line: 134, File: org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java {noformat} protected void releaseTableLock() { if (this.tableLock != null) { try { this.tableLock.release(); } catch (IOException ex) { LOG.warn(Could not release the table lock, ex); //TODO: if we get here, and not abort RS, this lock will never be released } } {noformat} The lock is not released if the exception occurs, causing potential deadlock or starvation. Similar code pattern can be found at: Line: 135, File: org/apache/hadoop/hbase/regionserver/SplitRequest.java == = Case 2: Line: 252, File: org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java {noformat} try { Field fEnd = SequenceFile.Reader.class.getDeclaredField(end); fEnd.setAccessible(true); end = fEnd.getLong(this.reader); } catch(Exception e) { /* reflection fail. keep going */ } {noformat} The caught Exception seems to be too general. While reflection-related errors might be harmless, the try block can throw other exceptions including SecurityException, IllegalAccessException, etc. Currently all those exceptions are ignored. Maybe the safe way is to ignore the specific reflection-related errors while logging and handling other types of unexpected exceptions. == = Case 3: Line: 148, File: org/apache/hadoop/hbase/HBaseConfiguration.java {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (Exception e) { } {noformat} Similar to the previous case, the exception handling is too general. While ClassNotFound error might be the normal case and ignored, Class.forName can also throw other exceptions (e.g., LinkageError) under some unexpected and rare error cases. If that happens, the error will be lost. So maybe change it to below: {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (LinkageError e) { LOG.warn(..); // handle linkage error } catch (ExceptionInInitializerError e) { LOG.warn(..); // handle Initializer error } catch (ClassNotFoundException e) { LOG.debug(..); // ignore } {noformat} == = Case 4: Line: 163, File: org/apache/hadoop/hbase/client/Get.java {noformat} public Get setTimeStamp(long timestamp) { try { tr = new TimeRange(timestamp, timestamp+1); } catch(IOException e) { // Will never happen } return this; } {noformat} Even if the IOException never happens right now, is it possible to happen in the future due to code change? At least there should be a log message. The current behavior is dangerous since if the exception ever happens in any unexpected scenario, it will be silently swallowed. Similar code pattern can be found at: Line: 300, File: org/apache/hadoop/hbase/client/Scan.java == = Case 5: Line: 207, File: org/apache/hadoop/hbase/util/JVM.java {noformat} if (input != null){ try { input.close(); } catch (IOException ignored) { } } {noformat} Any exception encountered in close is completely ignored, not even logged. In particular, the same exception scenario was handled
[jira] [Updated] (HBASE-10452) Potential bugs in exception handlers
[ https://issues.apache.org/jira/browse/HBASE-10452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ding Yuan updated HBASE-10452: -- Attachment: HBase-10452-trunk-v3.patch Potential bugs in exception handlers Key: HBASE-10452 URL: https://issues.apache.org/jira/browse/HBASE-10452 Project: HBase Issue Type: Bug Components: Client, master, regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Attachments: HBase-10452-trunk-v2.patch, HBase-10452-trunk-v3.patch, HBase-10452-trunk.patch Hi HBase developers, We are a group of researchers on software reliability. Recently we did a study and found that majority of the most severe failures in HBase are caused by bugs in exception handling logic -- that it is hard to anticipate all the possible real-world error scenarios. Therefore we built a simple checking tool that automatically detects some bug patterns that have caused some very severe real-world failures. I am reporting some of the results here. Any feedback is much appreciated! Ding = Case 1: Line: 134, File: org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java {noformat} protected void releaseTableLock() { if (this.tableLock != null) { try { this.tableLock.release(); } catch (IOException ex) { LOG.warn(Could not release the table lock, ex); //TODO: if we get here, and not abort RS, this lock will never be released } } {noformat} The lock is not released if the exception occurs, causing potential deadlock or starvation. Similar code pattern can be found at: Line: 135, File: org/apache/hadoop/hbase/regionserver/SplitRequest.java == = Case 2: Line: 252, File: org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java {noformat} try { Field fEnd = SequenceFile.Reader.class.getDeclaredField(end); fEnd.setAccessible(true); end = fEnd.getLong(this.reader); } catch(Exception e) { /* reflection fail. keep going */ } {noformat} The caught Exception seems to be too general. While reflection-related errors might be harmless, the try block can throw other exceptions including SecurityException, IllegalAccessException, etc. Currently all those exceptions are ignored. Maybe the safe way is to ignore the specific reflection-related errors while logging and handling other types of unexpected exceptions. == = Case 3: Line: 148, File: org/apache/hadoop/hbase/HBaseConfiguration.java {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (Exception e) { } {noformat} Similar to the previous case, the exception handling is too general. While ClassNotFound error might be the normal case and ignored, Class.forName can also throw other exceptions (e.g., LinkageError) under some unexpected and rare error cases. If that happens, the error will be lost. So maybe change it to below: {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (LinkageError e) { LOG.warn(..); // handle linkage error } catch (ExceptionInInitializerError e) { LOG.warn(..); // handle Initializer error } catch (ClassNotFoundException e) { LOG.debug(..); // ignore } {noformat} == = Case 4: Line: 163, File: org/apache/hadoop/hbase/client/Get.java {noformat} public Get setTimeStamp(long timestamp) { try { tr = new TimeRange(timestamp, timestamp+1); } catch(IOException e) { // Will never happen } return this; } {noformat} Even if the IOException never happens right now, is it possible to happen in the future due to code change? At least there should be a log message. The current behavior is dangerous since if the exception ever happens in any unexpected scenario, it will be silently swallowed. Similar code pattern can be found at: Line: 300, File: org/apache/hadoop/hbase/client/Scan.java == = Case 5: Line: 207, File: org/apache/hadoop/hbase/util/JVM.java {noformat} if (input != null){ try { input.close(); } catch (IOException ignored) { } } {noformat} Any exception encountered in close is completely ignored, not even logged. In particular, the same exception scenario was handled differently in other methods in the same file: Line: 154, same file {noformat}
[jira] [Commented] (HBASE-10452) Potential bugs in exception handlers
[ https://issues.apache.org/jira/browse/HBASE-10452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13896424#comment-13896424 ] Ding Yuan commented on HBASE-10452: --- Thanks for the comments! Attached a new patch to address them. As for the possible integer overflow error from TimeRange: an IOException instead of RuntimeException is now thrown so the upper levels will deal with it. Let me know if this is fine. Potential bugs in exception handlers Key: HBASE-10452 URL: https://issues.apache.org/jira/browse/HBASE-10452 Project: HBase Issue Type: Bug Components: Client, master, regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Attachments: HBase-10452-trunk-v2.patch, HBase-10452-trunk-v3.patch, HBase-10452-trunk.patch Hi HBase developers, We are a group of researchers on software reliability. Recently we did a study and found that majority of the most severe failures in HBase are caused by bugs in exception handling logic -- that it is hard to anticipate all the possible real-world error scenarios. Therefore we built a simple checking tool that automatically detects some bug patterns that have caused some very severe real-world failures. I am reporting some of the results here. Any feedback is much appreciated! Ding = Case 1: Line: 134, File: org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java {noformat} protected void releaseTableLock() { if (this.tableLock != null) { try { this.tableLock.release(); } catch (IOException ex) { LOG.warn(Could not release the table lock, ex); //TODO: if we get here, and not abort RS, this lock will never be released } } {noformat} The lock is not released if the exception occurs, causing potential deadlock or starvation. Similar code pattern can be found at: Line: 135, File: org/apache/hadoop/hbase/regionserver/SplitRequest.java == = Case 2: Line: 252, File: org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java {noformat} try { Field fEnd = SequenceFile.Reader.class.getDeclaredField(end); fEnd.setAccessible(true); end = fEnd.getLong(this.reader); } catch(Exception e) { /* reflection fail. keep going */ } {noformat} The caught Exception seems to be too general. While reflection-related errors might be harmless, the try block can throw other exceptions including SecurityException, IllegalAccessException, etc. Currently all those exceptions are ignored. Maybe the safe way is to ignore the specific reflection-related errors while logging and handling other types of unexpected exceptions. == = Case 3: Line: 148, File: org/apache/hadoop/hbase/HBaseConfiguration.java {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (Exception e) { } {noformat} Similar to the previous case, the exception handling is too general. While ClassNotFound error might be the normal case and ignored, Class.forName can also throw other exceptions (e.g., LinkageError) under some unexpected and rare error cases. If that happens, the error will be lost. So maybe change it to below: {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (LinkageError e) { LOG.warn(..); // handle linkage error } catch (ExceptionInInitializerError e) { LOG.warn(..); // handle Initializer error } catch (ClassNotFoundException e) { LOG.debug(..); // ignore } {noformat} == = Case 4: Line: 163, File: org/apache/hadoop/hbase/client/Get.java {noformat} public Get setTimeStamp(long timestamp) { try { tr = new TimeRange(timestamp, timestamp+1); } catch(IOException e) { // Will never happen } return this; } {noformat} Even if the IOException never happens right now, is it possible to happen in the future due to code change? At least there should be a log message. The current behavior is dangerous since if the exception ever happens in any unexpected scenario, it will be silently swallowed. Similar code pattern can be found at: Line: 300, File: org/apache/hadoop/hbase/client/Scan.java == = Case 5: Line: 207, File: org/apache/hadoop/hbase/util/JVM.java {noformat} if (input != null){ try { input.close(); } catch
[jira] [Commented] (HBASE-10452) Potential bugs in exception handlers
[ https://issues.apache.org/jira/browse/HBASE-10452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13890844#comment-13890844 ] Ding Yuan commented on HBASE-10452: --- Thanks Ram! Attaching a patch against trunk. I did not fix case 7 since it requires using HBCK to clear the node. I have little expertise in HBase code base and any of my attempt in this case is likely to do more harm than good. Again, any comment is much appreciated! Potential bugs in exception handlers Key: HBASE-10452 URL: https://issues.apache.org/jira/browse/HBASE-10452 Project: HBase Issue Type: Bug Components: Client, master, regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Attachments: HBase-10452-trunk.patch Hi HBase developers, We are a group of researchers on software reliability. Recently we did a study and found that majority of the most severe failures in HBase are caused by bugs in exception handling logic -- that it is hard to anticipate all the possible real-world error scenarios. Therefore we built a simple checking tool that automatically detects some bug patterns that have caused some very severe real-world failures. I am reporting some of the results here. Any feedback is much appreciated! Ding = Case 1: Line: 134, File: org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java {noformat} protected void releaseTableLock() { if (this.tableLock != null) { try { this.tableLock.release(); } catch (IOException ex) { LOG.warn(Could not release the table lock, ex); //TODO: if we get here, and not abort RS, this lock will never be released } } {noformat} The lock is not released if the exception occurs, causing potential deadlock or starvation. Similar code pattern can be found at: Line: 135, File: org/apache/hadoop/hbase/regionserver/SplitRequest.java == = Case 2: Line: 252, File: org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java {noformat} try { Field fEnd = SequenceFile.Reader.class.getDeclaredField(end); fEnd.setAccessible(true); end = fEnd.getLong(this.reader); } catch(Exception e) { /* reflection fail. keep going */ } {noformat} The caught Exception seems to be too general. While reflection-related errors might be harmless, the try block can throw other exceptions including SecurityException, IllegalAccessException, etc. Currently all those exceptions are ignored. Maybe the safe way is to ignore the specific reflection-related errors while logging and handling other types of unexpected exceptions. == = Case 3: Line: 148, File: org/apache/hadoop/hbase/HBaseConfiguration.java {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (Exception e) { } {noformat} Similar to the previous case, the exception handling is too general. While ClassNotFound error might be the normal case and ignored, Class.forName can also throw other exceptions (e.g., LinkageError) under some unexpected and rare error cases. If that happens, the error will be lost. So maybe change it to below: {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (LinkageError e) { LOG.warn(..); // handle linkage error } catch (ExceptionInInitializerError e) { LOG.warn(..); // handle Initializer error } catch (ClassNotFoundException e) { LOG.debug(..); // ignore } {noformat} == = Case 4: Line: 163, File: org/apache/hadoop/hbase/client/Get.java {noformat} public Get setTimeStamp(long timestamp) { try { tr = new TimeRange(timestamp, timestamp+1); } catch(IOException e) { // Will never happen } return this; } {noformat} Even if the IOException never happens right now, is it possible to happen in the future due to code change? At least there should be a log message. The current behavior is dangerous since if the exception ever happens in any unexpected scenario, it will be silently swallowed. Similar code pattern can be found at: Line: 300, File: org/apache/hadoop/hbase/client/Scan.java == = Case 5: Line: 207, File: org/apache/hadoop/hbase/util/JVM.java {noformat} if (input != null){ try { input.close(); } catch (IOException ignored) { } }
[jira] [Updated] (HBASE-10452) Potential bugs in exception handlers
[ https://issues.apache.org/jira/browse/HBASE-10452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ding Yuan updated HBASE-10452: -- Attachment: HBase-10452-trunk.patch Potential bugs in exception handlers Key: HBASE-10452 URL: https://issues.apache.org/jira/browse/HBASE-10452 Project: HBase Issue Type: Bug Components: Client, master, regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Attachments: HBase-10452-trunk.patch Hi HBase developers, We are a group of researchers on software reliability. Recently we did a study and found that majority of the most severe failures in HBase are caused by bugs in exception handling logic -- that it is hard to anticipate all the possible real-world error scenarios. Therefore we built a simple checking tool that automatically detects some bug patterns that have caused some very severe real-world failures. I am reporting some of the results here. Any feedback is much appreciated! Ding = Case 1: Line: 134, File: org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java {noformat} protected void releaseTableLock() { if (this.tableLock != null) { try { this.tableLock.release(); } catch (IOException ex) { LOG.warn(Could not release the table lock, ex); //TODO: if we get here, and not abort RS, this lock will never be released } } {noformat} The lock is not released if the exception occurs, causing potential deadlock or starvation. Similar code pattern can be found at: Line: 135, File: org/apache/hadoop/hbase/regionserver/SplitRequest.java == = Case 2: Line: 252, File: org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java {noformat} try { Field fEnd = SequenceFile.Reader.class.getDeclaredField(end); fEnd.setAccessible(true); end = fEnd.getLong(this.reader); } catch(Exception e) { /* reflection fail. keep going */ } {noformat} The caught Exception seems to be too general. While reflection-related errors might be harmless, the try block can throw other exceptions including SecurityException, IllegalAccessException, etc. Currently all those exceptions are ignored. Maybe the safe way is to ignore the specific reflection-related errors while logging and handling other types of unexpected exceptions. == = Case 3: Line: 148, File: org/apache/hadoop/hbase/HBaseConfiguration.java {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (Exception e) { } {noformat} Similar to the previous case, the exception handling is too general. While ClassNotFound error might be the normal case and ignored, Class.forName can also throw other exceptions (e.g., LinkageError) under some unexpected and rare error cases. If that happens, the error will be lost. So maybe change it to below: {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (LinkageError e) { LOG.warn(..); // handle linkage error } catch (ExceptionInInitializerError e) { LOG.warn(..); // handle Initializer error } catch (ClassNotFoundException e) { LOG.debug(..); // ignore } {noformat} == = Case 4: Line: 163, File: org/apache/hadoop/hbase/client/Get.java {noformat} public Get setTimeStamp(long timestamp) { try { tr = new TimeRange(timestamp, timestamp+1); } catch(IOException e) { // Will never happen } return this; } {noformat} Even if the IOException never happens right now, is it possible to happen in the future due to code change? At least there should be a log message. The current behavior is dangerous since if the exception ever happens in any unexpected scenario, it will be silently swallowed. Similar code pattern can be found at: Line: 300, File: org/apache/hadoop/hbase/client/Scan.java == = Case 5: Line: 207, File: org/apache/hadoop/hbase/util/JVM.java {noformat} if (input != null){ try { input.close(); } catch (IOException ignored) { } } {noformat} Any exception encountered in close is completely ignored, not even logged. In particular, the same exception scenario was handled differently in other methods in the same file: Line: 154, same file {noformat} if (in != null){ try {
[jira] [Updated] (HBASE-10452) Potential bugs in exception handlers
[ https://issues.apache.org/jira/browse/HBASE-10452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ding Yuan updated HBASE-10452: -- Attachment: HBase-10452-trunk-v2.patch Potential bugs in exception handlers Key: HBASE-10452 URL: https://issues.apache.org/jira/browse/HBASE-10452 Project: HBase Issue Type: Bug Components: Client, master, regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Attachments: HBase-10452-trunk-v2.patch, HBase-10452-trunk.patch Hi HBase developers, We are a group of researchers on software reliability. Recently we did a study and found that majority of the most severe failures in HBase are caused by bugs in exception handling logic -- that it is hard to anticipate all the possible real-world error scenarios. Therefore we built a simple checking tool that automatically detects some bug patterns that have caused some very severe real-world failures. I am reporting some of the results here. Any feedback is much appreciated! Ding = Case 1: Line: 134, File: org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java {noformat} protected void releaseTableLock() { if (this.tableLock != null) { try { this.tableLock.release(); } catch (IOException ex) { LOG.warn(Could not release the table lock, ex); //TODO: if we get here, and not abort RS, this lock will never be released } } {noformat} The lock is not released if the exception occurs, causing potential deadlock or starvation. Similar code pattern can be found at: Line: 135, File: org/apache/hadoop/hbase/regionserver/SplitRequest.java == = Case 2: Line: 252, File: org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java {noformat} try { Field fEnd = SequenceFile.Reader.class.getDeclaredField(end); fEnd.setAccessible(true); end = fEnd.getLong(this.reader); } catch(Exception e) { /* reflection fail. keep going */ } {noformat} The caught Exception seems to be too general. While reflection-related errors might be harmless, the try block can throw other exceptions including SecurityException, IllegalAccessException, etc. Currently all those exceptions are ignored. Maybe the safe way is to ignore the specific reflection-related errors while logging and handling other types of unexpected exceptions. == = Case 3: Line: 148, File: org/apache/hadoop/hbase/HBaseConfiguration.java {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (Exception e) { } {noformat} Similar to the previous case, the exception handling is too general. While ClassNotFound error might be the normal case and ignored, Class.forName can also throw other exceptions (e.g., LinkageError) under some unexpected and rare error cases. If that happens, the error will be lost. So maybe change it to below: {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (LinkageError e) { LOG.warn(..); // handle linkage error } catch (ExceptionInInitializerError e) { LOG.warn(..); // handle Initializer error } catch (ClassNotFoundException e) { LOG.debug(..); // ignore } {noformat} == = Case 4: Line: 163, File: org/apache/hadoop/hbase/client/Get.java {noformat} public Get setTimeStamp(long timestamp) { try { tr = new TimeRange(timestamp, timestamp+1); } catch(IOException e) { // Will never happen } return this; } {noformat} Even if the IOException never happens right now, is it possible to happen in the future due to code change? At least there should be a log message. The current behavior is dangerous since if the exception ever happens in any unexpected scenario, it will be silently swallowed. Similar code pattern can be found at: Line: 300, File: org/apache/hadoop/hbase/client/Scan.java == = Case 5: Line: 207, File: org/apache/hadoop/hbase/util/JVM.java {noformat} if (input != null){ try { input.close(); } catch (IOException ignored) { } } {noformat} Any exception encountered in close is completely ignored, not even logged. In particular, the same exception scenario was handled differently in other methods in the same file: Line: 154, same file {noformat} if (in != null){
[jira] [Commented] (HBASE-10452) Potential bugs in exception handlers
[ https://issues.apache.org/jira/browse/HBASE-10452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13891109#comment-13891109 ] Ding Yuan commented on HBASE-10452: --- Thanks Ted! Fixed both the line wrapping and the compilation error on java 6. Potential bugs in exception handlers Key: HBASE-10452 URL: https://issues.apache.org/jira/browse/HBASE-10452 Project: HBase Issue Type: Bug Components: Client, master, regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Attachments: HBase-10452-trunk-v2.patch, HBase-10452-trunk.patch Hi HBase developers, We are a group of researchers on software reliability. Recently we did a study and found that majority of the most severe failures in HBase are caused by bugs in exception handling logic -- that it is hard to anticipate all the possible real-world error scenarios. Therefore we built a simple checking tool that automatically detects some bug patterns that have caused some very severe real-world failures. I am reporting some of the results here. Any feedback is much appreciated! Ding = Case 1: Line: 134, File: org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java {noformat} protected void releaseTableLock() { if (this.tableLock != null) { try { this.tableLock.release(); } catch (IOException ex) { LOG.warn(Could not release the table lock, ex); //TODO: if we get here, and not abort RS, this lock will never be released } } {noformat} The lock is not released if the exception occurs, causing potential deadlock or starvation. Similar code pattern can be found at: Line: 135, File: org/apache/hadoop/hbase/regionserver/SplitRequest.java == = Case 2: Line: 252, File: org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java {noformat} try { Field fEnd = SequenceFile.Reader.class.getDeclaredField(end); fEnd.setAccessible(true); end = fEnd.getLong(this.reader); } catch(Exception e) { /* reflection fail. keep going */ } {noformat} The caught Exception seems to be too general. While reflection-related errors might be harmless, the try block can throw other exceptions including SecurityException, IllegalAccessException, etc. Currently all those exceptions are ignored. Maybe the safe way is to ignore the specific reflection-related errors while logging and handling other types of unexpected exceptions. == = Case 3: Line: 148, File: org/apache/hadoop/hbase/HBaseConfiguration.java {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (Exception e) { } {noformat} Similar to the previous case, the exception handling is too general. While ClassNotFound error might be the normal case and ignored, Class.forName can also throw other exceptions (e.g., LinkageError) under some unexpected and rare error cases. If that happens, the error will be lost. So maybe change it to below: {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (LinkageError e) { LOG.warn(..); // handle linkage error } catch (ExceptionInInitializerError e) { LOG.warn(..); // handle Initializer error } catch (ClassNotFoundException e) { LOG.debug(..); // ignore } {noformat} == = Case 4: Line: 163, File: org/apache/hadoop/hbase/client/Get.java {noformat} public Get setTimeStamp(long timestamp) { try { tr = new TimeRange(timestamp, timestamp+1); } catch(IOException e) { // Will never happen } return this; } {noformat} Even if the IOException never happens right now, is it possible to happen in the future due to code change? At least there should be a log message. The current behavior is dangerous since if the exception ever happens in any unexpected scenario, it will be silently swallowed. Similar code pattern can be found at: Line: 300, File: org/apache/hadoop/hbase/client/Scan.java == = Case 5: Line: 207, File: org/apache/hadoop/hbase/util/JVM.java {noformat} if (input != null){ try { input.close(); } catch (IOException ignored) { } } {noformat} Any exception encountered in close is completely ignored, not even logged. In particular, the same exception scenario was handled differently in other
[jira] [Updated] (HBASE-10452) Potential bugs in exception handlers
[ https://issues.apache.org/jira/browse/HBASE-10452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ding Yuan updated HBASE-10452: -- Description: Hi HBase developers, We are a group of researchers on software reliability. Recently we did a study and found that majority of the most severe failures in HBase are caused by bugs in exception handling logic -- that it is hard to anticipate all the possible real-world error scenarios. Therefore we built a simple checking tool that automatically detects some bug patterns that have caused some very severe real-world failures. I am reporting some of the results here. Any feedback is much appreciated! Ding = Case 1: Line: 134, File: org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java {noformat} protected void releaseTableLock() { if (this.tableLock != null) { try { this.tableLock.release(); } catch (IOException ex) { LOG.warn(Could not release the table lock, ex); //TODO: if we get here, and not abort RS, this lock will never be released } } {noformat} The lock is not released if the exception occurs, causing potential deadlock or starvation. Similar code pattern can be found at: Line: 135, File: org/apache/hadoop/hbase/regionserver/SplitRequest.java == = Case 2: Line: 252, File: org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java {noformat} try { Field fEnd = SequenceFile.Reader.class.getDeclaredField(end); fEnd.setAccessible(true); end = fEnd.getLong(this.reader); } catch(Exception e) { /* reflection fail. keep going */ } {noformat} The caught Exception seems to be too general. While reflection-related errors might be harmless, the try block can throw other exceptions including SecurityException, IllegalAccessException, etc. Currently all those exceptions are ignored. Maybe the safe way is to ignore the specific reflection-related errors while logging and handling other types of unexpected exceptions. == = Case 3: Line: 148, File: org/apache/hadoop/hbase/HBaseConfiguration.java {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (Exception e) { } {noformat} Similar to the previous case, the exception handling is too general. While ClassNotFound error might be the normal case and ignored, Class.forName can also throw other exceptions (e.g., LinkageError) under some unexpected and rare error cases. If that happens, the error will be lost. So maybe change it to below: {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (LinkageError e) { LOG.warn(..); // handle linkage error } catch (ExceptionInInitializerError e) { LOG.warn(..); // handle Initializer error } catch (ClassNotFoundException e) { LOG.debug(..); // ignore } {noformat} == = Case 4: Line: 163, File: org/apache/hadoop/hbase/client/Get.java {noformat} public Get setTimeStamp(long timestamp) { try { tr = new TimeRange(timestamp, timestamp+1); } catch(IOException e) { // Will never happen } return this; } {noformat} Even if the IOException never happens right now, is it possible to happen in the future due to code change? At least there should be a log message. The current behavior is dangerous since if the exception ever happens in any unexpected scenario, it will be silently swallowed. Similar code pattern can be found at: Line: 300, File: org/apache/hadoop/hbase/client/Scan.java == = Case 5: Line: 207, File: org/apache/hadoop/hbase/util/JVM.java {noformat} if (input != null){ try { input.close(); } catch (IOException ignored) { } } {noformat} Any exception encountered in close is completely ignored, not even logged. In particular, the same exception scenario was handled differently in other methods in the same file: Line: 154, same file {noformat} if (in != null){ try { in.close(); } catch (IOException e) { LOG.warn(Not able to close the InputStream, e); } } {noformat} Line: 248, same file {noformat} if (in != null){ try { in.close(); } catch (IOException e) { LOG.warn(Not able to close the InputStream, e); } } {noformat} == = Case 6: empty handler for exception: java.io.IOException Line: 312, File:
[jira] [Created] (HBASE-10452) Potential bugs in exception handlers
Ding Yuan created HBASE-10452: - Summary: Potential bugs in exception handlers Key: HBASE-10452 URL: https://issues.apache.org/jira/browse/HBASE-10452 Project: HBase Issue Type: Bug Components: Client, master, regionserver, util Affects Versions: 0.96.1 Reporter: Ding Yuan Hi HBase developers, We are a group of researchers on software reliability. Recently we did a study and found that majority of the most severe failures in HBase are caused by bugs in exception handling logic -- that it is hard to anticipate all the possible real-world error scenarios. Therefore we built a simple checking tool that automatically detects some bug patterns that have caused some very severe real-world failures. I am reporting some of the results here. Any feedback is much appreciated! = Case 1: Line: 134, File: org/apache/hadoop/hbase/regionserver/RegionMergeRequest.java {noformat} protected void releaseTableLock() { if (this.tableLock != null) { try { this.tableLock.release(); } catch (IOException ex) { LOG.warn(Could not release the table lock, ex); //TODO: if we get here, and not abort RS, this lock will never be released } } {noformat} The lock is not released if the exception occurs, causing potential deadlock or starvation. Similar code pattern can be found at: Line: 135, File: org/apache/hadoop/hbase/regionserver/SplitRequest.java == = Case 2: Line: 252, File: org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java {noformat} try { Field fEnd = SequenceFile.Reader.class.getDeclaredField(end); fEnd.setAccessible(true); end = fEnd.getLong(this.reader); } catch(Exception e) { /* reflection fail. keep going */ } {noformat} The caught Exception seems to be too general. While reflection-related errors might be harmless, the try block can throw other exceptions including SecurityException, IllegalAccessException, etc. Currently all those exceptions are ignored. Maybe the safe way is to ignore the specific reflection-related errors while logging and handling other types of unexpected exceptions. == = Case 3: Line: 148, File: org/apache/hadoop/hbase/HBaseConfiguration.java {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (Exception e) { } {noformat} Similar to the previous case, the exception handling is too general. While ClassNotFound error might be the normal case and ignored, Class.forName can also throw other exceptions (e.g., LinkageError) under some unexpected and rare error cases. If that happens, the error will be lost. So maybe change it to below: {noformat} try { if (Class.forName(org.apache.hadoop.conf.ConfServlet) != null) { isShowConf = true; } } catch (LinkageError e) { LOG.warn(..); // handle linkage error } catch (ExceptionInInitializerError e) { LOG.warn(..); // handle Initializer error } catch (ClassNotFoundException e) { LOG.debug(..); // ignore } {noformat} == = Case 4: Line: 163, File: org/apache/hadoop/hbase/client/Get.java {noformat} public Get setTimeStamp(long timestamp) { try { tr = new TimeRange(timestamp, timestamp+1); } catch(IOException e) { // Will never happen } return this; } {noformat} Even if the IOException never happens right now, is it possible to happen in the future due to code change? At least there should be a log message. The current behavior is dangerous since if the exception ever happens in any unexpected scenario, it will be silently swallowed. Similar code pattern can be found at: Line: 300, File: org/apache/hadoop/hbase/client/Scan.java == = Case 5: Line: 207, File: org/apache/hadoop/hbase/util/JVM.java {noformat} if (input != null){ try { input.close(); } catch (IOException ignored) { } } {noformat} Any exception encountered in close is completely ignored, not even logged. In particular, the same exception scenario was handled differently in other methods in the same file: Line: 154, same file {noformat} if (in != null){ try { in.close(); } catch (IOException e) { LOG.warn(Not able to close the InputStream, e); } } {noformat} Line: 248, same file {noformat} if (in != null){ try { in.close(); } catch (IOException e) { LOG.warn(Not able to close the InputStream, e); } } {noformat}