FineAndDandy commented on a change in pull request #1818:
URL: https://github.com/apache/accumulo/pull/1818#discussion_r556632737



##########
File path: 
core/src/main/java/org/apache/accumulo/core/util/threads/AccumuloUncaughtExceptionHandler.java
##########
@@ -16,20 +16,36 @@
  * specific language governing permissions and limitations
  * under the License.
  */
-package org.apache.accumulo.core.util;
+package org.apache.accumulo.core.util.threads;
 
 import java.lang.Thread.UncaughtExceptionHandler;
 
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
-public class AccumuloUncaughtExceptionHandler implements 
UncaughtExceptionHandler {
+/**
+ * UncaughtExceptionHandler that logs all Exceptions and Errors thrown from a 
Thread. If an Error is
+ * thrown, halt the JVM.
+ *
+ */
+class AccumuloUncaughtExceptionHandler implements UncaughtExceptionHandler {
 
-  private static final Logger log = 
LoggerFactory.getLogger(AccumuloUncaughtExceptionHandler.class);
+  private static final Logger LOG = 
LoggerFactory.getLogger(AccumuloUncaughtExceptionHandler.class);
 
   @Override
   public void uncaughtException(Thread t, Throwable e) {
-    log.error(String.format("Caught an exception in %s.  Shutting down.", t), 
e);
+    if (e instanceof Exception) {
+      LOG.error("Caught an Exception in {}. Thread is dead.", t, e);
+    } else if (e instanceof Error) {
+      try {
+        e.printStackTrace();
+        System.err.println("Error thrown in thread: " + t + ", halting VM.");
+      } catch (Throwable e1) {
+        // If e == OutOfMemoryError, then it's probably that another Error 
might be
+        // thrown when trying to print to System.err.
+      } finally {
+        Runtime.getRuntime().halt(-1);

Review comment:
       I would argue that there are potential subclasses of Error, that when 
appearing in a Scan thread should simply log and fail the scan, not halt the 
entire tserver. It's easy to agree that an OutOfMemoryError warrants halting, 
but what about a miss-configured context that leads to a 
UnsupportedClassVersionError or an IOError on a particular file? Particularly 
the scan threads need to have more flexibility since they are not necessarily 
running core code. One bad iterator/edge case should not be able to take a 
cluster down. If this is asking too much than having a configurable list of 
classes/subclasses that result in a halt would feel more comfortable. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to