FineAndDandy commented on a change in pull request #1818:
URL: https://github.com/apache/accumulo/pull/1818#discussion_r556632737
##########
File path:
core/src/main/java/org/apache/accumulo/core/util/threads/AccumuloUncaughtExceptionHandler.java
##########
@@ -16,20 +16,36 @@
* specific language governing permissions and limitations
* under the License.
*/
-package org.apache.accumulo.core.util;
+package org.apache.accumulo.core.util.threads;
import java.lang.Thread.UncaughtExceptionHandler;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
-public class AccumuloUncaughtExceptionHandler implements
UncaughtExceptionHandler {
+/**
+ * UncaughtExceptionHandler that logs all Exceptions and Errors thrown from a
Thread. If an Error is
+ * thrown, halt the JVM.
+ *
+ */
+class AccumuloUncaughtExceptionHandler implements UncaughtExceptionHandler {
- private static final Logger log =
LoggerFactory.getLogger(AccumuloUncaughtExceptionHandler.class);
+ private static final Logger LOG =
LoggerFactory.getLogger(AccumuloUncaughtExceptionHandler.class);
@Override
public void uncaughtException(Thread t, Throwable e) {
- log.error(String.format("Caught an exception in %s. Shutting down.", t),
e);
+ if (e instanceof Exception) {
+ LOG.error("Caught an Exception in {}. Thread is dead.", t, e);
+ } else if (e instanceof Error) {
+ try {
+ e.printStackTrace();
+ System.err.println("Error thrown in thread: " + t + ", halting VM.");
+ } catch (Throwable e1) {
+ // If e == OutOfMemoryError, then it's probably that another Error
might be
+ // thrown when trying to print to System.err.
+ } finally {
+ Runtime.getRuntime().halt(-1);
Review comment:
I would argue that there are potential subclasses of Error, that when
appearing in a Scan thread should simply log and fail the scan, not halt the
entire tserver. It's easy to agree that an OutOfMemoryError warrants halting,
but what about a miss-configured context that leads to a
UnsupportedClassVersionError or an IOError on a particular file? Particularly
the scan threads need to have more flexibility since they are not necessarily
running core code. One bad iterator/edge case should not be able to take a
cluster down. If this is asking too much than having a configurable list of
classes/subclasses that result in a halt would feel more comfortable.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]