[
https://issues.apache.org/jira/browse/KAFKA-19724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18023224#comment-18023224
]
Matthias J. Sax commented on KAFKA-19724:
-----------------------------------------
[~mikkolaj], yes, this aligns to what I said earlier:
https://issues.apache.org/jira/browse/KAFKA-19724?focusedCommentId=18021744&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-18021744
– catching Throwable instead of Exception should(*) address the issue.
(*) The concern about catching Throwable is totally valid; no disagreement. We
do it in Kafka Streams basically as a "best effort" approach, and are well
aware that it might not always work.
{quote}Java's {{UncaughtExceptionHandler}} seems to me like the last line of
defense, where user could try to detect these problems and take action like
shutting down the app
{quote}
I am not deep enough into JVM details, but I am wondering if this is actually
guaranteed? After a Throwable, the JVM is in "very bad state" from my
understanding, and I am not sure what guarantees there are that the handler is
invoked, and what code would be guaranteed to get executed inside the handler?
That we set the handler as a no-op, is clearly a bug (as said above already):
We actually set the handler first
[https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/KafkaStreams.java#L467-L471]
and incorrectly set it as no-op right afterwards.
{quote}Catching {{Throwable}} should probably handle most of the cases just
fine.
{quote}
Yes, that's why we do is this way.
[~fatihcelik] – yes. But inside `initialize()` as also pointed out by
[~mikkolaj], we would catch `Throwable` here:
[https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/GlobalStreamThread.java#L434]
to address it.
So easy fix: Change
[https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/GlobalStreamThread.java#L434]
from `Exception` to `Throwable` and remove
[https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/KafkaStreams.java#L475-L478]
Additionally, we could consider to also mess with
`Thread#setUncaughtExceptionHandler()` (note, that both `StreamsThread` and
`GlobalThread`, and neither the new `StateUpdatedThread` use this method – the
methods we discussed above, are overload from KS which work with
`StreamsUncaughtExceptionHandler`, not the Java
`Thread.UncaughtExceptionHandler`), but I am not sure if this would buy us
much? The tricky part would be, that `StreamsThreads`, `StateUpdaterThread` and
`GlobalStreamThread` are background threads and are not exposed to the user,
and we also would not want to add new public API to avoid confusion (having
`KafkaStreams#setUncaughtExceptionHandler(StreamsUncaughtExceptionHandler)` and
`KafkaStreams#setUncaughtExceptionHandler(Thread.UncaughExceptionHandler)`
would be very weird). – But we could either register our own KS specific
handler, or use `Thread.currentThread().getUncaughExceptionHandler` to get
whatever handler is set on the main thread, and add it to background threads.
But as said already, not sure if this would help much – but I also don't think
it would cause any issues? – Also not sure if this is necessary as using
`Thread.setDefaultUncaughtExceptionHandler()` should just do the trick?
> Global stream thread ignores all exceptions
> -------------------------------------------
>
> Key: KAFKA-19724
> URL: https://issues.apache.org/jira/browse/KAFKA-19724
> Project: Kafka
> Issue Type: Bug
> Components: streams
> Affects Versions: 3.4.0
> Reporter: Mikołaj Bul
> Assignee: Fatih Celik
> Priority: Major
> Labels: beginner, newbie
>
> {{globalStreamThread}} in {{KafkaStreams}} class ignores all exceptions and
> fails to apply the user-provided {{StreamsUncaughtExceptionHandler}} in:
> {code:java}
> public void setUncaughtExceptionHandler(final StreamsUncaughtExceptionHandler
> userStreamsUncaughtExceptionHandler)
> {code}
> This can lead to streams being stuck in faulty state after an exception is
> thrown during their initialization phase (e.g. failure to load the RocksDB
> native library). The exception isn't logged, so debugging such problem is
> difficult.
> From my understanding of the {{b62d8b97}} commit message, the following code
> is unnecessary as the {{globalStreamThread}} can't be replaced and the issue
> description from KAFKA-12699 doesn't apply to it. Removing it should help.
> {code:java}
> if (globalStreamThread != null) {
> globalStreamThread.setUncaughtExceptionHandler((t, e) -> { }
> );
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)