Hi Ivan,

I've opened a relevant patch to increase the timeout to fetch data from
remote storage. This will reduce the error occurrence rate:

https://github.com/apache/kafka/pull/14778

I think it's better to distinguish the cases:

1) Any error that happens while reading the remote data -- this should be
logged at ERROR level and the failedRemoteReadRequestRate metric should
reflect it.
2) When the task is cancelled due to timeout, then we can log it in the
debug level and skip updating the metrics.

On Fri, Nov 17, 2023 at 6:11 PM Ivan Yurchenko <i...@ivanyu.me> wrote:

> Hello!
>
> `RemoteStorageManager.fetchLogSegment` is called in a background thread by
> the broker [1]. When a fetch request times out, the associated Future is
> cancelled [2] and the thread is interrupted. If the InterruptedException is
> propagated from the `RemoteStorageManager`, it pollutes the broker logs
> with not very useful exceptions and also metrics with false errors [3]. It
> would be good to deal with this somehow. I'm see two options:
>
> 1. Add a `catch` to handle `InterruptedException` here [4] in a less noisy
> way. Maybe log on the debug level + not increase the error metrics.
> 2. Catch `InterruptedException` in the `RemoteStorageManager`
> implementation and return an empty `InputStream`. It seems safe because it
> goes this way [5]. In this case, this needs to be documented in the Javadoc.
>
> Personally, the first seems a better way. What does the community think
> about this? Regardless of the decision, I volunteer to make a PR (and a KIP
> if needed).
> Thank you!
>
> Best,
> Ivan
>
> [1]
> https://github.com/apache/kafka/blob/61fb83e20280ed3ac83de8290dd9815bdb7efcea/core/src/main/java/kafka/log/remote/RemoteLogManager.java#L1457-L1459
> [2]
> https://github.com/apache/kafka/blob/6f197301646135e0bb39a461ca0a07c09c3185fb/core/src/main/scala/kafka/server/DelayedRemoteFetch.scala#L79-L83
> [3]
> https://github.com/apache/kafka/blob/8aaf7daff393f2b26438fb7fe28016e06e23558c/core/src/main/java/kafka/log/remote/RemoteLogReader.java#L68-L72
> [4]
> https://github.com/apache/kafka/blob/8aaf7daff393f2b26438fb7fe28016e06e23558c/core/src/main/java/kafka/log/remote/RemoteLogReader.java#L68
> [5]
> https://github.com/apache/kafka/blob/61fb83e20280ed3ac83de8290dd9815bdb7efcea/core/src/main/java/kafka/log/remote/RemoteLogManager.java#L1276-L1278
>
>

Reply via email to