Hi Satish,

>So that means a consumer which gets behind by half an hour will find its
> reads being served from remote storage. And, if I understand the proposed
> algorithm, each such consumer fetch request could result in a separate
> fetch request from the remote storage. I.e. there's no mechanism to
> amortize the cost of the fetching between multiple consumers fetching
> similar ranges?
>
> local log segments are deleted according to the local
> log.retention.time/.size settings though they may have been already
> copied to remote storage. Consumers would still be able to fetch the
> messages from local storage if they are not yet deleted based on the
> retention. They will be served from remote storage only when they are
> not locally available.
>

Thanks, I missed that point. However, there's still a point at which the
consumer fetches start getting served from remote storage (even if that
point isn't as soon as the local log retention time/size). This represents
a kind of performance cliff edge and what I'm really interested in is how
easy it is for a consumer which falls off that cliff to catch up and so its
fetches again come from local storage. Obviously this can depend on all
sorts of factors (like production rate, consumption rate), so it's not
guaranteed (just like it's not guaranteed for Kafka today), but this would
represent a new failure mode.

Another aspect I'd like to understand better is the effect of serving fetch
request from remote storage has on the broker's network utilization. If
we're just trimming the amount of data held locally (without increasing the
overall local+remote retention), then we're effectively trading disk
bandwidth for network bandwidth when serving fetch requests from remote
storage (which I understand to be a good thing, since brokers are
often/usually disk bound). But if we're increasing the overall local+remote
retention then it's more likely that network itself becomes the bottleneck.
I appreciate this is all rather hand wavy, I'm just trying to understand
how this would affect broker performance, so I'd be grateful for any
insights you can offer.

Cheers,

Tom

Reply via email to