Hi Satish, >So that means a consumer which gets behind by half an hour will find its > reads being served from remote storage. And, if I understand the proposed > algorithm, each such consumer fetch request could result in a separate > fetch request from the remote storage. I.e. there's no mechanism to > amortize the cost of the fetching between multiple consumers fetching > similar ranges? > > local log segments are deleted according to the local > log.retention.time/.size settings though they may have been already > copied to remote storage. Consumers would still be able to fetch the > messages from local storage if they are not yet deleted based on the > retention. They will be served from remote storage only when they are > not locally available. >
Thanks, I missed that point. However, there's still a point at which the consumer fetches start getting served from remote storage (even if that point isn't as soon as the local log retention time/size). This represents a kind of performance cliff edge and what I'm really interested in is how easy it is for a consumer which falls off that cliff to catch up and so its fetches again come from local storage. Obviously this can depend on all sorts of factors (like production rate, consumption rate), so it's not guaranteed (just like it's not guaranteed for Kafka today), but this would represent a new failure mode. Another aspect I'd like to understand better is the effect of serving fetch request from remote storage has on the broker's network utilization. If we're just trimming the amount of data held locally (without increasing the overall local+remote retention), then we're effectively trading disk bandwidth for network bandwidth when serving fetch requests from remote storage (which I understand to be a good thing, since brokers are often/usually disk bound). But if we're increasing the overall local+remote retention then it's more likely that network itself becomes the bottleneck. I appreciate this is all rather hand wavy, I'm just trying to understand how this would affect broker performance, so I'd be grateful for any insights you can offer. Cheers, Tom