We have seen clusters with a few thousand topic getting idle CPU of 20 percent or more. These may be due to these fetch request. It seems your kip would address scalability by a lot in terms of (dormant) partitions, so I'm excited for this change
On 22 Nov. 2017 10:01 pm, "Mickael Maison" <mickael.mai...@gmail.com> wrote: That's an interesting idea. In our clusters, we definitively feel the cost of unused partitions and I think it's one of these areas where Kafka could improve. On Wed, Nov 22, 2017 at 6:11 AM, Jun Rao <j...@confluent.io> wrote: > Hi, Jay, > > I guess in your proposal the leader has to cache the last offset given back > for each partition so that it knows from which offset to serve the next > fetch request. This is doable but it means that the leader needs to do an > additional index lookup per partition to serve a fetch request. Not sure if > the benefit from the lighter fetch request obviously offsets the additional > index lookup though. > > Thanks, > > Jun > > On Tue, Nov 21, 2017 at 7:03 PM, Jay Kreps <j...@confluent.io> wrote: > >> I think the general thrust of this makes a ton of sense. >> >> I don't love that we're introducing a second type of fetch request. I think >> the motivation is for compatibility, right? But isn't that what versioning >> is for? Basically to me although the modification we're making makes sense, >> the resulting protocol doesn't really seem like something you would design >> this way from scratch. >> >> I think I may be misunderstanding the semantics of the partitions in >> IncrementalFetchRequest. I think the intention is that the server remembers >> the partitions you last requested, and the partitions you specify in the >> request are added to this set. This is a bit odd though because you can add >> partitions but I don't see how you remove them, so it doesn't really let >> you fully make changes incrementally. I suspect I'm misunderstanding that >> somehow, though. You'd also need to be a little bit careful that there was >> no way for the server's idea of what the client is interested in and the >> client's idea to ever diverge as you made these modifications over time >> (due to bugs or whatever). >> >> It seems like an alternative would be to not add a second request, but >> instead change the fetch api and implementation >> >> 1. We save the partitions you last fetched on that connection in the >> session for the connection (as I think you are proposing) >> 2. It only gives you back info on partitions that have data or have >> changed (no reason you need the others, right?) >> 3. Not specifying any partitions means "give me the usual", as defined >> by whatever you requested before attached to the session. >> >> This would be a new version of the fetch API, so compatibility would be >> retained by retaining the older version as is. >> >> This seems conceptually simpler to me. It's true that you have to resend >> the full set whenever you want to change it, but that actually seems less >> error prone and that should be rare. >> >> I suspect you guys thought about this and it doesn't quite work, but maybe >> you could explain why? >> >> -Jay >> >> On Tue, Nov 21, 2017 at 1:02 PM, Colin McCabe <cmcc...@apache.org> wrote: >> >> > Hi all, >> > >> > I created a KIP to improve the scalability and latency of FetchRequest: >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP- >> > 227%3A+Introduce+Incremental+FetchRequests+to+Increase+ >> > Partition+Scalability >> > >> > Please take a look. >> > >> > cheers, >> > Colin >> > >>