Hi David,

Thanks for the KIP.

I thought about this for a while and I actually think this approach is not 
quite right.  The problem that I see here is that using an explicitly set quota 
here requires careful tuning by the cluster operator.  Even worse, this tuning 
might be invalidated by changes in overall conditions or even more efficient 
controller software.

For example, if we empirically find that the controller can do 1000 topics in a 
minute (or whatever), this tuning might actually be wrong if the next version 
of the software can do 2000 topics in a minute because of efficiency upgrades.  
Or, the broker that the controller is located on might be experiencing heavy 
load from its non-controller operations, and so it can only do 500 topics in a 
minute during this period.

So the system administrator gets a very obscure tunable (it's not clear to a 
non-Kafka-developer what "controller mutations" are or why they should care).  
And even worse, they will have to significantly "sandbag" the value that they 
set it to, so that even under the heaviest load and oldest deployed version of 
the software, the controller can still function.  Even worse, this new quota 
adds a lot of complexity to the controller.

What we really want is backpressure when the controller is overloaded.  I 
believe this is the alternative you discuss in "Rejected Alternatives" under 
"Throttle the Execution instead of the Admission"  Your reason for rejecting it 
is that the client error handling does not work well in this case.  But 
actually, this is an artifact of our current implementation, rather than a 
fundamental issue with backpressure.

Consider the example of a CreateTopicsRequest.  The controller could return a 
special error code if the load was too high, and take the create topics event 
off the controller queue.  Let's call that error code BUSY. 
 Additionally, the controller could immediately refuse new events if the queue 
had reached its maximum length, and simply return BUSY for that case as well.

Basically, the way we handle RPC timeouts in the controller right now is not 
very good.  As you know, we time out the RPC, so the client gets 
TimeoutException, but then keep the event on the queue, so that it eventually 
gets executed!  There's no reason why we have to do that.  We could take the 
event off the queue if there is a timeout.  This would reduce load and mostly 
avoid the paradoxical situations you describe (getting TopicExistsException for 
a CreateTopicsRequest retry, etc.)

I say "mostly" because there are still cases where retries could go astray (for 
example if we execute the topic creation but a network problem prevents the 
response from being sent to the client).  But this would still be a very big 
improvement over what we have now.

Sorry for commenting so late on this but I got distracted by some other things. 
 I hope we can figure this one out-- I feel like there is a chance to 
significantly simplify this.

best,
Colin


On Fri, May 29, 2020, at 07:57, David Jacot wrote:
> Hi folks,
> 
> I'd like to start the vote for KIP-599 which proposes a new quota to
> throttle create topic, create partition, and delete topics operations to
> protect the Kafka controller:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-599%3A+Throttle+Create+Topic%2C+Create+Partition+and+Delete+Topic+Operations
> 
> Please, let me know what you think.
> 
> Cheers,
> David
>

Reply via email to