Hi Becket,

Thank you for the KIP. A few comments:

1.KIP says:  "*No public interface changes are needed. We only propose
behavior change on the broker side.*"

But from the proposed changes, it sounds like clients will be updated to
wait for throttle-time before sending next response, and also not handle
idle disconnections during that time. Doesn't that mean that clients need
to know that the broker has sent the response before throttling, requiring
protocol/version change?


2. At the moment, broker failures are detected by clients (and vice versa)
within connections.max.idle.ms. By removing this check for an unlimited
throttle time, failure detection could be delayed.


3. KIP says  "*Since this subsequent request is not actually handled until
the broker unmutes the channel, the client can hit request.timeout.ms
<http://request.timeout.ms> and reconnect. However, this is no worse than
the current state.*"

I think this could be worse than the current state because broker doesn't
detect and close the channel for an unlimited throttle time, while new
connections will get accepted. As a result, lot of connections could be in
CLOSE_WAIT state when throttle time is high.


Perhaps it is better to combine this KIP with a bound on throttle time?


Regards,


Rajini


On Fri, Nov 3, 2017 at 8:09 PM, Becket Qin <becket....@gmail.com> wrote:

> Thanks for the comment, Jun,
>
> 1. Yes, you are right. This could also happen with the current quota
> mechanism because we are essentially muting the socket during throttle
> time. There might be two ways to solve this.
> A) use another socket to send heartbeat.
> B) let the GroupCoordinator know that the client will not heartbeat for
> some time.
> It seems the first solution is cleaner.
>
> 2. For consumer it seems returning an empty response is a better option. In
> the producer case, if there is a spike in traffic. The brokers will see
> queued up requests, but that is hard to avoid unless we have connection
> level quota, which is a bigger change and may be easier to discuss it in a
> separate KIP.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
>
> On Fri, Nov 3, 2017 at 10:28 AM, Jun Rao <j...@confluent.io> wrote:
>
> > Hi, Jiangjie,
> >
> > Thanks for bringing this up. A couple of quick thoughts.
> >
> > 1. If the throttle time is large, what can happen is that a consumer
> won't
> > be able to heart beat to the group coordinator frequent enough. In that
> > case, even with this KIP, it seems there could be frequent consumer group
> > rebalances.
> >
> > 2. If we return a response immediately, for the consumer, do we return
> the
> > requested data or an empty response? If we do the former, it may not
> > protect against the case when there are multiple consumer instances
> > associated with the same user/clientid.
> >
> > Jun
> >
> > On Wed, Nov 1, 2017 at 9:53 AM, Becket Qin <becket....@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > We would like to start the discussion on KIP-219.
> > >
> > > The KIP tries to improve quota throttling time communication between
> > > brokers and clients to avoid clients timeout in case of long throttling
> > > time.
> > >
> > > The KIP link is following:
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 219+-+Improve+quota+
> > > communication
> > >
> > > Comments are welcome.
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> >
>

Reply via email to