In my opinion a lot of kafka configuration options were added using the
"minimal diff" approach, which results in very nuanced and complicated
configs required to indirectly achieve some goal. case in point - timeouts.

The goal here is to control the memory requirement. the 1st config was max
size of a single request, now the proposal is to control the number of
those in flight - which is inaccurate (you dont know the actual size and
must over-estimate), would have an impact on throughput in case of
over-estimation, and also fails to completely achieve the goal (what about
decompression?)

I think a memory pool in combination with Jay's proposal to only pick up
from socket conditionally when memory is available is the correct approach
- it deals with the problem directly and would result in a simler and more
understandable configuration (a single property for max memory consumption).

in the future the accuracy of the limit can be improved by, for example,
declaring both the compressed _AND UNCOMPRESSED_ sizes up front, so that we
can pick up from socket when we have enough memory to decompress as well -
this would obviously be a wire format change and outside the scope here,
but my point is that it could be done without adding any new configs)

On Mon, Oct 31, 2016 at 10:25 AM, Joel Koshy <jjkosh...@gmail.com> wrote:

> Agreed with this approach.
> One detail to be wary of is that since we multiplex various other requests
> (e.g., heartbeats, offset commits, metadata, etc.) over the client that
> connects to the coordinator this could delay some of these critical
> requests. Realistically I don't think it will be an issue except in extreme
> scenarios where someone sets the memory limit to be unreasonably low.
>
> Thanks,
>
> Joel
>
> On Sun, Oct 30, 2016 at 12:32 PM, Jun Rao <j...@confluent.io> wrote:
>
> > Hi, Mickael,
> >
> > I agree with others that it's better to be able to control the bytes the
> > consumer can read from sockets, instead of limiting the fetch requests.
> > KIP-72 has a proposal to bound the memory size at the socket selector
> > level. Perhaps that can be leveraged in this KIP too.
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 72%3A+Allow+putting+a+bound+on+memory+consumed+by+Incoming+requests
> >
> > Thanks,
> >
> > Jun
> >
> > On Thu, Oct 27, 2016 at 3:23 PM, Jay Kreps <j...@confluent.io> wrote:
> >
> > > This is a good observation on limiting total memory usage. If I
> > understand
> > > the proposal I think it is that the consumer client would stop sending
> > > fetch requests once a certain number of in-flight fetch requests is
> met.
> > I
> > > think a better approach would be to always issue one fetch request to
> > each
> > > broker immediately, allow the server to process that request, and send
> > data
> > > back to the local machine where it would be stored in the socket buffer
> > (up
> > > to that buffer size). Instead of throttling the requests sent, the
> > consumer
> > > should ideally throttle the responses read from the socket buffer at
> any
> > > given time. That is, in a single poll call, rather than reading from
> > every
> > > single socket it should just read until it has a given amount of memory
> > > used then bail out early. It can come back and read more from the other
> > > sockets after those messages are processed.
> > >
> > > The advantage of this approach is that you don't incur the additional
> > > latency.
> > >
> > > -Jay
> > >
> > > On Mon, Oct 10, 2016 at 6:41 AM, Mickael Maison <
> > mickael.mai...@gmail.com>
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I would like to discuss the following KIP proposal:
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > 81%3A+Max+in-flight+fetches
> > > >
> > > >
> > > > Feedback and comments are welcome.
> > > > Thanks !
> > > >
> > > > Mickael
> > > >
> > >
> >
>

Reply via email to