Hi all,

I have discussed with Magnus about the various options to get his view from
a librdkafka perspective
and he has suggested a good alternative.

It seems we could continue with the idea to use the
ApiVersionsRequest/Response but we a different
failing back strategy. When a broker get an unknown ApiVersionsRequest, it
could continue to fail back
to version 0 as today but instead of sending back the UNSUPPORTED_VERSION
error alone in the
response, it could also provide at minimum the supported version of the
ApiVersionsRequest. This way,
a recent client could leverage that information when the error is received
and send the correct version
to the broker instead of failing all the way back to version 0.

This way, we can evolve the ApiVersionsRequest while keeping forward
compatibility of the Request
Header. It doesn't add any extra round trip.

Knowing this, I think that putting the information in the
ApiVersionsRequest remains the best option.

What do you think?

Best,
David

On Tue, Sep 10, 2019 at 1:00 AM Colin McCabe <cmcc...@apache.org> wrote:

> Hi all,
>
> I agree that freezing the request header is not very appealing.  We might
> want to add something there later.
>
> Having a separate request type is very clean, but as David mentioned, it
> does add an extra request/response cycle to the time required to get a
> connection into a usable state.
>
> One solution to consider is adding the clientVersion and clientType to the
> request header as optional (tagged) fields.  This would let us skip the
> extra round trip.  I don't think it's that much more messy than having a
> separate request type to set the client version and type.  In both cases,
> you have to handle connections that set the version later than others, or
> don't set the version at all (for compatibility).  So the version/type has
> to be mutable and added after the TCP connection itself is established.
>
> best,
> Colin
>
>
> On Mon, Sep 9, 2019, at 06:11, David Jacot wrote:
> > Hi Gwen,
> >
> > The reasoning behind having the information before the Sasl
> authentication
> > was to have the information for troubleshooting purposes. For instance,
> when
> > there are errors in the handshake, it would be great to know if it comes
> > from
> > a specific buggy clients. I think we could live without though but was
> nice
> > to
> > have.
> >
> > Yeah. I agree with you. It seems that evolution of the RequestHeader (RH)
> > and the
> > ApiVersionsRequest (AVR) is conjunction is more complicated than I
> > anticipated.
> > The evolution if the AVR alone works well but it would make the evolution
> > of the
> > RH hard in the future. Please, check my other email to see my thoughts.
> >
> > Best,
> > David
> >
> > On Mon, Sep 9, 2019 at 3:18 AM Gwen Shapira <g...@confluent.io> wrote:
> >
> > > Hey,
> > >
> > > Since modifying ApiVersionsRequest seems to be quite involved, do we
> > > want to re-examine the rejected option of adding another
> > > request/response pair? It will add another roundtrip, but Kafka
> > > already expects client connections to be long-lived, so the overhead
> > > is probably negligible.
> > >
> > > In addition, in the rejected alternatives it says: "It also would
> > > require to be done before the authentication (TLS AuthN aside) and
> > > thus requiring specific treatment, similarly to the
> > > ApiVersionsRequest." - which I don't quite understand. Why do we think
> > > this has to happen before authenticating?
> > >
> > > It sounds like addition another protocol will allow us to keep the
> > > special assumptions around ApiVersionsRequest and will remove the
> > > dependency on KIP-482. Which will make KIP-511 much simpler than the
> > > alternative we are currently discussing.
> > >
> > > Gwen
> > >
> > > On Fri, Sep 6, 2019 at 3:16 PM Colin McCabe <cmcc...@apache.org>
> wrote:
> > > >
> > > > Hi David,
> > > >
> > > > Yeah, this is kind of difficult.
> > > >
> > > > From the high level, we either need forward compatibility (old
> brokers
> > > able to read new ApiVersionsRequests) or some kind of
> challenge/response
> > > system.  I think you're on the right track to be thinking about forward
> > > compatibility...  a challenge/response system would have a lot of
> overhead
> > > in cases where the client opens a lot of new connections.
> > > >
> > > > I agree that we can have the new versions "add stuff at the end" to
> > > maintain forward compatibility.  Tagged fields will also help here, by
> > > avoiding the need for version bumps.  We also have to think about the
> > > impact this will have on the request header.  It seems like we might
> need
> > > to freeze the version of the request header forever in order to support
> > > full forwards compatibility.  Otherwise, the old brokers will not know
> how
> > > long the new request headers are.  This is maybe OK if we can evolve
> the
> > > request header by adding tagged fields...
> > > >
> > > > Another option is to fall all the way back to version 0 when the
> broker
> > > doesn't understand the client's supplied version.  Version 0 had no
> content
> > > in the request at all, so there is no need for the broker to know
> exactly
> > > how long the supplied request header is.  (This assumes that the first
> 4
> > > fields of the request header will never change, which seems like a
> > > reasonable assumption...)
> > > >
> > > > best,
> > > > Colin
> > > >
> > > >
> > > > On Tue, Sep 3, 2019, at 23:54, David Jacot wrote:
> > > > > Hi Colin,
> > > > >
> > > > > Thanks for your input. Please, find my comments below:
> > > > >
> > > > > >> Currently, we don't parse the contents of ApiVersionsRequest at
> all,
> > > > > since it's an empty message.  KIP-511 proposes adding some fields
> here,
> > > > > which will clearly change that situation.  In the future, any
> changes
> > > to
> > > > > ApiVersionsRequest will have to only add stuff at the end, rather
> than
> > > > > changing fields that already exist.  This will significantly limit
> our
> > > > > ability to add new things over time.
> > > > > >> Therefore, I think we should make sure that KIP-511 is
> implemented
> > > after
> > > > > KIP-482, so that the version we freeze into place can be one that
> > > includes
> > > > > the ability to add tagged fields, and includes the more efficient
> > > string
> > > > > and array serialization specified in KIP-482.  It's probably worth
> > > spelling
> > > > > that out here.
> > > > >
> > > > > I agree with you. It makes sense to bump the version once for the
> two
> > > > > cases.
> > > > >
> > > > > >> On another topic, when the client sends an unsupported version
> N of
> > > > > ApiVersionsRequest, and the broker only supports versions up to M,
> the
> > > > > broker currently falls back to sending a version 0 response.  Why
> > > can't the
> > > > > broker fall back to version M instead?  Previously, we only had
> > > versions 1
> > > > > and 0 of ApiVersionsRequest, so the two behaviors were identical.
> But
> > > > > going forward, once we have version 2 and later of
> ApiVersoinsRequest,
> > > it
> > > > > doesn't make sense to fall back all the way to 0 if the broker
> > > supports 1
> > > > > (for example).
> > > > > >> If you agree, it would be good to spell this out in the KIP, so
> > > that if
> > > > > we want to add more things to the response, we can, without losing
> them
> > > > > each time the client's version of ApiVersionsRequest exceeds the
> > > broker's.
> > > > >
> > > > > I fully agree with you and I have already outlined this in the
> > > proposal,
> > > > > see "ApiVersions Request/Response Handling". The idea is to use the
> > > version
> > > > > M in the broker so it can leverage the provided information it
> knows
> > > about.
> > > > > The broker can then send back response with version M as well. On
> the
> > > > > client side, it is a bit more tricky. As there is no version in the
> > > > > response, the client will use the version used for the request to
> > > parse the
> > > > > response. If fields have been added to the schema of the response
> > > version
> > > > > N, it won't work to parse M. As the client doesn't know the version
> > > used,
> > > > > we have two options: 1) use version 0 which is a prefix of all
> others
> > > but
> > > > > it means losing information; or 2) try versions in descending order
> > > from N
> > > > > to 0. I guess that N-1 would the one in most of the cases.
> > > > >
> > > > > What is your opinion regarding the client side?
> > > > >
> > > > > Best,
> > > > > David
> > > > >
> > > > > On Wed, Sep 4, 2019 at 12:13 AM Colin McCabe <cmcc...@apache.org>
> > > wrote:
> > > > >
> > > > > > Hi David,
> > > > > >
> > > > > > Thanks again for the KIP.
> > > > > >
> > > > > > Currently, we don't parse the contents of ApiVersionsRequest at
> all,
> > > since
> > > > > > it's an empty message.  KIP-511 proposes adding some fields here,
> > > which
> > > > > > will clearly change that situation.  In the future, any changes
> to
> > > > > > ApiVersionsRequest will have to only add stuff at the end, rather
> > > than
> > > > > > changing fields that already exist.  This will significantly
> limit
> > > our
> > > > > > ability to add new things over time.
> > > > > >
> > > > > > Therefore, I think we should make sure that KIP-511 is
> implemented
> > > after
> > > > > > KIP-482, so that the version we freeze into place can be one that
> > > includes
> > > > > > the ability to add tagged fields, and includes the more efficient
> > > string
> > > > > > and array serialization specified in KIP-482.  It's probably
> worth
> > > spelling
> > > > > > that out here.
> > > > > >
> > > > > > On another topic, when the client sends an unsupported version N
> of
> > > > > > ApiVersionsRequest, and the broker only supports versions up to
> M,
> > > the
> > > > > > broker currently falls back to sending a version 0 response.  Why
> > > can't the
> > > > > > broker fall back to version M instead?  Previously, we only had
> > > versions 1
> > > > > > and 0 of ApiVersionsRequest, so the two behaviors were identical.
> > > But
> > > > > > going forward, once we have version 2 and later of
> > > ApiVersoinsRequest, it
> > > > > > doesn't make sense to fall back all the way to 0 if the broker
> > > supports 1
> > > > > > (for example).
> > > > > >
> > > > > > If you agree, it would be good to spell this out in the KIP, so
> that
> > > if we
> > > > > > want to add more things to the response, we can, without losing
> them
> > > each
> > > > > > time the client's version of ApiVersionsRequest exceeds the
> broker's.
> > > > > >
> > > > > > best,
> > > > > > Colin
> > > > > >
> > > > > >
> > > > > > On Tue, Sep 3, 2019, at 01:26, David Jacot wrote:
> > > > > > > Hi all,
> > > > > > >
> > > > > > > I have updated the KIP to address the various comments. I have
> also
> > > > > > added a
> > > > > > > section about the handling of the ApiVersionsRequest/Response
> in
> > > the
> > > > > > broker.
> > > > > > >
> > > > > > > Please, let me know what you think. I would like to make it for
> > > the next
> > > > > > > release if possible.
> > > > > > >
> > > > > > > Best,
> > > > > > > David
> > > > > > >
> > > > > > > On Fri, Aug 30, 2019 at 10:31 AM David Jacot <
> dja...@confluent.io>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Magnus,
> > > > > > > >
> > > > > > > > Thank you for your feedback. Please, find my comments below.
> > > > > > > >
> > > > > > > > 1. I thought that the clientId was meant for this purpose
> > > (providing
> > > > > > > > information about the application). Is there a gap I don't
> see?
> > > > > > > >
> > > > > > > > 2. I have put two fields to avoid requiring deep validation
> and
> > > > > > parsing on
> > > > > > > > the broker. I believe that it will be easier to use the
> metadata
> > > > > > downstream
> > > > > > > > like this.
> > > > > > > >
> > > > > > > > 3. Good point. I think that the broker should have some sort
> of
> > > > > > validation
> > > > > > > > and fail the ApiVersionRequest and/or the Connection if the
> > > validation
> > > > > > > > fails. To avoid backward compatibility issues, I think we
> should
> > > come
> > > > > > up
> > > > > > > > with a minimal validation and ensure it won't become more
> > > restrictive
> > > > > > in
> > > > > > > > the future.
> > > > > > > >
> > > > > > > > 4. I don't have strong opinion regarding this one but as the
> > > focus of
> > > > > > the
> > > > > > > > KIP is to gather the client's information, I suggest to
> > > discuss/address
> > > > > > > > this later on.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > David
> > > > > > > >
> > > > > > > > On Thu, Aug 29, 2019 at 6:56 PM Colin McCabe <
> cmcc...@apache.org
> > > >
> > > > > > wrote:
> > > > > > > >
> > > > > > > >> On Fri, Aug 23, 2019, at 00:07, Magnus Edenhill wrote:
> > > > > > > >> > Great proposal, this feature is well overdue!
> > > > > > > >> >
> > > > > > > >> > 1)
> > > > > > > >> > From an operator's perspective I don't think the kafka
> client
> > > > > > > >> > implementation name and version are sufficient,
> > > > > > > >> > I also believe the application name and version are of
> > > interest.
> > > > > > > >> > You could have all applications in your cluster run the
> same
> > > kafka
> > > > > > > >> client
> > > > > > > >> > and version, but only one type or
> > > > > > > >> > version of an application misbehave and needing to be
> tracked
> > > down.
> > > > > > > >>
> > > > > > > >> Hi Magnus,
> > > > > > > >>
> > > > > > > >> I think it might be better to leave this out of scope for
> now,
> > > and
> > > > > > think
> > > > > > > >> about it in the context of more generalized request tracing.
> > > This is
> > > > > > a
> > > > > > > >> very deep rabbit hole, and I could easily see it delaying
> this
> > > KIP
> > > > > > for a
> > > > > > > >> long time.  For example, if you have multiple Spark jobs
> > > producing to
> > > > > > > >> Kafka, just knowing that a client is being used by Spark may
> > > not be
> > > > > > that
> > > > > > > >> helpful.  So maybe you want a third set of fields to
> describe
> > > the
> > > > > > spark
> > > > > > > >> application ID and version, etc?  And then maybe that,
> itself,
> > > was
> > > > > > created
> > > > > > > >> by some framework... etc. Probably better to defer this
> > > discussion
> > > > > > for now
> > > > > > > >> and see how version tracking works out.
> > > > > > > >>
> > > > > > > >> >
> > > > > > > >> > While the application and client name and version could be
> > > combined
> > > > > > in
> > > > > > > >> the
> > > > > > > >> > ClientName/ClientVersion fields by
> > > > > > > >> > the user (e.g. like User-Agent), it would not be in a
> > > generalized
> > > > > > format
> > > > > > > >> > and hard for generic monitoring tools to parse correctly.
> > > > > > > >> >
> > > > > > > >> > So I'd suggest keeping ClientName and ClientVersion as the
> > > client
> > > > > > > >> > implementation name ("java" or "org.apache.kafka...") and
> > > version,
> > > > > > > >> > which can't be changed by the user/app developer, and
> > > providing two
> > > > > > > >> > optional fields for the application counterpart:
> > > > > > > >> > ApplicationName and ApplicationVersion, which are backed
> by
> > > > > > > >> corresponding
> > > > > > > >> > configuration properties (application.name,
> > > application.version).
> > > > > > > >> >
> > > > > > > >> > 2)
> > > > > > > >> > Do ..Name and ..Version need to be two separate fields,
> > > seeing how
> > > > > > the
> > > > > > > >> two
> > > > > > > >> > fields are ambigious when separated?
> > > > > > > >> > If we're looking to identify unique versions, combining
> the
> > > two
> > > > > > fields
> > > > > > > >> > would be sufficient (e.g., "java-2.3.1",
> "librdkafka/1.2.0",
> > > > > > > >> "sarama@1.2.3")
> > > > > > > >> > and perhaps easier to work with.
> > > > > > > >> > The actual format or content of the name-version string is
> > > > > > irrelevant as
> > > > > > > >> > long as it identifies a unique name+version.
> > > > > > > >> >
> > > > > > > >>
> > > > > > > >> Hmm.  Wouldn't the same arguments you made above about a
> > > combined
> > > > > > > >> named+version field being "hard for generic monitoring
> tools to
> > > parse
> > > > > > > >> correctly" apply here?  In any case, there seems to be no
> > > reason not
> > > > > > to
> > > > > > > >> just have two fields.  It avoids string parsing.
> > > > > > > >>
> > > > > > > >> >
> > > > > > > >> > 3)
> > > > > > > >> > As for allowed characters, will the broker fail the
> > > > > > ApiVersionResponse
> > > > > > > >> if
> > > > > > > >> > any of these fields contain invalid characters,
> > > > > > > >> > or will the broker sanitize the strings?
> > > > > > > >> > For future backwards compatibility (when the broker
> > > constraints
> > > > > > change
> > > > > > > >> but
> > > > > > > >> > clients are not updated) I suggest the latter.
> > > > > > > >> >
> > > > > > > >>
> > > > > > > >> I would argue that we should be strict about the characters
> > > that we
> > > > > > > >> accept, and just close the connection if the string is bad.
> > > There's
> > > > > > no
> > > > > > > >> reason to let clients troll us with "librdkafka  " (two
> spaces
> > > at the
> > > > > > end)
> > > > > > > >> or a version string with slashes or control characters in
> it.
> > > In
> > > > > > fact, I
> > > > > > > >> would argue we should just allow something like
> > > ([\.][a-z][A-Z][0-9])+
> > > > > > > >> This ensures that JMX will work well.  We can always loosen
> the
> > > > > > > >> restrictions later if there is a real need.
> > > > > > > >>
> > > > > > > >> > 4)
> > > > > > > >> > And while we're at it, can we add the broker name and
> version
> > > to the
> > > > > > > >> > ApiVersionResponse?
> > > > > > > >> > While an application must not use this information to
> detect
> > > > > > features
> > > > > > > >> (Hi
> > > > > > > >> > Jay!), it is good for troubleshooting
> > > > > > > >> > and providing more meaningful logs to the client user in
> case
> > > a
> > > > > > feature
> > > > > > > >> > (based on the actual api versions) is not available.
> > > > > > > >>
> > > > > > > >> I can think of several cases where people tried to set up
> > > client-side
> > > > > > > >> hacks based on the broker version, and were only stopped by
> the
> > > fact
> > > > > > that
> > > > > > > >> we don't expose this information.  I agree with Jay that we
> > > should
> > > > > > think
> > > > > > > >> very carefully before exposing it.  In any case, this seems
> out
> > > of
> > > > > > scope...
> > > > > > > >>
> > > > > > > >> best,
> > > > > > > >> Colin
> > > > > > > >>
> > > > > > > >> >
> > > > > > > >> > /Magnus
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> > Den tors 22 aug. 2019 kl 10:09 skrev David Jacot <
> > > > > > dja...@confluent.io>:
> > > > > > > >> >
> > > > > > > >> > > Hi Satish,
> > > > > > > >> > >
> > > > > > > >> > > Thank you for your feedback!
> > > > > > > >> > >
> > > > > > > >> > > Please find my answers below.
> > > > > > > >> > >
> > > > > > > >> > > >> Did you consider taking version property by loading
> > > > > > > >> > > “kafka/kafka-version.properties” as a resource while
> java
> > > client
> > > > > > is
> > > > > > > >> > > initialized?  “kafka/kafka-version.properties” is
> shipped
> > > with
> > > > > > > >> > > kafka-clients jar.
> > > > > > > >> > >
> > > > > > > >> > > I wasn't aware of the property file. It is exactly what
> I
> > > need.
> > > > > > > >> Thanks for
> > > > > > > >> > > pointing that out!
> > > > > > > >> > >
> > > > > > > >> > > >> I assume this metric value will be the total no of
> > > clients
> > > > > > > >> connected
> > > > > > > >> > > to a broker irrespective of whether name and version
> follow
> > > the
> > > > > > > >> > > expected pattern ([-.\w]+) or not.
> > > > > > > >> > >
> > > > > > > >> > > That is correct.
> > > > > > > >> > >
> > > > > > > >> > > >> It seems client name and client version are treated
> as
> > > tags for
> > > > > > > >> > > `ConnectedClients` metric. If so, you may implement this
> > > metric
> > > > > > > >> > > similar to `BrokerTopicMetrics` with topic tag as
> mentioned
> > > > > > here[1].
> > > > > > > >> > > When is the metric removed for a specific client-name
> and
> > > > > > > >> > > client-version?
> > > > > > > >> > >
> > > > > > > >> > > That is correct. Client name and version are treated as
> > > tags like
> > > > > > in
> > > > > > > >> > > BrokerTopicMetrics. My plan is to remove the metric
> when it
> > > goes
> > > > > > > >> > > back to zero - when all clients with a given name &
> version
> > > are
> > > > > > > >> > > disconnected.
> > > > > > > >> > >
> > > > > > > >> > > Best,
> > > > > > > >> > > David
> > > > > > > >> > >
> > > > > > > >> > > On Wed, Aug 21, 2019 at 6:52 PM Satish Duggana <
> > > > > > > >> satish.dugg...@gmail.com>
> > > > > > > >> > > wrote:
> > > > > > > >> > >
> > > > > > > >> > > > Hi David,
> > > > > > > >> > > > Thanks for the KIP. I have a couple of questions.
> > > > > > > >> > > >
> > > > > > > >> > > > >> For the Java client, the idea is to define two
> > > constants in
> > > > > > the
> > > > > > > >> code
> > > > > > > >> > > to
> > > > > > > >> > > > store its name and its version. If possible, the
> version
> > > will
> > > > > > be set
> > > > > > > >> > > > automatically based on metadata coming from gradle (or
> > > the repo
> > > > > > > >> itself)
> > > > > > > >> > > to
> > > > > > > >> > > > avoid having to do manual changes.
> > > > > > > >> > > >
> > > > > > > >> > > > Did you consider taking version property by loading
> > > > > > > >> > > > “kafka/kafka-version.properties” as a resource while
> java
> > > > > > client is
> > > > > > > >> > > > initialized?  “kafka/kafka-version.properties” is
> shipped
> > > with
> > > > > > > >> > > > kafka-clients jar.
> > > > > > > >> > > >
> > > > > > > >> > > > >>
> kafka.server:type=ClientMetrics,name=ConnectedClients
> > > > > > > >> > > > I assume this metric value will be the total no of
> clients
> > > > > > connected
> > > > > > > >> > > > to a broker irrespective of whether name and version
> > > follow the
> > > > > > > >> > > > expected pattern ([-.\w]+) or not.
> > > > > > > >> > > >
> > > > > > > >> > > > >>
> > > > > > > >> > > >
> > > > > > > >> > >
> > > > > > > >>
> > > > > >
> > >
> kafka.server:type=ClientMetrics,name=ConnectedClients,clientname=([-.\w]+),clientversion=([-.\w]+)
> > > > > > > >> > > > It seems client name and client version are treated as
> > > tags for
> > > > > > > >> > > > `ConnectedClients` metric. If so, you may implement
> this
> > > metric
> > > > > > > >> > > > similar to `BrokerTopicMetrics` with topic tag as
> > > mentioned
> > > > > > here[1].
> > > > > > > >> > > > When is the metric removed for a specific client-name
> and
> > > > > > > >> > > > client-version?
> > > > > > > >> > > >
> > > > > > > >> > > > 1.
> > > > > > > >> > > >
> > > > > > > >> > >
> > > > > > > >>
> > > > > >
> > >
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaRequestHandler.scala#L231
> > > > > > > >> > > >
> > > > > > > >> > > > Thanks,
> > > > > > > >> > > > Satish.
> > > > > > > >> > > >
> > > > > > > >> > > >
> > > > > > > >> > > >
> > > > > > > >> > > >
> > > > > > > >> > > > On Wed, Aug 21, 2019 at 5:33 PM David Jacot <
> > > > > > dja...@confluent.io>
> > > > > > > >> wrote:
> > > > > > > >> > > > >
> > > > > > > >> > > > > Hi all,
> > > > > > > >> > > > >
> > > > > > > >> > > > > I would like to start a discussion for KIP-511:
> > > > > > > >> > > > >
> > > > > > > >> > > >
> > > > > > > >> > >
> > > > > > > >>
> > > > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-511%3A+Collect+and+Expose+Client%27s+Name+and+Version+in+the+Brokers
> > > > > > > >> > > > >
> > > > > > > >> > > > > Let me know what you think.
> > > > > > > >> > > > >
> > > > > > > >> > > > > Best,
> > > > > > > >> > > > > David
> > > > > > > >> > > >
> > > > > > > >> > >
> > > > > > > >> >
> > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> > >
> > >
> > > --
> > > Gwen Shapira
> > > Product Manager | Confluent
> > > 650.450.2760 | @gwenshap
> > > Follow us: Twitter | blog
> > >
> >
>

Reply via email to