On Wed, Jul 21, 2021 at 6:17 PM Colin McCabe <cmcc...@apache.org> wrote:
> On Tue, Jun 29, 2021, at 07:22, Magnus Edenhill wrote: > > Den tors 17 juni 2021 kl 00:52 skrev Colin McCabe <cmcc...@apache.org>: > > > A few critiques: > > > > > > - As I wrote above, I think this could benefit a lot by being split > into > > > several RPCs. A registration RPC, a report RPC, and an unregister RPC > seem > > > like logical choices. > > > > > > > Responded to this in your previous mail, but in short I think a single > > request is sufficient and keeps the implementation complexity / state > down. > > > > Hi Magnus, > > I still suspect that trying to do everything with a single RPC is more > complex than using multiple RPCs. > > Can you go into more detail about how the client learns what metrics it > should send? This was the purpose of the "registration" step in my scheme > above. > > It seems quite awkward to combine an RPC for reporting metrics with and > RPC for finding out what metrics are configured to be reported. For > example, how would you build a tool to check what metrics are configured to > be reported? Does the tool have to report fake metrics, just because > there's no other way to get back that information? Seems wrong. (It would > be a bit like combining createTopics and listTopics for "simplicity") > +1 on separate RPC on metric discovery and metric report. I actually think it makes complexity/state down compared with single RPC. > > > > - I don't think the client should be able to choose its own UUID. This > > > adds complexity and introduces a chance that clients will choose an ID > that > > > is not unique. We already have an ID that the client itself supplies > > > (clientID) so there is no need to introduce another such ID. > > > > > > > The CLIENT_INSTANCE_ID (which is a combination of the client.id and a > UUID) > > is actually generated by the receiving broker on first contact. > > The need for a new unique semi-random id is outlined in the KIP, but in > > short; the client.id is not unique, and we need something unique that > still > > is prefix-matchable to the client.id so that we can add subscriptions > > either using prefix-matching of just the client.id (which may match one > or > > more client instances), and exact matching which will match a one > specific > > client instance. > > Hmm... the client id is already sent in every RPC as part of the header. > It's not necessary to send it again as part of one of the other RPC fields, > right? > > More generally, why does the client instance ID need to be > prefix-matchable? That seems like an implementation detail of the metrics > collection system used on the broker side. Maybe someone wants to group by > things other than client IDs -- perhaps client versions, for instance. By > the same argument, we should put the client version string in the client > instance ID, since someone might want to group by that. Or maybe we should > include the hostname, and the IP, and, and, and.... You see the issue here. > I think we shouldn't get involved in this kind of decision -- if we just > pass a UUID, the broker-side software can group it or prefix it however it > wants internally. > > > > - In general the schema seems to have a bad case of string-itis. UUID, > > > content type, and requested metrics are all strings. Since these > messages > > > will be sent very frequently, it's quite costly to use strings for all > > > these things. We have a type for UUID, which uses 16 bytes -- let's use > > > that type for client instance ID, rather than a string which will be > much > > > larger. Also, since we already send clientID in the message header, > there > > > is no need to include it again in the instance ID. > > > > > > > As explained above we need the client.id in the CLIENT_INSTANCE_ID. And > I > > don't think the overhead of this one string per request is going to be > much > > of an issue, > > typical metric push intervals are probably in the >60s range. > > If this becomes a problem we could use a per-connection identifier that > the > > broker translates to the client instance id before pushing metrics > upwards > > in the system. > > > > This is actually an interesting design question -- why not use a > per-TCP-connection identifier, rather than a per-client-instance > identifier? If we are grouping by other things anyway (clientID, principal, > etc.) on the server side, do we need to maintain a per-process identifier > rather than a per-connection one? > > > > > > - I think it would also be nice to have an enum or something for > > > AcceptedContentTypes, RequestedMetrics, etc. We know that new > additions to > > > these categories will require KIPs, so it should be straightforward > for the > > > project to just have an enum that allows us to communicate these as > ints. > > > > > > > I'm thinking this might be overly constraining. The broker doesn't parse > or > > handle the received metrics data itself but just pushes it to the metrics > > plugin, using an enum would require a KIP and broker upgrade if the > metrics plugin > > supports a newer version of OTLP. > > It is probably better if we don't strictly control the metric format > itself. > > > > Unfortunately, we have to strictly control the metrics format, because > otherwise clients can't implement it. I agree that we don't need to specify > how the broker-side code works, since that is pluggable. It's also > reasonable for the clients to have pluggable extensions as well, but this > KIP won't be of much use if we don't at least define a basic set of metrics > that most clients can understand how to send. The open source clients will > not implement anything more than what is specified in the KIP (or at least > the AK one won't...) > > > > > > > > - Can you talk about whether you are adding any new library > dependencies > > > to the Kafka client? It seems like you'd want to add opencensus / > > > opentelemetry, if we are using that format here. > > > > > > > Yeah, as we get closer to concensus more implementation specific details > > will be added to the KIP. > > > > I'm not sure if OpenCensus adds any value to this KIP, to be honest. Their > primary focus was never on the format of the data being sent (in fact, the > last time they checked, they left the format up to each OpenCensus > implementation). That may have changed, but I think it still has limited > usefulness to us, since we have our own format which we have to use anyway. > > > > > > > > > - Standard client resource labels: can we send these only in the > > > registration RPC? > > > > > > > These labels are part of the serialized OTLP data, which means it would > > need to be unpacked and repacked (including compression) by the broker > (or > > metrics plugin), which I believe is more costly than sending them for > each request. > > > > Hmm, that data is about 10 fields, most of which are strings. It certainly > adds a lot of overhead to resend it each time. > > I don't follow the comment about unpacking and repacking -- since the > client registered with the broker it already knows all this information, so > there's nothing to unpack or repack, except from memory. If it's more > convenient to serialize it once rather than multiple times, that is an > implementation detail of the broker side plugin, which we are not > specifying here anyway. > > best, > Colin > > > Thanks, > > Magnus > > > > > > > > > > > -- Best, Feng