I am ok with the config proposal.

For consumer/producer/admin client configs, I still don't understand why one would want to instantiate the object (for StreamsConfig with its complex "overwrite logic" of internal client configs, it makes more sense). But there is also no real harm done to making the constructors public, so if people want it, I won't object either.


Btw: We did intent to remove `new KafkaStreams(Topology, StreamsConfig)` in the past (the KIP was approved and we deprecated it: https://cwiki.apache.org/confluence/display/KAFKA/KIP-245%3A+Use+Properties+instead+of+StreamsConfig+in+KafkaStreams+constructor), but there was push back from users relying on dependency injection with follow up KIPs https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=93323580 and https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=211882578 -- neither KIP was approved, but we just backed up and removed the deprecation annotation for the time being to not break dependency injection for these users.

(Btw: please let's not start a discussion if dependency injection is a good thing or not... another controversial topic that I really don't want to touch -- we were just pragmatic and user-friendly to keep the constructor).


-Matthias

On 11/6/23 8:40 PM, Sophie Blee-Goldman wrote:
Thanks for the input Colin -- I think I'm with you in spirit, although the
2nd suggestion isn't viable as setting the log level to TRACE would mean
silencing every instance of the config logging, which was not the intention
here: we still want the configs to be logged once, when the client is
created, we just wanted to be able to instantiate the ProducerConfig at
other times without logging the whole thing every time. So suppressing
log4j doesn't let us silence this at the required level of granularity. We
wanted the ability to disable the logging for specific instances, which is
exactly what the `boolean doLog` constructor parameter provides.

But it doesn't necessarily have to be via the `doLog` flag -- I'd be happy
with the approach of introducing a new config key in AbstractConfig that
has the same effect as the `doLog` field. In that case would we want to
then demote all the other Config constructors that take this argument from
`protected` to package-private, so that they all conform to the same api?

As an aside, I do think it's completely valid to want to instantiate a
Config object in user code. For one thing, there are actually public
KafkaStreams constructors that take in a StreamsConfig parameter (though
imo those should be deprecated as they just add an extra step for users
compared to the overloads with Properties instead). But it can be a useful
way to do validation, for example making sure that all required properties
have been configured, or verifying that certain configs are set to an
accepted value in case some constraints are to be enforced, such as
preventing topology optimizations from being enabled/disabled for an
existing application: a breaking change that can corrupt the application
state

So there are really two questions here:
1. Do we intend for the actual Config classes to be public APIs for users
to construct and operate on/with, or are they "supposed" to be owned and
accessed primarily by the Kafka clients, with only the config definitions
being targeted for user consumption? It sounds like there is an implicit
assumption that these are inherently internal classes that just happen to
be in the public API but would ideally not be exposed to users at all. I
tend to disagree with this, for the reasons stated below, but we should
come to a consensus and update the API to reflect whatever is decided
regarding the purpose of these Config classes
2. Should plugin authors and/or end users be able to control the logging,
and should this be easy or is it acceptable to make people go out of their
way to do so?

It seems difficult to come up with an answer to the 2nd question that
everyone will be happy with if there is implicit disagreement on the 1st
question. That said, I'm happy to proceed with the "config key that
controls logging in AbstractConfig" approach if there aren't any objections
to it..

On Mon, Nov 6, 2023 at 12:35 PM Colin McCabe <cmcc...@apache.org> wrote:

-1

If the goal is not to have so much config logging, let's add a
configuration key that suppresses log4j in AbstractConfig. Then plugins can
set that configuration if they want before invoking the superclass
constructors.

(Or in the interest of allowing people to be verbose if they want, maybe
let's say that we have a configuration key that sets the log4j
configuration logging to TRACE rather than what it is now).

This will avoid exposing our guts to the world (our configuration guts are
very, very ugly in my opinion). It will be helpful to end-users as well as
plugin authors -- maybe people want this. And it doesn't require the
subclassing hack to get access to it.

best,
Colin


On Fri, Nov 3, 2023, at 17:06, Matthias J. Sax wrote:
Sophie reads my mind well, but I also won't object if majority if people
thinks it's desirable to have it public (it's does not really hurt to
have them public).

I just personally think, we should optimize for "end users" and they
should not need it -- and thus, keeping the API surface area as small as
possible seems desirable (and don't generate JavaDocs for protected
methods...). Maybe it's less of an issue for clients, but given my
experience with Kafka Streams, and it large API, I prefer to guide users
by avoiding "leaky" abstractions.

-Matthias



On 11/3/23 4:34 PM, Chris Egerton wrote:
No objections, I'm +1 ether way.

On Fri, Nov 3, 2023, 18:50 Sophie Blee-Goldman <sop...@responsive.dev>
wrote:

I am fine with making them public. Of course in that case we should
also
change the corresponding constructors in ConsumerConfig, AdminConfig,
and
StreamsConfig from protected to public as well, to be consistent. But
Matthias seems to feel that these should never be instantiated by a
user
and that the correct course of action would be to move in the opposite
direction.

I don't personally feel strongly either way -- honestly I had thought
it
was an abuse of internal APIs to extend the other Config classes in
order
to access the protected constructor and disable logging. So I would be
happy to officially pull it into the public API with all-public
constructors, because I do feel it is valid/useful to be able to
instantiate these objects. We do so in order to access config values
in a
way that accounts for any overrides on top of the default, for example
when
multiple overrides are in play (eg user overrides on top of framework
overrides on top of Kafka Streams overrides on top of
Consumer/Consumer/Admin client defaults). Using them is also (slightly)
more type-safe than going through a Properties or config Map<>

Any objections to expanding the KIP to the ConsumerConfig,
AdminConfig, and
StreamsConfig constructors and making them public as well? From
Matthias or
otherwise?

On Fri, Nov 3, 2023 at 11:09 AM Ismael Juma <m...@ismaeljuma.com> wrote:

It seems wrong to require inheritance for this and we already have a
public
constructor. I would make both of them public.

Ismael

On Fri, Nov 3, 2023 at 10:47 AM Matthias J. Sax <mj...@apache.org>
wrote:

+1 (binding)


About "why not public" question:

I think we need to distinguish between "end users" who create a
producer
instance, and "external parties" that might implement their own
`Producer` (or wrap/extend `KafkaProducer`).

In the end, I would not expect an "end user" to actually call `new
ProducerConfig` to begin with. If one creates a `KafkaProducer` they
pass the config via a `Map` or `Properties`, and the producer creates
`ProducerConfig` internally only. -- Thus, there is no need to make
it
`public`. (To this end, I don't actually understand why there is
public
`ProducerConfig` constructors to begin with -- sounds like a leaky
abstraction to me.)

On the other hand, if a "third party" implements `Producer` interface
to
ship their own producer implementation, they might want to create
`ProducerConfig` internally, so for them it's different, but still,
they
don't need public access because they can extend `ProducerConfig`,
too
for this case). -- To me, this falls into the category "simple thing
should be easy, and hard things should be possible).


-Matthias


On 11/3/23 6:06 AM, Ismael Juma wrote:
Hi Sophie,

I was trying to understand the goal of the change and it's not
totally
clear to me. If the goal is to allow third party applications to
customize
the logging behavior, why is the method protected instead of public?

Ismael

On Thu, Nov 2, 2023 at 9:55 PM Sophie Blee-Goldman <
sop...@responsive.dev>
wrote:

Hey all,

This is a trivial one-liner change that it was determined should go
through
a KIP during the PR review process (see this thread
<https://github.com/apache/kafka/pull/14681#discussion_r1378591228

for
context). Since the change itself was already reviewed and approved
I'm
skipping the discussion thread and bringing it to a vote right
away,
but of
course I'm open to feedback and can create a discussion thread if
there
is
need for it.

The change itself is simply adding the `protected` modifier to the
ProducerConfig constructor that allows for silencing the config
logging.
This just brings the ProducerConfig in alignment with the other
client
configs, all of which already had this constructor as protected.

KIP:





https://cwiki.apache.org/confluence/display/KAFKA/KIP-998%3A+Give+ProducerConfig%28props%2C+doLog%29+constructor+protected+access
PR: https://github.com/apache/kafka/pull/14681

Thanks!
Sophie








Reply via email to