Hi Boyang,

Thanks for the KIP, I have a few questions:

1) Will it be possible to enable/disable automatic file reloading? If not,
we should mention in the compatibility section.

2) We are introducing new common SSL configs and updating common code to
perform automated updates. What does this mean for clients? Are we going to
automatically reload client key stores and trust stores?

3) We should mention in the compatibility section that we are changing the
audit log/authorization model for dynamic updates of SSL stores. At the
moment, only a user with powerful Cluster:Alter permissions can dynamically
update SSL stores on brokers. The KIP removes this restriction and relies
purely on file system permissions for file-based stores, unlike for example
PEM store updates which would still rely on Kafka permissions.

4) Is it really necessary to add a file watcher that is not guaranteed 100%
of the time, if we are adding refresh interval configs? In particular, if
we extend this to clients now or at some point in the future, wouldn't it
be better just to use deterministic refresh intervals without the overhead
of watching?

5) The KIP doesn't talk about validation of key and trust stores. I am
guessing we will continue to perform the same validation that we perform
today. But what happens if validation fails? We should mention in the KIP
that we no longer provide feedback for this case (unlike admin client
requests that returned an error).
5a) If a key store doesn't conform (e.g. the DN was changed), we would fail
the update if we apply the current validation. Would we do the validation
every 5 minutes after that forever even though the file wasn't updated
since? Or will we remember the last reloaded time to avoid reloading if
file hasn't changed?
5b) We perform additional validation for inter-broker key and trust stores
to ensure we never break the broker with dynamic updates. Since this
validation matches key store with trust store, it relies on the order in
which stores are updated. With reloading in the admin client, user had
control over the update. We should document any restrictions on the order
in which files need to be updated on the file system to perform updates of
inter-broker SSL stores. And as with 5a), it will be good to document the
behaviour if validation fails.

Regards,

Rajini



On Wed, Jan 6, 2021 at 5:55 PM Jason Gustafson <ja...@confluent.io> wrote:

> Thanks Boyang. Someone mentioned my email never showed up, but basically I
> suggested tying the refresh configuration more directly to the
> configurations it would affect. I'm happy with the updates.
>
> -Jason
>
> On Tue, Jan 5, 2021 at 8:34 PM Boyang Chen <reluctanthero...@gmail.com>
> wrote:
>
> > Thanks Jason for the feedback. I separated the time configs for key store
> > and trust store, and rename the configs as you proposed.
> >
> > Best,
> > Boyang
> >
> > On Mon, Dec 14, 2020 at 3:47 PM Boyang Chen <reluctanthero...@gmail.com>
> > wrote:
> >
> > > Hey there,
> > >
> > > bumping up this thread to see if there are further questions regarding
> > the
> > > updated proposal.
> > >
> > > Best,
> > > Boyang
> > >
> > > On Thu, Dec 10, 2020 at 11:52 AM Boyang Chen <
> reluctanthero...@gmail.com
> > >
> > > wrote:
> > >
> > >> After some offline discussions, we believe that it's the right
> direction
> > >> to go by doing a hybrid approach which includes both file-watch
> trigger
> > and
> > >> interval based reloading. The former guarantees a swift change in 99%
> > time,
> > >> while the latter provides a time-based guarantee in the worst case
> when
> > the
> > >> file-watch does not take effect. The current default reloading
> interval
> > is
> > >> set to 5 min. I have updated the KIP and ticket, feel free to check
> out
> > and
> > >> see if it makes sense.
> > >>
> > >> Best,
> > >> Boyang
> > >>
> > >> On Tue, Dec 8, 2020 at 8:58 PM Boyang Chen <
> reluctanthero...@gmail.com>
> > >> wrote:
> > >>
> > >>> Hey Gwen, thanks for the feedback.
> > >>>
> > >>> On Sun, Dec 6, 2020 at 10:06 PM Gwen Shapira <g...@confluent.io>
> > wrote:
> > >>>
> > >>>> Agree with Igor. IIRC, we also encountered cases where filewatch was
> > >>>> not triggered as expected. An interval will give us a better
> > >>>> worse-case scenario that is easily controlled by the Kafka admin.
> > >>>>
> > >>>> Are the cases you were referring to happening in the cloud
> > environment?
> > >>> Should we investigate instead of simply assuming the standard API
> won't
> > >>> work? I checked around and found a similar complaint here
> > >>> <https://blog.arkey.fr/2019/09/13/watchservice-and-bind-mount/>.
> > >>>
> > >>> I would be partially agreeing that we want to have a reliable
> approach
> > >>> for all different operating systems in general, but would be great if
> > we
> > >>> could reach a quantitative measure of file-watch success rate if
> > possible
> > >>> for us to make the call. Eventually, the benefit of file-watch is
> more
> > >>> prompt reaction time and less configuration to the broker.
> > >>>
> > >>>> Gwen
> > >>>>
> > >>>> On Sun, Dec 6, 2020 at 8:17 AM Igor Soarez <i...@soarez.me> wrote:
> > >>>> >
> > >>>> >
> > >>>> > > > The proposed change relies on a file watch, why not also have
> a
> > >>>> polling
> > >>>> > > > interval to check the file for changes?
> > >>>> > > >
> > >>>> > > > The periodical check could work, the slight downside is that
> we
> > >>>> need
> > >>>> > > additional configurations to schedule the interval. Do you think
> > the
> > >>>> > > file-watch approach has any extra overhead than the interval
> based
> > >>>> solution?
> > >>>> >
> > >>>> > I don't think so. The reason I'm asking this is the KIP currently
> > >>>> includes:
> > >>>> >
> > >>>> >   "When the file watch does not work for unknown reason, user
> could
> > >>>> still try to change the store path in an explicit AlterConfig call
> in
> > the
> > >>>> worst case."
> > >>>> >
> > >>>> > Having the interval in addition to the file watch could result in
> a
> > >>>> better worst case scenario.
> > >>>> > I understand it would require introducing at least one new
> > >>>> configuration for the interval, so maybe this doesn't have to solved
> > in
> > >>>> this KIP.
> > >>>> >
> > >>>> > --
> > >>>> > Igor
> > >>>> >
> > >>>> > On Fri, Dec 4, 2020, at 5:14 PM, Boyang Chen wrote:
> > >>>> > > Hey Igor, thanks for the feedback.
> > >>>> > >
> > >>>> > > On Fri, Dec 4, 2020 at 5:24 AM Igor Soarez <i...@soarez.me> wrote:
> > >>>> > >
> > >>>> > > > Hi Boyang,
> > >>>> > > >
> > >>>> > >
> > >>>> > >
> > >>>> > > > What happens if the file is changed into an invalid store?
> Does
> > >>>> the
> > >>>> > > > previous store stay in use?
> > >>>> > > >
> > >>>> > > > If the reload fails, the previous store should be effective. I
> > >>>> will state
> > >>>> > > that in the KIP.
> > >>>> > >
> > >>>> > >
> > >>>> > > > Thanks,
> > >>>> > > >
> > >>>> > > > --
> > >>>> > > > Igor
> > >>>> > > >
> > >>>> > > > On Fri, Dec 4, 2020, at 1:28 AM, Boyang Chen wrote:
> > >>>> > > > > Hey there,
> > >>>> > > > >
> > >>>> > > > > I would like to start the discussion thread for KIP-687:
> > >>>> > > > >
> > >>>> > > >
> > >>>>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-687%3A+Automatic+Reloading+of+Security+Store
> > >>>> > > > >
> > >>>> > > > > This KIP is trying to deprecate the AlterConfigs API support
> > of
> > >>>> updating
> > >>>> > > > > the security store by reloading path in-place, and replace
> > with
> > >>>> a
> > >>>> > > > > file-watch mechanism inside the broker. Let me know what you
> > >>>> think.
> > >>>> > > > >
> > >>>> > > > > Best,
> > >>>> > > > > Boyang
> > >>>> > > > >
> > >>>> > > >
> > >>>> > >
> > >>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>> Gwen Shapira
> > >>>> Engineering Manager | Confluent
> > >>>> 650.450.2760 | @gwenshap
> > >>>> Follow us: Twitter | blog
> > >>>>
> > >>>
> >
>

Reply via email to