On Fri, Apr 17, 2020, at 13:11, Ismael Juma wrote:
> Hi Colin,
> 
> The read/modify/write is protected by the zk version, right?
> 
> Ismael

No, we don't use the ZK version when doing the write to the config znodes.  We 
do for ACLs, I think.

This is something that we could fix just by using the ZK version, but there are 
other race conditions like what if we're deleting a topic while setting this 
config, etc.  A single writer is a lot easier to reason about.

best,
Colin


> 
> On Fri, Apr 17, 2020 at 12:53 PM Colin McCabe <cmcc...@apache.org> wrote:
> 
> > On Thu, Apr 16, 2020, at 08:51, Ismael Juma wrote:
> > > I don't think these requests are necessarily infrequent under multi
> > tenant
> > > environments though. I've seen Controller availability being an issue for
> > > describe topics for example (before it was changed to go to any broker).
> >
> > Hi Ismael,
> >
> > I don't think DescribeTopics is a good comparison.  That RPC is available
> > to regular users and is used many orders of magnitude more frequently than
> > administrative operations like changing ACLs or setting quotas.
> >
> > The operations we're talking about redirecting here all require the
> > highest possible permissions and will not be frequent in any real-world
> > cluster... unless someone is running a stress-test or a benchmark.  We
> > didn't even notice some of the serious bugs in setting dynamic configs
> > until recently because the alterConfigs / incrementalAlterConfigs RPCs are
> > so infrequently called.
> >
> > Additionally, this KIP fixes some existing bugs.  The current approach of
> > having random writers do a read-write-modify cycle on a configuration znode
> > is buggy since it could be interleaved with another node's read-modify
> > write cycle.  It has a "lost updates" problem.
> >
> > For example, node 1 reads a config znode.  Node 2 reads the same config
> > znode.  Node 1 writes back a modified version of the znode.  Node 2 writes
> > back its (differently) modified version, overwriting the changes from node
> > 1.
> >
> > I don't think anyone ever noticed this problem since, again, these
> > operations are very infrequent, making the chance of such a collision low.
> > But it is a serious bug that is fixed by having a single writer.  (We
> > should add this to the KIP...)
> >
> > >
> > > Would it be better to redirect once the controller quorum is there?
> >
> > This KIP is needed for the bridge release.  The bridge release upgrade
> > process relies on the old nodes sending their administrative operations to
> > the controller quorum, not directly to zookeeper.
> >
> > best,
> > Colin
> >
> >
> > >
> > > Note that this is different from things like AlterIsr since these calls
> > are
> > > coming from clients versus other brokers.
> > >
> > > Ismael
> > >
> > > On Wed, Apr 15, 2020, 5:10 PM Colin McCabe <cmcc...@apache.org> wrote:
> > >
> > > > Hi Ismael,
> > > >
> > > > I agree that sending these requests through the controller will not
> > work
> > > > during the periods when there is no controller.  However, those periods
> > > > should be short-- otherwise we have bigger problems in the cluster.
> > > >
> > > > These requests are very infrequent because they are administrative
> > > > operations.  Basically the affected operations are changing ACLs,
> > changing
> > > > dynamic configurations, and changing quotas.
> > > >
> > > > best,
> > > > Colin
> > > >
> > > >
> > > > On Wed, Apr 15, 2020, at 15:25, Ismael Juma wrote:
> > > > > Hi Boyang,
> > > > >
> > > > > Thanks for the KIP. Have we considered that this reduces
> > availability for
> > > > > these operations since we have a single Controller instead of the ZK
> > > > quorum?
> > > > >
> > > > > Ismael
> > > > >
> > > > > On Fri, Apr 3, 2020 at 4:45 PM Boyang Chen <
> > reluctanthero...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hey all,
> > > > > >
> > > > > > I would like to start off the discussion for KIP-590, a follow-up
> > > > > > initiative after KIP-500:
> > > > > >
> > > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-590%3A+Redirect+Zookeeper+Mutation+Protocols+to+The+Controller
> > > > > >
> > > > > > This KIP proposes to migrate existing Zookeeper mutation paths,
> > > > including
> > > > > > configuration, security and quota changes, to controller-only by
> > always
> > > > > > routing these alterations to the controller.
> > > > > >
> > > > > > Let me know your thoughts!
> > > > > >
> > > > > > Best,
> > > > > > Boyang
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to