Re: Leader election

Michael Borokhovich Mon, 10 Dec 2018 22:40:15 -0800

Thanks, Maciej. That sounds good. We will try playing with the parameters
and have at least a known upper limit on the inconsistency interval.


On Fri, Dec 7, 2018 at 2:11 AM Maciej Smoleński <jezd...@gmail.com> wrote:

> On Fri, Dec 7, 2018 at 3:03 AM Michael Borokhovich <michael...@gmail.com>
> wrote:
>
> > We are planning to run Zookeeper nodes embedded with the client nodes.
> > I.e., each client runs also a ZK node. So, network partition will
> > disconnect a ZK node and not only the client.
> > My concern is about the following statement from the ZK documentation:
> >
> > "Timeliness: The clients view of the system is guaranteed to be
> up-to-date
> > within a certain time bound. (*On the order of tens of seconds.*) Either
> > system changes will be seen by a client within this bound, or the client
> > will detect a service outage."
> >
>
> This is related to the fact that ZooKeeper server handles reads from its
> local state - without communicating with other ZooKeeper servers.
> This design ensures scalability for read dominated workloads.
> In this approach client might receive data which is not up to date (it
> might not contain updates from other ZooKeeper servers (quorum)).
> Parameter 'syncLimit' describes how often ZooKeeper server
> synchronizes/updates its local state to global state.
> Client read operation will retrieve data from state not older then
> described by 'syncLimit'.
>
> However ZooKeeper client can always force to retrieve data which is up to
> date.
> It needs to issue 'sync' command to ZooKeeper server before issueing
> 'read'.
> With 'sync' ZooKeeper server with synchronize its local state with global
> state.
> Later 'read' will be handled from updated state.
> Client should be careful here - so that it communicates with the same
> ZooKeeper server for both 'sync' and 'read'.
>
>
> > What are these "*tens of seconds*"? Can we reduce this time by
> configuring
> > "syncLimit" and "tickTime" to let's say 5 seconds? Can we have a strong
> > guarantee on this time bound?
> >
>
> As describe above - you might use 'sync'+'read' to avoid this problem.
>
>
> >
> >
> > On Thu, Dec 6, 2018 at 1:05 PM Jordan Zimmerman <
> > jor...@jordanzimmerman.com>
> > wrote:
> >
> > > > Old service leader will detect network partition max 15 seconds after
> > it
> > > > happened.
> > >
> > > If the old service leader is in a very long GC it will not detect the
> > > partition. In the face of VM pauses, etc. it's not possible to avoid 2
> > > leaders for a short period of time.
> > >
> > > -JZ
> >
>

Re: Leader election

Reply via email to