Thanks Fangmin. That's an Interesting feature - allowing followers to host observers. but I assume the entire collection of servers is still considered part of the ensemble. If so, isn't the upper limit still capped to 256 - by the lowest 8 bits of the server id?
On Fri, Apr 10, 2020 at 5:32 PM Fangmin Lv <[email protected]> wrote: > There is ObserverMaster feature contributed back in ZOOKEEPER-3140 > <https://issues.apache.org/jira/browse/ZOOKEEPER-3140> could be used to > scale the > number of observers and traffics a single ensemble can support. > > It allows followers to serve observers as well, which relieves the fanout > load on leader. > > But as Michael mentioned, there is server id limit given lowest 8 bits are > used guarantee the session id > uniqueness, so max servers are limited to 255. > > Internally, we use local sessions only on observers, so we use dynamic > observer id (-1) for all observers, > which is not part of the dynamic config. It helps us scale more observers, > but this may not be a good > solution for community since there is limitation here. > > Thanks, > Fangmin > > On Fri, Apr 10, 2020 at 1:43 PM Michael Han <[email protected]> wrote: > > > If you have 100s of 1000s of ZK clients then having observer in each pod > > will presumably reduce traffic as most of the fan out traffic, from > server > > to clients is localized to each pod. > > > > Observer is not part of quorum, and a quorum can't scale pass a few > servers > > (typical just 5 or 7). Observers can scale from 100s to 1000s (depends on > > whether only leader hosts them, or follower can host them) but actual > > number depends on workload and hardware capacity. Although it's > recommended > > myid being [0,255] but I vaguely remember we can pass this limit, just > need > > to make sure the lower 8 bits of the myid always to be unique as that's > > used to construct session id. > > > > On Fri, Apr 10, 2020 at 12:09 PM James Arbo <[email protected]> wrote: > > > > > That was my instinct as well. I *think* any ZK writes would require a > > > quorum before the transaction is committed. Getting a quorum over a > > several > > > hundred/thousand node ensemble seems like a lot of traffic. > > > Plus, from what I've read - though not 100% certain, it seems the > number > > ZK > > > nodes is capped at 255. > > > > > > On Fri, Apr 10, 2020 at 2:52 PM Bram Van Dam <[email protected]> > > wrote: > > > > > > > On 10/04/2020 20:13, James Arbo wrote: > > > > > When we proposed this, there was great concern from the software > > > > architects > > > > > that network traffic between the kubernetes pods and the ZK > ensemble > > > must > > > > > be minimized. > > > > > > > > > This means that, at a minimum, we would be running at least 1 ZK > > > ensemble > > > > > member on every node of our K8S cluster. > > > > > > > > Sounds to me like this would *increase* network traffic, not decrease > > > > it. Instead of having communication between the pod and ZK whenever > > > > needed (which likely isn't very frequently?), you'll now be having > > > > constant communication between the ensemble and your hundreds of > > > > observers in order to keep the observers in sync. > > > > > > > > Maybe I'm missing something? > > > > > > > > - Bram > > > > > > > > > > > > > >
