This is great. I should confirm what zk version you are using for your tests. Thanks, Eric
On Thu, Dec 30, 2021 at 1:10 PM Alexander Shraer <[email protected]> wrote: > The reconfig is in process means something failed during reconfiguration > and it couldn't complete. Perhaps the new server disconnected in the middle > and never came back up. Notice that the second server's config file gets > overwritten after it connects to the leader, and if it reboots at this > stage it won't be able to connect again without you manually overwriting > its config file again (since in the server's config server 2 is not part of > the ensemble). > > I checked it locally (running both servers on my laptop), and it worked. > Perhaps start from that ? > > Like you said, I disabled acl by adding > > "-Dzookeeper.skipACL=yes" > > Here's the first server's config file: conf/zoo_replicated1.cfg > > dataDir=/Users/shralex/my-zookeeper/zookeeper1 > > syncLimit=2 > > initLimit=5 > > tickTime=2000 > > clientPort=2791 > > reconfigEnabled=true > > standaloneEnabled=false > > server.1=localhost:2721:2731:participant;localhost:2791 > > The second server's: conf/zoo_replicated2.cfg > > ataDir=/Users/shralex/my-zookeeper/zookeeper2 > > syncLimit=2 > > initLimit=5 > > tickTime=2000 > > clientPort=2792 > > reconfigEnabled=true > > standaloneEnabled=false > > server.1=localhost:2721:2731:participant;localhost:2791 > > server.2=localhost:2741:2751:participant;localhost:2792 > > create 2 directories for the servers: zookeeper1 and zookeeper2 and create > myid files in each > > echo 1 > zookeeper1/myid > > echo 2 > zookeeper2/myid > > I find it easier for debugging to allow zkServer.sh to log to stdout. You > can do this by changing zkServer.sh: > - change nohup "$JAVA" to just "$JAVA" > - remove " > "$_ZOO_DAEMON_OUT" 2>&1 < /dev/null" > > In two shells start both servers by > > export ZOOCFG=zoo_replicated1.cfg (change for server 2) > > ./bin/zkServer.sh start > > In a third shell I start the client by connecting it to server 2 as you did > > ./bin/zkCli.sh -server 127.0.0.1:2792 > > I run the following in the shell: > > [zk: 127.0.0.1:2792(CONNECTED) 2] config > > server.1=localhost:2721:2731:participant;localhost:2791 > > version=100000000 > > [zk: 127.0.0.1:2792(CONNECTED) 2] reconfig -add > "server.2=localhost:2741:2751:participant;localhost:2792" > > Committed new configuration: > > server.1=localhost:2721:2731:participant;localhost:2791 > > server.2=localhost:2741:2751:participant;localhost:2792 > > version=200000003 > > On Thu, Dec 30, 2021 at 10:47 AM Eric Edgar > <[email protected]> wrote: > > > I am a little closer I think. I disabled auth for testing using the > server > > flags .. but now I am getting a different error that the reconfig is in > > process and I see a zookeeper.dynamic.next file on both servers but > nothing > > happens after that. > > What would cause that file to not be merged into a new cfg. > > Eric > > > > On Thu, Dec 30, 2021 at 11:47 AM Eric Edgar <[email protected]> > > wrote: > > > > > Alex, > > > so I have 2 nodes .. the first has itself in the dynamic list with an > id > > > of 1. > > > server.1=10.1.1.104:2888:3888:participant;0.0.0.0:2181 > > > > > > I have brought the second node up with an id of 2 > > > server.1=10.1.1.104:2888:3888:participant;0.0.0.0:2181 > > > server.2=10.1.1.40:2888:3888:participant;2181 > > > > > > then i am trying to run from the second node. zkCli.sh -server > > 10.1.1.104 > > > reconfig -add "server.2=10.1.1.40:2888:3888:participant;2181" > > > > > > > > > > > > I get this error on the first server > > > 2021-12-30 17:37:02,880 [myid:1] - INFO [ProcessThread(sid:1 > > > cport:-1)::PrepRequestProcessor@461] - Incremental reconfig > > > 2021-12-30 17:37:02,880 [myid:1] - WARN [ProcessThread(sid:1 > > > cport:-1)::PrepRequestProcessor@532] - Reconfig failed - there must > be a > > > connected and synced quorum in new configuration > > > 2021-12-30 17:37:02,880 [myid:1] - INFO [ProcessThread(sid:1 > > > cport:-1)::PrepRequestProcessor@935] - Got user-level KeeperException > > > when processing sessionid:0x1002dfe65610014 type:reconfig cxid:0x1 > > > zxid:0x1600000033 txntype:-1 reqpath:n > > > > > > > > > on the second server issuing the reconfig command I get this error > > > No quorum of new config is connected and up-to-date with the leader of > > > last commmitted config - try invoking reconfiguration after new servers > > are > > > connected and synced > > > > > > I have not set any security at this point. > > > > > > I am not sure what I am missing at this point, assuming I don't need 2 > > > nodes fully clustered in advance as mentioned by Chris. > > > > > > Thanks, > > > Eric > > > > > > On Thu, Dec 30, 2021 at 11:03 AM Alexander Shraer <[email protected]> > > > wrote: > > > > > >> This is already possible, since the 3.5.0 release: > > >> > > >> > > > https://zookeeper.apache.org/doc/r3.5.3-beta/zookeeperReconfig.html#sc_reconfig_standaloneEnabled > > >> > > >> After your single node is up and running, you can connect other nodes > to > > >> it > > >> as described in the reconfig manual. See "Adding servers" in the link > > >> above. > > >> Essentially, you need to specify the new server's initial config files > > so > > >> that they can find some existing server and start syncing data. Once a > > >> quorum > > >> of the new config is up to date, you can invoke the reconfig command > to > > >> officially make them part of the configuration. > > >> > > >> Thanks, > > >> Alex > > >> > > >> On Thu, Dec 30, 2021 at 8:57 AM Eric Edgar > > >> <[email protected]> wrote: > > >> > > >> > Also would it be possible to update the code for this edge case, eg > > if > > >> the > > >> > current quorum is 1, and you want to add a node then add a flag > > saying I > > >> > trust the single master and reconfigure itself into a 2 node > cluster? > > >> > Thanks, > > >> > Eric > > >> > > > >> > On Thu, Dec 30, 2021 at 10:49 AM Eric Edgar < > > [email protected] > > >> > > > >> > wrote: > > >> > > > >> > > Are there any examples with a k8 orchestrator or some sort of > docker > > >> init > > >> > > scripts handling the initial cluster configuration? > > >> > > Thanks, > > >> > > Eric > > >> > > > > >> > > On Thu, Dec 30, 2021 at 9:44 AM Chris T. <[email protected]> > > >> wrote: > > >> > > > > >> > >> If you want to run a zookeeper cluster you have to start with at > > >> least 2 > > >> > >> members. From there you can scale up with the dynamic reconfig > > >> commands. > > >> > >> Regards > > >> > >> Chris > > >> > >> > > >> > >> On 30 December 2021 16:40:40 Eric Edgar > > >> > >> <[email protected]> wrote: > > >> > >> > > >> > >> > I am experimenting with zk and the reconfig feature and trying > to > > >> > >> > understand if I can start a single zk node and then > > >> reconfig/bootstrap > > >> > >> the > > >> > >> > other 2 nodes into the ensemble. The reconfig command is > > throwing > > >> an > > >> > >> error > > >> > >> > that there isn't a quorum yet. Is this line of thinking > > >> possible? or > > >> > >> do I > > >> > >> > need to setup the first 3 nodes manually the first time? > > >> > >> > I am basing this experiment off of this web page. > > >> > >> > > > >> > >> > > >> > > > >> > > > https://blog.container-solutions.com/dynamic-zookeeper-cluster-with-docker > > >> > >> > > > >> > >> > /opt/zookeeper/zookeeper/bin/zkCli.sh -server 10.1.1.104:2181 > > >> > reconfig > > >> > >> -add > > >> > >> > "server.2=10.1.1.40:2888:3888:participant;2181" > > >> > >> > No quorum of new config is connected and up-to-date with the > > >> leader of > > >> > >> last > > >> > >> > commmitted config - try invoking reconfiguration after new > > servers > > >> are > > >> > >> > connected and synced > > >> > >> > > > >> > >> > /opt/zookeeper/zookeeper/bin/zkCli.sh -server 10.1.1.104:2181 > > >> config > > >> > >> > server.1=10.1.1.104:2888:3888:participant;0.0.0.0:2181 > > >> > >> > > > >> > >> > cat ./zoo.cfg > > >> > >> > autopurge.purgeInterval=1 > > >> > >> > initLimit=10 > > >> > >> > syncLimit=5 > > >> > >> > autopurge.snapRetainCount=6 > > >> > >> > tickTime=2000 > > >> > >> > dataDir=/mnt/zookeeper/data > > >> > >> > reconfigEnabled=true > > >> > >> > standaloneEnabled=false > > >> > >> > > > >> > >> > > >> > > > >> > > > dynamicConfigFile=/opt/zookeeper/zookeeper/conf/zoo.cfg.dynamic.1600000000 > > >> > >> > > > >> > >> > What is the best solution for an unattended bootstrap setup of > a > > >> new > > >> > >> > cluster from scratch? > > >> > >> > > > >> > >> > > > >> > >> > This was something that we were able to accomplish with > exhibitor > > >> on > > >> > >> older > > >> > >> > versions of zookeeper in the past. > > >> > >> > > >> > >> > > >> > > > >> > > > > > >
