I did a bunch of manual testing last week. I tried running multiple cluster sizes, tried running new server against old server, also went through the rolling upgrade testing part of the document and that worked fine. afaict at this point we're ready to commit. If there are any concerns please speak up now as I intend to commit this soon.
Patrick On Fri, Nov 18, 2016 at 12:15 PM, Patrick Hunt <ph...@apache.org> wrote: > As Flavio said originally on this thread this is a big change. Based on > the current status of the patch and the testing feedback it seems like > we've done significant work to ensure the quality of the change. Do folks > feel that there has been sufficient review/testing that we can commit this > and cut a 3.5.10 release? If not what specifically is left? > > Patrick > > On Wed, Nov 2, 2016 at 8:51 AM, Andrew Purtell <apurt...@apache.org> > wrote: > >> > My view on this-> Since ZK server >> > already has "jaas.config" in place for client-server auth, IMHO to >> continue >> >> Right, also build the client-server authentication context >> programatically. >> >> > How about pushing the current >> > approach as a first step and based on the community interests later we >> > could enhance this feature by programmatically builds the JAAS context. >> >> Yes, this is what I was suggesting. >> >> Thanks for considering the idea. >> >> > 1) A bad host or bad Kerb principal is keep on trying to establish >> > connection with the Quorum. >> >> > 2) Trigger FLE several times. >> >> Let me get back to you. >> >> I think it would also not be difficult to create a new unit test that >> extends QuorumHammerTest (like ObserverQuorumHammerTest), sets up a >> secure >> configuration using the minikdc, and then uses QuorumBase methods to start >> and stop quorum peers at random while the generated load is hitting the >> quorum. >> >> >> On Wed, Nov 2, 2016 at 7:15 AM, Rakesh Radhakrishnan <rake...@apache.org> >> wrote: >> >> > Thanks a lot Andrew Purtell for your time and comments. >> > >> > >>>>>I like how Hadoop programmatically builds the JAAS context from its >> > >>>>>configuration such that the JAAS configuration file and definition >> of >> > >>>>>system property pointing to same are not needed. Would be nice if >> ZK >> > could >> > >>>>>optionally do the same, that would be one fewer config file to get >> > >>>>>precisely right. >> > >> > Its really an interesting thought. My view on this-> Since ZK server >> > already has "jaas.config" in place for client-server auth, IMHO to >> continue >> > with this jaas config and allows to configure the 'QuorumServer' & >> > 'QuorumLearner' quorum auth sections. How about pushing the current >> > approach as a first step and based on the community interests later we >> > could enhance this feature by programmatically builds the JAAS context. >> > Does this makes sense to you? >> > >> > >> > >>>>Any suggestions on some kind of hammer test to apply now ? >> > 1) A bad host or bad Kerb principal is keep on trying to establish >> > connection with the Quorum. Mostly this will increase the load on >> Leader ZK >> > server and observe how the system behaves. >> > 2) Trigger FLE several times. This can be done by finding out newly >> elected >> > Leader and kill it. Probably can do this many times. >> > >> > >> > Dear Committers, >> > >> > Could you please help me pushing ZOOKEEPER-2479 this in. This would >> help to >> > evaluate the time taken for FLE(enable or disable auth) and used to >> compare >> > time taken in several runs programatically in scripts or so. Thanks! >> > >> > Thanks, >> > Rakesh >> > >> > On Wed, Nov 2, 2016 at 6:59 AM, Andrew Purtell <apurt...@apache.org> >> > wrote: >> > >> > > I would like to relay a working example of a patched 3.4.9 ZK cluster >> > > running on one host (without containers). I can confirm mutual SASL >> auth >> > > among the quorum is required and enforced because when instances were >> > > slightly misconfigured with incorrect principal strings they couldn't >> > > participate in the quorum. >> > > >> > > 1. Assign extra loopback addresses. Example below is what you need to >> do >> > > for FreeBSD: >> > > >> > > $ sudo ifconfig lo0 127.0.1.1 netmask 255.255.255.0 alias >> > > $ sudo ifconfig lo0 127.0.2.1 netmask 255.255.255.0 alias >> > > $ sudo ifconfig lo0 127.0.3.1 netmask 255.255.255.0 alias >> > > >> > > 2. Create Kerberos principals. Example below is for Heimdal, MIT will >> be >> > > similar: >> > > >> > > $ sudo kadmin -l >> > > > add --random-key zookeeper >> > > > add --random-key zookeeper/127.0.1.1 >> > > > add --random-key zookeeper/127.0.2.1 >> > > > add --random-key zookeeper/127.0.3.1 >> > > > ext_keytab --keytab /var/tmp/test/zookeeper.keytab zookeeper >> > > > ext_keytab --keytab /var/tmp/test/zookeeper.keytab zookeeper/ >> 127.0.1.1 >> > > > ext_keytab --keytab /var/tmp/test/zookeeper.keytab zookeeper/ >> 127.0.2.1 >> > > > ext_keytab --keytab /var/tmp/test/zookeeper.keytab zookeeper/ >> 127.0.3.1 >> > > > exit >> > > >> > > 3. Patch ZK with 1045, build a tarball, then unpack it into three >> install >> > > directories. Mine are /var/tmp/test/{1,2,3}. >> > > >> > > 4. Initialize 'myid' files: >> > > >> > > $ mkdir /var/tmp/test/1/data >> > > $ echo 1 > /var/tmp/test/1/data/myid >> > > $ mkdir /var/tmp/test/2/data >> > > $ echo 2 > /var/tmp/test/2/data/myid >> > > $ mkdir /var/tmp/test/3/data >> > > $ echo 3 > /var/tmp/test/3/data/myid >> > > >> > > 5. Create configuration files for each instance. Below are examples >> for >> > > instance 1. You will need to make substitutions for your Kerberos >> realm >> > > (mine is LOCAL), and for instances 2 and 3 change the path to the JAAS >> > > configuration file, path to data directory, and bind address for >> > > clientPortAddress as appropriate. >> > > >> > > conf/java.env: >> > > >> > > export >> > > JVMFLAGS="-Djava.security.auth.login.config=/var/tmp/ >> > test/1/conf/jaas.conf >> > > -Djavax.security.auth.useSubjectCredsOnly=false" >> > > >> > > conf/jaas.conf: >> > > >> > > QuorumServer { >> > > com.sun.security.auth.module.Krb5LoginModule required >> > > useKeyTab=true >> > > keyTab="/var/tmp/test/zookeeper.keytab" >> > > storeKey=true >> > > useTicketCache=false >> > > debug=false >> > > principal="zookeeper/127.0.1.1@LOCAL"; >> > > }; >> > > >> > > QuorumLearner { >> > > com.sun.security.auth.module.Krb5LoginModule required >> > > useKeyTab=true >> > > keyTab="/var/tmp/test/zookeeper.keytab" >> > > storeKey=true >> > > useTicketCache=false >> > > debug=false >> > > principal="zookeeper/127.0.1.1@LOCAL"; >> > > }; >> > > >> > > conf/zoo.cfg: >> > > >> > > tickTime=2000 >> > > initLimit=10 >> > > syncLimit=5 >> > > dataDir=/var/tmp/test/zk/1/data >> > > clientPort=2181 >> > > clientPortAddress=127.0.1.1 >> > > >> > > quorum.auth.enableSasl=true >> > > quorum.auth.learnerRequireSasl=true >> > > quorum.auth.serverRequireSasl=true >> > > quorum.auth.learner.loginContext=QuorumLearner >> > > quorum.auth.server.loginContext=QuorumServer >> > > quorum.auth.kerberos.servicePrincipal=zookeeper/_HOST >> > > >> > > server.1=127.0.1.1:2888:3888 >> > > server.2=127.0.2.1:2888:3888 >> > > server.3=127.0.3.1:2888:3888 >> > > >> > > 5. Launch the three instances in the foreground in separate terminals. >> > > Example for instance 1: >> > > >> > > $ cd /var/tmp/test/1 >> > > $ bash ./bin/zkServer.sh start-foreground >> > > >> > > Having done all this, I can see successful authentication and >> bootstrap >> > of >> > > a quorum. >> > > >> > > This example shares a keytab. No need to do that if you'd like to be >> > > pedantic. Clone the keytab file into the separate conf directories and >> > > update the jaas.conf files. >> > > >> > > I like how Hadoop programmatically builds the JAAS context from its >> > > configuration such that the JAAS configuration file and definition of >> > > system property pointing to same are not needed. Would be nice if ZK >> > could >> > > optionally do the same, that would be one fewer config file to get >> > > precisely right. >> > > >> > > Any suggestions on some kind of hammer test to apply now ? >> > > >> > > >> > > On Fri, Oct 21, 2016 at 7:34 AM, Flavio Junqueira <f...@apache.org> >> > wrote: >> > > >> > > > Michael Han posted an update to the test plan to the jira and I >> want to >> > > > call the attention of the community to it. It is a big change that >> we >> > > need >> > > > to be extra careful about because it is supposed to go to the 3.4 >> > branch. >> > > > It'd be great to have more folks in the community involved with the >> > > > testing. If you have cycles and interest, please help with testing. >> > > > >> > > > -Flavio >> > > >> > > >> > > >> > > >> > > -- >> > > Best regards, >> > > >> > > - Andy >> > > >> > > Problems worthy of attack prove their worth by hitting back. - Piet >> Hein >> > > (via Tom White) >> > > >> > >> >> >> >> -- >> Best regards, >> >> - Andy >> >> Problems worthy of attack prove their worth by hitting back. - Piet Hein >> (via Tom White) >> > >