My suggestion is that we get this in and let it sit for a couple of weeks before cutting an RC. Alternatively, we could cut an RC and just give more time than the typical 2 weeks for people to test.
-Flavio > On 29 Nov 2016, at 17:23, Patrick Hunt <[email protected]> wrote: > > I did a bunch of manual testing last week. I tried running multiple cluster > sizes, tried running new server against old server, also went through the > rolling upgrade testing part of the document and that worked fine. afaict > at this point we're ready to commit. If there are any concerns please speak > up now as I intend to commit this soon. > > Patrick > > On Fri, Nov 18, 2016 at 12:15 PM, Patrick Hunt <[email protected]> wrote: > >> As Flavio said originally on this thread this is a big change. Based on >> the current status of the patch and the testing feedback it seems like >> we've done significant work to ensure the quality of the change. Do folks >> feel that there has been sufficient review/testing that we can commit this >> and cut a 3.5.10 release? If not what specifically is left? >> >> Patrick >> >> On Wed, Nov 2, 2016 at 8:51 AM, Andrew Purtell <[email protected]> >> wrote: >> >>>> My view on this-> Since ZK server >>>> already has "jaas.config" in place for client-server auth, IMHO to >>> continue >>> >>> Right, also build the client-server authentication context >>> programatically. >>> >>>> How about pushing the current >>>> approach as a first step and based on the community interests later we >>>> could enhance this feature by programmatically builds the JAAS context. >>> >>> Yes, this is what I was suggesting. >>> >>> Thanks for considering the idea. >>> >>>> 1) A bad host or bad Kerb principal is keep on trying to establish >>>> connection with the Quorum. >>> >>>> 2) Trigger FLE several times. >>> >>> Let me get back to you. >>> >>> I think it would also not be difficult to create a new unit test that >>> extends QuorumHammerTest (like ObserverQuorumHammerTest), sets up a >>> secure >>> configuration using the minikdc, and then uses QuorumBase methods to start >>> and stop quorum peers at random while the generated load is hitting the >>> quorum. >>> >>> >>> On Wed, Nov 2, 2016 at 7:15 AM, Rakesh Radhakrishnan <[email protected]> >>> wrote: >>> >>>> Thanks a lot Andrew Purtell for your time and comments. >>>> >>>>>>>>> I like how Hadoop programmatically builds the JAAS context from its >>>>>>>>> configuration such that the JAAS configuration file and definition >>> of >>>>>>>>> system property pointing to same are not needed. Would be nice if >>> ZK >>>> could >>>>>>>>> optionally do the same, that would be one fewer config file to get >>>>>>>>> precisely right. >>>> >>>> Its really an interesting thought. My view on this-> Since ZK server >>>> already has "jaas.config" in place for client-server auth, IMHO to >>> continue >>>> with this jaas config and allows to configure the 'QuorumServer' & >>>> 'QuorumLearner' quorum auth sections. How about pushing the current >>>> approach as a first step and based on the community interests later we >>>> could enhance this feature by programmatically builds the JAAS context. >>>> Does this makes sense to you? >>>> >>>> >>>>>>>> Any suggestions on some kind of hammer test to apply now ? >>>> 1) A bad host or bad Kerb principal is keep on trying to establish >>>> connection with the Quorum. Mostly this will increase the load on >>> Leader ZK >>>> server and observe how the system behaves. >>>> 2) Trigger FLE several times. This can be done by finding out newly >>> elected >>>> Leader and kill it. Probably can do this many times. >>>> >>>> >>>> Dear Committers, >>>> >>>> Could you please help me pushing ZOOKEEPER-2479 this in. This would >>> help to >>>> evaluate the time taken for FLE(enable or disable auth) and used to >>> compare >>>> time taken in several runs programatically in scripts or so. Thanks! >>>> >>>> Thanks, >>>> Rakesh >>>> >>>> On Wed, Nov 2, 2016 at 6:59 AM, Andrew Purtell <[email protected]> >>>> wrote: >>>> >>>>> I would like to relay a working example of a patched 3.4.9 ZK cluster >>>>> running on one host (without containers). I can confirm mutual SASL >>> auth >>>>> among the quorum is required and enforced because when instances were >>>>> slightly misconfigured with incorrect principal strings they couldn't >>>>> participate in the quorum. >>>>> >>>>> 1. Assign extra loopback addresses. Example below is what you need to >>> do >>>>> for FreeBSD: >>>>> >>>>> $ sudo ifconfig lo0 127.0.1.1 netmask 255.255.255.0 alias >>>>> $ sudo ifconfig lo0 127.0.2.1 netmask 255.255.255.0 alias >>>>> $ sudo ifconfig lo0 127.0.3.1 netmask 255.255.255.0 alias >>>>> >>>>> 2. Create Kerberos principals. Example below is for Heimdal, MIT will >>> be >>>>> similar: >>>>> >>>>> $ sudo kadmin -l >>>>>> add --random-key zookeeper >>>>>> add --random-key zookeeper/127.0.1.1 >>>>>> add --random-key zookeeper/127.0.2.1 >>>>>> add --random-key zookeeper/127.0.3.1 >>>>>> ext_keytab --keytab /var/tmp/test/zookeeper.keytab zookeeper >>>>>> ext_keytab --keytab /var/tmp/test/zookeeper.keytab zookeeper/ >>> 127.0.1.1 >>>>>> ext_keytab --keytab /var/tmp/test/zookeeper.keytab zookeeper/ >>> 127.0.2.1 >>>>>> ext_keytab --keytab /var/tmp/test/zookeeper.keytab zookeeper/ >>> 127.0.3.1 >>>>>> exit >>>>> >>>>> 3. Patch ZK with 1045, build a tarball, then unpack it into three >>> install >>>>> directories. Mine are /var/tmp/test/{1,2,3}. >>>>> >>>>> 4. Initialize 'myid' files: >>>>> >>>>> $ mkdir /var/tmp/test/1/data >>>>> $ echo 1 > /var/tmp/test/1/data/myid >>>>> $ mkdir /var/tmp/test/2/data >>>>> $ echo 2 > /var/tmp/test/2/data/myid >>>>> $ mkdir /var/tmp/test/3/data >>>>> $ echo 3 > /var/tmp/test/3/data/myid >>>>> >>>>> 5. Create configuration files for each instance. Below are examples >>> for >>>>> instance 1. You will need to make substitutions for your Kerberos >>> realm >>>>> (mine is LOCAL), and for instances 2 and 3 change the path to the JAAS >>>>> configuration file, path to data directory, and bind address for >>>>> clientPortAddress as appropriate. >>>>> >>>>> conf/java.env: >>>>> >>>>> export >>>>> JVMFLAGS="-Djava.security.auth.login.config=/var/tmp/ >>>> test/1/conf/jaas.conf >>>>> -Djavax.security.auth.useSubjectCredsOnly=false" >>>>> >>>>> conf/jaas.conf: >>>>> >>>>> QuorumServer { >>>>> com.sun.security.auth.module.Krb5LoginModule required >>>>> useKeyTab=true >>>>> keyTab="/var/tmp/test/zookeeper.keytab" >>>>> storeKey=true >>>>> useTicketCache=false >>>>> debug=false >>>>> principal="zookeeper/127.0.1.1@LOCAL"; >>>>> }; >>>>> >>>>> QuorumLearner { >>>>> com.sun.security.auth.module.Krb5LoginModule required >>>>> useKeyTab=true >>>>> keyTab="/var/tmp/test/zookeeper.keytab" >>>>> storeKey=true >>>>> useTicketCache=false >>>>> debug=false >>>>> principal="zookeeper/127.0.1.1@LOCAL"; >>>>> }; >>>>> >>>>> conf/zoo.cfg: >>>>> >>>>> tickTime=2000 >>>>> initLimit=10 >>>>> syncLimit=5 >>>>> dataDir=/var/tmp/test/zk/1/data >>>>> clientPort=2181 >>>>> clientPortAddress=127.0.1.1 >>>>> >>>>> quorum.auth.enableSasl=true >>>>> quorum.auth.learnerRequireSasl=true >>>>> quorum.auth.serverRequireSasl=true >>>>> quorum.auth.learner.loginContext=QuorumLearner >>>>> quorum.auth.server.loginContext=QuorumServer >>>>> quorum.auth.kerberos.servicePrincipal=zookeeper/_HOST >>>>> >>>>> server.1=127.0.1.1:2888:3888 >>>>> server.2=127.0.2.1:2888:3888 >>>>> server.3=127.0.3.1:2888:3888 >>>>> >>>>> 5. Launch the three instances in the foreground in separate terminals. >>>>> Example for instance 1: >>>>> >>>>> $ cd /var/tmp/test/1 >>>>> $ bash ./bin/zkServer.sh start-foreground >>>>> >>>>> Having done all this, I can see successful authentication and >>> bootstrap >>>> of >>>>> a quorum. >>>>> >>>>> This example shares a keytab. No need to do that if you'd like to be >>>>> pedantic. Clone the keytab file into the separate conf directories and >>>>> update the jaas.conf files. >>>>> >>>>> I like how Hadoop programmatically builds the JAAS context from its >>>>> configuration such that the JAAS configuration file and definition of >>>>> system property pointing to same are not needed. Would be nice if ZK >>>> could >>>>> optionally do the same, that would be one fewer config file to get >>>>> precisely right. >>>>> >>>>> Any suggestions on some kind of hammer test to apply now ? >>>>> >>>>> >>>>> On Fri, Oct 21, 2016 at 7:34 AM, Flavio Junqueira <[email protected]> >>>> wrote: >>>>> >>>>>> Michael Han posted an update to the test plan to the jira and I >>> want to >>>>>> call the attention of the community to it. It is a big change that >>> we >>>>> need >>>>>> to be extra careful about because it is supposed to go to the 3.4 >>>> branch. >>>>>> It'd be great to have more folks in the community involved with the >>>>>> testing. If you have cycles and interest, please help with testing. >>>>>> >>>>>> -Flavio >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Best regards, >>>>> >>>>> - Andy >>>>> >>>>> Problems worthy of attack prove their worth by hitting back. - Piet >>> Hein >>>>> (via Tom White) >>>>> >>>> >>> >>> >>> >>> -- >>> Best regards, >>> >>> - Andy >>> >>> Problems worthy of attack prove their worth by hitting back. - Piet Hein >>> (via Tom White) >>> >> >>
