Hi everyone, since this smells so much like a bug, I filed an issue now in ZK's bugtracker at https://issues.apache.org/jira/browse/ZOOKEEPER-4403 <https://issues.apache.org/jira/browse/ZOOKEEPER-4403> .
Best regards, Marc > On 25. Oct 2021, at 13:10, Marc Richter <[email protected]> wrote: > > Hi Andor, > > # How LE-certs were requested: > 1. Using https://github.com/acmesh-official/acme.sh > <https://github.com/acmesh-official/acme.sh> > 2. Requesting the cert from Let's Encrypt using: > `./acme.sh --issue --dns dns_nsupdate -d zookeeper1.ourdomain.cloud` > for each system. > 3. Merge fullchain- and certificate-file to a single PKCS12 file using: > `openssl pkcs12 -export -in <certfile> -inkey <keyfile> -out <pkcs12_file> > -name zookeeper1.ourdomain.cloud \ > -CAfile <fullchainfile> -password <JKS_password>` > 4. Adding the resulting PKCS12 file to the Quorum Keystore: > `keytool -importkeystore -deststorepass <JKS_password> -destkeypass > <JKS_password> -deststoretype pkcs12 \ > -srckeystore <pkcs12_file> -srcstoretype PKCS12 -srcstorepass > <JKS_password> -destkeystore <quorum_jks> \ > -alias zookeeper1.ourdomain.cloud` > > # Is this an IPv6-only environment? Do those hostnames resolve only to IPv6 > addresses? > The machines also have an IPv4 IP associated, but that is only for > infrastructure standard-reasons and not used in our scenario: The Service > DNS-Record is IPv6 only and the Quorum config only uses that DNS Record as > shown in the `server.n`-lines before. > > I think the log lines cited before clearly show: > 1. Zookeeper is picking up the correct certificate from the quorum Keystore, > since it states that the request does not match > any SNA and lists "zookeeper3.ourdomain.cloud" only, which it can only > know from the certificate itself. > 2. Zookeeper is validating the wrong thing here: Even though the config > clearly states to use a DNS name, the > certificates SNAs alre validated against the IPv6 address that record > belongs to instead of the DNS name condigured > (ERROR Failed to verify host address: 2a01:--CUT--:750) > > In my understanding, this is a Bug in Zookeeper's Quorum certificate handling > obviously and how certificates are received should not be too important in > this context. If it would state that it can't find any cert or similar, > without naming that SNA name as written in the certificate, maybe. But this > is a different case. > > Best regards, > Marc > > >> On 22. Oct 2021, at 01:20, Andor Molnar <[email protected]> wrote: >> >> >> Hi Marc, >> >> I need to take a closer look. >> Would you please share how have you requested the certificates from >> Let's Encrypt? >> Is this an IPv6-only environment? Do those hostnames resolve only to >> IPv6 addresses? >> >> Regards, >> Andor >> >> >> >> On Wed, 2021-10-13 at 15:47 +0200, Marc Richter wrote: >>> Hi everyone, >>> >>> for some days now, I am trying to wrap my head around TLS encryption >>> for the quorum-traffic. The hosts running Zookeeper do have a >>> publicly available DNS name and I am using those to issue SSL >>> certificates from Let's Encrypt. >>> This seems to work - but it seems like Zookeeper decides to validate >>> the SSL certificates against the IP(v6) of the connecting nodes >>> instead of their hostnames. >>> >>> In the `zookeeper.properties` of all my 3 nodes, I have set the >>> servers by their DNS names like this: >>> >>> ``` >>> server.1=zookeeper1.ourdomain.cloud:2888:3888 >>> server.2=zookeeper2.ourdomain.cloud:2888:3888 >>> server.3=zookeeper3.ourdomain.cloud:2888:3888 >>> ``` >>> >>> I requested SSL certificates from Let's Encrypt for these DNS names >>> and added the certificate/key pairs to the Keystores of the nodes. >>> >>> In the logs of the `zookeeper2` node, I now see something like this >>> when the `zookeeper3` node tries to connect: >>> >>> ``` >>> [2021-10-13 15:13:49,960] INFO Received connection request from >>> /2a01:--CUT--:750:47566 >>> (org.apache.zookeeper.server.quorum.QuorumCnxManager) >>> [2021-10-13 15:13:50,094] ERROR Failed to verify host address: 2a01:- >>> -CUT--:750 (org.apache.zookeeper.common.ZKTrustManager) >>> javax.net.ssl.SSLPeerUnverifiedException: Certificate for <2a01:-- >>> CUT--:750> doesn't match any of the subject alternative names: >>> [zookeeper3.ourdomain.cloud] >>> ``` >>> >>> Zookeeper seems to ignore the hostnames and complains about that the >>> IPv6 is not listed in the SNA of the presented certificate. Since >>> most open CAs do not sign IP addresses (Let's Encrypt does not do >>> that at all, ZeroSSL only for http auth, etc.), this behaviour >>> enforces me to have an internal CA and work with self signed >>> certificates; including all the negative things that come with it and >>> a lot of extra effort. >>> >>> How can I make Zookeeper to resolve this correctly? >>> >>> Best regards, >>> Marc >> >> >
