[ 
https://issues.apache.org/jira/browse/SOLR-11362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Calvin Hartwell updated SOLR-11362:
-----------------------------------
    Description: 
Hey all,

I ran into a strange scenario recently so I thought I'd share, it was very 
frustrating and I only discovered the fix on a whim. 

Let's imagine I have three nodes which form a solrcloud: 

- node0.someaddress.com
- node1.someaddress.com
- node2.someaddress.com 

Each of these machines has an SSL key and csr generated which is signed by a 
CA. The truststore contains the public certificate of the CA (defined as per 
manual using SOLR_SSL_TRUST_STORE). 

The keystore (SOLR_SSL_KEY_STORE) contains three entries, one for the CA public 
cert, and two entries for the server itself, with different alises (one has the 
alias set to the FQDN, the other is set to localhost). 

All parameters for SSL/TLS are configured correctly as per the solr manuals. 
Obviously the keystore (SOLR_SSL_KEY_STORE) only needs the single cert/private 
key for the server with no other entries, but this setup works 100% with Kafka 
using the three entries. 

Here is an example:

keytool -list -keystore node0.jks 
localhost ..(omitted)
node0.someaddress.com ...(omitted)
cacert ..(omitted)

keytool -list -keystore node1.jks 
localhost ..(omitted)
cacert ..(omitted)
node1.someaddress.com ...(omitted)

keytool -list -keystore node2.jks 
node2.someaddress.com ...(omitted)
localhost ..(omitted)
cacert ..(omitted)

Here is the interesting part, with this setup, when the nodes are started only 
1/3 nodes starts correctly (in my case, node1.someaddress.com) all the other 
nodes (node0.someaddress.com, node2.someaddress.com) have a handshake_failure 
error. If you try to run solr status on the two broken nodes it doesn't work 
but this command works fine for the working node. 

I enabled the most detailed level of logging and monitored the handshake but 
couldn't see anything really a miss, all the configuration properties were set 
correctly. 

What I noticed was this: when running keytool to list the keys for each 
keystore, the certificates in the keystore were displayed in different orders, 
like they were sorted by alphabetical order by the keytool cli tool. This gave 
me an idea to delete the rest of the certs in each keystore for each node so 
they only had single entries for the fqdn. 

So the keystores now looked like this: 
keytool -list -keystore solrkeystore.jks ...
node0.someaddress.com ...(omitted)

keytool -list -keystore solrkeystore.jks ...
node1.someaddress.com ...(omitted)

keytool -list -keystore solrkeystore.jks ...
node2.someaddress.com ...(omitted)

After I did this and restarted the solr nodes started working again fine, so 
here are the questions: 

1) Why does this setup work with Kafka and not Solr if the java classes used 
should be very similar? 

2) Why can't I use multiple keys/certs in the keystore? Is this expected 
functionality? 

  was:
Hey all,

I ran into a strange scenario recently so I thought I'd share, it was very 
frustrating and I only discovered the fix on a whim. Let's imagine I have three 
nodes which form a solrcloud: 

- node0.someaddress.com
- node1.someaddress.com
- node2.someaddress.com 

Each of these machines has an SSL key and csr generated which is signed by a 
CA. The truststore contains the public certificate of the CA (defined as per 
manual using SOLR_SSL_TRUST_STORE). 

The keystore (SOLR_SSL_KEY_STORE) contains three entries, one for the CA public 
cert, and two entries for the server itself, with different alises (one has the 
alias set to the FQDN, the other is set to localhost). 

All parameters for SSL/TLS are configured correctly as per the solr manuals. 
Obviously the keystore (SOLR_SSL_KEY_STORE) only needs the single cert/private 
key for the server with no other entries, but this setup works 100% with Kafka 
using the three entries. 

Here is an example:

keytool -list -keystore solrkeystore.jks 
localhost ..(omitted)
node0.someaddress.com ...(omitted)
cacert ..(omitted)

Here is the interesting part, with this setup, when the nodes are started only 
1/3 nodes starts correctly (in my case, node1.someaddress.com) all the other 
nodes (node0.someaddress.com, node2.someaddress.com) have a handshake_failure 
error. If you try to run solr status on the two broken nodes it doesn't work 
but this command works fine for the working node. 

I enabled the most detailed level of logging and monitored the handshake but 
couldn't see anything really a miss, all the configuration properties were set 
correctly. 

What I noticed was this: when running keytool to list the keys for each 
keystore, the certificates in the keystore were displayed in different orders, 
like they were sorted by alphabetical order by the keytool cli tool. This gave 
me an idea to delete the rest of the certs in each keystore for each node so 
they only had single entries for the fqdn. 

So the keystores now looked like this: 
keytool -list -keystore solrkeystore.jks ...
node0.someaddress.com ...(omitted)

After I did this and restarted the solr nodes started working again fine, so 
here are the questions: 

1) Why does this setup work with Kafka and not Solr if the java classes used 
should be very similar? 

2) Why can't I use multiple keys/certs in the keystore? Is this expected 
functionality? 


> Solr Cloud SSL handshake_failure keystore (SOLR_SSL_KEY_STORE) multiple certs 
> issue
> -----------------------------------------------------------------------------------
>
>                 Key: SOLR-11362
>                 URL: https://issues.apache.org/jira/browse/SOLR-11362
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>    Affects Versions: 6.6.0
>         Environment: CentOS 7.3, Virtual Machines, AWS. 
>            Reporter: Calvin Hartwell
>            Priority: Minor
>
> Hey all,
> I ran into a strange scenario recently so I thought I'd share, it was very 
> frustrating and I only discovered the fix on a whim. 
> Let's imagine I have three nodes which form a solrcloud: 
> - node0.someaddress.com
> - node1.someaddress.com
> - node2.someaddress.com 
> Each of these machines has an SSL key and csr generated which is signed by a 
> CA. The truststore contains the public certificate of the CA (defined as per 
> manual using SOLR_SSL_TRUST_STORE). 
> The keystore (SOLR_SSL_KEY_STORE) contains three entries, one for the CA 
> public cert, and two entries for the server itself, with different alises 
> (one has the alias set to the FQDN, the other is set to localhost). 
> All parameters for SSL/TLS are configured correctly as per the solr manuals. 
> Obviously the keystore (SOLR_SSL_KEY_STORE) only needs the single 
> cert/private key for the server with no other entries, but this setup works 
> 100% with Kafka using the three entries. 
> Here is an example:
> keytool -list -keystore node0.jks 
> localhost ..(omitted)
> node0.someaddress.com ...(omitted)
> cacert ..(omitted)
> keytool -list -keystore node1.jks 
> localhost ..(omitted)
> cacert ..(omitted)
> node1.someaddress.com ...(omitted)
> keytool -list -keystore node2.jks 
> node2.someaddress.com ...(omitted)
> localhost ..(omitted)
> cacert ..(omitted)
> Here is the interesting part, with this setup, when the nodes are started 
> only 1/3 nodes starts correctly (in my case, node1.someaddress.com) all the 
> other nodes (node0.someaddress.com, node2.someaddress.com) have a 
> handshake_failure error. If you try to run solr status on the two broken 
> nodes it doesn't work but this command works fine for the working node. 
> I enabled the most detailed level of logging and monitored the handshake but 
> couldn't see anything really a miss, all the configuration properties were 
> set correctly. 
> What I noticed was this: when running keytool to list the keys for each 
> keystore, the certificates in the keystore were displayed in different 
> orders, like they were sorted by alphabetical order by the keytool cli tool. 
> This gave me an idea to delete the rest of the certs in each keystore for 
> each node so they only had single entries for the fqdn. 
> So the keystores now looked like this: 
> keytool -list -keystore solrkeystore.jks ...
> node0.someaddress.com ...(omitted)
> keytool -list -keystore solrkeystore.jks ...
> node1.someaddress.com ...(omitted)
> keytool -list -keystore solrkeystore.jks ...
> node2.someaddress.com ...(omitted)
> After I did this and restarted the solr nodes started working again fine, so 
> here are the questions: 
> 1) Why does this setup work with Kafka and not Solr if the java classes used 
> should be very similar? 
> 2) Why can't I use multiple keys/certs in the keystore? Is this expected 
> functionality? 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to