[ https://issues.apache.org/jira/browse/CASSANDRA-16999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812561#comment-17812561 ]
Bret McGuire commented on CASSANDRA-16999: ------------------------------------------ Hey [~smiklosovic] and [~brandon.williams] .... finally got some time to test this this evening. Looks like we're not quite there yet. My testing process went as follows: * Create a three-node ccm cluster backed by Brandon's branch * Add native_protocol_port_ssl to cassandra.yaml + configure client_encryption_options * Run a simple client app which attempts to connect to one of the nodes on native_transport_port_ssl * Look for exceptions like the following: {noformat} 750 [s0-admin-0] DEBUG com.datastax.oss.driver.internal.core.pool.ChannelPool - [s0|localhost/127.0.0.1:9142] Trying to create 1 missing channels 759 [s0-io-4] DEBUG com.datastax.oss.driver.internal.core.channel.ProtocolInitHandler - [s0|connecting...] Starting channel initialization 817 [s0-io-5] DEBUG com.datastax.oss.driver.internal.core.channel.ProtocolInitHandler - [s0|connecting...] Starting channel initialization 818 [s0-admin-0] WARN com.datastax.oss.driver.internal.core.pool.ChannelPool - [s0|/127.0.0.2:9042] Error while opening new channel com.datastax.oss.driver.api.core.connection.ConnectionInitException: [s0|id: 0xab6e686b, L:/127.0.0.1:39342 - R:127.0.0.2/127.0.0.2:9042] Protocol initialization request, step 1 (STARTUP {CQL_VERSION=3.0.0, DRIVER_ NAME=DataStax Java driver for Apache Cassandra(R), DRIVER_VERSION=4.10.0, CLIENT_ID=cab7f27e-7c6f-45f3-8d6f-a8f0bde73347}): failed to send request (io.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record: 8500000000000000680000000a0062496e76616c6964206f7220756e737570706f727465642070726f746f636f6c2076657273696f6e20283232293b20737570706f727465642076657273696f6e73206172652028332f76332c20342f76342c20352f76352c20362f763 62d6265746129) at com.datastax.oss.driver.internal.core.channel.ProtocolInitHandler$InitRequest.fail(ProtocolInitHandler.java:354) at com.datastax.oss.driver.internal.core.channel.ChannelHandlerRequest.writeListener(ChannelHandlerRequest.java:87) at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:577) at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:570) ... {noformat} These exceptions correspond to the driver attempting to connect to hosts other than the control connection as discovered via the contents of system.peers_v2. In the case above port 9042 (native_transport_port, note no "ssl" in there) is used... which leads to the "not an SSL/TLS record" issues. Unfortunately I'm seeing the same thing when running a test client against Brandon's branch. I haven't dug into this in any detail but the cause sure looks to be the following: {noformat} $ ccm status Cluster: 'cass-402-16999bw-3' ----------------------------- node1: UP node2: UP node3: UP $ ccm node1 cqlsh Connected to cass-409-3 at 127.0.0.1:9042 [cqlsh 6.0.0 | Cassandra 4.0.9-SNAPSHOT | CQL spec 3.4.5 | Native protocol v5] Use HELP for help. cqlsh> select * from system.peers_v2; peer | peer_port | data_center | host_id | native_address | native_port | native_port_ssl | preferred_ip | preferred_port | rack | release_version | schema_version | tokens -----------+-----------+-------------+--------------------------------------+----------------+-------------+-----------------+--------------+----------------+-------+-----------------+--------------------------------------+-------------------------- 127.0.0.3 | 7000 | datacenter1 | 701aa9e6-6a7d-4ec1-9c1e-a103bc56a338 | 127.0.0.3 | 9042 | 9042 | null | null | rack1 | 4.0.9-SNAPSHOT | 2207c2a9-f598-3971-986b-2926e09e239d | {'3074457345618258602'} 127.0.0.2 | 7000 | datacenter1 | 3664025a-b102-44f1-a7eb-1a5cd07040d2 | 127.0.0.2 | 9042 | 9042 | null | null | rack1 | 4.0.9-SNAPSHOT | 2207c2a9-f598-3971-986b-2926e09e239d | {'-3074457345618258603'}(2 rows) {noformat} "native_port" and "native_port_ssl" are both showing up as 9042. This is correct for "native_port" but incorrect for "native_port_ssl". > system.peers and system.peers_v2 do not contain the native_transport and/or > native_transport_port_ssl > ----------------------------------------------------------------------------------------------------- > > Key: CASSANDRA-16999 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16999 > Project: Cassandra > Issue Type: Bug > Components: Local/Other > Reporter: Steve Lacerda > Assignee: Brandon Williams > Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x > > > system.peers_v2 includes a “native_port” but has no notion of > native_transport_port vs. native_transport_port_ssl. Given this limited > information, there’s no clear way for the driver to know that different ports > are being used for SSL vs. non-SSL or which of those two ports is identified > by “native_port”. > > The issue we ran into is that the java driver, since it has no notion of the > transport port SSL, the driver was only using the contact points and was not > load balancing. > > The customer had both set: > native_transport_port: 9042 > native_transport_port_ssl: 9142 > > They were attempting to connect to 9142, but that was failing. They could > only use 9042, and so their applications load balancing was failing. We found > that any node that was a contact point was connecting, but the other nodes > were never acting as coordinators. > > There are still issues in the driver, for which I have created JAVA-2967, > which also refers to JAVA-2638, but the system.peers and system.peers_v2 > tables should both contain native_transport_port and > native_transport_port_ssl. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org