[ https://issues.apache.org/jira/browse/CASSANDRA-17945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Semb Wever updated CASSANDRA-17945: ------------------------------------------- Fix Version/s: (was: 4.1-rc) > Fix StorageService.getNativeaddress handling of IPv6 addresses > -------------------------------------------------------------- > > Key: CASSANDRA-17945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17945 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Gossip > Reporter: Andy Tolbert > Assignee: Andy Tolbert > Priority: Normal > Fix For: 4.0.x, 4.1-rc1, 4.1 > > Time Spent: 20m > Remaining Estimate: 0h > > StorageService.getNativeaddress does not account for IPv6 addresses in the > case NATIVE_ADDRESS_AND_PORT is not present in gossip state for an endpoint > While upgrading a cluster using IPv6 addresses from 3.0 to 4.0 I noticed the > following in logs for upgraded nodes when processing down events for 3.0 > nodes that are going down as part of an upgrade: > > {noformat} > 2022-09-28 20:18:48,244 ERROR [GossipStage:1] > org.apache.cassandra.transport.Server - Problem retrieving RPC address for > /[0:0:0:0:0:0:0:d9]:7000 > java.net.UnknownHostException: 0:0:0:0:0:0:0:d9:9042: invalid IPv6 address > at java.net.InetAddress.getAllByName(InetAddress.java:1355) ~[?:?] > at java.net.InetAddress.getAllByName(InetAddress.java:1306) ~[?:?] > at java.net.InetAddress.getByName(InetAddress.java:1256) ~[?:?] > at > org.apache.cassandra.locator.InetAddressAndPort.getByNameOverrideDefaults(InetAddressAndPort.java:227) > > at > org.apache.cassandra.locator.InetAddressAndPort.getByName(InetAddressAndPort.java:212) > > at > org.apache.cassandra.transport.Server$EventNotifier.getNativeAddress(Server.java:377) > > at > org.apache.cassandra.transport.Server$EventNotifier.onDown(Server.java:438) > at > org.apache.cassandra.service.StorageService.notifyDown(StorageService.java:2651) > > at > org.apache.cassandra.service.StorageService.onDead(StorageService.java:3516) > at org.apache.cassandra.gms.Gossiper.markDead(Gossiper.java:1347) > at org.apache.cassandra.gms.Gossiper.markAsShutdown(Gossiper.java:590) > at > org.apache.cassandra.gms.GossipShutdownVerbHandler.doVerb(GossipShutdownVerbHandler.java:39) > > at org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78) > at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97) > at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45) > at > org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:433) > > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > [?:?] > at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > [?:?] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > [?:?] > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > [netty-all-4.1.58.Final.jar:4.1.58.Final] > at java.lang.Thread.run(Thread.java:829) [?:?]{noformat} > It appears that StorageService.getNativeaddress does not account for the fact > that an endpoint may be an IPv6 address, which required brackets when > specified with a port: > > [https://github.com/apache/cassandra/blob/cassandra-4.0.6/src/java/org/apache/cassandra/service/StorageService.java#L1978-L1981] > > > {code:java} > /** > * Return the native address associated with an endpoint as a string. > * @param endpoint The endpoint to get rpc address for > * @return the native address > */ > public String getNativeaddress(InetAddressAndPort endpoint, boolean > withPort) > { > if (endpoint.equals(FBUtilities.getBroadcastAddressAndPort())) > return > FBUtilities.getBroadcastNativeAddressAndPort().getHostAddress(withPort); > else if > (Gossiper.instance.getEndpointStateForEndpoint(endpoint).getApplicationState(ApplicationState.NATIVE_ADDRESS_AND_PORT) > != null) > { > try > { > InetAddressAndPort address = > InetAddressAndPort.getByName(Gossiper.instance.getEndpointStateForEndpoint(endpoint).getApplicationState(ApplicationState.NATIVE_ADDRESS_AND_PORT).value); > return address.getHostAddress(withPort); > } > catch (UnknownHostException e) > { > throw new RuntimeException(e); > } > } > else if > (Gossiper.instance.getEndpointStateForEndpoint(endpoint).getApplicationState(ApplicationState.RPC_ADDRESS) > == null) > return endpoint.address.getHostAddress() + ":" + > DatabaseDescriptor.getNativeTransportPort(); > else > return > Gossiper.instance.getEndpointStateForEndpoint(endpoint).getApplicationState(ApplicationState.RPC_ADDRESS).value > + ":" + DatabaseDescriptor.getNativeTransportPort(); > }{code} > In the two final else cases, the endpoint address and port are delimited with > a colon. For IPv6 addresses this creates an invalid address > (0:0:0:0:0:0:0:d9:9042), IPv6 addresses must be enclosed in brackets (e.g. > [0:0:0:0:0:0:0:d9]:9042) per > [https://datatracker.ietf.org/doc/html/rfc2732#section-2] > Once a cluster is fully upgraded to 4.0, this error no longer occurs as all > endpoints will have NATIVE_ADDRESS_AND_PORT in their gossip state. This only > appears to be an issue during a mixed version case, and the impact of this > seems low (4.0 nodes miss on down events for 3.0 nodes). > I'll have a proposed PR for this up shortly. > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org