[ https://issues.apache.org/jira/browse/CASSANDRA-19483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886514#comment-17886514 ]
David Capwell commented on CASSANDRA-19483: ------------------------------------------- been speaking with [~smiklosovic] in slack, trying to dump that conversation here. The main patch is https://github.com/instaclustr/cassandra/commit/fe0de74de0cfa289a09cfb2d16dc622f212a5267#diff-1807ebb9d586d616b71765b94be9b6f981a018f43368416a9e294dfae1ecf402R84, which is a simplification of the PR linked. The change would now log the following each time the protocol magic is found {code} noSpam1m.warn("Failed to properly handshake with peer " + channel.remoteAddress() + ". Closing the channel. Invalid legacy protocol magic. {}", cause.getMessage()); {code} this would be a regression to the internode_error_reporting_exclusions config as it would log regardless of what that config says. I am honestly fine with a new config that lets you opt into this behavior, something like the following (100% open to names) {code} private void exceptionCaught(Channel channel, Throwable cause) { final SocketAddress remoteAddress = channel.remoteAddress(); boolean reportingExclusion = DatabaseDescriptor.getInternodeErrorReportingExclusions().contains(remoteAddress); if (reportingExclusion) logger.debug("Excluding internode exception for {}; address contained in internode_error_reporting_exclusions", remoteAddress, cause); else if (cause instanceof Message.InvalidLegacyProtocolMagic && DatabaseDescriptor.getInvalidLegacyProtocolMagicNoSpamEnabled()) noSpam1m.warn("Failed to properly handshake with peer " + channel.remoteAddress() + ". Closing the channel. Invalid legacy protocol magic. {}", cause.getMessage()); else logger.error("Failed to properly handshake with peer {}. Closing the channel.", remoteAddress, cause); {code} with the above we don't change the behavior unless you opt in, and it plays nicely with the existing internode_error_reporting_exclusions logic. The patch doesn't offer any tests so it was hard to validate that this change is enough > Catch InvalidLegacyProtocolMagic exceptions > ------------------------------------------- > > Key: CASSANDRA-19483 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19483 > Project: Cassandra > Issue Type: Improvement > Components: Observability/Logging > Reporter: Brad Schoening > Assignee: Ilya Zakharov > Priority: Low > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x > > Time Spent: 40m > Remaining Estimate: 0h > > Similar to CASSANDRA-18839, we frequently see the exception > InvalidLegacyProtocolMagic with full stack traces flooding our logs. These > are due to Qualys vulnerability scans. > It seem to be a simple check in Message.java and would be better handled by: > a) returning a boolean from validateLegacyProtocolMagic() -> > hasValidLegacyProtocolMagic() instead of creating a custom exception class or > b) adding a catch block in HandshakeProtocol.java and return null as is done > for messagingVersion > {code:java} > static Initiate maybeDecode(ByteBuf buf) throws IOException > { > ... > try (DataInputBuffer in = new DataInputBuffer(nio, false)) > { > validateLegacyProtocolMagic(in.readInt()); // throws > exception > int flags = in.readInt(); > // legacy pre40 messagingVersion flag > if (getBits(flags, 8, 8) < VERSION_40) > return null; > int minMessagingVersion = getBits(flags, 16, 8); > int maxMessagingVersion = getBits(flags, 24, 8); > // 5.0+ does not support pre40 > if (maxMessagingVersion < MessagingService.VERSION_40) > return null; > .... > } > catch (EOFException e) > { > return null; > } > } > {code} > {code:java} > static void validateLegacyProtocolMagic(int magic) throws > InvalidLegacyProtocolMagic > { > if (magic != PROTOCOL_MAGIC) > throw new InvalidLegacyProtocolMagic(magic); > } > {code} > {code:java} > {{2024-03-20 03:47:27,380 [ERROR] [Messaging-EventLoop-3-2] cluster_id=9 > ip_address=10.0.0.1 InboundConnectionInitiator.java:360 - Failed to properly > handshake with peer /10.0.2:33356. Closing the channel.}} > {{io.netty.handler.codec.DecoderException: > org.apache.cassandra.net.Message$InvalidLegacyProtocolMagic: Read 1431520594, > Expected -900387334}} > {{ at > io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:478)}} > {{ at > io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:404)}} > {{ at > io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:371)}} > {{ at > io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:354)}} > {{ at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262)}} > {{ at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248)}} > {{ at > io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241)}} > {{ at > io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1405)}} > {{ at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262)}} > {{ at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248)}} > {{ at > io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:901)}} > {{ at > io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:819)}} > {{ at > io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)}} > {{ at > io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)}} > {{ at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:384)}} > {{ at > io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)}} > {{ at > io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)}} > {{ at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)}} > {{ at java.base/java.lang.Thread.run(Thread.java:834)}} > {{Caused by: org.apache.cassandra.net.Message$InvalidLegacyProtocolMagic: > Read 1431520594, Expected -900387334}} > {{ at > org.apache.cassandra.net.Message.validateLegacyProtocolMagic(Message.java:343)}} > {{ at > org.apache.cassandra.net.HandshakeProtocol$Initiate.maybeDecode(HandshakeProtocol.java:167)}} > {{ at > org.apache.cassandra.net.InboundConnectionInitiator$Handler.initiate(InboundConnectionInitiator.java:242)}} > {{ at > org.apache.cassandra.net.InboundConnectionInitiator$Handler.decode(InboundConnectionInitiator.java:235)}} > {{ at > io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:508)}} > {{ at > io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:447)}} > {{ ... 18 common frames omitted}} > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org