[ https://issues.apache.org/jira/browse/CASSANDRA-12791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jeremy Hanna updated CASSANDRA-12791: ------------------------------------- Component/s: Materialized Views > MessageIn logic to determine if the message is cross-node is wrong > ------------------------------------------------------------------ > > Key: CASSANDRA-12791 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12791 > Project: Cassandra > Issue Type: Bug > Components: Materialized Views, Observability, Streaming and > Messaging > Reporter: Sylvain Lebresne > Assignee: Sylvain Lebresne > Priority: Minor > Fix For: 3.10 > > > {{MessageIn}} has the following code to read the 'creation time' of the > message on the receiving side: > {noformat} > public static ConstructionTime readTimestamp(InetAddress from, DataInputPlus > input, long timestamp) throws IOException > { > // make sure to readInt, even if cross_node_to is not enabled > int partial = input.readInt(); > long crossNodeTimestamp = (timestamp & 0xFFFFFFFF00000000L) | (((partial > & 0xFFFFFFFFL) << 2) >> 2); > if (timestamp > crossNodeTimestamp) > { > MessagingService.instance().metrics.addTimeTaken(from, timestamp - > crossNodeTimestamp); > } > if(DatabaseDescriptor.hasCrossNodeTimeout()) > { > return new ConstructionTime(crossNodeTimestamp, timestamp != > crossNodeTimestamp); > } > else > { > return new ConstructionTime(); > } > } > {noformat} > where {{timestamp}} is really the local time on the receiving node when > calling that method. > The incorrect part, I believe, is the {{timestamp != crossNodeTimestamp}} > used to set the {{isCrossNode}} field of {{ConstructionTime}}. A first > problem is that this will basically always be {{true}}: for it to be > {{false}}, we'd need the low-bytes of the timestamp taken on the sending node > to coincide exactly with the ones taken on the receiving side, which is > _very_ unlikely. It is also a relatively meaningless test: having that test > be {{false}} basically means the lack of clock sync between the 2 nodes is > exactly the time the 2 calls to {{System.currentTimeMillis()}} (on sender and > receiver), which is definitively not what we care about. > What the result of this test is used for is to determine if the message was > crossNode or local. It's used to increment different metrics (we separate > metric local versus crossNode dropped messages) in {{MessagingService}} for > instance. And that's where this is kind of a bug: not only the {{timestamp != > crossNodeTimestamp}}, but if {{DatabaseDescriptor.hasCrossNodeTimeout()}}, we > *always* have this {{isCrossNode}} false, which means we'll never increment > the "cross-node dropped messages" metric, which is imo unexpected. > That is, it is true that if {{DatabaseDescriptor.hasCrossNodeTimeout() == > false}}, then we end using the receiver side timestamp to timeout messages, > and so you end up only dropping messages that timeout locally. And _in that > sense_, always incrementing the "locally" dropped messages metric is not > completely illogical. But I doubt most users are aware of those pretty > specific nuance when looking at the related metrics, and I'm relatively sure > users expect a metrics named {{droppedCrossNodeTimeout}} to actually count > cross-node messages by default (keep in mind that > {{DatabaseDescriptor.hasCrossNodeTimeout()}} is actually false by default). > Anyway, to sum it up I suggest that the following change should be done: > # the {{timestamp != crossNodeTimestamp}} test is definitively not what we > want. We should at a minimum just replace it to {{true}} as that's basically > what it ends up being except for very rare and arguably random cases. > # given how the {{ConstructionTime.isCrossNode}} is used, I suggest that we > really want it to mean if the message has shipped cross-node, not just be a > synonymous of {{DatabaseDescriptor.hasCrossNodeTimeout()}}. It should be > whether the message shipped cross-node, i.e. whether {{from == > BroadcastAdress()}} or not. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org