Jon Meredith created CASSANDRA-17601: ----------------------------------------
Summary: IllegalStateException with prepared queries selecting static columns in mixed 3.0.x/4.x clusters Key: CASSANDRA-17601 URL: https://issues.apache.org/jira/browse/CASSANDRA-17601 Project: Cassandra Issue Type: Bug Reporter: Jon Meredith Assignee: Jon Meredith Clusters that contain prepared statements that partially select static columns before the upgrade will fail to execute those statements coordinated from the 4.x nodes until the upgrade completes. h2. Reproduction Setup (before upgrade) {code:java} CREATE KEYSPACE ks1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor':3} CREATE TABLE ks1.tbl1 (pk1 int, ck2 int, s3 int static, s4 int static, c5 int, PRIMARY KEY (pk1, ck2)); INSERT INTO ks1.tbl1 (pk1, ck2, s3, s4, c5) VALUES (1, 2, 3, 4, 5); {code} Prepared Statement (prepare before upgrade) {code:java} SELECT c5, s3 FROM ks1.tbl1 WHERE pk1 = ? AND ck2 = ?; {code} Exception on 3.0.x nodes (when executing prepared statement after upgrade) {code:java} java.lang.IllegalStateException: [s3, s4] is not a subset of [s3] at org.apache.cassandra.db.Columns$Serializer.encodeBitmap(Columns.java:566) at org.apache.cassandra.db.Columns$Serializer.serializeSubset(Columns.java:498) at org.apache.cassandra.db.rows.UnfilteredSerializer.serializeRowBody(UnfilteredSerializer.java:235) at org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:209) at org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:141) at org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:129) at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:140) at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:95) at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:80) at org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:308) at org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:191) at org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:181) at org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:177) at org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:48) at org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:335) at org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:91) at org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:77) at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:93) at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:44) at org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:433) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134) at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:119) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:834) {code} Exception on 4.0.x nodes (when executing prepared statement after upgrade) {code:java} java.lang.IllegalStateException: [ColumnDefinition{name=s3, type=org.apache.cassandra.db.marshal.IntType, kind=STATIC, position=-1}, ColumnDefinition{name=s4, type=org.apache.cassandra.db.marshal.IntType, kind=STATIC, position=-1}] is not a subset of [s3] at org.apache.cassandra.db.Columns$Serializer.encodeBitmap(Columns.java:555) at org.apache.cassandra.db.Columns$Serializer.serializeSubset(Columns.java:487) at org.apache.cassandra.db.rows.UnfilteredSerializer.serializeRowBody(UnfilteredSerializer.java:216) at org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:190) at org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:121) at org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:109) at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:140) at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:94) at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:79) at org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:326) at org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:186) at org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:179) at org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:175) at org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:75) at org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:499) at org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:91) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.runUnsafe(AbstractLocalAwareExecutorService.java:194) at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.runUnsafe(AbstractLocalAwareExecutorService.java:137) at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:167) at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:122) at java.lang.Thread.run(Thread.java:748) {code} The root cause is CASSANDRA-16686 changes ColumnFilters to build and deserialize based on what versions the coordinating node thinks are running in the cluster, and that knowledge is always incorrect when statements are reprepared on startup and may be incorrect as all nodes reach their final version. h2. Sequence of events: Prepared statements are persisted in {{system.prepared_statements}} to be re-prepared on future startup. When the 4.x node starts up after upgrade, in {{org.apache.cassandra.service.CassandraDaemon#setup}} it calls {{QueryProcessor.instance.preloadPreparedStatements}} *before* the {{Gossiper}} is started by a call to {{StorageService.instance.initServer()}} later in {{{}setup{}}}. As part of preparing statements, when possible a {{ColumnFilterFactory}} is created that returns a {{ColumnFilter}} built at the time the query is prepared. After the changes from CASSANDRA-16686, the {{ColumnFilter}} builder constructs different column filter variants depending on the lowest version reported in gossip by checking {{{}org.apache.cassandra.gms.Gossiper#upgradeFromVersionMemoized{}}}. If this runs before the Gossiper is enabled the {{{}SystemKeyspace.CURRENT_VERSION{}}}, causing the {{ColumnFilter}} to create a column filter as if the cluster were fully upgraded. For the query above, the ColumnFilter creates an ALL_REGULARS_AND_QUERIED_STATICS_COLUMNS filter. The 3.0.x nodes participating do not understand the new flag and creates a {{ColumnFilter}} the equivalent of a {{{}WildcardColumnFilter{}}}. The 4.x nodes participating do understand the new flag, however the deserializer takes the lower than 3.4 path as other 3.0 nodes are known about and creates a {{{}WildcardColumFilter{}}}. The fetchedColumns sent by the ALL_REGULARS_AND_QUERIED_STATICS_COLUMNS filter only contains the queried static columns, however the pre-3.4 sstable iterator returns all regular and static columns, causing an IllegalStateException when the serialized response is sent back. The ISE clears once all nodes in the cluster think they are upgraded to the current version and behave as the originally prepared query intended. h2. Related Problems _Non-deterministic behavior of 4.0.x/4.1.x nodes_ If the prepared statements are cleared and/or freshly prepared when the cluster is in mixed 3.0/4.0 mode, the pre-built ColumnFilter will remain in the mixed mode version until re-prepared on a restart or cache clear/eviction. As upgradeFromVersionMemoized times out and is recalculated after the upgrade reaches a single version, individual nodes will make a local decision on column filter building and deserializing. Nodes that update upgradeFromVersionMemoized early that coordinate requests may cause the same ISE against nodes responding to the read command have the previous version still. _Digest Mismatches_ If {{ALL_REGULARS_AND_QUERIED_STATICS_COLUMN}} {{ColumnFilter}}s are incorrectly sent to 3.0.x nodes, the list of columns included will be ignored and compute a different digest than one locally executed on a 4.0.x coordinator. h1. Proposed fix In discussion with [~ifesdjeen], he suggested that the one way to resolve this is the {{ALL_REGULARS_AND_QUERIED_STATICS_COLUMNS}} filter should by deprecated (or just removed) and no longer built, always selecting all static columns This would just leave {{WildCardColumnFilter}} and {{SelectionColumnFilter}} with {{ALL_COLUMNS}} or {{{}ONLY_QUERIED_COLUMNS{}}}. This is a potential performance regression for unusual schemas with very large numbers of static columns, but seems unlikely in practice. /cc: [~blerer] -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org