Jon Meredith created CASSANDRA-17601:
----------------------------------------

             Summary: IllegalStateException with prepared queries selecting 
static columns in mixed 3.0.x/4.x clusters
                 Key: CASSANDRA-17601
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17601
             Project: Cassandra
          Issue Type: Bug
            Reporter: Jon Meredith
            Assignee: Jon Meredith


Clusters that contain prepared statements that partially select static columns 
before the upgrade will fail to execute those statements coordinated from the 
4.x nodes until the upgrade completes.
h2. Reproduction

Setup (before upgrade)
{code:java}
CREATE KEYSPACE ks1 WITH replication = {'class': 'SimpleStrategy', 
'replication_factor':3}
CREATE TABLE ks1.tbl1 (pk1 int,
ck2 int,
s3 int static,
s4 int static,
c5 int,
PRIMARY KEY (pk1, ck2));
INSERT INTO ks1.tbl1 (pk1, ck2, s3, s4, c5) VALUES (1, 2, 3, 4, 5);
{code}
Prepared Statement (prepare before upgrade)
{code:java}
SELECT c5, s3 FROM ks1.tbl1 WHERE pk1 = ? AND ck2 = ?;
{code}
Exception on 3.0.x nodes (when executing prepared statement after upgrade)
{code:java}
java.lang.IllegalStateException: [s3, s4] is not a subset of [s3] at 
org.apache.cassandra.db.Columns$Serializer.encodeBitmap(Columns.java:566)
at org.apache.cassandra.db.Columns$Serializer.serializeSubset(Columns.java:498) 
at 
org.apache.cassandra.db.rows.UnfilteredSerializer.serializeRowBody(UnfilteredSerializer.java:235)
at 
org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:209)
at 
org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:141)
at 
org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:129)
at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:140)
at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:95)
at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:80)
at 
org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:308)
at 
org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:191)
at 
org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:181)
at 
org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:177)
at org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:48)
at org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:335)
at 
org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:91)
at org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:77)
at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:93)
at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:44)
at 
org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:433)
at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:119)
at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:834)
{code}
Exception on 4.0.x nodes (when executing prepared statement after upgrade)
{code:java}
java.lang.IllegalStateException: [ColumnDefinition{name=s3, 
type=org.apache.cassandra.db.marshal.IntType, kind=STATIC, position=-1},
ColumnDefinition{name=s4, type=org.apache.cassandra.db.marshal.IntType, 
kind=STATIC, position=-1}] is not a subset of [s3]
at org.apache.cassandra.db.Columns$Serializer.encodeBitmap(Columns.java:555)
at org.apache.cassandra.db.Columns$Serializer.serializeSubset(Columns.java:487)
at 
org.apache.cassandra.db.rows.UnfilteredSerializer.serializeRowBody(UnfilteredSerializer.java:216)
at 
org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:190)
at 
org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:121)
at 
org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:109)
at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:140)
at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:94)
at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:79)
at 
org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:326)
at 
org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:186)
at 
org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:179)
at 
org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:175)
at org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:75)
at org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:499)
at 
org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:91)
at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.runUnsafe(AbstractLocalAwareExecutorService.java:194)
at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.runUnsafe(AbstractLocalAwareExecutorService.java:137)
at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:167)
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:122) at 
java.lang.Thread.run(Thread.java:748)
{code}
The root cause is CASSANDRA-16686 changes ColumnFilters to build and 
deserialize based on what versions the coordinating node thinks are running in 
the cluster, and that
knowledge is always incorrect when statements are reprepared on startup and may 
be incorrect as all nodes reach their final version.
h2. Sequence of events:

Prepared statements are persisted in {{system.prepared_statements}} to be 
re-prepared on future startup.

When the 4.x node starts up after upgrade, in 
{{org.apache.cassandra.service.CassandraDaemon#setup}} it calls 
{{QueryProcessor.instance.preloadPreparedStatements}} *before* the {{Gossiper}} 
is started by a call to {{StorageService.instance.initServer()}} later in 
{{{}setup{}}}.

As part of preparing statements, when possible a {{ColumnFilterFactory}} is 
created that returns a {{ColumnFilter}} built at the time the query is prepared.

After the changes from CASSANDRA-16686, the {{ColumnFilter}} builder constructs 
different column filter variants depending on the lowest version reported in 
gossip by checking 
{{{}org.apache.cassandra.gms.Gossiper#upgradeFromVersionMemoized{}}}. If this 
runs before the Gossiper is enabled the {{{}SystemKeyspace.CURRENT_VERSION{}}}, 
causing the {{ColumnFilter}} to create a column filter as if the cluster were 
fully upgraded.

For the query above, the ColumnFilter creates an 
ALL_REGULARS_AND_QUERIED_STATICS_COLUMNS filter.

The 3.0.x nodes participating do not understand the new flag and creates a 
{{ColumnFilter}} the equivalent of a {{{}WildcardColumnFilter{}}}. The 4.x 
nodes participating do understand the new flag, however the deserializer takes 
the lower than 3.4 path as other 3.0 nodes are known about and creates a 
{{{}WildcardColumFilter{}}}.

The fetchedColumns sent by the ALL_REGULARS_AND_QUERIED_STATICS_COLUMNS filter 
only contains the queried static columns, however the pre-3.4 sstable iterator 
returns all regular and static columns, causing an IllegalStateException when 
the serialized response is sent back.

The ISE clears once all nodes in the cluster think they are upgraded to the 
current version and behave as the originally prepared query intended.
h2. Related Problems

_Non-deterministic behavior of 4.0.x/4.1.x nodes_

If the prepared statements are cleared and/or freshly prepared when the cluster 
is in mixed 3.0/4.0 mode, the pre-built ColumnFilter will remain in the mixed 
mode version until re-prepared on a restart or cache clear/eviction.

As upgradeFromVersionMemoized times out and is recalculated after the upgrade 
reaches a single version, individual nodes will make a local decision on column 
filter building and deserializing.

Nodes that update upgradeFromVersionMemoized early that coordinate requests may 
cause the same ISE against nodes responding to the read command have the 
previous version still.

_Digest Mismatches_

If {{ALL_REGULARS_AND_QUERIED_STATICS_COLUMN}} {{ColumnFilter}}s are 
incorrectly sent to 3.0.x nodes, the list of columns included will be ignored 
and compute a different digest than one locally executed on a 4.0.x coordinator.
h1. Proposed fix

In discussion with [~ifesdjeen], he suggested that the one way to resolve this 
is the {{ALL_REGULARS_AND_QUERIED_STATICS_COLUMNS}} filter should by deprecated 
(or just removed) and no longer built, always selecting all static columns
This would just leave {{WildCardColumnFilter}} and {{SelectionColumnFilter}} 
with {{ALL_COLUMNS}} or {{{}ONLY_QUERIED_COLUMNS{}}}.

This is a potential performance regression for unusual schemas with very large 
numbers of static columns, but seems unlikely in practice.

/cc: [~blerer] 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to