[ https://issues.apache.org/jira/browse/CASSANDRA-15035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Stupp updated CASSANDRA-15035: ------------------------------------- Status: Ready to Commit (was: Review In Progress) > C* 3.0 sstables w/ UDTs are corrupted in 3.11 + 4.0 > --------------------------------------------------- > > Key: CASSANDRA-15035 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15035 > Project: Cassandra > Issue Type: Bug > Components: Feature/UDT, Local/SSTable > Reporter: Robert Stupp > Assignee: Robert Stupp > Priority: Urgent > Fix For: 3.11.6, 4.0 > > > OSS C* 3.0 writes incorrect type information for UDTs into the > serialization-header of each sstable. > In C* 3.0, both UDTs and tuple are always frozen. A frozen type must be > enclosed in a {{frozen<...>}} via the {{CQL3Type}} hierarchy (resp > {{org.apache.cassandra.db.marshal.FrozenType(...)}} via the {{AbstractType}} > hierarchy) “bracket” in the schema and serialization-header. > Since CASSANDRA-7423 (committed to C* 3.6) UDTs can also be non-frozen (= > multi-cell). > Unfortunately, C* 3.0 does not write the > {{org.apache.cassandra.db.marshal.FrozenType(...)}} “bracket” for UDTs into > the {{SerializationHeader.Component}} in the {{-Stats.db}} sstable component. > The order in which columns of a row are serialized depends on the concrete > {{AbstractType}}. Columns with variable length types (frozen types belong to > this category) are serialized before columns with multi-cell types > (non-frozen types belong to that category). > If C* 3.6 (or any newer version) reads an sstable written by C* 3.0 (up to > 3.5), it will read the type information “non-frozen UDT” from the > serialization header, which is technically correct. > This means, that upgrades from C* 3.0 to C* 3.11 and 4.0, using a schema that > uses UDTs, result in inaccessible data in those sstables. Reads against 3.0 > sstables as well as attempts to scrub these sstables result in a wide variety > of errors/exceptions ({{CorruptSSTableException}}, {{EOFExcepiton}}, > {{OutOfMemoryError}}, etc etc), as usual in such cases. > Mitigation strategy in the proposed patch: > * Fix the broken serialization-headers automatically when an upgrade from C* > 3.0 is detected. > * Enhance {{sstablescrub}} to verify the serialization-header against the > schema and allow {{sstablescrub}} to fix the UDT types according to the > information in the schema. This does not apply to "online scrub" (e.g. > nodetool scrub). The behavior of {{sstablescrub}} has been changed to first > inspect the serialization-header and verify the type information against the > schema. > Differences between the schema and the sstable serialization-headers cause > {{sstablescrub}} to error out and stop - i.e. safety first (there’s a way to > opt-out though). > A new class {{SSTableHeaderFix}} can inspect the serialization-header > ({{SerializationHeader.Component}}) in the the {{-Statistics.db}} component > and fix the type information in those sstables for UDTs according to the > schema information. > This new class could be used during verify and before sstables are imported. > But changes to “verify” and “import” are out of the scope of this ticket, as > the patch is already bigger than I originally expected. > Another issue not tackled by this ticket is that the wrong ‘kind’ is written > to the type information in {{system_schema.dropped_columns}} when a > non-frozen UDT column is dropped. When a UDT column is dropped, the type of > the dropped column is converted from the UDT definition to its > “corresponding” tuple type definition. But all versions currently write > {{frozen<tuple<...>>}}, but for non-frozen UDTs it should actually just be > {{tuple<...>}}. Unfortunately, there is nothing that could be done in this > ticket to fix (or even consider) the type information of a dropped column. > But for correctness, the tuple type should be a multi-cell one (only > accessible for dropped UDTs though - not as something that a user can create > as a type). -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org