Not heard any feedback yet, so tomorrow plan to remove… the feature was local 
to 3.6+ so all users migrating from 3.0 to 4.0 never had this issue

> On Jun 13, 2023, at 10:22 AM, David Capwell <dcapw...@apple.com> wrote:
> 
> org.apache.cassandra.io.sstable.SSTableHeaderFix was added due to bugs in 3.6 
> causing invalidate types or incompatible types (due to toString changes) in 
> the SSTableHeader… this logic runs on start and rewrites all Stats files that 
> had a mismatch from the local schema; with 5.0 requiring upgrades from 4.x 
> only, this logic should have already run as its a 3.x to 4.0 migration step 
> (though users are able to opt out [1]) which should have already fixed the 
> SSTables to have correct schema…
> 
> Why is this a problem now?  CASSANDRA-18504 is adding a lot of property/fuzz 
> tests to the type system and the read/write path, which has found several 
> bugs; fixing some of the bugs actually impacts SSTableHeader because it 
> requires generating and working with types that are not valid, so it can fix 
> them…   By removing this logic, we can push this type validation into the 
> type system to avoid generating incorrect types.  
> 
> If we wish to keep this class, we need to maintain allowing invalid types to 
> be created, which may cause bugs down the road.
> 
> 
> [1] if a user opts out there are 2 real cases that are impacted: UDTs, and 
> collections of collections…
> * For UDTs, the frozen vs non-frozen type are not the same, so mixing these 
> causes us to fail to read the data, failing the read…. I believe 
> writes/compactions will not corrupt the data, but anything that touches these 
> SSTables will fail due to the schema mismatch… the only way to resolve this 
> is to fix the SSTables… If you disabled in 4.x, you were living with broken / 
> unreadable SSTables, so by removing 5.0 would loose the ability to repair 
> them (but 4.x would still be able to)
> * for collections of collections, this is less of an issue.  The logic would 
> detect that the collection has a non-frozen collection as the element, so 
> would migrate them to frozen.  This behavior has been moved to the type 
> system, so a read from SSTable of “list<list<int>>” automagically becomes a 
> "ListType(FrozenType(ListType(Int32Type)))”.  The SSTables are not “fixed”, 
> but compaction is able to read the data correctly, and the new SSTables will 
> have the correct header.  

Reply via email to