[ https://issues.apache.org/jira/browse/CASSANDRA-11877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15310446#comment-15310446 ]
Paulo Motta commented on CASSANDRA-11877: ----------------------------------------- Thanks for the feedback. I definitely agree it doesn't make sense to make the 2 paradigms interoperable and it's better to keep legacy code isolated since it will probably be removed in the next major release. Since there quite a few special cases to consider (range tombstones, index sampling) let's focus on the simple case first (simple cells, no index sampling) so we can leverage existing code while having visible progress and create a basic test structure to build on top when dealing with more complex cases (range tombstones, collections, index sampling, large partitions, etc) and do the necessary improvements/optimizations later. I will update the ticket description to reflect that. I think we can start by: * Adding support to simple {{RowIndexEntry}} serialization (only position) on {{LegacyShallowIndexedEntry.serialize}} * Create {{LegacyLayout.LegacyBigTableWriter}}, which basically copies 2.2 BigTableWriter while working with the new {{SSTableWriter}} interface ({{append(UnfilteredRowIterator iterator)}}): ** Port other necessary class: {{LegacyLayout.LegacyColumnIndex}}, {{LegacyLayout.MetadataCollector}}, trying to use legacy classes from {{LegacyLayout}} where applicable ({{LegacyAtom, LegacyDeletionInfo, LegacyUnfilteredPartition}}), or classes that haven't changed between two versions ({{EstimatedHistogram, DeletionTime}} for example). ** Since we're not dealing with rangetombstones in this initial version, we can create an empty stub for {{LegacyRangeTombstoneTracker}} and port that later when dealing with range tombstones. ** Similarly, since we're not dealing with complex index columns, we can comment out parts constructing {{IndexInfo}} and always return {{ColumnIndex.EMPTY}} on {{ColumnIndex.Builder.build()}} * After the bulk structure of {{LegacyBigTableWriter}} is ported, we can probably reuse {{LegacyLayout.fromUnfilteredRowIterator}} to convert from {{UnfilteredRowIterator}} to {{LegacyUnfilteredPartition}} and work from there on {{LegacyBigTableWriter}} ** At this initial stage, since we're not dealing with range tombstones, we can probably extract the cell serialization code of {{LegacyLayout.serializeAsLegacyPartition}} to perform disk cell serialization on {{LegacyColumnIndex}} * After we have an initial draft of {{LegacyBigTableWriter}} ready, we can probably instantiate that when {{!version.storeRows}} on {{BigFormat.WriterFactory}} * Adding a few simple tests to guide the development would probably be handy, maybe we can start by making {{SimpleQuery.testTableWithoutClustering}} work with a converted sstable. [~thobbs] Does this sound better to start with and like it's going to work (even if maybe not efficiently)? Any other particular caveat we are missing or should be aware of? Thanks in advance for the help! > Add support to legacy row serialization on BigTableWriter > --------------------------------------------------------- > > Key: CASSANDRA-11877 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11877 > Project: Cassandra > Issue Type: Sub-task > Components: Tools > Reporter: Paulo Motta > Assignee: Kaide Mu > Priority: Minor > > In order to support writing pre-3.0 sstables, we must add support to legacy > cell serialization to {{BigTableWriter}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)