[jira] [Commented] (CASSANDRA-11877) Add support to legacy row serialization on BigTableWriter

Paulo Motta (JIRA) Wed, 01 Jun 2016 08:13:36 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-11877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15310446#comment-15310446
 ]


Paulo Motta commented on CASSANDRA-11877:
-----------------------------------------

Thanks for the feedback. I definitely agree it doesn't make sense to make the 2 
paradigms interoperable and it's better to keep legacy code isolated since it 
will probably be removed in the next major release.

Since there quite a few special cases to consider (range tombstones, index 
sampling) let's focus on the simple case first (simple cells, no index 
sampling) so we can leverage existing code while having visible progress and 
create a basic test structure to build on top when dealing with more complex 
cases (range tombstones, collections, index sampling, large partitions, etc) 
and do the necessary improvements/optimizations later. I will update the ticket 
description to reflect that.

I think we can start by:

* Adding support to simple {{RowIndexEntry}} serialization (only position) on 
{{LegacyShallowIndexedEntry.serialize}}
* Create {{LegacyLayout.LegacyBigTableWriter}}, which basically copies 2.2 
BigTableWriter while working with the new {{SSTableWriter}} interface 
({{append(UnfilteredRowIterator iterator)}}):
** Port other necessary class: {{LegacyLayout.LegacyColumnIndex}}, 
{{LegacyLayout.MetadataCollector}}, trying to use legacy classes from 
{{LegacyLayout}} where applicable ({{LegacyAtom, LegacyDeletionInfo, 
LegacyUnfilteredPartition}}), or classes that haven't changed between two 
versions ({{EstimatedHistogram, DeletionTime}} for example).
** Since we're not dealing with rangetombstones in this initial version, we can 
create an empty stub for {{LegacyRangeTombstoneTracker}} and port that later 
when dealing with range tombstones.
** Similarly, since we're not dealing with complex index columns, we can 
comment out parts constructing {{IndexInfo}} and always return 
{{ColumnIndex.EMPTY}} on {{ColumnIndex.Builder.build()}}
* After the bulk structure of {{LegacyBigTableWriter}} is ported, we can 
probably reuse {{LegacyLayout.fromUnfilteredRowIterator}} to convert from 
{{UnfilteredRowIterator}} to {{LegacyUnfilteredPartition}} and work from there 
on {{LegacyBigTableWriter}}
** At this initial stage, since we're not dealing with range tombstones, we can 
probably extract the cell serialization code of 
{{LegacyLayout.serializeAsLegacyPartition}} to perform disk cell serialization 
on {{LegacyColumnIndex}}
* After we have an initial draft of {{LegacyBigTableWriter}} ready, we can 
probably instantiate that when {{!version.storeRows}} on 
{{BigFormat.WriterFactory}}
* Adding a few simple tests to guide the development would probably be handy, 
maybe we can start by making {{SimpleQuery.testTableWithoutClustering}} work 
with a converted sstable.

[~thobbs] Does this sound better to start with and like it's going to work 
(even if maybe not efficiently)? Any other particular caveat we are missing or 
should be aware of? Thanks in advance for the help!

> Add support to legacy row serialization on BigTableWriter
> ---------------------------------------------------------
>
>                 Key: CASSANDRA-11877
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11877
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Tools
>            Reporter: Paulo Motta
>            Assignee: Kaide Mu
>            Priority: Minor
>
> In order to support writing pre-3.0 sstables, we must add support to legacy 
> cell serialization to {{BigTableWriter}}. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11877) Add support to legacy row serialization on BigTableWriter

Reply via email to