Zhaojun Zhang created CASSANDRA-12416:
-----------------------------------------

             Summary: sstableloader to stream sstables in a sorted order
                 Key: CASSANDRA-12416
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12416
             Project: Cassandra
          Issue Type: Wish
            Reporter: Zhaojun Zhang


Within each sstable, the data is sorted. However, this is not true across 
multiple sstables. We have a workflow which will create a read-only cluster by 
bulk loading data from sstables (written by cqlsstablewirter) to cassandra 
cluster. We don't want to trigger compaction, and the best way to do so is to 
write data in a sorted order, which requires us to do a global sort across all 
data sources using an external sort algorithm. If we are able to use 
sstableloader to load data into clusters in order, we don't need to do such 
global sort, which will dramatically simply our implementation and code 
redundancy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to