Too many small size sstables after loading data using sstableloader or 
BulkOutputFormat increases compaction time.
------------------------------------------------------------------------------------------------------------------

                 Key: CASSANDRA-3943
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3943
             Project: Cassandra
          Issue Type: Improvement
          Components: Hadoop, Tools
    Affects Versions: 0.8.2, 1.1.0
            Reporter: Samarth Gahire
            Assignee: Brandon Williams
            Priority: Minor
             Fix For: 1.1.0


When we create sstables using SimpleUnsortedWriter or BulkOutputFormat,the size 
of sstables created is around the buffer size provided.
But After loading , sstables created in the cluster nodes are of size around
{code}( (sstable_size_before_loading) * replication_factor ) / 
No_Of_Nodes_In_Cluster{code}

As the no of nodes in cluster goes increasing, size of each sstable loaded to 
cassandra node decreases.Such small size sstables take too much time to compact 
(minor compaction) as compare to relatively large size sstables.
One solution that we have tried is to increase the buffer size while generating 
sstables.But as we increase the buffer size ,time taken to generate sstables 
increases.Is there any solution to this in existing versions or are you fixing 
this in future version?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to