[ https://issues.apache.org/jira/browse/CASSANDRA-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis resolved CASSANDRA-3943. --------------------------------------- Resolution: Won't Fix Fix Version/s: (was: 2.0) Happy to see this picked up again but I think CASSANDRA-5371 addresses the low-hanging fruit here. > Too many small size sstables after loading data using sstableloader or > BulkOutputFormat increases compaction time. > ------------------------------------------------------------------------------------------------------------------ > > Key: CASSANDRA-3943 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3943 > Project: Cassandra > Issue Type: Wish > Components: Hadoop, Tools > Affects Versions: 0.8.2, 1.1.0 > Reporter: Samarth Gahire > Priority: Minor > Labels: bulkloader, hadoop, sstableloader, streaming, tools > Original Estimate: 168h > Remaining Estimate: 168h > > When we create sstables using SimpleUnsortedWriter or BulkOutputFormat,the > size of sstables created is around the buffer size provided. > But After loading , sstables created in the cluster nodes are of size around > {code}( (sstable_size_before_loading) * replication_factor ) / > No_Of_Nodes_In_Cluster{code} > As the no of nodes in cluster goes increasing, size of each sstable loaded to > cassandra node decreases.Such small size sstables take too much time to > compact (minor compaction) as compare to relatively large size sstables. > One solution that we have tried is to increase the buffer size while > generating sstables.But as we increase the buffer size ,time taken to > generate sstables increases.Is there any solution to this in existing > versions or are you fixing this in future version? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira