[ https://issues.apache.org/jira/browse/CASSANDRA-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yang Yang updated CASSANDRA-3073: --------------------------------- Attachment: 0001-liveSize-is-different-from-throughput-particularly-w.patch simple fix. > liveSize() calculation is wrong in case of overwrite > ---------------------------------------------------- > > Key: CASSANDRA-3073 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3073 > Project: Cassandra > Issue Type: Bug > Reporter: Yang Yang > Priority: Minor > Attachments: > 0001-liveSize-is-different-from-throughput-particularly-w.patch > > > currently liveSize() is the sum of currentThroughput. > this definition is wrong if most of the operations are overwrite, or counter > (which is essentially overwrite). > for example, the following code should always keep a single entry in db, with > one row, one cf, one column, and supposedly should have a size of only about > 100 bytes. > connect localhost/9160; > create keyspace blah; > use blah; > create column family cf2 with memtable_throughput=1024 and > memtable_operations=10000 ; > set the cassandra.yaml > memtable_total_space_in_mb: 20 > to make the error appear faster (but if u set to default, still same issue > will appear) > then we use a simple pycassa script: > >>> pool = pycassa.connect('blah') > >>> mycf = pycassa.ColumnFamily(pool,"cf2"); > >>> for x in range(1,10000000) : > ... xx = mycf.insert('key1',{'col1':"{}".format(x)}) > ... > you will see sstables being generated with only sizes of a few k, though we > set the CF options to get high SSTable sizes -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira