[jira] [Commented] (CASSANDRA-3073) liveSize() calculation is wrong in case of overwrite

Yang Yang (JIRA) Tue, 23 Aug 2011 20:04:16 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089970#comment-13089970
 ]


Yang Yang commented on CASSANDRA-3073:
--------------------------------------

I think OOM is an orthogonal issue. 

right now the limit is "memtable_total_space_in_mb"  ,  the natural semantic of 
this limit is what this fix describes,  the old implementation does not reflect 
this semantic, that's the problem.

if we want to avoid the throughput, we already have the memtable_throughput 
param (though only specific to CF level only). otherwise it is at least 
necessary to change the memtable_total_space_in_mb to some other name, such as 
"memtable_total_throughput_in_mb"  


if the OOM appears with SlabAllocator, but not with JVM native allocator, isn't 
that a problem with SlabAllocator itself?

> liveSize() calculation is wrong in case of overwrite
> ----------------------------------------------------
>
>                 Key: CASSANDRA-3073
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3073
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Yang Yang
>            Priority: Minor
>         Attachments: 
> 0001-liveSize-is-different-from-throughput-particularly-w.patch
>
>
> currently liveSize() is the sum of currentThroughput.
> this definition is wrong if most of the operations are overwrite, or counter 
> (which is essentially overwrite).
> for example, the following code should always keep a single entry in db, with 
> one row, one cf, one column, and supposedly should have a size of only about 
> 100 bytes.
> connect localhost/9160;  
> create keyspace blah;
> use blah;
> create column family cf2 with memtable_throughput=1024 and 
> memtable_operations=10000  ;
> set the cassandra.yaml 
> memtable_total_space_in_mb: 20
> to make the error appear faster (but if u set to default, still same issue 
> will appear)
> then we use a simple pycassa  script:
> >>> pool = pycassa.connect('blah')
> >>> mycf = pycassa.ColumnFamily(pool,"cf2");
> >>> for x in range(1,10000000) :
> ...     xx = mycf.insert('key1',{'col1':"{}".format(x)})
> ... 
> you will see sstables being generated with only sizes of a few k, though we 
> set the CF options to get high SSTable sizes

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3073) liveSize() calculation is wrong in case of overwrite

Reply via email to