In our case we have continuous flow of data to be cached. Every second
we're receiving tens of PUT requests. Every request has 500Kb payload in
average and TTL about 20 minutes.
On the other side we have the similar flow of GET requests. Every GET
request is transformed to get by key query
If this is a tombstone problem as suggested by some, and it is ok to turn of
replication as suggested by others, it may be an idea to do an optimization in
cassandra where
if replication_factor 1:
do not create tombstones
Terje
On Jul 2, 2013, at 11:11 PM, Dmitry Olshansky
Hello,
thanks to all for your answers and comments.
What we've done:
- increased Java heap memory up to 6 Gb
- changed replication factor to 1
- set durable_writes to false
- set memtable_total_space_in_mb to 5000
- set commitlog_total_space_in_mb to 6000
If I understand correctly the last
The most effective way to deal with obsolete Tombstones in the short lived
cache case seems to be to drop them on the floor en masse... :D
a) have two column families that the application alternates between, modulo
time_period
b) truncate and populate the cold one
c) read from the hot one
d)
https://issues.apache.org/jira/browse/CASSANDRA-2958
Thanks
-
Aaron Morton
Freelance Cassandra Consultant
New Zealand
@aaronmorton
http://www.thelastpickle.com
On 28/06/2013, at 6:30 AM, Robert Coli rc...@eventbrite.com wrote:
On Wed, Jun 26, 2013 at 9:51 PM, aaron morton
On Wed, Jun 26, 2013 at 9:51 PM, aaron morton aa...@thelastpickle.com wrote:
WARNING: disabling durable_writes means that writes are only in memory and
will not be committed to disk until the CF's are flushed. You should
*always* use nodetool drain before shutting down a node in this case.
I'll also add that you are probably running into some memory issues, 2.5 GB is
a low heap size
-Xms2500M -Xmx2500M -Xmn400M
If you really do have a cache and want to reduce the disk activity disable
durable_writes on the KS. That will stop the writes from going to the commit
log which is
Hello,
we are using Cassandra as a data storage for our caching system. Our
application generates about 20 put and get requests per second. An
average size of one cache item is about 500 Kb.
Cache items are placed into one column family with TTL set to 20 - 60
minutes. Keys and values are
If you have rapidly expiring data, then tombstones are probably filling your
disk and your heap (depending on how you order the data on disk). To check to
see if your queries are affected by tombstones, you might try using the query
tracing that's built-in to 1.2.
See:
Apart from what Jeremy said, you can try these
1) Use replication = 1. It is cache data and you dont need persistence.
2) Try playing with memtable size.
3) Use netflix client library as it will reduce one hop. It will chose the
node with data as the co ordinator.
4) Work on your schema. You might
10 matches
Mail list logo