Thanks Edward. In our usage scenario, there is never downtime, it's a global 24/7 operation.
What is impacted the worst, the read or write? How does a node handle compaction when there is a spike of writes coming to it? Edward Capriolo wrote: > > On Sat, Feb 5, 2011 at 11:59 AM, buddhasystem <potek...@bnl.gov> wrote: >> >> Just wanted to see if someone with experience in running an actual >> service >> can advise me: >> >> how often do you run nodetool compact on your nodes? Do you stagger it in >> time, for each node? How badly is performance affected? >> >> I know this all seems too generic but then again no two clusters are >> created >> equal anyhow. Just wanted to get a feel. >> >> Thanks, >> Maxim >> >> -- >> View this message in context: >> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/How-bad-is-teh-impact-of-compaction-on-performance-tp5995868p5995868.html >> Sent from the cassandra-u...@incubator.apache.org mailing list archive at >> Nabble.com. >> > > This is an interesting topic. Cassandra can now remove tombstones on > non-major compaction. For some use cases you may not have to trigger > nodetool compact yourself to remove tombstones. Use cases that do not > to many updates, deletes may have the least need to run compaction > yourself. > > !However! If you have smaller SSTables, or less SSTables your read > operations will be more efficient. > > if you have downtime such as from 1AM-6AM. Going through a major > compaction might shrink you dataset significantly and that will make > reads better. > > Compaction can be more or less intensive. The largest factor is is row > size. Users with large rows probably see faster compaction while > smaller rows see it take a long time. You can lower the priority of > the compaction thread for experimentation. > > As to the performance you want to get your cluster to the state where > it is not compacting often. This may mean you need more nodes to > handle writes. > > I graph the compaction information from JMX > http://www.jointhegrid.com/cassandra/cassandra-cacti-m6.jsp > to get a feel for how often a node is compacting on average. Also I > cross reference the compaction with Read latency and IO graphs I have > to see what impact compaction has on reads. > > Forcing a major compaction also lowers the chances a compaction will > happen during the day on peak time. I major compact a few cluster > nodes each night through cron (gc time 3 days). This has been good for > keeping our data on disk as small as possible. Forcing the major > compact at night uses IO, but i find it saves IO over the course of > the day because each read seeks less on disk. > > -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/How-bad-is-the-impact-of-compaction-on-performance-tp5995868p5995978.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.