> How this parameter works? I have 3 nodes and 2 core each CPU and I have > higher writes. It slows down the rate that compaction reads from disk. It reads at bit then has to take a break and wait until it can read again. With only 2 cores you will be running into issues when compaction or repair do their work.
> So usually for high update and high read situation what parameter we should > consider for tuning? In this case I think the issue is only having 2 cores. There are background processing like compaction and repair that have to run when you system is running. Slowing down compaction will reduce it's impact. Cheers ----------------- Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 30/03/2013, at 12:58 AM, Jay Svc <jaytechg...@gmail.com> wrote: > Hi Aaron, > > Thank you for your input. I have been monitoring my GC activities and looking > at my Heap, it shows pretty linear activities, without any spikes. > > When I look at CPU it shows higher utilization while during writes alone. I > also expect hevy read traffic. > > When I tried compaction_throughput_* parameter, I obsered that higher number > here in my case gets better CPU utilization and keeps pending compactions > pretty low. How this parameter works? I have 3 nodes and 2 core each CPU and > I have higher writes. > > So usually for high update and high read situation what parameter we should > consider for tuning? > > Thanks, > Jay > > > > > > On Wed, Mar 27, 2013 at 9:55 PM, aaron morton <aa...@thelastpickle.com> wrote: > * Check for GC activity in the logs > * check the volume the commit log is on to see it it's over utilised. > * check if the dropped messages correlate to compaction, look at the > compaction_* settings in yaml and consider reducing the throughput. > > Like Dean says if you have existing data it will result in more compaction. > You may be able to get a lot of writes through in a clean new cluster, but it > also has to work when compaction and repair are running. > > Cheers > > ----------------- > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 27/03/2013, at 1:43 PM, Jay Svc <jaytechg...@gmail.com> wrote: > >> Thanks Dean again! >> >> My use case is high number of reads and writes out of that I am just >> focusing on write now. I thought LCS is a suitable for my situation. I tried >> simillar on STCS and results are same. >> >> I ran nodetool for tpstats and MutationStage pending are very high. At the >> same time the SSTable count and Pending Compaction are high too during my >> updates. >> >> Please find the snapshot of my syslog. >> >> INFO [ScheduledTasks:1] 2013-03-26 15:05:48,560 StatusLogger.java (line 116) >> OpsCenter.rollups86400 0,0 >> INFO [FlushWriter:55] 2013-03-26 15:05:48,608 Memtable.java (line 264) >> Writing Memtable-InventoryPrice@1051586614(11438914/129587272 >> serialized/live bytes, 404320 ops) >> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,561 MessagingService.java (line >> 658) 2701 MUTATION messages dropped in last 5000ms >> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,562 StatusLogger.java (line 57) >> Pool Name Active Pending Blocked >> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,563 StatusLogger.java (line 72) >> ReadStage 0 0 0 >> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,568 StatusLogger.java (line 72) >> RequestResponseStage 0 0 0 >> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,627 StatusLogger.java (line 72) >> ReadRepairStage 0 0 0 >> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,627 StatusLogger.java (line 72) >> MutationStage 32 19967 0 >> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,628 StatusLogger.java (line 72) >> ReplicateOnWriteStage 0 0 0 >> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,628 StatusLogger.java (line 72) >> GossipStage 0 0 0 >> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,628 StatusLogger.java (line 72) >> AntiEntropyStage 0 0 0 >> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,629 StatusLogger.java (line 72) >> MigrationStage 0 0 0 >> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,629 StatusLogger.java (line 72) >> StreamStage 0 0 0 >> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,629 StatusLogger.java (line 72) >> MemtablePostFlusher 1 1 0 >> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,673 StatusLogger.java (line 72) >> FlushWriter 1 1 0 >> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,673 StatusLogger.java (line 72) >> MiscStage 0 0 0 >> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,673 StatusLogger.java (line 72) >> commitlog_archiver 0 0 0 >> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,674 StatusLogger.java (line 72) >> InternalResponseStage 0 0 0 >> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,674 StatusLogger.java (line 72) >> HintedHandoff 0 0 0 >> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,674 StatusLogger.java (line 77) >> CompactionManager 1 27 >> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,675 StatusLogger.java (line 89) >> MessagingService n/a 0,22 >> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,724 StatusLogger.java (line 99) >> Cache Type Size Capacity >> KeysToSave Provider >> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,725 StatusLogger.java (line 100) >> KeyCache 142315 2118997 >> all >> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,725 StatusLogger.java (line >> 106) RowCache 0 0 >> all >> org.apache.cassandra.cache.SerializingCacheProvider >> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,725 StatusLogger.java (line 113) >> ColumnFamily Memtable ops,data >> INFO [ScheduledTasks:1] 2013-03-26 15:0 >> >> Thanks, >> Jay >> >> >> >> >> On Tue, Mar 26, 2013 at 7:15 PM, Hiller, Dean <dean.hil...@nrel.gov> wrote: >> LCS is generally used for high read vs. write ratio though it sounds like >> you may be doing a heavy write load instead. LCS will involve more >> compactions as you write to the system compared to STCS because LCS is >> always trying to keep a 1 to 10 ratio between levels. While LCS will >> involve more compaction in general(more I/o, more cpu), I am not sure on >> update vs. insert though From what I understand STCS will happily duplicate >> rows across SS tables while LCS does not like to do this so as you update >> you will constantly compact….well, that is my understanding. Have you tried >> STCS out at all? (ps. This is just from what I understand so take with a >> grain of salt). >> >> Also, there are some great tools in the nodetool tool as well so you can get >> nodetool compactionstats, etc. etc. and see how backlogged you are in >> pending tasks….how many pending? >> >> Later, >> Dean >> >> From: Jay Svc <jaytechg...@gmail.com<mailto:jaytechg...@gmail.com>> >> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" >> <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> >> Date: Tuesday, March 26, 2013 6:08 PM >> To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" >> <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> >> Subject: Re: Insert v/s Update performance >> >> Thanks Dean, >> >> I have used LCS with sstable_size_in_mb of 15. I have also tried bigger >> sstable_size_in_mb and observed simillar behavior. >> >> Does compaction works differently for update v/s Insert? I belive all keys >> goes to single SST. What other options I should look into? >> >> Thanks, >> Jay >> >> >> >> >> On Tue, Mar 26, 2013 at 6:18 PM, Hiller, Dean >> <dean.hil...@nrel.gov<mailto:dean.hil...@nrel.gov>> wrote: >> Most likely compaction kicks in as updates cause duplicated rows in STCS and >> compaction causes load that may not have been there before(check your logs). >> Also, you can increase the number of nodes in your cluster as well to >> better handle the load. >> >> Later, >> Dean >> >> From: Jay Svc >> <jaytechg...@gmail.com<mailto:jaytechg...@gmail.com><mailto:jaytechg...@gmail.com<mailto:jaytechg...@gmail.com>>> >> Reply-To: >> "user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>" >> >> <user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>> >> Date: Tuesday, March 26, 2013 5:05 PM >> To: >> "user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>" >> >> <user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>> >> Subject: Insert v/s Update performance >> >> Hi Team, >> >> I have this 3 node cluster. I am writing data to these node at the rate of >> 2,000 records/second. What I observed that if I do inserts. (Means records >> for those keys does not exist, my column family has 0 records to start with) >> then I have better write performacne, low SSTable count, low pending >> compaction and write latency is acceptable and CPU utilization on each node >> between 35% to 85%. >> >> When I ran same test but for update this time (means records already exists >> in Column family with same key), I observed that my SSTable count gone high >> 3 times. Pending compactions gone high more than 2 times and write latency >> has gone high too and CPU utilization was almost 92% to 100%. >> >> What is a reason of deteriorating Update performance v/s Insert performance. >> Since this is critical you help is highly appriciated. >> >> P.S. I also observed that high number of pending Mutation Stage on my >> nodetool tpstats. >> >> Thanks, >> Jay >> >> > >