There are other threads linked to this issue. Most notable, I think we're hitting
https://issues.apache.org/jira/browse/CASSANDRA-1014 here. 2010/4/27 Schubert Zhang <[email protected]> > Seems: > > ROW-MUTATION-STAGE 32 3349 63897493 > is the clue, too many mutation requests are pending. > > > Yes, I also think cassandra should add a mechanism to avoid too many > requests pending (in queue). > When the queue is full, just reject the request from client. > > Seems https://issues.apache.org/jira/browse/CASSANDRA-685 is what we want. > > > > On Tue, Apr 27, 2010 at 8:16 PM, Eric Yu <[email protected]> wrote: > >> I wrote a script to record the tpstats output every 5 seconds. >> Here is the output just before the jvm OOM: >> >> >> Pool Name Active Pending Completed >> FILEUTILS-DELETE-POOL 0 0 280 >> >> STREAM-STAGE 0 0 0 >> RESPONSE-STAGE 0 0 245573 >> >> ROW-READ-STAGE 0 0 0 >> LB-OPERATIONS 0 0 0 >> MESSAGE-DESERIALIZER-POOL 1 14290091 65943291 >> GMFD 0 0 26670 >> >> LB-TARGET 0 0 0 >> CONSISTENCY-MANAGER 0 0 0 >> ROW-MUTATION-STAGE 32 3349 63897493 >> >> MESSAGE-STREAMING-POOL 0 0 3 >> LOAD-BALANCER-STAGE 0 0 0 >> FLUSH-SORTER-POOL 0 0 0 >> MEMTABLE-POST-FLUSHER 0 0 420 >> FLUSH-WRITER-POOL 0 0 420 >> >> AE-SERVICE-STAGE 1 1 4 >> HINTED-HANDOFF-POOL 0 0 52 >> >> >> On Tue, Apr 27, 2010 at 10:53 AM, Chris Goffinet <[email protected]>wrote: >> >>> I'll work on doing more tests around this. In 0.5 we used a different >>> data structure that required polling. But this does seem problematic. >>> >>> -Chris >>> >>> On Apr 26, 2010, at 7:04 PM, Eric Yu wrote: >>> >>> I have the same problem here, and I analysised the hprof file with mat, >>> as you said, LinkedBlockQueue used 2.6GB. >>> I think the ThreadPool of cassandra should limit the queue size. >>> >>> cassandra 0.6.1 >>> >>> java version >>> $ java -version >>> java version "1.6.0_20" >>> Java(TM) SE Runtime Environment (build 1.6.0_20-b02) >>> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode) >>> >>> iostat >>> $ iostat -x -l 1 >>> Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz >>> avgqu-sz await svctm %util >>> sda 81.00 8175.00 224.00 17.00 23984.00 2728.00 >>> 221.68 1.01 1.86 0.76 18.20 >>> >>> tpstats, of coz, this node is still alive >>> $ ./nodetool -host localhost tpstats >>> Pool Name Active Pending Completed >>> FILEUTILS-DELETE-POOL 0 0 1281 >>> STREAM-STAGE 0 0 0 >>> RESPONSE-STAGE 0 0 473617241 >>> ROW-READ-STAGE 0 0 0 >>> LB-OPERATIONS 0 0 0 >>> MESSAGE-DESERIALIZER-POOL 0 0 718355184 >>> GMFD 0 0 132509 >>> LB-TARGET 0 0 0 >>> CONSISTENCY-MANAGER 0 0 0 >>> ROW-MUTATION-STAGE 0 0 293735704 >>> MESSAGE-STREAMING-POOL 0 0 6 >>> LOAD-BALANCER-STAGE 0 0 0 >>> FLUSH-SORTER-POOL 0 0 0 >>> MEMTABLE-POST-FLUSHER 0 0 1870 >>> FLUSH-WRITER-POOL 0 0 1870 >>> AE-SERVICE-STAGE 0 0 5 >>> HINTED-HANDOFF-POOL 0 0 21 >>> >>> >>> On Tue, Apr 27, 2010 at 3:32 AM, Chris Goffinet <[email protected]>wrote: >>> >>>> Upgrade to b20 of Sun's version of JVM. This OOM might be related to >>>> LinkedBlockQueue issues that were fixed. >>>> >>>> -Chris >>>> >>>> >>>> 2010/4/26 Roland Hänel <[email protected]> >>>> >>>>> Cassandra Version 0.6.1 >>>>> OpenJDK Server VM (build 14.0-b16, mixed mode) >>>>> Import speed is about 10MB/s for the full cluster; if a compaction is >>>>> going on the individual node is I/O limited >>>>> tpstats: caught me, didn't know this. I will set up a test and try to >>>>> catch a node during the critical time. >>>>> >>>>> Thanks, >>>>> Roland >>>>> >>>>> >>>>> 2010/4/26 Chris Goffinet <[email protected]> >>>>> >>>>> Which version of Cassandra? >>>>>> Which version of Java JVM are you using? >>>>>> What do your I/O stats look like when bulk importing? >>>>>> When you run `nodeprobe -host XXXX tpstats` is any thread pool backing >>>>>> up during the import? >>>>>> >>>>>> -Chris >>>>>> >>>>>> >>>>>> 2010/4/26 Roland Hänel <[email protected]> >>>>>> >>>>>> I have a cluster of 5 machines building a Cassandra datastore, and I >>>>>>> load bulk data into this using the Java Thrift API. The first ~250GB >>>>>>> runs >>>>>>> fine, then, one of the nodes starts to throw OutOfMemory exceptions. >>>>>>> I'm not >>>>>>> using and row or index caches, and since I only have 5 CF's and some >>>>>>> 2,5 GB >>>>>>> of RAM allocated to the JVM (-Xmx2500M), in theory, that should happen. >>>>>>> All >>>>>>> inserts are done with consistency level ALL. >>>>>>> >>>>>>> I hope with this I have avoided all the 'usual dummy errors' that >>>>>>> lead to OOM's. I have begun to troubleshoot the issue with JMX, however, >>>>>>> it's difficult to catch the JVM in the right moment because it runs >>>>>>> well for >>>>>>> several hours before this thing happens. >>>>>>> >>>>>>> One thing gets to my mind, maybe one of the experts could confirm or >>>>>>> reject this idea for me: is it possible that when one machine slows >>>>>>> down a >>>>>>> little bit (for example because a big compaction is going on), the >>>>>>> memtables >>>>>>> don't get flushed to disk as fast as they are building up under the >>>>>>> continuing bulk import? That would result in a downward spiral, the >>>>>>> system >>>>>>> gets slower and slower on disk I/O, but since more and more data arrives >>>>>>> over Thrift, finally OOM. >>>>>>> >>>>>>> I'm using the "periodic" commit log sync, maybe also this could >>>>>>> create a situation where the commit log writer is too slow to catch up >>>>>>> with >>>>>>> the data intake, resulting in ever growing memory usage? >>>>>>> >>>>>>> Maybe these thoughts are just bullshit. Let me now if so... ;-) >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >>> >> >
