Upgrade to b20 of Sun's version of JVM. This OOM might be related to LinkedBlockQueue issues that were fixed.
-Chris 2010/4/26 Roland Hänel <[email protected]> > Cassandra Version 0.6.1 > OpenJDK Server VM (build 14.0-b16, mixed mode) > Import speed is about 10MB/s for the full cluster; if a compaction is going > on the individual node is I/O limited > tpstats: caught me, didn't know this. I will set up a test and try to catch > a node during the critical time. > > Thanks, > Roland > > > 2010/4/26 Chris Goffinet <[email protected]> > > Which version of Cassandra? >> Which version of Java JVM are you using? >> What do your I/O stats look like when bulk importing? >> When you run `nodeprobe -host XXXX tpstats` is any thread pool backing up >> during the import? >> >> -Chris >> >> >> 2010/4/26 Roland Hänel <[email protected]> >> >> I have a cluster of 5 machines building a Cassandra datastore, and I load >>> bulk data into this using the Java Thrift API. The first ~250GB runs fine, >>> then, one of the nodes starts to throw OutOfMemory exceptions. I'm not using >>> and row or index caches, and since I only have 5 CF's and some 2,5 GB of RAM >>> allocated to the JVM (-Xmx2500M), in theory, that should happen. All inserts >>> are done with consistency level ALL. >>> >>> I hope with this I have avoided all the 'usual dummy errors' that lead to >>> OOM's. I have begun to troubleshoot the issue with JMX, however, it's >>> difficult to catch the JVM in the right moment because it runs well for >>> several hours before this thing happens. >>> >>> One thing gets to my mind, maybe one of the experts could confirm or >>> reject this idea for me: is it possible that when one machine slows down a >>> little bit (for example because a big compaction is going on), the memtables >>> don't get flushed to disk as fast as they are building up under the >>> continuing bulk import? That would result in a downward spiral, the system >>> gets slower and slower on disk I/O, but since more and more data arrives >>> over Thrift, finally OOM. >>> >>> I'm using the "periodic" commit log sync, maybe also this could create a >>> situation where the commit log writer is too slow to catch up with the data >>> intake, resulting in ever growing memory usage? >>> >>> Maybe these thoughts are just bullshit. Let me now if so... ;-) >>> >>> >>> >> >
