if your commit logs are not getting cleared, doesn't that indicate your load
is more than your servers can handle?


On Mon, Aug 9, 2010 at 4:50 PM, Edward Capriolo <edlinuxg...@gmail.com>wrote:

> I have a 16 node 6.3 cluster and two nodes from my cluster are giving
> me major headaches.
>
> 10.71.71.56   Up         58.19 GB
> 108271662202116783829255556910108067277    |   ^
> 10.71.71.61   Down       67.77 GB
> 123739042516704895804863493611552076888    v   |
> 10.71.71.66   Up         43.51 GB
> 127605887595351923798765477786913079296    |   ^
> 10.71.71.59   Down       90.22 GB
> 139206422831293007780471430312996086499    v   |
> 10.71.71.65   Up         22.97 GB
> 148873535527910577765226390751398592512    |   ^
>
> The symptoms I am seeing are nodes 61 and nodes 59 have huge 6 GB +
> commit log directories. They keep growing, along with memory usage,
> eventually the logs start showing GCInspection errors and then the
> nodes will go OOM
>
> INFO 14:20:01,296 Creating new commitlog segment
> /var/lib/cassandra/commitlog/CommitLog-1281378001296.log
>  INFO 14:20:02,199 GC for ParNew: 327 ms, 57545496 reclaimed leaving
> 7955651792 used; max is 9773776896
>  INFO 14:20:03,201 GC for ParNew: 443 ms, 45124504 reclaimed leaving
> 8137412920 used; max is 9773776896
>  INFO 14:20:04,314 GC for ParNew: 438 ms, 54158832 reclaimed leaving
> 8310139720 used; max is 9773776896
>  INFO 14:20:05,547 GC for ParNew: 409 ms, 56888760 reclaimed leaving
> 8480136592 used; max is 9773776896
>  INFO 14:20:06,900 GC for ParNew: 441 ms, 58149704 reclaimed leaving
> 8648872520 used; max is 9773776896
>  INFO 14:20:08,904 GC for ParNew: 462 ms, 59185992 reclaimed leaving
> 8816581312 used; max is 9773776896
>  INFO 14:20:09,973 GC for ParNew: 460 ms, 57403840 reclaimed leaving
> 8986063136 used; max is 9773776896
>  INFO 14:20:11,976 GC for ParNew: 447 ms, 59814376 reclaimed leaving
> 9153134392 used; max is 9773776896
>  INFO 14:20:13,150 GC for ParNew: 441 ms, 61879728 reclaimed leaving
> 9318140296 used; max is 9773776896
> java.lang.OutOfMemoryError: Java heap space
> Dumping heap to java_pid10913.hprof ...
>  INFO 14:22:30,620 InetAddress /10.71.71.66 is now dead.
>  INFO 14:22:30,621 InetAddress /10.71.71.65 is now dead.
>  INFO 14:22:30,621 GC for ConcurrentMarkSweep: 44862 ms, 261200
> reclaimed leaving 9334753480 used; max is 9773776896
>  INFO 14:22:30,621 InetAddress /10.71.71.64 is now dead.
>
> Heap dump file created [12730501093 bytes in 253.445 secs]
> ERROR 14:28:08,945 Uncaught exception in thread Thread[Thread-2288,5,main]
> java.lang.OutOfMemoryError: Java heap space
>        at
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:71)
> ERROR 14:28:08,948 Uncaught exception in thread Thread[Thread-2281,5,main]
> java.lang.OutOfMemoryError: Java heap space
>        at
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:71)
>  INFO 14:28:09,017 GC for ConcurrentMarkSweep: 33737 ms, 85880
> reclaimed leaving 9335215296 used; max is 9773776896
>
> Does anyone have any ideas what is going on?
>

Reply via email to