Hello,
We are trying to add new nodes to our *6-node* cassandra cluster with
RF=3 cassandra version 1.0.11. We are *adding 18 new nodes* one-by-one.
First strange thing, I've noticed, is the number of completed
MigrationStage in nodetool tpstats grows for every new node, while
schema is not changed. For now with 21-nodes ring, for final join it
shows 184683 migrations, while with 7-nodes it was about 50k migrations.
In fact it seems that this number is not a number of applied migrations.
When i grep log file with
grep "Applying migration" /var/log/cassandra/system.log -c
For each new node result is pretty much the same - around 7500 "Applying
migration" found in log.
And the real problem is that now new nodes fail with Out Of Memory while
building schema from migrations. In logs we can find the following:
WARN [ScheduledTasks:1] 2012-09-19 18:51:22,497 GCInspector.java (line
145) Heap is 0.7712290960125684 full. You may need to reduce memtable
and/or cache sizes. Cassandra will now flush up to the two largest
memtables to free up memory. Adjust flush_largest_memtables_at
threshold in cassandra.yaml if you don't want Cassandra to do this
automatically
INFO [ScheduledTasks:1] 2012-09-19 18:51:22,498 StorageService.java
(line 2658) Unable to reduce heap usage since there are no dirty column
families
....
WARN [ScheduledTasks:1] 2012-09-19 18:51:29,500 GCInspector.java (line
139) Heap is 0.853078131310858 full. You may need to reduce memtable
and/or cache sizes. Cassandra is now reducing cache sizes to free up
memory. Adjust reduce_cache_sizes_at threshold in cassandra.yaml if you
don't want Cassandra to do this automatically
WARN [ScheduledTasks:1] 2012-09-19 18:51:29,500 AutoSavingCache.java
(line 187) Reducing AppUser RowCache capacity from 100000 to 0 to reduce
memory pressure
WARN [ScheduledTasks:1] 2012-09-19 18:51:29,500 AutoSavingCache.java
(line 187) Reducing AppUser KeyCache capacity from 100000 to 0 to reduce
memory pressure
WARN [ScheduledTasks:1] 2012-09-19 18:51:29,500 AutoSavingCache.java
(line 187) Reducing PaymentClaim KeyCache capacity from 50000 to 0 to
reduce memory pressure
WARN [ScheduledTasks:1] 2012-09-19 18:51:29,500 AutoSavingCache.java
(line 187) Reducing Organization RowCache capacity from 1000 to 0 to
reduce memory pressure
.....
INFO [main] 2012-09-19 18:57:14,181 StorageService.java (line 668)
JOINING: waiting for schema information to complete
ERROR [Thread-28] 2012-09-19 18:57:14,198 AbstractCassandraDaemon.java
(line 139) Fatal exception in thread Thread[Thread-28,5,main]
java.lang.OutOfMemoryError: Java heap space
at
org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:140)
at
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:115)
...
ERROR [ReadStage:353] 2012-09-19 18:57:20,453
AbstractCassandraDaemon.java (line 139) Fatal exception in thread
Thread[ReadStage:353,5,main]
java.lang.OutOfMemoryError: Java heap space
at
org.apache.cassandra.service.MigrationManager.makeColumns(MigrationManager.java:256)
at
org.apache.cassandra.db.DefinitionsUpdateVerbHandler.doVerb(DefinitionsUpdateVerbHandler.java:51)
at
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
Originally "max heap size" was set to 6G. Then we increased heap size
limit to 8G and it works. But warnings still present
WARN [ScheduledTasks:1] 2012-09-20 11:39:11,373 GCInspector.java (line
145) Heap is 0.7760745735786222 full. You may need to reduce memtable
and/or cache sizes. Cassandra will now flush up to the two largest
memtables to free up memory. Adjust flush_largest_memtables_at
threshold in cassandra.yaml if you don't want Cassandra to do this
automatically
INFO [ScheduledTasks:1] 2012-09-20 11:39:11,374 StorageService.java
(line 2658) Unable to reduce heap usage since there are no dirty column
families
It is probably a bug in applying migrations.
Could anyone explain why cassandra behaves this way? Could you please
recommend us smth to cope with this situation?
Thank you in advance.
--
W/ best regards,
Sergey B.