Re: OOM recovering failed node with many CFs

Jonathan Ellis Thu, 26 May 2011 13:53:11 -0700

We've applied a fix to the 0.7 branch in
https://issues.apache.org/jira/browse/CASSANDRA-2714.  The patch
probably applies to 0.7.6 as well.


On Thu, May 26, 2011 at 11:36 AM, Flavio Baronti
<f.baro...@list-group.com> wrote:
> I tried the manual copy you suggest, but the SystemTable.checkHealth()
> function
> complains it can't load the system files. Log follows, I will gather some
> more
> info and create a ticket as soon as possible.
>
>  INFO [main] 2011-05-26 18:25:36,147 AbstractCassandraDaemon.java Logging
> initialized
>  INFO [main] 2011-05-26 18:25:36,172 AbstractCassandraDaemon.java Heap size:
> 4277534720/4277534720
>  INFO [main] 2011-05-26 18:25:36,174 CLibrary.java JNA not found. Native
> methods will be disabled.
>  INFO [main] 2011-05-26 18:25:36,190 DatabaseDescriptor.java Loading
> settings from file:/C:/Cassandra/conf/hscassandra9170.yaml
>  INFO [main] 2011-05-26 18:25:36,344 DatabaseDescriptor.java DiskAccessMode
> 'auto' determined to be mmap, indexAccessMode is mmap
>  INFO [main] 2011-05-26 18:25:36,532 SSTableReader.java Opening
> G:\Cassandra\data\system\Schema-f-2746
>  INFO [main] 2011-05-26 18:25:36,577 SSTableReader.java Opening
> G:\Cassandra\data\system\Schema-f-2729
>  INFO [main] 2011-05-26 18:25:36,590 SSTableReader.java Opening
> G:\Cassandra\data\system\Schema-f-2745
>  INFO [main] 2011-05-26 18:25:36,599 SSTableReader.java Opening
> G:\Cassandra\data\system\Migrations-f-2167
>  INFO [main] 2011-05-26 18:25:36,600 SSTableReader.java Opening
> G:\Cassandra\data\system\Migrations-f-2131
>  INFO [main] 2011-05-26 18:25:36,602 SSTableReader.java Opening
> G:\Cassandra\data\system\Migrations-f-1041
>  INFO [main] 2011-05-26 18:25:36,603 SSTableReader.java Opening
> G:\Cassandra\data\system\Migrations-f-1695
> ERROR [main] 2011-05-26 18:25:36,634 AbstractCassandraDaemon.java Fatal
> exception during initialization
> org.apache.cassandra.config.ConfigurationException: Found system table
> files, but they couldn't be loaded. Did you change the partitioner?
>        at
> org.apache.cassandra.db.SystemTable.checkHealth(SystemTable.java:236)
>        at
> org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:127)
>        at
> org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:314)
>        at
> org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:79)
>
>
> Il 5/26/2011 6:04 PM, Jonathan Ellis ha scritto:
>>
>> Sounds like a legitimate bug, although looking through the code I'm
>> not sure what would cause a tight retry loop on migration
>> announce/rectify. Can you create a ticket at
>> https://issues.apache.org/jira/browse/CASSANDRA ?
>>
>> As a workaround, I would try manually copying the Migrations and
>> Schema sstable files from the system keyspace of the live node, then
>> restart the recovering one.
>>
>> On Thu, May 26, 2011 at 9:27 AM, Flavio Baronti
>> <f.baro...@list-group.com>  wrote:
>>>
>>> I can't seem to be able to recover a failed node on a database where i
>>> did
>>> many updates to the schema.
>>>
>>> I have a small cluster with 2 nodes, around 1000 CF (I know it's a lot,
>>> but
>>> it can't be changed right now), and ReplicationFactor=2.
>>> I shut down a node and cleaned its data entirely, then tried to bring it
>>> back up. The node starts fetching schema updates from the live node, but
>>> the
>>> operation fails halfway with an OOME.
>>> After some investigation, what I found is that:
>>>
>>> - I have a lot of schema updates (there are 2067 rows in the
>>> system.Schema
>>> CF).
>>> - The live node loads migrations 1-1000, and sends them to the recovering
>>> node (Migration.getLocalMigrations())
>>> - Soon afterwards, the live node checks the schema version on the
>>> recovering
>>> node and finds it has moved by a little - say it has applied the first 3
>>> migrations. It then loads migrations 3-1003, and sends them to the node.
>>> - This process is repeated very quickly (sends migrations 6-1006, 9-1009,
>>> etc).
>>>
>>> Analyzing the memory dump and the logs, it looks like each of these 1000
>>> migration blocks are composed in a single message and sent to the
>>> OutboundTcpConnection queue. However, since the schema is big, the
>>> messages
>>> occupy a lot of space, and are built faster than the connection can send
>>> them. Therefore, they accumulate in OutboundTcpConnection.queue, until
>>> memory is completely filled.
>>>
>>> Any suggestions? Can I change something to make this work, apart from
>>> reducing the number of CFs?
>>>
>>> Flavio
>>>
>>
>>
>>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: OOM recovering failed node with many CFs

Reply via email to