What a coincidence! Today happened in my cluster of 7 nodes as well. Regards, Pavel
On Wed, Jun 18, 2014 at 11:13 AM, Marcelo Elias Del Valle < marc...@s1mbi0se.com.br> wrote: > I have a 10 node cluster with cassandra 2.0.8. > > I am taking this exceptions in the log when I run my code. What my code > does is just reading data from a CF and in some cases it writes new data. > > WARN [Native-Transport-Requests:553] 2014-06-18 11:04:51,391 > BatchStatement.java (line 228) Batch of prepared statements for > [identification1.entity, identification1.entity_lookup] is of size 6165, > exceeding specified threshold of 5120 by 1045. > WARN [Native-Transport-Requests:583] 2014-06-18 11:05:01,152 > BatchStatement.java (line 228) Batch of prepared statements for > [identification1.entity, identification1.entity_lookup] is of size 21266, > exceeding specified threshold of 5120 by 16146. > WARN [Native-Transport-Requests:581] 2014-06-18 11:05:20,229 > BatchStatement.java (line 228) Batch of prepared statements for > [identification1.entity, identification1.entity_lookup] is of size 22978, > exceeding specified threshold of 5120 by 17858. > INFO [MemoryMeter:1] 2014-06-18 11:05:32,682 Memtable.java (line 481) > CFS(Keyspace='OpsCenter', ColumnFamily='rollups300') liveRatio is > 14.249755859375 (just-counted was 9.85302734375). calculation took 3ms for > 1024 cells > > After some time, one node of the cluster goes down. Then it goes back > after some seconds and another node goes down. It keeps happening and there > is always a node down in the cluster, when it goes back another one falls. > > The only exceptions I see in the log is "connected reset by the peer", > which seems to be relative to gossip protocol, when a node goes down. > > Any hint of what could I do to investigate this problem further? > > Best regards, > Marcelo Valle. >