What client are you using? Official java and python clients should not have a LB between them and the C* nodes AFAIK.
Why aren't you using 2.1.9? Have you checked for schema agreement amongst all nodes? ml On Wed, Sep 30, 2015 at 11:22 AM, Walsh, Stephen <stephen.wa...@aspect.com> wrote: > More information, > > > > I’ve just setup a NTP server to rule out any timing issues. > > And I also see this in the Cassandra node log files > > > > MessagingService-Incoming-/172.31.22.4] 2015-09-30 15:19:14,769 > IncomingTcpConnection.java:97 - UnknownColumnFamilyException reading from > socket; closing > > org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find > cfId=cf411b50-6785-11e5-a435-e7be20c92086 > > > > Any idea what this is related too? > > All these tests are run with a clean setup of Cassandra nodes followed by > a nodetool repair. > > Before any data hits them. > > > > > > *From:* Walsh, Stephen [mailto:stephen.wa...@aspect.com] > *Sent:* 30 September 2015 15:17 > *To:* user@cassandra.apache.org > *Subject:* Consistency Issues > > > > Hi there, > > > > We are having some issues with consistency. I’ll try my best to explain. > > > > We have an application that was able to > > Write ~1000 p/s > > Read ~300 p/s > > Total CF created: 400 > > Total Keyspaces created : 80 > > > > On a 4 node Cassandra Cluster with > > Version 2.1.6 > > Replication : 3 > > Consistency (Read & Write) : LOCAL_QUORUM > > Cores : 4 > > Ram : 15 GB > > Heap Size 8GB > > > > This was fine and worked, but was pushing our application to the max. > > > > --------------------- > > > > Next we added a load balancer (HaProxy) to our application. > > So now we have 3 of our nodes talking to 4 Cassandra Nodes with a load of > > Write ~1250 p/s > > Read 0p/s > > Total CF created: 450 > > Total Keyspaces created : 100 > > > > On our application we now see > > Cassandra timeout during write query at consistency LOCAL_QUORUM (2 > replica were required but only 1 acknowledged the write) > > (we are using java Cassandra driver 2.1.6) > > > > So we increased the number of Cassandra nodes > > To 5, then 6 and each time got the same replication error. > > > > So then we double the spec of every node to > > 8 cores > > 30GB RAM > > Heap size 15GB > > > > And we still get this replication error (2 replica were required but only > 1 acknowledged the write) > > > > We know that when we introduce HaProxy Load balancer with 3 of our nodes > that its hits Cassandra 3 times quicker. > > But we’ve now increased the Cassandra spec nearly 3 fold, and only for an > extra 250 writes p/s and it still doesn’t work. > > > > We’re having a hard time finding out why replication is an issue with the > size of a cluster. > > > > We tried to get OpsCenter working to monitor the nodes, but due to the > amount of CF’s in Cassandra the datastax-agent takes 90% of the CPU on > every node. > > > > Any suggestion / recommendation would be very welcome. > > > > Regards > > Stephen Walsh > > > > > > > > This email (including any attachments) is proprietary to Aspect Software, > Inc. and may contain information that is confidential. If you have received > this message in error, please do not read, copy or forward this message. > Please notify the sender immediately, delete it from your system and > destroy any copies. You may not further disclose or distribute this email > or its attachments. > This email (including any attachments) is proprietary to Aspect Software, > Inc. and may contain information that is confidential. If you have received > this message in error, please do not read, copy or forward this message. > Please notify the sender immediately, delete it from your system and > destroy any copies. You may not further disclose or distribute this email > or its attachments. >