If you combine inserts for multiple partition keys in the same batch you negate most of the effect of token-aware routing. It's best to insert only rows with the same partition key in a single batch. You also need to set the partition key for routing for the batch.
Also, RF=2 is not recommended since it does not permit quorum operations if a replica node is down. RF=3 is generally more appropriate. -- Jack Krupansky On Sun, Dec 6, 2015 at 10:27 PM, xutom <xutom2...@126.com> wrote: > Dear all, > Thanks for ur reply! > Now I`m using Apache Cassandra 2.1.1 and my JDK is 1.7.0_79, my > keyspace replication factor is 2,and I do enable the "token aware". The GC > configuration is default for such as: > # GC tuning options > JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC" > JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC" > JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled" > And I check the gc log: gc.log.0.current, I found there is only one > Full GC. The stop-the-world times is low. > CMS-initial-mark: 0.2747280 secs > CMS-remark: 0.3623090 secs > > The insert codes in my test client are following: > String content = RandomStringUtils.randomAlphabetic(120); > cluster = Cluster > .builder() > .addContactPoint(this.seedIP) > .withCredentials("test", "test") > .withRetryPolicy(DefaultRetryPolicy.INSTANCE) > .withLoadBalancingPolicy(new TokenAwarePolicy(new > DCAwareRoundRobinPolicy())) > .build(); > session = cluster.connect("demo"); > ...... > PreparedStatement insertPreparedStatement = session.prepare( > " INSERT INTO teacher (id, lastname, firstname, > city) " + > "VALUES (?, ?, ?, ?); "); > > BatchStatement batch = new BatchStatement(); > for (; i < max; i+=5) { > try { > batch.add(insertPreparedStatement.bind(i, "Entre > Nous", "adsfasdfa1", content)); > batch.add(insertPreparedStatement.bind(i+1, "Entre > Nous", "adsfasdfa2", content)); > batch.add(insertPreparedStatement.bind(i+2, "Entre > Nous", "adsfasdfa3", content)); > batch.add(insertPreparedStatement.bind(i+3, "Entre > Nous", "adsfasdfa4", content)); > batch.add(insertPreparedStatement.bind(i+4, "Entre > Nous", "adsfasdfa5", content)); > > // System.out.println("the is is " + i); > session.execute(batch); > thisTimeCount += 5; > } > } > > > > At 2015-12-07 00:40:06, "Graham Sanderson" <gra...@vast.com> wrote: > > What version of C* are you using; what JVM version - you showed a partial > GC config but if that is still CMS (not G1) then you are going to have > insane GC pauses... > > Depending on C* versions are you using on/off heap memtables and what type > > Those are the sorts of issues related to fat nodes; I'd be worried about - > we run very nicely at 20G total heap and 8G new - the rest of our 128G > memory is disk cache/mmap and all of the off heap stuff so it doesn't go to > waste > > That said I think Jack is probably on the right path with overloaded > coordinators- though you'd still expect to see CPU usage unless your > timeouts are too low for the load, In which case the coordinator would be > getting no responses in time and quite possibly the other nodes are just > dropping the mutations (since they don't get to them before they know the > coordinator would have timed out) - I forget the command to check dropped > mutations off the top of my head but you can see it in opcenter > > If you have GC problems you certainly > Expect to see GC cpu usage but depending on how long you run your tests it > might take you a little while to run thru 40G > > I'm personally not a fan off >32G (ish) heaps as you can't do compressed > oops and also it is unrealistic for CMS ... The word is that G1 is now > working ok with C* especially on newer C* and JDK versions, but that said > it takes quite a lot of thru-put to require insane quantities of young > gen... We are guessing that when we remove all our legacy thrift batch > inserts we will need less - and as for 20G total we actually don't need > that much (we dropped from 24 when we moved memtables off heap, and believe > we can drop further) > > Sent from my iPhone > > On Dec 6, 2015, at 9:07 AM, Jack Krupansky <jack.krupan...@gmail.com> > wrote: > > What replication factor are you using? Even if your writes use CL.ONE, > Cassandra will be attempting writes to the replica nodes in the background. > > Are your writes "token aware"? If not, the receiving node has the overhead > of forwarding the request to the node that owns the token for the primary > key. > > For the record, Cassandra is not designed and optimized for so-called "fat > nodes". The design focus is "commodity hardware" and "distributed cluster" > (typically a dozen or more nodes.) > > That said, it would be good if we had a rule of thumb for how many > simultaneous requests a node can handle, both external requests and > inter-node traffic. I think there is an open Jira to enforce a limit on > inflight requests so that nodes don't overloaded and start failing in the > middle of writes as you seem to be seeing. > > -- Jack Krupansky > > On Sun, Dec 6, 2015 at 9:29 AM, jerry <xutom2...@126.com> wrote: > >> Dear All, >> >> Now I have a 4 nodes Cassandra cluster, and I want to know the >> highest performance of my Cassandra cluster. I write a JAVA client to batch >> insert datas into ALL 4 nodes Cassandra, when I start less than 30 >> subthreads in my client applications to insert datas into cassandra, it >> will be ok for everything, but when I start more than 80 or 100 subthreads >> in my client applications, there will be too much timeout Exceptions (Such >> as: Cassandra timeout during write query at consistency ONE (1 replica were >> required but only 0 acknowledged the write)). And no matter how many >> subthreads or even I start multiple clients with multiple subthreads on >> different computers, I can get the highest performance for about 60000 - >> 80000 TPS. By the way, each row I insert into cassandra is about 130 Bytes. >> My 4 nodes of Cassandra is : >> CPU: 4*15 >> Memory: 512G >> Disk: flash card (only one disk but better than SSD) >> My cassandra configurations are: >> MAX_HEAP_SIZE: 60G >> NEW_HEAP_SIZE: 40G >> >> When I insert datas into my cassandra cluster, each nodes has NOT >> reached bottleneck such as CPU or Memory or Disk. Each of the three main >> hardwares is idle。So I think maybe there is something wrong about my >> configuration of cassandra cluster. Can somebody please help me to My >> Cassandra Tuning? Thanks in advances! >> > > > > >