[ https://issues.apache.org/jira/browse/CASSANDRA-9754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15538947#comment-15538947 ]
Michael Kjellman commented on CASSANDRA-9754: --------------------------------------------- Wanted to post a quick update on the ticket. I've been working pretty much around the clock for the last two weeks on stabilizing, performance testing, validating, and bug fixing the code. I had an unfortunate unexpected death in my family last week so I lost the better part of this past week tying up the last pieces I was finishing up before I got the bad news. After attempting to work with a few people in the community to get cassandra-stress working in a way that actually stresses large partitions and validates the data written into it, I ended up needing to write a stress tool. I loaded up a few hundred 30GB+ partitions with column sizes of 300-2048 bytes while constantly reading data that was sampled during the inserts to make sure I'm not returning bad data or incorrect results. I ran the most recent load for ~2 days in a small performance cluster and there were no validation errors. Additionally, I'm running the exact same stress/perf load in another identical cluster with a 2.1 build that does *not* contain Birch. This is allowing me to make objective A/B comparisons between the two builds. The build is stable, there are no exceptions or errors in the logs even under pretty high load (the instances are doing 3x the load we generally run at in production) and most importantly GC is *very* stable. In contrast, GC starts off great without Birch but around the time the large partitions generated by the stress tool reached ~250MB GC shot up and then started increasing literally as the row increased (as expected). In contrast, the cluster with the Birch build had no change in GC as the size of the partitions increased. I was a bit disappointed with some of the latencies I saw on reads in the upper percentiles and so I've identified what I'm almost positive was the cause and just finished up refactoring the logic for serializing/deserializing the aligned segments and subsegments in PageAlignedWriter/PageAlignedReader. I'm cleaning up the commit now and then going to get it into the perf cluster to start another load. If that looks good hoping to push all the stability and performance changes I've made up to my public Github branch most likely Tuesday as I'd like to let the performance load run for 2 days to build up large enough partitions to accurately stress and test things. :) > Make index info heap friendly for large CQL partitions > ------------------------------------------------------ > > Key: CASSANDRA-9754 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9754 > Project: Cassandra > Issue Type: Improvement > Reporter: sankalp kohli > Assignee: Michael Kjellman > Priority: Minor > Fix For: 4.x > > Attachments: 9754_part1-v1.diff, 9754_part2-v1.diff > > > Looking at a heap dump of 2.0 cluster, I found that majority of the objects > are IndexInfo and its ByteBuffers. This is specially bad in endpoints with > large CQL partitions. If a CQL partition is say 6,4GB, it will have 100K > IndexInfo objects and 200K ByteBuffers. This will create a lot of churn for > GC. Can this be improved by not creating so many objects? -- This message was sent by Atlassian JIRA (v6.3.4#6332)