Re: TWCS generates large numbers of sstables on only some nodes
On Mon, Jul 15, 2019 at 6:20 PM Carl Mueller wrote: > Related to our overstreaming, we have a cluster of about 25 nodes, with > most at about 1000 sstable files (Data + others). > > And about four that are at 20,000 - 30,000 sstable files (Data+Index+etc). > > We have vertically scaled the outlier machines and turned off compaction > throttling thinking it was compaction that couldn't keep up. That > stabilized the growth, but the sstable count is not going down. > > The TWCS code seems to highly bias towards "recent" tables for compaction. > We figured we'd boost the throughput/compactors and that would solve the > more recent ones, and the older ones would fall off. But the number of > sstables has remained high on a daily basis on the couple "bad nodes". > > Is this simply a lack of sufficient compaction throughput? Is there > something in TWCS that would force frequent flushing more than normal? > What does nodetool compactionstats says about pending compaction tasks on the affected nodes with the high number of files? Regards, -- Alex
Re: Breaking up major compacted Sstable with TWCS
No Sent from my iPhone > On Jul 15, 2019, at 9:14 AM, Carl Mueller > wrote: > > Does sstablesplit properly restore the time-bucket the data? That appears to > be size-based only. > >> On Fri, Jul 12, 2019 at 5:55 AM Rhys Campbell >> wrote: >> https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/tools/toolsSStables/toolsSSTableSplit.html >> >> Leon Zaruvinsky schrieb am Fr., 12. Juli 2019, >> 00:06: >>> Hi, >>> >>> We are switching a table to run using TWCS. However, after running the >>> alter statement, we ran a major compaction without understanding the >>> implications. >>> >>> Now, while new sstables are properly being created according to the time >>> window, there is a giant sstable sitting around waiting for expiration. >>> >>> Is there a way we can break it up again? Running the alter statement again >>> doesn’t seem to be touching it. >>> >>> Thanks, >>> Leon >>> >>> - >>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >>> For additional commands, e-mail: user-h...@cassandra.apache.org >>>
TWCS generates large numbers of sstables on only some nodes
Related to our overstreaming, we have a cluster of about 25 nodes, with most at about 1000 sstable files (Data + others). And about four that are at 20,000 - 30,000 sstable files (Data+Index+etc). We have vertically scaled the outlier machines and turned off compaction throttling thinking it was compaction that couldn't keep up. That stabilized the growth, but the sstable count is not going down. The TWCS code seems to highly bias towards "recent" tables for compaction. We figured we'd boost the throughput/compactors and that would solve the more recent ones, and the older ones would fall off. But the number of sstables has remained high on a daily basis on the couple "bad nodes". Is this simply a lack of sufficient compaction throughput? Is there something in TWCS that would force frequent flushing more than normal?
Re: Breaking up major compacted Sstable with TWCS
Does sstablesplit properly restore the time-bucket the data? That appears to be size-based only. On Fri, Jul 12, 2019 at 5:55 AM Rhys Campbell wrote: > > https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/tools/toolsSStables/toolsSSTableSplit.html > > Leon Zaruvinsky schrieb am Fr., 12. Juli 2019, > 00:06: > >> Hi, >> >> We are switching a table to run using TWCS. However, after running the >> alter statement, we ran a major compaction without understanding the >> implications. >> >> Now, while new sstables are properly being created according to the time >> window, there is a giant sstable sitting around waiting for expiration. >> >> Is there a way we can break it up again? Running the alter statement >> again doesn’t seem to be touching it. >> >> Thanks, >> Leon >> >> - >> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >> For additional commands, e-mail: user-h...@cassandra.apache.org >> >>
Node tool status is not showing all the nodes on sum of the nodes in the cluster
HI All I am having 2 que's first one: My cluster is a 200 node cluster and we recently joined 4 Datacenters to existing 48 node cluster. on some of the nodes, Nodetool status is reporting 197, 198 (which varies from 198 to 202), Can anyone faced the same issue? Second one: what represents Tokens in the Nodetool Gossipinfo output /XX.87.XX.XX generation:1561848796 heartbeat:1374022 STATUS:21:NORMAL,-1118741779833099361 LOAD:1374005:3.5356469372E10 SCHEMA:1364914:62ecdb21-4490-3aab-91ff-d1cf4f716fed DC:8:XX-XX-DC RACK:10:XX-XX-RAC RELEASE_VERSION:4:3.11.0 INTERNAL_IP:6:XX.XX.XX.169 RPC_ADDRESS:3:XX.XX.XX.169 NET_VERSION:1:11 HOST_ID:2:1bd319cf-8493-43a6-a98a-d6e7e0d4968d RPC_READY:32:true *TOKENS:20:* *Thanks* -- Thanks & Regards, Nanda Kishore