Re: TWCS generates large numbers of sstables on only some nodes

2019-07-15 Thread Oleksandr Shulgin
On Mon, Jul 15, 2019 at 6:20 PM Carl Mueller
 wrote:

> Related to our overstreaming, we have a cluster of about 25 nodes, with
> most at about 1000 sstable files (Data + others).
>
> And about four that are at 20,000 - 30,000 sstable files (Data+Index+etc).
>
> We have vertically scaled the outlier machines and turned off compaction
> throttling thinking it was compaction that couldn't keep up. That
> stabilized the growth, but the sstable count is not going down.
>
> The TWCS code seems to highly bias towards "recent" tables for compaction.
> We figured we'd boost the throughput/compactors and that would solve the
> more recent ones, and the older ones would fall off. But the number of
> sstables has remained high on a daily basis on the couple "bad nodes".
>
> Is this simply a lack of sufficient compaction throughput? Is there
> something in TWCS that would force frequent flushing more than normal?
>

What does nodetool compactionstats says about pending compaction tasks on
the affected nodes with the high number of files?

Regards,
-- 
Alex


Re: Breaking up major compacted Sstable with TWCS

2019-07-15 Thread Jeff Jirsa
No 

Sent from my iPhone

> On Jul 15, 2019, at 9:14 AM, Carl Mueller 
>  wrote:
> 
> Does sstablesplit properly restore the time-bucket the data? That appears to 
> be size-based only.
> 
>> On Fri, Jul 12, 2019 at 5:55 AM Rhys Campbell 
>>  wrote:
>> https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/tools/toolsSStables/toolsSSTableSplit.html
>> 
>> Leon Zaruvinsky  schrieb am Fr., 12. Juli 2019, 
>> 00:06:
>>> Hi,
>>> 
>>> We are switching a table to run using TWCS. However, after running the 
>>> alter statement, we ran a major compaction without understanding the 
>>> implications.
>>> 
>>> Now, while new sstables are properly being created according to the time 
>>> window, there is a giant sstable sitting around waiting for expiration.
>>> 
>>> Is there a way we can break it up again?  Running the alter statement again 
>>> doesn’t seem to be touching it.
>>> 
>>> Thanks,
>>> Leon
>>> 
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>> 


TWCS generates large numbers of sstables on only some nodes

2019-07-15 Thread Carl Mueller
Related to our overstreaming, we have a cluster of about 25 nodes, with
most at about 1000 sstable files (Data + others).

And about four that are at 20,000 - 30,000 sstable files (Data+Index+etc).

We have vertically scaled the outlier machines and turned off compaction
throttling thinking it was compaction that couldn't keep up. That
stabilized the growth, but the sstable count is not going down.

The TWCS code seems to highly bias towards "recent" tables for compaction.
We figured we'd boost the throughput/compactors and that would solve the
more recent ones, and the older ones would fall off. But the number of
sstables has remained high on a daily basis on the couple "bad nodes".

Is this simply a lack of sufficient compaction throughput? Is there
something in TWCS that would force frequent flushing more than normal?


Re: Breaking up major compacted Sstable with TWCS

2019-07-15 Thread Carl Mueller
Does sstablesplit properly restore the time-bucket the data? That appears
to be size-based only.

On Fri, Jul 12, 2019 at 5:55 AM Rhys Campbell
 wrote:

>
> https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/tools/toolsSStables/toolsSSTableSplit.html
>
> Leon Zaruvinsky  schrieb am Fr., 12. Juli 2019,
> 00:06:
>
>> Hi,
>>
>> We are switching a table to run using TWCS. However, after running the
>> alter statement, we ran a major compaction without understanding the
>> implications.
>>
>> Now, while new sstables are properly being created according to the time
>> window, there is a giant sstable sitting around waiting for expiration.
>>
>> Is there a way we can break it up again?  Running the alter statement
>> again doesn’t seem to be touching it.
>>
>> Thanks,
>> Leon
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>


Node tool status is not showing all the nodes on sum of the nodes in the cluster

2019-07-15 Thread Nandakishore Tokala
HI All

I am having 2 que's

first one:
My cluster is a 200 node cluster and we recently joined 4 Datacenters to
existing 48 node cluster.
on some of the nodes, Nodetool status is reporting 197, 198 (which varies
from 198 to 202), Can anyone faced the same issue?

Second one:

what represents Tokens in the Nodetool  Gossipinfo output

/XX.87.XX.XX
  generation:1561848796
  heartbeat:1374022
  STATUS:21:NORMAL,-1118741779833099361
  LOAD:1374005:3.5356469372E10
  SCHEMA:1364914:62ecdb21-4490-3aab-91ff-d1cf4f716fed
  DC:8:XX-XX-DC
  RACK:10:XX-XX-RAC
  RELEASE_VERSION:4:3.11.0
  INTERNAL_IP:6:XX.XX.XX.169
  RPC_ADDRESS:3:XX.XX.XX.169
  NET_VERSION:1:11
  HOST_ID:2:1bd319cf-8493-43a6-a98a-d6e7e0d4968d
  RPC_READY:32:true
  *TOKENS:20:*


*Thanks*






-- 
Thanks & Regards,
Nanda Kishore