Re: Cassandra JVM configuration

2019-09-06 Thread John Sumsion
Pat,

You might find this post useful in tuning G1 for Cassandra:

  *   
http://deliberate-thinking.blogspot.com/2019/05/tuning-g1-gc-for-cassandra.html

This assumes a machine with 60-120G of RAM -- and your use case may be 
different than the clusters I've tuned, so take each step with care.

Also, I have no experience tuning G1 over 31G of heap -- the JVM uses 8 byte 
pointers with a 32G+ heap.  You can get a lot of read throughput with a 24G 
heap with a new size fixed at 8-10G, as long as your records aren't all huge 
blobs.

John...

From: p...@xvalheru.org 
Sent: Friday, September 6, 2019 3:00 AM
To: user@cassandra.apache.org 
Cc: Jeff Jirsa 
Subject: Re: Cassandra JVM configuration

- reads => as much as possible - huge stream of requests
- data => 186GB on each node
- the reads are unpredictable
- there's (in the cluster) about 6 billions of records

I'll try change the garbage collector.

Thanks

Pat


On 2019-09-05 16:38, Jeff Jirsa wrote:
> Lot of variables
>
> - how many reads per second per machine?
> - how much data per machine?
> - are the reads random or is there a hot working set?
>
> Some of the suggestions online are old.
> CASSANDRA-8150 has some old’ish suggestions if you’re running CMS
> collector. Running > 16G heap should consider using G1GC and that’s
> tuned quite differently. Amy Tobey has a decent (2.1 era) tuning
> guide, I imagine The Last Pickle has one as well (and wouldn’t be
> surprised if Pythian and Instaclustr do too).
>
>
>
>> On Sep 5, 2019, at 7:04 AM, p...@xvalheru.org wrote:
>>
>> Hi,
>>
>> sorry to bring such question, but I want to ask what are the best JVM
>> options for Cassandra node? In solution I'm implementing the Cassandra
>> serves as read-only storage (of course populated at beginning) - the
>> records are not changed in time. Currently each  Cassandra node's VM
>> has this configuration 16CPUs and 64GB of RAM. I've set these JVM
>> options: -Xms4G and -Xmx40G; JDK 1.8.0_221. My question is what's the
>> best size of memory for Cassandra node and if there's any relation
>> between number of CPUs and memory size. When I've searched for an
>> answer I've found that suggested size for node is 8GB of RAM, but I
>> have doubts.
>>
>> Thanks
>>
>> Pat
>>
>> 
>> Freehosting PIPNI - 
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.pipni.cz_=DwIDaQ=z0adcvxXWKG6LAMN6dVEqQ=W9UI0GQq10yOhf5LxSjoITGT9p69DtOfFK_UGgl4kx8=Twi1q-yZbM-7JseysmH7h_XO5CVH_qz-2rZkyrQ27zY=KB8dnJ9Hih2DBTUE5LHJNDQW5jpcUv9GPGwoC-03bmQ=
>>
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>
> 
> Freehosting PIPNI - 
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.pipni.cz_=DwIDaQ=z0adcvxXWKG6LAMN6dVEqQ=W9UI0GQq10yOhf5LxSjoITGT9p69DtOfFK_UGgl4kx8=Twi1q-yZbM-7JseysmH7h_XO5CVH_qz-2rZkyrQ27zY=KB8dnJ9Hih2DBTUE5LHJNDQW5jpcUv9GPGwoC-03bmQ=


Freehosting PIPNI - 
https://urldefense.proofpoint.com/v2/url?u=http-3A__www.pipni.cz_=DwIDaQ=z0adcvxXWKG6LAMN6dVEqQ=W9UI0GQq10yOhf5LxSjoITGT9p69DtOfFK_UGgl4kx8=Twi1q-yZbM-7JseysmH7h_XO5CVH_qz-2rZkyrQ27zY=KB8dnJ9Hih2DBTUE5LHJNDQW5jpcUv9GPGwoC-03bmQ=


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: gossipinfo contains two nodes dead for more than two years

2019-08-28 Thread John Sumsion
I've seen something similar if there is a node still referring to that IP as a 
seed node in cassandra.yaml.  You might want to check that.

From: Vincent Rischmann 
Sent: Wednesday, August 28, 2019 10:10 AM
To: user@cassandra.apache.org 
Subject: Re: gossipinfo contains two nodes dead for more than two years

Yep, they're not visible in both ring and status.

On Wed, Aug 28, 2019, at 17:08, Jeff Jirsa wrote:
Based on what you've posted, I assume the instances are not visible in 
`nodetool ring` or `nodetool status`, and the only reason you know they're 
still in gossipinfo is you see them in the logs? If that's the case, then yes, 
I would do `nodetool assassinate`.



On Wed, Aug 28, 2019 at 7:33 AM Vincent Rischmann 
mailto:vinc...@rischmann.fr>> wrote:

Hi,

while replacing a node in a cluster I saw this log:

2019-08-27 16:35:31,439 Gossiper.java:995 - InetAddress 
/10.15.53.27
 is now DOWN

it caught my attention because that ip address doesn't exist anymore in the 
cluster and it hasn't for a long time.

After some reading I ran `nodetool gossipinfo` and I saw these entries which 
are nodes that don't exist anymore:


/10.15.53.27
  generation:1503480618
  heartbeat:26970
  STATUS:2:hibernate,true
  LOAD:26810:6.17363354147E11
  SCHEMA:101:d21b1e47-f226-3417-8de7-5802518ae824
  DC:10:DC1
  RACK:12:RAC1
  RELEASE_VERSION:6:2.1.18
  INTERNAL_IP:8:10.15.53.27
  RPC_ADDRESS:5:10.15.53.27
  SEVERITY:26972:0.0
  NET_VERSION:3:8
  HOST_ID:4:2488fccc-108a-4a9d-ad43-5e8b8b6ee17b
  TOKENS:1:

/10.5.1.16
  generation:1503636779
  heartbeat:324
  STATUS:2:hibernate,true
  LOAD:204:2.601990697532E12
  SCHEMA:14:d21b1e47-f226-3417-8de7-5802518ae824
  DC:10:DC1
  RACK:12:RAC1
  RELEASE_VERSION:6:2.1.18
  INTERNAL_IP:8:10.5.1.16
  RPC_ADDRESS:5:10.5.1.16
  SEVERITY:326:0.0
  NET_VERSION:3:8
  HOST_ID:4:2488fccc-108a-4a9d-ad43-5e8b8b6ee17b
  TOKENS:1:

the generations are:

- Wed, 23 Aug 2017 09:30:18 GMT
- Fri, 25 Aug 2017 04:52:59 GMT

I don't remember what we did at that time but it looks like we botched 
something while joining a node or something.

After reading 
https://thelastpickle.com/blog/2018/09/18/assassinate.html
 I'm thinking of doing the following:

* nodetool removenode 10.15.53.27
* if it doesn't work for some reason: nodetool assassinate 10.15.53.27

Since those nodes have been long dead and don't appear in system.peer I don't 
anticipate any problems but I'd like some confirmation that this can't break my 
cluster.

Thanks !


Tool to decide which node to decommission (using vnodes)

2018-12-06 Thread John Sumsion
Here is a tool I worked on to figure out which node to decommission that will 
leave you with the most even token balance afterwards.

https://github.com/jdsumsion/vnode-decommission-calculator

Feel free to use or enhance as you desire.


John...


JMX for row cache churn

2018-08-20 Thread John Sumsion
Is there a JMX property somewhere that I could monitor to see how old the 
oldest row cache item is?


I want to see how much churn there is.


Thanks in advance,

John...


Possible to adjust tokens on a vnode cluster?

2016-01-19 Thread John Sumsion
I have a 24 node cluster, with vnodes set to 256.


'nodetool status ' looks like this for our keyspace:


UN  588.23 GB  256 11.0% 
0c8708a7-b962-4fc9-996c-617da642d9ee  1a
UN  601.33 GB  256 11.3% 
5ef60730-0b01-4a8b-a578-d828cdf78a1f  1b
UN  613.02 GB  256 11.5% 
dddc78b1-7dc2-4e9f-8e8a-1b52595aa0e3  1a
UN  620.76 GB  256 11.7% 
87ac93ff-dc8e-4cd5-842c-0389ce016d70  1b
UN  631.81 GB  256 11.9% 
8e1416aa-3e75-4ab5-a2a6-49d26f514115  1d
UN  634.65 GB  256 11.9% 
3c97f722-16f5-455c-8f58-71c07ad93d25  1b
UN  634.79 GB  256 11.9% 
3e3d41bd-d6e8-4a7e-aee2-7ea16b1dadb9  1d
UN  637.05 GB  256 12.0% 
2f26f19a-c88f-4cbe-b865-155c0b66bff0  1b
UN  637.83 GB  256 12.0% 
6385e073-5b48-49b3-a85b-e7511fa8b3a0  1a
UN  638.05 GB  256 12.1% 
382681e5-c060-4594-ae2a-062a324c12d4  1d
UN  660.22 GB  256 12.4% 
ea6aad23-7d93-4989-8898-7505df51298f  1d
UN  674.98 GB  256 12.6% 
7d372371-c23f-4235-9e3c-cf030fb52ab3  1a
UN  676.22 GB  256 12.7% 
41c4cb98-91ae-43a6-9bc4-11aa6106faad  1d
UN  680.15 GB  256 12.7% 
65ac3aef-8a9b-423d-83fb-ed8e41f88ccc  1a
UN  681.35 GB  256 12.8% 
e38efc6a-e7eb-4d8e-9069-a0b099bea96e  1d
UN  693.19 GB  256 13.0% 
2b9a5d3e-8529-47fe-8d2c-13553a8df91f  1b
UN  696.92 GB  256 13.0% 
46382cd1-402c-4200-858c-100dade03fc5  1d
UN  698.17 GB  256 13.1% 
a68107e7-8e1a-469e-8dd1-e2d87445fd47  1b
UN  698.92 GB  256 13.1% 
662338a7-1f5c-4eaa-926e-9e9fda926504  1a
UN  699.26 GB  256 13.1% 
e7c15c56-80e6-4961-9cd9-c1302fbf2026  1a
UN  702.98 GB  256 13.2% 
461baba0-60f3-423a-a5cf-e0c482da2dbf  1b
UN  710.27 GB  256 13.3% 
ffa9700d-50ef-4b23-92d9-18f8029c8cd6  1d
UN  740.63 GB  256 13.8% 
d9c6e2a1-2193-4f32-8426-3bd7ad8bf679  1a
UN  744.12 GB  256 13.9% 
ff841094-7624-4dc5-b480-f39138b7f17c  1b


First, the difference in disk usage between 588G (lowest) and 744G (highest) is 
significant - at 156G.  I'm sure it's probably a weird pattern in our partition 
keys, but we can't predict that until we get the data loaded.


Maybe someone will advise against using vnodes altogether, but we need to be 
able to add 3 nodes for extra capacity and would like not to have to rewrite 
the vnode token assignment code in order to figure out a rack-safe token 
reassignment.


Given the above, is there any way to manually adjust tokens (while still using 
vnodes) so that we can balance the disk usage out?  If so, is there an easy way 
to do that in a rack-safe manner?


Thanks,

John...


Question about incremental repair

2014-10-01 Thread John Sumsion
If you only run incremental repairs, does that mean that bitrot will go 
undetected for already repaired sstables?

If so, is there any other process that will detect bitrot for all the repaired 
sstables other than full repair (or an unfortunate user)?

John...


 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.



Detecting bitrot with incremental repair

2014-09-11 Thread John Sumsion
jbellis talked about incremental repair, which is great, but as I understood, 
repair was also somewhat responsible for detecting and repairing bitrot on 
long-lived sstables.

If repair doesn't do it, what will?

Thanks,
John...


 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.