So I tried to run a repair with the following on one of the server.
nodetool repair system_auth -pr –local

After two hours it hadn’t finished.  I had to kill the repair because of 
another issue and haven’t tried again.

Why would such a small table take so long to repair?

Also what would happen if I set the RF back to a lower number like 5?


Thanks
From: <li...@beobal.com> on behalf of Sam Tunnicliffe <s...@beobal.com>
Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Date: Wednesday, August 30, 2017 at 10:10 AM
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Subject: Re: system_auth replication factor in Cassandra 2.1

It's a better rule of thumb to use an RF of 3 to 5 per DC and this is what the 
docs now suggest: 
http://cassandra.apache.org/doc/latest/operating/security.html#authentication
Out of the box, the system_auth keyspace is setup with SimpleStrategy and RF=1 
so that it works on any new system including dev & test clusters, but obviously 
that's no use for a production system.

Regarding the increased rate of authentication errors: did you run repair after 
changing the RF? Auth queries are done at CL.LOCAL_ONE, so if you haven't 
repaired, the data for the user logging in will probably not be where it should 
be. The exception to this is the default "cassandra" user, queries for that 
user are done at CL.QUORUM, which will indeed lead to timeouts and 
authentication errors with a very high RF. It's recommended to only use that 
default user to bootstrap the setup of your own users & superusers, the link 
above also has info on this.

Thanks,
Sam


On 30 August 2017 at 16:50, Chuck Reynolds 
<creyno...@ancestry.com<mailto:creyno...@ancestry.com>> wrote:
So I’ve read that if your using authentication in Cassandra 2.1 that your 
replication factor should match the number of nodes in your datacenter.

Is that true?

I have two datacenter cluster, 135 nodes in datacenter 1 & 227 nodes in an AWS 
datacenter.

Why do I want to replicate the system_auth table that many times?

What are the benefits and disadvantages of matching the number of nodes as 
opposed to the standard replication factor of 3?


The reason I’m asking the question is because it seems like I’m getting a lot 
of authentication errors now and they seem to happen more under load.

Also, querying the system_auth table from cqlsh to get the users seems to now 
timeout.


Any help would be greatly appreciated.

Thanks

Reply via email to