Justin Geiser created CASSANDRA-5593: ----------------------------------------
Summary: Auth.isExistingUser is periodically throwing org.apache.cassandra.exceptions.UnavailableException: Cannot achieve consistency level ONE Key: CASSANDRA-5593 URL: https://issues.apache.org/jira/browse/CASSANDRA-5593 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.4 Environment: Three node cluster Reporter: Justin Geiser When setting up authentication on a clustered setup we're periodically getting an UnavailableException: Cannot achieve consistency level ONE whenever one or two of the cluster nodes is down. {code} java.lang.RuntimeException: org.apache.cassandra.exceptions.UnavailableException: Cannot achieve consistency level ONE at org.apache.cassandra.auth.Auth.isExistingUser(Auth.java:75) at com.resolve.cassandra.auth.SimpleAuthenticator.setup(SimpleAuthenticator.java:273) at org.apache.cassandra.auth.Auth.setup(Auth.java:139) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:781) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:542) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:439) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:323) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:411) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:454) Caused by: org.apache.cassandra.exceptions.UnavailableException: Cannot achieve consistency level ONE at org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:250) at org.apache.cassandra.service.ReadCallback.assureSufficientLiveNodes(ReadCallback.java:152) at org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:891) at org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:829) at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:126) at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:1) at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:132) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:143) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:151) at org.apache.cassandra.auth.Auth.isExistingUser(Auth.java:71) ... 8 more {code} Digging into the issue it looks like the problem is the SimpleStrategy.calculateNaturalEndpoints method is only returning 1 entry, because the replication factor for the system_auth column family is 1, and if this node happens to be one of the down nodes it gets removed by getLiveNaturalEndpoints in StorageService. So by the time it reaches StorageProxy.fetchRows(StorageProxy.java:891) the endpoints list is empty, even though we have valid nodes running. For a quick fix I removed the "endpoints.size() < replicas" check from the while loop in SimpleStrategy.calculateNaturalEndpoints: {code} public List<InetAddress> calculateNaturalEndpoints(Token token, TokenMetadata metadata) { int replicas = getReplicationFactor(); ArrayList<Token> tokens = metadata.sortedTokens(); List<InetAddress> endpoints = new ArrayList<InetAddress>(replicas); if (tokens.isEmpty()) return endpoints; // Add the token at the index by default Iterator<Token> iter = TokenMetadata.ringIterator(tokens, token, false); while (iter.hasNext()) { InetAddress ep = metadata.getEndpoint(iter.next()); if (!endpoints.contains(ep)) { endpoints.add(ep); } } return endpoints; } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira