[ 
https://issues.apache.org/jira/browse/CASSANDRA-5593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665971#comment-13665971
 ] 

Justin Geiser commented on CASSANDRA-5593:
------------------------------------------

Didn't think it was a good fix, just a quick one.  Didn't think it would be 
quite that bad though :).  The problem is the consitencyForUser in Auth is hard 
coded as ONE (except for the super user), and because the endpoint list is 
cached the isExistingUser will fail every time until the node it picked the 
first time comes back up.  Because we don't allow anonymous authentication, and 
aren't using the super user, the node we're trying to access is effectively 
hosed until the down node comes back up.
                
> Auth.isExistingUser is periodically throwing 
> org.apache.cassandra.exceptions.UnavailableException: Cannot achieve 
> consistency level ONE
> ---------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-5593
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5593
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.2.4
>         Environment: Three node cluster
>            Reporter: Justin Geiser
>
> When setting up authentication on a clustered setup we're periodically 
> getting an UnavailableException: Cannot achieve consistency level ONE 
> whenever one or two of the cluster nodes is down.
> {code}
> java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.UnavailableException: Cannot achieve 
> consistency level ONE
>         at org.apache.cassandra.auth.Auth.isExistingUser(Auth.java:75)
>         at 
> com.resolve.cassandra.auth.SimpleAuthenticator.setup(SimpleAuthenticator.java:273)
>         at org.apache.cassandra.auth.Auth.setup(Auth.java:139)
>         at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:781)
>         at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:542)
>         at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:439)
>         at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:323)
>         at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:411)
>         at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:454)
> Caused by: org.apache.cassandra.exceptions.UnavailableException: Cannot 
> achieve consistency level ONE
>         at 
> org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:250)
>         at 
> org.apache.cassandra.service.ReadCallback.assureSufficientLiveNodes(ReadCallback.java:152)
>         at 
> org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:891)
>         at 
> org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:829)
>         at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:126)
>         at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:1)
>         at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:132)
>         at 
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:143)
>         at 
> org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:151)
>         at org.apache.cassandra.auth.Auth.isExistingUser(Auth.java:71)
>         ... 8 more
> {code}
> Digging into the issue it looks like the problem is the 
> SimpleStrategy.calculateNaturalEndpoints method is only returning 1 entry, 
> because the replication factor for the system_auth column family is 1, and if 
> this node happens to be one of the down nodes it gets removed by 
> getLiveNaturalEndpoints in StorageService.  So by the time it reaches 
> StorageProxy.fetchRows(StorageProxy.java:891) the endpoints list is empty, 
> even though we have valid nodes running.
> For a quick fix I removed the "endpoints.size() < replicas" check from the 
> while loop in SimpleStrategy.calculateNaturalEndpoints:
> {code}
>     public List<InetAddress> calculateNaturalEndpoints(Token token, 
> TokenMetadata metadata)
>     {
>         int replicas = getReplicationFactor();
>         ArrayList<Token> tokens = metadata.sortedTokens();
>         List<InetAddress> endpoints = new ArrayList<InetAddress>(replicas);
>         if (tokens.isEmpty())
>             return endpoints;
>         // Add the token at the index by default
>         Iterator<Token> iter = TokenMetadata.ringIterator(tokens, token, 
> false);
>         while (iter.hasNext())
>         {
>             InetAddress ep = metadata.getEndpoint(iter.next());
>             if (!endpoints.contains(ep))
>             {
>                 endpoints.add(ep);
>             }
>         }
>         return endpoints;
>     }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to