[ 
https://issues.apache.org/jira/browse/CASSANDRA-18096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17643890#comment-17643890
 ] 

Stefan Miklosovic commented on CASSANDRA-18096:
-----------------------------------------------

OK. Do you think this still makes sense to do? I am trying to come up with some 
way how to stop the spamming here. Maybe we should kick off the schema pulling 
after these nodes are detected as alive? But then we would skip the cases when 
nodes are genuinely not alive. Another idea is to postpone the schema pulling a 
little bit before they are alive? 

> Do not spam the logs with MigrationCoordinator not able to pull schemas on 
> bootstrap
> ------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-18096
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18096
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Cluster/Schema
>            Reporter: Stefan Miklosovic
>            Assignee: Stefan Miklosovic
>            Priority: Low
>             Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 4.x
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> When a node is joining a cluster, there is this output upon startup:
> {code}
> cassandra_node_6  | INFO  [GossipStage:1] 2022-12-06 12:48:07,187 
> Gossiper.java:1413 - Node /172.19.0.5:7000 is now part of the cluster
> cassandra_node_6  | WARN MigrationCoordinator.java:650 - Can't send schema 
> pull request: node /172.19.0.5:7000 is down.
> cassandra_node_6  | WARN MigrationCoordinator.java:650 - Can't send schema 
> pull request: node /172.19.0.5:7000 is down.
> cassandra_node_6  | WARN MigrationCoordinator.java:650 - Can't send schema 
> pull request: node /172.19.0.5:7000 is down.
> cassandra_node_6  | WARN MigrationCoordinator.java:650 - Can't send schema 
> pull request: node /172.19.0.5:7000 is down.
> cassandra_node_6  | WARN MigrationCoordinator.java:650 - Can't send schema 
> pull request: node /172.19.0.5:7000 is down.
> cassandra_node_6  | WARN MigrationCoordinator.java:650 - Can't send schema 
> pull request: node /172.19.0.5:7000 is down.
> cassandra_node_6  | WARN MigrationCoordinator.java:650 - Can't send schema 
> pull request: node /172.19.0.5:7000 is down.
> {code}
> This is there for a lot of already existing nodes. You got the idea. This log 
> is misleading, it indeed can not pull requests because "node is down" but it 
> is not down, it just thinks it is because Gossiper has not had a chance to 
> receive any gossip about these nodes _yet_.
> I put there more logs and it writes this:
> {code}
>  MigrationCoordinator.java:655 - Can't send schema pull request: node 
> /172.19.0.5:7000 is down: NORMAL, isAlive: false
> {code}
> When I do this:
> {code}
>         if (!gossiper.hasEndpointState(endpoint))
>             return;
>         if (!gossiper.isAlive(endpoint))
>         {
>             EndpointState endpointStateForEndpoint = 
> gossiper.getEndpointStateForEndpoint(endpoint);
>             String status = 
> Gossiper.getGossipStatus(endpointStateForEndpoint);
>             logger.warn("Can't send schema pull request: node {} is down: {}, 
> isAlive: {}", endpoint, status, endpointStateForEndpoint.isAlive());
>             callback.onFailure(endpoint, RequestFailureReason.UNKNOWN);
>             return;
>         }
> {code}
> So it is in NORMAL but it is not alive yet which is quite strange.
> The fix is to still return prematurely but we would not skip the logging on 
> WARN only in case isAlive is false and status is _not_NORMAL. We would 
> however still log on TRACE at least.
> (1) 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/schema/MigrationCoordinator.java#L648-L653



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to