[ https://issues.apache.org/jira/browse/CASSANDRA-18096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17643903#comment-17643903 ]
Stefan Miklosovic edited comment on CASSANDRA-18096 at 12/6/22 2:53 PM: ------------------------------------------------------------------------ I have updated the PRs with NoSpamLogger approach. Lets just wait until 4.1 is released to lower the buzz. was (Author: smiklosovic): I have updated the PRs with NoSpamLogger approach. > Do not spam the logs with MigrationCoordinator not able to pull schemas on > bootstrap > ------------------------------------------------------------------------------------ > > Key: CASSANDRA-18096 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18096 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Schema > Reporter: Stefan Miklosovic > Assignee: Stefan Miklosovic > Priority: Low > Fix For: 3.0.x, 3.11.x, 4.0.x, 4.1.x, 4.x > > Time Spent: 20m > Remaining Estimate: 0h > > When a node is joining a cluster, there is this output upon startup: > {code} > cassandra_node_6 | INFO [GossipStage:1] 2022-12-06 12:48:07,187 > Gossiper.java:1413 - Node /172.19.0.5:7000 is now part of the cluster > cassandra_node_6 | WARN MigrationCoordinator.java:650 - Can't send schema > pull request: node /172.19.0.5:7000 is down. > cassandra_node_6 | WARN MigrationCoordinator.java:650 - Can't send schema > pull request: node /172.19.0.5:7000 is down. > cassandra_node_6 | WARN MigrationCoordinator.java:650 - Can't send schema > pull request: node /172.19.0.5:7000 is down. > cassandra_node_6 | WARN MigrationCoordinator.java:650 - Can't send schema > pull request: node /172.19.0.5:7000 is down. > cassandra_node_6 | WARN MigrationCoordinator.java:650 - Can't send schema > pull request: node /172.19.0.5:7000 is down. > cassandra_node_6 | WARN MigrationCoordinator.java:650 - Can't send schema > pull request: node /172.19.0.5:7000 is down. > cassandra_node_6 | WARN MigrationCoordinator.java:650 - Can't send schema > pull request: node /172.19.0.5:7000 is down. > {code} > This is there for a lot of already existing nodes. You got the idea. This log > is misleading, it indeed can not pull requests because "node is down" but it > is not down, it just thinks it is because Gossiper has not had a chance to > receive any gossip about these nodes _yet_. > I put there more logs and it writes this: > {code} > MigrationCoordinator.java:655 - Can't send schema pull request: node > /172.19.0.5:7000 is down: NORMAL, isAlive: false > {code} > When I do this: > {code} > if (!gossiper.hasEndpointState(endpoint)) > return; > if (!gossiper.isAlive(endpoint)) > { > EndpointState endpointStateForEndpoint = > gossiper.getEndpointStateForEndpoint(endpoint); > String status = > Gossiper.getGossipStatus(endpointStateForEndpoint); > logger.warn("Can't send schema pull request: node {} is down: {}, > isAlive: {}", endpoint, status, endpointStateForEndpoint.isAlive()); > callback.onFailure(endpoint, RequestFailureReason.UNKNOWN); > return; > } > {code} > So it is in NORMAL but it is not alive yet which is quite strange. > The fix is to still return prematurely but we would not skip the logging on > WARN only in case isAlive is false and status is _not_NORMAL. We would > however still log on TRACE at least. > (1) > https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/schema/MigrationCoordinator.java#L648-L653 -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org