[
https://issues.apache.org/jira/browse/IGNITE-25948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexander Lapin updated IGNITE-25948:
-------------------------------------
Description:
h3. Motivation
Ignite logs are both excessive and in some places not indicative. Among other
components raft provides redundant amount of logs. Single node cluster with
1_000 partitions will log (in attachement)
* Node {} init ballot box's lastCommittedIndex={}.
* Node {} init, term={}, lastLogId={}, conf={}, oldConf={}.
* Starts FSMCaller successfully [nodeId={}].
* Shutting down FSMCaller...
2054 times each
* Save raft meta, path={}, term={}, votedFor={}, cost time={} ms
4108 times.
etc.
Or 1_000 partitions will log "Unsuccessful election round number" > 327272
times if majority is lost for one hour.
h3. Implementation notes
After some discussions following is suggests:
# Change log level from "info" to "debug" for following messages:
## {{{}[LOG.info|http://log.info/]{}}}{{{}("Node {} init ballot box's
lastCommittedIndex={}.", getNodeId(), lastCommittedIndex);{}}}{{{}{}}}{{{}{}}}
## {{{}[LOG.info|http://log.info/]{}}}{{{}("Starts FSMCaller successfully
[nodeId={}].", nodeId);{}}}
## {{{}[LOG.info|http://log.info/]{}}}{{{}("Shutting down FSMCaller...");{}}}
## {{{}[LOG.info|http://log.info/]{}}}{{{}("Save raft meta, path={}, term={},
votedFor={}, cost time={} ms", this.path, this.term,this.votedFor, cost); if
cost is small enough.{}}}
## {{{}[LOG.info|http://log.info/]{}}}{{{}("Node {} is a learner, election
timer is not started.", this.nodeId);{}}}
## {{{}[LOG.info|http://log.info/]{}}}{{{}("Node {} received PreVoteRequest
from {}, term={}, currTerm={}, granted={}, requestLastLogId={}, lastLogId={}.",
getNodeId(), request.serverId(), request.term(), this.currTerm, granted,
requestLastLogId, lastLogId);{}}}
## {{{}[LOG.info|http://log.info/]{}}}{{{}("Node {} term {} start preVote.",
getNodeId(), this.currTerm);{}}}
# Add lastCommittedIndex to
## {{{}[LOG.info|http://log.info/]{}}}{{{}("Node {} init, term={},
lastLogId={}, conf={}, oldConf={}." {}}}
## {{{}[LOG.info|http://log.info/]{}}}{{{}("Init node {} with empty conf.",
this.serverId);{}}}
# For {{{}[LOG.info|http://log.info/]{}}}{{{}("Unsuccessful election round
number {}, group '{}'", electionRound, groupId);{}}}
## {{{}{}}}{{{}Print each for first 10.{}}}
## Print every tents for 11-100.
## Print every hundredth for the rest.
> Reduce amount of raft logs
> --------------------------
>
> Key: IGNITE-25948
> URL: https://issues.apache.org/jira/browse/IGNITE-25948
> Project: Ignite
> Issue Type: Improvement
> Reporter: Alexander Lapin
> Priority: Major
> Attachments: SingleNode_1000partions_nodeRestart
>
>
> h3. Motivation
> Ignite logs are both excessive and in some places not indicative. Among other
> components raft provides redundant amount of logs. Single node cluster with
> 1_000 partitions will log (in attachement)
> * Node {} init ballot box's lastCommittedIndex={}.
> * Node {} init, term={}, lastLogId={}, conf={}, oldConf={}.
> * Starts FSMCaller successfully [nodeId={}].
> * Shutting down FSMCaller...
> 2054 times each
> * Save raft meta, path={}, term={}, votedFor={}, cost time={} ms
> 4108 times.
> etc.
> Or 1_000 partitions will log "Unsuccessful election round number" > 327272
> times if majority is lost for one hour.
> h3. Implementation notes
> After some discussions following is suggests:
> # Change log level from "info" to "debug" for following messages:
> ## {{{}[LOG.info|http://log.info/]{}}}{{{}("Node {} init ballot box's
> lastCommittedIndex={}.", getNodeId(), lastCommittedIndex);{}}}{{{}{}}}{{{}{}}}
> ## {{{}[LOG.info|http://log.info/]{}}}{{{}("Starts FSMCaller successfully
> [nodeId={}].", nodeId);{}}}
> ## {{{}[LOG.info|http://log.info/]{}}}{{{}("Shutting down FSMCaller...");{}}}
> ## {{{}[LOG.info|http://log.info/]{}}}{{{}("Save raft meta, path={},
> term={}, votedFor={}, cost time={} ms", this.path, this.term,this.votedFor,
> cost); if cost is small enough.{}}}
> ## {{{}[LOG.info|http://log.info/]{}}}{{{}("Node {} is a learner, election
> timer is not started.", this.nodeId);{}}}
> ## {{{}[LOG.info|http://log.info/]{}}}{{{}("Node {} received PreVoteRequest
> from {}, term={}, currTerm={}, granted={}, requestLastLogId={},
> lastLogId={}.", getNodeId(), request.serverId(), request.term(),
> this.currTerm, granted, requestLastLogId, lastLogId);{}}}
> ## {{{}[LOG.info|http://log.info/]{}}}{{{}("Node {} term {} start preVote.",
> getNodeId(), this.currTerm);{}}}
> # Add lastCommittedIndex to
> ## {{{}[LOG.info|http://log.info/]{}}}{{{}("Node {} init, term={},
> lastLogId={}, conf={}, oldConf={}." {}}}
> ## {{{}[LOG.info|http://log.info/]{}}}{{{}("Init node {} with empty conf.",
> this.serverId);{}}}
> # For {{{}[LOG.info|http://log.info/]{}}}{{{}("Unsuccessful election round
> number {}, group '{}'", electionRound, groupId);{}}}
> ## {{{}{}}}{{{}Print each for first 10.{}}}
> ## Print every tents for 11-100.
> ## Print every hundredth for the rest.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)