[
https://issues.apache.org/jira/browse/IGNITE-25948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vyacheslav Koptilin updated IGNITE-25948:
-----------------------------------------
Fix Version/s: 3.2
> Reduce amount of raft logs
> --------------------------
>
> Key: IGNITE-25948
> URL: https://issues.apache.org/jira/browse/IGNITE-25948
> Project: Ignite
> Issue Type: Improvement
> Reporter: Alexander Lapin
> Assignee: Alexander Lapin
> Priority: Major
> Labels: ignite-3
> Fix For: 3.2
>
> Attachments: SingleNode_1000partions_nodeRestart
>
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> h3. Motivation
> Ignite logs are both excessive and in some places not indicative. Among other
> components raft provides redundant amount of logs. Single node cluster with
> 1_000 partitions will log (in attachement)
> * Node {} init ballot box's lastCommittedIndex={}.
> * Node {} init, term={}, lastLogId={}, conf={}, oldConf={}.
> * Starts FSMCaller successfully [nodeId={}].
> * Shutting down FSMCaller...
> 2054 times each
> * Save raft meta, path={}, term={}, votedFor={}, cost time={} ms
> 4108 times.
> etc.
> Or 1_000 partitions will log "Unsuccessful election round number" > 327272
> times if majority is lost for one hour.
> h3. Implementation notes
> After some discussions following is suggests:
> # Change log level from "info" to "debug" for following messages:
> ## {{{}[LOG.info|http://log.info/]{}}}{{{}("Node {} init ballot box's
> lastCommittedIndex={}.", getNodeId(), lastCommittedIndex);{}}}{{{}{}}}{{{}{}}}
> ## {{{}[LOG.info|http://log.info/]{}}}{{{}("Starts FSMCaller successfully
> [nodeId={}].", nodeId);{}}}
> ## {{{}[LOG.info|http://log.info/]{}}}{{{}("Shutting down FSMCaller...");{}}}
> ## {{{}[LOG.info|http://log.info/]{}}}{{{}("Save raft meta, path={},
> term={}, votedFor={}, cost time={} ms", this.path, this.term,this.votedFor,
> cost); if cost is small enough.{}}}
> ## {{{}[LOG.info|http://log.info/]{}}}{{{}("Node {} is a learner, election
> timer is not started.", this.nodeId);{}}}
> ## {{{}[LOG.info|http://log.info/]{}}}{{{}("Node {} received PreVoteRequest
> from {}, term={}, currTerm={}, granted={}, requestLastLogId={},
> lastLogId={}.", getNodeId(), request.serverId(), request.term(),
> this.currTerm, granted, requestLastLogId, lastLogId);{}}}
> ## {{{}[LOG.info|http://log.info/]{}}}{{{}("Node {} term {} start preVote.",
> getNodeId(), this.currTerm);{}}}
> # Add lastCommittedIndex to
> ## {{{}[LOG.info|http://log.info/]{}}}{{{}("Node {} init, term={},
> lastLogId={}, conf={}, oldConf={}." {}}}
> ## {{{}[LOG.info|http://log.info/]{}}}{{{}("Init node {} with empty conf.",
> this.serverId);{}}}
> # For {{{}[LOG.info|http://log.info/]{}}}{{{}("Unsuccessful election round
> number {}, group '{}'", electionRound, groupId);{}}}
> ## {{{}{}}}{{{}Print each for first 10.{}}}
> ## Print every tents for 11-100.
> ## Print every hundredth for the rest.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)