Bharat Viswanadham created HDDS-1476:
----------------------------------------

             Summary: Fix logIfNeeded logic in EndPointStateMachine
                 Key: HDDS-1476
                 URL: https://issues.apache.org/jira/browse/HDDS-1476
             Project: Hadoop Distributed Data Store
          Issue Type: Bug
            Reporter: Bharat Viswanadham


{code:java}
public void E(Exception ex) {
 LOG.trace("Incrementing the Missed count. Ex : {}", ex);
this.incMissed();
 if (this.getMissedCount() % getLogWarnInterval(conf) ==
 0) {
 LOG.error(
 "Unable to communicate to SCM server at {} for past {} seconds.",
 this.getAddress().getHostString() + ":" + this.getAddress().getPort(),
 TimeUnit.MILLISECONDS.toSeconds(
 this.getMissedCount() * getScmHeartbeatInterval(this.conf)), ex);
 }

}{code}
This method will be called when any exception occur in stateMachine to log an 
exception. But to not log aggresively we have this 
ozone.scm.heartbeat.log.warn.interval.count property to control  logging. 

 

There is a small issue here, we don't log the exception first time when it 
occurred. So, we need to log for the first time and then increment the 
missingCount.

 

Fix is to move the this.incMissed() to end of the method so that we log it for 
the first time exception occurred and after that every log.warn.interval.count 
exceptions happened.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to