Additional heartbeat logger
---------------------------

         Key: SEQUOIA-920
         URL: https://forge.continuent.org/jira/browse/SEQUOIA-920
     Project: Sequoia
        Type: Improvement
  Components: Core  
    Versions: Sequoia 2.10.6, Sequoia 2.10.5, Sequoia 3.0 beta2, Sequoia 
2.10.4, Sequoia 2.10.3, Sequoia 3.0 beta1, Sequoia 2.10.2, Sequoia 2.10.1, 
Sequoia 2.10, Sequoia 2.9    
    Reporter: Marc Herbert


There is a strong desire for an additional "heartbeat" logger, exactly
like syslogd (see "-- MARK --" in syslogd man page). This would solve
(at least) two problems:

- when reading the logs, you find a 5 hours hole before controller
restart. When did the host crashed approximately?

- under heavy load mysterious time behaviours have been noticed,
including clock drifts (!) under linux.


Design sketch:

- has its own, new dedicated logger

- period is configurable through the debug level (e.g.:
  DEBUG: every 15seconds, INFO: every 2minutes, etc.)

- lifecycle = controller lifecycle

- logs lines like this:

    2007-01-12 19:21:36,247  --- BEAT --- controller has been up for HH:MM:SS

  HH:MM:SS is _NOT_ computed using wall-clock, but by multiplying the 
  the number of MARKs by the sleep period instead => trivial detection
  of clock drifts issues

java.util.Timer/TimerTask seem well-suited for the job (we have enough
bare threads to manage already).

Please provide comments/make suggestions. Especially:
- alternatives names more intuitive for native english speakers
- good default period values, preferably inspired by field experience

Marc +Jeff

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   https://forge.continuent.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


_______________________________________________
Sequoia mailing list
[email protected]
https://forge.continuent.org/mailman/listinfo/sequoia

Reply via email to