Hi all,

new user here. 
We have been testing an older version of the heartbeat / pacemaker combination 
compiled for illumos (an opensolaris follow-up).
Versions:
Heartbeat-3-0-STABLE-3.0.5
Pacemaker-1-0-Pacemaker-1.0.11

It all works ok while testing (several months now) but I have noticed that 
every so often (and sometimes quite frequently) I see the following console 
message appear:

crmd: [ID 996084 daemon.crit] [12637]: CRIT: time_longclock: old value was 
298671305, new value is 298671304, diff is 1, callcount 141814

Now from what I have been able to find about this, is that this type of 
occurence should have been fixed in heartbeat post 2.1.4 versions. At that time 
this occurence could make a cluster start behaving irratically.
We have two test implementions of a cluster, 1 in vmware and 1 on standard 
hardware. All just for testing.
We have made sure that timesync is done via ntp with the internet. The hardware 
implementation doesn't show this message as many times as the vmware 
implementation, but still it appears (sometimes about three times per 24 hours).

We haven't had any strange behaviour yet in the cluster, but my questions about 
this are as follows:

should we worry about this 'time_longclock' crit error eventhough it should 
have been fixed in version post HA 3?

Is there something (simple) that can be done to prevent this type of error, or 
should we expect normal cluster behaviour since ntp is used.

The above message should make clear that I'm not a programmer, nor am I a 
heartbeat specialist .... ;-)

Regards,

S.


                                          
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to