Re: [Fwd: Re: [Linux-HA] Heartbeat compatibility question]

2007-04-11 Thread Guochun Shi
The log from Patrick Begou shows the communication is dead among the two 
nodes after their initial communication.


I tried the combination (1.2.3  2.0.8) with a simple IP address v1 
style resource. The communication is fine, one node can see the other 
node without problem.
However the resource cannot start or fail over correctly. Two 2.0.8 
nodes v1 style resource work correctly though. I opened a bug for this 
(#1544).


-guochun




Alan Robertson wrote:

Looks like something is broken between 1.2.x versions and 2.0.x versions
:-(.  This is the 2nd complaint about this in the last week.

  




Subject:
Re: [Linux-HA] Heartbeat compatibility question
From:
Patrick Begou [EMAIL PROTECTED]
Date:
Fri, 06 Apr 2007 09:22:54 +0200
To:
General Linux-HA mailing list linux-ha@lists.linux-ha.org

To:
General Linux-HA mailing list linux-ha@lists.linux-ha.org


I have keep a log file, even if I do not find any usefull information 
for my (low) level of knowledge. It is attached, if it can help.

Dean is the 2.0.8 heartbeat node (FC6 X86_64) wich produce this log.
Ekman is the 1.2.3 heartbeat node (Sarge Amd64)

The only thing I notice are the messages:
info: flow control disabled due to different version heartbeat
info: Status update for node ekman: status active

I launch several hb_takeover commands (returning without doing 
anything) and finaly /etc/init.d/heartbeat stop (heartbeat -k 
freeze) but nothing appears in the logs (or I do not see them as I do 
not know what should be logged).

Heartbeat was ended by a killall heartbeat at the end of the log file.

About extra problems, yes they have nothing to deal with heartbeat. It 
was just to give some feedback to heartbeat users wich could reach 
into this situation.


Now heartbeat 1.2 runs fine and my HA cluster is available with one 
node in Sarge and one node in FC6. I will setup the second node and 
update heartbeat to 2.08 in the next weeks.


Patrick


___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems



___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Heartbeat compatibility question

2007-04-06 Thread Alan Robertson
Patrick Begou wrote:
 I have keep a log file, even if I do not find any usefull information
 for my (low) level of knowledge. It is attached, if it can help.
 Dean is the 2.0.8 heartbeat node (FC6 X86_64) wich produce this log.
 Ekman is the 1.2.3 heartbeat node (Sarge Amd64)
 
 The only thing I notice are the messages:
 info: flow control disabled due to different version heartbeat
 info: Status update for node ekman: status active
 
 I launch several hb_takeover commands (returning without doing anything)
 and finaly /etc/init.d/heartbeat stop (heartbeat -k freeze) but
 nothing appears in the logs (or I do not see them as I do not know what
 should be logged).
 Heartbeat was ended by a killall heartbeat at the end of the log file.
 
 About extra problems, yes they have nothing to deal with heartbeat. It
 was just to give some feedback to heartbeat users wich could reach into
 this situation.
 
 Now heartbeat 1.2 runs fine and my HA cluster is available with one node
 in Sarge and one node in FC6. I will setup the second node and update
 heartbeat to 2.08 in the next weeks.

Thanks for this feedback -- and the logs.  We'll look them over.


-- 
Alan Robertson [EMAIL PROTECTED]

Openness is the foundation and preservative of friendship...  Let me
claim from you at all times your undisguised opinions. - William
Wilberforce
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Heartbeat compatibility question

2007-04-05 Thread Patrick Begou

Just to give some final info on this thread:
- Heartbeat version 2.0.8 do not works with heartbeat version 1.2.x. i 
had to install two identical versions on the two nodes (downgrading FC6 
official version)
- version 1.2.x works without major problems between a node Fedora Core 
6 X86_64 and a node Debian Sarge AMD64.
The main difficultie is for the haresource file. The documentation 
should be changed a little in the sense that:
 haresources should not be identical but should provide the same 
behavior as ressources names can change with OSs:

 - bind9, in Debian, is named in FC6
 - nfs startup is different in the two OSs
 - etc...

One of the main problem in this situation is also different uid for the 
owners of sendmail, named services and different version of these 
services not fully compatible with the shared data files. It is not 
fully solved for me!


It is quite difficult to move one HA node to a new version (with the 
other one in production) and then update the second one while the first 
is in production!


Thanks all for your help and precious advices.

Patrick

--
===
|  Equipe M.O.S.T. | http://most.hmg.inpg.fr  |
|  Patrick BEGOU   |      |
|  LEGI| mailto:[EMAIL PROTECTED] |
|  BP 53 X | Tel 04 76 82 51 35   |
|  38041 GRENOBLE CEDEX| Fax 04 76 82 52 71   |
===
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Heartbeat compatibility question

2007-04-05 Thread Dejan Muhamedagic
On Thu, Apr 05, 2007 at 10:25:28AM +0200, Patrick Begou wrote:
 Just to give some final info on this thread:
 - Heartbeat version 2.0.8 do not works with heartbeat version 1.2.x. i 
 had to install two identical versions on the two nodes (downgrading FC6 
 official version)

Interesting. I wonder if you ran into bugs here. BTW, I don't
think you posted any logs.

Also, why is it that you wanted to run 1.2 with 2.0?

 - version 1.2.x works without major problems between a node Fedora Core 
 6 X86_64 and a node Debian Sarge AMD64.

Good. heartbeat should run in a mix of platforms. I think I can
recall somebody having a solaris/linux cluster.

 The main difficultie is for the haresource file. The documentation 
 should be changed a little in the sense that:
  haresources should not be identical but should provide the same 
 behavior as ressources names can change with OSs:
  - bind9, in Debian, is named in FC6
  - nfs startup is different in the two OSs
  - etc...

Hmpf. In cases like this it would be best to use the heartbeat
supplied resource agents. And, if those don't exist, to write one
or, as an ultimate resort, adopt one of the two for the other
platform.

 One of the main problem in this situation is also different uid for the 
 owners of sendmail, named services and different version of these 
 services not fully compatible with the shared data files. It is not 
 fully solved for me!

:) You'd have to manually fix the IDs to match on all nodes.
Heartbeat can't help here.

 It is quite difficult to move one HA node to a new version (with the 
 other one in production) and then update the second one while the first 
 is in production!

Yes, for v1, but with the v2 (CRM) kind of cluster it's a peace of
cake.

 Thanks all for your help and precious advices.

Welcome. BTW, did you reach any conclusion regarding the crashes?

 
 Patrick
 
 -- 
 ===
 |  Equipe M.O.S.T. | http://most.hmg.inpg.fr  |
 |  Patrick BEGOU   |      |
 |  LEGI| mailto:[EMAIL PROTECTED] |
 |  BP 53 X | Tel 04 76 82 51 35   |
 |  38041 GRENOBLE CEDEX| Fax 04 76 82 52 71   |
 ===
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems

-- 
Dejan
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Heartbeat compatibility question

2007-04-05 Thread David Lee
On Thu, 5 Apr 2007, Dejan Muhamedagic wrote:

 On Thu, Apr 05, 2007 at 10:25:28AM +0200, Patrick Begou wrote:

  - version 1.2.x works without major problems between a node Fedora Core
  6 X86_64 and a node Debian Sarge AMD64.

 Good. heartbeat should run in a mix of platforms. I think I can
 recall somebody having a solaris/linux cluster.

(Drifting off-topic a little...)

To confirm: I have run a 1.2.x Solaris/Linux pair in the past.  Never
really pushed it hard, and only used it for simple IPaddr failover, but it
worked that far...


-- 

:  David LeeI.T. Service  :
:  Senior Systems ProgrammerComputer Centre   :
:  UNIX Team Leader Durham University :
:   South Road:
:  http://www.dur.ac.uk/t.d.lee/Durham DH1 3LE:
:  Phone: +44 191 334 2752  U.K.  :
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Heartbeat compatibility question

2007-04-05 Thread Patrick Begou

Dejan Muhamedagic wrote:

On Thu, Apr 05, 2007 at 10:25:28AM +0200, Patrick Begou wrote:


Just to give some final info on this thread:
- Heartbeat version 2.0.8 do not works with heartbeat version 1.2.x. i 
had to install two identical versions on the two nodes (downgrading FC6 
official version)



Interesting. I wonder if you ran into bugs here. BTW, I don't
think you posted any logs.


Runing hb_takeover/hb_standby hangs (command freeze) as if the nodes 
couldn't communicate.


Runing heartbeat -k hangs, like if it was waiting for ever a response 
from the otrher node.


I'make these tests without iptables nor selinux runing.

I do not see anything in the log files (debug set to 1, then 2, then 3) 
except the pings to the nodes wich was successfull.




Also, why is it that you wanted to run 1.2 with 2.0?




Smooth transition: moving one node to a different OS, testing, moving 
the services on it, updating the other node, re-activating the HA cluster.


Patrick

--
===
|  Equipe M.O.S.T. | http://most.hmg.inpg.fr  |
|  Patrick BEGOU   |      |
|  LEGI| mailto:[EMAIL PROTECTED] |
|  BP 53 X | Tel 04 76 82 51 35   |
|  38041 GRENOBLE CEDEX| Fax 04 76 82 52 71   |
===
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Heartbeat compatibility question

2007-03-27 Thread Patrick Begou

Dejan Muhamedagic wrote:
 

Hmm, are you sure that your hardware is good and that it is well
supported under Linux? Haven't you been able to find the reason
for your computers crashing so often? BTW, you can always try the
vanilla kernel and then bother people on the kernel list ;-)



So many tests run on the hardware for these 2 years...
- hard drive checked (with constructor softwar) OK. New low level 
initialisation and re-install. Same problem.

- memory test for 72 hours with the latest memtest version. No error.
- mother board change. Same problem.
- Power supply change. Same problem.
- Adding additionnal fan to the unit (even if the room is maintained at 
22 degres). Same problem.


As the problem occurs on the 2 hosts but not simultaneously (usualy) I 
think it is a software problem.  But I cannot get any info when the host 
crashes (nothing in the logs and only partial info on the console).


May be I can again try a new kernel before removing Debian...

Patrick
--
===
|  Equipe M.O.S.T. | http://most.hmg.inpg.fr  |
|  Patrick BEGOU   |      |
|  LEGI| mailto:[EMAIL PROTECTED] |
|  BP 53 X | Tel 04 76 82 51 35   |
|  38041 GRENOBLE CEDEX| Fax 04 76 82 52 71   |
===
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems