Hi . I agree with you !! About giving the admin to adopt or not the paranoid approach of not failing over the services.
I supported in the past tru64 clusters & now days the HP serviceguard.( hpux & linux ). Hp decided not develops the serviceguard on linux anymore & we now start using Redhat-Cluster. Its seems that for very critical customers you need at least 2 fencing method !!! & there is another thing to be fix ASAP - when using HALVM - The needs of comparing which file is newer , the lvm.conf or the initrd.img. - Regards. Shalom. On Fri, Mar 5, 2010 at 10:01 PM, brem belguebli <brem.belgue...@gmail.com>wrote: > Corey, > > Hi Corey > > I was talking about a watchdog not a kernel panic (sysreq...), on > common (X86) hardware, most server vendors implement embedded hardware > chips that could be used. > > Indeed, SCSI3 reservation/registration could be combined to this whole > stuff to be sure about the nodes sanity. > > I think the choice should be given to the admin to adopt or not the > paranoid approach of not failing over the services. > > > > 2010/3/4 Corey Kovacs <corey.kov...@gmail.com>: > > Brem, > > > > It's been my understanding that the kernel panic technique you are > > describing essentially is undesirable for the fact that the kernel is in > an > > unknown state. Basically anything can happen. The OS doesn't have to do a > > sync for an hba do flush etc. Since RedHat isn't in the business of > building > > there own hardware like HP(DEC), Sun, IBM, they take the next best route > to > > ensure that nothing from that problematic machine can affect the storage > and > > the only way to guarantee that is to remove power from the whole machine. > > > > VMS and Tru64 use the panic method but the other nodes will issue a > > reservation on the scsi bus against that node to protect the storage. > They > > can do that because they know exactly how there hardware and > implementation > > of reservations work. > > > > Corey > > > > On Thu, Mar 4, 2010 at 5:32 AM, שלום קלמר <skle...@gmail.com> wrote: > >> > >> Thanks to all !!!! > >> > >> shalom.kle...@hp.com > >> > >> On Thu, Mar 4, 2010 at 12:00 AM, Lon Hohberger <l...@redhat.com> wrote: > >>> > >>> On Wed, 2010-03-03 at 13:10 +0200, שלום קלמר wrote: > >>> > Hi. > >>> > > >>> > I got 2 power supplies. But if someone by mistake pull the power > >>> > cables , is that mean > >>> > > >>> > That the services will not failover ?? > >>> > >>> The problem is: > >>> > >>> no power = no ping + no DRAC access > >>> no network = no ping, no DRAC access > >>> > >>> If there's no power, then it is safe to fail over. > >>> > >>> If there is no network (and power is OK), then it is not safe to fail > >>> over. Failover in this case is very likely to produce data corruption! > >>> > >>> Because we can not tell which case happened, we do not fail over. > >>> > >>> -- Lon > >>> > >>> > >>> -- > >>> Linux-cluster mailing list > >>> Linux-cluster@redhat.com > >>> https://www.redhat.com/mailman/listinfo/linux-cluster > >> > >> -- > >> Linux-cluster mailing list > >> Linux-cluster@redhat.com > >> https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > -- > > Linux-cluster mailing list > > Linux-cluster@redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > -- > Linux-cluster mailing list > Linux-cluster@redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster >
-- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster