On 07/02/2013 04:32 AM, Hearns, John wrote: > > > > > -----Original Message----- > > > Our NoSQL database uses pub-sub for cluster membership, and we found > > that the hosts with screwed up RAID controllers could easily stay in > > the cluster even if they were really screwed up. We had to add some extra > > watchdogs and tests that the system disk is working. > > I've seen that before with a RAID controller when one disk fails. > System pings, OS and TCP-IP stack are up, but the system disk has been > marked write-only. > I'm still pretty amazed that the Linux OS soldiers on in this state. > One to watch out for!
I have a feeling its going to be big some day ... This is one of the reasons why we like diskless/stateless boot. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics, Inc. email: [email protected] web : http://scalableinformatics.com http://scalableinformatics.com/siflash phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615 _______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
