On 2/19/07, matt jones <[EMAIL PROTECTED]> wrote:
if one fails there are still 3, if another there are still 2. i've also read somewhere else that if one fails, it can automatically recreate the image from the remaining ones on a spare node.
[...]
this approach is rather ott, but it works and works well.
not sure of Google gents; but we're using reliability model to calculate number of nodes and their physical locations (continuous scheduling) - to meet the expected reliability coefficient specified by the system operator/deployer/configurator (for EE, SW and HW parts). HDD is unreliable system part, with the nearly known reliability (expected -actually), moreover, as we know, most of HDDs have SMART metrics - the good way to correct live coefficients within used math model. The outcome here is to use adaptive techs. So Googles are using the same way probably - a good company anyhow... ta-da! :) [EMAIL PROTECTED] – http://sgrid.sourceforge.net/ // (the perfect doc - the amazing work) _______________________________________________ Beowulf mailing list, [email protected] To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
