On Fri, 2009-01-23 at 22:36 -0500, erik quanstrom wrote:
> > You never know when end-to-end data consistency will start to really
> > matter. Just the other day I attended the cloud conference where 
> > some Amazon EC2 customers were swapping stories of Amazon's networking
> > "stack" malfunctioning and silently corrupting data that was written
> > onto EBS. All of sudden, something like ZFS started to sound like 
> > a really good idea to them.
> 
> i know we need to bow down before zfs's greatness, but i still have
> some questions. ☺

Oh, come on! I said "something like ZFS" ;-) These guys are on
Linux, for crying out loud! They need to be saved one way
or the other (and Solaris at least have *some* AMIs available
on EC2).

> does ec2 corrupt all one's data en mass?  

>From what I understood -- it was NOT en mass. But the scary
thing is that they only noticed because of the dumb luck
(the app coredumped because the input it was getting was not
properly formatted or something) 

> how do you do meaningful redundency in a cloud where one controls 
> none of the failure-prone pieces.

Well, that's the very point I'm trying to make: you have
to be at least notified that your data got corrupted.

Once you do get notified -- you can recover in variety
of different ways: starting from simply re-uploading/re-generating
your data all the way to the RAID-like things.

> finally, if p is the probability of a lost block, when does p become too
> large for zfs' redundency to overcome failures? 

It depends on the vdev configuration. You can do simple mirroring
or you can do RAID-Z (which is more or less RAID-5 done properly).

> does this depend on the amount of i/o one does on the data or does 
> zfs scrub at a minimum rate anyway.  if it does, that would be expensive.  

You can do resilvering (fixing the data that is known to be
bad) or scrubbing (verifying and fixing *all* the data). You
also can configure things so that bad blocks either trigger
or don't automatic resilvering. Does this answer your question?

> maybe ec2 is heads amazon wins, tails you loose?

The scariest takeaway from the conference was: with the economy
the way it is physical on-site datacenters are becoming a 
luxury for all but the most wealthy companies. Thus whether
we like it or not virtual data centers are here to stay.

Thanks,
Roman.


Reply via email to