On Wed, Dec 02, 2015 at 01:31:46PM +0800, Wen Congyang wrote:
> +== Failure Handling ==
> +There are 6 internal errors when block replication is running:
> +1. I/O error on primary disk
> +2. Forwarding primary write requests failed
> +3. Backup failed
> +4. I/O error on secondary disk
> +5. I/O error on active disk
> +6. Making active disk or hidden disk empty failed
> +In case 1 and 5, we just report the error to the disk layer. In case 2, 3,
> +4 and 6, we just report block replication's error to FT/HA manager (which
> +decides when to do a new checkpoint, when to do failover).
> +There is no internal error when doing failover.

Not sure this is true.

Below it says the following for failover: "We will flush the Disk buffer
into Secondary Disk and stop block replication".  Flushing the disk
buffer can result in I/O errors.  This means that failover operations
are not guaranteed to succeed.

In practice I think this is similar to a successful failover followed by
immediately getting I/O errors on the new Primary Disk.  It means that
right after failover there is another failure and the system may not be
able to continue.

So this really only matters in the case where there is a new Secondary
ready after failover.  In that case the user might expect failover to
continue to the new Secondary (Host 3):

   [X]        [X]
  Host 1 <-> Host 2 <-> Host 3

Attachment: signature.asc
Description: PGP signature

Reply via email to