On 09/29/2016 03:46 AM, zhanghailiang wrote: > Introduce the design of COLO, and how to test it. > > Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> > --- > docs/COLO-FT.txt | 190 > +++++++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 190 insertions(+) > create mode 100644 docs/COLO-FT.txt >
> + > +== Background == > +Virtual machine (VM) replication is a well known technique for providing > +application-agnostic software-implemented hardware fault tolerance > +"non-stop service". Do you want s/tolerance/tolerance, also known as/ ? > +== Architecture == > + > +The architecture of COLO is shown in the bellow diagram. s/bellow diagram/diagram below/ > +It consists of a pair of networked physical nodes: > +The primary node running the PVM, and the secondary node running the SVM > +to maintain a valid replica of the PVM. > +PVM and SVM execute in parallel and generate output of response packets for > +client requests according to the application semantics. > + > +The incoming packets from the client or external network are received by the > +primary node, and then forwarded to the secondary node, so that Both the PVM s/Both/both/ > +and the SVM are stimulated with the same requests. > + > +COLO receives the outbound packets from both the PVM and SVM and compares > them > +before allowing the output to be sent to clients. > + > +The SVM is qualified as a valid replica of the PVM, as long as it generates > +identical responses to all client requests. Once the differences in the > outputs > +are detected between the PVM and SVM, COLO withholds transmission of the > +outbound packets until it has successfully synchronized the PVM state to the > SVM. > + > +== Components introduction == > + > +You can see there are several components in COLO's diagram of architecture. > +Their functions are described as bellow. s/as bellow/below/ > + > +HeartBeat: > +Runs on both the primary and secondary nodes, to periodically check platform > +availability. When the primary node suffers a hardware fail-stop failure, > +the heartbeat stops responding, the secondary node will trigger a failover > +as soon as it determines the absence. > + > +COLO disk Manager: > +When primary VM writes data into image, the colo disk manger captures this > data > +and send it to secondary VM’s which makes sure the context of secondary VM's s/send/sends/ > +image is consentient with the context of primary VM 's image. s/consentient/consistent/ s/VM 's/VM's/ > +For more details, please refer to docs/block-replication.txt. > + > +Checkpoint/Failover Controller: > +Modifications of save/restore flow to realize continuous migration, > +to make sure the state of VM in Secondary side always be consistent with VM > in s/always be/is always/ > +Primary side. > + > +COLO Proxy: > +Delivers packets to Primary and Seconday, and then compare the responses from > +both side. Then decide whether to start a checkpoint according to some rules. > + > +Note: > + a. HeartBeat is not been realized, so you need to trigger failover process s/is/has/ s/realized/implemented yet/ Is this note going to be stale once heartbeat is implemented? > + by using 'x-colo-lost-heartbeat' command. > + b. COLO proxy compents is work-in-process, it only support periodic > checkpoint s/compents is/components are a/ > + mode now, just as Micro-checkpointing. > + > +3. On Primary VM's QEMU monitor, issue command: > +{'execute':'qmp_capabilities'} > +{ 'execute': 'human-monitor-command', > + 'arguments': {'command-line': 'drive_add -n buddy > driver=replication,mode=primary,file.driver=nbd,file.host=xx.xx.xx.xx,file.port=8889,file.export=colo-disk0,node-name=node0'}} It would be really nice if we could get this done through QMP blockdev-add instead of HMP drive_add. > + > +Before issuing '{ "execute": "x-colo-lost-heartbeat" }' command, we have to > +issue block related command to stop block replication. > +Primary: > + Remove the nbd child from the quorum: > + { 'execute': 'x-blockdev-change', 'arguments': {'parent': 'colo-disk0', > 'child': 'children.1'}} > + { 'execute': 'human-monitor-command','arguments': {'command-line': > 'drive_del blk-buddy0'}} > + Note: there is no qmp command to remove the blockdev now Don't we have x-blockdev-del? > + > +Secondary: > + The primary host is down, so we should do the following thing: > + { 'execute': 'nbd-server-stop' } > + > +== TODO == > +1. Support continuously VM replication. s/continuously/continuous/ > +2. Support shared storage. > +3. Develop the heartbeat part. > +4. Reduce checkpoint VM’s downtime while do checkpoint. s/do/doing/ > -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature