Hailiang Zhang <zhang.zhanghaili...@huawei.com> writes: > Hi Markus, > > On 2015/12/19 16:54, Markus Armbruster wrote: >> Jumping in at v12 for a bit of QAPI review (and whatever else catched my >> eye nearby), please pardon my ignorance of COLO in general, and previous >> review of this series in particular. >> > > Thanks all the same :) [...] >>> diff --git a/migration/colo.c b/migration/colo.c >>> index 0ab9618..0ce2a6e 100644 >>> --- a/migration/colo.c >>> +++ b/migration/colo.c >>> @@ -10,10 +10,12 @@ >>> * later. See the COPYING file in the top-level directory. >>> */ >>> >>> +#include <unistd.h> >>> #include "sysemu/sysemu.h" >>> #include "migration/colo.h" >>> #include "trace.h" >>> #include "qemu/error-report.h" >>> +#include "qemu/sockets.h" >>> >>> bool colo_supported(void) >>> { >>> @@ -34,6 +36,100 @@ bool migration_incoming_in_colo_state(void) >>> return mis && (mis->state == MIGRATION_STATUS_COLO); >>> } >>> >>> +static int colo_put_cmd(QEMUFile *f, uint32_t cmd) >>> +{ >>> + int ret; >>> + >>> + if (cmd >= COLO_COMMAND_MAX) { >> >> Needs a trivial rebase due to commit 7fb1cf1. >> > >>> + error_report("%s: Invalid cmd", __func__); >>> + return -EINVAL; >> >> Can this run in a context with different error handling needs? >> >> Or asked differently: who may ultimately handle this error? Whoever >> that may be, how does it need to report errors? >> >> Peeking ahead: the immediate callers don't handle this error, they just >> pass it on their callers. >> >> I'm asking because I'm trying to understand whether error_report() is >> appropriate here, or whether you need to use error_setg(), and leave the >> actual reporting to the spot that ultimately handles this error. >> > > Hmm, i know what you mean, we handled them all together after exit > from the colo process loop, > Use error_setg() seems to be a good idea, with this modification, we > can also drop the return > value. I will fix it in next version. > > >>> + } >>> + qemu_put_be32(f, cmd); >>> + qemu_fflush(f); >>> + >>> + ret = qemu_file_get_error(f); >>> + trace_colo_put_cmd(COLOCommand_lookup[cmd]); >>> + >>> + return ret; >>> +} >> >> Looks like @cmd is a COLOCommand. Why is the parameter type uint32_t? >> > > OK, i will change it to use enum COLOCommand. > >>> + >>> +static int colo_get_cmd(QEMUFile *f, uint32_t *cmd) >>> +{ >>> + int ret; >>> + >>> + *cmd = qemu_get_be32(f); >>> + ret = qemu_file_get_error(f); >>> + if (ret < 0) { >>> + return ret; >>> + } >>> + if (*cmd >= COLO_COMMAND_MAX) { >>> + error_report("%s: Invalid cmd", __func__); >>> + return -EINVAL; >>> + } >>> + trace_colo_get_cmd(COLOCommand_lookup[*cmd]); >>> + return 0; >>> +} >> >> Same question. >> >> The "get" in the name suggests the function returns the value gotten, >> like similarly named function elsewhere in migration/ do. >> > Do you mean it should return the cmd value directly, not though parameter way > ? > After we convert it to use error_setg() to indicate success or not, we > can do like that. > I will fix it.
Sounds good to me. [...] >>> diff --git a/qapi-schema.json b/qapi-schema.json >>> index c9ff34e..85f7800 100644 >>> --- a/qapi-schema.json >>> +++ b/qapi-schema.json >>> @@ -720,6 +720,31 @@ >>> { 'command': 'migrate-start-postcopy' } >>> >>> ## >>> +# @COLOCommand >>> +# >>> +# The commands for COLO fault tolerance >>> +# >>> +# @checkpoint-ready: SVM is ready for checkpointing >>> +# >>> +# @checkpoint-request: PVM tells SVM to prepare for new checkpointing >>> +# >>> +# @checkpoint-reply: SVM gets PVM's checkpoint request >>> +# >>> +# @vmstate-send: VM's state will be sent by PVM. >>> +# >>> +# @vmstate-size: The total size of VMstate. >>> +# >>> +# @vmstate-received: VM's state has been received by SVM. >>> +# >>> +# @vmstate-loaded: VM's state has been loaded by SVM. >>> +# >>> +# Since: 2.6 >>> +## >>> +{ 'enum': 'COLOCommand', >>> + 'data': [ 'checkpoint-ready', 'checkpoint-request', 'checkpoint-reply', >>> + 'vmstate-send', 'vmstate-size','vmstate-received', >>> + 'vmstate-loaded' ] } >>> + >> >> Space after 'vmstate-size', please. >> > >> 'vmstate-size' is not used in this patch. You may want to add it with >> its first use instead. >> > > OK, i will move it to the corresponding patch. > >> Should this enum really be named "COLOCommand"? 'checkpoint-ready', >> 'checkpoint-request', 'vmstate-send' look like commands to me, but the >> others look like replies. >> > > Yes, COLOCommand is not so exact. what about name it COLOProtocol? A protocol specifies valid sequences of messages, and what they mean. This isn't a protocol, it's a message within a protocol. COLOMessage? >> >>> # @MouseInfo: >>> # >>> # Information about a mouse device. >>> diff --git a/trace-events b/trace-events >>> index 5565e79..39fdd8d 100644 >>> --- a/trace-events >>> +++ b/trace-events >>> @@ -1579,6 +1579,8 @@ postcopy_ram_incoming_cleanup_join(void) "" >>> >>> # migration/colo.c >>> colo_vm_state_change(const char *old, const char *new) "Change >>> '%s' => '%s'" >>> +colo_put_cmd(const char *msg) "Send '%s' cmd" >>> +colo_get_cmd(const char *msg) "Receive '%s' cmd" >>> >>> # kvm-all.c >>> kvm_ioctl(int type, void *arg) "type 0x%x, arg %p" >> >> I like how this commit creates just the two state machines, and leaves >> filling in their actions to later commits. Helps ignorant rewiewers >> like me :) >> >> > > Do you mean i should split this patch ? Leave this patch with the > simplest colo process, > maybe just 'ready, request, reply', and add the other states in later patch? No, I *like* how you split up the work.