On 06/24/2015 10:07 PM, Dr. David Alan Gilbert wrote: > * Wen Congyang (ghost...@gmail.com) wrote: >> At 2015/6/19 18:49, Stefan Hajnoczi Wrote: >>> On Fri, Jun 19, 2015 at 08:54:56AM +0800, Wen Congyang wrote: >>>> On 06/19/2015 12:06 AM, Stefan Hajnoczi wrote: >>>>> On Thu, Jun 18, 2015 at 10:36:39PM +0800, Wen Congyang wrote: >>>>>> At 2015/6/18 20:55, Stefan Hajnoczi Wrote: >>>>>>> On Thu, Jun 18, 2015 at 04:49:12PM +0800, Wen Congyang wrote: >>>>>>>> +void bdrv_connect(BlockDriverState *bs, Error **errp) >>>>>>>> +{ >>>>>>>> + BlockDriver *drv = bs->drv; >>>>>>>> + >>>>>>>> + if (drv && drv->bdrv_connect) { >>>>>>>> + drv->bdrv_connect(bs, errp); >>>>>>>> + } else if (bs->file) { >>>>>>>> + bdrv_connect(bs->file, errp); >>>>>>>> + } else { >>>>>>>> + error_setg(errp, "this feature or command is not currently >>>>>>>> supported"); >>>>>>>> + } >>>>>>>> +} >>>>>>>> + >>>>>>>> +void bdrv_disconnect(BlockDriverState *bs) >>>>>>>> +{ >>>>>>>> + BlockDriver *drv = bs->drv; >>>>>>>> + >>>>>>>> + if (drv && drv->bdrv_disconnect) { >>>>>>>> + drv->bdrv_disconnect(bs); >>>>>>>> + } else if (bs->file) { >>>>>>>> + bdrv_disconnect(bs->file); >>>>>>>> + } >>>>>>>> +} >>>>>>> >>>>>>> Please add doc comments describing the semantics of these commands. >>>>>> >>>>>> Where should it be documented? In the header file? >>>>> >>>>> block.h doesn't document prototypes in the header file, please document >>>>> the function definition in block.c. (QEMU is not consistent here, some >>>>> places do it the other way around.) >>>>> >>>>>>> Why are these operations needed when there is already a bs->drv == NULL >>>>>>> case which means the BDS is not ready for read/write? >>>>>>> >>>>>> >>>>>> The purpos is that: don't connect to nbd server when opening a nbd >>>>>> client. >>>>>> connect/disconnect >>>>>> to nbd server when we need to do it. >>>>>> >>>>>> IIUC, if bs->drv is NULL, it means that the driver is ejected? Here, >>>>>> connect/disconnect >>>>>> means that connect/disconnect to remote target(The target may be in >>>>>> another >>>>>> host). >>>>> >>>>> Connect/disconnect puts something on the QEMU command-line that isn't >>>>> ready at startup time. >>>>> >>>>> How about using monitor commands to add objects when needed instead? >>>>> >>>>> That is cleaner because it doesn't introduce a new state (which is only >>>>> implemented for nbd). >>>>> >>>> >>>> The problem is that, nbd client is one child of quorum, and quorum must >>>> have more >>>> than one child. The nbd server is not ready until colo is running. >>> >>> A monitor command to hot add/remove quorum children solves this problem >>> and could also be used in other scenarios (e.g. user decides to take a >>> quorum child offline). >>> >> >> For replication case, we always do checkpoint again and again after >> migration. If the disk is not synced before migration, we will use disk >> mirgation or mirror >> job to sync it. > > Can you document the way that you use disk migration or mirror, together > with your COLO setup? I think it would make it easier to understand this > restriction. > At the moment I don't understand how you can switch from doing a disk > migration > into COLO mode - there seems to be a gap between the end of disk migration > and the start > of COLO.
In my test, I use disk migration. After migration and before starting COLO, we will disable disk migration: + /* Disable block migration */ + s->params.blk = 0; + s->params.shared = 0; + qemu_savevm_state_begin(trans, &s->params); If the user uses mirror job, we don't cancel the mirror job now. > >> So we cannot start block replication when migration is running. We need >> that nbd >> client is not ready when migration is running, and it is ready between >> migration ends >> and checkpoint begins. Using a monotir command add the nbd client will cause >> larger >> downtime. So if the nbd client has been added(only not connect to the nbd >> server), >> we can connect to nbd server automatically. > > Without the disk migration or mirror, I can't see the need for the delayed > bdrv_connect. > I can see that you want to do a disconnect at failover, however you need > to take care because if the network is broken at the point of failover > you need to make sure nothing blocks. disk migration is needed if the disk is not synced before migration. Thanks Wen Congyang > > Dave > >> >> Thanks >> Wen Congyang > -- > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK > . >