On 09/23/2015 05:21 PM, Dr. David Alan Gilbert wrote:
> * Wen Congyang (we...@cn.fujitsu.com) wrote:
>> On 09/22/2015 07:15 PM, Dr. David Alan Gilbert wrote:
>>> * Wen Congyang (we...@cn.fujitsu.com) wrote:
>>>> If quorum's child is broken, we can use mirror job to replace it.
>>>> But sometimes, the user only need to remove the broken child, and
>>>> add it later when the problem is fixed.
>>>>
>>>
>>> Hi,
>>>   Two questions:
>>>     1) Do you have an example of a pair of add/remove commands that work
>>>       together? (I'm not quite sure I understand where the ID for the remove
>>>       comes from).
>>
>> The command line:
>> -drive 
>> if=virtio,id=disk1,driver=quorum,read-pattern=fifo,vote-threshold=1,children.0.file.filename=/data/images/kvm/suse/suse11_3.img,children.0.driver=raw
>>
>> And the QMP monitor command:
>> {'execute':'blockdev-add', 'arguments':{'options':{'driver': 'raw', 
>> 'node-name': 'test1', 'file': {'driver': 'file', 'filename': '/dev/null'}, 
>> 'id': 'test11' }  } }
>> {'execute': 'human-monitor-command', 'arguments': {'command-line': 
>> 'drive_add buddy 
>> driver=nbd,host=192.168.3.1,port=8889,export=colo-disk1,node-name=test2,if=none'}}
>> {'execute':'x-blockdev-child-add', 'arguments':{'parent': 'disk1', 'child': 
>> 'test1' } }
>> {'execute':'x-blockdev-child-add', 'arguments':{'parent': 'disk1', 'child': 
>> 'test2' } }
>> {'execute': 'x-blockdev-child-del', 'arguments': {'parent': 'disk1', 
>> 'child': 'test1' } }
>> {'execute': 'x-blockdev-child-del', 'arguments': {'parent': 'disk1', 
>> 'child': 'test2' } }
>>
>> Note: the qmp monitor command doesn't support nbd now, and I use the hmp 
>> command to add a BDS.
> 
> Thank you; OK I see the format has changed quite a bit from the older 
> version; this version
> is a lot nicer.
> 
>>>     2) If the child has failed and is not responding to block operations
>>>        at all (e.g a networking failure to an nbd device which may take 
>>> minutes
>>>        to time out); how do you recover - flush or drain on the devices
>>>        hang at that point.
>>
>> If the network fails, the kernel doesn't notify the application...
>>
>>>
>>> (I was trying to test recovery from a failed secondary using the July COLO
>>> release; but the primary gets stuck in bdrv_drain or bdrv_flush if I kill
>>> the secondary in the right way).
>>
>> IIRC, if the qemu is killed, the connection is closed at the same time. 
>> bdrv_drain()
>> or bdrv_flush() should not get stuck.
> 
> I use kill -SIGSTOP to the secondary qemu so I think that behaves like the 
> network fails,
> or if the secondary host just failed completely.  You do need some way to 
> recover from the
> NBD server dieing like that.

You use SIGSTOP, so there is no error in the connection, and the nbd client 
will wait the
reply. bdrv_drain() will never end in this case.

> 
> It sounds like we need some way to be able to remove a blockdev that's failed 
> like that;
> Paolo suggested the 'disk deadline' series could be used to time something 
> like that
> out eventually, but maybe you need something that allows you to remove
> a child more forcibly.

Yes, but quorum will wait bdrv_co_write() return. It is very hard to implement 
it now...


I guess 'disk deadline' can fix these two problems.

Thanks
Wen Congyang

> 
> Dave
> 
>>
>> Thanks
>> Wen Congyang
>>
>>>
>>> Dave
>>>
>>>
>>>> It is based on the following patch:
>>>> http://lists.nongnu.org/archive/html/qemu-devel/2015-09/msg04579.html
>>>>
>>>> ChangLog:
>>>> v5:
>>>> 1. Address Eric Blake's comments
>>>> v4:
>>>> 1. drop nbd driver's implementation. We can use human-monitor-command
>>>>    to do it.
>>>> 2. Rename the command name.
>>>> v3:
>>>> 1. Don't open BDS in bdrv_add_child(). Use the existing BDS which is
>>>>    created by the QMP command blockdev-add.
>>>> 2. The driver NBD can support filename, path, host:port now.
>>>> v2:
>>>> 1. Use bdrv_get_device_or_node_name() instead of new function
>>>>    bdrv_get_id_or_node_name()
>>>> 2. Update the error message
>>>> 3. Update the documents in block-core.json
>>>>
>>>> Wen Congyang (4):
>>>>   Add new block driver interface to add/delete a BDS's child
>>>>   quorum: implement bdrv_add_child() and bdrv_del_child()
>>>>   qmp: add monitor command to add/remove a child
>>>>   hmp: add monitor command to add/remove a child
>>>>
>>>>  block.c                   | 56 ++++++++++++++++++++++++++++++++++--
>>>>  block/quorum.c            | 72 
>>>> +++++++++++++++++++++++++++++++++++++++++++++--
>>>>  blockdev.c                | 48 +++++++++++++++++++++++++++++++
>>>>  hmp-commands.hx           | 28 ++++++++++++++++++
>>>>  hmp.c                     | 20 +++++++++++++
>>>>  hmp.h                     |  2 ++
>>>>  include/block/block.h     |  8 ++++++
>>>>  include/block/block_int.h |  5 ++++
>>>>  qapi/block-core.json      | 34 ++++++++++++++++++++++
>>>>  qmp-commands.hx           | 61 +++++++++++++++++++++++++++++++++++++++
>>>>  10 files changed, 329 insertions(+), 5 deletions(-)
>>>>
>>>> -- 
>>>> 2.4.3
>>>>
>>> --
>>> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
>>> .
>>>
>>
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> .
> 


Reply via email to