attach_aio_context()

ronnie sahlberg Thu, 08 May 2014 07:53:55 -0700

On Thu, May 8, 2014 at 4:33 AM, Stefan Hajnoczi <stefa...@redhat.com> wrote:
> On Wed, May 07, 2014 at 04:09:27PM +0200, Peter Lieven wrote:
>> On 07.05.2014 12:29, Paolo Bonzini wrote:
>> >Il 07/05/2014 12:07, Stefan Hajnoczi ha scritto:
>> >>On Fri, May 02, 2014 at 12:39:06AM +0200, Peter Lieven wrote:
>> >>>>+static void iscsi_attach_aio_context(BlockDriverState *bs,
>> >>>>+                                     AioContext *new_context)
>> >>>>+{
>> >>>>+    IscsiLun *iscsilun = bs->opaque;
>> >>>>+
>> >>>>+    iscsilun->aio_context = new_context;
>> >>>>+    iscsi_set_events(iscsilun);
>> >>>>+
>> >>>>+#if defined(LIBISCSI_FEATURE_NOP_COUNTER)
>> >>>>+    /* Set up a timer for sending out iSCSI NOPs */
>> >>>>+    iscsilun->nop_timer = aio_timer_new(iscsilun->aio_context,
>> >>>>+ QEMU_CLOCK_REALTIME, SCALE_MS,
>> >>>>+ iscsi_nop_timed_event, iscsilun);
>> >>>>+    timer_mod(iscsilun->nop_timer,
>> >>>>+              qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + NOP_INTERVAL);
>> >>>>+#endif
>> >>>>+}
>> >>>
>> >>>Is it still guaranteed that iscsi_nop_timed_event for a target is not 
>> >>>invoked
>> >>>while we are in another function/callback of the iscsi driver for the 
>> >>>same target?
>> >
>> >Yes, since the timer is in the same AioContext as the iscsi driver 
>> >callbacks.
>>
>>
>> Ok. Stefan: What MUST NOT happen is that the timer gets fired while we are 
>> in iscsi_service.
>> As Paolo outlined, this cannot happen, right?
>
> Okay, I think we're safe then.  The timer can only be invoked during
> aio_poll() event loop iterations.  It cannot be invoked while we're
> inside iscsi_service().
>
>> >>BTW, is iscsi_reconnect() the right libiscsi interface to use since it
>> >>is synchronous?  It seems like this would block QEMU until the socket
>> >>has connected!  The guest would be frozen.
>> >
>> >There is no asynchronous interface yet for reconnection, unfortunately.
>>
>> We initiate the reconnect after we miss a few NOP replies. So the target is 
>> already down for approx. 30 seconds.
>> Every process inside the guest is already haging or has timed out.
>>
>> If I understand correctly with the new patches only the communication with 
>> this target is hanging or isn't it?
>> So what benefit would an asyncronous reconnect have?
>
> Asynchronous reconnect is desirable:
>
> 1. The QEMU monitor is blocked while we're waiting for the iSCSI target
>    to accept our reconnect.  This means the management stack (libvirt)
>    cannot control QEMU until we time out or succeed.
>
> 2. The guest is totally frozen - cannot execute instructions - because
>    it will soon reach a point in the code that locks the QEMU global
>    mutex (which is being held while we reconnect to the iSCSI target).
>
>    This may be okayish for guests where the iSCSI LUN contains the
>    "main" data that is being processed.  But what if an iSCSI LUN was
>    just attached to a guest that is also doing other things that are
>    independent (e.g. serving a website, processing data from a local
>    disk, etc) - now the reconnect causes downtime for the entire guest.


I will look into making the reconnect async over the next few days.

Re: [Qemu-devel] [PATCH 08/22] iscsi: implement .bdrv_detach/attach_aio_context()

Reply via email to