On 21.09.2020 13:40, Julien Grall wrote:
> (+ Xen-devel)
> 
> Sorry I forgot to CC xen-devel.
> 
> On 21/09/2020 12:38, Julien Grall wrote:
>> Hi all,
>>
>> I have started to look at the deferral code (see 
>> vcpu_start_shutdown_deferral()) because we need it for LiveUpdate and 
>> Arm will soon use it.
>>
>> The current implementation is using an smp_mb() to ensure ordering 
>> between a write then a read. The code looks roughly (I have slightly 
>> adapted it to make my question more obvious):
>>
>> domain_shutdown()
>>      d->is_shutting_down = 1;
>>      smp_mb();
>>      if ( !vcpu0->defer_shutdown )
>>      {
>>        vcpu_pause_nosync(v);
>>        v->paused_for_shutdown = 1;
>>      }
>>
>> vcpu_start_shutdown_deferral()
>>      vcpu0->defer_shutdown = 1;
>>      smp_mb();
>>      if ( unlikely(d->is_shutting_down) )
>>        vcpu_check_shutdown(v);
>>
>>      return vcpu0->defer_shutdown;
>>
>> smp_mb() should only guarantee ordering (this may be stronger on some 
>> arch), so I think there is a race between the two functions.
>>
>> It would be possible to pause the vCPU in domain_shutdown() because 
>> vcpu0->defer_shutdown wasn't yet seen.
>>
>> Equally, vcpu_start_shutdown_deferral() may not see d->is_shutting_down 
>> and therefore Xen may continue to send the I/O. Yet the vCPU will be 
>> paused so the I/O will never complete.

Individually for each of these I agree. But isn't the goal merely
to prevent both to enter their if()-s' bodies at the same time?
And isn't the combined effect of the two barriers preventing just
this?

>> I am not fully familiar with the IOREQ code, but it sounds to me this is 
>> not the behavior that was intended. Can someone more familiar with the 
>> code confirm it?

As to original intentions, I'm afraid among the people still
listed as maintainers for any part of Xen it may only be Tim to
possibly have been involved in the original installation of
this model, and hence who may know of the precise intentions
and considerations back at the time.

As far as I'm concerned, to be honest I don't think I've ever
managed to fully convince myself of the correctness of the
model in the general case. But since it did look good enough
for x86 ...

Jan

Reply via email to