> -----Original Message-----
> From: Stephen Hemminger [mailto:step...@networkplumber.org]
> Sent: Saturday, March 18, 2017 9:55 PM
> To: gre...@linuxfoundation.org; KY Srinivasan <k...@microsoft.com>;
> Haiyang Zhang <haiya...@microsoft.com>
> Cc: sta...@vger.kernel.org; de...@linuxdriverproject.org; Stephen
> Hemminger <sthem...@microsoft.com>
> Subject: [PATCH v2] vmbus: fix missed ring events on boot
> 
> During initialization, the channel initialization code schedules the
> tasklet to scan the VMBUS receive event page (i.e. simulates an
> interrupt). The problem was that it invokes the tasklet on a different
> CPU from where it normally runs and therefore if an event is present,
> it will clear the bit but not find the associated channel.
> 
> This can lead to missed events, typically stuck tasks, during bootup
> when sub channels are being initialized. Typically seen as stuck
> boot with 8 or more CPU's.
> 
> This patch is not necessary for upstream (4.11 and later) since
> commit 631e63a9f346 ("vmbus: change to per channel tasklet").
> This changed vmbus code to get rid of common tasklet which
> caused the problem.
> 
> Cc: sta...@vger.kernel.org
> Fixes: 638fea33aee8 ("Drivers: hv: vmbus: fix the race when querying &
> updating the percpu list")
> Signed-off-by: Stephen Hemminger <sthem...@microsoft.com>
> ---
> v2 - simplified version (only need to change one function).
>      also don't need to wait for tasklet to be scheduled on other CPU
>      add Fixes tag.
> 
>  drivers/hv/channel_mgmt.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> index 0af7e39006c8..63c903b00a58 100644
> --- a/drivers/hv/channel_mgmt.c
> +++ b/drivers/hv/channel_mgmt.c
> @@ -361,8 +361,19 @@ void hv_event_tasklet_enable(struct
> vmbus_channel *channel)
>       tasklet = hv_context.event_dpc[channel->target_cpu];
>       tasklet_enable(tasklet);
> 
> -     /* In case there is any pending event */
> -     tasklet_schedule(tasklet);
> +     /*
> +      * In case there is any pending event schedule a rescan
> +      * but must be on the correct CPU for the channel.
> +      */
> +     if (channel->target_cpu == get_cpu()) {
> +             put_cpu();

We could be preempted here and end up scheduling the taklet on the wrong CPU.

> +             tasklet_schedule(tasklet);
> +     } else {
> +             smp_call_function_single(channel->target_cpu,
> +                                      (smp_call_func_t)tasklet_schedule,
> +                                      tasklet, false);
> +             put_cpu();
> +     }

Why not call put_cpu() at the end of the block to close the preemption window 
we have
earlier.

K. Y
>  }
> 
>  void hv_process_channel_removal(struct vmbus_channel *channel, u32
> relid)
> --
> 2.11.0

_______________________________________________
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

Reply via email to