On Thu, May 14, 2020 at 08:11:24AM -0700, Divya Indi wrote:
>  static void ib_nl_set_path_rec_attrs(struct sk_buff *skb,
>                                    struct ib_sa_query *query)
>  {
> @@ -889,6 +904,15 @@ static int ib_nl_make_request(struct ib_sa_query *query, 
> gfp_t gfp_mask)
>               spin_lock_irqsave(&ib_nl_request_lock, flags);
>               list_del(&query->list);
>               spin_unlock_irqrestore(&ib_nl_request_lock, flags);
> +     } else {
> +             set_bit(IB_SA_NL_QUERY_SENT, (unsigned long *)&query->flags);
> +
> +             /*
> +              * If response is received before this flag was set
> +              * someone is waiting to process the response and release the
> +              * query.
> +              */
> +             wake_up(&wait_queue);
>       }

As far as I can see the issue here is that the request is put into the
ib_nl_request_list before it is really ready to be in that list, eg
ib_nl_send_msg() has actually completed and ownership of the memory
has been transfered.

It appears to me the reason for this is simply because a spinlock is
used for the ib_nl_request_lock and it cannot be held across
ib_nl_send_msg().

Convert that lock to a mutex and move the list_add to after the
success of ib_nl_send_msg() and this bug should be fixed without
adding jaunty atomics or a wait queue.

This is a 'racy error unwind' bug class...

Jason

Reply via email to