On 06/29/2018 11:49 AM, Willem de Bruijn wrote:
>>>> diff --git a/net/sched/sch_etf.c b/net/sched/sch_etf.c
>>>> +static void report_sock_error(struct sk_buff *skb, u32 err, u8 code)
>>>> +{
>>>> +       struct sock_exterr_skb *serr;
>>>> +       ktime_t txtime = skb->tstamp;
>>>> +
>>>> +       if (!skb->sk || !(skb->sk->sk_txtime_flags & 
>>>> SK_TXTIME_RECV_ERR_MASK))
>>>> +               return;
>>>> +
>>>> +       skb = skb_clone_sk(skb);
>>>> +       if (!skb)
>>>> +               return;
>>>> +
>>>> +       sock_hold(skb->sk);
>>>
>>> Why take an extra reference? The skb holds a ref on the sk.
>>
>>
>> Yes, the cloned skb holds a ref on the socket, but the documentation of
>> skb_clone_sk() makes this explicit suggestion:
>>
>> (...)
>>  * When passing buffers allocated with this function to sock_queue_err_skb
>>  * it is necessary to wrap the call with sock_hold/sock_put in order to
>>  * prevent the socket from being released prior to being enqueued on
>>  * the sk_error_queue.
>>  */
>>
>> which I believe is here just so we are protected against a possible race 
>> after
>> skb_orphan() is called from sock_queue_err_skb(). Please let me know if I'm
>> misreading anything.
> 
> Yes, indeed. Code only has to worry about that if there are no
> concurrent references
> on the socket.
> 
> I may be mistaken, but I believe that this complicated logic exists
> only for cases where
> the timestamp may be queued after the original skb has been released.
> Specifically,
> when a tx timestamp is returned from a hardware device after transmission of 
> the
> original skb. Then the cloned timestamp skb needs its own reference on
> the sk while
> it is waiting for the timestamp data (i.e., until the device
> completion arrives) and then
> we need a temporary extra ref to work around the skb_orphan in
> sock_queue_err_skb.
> 
> Compare skb_complete_tx_timestamp with skb_tstamp_tx. The second is used in
> the regular datapath to clone an skb and queue it on the error queue
> immediately,
> while holding the original skb. This does not call skb_clone_sk and
> does not need the
> extra sock_hold. This should be good enough for this code path, too.
> As kb holds a
> ref on skb->sk, the socket cannot go away in the middle of report_sock_error.


Oh, that makes sense. Great, I will give this a try and add it to the v2.

Thanks,
Jesus

Reply via email to