From: Hamish Martin <[email protected]>
Date: Thu,  9 Jul 2020 09:06:44 +1200

> A scenario has been observed where a 'bc_init' message for a link is not
> retransmitted if it fails to be received by the peer. This leads to the
> peer never establishing the link fully and it discarding all other data
> received on the link. In this scenario the message is lost in transit to
> the peer.
> 
> The issue is traced to the 'nxt_retr' field of the skb not being
> initialised for links that aren't a bc_sndlink. This leads to the
> comparison in tipc_link_advance_transmq() that gates whether to attempt
> retransmission of a message performing in an undesirable way.
> Depending on the relative value of 'jiffies', this comparison:
>     time_before(jiffies, TIPC_SKB_CB(skb)->nxt_retr)
> may return true or false given that 'nxt_retr' remains at the
> uninitialised value of 0 for non bc_sndlinks.
> 
> This is most noticeable shortly after boot when jiffies is initialised
> to a high value (to flush out rollover bugs) and we compare a jiffies of,
> say, 4294940189 to zero. In that case time_before returns 'true' leading
> to the skb not being retransmitted.
> 
> The fix is to ensure that all skbs have a valid 'nxt_retr' time set for
> them and this is achieved by refactoring the setting of this value into
> a central function.
> With this fix, transmission losses of 'bc_init' messages do not stall
> the link establishment forever because the 'bc_init' message is
> retransmitted and the link eventually establishes correctly.
> 
> Fixes: 382f598fb66b ("tipc: reduce duplicate packets for unicast traffic")
> Acked-by: Jon Maloy <[email protected]>
> Signed-off-by: Hamish Martin <[email protected]>

Applied and queued up for -stable, thank you.

Reply via email to