Re: [ovs-dev] [PATCH 1/5] ipf: Fix locking for lists and a deadlock with the clean thread.

Ilya Maximets Fri, 24 Oct 2025 05:45:49 -0700

On 10/24/25 10:17 AM, Eelco Chaudron wrote:
> 
> 
> On 22 Oct 2025, at 19:58, Ilya Maximets wrote:
> 
>> On 10/22/25 6:28 PM, Mike Pattrick wrote:
>>> On Fri, Oct 17, 2025 at 1:15 PM Ilya Maximets <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>>
>>>     In a few places the list sizes are checked without holding the lock.
>>>     While this probably speeds up the processing a bit, it's technically
>>>     incorrect as the access is not atomic.  Caught by thread-safety
>>>     analyzer of clang 21, as it now properly supports GUARDED annotations
>>>     on structure fields:
>>>
>>>       lib/ipf.c:1117:33:
>>>         error: reading the value pointed to by 'frag_exp_list' requires
>>>         holding any mutex [-Werror,-Wthread-safety-analysis]
>>>        1117 |     if (ovs_list_is_empty(&ipf->frag_exp_list)) {
>>>             |                                 ^
>>>       lib/ipf.c:1154:33:
>>>         error: reading the value pointed to by 'reassembled_pkt_list'
>>>         requires holding any mutex [-Werror,-Wthread-safety-analysis]
>>>        1154 |     if (ovs_list_is_empty(&ipf->reassembled_pkt_list)) {
>>>             |                                 ^
>>>
>>>     There is also a potential deadlock between the thread destroying the
>>>     ipf and the clean thread as they are using the latch and the ipf lock
>>>     in the opposite order.  The latch itself is thread-safe, so we should
>>>     not be holding the lock while setting it and waiting for the clean
>>>     thread that may be waiting for the lock.
>>>
>>>     Fixes: 4ea96698f667 ("Userspace datapath: Add fragmentation handling.")
>>>     Signed-off-by: Ilya Maximets <[email protected] 
>>> <mailto:[email protected]>>
>>>     ---
>>>      lib/ipf.c | 33 +++++++++++----------------------
>>>      1 file changed, 11 insertions(+), 22 deletions(-)
>>>
>>>     diff --git a/lib/ipf.c b/lib/ipf.c
>>>     index b16797312..6ce144bfb 100644
>>>     --- a/lib/ipf.c
>>>     +++ b/lib/ipf.c
>>>     @@ -1071,12 +1071,9 @@ ipf_send_completed_frags(struct ipf *ipf, struct 
>>> dp_packet_batch *pb,
>>>                               long long now, bool v6, uint16_t zone,
>>>                               odp_port_t in_port)
>>>      {
>>>     -    if (ovs_list_is_empty(&ipf->frag_complete_list)) {
>>>     -        return;
>>>     -    }
>>>
>>>
>>> I worry about the performance impact with these changes, though I haven't 
>>> tested it.
>>> I could run a test if you haven't.
>>
>> If you can run some performance tests, that would be great.  I'm not sure 
>> how much
>> impact this have, certainly some.  OTOH, fragment reassembly is not a 
>> particularly
>> performant code path regardless.
>>
>>> Receiving a single fragmented packet on any port
>>> would result in all CT threads fighting over a global lock. Maybe there  
>>> should be
>>> a new atomic counter for nfrags_completed?
>>
>> May be an option, if performance hit is significant.  Though it's a bit of a
>> brute-force solution, i.e. not particularly elegant.
>>
>>> I was also thinking about a trylock however
>>> the risk with that is not all threads can handle all completed frags.
>>
>> It's a bit strange that we're completing packets on threads that potentially 
>> didn't
>> receive them.  It would be better if only the threads that completed some 
>> packets
>> on the current run handled them.  Those threads are taking the lock anyways.
>>
>>>  
>>>
>>>     +    struct ipf_list *ipf_list;
>>>
>>>          ovs_mutex_lock(&ipf->ipf_lock);
>>>     -    struct ipf_list *ipf_list;
>>>
>>>          LIST_FOR_EACH_SAFE (ipf_list, list_node, &ipf->frag_complete_list) 
>>> {
>>>
>>>     @@ -1114,14 +1111,11 @@ ipf_delete_expired_frags(struct ipf *ipf, long 
>>> long now)
>>>          };
>>>
>>>
>>>     -    if (ovs_list_is_empty(&ipf->frag_exp_list)) {
>>>     -        return;
>>>     -    }
>>>     -
>>>     -    ovs_mutex_lock(&ipf->ipf_lock);
>>>          struct ipf_list *ipf_list;
>>>          size_t lists_removed = 0;
>>>
>>>     +    ovs_mutex_lock(&ipf->ipf_lock);
>>>     +
>>>          LIST_FOR_EACH_SAFE (ipf_list, list_node, &ipf->frag_exp_list) {
>>>              if (now <= ipf_list->expiration ||
>>>                  lists_removed >= IPF_FRAG_LIST_MAX_EXPIRED) {
>>>     @@ -1151,12 +1145,9 @@ static void
>>>      ipf_execute_reass_pkts(struct ipf *ipf, struct dp_packet_batch *pb,
>>>                             ovs_be16 dl_type)
>>>      {
>>>     -    if (ovs_list_is_empty(&ipf->reassembled_pkt_list)) {
>>>     -        return;
>>>     -    }
>>>     +    struct reassembled_pkt *rp;
>>>
>>>          ovs_mutex_lock(&ipf->ipf_lock);
>>>     -    struct reassembled_pkt *rp;
>>>
>>>          LIST_FOR_EACH_SAFE (rp, rp_list_node, &ipf->reassembled_pkt_list) {
>>>              if (!rp->list->reass_execute_ctx &&
>>>     @@ -1175,12 +1166,9 @@ static void
>>>      ipf_post_execute_reass_pkts(struct ipf *ipf,
>>>                                  struct dp_packet_batch *pb, bool v6)
>>>      {
>>>     -    if (ovs_list_is_empty(&ipf->reassembled_pkt_list)) {
>>>     -        return;
>>>     -    }
>>>     +    struct reassembled_pkt *rp;
>>>
>>>          ovs_mutex_lock(&ipf->ipf_lock);
>>>     -    struct reassembled_pkt *rp;
>>>
>>>          LIST_FOR_EACH_SAFE (rp, rp_list_node, &ipf->reassembled_pkt_list) {
>>>              const size_t pb_cnt = dp_packet_batch_size(pb);
>>>     @@ -1308,11 +1296,11 @@ ipf_clean_thread_main(void *f)
>>>
>>>              long long now = time_msec();
>>>
>>>     +        ovs_mutex_lock(&ipf->ipf_lock);
>>>     +
>>>              if (!ovs_list_is_empty(&ipf->frag_exp_list) ||
>>>                  !ovs_list_is_empty(&ipf->frag_complete_list)) {
>>>
>>>
>>> Couldn't this conditional just be removed?
>>
>> Either way is fine, I guess.  We can remove the if statements here, it will
>> just be a slightly larger change.  I can do that in v2 or on commit depending
>> on the results of the performance discussion above.
> 
> Did a quick review, and functionally the patch seems correct to me. I guess if
> performance is an issue, we could create an ipf_peek_frag_list_empty() 
> function
> with analysis disabled. However, the question remains: is it safe to assume 
> the
> list is empty without a lock here?
It's not safe, but the negative consequences of missing the processing of
a completed packet are likely very minimal, and if the current thread did
complete the packet on the current iteration, it will not miss it due to
updates from the other thread.  And if it does, then that other thread will
pick up the completed packet.


OTOH, the whole idea that one thread completes the packet and the other
thread is processing and sending it out seems wrong to me, as that can
cause packet reordering.  But we have way more issues in this module than
the potential reordering, I guess.

I'll change the patch to add a couple of functions for checking if the
list is empty and turn off thread safety analysis for them.  This will
just fix the build without affecting performance.  And I'll split the
deadlock fix into a separate patch.  We can think of how to fix the thread
safety without affecting performance in the meantime.

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 1/5] ipf: Fix locking for lists and a deadlock with the clean thread.

Reply via email to