On Wed, Oct 22, 2025 at 11:11:01AM +0800, Jason Xing wrote: > On Wed, Oct 22, 2025 at 1:33 AM Alessandro Decina > <[email protected]> wrote: > > diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c > > b/drivers/net/ethernet/intel/i40e/i40e_xsk.c > > index 9f47388eaba5..dbc19083bbb7 100644 > > --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c > > +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c > > @@ -441,13 +441,18 @@ int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, > > int budget) > > dma_rmb(); > > > > if (i40e_rx_is_programming_status(qword)) { > > + u16 ntp; > > + > > i40e_clean_programming_status(rx_ring, > > rx_desc->raw.qword[0], > > qword); > > bi = *i40e_rx_bi(rx_ring, next_to_process); > > xsk_buff_free(bi); > > - if (++next_to_process == count) > > + ntp = next_to_process++; > > + if (next_to_process == count) > > next_to_process = 0; > > + if (next_to_clean == ntp) > > + next_to_clean = next_to_process; > > continue; > > } > > > > -- > > 2.43.0 > > > > > > I'm copying your reply from v1 as shown below so that we can continue > with the discussion :) > > > It really depends on whether a status descriptor can be received in the > > middle of multi-buffer packet. Based on the existing code, I assumed it > > can. Therefore, consider this case: > > > > [valid_1st_packet][status_descriptor][valid_2nd_packet] > > > > In this case you want to skip status_descriptor but keep the existing > > logic that leads to: > > > > first = next_to_clean = valid_1st_packet > > > > so then you can go and add valid_2nd_packet as a fragment to the first. > > Sorry, honestly, I still don't follow you. > > Looking at the case you provided, I think @first always pointing to > valid_1st_packet is valid which does not bring any trouble. You mean > the case is what you're trying to handle?
Yes, I mean this case needs to keep working, so we can't move next_to_clean unconditionally, we can only move it when next_to_clean == ntp, which is equivalent to checking that first == NULL. See below. > You patch updates next_to_clean that is only used at the very > beginning, so it will not affect @first. Imaging the following case: > > [status_descriptor][valid_1st_packet][valid_2nd_packet] > > Even if the next_to_clean is updated, the @first still points to > [status_descriptor] that is invalid and that will later cause the > panic when constructing the skb. Exactly - the key is to make sure we never get into this state :) At the beginning of the function - outside the loop - first is only assigned if (next_to_process != next_to_clean). This condition means: if we previously exited the function in the middle of a multi-buffer packet, we must resume from the start of that packet (next_to_clean) and process the next fragment in it (next_to_process). Consider the case you just gave: > [status_descriptor][valid_1st_mb_packet][valid_2nd_mb_packet] Assume we enter the function and next_to_process == next_to_clean, we don't assign first, so first = NULL. We find the status descriptor: without this patch, we increment next_to_process, don't increment next_to_clean, say we run out of budget and break the loop, the next time the function is entered we set first = status_descriptor because next_to_process != next_to_clean. This is exactly what we want to avoid. With this patch upon processing the status descriptor, we see next_to_clean == ntp, we increment next_to_clean, the next time the function is entered next_to_process == next_to_clean, first is correctly set to NULL and the next packet starts from valid_1st_mb_packet. So I've covered both the scenarios: a) [status][mb_packet1][mb_packet2] b) [mb_packet1][status][mb_packet2] The last case c) [packet1][packet2][status] is actually just a), because at packet2 we'd find the EOP marker and close off the previous multi-buffer packet. I hope I was more clear and please check my logic :) Ciao, Alessandro
