On 1/23/24 14:31, Mike Pattrick wrote: > On Tue, Jan 23, 2024 at 7:41 AM Ilya Maximets <i.maxim...@ovn.org> wrote: >> >> On 1/22/24 23:24, Mike Pattrick wrote: >>> On Mon, Jan 22, 2024 at 4:10 PM Ilya Maximets <i.maxim...@ovn.org> wrote: >>>> >>>> On 1/22/24 21:33, Mike Pattrick wrote: >>>>> On Mon, Jan 22, 2024 at 1:47 PM Ilya Maximets <i.maxim...@ovn.org> wrote: >>>>>> >>>>>> On 1/22/24 18:51, Mike Pattrick wrote: >>>>>>> The OVN test suite identified a bug in dp_packet_ol_send_prepare() where >>>>>>> a double encapsulated BFD packet would trigger a seg fault. This >>>>>>> happened because we had assumed that if IP checksumming was marked as >>>>>>> offloaded, that checksum would occur in the innermost packet. >>>>>>> >>>>>>> This change will check that the inner packet has an L3 and L4 before >>>>>>> checksumming those parts. And if missing, will resume checksumming the >>>>>>> outer components. >>>>>> >>>>>> Hrm. This looks like a workaround rather than a fix. If the inner >>>>>> packet >>>>>> doesn't have L3 header, dp-packet must not have a flag for L3 >>>>>> checksumming >>>>>> set in the first place. And if it does have inner L3, the offset must be >>>>>> initialized. I guess, some of the offsets can be not initialized, >>>>>> because >>>>>> the packet is never parsed by either miniflow_extract() or >>>>>> parse_tcp_flags(). >>>>>> And the bfd_put_packet() doesn't seem to set them. >>>>> >>>>> I think you're right, I stepped through the problem in GDB and it was >>>>> clear that the flags weren't getting reset properly between BFD >>>>> packets. I'll send an updated patch. >>>> >>>> Yeah, flags preservation is one thing, the other is that l3/l4_ofs are just >>>> not populated in these packets, which doesn't sound right. >>>> >>>> It's also not clear why the packet is marked for inner checksum offload if >>>> the inner checksum is actually fully calculated and correct. I understand >>>> that the flags are carried over from a previous packet, but why did that >>>> previous packet have this flag set? >>> >>> Here's an example: >>> >>> monitor_run() >>> \_ dp_packet_use_stub(&packet, stub, sizeof stub); <--- initializes packet >>> once >>> \_ while(!heap_is_empty(&monitor_heap) <-- loop where the same packet >>> can be reused >>> \_ monitor_mport_run() >>> \_ dp_packet_clear() <- Data cleared, but flags are not cleared >>> \_ cfm_compose_ccm() / bfd_put_packet() / lldp_put_packet() <- >>> one or more can run >>> \_ Note that cfm_compose_ccm and lldp_put_packet call >>> eth_compose, but bfd_put_packet doesn't. bfd_put_packet doesn't reset >>> the offsets like eth_compose does. >> >> Sounds like a bug to me. bfd_put_packet() should set correct offsets >> in the packet, otherwise we'll get random failures with different >> actions that may depend on these offsets to be populated. > > Agreed. Do you want me to homologate this as part of this series?
Yeah, a separate patch for this would be great. > >> >>> \_ ofproto_dpif_send_packet() >>> .... non-relevant stack trace ... >>> \_ netdev_push_header() >>> \_ First run, push geneve header and toggle geneve flag >>> \_ Second run, detect geneve header flag, call >>> dp_packet_ol_send_prepare() >>> >>> Given the above, I think it makes sense to clear the offload flags in >>> dp_packet_clear(). >> >> I agree with that. But it still doesn't explain why the >> DP_PACKET_OL_TX_IP_CKSUM >> is set after the first run. The inner checksum is fully calculated and >> correct. >> There should be no Tx offloading for it set. Only the >> DP_PACKET_OL_TX_OUTER_IP_CKSUM >> should be set in this packet. Or am I missing something? > > bfd_put_packet() calculates the checksum, so it will always be > correct, miniflow_extract() sets DP_PACKET_OL_TX_IP_CKSUM if the > previous packet had DP_PACKET_OL_RX_IP_CKSUM_GOOD, which it would have > gotten from dp_packet_ol_send_prepare(). OK. That makes sense. I understand now. Thanks! > > >> >>> >>> -M >>> >>>> >>>>> >>>>>> BTW, is there actually a double encapsulation in the original OVN test? >>>>>> Sounds strange. >>>>> >>>>> The problem was exposed in netdev_push_header() in >>>>> dp_packet_ol_send_prepare(packet, 0); >>>> >>>> That's true, but the packet dump in gdb showed a plain BFD packet. >>>> So, it was a first encapsulation, not double. >>>> >>>>> >>>>> -M >>>>> >>>>>> >>>>>>> >>>>>>> Fixes: 8b5fe2dc6080 ("userspace: Add Generic Segmentation Offloading.") >>>>>>> Reported-by: Dumitru Ceara <dce...@redhat.com> >>>>>>> Reported-at: https://issues.redhat.com/browse/FDP-300 >>>>>>> Signed-off-by: Mike Pattrick <m...@redhat.com> >>>>>>> --- >>>>>>> lib/dp-packet.h | 10 ++++++++-- >>>>>>> lib/packets.c | 6 +++--- >>>>>>> 2 files changed, 11 insertions(+), 5 deletions(-) >>>>>> >>>>>> Best regards, Ilya Maximets. >>>>>> >>>>> >>>> >>> >> > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev