Hello Anju, Ben, Looks like I fixed a leak reported here in my recent patch: https://mail.openvswitch.org/pipermail/ovs-dev/2018-May/346892.html
Could you, please, take a look at it? I actually managed to reproduce the packet leak on tunnel-push-pop unit tests with valgrind: ==6445== 2,912 (2,208 direct, 704 indirect) bytes in 4 blocks are definitely lost in loss record 400 of 410 ==6445== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==6445== by 0x516314: xmalloc (util.c:120) ==6445== by 0x466154: dp_packet_new (dp-packet.c:138) ==6445== by 0x466154: dp_packet_new_with_headroom (dp-packet.c:148) ==6445== by 0x4F6CFE: eth_from_hex (packets.c:498) ==6445== by 0x48EB43: eth_from_packet (netdev-dummy.c:1450) ==6445== by 0x48EB43: netdev_dummy_receive (netdev-dummy.c:1562) ==6445== by 0x515980: process_command (unixctl.c:313) ==6445== by 0x515980: run_connection (unixctl.c:347) ==6445== by 0x515980: unixctl_server_run (unixctl.c:398) ==6445== by 0x406F1E: main (ovs-vswitchd.c:120) Above patch fixes it. Best regards, Ilya Maximets. > Hi Ben, > > I will work on these changes as well. > > Regards > Anju > > -----Original Message----- > From: Ben Pfaff [mailto:blp at ovn.org] > Sent: Friday, May 11, 2018 2:00 AM > To: Anju Thomas <anju.thomas at ericsson.com> > Cc: dev at openvswitch.org > Subject: Re: [ovs-dev] [PATCH v1] Fix crash due to multiple tnl push action > > On Thu, May 10, 2018 at 08:51:03AM +0000, Anju Thomas wrote: >> It looks like your commit message describes at least two other bugs in >> OVS, though. First, if OVS crashes when it pushes tunnel headers, >> even if there's not enough headroom, that's really bad. At worst, it >> should drop the packet. Do you know where the crash occurs? We >> should fix the problem, since it might recur in some other context. >> >> [Anju] OVS was actually crashing in the dpdk datapath. The crash is a manual >> assert in our case. >> The rootcause is that dp receives the actions after the upcall (say with >=3 >> tunnel pushes ) . Now as part of action processing , since it is a tunnel >> push action , we try to find space in the dpdk mbuf packet headroom (which >> Is 128 bytes). By the time we try to process the third tunnel push , there >> is no headroom left since each tunnel header is of 50 bytes (50 *3 > 128 >> bytes headroom). Hence it manually asserts . This assert is to catch any >> unexpected code flow . Do you think that in this case we should still go >> ahead and prevent the crash ? > > I don't understand why it's OK to crash in this case. Why do you think so? > >> Second, it's a little shocking to hear that an encapsulation action without >> a following output action causes a memory leak. We also need to fix that. >> Do you have any more details? >> [Anju] Now as explained above, the crash happens because we run out of >> headroom. But in case we have say 2 or less than 2 tunnel pushes we will >> have a mem leak as packet is never freed because the tnl push is the dp last >> action and there is no other output action or any other action like recirc >> that may translate to output action in the end leading to packet buffer not >> being freed. >> Are you proposing that we have some sort of preventive fix in the dp to >> handle an incorrect action list from the upcall handling? > > Yes. It's unacceptable to leak memory because there's a packet modification > without an output action. The kernel datapath never does this, for example. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
