We have a lot of tracing tools, at multiple levels. I wonder whether they are documented adequately. I tried to document them by example in the OVN OpenStack tutorial (e.g. http://docs.openvswitch.org/en/latest/tutorials/ovn-openstack/), but it is not exactly a how-to guide. Maybe someone could find the time to write a high-level guide to how to trace a packet through the system. With plenty of examples, it could be a great resource.
On Fri, Mar 15, 2019 at 05:25:09PM +0200, Daniel Alvarez Sanchez wrote: > Sounds like a great plan, Ben! Thanks for that. It'd be great if > people could chime in this thread to help identify those gaps. > > As about the anecdotes, we had just been involved in a case where OVN > was used and packets were dropped at conntrack: > > Two VMs on different Logical Switches (externally routed), running on > the same hypervisor were communicating between each other and packet > loss was observed. The packet loss was observed only on small (<64B) > packets. These packets were padded by the NIC before being put on the > wire and when they came back, due to the ACLs, they were put into > conntrack and dropped there. We determined this by inspecting DP flows > via 'ovs-dpctl dump-flows' and then we enabled logging on netfilter > which showed that there was an error with the checksum calculation. It > happened to be a bug on the OVS kernel side which was already fixed in > newer kernels but it took quite a while to figure out and a good > understanding on what was going on. In this scenario, if OVN ACLs were > removed, traffic worked so OVN was the first to be blamed. And > sometimes, the OVN user/engineer is not an OVS expert to be able to > tell effectively what happened to a packet. > > Maybe the example is not the best as it was resolved using just the > 'ovs-dpctl' tool and some logging but support engineers may loop in > OVN engineers which may loop in OVS engineers which may loop in kernel > engineers. It'd be great to improve the experience somehow so that the > initial assessment doesn't have to go always all the way down. > > I'm curious about other folks' experiences here as well with more pure > OVS experience. > > Thanks a lot! > Daniel > > On Thu, Mar 14, 2019 at 5:55 PM Ben Pfaff <b...@ovn.org> wrote: > > > > On Thu, Mar 14, 2019 at 04:55:56PM +0100, Daniel Alvarez Sanchez wrote: > > > Hi folks, > > > > > > Lately I'm getting the question in the subject line more and more > > > frequently and facing it myself, especially in the context of > > > OpenStack. > > > > > > The shift to OVN in OpenStack involves a totally different approach > > > when it comes to tracing packet drops. Before OVN, there were a bunch > > > of network namespaces and devices where you could hook a tcpdump on > > > and inspect the traffic. People are used to those troubleshooting > > > techniques and OVS was merely used for normal action switches. > > > > > > It's clear that there's tools and techniques to analyze this (trace > > > tool, port mirroring, etc.), but often times requires quite high > > > knowledge and understanding of the pipeline and OVS itself to > > > effectively trace where a packet got dropped. Furthermore, there could > > > be some scenarios where the packet can be silently dropped. > > > > > > I came across this patch [0] and presentation about it [1] which aims > > > to tackle partly the problem described here (focusing in the DPDK > > > datapath). > > > > > > The intent of this email is to gather some feedback as how to provide > > > efficient tools and techniques to troubleshoot OVS/OVN issues and what > > > do you think is immediately missing in this context. > > > > I guess that there are multiple things to do here: > > > > - Better document the tools that are available. > > > > - Implement improvements, especially UX-wise, to the existing tools. > > > > - Identify gaps in the available tools (and then fill them). > > > > Do you have any good anecdotes about user/admin frustration? They might > > be helpful for figuring out how to help. A lot of us here designed and > > built this stuff and so the gaps are not always obvious to us. _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss