Hello Casey, We have something like that internally. It works something like this:
./sdndbg trace --from <ns_podname > --to <ns_podname|external_ip> {protocol:-tcp,udp,icmp4,ip4,arp,dhcp4} {protocol specific options} would generate the output that one could directly use in `ovn-trace` command. It is currently python based, and I am planning to re-write it using Go and submit it upstream to ovn-org/ovn-kubernetes repo. Regards, ~Girish From: <ovn-kuberne...@googlegroups.com> on behalf of Casey Callendrello <c...@redhat.com> Date: Wednesday, June 10, 2020 at 7:37 AM To: Tim Rozet <tro...@redhat.com> Cc: Dumitru Ceara <dce...@redhat.com>, "ovs-discuss@openvswitch.org" <ovs-discuss@openvswitch.org>, "ovn-kuberne...@googlegroups.com" <ovn-kuberne...@googlegroups.com>, "Pan, Feng" <f...@redhat.com> Subject: Re: RFC - OVN end to end packet tracing - ovn-global-trace External email: Use caution opening links or attachments Skydive would be awesome, but that's a lot of work to integrate. I'd love to see it more widely deployed, but that hasn't happened. For starters, ovn-kubernetes should probably come with some kind of ovn-trace wrapper that has a bit more logic around it. I could imagine it looking something like ovnk-trace <ns>/<podname> <dstip> and it would automatically execute the equivalent ovn-trace command, simulating traffic from podname to dstip. On Wed, Jun 10, 2020 at 3:23 PM Tim Rozet <tro...@redhat.com<mailto:tro...@redhat.com>> wrote: On Wed, Jun 10, 2020 at 3:36 AM Dumitru Ceara <dce...@redhat.com<mailto:dce...@redhat.com>> wrote: On 6/9/20 3:47 PM, Tim Rozet wrote: > Hi Dumitru, Hi Tim, > Thanks for the detailed explanation. It makes sense and would like to > comment on a few things you touched on: > 1. I do think we need to somehow functionally trigger conntrack when we > do ofproto-trace. It's the only way to know what the real session state > ends up being, and we need to be able to follow that for some of the > complex bugs where packets are getting dropped after they enter a CT > based flow. > 2. For your ovn-global-trace, it would be great if that could return a > json or other parsable format, so that we could build on top of it with > a tool + GUI to graphically show where the problem is in the network. Ack. > 3. We really need better user guides on this stuff. Your email is the > best tutorial I've seen yet :) I didn't even know about the > ovs-tcpundump command, or ovn-detrace (until you told me previously). It > would be great to add an ovn troubleshooting guide or something to the docs. > I was planning on sending a patch to update the OVN docs but didn't get the chance to do it yet. > As an administrator I would like to have GUI showing all of the logical > switch ports (skydive as an example, already does this) and then click > on a specific port that someone has reported an issue on. At that point > I can click on the port and ask it to tcpdump me the traffic coming out > of it. From there, I can select which packet I care about and attempt to > do an ovn-global-trace on it, which will then show me where the packet > is getting dropped and why. I think this would be the ideal behavior. > That would be cool. Using your example (skydive) though, I guess one could also come up with a solution that directly uses the tools already existing in OVS/OVN essentially performing the steps that something like ovn-global-trace would do. They could, but I think it would be better off living in OVN and then consumed by something above it. Thanks, Dumitru > Tim Rozet > Red Hat CTO Networking Team > > > On Mon, Jun 8, 2020 at 7:53 AM Dumitru Ceara > <dce...@redhat.com<mailto:dce...@redhat.com> > <mailto:dce...@redhat.com<mailto:dce...@redhat.com>>> wrote: > > Hi everyone, > > CC-ing ovn-kubernetes mailing list as I know there's interest about this > there too. > > OVN currently has a couple of tools that help > tracing/tracking/simulating what would happen to packets within OVN, > some examples: > > 1. ovn-trace > 2. ovs-appctl ofproto/trace ... | ovn-detrace > > They're both really useful and provide lots of information but with both > of them quite it's hard to get an overview of the end-to-end packet > processing in OVN for a given packet. Therefore both solutions have > disadvantages when trying to troubleshoot production deployments. Some > examples: > > a. ovn-trace will not take into account any potential issues with > translating logical flows to openflow so if there's a bug in the > translation we'll not be able to detect it by looking at ovn-trace > output. There is the --ovs switch but the user would have to somehow > determine on which hypervisor to query for the openflows corresponding > to logical flows/SB entities. > > b. "ovs-appctl ofproto/trace ... | ovn-detrace" works quite well when > used on a single node but as soon as traffic gets tunneled to a > different hypervisor the user has to figure out the changes that were > performed on the packet on the source hypervisor and adapt the > packet/flow to include the tunnel information to be used when running > ofproto/trace on the destination hypervisor. > > c. both ovn-trace and ofproto/trace support minimal hints to specify the > new conntrack state after conntrack recirculation but that turns out to > be not enough even in simple scenarios when NAT is involved [0]. > > In a production deployment one of the scenarios one would have to > troubleshoot is: > > "Given this OVN deployment on X nodes why isn't this specific > packet/traffic that is received on logical port P1 doesn't reach/reach > port P2." > > Assuming that point "c" above is addressed somehow (there are a few > suggestions on how to do that [1]) it's still quite a lot of work for > the engineer doing the troubleshooting to gather all the interesting > information. One would probably do something like: > > 1. connect to the node running the southbound database and get the > chassis where the logical port is bound: > > chassis=$(ovn-sbctl --bare --columns chassis list port_binding P1) > hostname=$(ovn-sbctl --bare --columns hostname list chassis $chassis) > > 2. connect to $hostname and determine the OVS ofport id of the interface > corresponding to P1: > > in_port=$(ovs-vsctl --bare --columns ofport find interface > external_ids:iface-id=P1) > iface=$(ovs-vsctl --bare --columns name find interface > external_ids:iface-id=P1) > > 3. get a hexdump of the packet to be traced (or the flow), for example, > on $hostname: > flow=$(tcpdump -xx -c 1 -i $iface $pkt_filter | ovs-tcpundump) > > 3. run ofproto/trace on $hostname (potentially piping output to > ovn-detrace): > > ovs-appctl ofproto/trace br-int in_port=$in_port $flow | ovn-detrace > --ovnnb=$NB_CONN --ovnsb=$SB_CONN > > 4. In the best case the packet is fully processed on the current node > (e.g., is dropped or forwarded out a local VIF). > > 5. In the worst case the packet needs to be tunneled to a remote > hypervisor for egress on a remote VIF. The engineer needs to identify in > the ofproto/trace output the metadata that would be passed through the > tunnel along with the packet and also the changes that would happen to > the packet payload (e.g. NAT) on the local hypervisor. > > 6. Determine the hostname of the chassis hosting the remote tunnel > destination based on "tun_dst" from the ofproto/trace output at point 3 > above: > > chassis_name=$(ovn-sbctl --bare --columns chassis_name find encap > ip=$tun_dst) > hostname=$(ovn-sbctl --bare --columns hostname find chassis > name=$chassis_name) > > 7. Rerun the ofproto/trace on the remote chassis (basically go back to > step #3 above). > > My initial thought was that all the work above can be automated as all > the information we need is either in the Southbound DB or in OVS DB on > the hypervisors and the output of ofproto/trace contains all the packet > modifications and tunnel information we need. I had started working on a > tool, "ovn-global-trace", that would do all the work above but I hit a > few blocking issues: > > - point "c" above, i.e., conntrack related packet modifications: this > will require some work in OVS ofproto/trace to either support additional > conntrack hints or to actually run the trace against conntrack on > the node. > > - if we choose to query conntrack during ofproto/trace we'd probably > need a way to also update the conntrack records the trace is run > against. This would turn out useful for cases when we troubleshoot > session establishment, e.g., with TCP: first run a trace for the SYN > packet, then run a a trace for the SYN-ACK packet in the other direction > but for this second trace we'd need the conntrack entry to have been > created by the initial trace. > > - ofproto/trace output is plain text: while a tool could parse the > information from the text output it would probably be easier if > ofproto/trace would dump the trace information in a structured way > (e.g., json). > > It would be great to get some feedback from the community about other > aspects that I might have missed regarding end-to-end packet tracing and > how we could aggregate current utilities into a single easier to use > tool like I was hoping "ovn-global-trace" would end up. > > Thanks, > Dumitru > > [0] > > https://patchwork.ozlabs.org/project/openvswitch/patch/1578648883-1145-1-git-send-email-dce...@redhat.com/ > [1] > https://mail.openvswitch.org/pipermail/ovs-dev/2020-January/366571.html > > -- > You received this message because you are subscribed to the Google > Groups "ovn-kubernetes" group. > To unsubscribe from this group and stop receiving emails from it, > send an email to > ovn-kubernetes+unsubscr...@googlegroups.com<mailto:ovn-kubernetes%2bunsubscr...@googlegroups.com> > > <mailto:ovn-kubernetes%2bunsubscr...@googlegroups.com<mailto:ovn-kubernetes%252bunsubscr...@googlegroups.com>>. > To view this discussion on the web visit > > https://groups.google.com/d/msgid/ovn-kubernetes/543bf38d-0578-7f6f-2eef-206d84026a3e%40redhat.com. > -- You received this message because you are subscribed to the Google Groups "ovn-kubernetes" group. To unsubscribe from this group and stop receiving emails from it, send an email to ovn-kubernetes+unsubscr...@googlegroups.com<mailto:ovn-kubernetes+unsubscr...@googlegroups.com>. To view this discussion on the web visit https://groups.google.com/d/msgid/ovn-kubernetes/CADO7ZnpZ%2BMp_9TbTbc2w1UWcnN%3DSeViD1DHeaaDiVpxfsDwRVQ%40mail.gmail.com<https://groups.google.com/d/msgid/ovn-kubernetes/CADO7ZnpZ%2BMp_9TbTbc2w1UWcnN%3DSeViD1DHeaaDiVpxfsDwRVQ%40mail.gmail.com?utm_medium=email&utm_source=footer>. -- You received this message because you are subscribed to the Google Groups "ovn-kubernetes" group. To unsubscribe from this group and stop receiving emails from it, send an email to ovn-kubernetes+unsubscr...@googlegroups.com<mailto:ovn-kubernetes+unsubscr...@googlegroups.com>. To view this discussion on the web visit https://groups.google.com/d/msgid/ovn-kubernetes/CALbOP4G0xDZoo3_OygF9Md1T_YE1eLKPAf0G2VAtNt%3Dy5pOmHA%40mail.gmail.com<https://groups.google.com/d/msgid/ovn-kubernetes/CALbOP4G0xDZoo3_OygF9Md1T_YE1eLKPAf0G2VAtNt%3Dy5pOmHA%40mail.gmail.com?utm_medium=email&utm_source=footer>.
_______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss