[ovs-discuss] good build test platform? (was: Re: Bad OpenFlow buffer IDs, etc.)
Thanks, I pushed this to master, branch-1.2, and branch-1.3. Justin mentioned to me that we could probably pick up an ARM box here for at least occasional build and "make check" testing. Do you have a brand/model/whatever to recommend? Thanks, Ben. On Tue, Oct 18, 2011 at 03:00:36PM -0700, Murphy McCauley wrote: > Yeah, my temporary patch does the same thing (though I explicitly called > push_uninit and memcpy because I didn't notice ofpbuf_push). I wasn't > sure if I was missing some better solution. > > That's a good thought on locating other places. I probably don't have > time to do it immediately, but something to keep in mind if this > continues to bite me. :) ___ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss
Re: [ovs-discuss] Bad OpenFlow buffer IDs, etc.
Yeah, my temporary patch does the same thing (though I explicitly called push_uninit and memcpy because I didn't notice ofpbuf_push). I wasn't sure if I was missing some better solution. That's a good thought on locating other places. I probably don't have time to do it immediately, but something to keep in mind if this continues to bite me. :) Thanks. -- Murphy On Tue, 2011-10-18 at 14:01 -0700, Ben Pfaff wrote: > On Mon, Oct 17, 2011 at 12:38:17PM -0700, Murphy McCauley wrote: > > The problem with buffer IDs is in ofputil_encode_packet_in() which > > writes to an unaligned ofp_packet_in pointer. > > > > Any thoughts on a good way to fix this, or how to locate other places > > where the same thing may be happening (there's no warning or anything)? > > The fix itself is easy. I've appended my first thought at how to do > it. Please test it and let me know the results. > > As for how to locate other places, I don't have good ideas. Probably, > testing on platforms that signal misaligned accesses (such as SPARC) > instead of on platforms that rotate bytes on misaligned accesses (only > ARM, as far as I know), is a good idea. Static analysis is better, > but I don't have a good way to do it. > > --8<--cut here-->8-- > > From 9d734d8341d2f2636d7402145f6934544c8f7e1c Mon Sep 17 00:00:00 2001 > From: Ben Pfaff > Date: Tue, 18 Oct 2011 13:58:21 -0700 > Subject: [PATCH] ofp-util: Avoid misaligned memory access in > ofputil_encode_packet_in(). > > Reported-by: Murphy McCauley > --- > AUTHORS|1 + > lib/ofp-util.c | 17 + > 2 files changed, 10 insertions(+), 8 deletions(-) > > diff --git a/AUTHORS b/AUTHORS > index 3229f34..e00feea 100644 > --- a/AUTHORS > +++ b/AUTHORS > @@ -84,6 +84,7 @@ Krishna Miriyalakris...@nicira.com > Luiz Henrique Ozaki luiz.oz...@gmail.com > Michael Hu m...@nicira.com > Michael Mao m...@nicira.com > +Murphy McCauley murphy.mccau...@gmail.com > Mikael Doverhag mdover...@nicira.com > Niklas Anderssonnanders...@nicira.com > Pankaj Thakkar thak...@nicira.com > diff --git a/lib/ofp-util.c b/lib/ofp-util.c > index b46219a..0930196 100644 > --- a/lib/ofp-util.c > +++ b/lib/ofp-util.c > @@ -1449,7 +1449,7 @@ ofputil_encode_packet_in(const struct ofputil_packet_in > *pin, > struct ofpbuf *rw_packet) > { > int total_len = pin->packet->size; > -struct ofp_packet_in *opi; > +struct ofp_packet_in opi; > > if (rw_packet) { > if (pin->send_len < rw_packet->size) { > @@ -1462,13 +1462,14 @@ ofputil_encode_packet_in(const struct > ofputil_packet_in *pin, > } > > /* Add OFPT_PACKET_IN. */ > -opi = ofpbuf_push_zeros(rw_packet, offsetof(struct ofp_packet_in, data)); > -opi->header.version = OFP_VERSION; > -opi->header.type = OFPT_PACKET_IN; > -opi->total_len = htons(total_len); > -opi->in_port = htons(pin->in_port); > -opi->reason = pin->reason; > -opi->buffer_id = htonl(pin->buffer_id); > +memset(&opi, 0, sizeof opi); > +opi.header.version = OFP_VERSION; > +opi.header.type = OFPT_PACKET_IN; > +opi.total_len = htons(total_len); > +opi.in_port = htons(pin->in_port); > +opi.reason = pin->reason; > +opi.buffer_id = htonl(pin->buffer_id); > +ofpbuf_push(rw_packet, &opi, offsetof(struct ofp_packet_in, data)); > update_openflow_length(rw_packet); > > return rw_packet; ___ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss
Re: [ovs-discuss] Bad OpenFlow buffer IDs, etc.
On Mon, Oct 17, 2011 at 12:38:17PM -0700, Murphy McCauley wrote: > The problem with buffer IDs is in ofputil_encode_packet_in() which > writes to an unaligned ofp_packet_in pointer. > > Any thoughts on a good way to fix this, or how to locate other places > where the same thing may be happening (there's no warning or anything)? The fix itself is easy. I've appended my first thought at how to do it. Please test it and let me know the results. As for how to locate other places, I don't have good ideas. Probably, testing on platforms that signal misaligned accesses (such as SPARC) instead of on platforms that rotate bytes on misaligned accesses (only ARM, as far as I know), is a good idea. Static analysis is better, but I don't have a good way to do it. --8<--cut here-->8-- >From 9d734d8341d2f2636d7402145f6934544c8f7e1c Mon Sep 17 00:00:00 2001 From: Ben Pfaff Date: Tue, 18 Oct 2011 13:58:21 -0700 Subject: [PATCH] ofp-util: Avoid misaligned memory access in ofputil_encode_packet_in(). Reported-by: Murphy McCauley --- AUTHORS|1 + lib/ofp-util.c | 17 + 2 files changed, 10 insertions(+), 8 deletions(-) diff --git a/AUTHORS b/AUTHORS index 3229f34..e00feea 100644 --- a/AUTHORS +++ b/AUTHORS @@ -84,6 +84,7 @@ Krishna Miriyalakris...@nicira.com Luiz Henrique Ozaki luiz.oz...@gmail.com Michael Hu m...@nicira.com Michael Mao m...@nicira.com +Murphy McCauley murphy.mccau...@gmail.com Mikael Doverhag mdover...@nicira.com Niklas Anderssonnanders...@nicira.com Pankaj Thakkar thak...@nicira.com diff --git a/lib/ofp-util.c b/lib/ofp-util.c index b46219a..0930196 100644 --- a/lib/ofp-util.c +++ b/lib/ofp-util.c @@ -1449,7 +1449,7 @@ ofputil_encode_packet_in(const struct ofputil_packet_in *pin, struct ofpbuf *rw_packet) { int total_len = pin->packet->size; -struct ofp_packet_in *opi; +struct ofp_packet_in opi; if (rw_packet) { if (pin->send_len < rw_packet->size) { @@ -1462,13 +1462,14 @@ ofputil_encode_packet_in(const struct ofputil_packet_in *pin, } /* Add OFPT_PACKET_IN. */ -opi = ofpbuf_push_zeros(rw_packet, offsetof(struct ofp_packet_in, data)); -opi->header.version = OFP_VERSION; -opi->header.type = OFPT_PACKET_IN; -opi->total_len = htons(total_len); -opi->in_port = htons(pin->in_port); -opi->reason = pin->reason; -opi->buffer_id = htonl(pin->buffer_id); +memset(&opi, 0, sizeof opi); +opi.header.version = OFP_VERSION; +opi.header.type = OFPT_PACKET_IN; +opi.total_len = htons(total_len); +opi.in_port = htons(pin->in_port); +opi.reason = pin->reason; +opi.buffer_id = htonl(pin->buffer_id); +ofpbuf_push(rw_packet, &opi, offsetof(struct ofp_packet_in, data)); update_openflow_length(rw_packet); return rw_packet; -- 1.7.4.4 ___ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss
Re: [ovs-discuss] Using unix sockets for controller communication
On Tue, Oct 18, 2011 at 02:25:46PM +0900, Jari Sundell wrote: > On Tue, Oct 18, 2011 at 1:46 AM, Ben Pfaff wrote: > > On Mon, Oct 17, 2011 at 02:59:43PM +0900, Jari Sundell wrote: > >> - I still have to set disable-in-band despite the controller being on > >> a unix socket. Not sure if that was due to me having set tcp:127.0.0.1 > >> prior to testing. > > > > In-band control should only get triggered if you have a manager > > configured (e.g. with "ovs-vsctl set-manager") with a TCP or SSL type, > > or if you have an in-band controller configured. ?(What do "ovs-vsctl > > list manager" and "ovs-vsctl list controller" print?) > [...] > Tried restarting the daemon so that no tcp controllers had been set > during the run, and still the issue arises. Can you describe the issue, actually? So far you've just referenced a message in the archives. I don't understand what problem you are seeing. When I configure a unix: controller myself, I don't see any in-band control flows set up, so I don't know how disable-in-band would make a difference. > >> - The whitelist string comparison seems to encounter some issues when > >> the prefix used to compile ovs contains double slashes, requiring us > >> to insert an additional slash when setting the controller. E.g. > >> 'unix:/usr/share/foo//ovs/var/run/openvswitch/br0.controller', which > >> can crop up when using scripts to compile, etc. > > > > It would be a good idea to fix the scripts, but here's a revised patch > > that works around that. > > -if (equal_pathnames(c->target, whitelist)) { > +if (!equal_pathnames(c->target, whitelist)) { > > Except for that small fix, it works nicely. Thanks. I applied that fix and pushed this to master. ___ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss
Re: [ovs-discuss] intermitting ARP problems on DP interface
Hi, working at the same company, perhaps I can clarify this a bit. Firstly, we are using openvswitch v1.2.2, the linux kernel datapath implementation and the test-openflowd openflow switch (we have good reasons not to use ovs-vswitchd). Restarting the controller means we've restarted (after SIGTERMinating) the test-openflowd user space daemon previously attached to a datapath in order to check whether our system recovers from test-openflowd crashes. After the restart, the switch successfully reconnects to our openflow controller but approx. every 10 seconds, nothing received at the datapath's local port is properly forwarded (to a physical switch port or our controller). This situation persists for a few seconds until it seems to recover and work properly again for about 10 seconds, etc. btw. I guess the "received packet on unknown port 65534" message alludes to an openvswitch bug since 65534 is the openflow LOCAL port while the source code states that an "odp" port should be logged. LOCAL however is odp-port 0. So it appears that somehow the two port types got mixed up. cheers, Robin Haberkorn - Original Message - > Hi, > > We have encountered a strange behavior. After restarting the > controller process and even after removing and reinserting the > datapath module, ARP packets are forwarded from dp0 to the real port > intermittently. > tcpdump confirms that the ARP request is seen on dp0, but not on the > physical port (eth1). > At the same time the following sequence on log messages appear: > > Oct 18 12:22:54|00344|netlink_socket|DBG|nl_sock_recv__ (Success): > nl(len:24, type=30(ovs_packet), flags=0, seq=0, > pid=0(0:0)),genl(cmd=1,version=1) > Oct 18 12:22:54|00345|dpif|DBG|system@dp0: miss upcall: > in_port(0),eth(src=00:23:20:fc:70:43,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=172.28.0.1,tip=172.28.0.19,op=1,sha=00:23:20:fc:70:43,tha=00:00:00:00:00:00) > 00:23:20:fc:70:43 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length > 42: Request who-has 172.28.0.19 tell 172.28.0.1, length 28 > Oct 18 12:22:54|00346|ofproto_dpif|WARN|bridge dp0: received packet > on unknown port 65534 > Oct 18 12:22:54|00347|netlink_socket|DBG|nl_sock_transact_multiple__ > (Success): nl(len:140, type=30(ovs_packet), flags=1[REQUEST], > seq=4e9d5835, pid=75500397(2925:18)),genl(cmd=3,version=1) > Oct 18 12:22:54|00348|netlink_socket|DBG|nl_sock_transact_multiple__ > (Success): nl(len:92, type=29(ovs_flow), flags=5[REQUEST][ACK], > seq=4e9d5836, pid=75500397(2925:18)),genl(cmd=1,version=1) > Oct 18 12:22:54|00349|netlink_socket|DBG|nl_sock_recv__ (Success): > nl(len:36, type=2(error), flags=0, seq=4e9d5836, > pid=75500397(2925:18)) error(0, in-reply-to(nl(len:92, > type=29(ovs_flow), flags=5[REQUEST][ACK], seq=4e9d5836, > pid=75500397(2925:18 > > There clearly is something strange going on. How can the dp receive > something on an unknown port? > > Any hints? > > Regards > Andreas > -- > -- > Dipl. Inform. > Andreas Schultz > > email: a...@travelping.com > phone: +49-391-819099-224 > mobil: +49-179-7654368 > > -- managed broadband access -- > > Travelping GmbH phone: +49-391-8190990 > Roentgenstr. 13 fax: +49-391-819099299 > D-39108 Magdeburg email: i...@travelping.com > GERMANY web: http://www.travelping.com > > Company Registration: HRB21276 Handelsregistergericht Chemnitz > Geschaeftsfuehrer: Holger Winkelmann | VAT ID No.: DE236673780 > -- > > ___ > discuss mailing list > discuss@openvswitch.org > http://openvswitch.org/mailman/listinfo/discuss > ___ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss
Re: [ovs-discuss] intermitting ARP problems on DP interface
I believe that 65534 is the LOCAL port, so it is possible that linux is ARPing via that interface, if it has an IP address. I think I've seen this behavior as well, but don't entirely understand it. It would be great if someone could explain how OVS handles ARP specially. Cheers, Dan On Tue, Oct 18, 2011 at 7:42 PM, Andreas Schultz wrote: > Hi, > > We have encountered a strange behavior. After restarting the controller > process and even after removing and reinserting the datapath module, ARP > packets are forwarded from dp0 to the real port intermittently. > tcpdump confirms that the ARP request is seen on dp0, but not on the physical > port (eth1). > At the same time the following sequence on log messages appear: > > Oct 18 12:22:54|00344|netlink_socket|DBG|nl_sock_recv__ (Success): nl(len:24, > type=30(ovs_packet), flags=0, seq=0, pid=0(0:0)),genl(cmd=1,version=1) > Oct 18 12:22:54|00345|dpif|DBG|system@dp0: miss upcall: > in_port(0),eth(src=00:23:20:fc:70:43,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=172.28.0.1,tip=172.28.0.19,op=1,sha=00:23:20:fc:70:43,tha=00:00:00:00:00:00) > 00:23:20:fc:70:43 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: > Request who-has 172.28.0.19 tell 172.28.0.1, length 28 > Oct 18 12:22:54|00346|ofproto_dpif|WARN|bridge dp0: received packet on > unknown port 65534 > Oct 18 12:22:54|00347|netlink_socket|DBG|nl_sock_transact_multiple__ > (Success): nl(len:140, type=30(ovs_packet), flags=1[REQUEST], seq=4e9d5835, > pid=75500397(2925:18)),genl(cmd=3,version=1) > Oct 18 12:22:54|00348|netlink_socket|DBG|nl_sock_transact_multiple__ > (Success): nl(len:92, type=29(ovs_flow), flags=5[REQUEST][ACK], seq=4e9d5836, > pid=75500397(2925:18)),genl(cmd=1,version=1) > Oct 18 12:22:54|00349|netlink_socket|DBG|nl_sock_recv__ (Success): nl(len:36, > type=2(error), flags=0, seq=4e9d5836, pid=75500397(2925:18)) error(0, > in-reply-to(nl(len:92, type=29(ovs_flow), flags=5[REQUEST][ACK], > seq=4e9d5836, pid=75500397(2925:18 > > There clearly is something strange going on. How can the dp receive something > on an unknown port? > > Any hints? > > Regards > Andreas > -- > -- > Dipl. Inform. > Andreas Schultz > > email: a...@travelping.com > phone: +49-391-819099-224 > mobil: +49-179-7654368 > > -- managed broadband access -- > > Travelping GmbH phone: +49-391-8190990 > Roentgenstr. 13 fax: +49-391-819099299 > D-39108 Magdeburg email: i...@travelping.com > GERMANY web: http://www.travelping.com > > Company Registration: HRB21276 Handelsregistergericht Chemnitz > Geschaeftsfuehrer: Holger Winkelmann | VAT ID No.: DE236673780 > -- > > ___ > discuss mailing list > discuss@openvswitch.org > http://openvswitch.org/mailman/listinfo/discuss > ___ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss
Re: [ovs-discuss] n_actions in add_flow() and check_ofp_message_array()
On Tue, Oct 18, 2011 at 04:23:03PM +0300, Ivan Zazulyak wrote: > So, I am using a source code from a head of WDP branch and have a > question regarding to the number of actions, stored in the > wdp_flow_put::n_actions structure member. It is a mistake to use the WDP branch. It is hopelessly out-of-date. The ideas in the WDP branch are implemented much better in "master", so please upgrade to "master" > This member is set in the add_flow() function and described in > comments as number of actions in the flow. > But in fact it is calculated by the check_ofp_message_array() > function as a number of 8-byte blocks (slots). > > An action could occupy several slots, e.g currntly the > ofp_action_dl_addr action structure requires 2 ones. > > Hereby are my conclusions: > - n_action should be always treated as number of slots, not actual > number of actions > - application is responsible for parsing actions, based on the type > of action and it's length > > Is it correct? Yes, that's common practice in the OVS source tree. ___ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss
[ovs-discuss] n_actions in add_flow() and check_ofp_message_array()
Hi, I am a little bit new at the OpenVSwitch project. I tried to find the answer in the discuss Archives, but didn't succeed. So, I am using a source code from a head of WDP branch and have a question regarding to the number of actions, stored in the wdp_flow_put::n_actions structure member. This member is set in the add_flow() function and described in comments as number of actions in the flow. But in fact it is calculated by the check_ofp_message_array() function as a number of 8-byte blocks (slots). An action could occupy several slots, e.g currntly the ofp_action_dl_addr action structure requires 2 ones. Hereby are my conclusions: - n_action should be always treated as number of slots, not actual number of actions - application is responsible for parsing actions, based on the type of action and it's length Is it correct? Thanks in advance, Ivan ___ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss
[ovs-discuss] intermitting ARP problems on DP interface
Hi, We have encountered a strange behavior. After restarting the controller process and even after removing and reinserting the datapath module, ARP packets are forwarded from dp0 to the real port intermittently. tcpdump confirms that the ARP request is seen on dp0, but not on the physical port (eth1). At the same time the following sequence on log messages appear: Oct 18 12:22:54|00344|netlink_socket|DBG|nl_sock_recv__ (Success): nl(len:24, type=30(ovs_packet), flags=0, seq=0, pid=0(0:0)),genl(cmd=1,version=1) Oct 18 12:22:54|00345|dpif|DBG|system@dp0: miss upcall: in_port(0),eth(src=00:23:20:fc:70:43,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=172.28.0.1,tip=172.28.0.19,op=1,sha=00:23:20:fc:70:43,tha=00:00:00:00:00:00) 00:23:20:fc:70:43 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 172.28.0.19 tell 172.28.0.1, length 28 Oct 18 12:22:54|00346|ofproto_dpif|WARN|bridge dp0: received packet on unknown port 65534 Oct 18 12:22:54|00347|netlink_socket|DBG|nl_sock_transact_multiple__ (Success): nl(len:140, type=30(ovs_packet), flags=1[REQUEST], seq=4e9d5835, pid=75500397(2925:18)),genl(cmd=3,version=1) Oct 18 12:22:54|00348|netlink_socket|DBG|nl_sock_transact_multiple__ (Success): nl(len:92, type=29(ovs_flow), flags=5[REQUEST][ACK], seq=4e9d5836, pid=75500397(2925:18)),genl(cmd=1,version=1) Oct 18 12:22:54|00349|netlink_socket|DBG|nl_sock_recv__ (Success): nl(len:36, type=2(error), flags=0, seq=4e9d5836, pid=75500397(2925:18)) error(0, in-reply-to(nl(len:92, type=29(ovs_flow), flags=5[REQUEST][ACK], seq=4e9d5836, pid=75500397(2925:18 There clearly is something strange going on. How can the dp receive something on an unknown port? Any hints? Regards Andreas -- -- Dipl. Inform. Andreas Schultz email: a...@travelping.com phone: +49-391-819099-224 mobil: +49-179-7654368 -- managed broadband access -- Travelping GmbH phone: +49-391-8190990 Roentgenstr. 13 fax: +49-391-819099299 D-39108 Magdeburg email: i...@travelping.com GERMANY web: http://www.travelping.com Company Registration: HRB21276 Handelsregistergericht Chemnitz Geschaeftsfuehrer: Holger Winkelmann | VAT ID No.: DE236673780 -- ___ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss