Re: [nox-dev] Flow establishment problem? NOX and/or OpenFlow
Ben Pfaff wrote: > Glen Gibb <[EMAIL PROTECTED]> writes: > > >> Ben Pfaff wrote: >> >>> Glen Gibb <[EMAIL PROTECTED]> writes: >>> >>> >>> I've got another issue here, this time to do with flow establishment. I'm trying to ping from one host to another through a bunch of OF switches -- this succeeds if the OF switches are brought up before NOX but fails if the OF switches are brought up after NOX. >>> Are the switches using in-band or out-of-band control? I am >>> aware of a bug in the in-band control code that might cause these >>> symptoms. I now have a fix that is under review. I've pushed it >>> to the "arp" branch on nicira.dyndns.org and on >>> yuba.stanford.edu:/usr/local/git/openflow. The commit messages >>> should make it clear what it fixes. Let me know if it makes a >>> different for you. >>> >>> >> Nope, this is out-of-band control. Is the fix likely to do anything in >> this scenario? >> > > That particular fix will not have any effect with out-of-band > control. > > However, the following commit could have fixed the problem. > Please make sure that it is included in the copy of NOX that you > are running. > > commit d8226dd2e8305cd59417fa004c70ac20e2169720 > Author: Ben Pfaff <[EMAIL PROTECTED]> > Date: Tue Jul 15 15:48:31 2008 -0700 > > Keep time from standing still. > > do_gettimeofday(true) needs to be called on a reasonably frequent > basis, or the time as returned by do_gettimeofday(false) will never > be updated. In this case, the symptoms were that if NOX was running > for over 5 seconds without anything to do, then switches that attempted > to connect would instantly be timed out, because Handshake_fsm has a > 5-second timeout and time wasn't getting updated. > Hi Ben, I'm afraid the above commit didn't fix the problem. We still need to kill and restart NOX if a switch comes up after NOX. Glen ___ nox-dev mailing list nox-dev@noxrepo.org http://noxrepo.org/mailman/listinfo/nox-dev_noxrepo.org
Re: [nox-dev] Flow establishment problem? NOX and/or OpenFlow
Glen Gibb <[EMAIL PROTECTED]> writes: > Ben Pfaff wrote: >> Glen Gibb <[EMAIL PROTECTED]> writes: >> >> >>> I've got another issue here, this time to do with flow >>> establishment. I'm trying to ping from one host to another through >>> a bunch of OF switches -- this succeeds if the OF switches are >>> brought up before NOX but fails if the OF switches are brought up >>> after NOX. >>> >> >> Are the switches using in-band or out-of-band control? I am >> aware of a bug in the in-band control code that might cause these >> symptoms. I now have a fix that is under review. I've pushed it >> to the "arp" branch on nicira.dyndns.org and on >> yuba.stanford.edu:/usr/local/git/openflow. The commit messages >> should make it clear what it fixes. Let me know if it makes a >> different for you. >> > Nope, this is out-of-band control. Is the fix likely to do anything in > this scenario? That particular fix will not have any effect with out-of-band control. However, the following commit could have fixed the problem. Please make sure that it is included in the copy of NOX that you are running. commit d8226dd2e8305cd59417fa004c70ac20e2169720 Author: Ben Pfaff <[EMAIL PROTECTED]> Date: Tue Jul 15 15:48:31 2008 -0700 Keep time from standing still. do_gettimeofday(true) needs to be called on a reasonably frequent basis, or the time as returned by do_gettimeofday(false) will never be updated. In this case, the symptoms were that if NOX was running for over 5 seconds without anything to do, then switches that attempted to connect would instantly be timed out, because Handshake_fsm has a 5-second timeout and time wasn't getting updated. diff --git a/src/lib/threads/impl.cc b/src/lib/threads/impl.cc index c8d7c25..7caa017 100644 --- a/src/lib/threads/impl.cc +++ b/src/lib/threads/impl.cc @@ -502,6 +502,7 @@ co_poll(void) process_poll_results(n_events); } } +do_gettimeofday(true); } /* Migrates the running thread to the specified 'new' thread group. If 'new' @@ -1320,6 +1321,7 @@ thread_main(void *thread_) } } +do_gettimeofday(true); thread->run(); if (thread->completion) { thread->completion->release(); @@ -1410,6 +1412,7 @@ join_coop_group(struct co_group *group) pthread_mutex_unlock(&group->mutex); sem_wait(&thread->sched_sem); reschedule_while_needed(); +do_gettimeofday(true); } } @@ -1574,7 +1577,7 @@ do_schedule() next->flags &= ~COTF_READY; if (next == thread) { pthread_mutex_unlock(&group->mutex); -return; +break; } if (!(next->flags & COTF_FSM)) { @@ -1584,12 +1587,13 @@ do_schedule() /* Wait until we're scheduled again. */ sem_wait(&thread->sched_sem); -return; +break; } else { pthread_mutex_unlock(&group->mutex); co_fsm_run(next); } } +do_gettimeofday(true); } static void @@ -1608,6 +1612,8 @@ run_fsm(struct co_thread *fsm) cancel_events(fsm); +do_gettimeofday(true); + set_self(fsm); co_enter_critical_section(); int save_errno = errno; -- Ben Pfaff Nicira Networks, Inc. ___ nox-dev mailing list nox-dev@noxrepo.org http://noxrepo.org/mailman/listinfo/nox-dev_noxrepo.org
Re: [nox-dev] Flow establishment problem? NOX and/or OpenFlow
Ben Pfaff wrote: > Glen Gibb <[EMAIL PROTECTED]> writes: > > >> I've got another issue here, this time to do with flow establishment. >> I'm trying to ping from one host to another through a bunch of OF >> switches -- this succeeds if the OF switches are brought up before NOX >> but fails if the OF switches are brought up after NOX. >> > > Are the switches using in-band or out-of-band control? I am > aware of a bug in the in-band control code that might cause these > symptoms. I now have a fix that is under review. I've pushed it > to the "arp" branch on nicira.dyndns.org and on > yuba.stanford.edu:/usr/local/git/openflow. The commit messages > should make it clear what it fixes. Let me know if it makes a > different for you. > Nope, this is out-of-band control. Is the fix likely to do anything in this scenario? ___ nox-dev mailing list nox-dev@noxrepo.org http://noxrepo.org/mailman/listinfo/nox-dev_noxrepo.org
Re: [nox-dev] Flow establishment problem? NOX and/or OpenFlow
Glen Gibb <[EMAIL PROTECTED]> writes: > I've got another issue here, this time to do with flow establishment. > I'm trying to ping from one host to another through a bunch of OF > switches -- this succeeds if the OF switches are brought up before NOX > but fails if the OF switches are brought up after NOX. Are the switches using in-band or out-of-band control? I am aware of a bug in the in-band control code that might cause these symptoms. I now have a fix that is under review. I've pushed it to the "arp" branch on nicira.dyndns.org and on yuba.stanford.edu:/usr/local/git/openflow. The commit messages should make it clear what it fixes. Let me know if it makes a different for you. -- Ben Pfaff Nicira Networks, Inc. ___ nox-dev mailing list nox-dev@noxrepo.org http://noxrepo.org/mailman/listinfo/nox-dev_noxrepo.org
[nox-dev] Flow establishment problem? NOX and/or OpenFlow
Hi all, I've got another issue here, this time to do with flow establishment. I'm trying to ping from one host to another through a bunch of OF switches -- this succeeds if the OF switches are brought up before NOX but fails if the OF switches are brought up after NOX. The details are as follows: Network setup: Three OF switches (mvm-ofroot, mvm-of1, mvm-of2), two "hosts" (mvm-17, mvm-18), connected linearly as follows: mvm-17 mvm-ofroot mvm-of1 mvm-ap1 mvm-18 Note: mvm-18 is actually the of0 interface on a noxbox install on mvm-ap1 I'm trying to ping from mvm-18 to mvm-17. If I start the NOX controller before bringing up the OF switches the ping fails. The ICMP echo requests fail to reach the destination (mvm-17) although ARP packets seem to get through. If I start the OF switches and then bring up the NOX controller the ping succeeds. To aid in debugging I turned on NOX logging, ran a packet capture on the NOX controller, and ran dpctl on each of the OF switches. The file http://yuba.stanford.edu/~grg/ping_issue.tgz contains the results of these processes. The ping_nox_first.* files are when running NOX before starting OF flow, the ping_nox_last.* files are the opposite. In ping_nox_first.dump, you can see the ARP requests/replies in packets 278 to 299 (including packet in/packet out messages). Packet 301 is the echo request from mvm-ap1, 302 is a flow mod, and 304 is the corresponding packet out. Then in 306 we have a echo request packet in from mvm-of1, followed by a flow mod to of1, but there is no corresponding packet out. Contrasting with ping_nox_last.dump, we see the ARP request/replies in packets 215 to 235. Packet 237 is the echo request packet in from ap1, 238, 240 and 242 are flow mods to the three OFs, and 244 is a packet out. We don't see a packet in from of1. We then see a packet in for the echo request from ofroot at 246, with flow mods at 247, 248 and 250, followed by the packet out at 253. Hopefully this has given you plenty of information to help track down the problem. Let me know if there's other things I can do to help. Glen ___ nox-dev mailing list nox-dev@noxrepo.org http://noxrepo.org/mailman/listinfo/nox-dev_noxrepo.org