Re: [nox-dev] Flow establishment problem? NOX and/or OpenFlow

2008-07-22 Thread Glen Gibb
Ben Pfaff wrote:
> Glen Gibb <[EMAIL PROTECTED]> writes:
>
>   
>> Ben Pfaff wrote:
>> 
>>> Glen Gibb <[EMAIL PROTECTED]> writes:
>>>
>>>   
>>>   
 I've got another issue here, this time to do with flow
 establishment. I'm trying to ping from one host to another through
 a bunch of OF switches -- this succeeds if the OF switches are
 brought up before NOX but fails if the OF switches are brought up
 after NOX.
 
 
>>> Are the switches using in-band or out-of-band control?  I am
>>> aware of a bug in the in-band control code that might cause these
>>> symptoms.  I now have a fix that is under review.  I've pushed it
>>> to the "arp" branch on nicira.dyndns.org and on
>>> yuba.stanford.edu:/usr/local/git/openflow.  The commit messages
>>> should make it clear what it fixes.  Let me know if it makes a
>>> different for you.
>>>   
>>>   
>> Nope, this is out-of-band control. Is the fix likely to do anything in
>> this scenario?
>> 
>
> That particular fix will not have any effect with out-of-band
> control.
>
> However, the following commit could have fixed the problem.
> Please make sure that it is included in the copy of NOX that you
> are running.
>
> commit d8226dd2e8305cd59417fa004c70ac20e2169720
> Author: Ben Pfaff <[EMAIL PROTECTED]>
> Date:   Tue Jul 15 15:48:31 2008 -0700
>
> Keep time from standing still.
> 
> do_gettimeofday(true) needs to be called on a reasonably frequent
> basis, or the time as returned by do_gettimeofday(false) will never
> be updated.  In this case, the symptoms were that if NOX was running
> for over 5 seconds without anything to do, then switches that attempted
> to connect would instantly be timed out, because Handshake_fsm has a
> 5-second timeout and time wasn't getting updated.
>   
Hi Ben,

I'm afraid the above commit didn't fix the problem. We still need to 
kill and restart NOX if a switch comes up after NOX.

Glen

___
nox-dev mailing list
nox-dev@noxrepo.org
http://noxrepo.org/mailman/listinfo/nox-dev_noxrepo.org


Re: [nox-dev] Flow establishment problem? NOX and/or OpenFlow

2008-07-16 Thread Ben Pfaff
Glen Gibb <[EMAIL PROTECTED]> writes:

> Ben Pfaff wrote:
>> Glen Gibb <[EMAIL PROTECTED]> writes:
>>
>>   
>>> I've got another issue here, this time to do with flow
>>> establishment. I'm trying to ping from one host to another through
>>> a bunch of OF switches -- this succeeds if the OF switches are
>>> brought up before NOX but fails if the OF switches are brought up
>>> after NOX.
>>> 
>>
>> Are the switches using in-band or out-of-band control?  I am
>> aware of a bug in the in-band control code that might cause these
>> symptoms.  I now have a fix that is under review.  I've pushed it
>> to the "arp" branch on nicira.dyndns.org and on
>> yuba.stanford.edu:/usr/local/git/openflow.  The commit messages
>> should make it clear what it fixes.  Let me know if it makes a
>> different for you.
>>   
> Nope, this is out-of-band control. Is the fix likely to do anything in
> this scenario?

That particular fix will not have any effect with out-of-band
control.

However, the following commit could have fixed the problem.
Please make sure that it is included in the copy of NOX that you
are running.

commit d8226dd2e8305cd59417fa004c70ac20e2169720
Author: Ben Pfaff <[EMAIL PROTECTED]>
Date:   Tue Jul 15 15:48:31 2008 -0700

Keep time from standing still.

do_gettimeofday(true) needs to be called on a reasonably frequent
basis, or the time as returned by do_gettimeofday(false) will never
be updated.  In this case, the symptoms were that if NOX was running
for over 5 seconds without anything to do, then switches that attempted
to connect would instantly be timed out, because Handshake_fsm has a
5-second timeout and time wasn't getting updated.

diff --git a/src/lib/threads/impl.cc b/src/lib/threads/impl.cc
index c8d7c25..7caa017 100644
--- a/src/lib/threads/impl.cc
+++ b/src/lib/threads/impl.cc
@@ -502,6 +502,7 @@ co_poll(void)
 process_poll_results(n_events);
 }
 }
+do_gettimeofday(true);
 }
 
 /* Migrates the running thread to the specified 'new' thread group.  If 'new'
@@ -1320,6 +1321,7 @@ thread_main(void *thread_)
 }
 }
 
+do_gettimeofday(true);
 thread->run();
 if (thread->completion) {
 thread->completion->release();
@@ -1410,6 +1412,7 @@ join_coop_group(struct co_group *group)
 pthread_mutex_unlock(&group->mutex);
 sem_wait(&thread->sched_sem);
 reschedule_while_needed();
+do_gettimeofday(true);
 }
 }
 
@@ -1574,7 +1577,7 @@ do_schedule()
 next->flags &= ~COTF_READY;
 if (next == thread) {
 pthread_mutex_unlock(&group->mutex);
-return;
+break;
 }
 
 if (!(next->flags & COTF_FSM)) {
@@ -1584,12 +1587,13 @@ do_schedule()
 
 /* Wait until we're scheduled again. */
 sem_wait(&thread->sched_sem);
-return;
+break;
 } else {
 pthread_mutex_unlock(&group->mutex);
 co_fsm_run(next);
 }
 }
+do_gettimeofday(true);
 }
 
 static void
@@ -1608,6 +1612,8 @@ run_fsm(struct co_thread *fsm)
 
 cancel_events(fsm);
 
+do_gettimeofday(true);
+
 set_self(fsm);
 co_enter_critical_section();
 int save_errno = errno;

-- 
Ben Pfaff
Nicira Networks, Inc.

___
nox-dev mailing list
nox-dev@noxrepo.org
http://noxrepo.org/mailman/listinfo/nox-dev_noxrepo.org


Re: [nox-dev] Flow establishment problem? NOX and/or OpenFlow

2008-07-16 Thread Glen Gibb
Ben Pfaff wrote:
> Glen Gibb <[EMAIL PROTECTED]> writes:
>
>   
>> I've got another issue here, this time to do with flow establishment. 
>> I'm trying to ping from one host to another through a bunch of OF 
>> switches -- this succeeds if the OF switches are brought up before NOX 
>> but fails if the OF switches are brought up after NOX.
>> 
>
> Are the switches using in-band or out-of-band control?  I am
> aware of a bug in the in-band control code that might cause these
> symptoms.  I now have a fix that is under review.  I've pushed it
> to the "arp" branch on nicira.dyndns.org and on
> yuba.stanford.edu:/usr/local/git/openflow.  The commit messages
> should make it clear what it fixes.  Let me know if it makes a
> different for you.
>   
Nope, this is out-of-band control. Is the fix likely to do anything in 
this scenario?



___
nox-dev mailing list
nox-dev@noxrepo.org
http://noxrepo.org/mailman/listinfo/nox-dev_noxrepo.org


Re: [nox-dev] Flow establishment problem? NOX and/or OpenFlow

2008-07-16 Thread Ben Pfaff
Glen Gibb <[EMAIL PROTECTED]> writes:

> I've got another issue here, this time to do with flow establishment. 
> I'm trying to ping from one host to another through a bunch of OF 
> switches -- this succeeds if the OF switches are brought up before NOX 
> but fails if the OF switches are brought up after NOX.

Are the switches using in-band or out-of-band control?  I am
aware of a bug in the in-band control code that might cause these
symptoms.  I now have a fix that is under review.  I've pushed it
to the "arp" branch on nicira.dyndns.org and on
yuba.stanford.edu:/usr/local/git/openflow.  The commit messages
should make it clear what it fixes.  Let me know if it makes a
different for you.
-- 
Ben Pfaff
Nicira Networks, Inc.

___
nox-dev mailing list
nox-dev@noxrepo.org
http://noxrepo.org/mailman/listinfo/nox-dev_noxrepo.org


[nox-dev] Flow establishment problem? NOX and/or OpenFlow

2008-07-16 Thread Glen Gibb
Hi all,

I've got another issue here, this time to do with flow establishment. 
I'm trying to ping from one host to another through a bunch of OF 
switches -- this succeeds if the OF switches are brought up before NOX 
but fails if the OF switches are brought up after NOX.


The details are as follows:

Network setup:
Three OF switches (mvm-ofroot, mvm-of1, mvm-of2), two "hosts" (mvm-17, 
mvm-18), connected linearly as follows:

mvm-17  mvm-ofroot  mvm-of1  mvm-ap1  mvm-18

Note: mvm-18 is actually the of0 interface on a noxbox install on mvm-ap1


I'm trying to ping from mvm-18 to mvm-17.


If I start the NOX controller before bringing up the OF switches the 
ping fails. The ICMP echo requests fail to reach the destination 
(mvm-17) although ARP packets seem to get through.
If I start the OF switches and then bring up the NOX controller the ping 
succeeds.


To aid in debugging I turned on NOX logging, ran a packet capture on the 
NOX controller, and ran dpctl on each of the OF switches. The file 
http://yuba.stanford.edu/~grg/ping_issue.tgz contains the results of 
these processes. The ping_nox_first.* files are when running NOX before 
starting OF flow, the ping_nox_last.* files are the opposite.

In ping_nox_first.dump, you can see the ARP requests/replies in packets 
278 to 299 (including packet in/packet out messages). Packet 301 is the 
echo request from mvm-ap1, 302 is  a flow mod, and 304 is the 
corresponding packet out. Then in 306 we have a echo request packet in 
from mvm-of1, followed by a flow mod to of1, but there is no 
corresponding packet out.

Contrasting with ping_nox_last.dump, we see the ARP request/replies in 
packets 215 to 235. Packet 237 is the echo request packet in from ap1, 
238, 240 and 242 are flow mods to the three OFs, and 244 is a packet 
out. We don't see a packet in from of1. We then see a packet in for the 
echo request from ofroot at 246, with flow mods at 247, 248 and 250, 
followed by the packet out at 253.


Hopefully this has given you plenty of information to help track down 
the problem. Let me know if there's other things I can do to help.

Glen


___
nox-dev mailing list
nox-dev@noxrepo.org
http://noxrepo.org/mailman/listinfo/nox-dev_noxrepo.org