On Mar 31, 2013, at 6:26 AM, Andrew Ferguson wrote:

> I'm curious about the background of the openflow.discovery component.  my 
> main question boils down to: is a timeout-based approach actually necessary 
> in a pure OpenFlow network?  was LLDP implemented so non-OpenFlow switches 
> could discover POX-controlled OF switches, or was this based on something you 
> all experienced with a pure OF network?
> 
> as I currently understand things, one could simply use the PortStatus 
> messages as proactive notification that a link has disappeared.  (this 
> assumes that OF switches properly deliver such messages -- if some don't, 
> then indeed, a timeout approach is required, so I'm wondering if anyone has 
> seen such behavior.)  in other words, while I believe PortStatus messages are 
> sufficient, are they also necessary? (if they are, then we could ditch the 
> periodic re-sending of LLDP packets, right?)

This design was inspired by NOX.  I don't think discovery by non-OF switches 
was a major factor.  I think the major reasons are:

1) Port status messages alone don't actually let you discover the topology, 
which is a major point of this module.  The discovery module really does both 
topology discovery and link failure detection; in the mental model behind the 
module, they're kind of the same thing, but other approaches also make sense.  
For example, I think that the existing "I will continually figure out what 
people have plugged together" approach may make sense for enterprise scenarios, 
but for something like a datacenter it's probably more reasonable to either 
load the topology from a file or do topology discovery *once* at startup and 
then assume it's the same forever and all you do is check for link 
failure/recovery.  I just haven't gotten around to making any clean versions of 
alternate approaches to include in POX.

2) This approach works in non-pure-OpenFlow networks.  Indeed, the choice of a 
non-standard ethernet address for the LLDP messages is specifically to allow 
the controller to see through non-OpenFlow switches.

3) The relevant port state is "link down", which we can assume is probably tied 
to LIT/link pulse/whatever, but there are ways that a link can fail which 
aren't caught by these -- though admittedly, many more in mixed networks.  A 
more recent but related concern is ... for a virtual port that represents a 
tunnel (as is common in the important use case of an OpenFlow overlay network 
-- a different sort of mixed network), do you get link downs?  I don't actually 
know, but I expect the answer is "not always" at best.

> relatedly, I noticed that while POX's openflow.discovery proactively deletes 
> links in response to a switch's ConnectionDown, the PortStatus events are 
> only used to remove ports from the list of ports out of which to send LLDP 
> messages on each cycle.  what was the reasoning behind this design?

No real rationale there.  The POX design followed the broad strokes of the NOX 
design (diverging a bit), and I expect NOX didn't do it so neither did POX.  I 
actually noticed this myself when I did some work on discovery relatively 
recently but didn't do anything about it (the refactoring moved the port status 
handlers to another class which made it obvious that this wasn't being done).  
I think it'd be a reasonable addition, but (in my opinion) is a fairly minor 
optimization -- under some conditions you wouldn't have to wait some fraction 
of the discovery cycle time to notice a link had been removed/downed.  Of 
course, if the switch-provided link state was assumed to be sufficient, this 
would be a different story, but that motivates an entirely different design of 
the discovery component anyway.

> anyway, this is the discovery approach I currently use in my own controller, 
> and I'm wondering if I'll need the timeout-based approach in some scenario I 
> haven't encountered yet.

In general, I think you need to probe and timeout for reliable detection.  But 
as in so many things, I think there are probably multiple designs that make 
sense for different use cases and environments.

-- Murphy

Reply via email to