Hi ,

Le 26/06/2009 09:24, line.ho...@sun.com a écrit :
>> Moreover, I gave it some thoughts and I think there is a flaw in this 
>> approach.
>> For example
>>
>>
>>  L1-1    L1-2
>>   |       | \
>>   |       |  ----
>>   |       |      \
>>  _|___|___|_      L2
>>  | QNEM    |     |\\
>>  -----------     |\\\
>>   ////  \\\\     |\\\\
>> N[1-12] N[13-24] M[1-18]
>>
> 
> Nicolas, you are right. In this example you would not have
> full connectivity, N1-12 would not reach M1-18 because that
> would represent a down-up-down path.
> Your example represents a highly degraded system with no 
> common core switch for L2 and the leftmost switch in the QNEM.
> In this case, full connectivity could be reached with alternative
> routing algorithms but that should be an administrative decision.
> 

It's not necessary degraded. I took switches and many nodes for an example but 
you could as simply put a service node (admin?) directly under the L1-2 switch 
and have exactly the same problem.
Connecting a whole cluster like this would be stupid, but add I/O or service 
nodes makes sense.

> Handling the horizontal links as both up and down would give
> full connectivity in your example. But there are other examples
> that would introduce deadlock situations with your approach.
> 
> 
> Level 1    x-1          y-1          
>             | \        /  |
>             |  \      /   |
> Level 2    x-2 x-3  y-2  y-3
>             |    \  /     |
>             |     \/      |
>             |     /\      |
>             |    /  \     |
> leafs       A -- B   C -- D
> 
> Clarifications:
>  x1-3 is a subset of switches in a core ftree
>  y1-3 is a subset of switches in another core ftree
>  A+B are two interconnected leafs (or a QNEM if you like)
>  C+D are also two interconnected leafs
>  B is connected to y-2, and C to x-3
>  A, B, C and D all have CAs connected
> 
> If you are allowed to use the horizontal link both on your way up
> and on your way down you could get the following path scenario
> all with valid shortest path:
> 
>   A to D : A --> x-2 --> x-1 --> x-3 --> C   --> D   (horizontal = DOWN)
>   C to B : C --> D   --> y-3 --> y-1 --> y-2 --> B   (horizontal = UP)
>   D to A : D --> y-3 --> y-1 --> y-2 --> B   --> A   (horizontal = DOWN)
>   B to C : B --> A   --> x-2 --> x-1 --> x-3 --> C   (horizontal = UP)
> 
> Voila - you have potential deadlock for CN to CN communication.
> The deadlock is removed if you only allow to use the horizontal
> link in one direction, either up or down.
> 

It won't route this way if you optimize thing.
You have to consider that you can take both up and down but they are not part 
of both up port groups and down port groups.
By exploring in the right order (up before horizontal and down before 
horizontal) you can avoid that.
Bascially if possible you'll always use the horizontal link last (in the 
algorithm which means first on the route).  

So here we would have
A to D: A --> B --> y-2 --> y-1 --> y-3 --> D
B to C: B --> A --> x-2 --> x-1 --> x-3 --> C
C to B: C --> D --> y-3 --> y-1 --> y-2 --> B
D to A: D --> C --> x-3 --> x-1 --> x-2 --> A

Which is basically what you obtain with your patch


Nicolas
_______________________________________________
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to