Eli,

On Wed, 26 Aug 2009 17:37:30 +0300
"Eli Dorfman (Voltaire)" <[email protected]> wrote:

> Subject: [PATCH] Fix IB network discovery from switch node.

Sorry for the late inquiry on this but what exactly was the bug here?

I just found that this change introduced a bug.  The problem is that if you
don't do this query, even when the first found node is a switch, the port you
came into the switch on will not get reported properly.  Here is what I mean.

Running with the current master:

17:19:42 > ./iblinkinfo -S 0x000b8cffff00490c
Switch 0x000b8cffff00490c MT47396 Infiniscale-III Mellanox Technologies:
           8    1[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>             [  ] "" ( 
)
...
           8    9[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>             [  ] "" ( 
)
           8   10[  ] ==( 4X 5.0 Gbps Active/  LinkUp)==>      15   24[  ] 
"ISR9024D Voltaire" ( )
           8   11[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>             [  ] "" ( 
)
           8   12[  ] ==( 4X 5.0 Gbps Active/  LinkUp)==>             [  ] "" ( 
)
           8   13[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>             [  ] "" ( 
)
...

The DR path "came in" on port 12 and is reported as Active/LinkUp but has no
information on the other end.  Here is what the output should look like with
your change removed.

17:22:36 > ./iblinkinfo -S 0x000b8cffff00490c
Switch 0x000b8cffff00490c MT47396 Infiniscale-III Mellanox Technologies:
           8    1[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>             [  ] "" ( 
)
...
           8    9[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>             [  ] "" ( 
)
           8   10[  ] ==( 4X 5.0 Gbps Active/  LinkUp)==>      15   24[  ] 
"ISR9024D Voltaire" ( )
           8   11[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>             [  ] "" ( 
)
           8   12[  ] ==( 4X 5.0 Gbps Active/  LinkUp)==>       7    8[  ] 
"Cisco Switch SFS7000D" ( )
           8   13[  ] ==( 4X 2.5 Gbps   Down/ Polling)==>             [  ] "" ( 
)
...

This properly reports the other end of this link as another switch.

Could you explain the problem a bit more so we can come up with a better
solution?

Thanks,
Ira

> 
> Signed-off-by: Eli Dorfman <[email protected]>
> ---
>  infiniband-diags/libibnetdisc/src/ibnetdisc.c |   16 +++++++++-------
>  1 files changed, 9 insertions(+), 7 deletions(-)
> 
> diff --git a/infiniband-diags/libibnetdisc/src/ibnetdisc.c 
> b/infiniband-diags/libibnetdisc/src/ibnetdisc.c
> index c69467e..779e659 100644
> --- a/infiniband-diags/libibnetdisc/src/ibnetdisc.c
> +++ b/infiniband-diags/libibnetdisc/src/ibnetdisc.c
> @@ -590,13 +590,15 @@ ibnd_fabric_t *ibnd_discover_fabric(struct ibmad_port * 
> ibmad_port,
>       if (!port)
>               goto error;
>  
> -     rc = get_remote_node(ibmad_port, fabric, node, port, from,
> -                          mad_get_field(node->info, 0,
> -                                        IB_NODE_LOCAL_PORT_F), 0);
> -     if (rc < 0)
> -             goto error;
> -     if (rc > 0)             /* non-fatal error, nothing more to be done */
> -             return ((ibnd_fabric_t *) fabric);
> +     if (node->node.type != IB_NODE_SWITCH) { 
> +             rc = get_remote_node(ibmad_port, fabric, node, port, from,
> +                                  mad_get_field(node->info, 0,
> +                                                IB_NODE_LOCAL_PORT_F), 0);
> +             if (rc < 0)
> +                     goto error;
> +             if (rc > 0)             /* non-fatal error, nothing more to be 
> done */
> +                     return ((ibnd_fabric_t *) fabric);
> +     }
>  
>       for (dist = 0; dist <= max_hops; dist++) {
>  
> -- 
> 1.5.5
> 
> _______________________________________________
> general mailing list
> [email protected]
> http://*lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit 
> http://*openib.org/mailman/listinfo/openib-general
> 


-- 
Ira Weiny
Math Programmer/Computer Scientist
Lawrence Livermore National Lab
925-423-8008
[email protected]
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to