Ira Weiny wrote: > On Tue, 29 Sep 2009 18:16:21 +0200 > "Eli Dorfman (Voltaire)" <[email protected]> wrote: > >> Ira Weiny wrote: >>> Eli, >>> >>> On Wed, 26 Aug 2009 17:37:30 +0300 >>> "Eli Dorfman (Voltaire)" <[email protected]> wrote: >>> >>>> Subject: [PATCH] Fix IB network discovery from switch node. >>> Sorry for the late inquiry on this but what exactly was the bug here? >> Sorry for the late response. >> The problem is related to wrong discovery when running from the switch. >> Without the patch ibnetdiscover finds only local switch > > Ok I see. > > [snip] > >> I think that the problem is related to NodeInfo:LocalPort which is 0 in case >> of a switch. >> I see that get_remote_node() sends direct route MAD to switch with path 0,0 >> and that fails (at least for Mellanox IS4 switch chips). >> Another way to bypass this may be as follows: >> >> diff --git a/infiniband-diags/libibnetdisc/src/ibnetdisc.c >> b/infiniband-diags/libibnetdisc/src/ibnetdisc.c >> index 1e93ff8..3dd0dc6 100644 >> --- a/infiniband-diags/libibnetdisc/src/ibnetdisc.c >> +++ b/infiniband-diags/libibnetdisc/src/ibnetdisc.c >> @@ -461,7 +461,7 @@ get_remote_node(struct ibnd_fabric *fabric, struct >> ibnd_node *node, struct ibnd_ >> != IB_PORT_PHYS_STATE_LINKUP) >> return -1; >> >> - if (extend_dpath(fabric, path, portnum) < 0) >> + if (portnum > 0 && extend_dpath(fabric, path, portnum) < 0) >> return -1; >> >> if (query_node(fabric, &node_buf, &port_buf, path)) { >> >> >> Please check whether this is OK and I can send a new patch. >> > > This seems to fix my issue. Here is a patch against master which works for > me. If you want to verify that would be great.
Verified this again and it works. Sasha, please apply this patch. Thanks, Eli > > Thanks for helping me out, > Ira > > From: Ira Weiny <[email protected]> > Date: Tue, 22 Sep 2009 11:08:28 -0700 > Subject: [PATCH] infiniband-diags/libibnetdisc/src/ibnetdisc.c: fix bug in > single node processing. > > Eli fixed an issue with running ibnetdiscover from a switch but it > introduced a bug in processing a single switch: > > 17:19:42 > ./iblinkinfo -S 0x000b8cffff00490c > Switch 0x000b8cffff00490c MT47396 Infiniscale-III Mellanox Technologies: > ... > 8 11[ ] ==( 4X 2.5 Gbps Down/ Polling)==> [ ] "" > ( ) > 8 12[ ] ==( 4X 5.0 Gbps Active/ LinkUp)==> [ ] "" > ( ) > 8 13[ ] ==( 4X 2.5 Gbps Down/ Polling)==> [ ] "" > ( ) > ... > > The port we "come in on" when discovering the switch is not reported > properly. > > This patch, suggested by Eli, reverses Eli's patch and fixes his original > bug in a way which does not introduce the above issue. > > Signed-off-by: Ira Weiny <[email protected]> > --- > infiniband-diags/libibnetdisc/src/ibnetdisc.c | 18 ++++++++---------- > 1 files changed, 8 insertions(+), 10 deletions(-) > > diff --git a/infiniband-diags/libibnetdisc/src/ibnetdisc.c > b/infiniband-diags/libibnetdisc/src/ibnetdisc.c > index 97e369c..96f72c5 100644 > --- a/infiniband-diags/libibnetdisc/src/ibnetdisc.c > +++ b/infiniband-diags/libibnetdisc/src/ibnetdisc.c > @@ -506,7 +506,7 @@ static int get_remote_node(struct ibmad_port *ibmad_port, > != IB_PORT_PHYS_STATE_LINKUP) > return 1; /* positive == non-fatal error */ > > - if (extend_dpath(ibmad_port, fabric, path, portnum) < 0) > + if (portnum > 0 && extend_dpath(ibmad_port, fabric, path, portnum) < 0) > return -1; > > if (query_node(ibmad_port, fabric, &node_buf, &port_buf, path)) { > @@ -600,15 +600,13 @@ ibnd_fabric_t *ibnd_discover_fabric(struct ibmad_port * > ibmad_port, > if (!port) > goto error; > > - if (node->type != IB_NODE_SWITCH) { > - rc = get_remote_node(ibmad_port, fabric, node, port, from, > - mad_get_field(node->info, 0, > - IB_NODE_LOCAL_PORT_F), 0); > - if (rc < 0) > - goto error; > - if (rc > 0) /* non-fatal error, nothing more to be > done */ > - return ((ibnd_fabric_t *) fabric); > - } > + rc = get_remote_node(ibmad_port, fabric, node, port, from, > + mad_get_field(node->info, 0, > + IB_NODE_LOCAL_PORT_F), 0); > + if (rc < 0) > + goto error; > + if (rc > 0) /* non-fatal error, nothing more to be done */ > + return ((ibnd_fabric_t *) fabric); > > for (dist = 0; dist <= max_hops; dist++) { > _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
