On 7/24/07, Eitan Zahavi <[EMAIL PROTECTED]> wrote:

 *From:* Hal Rosenstock [mailto:[EMAIL PROTECTED]
*Sent:* Tuesday, July 24, 2007 5:53 PM
*To:* Eitan Zahavi
*Cc:* OpenFabrics General; Sasha Khapyorsky; Yevgeny Kliteynik
*Subject:* Re: OpenSM detection of duplicated GUIDs on loopback



Hi Eitan,

On 7/24/07, Eitan Zahavi <[EMAIL PROTECTED]> wrote:
>
>  *Hi Hal,*
> **
> *What is this "loopback" connector used for?*
> *Does not seem to me like a very useful thing to do.*
>
**
Perhaps not but no reason OpenSM can't handle this more gracefully.

 *Anyway, if it is not a production environment we could add a "debug
> mode" (-d flag option) to ignore this check.*
>
**
Why would a separate flag be needed ?
*[EZ] Since I do not see any other solution for the SM  to know it is
really a loop back plug rather then two devices with same GUID connected
back to back ...*


"Technically", this should only occur when looped back and not two devices
with same GUID as GUID == globally unique and a duplication indicates a
"manufacturing" issue.

Anyhow, can't these be treated the same (and handled more gracefully)
without an additional option/flag ?

-- Hal


-- Hal

 **
>
> *Eitan Zahavi***
> Senior Engineering Director, Software Architect
> Mellanox Technologies LTD
> Tel:+972-4-9097208
> Fax:+972-4-9593245
> P.O. Box 586 Yokneam 20692 ISRAEL
>
>
>  ------------------------------
> *From:* Hal Rosenstock [mailto:[EMAIL PROTECTED]
> *Sent: *Tuesday, July 24, 2007 5:31 PM
> *To:* OpenFabrics General
> *Cc:* Sasha Khapyorsky; Eitan Zahavi; Yevgeny Kliteynik
> *Subject:* OpenSM detection of duplicated GUIDs on loopback
>
>
>  Hi,
>
> This is what starts off as a "minor" issue and I know it has been
> discussed it somewhat in the past:
>
> Putting a loopback connector on a (switch) link causes OpenSM to
> indicate duplicated GUID error 0D18 as follows:
>
> __osm_ni_rcv_set_links
> {
> ...
>           /*
>              When there are only two nodes with exact same guids
> (connected back
>              to back) - the previous check for duplicated guid will not
> catch
>              them. But the link will be from the port to itself...
>              Enhanced Port 0 is an exception to this
>           */
>           if ((osm_node_get_node_guid( p_node ) ==
> p_ni_context->node_guid) &&
>               (port_num == p_ni_context->port_num) &&
>               (port_num != 0))
>           {
>             osm_log( p_rcv->p_log, OSM_LOG_ERROR,
>                      "__osm_ni_rcv_set_links: ERR 0D18: "
>                      "Duplicate GUID found by link from a port to
> itself:"
>                      "node 0x%" PRIx64 ", port number 0x%X\n",
>                      cl_ntoh64( osm_node_get_node_guid( p_node ) ),
>                      port_num );
> ...
>
> So this occurs over and over and over and fills the log with the same
> spew. This should be improved IMO.
>
> Is this really a fatal condition ? Doesn't seem like it should be to me.
>
>
> Also, OpenSM can "ride" this out with -y (stay on fatal) but is that
> safe for this condition ?
>
> Seems like something like an extra loopback bit should be added to some
> port structure which should cause these links to be ignored. This bit would
> then be reset when the peer is now longer itself.
>
> Also, is there a relationship of this with the 12x/duplicated GUID code
> ?
>
> Thanks.
>
> -- Hal
>
>

_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to