The HCA is Mellanox MHES18-XSC.
Thanks!
Yicheng Jia
Hal Rosenstock <[email protected]>
04/16/2009 06:12 PM
To
Yicheng Jia <[email protected]>
cc
Nicolas Morey-Chaisemartin <[email protected]>,
[email protected]
Subject
Re: [ofa-general] link width problem of Qlogic 9024 unmanaged switch
On Thu, Apr 16, 2009 at 6:06 PM, Yicheng Jia <[email protected]> wrote:
>
>> There's a race condition here that I was asking about. If the link
>> initialization takes too long and doesn't complete (gets to init)
>> prior to the enable trying to be sent to the switch, then you could
>> see these results but since it's DOWN until reboot it's something
>> different.
>
> I did the "reset" when ports on both side of the link are in INIT state
and
> LinkUp phys state.
>
>> If the disable/wait/enable worked that would've been another story.
>
> It fails too. Both ports go to DOWN after disable is issued and never
come
> back. How long am I supposed to wait?
Ideally you would see init before doing the enable but sounds like
that's not occuring. Either you need low level debug to see why the
link does not initialize at that point or get support from your
CA/switch vendor(s). What's your CA device ?
-- Hal
> Thanks!
> Yicheng Jia
>
>
>
>
> Hal Rosenstock <[email protected]>
>
> 04/16/2009 03:26 PM
>
> To
> Yicheng Jia <[email protected]>
> cc
> Nicolas Morey-Chaisemartin <[email protected]>,
> [email protected]
> Subject
> Re: [ofa-general] link width problem of Qlogic 9024 unmanaged switch
>
>
>
>
> On Thu, Apr 16, 2009 at 3:47 PM, Yicheng Jia <[email protected]> wrote:
>>
>>> Are you resetting the switch from the peer HCA port or some other port
>>> ? That's what Nicolas asked but I might have missed the answer.
>>
>> Yes, I am trying to reset from the peer HCA port. Is anything wrong
with
>> this?
>
> There's a race condition here that I was asking about. If the link
> initialization takes too long and doesn't complete (gets to init)
> prior to the enable trying to be sent to the switch, then you could
> see these results but since it's DOWN until reboot it's something
> different.
>
>>> Also, try disable (wait) and then enable and see if that works.
>>
>> It remains the same, the switch port is DOWN forever. No SMP massage
could
>> get to the switch port.
>
> Right; in down, the SMP can't be sent.
>
>>> If I recall correctly, you had those links which are taking a long
time
>>> to
>>> initialize. If the link stays down forever after disable, this won't
>>> work but I want to be sure.
>>
>> This is seperate issue.
>
> Since the link stays down yes. If the disable/wait/enable worked that
> would've been another story.
>
> -- Hal
>
>> The "reset" command is tested on a single port HCA
>> directly connected with Qlogic siwth. The HCA is plugged into a Linux
>> machine. It is the simplest test environment.
>>
>> Thanks!
>> Yicheng Jia
>>
>>
>>
>>
>> Hal Rosenstock <[email protected]>
>>
>> 04/16/2009 02:29 PM
>>
>> To
>> Yicheng Jia <[email protected]>
>> cc
>> Nicolas Morey-Chaisemartin <[email protected]>,
>> [email protected]
>> Subject
>> Re: [ofa-general] link width problem of Qlogic 9024 unmanaged switch
>>
>>
>>
>>
>> On Thu, Apr 16, 2009 at 3:20 PM, Hal Rosenstock
>> <[email protected]> wrote:
>>> On Thu, Apr 16, 2009 at 3:18 PM, Yicheng Jia <[email protected]> wrote:
>>>>
>>>> They both are POLLING before "reset".
>>>
>>> Then they _should_ come back to INIT.
>>>
>>> What does the local LDDS value say after reset ? Any way to get the
>>> switch port LDDS value ?
>>
>> Are you resetting the switch from the peer HCA port or some other port
>> ? That's what Nicolas asked but I might have missed the answer.
>>
>> Also, try disable (wait) and then enable and see if that works. If I
>> recall correctly, you had those links which are taking a long time to
>> initialize. If the link stays down forever after disable, this won't
>> work but I want to be sure.
>>
>> -- Hal
>>
>>> -- Hal
>>>
>>>> Thanks!
>>>> Yicheng Jia
>>>>
>>>>
>>>>
>>>>
>>>> Hal Rosenstock <[email protected]>
>>>>
>>>> 04/16/2009 01:53 PM
>>>>
>>>> To
>>>> Yicheng Jia <[email protected]>
>>>> cc
>>>> Nicolas Morey-Chaisemartin <[email protected]>,
>>>> [email protected]
>>>> Subject
>>>> Re: [ofa-general] link width problem of Qlogic 9024 unmanaged switch
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Apr 16, 2009 at 11:12 AM, Yicheng Jia <[email protected]>
wrote:
>>>>>
>>>>> Hi Nicolas,
>>>>>
>>>>> After this "reset" command, both ports are DOWN forever, I can only
get
>>>>> portinfo from local port.
>>>>>
>>>>> I am sure that the port that has been reset is not the local port,
>>>>> otherwise
>>>>> it will prompt "node type not switch" error.
>>>>>
>>>>> I tried to enable this switch port from another port and brought it
to
>>>>> POLLING state, but as long as I use "reset", both ports are DOWN.
>>>>
>>>> What are the peer port's LinkDownDefaultStates ? Sounds like one or
>>>> more must be Sleeping rather than Polling for some reason.
>>>>
>>>> -- Hal
>>>>
>>>>> Thanks!
>>>>>
>>>>> Yicheng Jia
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Nicolas Morey-Chaisemartin <[email protected]>
>>>>>
>>>>> 04/16/2009 12:43 AM
>>>>>
>>>>> To
>>>>> Yicheng Jia <[email protected]>
>>>>> cc
>>>>> [email protected]
>>>>> Subject
>>>>> Re: [ofa-general] link width problem of Qlogic 9024 unmanaged switch
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> By any chances have you not reset the port you're on?
>>>>> Have you tried using another node to enable the port again?
>>>>>
>>>>> Nicolas
>>>>>
>>>>> Le 16/04/2009 00:45, Yicheng Jia a écrit :
>>>>>>
>>>>>> Hello Randy,
>>>>>>
>>>>>> I am trying to run "ibportstate reset" to reset the switch port on
the
>>>>>> other side in order to get 4x link. However I get the following
error:
>>>>>> ibwarn: [19660] mad_rpc: _do_madrpc failed; dport (Lid 7)
>>>>>> ibportstate: iberror: failed: smp set portinfo failed
>>>>>>
>>>>>> And the port status change to DOWN after this. Have you ever tried
to
>>>>>> run "ibportstate" to reset the switch port?
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> Yicheng Jia
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> ------------------------------
>>>>>>
>>>>>> Message: 2
>>>>>> Date: Wed, 4 Mar 2009 18:39:54 -0600
>>>>>> From: Randy Halverson <[email protected]>
>>>>>> Subject: [ofa-general] link width problem of Qlogic 9024 unmanaged
>>>>>> switch
>>>>>> To: "'[email protected]'"
<[email protected]>
>>>>>> Message-ID:
>>>>>> <[email protected]>
>>>>>> Content-Type: text/plain; charset="us-ascii"
>>>>>>
>>>>>> Hello Yicheng,
>>>>>>
>>>>>> After checking internally, this appears to be a known problem with
>>>>>> older
>>>>>> firmware for the 9024FC switches.
>>>>>>
>>>>>> It appears that you or another person at 'tmriusa.com' has recently
>>>>>> opened a case with QLogic Tech Support for this issue. Please
continue
>>>>>> to work with QLogic Tech Support on firmware upgrade resolution
since
>>>>>> you probably don't have our FastFabric Tools to manage the 9024FC
>>>>>> switches..
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Randy
>>>>>> Technical Support
>>>>>> QLogic Corporation
>>>>>> -------------- next part --------------
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
_____________________________________________________________________________
>>>>>> Scanned by IBM Email Security Management Services powered by
>>>>>> MessageLabs. For more information please visit
http://www.ers.ibm.com
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
_____________________________________________________________________________
>>>>>> <http://www.ers.ibm.com/>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
_____________________________________________________________________________
>>>>>> Scanned by IBM Email Security Management Services powered by
>>>>>> MessageLabs. For more information please visit
http://www.ers.ibm.com
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
_____________________________________________________________________________
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
------------------------------------------------------------------------
>>>>>>
>>>>>> _______________________________________________
>>>>>> general mailing list
>>>>>> [email protected]
>>>>>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>>>>>>
>>>>>> To unsubscribe, please visit
>>>>>> http://openib.org/mailman/listinfo/openib-general
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
_____________________________________________________________________________
>>>>> Scanned by IBM Email Security Management Services powered by
>>>>> MessageLabs.
>>>>> For more information please visit http://www.ers.ibm.com
>>>>>
>>>>>
>>>>>
>>>>>
_____________________________________________________________________________
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
_____________________________________________________________________________
>>>>> Scanned by IBM Email Security Management Services powered by
>>>>> MessageLabs.
>>>>> For more information please visit http://www.ers.ibm.com
>>>>>
>>>>>
>>>>>
>>>>>
_____________________________________________________________________________
>>>>>
>>>>> _______________________________________________
>>>>> general mailing list
>>>>> [email protected]
>>>>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>>>>>
>>>>> To unsubscribe, please visit
>>>>> http://openib.org/mailman/listinfo/openib-general
>>>>>
>>>>
>>>>
>>>>
>>>>
_____________________________________________________________________________
>>>> Scanned by IBM Email Security Management Services powered by
>>>> MessageLabs.
>>>> For more information please visit http://www.ers.ibm.com
>>>>
>>>>
>>>>
_____________________________________________________________________________
>>>>
>>>>
>>>>
>>>>
>>>>
_____________________________________________________________________________
>>>> Scanned by IBM Email Security Management Services powered by
>>>> MessageLabs.
>>>> For more information please visit http://www.ers.ibm.com
>>>>
>>>>
>>>>
_____________________________________________________________________________
>>>>
>>>
>>
>>
>>
_____________________________________________________________________________
>> Scanned by IBM Email Security Management Services powered by
MessageLabs.
>> For more information please visit http://www.ers.ibm.com
>>
>>
_____________________________________________________________________________
>>
>>
>>
>>
_____________________________________________________________________________
>> Scanned by IBM Email Security Management Services powered by
MessageLabs.
>> For more information please visit http://www.ers.ibm.com
>>
>>
_____________________________________________________________________________
>>
>
>
_____________________________________________________________________________
> Scanned by IBM Email Security Management Services powered by
MessageLabs.
> For more information please visit http://www.ers.ibm.com
>
_____________________________________________________________________________
>
>
>
_____________________________________________________________________________
> Scanned by IBM Email Security Management Services powered by
MessageLabs.
> For more information please visit http://www.ers.ibm.com
>
_____________________________________________________________________________
>
_____________________________________________________________________________
Scanned by IBM Email Security Management Services powered by MessageLabs.
For more information please visit http://www.ers.ibm.com
_____________________________________________________________________________
_____________________________________________________________________________
Scanned by IBM Email Security Management Services powered by MessageLabs. For
more information please visit http://www.ers.ibm.com
_____________________________________________________________________________
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general