On 12/16/2012 8:39 AM, Jens Domke wrote:
> Hi,
> 
> On Dec 16, 2012, at 9:32 PM, Hal Rosenstock wrote:
> 
>> Hi,
>>
>> On 12/16/2012 7:03 AM, Jens Domke wrote:
>>> Hello Hal,
>>>
>>> On Dec 15, 2012, at 5:44 AM, Hal Rosenstock wrote:
>>>
>>>> Hi,
>>>>
>>>> On 12/14/2012 3:32 PM, Jens Domke wrote:
>>>>> Hello Hal,
>>>>>
>>>>> On Dec 15, 2012, at 3:58 AM, Hal Rosenstock wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> On 12/14/2012 1:24 PM, Jens Domke wrote:
>>>>>>> Hello Hal,
>>>>>>>
>>>>>>> On Dec 15, 2012, at 1:42 AM, Hal Rosenstock wrote:
>>>>>>>
>>>>>>>> Hi again,
>>>>>>>>
>>>>>>>> On 12/14/2012 10:17 AM, Jens Domke wrote:
>>>>>>>>> Hello Hal,
>>>>>>>>>
>>>>>>>>> thank you for the fast response. I will try to clarify some points.
>>>>>>>>>
>>>>>>>>>>> d) OpenMPI runs are executed with "--mca 
>>>>>>>>>>> btl_openib_ib_path_record_service_level 1"
>>>>>>>>>>
>>>>>>>>>> I'm not familiar with what DFSSSP does to figure out SLs exactly but
>>>>>>>>>> there should be no need to set this. The proper SL for querying the 
>>>>>>>>>> SA
>>>>>>>>>> for PathRecords, etc. is always in PortInfo.SMSL. In the case of 
>>>>>>>>>> DFSSSP
>>>>>>>>>> (and other QoS based routing algorithms), it calculates that and the 
>>>>>>>>>> SM
>>>>>>>>>> pushes this into each port. That should be used. It's possible that 
>>>>>>>>>> SL1
>>>>>>>>>> is not a valid SL for port <-> SA querying using DFSSSP.
>>>>>>>>> The OpenMPI parameter btl_openib_ib_path_record_service_level does 
>>>>>>>>> not specify the SL for querying the PathRecords.
>>>>>>>>> It just enables the functionality. And the ompi processes use the 
>>>>>>>>> PortInfo.SMSL to send the request.
>>>>>>>>> For the request "port -> SA" every 0<=SL<=7 was used in the test, and 
>>>>>>>>> the SA received the requests.  
>>>>>>>>>>
>>>>>>>>>>> e) kernel 2.6.32-220.13.1.el6.x86_64
>>>>>>>>>>>
>>>>>>>>>>> As far as I understand the whole system:
>>>>>>>>>>> 1. the OMPI processes are sending MAD requests 
>>>>>>>>>>> (SubnAdmGet:PathRecord) to the OpenSM
>>>>>>>>>>> 2. the SA receives the request on QP1
>>>>>>>>>>
>>>>>>>>>> There is the SL in the query itself. This should be the SMSL that 
>>>>>>>>>> the SM
>>>>>>>>>> set for that port.
>>>>>>>>> Hmm, there you might have a point. I think I saw that the query 
>>>>>>>>> itself had SL=0 specified.
>>>>>>>>> In fact OpenMPI sets everthing to 0 except for slid and dlid.
>>>>>>>>>>
>>>>>>>>>>> 3. SA asks the routing algorithm (like LASH, DFSSSP or Torus_2QoS) 
>>>>>>>>>>> about a special service level for the slid/dlid path
>>>>>>>>>>
>>>>>>>>>> This is a (potentially) different SL (for MPI<->MPI port 
>>>>>>>>>> communication)
>>>>>>>>>> than the one the query used and is the one returned inside the
>>>>>>>>>> PathRecord attribute/data.
>>>>>>>>> Yes, it can be different, but DFSSSP sets the same SL, because the SM 
>>>>>>>>> is running on a port which is also used for MPI comm.
>>>>>>>>
>>>>>>>> With DFSSSP are all SLs same from source port to get to any 
>>>>>>>> destination ?
>>>>>>> No, not necessarily. In general DFSSSP does not enforce SL(LID1->LID2) 
>>>>>>> == SL(LID2->LID1) or SL(LID1->LID2) == SL(LID1->LID3).
>>>>>>
>>>>>> If SL(LID1->LID2) != SL(LID2->LID1), that's not a reversible path.
>>>>> True. But i don't think that the SA asks the DFSSSP routing about the SL 
>>>>> for the reversible path.
>>>>> So, the SA could use any SL which is a valid SL, even if the DFSSSP would 
>>>>> recommend another SL.
>>>>>
>>>>> I just read the IB Specs and it says, that "SL specified in the received 
>>>>> packet is used as the SL in the response packet" for MAD packets.
>>>>> So, its most likely, that there is a mismatch in the way how OMPI does 
>>>>> the setup of the PathRequest and the way how the SA does build the 
>>>>> respond packet.
>>>>> OMPI always specifies SL=0 (lets say SL_a) inside of the PathRequest 
>>>>> packet, 
>>>>
>>>> So CompMask in the query has the SL bit on and SL is set to 0 inside the
>>>> SubAdmGet of PatchRecord ?
>>>
>>> No, the CompMask didn't had the SL bit and the SL was set to 0.
>>
>> That means the SL in the request is wildcarded so the SA/SM fills in a
>> valid one in the response.
> Ok.
>>
>>> I tried to follow the path of the SL bit (IB_PR_COMPMASK_SL) and the only 
>>> reference I found was in osm_sa_path_record.c
>>> The SA just treats the SL in the PathRequest as a "I would like to use this 
>>> SL" in case the SL bit is set.
>>> But the routing engine can overwrite the requested SL before the reply is 
>>> send.
>>>
>>> Nevertheless, I have changed the code of OMPI so that it sets the SL bit in 
>>> the CompMask and sets the SL to SMSL for the PathRequest, so that SL_a == 
>>> SL_b.
>>> Sadly, the reply send by the SA does not leave the node (for SL_b>0). Only 
>>> if I change the SL to 0 in the MAD right before umad_send is called by the 
>>> SA, the paket is able to leave the node and reaches the OMPI process.
>>
>> Are you sure the response doesn't leave the SA node or it's not received
>> at the requester (OMPI node) ?
> No, I'm not sure. Is there any possibility to check that? As far as I know, 
> ibdump does not show MAD pakets which leave a port, it only shows the pakets 
> when they are received on the other end.
>>
>>>
>>>>
>>>>> and sends the packet on SL_b (PortInfo.SMSL).
>>>>
>>>> Good.
>>>>
>>>>> The SA uses p_mad_addr->addr_type.gsi.service_level, which is SL_b, for 
>>>>> the response.
>>>>> If SL_b is not 0, then the packet can't reach the OMPI process. Right?
>>>>
>>>> Depends. It may be that both SLs work but maybe not.
>>>>
>>>>> If I analyse this correctly, then there are two bugs. One is in OMPI, 
>>>>> that it does not specify the SL within the PathRequest in a appropriate 
>>>>> way (which would be a SL suggested by DFSSSP for the reversible path). 
>>>>> And the second bug is that the SA uses the SL, on which the PathRequest 
>>>>> packet was send, and not the SL specified within the packet.
>>>>> What do you think?
>>>>
>>>> Yes, it might be better to wildcard the SL in the query. The only
>>>> scenario that would fail with the query you are making if there's no SL
>>>> 0 path between the src/dest LIDs or GIDs in the OMPI PathRecord query.
>>>> If that's the case, SA should return MAD status 0xc (status code 3 -
>>>> ERR_NO_RECORDS). But the response doesn't make it back to the requester
>>>> OMPI node so it's not even getting that far.
>>>
>>> Yes, exactly. So, do you have an idea why the response hands in the SA node?
>>> I have no inside of the underlying layer (kernel driver and fireware). 
>>> Maybe there are some implementations, which prevent the SA from sending 
>>> MADs back on SL>0?
>>
>> If you're sure this response doesn't get out of the SA node, please
>> contact Mellanox support with the details.
> Ok, I can do this, if it turns out to be true.
>>
>>>>
>>>>> I can try to change the PathRequest of OMPI tomorrow, so that it matches 
>>>>> addr_type.gsi.service_level.
>>>>> Maybe, with this change the packets of the SA will reach the OMPI process 
>>>>> on a SL>0.
>>>>>>
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> 4. SA sends the PathRecord back to the OMPI process via umad_send 
>>>>>>>>>>> in libvendor/osm_vendor_ibumad.c
>>>>>>>>>>
>>>>>>>>>> By the response reversibility rule, I think this is returned on the 
>>>>>>>>>> SL
>>>>>>>>>> of the original query but haven't verified this in the code base yet.
>>>>>>>>> Ok, I was not aware of that rule. But if this is true, then the SA 
>>>>>>>>> should also be able to send via SL>0.
>>>>>>>>
>>>>>>>> I doubled checked and indeed the SA response does use the SL that the
>>>>>>>> incoming request was received on.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> The osm_vendor_send() function builds the MAD packet with the 
>>>>>>>>>>> following attributes:
>>>>>>>>>>>    /* GS classes */
>>>>>>>>>>>    umad_set_addr_net(p_vw->umad, p_mad_addr->dest_lid,
>>>>>>>>>>>                      p_mad_addr->addr_type.gsi.remote_qp,
>>>>>>>>>>>                      p_mad_addr->addr_type.gsi.service_level,
>>>>>>>>>>>                      IB_QP1_WELL_KNOWN_Q_KEY);
>>>>>>>>>>> So, the SL is the same like the one which was used by the OMPI 
>>>>>>>>>>> process. The Q_Key matches the Q_key on the OMPI process, and 
>>>>>>>>>>> remote_qp and dest_lid is correct, too.
>>>>>>>>>>> Afterwards umad_send(…) is used to send the reply with the 
>>>>>>>>>>> PathRecord, and this send does not work (except for SL=0).
>>>>>>>>>>
>>>>>>>>>> By not working, what do you mean ? Do you mean it's not received at 
>>>>>>>>>> the
>>>>>>>>>> requester with no message in the OpenSM log or not received at the
>>>>>>>>>> OpenSM or something else ? It could be due to the wrong SL being 
>>>>>>>>>> used in
>>>>>>>>>> the original request (forcing it to SL 1). That could cause it not 
>>>>>>>>>> to be
>>>>>>>>>> received at the SM or the response not to make it back to the 
>>>>>>>>>> requester
>>>>>>>>>> from the SA if the SL used is not "reversible".
>>>>>>>>> By "not working" I mean, that the MPI process does not receive any 
>>>>>>>>> response from the SA.
>>>>>>>>> I get messages from the MPI process like the following:
>>>>>>>>> [rc011][[14851,1],1][connect/btl_openib_connect_sl.c:301:get_pathrecord_info]
>>>>>>>>>  No response from SA after 20 retries
>>>>>>>>> The log of OpenSM shows that the SA received the PathRequest query, 
>>>>>>>>> dumps the query into the log, and sends the reply back.
>>>>>>>>> And I think I was some messages in the log about "…1 outstanding 
>>>>>>>>> MAD…".
>>>>>>>>>>
>>>>>>>>>>> If I look into the MAD before it is send, then it looks like this:
>>>>>>>>>>> Breakpoint 2, umad_send (fd=9, agentid=2, umad=0x7fffe8012530, 
>>>>>>>>>>> length=120, timeout_ms=0, retries=3)
>>>>>>>>>>> at src/umad.c:791
>>>>>>>>>>> 791             if (umaddebug > 1)
>>>>>>>>>>> (gdb) p *mad
>>>>>>>>>>> $1 = {agent_id = 2, status = 0, timeout_ms = 0, retries = 3, length 
>>>>>>>>>>> = 0, addr = {qpn = 1325427712, qkey = 384, 
>>>>>>>>>>> lid = 4096, sl = 6 '\006', path_bits = 0 '\000', grh_present = 0 
>>>>>>>>>>> '\000', gid_index = 0 '\000', 
>>>>>>>>>>> hop_limit = 0 '\000', traffic_class = 0 '\000', gid = '\000' 
>>>>>>>>>>> <repeats 15 times>, flow_label = 0, 
>>>>>>>>>>> pkey_index = 0, reserved = "\000\000\000\000\000"}, data = 
>>>>>>>>>>> 0x7fffe8012530 "\002"}
>>>>>>>>>>
>>>>>>>>>> Is this the PathRecord query on the OpenMPI side or the response on 
>>>>>>>>>> the
>>>>>>>>>> OpenSM side ? SL is 6 rather than 1 here.
>>>>>>>>> This is the response on the OpenSM side (inside the umad_send 
>>>>>>>>> function, right before it is written to the device with write(fd, …).
>>>>>>>>> SL=6 indicates, that the MPI process was sending the request on SL 6.
>>>>>>>>
>>>>>>>> What is SMSL for the requester ? Was it SL 6 ?
>>>>>>> Yes, it was SL 6.
>>>>>>> Here is a content of a similar packet which was received by the SA. I 
>>>>>>> have used ibdump on the port where the OpenSM was running:
>>>>>>> ======================================================================================
>>>>>>> No.     Time        Source                Destination           
>>>>>>> Protocol Length Info
>>>>>>>  785 14.352168   LID: 384              LID: 4140             InfiniBand 
>>>>>>> 290    UD Send Only SubnAdmGet(PathRecord)
>>>>>>>
>>>>>>> Frame 785: 290 bytes on wire (2320 bits), 290 bytes captured (2320 bits)
>>>>>>>  Arrival Time: Dec 13, 2012 18:09:44.437633332 JST
>>>>>>>  Epoch Time: 1355389784.437633332 seconds
>>>>>>>  [Time delta from previous captured frame: 4.332020528 seconds]
>>>>>>>  [Time delta from previous displayed frame: 4.332020528 seconds]
>>>>>>>  [Time since reference or first frame: 14.352168681 seconds]
>>>>>>>  Frame Number: 785
>>>>>>>  Frame Length: 290 bytes (2320 bits)
>>>>>>>  Capture Length: 290 bytes (2320 bits)
>>>>>>>  [Frame is marked: False]
>>>>>>>  [Frame is ignored: False]
>>>>>>>  [Protocols in frame: erf:infiniband]
>>>>>>> Extensible Record Format
>>>>>>>  [ERF Header]
>>>>>>>      Timestamp: 0x50c99b587008bcf2
>>>>>>>      [Header type]
>>>>>>>          .001 0101 = type: INFINIBAND (21)
>>>>>>>          0... .... = Extension header present: 0
>>>>>>>      0000 0100 = flags: 4
>>>>>>>          .... ..00 = capture interface: 0
>>>>>>>          .... .1.. = varying record length: 1
>>>>>>>          .... 0... = truncated: 0
>>>>>>>          ...0 .... = rx error: 0
>>>>>>>          ..0. .... = ds error: 0
>>>>>>>          00.. .... = reserved: 0
>>>>>>>      record length: 306
>>>>>>>      loss counter: 0
>>>>>>>      wire length: 290
>>>>>>> InfiniBand
>>>>>>>  Local Route Header
>>>>>>>      0110 .... = Virtual Lane: 0x06
>>>>>>>      .... 0000 = Link Version: 0
>>>>>>>      0110 .... = Service Level: 6
>>>>>>>      .... 00.. = Reserved (2 bits): 0
>>>>>>>      .... ..10 = Link Next Header: 0x02
>>>>>>>      Destination Local ID: 19
>>>>>>>      0000 0... .... .... = Reserved (5 bits): 0
>>>>>>>      .... .000 0100 1000 = Packet Length: 72
>>>>>>>      Source Local ID: 16
>>>>>>>  Base Transport Header
>>>>>>>      Opcode: 100
>>>>>>>      1... .... = Solicited Event: True
>>>>>>>      .1.. .... = MigReq: True
>>>>>>>      ..00 .... = Pad Count: 0
>>>>>>>      .... 0000 = Header Version: 0
>>>>>>>      Partition Key: 65535
>>>>>>>      Reserved (8 bits): 0
>>>>>>>      Destination Queue Pair: 0x000001
>>>>>>>      0... .... = Acknowledge Request: False
>>>>>>>      .000 0000 = Reserved (7 bits): 0
>>>>>>>      Packet Sequence Number: 0
>>>>>>>  DETH - Datagram Extended Transport Header
>>>>>>>      Queue Key: 2147549184
>>>>>>>      Reserved (8 bits): 0
>>>>>>>      Source Queue Pair: 0x00380050
>>>>>>>  MAD Header - Common Management Datagram
>>>>>>>      Base Version: 0x01
>>>>>>>      Management Class: 0x03
>>>>>>>      Class Version: 0x02
>>>>>>>      Method: Get() (0x01)
>>>>>>>      Status: 0x0000
>>>>>>>      Class Specific: 0x0000
>>>>>>>      Transaction ID: 0x0010000f38005000
>>>>>>>      Attribute ID: 0x0035
>>>>>>>      Reserved: 0x0000
>>>>>>>      Attribute Modifier: 0x00000000
>>>>>>>      MAD Data Payload: 
>>>>>>> 000000000000000000000000000000000000000000000000...
>>>>>>>   Illegal RMPP Type (0)! 
>>>>>>>      RMPP Type: 0x00
>>>>>>>      RMPP Type: 0x00
>>>>>>>      0000 .... = R Resp Time: 0x00
>>>>>>>      .... 0000 = RMPP Flags: Unknown (0x00)
>>>>>>>      RMPP Status:  (Normal) (0x00)
>>>>>>>      RMPP Data 1: 0x00000000
>>>>>>>      RMPP Data 2: 0x00000000
>>>>>>>  SMASubnAdmGet(PathRecord)
>>>>>>>      SM_Key (Verification Key): 0x0000000000000000
>>>>>>>      Attribute Offset: 0x0000
>>>>>>>      Reserved: 0x0000
>>>>>>>      Component Mask: 0x0000003000000000
>>>>>>>      Attribute (PathRecord)
>>>>>>>          PathRecord
>>>>>>>              DGID: :: (::)
>>>>>>>              SGID: ::0.15.0.16 (::0.15.0.16)
>>>>>>>              DLID: 0x0000
>>>>>>>              SLID: 0x0000
>>>>>>>              0... .... = RawTraffic: 0x00
>>>>>>>              .... 0000 0000 0000 0000 0000 = FlowLabel: 0x000000
>>>>>>>              HopLimit: 0x00
>>>>>>>              TClass: 0x00
>>>>>>>              0... .... = Reversible: 0x00
>>>>>>>              .000 0000 = NumbPath: 0x00
>>>>>>>              P_Key: 0x0000
>>>>>>>              .... .... .... 0000 = SL: 0x0000
>>>>>>>              00.. .... = MTUSelector: 0x00
>>>>>>>              ..00 0000 = MTU: 0x00
>>>>>>>              00.. .... = RateSelector: 0x00
>>>>>>>              ..00 0000 = Rate: 0x00
>>>>>>>              00.. .... = PacketLifeTimeSelector: 0x00
>>>>>>>              ..00 0000 = PacketLifeTime: 0x00
>>>>>>>              Preference: 0x00
>>>>>>>  Variant CRC: 0xad4e
>>>>>>> ======================================================================================
>>>>>>
>>>>>> And the SubnAdmGetResp(PathRecord) is not seen ? If not, it doesn't get
>>>>>> out that machine and the issue is internal to that machine. It could be
>>>>>> because of the underlying issue which hangs OpenSM when some IB program
>>>>>> tried to unregister from the MAD layer but there were outstanding work
>>>>>> completions. That's based on your original email earlier this AM.
>>>>> No, the SubnAdmGetResp does not show up, if I use ibdump on the OMPI side 
>>>>> and the SA uses a SL>0.
>>>>
>>>> Can ibdump be used to capture output on the SM port ?
>>>
>>> Yes, that works quite well, despite the warning in the ibdump manual.
>>> But I have started ibdump before opensm, maybe that makes a difference, not 
>>> sure.
>>>
>>> Regards,
>>> Jens
>>>
>>> PS: I have seen a small bug. Not sure if its a bug in wireshark or ibdump, 
>>> but the response received by the OMPI node isn't shown correctly. The 
>>> PathRecord contains an offset which is either missing in the dump or is not 
>>> treated correctly be wireshark. But it causes wireshark to show the 
>>> PathRecord data with wrong values.
>>> Maybe you could redirect this to the developer of ibdump, so that he can 
>>> check/fix it.
>>
>> Are you referring to the fields after the SA AttributeOffset or
>> something else ?
> Yes, after the SMASubnAdmGet Attribute Offset. Here an example:
> I get on the OMPI side:
>     SMASubnAdmGetResp(PathRecord)
>         SM_Key (Verification Key): 0x0000000000000000
>         Attribute Offset: 0x0008
>         Reserved: 0x0000
>         Component Mask: 0x0000803000000000
>         Attribute (PathRecord)
>             PathRecord
>                 DGID: ::8:f104:399:ebb5:fe80:0 (::8:f104:399:ebb5:fe80:0)
>                 SGID: ::8:f104:399:ecd5:4:8 (::8:f104:399:ecd5:4:8)
>                 DLID: 0x0000
>                 SLID: 0x0000
>                 0... .... = RawTraffic: 0x00
>                 .... 0000 1000 0000 1111 1111 = FlowLabel: 0x0080ff
>                 HopLimit: 0xff
>                 TClass: 0x00
>                 0... .... = Reversible: 0x00
>                 .000 0011 = NumbPath: 0x03
>                 P_Key: 0x8486
>                 .... .... .... 0000 = SL: 0x0000
>                 00.. .... = MTUSelector: 0x00
>                 ..00 0000 = MTU: 0x00
>                 00.. .... = RateSelector: 0x00
>                 ..00 0000 = Rate: 0x00
>                 00.. .... = PacketLifeTimeSelector: 0x00
>                 ..00 0000 = PacketLifeTime: 0x00
>                 Preference: 0x00
> 
> But it should show (see the difference in SLID, DLID, SL which are now 
> correct):
>     SMASubnAdmGetResp(PathRecord)
>         SM_Key (Verification Key): 0x0000000000000000
>         Attribute Offset: 0x0008
>         Reserved: 0x0000
>         Component Mask: 0x0000803000000000
>         Attribute (PathRecord)
>             PathRecord
>                 DGID: ::8:f104:399:ebb5 (::8:f104:399:ebb5)
>                 SGID: fe80::8:f104:399:ecd5 (fe80::8:f104:399:ecd5)
>                 DLID: 0x0004
>                 SLID: 0x0008
>                 0... .... = RawTraffic: 0x00
>                 .... 0000 0000 0000 0000 0000 = FlowLabel: 0x000000
>                 HopLimit: 0x00
>                 TClass: 0x00
>                 1... .... = Reversible: 0x01
>                 .000 0000 = NumbPath: 0x00
>                 P_Key: 0xffff
>                 .... .... .... 0011 = SL: 0x0003
>                 10.. .... = MTUSelector: 0x02
>                 ..00 0100 = MTU: 0x04
>                 10.. .... = RateSelector: 0x02
>                 ..00 0110 = Rate: 0x06
>                 10.. .... = PacketLifeTimeSelector: 0x02
>                 ..01 0010 = PacketLifeTime: 0x12
>                 Preference: 0x00


I think everything after AttributeOffset is off by 2 bytes. DGID doesn't
look right to me (no subnet prefix fe80:: in front of GUID).

-- Hal

> 
> Regards,
> Jens
> 
>>
>> -- Hal
>>
>>>>
>>>> -- Hal
>>>>
>>>>>>
>>>>>>>>
>>>>>>>> One would need to walk the SLToVLMappingTables from requester (OMPI
>>>>>>>> port) to SA and back to see whether SL6 would even have a chance of
>>>>>>>> working (not dropping) aside from whether it's really the correct SL 
>>>>>>>> to use.
>>>>>>> All SL2VL tables look the same. I checked the output of OpenSM.
>>>>>>>         SL: |  0  | 1  | 2  | 3  | 4  | 5  | 6  | 7  | 8  | 9  | 10 | 
>>>>>>> 11 | 12 | 13 | 14 | 15 |
>>>>>>>         VL: | 0x0 |0x1 |0x2 |0x3 |0x4 |0x5 |0x6 |0x7 |0x0 |0x1 |0x2 
>>>>>>> |0x3 |0x4 |0x5 |0x6 |0x7 |
>>>>>>> But this is also as expected, because I have set the QoS in the opensm 
>>>>>>> config as follows:
>>>>>>>         qos_sl2vl 0,1,2,3,4,5,6,7,0,1,2,3,4,5,6,7
>>>>>>> This was set for "default", "CA" and "Switch external ports". I have 
>>>>>>> not touched the config for "Switch Port 0" and "Router ports", they 
>>>>>>> remained: qos_[sw0 | rtr]_sl2vl (null)
>>>>>>
>>>>>> That works as long as all links have (at least) 8 data VLs (VLCap 4).
>>>>> Yes, all VL_CAP show 4 in the OpenSM log file.
>>>>>
>>>>> Regards
>>>>> Jens
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> -- Hal
>>>>>>
>>>>>>> Regards
>>>>>>> Jens
>>>>>>>
>>>>>>>>
>>>>>>>> -- Hal
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> The output of OpenMPI or OpenSM's log file don't show any useful 
>>>>>>>>>>> information for this problem, even with higher debug levels.
>>>>>>>>>>
>>>>>>>>>> So nothing interesting logged relative to the PathRecord queries ?
>>>>>>>>> In the OpenSM log, only that it was received, how the request looks 
>>>>>>>>> like, and that it was send back.
>>>>>>>>> And a few "outstanding MADs" a few lines later in the log.
>>>>>>>>>>
>>>>>>>>>>> So, right now I'm stuck, and have no idea if there is an error in 
>>>>>>>>>>> the kernel driver, the HCA firmware or something completely 
>>>>>>>>>>> different. Or if umad_send basically does not support SL>0.
>>>>>>>>>>> A workaround for the moment is to set the SL in the 
>>>>>>>>>>> umad_set_addr_net(...) call to 0.
>>>>>>>>>>
>>>>>>>>>> So SL 0 works between all nodes and SA for querying/responses. 
>>>>>>>>>> Wonder if
>>>>>>>>>> that's how SMSL is set by DFSSSP.
>>>>>>>>> No, the SMSL set by DFSSSP is different from 0, I have checked this. 
>>>>>>>>> In our case (OpenSM running on a compute node), it sets the same SL, 
>>>>>>>>> which is used
>>>>>>>> for MPI<->MPI traffic, to ensure deadlock freedom.
>>>>>>>>>
>>>>>>>>> Regards
>>>>>>>>> Jens
>>>>>>>>>
>>>>>>>>> --------------------------------
>>>>>>>>> Dipl.-Math. Jens Domke
>>>>>>>>> Researcher - Tokyo Institute of Technology
>>>>>>>>> Satoshi MATSUOKA Laboratory
>>>>>>>>> Global Scientific Information and Computing Center
>>>>>>>>> 2-12-1-E2-7 Ookayama, Meguro-ku, 
>>>>>>>>> Tokyo, 152-8550, JAPAN
>>>>>>>>> Tel/Fax: +81-3-5734-3876
>>>>>>>>> E-Mail: domke.j...@m.titech.ac.jp
>>>>>>>>> --------------------------------
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" 
>>>>>>>> in
>>>>>>>> the body of a message to majord...@vger.kernel.org
>>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>>
>>>>>>> --------------------------------
>>>>>>> Dipl.-Math. Jens Domke
>>>>>>> Researcher - Tokyo Institute of Technology
>>>>>>> Satoshi MATSUOKA Laboratory
>>>>>>> Global Scientific Information and Computing Center
>>>>>>> 2-12-1-E2-7 Ookayama, Meguro-ku, 
>>>>>>> Tokyo, 152-8550, JAPAN
>>>>>>> Tel/Fax: +81-3-5734-3876
>>>>>>> E-Mail: domke.j...@m.titech.ac.jp
>>>>>>> --------------------------------
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>>>>>> the body of a message to majord...@vger.kernel.org
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>> --------------------------------
>>>>> Dipl.-Math. Jens Domke
>>>>> Researcher - Tokyo Institute of Technology
>>>>> Satoshi MATSUOKA Laboratory
>>>>> Global Scientific Information and Computing Center
>>>>> 2-12-1-E2-7 Ookayama, Meguro-ku, 
>>>>> Tokyo, 152-8550, JAPAN
>>>>> Tel/Fax: +81-3-5734-3876
>>>>> E-Mail: domke.j...@m.titech.ac.jp
>>>>> --------------------------------
>>>>>
>>>>>
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>>>> the body of a message to majord...@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>> --------------------------------
>>> Dipl.-Math. Jens Domke
>>> Researcher - Tokyo Institute of Technology
>>> Satoshi MATSUOKA Laboratory
>>> Global Scientific Information and Computing Center
>>> 2-12-1-E2-7 Ookayama, Meguro-ku, 
>>> Tokyo, 152-8550, JAPAN
>>> Tel/Fax: +81-3-5734-3876
>>> E-Mail: domke.j...@m.titech.ac.jp
>>> --------------------------------
>>>
>>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to