Hi Sean,

The problem arises when the active Solaris client is sending a connection 
request to a passive OFED server instance. Solaris will set the hop_limit field 
to 0xFF and will not expect or enable GRH routing. The subsequent exchange of 
RC messages are therefore silently dropped since one side expects GRH traffic 
and the other doesn't. 

The active side seems to work ok for local only subnets so nothing needs to be 
changed there. 

Here is an updated patch:

Signed-off-by: Jim L Hall <[EMAIL PROTECTED]>
---
 drivers/infiniband/core/cm.c |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index d446998..25a77ec 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -1095,7 +1095,10 @@ static void cm_format_paths_from_req(struct cm_req_msg 
*req_msg,
        primary_path->dlid = req_msg->primary_local_lid;
        primary_path->slid = req_msg->primary_remote_lid;
        primary_path->flow_label = cm_req_get_primary_flow_label(req_msg);
-       primary_path->hop_limit = req_msg->primary_hop_limit;
+       if (cm_req_get_primary_subnet_local(req_msg) == 1)
+               primary_path->hop_limit = 1;
+       else
+               primary_path->hop_limit = req_msg->primary_hop_limit;
        primary_path->traffic_class = req_msg->primary_traffic_class;
        primary_path->reversible = 1;
        primary_path->pkey = req_msg->pkey;
@@ -1116,7 +1119,10 @@ static void cm_format_paths_from_req(struct cm_req_msg 
*req_msg,
                alt_path->dlid = req_msg->alt_local_lid;
                alt_path->slid = req_msg->alt_remote_lid;
                alt_path->flow_label = cm_req_get_alt_flow_label(req_msg);
-               alt_path->hop_limit = req_msg->alt_hop_limit;
+               if (cm_req_get_alt_subnet_local(req_msg) == 1)
+                       alt_path->hop_limit = 1;
+               else
+                       alt_path->hop_limit = req_msg->alt_hop_limit;
                alt_path->traffic_class = req_msg->alt_traffic_class;
                alt_path->reversible = 1;
                alt_path->pkey = req_msg->pkey;



Thanks,

- Jim H.
  ----- Original Message ----- 
  From: Sean Hefty 
  To: 'Jim Hall' ; [email protected] 
  Sent: Monday, September 17, 2007 12:10 PM
  Subject: RE: [PATCH] core/cm: improve request message interpretation of 
subnet local fields


  (I don't think this made it to the mailng list, so re-posting.)

   

  I don't disagree with the concept here, but can you explain the problem that 
you're seeing?  Is it that the path is assumed to be routed based on the 
hop_limit (set in ib_init_ah_from_path)?  Are any changes needed for active 
side processing?

   

  Btw, I'd prefer something more like:

   

  if (cm_req_get_primary_subnet_local. )

              primary_path->hop_limit = 1;

  else

              primary_path->hop_limit = req_msg->primary_hop_limit;

   

  (or '? :' equivalent), versus setting hop_limit, then overriding it in the 
common case.  And I'm fine if we don't keep the comment.

   

  - Sean

   


------------------------------------------------------------------------------

  From: Jim Hall [mailto:[EMAIL PROTECTED] 
  Sent: Monday, September 17, 2007 7:38 AM
  To: [EMAIL PROTECTED]
  Cc: Hefty, Sean
  Subject: [PATCH] core/cm: improve request message interpertation of subnet 
local fields

   

    When parsing a CMA connect request message, if the subnet local is 1 

  (both nodes on same subnet), then explicitly set the hop limit in the 
corresponding

  path record to 1. This avoids a Global/Local mis-configuration problem with 
Solaris 

  infinband CMA sessions.  

   

  Signed-off-by: Jim L Hall <[EMAIL PROTECTED]>
  ---
   drivers/infiniband/core/cm.c |   16 ++++++++++++++++
   1 files changed, 16 insertions(+), 0 deletions(-)

   

  diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
  index d446998..3d8740c 100644
  --- a/drivers/infiniband/core/cm.c
  +++ b/drivers/infiniband/core/cm.c
  @@ -1109,6 +1109,14 @@ static void cm_format_paths_from_req(struct cm_req_msg 
*req_msg,
                  cm_req_get_primary_local_ack_timeout(req_msg);
          primary_path->packet_life_time -= (primary_path->packet_life_time > 
0);

   

  +       if (cm_req_get_primary_subnet_local(req_msg) == 1) {
  +
  +               /* At this point we know that both sides are on the same
  +                * subnet, any hop limits above 1 don't make much sense
  +                */
  +               primary_path->hop_limit = 1;
  +       }
  +
          if (req_msg->alt_local_lid) {
                  memset(alt_path, 0, sizeof *alt_path);
                  alt_path->dgid = req_msg->alt_local_gid;
  @@ -1129,6 +1137,14 @@ static void cm_format_paths_from_req(struct cm_req_msg 
*req_msg,
                  alt_path->packet_life_time =
                          cm_req_get_alt_local_ack_timeout(req_msg);
                  alt_path->packet_life_time -= (alt_path->packet_life_time > 
0);
  +
  +               if (cm_req_get_alt_subnet_local(req_msg) == 1) {
  +
  +                       /* At this point we know that both sides are on the 
same
  +                        * subnet, any hop limits above 1 don't make much 
sense
  +                        */
  +                       alt_path->hop_limit = 1;
  +               }
          }
   }


   
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to