Re: [PATCH] bonding: move ipoib_header_ops to vmlinux

2014-12-30 Thread Wengang

于 2014年12月30日 12:25, David Miller 写道:

From: Wengang wen.gang.w...@oracle.com
Date: Tue, 30 Dec 2014 11:01:42 +0800


There are more than one way we do things. For this case, considering
needs, complexity and stability I think moving ipoib_header_ops is the
right way to go.

I completely disagree, it's a gross hack at best.

It's papering over the real problem.

When we have references to released objects in other areas of the
networking stack, we don't move those objects into the static kernel
image as a fix.


OK. Let me see if I can make a patch to match what you want.

Thanks
wengang
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/6] IB/Core: Changes to the IB Core infrastructure for RoCEv2 support

2014-12-30 Thread Moni Shoua
 Although you follow the spec here, 3 types of RDMA_NETWORK are not
 really required. Maybe we can get rid of this
 Maybe we can get rid of this duplication

 [SOM]: Not sure i understood the duplication here or why it's not required?
 We now have a new 'network/L3' layer on top of L2 - that was the reason, does 
 it not make sense?

Once you know pair (type of RoCE and GID value) you know what is the
network type
And once you know network type you know the pair.
So, it is not really necessary to keep them all stored.
My idea is to get rid if the type that describes network type

 [SOM]: Well, partly the use of get_port_type() was motivated by the 
 SPEC(Query HCA - 17.5.x.x IIRC) talking about the need to have a port_type
 attribute as part of RoCEV2 that would indicate if a HW device/port supported 
 RoCEV2 or not.  It was also serving another purpose
 in cma_acqure_dev() as you can see above in the patch where it was helping 
 the use case of devices that only support V2 and not V1.
 Still feel it doesn't make sense?
 Not sure how/where did you want point 2 coming from - sysfs/proc/debugfs?
 I'd prefer to have that in the next stage of the patchset

Spec is confusing. Under the section QUERY HCA it describes changes to
the port attr.
Anyway, I see that spec points to the ability of querying capabilities
and comparing them against decisions but not as a method to take
decisions.
What I would do is to add 2 flags to ib_port_cap_flags,
IB_PORT_ROCE_SUP and IB_PORT_ROCE_V2_SUP.

I think that the main difference between approaches is how to decide
about the RoCE type to use for a session.
This is also why I think that we should not postpone the change in
cma_acquire_dev to later.

thanks
Moni
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


mlx4: having trouble getting mlx4_NOP to succeed in the VF driver

2014-12-30 Thread Bob Biloxi
Hi,

I was going through the mlx4 source code and had a few questions
regarding the generation of interrupts upon execution of the NOP
command from the VF driver.

If i am running as a dedicated driver, then NOP seems to work fine(I
get an interrupt)

But if I enable SRIOV and then from the VF driver, i run the NOP
command, I don't receive any interrupt(on the VF side)

err = mlx4_NOP(dev); //this command when executed from VF driver
doesn't raise any interrupt.

I get the following from VF logs:

[  117.879100] mlx4_core :01:00.0: communication channel command
0x5 timed out
[  117.879120] mlx4_core :01:00.0: failed execution of VHCR_POST
commandopcode 0x31
[  117.879127] mlx4_core :01:00.0: NOP command failed to generate
MSI-X interrupt IRQ 24).


I have checked the logs and it seems from the VHCR, NOP is received
properly on the PF side and the HCR command is successful.

Also GEN_EQE HCR command when executed in response to NOP is also
successful.( i can see the return status of the command execution)



But on the VF side, the mlx4_eq_int function doesn't get called.

I have checked the return value of request_irq and it seems to be 0(no error)

mlx4_enable_msi_x is also successful.


Can anyone please help me if I am missing something?
Is there anything to be done so as to get interrupts in the mlx4 VF driver?

Can i check at any logs? dmesg output is the only place i was checking.



Also, can the ConnectX hardware generate interrupt to the VF driver?
Or is it that it only generates to the PF driver and PF driver uses
GEN_EQE? I understand that GEN_EQE is used to generate an event
towards a VF..But how are the interrupts routed to the VF driver?


I would be really very much grateful if I can get any kind of help.


Thanks so much !!


Best Regards,
Bob
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/6] IB/Core: Changes to the IB Core infrastructure for RoCEv2 support

2014-12-30 Thread Devesh Sharma
Hi Moni,

Please find my response inline:

 -Original Message-
 From: monisonli...@gmail.com [mailto:monisonli...@gmail.com] On Behalf
 Of Moni Shoua
 Sent: Tuesday, December 30, 2014 9:10 PM
 To: Somnath Kotur
 Cc: rol...@kernel.org; linux-rdma; Devesh Sharma
 Subject: Re: [PATCH 1/6] IB/Core: Changes to the IB Core infrastructure for
 RoCEv2 support
 
  Although you follow the spec here, 3 types of RDMA_NETWORK are not
  really required. Maybe we can get rid of this Maybe we can get rid of
  this duplication
 
  [SOM]: Not sure i understood the duplication here or why it's not required?
  We now have a new 'network/L3' layer on top of L2 - that was the reason,
 does it not make sense?
 
 Once you know pair (type of RoCE and GID value) you know what is the
 network type And once you know network type you know the pair.
 So, it is not really necessary to keep them all stored.
 My idea is to get rid if the type that describes network type
 
  [SOM]: Well, partly the use of get_port_type() was motivated by the
  SPEC(Query HCA - 17.5.x.x IIRC) talking about the need to have a
  port_type attribute as part of RoCEV2 that would indicate if a HW
 device/port supported RoCEV2 or not.  It was also serving another purpose in
 cma_acqure_dev() as you can see above in the patch where it was helping
 the use case of devices that only support V2 and not V1.
  Still feel it doesn't make sense?
  Not sure how/where did you want point 2 coming from -
 sysfs/proc/debugfs?
  I'd prefer to have that in the next stage of the patchset
 
 Spec is confusing. Under the section QUERY HCA it describes changes to the
 port attr.
 Anyway, I see that spec points to the ability of querying capabilities and
 comparing them against decisions but not as a method to take decisions.
 What I would do is to add 2 flags to ib_port_cap_flags, IB_PORT_ROCE_SUP
 and IB_PORT_ROCE_V2_SUP

[DS]: In the ib_port_cap_flags such flags is added, however it needs a name 
change as per your
Suggestion.

@@ -265,7 +329,8 @@ enum ib_port_cap_flags {
IB_PORT_BOOT_MGMT_SUP   = 1  23,
IB_PORT_LINK_LATENCY_SUP= 1  24,
IB_PORT_CLIENT_REG_SUP  = 1  25,
-   IB_PORT_IP_BASED_GIDS   = 1  26
+   IB_PORT_IP_BASED_GIDS   = 1  26,
+   IB_PORT_RoCEV2_BASED_GIDS   = 1  27
 };

On the other hand;
The motive to have rdma_get_port_type() is to query port type in 
cma_acquire_dev() even if 
 port_num = 0. Ib_query_port would fail to report the ib_port_cap_flags if 
application have
not specified the device port number explicitly. The failure is due to port 
number range check.
However, I think it's also okay to call ib_query_port() and skip the status 
check for this call. Makes sense?

.
 
 I think that the main difference between approaches is how to decide about
 the RoCE type to use for a session.
 This is also why I think that we should not postpone the change in
 cma_acquire_dev to later.
 
 thanks
 Moni