RE: rsockets and standard socket based TCP benchmarks

2012-06-11 Thread Hefty, Sean
> Though one can consider the fall-back in reverse order i.e. if the > rdma connection fails continue with the already established connection (over > the normal inet socket). When I consider fallback, one of the issues is handling the case where one of the two sides is not using rsockets. This

RE: rsockets and standard socket based TCP benchmarks

2012-06-11 Thread Vivek Kashyap
On Sat, 9 Jun 2012, Hefty, Sean wrote: But to map standard networking applications to rsockets we will run into the above problem i.e. fork() will not work.  It would be very useful to allow for the standard networking paradigm of: bind()->listen()->accept()- ->fork(), and then the server goes b

Re: [PATCH] RDMA/ocrdma: Corrected Queue max values.

2012-06-11 Thread Roland Dreier
This patch is almost formatted perfectly, except in the future when you send someone else's patch, please include your own Signed-off-by line as well at the end of the chain. In other words, the email should go >From: Mahesh Vardhamanaiah > >Fixed code to read the max wqe and max rqe values fro

Re: [PATCH] RDMA/ocrdma: fixed gid table for vlan and events.

2012-06-11 Thread Roland Dreier
thanks, applied -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: size of queue pair number

2012-06-11 Thread Roland Dreier
On Mon, Jun 11, 2012 at 8:10 AM, Ursula Braun wrote: > If zero based virtual addressing has never been implemented within > Linux, does this mean an ib_post_send() with opcode IB_WR_RDMA_WRITE > requires specification of both - rdma.rkey AND rdma.remote_addr? > And thus means there is a need to ex

Re: size of queue pair number

2012-06-11 Thread Ursula Braun
On Fri, 2012-05-25 at 09:01 -0700, Roland Dreier wrote: > > Thanks for your good and quick help. May I ask another question about > > zero based virtual addressing: The Infiniband Architecture Specification > > talks about specifying the "type of VA" when registering a memory > > region. Checking t

Re: [patch] RDMA/uverbs: potential integer overflow

2012-06-11 Thread Dan Carpenter
I was looking through old integer overflows, and I was still wondering if this is buggy. regards, dan carpenter On 12/1/11, Dan Carpenter wrote: > This is a static checker fix, and I'm not super familiar with this code. > > My checker complains that user_wr->num_sge + sg_ind can have an integer

[PATCH for-next 19/29] IB/mlx4: Added Multicast Groups (MCG) para-virtualization for SRIOV

2012-06-11 Thread Jack Morgenstein
From: Oren Duer MCG para-virtualization support includes: - Creating multicast groups by VFs, and keeping accounting of them - Leaving multicast groups by VFs - SM will only be updated with real changes in the overall picture of MCGs status - Creation of MGID=0 groups (let SM choose MGID) Note

[PATCH for-next 16/29] {NET,IB}/mlx4: Implement QP paravirtualization

2012-06-11 Thread Jack Morgenstein
The requires: 1. Replacing the paravirtualized pkey index (inserted by the guest) with the real pkey index 2. For UD qp's, placing the guest's true source gid index in the address path structure mgid field, and setting the ud_force_mgid bit so that the mgid is taken from the qp context

[PATCH for-next 15/29] IB/mlx4: Initialize SRIOV IB support for slaves in master context

2012-06-11 Thread Jack Morgenstein
Allocate sriov paravirtualization resources and mad demuxing contexts on the master. This has two parts. The first part is to initialize the structures to contain the contexts. This is done at master startup time (mlx4_ib_init_sriov). The second part is to actually create the tunneling resource

[PATCH for-next 17/29] IB/mlx4: SRIOV multiplex and demultiplex MADs

2012-06-11 Thread Jack Morgenstein
Special QPs are para-virtualized. vHCAs are not given direct access to QP0/1. Rather, these QPs are operated by a special context hosted by the PF, which mediates access to/from vHCAs. This is done by opening a “tunnel” per vHCA port per QP0/1. A tunnel comprises a pair of UD QPs: a “Tunnel QP” i

[PATCH for-next 21/29] net/mlx4_core: Add IB port-state machine, and port mgmt event propagation infrastructure

2012-06-11 Thread Jack Morgenstein
For an IB port, a slave should not show port active until that slave has a valid alias-guid (provided by the subnet manager). Therefore the port-up event should be passed to a slave only after both the port is up, and the slave's alias-guid has been set. Also, provide the infrastructure for propag

[PATCH for-next 18/29] {NET,IB}/mlx4: MAD_IFC paravirtualization

2012-06-11 Thread Jack Morgenstein
The MAD_IFC firmware command fulfills two functions. In the first case, it is used in the QP0/QP1 MAD-handling flow to obtain information from the FW (for answering queries), and for setting variables in the HCA (MAD SET packets). In this function, MAD_IFC should provide the FW (physical) view of

[PATCH for-next 12/29] net/mlx4_core: place phys gid and pkey tbl sizes in mlx4_phys_caps struct and paravirtualize them

2012-06-11 Thread Jack Morgenstein
To allow easy paravirtualization of pkey and gid table sizes, keep paravirtualized sizes in mlx4_dev->caps, but save the actual physical sizes in FW in struct: mlx4_dev->phys_cap. In addition, in SRIOV mode, do the following: 1. Reduce reported pkey table size by 1. This is done to reserve the

[PATCH for-next 22/29] {NET,IB}/mlx4: Add alias_guid mechanism

2012-06-11 Thread Jack Morgenstein
For IB ports, we paravirtualize the GUID at index 0 on slaves. The GUID at index 0 seen by a slave is the actual GUID occupying the GUID table at the slave-id index. The driver, by default, request at startup time that subnet manager populate its entire guid table with GUIDs. These guids are then

[PATCH for-next 26/29] IB/mlx4: Initialize guid-cache index 0 (default guid)

2012-06-11 Thread Jack Morgenstein
The guid cache introduced by the alias guid feature contains the all the GUIDs for each port. GUIDs 1..127 are obtained from the subnet manager. GUID 0, however, is obtained from the FW and cannot be modified by the subnet manager. It must be introduced separately into the guid cache. Initialize

[PATCH for-next 24/29] IB/mlx4: Add iov directory in sysfs under the ib device

2012-06-11 Thread Jack Morgenstein
This directory is added only for the master -- slaves do not have it. The sysfs iov directory is used to manage and examine the port pkey and guid paravirtualization. Under iov/ports, the administrator may examine the gid and pkey tables as they are present in the device (and as are seen in the "

[PATCH for-next 09/29] net/mlx4_core: For SRIOV, initialize ib port-capabilities for all slaves

2012-06-11 Thread Jack Morgenstein
Under SRIOV-IB, each slave has its own separate copy of the port-capabilities flags. Thus, for example, the master can run a subnet manager (which causes the IS_SM bit to be set in the master's port capabilities) without affecting the port capabilities seen by the slaves (i.e., the IS_SM bit will

[PATCH for-next 29/29] {NET,IB}/mlx4: Activate SRIOV mode for IB

2012-06-11 Thread Jack Morgenstein
Note: RoCE is still not supported on slaves. This patch will be changed for V1 to reflect this. Signed-off-by: Jack Morgenstein --- drivers/infiniband/hw/mlx4/main.c |5 - drivers/net/ethernet/mellanox/mlx4/fw.c |6 -- 2 files changed, 0 insertions(+), 11 deletions(-) diff

[PATCH for-next 27/29] net/mlx4_core: INIT/CLOSE port logic for IB ports in SRIOV mode

2012-06-11 Thread Jack Morgenstein
Normally, INIT_PORT and CLOSE_PORT are invoked when special QP0 transitions to RTR, or transitions to ERR/RESET respectively. In SRIOV mode, however, the master is also paravirtualized. This in turn requires that we not do INIT_PORT until the entire QP0 path (real QP0 and proxy QP0) is ready to re

[PATCH for-next 25/29] net/mlx4_core: Adjustments to SET_PORT for SRIOV-IB

2012-06-11 Thread Jack Morgenstein
1. Slave may not set the IS_SM capability for the port. 2. No DEV_MGR in multifunc mode. Signed-off-by: Jack Morgenstein --- drivers/net/ethernet/mellanox/mlx4/port.c | 10 ++ include/linux/mlx4/device.h |5 + 2 files changed, 15 insertions(+), 0 deletions(-) dif

[PATCH for-next 20/29] IB/mlx4: Add CM paravirtualization

2012-06-11 Thread Jack Morgenstein
From: Amir Vadai In CM para-virtualization: 1. Incoming requests are steered to the correct vHCA according to the embedded GID 2. Communication IDs on outgoing requests are replaced by a globally unique ID, generated by the PPF, since there is no synchronization of ID generation between gue

[PATCH for-next 28/29] IB/mlx4: Miscellaneous adjustments to SRIOV IB support

2012-06-11 Thread Jack Morgenstein
1. allow only master to change node description 2. prevent ah leakage in send mads 3. take device part number from PCI structure, so that guests see the VF part number (and not the PF part number) 4. place the device revision ID into caps structure at startup 5. SET_PORT in update_gids_task need

[PATCH for-next 05/29] IB/core: Add ib_find_exact_cached_pkey() to search for 16-bit pkey match

2012-06-11 Thread Jack Morgenstein
When port pkey table potentially contains both full and partial membership copies for the same pkey, we need a function to find the exact (16-bit) pkey index. This is particularly necessary when the master forwards QP1 MADS sent by guests. If the guest has sent the MAD with a limited membership p

[PATCH for-next 23/29] IB/mlx4: Propagate pkey and guid change port management events to slaves

2012-06-11 Thread Jack Morgenstein
pkey change and guid change events are not of interest to all slaves, but only to those slaves which "see" the table slots whose contents have change. For example, if the guid at port 1, index 5 has changed in the PPF, we wish to propagate the gid-change event only to the function which has that g

[PATCH for-next 03/29] IB/mlx4: Add run-time switchable error path debug output capability

2012-06-11 Thread Jack Morgenstein
Implement debug printouts to assist in supporting the driver. The intent is to provide more detail in the log for errors which would otherwise be returned as -EINVAL. The facility is on-off switchable at run-time (i.e., module-parameter controlled, off by default). Error output is added here spe

[PATCH for-next 11/29] net/mlx4_core: Allow guests to support IB ports

2012-06-11 Thread Jack Morgenstein
Modify mlx4_dev_cap to allow IB support when SRIOV is active. Modify mlx4_slave_cap to set the "rdma-supported" bit in its flags area, and pass that to the guests (this is done in QUERY_FUNC_CAP and its wrapper). However, Do not yet activate IB support (i.e., do not yet remove the error return at

[PATCH for-next 14/29] net/mlx4_core: Add proxy and tunnel QPs to the reserved QP area

2012-06-11 Thread Jack Morgenstein
In addition, pass the proxy and tunnel QP numbers to slaves so the driver can perform sqp paravirtualization. Signed-off-by: Jack Morgenstein --- drivers/net/ethernet/mellanox/mlx4/fw.c| 21 +++ drivers/net/ethernet/mellanox/mlx4/fw.h|3 ++ drivers/net/e

[PATCH for-next 08/29] {NET,IB}/mlx4: Use port management change event instead of smp_snoop

2012-06-11 Thread Jack Morgenstein
The port management change event can replace smp_snoop. If the capability bit for this event is set in dev-caps, the event is used (by the driver setting the PORT_MNG_CHG_EVENT bit in the async event mask in the MAP_EQ fw command). In this case, when the driver passes incoming SMP PORT_INFO SET ma

[PATCH for-next 10/29] net/mlx4_core: Implement mechanism for reserved qkeys

2012-06-11 Thread Jack Morgenstein
The sriov special-qp tunneling mechanism uses proxy special-qp's (instead of the real special qps) for MADs on guests. These proxy qp's send their packets to a "tunnel" qp owned by the master. The master then forwards the MAD (after any required paravirtualization) to the real special QP, which s

[PATCH for-next 06/29] IB/sa: Add GuidInfoRecord query support.

2012-06-11 Thread Jack Morgenstein
From: Erez Shitrit This query is needed for SRIOV alias GUID support. The query is implemented per the IB Spec definition in section 15.2.5.18 (GuidInfoRecord). Signed-off-by: Erez Shitrit Signed-off-by: Jack Morgenstein Signed-off-by: Or Gerlitz --- drivers/infiniband/core/sa_query.c | 133

[PATCH for-next 00/29] Add SRIOV support for IB interfaces

2012-06-11 Thread Jack Morgenstein
This patch set adds SRIOV support for IB interfaces. Patches 1-13 are "precondition" patches. Patches 14-29 actually implement the feature. This patch set introduces Infiniband SRIOV support for ConnectX2 and ConnectX3 devices. Each function presents itself as an independent vHCA (virtual HCA) t

[PATCH for-next 13/29] IB/mlx4: SRIOV IB context objects and proxy/tunnel sqp support

2012-06-11 Thread Jack Morgenstein
1. Introduce the basic sriov parvirtualization context objects for multiplexing and demultiplexing MADs. 2. Introduce support for the new proxy and tunnel QP types. This patch introduces the objects required by the master for managing QP paravirtualization for guests. struct mlx4_ib_sriov{} is

[PATCH for-next 02/29] IB/mlx4: Mask out high order bit of port_num in mlx4_ib_create_ah

2012-06-11 Thread Jack Morgenstein
The high-order bit of the port_num field is used for the forced-loopback flag. Note: we just noticed that we made illegal use here of the MSB of the port_num field in the ib core ah structure. We will correct this error in the V1 patch set. We did not want to delay the submission due to this. S

[PATCH for-next 04/29] IB/core: change pkey table lookups to support full and partial membership for the same pkey

2012-06-11 Thread Jack Morgenstein
Enhance the cached and non-cached pkey table lookups to enable limited and full members of the same pkey to co-exist in the pkey table. This is necessary for SRIOV to allow for a scheme where some guests would have the full membership pkey in their virtual pkey table, where other guests on the sa

[PATCH for-next 01/29] net/mlx4_core: Pass an invalid PCI id number to VFs

2012-06-11 Thread Jack Morgenstein
Currently, VFs have 0 in their dev->caps.function field. This is a valid pci id (usually of the PF). Pass an invalid pci id to the VF via QUERY_FW, to make sure that if the value gets accessed in the VF driver, it will not be valid. Signed-off-by: Jack Morgenstein --- drivers/net/ethernet/mella

[PATCH for-next 07/29] IB/core: move macros from cm_msgs.h to ib_cm.h

2012-06-11 Thread Jack Morgenstein
These macros will be reused by the mlx4 SRIOV-IB CM paravirtualization code, and there is no reason to have them declared both in the IB core in the mlx4 IB driver. Signed-off-by: Jack Morgenstein Signed-off-by: Or Gerlitz --- drivers/infiniband/core/cm_msgs.h | 12 include/rdma/

[PATCH] RDMA/ocrdma: Fixed polling RQ error CQE polling.

2012-06-11 Thread Parav Pandit
Fixed polling RQ/SRQ error CQE polling. Returning error CQE to consumer for error case which was not returned previously. Signed-off-by: Parav Pandit --- drivers/infiniband/hw/ocrdma/ocrdma_verbs.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/hw