RSS (Receive Side Scaling) technology allows to spread incoming traffic between
different receive descriptor queues.
Assigning each queue to different CPU cores allows to better load balance the
incoming traffic and improve performance.

This patch-set introduces some new objects and verbs in order to allow
verbs based solutions to utilize the RSS offload capability which is
widely supported today by many modern NICs. It extends the IB and uverbs
layers to support the above functionality and supplies a specific
implementation for the mlx5_ib driver.

The implementation is based on an RFC that was sent to the list some months ago
and describes the expected verbs and objects.
RFC: http://www.spinics.net/lists/linux-rdma/msg25012.html

In addition, below URL can be used as a reference to the motivation and 
the justification to add the new objects that are described below.
http://lxr.free-electrons.com/source/Documentation/networking/scaling.txt

Overview of the changes:
- Add new objects: Work Queue and Receive Work Queues Indirection Table.
- Add new verbs that are required to handle the new objects:
  ib_create_wq(), ib_modify_wq(), ib_destory_wq(),
  ib_create_rwq_ind_table(), ib_destroy_rwq_ind_table().

Work Queue: (ib_wq)
- Work Queue is associated (many to one) with  Completion Queue.
- It owns Work Queue properties (PD, WQ size etc.).
- Currently Work Queue type can be IB_WQT_RQ (receive queue), other ones may be 
added
  in the future. (e.g. IB_WQT_SQ, send queue)
- Work Queue from type IB_WQT_RQ contains receive work requests.
- Work Queue context is subject to a well-defined state transitions done by the 
modify_wq verb.
- Work Queue is a necessary component for RSS technology since RSS mechanism is
  supposed to distribute the traffic between multiple Receive Work Queues. 
  
Receive Work Queue Indirection Table: (ib_rwq_ind_tbl)
- Serves to spread traffic between Work Queues from type RQ.
- Can be modified dynamically to give different queues different relative 
weights.
- The receive queue for a packet is determined by computed hash for the 
incoming packet.
- Receive Work Queue Indirection Table is associated (one to many) with QPs.

Future extensions to this patch-set:
- Add ib_modify_rwq_ind_table() verb to enable a dynamic RQ mapping change.
- Introduce RSS hashing configuration that should be used to compute the 
required RQ entry for the incoming packet.
- Extend the ib_create_qp() verb to work with external WQs by the indirection 
table object and with RSS hash configuration.
  - Will enable a ULP/user application to enjoy from the RSS scaling.
  - QPs that support flow steering rules can enjoy from the RSS scaling in 
addition to the steering capabilities.
- Reflect RSS capabilities by the query device verb.
- User space support (i.e. libibverbs/vendor drivers) to expose the new verbs 
and objects.

Patches:
#1 - Exposes the required APIs from mlx5_core to be used in coming patches by 
mlx5_ib driver.
#2 - Introduces the Work Queue object and its verbs in the IB layer.
#3 - Adds uverbs support for the Work Queue verbs.
#4 - Implements the Work Queue verbs in mlx5_ib driver.
#5 - Introduces Receive Work Queue indirection table and its verbs in the IB 
layer.
#6 - Adds uverbs support for the Receive Work Queue indirection table verbs.
#7 - Implements the Receive Work Queue indirection table verbs in mlx5_ib 
driver.
 
Yishai Hadas (7):
  net/mlx5_core: Expose transobj APIs from mlx5 core
  IB: Introduce Work Queue object and its verbs
  IB/uverbs: Add WQ support
  IB/mlx5: Add receive Work Queue verbs
  IB: Introduce Receive Work Queue indirection table
  IB/uverbs: Introduce RWQ Indirection table
  IB/mlx5: Add Receive Work Queue Indirection table operations

 drivers/infiniband/core/uverbs.h                   |   12 +
 drivers/infiniband/core/uverbs_cmd.c               |  405 ++++++++++++++++++++
 drivers/infiniband/core/uverbs_main.c              |   38 ++
 drivers/infiniband/core/verbs.c                    |  111 ++++++
 drivers/infiniband/hw/mlx5/main.c                  |   13 +
 drivers/infiniband/hw/mlx5/mlx5_ib.h               |   49 +++
 drivers/infiniband/hw/mlx5/qp.c                    |  319 +++++++++++++++
 drivers/infiniband/hw/mlx5/user.h                  |   15 +
 drivers/net/ethernet/mellanox/mlx5/core/en.h       |    2 +-
 drivers/net/ethernet/mellanox/mlx5/core/srq.c      |    2 +-
 drivers/net/ethernet/mellanox/mlx5/core/transobj.c |    8 +-
 drivers/net/ethernet/mellanox/mlx5/core/transobj.h |   72 ----
 include/linux/mlx5/transobj.h                      |   73 ++++
 include/rdma/ib_verbs.h                            |  136 +++++++
 include/uapi/rdma/ib_user_verbs.h                  |   65 ++++
 15 files changed, 1245 insertions(+), 75 deletions(-)
 delete mode 100644 drivers/net/ethernet/mellanox/mlx5/core/transobj.h
 create mode 100644 include/linux/mlx5/transobj.h

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to