> In order to manage multiple types, vlans and MACs per GID, we
> need to store them along the GID itself. We store the net device
> as well, as sometimes GIDs should be handled according to the
> net device they came from. Since populating the GID table should
> be identical for every RoCE provider, the GIDs table should be
> handled in ib_core.
>
> Adding a GID cache table that supports a lockless find, add and
> delete gids. The lockless nature comes from using a unique
> sequence number per table entry and detecting that while reading/
> writing this sequence wasn't changed.
>
> By using this RoCE GID cache table, providers must implement a
> modify_gid callback. The table is managed exclusively by
> this roce_gid_cache and the provider just need to write
> the data to the hardware.
>
> Signed-off-by: Matan Barak <[email protected]>
> Signed-off-by: Somnath Kotur <[email protected]>
> ---
> drivers/infiniband/core/Makefile | 3 +-
> drivers/infiniband/core/core_priv.h | 24 ++
> drivers/infiniband/core/roce_gid_cache.c | 518
Why does RoCE need such a complex gid cache? If a gid cache is needed at all,
why should it be restricted to RoCE only? And why is such a complex
synchronization scheme needed? Seriously, how many times will GIDs change and
how many readers at once do you expect to have?
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 65994a1..1866595 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -64,6 +64,36 @@ union ib_gid {
> } global;
> };
>
> +extern union ib_gid zgid;
> +
> +enum ib_gid_type {
> + /* If link layer is Ethernet, this is RoCE V1 */
I don't understand this comment. Does RoCE v2 not run on Ethernet?
> + IB_GID_TYPE_IB = 0,
> + IB_GID_TYPE_ROCE_V2 = 1,
> + IB_GID_TYPE_SIZE
> +};
Can you explain the purpose of defining a 'GID type'. A GID is just a global
address. Why does it matter to anyone using it how it was constructed?
> +
> +struct ib_gid_attr {
> + enum ib_gid_type gid_type;
> + struct net_device *ndev;
> +};
> +
> +struct ib_roce_gid_cache_entry {
> + /* seq number of 0 indicates entry being changed. */
> + unsigned int seq;
> + union ib_gid gid;
> + struct ib_gid_attr attr;
> + void *context;
> +};
> +
> +struct ib_roce_gid_cache {
> + int active;
> + int sz;
> + /* locking against multiple writes in data_vec */
> + struct mutex lock;
> + struct ib_roce_gid_cache_entry *data_vec;
> +};
> +
> enum rdma_node_type {
> /* IB values map to NodeInfo:NodeType. */
> RDMA_NODE_IB_CA = 1,
> @@ -265,7 +295,9 @@ enum ib_port_cap_flags {
> IB_PORT_BOOT_MGMT_SUP = 1 << 23,
> IB_PORT_LINK_LATENCY_SUP = 1 << 24,
> IB_PORT_CLIENT_REG_SUP = 1 << 25,
> - IB_PORT_IP_BASED_GIDS = 1 << 26
> + IB_PORT_IP_BASED_GIDS = 1 << 26,
> + IB_PORT_ROCE = 1 << 27,
> + IB_PORT_ROCE_V2 = 1 << 28,
Why does RoCE suddenly require a port capability bit? RoCE runs today without
setting any bit.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html