Re: [PATCH 10/11] IB: only keep a single key in struct ib_mr

2015-12-25 Thread Liran Liss
> From: Jason Gunthorpe > > >fill mr->key by the lkey or rkey based on that and everything will > > >work fine. > > > > But the ULP *can* register a memory buffer with local and remote > > access permissions. > Not in the new API. > > If a ULP ever comes along

RE: [PATCH for-next V2 00/11] Add RoCE v2 support

2015-12-17 Thread Liran Liss
> From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma- > ow...@vger.kernel.org] On Behalf Of Doug Ledford > These patches add the concept of duplicate GIDs that are differentiated by > their RoCE version (also called network type). So, now, an incoming packet > could match a couple

RE: [PATCH for-next V2 00/11] Add RoCE v2 support

2015-12-16 Thread Liran Liss
> From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma- > ow...@vger.kernel.org] On Behalf Of Doug Ledford > In particular, Liran piped up with this comment: > > "Also, I don't want to do any route resolution on the Rx path. A UD QP > completion just reports the details of the packet it

RE: [PATCH for-next V2 00/11] Add RoCE v2 support

2015-12-16 Thread Liran Liss
> From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma- > > Since you and Jason did not reach a consensus, I have to dig in and > > see if these patches make it possible to break namespace confinement, > > either accidentally or with intentionally tricky behavior. That's > > going to take

RE: [PATCH for-next V2 05/11] IB/core: Add rdma_network_type to wc

2015-12-13 Thread Liran Liss
> From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma- > > > You are pushing abstraction into provider code instead of handling it in a > generic way. > > No, I am defining an API that *make sense* and doesn't leak useless details. > Of course that doesn't force code duplication or

RE: [PATCH for-next V2 05/11] IB/core: Add rdma_network_type to wc

2015-12-09 Thread Liran Liss
> From: Jason Gunthorpe [mailto:jguntho...@obsidianresearch.com] > Look, here is a concrete direction: > > Replace all the crap in > ib_init_ah_from_wc/get_sgid_index_from_eth/rdma_addr_find_dmac_by_ > grh > > with a straightforward > >rdma_dgid_index_from_wc( >

RE: [PATCH for-next V1 5/9] IB/core: Add rdma_network_type to wc

2015-11-30 Thread Liran Liss
> From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma- > > The abstraction at the gid cache is making it too easy to make this mistake. > It > is enabling callers to do direct gid lookups without a route lookup, which is > unconditionally wrong. Every call site into the gid cache I looked

RE: [PATCH for-next V7 6/6] IB/ucma: HW Device hot-removal support

2015-08-11 Thread Liran Liss
From: Jason Gunthorpe [mailto:jguntho...@obsidianresearch.com] It does if you want a planned 'gental' removal to be possible.. There could be a lot of design options for a 'gentle' removal, such as first sending a 'prepare' event, and only then doing the flow proposed here. I do not want

RE: [PATCH for-next V7 6/6] IB/ucma: HW Device hot-removal support

2015-08-06 Thread Liran Liss
From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma- ow...@vger.kernel.org] On Behalf Of Jason Gunthorpe I have no real problem with that, it would be nice to have an answer to the uverbs vs ucma removal ordering question and the basic issue of if we even want to do this so async

RE: [PATCH v1 08/12] IB/cma: Add net_dev and private data checks to RDMA CM

2015-07-16 Thread Liran Liss
From: Jason Gunthorpe [mailto:jguntho...@obsidianresearch.com] After all, it is the payload that designates the entity that you want to establish a connection to, rather than the packet headers, which are just meant to relay the packet to the proper CM No, that isn't right. The IBA uses

RE: [PATCH v1 08/12] IB/cma: Add net_dev and private data checks to RDMA CM

2015-07-15 Thread Liran Liss
From: Jason Gunthorpe [mailto:jguntho...@obsidianresearch.com] What is really missing here I guess is a mechanism that would enforce containers to only use certain pkeys - perhaps with something like an RDMA cgroup. It could force containers to only use approved pkeys not only with

RE: [PATCH 14/14] IB/mad: Add final OPA MAD processing

2015-06-18 Thread Liran Liss
From: Weiny, Ira [mailto:ira.we...@intel.com] ib_verbs define an *extensive* direct HW access API, which is constantly evolving. This is the problem with verbs... Huh? It is its strength, if you don't break backward compatibility... You cannot describe the intricate object relations

RE: [PATCH 14/14] IB/mad: Add final OPA MAD processing

2015-06-14 Thread Liran Liss
From: Doug Ledford [mailto:dledf...@redhat.com] But the node_type stands for more than just an abstract RDMA device: In IB, it designates an instance of an industry-standard, well-defined, device type: it's possible link types, transport, semantics, management, everything. It *should* be

RE: [PATCH 14/14] IB/mad: Add final OPA MAD processing

2015-06-11 Thread Liran Liss
From: Doug Ledford [mailto:dledf...@redhat.com] OPA cannot impersonate IB; OPA node and link types have to be designated as such. In terms of MAD processing flows, both explicit (as in the handle_opa_smi() call below) and implicit code paths (which share IB flows - there are

RE: [PATCH 14/14] IB/mad: Add final OPA MAD processing

2015-06-10 Thread Liran Liss
From: Ira Weiny ira.we...@intel.com Hi Ira, OPA cannot impersonate IB; OPA node and link types have to be designated as such. In terms of MAD processing flows, both explicit (as in the handle_opa_smi() call below) and implicit code paths (which share IB flows - there are several cases) must

RE: [PATCH 14/14] IB/mad: Add final OPA MAD processing

2015-05-28 Thread Liran Liss
Why do you have RDMA_NODE_IB_SWITCH related stuff inside the handle_opa_smi() function? Is there a node type of switch in OPA similar to IB? Yes. OPA uses the same node types as IB. Ira No, OPA cannot impersonate IB. It has to have distinct node and link types. --Liran -- To

RE: [RESEND PATCH V3 for-next 0/3] HW Device hot-removal support

2015-05-28 Thread Liran Liss
From: Doug Ledford [mailto:dledf...@redhat.com] I suppose that the main issue would be handling existing user memory mappings, which cannot be just invalidated -- the user-space driver may not be aware of the device removal and may access this memory concurrently, and we don't want it

RE: [RESEND PATCH V3 for-next 0/3] HW Device hot-removal support

2015-05-28 Thread Liran Liss
From: Doug Ledford [mailto:dledf...@redhat.com] I suppose that the main issue would be handling existing user memory mappings, which cannot be just invalidated -- the user-space driver may not be aware of the device removal and may access this memory concurrently, and we don't want it

RE: [PATCH v6 02/26] IB/Verbs: Implement raw management helpers

2015-04-24 Thread Liran Liss
From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma- Add raw helpers: rdma_tech_ib rdma_tech_iboe rdma_tech_iwarp rdma_ib_or_iboe (transition, clean up later) To help us detect which technology the port supported. Replace rdma_tech_* with rdma_protocol_*.

RE: [PATCH v5 00/27] IB/Verbs: IB Management Helpers

2015-04-24 Thread Liran Liss
From: Hefty, Sean [mailto:sean.he...@intel.com] [snip] So, I think that our old-transport below is just fine. No need to change it (and you aren't, since it is currently implemented as a function). I think there is a need to change this. Encoding the transport into the node

RE: [PATCH v5 00/27] IB/Verbs: IB Management Helpers

2015-04-24 Thread Liran Liss
From: Michael Wang [mailto:yun.w...@profitbricks.com] [snip] Depends on who is we. For ULPs, you are probably right. However, core services (e.g., mad management, CM, SA) do care about various details. In some cases, where it doesn't matter, this code will use management helpers.

RE: [PATCH v6 01/26] IB/Verbs: Implement new callback query_transport()

2015-04-24 Thread Liran Liss
From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma- [snip] a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 65994a1..d54f91e 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -75,10 +75,13 @@ enum rdma_node_type { }; enum rdma_transport_type { +

RE: [PATCH v5 00/27] IB/Verbs: IB Management Helpers

2015-04-22 Thread Liran Liss
From: Michael Wang [mailto:yun.w...@profitbricks.com] Hi, Liran Thanks for the comment :-) On 04/22/2015 01:36 AM, Liran Liss wrote: [snip] (**) This has been extended to also encode the transport in the current code. At least for user-space visible APIs, we might chose to leave

RE: [PATCH v5 00/27] IB/Verbs: IB Management Helpers

2015-04-21 Thread Liran Liss
Hi Michael, The spirit of this patch-set is great, but I think that we need to clarify some concepts. Since this will affect the whole patch-set, I am laying out my concerns here instead. A suggestion for the resulting management helpers is given below. I believe the result would be much more

RE: RE: [PATCH v3 for-next 01/33] IB/core: Add RoCE GID cache

2015-04-16 Thread Liran Liss
RoCE v2 is really Infiniband over UDP over IP. Why don't we just call it IBoUDP like it is? RoCEv2 is the name in the IBTA spec (Annex 17) We call RoCE IBoE in the kernel, because that's what it is. RoCE is an IBTA marketing name. Looking through the Annex, I don't see where

RE: RE: [PATCH v3 for-next 01/33] IB/core: Add RoCE GID cache

2015-04-16 Thread Liran Liss
The RoCE Verbs interface references the HCA GID table in QPs and AHs, for all RoCE versions. The IBTA specifically does not define software interfaces. The concern here is the architecture and definition of the linux rdma software stack, not verbs, despite the fact that the layer is

RE: Status of ummunot branch?

2013-06-10 Thread Liran Liss
Here are a few more clarifications: 1) ODP MRs can cover address ranges that do not have a mapping at registration time. This means that MPI can register in advance, say, the lower GB's of the address space, covering malloc's primary arena. Thus, there is no need to adjust to each increase in

RE: Status of ummunot branch?

2013-06-10 Thread Liran Liss
-Original Message- From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma- ow...@vger.kernel.org] On Behalf Of Jeff Squyres (jsquyres) Sent: Monday, June 10, 2013 5:50 PM To: Jason Gunthorpe Cc: Haggai Eran; Or Gerlitz; linux-rdma@vger.kernel.org; Shachar Raindel Subject: Re:

RE: [PATCH net-next V0 18/21] mlx4_core: adjust catas operation for SRIOV mode

2011-12-04 Thread Liran Liss
On Fri, Dec 2, 2011 at 2:19 AM, Yevgeny Petrilin yevge...@mellanox.co.il wrote: When running in SRIOV mode, driver should not automatically start/stop the mlx4_core upon sensing an HCA internal error -- doing this disables/enables sriov, which will cause the hypervisor to hang if there

RE: When IBoE will be merged to upstream?

2010-07-20 Thread Liran Liss
Small correction needed regarding the multicast forwarding. Since we are talking about IPv6 multicast groups, which translate to 33:33:xx:xx:xx:xx MAC address, the router listener notification protocol is going to be MLD and not IGMP. Still there are switches which support MLD

RE: When IBoE will be merged to upstream?

2010-07-15 Thread Liran Liss
The text is saying that the specification does not use any of the LID fields in the verbs interface, that is it. It isn't talking about MAC addresses. Exactly how and where the MAC address comes about was never decided, and at least some participants thought it should be a

RE: When IBoE will be merged to upstream?

2010-07-15 Thread Liran Liss
But, we can't mandate an overload of the GID in a way that it prevents its use as a true L3 address (eventually routable). Actually I'm beginning to think that the only possible way we can use the GID in IBoE is as a link-local IPv6 addresses containing an Ethernet address. Trying

RE: When IBoE will be merged to upstream?

2010-07-15 Thread Liran Liss
A quibble about multicast - AFAIK this is unsolved. I think some spec needs to be agreed that documents what sort of multicast snooping operations switches need to do, ie if IGMP joins imply that IBoE traffic for the same DMAC is included in the join, or if IBoE requires a

RE: When IBoE will be merged to upstream?

2010-07-13 Thread Liran Liss
...A verbs consumer using a RoCE network relies strictly on so-called Layer 3 addressing (GIDs); layer 2 addresses (e.g. subnet local identifiers) are not passed across the verbs interface... Ah, hmm, well, I was on that list during this time and I don't think this

RE: When IBoE will be merged to upstream?

2010-07-10 Thread Liran Liss
as possible, while preserving transparency to the applications. Comments are welcome. Liran -Original Message- From: Or Gerlitz [mailto:ogerl...@voltaire.com] Sent: Wednesday, July 07, 2010 9:00 AM To: Liran Liss Cc: Roland Dreier; Jason Gunthorpe; Hefty, Sean; Aleksey Senin; linux-rdma

RE: When IBoE will be merged to upstream?

2010-06-25 Thread Liran Liss
: Thursday, June 24, 2010 11:37 PM To: Liran Liss Cc: Hefty, Sean; Roland Dreier; Aleksey Senin; linux-rdma; mo...@voltaire.com; aleks...@voltaire.com; yift...@voltaire.com; Tziporet Koren; al...@voltaire.com Subject: Re: When IBoE will be merged to upstream? The current behavior of ibv_create_ah

RE: When IBoE will be merged to upstream?

2010-06-24 Thread Liran Liss
Regarding GID to Eth mappings, we discussed using the standard create_ah() Verb for this. In the kernel, create_ah() will call a generic address resolution function in the cma. The returned information will be copied back to user-space in a device-specific structure (since address handles are

RE: When IBoE will be merged to upstream?

2010-06-24 Thread Liran Liss
S.B. --Liran -Original Message- From: Hefty, Sean [mailto:sean.he...@intel.com] Sent: Thursday, June 24, 2010 9:06 PM To: Liran Liss; Roland Dreier; Aleksey Senin Cc: linux-rdma; mo...@voltaire.com; aleks...@voltaire.com; yift...@voltaire.com; Tziporet Koren; al...@voltaire.com Subject

RE: [PATCHv8 07/11] ib_core: Add API to support IBoE from userspace

2010-05-17 Thread Liran Liss
If we have a dedicated ABI call for this mapping, then it seems reasonable to have it device independent. However, this mapping is really only used when creating address handles. So, we can base the mapping on the (device specific) create_ah() flow, but provide generic mapping functions for all

RE: RDMAoE verbs questions

2009-12-03 Thread Liran Liss
Existing apps rely on transport_type == IBV_TRANSPORT_IB to indicate IB management is present. There are many examples of this. The art of API compatability is to not break existing old apps, so you don't get to change the meaning of transport_type == IBV_TRANSPORT_IB to mean 'it is only IB

RE: RDMAoE verbs questions

2009-12-02 Thread Liran Liss
So? There are substantial semantic differences for *all* non-rdmacm applications. Even common ones like OpenMPI. You propose to ignore them? On the contrary! Any application that *does* care what the link layer is can look up a new field in port_attr (rather than a new node transport type).

RE: RDMAoE verbs questions

2009-12-02 Thread Liran Liss
Hi Paul, you are not missing anything - lookback communication will work in RDMAoE just as in IB. --Liran -Original Message- From: Paul Grun [mailto:pg...@systemfabricworks.com] Sent: Wednesday, December 02, 2009 10:55 AM To: 'Or Gerlitz'; Liran Liss Cc: 'Sean Hefty'; 'Jason Gunthorpe