Re: [PATCH 0/2] Adding support for RoCE V2 specification

2014-12-01 Thread Moni Shoua
On Tue, Dec 2, 2014 at 7:35 AM, Somnath Kotur  wrote:
> HI Moni,
> Thank you for your comments , please find my response inline.
>
>
>> > The overarching goal behind this patch is to keep RDMA-CM as the
>> > central entity that decides the protocol (ROCEV2 /ROCEV1) and hides all the
>> address resolution details from applications.
>> I agree but with one comment. RDMA-CM, although preferred,  is not
>> mandatory to establish a connection. The solution needs to take care also for
>> apps that don't use RDMA-CM
>
> Agreed and this patch takes care of that as well thanks to the solution that 
> was put in place
> for IP-based GIDs anyway in `modify_qp` effectively by means of the 
> ib_resolve_l2_attrs().
>
>>
>> > This  is just a continuation of the existing RDMA-CM design philosophy
>> > for IP Based GIDs that lets applications operate over any form of RDMA
>> Service in a completely transparent way without any changes to the
>> applications themselves.
>> >
>> > Protocol Resolution(RoCEV2 or V1) must be done even before we send out
>> > the 1st connection request packet, the corresponding MAD pkt must be V2
>> for a destination that is behind  a router, correct?
>> > I am not sure we want to try sending out RoCEv2 connection requests first
>> and if that fails ,then fallback/retry with RoCEv1 request.
>> I agree but when you send the first packet (request or response) you know
>> the SGID index to be used. Knowing that and assuming you chose the SGID
>> index correctly there is no need to guess.
>> A suggestion for choosing the SGID index is:
>> 1. First, populate GID table with GIDs matching the IP address list of the
>> corresponding net devices. Each GID will appear twice, one for
>> RoCEv1 and one for RoCEv2.
>> 2. When connection is being established RDMA-CM will choose one of the
>> 2 matching indices based on a preferred protocol attribute for the rmda_id
>> (new attr.) 3. Proffered protocol is
>>  a) global for the node/fabric based on the administrator settings
>>  b) can be changed by applications that want to override the default
>> (requires extra API)
>
> Agree to point 1 which is what we were thinking of doing as well as part of 
> the 'override' option.
> Still feel that the 'default' protocol resolution should be based on help 
> from the IP/Networking stack
> as that would straight away enable existing applications and majority of 
> use-case scenarios that use RDMA-CM to seamlessly
> work without any change for RoCE V2.
> What this means is now RDMA-CM will choose one of the 2 matching GID indices 
> based on help from the IP stack.
>
> For this to happen , the 'gid_cache' structure will have to undergo a change 
> to now incorporate a new 'gid_type' field.
> Accordingly the helper functions that find a cached gid entry - 
> ib_find_cached_gid() will also undergo a change to take into account this new 
> field.
>
> I plan to resubmit the patch series making the following changes
> a) Populating GID table with 2 entries per IP address
> b) Changes to the GID cache structure along with it's helper functions
> c) Make sure that RDMA-CM selects/finds the corresponding GID once a network 
> type is chosen based on help from the IP stack.
>
>
> The only thing missing is the ability for end-nodes connected locally inside 
> same subnet to communicate using V2 and that
> could be provided as an 'override' option using the 'preferred protocol' 
> attribute setting on a system-wide basis or on an application basis
> as you have proposed.
> This I propose as an additional patch after the above patch goes in first.
>
> Does this sound like a plan ?
>
> Thanks
> Som
>
Hi Som
Generally, I agree with the changes you suggested. To summarize (and
make sure I got it right) your plan for V2 is to:

1. Associate gid_index with RoCE type in GID table and GID cache.
Requires helper functions to search for gid_index according to GID
value and type
2. Move the code that monitors IP address changes to populate GID
table from device driver to IB/core. Requires each RoCE device driver
to implement add_gid_entry/rem_gid_entry functions
3. Remove the RoCE type from address structure (e.g. ib_ah_attr).
Leave it only in rdma_addr to keep the hint from the IP stack
4. Choose sgid_index (from all matching entries) in RDMA-CM based on
hint from IP stack, user preference and admin policy

If so, I guess that we have a plan.

Thanks
Moni
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 0/2] Adding support for RoCE V2 specification

2014-12-01 Thread Somnath Kotur
HI Moni,
Thank you for your comments , please find my response inline.


> > The overarching goal behind this patch is to keep RDMA-CM as the
> > central entity that decides the protocol (ROCEV2 /ROCEV1) and hides all the
> address resolution details from applications.
> I agree but with one comment. RDMA-CM, although preferred,  is not
> mandatory to establish a connection. The solution needs to take care also for
> apps that don't use RDMA-CM

Agreed and this patch takes care of that as well thanks to the solution that 
was put in place 
for IP-based GIDs anyway in `modify_qp` effectively by means of the 
ib_resolve_l2_attrs().

> 
> > This  is just a continuation of the existing RDMA-CM design philosophy
> > for IP Based GIDs that lets applications operate over any form of RDMA
> Service in a completely transparent way without any changes to the
> applications themselves.
> >
> > Protocol Resolution(RoCEV2 or V1) must be done even before we send out
> > the 1st connection request packet, the corresponding MAD pkt must be V2
> for a destination that is behind  a router, correct?
> > I am not sure we want to try sending out RoCEv2 connection requests first
> and if that fails ,then fallback/retry with RoCEv1 request.
> I agree but when you send the first packet (request or response) you know
> the SGID index to be used. Knowing that and assuming you chose the SGID
> index correctly there is no need to guess.
> A suggestion for choosing the SGID index is:
> 1. First, populate GID table with GIDs matching the IP address list of the
> corresponding net devices. Each GID will appear twice, one for
> RoCEv1 and one for RoCEv2.
> 2. When connection is being established RDMA-CM will choose one of the
> 2 matching indices based on a preferred protocol attribute for the rmda_id
> (new attr.) 3. Proffered protocol is
>  a) global for the node/fabric based on the administrator settings
>  b) can be changed by applications that want to override the default
> (requires extra API)

Agree to point 1 which is what we were thinking of doing as well as part of the 
'override' option.
Still feel that the 'default' protocol resolution should be based on help from 
the IP/Networking stack
as that would straight away enable existing applications and majority of 
use-case scenarios that use RDMA-CM to seamlessly
work without any change for RoCE V2.
What this means is now RDMA-CM will choose one of the 2 matching GID indices 
based on help from the IP stack.

For this to happen , the 'gid_cache' structure will have to undergo a change to 
now incorporate a new 'gid_type' field. 
Accordingly the helper functions that find a cached gid entry - 
ib_find_cached_gid() will also undergo a change to take into account this new 
field.

I plan to resubmit the patch series making the following changes
a) Populating GID table with 2 entries per IP address
b) Changes to the GID cache structure along with it's helper functions
c) Make sure that RDMA-CM selects/finds the corresponding GID once a network 
type is chosen based on help from the IP stack. 


The only thing missing is the ability for end-nodes connected locally inside 
same subnet to communicate using V2 and that 
could be provided as an 'override' option using the 'preferred protocol' 
attribute setting on a system-wide basis or on an application basis 
as you have proposed.
This I propose as an additional patch after the above patch goes in first.

Does this sound like a plan ?

Thanks
Som

N�r��yb�X��ǧv�^�)޺{.n�+{��ٚ�{ay�ʇڙ�,j��f���h���z��w���
���j:+v���w�j�mzZ+�ݢj"��!�i

RE: [RFC PATCH 00/16] ib_mad: Add support for Intel Omni-Path Architecture (OPA) MAD processing.

2014-12-01 Thread Weiny, Ira
> >
> >>>[RFC PATCH 07/16] ib/mad: create a jumbo MAD kmem_cache
> >> why not use a single kmem-cache instance with a non hard coded
> >> element size, 256B (or whatever we use today) or 2KB?
> > I wanted to be able to adjust the element count of the caches separately to
> better tune overall memory usage.  However, I stopped short of adding
> additional module parameters to adjust the 2K cache at this time.
> 
> 
> I tend to think that the resulted code is too much of a special purpose one
> under a  (jumbo == 2K) assumption. See some more comments in the individual
> patches and we'll take it from there.
> 

Ok, I'll address those comments in the other email threads.

> 
> 
> >
> >> Also (nit), please change the prefix for all patches to be IB/mad:
> >> and not
> >> ib/mad: to comply with the existing habit of patch titles for the IB
> >> subsystem
> > I will thanks.
> 
> Good. See below another easy-to-fix nitpicking comment, but before that, for
> the sake of easier review and post-robustness of the code to future 
> bisections,
> please do a re-ordering of the series such that all general refactoring and 
> pre-
> patches come before the OPApatches.
> 
> This goes to re-order the current series such tat patches 8/9/10 are located
> after patch 14, as listed here:
> 
>[RFC PATCH 01/16] ib/mad: rename is_data_mad to is_rmpp_data_mad
>[RFC PATCH 02/16] ib/core: add IB_DEVICE_JUMBO_MAD_SUPPORT device
> cap
>[RFC PATCH 03/16] ib/mad: Add check for jumbo MADs support on a device
>[RFC PATCH 04/16] ib/mad: add base version parameter to
>[RFC PATCH 05/16] ib/mad: Add MAD size parameters to process_mad
>[RFC PATCH 06/16] ib/mad: Create jumbo_mad data structures
>[RFC PATCH 07/16] ib/mad: create a jumbo MAD kmem_cache
>[RFC PATCH 11/16] ib/mad: create helper function for
>[RFC PATCH 12/16] ib/mad: create helper function for
>[RFC PATCH 13/16] ib/mad: create helper function for
>[RFC PATCH 14/16] ib/mad: Create helper function for SMI processing
>[RFC PATCH 08/16] ib/mad: Add Intel Omni-Path Architecture defines
>[RFC PATCH 09/16] ib/mad: Implement support for Intel Omni-Path
>[RFC PATCH 10/16] ib/mad: Add registration check for Intel Omni-Path
>[RFC PATCH 15/16] ib/mad: Implement Intel Omni-Path Architecture SMP
>[RFC PATCH 16/16] ib/mad: Implement Intel Omni-Path Architecture General
> 

Done.

> Another easy-to-fix nitpicking comment would be to have all the patches be
> consistent w.r.t to the capitalization of the 1st letter in the 1st word 
> after the
> IB/core: or IB/mad:  prefix, e.g
> 
> ib/mad: create helper function for smi_handle_dr_smp_send
> 
> becomes
> 
> IB/mad: Create helper function for smi_handle_dr_smp_send

Done.

> 
> BTW, here my personal preference is "Add helper" and not "Create helper"
> 
> IB/mad: Add helper function for smi_handle_dr_smp_send

Done.

Ira

N�r��yb�X��ǧv�^�)޺{.n�+{��ٚ�{ay�ʇڙ�,j��f���h���z��w���
���j:+v���w�j�mzZ+�ݢj"��!�i

RE: [PATCH] iw_cxgb4: Wake up waiters after flushing the qp

2014-12-01 Thread Steve Wise
Hey Roland, please reply and let me know if you will push these to 3.18 (would 
have to happen now) or if they are destined for 3.19.

Thanks,

Steve.


> -Original Message-
> From: Steve Wise [mailto:sw...@opengridcomputing.com]
> Sent: Friday, November 21, 2014 9:42 AM
> To: 'rol...@purestorage.com'
> Cc: 'linux-rdma@vger.kernel.org'
> Subject: RE: [PATCH] iw_cxgb4: Wake up waiters after flushing the qp
> 
> I've pushed these to:
> 
> git://git.openfabrics.org/~swise/linux.git for_roland
> 
> Steve
> 
> Hariprasad Shenai (2):
>   iw_cxgb4: Fixes locking issue in process_mpa_request
>   iw_cxgb4: limit MRs to < 8GB for T4/T5 devices
> 
> Pramod Kumar (2):
>   cxgb4/rdma:Increase epd buff size for debug interface
>   cxgb4/rdma: configure 0B MRs to match HW implementation
> 
> Steve Wise (1):
>   iw_cxgb4: Wake up waiters after flushing the qp
> 
> 
> > -Original Message-
> > From: Steve Wise [mailto:sw...@opengridcomputing.com]
> > Sent: Friday, November 21, 2014 9:26 AM
> > To: rol...@purestorage.com
> > Cc: linux-rdma@vger.kernel.org
> > Subject: Re: [PATCH] iw_cxgb4: Wake up waiters after flushing the qp
> >
> > Hey Roland,
> >
> > Hope you can get this bug fix into 3.18.  Plus these others that have
> > been submitted a while ago.  Here is the entire list:
> >
> > https://patchwork.kernel.org/patch/5106541/
> > https://patchwork.kernel.org/patch/5249511/
> > https://patchwork.kernel.org/patch/5249501/
> > https://patchwork.kernel.org/patch/5089511/
> > https://patchwork.kernel.org/patch/5089501/
> >
> >
> > Thanks,
> >
> > Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html