Re: [PATCH v2 0/5] Support for generalized use of make C={1,2} via a wrapper program

2017-12-18 Thread Leon Romanovsky
On Mon, Dec 18, 2017 at 07:39:50PM +0100, Knut Omang wrote:
> On Mon, 2017-12-18 at 17:56 +, Bart Van Assche wrote:
> > On Mon, 2017-12-18 at 10:46 -0700, Jason Gunthorpe wrote:
> > > On Sun, Dec 17, 2017 at 10:00:17PM -0800, Joe Perches wrote:
> > >
> > > > > Today when we run checkers we get so many warnings it is too hard to
> > > > > make any sense of it.
> > > >
> > > > Here is a list of the checkpatch messages for drivers/infiniband
> > > > sorted by type.
> > > >
> > > > Many of these might be corrected by using
> > > >
> > > > $ ./scripts/checkpatch.pl -f --fix-inplace --types= \
> > > >   $(git ls-files drivers/infiniband/)
> > >
> > > How many of these do you think it is worth to fix?
> > >
> > > We do get a steady trickle of changes in this topic every cycle.
> > >
> > > Is it better to just do a big number of them all at once? Do you have
> > > an idea how disruptive this kind of work is to the whole patch flow
> > > eg new patches no longer applying to for-next, backports no longer
> > > applying, merge conflicts?
> >
> > In my opinion patches that only change the coding style and do not change 
> > any
> > functionality are annoying. Before posting a patch that fixes a bug the 
> > change
> > history (git log -p) has to be cheched to figure out which patch introduced
> > the bug. Patches that only change coding style pollute the change history.
>
> I agree with you - the problem is that style issues should not have existed.
> But when they do it becomes a problem to remove them and a problem to
> keep them - for instance us who try to be compliant by having style helpers
> in our editor, we end up having to manually revert old style mistakes back in
> to avoid making unrelated whitespace changes or similar.

If the checkpatch.pl complains about coding style for the new patch in
newly added code, I'm asking from the author to prepare cleanup patch so
it will be applied before actual patch.

In case, complains are for code which patch are not touching, I'm
submitting it as is.

Thanks


signature.asc
Description: PGP signature


Re: [PATCHv6 1/3] rdmacg: Added rdma cgroup controller

2016-02-21 Thread Leon Romanovsky
On Sun, Feb 21, 2016 at 07:41:08PM +0530, Parav Pandit wrote:
> CONFIG_CGROUP_RDMA
> 
> On Sun, Feb 21, 2016 at 7:15 PM, Leon Romanovsky  wrote:
> > On Sun, Feb 21, 2016 at 05:03:05PM +0530, Parav Pandit wrote:
> >> On Sun, Feb 21, 2016 at 1:13 PM, Leon Romanovsky  wrote:
> >> > On Sat, Feb 20, 2016 at 04:30:04PM +0530, Parav Pandit wrote:
> >> > Can you place this ifdef before declaring struct rdma_cgroup?
> >> Yes. I missed out this cleanup. Done locally now.
> >
> > Great, additional thing which spotted my attention was related to
> > declaring and using the new cgroups functions. There are number of
> > places where you protected the calls by specific ifdefs in the
> > IB/core c-files and not in h-files as it is usually done.
> >
> ib_device_register_rdmacg, ib_device_unregister_rdmacg are the only
> two functions called from IB/core as its tied to functionality.
> They can also be implemented as NULL call when CONFIG_CGROUP_RDMA is 
> undefined.
> (Similar to ib_rdmacg_try_charge and others).
> I didn't do because occurrence of call of register and unregister is
> limited to single file and only twice compare to charge/uncharge
> functions.
> Either way is fine with me, I can make the changes which you
> described. Let me know.

Please do,
IMHO, it is better to have one place which handles all relevant ifdefs
and functions. IB/core doesn't need to know about cgroups implementation.

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv6 1/3] rdmacg: Added rdma cgroup controller

2016-02-21 Thread Leon Romanovsky
On Sun, Feb 21, 2016 at 05:03:05PM +0530, Parav Pandit wrote:
> On Sun, Feb 21, 2016 at 1:13 PM, Leon Romanovsky  wrote:
> > On Sat, Feb 20, 2016 at 04:30:04PM +0530, Parav Pandit wrote:
> > Can you place this ifdef before declaring struct rdma_cgroup?
> Yes. I missed out this cleanup. Done locally now.

Great, additional thing which spotted my attention was related to
declaring and using the new cgroups functions. There are number of
places where you protected the calls by specific ifdefs in the
IB/core c-files and not in h-files as it is usually done.

> 
> > Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv6 1/3] rdmacg: Added rdma cgroup controller

2016-02-21 Thread Leon Romanovsky
On Sat, Feb 20, 2016 at 04:30:04PM +0530, Parav Pandit wrote:
> Added rdma cgroup controller that does accounting, limit enforcement
> on rdma/IB verbs and hw resources.
> 
> Added rdma cgroup header file which defines its APIs to perform
> charing/uncharing functionality and device registration which will
> participate in controller functions of accounting and limit
> enforcements. It also define rdmacg_device structure to bind IB stack
> and RDMA cgroup controller.
> 
> RDMA resources are tracked using resource pool. Resource pool is per
> device, per cgroup entity which allows setting up accounting limits
> on per device basis.
> 
> Resources are not defined by the RDMA cgroup, instead they are defined
> by the external module IB stack. This allows extending IB stack
> without changing kernel, as IB stack is going through changes
> and enhancements.
> 
> Resource pool is created/destroyed dynamically whenever
> charging/uncharging occurs respectively and whenever user
> configuration is done. Its a tradeoff of memory vs little more code
> space that creates resource pool whenever necessary,
> instead of creating them during cgroup creation and device registration
> time.
> 
> Signed-off-by: Parav Pandit 
> ---
>  include/linux/cgroup_rdma.h   |  53 +++
>  include/linux/cgroup_subsys.h |   4 +
>  init/Kconfig  |  10 +
>  kernel/Makefile   |   1 +
>  kernel/cgroup_rdma.c  | 753 
> ++
>  5 files changed, 821 insertions(+)
>  create mode 100644 include/linux/cgroup_rdma.h
>  create mode 100644 kernel/cgroup_rdma.c
> 
> diff --git a/include/linux/cgroup_rdma.h b/include/linux/cgroup_rdma.h
> new file mode 100644
> index 000..b370733
> --- /dev/null
> +++ b/include/linux/cgroup_rdma.h
> @@ -0,0 +1,53 @@
> +#ifndef _CGROUP_RDMA_H
> +#define _CGROUP_RDMA_H
> +
> +#include 
> +
> +struct rdma_cgroup {
> +#ifdef CONFIG_CGROUP_RDMA
> + struct cgroup_subsys_state  css;
> +
> + spinlock_t rpool_list_lock; /* protects resource pool list */
> + struct list_head rpool_head;/* head to keep track of all resource
> +  * pools that belongs to this cgroup.
> +  */
> +#endif
> +};
> +
> +#ifdef CONFIG_CGROUP_RDMA

I'm sure that you already asked about that, but why do you need ifdef
embedded in struct rdma_cgroup and right after that the same one?
Can you place this ifdef before declaring struct rdma_cgroup?

> +
> +struct rdmacg_device;
> +

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv10 1/3] rdmacg: Added rdma cgroup controller

2016-04-05 Thread Leon Romanovsky
On Tue, Apr 05, 2016 at 05:55:26AM -0700, Parav Pandit wrote:
> Hi Christoph,
> 
> On Tue, Apr 5, 2016 at 5:42 AM, Christoph Hellwig  wrote:
> > On Tue, Apr 05, 2016 at 05:39:21AM -0700, Parav Pandit wrote:
> >> I am not really trying to address OFED issues here. I am sure you
> >> understand that if ib_core.ko kernel module is in-kernel module than,
> >> for all the fixes/enhancements that goes to it would require people to
> >> upgrade to newer kernel, instead of just modules upgrade. Such heavy
> >> weight upgrade slows down the adoption which i am trying to avoid here
> >> by placing knobs in right kernel module's hand.
> >
> > What a load of rubbish.  The Linux kernel is one program and should be
> > upgraded together.
> 
> Just because we add one more rdma resource, we need to ask someone to
> upgrade kernel?

It doesn't make sense. Kernel and modules are always coming together,
the attempts to mix kernel and modules from different versions can lead
to many interesting debug scenarios.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv12 1/3] rdmacg: Added rdma cgroup controller

2016-09-15 Thread Leon Romanovsky
On Wed, Sep 14, 2016 at 12:36:19PM +0530, Parav Pandit wrote:
> Hi Dennis,
>
> Do you know how would HFI1 driver would work along with rdma cgroup?
>
> Hi Matan, Leon, Jason,
> Apart from HFI1, is there any other concern?
> Or Patch is good to go?

I didn't review it yet :(.
Sorry

>
> 4.8 dates are close by (2 weeks) and there are two git trees involved
> (that might cause merge error to Linus) so if there are no issues, I
> would like to make request to Doug to consider it for 4.8 early on.
>
> Parav
>
> On Mon, Sep 12, 2016 at 10:37 AM, Leon Romanovsky  wrote:
> > On Sun, Sep 11, 2016 at 11:52:35AM -0600, Jason Gunthorpe wrote:
> >> On Sun, Sep 11, 2016 at 07:24:45PM +0200, Christoph Hellwig wrote:
> >> > > > > I've posted some initial work toward a) a while ago, and once we
> >> > >
> >> > > Did it get merged? Do you have a pointer?
> >> >
> >> > http://www.spinics.net/lists/linux-rdma/msg31958.html
> >>
> >> Right, I remember that. Certainly the right direction
> >>
> >> > > However, everything under verbs is not straightforward. The files in
> >> > > userspace are not copies...
> >> > >
> >> > > user:
> >> > >
> >> > > struct ibv_query_device {
> >> > >__u32 command;
> >> > >__u16 in_words;
> >> > >__u16 out_words;
> >> > >__u64 response;
> >> > >__u64 driver_data[0];
> >> > > };
> >> > >
> >> > > kernel:
> >> > >
> >> > > struct ib_uverbs_query_device {
> >> > > __u64 response;
> >> > > __u64 driver_data[0];
> >> > > };
> >> >
> >> > We'll obviously need different strutures for the libibvers API
> >> > and the kernel interface in this case, and we'll need to figure out
> >> > how to properly translate them.  I think a cast, plus compile time
> >> > type checking ala BUILD_BUG_ON is the way to go.
> >>
> >> I'm not sure I follow, which would I cast?
> >>
> >> BUILD_BUG_ON(sizeof(ibv_query_device) == sizeof(ib_uverbs_cmd_hdr) +
> >>  sizeof(ib_uverbs_query_device))
> >>
> >> ?
> >>
> >> > > I'm thinking the best way forward might be to use a script and
> >> > > transform userspace into:
> >> > >
> >> > > struct ibv_query_device {
> >> > >   struct ib_uverbs_cmd_hdr hdr;
> >> > >   struct ib_uverbs_query_device cmd;
> >> > > };
> >> >
> >> > That would break the users of the interface.
> >>
> >> Sorry, I mean doing this inside rdma-plumbing. Since the change is ABI
> >> identical the modified libibverbs would still be binary compatible
> >> with all providers but not source compatible. Since all kernel
> >> supported providers are in rdma-plumbing we can add the '.cmd.' at the
> >> same time.
> >>
> >> The kernel uapi header would stay the same.
> >>
> >> > However automatically generating the user ABI from the kernel one
> >> > might still be a good idea in the long run.
> >>
> >> My preference would be to try and use the kernel headers directly.
> >
> > I thought the same, especially after realizing that they are almost
> > copy/paste from the vendor *-abi.h files.
> >
> >>
> >> Jason


signature.asc
Description: PGP signature


Re: [PATCHv12 1/3] rdmacg: Added rdma cgroup controller

2016-10-05 Thread Leon Romanovsky
On Wed, Oct 05, 2016 at 08:37:35AM +0200, Christoph Hellwig wrote:
> FYI, the patches look fine to me:
>
> Acked-by: Christoph Hellwig 
>
> but we're past the merge window for 4.9 now unfortunately.

IMHO, it still can make it.

Thanks


signature.asc
Description: PGP signature


Re: [PATCHv12 0/3] rdmacg: IB/core: rdma controller support

2016-10-05 Thread Leon Romanovsky
On Wed, Aug 31, 2016 at 02:07:24PM +0530, Parav Pandit wrote:
> rdmacg: IB/core: rdma controller support
>
> Patch is generated and tested against below Doug's linux-rdma
> git tree.
>
> URL: git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma.git
> Branch: master
>
> Patchset is also compiled and tested against below Tejun's cgroup tree
> using cgroup v2 mode.
> URL: git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git
> Branch: master
>
> Overview:
> Currently user space applications can easily take away all the rdma
> device specific resources such as AH, CQ, QP, MR etc. Due to which other
> applications in other cgroup or kernel space ULPs may not even get chance
> to allocate any rdma resources. This results into service unavailibility.
>
> RDMA cgroup addresses this issue by allowing resource accounting,
> limit enforcement on per cgroup, per rdma device basis.
>
> RDMA uverbs layer will enforce limits on well defined RDMA verb
> resources without any HCA vendor device driver involvement.
>
> RDMA uverbs layer will not do limit enforcement of HCA hw vendor
> specific resources. Instead rdma cgroup provides set of APIs
> through which vendor specific drivers can do resource accounting
> by making use of rdma cgroup.

Hi Parav,
I want to propose an extension to the RDMA cgroup which can be done as
follow-up patches.

Let's add new global type, which will control whole HCA (for example in 
percentages). It will
allow natively define new objects without need to introduce them to the user.

This HCA share will be overwritten by specific UVERBS types which you
already defined.

What do you think?

Except this proposal,
Reviewed-by: Leon Romanovsky 

Thanks.


signature.asc
Description: PGP signature


Re: [PATCHv14 0/3] rdmacg: IB/core: rdma controller support

2017-01-10 Thread Leon Romanovsky
On Tue, Jan 10, 2017 at 11:15:05AM -0500, Tejun Heo wrote:
> On Tue, Jan 10, 2017 at 12:02:12AM +, Parav Pandit wrote:
> > Patchset is compiled and tested against below Tejun's cgroup tree
> > using cgroup v1 and v2 mode.
> > URL: git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git
> > Branch: for-4.11
>
> Applied 1-3 to cgroup/for-4.11-rdmacg.

Thanks Tejun and Parav.

>
> Thanks.
>
> --
> tejun


signature.asc
Description: PGP signature


Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce

2018-11-12 Thread Leon Romanovsky
On Mon, Nov 12, 2018 at 03:58:02PM +0800, Kenneth Lee wrote:
> From: Kenneth Lee 
>
> WarpDrive is a general accelerator framework for the user application to
> access the hardware without going through the kernel in data path.
>
> The kernel component to provide kernel facility to driver for expose the
> user interface is called uacce. It a short name for
> "Unified/User-space-access-intended Accelerator Framework".
>
> This patch add document to explain how it works.

+ RDMA and netdev folks

Sorry, to be late in the game, I don't see other patches, but from
the description below it seems like you are reinventing RDMA verbs
model. I have hard time to see the differences in the proposed
framework to already implemented in drivers/infiniband/* for the kernel
space and for the https://github.com/linux-rdma/rdma-core/ for the user
space parts.

Hard NAK from RDMA side.

Thanks

>
> Signed-off-by: Kenneth Lee 
> ---
>  Documentation/warpdrive/warpdrive.rst   | 260 +++
>  Documentation/warpdrive/wd-arch.svg | 764 
>  Documentation/warpdrive/wd.svg  | 526 ++
>  Documentation/warpdrive/wd_q_addr_space.svg | 359 +
>  4 files changed, 1909 insertions(+)
>  create mode 100644 Documentation/warpdrive/warpdrive.rst
>  create mode 100644 Documentation/warpdrive/wd-arch.svg
>  create mode 100644 Documentation/warpdrive/wd.svg
>  create mode 100644 Documentation/warpdrive/wd_q_addr_space.svg
>
> diff --git a/Documentation/warpdrive/warpdrive.rst 
> b/Documentation/warpdrive/warpdrive.rst
> new file mode 100644
> index ..ef84d3a2d462
> --- /dev/null
> +++ b/Documentation/warpdrive/warpdrive.rst
> @@ -0,0 +1,260 @@
> +Introduction of WarpDrive
> +=
> +
> +*WarpDrive* is a general accelerator framework for the user application to
> +access the hardware without going through the kernel in data path.
> +
> +It can be used as the quick channel for accelerators, network adaptors or
> +other hardware for application in user space.
> +
> +This may make some implementation simpler.  E.g.  you can reuse most of the
> +*netdev* driver in kernel and just share some ring buffer to the user space
> +driver for *DPDK* [4] or *ODP* [5]. Or you can combine the RSA accelerator 
> with
> +the *netdev* in the user space as a https reversed proxy, etc.
> +
> +*WarpDrive* takes the hardware accelerator as a heterogeneous processor which
> +can share particular load from the CPU:
> +
> +.. image:: wd.svg
> +:alt: WarpDrive Concept
> +
> +The virtual concept, queue, is used to manage the requests sent to the
> +accelerator. The application send requests to the queue by writing to some
> +particular address, while the hardware takes the requests directly from the
> +address and send feedback accordingly.
> +
> +The format of the queue may differ from hardware to hardware. But the
> +application need not to make any system call for the communication.
> +
> +*WarpDrive* tries to create a shared virtual address space for all involved
> +accelerators. Within this space, the requests sent to queue can refer to any
> +virtual address, which will be valid to the application and all involved
> +accelerators.
> +
> +The name *WarpDrive* is simply a cool and general name meaning the framework
> +makes the application faster. It includes general user library, kernel
> +management module and drivers for the hardware. In kernel, the management
> +module is called *uacce*, meaning "Unified/User-space-access-intended
> +Accelerator Framework".
> +
> +
> +How does it work
> +
> +
> +*WarpDrive* uses *mmap* and *IOMMU* to play the trick.
> +
> +*Uacce* creates a chrdev for the device registered to it. A "queue" will be
> +created when the chrdev is opened. The application access the queue by mmap
> +different address region of the queue file.
> +
> +The following figure demonstrated the queue file address space:
> +
> +.. image:: wd_q_addr_space.svg
> +:alt: WarpDrive Queue Address Space
> +
> +The first region of the space, device region, is used for the application to
> +write request or read answer to or from the hardware.
> +
> +Normally, there can be three types of device regions mmio and memory regions.
> +It is recommended to use common memory for request/answer descriptors and use
> +the mmio space for device notification, such as doorbell. But of course, this
> +is all up to the interface designer.
> +
> +There can be two types of device memory regions, kernel-only and user-shared.
> +This will be explained in the "kernel APIs" section.
> +
> +The Static Share Virtual Memory region is necessary only when the device 
> IOMMU
> +does not support "Share Virtual Memory". This will be explained after the
> +*IOMMU* idea.
> +
> +
> +Architecture
> +
> +
> +The full *WarpDrive* architecture is represented in the following class
> +diagram:
> +
> +.. image:: wd-arch.svg
> +:alt: WarpDrive Architecture
> +
> 

Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce

2018-11-14 Thread Leon Romanovsky
On Wed, Nov 14, 2018 at 10:58:09AM +0800, Kenneth Lee wrote:
>
> 在 2018/11/13 上午8:23, Leon Romanovsky 写道:
> > On Mon, Nov 12, 2018 at 03:58:02PM +0800, Kenneth Lee wrote:
> > > From: Kenneth Lee 
> > >
> > > WarpDrive is a general accelerator framework for the user application to
> > > access the hardware without going through the kernel in data path.
> > >
> > > The kernel component to provide kernel facility to driver for expose the
> > > user interface is called uacce. It a short name for
> > > "Unified/User-space-access-intended Accelerator Framework".
> > >
> > > This patch add document to explain how it works.
> > + RDMA and netdev folks
> >
> > Sorry, to be late in the game, I don't see other patches, but from
> > the description below it seems like you are reinventing RDMA verbs
> > model. I have hard time to see the differences in the proposed
> > framework to already implemented in drivers/infiniband/* for the kernel
> > space and for the https://github.com/linux-rdma/rdma-core/ for the user
> > space parts.
>
> Thanks Leon,
>
> Yes, we tried to solve similar problem in RDMA. We also learned a lot from
> the exist code of RDMA. But we we have to make a new one because we cannot
> register accelerators such as AI operation, encryption or compression to the
> RDMA framework:)

Assuming that you did everything right and still failed to use RDMA
framework, you was supposed to fix it and not to reinvent new exactly
same one. It is how we develop kernel, by reusing existing code.

>
> Another problem we tried to address is the way to pin the memory for dma
> operation. The RDMA way to pin the memory cannot avoid the page lost due to
> copy-on-write operation during the memory is used by the device. This may
> not be important to RDMA library. But it is important to accelerator.

Such support exists in drivers/infiniband/ from late 2014 and
it is called ODP (on demand paging).

>
> Hope this can help the understanding.

Yes, it helped me a lot.
Now, I'm more than before convinced that this whole patchset shouldn't
exist in the first place.

To be clear, NAK.

Thanks

>
> Cheers
>
> >
> > Hard NAK from RDMA side.
> >
> > Thanks
> >
> > > Signed-off-by: Kenneth Lee 
> > > ---
> > >   Documentation/warpdrive/warpdrive.rst   | 260 +++
> > >   Documentation/warpdrive/wd-arch.svg | 764 
> > >   Documentation/warpdrive/wd.svg  | 526 ++
> > >   Documentation/warpdrive/wd_q_addr_space.svg | 359 +
> > >   4 files changed, 1909 insertions(+)
> > >   create mode 100644 Documentation/warpdrive/warpdrive.rst
> > >   create mode 100644 Documentation/warpdrive/wd-arch.svg
> > >   create mode 100644 Documentation/warpdrive/wd.svg
> > >   create mode 100644 Documentation/warpdrive/wd_q_addr_space.svg
> > >
> > > diff --git a/Documentation/warpdrive/warpdrive.rst 
> > > b/Documentation/warpdrive/warpdrive.rst
> > > new file mode 100644
> > > index ..ef84d3a2d462
> > > --- /dev/null
> > > +++ b/Documentation/warpdrive/warpdrive.rst
> > > @@ -0,0 +1,260 @@
> > > +Introduction of WarpDrive
> > > +=
> > > +
> > > +*WarpDrive* is a general accelerator framework for the user application 
> > > to
> > > +access the hardware without going through the kernel in data path.
> > > +
> > > +It can be used as the quick channel for accelerators, network adaptors or
> > > +other hardware for application in user space.
> > > +
> > > +This may make some implementation simpler.  E.g.  you can reuse most of 
> > > the
> > > +*netdev* driver in kernel and just share some ring buffer to the user 
> > > space
> > > +driver for *DPDK* [4] or *ODP* [5]. Or you can combine the RSA 
> > > accelerator with
> > > +the *netdev* in the user space as a https reversed proxy, etc.
> > > +
> > > +*WarpDrive* takes the hardware accelerator as a heterogeneous processor 
> > > which
> > > +can share particular load from the CPU:
> > > +
> > > +.. image:: wd.svg
> > > +:alt: WarpDrive Concept
> > > +
> > > +The virtual concept, queue, is used to manage the requests sent to the
> > > +accelerator. The application send requests to the queue by writing to 
> > > some
> > > +particular address, while the hardware takes the requests directly from 
> > > the
> > &g

Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce

2018-11-15 Thread Leon Romanovsky
On Thu, Nov 15, 2018 at 04:51:09PM +0800, Kenneth Lee wrote:
> On Wed, Nov 14, 2018 at 06:00:17PM +0200, Leon Romanovsky wrote:
> > Date: Wed, 14 Nov 2018 18:00:17 +0200
> > From: Leon Romanovsky 
> > To: Kenneth Lee 
> > CC: Tim Sell , linux-doc@vger.kernel.org,
> >  Alexander Shishkin , Zaibo Xu
> >  , zhangfei@foxmail.com, linux...@huawei.com,
> >  haojian.zhu...@linaro.org, Christoph Lameter , Hao Fang
> >  , Gavin Schenk , RDMA mailing
> >  list , Zhou Wang ,
> >  Jason Gunthorpe , Doug Ledford , Uwe
> >  Kleine-König , David Kershner
> >  , Johan Hovold , Cyrille
> >  Pitchen , Sagar Dharia
> >  , Jens Axboe ,
> >  guodong...@linaro.org, linux-netdev , Randy Dunlap
> >  , linux-ker...@vger.kernel.org, Vinod Koul
> >  , linux-cry...@vger.kernel.org, Philippe Ombredanne
> >  , Sanyog Kale , Kenneth Lee
> >  , "David S. Miller" ,
> >  linux-accelerat...@lists.ozlabs.org
> > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
> > User-Agent: Mutt/1.10.1 (2018-07-13)
> > Message-ID: <20181114160017.gi3...@mtr-leonro.mtl.com>
> >
> > On Wed, Nov 14, 2018 at 10:58:09AM +0800, Kenneth Lee wrote:
> > >
> > > 在 2018/11/13 上午8:23, Leon Romanovsky 写道:
> > > > On Mon, Nov 12, 2018 at 03:58:02PM +0800, Kenneth Lee wrote:
> > > > > From: Kenneth Lee 
> > > > >
> > > > > WarpDrive is a general accelerator framework for the user application 
> > > > > to
> > > > > access the hardware without going through the kernel in data path.
> > > > >
> > > > > The kernel component to provide kernel facility to driver for expose 
> > > > > the
> > > > > user interface is called uacce. It a short name for
> > > > > "Unified/User-space-access-intended Accelerator Framework".
> > > > >
> > > > > This patch add document to explain how it works.
> > > > + RDMA and netdev folks
> > > >
> > > > Sorry, to be late in the game, I don't see other patches, but from
> > > > the description below it seems like you are reinventing RDMA verbs
> > > > model. I have hard time to see the differences in the proposed
> > > > framework to already implemented in drivers/infiniband/* for the kernel
> > > > space and for the https://github.com/linux-rdma/rdma-core/ for the user
> > > > space parts.
> > >
> > > Thanks Leon,
> > >
> > > Yes, we tried to solve similar problem in RDMA. We also learned a lot from
> > > the exist code of RDMA. But we we have to make a new one because we cannot
> > > register accelerators such as AI operation, encryption or compression to 
> > > the
> > > RDMA framework:)
> >
> > Assuming that you did everything right and still failed to use RDMA
> > framework, you was supposed to fix it and not to reinvent new exactly
> > same one. It is how we develop kernel, by reusing existing code.
>
> Yes, but we don't force other system such as NIC or GPU into RDMA, do we?

You don't introduce new NIC or GPU, but proposing another interface to
directly access HW memory and bypass kernel for the data path. This is
whole idea of RDMA and this is why it is already present in the kernel.

Various hardware devices are supported in our stack allow a ton of crazy
stuff, including GPUs interconnections and NIC functionalities.

>
> I assume you would not agree to register a zip accelerator to infiniband? :)

"infiniband" name in the "drivers/infiniband/" is legacy one and the
current code supports IB, RoCE, iWARP and OmniPath as a transport layers.
For a lone time, we wanted to rename that folder to be "drivers/rdma",
but didn't find enough brave men/women to do it, due to backport mess
for such move.

The addition of zip accelerator to RDMA is possible and depends on how
you will model such new functionality - new driver, or maybe new ULP.

>
> Further, I don't think it is wise to break an exist system (RDMA) to fulfill a
> totally new scenario. The better choice is to let them run in parallel for 
> some
> time and try to merge them accordingly.

Awesome, so please run your code out-of-tree for now and once you are ready
for submission let's try to merge it.

>
> >
> > >
> > > Another problem we tried to address is the way to pin the memory for dma
> > > operation. The RDMA way to pin the memory cannot avoid the page lost due 
> > > to
> > > copy-on-write operation during the memory is used by the device. T

Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce

2018-11-19 Thread Leon Romanovsky
On Mon, Nov 19, 2018 at 05:19:10PM +0800, Kenneth Lee wrote:
> On Mon, Nov 19, 2018 at 05:14:05PM +0800, Kenneth Lee wrote:
> > Date: Mon, 19 Nov 2018 17:14:05 +0800
> > From: Kenneth Lee 
> > To: Leon Romanovsky 
> > CC: Tim Sell , linux-doc@vger.kernel.org,
> >  Alexander Shishkin , Zaibo Xu
> >  , zhangfei@foxmail.com, linux...@huawei.com,
> >  haojian.zhu...@linaro.org, Christoph Lameter , Hao Fang
> >  , Gavin Schenk , RDMA mailing
> >  list , Vinod Koul , Jason
> >  Gunthorpe , Doug Ledford , Uwe
> >  Kleine-König , David Kershner
> >  , Kenneth Lee , Johan
> >  Hovold , Cyrille Pitchen
> >  , Sagar Dharia
> >  , Jens Axboe ,
> >  guodong...@linaro.org, linux-netdev , Randy Dunlap
> >  , linux-ker...@vger.kernel.org, Zhou Wang
> >  , linux-cry...@vger.kernel.org, Philippe
> >  Ombredanne , Sanyog Kale ,
> >  "David S. Miller" ,
> >  linux-accelerat...@lists.ozlabs.org
> > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
> > User-Agent: Mutt/1.5.21 (2010-09-15)
> > Message-ID: <20181119091405.GE157308@Turing-Arch-b>
> >
> > On Thu, Nov 15, 2018 at 04:54:55PM +0200, Leon Romanovsky wrote:
> > > Date: Thu, 15 Nov 2018 16:54:55 +0200
> > > From: Leon Romanovsky 
> > > To: Kenneth Lee 
> > > CC: Kenneth Lee , Tim Sell ,
> > >  linux-doc@vger.kernel.org, Alexander Shishkin
> > >  , Zaibo Xu ,
> > >  zhangfei@foxmail.com, linux...@huawei.com, haojian.zhu...@linaro.org,
> > >  Christoph Lameter , Hao Fang , 
> > > Gavin
> > >  Schenk , RDMA mailing list
> > >  , Zhou Wang , Jason
> > >  Gunthorpe , Doug Ledford , Uwe
> > >  Kleine-König , David Kershner
> > >  , Johan Hovold , Cyrille
> > >  Pitchen , Sagar Dharia
> > >  , Jens Axboe ,
> > >  guodong...@linaro.org, linux-netdev , Randy 
> > > Dunlap
> > >  , linux-ker...@vger.kernel.org, Vinod Koul
> > >  , linux-cry...@vger.kernel.org, Philippe Ombredanne
> > >  , Sanyog Kale , "David S.
> > >  Miller" , linux-accelerat...@lists.ozlabs.org
> > > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
> > > User-Agent: Mutt/1.10.1 (2018-07-13)
> > > Message-ID: <20181115145455.gn3...@mtr-leonro.mtl.com>
> > >
> > > On Thu, Nov 15, 2018 at 04:51:09PM +0800, Kenneth Lee wrote:
> > > > On Wed, Nov 14, 2018 at 06:00:17PM +0200, Leon Romanovsky wrote:
> > > > > Date: Wed, 14 Nov 2018 18:00:17 +0200
> > > > > From: Leon Romanovsky 
> > > > > To: Kenneth Lee 
> > > > > CC: Tim Sell , linux-doc@vger.kernel.org,
> > > > >  Alexander Shishkin , Zaibo Xu
> > > > >  , zhangfei@foxmail.com, linux...@huawei.com,
> > > > >  haojian.zhu...@linaro.org, Christoph Lameter , Hao 
> > > > > Fang
> > > > >  , Gavin Schenk , RDMA 
> > > > > mailing
> > > > >  list , Zhou Wang 
> > > > > ,
> > > > >  Jason Gunthorpe , Doug Ledford , 
> > > > > Uwe
> > > > >  Kleine-König , David Kershner
> > > > >  , Johan Hovold , Cyrille
> > > > >  Pitchen , Sagar Dharia
> > > > >  , Jens Axboe ,
> > > > >  guodong...@linaro.org, linux-netdev , Randy 
> > > > > Dunlap
> > > > >  , linux-ker...@vger.kernel.org, Vinod Koul
> > > > >  , linux-cry...@vger.kernel.org, Philippe Ombredanne
> > > > >  , Sanyog Kale , 
> > > > > Kenneth Lee
> > > > >  , "David S. Miller" ,
> > > > >  linux-accelerat...@lists.ozlabs.org
> > > > > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for 
> > > > > WarpDrive/uacce
> > > > > User-Agent: Mutt/1.10.1 (2018-07-13)
> > > > > Message-ID: <20181114160017.gi3...@mtr-leonro.mtl.com>
> > > > >
> > > > > On Wed, Nov 14, 2018 at 10:58:09AM +0800, Kenneth Lee wrote:
> > > > > >
> > > > > > 在 2018/11/13 上午8:23, Leon Romanovsky 写道:
> > > > > > > On Mon, Nov 12, 2018 at 03:58:02PM +0800, Kenneth Lee wrote:
> > > > > > > > From: Kenneth Lee 
> > > > > > > >
> > > > > > > > WarpDrive is a general accelerator framework for the user 
> > > > > > > > application to
> > > > > > > > access the hardware without going through the ker

Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce

2018-11-19 Thread Leon Romanovsky
On Mon, Nov 19, 2018 at 01:42:16PM -0500, Jerome Glisse wrote:
> On Mon, Nov 19, 2018 at 11:27:52AM -0700, Jason Gunthorpe wrote:
> > On Mon, Nov 19, 2018 at 11:48:54AM -0500, Jerome Glisse wrote:
> >
> > > Just to comment on this, any infiniband driver which use umem and do
> > > not have ODP (here ODP for me means listening to mmu notifier so all
> > > infiniband driver except mlx5) will be affected by same issue AFAICT.
> > >
> > > AFAICT there is no special thing happening after fork() inside any of
> > > those driver. So if parent create a umem mr before fork() and program
> > > hardware with it then after fork() the parent might start using new
> > > page for the umem range while the old memory is use by the child. The
> > > reverse is also true (parent using old memory and child new memory)
> > > bottom line you can not predict which memory the child or the parent
> > > will use for the range after fork().
> > >
> > > So no matter what you consider the child or the parent, what the hw
> > > will use for the mr is unlikely to match what the CPU use for the
> > > same virtual address. In other word:
> > >
> > > Before fork:
> > > CPU parent: virtual addr ptr1 -> physical address = 0xCAFE
> > > HARDWARE:   virtual addr ptr1 -> physical address = 0xCAFE
> > >
> > > Case 1:
> > > CPU parent: virtual addr ptr1 -> physical address = 0xCAFE
> > > CPU child:  virtual addr ptr1 -> physical address = 0xDEAD
> > > HARDWARE:   virtual addr ptr1 -> physical address = 0xCAFE
> > >
> > > Case 2:
> > > CPU parent: virtual addr ptr1 -> physical address = 0xBEEF
> > > CPU child:  virtual addr ptr1 -> physical address = 0xCAFE
> > > HARDWARE:   virtual addr ptr1 -> physical address = 0xCAFE
> >
> > IIRC this is solved in IB by automatically calling
> > madvise(MADV_DONTFORK) before creating the MR.
> >
> > MADV_DONTFORK
> >   .. This is useful to prevent copy-on-write semantics from changing the
> >   physical location of a page if the parent writes to it after a
> >   fork(2) ..
>
> This would work around the issue but this is not transparent ie
> range marked with DONTFORK no longer behave as expected from the
> application point of view.
>
> Also it relies on userspace doing the right thing (which is not
> something i usualy trust :)).

The good thing that we didn't see anyone who succeeded to run
IB stack without our user space, which does right thing under
the hood :).

>
> Cheers,
> Jérôme


signature.asc
Description: PGP signature


Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce

2018-11-19 Thread Leon Romanovsky
On Tue, Nov 20, 2018 at 11:07:02AM +0800, Kenneth Lee wrote:
> On Mon, Nov 19, 2018 at 11:49:54AM -0700, Jason Gunthorpe wrote:
> > Date: Mon, 19 Nov 2018 11:49:54 -0700
> > From: Jason Gunthorpe 
> > To: Kenneth Lee 
> > CC: Leon Romanovsky , Kenneth Lee ,
> >  Tim Sell , linux-doc@vger.kernel.org, Alexander
> >  Shishkin , Zaibo Xu
> >  , zhangfei@foxmail.com, linux...@huawei.com,
> >  haojian.zhu...@linaro.org, Christoph Lameter , Hao Fang
> >  , Gavin Schenk , RDMA mailing
> >  list , Zhou Wang ,
> >  Doug Ledford , Uwe Kleine-König
> >  , David Kershner
> >  , Johan Hovold , Cyrille
> >  Pitchen , Sagar Dharia
> >  , Jens Axboe ,
> >  guodong...@linaro.org, linux-netdev , Randy Dunlap
> >  , linux-ker...@vger.kernel.org, Vinod Koul
> >  , linux-cry...@vger.kernel.org, Philippe Ombredanne
> >  , Sanyog Kale , "David S.
> >  Miller" , linux-accelerat...@lists.ozlabs.org
> > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
> > User-Agent: Mutt/1.9.4 (2018-02-28)
> > Message-ID: <20181119184954.gb4...@ziepe.ca>
> >
> > On Mon, Nov 19, 2018 at 05:14:05PM +0800, Kenneth Lee wrote:
> >
> > > If the hardware cannot share page table with the CPU, we then need to have
> > > some way to change the device page table. This is what happen in ODP. It
> > > invalidates the page table in device upon mmu_notifier call back. But 
> > > this cannot
> > > solve the COW problem: if the user process A share a page P with device, 
> > > and A
> > > forks a new process B, and it continue to write to the page. By COW, the
> > > process B will keep the page P, while A will get a new page P'. But you 
> > > have
> > > no way to let the device know it should use P' rather than P.
> >
> > Is this true? I thought mmu_notifiers covered all these cases.
> >
> > The mm_notifier for A should fire if B causes the physical address of
> > A's pages to change via COW.
> >
> > And this causes the device page tables to re-synchronize.
>
> I don't see such code. The current do_cow_fault() implemenation has nothing to
> do with mm_notifer.
>
> >
> > > In WarpDrive/uacce, we make this simple. If you support IOMMU and it 
> > > support
> > > SVM/SVA. Everything will be fine just like ODP implicit mode. And you 
> > > don't need
> > > to write any code for that. Because it has been done by IOMMU framework. 
> > > If it
> >
> > Looks like the IOMMU code uses mmu_notifier, so it is identical to
> > IB's ODP. The only difference is that IB tends to have the IOMMU page
> > table in the device, not in the CPU.
> >
> > The only case I know if that is different is the new-fangled CAPI
> > stuff where the IOMMU can directly use the CPU's page table and the
> > IOMMU page table (in device or CPU) is eliminated.
> >
>
> Yes. We are not focusing on the current implementation. As mentioned in the
> cover letter. We are expecting Jean Philips' SVA patch:
> git://linux-arm.org/linux-jpb.
>
> > Anyhow, I don't think a single instance of hardware should justify an
> > entire new subsystem. Subsystems are hard to make and without multiple
> > hardware examples there is no way to expect that it would cover any
> > future use cases.
>
> Yes. That's our first expectation. We can keep it with our driver. But because
> there is no user driver support for any accelerator in mainline kernel. Even 
> the
> well known QuickAssit has to be maintained out of tree. So we try to see if
> people is interested in working together to solve the problem.
>
> >
> > If all your driver needs is to mmap some PCI bar space, route
> > interrupts and do DMA mapping then mediated VFIO is probably a good
> > choice.
>
> Yes. That is what is done in our RFCv1/v2. But we accepted Jerome's opinion 
> and
> try not to add complexity to the mm subsystem.
>
> >
> > If it needs to do a bunch of other stuff, not related to PCI bar
> > space, interrupts and DMA mapping (ie special code for compression,
> > crypto, AI, whatever) then you should probably do what Jerome said and
> > make a drivers/char/hisillicon_foo_bar.c that exposes just what your
> > hardware does.
>
> Yes. If no other accelerator driver writer is interested. That is the
> expectation:)
>
> But we really like to have a public solution here. Consider this scenario:
>
> You create some connections (queues) to NIC, RSA, and AI engine. Then you got
> data direct from t

Re: [PATCHv12 1/3] rdmacg: Added rdma cgroup controller

2016-08-31 Thread Leon Romanovsky
On Wed, Aug 31, 2016 at 02:07:25PM +0530, Parav Pandit wrote:
> Added rdma cgroup controller that does accounting, limit enforcement
> on rdma/IB verbs and hw resources.
>
> Added rdma cgroup header file which defines its APIs to perform
> charing/uncharing functionality. It also defined APIs for RDMA/IB
> stack for device registration. Devices which are registered will
> participate in controller functions of accounting and limit
> enforcements. It define rdmacg_device structure to bind IB stack
> and RDMA cgroup controller.
>
> RDMA resources are tracked using resource pool. Resource pool is per
> device, per cgroup entity which allows setting up accounting limits
> on per device basis.
>
> Currently resources are defined by the RDMA cgroup.
>
> Resource pool is created/destroyed dynamically whenever
> charging/uncharging occurs respectively and whenever user
> configuration is done. Its a tradeoff of memory vs little more code
> space that creates resource pool object whenever necessary, instead of
> creating them during cgroup creation and device registration time.
>
> Signed-off-by: Parav Pandit 
> ---

<...>

> +
> +static struct rdmacg_resource_pool *
> +get_cg_rpool_locked(struct rdma_cgroup *cg, struct rdmacg_device *device)
> +{
> + struct rdmacg_resource_pool *rpool;
> +
> + rpool = find_cg_rpool_locked(cg, device);
> + if (rpool)
> + return rpool;
> +
> + rpool = kzalloc(sizeof(*rpool), GFP_KERNEL);
> + if (!rpool)
> + return ERR_PTR(-ENOMEM);
> +
> + rpool->device = device;
> + set_all_resource_max_limit(rpool);
> +
> + INIT_LIST_HEAD(&rpool->cg_node);
> + INIT_LIST_HEAD(&rpool->dev_node);
> + list_add_tail(&rpool->cg_node, &cg->rpools);
> + list_add_tail(&rpool->dev_node, &device->rpools);
> + return rpool;
> +}

<...>

> + for (p = cg; p; p = parent_rdmacg(p)) {
> + rpool = get_cg_rpool_locked(p, device);
> + if (IS_ERR_OR_NULL(rpool)) {

get_cg_rpool_locked always returns !NULL (error, or pointer)

> + ret = PTR_ERR(rpool);
> + goto err;

I didn't review the whole series yet.


signature.asc
Description: PGP signature


Re: [PATCHv12 1/3] rdmacg: Added rdma cgroup controller

2016-09-07 Thread Leon Romanovsky
On Wed, Sep 07, 2016 at 08:37:23PM +0530, Parav Pandit wrote:
> Hi Leon,
>
> >> Signed-off-by: Parav Pandit 
> >> +static struct rdmacg_resource_pool *
> >> +get_cg_rpool_locked(struct rdma_cgroup *cg, struct rdmacg_device *device)
> >> +{
> >> + struct rdmacg_resource_pool *rpool;
> >> +
> >> + rpool = find_cg_rpool_locked(cg, device);
> >> + if (rpool)
> >> + return rpool;
> >> +
> >> + rpool = kzalloc(sizeof(*rpool), GFP_KERNEL);
> >> + if (!rpool)
> >> + return ERR_PTR(-ENOMEM);
> >> +
> >> + rpool->device = device;
> >> + set_all_resource_max_limit(rpool);
> >> +
> >> + INIT_LIST_HEAD(&rpool->cg_node);
> >> + INIT_LIST_HEAD(&rpool->dev_node);
> >> + list_add_tail(&rpool->cg_node, &cg->rpools);
> >> + list_add_tail(&rpool->dev_node, &device->rpools);
> >> + return rpool;
> >> +}
> >
> > <...>
> >
> >> + for (p = cg; p; p = parent_rdmacg(p)) {
> >> + rpool = get_cg_rpool_locked(p, device);
> >> + if (IS_ERR_OR_NULL(rpool)) {
> >
> > get_cg_rpool_locked always returns !NULL (error, or pointer)
>
> Can this change go as incremental change after this patch, since this
> patch is close to completion?
> Or I need to revise v13?

Sure, it is cleanup. It is not worth of respinning.

>
> >
> >> + ret = PTR_ERR(rpool);
> >> + goto err;
> >
> > I didn't review the whole series yet.
>
> Did you get a chance to review the series?

We need to decide on fundamental question before reviewing it, which is
"how to fit rdmacg to new ABI model".

Thanks


signature.asc
Description: PGP signature


Re: [PATCHv12 1/3] rdmacg: Added rdma cgroup controller

2016-09-11 Thread Leon Romanovsky
On Sun, Sep 11, 2016 at 03:34:21PM +0200, Christoph Hellwig wrote:
> On Sat, Sep 10, 2016 at 11:01:51AM -0600, Jason Gunthorpe wrote:
> > Sadly, it isn't a step backwards, it is status quo - at least as far
> > as the uapi is concerned.
>
> Sort of, see below:
>
> > struct mlx5_create_cq {
> > struct ibv_create_cqibv_cmd;
> > __u64   buf_addr;
> > __u64   db_addr;
> > __u32   cqe_size;
> > };
> >
> > struct iwch_create_cq {
> > struct ibv_create_cq ibv_cmd;
> > uint64_t user_rptr_addr;
> > };
> >
> > Love to hear ideas on a way forward that doesn't involve rewriting
> > everything :(
>
> We stil always have the common structure first.  And at least for
> cgroups supports that's what matters.
>
> Re the actual structures - we'll really need to make sure we
>
>  a) expose proper userspace abi headers in the kernel for all code
> in the RDMA subsystem
>  b) actually use that in the userspace components
>
> I've posted some initial work toward a) a while ago, and once we
> agree on adopting your common repo I'd really like to start through
> with that work.  I think it's a pre-requisite for any major new
> userspace ABI work.

I started to work on it over weekend and it is worth do not do same work twice.

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


signature.asc
Description: PGP signature


Re: [PATCHv12 1/3] rdmacg: Added rdma cgroup controller

2016-09-11 Thread Leon Romanovsky
On Sun, Sep 11, 2016 at 11:52:35AM -0600, Jason Gunthorpe wrote:
> On Sun, Sep 11, 2016 at 07:24:45PM +0200, Christoph Hellwig wrote:
> > > > > I've posted some initial work toward a) a while ago, and once we
> > >
> > > Did it get merged? Do you have a pointer?
> >
> > http://www.spinics.net/lists/linux-rdma/msg31958.html
>
> Right, I remember that. Certainly the right direction
>
> > > However, everything under verbs is not straightforward. The files in
> > > userspace are not copies...
> > >
> > > user:
> > >
> > > struct ibv_query_device {
> > >__u32 command;
> > >__u16 in_words;
> > >__u16 out_words;
> > >__u64 response;
> > >__u64 driver_data[0];
> > > };
> > >
> > > kernel:
> > >
> > > struct ib_uverbs_query_device {
> > > __u64 response;
> > > __u64 driver_data[0];
> > > };
> >
> > We'll obviously need different strutures for the libibvers API
> > and the kernel interface in this case, and we'll need to figure out
> > how to properly translate them.  I think a cast, plus compile time
> > type checking ala BUILD_BUG_ON is the way to go.
>
> I'm not sure I follow, which would I cast?
>
> BUILD_BUG_ON(sizeof(ibv_query_device) == sizeof(ib_uverbs_cmd_hdr) +
>  sizeof(ib_uverbs_query_device))
>
> ?
>
> > > I'm thinking the best way forward might be to use a script and
> > > transform userspace into:
> > >
> > > struct ibv_query_device {
> > >   struct ib_uverbs_cmd_hdr hdr;
> > >   struct ib_uverbs_query_device cmd;
> > > };
> >
> > That would break the users of the interface.
>
> Sorry, I mean doing this inside rdma-plumbing. Since the change is ABI
> identical the modified libibverbs would still be binary compatible
> with all providers but not source compatible. Since all kernel
> supported providers are in rdma-plumbing we can add the '.cmd.' at the
> same time.
>
> The kernel uapi header would stay the same.
>
> > However automatically generating the user ABI from the kernel one
> > might still be a good idea in the long run.
>
> My preference would be to try and use the kernel headers directly.

I thought the same, especially after realizing that they are almost
copy/paste from the vendor *-abi.h files.

>
> Jason


signature.asc
Description: PGP signature


Re: [PATCH 2/2] Documentation: best practices for using Link trailers

2024-06-19 Thread Leon Romanovsky
On Tue, Jun 18, 2024 at 12:42:11PM -0400, Konstantin Ryabitsev wrote:
> Based on multiple conversations, most recently on the ksummit mailing
> list [1], add some best practices for using the Link trailer, such as:
> 
> - how to use markdown-like bracketed numbers in the commit message to
> indicate the corresponding link
> - when to use lore.kernel.org vs patch.msgid.link domains
> 
> Cc: ksum...@lists.linux.dev
> Link: 
> https://lore.kernel.org/20240617-arboreal-industrious-hedgehog-5b84ae@meerkat 
> # [1]
> Signed-off-by: Konstantin Ryabitsev 
> ---
>  Documentation/process/maintainer-tip.rst | 24 ++--
>  1 file changed, 18 insertions(+), 6 deletions(-)
> 
> diff --git a/Documentation/process/maintainer-tip.rst 
> b/Documentation/process/maintainer-tip.rst
> index 64739968afa6..57ffa553c21e 100644
> --- a/Documentation/process/maintainer-tip.rst
> +++ b/Documentation/process/maintainer-tip.rst
> @@ -375,14 +375,26 @@ following tag ordering scheme:
> For referring to an email on LKML or other kernel mailing lists,
> please use the lore.kernel.org redirector URL::
>  
> - https://lore.kernel.org/r/email-message@id
> + Link: https://lore.kernel.org/email-message@id
>  
> -   The kernel.org redirector is considered a stable URL, unlike other email
> -   archives.
> +   This URL should be used when referring to relevant mailing list
> +   resources, related patch sets, or other notable discussion threads.
> +   A convenient way to associate Link trailers with the accompanying
> +   message is to use markdown-like bracketed notation, for example::
>  
> -   Maintainers will add a Link tag referencing the email of the patch
> -   submission when they apply a patch to the tip tree. This tag is useful
> -   for later reference and is also used for commit notifications.
> + A similar approach was attempted before as part of a different
> + effort [1], but the initial implementation caused too many
> + regressions [2], so it was backed out and reimplemented.
> +
> + Link: https://lore.kernel.org/some-msgid@here # [1]
> + Link: https://bugzilla.example.org/bug/12345  # [2]
> +
> +   When using the ``Link:`` trailer to indicate the provenance of the
> +   patch, you should use the dedicated ``patch.msgid.link`` domain. This
> +   makes it possible for automated tooling to establish which link leads
> +   to the original patch submission. For example::
> +
> + Link: https://patch.msgid.link/patch-source-msgid@here

Default b4.linkmask points to https://msgid.link/ and not to 
https://patch.msgid.link/

https://git.kernel.org/pub/scm/utils/b4/b4.git/tree/.b4-config#n3
https://git.kernel.org/pub/scm/utils/b4/b4.git/tree/docs/config.rst#n46

It will be good to update the default value in b4 to point to the correct 
domain.

Thanks