Re: [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-21 Thread Jason Gunthorpe
On Mon, Jun 21, 2021 at 10:24:16PM +0300, Oded Gabbay wrote:

> Another thing I want to emphasize is that we are doing p2p only
> through the export/import of the FD. We do *not* allow the user to
> mmap the dma-buf as we do not support direct IO. So there is no access
> to these pages through the userspace.

Arguably mmaping the memory is a better choice, and is the direction
that Logan's series goes in. Here the use of DMABUF was specifically
designed to allow hitless revokation of the memory, which this isn't
even using.

So you are taking the hit of very limited hardware support and reduced
performance just to squeeze into DMABUF..

Jason
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-21 Thread Oded Gabbay
On Mon, Jun 21, 2021 at 9:27 PM Daniel Vetter  wrote:
>
> On Mon, Jun 21, 2021 at 7:55 PM Jason Gunthorpe  wrote:
> > On Mon, Jun 21, 2021 at 07:26:14PM +0300, Oded Gabbay wrote:
> > > On Mon, Jun 21, 2021 at 5:12 PM Jason Gunthorpe  wrote:
> > > >
> > > > On Mon, Jun 21, 2021 at 03:02:10PM +0200, Greg KH wrote:
> > > > > On Mon, Jun 21, 2021 at 02:28:48PM +0200, Daniel Vetter wrote:
> > > >
> > > > > > Also I'm wondering which is the other driver that we share buffers
> > > > > > with. The gaudi stuff doesn't have real struct pages as backing
> > > > > > storage, it only fills out the dma_addr_t. That tends to blow up 
> > > > > > with
> > > > > > other drivers, and the only place where this is guaranteed to work 
> > > > > > is
> > > > > > if you have a dynamic importer which sets the allow_peer2peer flag.
> > > > > > Adding maintainers from other subsystems who might want to chime in
> > > > > > here. So even aside of the big question as-is this is broken.
> > > > >
> > > > > From what I can tell this driver is sending the buffers to other
> > > > > instances of the same hardware,
> > > >
> > > > A dmabuf is consumed by something else in the kernel calling
> > > > dma_buf_map_attachment() on the FD.
> > > >
> > > > What is the other side of this? I don't see any
> > > > dma_buf_map_attachment() calls in drivers/misc, or added in this patch
> > > > set.
> > >
> > > This patch-set is only to enable the support for the exporter side.
> > > The "other side" is any generic RDMA networking device that will want
> > > to perform p2p communication over PCIe with our GAUDI accelerator.
> > > An example is indeed the mlnx5 card which has already integrated
> > > support for being an "importer".
> >
> > It raises the question of how you are testing this if you aren't using
> > it with the only intree driver: mlx5.
>
> For p2p dma-buf there's also amdgpu as a possible in-tree candiate
> driver, that's why I added amdgpu folks. Otoh I'm not aware of AI+GPU
> combos being much in use, at least with upstream gpu drivers (nvidia
> blob is a different story ofc, but I don't care what they do in their
> own world).
> -Daniel
> --
We have/are doing three things:
1. I wrote a simple "importer" driver that emulates an RDMA driver. It
calls all the IB_UMEM_DMABUF functions, same as the mlnx5 driver does.
And instead of using h/w, it accesses the bar directly. We wrote
several tests that emulated the real application. i.e. asking the
habanalabs driver to create dma-buf object and export its FD back to
userspace. Then the userspace sends the FD to the "importer" driver,
which attaches to it, get the SG list and accesses the memory on the
GAUDI device. This gave me the confidence that how we integrated the
exporter is basically correct/working.

2. We are trying to do a POC with a MLNX card we have, WIP.

3. We are working with another 3rd party RDMA device that its driver
is now adding support for being an "importer". also WIP

In both points 2&3 We haven't yet reached the actual stage of checking
this feature.

Another thing I want to emphasize is that we are doing p2p only
through the export/import of the FD. We do *not* allow the user to
mmap the dma-buf as we do not support direct IO. So there is no access
to these pages through the userspace.

Thanks,
Oded
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-21 Thread Daniel Vetter
On Mon, Jun 21, 2021 at 7:55 PM Jason Gunthorpe  wrote:
> On Mon, Jun 21, 2021 at 07:26:14PM +0300, Oded Gabbay wrote:
> > On Mon, Jun 21, 2021 at 5:12 PM Jason Gunthorpe  wrote:
> > >
> > > On Mon, Jun 21, 2021 at 03:02:10PM +0200, Greg KH wrote:
> > > > On Mon, Jun 21, 2021 at 02:28:48PM +0200, Daniel Vetter wrote:
> > >
> > > > > Also I'm wondering which is the other driver that we share buffers
> > > > > with. The gaudi stuff doesn't have real struct pages as backing
> > > > > storage, it only fills out the dma_addr_t. That tends to blow up with
> > > > > other drivers, and the only place where this is guaranteed to work is
> > > > > if you have a dynamic importer which sets the allow_peer2peer flag.
> > > > > Adding maintainers from other subsystems who might want to chime in
> > > > > here. So even aside of the big question as-is this is broken.
> > > >
> > > > From what I can tell this driver is sending the buffers to other
> > > > instances of the same hardware,
> > >
> > > A dmabuf is consumed by something else in the kernel calling
> > > dma_buf_map_attachment() on the FD.
> > >
> > > What is the other side of this? I don't see any
> > > dma_buf_map_attachment() calls in drivers/misc, or added in this patch
> > > set.
> >
> > This patch-set is only to enable the support for the exporter side.
> > The "other side" is any generic RDMA networking device that will want
> > to perform p2p communication over PCIe with our GAUDI accelerator.
> > An example is indeed the mlnx5 card which has already integrated
> > support for being an "importer".
>
> It raises the question of how you are testing this if you aren't using
> it with the only intree driver: mlx5.

For p2p dma-buf there's also amdgpu as a possible in-tree candiate
driver, that's why I added amdgpu folks. Otoh I'm not aware of AI+GPU
combos being much in use, at least with upstream gpu drivers (nvidia
blob is a different story ofc, but I don't care what they do in their
own world).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-21 Thread Jason Gunthorpe
On Mon, Jun 21, 2021 at 07:26:14PM +0300, Oded Gabbay wrote:
> On Mon, Jun 21, 2021 at 5:12 PM Jason Gunthorpe  wrote:
> >
> > On Mon, Jun 21, 2021 at 03:02:10PM +0200, Greg KH wrote:
> > > On Mon, Jun 21, 2021 at 02:28:48PM +0200, Daniel Vetter wrote:
> >
> > > > Also I'm wondering which is the other driver that we share buffers
> > > > with. The gaudi stuff doesn't have real struct pages as backing
> > > > storage, it only fills out the dma_addr_t. That tends to blow up with
> > > > other drivers, and the only place where this is guaranteed to work is
> > > > if you have a dynamic importer which sets the allow_peer2peer flag.
> > > > Adding maintainers from other subsystems who might want to chime in
> > > > here. So even aside of the big question as-is this is broken.
> > >
> > > From what I can tell this driver is sending the buffers to other
> > > instances of the same hardware,
> >
> > A dmabuf is consumed by something else in the kernel calling
> > dma_buf_map_attachment() on the FD.
> >
> > What is the other side of this? I don't see any
> > dma_buf_map_attachment() calls in drivers/misc, or added in this patch
> > set.
> 
> This patch-set is only to enable the support for the exporter side.
> The "other side" is any generic RDMA networking device that will want
> to perform p2p communication over PCIe with our GAUDI accelerator.
> An example is indeed the mlnx5 card which has already integrated
> support for being an "importer".

It raises the question of how you are testing this if you aren't using
it with the only intree driver: mlx5.

Jason
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-21 Thread Oded Gabbay
On Mon, Jun 21, 2021 at 5:12 PM Jason Gunthorpe  wrote:
>
> On Mon, Jun 21, 2021 at 03:02:10PM +0200, Greg KH wrote:
> > On Mon, Jun 21, 2021 at 02:28:48PM +0200, Daniel Vetter wrote:
>
> > > Also I'm wondering which is the other driver that we share buffers
> > > with. The gaudi stuff doesn't have real struct pages as backing
> > > storage, it only fills out the dma_addr_t. That tends to blow up with
> > > other drivers, and the only place where this is guaranteed to work is
> > > if you have a dynamic importer which sets the allow_peer2peer flag.
> > > Adding maintainers from other subsystems who might want to chime in
> > > here. So even aside of the big question as-is this is broken.
> >
> > From what I can tell this driver is sending the buffers to other
> > instances of the same hardware,
>
> A dmabuf is consumed by something else in the kernel calling
> dma_buf_map_attachment() on the FD.
>
> What is the other side of this? I don't see any
> dma_buf_map_attachment() calls in drivers/misc, or added in this patch
> set.

This patch-set is only to enable the support for the exporter side.
The "other side" is any generic RDMA networking device that will want
to perform p2p communication over PCIe with our GAUDI accelerator.
An example is indeed the mlnx5 card which has already integrated
support for being an "importer".

This is *not* used for communication with another GAUDI device. If I
want to communicate with another GAUDI device, our userspace
communications library will use our internal network links, without
any need for dma-buf.

Oded

>
> AFAIK the only viable in-tree other side is in mlx5 (look in
> umem_dmabuf.c)
>
> Though as we already talked habana has their own networking (out of
> tree, presumably) so I suspect this is really to support some out of
> tree stuff??
>
> Jason
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-21 Thread Jason Gunthorpe
On Mon, Jun 21, 2021 at 04:20:35PM +0200, Daniel Vetter wrote:

> Also unless we're actually doing this properly there's zero incentive for
> me to review the kernel code and check whether it follows the rules
> correctly, so you have excellent chances that you just break the rules.
> And dma_buf/fence are tricky enough that you pretty much guaranteed to
> break the rules if you're not involved in the discussions. Just now we
> have a big one where everyone involved (who's been doing this for 10+
> years all at least) realizes we've fucked up big time.

This is where I come from on dmabuf, it is fiendishly
complicated. Don't use it unless you absoultely have to, are in DRM,
and have people like Daniel helping to make sure you use it right.

It's whole premise and design is compromised by specialty historical
implementation choices on the GPU side.

Jason
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-21 Thread Jason Gunthorpe
On Mon, Jun 21, 2021 at 02:28:48PM +0200, Daniel Vetter wrote:

> Mission acomplished, we've gone full circle, and the totally-not-a-gpu
> driver is now trying to use gpu infrastructure. And seems to have
> gained vram meanwhile too. Next up is going to be synchronization
> using dma_fence so you can pass buffers back without stalls
> among drivers.

Well, we can't even see the other side of this so who knows

This is a new uAPI, where is the userspace? In RDMA at least I require
to see the new userspace and test suite before changes to
include/uapi/rdma can go ahead.

> Doug/Jason from infiniband: Should we add linux-rdma to the dma-buf
> wildcard match so that you can catch these next time around too? At
> least when people use scripts/get_maintainers.pl correctly. All the
> other subsystems using dma-buf are on there already (dri-devel,
> linux-media and linaro-mm-sig for android/arm embedded stuff).

My bigger concern is this doesn't seem to be implementing PCI P2P DMA
correctly. This is following the same hacky NULL page approach that
Christoph Hellwig already NAK'd for AMD.

This should not be allowed to proliferate.

I would be much happier seeing this be done using the approach of
Logan's series here:

https://lore.kernel.org/linux-block/20210513223203.5542-1-log...@deltatee.com/

Jason
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-21 Thread Daniel Vetter
On Mon, Jun 21, 2021 at 03:02:10PM +0200, Greg KH wrote:
> On Mon, Jun 21, 2021 at 02:28:48PM +0200, Daniel Vetter wrote:
> > On Fri, Jun 18, 2021 at 2:36 PM Oded Gabbay  wrote:
> > > User process might want to share the device memory with another
> > > driver/device, and to allow it to access it over PCIe (P2P).
> > >
> > > To enable this, we utilize the dma-buf mechanism and add a dma-buf
> > > exporter support, so the other driver can import the device memory and
> > > access it.
> > >
> > > The device memory is allocated using our existing allocation uAPI,
> > > where the user will get a handle that represents the allocation.
> > >
> > > The user will then need to call the new
> > > uAPI (HL_MEM_OP_EXPORT_DMABUF_FD) and give the handle as a parameter.
> > >
> > > The driver will return a FD that represents the DMA-BUF object that
> > > was created to match that allocation.
> > >
> > > Signed-off-by: Oded Gabbay 
> > > Reviewed-by: Tomer Tayar 
> > 
> > Mission acomplished, we've gone full circle, and the totally-not-a-gpu
> > driver is now trying to use gpu infrastructure. And seems to have
> > gained vram meanwhile too. Next up is going to be synchronization
> > using dma_fence so you can pass buffers back without stalls
> > among drivers.
> 
> What's wrong with other drivers using dmabufs and even dma_fence?  It's
> a common problem when shuffling memory around systems, why is that
> somehow only allowed for gpu drivers?
> 
> There are many users of these structures in the kernel today that are
> not gpu drivers (tee, fastrpc, virtio, xen, IB, etc) as this is a common
> thing that drivers want to do (throw chunks of memory around from
> userspace to hardware).
> 
> I'm not trying to be a pain here, but I really do not understand why
> this is a problem.  A kernel api is present, why not use it by other
> in-kernel drivers?  We had the problem in the past where subsystems were
> trying to create their own interfaces for the same thing, which is why
> you all created the dmabuf api to help unify this.

It's the same thing as ever. 90% of an accel driver are in userspace,
that's where all the fun is, that's where the big picture review needs to
happen, and we've very conveniently bypassed all that a few years back
because it was too annoying.

Once we have the full driver stack and can start reviewing it I have no
objections to totally-not-gpus using all this stuff too. But until we can
do that this is all just causing headaches.

Ofc if you assume that userspace doesn't matter then you don't care, which
is where this giantic disconnect comes from.

Also unless we're actually doing this properly there's zero incentive for
me to review the kernel code and check whether it follows the rules
correctly, so you have excellent chances that you just break the rules.
And dma_buf/fence are tricky enough that you pretty much guaranteed to
break the rules if you're not involved in the discussions. Just now we
have a big one where everyone involved (who's been doing this for 10+
years all at least) realizes we've fucked up big time.

Anyway we've had this discussion, we're not going to move anyone here at
all, so *shrug*. I'll keep seeing accelarators in drivers/misc as blantant
bypassing of review by actual accelerator pieces, you keep seing dri-devel
as ... well I dunno, people who don't know what they're talking about
maybe. Or not relevant to your totally-not-a-gpu thing.

> > Also I'm wondering which is the other driver that we share buffers
> > with. The gaudi stuff doesn't have real struct pages as backing
> > storage, it only fills out the dma_addr_t. That tends to blow up with
> > other drivers, and the only place where this is guaranteed to work is
> > if you have a dynamic importer which sets the allow_peer2peer flag.
> > Adding maintainers from other subsystems who might want to chime in
> > here. So even aside of the big question as-is this is broken.
> 
> From what I can tell this driver is sending the buffers to other
> instances of the same hardware, as that's what is on the other "end" of
> the network connection.  No different from IB's use of RDMA, right?

There's no import afaict, but maybe I missed it. Assuming I haven't missed
it the importing necessarily has to happen by some other drivers.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-21 Thread Jason Gunthorpe
On Mon, Jun 21, 2021 at 03:02:10PM +0200, Greg KH wrote:
> On Mon, Jun 21, 2021 at 02:28:48PM +0200, Daniel Vetter wrote:

> > Also I'm wondering which is the other driver that we share buffers
> > with. The gaudi stuff doesn't have real struct pages as backing
> > storage, it only fills out the dma_addr_t. That tends to blow up with
> > other drivers, and the only place where this is guaranteed to work is
> > if you have a dynamic importer which sets the allow_peer2peer flag.
> > Adding maintainers from other subsystems who might want to chime in
> > here. So even aside of the big question as-is this is broken.
> 
> From what I can tell this driver is sending the buffers to other
> instances of the same hardware,

A dmabuf is consumed by something else in the kernel calling
dma_buf_map_attachment() on the FD.

What is the other side of this? I don't see any
dma_buf_map_attachment() calls in drivers/misc, or added in this patch
set.

AFAIK the only viable in-tree other side is in mlx5 (look in
umem_dmabuf.c)

Though as we already talked habana has their own networking (out of
tree, presumably) so I suspect this is really to support some out of
tree stuff??

Jason
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-21 Thread Greg KH
On Mon, Jun 21, 2021 at 02:28:48PM +0200, Daniel Vetter wrote:
> On Fri, Jun 18, 2021 at 2:36 PM Oded Gabbay  wrote:
> > User process might want to share the device memory with another
> > driver/device, and to allow it to access it over PCIe (P2P).
> >
> > To enable this, we utilize the dma-buf mechanism and add a dma-buf
> > exporter support, so the other driver can import the device memory and
> > access it.
> >
> > The device memory is allocated using our existing allocation uAPI,
> > where the user will get a handle that represents the allocation.
> >
> > The user will then need to call the new
> > uAPI (HL_MEM_OP_EXPORT_DMABUF_FD) and give the handle as a parameter.
> >
> > The driver will return a FD that represents the DMA-BUF object that
> > was created to match that allocation.
> >
> > Signed-off-by: Oded Gabbay 
> > Reviewed-by: Tomer Tayar 
> 
> Mission acomplished, we've gone full circle, and the totally-not-a-gpu
> driver is now trying to use gpu infrastructure. And seems to have
> gained vram meanwhile too. Next up is going to be synchronization
> using dma_fence so you can pass buffers back without stalls
> among drivers.

What's wrong with other drivers using dmabufs and even dma_fence?  It's
a common problem when shuffling memory around systems, why is that
somehow only allowed for gpu drivers?

There are many users of these structures in the kernel today that are
not gpu drivers (tee, fastrpc, virtio, xen, IB, etc) as this is a common
thing that drivers want to do (throw chunks of memory around from
userspace to hardware).

I'm not trying to be a pain here, but I really do not understand why
this is a problem.  A kernel api is present, why not use it by other
in-kernel drivers?  We had the problem in the past where subsystems were
trying to create their own interfaces for the same thing, which is why
you all created the dmabuf api to help unify this.

> Also I'm wondering which is the other driver that we share buffers
> with. The gaudi stuff doesn't have real struct pages as backing
> storage, it only fills out the dma_addr_t. That tends to blow up with
> other drivers, and the only place where this is guaranteed to work is
> if you have a dynamic importer which sets the allow_peer2peer flag.
> Adding maintainers from other subsystems who might want to chime in
> here. So even aside of the big question as-is this is broken.

>From what I can tell this driver is sending the buffers to other
instances of the same hardware, as that's what is on the other "end" of
the network connection.  No different from IB's use of RDMA, right?

thanks,

greg k-h
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


Re: [PATCH v3 1/2] habanalabs: define uAPI to export FD for DMA-BUF

2021-06-21 Thread Daniel Vetter
On Fri, Jun 18, 2021 at 2:36 PM Oded Gabbay  wrote:
> User process might want to share the device memory with another
> driver/device, and to allow it to access it over PCIe (P2P).
>
> To enable this, we utilize the dma-buf mechanism and add a dma-buf
> exporter support, so the other driver can import the device memory and
> access it.
>
> The device memory is allocated using our existing allocation uAPI,
> where the user will get a handle that represents the allocation.
>
> The user will then need to call the new
> uAPI (HL_MEM_OP_EXPORT_DMABUF_FD) and give the handle as a parameter.
>
> The driver will return a FD that represents the DMA-BUF object that
> was created to match that allocation.
>
> Signed-off-by: Oded Gabbay 
> Reviewed-by: Tomer Tayar 

Mission acomplished, we've gone full circle, and the totally-not-a-gpu
driver is now trying to use gpu infrastructure. And seems to have
gained vram meanwhile too. Next up is going to be synchronization
using dma_fence so you can pass buffers back without stalls
among drivers.

Bonus points for this being at v3 before it shows up on dri-devel and
cc's dma-buf folks properly (not quite all, I added the missing
people).

I think we roughly have two options here

a) Greg continues to piss off dri-devel folks while trying to look
cute and steadfastly claiming that this accelator doesn't work
like any of the other accelerator drivers we have in drivers/gpu/drm.
All while the driver ever more looks like one of these other accel
drivers.

b) We finally do what we should have done years back and treat this as
a proper driver submission and review it on dri-devel instead of
sneaking it in through other channels because the merge criteria
dri-devel has are too onerous and people who don't have experience
with accel stacks for the past 20 years or so don't like them.

"But this probably means a new driver and big disruption!"

Not my problem, I'm not the dude who has to come up with an excuse for
this because I didn't merge the driver in the first place. I do get to
throw a "we all told you so" in though, but that's not helping.

Also I'm wondering which is the other driver that we share buffers
with. The gaudi stuff doesn't have real struct pages as backing
storage, it only fills out the dma_addr_t. That tends to blow up with
other drivers, and the only place where this is guaranteed to work is
if you have a dynamic importer which sets the allow_peer2peer flag.
Adding maintainers from other subsystems who might want to chime in
here. So even aside of the big question as-is this is broken.

Currently only 2 drivers set allow_peer2peer, so those are the only
ones who can consume these buffers from device memory. Pinging those
folks specifically.

Doug/Jason from infiniband: Should we add linux-rdma to the dma-buf
wildcard match so that you can catch these next time around too? At
least when people use scripts/get_maintainers.pl correctly. All the
other subsystems using dma-buf are on there already (dri-devel,
linux-media and linaro-mm-sig for android/arm embedded stuff).

Cheers, Daniel



> ---
>  include/uapi/misc/habanalabs.h | 28 +++-
>  1 file changed, 27 insertions(+), 1 deletion(-)
>
> diff --git a/include/uapi/misc/habanalabs.h b/include/uapi/misc/habanalabs.h
> index a47a731e4527..aa3d8e0ba060 100644
> --- a/include/uapi/misc/habanalabs.h
> +++ b/include/uapi/misc/habanalabs.h
> @@ -808,6 +808,10 @@ union hl_wait_cs_args {
>  #define HL_MEM_OP_UNMAP3
>  /* Opcode to map a hw block */
>  #define HL_MEM_OP_MAP_BLOCK4
> +/* Opcode to create DMA-BUF object for an existing device memory allocation
> + * and to export an FD of that DMA-BUF back to the caller
> + */
> +#define HL_MEM_OP_EXPORT_DMABUF_FD 5
>
>  /* Memory flags */
>  #define HL_MEM_CONTIGUOUS  0x1
> @@ -878,11 +882,26 @@ struct hl_mem_in {
> /* Virtual address returned from HL_MEM_OP_MAP */
> __u64 device_virt_addr;
> } unmap;
> +
> +   /* HL_MEM_OP_EXPORT_DMABUF_FD */
> +   struct {
> +   /* Handle returned from HL_MEM_OP_ALLOC. In Gaudi,
> +* where we don't have MMU for the device memory, the
> +* driver expects a physical address (instead of
> +* a handle) in the device memory space.
> +*/
> +   __u64 handle;
> +   /* Size of memory allocation. Relevant only for GAUDI 
> */
> +   __u64 mem_size;
> +   } export_dmabuf_fd;
> };
>
> /* HL_MEM_OP_* */
> __u32 op;
> -   /* HL_MEM_* flags */
> +   /* HL_MEM_* flags.
> +* For the HL_MEM_OP_EXPORT_DMABUF_FD opcode, this field holds the
> +* DMA-BUF file/FD flags.
> +*/
> __u32 flags;
> /* Context ID - Currently not in use */
> __u32