Re: Design session notes: GPU acceleration in Xen

2024-06-19 Thread Christian König

Am 18.06.24 um 16:12 schrieb Demi Marie Obenour:

On Tue, Jun 18, 2024 at 08:33:38AM +0200, Christian König wrote:
> Am 18.06.24 um 02:57 schrieb Demi Marie Obenour:
>> On Mon, Jun 17, 2024 at 10:46:13PM +0200, Marek Marczykowski-Górecki
>> wrote:
>>> On Mon, Jun 17, 2024 at 09:46:29AM +0200, Roger Pau Monné wrote:
>>>> On Sun, Jun 16, 2024 at 08:38:19PM -0400, Demi Marie Obenour wrote:
>>>>> In both cases, the device physical
>>>>> addresses are identical to dom0’s physical addresses.
>>>>
>>>> Yes, but a PV dom0 physical address space can be very scattered.
>>>>
>>>> IIRC there's an hypercall to request physically contiguous memory for
>>>> PV, but you don't want to be using that every time you allocate a
>>>> buffer (not sure it would support the sizes needed by the GPU
>>>> anyway).
>>
>>> Indeed that isn't going to fly. In older Qubes versions we had PV
>>> sys-net with PCI passthrough for a network card. After some uptime it
>>> was basically impossible to restart and still have enough contagious
>>> memory for a network driver, and there it was about _much_ smaller
>>> buffers, like 2M or 4M. At least not without shutting down a lot more
>>> things to free some more memory.
>>
>> Ouch!  That makes me wonder if all GPU drivers actually need physically
>> contiguous buffers, or if it is (as I suspect) driver-specific. CCing
>> Christian König who has mentioned issues in this area.

> Well GPUs don't need physical contiguous memory to function, but if they
> only get 4k pages to work with it means a quite large (up to 30%)
> performance penalty.

The status quo is "no GPU acceleration at all", so 70% of bare metal
performance would be amazing right now.


Well AMD uses native context approach in XEN which which delivers over 
90% of bare metal performance.


Pierre-Eric can tell you more, but we certainly have GPU solutions in 
productions with XEN which would suffer greatly if we see the underlying 
memory fragmented like this.



  However, the implementation
should not preclude eliminating this performance penalty in the future.

What size pages do GPUs need for good performance?  Is it the same as
CPU huge pages?


2MiB are usually sufficient.

Regards,
Christian.


  PV dom0 doesn't get huge pages at all, but PVH and HVM
guests do, and the goal is to move away from PV guests as they have lots
of unrelated problems.

> So scattering memory like you described is probably a very bad idea 
if you

> want any halve way decent performance.

For an initial prototype a 30% performance penalty is acceptable, but
it's good to know that memory fragmentation needs to be avoided.

> Regards,
> Christian

Thanks for the prompt response!





Re: Design session notes: GPU acceleration in Xen

2024-06-17 Thread Christian König

Am 18.06.24 um 02:57 schrieb Demi Marie Obenour:
On Mon, Jun 17, 2024 at 10:46:13PM +0200, Marek Marczykowski-Górecki 
wrote:

> On Mon, Jun 17, 2024 at 09:46:29AM +0200, Roger Pau Monné wrote:
>> On Sun, Jun 16, 2024 at 08:38:19PM -0400, Demi Marie Obenour wrote:
>>> In both cases, the device physical
>>> addresses are identical to dom0’s physical addresses.
>>
>> Yes, but a PV dom0 physical address space can be very scattered.
>>
>> IIRC there's an hypercall to request physically contiguous memory for
>> PV, but you don't want to be using that every time you allocate a
>> buffer (not sure it would support the sizes needed by the GPU
>> anyway).

> Indeed that isn't going to fly. In older Qubes versions we had PV
> sys-net with PCI passthrough for a network card. After some uptime it
> was basically impossible to restart and still have enough contagious
> memory for a network driver, and there it was about _much_ smaller
> buffers, like 2M or 4M. At least not without shutting down a lot more
> things to free some more memory.

Ouch!  That makes me wonder if all GPU drivers actually need physically
contiguous buffers, or if it is (as I suspect) driver-specific. CCing
Christian König who has mentioned issues in this area.


Well GPUs don't need physical contiguous memory to function, but if they 
only get 4k pages to work with it means a quite large (up to 30%) 
performance penalty.


So scattering memory like you described is probably a very bad idea if 
you want any halve way decent performance.


Regards,
Christian.



Given the recent progress on PVH dom0, is it reasonable to assume that
PVH dom0 will be ready in time for R4.3, and that therefore Qubes OS
doesn't need to worry about this problem on x86?





Re: [PATCH] xen/gntdev: Fix the abuse of underlying struct page in DMA-buf import

2024-01-05 Thread Christian König

Am 04.01.24 um 19:53 schrieb Oleksandr Tyshchenko:

From: Oleksandr Tyshchenko 

DO NOT access the underlying struct page of an sg table exported
by DMA-buf in dmabuf_imp_to_refs(), this is not allowed.
Please see drivers/dma-buf/dma-buf.c:mangle_sg_table() for details.

Fortunately, here (for special Xen device) we can avoid using
pages and calculate gfns directly from dma addresses provided by
the sg table.

Suggested-by: Daniel Vetter 
Signed-off-by: Oleksandr Tyshchenko 


I can't say that I can judge the full technical background, but that 
looks reasonable to me.


Acked-by: Christian König 


---
Please note, I didn't manage to test the patch against the latest master branch
on real HW (patch was only build tested there). Patch was tested on Arm64
guests using Linux v5.10.41 from vendor's BSP, this is the environment where
running this use-case is possible and to which I have an access (Xen PV display
with zero-copy and backend domain as a buffer provider - be-alloc=1, so dma-buf
import part was involved). A little bit old, but the dma-buf import code
in gntdev-dmabuf.c hasn't been changed much since that time, all context
remains allmost the same according to my code inspection.
---
---
  drivers/xen/gntdev-dmabuf.c | 42 +++--
  1 file changed, 17 insertions(+), 25 deletions(-)

diff --git a/drivers/xen/gntdev-dmabuf.c b/drivers/xen/gntdev-dmabuf.c
index 4440e626b797..0dde49fca9a5 100644
--- a/drivers/xen/gntdev-dmabuf.c
+++ b/drivers/xen/gntdev-dmabuf.c
@@ -11,6 +11,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -50,7 +51,7 @@ struct gntdev_dmabuf {
  
  	/* Number of pages this buffer has. */

int nr_pages;
-   /* Pages of this buffer. */
+   /* Pages of this buffer (only for dma-buf export). */
struct page **pages;
  };
  
@@ -484,7 +485,7 @@ static int dmabuf_exp_from_refs(struct gntdev_priv *priv, int flags,

  /* DMA buffer import support. */
  
  static int

-dmabuf_imp_grant_foreign_access(struct page **pages, u32 *refs,
+dmabuf_imp_grant_foreign_access(unsigned long *gfns, u32 *refs,
int count, int domid)
  {
grant_ref_t priv_gref_head;
@@ -507,7 +508,7 @@ dmabuf_imp_grant_foreign_access(struct page **pages, u32 
*refs,
}
  
  		gnttab_grant_foreign_access_ref(cur_ref, domid,

-   xen_page_to_gfn(pages[i]), 0);
+   gfns[i], 0);
refs[i] = cur_ref;
}
  
@@ -529,7 +530,6 @@ static void dmabuf_imp_end_foreign_access(u32 *refs, int count)
  
  static void dmabuf_imp_free_storage(struct gntdev_dmabuf *gntdev_dmabuf)

  {
-   kfree(gntdev_dmabuf->pages);
kfree(gntdev_dmabuf->u.imp.refs);
kfree(gntdev_dmabuf);
  }
@@ -549,12 +549,6 @@ static struct gntdev_dmabuf *dmabuf_imp_alloc_storage(int 
count)
if (!gntdev_dmabuf->u.imp.refs)
goto fail;
  
-	gntdev_dmabuf->pages = kcalloc(count,

-  sizeof(gntdev_dmabuf->pages[0]),
-  GFP_KERNEL);
-   if (!gntdev_dmabuf->pages)
-   goto fail;
-
gntdev_dmabuf->nr_pages = count;
  
  	for (i = 0; i < count; i++)

@@ -576,7 +570,8 @@ dmabuf_imp_to_refs(struct gntdev_dmabuf_priv *priv, struct 
device *dev,
struct dma_buf *dma_buf;
struct dma_buf_attachment *attach;
struct sg_table *sgt;
-   struct sg_page_iter sg_iter;
+   struct sg_dma_page_iter sg_iter;
+   unsigned long *gfns;
int i;
  
  	dma_buf = dma_buf_get(fd);

@@ -624,26 +619,23 @@ dmabuf_imp_to_refs(struct gntdev_dmabuf_priv *priv, 
struct device *dev,
  
  	gntdev_dmabuf->u.imp.sgt = sgt;
  
-	/* Now convert sgt to array of pages and check for page validity. */

+   gfns = kcalloc(count, sizeof(*gfns), GFP_KERNEL);
+   if (!gfns)
+   goto fail_unmap;
+
+   /* Now convert sgt to array of gfns without accessing underlying pages. 
*/
i = 0;
-   for_each_sgtable_page(sgt, &sg_iter, 0) {
-   struct page *page = sg_page_iter_page(&sg_iter);
-   /*
-* Check if page is valid: this can happen if we are given
-* a page from VRAM or other resources which are not backed
-* by a struct page.
-*/
-   if (!pfn_valid(page_to_pfn(page))) {
-   ret = ERR_PTR(-EINVAL);
-   goto fail_unmap;
-   }
+   for_each_sgtable_dma_page(sgt, &sg_iter, 0) {
+   dma_addr_t addr = sg_page_iter_dma_address(&sg_iter);
+   unsigned long pfn = bfn_to_pfn(XEN_PFN_DOWN(dma_to_phys(dev, 
addr)));
  
-		gntdev_dmabuf->pages[i++] = page;

+   gfns[i++] = pfn_to_gfn(pfn);
}
  
-	ret = ERR_PTR(dmabuf_imp

Re: [PATCH RFC v1 00/52] drm/crtc: Rename struct drm_crtc::dev to drm_dev

2023-07-12 Thread Christian König

Am 12.07.23 um 15:38 schrieb Uwe Kleine-König:

Hello Maxime,

On Wed, Jul 12, 2023 at 02:52:38PM +0200, Maxime Ripard wrote:

On Wed, Jul 12, 2023 at 01:02:53PM +0200, Uwe Kleine-König wrote:

Background is that this makes merge conflicts easier to handle and detect.

Really?

FWIW, I agree with Christian here.


Each file (apart from include/drm/drm_crtc.h) is only touched once. So
unless I'm missing something you don't get less or easier conflicts by
doing it all in a single patch. But you gain the freedom to drop a
patch for one driver without having to drop the rest with it.

Not really, because the last patch removed the union anyway. So you have
to revert both the last patch, plus that driver one. And then you need
to add a TODO to remove that union eventually.

Yes, with a single patch you have only one revert (but 194 files changed,
1264 insertions(+), 1296 deletions(-)) instead of two (one of them: 1
file changed, 9 insertions(+), 1 deletion(-); the other maybe a bit
bigger). (And maybe you get away with just reverting the last patch.)

With a single patch the TODO after a revert is "redo it all again (and
prepare for a different set of conflicts)" while with the split series
it's only "fix that one driver that was forgotten/borked" + reapply that
10 line patch.


Yeah, but for a maintainer the size of the patches doesn't matter. 
That's only interesting if you need to manually review the patch, which 
you hopefully doesn't do in case of something auto-generated.


In other words if the patch is auto-generated re-applying it completely 
is less work than fixing things up individually.



  As the one who gets that TODO, I prefer the latter.


Yeah, but your personal preferences are not a technical relevant 
argument to a maintainer.


At the end of the day Dave or Daniel need to decide, because they need 
to live with it.


Regards,
Christian.



So in sum: If your metric is "small count of reverted commits", you're
right. If however your metric is: Better get 95% of this series' change
in than maybe 0%, the split series is the way to do it.

With me having spend ~3h on this series' changes, it's maybe
understandable that I did it the way I did.

FTR: This series was created on top of v6.5-rc1. If you apply it to
drm-misc-next you get a (trivial) conflict in patch #2. If I consider to
be the responsible maintainer who applies this series, I like being able
to just do git am --skip then.

FTR#2: In drm-misc-next is a new driver
(drivers/gpu/drm/loongson/lsdc_crtc.c) so skipping the last patch for
now might indeed be a good idea.


So I still like the split version better, but I'm open to a more
verbose reasoning from your side.

You're doing only one thing here, really: you change the name of a
structure field. If it was shared between multiple maintainers, then
sure, splitting that up is easier for everyone, but this will go through
drm-misc, so I can't see the benefit it brings.

I see your argument, but I think mine weights more.

Best regards
Uwe






Re: [PATCH RFC v1 00/52] drm/crtc: Rename struct drm_crtc::dev to drm_dev

2023-07-12 Thread Christian König

Am 12.07.23 um 11:46 schrieb Uwe Kleine-König:

Hello,

while I debugged an issue in the imx-lcdc driver I was constantly
irritated about struct drm_device pointer variables being named "dev"
because with that name I usually expect a struct device pointer.

I think there is a big benefit when these are all renamed to "drm_dev".
I have no strong preference here though, so "drmdev" or "drm" are fine
for me, too. Let the bikesheding begin!

Some statistics:

$ git grep -ohE 'struct drm_device *\* *[^ (),;]*' v6.5-rc1 | sort | uniq -c | 
sort -n
   1 struct drm_device *adev_to_drm
   1 struct drm_device *drm_
   1 struct drm_device  *drm_dev
   1 struct drm_device*drm_dev
   1 struct drm_device *pdev
   1 struct drm_device *rdev
   1 struct drm_device *vdev
   2 struct drm_device *dcss_drv_dev_to_drm
   2 struct drm_device **ddev
   2 struct drm_device *drm_dev_alloc
   2 struct drm_device *mock
   2 struct drm_device *p_ddev
   5 struct drm_device *device
   9 struct drm_device * dev
  25 struct drm_device *d
  95 struct drm_device *
 216 struct drm_device *ddev
 234 struct drm_device *drm_dev
 611 struct drm_device *drm
4190 struct drm_device *dev

This series starts with renaming struct drm_crtc::dev to drm_dev. If
it's not only me and others like the result of this effort it should be
followed up by adapting the other structs and the individual usages in
the different drivers.

To make this series a bit easier handleable, I first added an alias for
drm_crtc::dev, then converted the drivers one after another and the last
patch drops the "dev" name. This has the advantage of being easier to
review, and if I should have missed an instance only the last patch must
be dropped/reverted. Also this series might conflict with other patches,
in this case the remaining patches can still go in (apart from the last
one of course). Maybe it also makes sense to delay applying the last
patch by one development cycle?


When you automatically generate the patch (with cocci for example) I 
usually prefer a single patch instead.


Background is that this makes merge conflicts easier to handle and detect.

When you have multiple patches and a merge conflict because of some 
added lines using the old field the build breaks only on the last patch 
which removes the old field.


In such cases reviewing the patch just means automatically re-generating 
it and double checking that you don't see anything funky.


Apart from that I honestly absolutely don't care what the name is.

Cheers,
Christian.



The series was compile tested for arm, arm64, powerpc and amd64 using an
allmodconfig (though I only build drivers/gpu/).

Best regards
Uwe

Uwe Kleine-König (52):
   drm/crtc: Start renaming struct drm_crtc::dev to drm_dev
   drm/core: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev
   drm/amd: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev
   drm/armada: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/arm: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev
   drm/aspeed: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/ast: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev
   drm/atmel-hlcdc: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/exynos: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/fsl-dcu: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/gma500: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/gud: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev
   drm/hisilicon: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/hyperv: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/i915: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev
   drm/imx: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev
   drm/ingenic: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/kmb: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev
   drm/logicvc: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/mcde: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev
   drm/mediatek: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/meson: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/mgag200: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/msm: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev
   drm/mxsfb: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/nouveau: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/omapdrm: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/panel-ili9341: Use struct drm_crtc::drm_dev instead of struct
 drm_crtc::dev
   drm/pl111: Use struct d

Re: [RFC PATCH 1/5] x86/xen: disable swiotlb for xen pvh

2023-03-21 Thread Christian König

Am 17.03.23 um 15:45 schrieb Alex Deucher:

On Thu, Mar 16, 2023 at 7:09 PM Stefano Stabellini
 wrote:

On Thu, 16 Mar 2023, Juergen Gross wrote:

On 16.03.23 14:53, Alex Deucher wrote:

On Thu, Mar 16, 2023 at 9:48 AM Juergen Gross  wrote:

On 16.03.23 14:45, Alex Deucher wrote:

On Thu, Mar 16, 2023 at 3:50 AM Jan Beulich  wrote:

On 16.03.2023 00:25, Stefano Stabellini wrote:

On Wed, 15 Mar 2023, Jan Beulich wrote:

On 15.03.2023 01:52, Stefano Stabellini wrote:

On Mon, 13 Mar 2023, Jan Beulich wrote:

On 12.03.2023 13:01, Huang Rui wrote:

Xen PVH is the paravirtualized mode and takes advantage of
hardware
virtualization support when possible. It will using the
hardware IOMMU
support instead of xen-swiotlb, so disable swiotlb if
current domain is
Xen PVH.

But the kernel has no way (yet) to drive the IOMMU, so how can
it get
away without resorting to swiotlb in certain cases (like I/O
to an
address-restricted device)?

I think Ray meant that, thanks to the IOMMU setup by Xen, there
is no
need for swiotlb-xen in Dom0. Address translations are done by
the IOMMU
so we can use guest physical addresses instead of machine
addresses for
DMA. This is a similar case to Dom0 on ARM when the IOMMU is
available
(see include/xen/arm/swiotlb-xen.h:xen_swiotlb_detect, the
corresponding
case is XENFEAT_not_direct_mapped).

But how does Xen using an IOMMU help with, as said,
address-restricted
devices? They may still need e.g. a 32-bit address to be
programmed in,
and if the kernel has memory beyond the 4G boundary not all I/O
buffers
may fulfill this requirement.

In short, it is going to work as long as Linux has guest physical
addresses (not machine addresses, those could be anything) lower
than
4GB.

If the address-restricted device does DMA via an IOMMU, then the
device
gets programmed by Linux using its guest physical addresses (not
machine
addresses).

The 32-bit restriction would be applied by Linux to its choice of
guest
physical address to use to program the device, the same way it does
on
native. The device would be fine as it always uses Linux-provided
<4GB
addresses. After the IOMMU translation (pagetable setup by Xen), we
could get any address, including >4GB addresses, and that is
expected to
work.

I understand that's the "normal" way of working. But whatever the
swiotlb
is used for in baremetal Linux, that would similarly require its use
in
PVH (or HVM) aiui. So unconditionally disabling it in PVH would look
to
me like an incomplete attempt to disable its use altogether on x86.
What
difference of PVH vs baremetal am I missing here?

swiotlb is not usable for GPUs even on bare metal.  They often have
hundreds or megs or even gigs of memory mapped on the device at any
given time.  Also, AMD GPUs support 44-48 bit DMA masks (depending on
the chip family).

But the swiotlb isn't per device, but system global.

Sure, but if the swiotlb is in use, then you can't really use the GPU.
So you get to pick one.

The swiotlb is used only for buffers which are not within the DMA mask of a
device (see dma_direct_map_page()). So an AMD GPU supporting a 44 bit DMA mask
won't use the swiotlb unless you have a buffer above guest physical address of
16TB (so basically never).

Disabling swiotlb in such a guest would OTOH mean, that a device with only
32 bit DMA mask passed through to this guest couldn't work with buffers
above 4GB.

I don't think this is acceptable.

 From the Xen subsystem in Linux point of view, the only thing we need to
do is to make sure *not* to enable swiotlb_xen (yes "swiotlb_xen", not
the global swiotlb) on PVH because it is not needed anyway.

I think we should leave the global "swiotlb" setting alone. The global
swiotlb is not relevant to Xen anyway, and surely baremetal Linux has to
have a way to deal with swiotlb/GPU incompatibilities.

We just have to avoid making things worse on Xen, and for that we just
need to avoid unconditionally enabling swiotlb-xen. If the Xen subsystem
doesn't enable swiotlb_xen/swiotlb, and no other subsystem enables
swiotlb, then we have a good Linux configuration capable of handling the
GPU properly.

Alex, please correct me if I am wrong. How is x86_swiotlb_enable set to
false on native (non-Xen) x86?

In most cases we have an IOMMU enabled and IIRC, TTM has slightly
different behavior for memory allocation depending on whether swiotlb
would be needed or not.


Well "slightly different" is an understatement. We need to disable quite 
a bunch of features to make swiotlb work with GPUs.


Especially userptr and inter device sharing won't work any more.

Regards,
Christian.



Alex





Re: [RFC XEN PATCH 2/6] vpci: accept BAR writes if dom0 is PVH

2023-03-13 Thread Christian König

Am 13.03.23 um 08:23 schrieb Christian König:



Am 12.03.23 um 08:54 schrieb Huang Rui:

From: Chen Jiqian 

When dom0 is PVH and we want to passthrough gpu to guest,
we should allow BAR writes even through BAR is mapped. If
not, the value of BARs are not initialized when guest firstly
start.

Signed-off-by: Chen Jiqian 
Signed-off-by: Huang Rui 
---
  xen/drivers/vpci/header.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index ec2e978a4e..918d11fbce 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -392,7 +392,7 @@ static void cf_check bar_write(
   * Xen only cares whether the BAR is mapped into the p2m, so 
allow BAR

   * writes as long as the BAR is not mapped into the p2m.
   */
-    if ( bar->enabled )
+    if ( pci_conf_read16(pdev->sbdf, PCI_COMMAND) & 
PCI_COMMAND_MEMORY )


Checkpath.pl gives here:

ERROR: space prohibited after that open parenthesis '('
#115: FILE: xen/drivers/vpci/header.c:395:
+    if ( pci_conf_read16(pdev->sbdf, PCI_COMMAND) & PCI_COMMAND_MEMORY )


But I should probably mention that I'm not 100% sure if this code base 
uses kernel coding style!


Christian.



Christian.



  {
  /* If the value written is the current one avoid printing a 
warning. */

  if ( val != (uint32_t)(bar->addr >> (hi ? 32 : 0)) )







Re: [RFC XEN PATCH 0/6] Introduce VirtIO GPU and Passthrough GPU support on Xen PVH dom0

2023-03-13 Thread Christian König

Hi Ray,

one nit comment on the style, apart from that looks technical correct.

But I'm *really* not and expert on all that stuff.

Regards,
Christian.

Am 12.03.23 um 08:54 schrieb Huang Rui:

Hi all,

In graphic world, the 3D applications/games are runing based on open
graphic libraries such as OpenGL and Vulkan. Mesa is the Linux
implemenatation of OpenGL and Vulkan for multiple hardware platforms.
Because the graphic libraries would like to have the GPU hardware
acceleration. In virtualization world, virtio-gpu and passthrough-gpu are
two of gpu virtualization technologies.

Current Xen only supports OpenGL (virgl:
https://docs.mesa3d.org/drivers/virgl.html) for virtio-gpu and passthrough
gpu based on PV dom0 for x86 platform. Today, we would like to introduce
Vulkan (venus: https://docs.mesa3d.org/drivers/venus.html) and another
OpenGL on Vulkan (zink: https://docs.mesa3d.org/drivers/zink.html) support
for VirtIO GPU on Xen. These functions are supported on KVM at this moment,
but so far, they are not supported on Xen. And we also introduce the PCIe
passthrough (GPU) function based on PVH dom0 for AMD x86 platform.

These supports required multiple repositories changes on kernel, xen, qemu,
mesa, and virglrenderer. Please check below branches:

Kernel: 
https://git.kernel.org/pub/scm/linux/kernel/git/rui/linux.git/log/?h=upstream-fox-xen
Xen: https://gitlab.com/huangrui123/xen/-/commits/upstream-for-xen
QEMU: https://gitlab.com/huangrui123/qemu/-/commits/upstream-for-xen
Mesa: https://gitlab.freedesktop.org/rui/mesa/-/commits/upstream-for-xen
Virglrenderer: 
https://gitlab.freedesktop.org/rui/virglrenderer/-/commits/upstream-for-xen

In xen part, we mainly add the PCIe passthrough support on PVH dom0. It's
using the QEMU to passthrough the GPU device into guest HVM domU. And
mainly work is to transfer the interrupt by using gsi, vector, and pirq.

Below are the screenshot of these functions, please take a look.

Venus:
https://drive.google.com/file/d/1_lPq6DMwHu1JQv7LUUVRx31dBj0HJYcL/view?usp=share_link

Zink:
https://drive.google.com/file/d/1FxLmKu6X7uJOxx1ZzwOm1yA6IL5WMGzd/view?usp=share_link

Passthrough GPU:
https://drive.google.com/file/d/17onr5gvDK8KM_LniHTSQEI2hGJZlI09L/view?usp=share_link

We are working to write the documentation that describe how to verify these
functions in the xen wiki page. And will update it in the future version.

Thanks,
Ray

Chen Jiqian (5):
   vpci: accept BAR writes if dom0 is PVH
   x86/pvh: shouldn't check pirq flag when map pirq in PVH
   x86/pvh: PVH dom0 also need PHYSDEVOP_setup_gsi call
   tools/libs/call: add linux os call to get gsi from irq
   tools/libs/light: pci: translate irq to gsi

Roger Pau Monne (1):
   x86/pvh: report ACPI VFCT table to dom0 if present

  tools/include/xen-sys/Linux/privcmd.h |  7 +++
  tools/include/xencall.h   |  2 ++
  tools/include/xenctrl.h   |  2 ++
  tools/libs/call/core.c|  5 +
  tools/libs/call/libxencall.map|  2 ++
  tools/libs/call/linux.c   | 14 ++
  tools/libs/call/private.h |  9 +
  tools/libs/ctrl/xc_physdev.c  |  4 
  tools/libs/light/libxl_pci.c  |  1 +
  xen/arch/x86/hvm/dom0_build.c |  1 +
  xen/arch/x86/hvm/hypercall.c  |  3 +--
  xen/drivers/vpci/header.c |  2 +-
  xen/include/acpi/actbl3.h |  1 +
  13 files changed, 50 insertions(+), 3 deletions(-)






Re: [RFC XEN PATCH 2/6] vpci: accept BAR writes if dom0 is PVH

2023-03-13 Thread Christian König




Am 12.03.23 um 08:54 schrieb Huang Rui:

From: Chen Jiqian 

When dom0 is PVH and we want to passthrough gpu to guest,
we should allow BAR writes even through BAR is mapped. If
not, the value of BARs are not initialized when guest firstly
start.

Signed-off-by: Chen Jiqian 
Signed-off-by: Huang Rui 
---
  xen/drivers/vpci/header.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index ec2e978a4e..918d11fbce 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -392,7 +392,7 @@ static void cf_check bar_write(
   * Xen only cares whether the BAR is mapped into the p2m, so allow BAR
   * writes as long as the BAR is not mapped into the p2m.
   */
-if ( bar->enabled )
+if ( pci_conf_read16(pdev->sbdf, PCI_COMMAND) & PCI_COMMAND_MEMORY )


Checkpath.pl gives here:

ERROR: space prohibited after that open parenthesis '('
#115: FILE: xen/drivers/vpci/header.c:395:
+    if ( pci_conf_read16(pdev->sbdf, PCI_COMMAND) & PCI_COMMAND_MEMORY )

Christian.



  {
  /* If the value written is the current one avoid printing a warning. 
*/
  if ( val != (uint32_t)(bar->addr >> (hi ? 32 : 0)) )





Re: [PATCH 01/21] drm/amdgpu: Don't set struct drm_driver.lastclose

2022-10-20 Thread Christian König

Am 20.10.22 um 12:37 schrieb Thomas Zimmermann:

Don't set struct drm_driver.lastclose. It's used to restore the
fbdev console. But as amdgpu uses generic fbdev emulation, the
console is being restored by the DRM client helpers already. See
the call to drm_client_dev_restore() in drm_lastclose().


???

The commit message doesn't match what the patch is doing. You are 
removing output_poll_changed instead of lastclose here.


Did something got mixed up?

Cheers,
Christian.



Signed-off-by: Thomas Zimmermann 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_display.c   | 1 -
  drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 --
  2 files changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 23998f727c7f9..fb7186c5ade2a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -1224,7 +1224,6 @@ amdgpu_display_user_framebuffer_create(struct drm_device 
*dev,
  
  const struct drm_mode_config_funcs amdgpu_mode_funcs = {

.fb_create = amdgpu_display_user_framebuffer_create,
-   .output_poll_changed = drm_fb_helper_output_poll_changed,
  };
  
  static const struct drm_prop_enum_list amdgpu_underscan_enum_list[] =

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index f6a9e8fdd87d6..e9a28a5363b9a 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -82,7 +82,6 @@
  #include 
  #include 
  #include 
-#include 
  #include 
  #include 
  #include 
@@ -2810,7 +2809,6 @@ const struct amdgpu_ip_block_version dm_ip_block =
  static const struct drm_mode_config_funcs amdgpu_dm_mode_funcs = {
.fb_create = amdgpu_display_user_framebuffer_create,
.get_format_info = amd_get_format_info,
-   .output_poll_changed = drm_fb_helper_output_poll_changed,
.atomic_check = amdgpu_dm_atomic_check,
.atomic_commit = drm_atomic_helper_commit,
  };





Re: [PATCH v5 05/10] drm/ttm: Add vmap/vunmap to TTM and TTM GEM helpers

2020-10-20 Thread Christian König

Am 20.10.20 um 14:20 schrieb Thomas Zimmermann:

The new functions ttm_bo_{vmap,vunmap}() map and unmap a TTM BO in kernel
address space. The mapping's address is returned as struct dma_buf_map.
Each function is a simplified version of TTM's existing kmap code. Both
functions respect the memory's location ani/or writecombine flags.

On top TTM's functions, GEM TTM helpers got drm_gem_ttm_{vmap,vunmap}(),
two helpers that convert a GEM object into the TTM BO and forward the call
to TTM's vmap/vunmap. These helpers can be dropped into the rsp GEM object
callbacks.

v5:
* use size_t for storing mapping size (Christian)
* ignore premapped memory areas correctly in ttm_bo_vunmap()
* rebase onto latest TTM interfaces (Christian)
* remove BUG() from ttm_bo_vmap() (Christian)
v4:
* drop ttm_kmap_obj_to_dma_buf() in favor of vmap helpers (Daniel,
  Christian)

Signed-off-by: Thomas Zimmermann 
Acked-by: Daniel Vetter 
Tested-by: Sam Ravnborg 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/drm_gem_ttm_helper.c | 38 +++
  drivers/gpu/drm/ttm/ttm_bo_util.c| 72 
  include/drm/drm_gem_ttm_helper.h |  6 +++
  include/drm/ttm/ttm_bo_api.h | 28 +++
  include/linux/dma-buf-map.h  | 20 
  5 files changed, 164 insertions(+)

diff --git a/drivers/gpu/drm/drm_gem_ttm_helper.c 
b/drivers/gpu/drm/drm_gem_ttm_helper.c
index 0e4fb9ba43ad..db4c14d78a30 100644
--- a/drivers/gpu/drm/drm_gem_ttm_helper.c
+++ b/drivers/gpu/drm/drm_gem_ttm_helper.c
@@ -49,6 +49,44 @@ void drm_gem_ttm_print_info(struct drm_printer *p, unsigned 
int indent,
  }
  EXPORT_SYMBOL(drm_gem_ttm_print_info);
  
+/**

+ * drm_gem_ttm_vmap() - vmap &ttm_buffer_object
+ * @gem: GEM object.
+ * @map: [out] returns the dma-buf mapping.
+ *
+ * Maps a GEM object with ttm_bo_vmap(). This function can be used as
+ * &drm_gem_object_funcs.vmap callback.
+ *
+ * Returns:
+ * 0 on success, or a negative errno code otherwise.
+ */
+int drm_gem_ttm_vmap(struct drm_gem_object *gem,
+struct dma_buf_map *map)
+{
+   struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gem);
+
+   return ttm_bo_vmap(bo, map);
+
+}
+EXPORT_SYMBOL(drm_gem_ttm_vmap);
+
+/**
+ * drm_gem_ttm_vunmap() - vunmap &ttm_buffer_object
+ * @gem: GEM object.
+ * @map: dma-buf mapping.
+ *
+ * Unmaps a GEM object with ttm_bo_vunmap(). This function can be used as
+ * &drm_gem_object_funcs.vmap callback.
+ */
+void drm_gem_ttm_vunmap(struct drm_gem_object *gem,
+   struct dma_buf_map *map)
+{
+   struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gem);
+
+   ttm_bo_vunmap(bo, map);
+}
+EXPORT_SYMBOL(drm_gem_ttm_vunmap);
+
  /**
   * drm_gem_ttm_mmap() - mmap &ttm_buffer_object
   * @gem: GEM object.
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c 
b/drivers/gpu/drm/ttm/ttm_bo_util.c
index ba7ab5ed85d0..5c79418405ea 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -32,6 +32,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -527,6 +528,77 @@ void ttm_bo_kunmap(struct ttm_bo_kmap_obj *map)
  }
  EXPORT_SYMBOL(ttm_bo_kunmap);
  
+int ttm_bo_vmap(struct ttm_buffer_object *bo, struct dma_buf_map *map)

+{
+   struct ttm_resource *mem = &bo->mem;
+   int ret;
+
+   ret = ttm_mem_io_reserve(bo->bdev, mem);
+   if (ret)
+   return ret;
+
+   if (mem->bus.is_iomem) {
+   void __iomem *vaddr_iomem;
+   size_t size = bo->num_pages << PAGE_SHIFT;
+
+   if (mem->bus.addr)
+   vaddr_iomem = (void __iomem *)mem->bus.addr;
+   else if (mem->bus.caching == ttm_write_combined)
+   vaddr_iomem = ioremap_wc(mem->bus.offset, size);
+   else
+   vaddr_iomem = ioremap(mem->bus.offset, size);
+
+   if (!vaddr_iomem)
+   return -ENOMEM;
+
+   dma_buf_map_set_vaddr_iomem(map, vaddr_iomem);
+
+   } else {
+   struct ttm_operation_ctx ctx = {
+   .interruptible = false,
+   .no_wait_gpu = false
+   };
+   struct ttm_tt *ttm = bo->ttm;
+   pgprot_t prot;
+   void *vaddr;
+
+   ret = ttm_tt_populate(bo->bdev, ttm, &ctx);
+   if (ret)
+   return ret;
+
+   /*
+* We need to use vmap to get the desired page protection
+* or to make the buffer object look contiguous.
+*/
+   prot = ttm_io_prot(bo, mem, PAGE_KERNEL);
+   vaddr = vmap(ttm->pages, bo->num_pages, 0, prot);
+   if (!vaddr)
+   return -ENOMEM;
+
+   dma_bu

Re: [PATCH v4 05/10] drm/ttm: Add vmap/vunmap to TTM and TTM GEM helpers

2020-10-19 Thread Christian König

Hi Thomas,

[SNIP]

   +int ttm_bo_vmap(struct ttm_buffer_object *bo, struct dma_buf_map *map)
+{
+    struct ttm_resource *mem = &bo->mem;
+    int ret;
+
+    ret = ttm_mem_io_reserve(bo->bdev, mem);
+    if (ret)
+    return ret;
+
+    if (mem->bus.is_iomem) {
+    void __iomem *vaddr_iomem;
+    unsigned long size = bo->num_pages << PAGE_SHIFT;

Please use uint64_t here and make sure to cast bo->num_pages before
shifting.

I thought the rule of thumb is to use u64 in source code. Yet TTM only
uses uint*_t types. Is there anything special about TTM?


My last status is that you can use both and my personal preference is to 
use the uint*_t types because they are part of a higher level standard.



We have an unit tests of allocating a 8GB BO and that should work on a
32bit machine as well :)


+
+    if (mem->bus.addr)
+    vaddr_iomem = (void *)(((u8 *)mem->bus.addr));

I after reading the patch again, I realized that this is the
'ttm_bo_map_premapped' case and it's missing from _vunmap(). I see two
options here: ignore this case in _vunmap(), or do an ioremap()
unconditionally. Which one is preferable?


ioremap would be very very bad, so we should just do nothing.

Thanks,
Christian.



Best regards
Thomas


+    else if (mem->placement & TTM_PL_FLAG_WC)

I've just nuked the TTM_PL_FLAG_WC flag in drm-misc-next. There is a new
mem->bus.caching enum as replacement.


+    vaddr_iomem = ioremap_wc(mem->bus.offset, size);
+    else
+    vaddr_iomem = ioremap(mem->bus.offset, size);
+
+    if (!vaddr_iomem)
+    return -ENOMEM;
+
+    dma_buf_map_set_vaddr_iomem(map, vaddr_iomem);
+
+    } else {
+    struct ttm_operation_ctx ctx = {
+    .interruptible = false,
+    .no_wait_gpu = false
+    };
+    struct ttm_tt *ttm = bo->ttm;
+    pgprot_t prot;
+    void *vaddr;
+
+    BUG_ON(!ttm);

I think we can drop this, populate will just crash badly anyway.


+
+    ret = ttm_tt_populate(bo->bdev, ttm, &ctx);
+    if (ret)
+    return ret;
+
+    /*
+ * We need to use vmap to get the desired page protection
+ * or to make the buffer object look contiguous.
+ */
+    prot = ttm_io_prot(mem->placement, PAGE_KERNEL);

The calling convention has changed on drm-misc-next as well, but should
be trivial to adapt.

Regards,
Christian.


+    vaddr = vmap(ttm->pages, bo->num_pages, 0, prot);
+    if (!vaddr)
+    return -ENOMEM;
+
+    dma_buf_map_set_vaddr(map, vaddr);
+    }
+
+    return 0;
+}
+EXPORT_SYMBOL(ttm_bo_vmap);
+
+void ttm_bo_vunmap(struct ttm_buffer_object *bo, struct dma_buf_map
*map)
+{
+    if (dma_buf_map_is_null(map))
+    return;
+
+    if (map->is_iomem)
+    iounmap(map->vaddr_iomem);
+    else
+    vunmap(map->vaddr);
+    dma_buf_map_clear(map);
+
+    ttm_mem_io_free(bo->bdev, &bo->mem);
+}
+EXPORT_SYMBOL(ttm_bo_vunmap);
+
   static int ttm_bo_wait_free_node(struct ttm_buffer_object *bo,
    bool dst_use_tt)
   {
diff --git a/include/drm/drm_gem_ttm_helper.h
b/include/drm/drm_gem_ttm_helper.h
index 118cef76f84f..7c6d874910b8 100644
--- a/include/drm/drm_gem_ttm_helper.h
+++ b/include/drm/drm_gem_ttm_helper.h
@@ -10,11 +10,17 @@
   #include 
   #include 
   +struct dma_buf_map;
+
   #define drm_gem_ttm_of_gem(gem_obj) \
   container_of(gem_obj, struct ttm_buffer_object, base)
     void drm_gem_ttm_print_info(struct drm_printer *p, unsigned int
indent,
   const struct drm_gem_object *gem);
+int drm_gem_ttm_vmap(struct drm_gem_object *gem,
+ struct dma_buf_map *map);
+void drm_gem_ttm_vunmap(struct drm_gem_object *gem,
+    struct dma_buf_map *map);
   int drm_gem_ttm_mmap(struct drm_gem_object *gem,
    struct vm_area_struct *vma);
   diff --git a/include/drm/ttm/ttm_bo_api.h
b/include/drm/ttm/ttm_bo_api.h
index 37102e45e496..2c59a785374c 100644
--- a/include/drm/ttm/ttm_bo_api.h
+++ b/include/drm/ttm/ttm_bo_api.h
@@ -48,6 +48,8 @@ struct ttm_bo_global;
     struct ttm_bo_device;
   +struct dma_buf_map;
+
   struct drm_mm_node;
     struct ttm_placement;
@@ -494,6 +496,32 @@ int ttm_bo_kmap(struct ttm_buffer_object *bo,
unsigned long start_page,
    */
   void ttm_bo_kunmap(struct ttm_bo_kmap_obj *map);
   +/**
+ * ttm_bo_vmap
+ *
+ * @bo: The buffer object.
+ * @map: pointer to a struct dma_buf_map representing the map.
+ *
+ * Sets up a kernel virtual mapping, using ioremap or vmap to the
+ * data in the buffer object. The parameter @map returns the virtual
+ * address as struct dma_buf_map. Unmap the buffer with ttm_bo_vunmap().
+ *
+ * Returns
+ * -ENOMEM: Out of memory.
+ * -EINVAL: Invalid range.
+ */
+int ttm_bo_vmap(struct ttm_buffer_object *bo, struct dma_buf_map *map);
+
+/**
+ * ttm_bo_vunmap
+ *
+ * @bo: The buffer object.
+ * @map: Object describing the map to unmap.
+ *
+ * Unmaps a kernel map set up by ttm_bo_vmap().
+ */
+void tt

Re: [PATCH v4 05/10] drm/ttm: Add vmap/vunmap to TTM and TTM GEM helpers

2020-10-16 Thread Christian König

Am 15.10.20 um 19:52 schrieb Thomas Zimmermann:

Hi

On Thu, 15 Oct 2020 18:49:09 +0200 Daniel Vetter  wrote:


On Thu, Oct 15, 2020 at 04:08:13PM +0200, Christian König wrote:

Am 15.10.20 um 14:38 schrieb Thomas Zimmermann:

The new functions ttm_bo_{vmap,vunmap}() map and unmap a TTM BO in
kernel address space. The mapping's address is returned as struct
dma_buf_map. Each function is a simplified version of TTM's existing
kmap code. Both functions respect the memory's location ani/or
writecombine flags.

On top TTM's functions, GEM TTM helpers got drm_gem_ttm_{vmap,vunmap}(),
two helpers that convert a GEM object into the TTM BO and forward the
call to TTM's vmap/vunmap. These helpers can be dropped into the rsp
GEM object callbacks.

v4:
* drop ttm_kmap_obj_to_dma_buf() in favor of vmap helpers
(Daniel, Christian)

Bunch of minor comments below, but over all look very solid to me.

Yeah I think just duplicating the ttm bo map stuff for vmap is indeed the
cleanest. And then we can maybe push the combinatorial monster into
vmwgfx, which I think is the only user after this series. Or perhaps a
dedicated set of helpers to map an invidual page (again using the
dma_buf_map stuff).

 From a quick look, I'd say it should be possible to have the same interface
for kmap/kunmap as for vmap/vunmap (i.e., parameters are bo and dma-buf-map).
All mapping state can be deduced from this. And struct ttm_bo_kmap_obj can be
killed off entirely.


Yes, that would be rather nice to have.

Thanks,
Christian.



Best regards
Thomas


I'll let Christian with the details, but at a high level this is
definitely

Acked-by: Daniel Vetter 

Thanks a lot for doing all this.
-Daniel


Signed-off-by: Thomas Zimmermann 
---
   drivers/gpu/drm/drm_gem_ttm_helper.c | 38 +++
   drivers/gpu/drm/ttm/ttm_bo_util.c| 72 
   include/drm/drm_gem_ttm_helper.h |  6 +++
   include/drm/ttm/ttm_bo_api.h | 28 +++
   include/linux/dma-buf-map.h  | 20 
   5 files changed, 164 insertions(+)

diff --git a/drivers/gpu/drm/drm_gem_ttm_helper.c
b/drivers/gpu/drm/drm_gem_ttm_helper.c index 0e4fb9ba43ad..db4c14d78a30
100644 --- a/drivers/gpu/drm/drm_gem_ttm_helper.c
+++ b/drivers/gpu/drm/drm_gem_ttm_helper.c
@@ -49,6 +49,44 @@ void drm_gem_ttm_print_info(struct drm_printer *p,
unsigned int indent, }
   EXPORT_SYMBOL(drm_gem_ttm_print_info);
+/**
+ * drm_gem_ttm_vmap() - vmap &ttm_buffer_object
+ * @gem: GEM object.
+ * @map: [out] returns the dma-buf mapping.
+ *
+ * Maps a GEM object with ttm_bo_vmap(). This function can be used as
+ * &drm_gem_object_funcs.vmap callback.
+ *
+ * Returns:
+ * 0 on success, or a negative errno code otherwise.
+ */
+int drm_gem_ttm_vmap(struct drm_gem_object *gem,
+struct dma_buf_map *map)
+{
+   struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gem);
+
+   return ttm_bo_vmap(bo, map);
+
+}
+EXPORT_SYMBOL(drm_gem_ttm_vmap);
+
+/**
+ * drm_gem_ttm_vunmap() - vunmap &ttm_buffer_object
+ * @gem: GEM object.
+ * @map: dma-buf mapping.
+ *
+ * Unmaps a GEM object with ttm_bo_vunmap(). This function can be used
as
+ * &drm_gem_object_funcs.vmap callback.
+ */
+void drm_gem_ttm_vunmap(struct drm_gem_object *gem,
+   struct dma_buf_map *map)
+{
+   struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gem);
+
+   ttm_bo_vunmap(bo, map);
+}
+EXPORT_SYMBOL(drm_gem_ttm_vunmap);
+
   /**
* drm_gem_ttm_mmap() - mmap &ttm_buffer_object
* @gem: GEM object.
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c
b/drivers/gpu/drm/ttm/ttm_bo_util.c index bdee4df1f3f2..80c42c774c7d
100644 --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -32,6 +32,7 @@
   #include 
   #include 
   #include 
+#include 
   #include 
   #include 
   #include 
@@ -526,6 +527,77 @@ void ttm_bo_kunmap(struct ttm_bo_kmap_obj *map)
   }
   EXPORT_SYMBOL(ttm_bo_kunmap);
+int ttm_bo_vmap(struct ttm_buffer_object *bo, struct dma_buf_map *map)
+{
+   struct ttm_resource *mem = &bo->mem;
+   int ret;
+
+   ret = ttm_mem_io_reserve(bo->bdev, mem);
+   if (ret)
+   return ret;
+
+   if (mem->bus.is_iomem) {
+   void __iomem *vaddr_iomem;
+   unsigned long size = bo->num_pages << PAGE_SHIFT;

Please use uint64_t here and make sure to cast bo->num_pages before
shifting.

We have an unit tests of allocating a 8GB BO and that should work on a
32bit machine as well :)


+
+   if (mem->bus.addr)
+   vaddr_iomem = (void *)(((u8 *)mem->bus.addr));
+   else if (mem->placement & TTM_PL_FLAG_WC)

I've just nuked the TTM_PL_FLAG_WC flag in drm-misc-next. There is a new
mem->bus.caching enum as replacement.


+   vaddr_iomem = ioremap_wc(mem->bus.offset,
size);
+   else

Re: [PATCH v4 06/10] drm/gem: Use struct dma_buf_map in GEM vmap ops and convert GEM backends

2020-10-15 Thread Christian König

Am 15.10.20 um 14:38 schrieb Thomas Zimmermann:

This patch replaces the vmap/vunmap's use of raw pointers in GEM object
functions with instances of struct dma_buf_map. GEM backends are
converted as well. For most of them, this simply changes the returned type.

TTM-based drivers now return information about the location of the memory,
either system or I/O memory. GEM VRAM helpers and qxl now use ttm_bo_vmap()
et al. Amdgpu, nouveau and radeon use drm_gem_ttm_vmap() et al instead of
implementing their own vmap callbacks.

v4:
* use ttm_bo_vmap(), drm_gem_ttm_vmap(), et al. (Daniel, Christian)
* fix a trailing { in drm_gem_vmap()
* remove several empty functions instead of converting them (Daniel)
* comment uses of raw pointers with a TODO (Daniel)
* TODO list: convert more helpers to use struct dma_buf_map

Signed-off-by: Thomas Zimmermann 


The amdgpu changes look good to me, but I can't fully judge the other stuff.

Acked-by: Christian König 


---
  Documentation/gpu/todo.rst  |  18 
  drivers/gpu/drm/Kconfig |   2 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c |  36 ---
  drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.h |   2 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c |   5 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.h  |   1 -
  drivers/gpu/drm/ast/ast_cursor.c|  27 +++--
  drivers/gpu/drm/ast/ast_drv.h   |   7 +-
  drivers/gpu/drm/drm_gem.c   |  23 +++--
  drivers/gpu/drm/drm_gem_cma_helper.c|  10 +-
  drivers/gpu/drm/drm_gem_shmem_helper.c  |  48 +
  drivers/gpu/drm/drm_gem_vram_helper.c   | 107 ++--
  drivers/gpu/drm/etnaviv/etnaviv_drv.h   |   2 +-
  drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c |   9 +-
  drivers/gpu/drm/lima/lima_gem.c |   6 +-
  drivers/gpu/drm/lima/lima_sched.c   |  11 +-
  drivers/gpu/drm/mgag200/mgag200_mode.c  |  10 +-
  drivers/gpu/drm/nouveau/Kconfig |   1 +
  drivers/gpu/drm/nouveau/nouveau_bo.h|   2 -
  drivers/gpu/drm/nouveau/nouveau_gem.c   |   6 +-
  drivers/gpu/drm/nouveau/nouveau_gem.h   |   2 -
  drivers/gpu/drm/nouveau/nouveau_prime.c |  20 
  drivers/gpu/drm/panfrost/panfrost_perfcnt.c |  14 +--
  drivers/gpu/drm/qxl/qxl_display.c   |  11 +-
  drivers/gpu/drm/qxl/qxl_draw.c  |  14 ++-
  drivers/gpu/drm/qxl/qxl_drv.h   |  11 +-
  drivers/gpu/drm/qxl/qxl_object.c|  31 +++---
  drivers/gpu/drm/qxl/qxl_object.h|   2 +-
  drivers/gpu/drm/qxl/qxl_prime.c |  12 +--
  drivers/gpu/drm/radeon/radeon.h |   1 -
  drivers/gpu/drm/radeon/radeon_gem.c |   7 +-
  drivers/gpu/drm/radeon/radeon_prime.c   |  20 
  drivers/gpu/drm/rockchip/rockchip_drm_gem.c |  22 ++--
  drivers/gpu/drm/rockchip/rockchip_drm_gem.h |   4 +-
  drivers/gpu/drm/tiny/cirrus.c   |  10 +-
  drivers/gpu/drm/tiny/gm12u320.c |  10 +-
  drivers/gpu/drm/udl/udl_modeset.c   |   8 +-
  drivers/gpu/drm/vboxvideo/vbox_mode.c   |  11 +-
  drivers/gpu/drm/vc4/vc4_bo.c|   6 +-
  drivers/gpu/drm/vc4/vc4_drv.h   |   2 +-
  drivers/gpu/drm/vgem/vgem_drv.c |  16 ++-
  drivers/gpu/drm/xen/xen_drm_front_gem.c |  18 ++--
  drivers/gpu/drm/xen/xen_drm_front_gem.h |   6 +-
  include/drm/drm_gem.h   |   5 +-
  include/drm/drm_gem_cma_helper.h|   2 +-
  include/drm/drm_gem_shmem_helper.h  |   4 +-
  include/drm/drm_gem_vram_helper.h   |  14 +--
  47 files changed, 321 insertions(+), 295 deletions(-)

diff --git a/Documentation/gpu/todo.rst b/Documentation/gpu/todo.rst
index 700637e25ecd..7e6fc3c04add 100644
--- a/Documentation/gpu/todo.rst
+++ b/Documentation/gpu/todo.rst
@@ -446,6 +446,24 @@ Contact: Ville Syrjälä, Daniel Vetter
  
  Level: Intermediate
  
+Use struct dma_buf_map throughout codebase

+--
+
+Pointers to shared device memory are stored in struct dma_buf_map. Each
+instance knows whether it refers to system or I/O memory. Most of the DRM-wide
+interface have been converted to use struct dma_buf_map, but implementations
+often still use raw pointers.
+
+The task is to use struct dma_buf_map where it makes sense.
+
+* Memory managers should use struct dma_buf_map for dma-buf-imported buffers.
+* TTM might benefit from using struct dma_buf_map internally.
+* Framebuffer copying and blitting helpers should operate on struct 
dma_buf_map.
+
+Contact: Thomas Zimmermann , Christian König, Daniel 
Vetter
+
+Level: Intermediate
+
  
  Core refactorings

  =
diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 147d61b9674e..319839b87d37 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -239,6 +239,7 @@ config DRM_RADEON
select FW_LOADER
  select DRM_

Re: [PATCH v4 05/10] drm/ttm: Add vmap/vunmap to TTM and TTM GEM helpers

2020-10-15 Thread Christian König

Am 15.10.20 um 14:38 schrieb Thomas Zimmermann:

The new functions ttm_bo_{vmap,vunmap}() map and unmap a TTM BO in kernel
address space. The mapping's address is returned as struct dma_buf_map.
Each function is a simplified version of TTM's existing kmap code. Both
functions respect the memory's location ani/or writecombine flags.

On top TTM's functions, GEM TTM helpers got drm_gem_ttm_{vmap,vunmap}(),
two helpers that convert a GEM object into the TTM BO and forward the call
to TTM's vmap/vunmap. These helpers can be dropped into the rsp GEM object
callbacks.

v4:
* drop ttm_kmap_obj_to_dma_buf() in favor of vmap helpers (Daniel,
  Christian)


Bunch of minor comments below, but over all look very solid to me.



Signed-off-by: Thomas Zimmermann 
---
  drivers/gpu/drm/drm_gem_ttm_helper.c | 38 +++
  drivers/gpu/drm/ttm/ttm_bo_util.c| 72 
  include/drm/drm_gem_ttm_helper.h |  6 +++
  include/drm/ttm/ttm_bo_api.h | 28 +++
  include/linux/dma-buf-map.h  | 20 
  5 files changed, 164 insertions(+)

diff --git a/drivers/gpu/drm/drm_gem_ttm_helper.c 
b/drivers/gpu/drm/drm_gem_ttm_helper.c
index 0e4fb9ba43ad..db4c14d78a30 100644
--- a/drivers/gpu/drm/drm_gem_ttm_helper.c
+++ b/drivers/gpu/drm/drm_gem_ttm_helper.c
@@ -49,6 +49,44 @@ void drm_gem_ttm_print_info(struct drm_printer *p, unsigned 
int indent,
  }
  EXPORT_SYMBOL(drm_gem_ttm_print_info);
  
+/**

+ * drm_gem_ttm_vmap() - vmap &ttm_buffer_object
+ * @gem: GEM object.
+ * @map: [out] returns the dma-buf mapping.
+ *
+ * Maps a GEM object with ttm_bo_vmap(). This function can be used as
+ * &drm_gem_object_funcs.vmap callback.
+ *
+ * Returns:
+ * 0 on success, or a negative errno code otherwise.
+ */
+int drm_gem_ttm_vmap(struct drm_gem_object *gem,
+struct dma_buf_map *map)
+{
+   struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gem);
+
+   return ttm_bo_vmap(bo, map);
+
+}
+EXPORT_SYMBOL(drm_gem_ttm_vmap);
+
+/**
+ * drm_gem_ttm_vunmap() - vunmap &ttm_buffer_object
+ * @gem: GEM object.
+ * @map: dma-buf mapping.
+ *
+ * Unmaps a GEM object with ttm_bo_vunmap(). This function can be used as
+ * &drm_gem_object_funcs.vmap callback.
+ */
+void drm_gem_ttm_vunmap(struct drm_gem_object *gem,
+   struct dma_buf_map *map)
+{
+   struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gem);
+
+   ttm_bo_vunmap(bo, map);
+}
+EXPORT_SYMBOL(drm_gem_ttm_vunmap);
+
  /**
   * drm_gem_ttm_mmap() - mmap &ttm_buffer_object
   * @gem: GEM object.
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c 
b/drivers/gpu/drm/ttm/ttm_bo_util.c
index bdee4df1f3f2..80c42c774c7d 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -32,6 +32,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -526,6 +527,77 @@ void ttm_bo_kunmap(struct ttm_bo_kmap_obj *map)
  }
  EXPORT_SYMBOL(ttm_bo_kunmap);
  
+int ttm_bo_vmap(struct ttm_buffer_object *bo, struct dma_buf_map *map)

+{
+   struct ttm_resource *mem = &bo->mem;
+   int ret;
+
+   ret = ttm_mem_io_reserve(bo->bdev, mem);
+   if (ret)
+   return ret;
+
+   if (mem->bus.is_iomem) {
+   void __iomem *vaddr_iomem;
+   unsigned long size = bo->num_pages << PAGE_SHIFT;


Please use uint64_t here and make sure to cast bo->num_pages before 
shifting.


We have an unit tests of allocating a 8GB BO and that should work on a 
32bit machine as well :)



+
+   if (mem->bus.addr)
+   vaddr_iomem = (void *)(((u8 *)mem->bus.addr));
+   else if (mem->placement & TTM_PL_FLAG_WC)


I've just nuked the TTM_PL_FLAG_WC flag in drm-misc-next. There is a new 
mem->bus.caching enum as replacement.



+   vaddr_iomem = ioremap_wc(mem->bus.offset, size);
+   else
+   vaddr_iomem = ioremap(mem->bus.offset, size);
+
+   if (!vaddr_iomem)
+   return -ENOMEM;
+
+   dma_buf_map_set_vaddr_iomem(map, vaddr_iomem);
+
+   } else {
+   struct ttm_operation_ctx ctx = {
+   .interruptible = false,
+   .no_wait_gpu = false
+   };
+   struct ttm_tt *ttm = bo->ttm;
+   pgprot_t prot;
+   void *vaddr;
+
+   BUG_ON(!ttm);


I think we can drop this, populate will just crash badly anyway.


+
+   ret = ttm_tt_populate(bo->bdev, ttm, &ctx);
+   if (ret)
+   return ret;
+
+   /*
+* We need to use vmap to get the desired page protection
+* or to make the buffer object look contiguous.
+*/
+   prot = ttm_io_prot(mem->placement, PAGE_KERNEL);


The calling convention has changed on drm-misc-next as well, but should 
be trivial to adapt.


Regards

Re: [PATCH v4 04/10] drm/exynos: Remove empty exynos_drm_gem_prime_{vmap,vunmap}()

2020-10-15 Thread Christian König

Am 15.10.20 um 14:38 schrieb Thomas Zimmermann:

The functions exynos_drm_gem_prime_{vmap,vunmap}() are empty. Remove
them before changing the interface to use struct drm_buf_map. As a side
effect of removing drm_gem_prime_vmap(), the error code changes from
ENOMEM to EOPNOTSUPP.

Signed-off-by: Thomas Zimmermann 


Acked-by: Christian König 


---
  drivers/gpu/drm/exynos/exynos_drm_gem.c | 12 
  drivers/gpu/drm/exynos/exynos_drm_gem.h |  2 --
  2 files changed, 14 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c 
b/drivers/gpu/drm/exynos/exynos_drm_gem.c
index e7a6eb96f692..13a35623ac04 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
@@ -137,8 +137,6 @@ static const struct vm_operations_struct 
exynos_drm_gem_vm_ops = {
  static const struct drm_gem_object_funcs exynos_drm_gem_object_funcs = {
.free = exynos_drm_gem_free_object,
.get_sg_table = exynos_drm_gem_prime_get_sg_table,
-   .vmap = exynos_drm_gem_prime_vmap,
-   .vunmap = exynos_drm_gem_prime_vunmap,
.vm_ops = &exynos_drm_gem_vm_ops,
  };
  
@@ -471,16 +469,6 @@ exynos_drm_gem_prime_import_sg_table(struct drm_device *dev,

return &exynos_gem->base;
  }
  
-void *exynos_drm_gem_prime_vmap(struct drm_gem_object *obj)

-{
-   return NULL;
-}
-
-void exynos_drm_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr)
-{
-   /* Nothing to do */
-}
-
  int exynos_drm_gem_prime_mmap(struct drm_gem_object *obj,
  struct vm_area_struct *vma)
  {
diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.h 
b/drivers/gpu/drm/exynos/exynos_drm_gem.h
index 74e926abeff0..a23272fb96fb 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gem.h
+++ b/drivers/gpu/drm/exynos/exynos_drm_gem.h
@@ -107,8 +107,6 @@ struct drm_gem_object *
  exynos_drm_gem_prime_import_sg_table(struct drm_device *dev,
 struct dma_buf_attachment *attach,
 struct sg_table *sgt);
-void *exynos_drm_gem_prime_vmap(struct drm_gem_object *obj);
-void exynos_drm_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr);
  int exynos_drm_gem_prime_mmap(struct drm_gem_object *obj,
  struct vm_area_struct *vma);
  





Re: [PATCH v4 03/10] drm/etnaviv: Remove empty etnaviv_gem_prime_vunmap()

2020-10-15 Thread Christian König

Am 15.10.20 um 14:37 schrieb Thomas Zimmermann:

The function etnaviv_gem_prime_vunmap() is empty. Remove it before
changing the interface to use struct drm_buf_map.

Signed-off-by: Thomas Zimmermann 


Acked-by: Christian König 


---
  drivers/gpu/drm/etnaviv/etnaviv_drv.h   | 1 -
  drivers/gpu/drm/etnaviv/etnaviv_gem.c   | 1 -
  drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c | 5 -
  3 files changed, 7 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.h 
b/drivers/gpu/drm/etnaviv/etnaviv_drv.h
index 914f0867ff71..9682c26d89bb 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_drv.h
+++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.h
@@ -52,7 +52,6 @@ int etnaviv_gem_mmap(struct file *filp, struct vm_area_struct 
*vma);
  int etnaviv_gem_mmap_offset(struct drm_gem_object *obj, u64 *offset);
  struct sg_table *etnaviv_gem_prime_get_sg_table(struct drm_gem_object *obj);
  void *etnaviv_gem_prime_vmap(struct drm_gem_object *obj);
-void etnaviv_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr);
  int etnaviv_gem_prime_mmap(struct drm_gem_object *obj,
   struct vm_area_struct *vma);
  struct drm_gem_object *etnaviv_gem_prime_import_sg_table(struct drm_device 
*dev,
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
index 67d9a2b9ea6a..bbd235473645 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
@@ -571,7 +571,6 @@ static const struct drm_gem_object_funcs 
etnaviv_gem_object_funcs = {
.unpin = etnaviv_gem_prime_unpin,
.get_sg_table = etnaviv_gem_prime_get_sg_table,
.vmap = etnaviv_gem_prime_vmap,
-   .vunmap = etnaviv_gem_prime_vunmap,
.vm_ops = &vm_ops,
  };
  
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c

index 135fbff6fecf..a6d9932a32ae 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
@@ -27,11 +27,6 @@ void *etnaviv_gem_prime_vmap(struct drm_gem_object *obj)
return etnaviv_gem_vmap(obj);
  }
  
-void etnaviv_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr)

-{
-   /* TODO msm_gem_vunmap() */
-}
-
  int etnaviv_gem_prime_mmap(struct drm_gem_object *obj,
   struct vm_area_struct *vma)
  {





Re: [PATCH v4 02/10] drm/cma-helper: Remove empty drm_gem_cma_prime_vunmap()

2020-10-15 Thread Christian König

Am 15.10.20 um 14:37 schrieb Thomas Zimmermann:

The function drm_gem_cma_prime_vunmap() is empty. Remove it before
changing the interface to use struct drm_buf_map.

Signed-off-by: Thomas Zimmermann 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/drm_gem_cma_helper.c | 17 -
  drivers/gpu/drm/vc4/vc4_bo.c |  1 -
  include/drm/drm_gem_cma_helper.h |  1 -
  3 files changed, 19 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_cma_helper.c 
b/drivers/gpu/drm/drm_gem_cma_helper.c
index 2165633c9b9e..d527485ea0b7 100644
--- a/drivers/gpu/drm/drm_gem_cma_helper.c
+++ b/drivers/gpu/drm/drm_gem_cma_helper.c
@@ -537,23 +537,6 @@ void *drm_gem_cma_prime_vmap(struct drm_gem_object *obj)
  }
  EXPORT_SYMBOL_GPL(drm_gem_cma_prime_vmap);
  
-/**

- * drm_gem_cma_prime_vunmap - unmap a CMA GEM object from the kernel's virtual
- * address space
- * @obj: GEM object
- * @vaddr: kernel virtual address where the CMA GEM object was mapped
- *
- * This function removes a buffer exported via DRM PRIME from the kernel's
- * virtual address space. This is a no-op because CMA buffers cannot be
- * unmapped from kernel space. Drivers using the CMA helpers should set this
- * as their &drm_gem_object_funcs.vunmap callback.
- */
-void drm_gem_cma_prime_vunmap(struct drm_gem_object *obj, void *vaddr)
-{
-   /* Nothing to do */
-}
-EXPORT_SYMBOL_GPL(drm_gem_cma_prime_vunmap);
-
  static const struct drm_gem_object_funcs drm_gem_cma_default_funcs = {
.free = drm_gem_cma_free_object,
.print_info = drm_gem_cma_print_info,
diff --git a/drivers/gpu/drm/vc4/vc4_bo.c b/drivers/gpu/drm/vc4/vc4_bo.c
index f432278173cd..557f0d1e6437 100644
--- a/drivers/gpu/drm/vc4/vc4_bo.c
+++ b/drivers/gpu/drm/vc4/vc4_bo.c
@@ -387,7 +387,6 @@ static const struct drm_gem_object_funcs 
vc4_gem_object_funcs = {
.export = vc4_prime_export,
.get_sg_table = drm_gem_cma_prime_get_sg_table,
.vmap = vc4_prime_vmap,
-   .vunmap = drm_gem_cma_prime_vunmap,
.vm_ops = &vc4_vm_ops,
  };
  
diff --git a/include/drm/drm_gem_cma_helper.h b/include/drm/drm_gem_cma_helper.h

index 2bfa2502607a..a064b0d1c480 100644
--- a/include/drm/drm_gem_cma_helper.h
+++ b/include/drm/drm_gem_cma_helper.h
@@ -104,7 +104,6 @@ drm_gem_cma_prime_import_sg_table(struct drm_device *dev,
  int drm_gem_cma_prime_mmap(struct drm_gem_object *obj,
   struct vm_area_struct *vma);
  void *drm_gem_cma_prime_vmap(struct drm_gem_object *obj);
-void drm_gem_cma_prime_vunmap(struct drm_gem_object *obj, void *vaddr);
  
  struct drm_gem_object *

  drm_gem_cma_create_object_default_funcs(struct drm_device *dev, size_t size);





Re: [PATCH v4 01/10] drm/vram-helper: Remove invariant parameters from internal kmap function

2020-10-15 Thread Christian König

Am 15.10.20 um 14:37 schrieb Thomas Zimmermann:

The parameters map and is_iomem are always of the same value. Removed them
to prepares the function for conversion to struct dma_buf_map.

v4:
* don't check for !kmap->virtual; will always be false

Signed-off-by: Thomas Zimmermann 
Reviewed-by: Daniel Vetter 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/drm_gem_vram_helper.c | 18 --
  1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_vram_helper.c 
b/drivers/gpu/drm/drm_gem_vram_helper.c
index 3213429f8444..2d5ed30518f1 100644
--- a/drivers/gpu/drm/drm_gem_vram_helper.c
+++ b/drivers/gpu/drm/drm_gem_vram_helper.c
@@ -382,32 +382,22 @@ int drm_gem_vram_unpin(struct drm_gem_vram_object *gbo)
  }
  EXPORT_SYMBOL(drm_gem_vram_unpin);
  
-static void *drm_gem_vram_kmap_locked(struct drm_gem_vram_object *gbo,

- bool map, bool *is_iomem)
+static void *drm_gem_vram_kmap_locked(struct drm_gem_vram_object *gbo)
  {
int ret;
struct ttm_bo_kmap_obj *kmap = &gbo->kmap;
+   bool is_iomem;
  
  	if (gbo->kmap_use_count > 0)

goto out;
  
-	if (kmap->virtual || !map)

-   goto out;
-
ret = ttm_bo_kmap(&gbo->bo, 0, gbo->bo.num_pages, kmap);
if (ret)
return ERR_PTR(ret);
  
  out:

-   if (!kmap->virtual) {
-   if (is_iomem)
-   *is_iomem = false;
-   return NULL; /* not mapped; don't increment ref */
-   }
++gbo->kmap_use_count;
-   if (is_iomem)
-   return ttm_kmap_obj_virtual(kmap, is_iomem);
-   return kmap->virtual;
+   return ttm_kmap_obj_virtual(kmap, &is_iomem);
  }
  
  static void drm_gem_vram_kunmap_locked(struct drm_gem_vram_object *gbo)

@@ -452,7 +442,7 @@ void *drm_gem_vram_vmap(struct drm_gem_vram_object *gbo)
ret = drm_gem_vram_pin_locked(gbo, 0);
if (ret)
goto err_ttm_bo_unreserve;
-   base = drm_gem_vram_kmap_locked(gbo, true, NULL);
+   base = drm_gem_vram_kmap_locked(gbo);
if (IS_ERR(base)) {
ret = PTR_ERR(base);
goto err_drm_gem_vram_unpin_locked;





Re: [PATCH v3 2/7] drm/ttm: Add ttm_kmap_obj_to_dma_buf_map() for type conversion

2020-10-07 Thread Christian König

Am 07.10.20 um 15:20 schrieb Thomas Zimmermann:

Hi

Am 07.10.20 um 15:10 schrieb Daniel Vetter:

On Wed, Oct 7, 2020 at 2:57 PM Thomas Zimmermann  wrote:

Hi

Am 02.10.20 um 11:58 schrieb Daniel Vetter:

On Wed, Sep 30, 2020 at 02:51:46PM +0200, Daniel Vetter wrote:

On Wed, Sep 30, 2020 at 2:34 PM Christian König
 wrote:

Am 30.09.20 um 11:47 schrieb Daniel Vetter:

On Wed, Sep 30, 2020 at 10:34:31AM +0200, Christian König wrote:

Am 30.09.20 um 10:19 schrieb Thomas Zimmermann:

Hi

Am 30.09.20 um 10:05 schrieb Christian König:

Am 29.09.20 um 19:49 schrieb Thomas Zimmermann:

Hi Christian

Am 29.09.20 um 17:35 schrieb Christian König:

Am 29.09.20 um 17:14 schrieb Thomas Zimmermann:

The new helper ttm_kmap_obj_to_dma_buf() extracts address and location
from and instance of TTM's kmap_obj and initializes struct dma_buf_map
with these values. Helpful for TTM-based drivers.

We could completely drop that if we use the same structure inside TTM as
well.

Additional to that which driver is going to use this?

As Daniel mentioned, it's in patch 3. The TTM-based drivers will
retrieve the pointer via this function.

I do want to see all that being more tightly integrated into TTM, but
not in this series. This one is about fixing the bochs-on-sparc64
problem for good. Patch 7 adds an update to TTM to the DRM TODO list.

I should have asked which driver you try to fix here :)

In this case just keep the function inside bochs and only fix it there.

All other drivers can be fixed when we generally pump this through TTM.

Did you take a look at patch 3? This function will be used by VRAM
helpers, nouveau, radeon, amdgpu and qxl. If we don't put it here, we
have to duplicate the functionality in each if these drivers. Bochs
itself uses VRAM helpers and doesn't touch the function directly.

Ah, ok can we have that then only in the VRAM helpers?

Alternative you could go ahead and use dma_buf_map in ttm_bo_kmap_obj
directly and drop the hack with the TTM_BO_MAP_IOMEM_MASK.

What I want to avoid is to have another conversion function in TTM because
what happens here is that we already convert from ttm_bus_placement to
ttm_bo_kmap_obj and then to dma_buf_map.

Hm I'm not really seeing how that helps with a gradual conversion of
everything over to dma_buf_map and assorted helpers for access? There's
too many places in ttm drivers where is_iomem and related stuff is used to
be able to convert it all in one go. An intermediate state with a bunch of
conversions seems fairly unavoidable to me.

Fair enough. I would just have started bottom up and not top down.

Anyway feel free to go ahead with this approach as long as we can remove
the new function again when we clean that stuff up for good.

Yeah I guess bottom up would make more sense as a refactoring. But the
main motivation to land this here is to fix the __mmio vs normal
memory confusion in the fbdev emulation helpers for sparc (and
anything else that needs this). Hence the top down approach for
rolling this out.

Ok I started reviewing this a bit more in-depth, and I think this is a bit
too much of a de-tour.

Looking through all the callers of ttm_bo_kmap almost everyone maps the
entire object. Only vmwgfx uses to map less than that. Also, everyone just
immediately follows up with converting that full object map into a
pointer.

So I think what we really want here is:
- new function

int ttm_bo_vmap(struct ttm_buffer_object *bo, struct dma_buf_map *map);

   _vmap name since that's consistent with both dma_buf functions and
   what's usually used to implement this. Outside of the ttm world kmap
   usually just means single-page mappings using kmap() or it's iomem
   sibling io_mapping_map* so rather confusing name for a function which
   usually is just used to set up a vmap of the entire buffer.

- a helper which can be used for the drm_gem_object_funcs vmap/vunmap
   functions for all ttm drivers. We should be able to make this fully
   generic because a) we now have dma_buf_map and b) drm_gem_object is
   embedded in the ttm_bo, so we can upcast for everyone who's both a ttm
   and gem driver.

   This is maybe a good follow-up, since it should allow us to ditch quite
   a bit of the vram helper code for this more generic stuff. I also might
   have missed some special-cases here, but from a quick look everything
   just pins the buffer to the current location and that's it.

   Also this obviously requires Christian's generic ttm_bo_pin rework
   first.

- roll the above out to drivers.

Christian/Thomas, thoughts on this?

I agree on the goals, but what is the immediate objective here?

Adding ttm_bo_vmap() does not work out easily, as struct ttm_bo_kmap_obj
is a central part of the internals of TTM. struct ttm_bo_kmap_obj has
more internal state that struct dma_buf_map, so they are not easily
convertible either. What you propose seems to require a reimplementation
of the existing ttm_bo_kmap() c

Re: [PATCH v3 2/7] drm/ttm: Add ttm_kmap_obj_to_dma_buf_map() for type conversion

2020-10-02 Thread Christian König

Am 02.10.20 um 11:58 schrieb Daniel Vetter:

On Wed, Sep 30, 2020 at 02:51:46PM +0200, Daniel Vetter wrote:

On Wed, Sep 30, 2020 at 2:34 PM Christian König
 wrote:

Am 30.09.20 um 11:47 schrieb Daniel Vetter:

On Wed, Sep 30, 2020 at 10:34:31AM +0200, Christian König wrote:

Am 30.09.20 um 10:19 schrieb Thomas Zimmermann:

Hi

Am 30.09.20 um 10:05 schrieb Christian König:

Am 29.09.20 um 19:49 schrieb Thomas Zimmermann:

Hi Christian

Am 29.09.20 um 17:35 schrieb Christian König:

Am 29.09.20 um 17:14 schrieb Thomas Zimmermann:

The new helper ttm_kmap_obj_to_dma_buf() extracts address and location
from and instance of TTM's kmap_obj and initializes struct dma_buf_map
with these values. Helpful for TTM-based drivers.

We could completely drop that if we use the same structure inside TTM as
well.

Additional to that which driver is going to use this?

As Daniel mentioned, it's in patch 3. The TTM-based drivers will
retrieve the pointer via this function.

I do want to see all that being more tightly integrated into TTM, but
not in this series. This one is about fixing the bochs-on-sparc64
problem for good. Patch 7 adds an update to TTM to the DRM TODO list.

I should have asked which driver you try to fix here :)

In this case just keep the function inside bochs and only fix it there.

All other drivers can be fixed when we generally pump this through TTM.

Did you take a look at patch 3? This function will be used by VRAM
helpers, nouveau, radeon, amdgpu and qxl. If we don't put it here, we
have to duplicate the functionality in each if these drivers. Bochs
itself uses VRAM helpers and doesn't touch the function directly.

Ah, ok can we have that then only in the VRAM helpers?

Alternative you could go ahead and use dma_buf_map in ttm_bo_kmap_obj
directly and drop the hack with the TTM_BO_MAP_IOMEM_MASK.

What I want to avoid is to have another conversion function in TTM because
what happens here is that we already convert from ttm_bus_placement to
ttm_bo_kmap_obj and then to dma_buf_map.

Hm I'm not really seeing how that helps with a gradual conversion of
everything over to dma_buf_map and assorted helpers for access? There's
too many places in ttm drivers where is_iomem and related stuff is used to
be able to convert it all in one go. An intermediate state with a bunch of
conversions seems fairly unavoidable to me.

Fair enough. I would just have started bottom up and not top down.

Anyway feel free to go ahead with this approach as long as we can remove
the new function again when we clean that stuff up for good.

Yeah I guess bottom up would make more sense as a refactoring. But the
main motivation to land this here is to fix the __mmio vs normal
memory confusion in the fbdev emulation helpers for sparc (and
anything else that needs this). Hence the top down approach for
rolling this out.

Ok I started reviewing this a bit more in-depth, and I think this is a bit
too much of a de-tour.

Looking through all the callers of ttm_bo_kmap almost everyone maps the
entire object. Only vmwgfx uses to map less than that. Also, everyone just
immediately follows up with converting that full object map into a
pointer.

So I think what we really want here is:
- new function

int ttm_bo_vmap(struct ttm_buffer_object *bo, struct dma_buf_map *map);

   _vmap name since that's consistent with both dma_buf functions and
   what's usually used to implement this. Outside of the ttm world kmap
   usually just means single-page mappings using kmap() or it's iomem
   sibling io_mapping_map* so rather confusing name for a function which
   usually is just used to set up a vmap of the entire buffer.

- a helper which can be used for the drm_gem_object_funcs vmap/vunmap
   functions for all ttm drivers. We should be able to make this fully
   generic because a) we now have dma_buf_map and b) drm_gem_object is
   embedded in the ttm_bo, so we can upcast for everyone who's both a ttm
   and gem driver.

   This is maybe a good follow-up, since it should allow us to ditch quite
   a bit of the vram helper code for this more generic stuff. I also might
   have missed some special-cases here, but from a quick look everything
   just pins the buffer to the current location and that's it.

   Also this obviously requires Christian's generic ttm_bo_pin rework
   first.

- roll the above out to drivers.

Christian/Thomas, thoughts on this?


Calling this vmap instead of kmap certainly makes sense.

Not 100% sure about the generic helpers, but it sounds like this should 
indeed look rather clean in the end.


Christian.



I think for the immediate need of rolling this out for vram helpers and
fbdev code we should be able to do this, but just postpone the driver wide
roll-out for now.

Cheers, Daniel


-Daniel


Christian.


-Daniel


Thanks,
Christian.


Best regards
Thomas


Regards,
Christian.


Best regards
Thomas


Regards,
Christian.


Signed-off-by: T

Re: [PATCH v3 2/7] drm/ttm: Add ttm_kmap_obj_to_dma_buf_map() for type conversion

2020-09-30 Thread Christian König

Am 30.09.20 um 11:47 schrieb Daniel Vetter:

On Wed, Sep 30, 2020 at 10:34:31AM +0200, Christian König wrote:

Am 30.09.20 um 10:19 schrieb Thomas Zimmermann:

Hi

Am 30.09.20 um 10:05 schrieb Christian König:

Am 29.09.20 um 19:49 schrieb Thomas Zimmermann:

Hi Christian

Am 29.09.20 um 17:35 schrieb Christian König:

Am 29.09.20 um 17:14 schrieb Thomas Zimmermann:

The new helper ttm_kmap_obj_to_dma_buf() extracts address and location
from and instance of TTM's kmap_obj and initializes struct dma_buf_map
with these values. Helpful for TTM-based drivers.

We could completely drop that if we use the same structure inside TTM as
well.

Additional to that which driver is going to use this?

As Daniel mentioned, it's in patch 3. The TTM-based drivers will
retrieve the pointer via this function.

I do want to see all that being more tightly integrated into TTM, but
not in this series. This one is about fixing the bochs-on-sparc64
problem for good. Patch 7 adds an update to TTM to the DRM TODO list.

I should have asked which driver you try to fix here :)

In this case just keep the function inside bochs and only fix it there.

All other drivers can be fixed when we generally pump this through TTM.

Did you take a look at patch 3? This function will be used by VRAM
helpers, nouveau, radeon, amdgpu and qxl. If we don't put it here, we
have to duplicate the functionality in each if these drivers. Bochs
itself uses VRAM helpers and doesn't touch the function directly.

Ah, ok can we have that then only in the VRAM helpers?

Alternative you could go ahead and use dma_buf_map in ttm_bo_kmap_obj
directly and drop the hack with the TTM_BO_MAP_IOMEM_MASK.

What I want to avoid is to have another conversion function in TTM because
what happens here is that we already convert from ttm_bus_placement to
ttm_bo_kmap_obj and then to dma_buf_map.

Hm I'm not really seeing how that helps with a gradual conversion of
everything over to dma_buf_map and assorted helpers for access? There's
too many places in ttm drivers where is_iomem and related stuff is used to
be able to convert it all in one go. An intermediate state with a bunch of
conversions seems fairly unavoidable to me.


Fair enough. I would just have started bottom up and not top down.

Anyway feel free to go ahead with this approach as long as we can remove 
the new function again when we clean that stuff up for good.


Christian.


-Daniel


Thanks,
Christian.


Best regards
Thomas


Regards,
Christian.


Best regards
Thomas


Regards,
Christian.


Signed-off-by: Thomas Zimmermann 
---
    include/drm/ttm/ttm_bo_api.h | 24 
    include/linux/dma-buf-map.h  | 20 
    2 files changed, 44 insertions(+)

diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
index c96a25d571c8..62d89f05a801 100644
--- a/include/drm/ttm/ttm_bo_api.h
+++ b/include/drm/ttm/ttm_bo_api.h
@@ -34,6 +34,7 @@
    #include 
    #include 
    #include 
+#include 
    #include 
    #include 
    #include 
@@ -486,6 +487,29 @@ static inline void *ttm_kmap_obj_virtual(struct
ttm_bo_kmap_obj *map,
    return map->virtual;
    }
    +/**
+ * ttm_kmap_obj_to_dma_buf_map
+ *
+ * @kmap: A struct ttm_bo_kmap_obj returned from ttm_bo_kmap.
+ * @map: Returns the mapping as struct dma_buf_map
+ *
+ * Converts struct ttm_bo_kmap_obj to struct dma_buf_map. If the memory
+ * is not mapped, the returned mapping is initialized to NULL.
+ */
+static inline void ttm_kmap_obj_to_dma_buf_map(struct ttm_bo_kmap_obj
*kmap,
+   struct dma_buf_map *map)
+{
+    bool is_iomem;
+    void *vaddr = ttm_kmap_obj_virtual(kmap, &is_iomem);
+
+    if (!vaddr)
+    dma_buf_map_clear(map);
+    else if (is_iomem)
+    dma_buf_map_set_vaddr_iomem(map, (void __force __iomem *)vaddr);
+    else
+    dma_buf_map_set_vaddr(map, vaddr);
+}
+
    /**
     * ttm_bo_kmap
     *
diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h
index fd1aba545fdf..2e8bbecb5091 100644
--- a/include/linux/dma-buf-map.h
+++ b/include/linux/dma-buf-map.h
@@ -45,6 +45,12 @@
     *
     *    dma_buf_map_set_vaddr(&map. 0xdeadbeaf);
     *
+ * To set an address in I/O memory, use dma_buf_map_set_vaddr_iomem().
+ *
+ * .. code-block:: c
+ *
+ *    dma_buf_map_set_vaddr_iomem(&map. 0xdeadbeaf);
+ *
     * Test if a mapping is valid with either dma_buf_map_is_set() or
     * dma_buf_map_is_null().
     *
@@ -118,6 +124,20 @@ static inline void dma_buf_map_set_vaddr(struct
dma_buf_map *map, void *vaddr)
    map->is_iomem = false;
    }
    +/**
+ * dma_buf_map_set_vaddr_iomem - Sets a dma-buf mapping structure to
an address in I/O memory
+ * @map:    The dma-buf mapping structure
+ * @vaddr_iomem:    An I/O-memory address
+ *
+ * Sets the address and the I/O-memory flag.
+ */
+static inline void dma_buf_map_set_vaddr_iomem(struct dma_buf_map *map,
+   voi

Re: [PATCH v3 2/7] drm/ttm: Add ttm_kmap_obj_to_dma_buf_map() for type conversion

2020-09-30 Thread Christian König

Am 30.09.20 um 10:19 schrieb Thomas Zimmermann:

Hi

Am 30.09.20 um 10:05 schrieb Christian König:

Am 29.09.20 um 19:49 schrieb Thomas Zimmermann:

Hi Christian

Am 29.09.20 um 17:35 schrieb Christian König:

Am 29.09.20 um 17:14 schrieb Thomas Zimmermann:

The new helper ttm_kmap_obj_to_dma_buf() extracts address and location
from and instance of TTM's kmap_obj and initializes struct dma_buf_map
with these values. Helpful for TTM-based drivers.

We could completely drop that if we use the same structure inside TTM as
well.

Additional to that which driver is going to use this?

As Daniel mentioned, it's in patch 3. The TTM-based drivers will
retrieve the pointer via this function.

I do want to see all that being more tightly integrated into TTM, but
not in this series. This one is about fixing the bochs-on-sparc64
problem for good. Patch 7 adds an update to TTM to the DRM TODO list.

I should have asked which driver you try to fix here :)

In this case just keep the function inside bochs and only fix it there.

All other drivers can be fixed when we generally pump this through TTM.

Did you take a look at patch 3? This function will be used by VRAM
helpers, nouveau, radeon, amdgpu and qxl. If we don't put it here, we
have to duplicate the functionality in each if these drivers. Bochs
itself uses VRAM helpers and doesn't touch the function directly.


Ah, ok can we have that then only in the VRAM helpers?

Alternative you could go ahead and use dma_buf_map in ttm_bo_kmap_obj 
directly and drop the hack with the TTM_BO_MAP_IOMEM_MASK.


What I want to avoid is to have another conversion function in TTM 
because what happens here is that we already convert from 
ttm_bus_placement to ttm_bo_kmap_obj and then to dma_buf_map.


Thanks,
Christian.



Best regards
Thomas


Regards,
Christian.


Best regards
Thomas


Regards,
Christian.


Signed-off-by: Thomas Zimmermann 
---
   include/drm/ttm/ttm_bo_api.h | 24 
   include/linux/dma-buf-map.h  | 20 
   2 files changed, 44 insertions(+)

diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
index c96a25d571c8..62d89f05a801 100644
--- a/include/drm/ttm/ttm_bo_api.h
+++ b/include/drm/ttm/ttm_bo_api.h
@@ -34,6 +34,7 @@
   #include 
   #include 
   #include 
+#include 
   #include 
   #include 
   #include 
@@ -486,6 +487,29 @@ static inline void *ttm_kmap_obj_virtual(struct
ttm_bo_kmap_obj *map,
   return map->virtual;
   }
   +/**
+ * ttm_kmap_obj_to_dma_buf_map
+ *
+ * @kmap: A struct ttm_bo_kmap_obj returned from ttm_bo_kmap.
+ * @map: Returns the mapping as struct dma_buf_map
+ *
+ * Converts struct ttm_bo_kmap_obj to struct dma_buf_map. If the memory
+ * is not mapped, the returned mapping is initialized to NULL.
+ */
+static inline void ttm_kmap_obj_to_dma_buf_map(struct ttm_bo_kmap_obj
*kmap,
+   struct dma_buf_map *map)
+{
+    bool is_iomem;
+    void *vaddr = ttm_kmap_obj_virtual(kmap, &is_iomem);
+
+    if (!vaddr)
+    dma_buf_map_clear(map);
+    else if (is_iomem)
+    dma_buf_map_set_vaddr_iomem(map, (void __force __iomem *)vaddr);
+    else
+    dma_buf_map_set_vaddr(map, vaddr);
+}
+
   /**
    * ttm_bo_kmap
    *
diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h
index fd1aba545fdf..2e8bbecb5091 100644
--- a/include/linux/dma-buf-map.h
+++ b/include/linux/dma-buf-map.h
@@ -45,6 +45,12 @@
    *
    *    dma_buf_map_set_vaddr(&map. 0xdeadbeaf);
    *
+ * To set an address in I/O memory, use dma_buf_map_set_vaddr_iomem().
+ *
+ * .. code-block:: c
+ *
+ *    dma_buf_map_set_vaddr_iomem(&map. 0xdeadbeaf);
+ *
    * Test if a mapping is valid with either dma_buf_map_is_set() or
    * dma_buf_map_is_null().
    *
@@ -118,6 +124,20 @@ static inline void dma_buf_map_set_vaddr(struct
dma_buf_map *map, void *vaddr)
   map->is_iomem = false;
   }
   +/**
+ * dma_buf_map_set_vaddr_iomem - Sets a dma-buf mapping structure to
an address in I/O memory
+ * @map:    The dma-buf mapping structure
+ * @vaddr_iomem:    An I/O-memory address
+ *
+ * Sets the address and the I/O-memory flag.
+ */
+static inline void dma_buf_map_set_vaddr_iomem(struct dma_buf_map *map,
+   void __iomem *vaddr_iomem)
+{
+    map->vaddr_iomem = vaddr_iomem;
+    map->is_iomem = true;
+}
+
   /**
    * dma_buf_map_is_equal - Compares two dma-buf mapping structures
for equality
    * @lhs:    The dma-buf mapping structure

___
dri-devel mailing list
dri-de...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


___
dri-devel mailing list
dri-de...@lists.freedesktop.org
https://list

Re: [PATCH v3 2/7] drm/ttm: Add ttm_kmap_obj_to_dma_buf_map() for type conversion

2020-09-30 Thread Christian König

Am 29.09.20 um 19:49 schrieb Thomas Zimmermann:

Hi Christian

Am 29.09.20 um 17:35 schrieb Christian König:

Am 29.09.20 um 17:14 schrieb Thomas Zimmermann:

The new helper ttm_kmap_obj_to_dma_buf() extracts address and location
from and instance of TTM's kmap_obj and initializes struct dma_buf_map
with these values. Helpful for TTM-based drivers.

We could completely drop that if we use the same structure inside TTM as
well.

Additional to that which driver is going to use this?

As Daniel mentioned, it's in patch 3. The TTM-based drivers will
retrieve the pointer via this function.

I do want to see all that being more tightly integrated into TTM, but
not in this series. This one is about fixing the bochs-on-sparc64
problem for good. Patch 7 adds an update to TTM to the DRM TODO list.


I should have asked which driver you try to fix here :)

In this case just keep the function inside bochs and only fix it there.

All other drivers can be fixed when we generally pump this through TTM.

Regards,
Christian.



Best regards
Thomas


Regards,
Christian.


Signed-off-by: Thomas Zimmermann 
---
   include/drm/ttm/ttm_bo_api.h | 24 
   include/linux/dma-buf-map.h  | 20 
   2 files changed, 44 insertions(+)

diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
index c96a25d571c8..62d89f05a801 100644
--- a/include/drm/ttm/ttm_bo_api.h
+++ b/include/drm/ttm/ttm_bo_api.h
@@ -34,6 +34,7 @@
   #include 
   #include 
   #include 
+#include 
   #include 
   #include 
   #include 
@@ -486,6 +487,29 @@ static inline void *ttm_kmap_obj_virtual(struct
ttm_bo_kmap_obj *map,
   return map->virtual;
   }
   +/**
+ * ttm_kmap_obj_to_dma_buf_map
+ *
+ * @kmap: A struct ttm_bo_kmap_obj returned from ttm_bo_kmap.
+ * @map: Returns the mapping as struct dma_buf_map
+ *
+ * Converts struct ttm_bo_kmap_obj to struct dma_buf_map. If the memory
+ * is not mapped, the returned mapping is initialized to NULL.
+ */
+static inline void ttm_kmap_obj_to_dma_buf_map(struct ttm_bo_kmap_obj
*kmap,
+   struct dma_buf_map *map)
+{
+    bool is_iomem;
+    void *vaddr = ttm_kmap_obj_virtual(kmap, &is_iomem);
+
+    if (!vaddr)
+    dma_buf_map_clear(map);
+    else if (is_iomem)
+    dma_buf_map_set_vaddr_iomem(map, (void __force __iomem *)vaddr);
+    else
+    dma_buf_map_set_vaddr(map, vaddr);
+}
+
   /**
    * ttm_bo_kmap
    *
diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h
index fd1aba545fdf..2e8bbecb5091 100644
--- a/include/linux/dma-buf-map.h
+++ b/include/linux/dma-buf-map.h
@@ -45,6 +45,12 @@
    *
    *    dma_buf_map_set_vaddr(&map. 0xdeadbeaf);
    *
+ * To set an address in I/O memory, use dma_buf_map_set_vaddr_iomem().
+ *
+ * .. code-block:: c
+ *
+ *    dma_buf_map_set_vaddr_iomem(&map. 0xdeadbeaf);
+ *
    * Test if a mapping is valid with either dma_buf_map_is_set() or
    * dma_buf_map_is_null().
    *
@@ -118,6 +124,20 @@ static inline void dma_buf_map_set_vaddr(struct
dma_buf_map *map, void *vaddr)
   map->is_iomem = false;
   }
   +/**
+ * dma_buf_map_set_vaddr_iomem - Sets a dma-buf mapping structure to
an address in I/O memory
+ * @map:    The dma-buf mapping structure
+ * @vaddr_iomem:    An I/O-memory address
+ *
+ * Sets the address and the I/O-memory flag.
+ */
+static inline void dma_buf_map_set_vaddr_iomem(struct dma_buf_map *map,
+   void __iomem *vaddr_iomem)
+{
+    map->vaddr_iomem = vaddr_iomem;
+    map->is_iomem = true;
+}
+
   /**
    * dma_buf_map_is_equal - Compares two dma-buf mapping structures
for equality
    * @lhs:    The dma-buf mapping structure

___
dri-devel mailing list
dri-de...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx




Re: [PATCH v3 2/7] drm/ttm: Add ttm_kmap_obj_to_dma_buf_map() for type conversion

2020-09-29 Thread Christian König

Am 29.09.20 um 17:14 schrieb Thomas Zimmermann:

The new helper ttm_kmap_obj_to_dma_buf() extracts address and location
from and instance of TTM's kmap_obj and initializes struct dma_buf_map
with these values. Helpful for TTM-based drivers.


We could completely drop that if we use the same structure inside TTM as 
well.


Additional to that which driver is going to use this?

Regards,
Christian.



Signed-off-by: Thomas Zimmermann 
---
  include/drm/ttm/ttm_bo_api.h | 24 
  include/linux/dma-buf-map.h  | 20 
  2 files changed, 44 insertions(+)

diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
index c96a25d571c8..62d89f05a801 100644
--- a/include/drm/ttm/ttm_bo_api.h
+++ b/include/drm/ttm/ttm_bo_api.h
@@ -34,6 +34,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -486,6 +487,29 @@ static inline void *ttm_kmap_obj_virtual(struct 
ttm_bo_kmap_obj *map,
return map->virtual;
  }
  
+/**

+ * ttm_kmap_obj_to_dma_buf_map
+ *
+ * @kmap: A struct ttm_bo_kmap_obj returned from ttm_bo_kmap.
+ * @map: Returns the mapping as struct dma_buf_map
+ *
+ * Converts struct ttm_bo_kmap_obj to struct dma_buf_map. If the memory
+ * is not mapped, the returned mapping is initialized to NULL.
+ */
+static inline void ttm_kmap_obj_to_dma_buf_map(struct ttm_bo_kmap_obj *kmap,
+  struct dma_buf_map *map)
+{
+   bool is_iomem;
+   void *vaddr = ttm_kmap_obj_virtual(kmap, &is_iomem);
+
+   if (!vaddr)
+   dma_buf_map_clear(map);
+   else if (is_iomem)
+   dma_buf_map_set_vaddr_iomem(map, (void __force __iomem *)vaddr);
+   else
+   dma_buf_map_set_vaddr(map, vaddr);
+}
+
  /**
   * ttm_bo_kmap
   *
diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h
index fd1aba545fdf..2e8bbecb5091 100644
--- a/include/linux/dma-buf-map.h
+++ b/include/linux/dma-buf-map.h
@@ -45,6 +45,12 @@
   *
   *dma_buf_map_set_vaddr(&map. 0xdeadbeaf);
   *
+ * To set an address in I/O memory, use dma_buf_map_set_vaddr_iomem().
+ *
+ * .. code-block:: c
+ *
+ * dma_buf_map_set_vaddr_iomem(&map. 0xdeadbeaf);
+ *
   * Test if a mapping is valid with either dma_buf_map_is_set() or
   * dma_buf_map_is_null().
   *
@@ -118,6 +124,20 @@ static inline void dma_buf_map_set_vaddr(struct 
dma_buf_map *map, void *vaddr)
map->is_iomem = false;
  }
  
+/**

+ * dma_buf_map_set_vaddr_iomem - Sets a dma-buf mapping structure to an 
address in I/O memory
+ * @map:   The dma-buf mapping structure
+ * @vaddr_iomem:   An I/O-memory address
+ *
+ * Sets the address and the I/O-memory flag.
+ */
+static inline void dma_buf_map_set_vaddr_iomem(struct dma_buf_map *map,
+  void __iomem *vaddr_iomem)
+{
+   map->vaddr_iomem = vaddr_iomem;
+   map->is_iomem = true;
+}
+
  /**
   * dma_buf_map_is_equal - Compares two dma-buf mapping structures for equality
   * @lhs:  The dma-buf mapping structure





Re: [PATCH v3 00/22] Convert all remaining drivers to GEM object functions

2020-09-23 Thread Christian König
Feel free to add an Acked-by: Christian König  
to all patches which I haven't explicitly reviewed.


I would say we should just push this to drm-misc-next now.

Thanks for the nice cleanup,
Christian.

Am 23.09.20 um 12:21 schrieb Thomas Zimmermann:

The GEM and PRIME related callbacks in struct drm_driver are deprecated in
favor of GEM object functions in struct drm_gem_object_funcs. This patchset
converts the remaining drivers to object functions and removes most of the
obsolete interfaces.

Version 3 of this patchset mostly fixes drm_gem_prime_handle_to_fd and
updates i.MX's dcss driver. The driver was missing from earlier versions
and still needs review.

Patches #1 to #6, #8 to #17 and #19 to #20 convert DRM drivers to GEM object
functions, one by one. Each patch moves existing callbacks from struct
drm_driver to an instance of struct drm_gem_object_funcs, and sets these
funcs when the GEM object is initialized. The expection is .gem_prime_mmap.
There are different ways of how drivers implement the callback, and moving
it to GEM object functions requires a closer review for each.

Patch #18 fixes virtgpu to use GEM object functions where possible. The
driver recently introduced a function for one of the deprecated callbacks.

Patches #7 and #20 convert i.MX's dcss and xlnx to CMA helper macros. There's
no apparent reason why the drivers do the GEM setup on their's own. Using CMA
helper macros adds GEM object functions implicitly.

With most of the GEM and PRIME moved to GEM object functions, related code
in struct drm_driver and in the DRM core/helpers is being removed by patch
#22.

Further testing is welcome. I tested the drivers for which I have HW
available. These are gma500, i915, nouveau, radeon and vc4. The console,
Weston and Xorg apparently work with the patches applied.

v3:
* restore default call to drm_gem_prime_export() in
  drm_gem_prime_handle_to_fd()
* return -ENOSYS if get_sg_table is not set
* drop all checks for obj->funcs
* clean up TODO list and documentation
v2:
* moved code in amdgpu and radeon
* made several functions static in various drivers
* updated TODO-list item
* fix virtgpu

Thomas Zimmermann (22):
   drm/amdgpu: Introduce GEM object functions
   drm/armada: Introduce GEM object functions
   drm/etnaviv: Introduce GEM object functions
   drm/exynos: Introduce GEM object functions
   drm/gma500: Introduce GEM object functions
   drm/i915: Introduce GEM object functions
   drm/imx/dcss: Initialize DRM driver instance with CMA helper macro
   drm/mediatek: Introduce GEM object functions
   drm/msm: Introduce GEM object funcs
   drm/nouveau: Introduce GEM object functions
   drm/omapdrm: Introduce GEM object functions
   drm/pl111: Introduce GEM object functions
   drm/radeon: Introduce GEM object functions
   drm/rockchip: Convert to drm_gem_object_funcs
   drm/tegra: Introduce GEM object functions
   drm/vc4: Introduce GEM object functions
   drm/vgem: Introduce GEM object functions
   drm/virtgpu: Set PRIME export function in struct drm_gem_object_funcs
   drm/vkms: Introduce GEM object functions
   drm/xen: Introduce GEM object functions
   drm/xlnx: Initialize DRM driver instance with CMA helper macro
   drm: Remove obsolete GEM and PRIME callbacks from struct drm_driver

  Documentation/gpu/drm-mm.rst  |  4 +-
  Documentation/gpu/todo.rst|  9 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |  6 --
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c   | 23 +++--
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.h   |  5 --
  drivers/gpu/drm/armada/armada_drv.c   |  3 -
  drivers/gpu/drm/armada/armada_gem.c   | 12 ++-
  drivers/gpu/drm/armada/armada_gem.h   |  2 -
  drivers/gpu/drm/drm_gem.c | 53 
  drivers/gpu/drm/drm_gem_cma_helper.c  |  8 +-
  drivers/gpu/drm/drm_prime.c   | 14 +--
  drivers/gpu/drm/etnaviv/etnaviv_drv.c | 13 ---
  drivers/gpu/drm/etnaviv/etnaviv_drv.h |  1 -
  drivers/gpu/drm/etnaviv/etnaviv_gem.c | 19 -
  drivers/gpu/drm/exynos/exynos_drm_drv.c   | 10 ---
  drivers/gpu/drm/exynos/exynos_drm_gem.c   | 15 
  drivers/gpu/drm/gma500/framebuffer.c  |  2 +
  drivers/gpu/drm/gma500/gem.c  | 18 +++-
  drivers/gpu/drm/gma500/gem.h  |  3 +
  drivers/gpu/drm/gma500/psb_drv.c  |  9 --
  drivers/gpu/drm/gma500/psb_drv.h  |  2 -
  drivers/gpu/drm/i915/gem/i915_gem_object.c| 21 -
  drivers/gpu/drm/i915/gem/i915_gem_object.h|  3 -
  drivers/gpu/drm/i915/i915_drv.c   |  4 -
  .../gpu/drm/i915/selftests/mock_gem_device.c  |  3 -
  drivers/gpu/drm/imx/dcss/dcss-kms.c   | 14 +--
  drivers/gpu/drm/mediatek/mtk_drm_drv.c|  5 --
  drivers/gpu/drm/mediatek/mtk_drm_gem.c| 11 +++
  drivers/gpu/drm/msm/msm_drv.c   

Re: [PATCH v2 00/21] Convert all remaining drivers to GEM object functions

2020-09-15 Thread Christian König

Added my rb to the amdgpu and radeon patches.

Should we pick those up through the amd branches or do you want to push 
everything to drm-misc-next?


I think the later since this should result in much merge clash.

Christian.

Am 15.09.20 um 16:59 schrieb Thomas Zimmermann:

The GEM and PRIME related callbacks in struct drm_driver are deprecated in
favor of GEM object functions in struct drm_gem_object_funcs. This patchset
converts the remaining drivers to object functions and removes most of the
obsolete interfaces.

Patches #1 to #16 and #18 to #19 convert DRM drivers to GEM object functions,
one by one. Each patch moves existing callbacks from struct drm_driver to an
instance of struct drm_gem_object_funcs, and sets these funcs when the GEM
object is initialized. The expection is .gem_prime_mmap. There are different
ways of how drivers implement the callback, and moving it to GEM object
functions requires a closer review for each.

Patch #17 fixes virtgpu to use GEM object functions where possible. The
driver recently introduced a function for one of the deprecated callbacks.

Patch #20 converts xlnx to CMA helper macros. There's no apparent reason
why the driver does the GEM setup on it's own. Using CMA helper macros
adds GEM object functions implicitly.

With most of the GEM and PRIME moved to GEM object functions, related code
in struct drm_driver and in the DRM core/helpers is being removed by patch
#21.

Further testing is welcome. I tested the drivers for which I have HW
available. These are gma500, i915, nouveau, radeon and vc4. The console,
Weston and Xorg apparently work with the patches applied.

v2:
* moved code in amdgpu and radeon
* made several functions static in various drivers
* updated TODO-list item
* fix virtgpu

Thomas Zimmermann (21):
   drm/amdgpu: Introduce GEM object functions
   drm/armada: Introduce GEM object functions
   drm/etnaviv: Introduce GEM object functions
   drm/exynos: Introduce GEM object functions
   drm/gma500: Introduce GEM object functions
   drm/i915: Introduce GEM object functions
   drm/mediatek: Introduce GEM object functions
   drm/msm: Introduce GEM object funcs
   drm/nouveau: Introduce GEM object functions
   drm/omapdrm: Introduce GEM object functions
   drm/pl111: Introduce GEM object functions
   drm/radeon: Introduce GEM object functions
   drm/rockchip: Convert to drm_gem_object_funcs
   drm/tegra: Introduce GEM object functions
   drm/vc4: Introduce GEM object functions
   drm/vgem: Introduce GEM object functions
   drm/virtgpu: Set PRIME export function in struct drm_gem_object_funcs
   drm/vkms: Introduce GEM object functions
   drm/xen: Introduce GEM object functions
   drm/xlnx: Initialize DRM driver instance with CMA helper macro
   drm: Remove obsolete GEM and PRIME callbacks from struct drm_driver

  Documentation/gpu/todo.rst|  7 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c   |  6 --
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c   | 23 +++--
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.h   |  5 --
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c|  1 +
  drivers/gpu/drm/armada/armada_drv.c   |  3 -
  drivers/gpu/drm/armada/armada_gem.c   | 12 ++-
  drivers/gpu/drm/armada/armada_gem.h   |  2 -
  drivers/gpu/drm/drm_gem.c | 35 ++--
  drivers/gpu/drm/drm_gem_cma_helper.c  |  6 +-
  drivers/gpu/drm/drm_prime.c   | 17 ++--
  drivers/gpu/drm/etnaviv/etnaviv_drv.c | 13 ---
  drivers/gpu/drm/etnaviv/etnaviv_drv.h |  1 -
  drivers/gpu/drm/etnaviv/etnaviv_gem.c | 19 -
  drivers/gpu/drm/exynos/exynos_drm_drv.c   | 10 ---
  drivers/gpu/drm/exynos/exynos_drm_gem.c   | 15 
  drivers/gpu/drm/gma500/framebuffer.c  |  2 +
  drivers/gpu/drm/gma500/gem.c  | 18 +++-
  drivers/gpu/drm/gma500/gem.h  |  3 +
  drivers/gpu/drm/gma500/psb_drv.c  |  9 --
  drivers/gpu/drm/gma500/psb_drv.h  |  2 -
  drivers/gpu/drm/i915/gem/i915_gem_object.c| 21 -
  drivers/gpu/drm/i915/gem/i915_gem_object.h|  3 -
  drivers/gpu/drm/i915/i915_drv.c   |  4 -
  .../gpu/drm/i915/selftests/mock_gem_device.c  |  3 -
  drivers/gpu/drm/mediatek/mtk_drm_drv.c|  5 --
  drivers/gpu/drm/mediatek/mtk_drm_gem.c| 11 +++
  drivers/gpu/drm/msm/msm_drv.c | 13 ---
  drivers/gpu/drm/msm/msm_drv.h |  1 -
  drivers/gpu/drm/msm/msm_gem.c | 19 -
  drivers/gpu/drm/nouveau/nouveau_drm.c |  9 --
  drivers/gpu/drm/nouveau/nouveau_gem.c | 13 +++
  drivers/gpu/drm/nouveau/nouveau_gem.h |  2 +
  drivers/gpu/drm/nouveau/nouveau_prime.c   |  2 +
  drivers/gpu/drm/omapdrm/omap_drv.c|  9 --
  drivers/gpu/drm/omapdrm/omap_gem.c| 18 +++-
  drivers/gpu/drm/omapdrm/omap_gem.h|  2 -
  drivers/gpu/drm/pl111/pl111_drv.c |  5 +-
  d

Re: [PATCH v2 12/21] drm/radeon: Introduce GEM object functions

2020-09-15 Thread Christian König

Am 15.09.20 um 16:59 schrieb Thomas Zimmermann:

GEM object functions deprecate several similar callback interfaces in
struct drm_driver. This patch replaces the per-driver callbacks with
per-instance callbacks in radeon.

v2:
* move object-function instance to radeon_gem.c (Christian)
* set callbacks in radeon_gem_object_create() (Christian)

Signed-off-by: Thomas Zimmermann 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/radeon/radeon_drv.c | 23 +
  drivers/gpu/drm/radeon/radeon_gem.c | 31 +
  2 files changed, 28 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
b/drivers/gpu/drm/radeon/radeon_drv.c
index 4cd30613fa1d..65061c949aee 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -124,13 +124,6 @@ void radeon_driver_irq_preinstall_kms(struct drm_device 
*dev);
  int radeon_driver_irq_postinstall_kms(struct drm_device *dev);
  void radeon_driver_irq_uninstall_kms(struct drm_device *dev);
  irqreturn_t radeon_driver_irq_handler_kms(int irq, void *arg);
-void radeon_gem_object_free(struct drm_gem_object *obj);
-int radeon_gem_object_open(struct drm_gem_object *obj,
-   struct drm_file *file_priv);
-void radeon_gem_object_close(struct drm_gem_object *obj,
-   struct drm_file *file_priv);
-struct dma_buf *radeon_gem_prime_export(struct drm_gem_object *gobj,
-   int flags);
  extern int radeon_get_crtc_scanoutpos(struct drm_device *dev, unsigned int 
crtc,
  unsigned int flags, int *vpos, int *hpos,
  ktime_t *stime, ktime_t *etime,
@@ -145,14 +138,9 @@ int radeon_mode_dumb_mmap(struct drm_file *filp,
  int radeon_mode_dumb_create(struct drm_file *file_priv,
struct drm_device *dev,
struct drm_mode_create_dumb *args);
-struct sg_table *radeon_gem_prime_get_sg_table(struct drm_gem_object *obj);
  struct drm_gem_object *radeon_gem_prime_import_sg_table(struct drm_device 
*dev,
struct 
dma_buf_attachment *,
struct sg_table *sg);
-int radeon_gem_prime_pin(struct drm_gem_object *obj);
-void radeon_gem_prime_unpin(struct drm_gem_object *obj);
-void *radeon_gem_prime_vmap(struct drm_gem_object *obj);
-void radeon_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr);
  
  /* atpx handler */

  #if defined(CONFIG_VGA_SWITCHEROO)
@@ -550,7 +538,7 @@ long radeon_drm_ioctl(struct file *filp,
}
  
  	ret = drm_ioctl(filp, cmd, arg);

-   
+
pm_runtime_mark_last_busy(dev->dev);
pm_runtime_put_autosuspend(dev->dev);
return ret;
@@ -609,22 +597,13 @@ static struct drm_driver kms_driver = {
.irq_uninstall = radeon_driver_irq_uninstall_kms,
.irq_handler = radeon_driver_irq_handler_kms,
.ioctls = radeon_ioctls_kms,
-   .gem_free_object_unlocked = radeon_gem_object_free,
-   .gem_open_object = radeon_gem_object_open,
-   .gem_close_object = radeon_gem_object_close,
.dumb_create = radeon_mode_dumb_create,
.dumb_map_offset = radeon_mode_dumb_mmap,
.fops = &radeon_driver_kms_fops,
  
  	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,

.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
-   .gem_prime_export = radeon_gem_prime_export,
-   .gem_prime_pin = radeon_gem_prime_pin,
-   .gem_prime_unpin = radeon_gem_prime_unpin,
-   .gem_prime_get_sg_table = radeon_gem_prime_get_sg_table,
.gem_prime_import_sg_table = radeon_gem_prime_import_sg_table,
-   .gem_prime_vmap = radeon_gem_prime_vmap,
-   .gem_prime_vunmap = radeon_gem_prime_vunmap,
  
  	.name = DRIVER_NAME,

.desc = DRIVER_DESC,
diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index e5c4271e64ed..0ccd7213e41f 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -35,7 +35,17 @@
  
  #include "radeon.h"
  
-void radeon_gem_object_free(struct drm_gem_object *gobj)

+struct dma_buf *radeon_gem_prime_export(struct drm_gem_object *gobj,
+   int flags);
+struct sg_table *radeon_gem_prime_get_sg_table(struct drm_gem_object *obj);
+int radeon_gem_prime_pin(struct drm_gem_object *obj);
+void radeon_gem_prime_unpin(struct drm_gem_object *obj);
+void *radeon_gem_prime_vmap(struct drm_gem_object *obj);
+void radeon_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr);
+
+static const struct drm_gem_object_funcs radeon_gem_object_funcs;
+
+static void radeon_gem_object_free(struct drm_gem_object *gobj)
  {
struct radeon_bo *robj = gem_to_radeon_bo(gobj);
  
@@ -85,6 +95,7 @@ int radeon_gem_object_create(struct radeon_device 

Re: [PATCH v2 01/21] drm/amdgpu: Introduce GEM object functions

2020-09-15 Thread Christian König
g amdgpu_gem_timeout(uint64_t timeout_ns);
  
  /*

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index ac043baac05d..c4e82a8fa53f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -561,6 +561,7 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
bo = kzalloc(sizeof(struct amdgpu_bo), GFP_KERNEL);
if (bo == NULL)
return -ENOMEM;
+


The newline is not unrelated.

Apart from that the patch is Reviewed-by: Christian König 
.


But I think we need some smoke testing of it.

Christian.


drm_gem_private_object_init(adev_to_drm(adev), &bo->tbo.base, size);
INIT_LIST_HEAD(&bo->shadow_list);
bo->vm_bo = NULL;





Re: [PATCH 01/20] drm/amdgpu: Introduce GEM object functions

2020-09-14 Thread Christian König

Am 14.09.20 um 17:05 schrieb Thomas Zimmermann:

Hi

Am 13.08.20 um 12:22 schrieb Christian König:

Am 13.08.20 um 10:36 schrieb Thomas Zimmermann:

GEM object functions deprecate several similar callback interfaces in
struct drm_driver. This patch replaces the per-driver callbacks with
per-instance callbacks in amdgpu. The only exception is gem_prime_mmap,
which is non-trivial to convert.

Signed-off-by: Thomas Zimmermann 
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    |  6 --
   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 12 
   2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 81a79760ca61..51525b8774c9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1468,19 +1468,13 @@ static struct drm_driver kms_driver = {
   .lastclose = amdgpu_driver_lastclose_kms,
   .irq_handler = amdgpu_irq_handler,
   .ioctls = amdgpu_ioctls_kms,
-    .gem_free_object_unlocked = amdgpu_gem_object_free,
-    .gem_open_object = amdgpu_gem_object_open,
-    .gem_close_object = amdgpu_gem_object_close,
   .dumb_create = amdgpu_mode_dumb_create,
   .dumb_map_offset = amdgpu_mode_dumb_mmap,
   .fops = &amdgpu_driver_kms_fops,
     .prime_handle_to_fd = drm_gem_prime_handle_to_fd,
   .prime_fd_to_handle = drm_gem_prime_fd_to_handle,
-    .gem_prime_export = amdgpu_gem_prime_export,
   .gem_prime_import = amdgpu_gem_prime_import,
-    .gem_prime_vmap = amdgpu_gem_prime_vmap,
-    .gem_prime_vunmap = amdgpu_gem_prime_vunmap,
   .gem_prime_mmap = amdgpu_gem_prime_mmap,
     .name = DRIVER_NAME,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 43f4966331dd..ca2b79f94e99 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -36,6 +36,7 @@
   #include 
   #include 
   #include "amdgpu.h"
+#include "amdgpu_dma_buf.h"
   #include "amdgpu_trace.h"
   #include "amdgpu_amdkfd.h"
   @@ -510,6 +511,15 @@ bool amdgpu_bo_support_uswc(u64 bo_flags)
   #endif
   }
   +static const struct drm_gem_object_funcs amdgpu_gem_object_funcs = {
+    .free = amdgpu_gem_object_free,
+    .open = amdgpu_gem_object_open,
+    .close = amdgpu_gem_object_close,
+    .export = amdgpu_gem_prime_export,
+    .vmap = amdgpu_gem_prime_vmap,
+    .vunmap = amdgpu_gem_prime_vunmap,
+};
+

Wrong file, this belongs into amdgpu_gem.c


   static int amdgpu_bo_do_create(struct amdgpu_device *adev,
  struct amdgpu_bo_param *bp,
  struct amdgpu_bo **bo_ptr)
@@ -552,6 +562,8 @@ static int amdgpu_bo_do_create(struct
amdgpu_device *adev,
   bo = kzalloc(sizeof(struct amdgpu_bo), GFP_KERNEL);
   if (bo == NULL)
   return -ENOMEM;
+
+    bo->tbo.base.funcs = &amdgpu_gem_object_funcs;

And this should probably go into amdgpu_gem_object_create().

I'm trying to understand what amdgpu does.  What about all the places
where amdgpu calls amdgpu_bo_create() internally? Wouldn't these miss
the free callback for the GEM object?


Those shouldn't have a GEM object in the first place.

Or otherwise we would have a reference counting issue.

Regards,
Christian.



Best regards
Thomas


Apart from that looks like a good idea to me.

Christian.


   drm_gem_private_object_init(adev->ddev, &bo->tbo.base, size);
   INIT_LIST_HEAD(&bo->shadow_list);
   bo->vm_bo = NULL;


___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx




Re: [PATCH 1/2] drm: allow limiting the scatter list size.

2020-08-18 Thread Christian König

Am 18.08.20 um 10:27 schrieb Gerd Hoffmann:

On Tue, Aug 18, 2020 at 09:57:59AM +0200, Christian König wrote:

Am 18.08.20 um 09:48 schrieb Gerd Hoffmann:

Add max_segment argument to drm_prime_pages_to_sg().  When set pass it
through to the __sg_alloc_table_from_pages() call, otherwise use
SCATTERLIST_MAX_SEGMENT.

Also add max_segment field to gem objects and pass it to
drm_prime_pages_to_sg() calls in drivers and helpers.

Signed-off-by: Gerd Hoffmann 

I'm missing an explanation why this should be useful (it certainly is).

virtio-gpu needs this to work properly with SEV (see patch 2/2 of this
series).


Yeah, that's the problem patch 2/2 never showed up here :)


And the maximum segment size seems misplaced in the GEM object. This is
usually a property of the device or even completely constant.

Placing it in drm_device instead would indeed work for virtio-gpu, so I
guess you are suggesting that instead?


That is probably the best approach, yes.

For Intel and AMD it could even be global/constant, but it certainly 
doesn't needs to be kept around for each buffer.


Christian.



take care,
   Gerd






Re: [PATCH 1/2] drm: allow limiting the scatter list size.

2020-08-18 Thread Christian König

Am 18.08.20 um 09:48 schrieb Gerd Hoffmann:

Add max_segment argument to drm_prime_pages_to_sg().  When set pass it
through to the __sg_alloc_table_from_pages() call, otherwise use
SCATTERLIST_MAX_SEGMENT.

Also add max_segment field to gem objects and pass it to
drm_prime_pages_to_sg() calls in drivers and helpers.

Signed-off-by: Gerd Hoffmann 


I'm missing an explanation why this should be useful (it certainly is).

And the maximum segment size seems misplaced in the GEM object. This is 
usually a property of the device or even completely constant.


Christian.


---
  include/drm/drm_gem.h   |  8 
  include/drm/drm_prime.h |  3 ++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c |  3 ++-
  drivers/gpu/drm/drm_gem_shmem_helper.c  |  3 ++-
  drivers/gpu/drm/drm_prime.c | 10 +++---
  drivers/gpu/drm/etnaviv/etnaviv_gem.c   |  3 ++-
  drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c |  3 ++-
  drivers/gpu/drm/msm/msm_gem.c   |  3 ++-
  drivers/gpu/drm/msm/msm_gem_prime.c |  3 ++-
  drivers/gpu/drm/nouveau/nouveau_prime.c |  3 ++-
  drivers/gpu/drm/radeon/radeon_prime.c   |  3 ++-
  drivers/gpu/drm/rockchip/rockchip_drm_gem.c |  6 --
  drivers/gpu/drm/tegra/gem.c |  3 ++-
  drivers/gpu/drm/vgem/vgem_drv.c |  3 ++-
  drivers/gpu/drm/xen/xen_drm_front_gem.c |  3 ++-
  15 files changed, 43 insertions(+), 17 deletions(-)

diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index 337a48321705..dea5e92e745b 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -241,6 +241,14 @@ struct drm_gem_object {
 */
size_t size;
  
+	/**

+* @max_segment:
+*
+* Max size for scatter list segments.  When unset the default
+* (SCATTERLIST_MAX_SEGMENT) is used.
+*/
+   size_t max_segment;
+
/**
 * @name:
 *
diff --git a/include/drm/drm_prime.h b/include/drm/drm_prime.h
index 9af7422b44cf..2c3689435cb4 100644
--- a/include/drm/drm_prime.h
+++ b/include/drm/drm_prime.h
@@ -88,7 +88,8 @@ void drm_gem_dmabuf_vunmap(struct dma_buf *dma_buf, void 
*vaddr);
  int drm_gem_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct 
*vma);
  int drm_gem_dmabuf_mmap(struct dma_buf *dma_buf, struct vm_area_struct *vma);
  
-struct sg_table *drm_prime_pages_to_sg(struct page **pages, unsigned int nr_pages);

+struct sg_table *drm_prime_pages_to_sg(struct page **pages, unsigned int 
nr_pages,
+  size_t max_segment);
  struct dma_buf *drm_gem_prime_export(struct drm_gem_object *obj,
 int flags);
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c

index 519ce4427fce..5e8a9760b33f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -303,7 +303,8 @@ static struct sg_table *amdgpu_dma_buf_map(struct 
dma_buf_attachment *attach,
switch (bo->tbo.mem.mem_type) {
case TTM_PL_TT:
sgt = drm_prime_pages_to_sg(bo->tbo.ttm->pages,
-   bo->tbo.num_pages);
+   bo->tbo.num_pages,
+   obj->max_segment);
if (IS_ERR(sgt))
return sgt;
  
diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c

index 4b7cfbac4daa..cfb979d808fd 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -656,7 +656,8 @@ struct sg_table *drm_gem_shmem_get_sg_table(struct 
drm_gem_object *obj)
  
  	WARN_ON(shmem->base.import_attach);
  
-	return drm_prime_pages_to_sg(shmem->pages, obj->size >> PAGE_SHIFT);

+   return drm_prime_pages_to_sg(shmem->pages, obj->size >> PAGE_SHIFT,
+obj->max_segment);
  }
  EXPORT_SYMBOL_GPL(drm_gem_shmem_get_sg_table);
  
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c

index 1693aa7c14b5..27c783fd6633 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -802,7 +802,8 @@ static const struct dma_buf_ops drm_gem_prime_dmabuf_ops =  
{
   *
   * This is useful for implementing &drm_gem_object_funcs.get_sg_table.
   */
-struct sg_table *drm_prime_pages_to_sg(struct page **pages, unsigned int 
nr_pages)
+struct sg_table *drm_prime_pages_to_sg(struct page **pages, unsigned int 
nr_pages,
+  size_t max_segment)
  {
struct sg_table *sg = NULL;
int ret;
@@ -813,8 +814,11 @@ struct sg_table *drm_prime_pages_to_sg(struct page 
**pages, unsigned int nr_page
goto out;
}
  
-	ret = sg_alloc_table_from_pages(sg, pages, nr_pages, 0,

-   nr_pages << PAGE_SHIFT, GFP_KERNEL);
+   if (max_segment 

Re: [PATCH 12/20] drm/radeon: Introduce GEM object functions

2020-08-13 Thread Christian König

Am 13.08.20 um 12:41 schrieb Thomas Zimmermann:

Hi

Am 13.08.20 um 12:24 schrieb Christian König:

Am 13.08.20 um 10:36 schrieb Thomas Zimmermann:

GEM object functions deprecate several similar callback interfaces in
struct drm_driver. This patch replaces the per-driver callbacks with
per-instance callbacks in radeon.

Signed-off-by: Thomas Zimmermann 
---
   drivers/gpu/drm/radeon/radeon_drv.c    | 23 +--
   drivers/gpu/drm/radeon/radeon_object.c | 26 ++
   2 files changed, 27 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_drv.c
b/drivers/gpu/drm/radeon/radeon_drv.c
index 4cd30613fa1d..65061c949aee 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -124,13 +124,6 @@ void radeon_driver_irq_preinstall_kms(struct
drm_device *dev);
   int radeon_driver_irq_postinstall_kms(struct drm_device *dev);
   void radeon_driver_irq_uninstall_kms(struct drm_device *dev);
   irqreturn_t radeon_driver_irq_handler_kms(int irq, void *arg);
-void radeon_gem_object_free(struct drm_gem_object *obj);
-int radeon_gem_object_open(struct drm_gem_object *obj,
-    struct drm_file *file_priv);
-void radeon_gem_object_close(struct drm_gem_object *obj,
-    struct drm_file *file_priv);
-struct dma_buf *radeon_gem_prime_export(struct drm_gem_object *gobj,
-    int flags);
   extern int radeon_get_crtc_scanoutpos(struct drm_device *dev,
unsigned int crtc,
     unsigned int flags, int *vpos, int *hpos,
     ktime_t *stime, ktime_t *etime,
@@ -145,14 +138,9 @@ int radeon_mode_dumb_mmap(struct drm_file *filp,
   int radeon_mode_dumb_create(struct drm_file *file_priv,
   struct drm_device *dev,
   struct drm_mode_create_dumb *args);
-struct sg_table *radeon_gem_prime_get_sg_table(struct drm_gem_object
*obj);
   struct drm_gem_object *radeon_gem_prime_import_sg_table(struct
drm_device *dev,
   struct dma_buf_attachment *,
   struct sg_table *sg);
-int radeon_gem_prime_pin(struct drm_gem_object *obj);
-void radeon_gem_prime_unpin(struct drm_gem_object *obj);
-void *radeon_gem_prime_vmap(struct drm_gem_object *obj);
-void radeon_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr);
     /* atpx handler */
   #if defined(CONFIG_VGA_SWITCHEROO)
@@ -550,7 +538,7 @@ long radeon_drm_ioctl(struct file *filp,
   }
     ret = drm_ioctl(filp, cmd, arg);
-
+
   pm_runtime_mark_last_busy(dev->dev);
   pm_runtime_put_autosuspend(dev->dev);
   return ret;
@@ -609,22 +597,13 @@ static struct drm_driver kms_driver = {
   .irq_uninstall = radeon_driver_irq_uninstall_kms,
   .irq_handler = radeon_driver_irq_handler_kms,
   .ioctls = radeon_ioctls_kms,
-    .gem_free_object_unlocked = radeon_gem_object_free,
-    .gem_open_object = radeon_gem_object_open,
-    .gem_close_object = radeon_gem_object_close,
   .dumb_create = radeon_mode_dumb_create,
   .dumb_map_offset = radeon_mode_dumb_mmap,
   .fops = &radeon_driver_kms_fops,
     .prime_handle_to_fd = drm_gem_prime_handle_to_fd,
   .prime_fd_to_handle = drm_gem_prime_fd_to_handle,
-    .gem_prime_export = radeon_gem_prime_export,
-    .gem_prime_pin = radeon_gem_prime_pin,
-    .gem_prime_unpin = radeon_gem_prime_unpin,
-    .gem_prime_get_sg_table = radeon_gem_prime_get_sg_table,
   .gem_prime_import_sg_table = radeon_gem_prime_import_sg_table,
-    .gem_prime_vmap = radeon_gem_prime_vmap,
-    .gem_prime_vunmap = radeon_gem_prime_vunmap,
     .name = DRIVER_NAME,
   .desc = DRIVER_DESC,
diff --git a/drivers/gpu/drm/radeon/radeon_object.c
b/drivers/gpu/drm/radeon/radeon_object.c
index bb7582afd803..882390e15dfe 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -45,6 +45,19 @@ int radeon_ttm_init(struct radeon_device *rdev);
   void radeon_ttm_fini(struct radeon_device *rdev);
   static void radeon_bo_clear_surface_reg(struct radeon_bo *bo);
   +void radeon_gem_object_free(struct drm_gem_object *obj);
+int radeon_gem_object_open(struct drm_gem_object *obj,
+    struct drm_file *file_priv);
+void radeon_gem_object_close(struct drm_gem_object *obj,
+    struct drm_file *file_priv);
+struct dma_buf *radeon_gem_prime_export(struct drm_gem_object *gobj,
+    int flags);
+struct sg_table *radeon_gem_prime_get_sg_table(struct drm_gem_object
*obj);
+int radeon_gem_prime_pin(struct drm_gem_object *obj);
+void radeon_gem_prime_unpin(struct drm_gem_object *obj);
+void *radeon_gem_prime_vmap(struct drm_gem_object *obj);
+void radeon_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr);
+
   /*
    * To exclude mutual BO access we rely on bo_reserve exclusion, as all
    * function are calling it.
@@ -180,6 +193,18 @@ void radeon_ttm_placement_from_domain(struct
rade

Re: [PATCH 12/20] drm/radeon: Introduce GEM object functions

2020-08-13 Thread Christian König

Am 13.08.20 um 10:36 schrieb Thomas Zimmermann:

GEM object functions deprecate several similar callback interfaces in
struct drm_driver. This patch replaces the per-driver callbacks with
per-instance callbacks in radeon.

Signed-off-by: Thomas Zimmermann 
---
  drivers/gpu/drm/radeon/radeon_drv.c| 23 +--
  drivers/gpu/drm/radeon/radeon_object.c | 26 ++
  2 files changed, 27 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
b/drivers/gpu/drm/radeon/radeon_drv.c
index 4cd30613fa1d..65061c949aee 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -124,13 +124,6 @@ void radeon_driver_irq_preinstall_kms(struct drm_device 
*dev);
  int radeon_driver_irq_postinstall_kms(struct drm_device *dev);
  void radeon_driver_irq_uninstall_kms(struct drm_device *dev);
  irqreturn_t radeon_driver_irq_handler_kms(int irq, void *arg);
-void radeon_gem_object_free(struct drm_gem_object *obj);
-int radeon_gem_object_open(struct drm_gem_object *obj,
-   struct drm_file *file_priv);
-void radeon_gem_object_close(struct drm_gem_object *obj,
-   struct drm_file *file_priv);
-struct dma_buf *radeon_gem_prime_export(struct drm_gem_object *gobj,
-   int flags);
  extern int radeon_get_crtc_scanoutpos(struct drm_device *dev, unsigned int 
crtc,
  unsigned int flags, int *vpos, int *hpos,
  ktime_t *stime, ktime_t *etime,
@@ -145,14 +138,9 @@ int radeon_mode_dumb_mmap(struct drm_file *filp,
  int radeon_mode_dumb_create(struct drm_file *file_priv,
struct drm_device *dev,
struct drm_mode_create_dumb *args);
-struct sg_table *radeon_gem_prime_get_sg_table(struct drm_gem_object *obj);
  struct drm_gem_object *radeon_gem_prime_import_sg_table(struct drm_device 
*dev,
struct 
dma_buf_attachment *,
struct sg_table *sg);
-int radeon_gem_prime_pin(struct drm_gem_object *obj);
-void radeon_gem_prime_unpin(struct drm_gem_object *obj);
-void *radeon_gem_prime_vmap(struct drm_gem_object *obj);
-void radeon_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr);
  
  /* atpx handler */

  #if defined(CONFIG_VGA_SWITCHEROO)
@@ -550,7 +538,7 @@ long radeon_drm_ioctl(struct file *filp,
}
  
  	ret = drm_ioctl(filp, cmd, arg);

-   
+
pm_runtime_mark_last_busy(dev->dev);
pm_runtime_put_autosuspend(dev->dev);
return ret;
@@ -609,22 +597,13 @@ static struct drm_driver kms_driver = {
.irq_uninstall = radeon_driver_irq_uninstall_kms,
.irq_handler = radeon_driver_irq_handler_kms,
.ioctls = radeon_ioctls_kms,
-   .gem_free_object_unlocked = radeon_gem_object_free,
-   .gem_open_object = radeon_gem_object_open,
-   .gem_close_object = radeon_gem_object_close,
.dumb_create = radeon_mode_dumb_create,
.dumb_map_offset = radeon_mode_dumb_mmap,
.fops = &radeon_driver_kms_fops,
  
  	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,

.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
-   .gem_prime_export = radeon_gem_prime_export,
-   .gem_prime_pin = radeon_gem_prime_pin,
-   .gem_prime_unpin = radeon_gem_prime_unpin,
-   .gem_prime_get_sg_table = radeon_gem_prime_get_sg_table,
.gem_prime_import_sg_table = radeon_gem_prime_import_sg_table,
-   .gem_prime_vmap = radeon_gem_prime_vmap,
-   .gem_prime_vunmap = radeon_gem_prime_vunmap,
  
  	.name = DRIVER_NAME,

.desc = DRIVER_DESC,
diff --git a/drivers/gpu/drm/radeon/radeon_object.c 
b/drivers/gpu/drm/radeon/radeon_object.c
index bb7582afd803..882390e15dfe 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -45,6 +45,19 @@ int radeon_ttm_init(struct radeon_device *rdev);
  void radeon_ttm_fini(struct radeon_device *rdev);
  static void radeon_bo_clear_surface_reg(struct radeon_bo *bo);
  
+void radeon_gem_object_free(struct drm_gem_object *obj);

+int radeon_gem_object_open(struct drm_gem_object *obj,
+   struct drm_file *file_priv);
+void radeon_gem_object_close(struct drm_gem_object *obj,
+   struct drm_file *file_priv);
+struct dma_buf *radeon_gem_prime_export(struct drm_gem_object *gobj,
+   int flags);
+struct sg_table *radeon_gem_prime_get_sg_table(struct drm_gem_object *obj);
+int radeon_gem_prime_pin(struct drm_gem_object *obj);
+void radeon_gem_prime_unpin(struct drm_gem_object *obj);
+void *radeon_gem_prime_vmap(struct drm_gem_object *obj);
+void radeon_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr);
+
  /*
   * To exclude mutual BO access we rely on bo_reserve exclusi

Re: [PATCH 01/20] drm/amdgpu: Introduce GEM object functions

2020-08-13 Thread Christian König

Am 13.08.20 um 10:36 schrieb Thomas Zimmermann:

GEM object functions deprecate several similar callback interfaces in
struct drm_driver. This patch replaces the per-driver callbacks with
per-instance callbacks in amdgpu. The only exception is gem_prime_mmap,
which is non-trivial to convert.

Signed-off-by: Thomas Zimmermann 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c|  6 --
  drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 12 
  2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 81a79760ca61..51525b8774c9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1468,19 +1468,13 @@ static struct drm_driver kms_driver = {
.lastclose = amdgpu_driver_lastclose_kms,
.irq_handler = amdgpu_irq_handler,
.ioctls = amdgpu_ioctls_kms,
-   .gem_free_object_unlocked = amdgpu_gem_object_free,
-   .gem_open_object = amdgpu_gem_object_open,
-   .gem_close_object = amdgpu_gem_object_close,
.dumb_create = amdgpu_mode_dumb_create,
.dumb_map_offset = amdgpu_mode_dumb_mmap,
.fops = &amdgpu_driver_kms_fops,
  
  	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,

.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
-   .gem_prime_export = amdgpu_gem_prime_export,
.gem_prime_import = amdgpu_gem_prime_import,
-   .gem_prime_vmap = amdgpu_gem_prime_vmap,
-   .gem_prime_vunmap = amdgpu_gem_prime_vunmap,
.gem_prime_mmap = amdgpu_gem_prime_mmap,
  
  	.name = DRIVER_NAME,

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 43f4966331dd..ca2b79f94e99 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -36,6 +36,7 @@
  #include 
  #include 
  #include "amdgpu.h"
+#include "amdgpu_dma_buf.h"
  #include "amdgpu_trace.h"
  #include "amdgpu_amdkfd.h"
  
@@ -510,6 +511,15 @@ bool amdgpu_bo_support_uswc(u64 bo_flags)

  #endif
  }
  
+static const struct drm_gem_object_funcs amdgpu_gem_object_funcs = {

+   .free = amdgpu_gem_object_free,
+   .open = amdgpu_gem_object_open,
+   .close = amdgpu_gem_object_close,
+   .export = amdgpu_gem_prime_export,
+   .vmap = amdgpu_gem_prime_vmap,
+   .vunmap = amdgpu_gem_prime_vunmap,
+};
+


Wrong file, this belongs into amdgpu_gem.c


  static int amdgpu_bo_do_create(struct amdgpu_device *adev,
   struct amdgpu_bo_param *bp,
   struct amdgpu_bo **bo_ptr)
@@ -552,6 +562,8 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,
bo = kzalloc(sizeof(struct amdgpu_bo), GFP_KERNEL);
if (bo == NULL)
return -ENOMEM;
+
+   bo->tbo.base.funcs = &amdgpu_gem_object_funcs;


And this should probably go into amdgpu_gem_object_create().

Apart from that looks like a good idea to me.

Christian.


drm_gem_private_object_init(adev->ddev, &bo->tbo.base, size);
INIT_LIST_HEAD(&bo->shadow_list);
bo->vm_bo = NULL;





Re: [Xen-devel] [PATCH v2 12/15] drm/amdgpu: Call find_vma under mmap_sem

2019-10-29 Thread Christian König

Am 29.10.19 um 17:28 schrieb Kuehling, Felix:

On 2019-10-28 4:10 p.m., Jason Gunthorpe wrote:

From: Jason Gunthorpe 

find_vma() must be called under the mmap_sem, reorganize this code to
do the vma check after entering the lock.

Further, fix the unlocked use of struct task_struct's mm, instead use
the mm from hmm_mirror which has an active mm_grab. Also the mm_grab
must be converted to a mm_get before acquiring mmap_sem or calling
find_vma().

Fixes: 66c45500bfdc ("drm/amdgpu: use new HMM APIs and helpers")
Fixes: 0919195f2b0d ("drm/amdgpu: Enable amdgpu_ttm_tt_get_user_pages in worker 
threads")
Cc: Alex Deucher 
Cc: Christian König 
Cc: David (ChunMing) Zhou 
Cc: amd-...@lists.freedesktop.org
Signed-off-by: Jason Gunthorpe 

One question inline to confirm my understanding. Otherwise this patch is

Reviewed-by: Felix Kuehling 



---
   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 37 ++---
   1 file changed, 21 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index dff41d0a85fe96..c0e41f1f0c2365 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -35,6 +35,7 @@
   #include 
   #include 
   #include 
+#include 
   #include 
   #include 
   #include 
@@ -788,7 +789,7 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, 
struct page **pages)
struct hmm_mirror *mirror = bo->mn ? &bo->mn->mirror : NULL;
struct ttm_tt *ttm = bo->tbo.ttm;
struct amdgpu_ttm_tt *gtt = (void *)ttm;
-   struct mm_struct *mm = gtt->usertask->mm;
+   struct mm_struct *mm;
unsigned long start = gtt->userptr;
struct vm_area_struct *vma;
struct hmm_range *range;
@@ -796,25 +797,14 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, 
struct page **pages)
uint64_t *pfns;
int r = 0;
   
-	if (!mm) /* Happens during process shutdown */

-   return -ESRCH;
-
if (unlikely(!mirror)) {
DRM_DEBUG_DRIVER("Failed to get hmm_mirror\n");
-   r = -EFAULT;
-   goto out;
+   return -EFAULT;
}
   
-	vma = find_vma(mm, start);

-   if (unlikely(!vma || start < vma->vm_start)) {
-   r = -EFAULT;
-   goto out;
-   }
-   if (unlikely((gtt->userflags & AMDGPU_GEM_USERPTR_ANONONLY) &&
-   vma->vm_file)) {
-   r = -EPERM;
-   goto out;
-   }
+   mm = mirror->hmm->mmu_notifier.mm;
+   if (!mmget_not_zero(mm)) /* Happens during process shutdown */

This works because mirror->hmm->mmu_notifier holds an mmgrab reference
to the mm? So the MM will not just go away, but if the mmget refcount is
0, it means the mm is marked for destruction and shouldn't be used any more.


Yes, exactly. That is a rather common pattern, one reference count for 
the functionality and one for the structure.


When the functionality is gone the structure might still be alive for 
some reason. TTM and a couple of other structures use the same approach.


Christian.





+   return -ESRCH;
   
   	range = kzalloc(sizeof(*range), GFP_KERNEL);

if (unlikely(!range)) {
@@ -847,6 +837,17 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, 
struct page **pages)
hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT);
   
   	down_read(&mm->mmap_sem);

+   vma = find_vma(mm, start);
+   if (unlikely(!vma || start < vma->vm_start)) {
+   r = -EFAULT;
+   goto out_unlock;
+   }
+   if (unlikely((gtt->userflags & AMDGPU_GEM_USERPTR_ANONONLY) &&
+   vma->vm_file)) {
+   r = -EPERM;
+   goto out_unlock;
+   }
+
r = hmm_range_fault(range, 0);
up_read(&mm->mmap_sem);
   
@@ -865,15 +866,19 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, struct page **pages)

}
   
   	gtt->range = range;

+   mmput(mm);
   
   	return 0;
   
+out_unlock:

+   up_read(&mm->mmap_sem);
   out_free_pfns:
hmm_range_unregister(range);
kvfree(pfns);
   out_free_ranges:
kfree(range);
   out:
+   mmput(mm);
return r;
   }
   

___
amd-gfx mailing list
amd-...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] dma-buf: add struct dma_buf_attach_info v2

2019-05-03 Thread Christian König

Am 30.04.19 um 19:31 schrieb Russell King - ARM Linux admin:

On Tue, Apr 30, 2019 at 01:10:02PM +0200, Christian König wrote:

Add a structure for the parameters of dma_buf_attach, this makes it much easier
to add new parameters later on.

I don't understand this reasoning.  What are the "new parameters" that
are being proposed, and why do we need to put them into memory to pass
them across this interface?

If the intention is to make it easier to change the interface, passing
parameters in this manner mean that it's easy for the interface to
change and drivers not to notice the changes, since the compiler will
not warn (unless some member of the structure that the driver is using
gets removed, in which case it will error.)

Additions to the structure will go unnoticed by drivers - what if the
caller is expecting some different kind of behaviour, and the driver
ignores that new addition?


Well, exactly that's the intention here: That the drivers using this 
interface should be able to ignore the new additions for now as long as 
they are not going to use them.


The background is that we have multiple interface changes in the 
pipeline, and each step requires new optional parameters.



This doesn't seem to me like a good idea.


Well, the obvious alternatives are:

a) Change all drivers to explicitly provide NULL/0 for the new parameters.

b) Use a wrapper, so that the function signature of dma_buf_attach stays 
the same.


Key point here is that I have an invalidation callback change, a P2P 
patch set and some locking changes which all require adding new 
parameters or flags. And at each step I would then start to change all 
drivers, adding some more NULL pointers or flags with 0 default value.


I'm actually perfectly fine going down any route, but this just seemed 
to me simplest and with the least risk of breaking anything. Opinions?


Thanks,
Christian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH] dma-buf: add struct dma_buf_attach_info v2

2019-04-30 Thread Christian König
Add a structure for the parameters of dma_buf_attach, this makes it much easier
to add new parameters later on.

v2: rebase cleanup and fix all new implementations as well

Signed-off-by: Christian König 
---
 drivers/dma-buf/dma-buf.c   | 13 +++--
 drivers/gpu/drm/armada/armada_gem.c |  6 +-
 drivers/gpu/drm/drm_prime.c |  6 +-
 drivers/gpu/drm/i915/i915_gem_dmabuf.c  |  6 +-
 drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c   |  6 +-
 drivers/gpu/drm/tegra/gem.c |  6 +-
 drivers/gpu/drm/udl/udl_dmabuf.c|  6 +-
 .../common/videobuf2/videobuf2-dma-contig.c |  6 +-
 .../media/common/videobuf2/videobuf2-dma-sg.c   |  6 +-
 drivers/misc/fastrpc.c  |  6 +-
 drivers/staging/media/tegra-vde/tegra-vde.c |  6 +-
 drivers/xen/gntdev-dmabuf.c |  4 
 include/linux/dma-buf.h | 17 +++--
 13 files changed, 76 insertions(+), 18 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 3ae6c0c2cc02..e295e76a8c57 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -535,8 +535,9 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
 /**
  * dma_buf_attach - Add the device to dma_buf's attachments list; optionally,
  * calls attach() of dma_buf_ops to allow device-specific attach functionality
- * @dmabuf:[in]buffer to attach device to.
- * @dev:   [in]device to be attached.
+ * @info:  [in]holds all the attach related information provided
+ * by the importer. see &struct dma_buf_attach_info
+ * for further details.
  *
  * Returns struct dma_buf_attachment pointer for this attachment. Attachments
  * must be cleaned up by calling dma_buf_detach().
@@ -550,20 +551,20 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
  * accessible to @dev, and cannot be moved to a more suitable place. This is
  * indicated with the error code -EBUSY.
  */
-struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
- struct device *dev)
+struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info 
*info)
 {
+   struct dma_buf *dmabuf = info->dmabuf;
struct dma_buf_attachment *attach;
int ret;
 
-   if (WARN_ON(!dmabuf || !dev))
+   if (WARN_ON(!dmabuf || !info->dev))
return ERR_PTR(-EINVAL);
 
attach = kzalloc(sizeof(*attach), GFP_KERNEL);
if (!attach)
return ERR_PTR(-ENOMEM);
 
-   attach->dev = dev;
+   attach->dev = info->dev;
attach->dmabuf = dmabuf;
 
mutex_lock(&dmabuf->lock);
diff --git a/drivers/gpu/drm/armada/armada_gem.c 
b/drivers/gpu/drm/armada/armada_gem.c
index 642d0e70d0f8..19c47821032f 100644
--- a/drivers/gpu/drm/armada/armada_gem.c
+++ b/drivers/gpu/drm/armada/armada_gem.c
@@ -501,6 +501,10 @@ armada_gem_prime_export(struct drm_device *dev, struct 
drm_gem_object *obj,
 struct drm_gem_object *
 armada_gem_prime_import(struct drm_device *dev, struct dma_buf *buf)
 {
+   struct dma_buf_attach_info attach_info = {
+   .dev = dev->dev,
+   .dmabuf = buf
+   };
struct dma_buf_attachment *attach;
struct armada_gem_object *dobj;
 
@@ -516,7 +520,7 @@ armada_gem_prime_import(struct drm_device *dev, struct 
dma_buf *buf)
}
}
 
-   attach = dma_buf_attach(buf, dev->dev);
+   attach = dma_buf_attach(&attach_info);
if (IS_ERR(attach))
return ERR_CAST(attach);
 
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index dc079efb3b0f..1dd70fc095ee 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -710,6 +710,10 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct 
drm_device *dev,
struct dma_buf *dma_buf,
struct device *attach_dev)
 {
+   struct dma_buf_attach_info attach_info = {
+   .dev = attach_dev,
+   .dmabuf = dma_buf
+   };
struct dma_buf_attachment *attach;
struct sg_table *sgt;
struct drm_gem_object *obj;
@@ -730,7 +734,7 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct 
drm_device *dev,
if (!dev->driver->gem_prime_import_sg_table)
return ERR_PTR(-EINVAL);
 
-   attach = dma_buf_attach(dma_buf, attach_dev);
+   attach = dma_buf_attach(&attach_info);
if (IS_ERR(attach))
return ERR_CAST(attach);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index 5a101a9462d8..978054157c64 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -277,6 

Re: [Xen-devel] [PATCH] mm, oom: distinguish blockable mode for mmu notifiers

2018-08-27 Thread Christian König

Am 26.08.2018 um 10:40 schrieb Tetsuo Handa:

On 2018/08/24 22:52, Michal Hocko wrote:

@@ -180,11 +180,15 @@ void amdgpu_mn_unlock(struct amdgpu_mn *mn)
   */
  static int amdgpu_mn_read_lock(struct amdgpu_mn *amn, bool blockable)
  {
-   if (blockable)
-   mutex_lock(&amn->read_lock);
-   else if (!mutex_trylock(&amn->read_lock))
-   return -EAGAIN;
-
+   /*
+* We can take sleepable lock even on !blockable mode because
+* read_lock is only ever take from this path and the notifier
+* lock never really sleeps. In fact the only reason why the
+* later is sleepable is because the notifier itself might sleep
+* in amdgpu_mn_invalidate_node but blockable mode is handled
+* before calling into that path.
+*/
+   mutex_lock(&amn->read_lock);
if (atomic_inc_return(&amn->recursion) == 1)
down_read_non_owner(&amn->lock);
mutex_unlock(&amn->read_lock);


I'm not following. Why don't we need to do like below (given that
nobody except amdgpu_mn_read_lock() holds ->read_lock) because e.g.
drm_sched_fence_create() from drm_sched_job_init() from amdgpu_cs_submit()
is doing GFP_KERNEL memory allocation with ->lock held for write?


That's a bug which needs to be fixed separately.

Allocating memory with GFP_KERNEL while holding a lock which is also 
taken in the reclaim code path is illegal not matter what you do.


Patches to fix this are already on the appropriate mailing list and will 
be pushed upstream today.


Regards,
Christian.



diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
index e55508b..e1cb344 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
@@ -64,8 +64,6 @@
   * @node: hash table node to find structure by adev and mn
   * @lock: rw semaphore protecting the notifier nodes
   * @objects: interval tree containing amdgpu_mn_nodes
- * @read_lock: mutex for recursive locking of @lock
- * @recursion: depth of recursion
   *
   * Data for each amdgpu device and process address space.
   */
@@ -85,8 +83,6 @@ struct amdgpu_mn {
/* objects protected by lock */
struct rw_semaphore lock;
struct rb_root_cached   objects;
-   struct mutexread_lock;
-   atomic_trecursion;
  };
  
  /**

@@ -181,14 +177,9 @@ void amdgpu_mn_unlock(struct amdgpu_mn *mn)
  static int amdgpu_mn_read_lock(struct amdgpu_mn *amn, bool blockable)
  {
if (blockable)
-   mutex_lock(&amn->read_lock);
-   else if (!mutex_trylock(&amn->read_lock))
+   down_read(&amn->lock);
+   else if (!down_read_trylock(&amn->lock))
return -EAGAIN;
-
-   if (atomic_inc_return(&amn->recursion) == 1)
-   down_read_non_owner(&amn->lock);
-   mutex_unlock(&amn->read_lock);
-
return 0;
  }
  
@@ -199,8 +190,7 @@ static int amdgpu_mn_read_lock(struct amdgpu_mn *amn, bool blockable)

   */
  static void amdgpu_mn_read_unlock(struct amdgpu_mn *amn)
  {
-   if (atomic_dec_return(&amn->recursion) == 0)
-   up_read_non_owner(&amn->lock);
+   up_read(&amn->lock);
  }
  
  /**

@@ -410,8 +400,6 @@ struct amdgpu_mn *amdgpu_mn_get(struct amdgpu_device *adev,
amn->type = type;
amn->mn.ops = &amdgpu_mn_ops[type];
amn->objects = RB_ROOT_CACHED;
-   mutex_init(&amn->read_lock);
-   atomic_set(&amn->recursion, 0);
  
  	r = __mmu_notifier_register(&amn->mn, mm);

if (r)



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] mm, oom: distinguish blockable mode for mmu notifiers

2018-08-24 Thread Christian König

Am 24.08.2018 um 15:40 schrieb Michal Hocko:

On Fri 24-08-18 15:28:33, Christian König wrote:

Am 24.08.2018 um 15:24 schrieb Michal Hocko:

On Fri 24-08-18 15:10:08, Christian König wrote:

Am 24.08.2018 um 15:01 schrieb Michal Hocko:

On Fri 24-08-18 14:52:26, Christian König wrote:

Am 24.08.2018 um 14:33 schrieb Michal Hocko:

[...]

Thiking about it some more, I can imagine that a notifier callback which
performs an allocation might trigger a memory reclaim and that in turn
might trigger a notifier to be invoked and recurse. But notifier
shouldn't really allocate memory. They are called from deep MM code
paths and this would be extremely deadlock prone. Maybe Jerome can come
up some more realistic scenario. If not then I would propose to simplify
the locking here. We have lockdep to catch self deadlocks and it is
always better to handle a specific issue rather than having a code
without a clear indication how it can recurse.

Well I agree that we should probably fix that, but I have some concerns to
remove the existing workaround.

See we added that to get rid of a real problem in a customer environment and
I don't want to that to show up again.

It would really help to know more about that case and fix it properly
rather than workaround it like this. Anyway, let me think how to handle
the non-blocking notifier invocation then. I was not able to come up
with anything remotely sane yet.

With avoiding allocating memory in the write lock path I don't see an issue
any more with that.

All what the write lock path does now is adding items to a linked lists,
arrays etc

Can we change it to non-sleepable lock then?

No, the write side doesn't sleep any more, but the read side does.

See amdgpu_mn_invalidate_node() and that is where you actually need to
handle the non-blocking flag correctly.

Ohh, right you are. We already handle that by bailing out before calling
amdgpu_mn_invalidate_node in !blockable mode.


Yeah, that is sufficient.

It could be improved because we have something like 90% chance that 
amdgpu_mn_invalidate_node() actually doesn't need to do anything.


But I can take care of that when the patch set has landed.


So does this looks good to
you?


Yeah, that looks perfect to me. Reviewed-by: Christian König 



Thanks,
Christian.



diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
index e55508b39496..48fa152231be 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
@@ -180,11 +180,15 @@ void amdgpu_mn_unlock(struct amdgpu_mn *mn)
   */
  static int amdgpu_mn_read_lock(struct amdgpu_mn *amn, bool blockable)
  {
-   if (blockable)
-   mutex_lock(&amn->read_lock);
-   else if (!mutex_trylock(&amn->read_lock))
-   return -EAGAIN;
-
+   /*
+* We can take sleepable lock even on !blockable mode because
+* read_lock is only ever take from this path and the notifier
+* lock never really sleeps. In fact the only reason why the
+* later is sleepable is because the notifier itself might sleep
+* in amdgpu_mn_invalidate_node but blockable mode is handled
+* before calling into that path.
+*/
+   mutex_lock(&amn->read_lock);
if (atomic_inc_return(&amn->recursion) == 1)
down_read_non_owner(&amn->lock);
mutex_unlock(&amn->read_lock);



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] mm, oom: distinguish blockable mode for mmu notifiers

2018-08-24 Thread Christian König

Am 24.08.2018 um 15:24 schrieb Michal Hocko:

On Fri 24-08-18 15:10:08, Christian König wrote:

Am 24.08.2018 um 15:01 schrieb Michal Hocko:

On Fri 24-08-18 14:52:26, Christian König wrote:

Am 24.08.2018 um 14:33 schrieb Michal Hocko:

[...]

Thiking about it some more, I can imagine that a notifier callback which
performs an allocation might trigger a memory reclaim and that in turn
might trigger a notifier to be invoked and recurse. But notifier
shouldn't really allocate memory. They are called from deep MM code
paths and this would be extremely deadlock prone. Maybe Jerome can come
up some more realistic scenario. If not then I would propose to simplify
the locking here. We have lockdep to catch self deadlocks and it is
always better to handle a specific issue rather than having a code
without a clear indication how it can recurse.

Well I agree that we should probably fix that, but I have some concerns to
remove the existing workaround.

See we added that to get rid of a real problem in a customer environment and
I don't want to that to show up again.

It would really help to know more about that case and fix it properly
rather than workaround it like this. Anyway, let me think how to handle
the non-blocking notifier invocation then. I was not able to come up
with anything remotely sane yet.

With avoiding allocating memory in the write lock path I don't see an issue
any more with that.

All what the write lock path does now is adding items to a linked lists,
arrays etc

Can we change it to non-sleepable lock then?


No, the write side doesn't sleep any more, but the read side does.

See amdgpu_mn_invalidate_node() and that is where you actually need to 
handle the non-blocking flag correctly.


Christian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] mm, oom: distinguish blockable mode for mmu notifiers

2018-08-24 Thread Christian König

Am 24.08.2018 um 15:01 schrieb Michal Hocko:

On Fri 24-08-18 14:52:26, Christian König wrote:

Am 24.08.2018 um 14:33 schrieb Michal Hocko:

[...]

Thiking about it some more, I can imagine that a notifier callback which
performs an allocation might trigger a memory reclaim and that in turn
might trigger a notifier to be invoked and recurse. But notifier
shouldn't really allocate memory. They are called from deep MM code
paths and this would be extremely deadlock prone. Maybe Jerome can come
up some more realistic scenario. If not then I would propose to simplify
the locking here. We have lockdep to catch self deadlocks and it is
always better to handle a specific issue rather than having a code
without a clear indication how it can recurse.

Well I agree that we should probably fix that, but I have some concerns to
remove the existing workaround.

See we added that to get rid of a real problem in a customer environment and
I don't want to that to show up again.

It would really help to know more about that case and fix it properly
rather than workaround it like this. Anyway, let me think how to handle
the non-blocking notifier invocation then. I was not able to come up
with anything remotely sane yet.


With avoiding allocating memory in the write lock path I don't see an 
issue any more with that.


All what the write lock path does now is adding items to a linked lists, 
arrays etc


So there is no more blocking involved here and the read lock side should 
be able to grab the lock immediately.


Christian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] mm, oom: distinguish blockable mode for mmu notifiers

2018-08-24 Thread Christian König

Am 24.08.2018 um 14:33 schrieb Michal Hocko:

On Fri 24-08-18 14:18:44, Christian König wrote:

Am 24.08.2018 um 14:03 schrieb Michal Hocko:

On Fri 24-08-18 13:57:52, Christian König wrote:

Am 24.08.2018 um 13:52 schrieb Michal Hocko:

On Fri 24-08-18 13:43:16, Christian König wrote:

[...]

That won't work like this there might be multiple
invalidate_range_start()/invalidate_range_end() pairs open at the same time.
E.g. the lock might be taken recursively and that is illegal for a
rw_semaphore.

I am not sure I follow. Are you saying that one invalidate_range might
trigger another one from the same path?

No, but what can happen is:

invalidate_range_start(A,B);
invalidate_range_start(C,D);
...
invalidate_range_end(C,D);
invalidate_range_end(A,B);

Grabbing the read lock twice would be illegal in this case.

I am sorry but I still do not follow. What is the context the two are
called from?

I don't have the slightest idea.


Can you give me an example. I simply do not see it in the
code, mostly because I am not familiar with it.

I'm neither.

We stumbled over that by pure observation and after discussing the problem
with Jerome came up with this solution.

No idea where exactly that case comes from, but I can confirm that it indeed
happens.

Thiking about it some more, I can imagine that a notifier callback which
performs an allocation might trigger a memory reclaim and that in turn
might trigger a notifier to be invoked and recurse. But notifier
shouldn't really allocate memory. They are called from deep MM code
paths and this would be extremely deadlock prone. Maybe Jerome can come
up some more realistic scenario. If not then I would propose to simplify
the locking here. We have lockdep to catch self deadlocks and it is
always better to handle a specific issue rather than having a code
without a clear indication how it can recurse.


Well I agree that we should probably fix that, but I have some concerns 
to remove the existing workaround.


See we added that to get rid of a real problem in a customer environment 
and I don't want to that to show up again.


In the meantime I've send out a fix to avoid allocating memory while 
holding the mn_lock.


Thanks for pointing that out,
Christian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] mm, oom: distinguish blockable mode for mmu notifiers

2018-08-24 Thread Christian König

Am 24.08.2018 um 14:03 schrieb Michal Hocko:

On Fri 24-08-18 13:57:52, Christian König wrote:

Am 24.08.2018 um 13:52 schrieb Michal Hocko:

On Fri 24-08-18 13:43:16, Christian König wrote:

[...]

That won't work like this there might be multiple
invalidate_range_start()/invalidate_range_end() pairs open at the same time.
E.g. the lock might be taken recursively and that is illegal for a
rw_semaphore.

I am not sure I follow. Are you saying that one invalidate_range might
trigger another one from the same path?

No, but what can happen is:

invalidate_range_start(A,B);
invalidate_range_start(C,D);
...
invalidate_range_end(C,D);
invalidate_range_end(A,B);

Grabbing the read lock twice would be illegal in this case.

I am sorry but I still do not follow. What is the context the two are
called from?


I don't have the slightest idea.


Can you give me an example. I simply do not see it in the
code, mostly because I am not familiar with it.


I'm neither.

We stumbled over that by pure observation and after discussing the 
problem with Jerome came up with this solution.


No idea where exactly that case comes from, but I can confirm that it 
indeed happens.


Regards,
Christian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] mm, oom: distinguish blockable mode for mmu notifiers

2018-08-24 Thread Christian König

Am 24.08.2018 um 13:52 schrieb Michal Hocko:

On Fri 24-08-18 13:43:16, Christian König wrote:

Am 24.08.2018 um 13:32 schrieb Michal Hocko:

On Fri 24-08-18 19:54:19, Tetsuo Handa wrote:

Two more worries for this patch.




--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
@@ -178,12 +178,18 @@ void amdgpu_mn_unlock(struct amdgpu_mn *mn)
*
* @amn: our notifier
*/
-static void amdgpu_mn_read_lock(struct amdgpu_mn *amn)
+static int amdgpu_mn_read_lock(struct amdgpu_mn *amn, bool blockable)
   {
-   mutex_lock(&amn->read_lock);
+   if (blockable)
+   mutex_lock(&amn->read_lock);
+   else if (!mutex_trylock(&amn->read_lock))
+   return -EAGAIN;
+
  if (atomic_inc_return(&amn->recursion) == 1)
  down_read_non_owner(&amn->lock);

Why don't we need to use trylock here if blockable == false ?
Want comment why it is safe to use blocking lock here.

Hmm, I am pretty sure I have checked the code but it was quite confusing
so I might have missed something. Double checking now, it seems that
this read_lock is not used anywhere else and it is not _the_ lock we are
interested about. It is the amn->lock (amdgpu_mn_lock) which matters as
it is taken in exclusive mode for expensive operations.

The write side of the lock is only taken in the command submission IOCTL.

So you actually don't need to change anything here (even the proposed
changes are overkill) since we can't tear down the struct_mm while an IOCTL
is still using.

I am not so sure. We are not in the mm destruction phase yet. This is
mostly about the oom context which might fire right during the IOCTL. If
any of the path which is holding the write lock blocks for unbound
amount of time or even worse allocates a memory then we are screwed. So
we need to back of when blockable = false.


Oh, yeah good point. Haven't thought about that possibility.




Is that correct Christian? If this is correct then we need to update the
locking here. I am struggling to grasp the ref counting part. Why cannot
all readers simply take the lock rather than rely on somebody else to
take it? 1ed3d2567c800 didn't really help me to understand the locking
scheme here so any help would be appreciated.

That won't work like this there might be multiple
invalidate_range_start()/invalidate_range_end() pairs open at the same time.
E.g. the lock might be taken recursively and that is illegal for a
rw_semaphore.

I am not sure I follow. Are you saying that one invalidate_range might
trigger another one from the same path?


No, but what can happen is:

invalidate_range_start(A,B);
invalidate_range_start(C,D);
...
invalidate_range_end(C,D);
invalidate_range_end(A,B);

Grabbing the read lock twice would be illegal in this case.

Regards,
Christian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH] mm, oom: distinguish blockable mode for mmu notifiers

2018-08-24 Thread Christian König

Am 24.08.2018 um 13:32 schrieb Michal Hocko:

On Fri 24-08-18 19:54:19, Tetsuo Handa wrote:

Two more worries for this patch.




--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
@@ -178,12 +178,18 @@ void amdgpu_mn_unlock(struct amdgpu_mn *mn)
   *
   * @amn: our notifier
   */
-static void amdgpu_mn_read_lock(struct amdgpu_mn *amn)
+static int amdgpu_mn_read_lock(struct amdgpu_mn *amn, bool blockable)
  {
-   mutex_lock(&amn->read_lock);
+   if (blockable)
+   mutex_lock(&amn->read_lock);
+   else if (!mutex_trylock(&amn->read_lock))
+   return -EAGAIN;
+
 if (atomic_inc_return(&amn->recursion) == 1)
 down_read_non_owner(&amn->lock);

Why don't we need to use trylock here if blockable == false ?
Want comment why it is safe to use blocking lock here.

Hmm, I am pretty sure I have checked the code but it was quite confusing
so I might have missed something. Double checking now, it seems that
this read_lock is not used anywhere else and it is not _the_ lock we are
interested about. It is the amn->lock (amdgpu_mn_lock) which matters as
it is taken in exclusive mode for expensive operations.


The write side of the lock is only taken in the command submission IOCTL.

So you actually don't need to change anything here (even the proposed 
changes are overkill) since we can't tear down the struct_mm while an 
IOCTL is still using.



Is that correct Christian? If this is correct then we need to update the
locking here. I am struggling to grasp the ref counting part. Why cannot
all readers simply take the lock rather than rely on somebody else to
take it? 1ed3d2567c800 didn't really help me to understand the locking
scheme here so any help would be appreciated.


That won't work like this there might be multiple 
invalidate_range_start()/invalidate_range_end() pairs open at the same 
time. E.g. the lock might be taken recursively and that is illegal for a 
rw_semaphore.


Regards,
Christian.



I am wondering why we cannot do
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
index e55508b39496..93034178673d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
@@ -180,14 +180,11 @@ void amdgpu_mn_unlock(struct amdgpu_mn *mn)
   */
  static int amdgpu_mn_read_lock(struct amdgpu_mn *amn, bool blockable)
  {
-   if (blockable)
-   mutex_lock(&amn->read_lock);
-   else if (!mutex_trylock(&amn->read_lock))
-   return -EAGAIN;
-
-   if (atomic_inc_return(&amn->recursion) == 1)
-   down_read_non_owner(&amn->lock);
-   mutex_unlock(&amn->read_lock);
+   if (!down_read_trylock(&amn->lock)) {
+   if (!blockable)
+   return -EAGAIN;
+   down_read(amn->lock);
+   }
  
  	return 0;

  }
@@ -199,8 +196,7 @@ static int amdgpu_mn_read_lock(struct amdgpu_mn *amn, bool 
blockable)
   */
  static void amdgpu_mn_read_unlock(struct amdgpu_mn *amn)
  {
-   if (atomic_dec_return(&amn->recursion) == 0)
-   up_read_non_owner(&amn->lock);
+   up_read(&amn->lock);
  }
  
  /**





___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH] mm, oom: distinguish blockable mode for mmu notifiers

2018-07-02 Thread Christian König

Am 02.07.2018 um 14:35 schrieb Michal Hocko:

On Mon 02-07-18 14:24:29, Christian König wrote:

Am 02.07.2018 um 14:20 schrieb Michal Hocko:

On Mon 02-07-18 14:13:42, Christian König wrote:

Am 02.07.2018 um 13:54 schrieb Michal Hocko:

On Mon 02-07-18 11:14:58, Christian König wrote:

Am 27.06.2018 um 09:44 schrieb Michal Hocko:

This is the v2 of RFC based on the feedback I've received so far. The
code even compiles as a bonus ;) I haven't runtime tested it yet, mostly
because I have no idea how.

Any further feedback is highly appreciated of course.

That sounds like it should work and at least the amdgpu changes now look
good to me on first glance.

Can you split that up further in the usual way? E.g. adding the blockable
flag in one patch and fixing all implementations of the MMU notifier in
follow up patches.

But such a code would be broken, no? Ignoring the blockable state will
simply lead to lockups until the fixup parts get applied.

Well to still be bisect-able you only need to get the interface change in
first with fixing the function signature of the implementations.

That would only work if those functions return -AGAIN unconditionally.
Otherwise they would pretend to not block while that would be obviously
incorrect. This doesn't sound correct to me.


Then add all the new code to the implementations and last start to actually
use the new interface.

That is a pattern we use regularly and I think it's good practice to do
this.

But we do rely on the proper blockable handling.

Yeah, but you could add the handling only after you have all the
implementations in place. Don't you?

Yeah, but then I would be adding a code with no user. And I really
prefer to no do so because then the code is harder to argue about.


Is the split up really worth it? I was thinking about that but had hard
times to end up with something that would be bisectable. Well, except
for returning -EBUSY until all notifiers are implemented. Which I found
confusing.

It at least makes reviewing changes much easier, cause as driver maintainer
I can concentrate on the stuff only related to me.

Additional to that when you cause some unrelated side effect in a driver we
can much easier pinpoint the actual change later on when the patch is
smaller.


This way I'm pretty sure Felix and I can give an rb on the amdgpu/amdkfd
changes.

If you are worried to give r-b only for those then this can be done even
for larger patches. Just make your Reviewd-by more specific
R-b: name # For BLA BLA

Yeah, possible alternative but more work for me when I review it :)

I definitely do not want to add more work to reviewers and I completely
see how massive "flag days" like these are not popular but I really
didn't find a reasonable way around that would be both correct and
wouldn't add much more churn on the way. So if you really insist then I
would really appreciate a hint on the way to achive the same without any
above downsides.

Well, I don't insist on this. It's just from my point of view that this
patch doesn't needs to be one patch, but could be split up.

Well, if there are more people with the same concern I can try to do
that. But if your only concern is to focus on your particular part then
I guess it would be easier both for you and me to simply apply the patch
and use git show $files_for_your_subystem on your end. I have put the
patch to attempts/oom-vs-mmu-notifiers branch to my tree at
git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git


Not wanting to block something as important as this, so feel free to add 
an Acked-by: Christian König  to the patch.


Let's rather face the next topic: Any idea how to runtime test this?

I mean I can rather easily provide a test which crashes an AMD GPU, 
which in turn then would mean that the MMU notifier would block forever 
without this patch.


But do you know a way to let the OOM killer kill a specific process?

Regards,
Christian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH] mm, oom: distinguish blockable mode for mmu notifiers

2018-07-02 Thread Christian König

Am 02.07.2018 um 14:20 schrieb Michal Hocko:

On Mon 02-07-18 14:13:42, Christian König wrote:

Am 02.07.2018 um 13:54 schrieb Michal Hocko:

On Mon 02-07-18 11:14:58, Christian König wrote:

Am 27.06.2018 um 09:44 schrieb Michal Hocko:

This is the v2 of RFC based on the feedback I've received so far. The
code even compiles as a bonus ;) I haven't runtime tested it yet, mostly
because I have no idea how.

Any further feedback is highly appreciated of course.

That sounds like it should work and at least the amdgpu changes now look
good to me on first glance.

Can you split that up further in the usual way? E.g. adding the blockable
flag in one patch and fixing all implementations of the MMU notifier in
follow up patches.

But such a code would be broken, no? Ignoring the blockable state will
simply lead to lockups until the fixup parts get applied.

Well to still be bisect-able you only need to get the interface change in
first with fixing the function signature of the implementations.

That would only work if those functions return -AGAIN unconditionally.
Otherwise they would pretend to not block while that would be obviously
incorrect. This doesn't sound correct to me.


Then add all the new code to the implementations and last start to actually
use the new interface.

That is a pattern we use regularly and I think it's good practice to do
this.

But we do rely on the proper blockable handling.


Yeah, but you could add the handling only after you have all the 
implementations in place. Don't you?



Is the split up really worth it? I was thinking about that but had hard
times to end up with something that would be bisectable. Well, except
for returning -EBUSY until all notifiers are implemented. Which I found
confusing.

It at least makes reviewing changes much easier, cause as driver maintainer
I can concentrate on the stuff only related to me.

Additional to that when you cause some unrelated side effect in a driver we
can much easier pinpoint the actual change later on when the patch is
smaller.


This way I'm pretty sure Felix and I can give an rb on the amdgpu/amdkfd
changes.

If you are worried to give r-b only for those then this can be done even
for larger patches. Just make your Reviewd-by more specific
R-b: name # For BLA BLA

Yeah, possible alternative but more work for me when I review it :)

I definitely do not want to add more work to reviewers and I completely
see how massive "flag days" like these are not popular but I really
didn't find a reasonable way around that would be both correct and
wouldn't add much more churn on the way. So if you really insist then I
would really appreciate a hint on the way to achive the same without any
above downsides.


Well, I don't insist on this. It's just from my point of view that this 
patch doesn't needs to be one patch, but could be split up.


Could be that I just don't know the code or the consequences of adding 
that well enough to really judge.


Christian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH] mm, oom: distinguish blockable mode for mmu notifiers

2018-07-02 Thread Christian König

Am 02.07.2018 um 13:54 schrieb Michal Hocko:

On Mon 02-07-18 11:14:58, Christian König wrote:

Am 27.06.2018 um 09:44 schrieb Michal Hocko:

This is the v2 of RFC based on the feedback I've received so far. The
code even compiles as a bonus ;) I haven't runtime tested it yet, mostly
because I have no idea how.

Any further feedback is highly appreciated of course.

That sounds like it should work and at least the amdgpu changes now look
good to me on first glance.

Can you split that up further in the usual way? E.g. adding the blockable
flag in one patch and fixing all implementations of the MMU notifier in
follow up patches.

But such a code would be broken, no? Ignoring the blockable state will
simply lead to lockups until the fixup parts get applied.


Well to still be bisect-able you only need to get the interface change 
in first with fixing the function signature of the implementations.


Then add all the new code to the implementations and last start to 
actually use the new interface.


That is a pattern we use regularly and I think it's good practice to do 
this.



Is the split up really worth it? I was thinking about that but had hard
times to end up with something that would be bisectable. Well, except
for returning -EBUSY until all notifiers are implemented. Which I found
confusing.


It at least makes reviewing changes much easier, cause as driver 
maintainer I can concentrate on the stuff only related to me.


Additional to that when you cause some unrelated side effect in a driver 
we can much easier pinpoint the actual change later on when the patch is 
smaller.





This way I'm pretty sure Felix and I can give an rb on the amdgpu/amdkfd
changes.

If you are worried to give r-b only for those then this can be done even
for larger patches. Just make your Reviewd-by more specific
R-b: name # For BLA BLA


Yeah, possible alternative but more work for me when I review it :)

Regards,
Christian.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [RFC PATCH] mm, oom: distinguish blockable mode for mmu notifiers

2018-07-02 Thread Christian König

Am 27.06.2018 um 09:44 schrieb Michal Hocko:

This is the v2 of RFC based on the feedback I've received so far. The
code even compiles as a bonus ;) I haven't runtime tested it yet, mostly
because I have no idea how.

Any further feedback is highly appreciated of course.


That sounds like it should work and at least the amdgpu changes now look 
good to me on first glance.


Can you split that up further in the usual way? E.g. adding the 
blockable flag in one patch and fixing all implementations of the MMU 
notifier in follow up patches.


This way I'm pretty sure Felix and I can give an rb on the amdgpu/amdkfd 
changes.


Thanks,
Christian.


---
 From ec9a7241bf422b908532c4c33953b0da2655ad05 Mon Sep 17 00:00:00 2001
From: Michal Hocko 
Date: Wed, 20 Jun 2018 15:03:20 +0200
Subject: [PATCH] mm, oom: distinguish blockable mode for mmu notifiers
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

There are several blockable mmu notifiers which might sleep in
mmu_notifier_invalidate_range_start and that is a problem for the
oom_reaper because it needs to guarantee a forward progress so it cannot
depend on any sleepable locks.

Currently we simply back off and mark an oom victim with blockable mmu
notifiers as done after a short sleep. That can result in selecting a
new oom victim prematurely because the previous one still hasn't torn
its memory down yet.

We can do much better though. Even if mmu notifiers use sleepable locks
there is no reason to automatically assume those locks are held.
Moreover majority of notifiers only care about a portion of the address
space and there is absolutely zero reason to fail when we are unmapping an
unrelated range. Many notifiers do really block and wait for HW which is
harder to handle and we have to bail out though.

This patch handles the low hanging fruid. __mmu_notifier_invalidate_range_start
gets a blockable flag and callbacks are not allowed to sleep if the
flag is set to false. This is achieved by using trylock instead of the
sleepable lock for most callbacks and continue as long as we do not
block down the call chain.

I think we can improve that even further because there is a common
pattern to do a range lookup first and then do something about that.
The first part can be done without a sleeping lock in most cases AFAICS.

The oom_reaper end then simply retries if there is at least one notifier
which couldn't make any progress in !blockable mode. A retry loop is
already implemented to wait for the mmap_sem and this is basically the
same thing.

Changes since rfc v1
- gpu notifiers can sleep while waiting for HW (evict_process_queues_cpsch
   on a lock and amdgpu_mn_invalidate_node on unbound timeout) make sure
   we bail out when we have an intersecting range for starter
- note that a notifier failed to the log for easier debugging
- back off early in ib_umem_notifier_invalidate_range_start if the
   callback is called
- mn_invl_range_start waits for completion down the unmap_grant_pages
   path so we have to back off early on overlapping ranges

Cc: "David (ChunMing) Zhou" 
Cc: Paolo Bonzini 
Cc: "Radim Krčmář" 
Cc: Alex Deucher 
Cc: "Christian König" 
Cc: David Airlie 
Cc: Jani Nikula 
Cc: Joonas Lahtinen 
Cc: Rodrigo Vivi 
Cc: Doug Ledford 
Cc: Jason Gunthorpe 
Cc: Mike Marciniszyn 
Cc: Dennis Dalessandro 
Cc: Sudeep Dutt 
Cc: Ashutosh Dixit 
Cc: Dimitri Sivanich 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: "Jérôme Glisse" 
Cc: Andrea Arcangeli 
Cc: Felix Kuehling 
Cc: k...@vger.kernel.org (open list:KERNEL VIRTUAL MACHINE FOR X86 (KVM/x86))
Cc: linux-ker...@vger.kernel.org (open list:X86 ARCHITECTURE (32-BIT AND 
64-BIT))
Cc: amd-...@lists.freedesktop.org (open list:RADEON and AMDGPU DRM DRIVERS)
Cc: dri-de...@lists.freedesktop.org (open list:DRM DRIVERS)
Cc: intel-...@lists.freedesktop.org (open list:INTEL DRM DRIVERS (excluding 
Poulsbo, Moorestow...)
Cc: linux-r...@vger.kernel.org (open list:INFINIBAND SUBSYSTEM)
Cc: xen-devel@lists.xenproject.org (moderated list:XEN HYPERVISOR INTERFACE)
Cc: linux...@kvack.org (open list:HMM - Heterogeneous Memory Management)
Reported-by: David Rientjes 
Signed-off-by: Michal Hocko 
---
  arch/x86/kvm/x86.c  |  7 ++--
  drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c  | 43 +++-
  drivers/gpu/drm/i915/i915_gem_userptr.c | 13 ++--
  drivers/gpu/drm/radeon/radeon_mn.c  | 22 +++--
  drivers/infiniband/core/umem_odp.c  | 33 +++
  drivers/infiniband/hw/hfi1/mmu_rb.c | 11 ---
  drivers/infiniband/hw/mlx5/odp.c|  2 +-
  drivers/misc/mic/scif/scif_dma.c|  7 ++--
  drivers/misc/sgi-gru/grutlbpurge.c  |  7 ++--
  drivers/xen/gntdev.c| 44 -
  include/linux/kvm_host.h|  4 +--
  include/linux/mmu_notifier.h| 35 +++-
  include/linux/oom.h  

Re: [Xen-devel] [RFC PATCH] mm, oom: distinguish blockable mode for mmu notifiers

2018-06-22 Thread Christian König

Hi Michal,

[Adding Felix as well]

Well first of all you have a misconception why at least the AMD graphics 
driver need to be able to sleep in an MMU notifier: We need to sleep 
because we need to wait for hardware operations to finish and *NOT* 
because we need to wait for locks.


I'm not sure if your flag now means that you generally can't sleep in 
MMU notifiers any more, but if that's the case at least AMD hardware 
will break badly. In our case the approach of waiting for a short time 
for the process to be reaped and then select another victim actually 
sounds like the right thing to do.


What we also already try to do is to abort hardware operations with the 
address space when we detect that the process is dying, but that can 
certainly be improved.


Regards,
Christian.

Am 22.06.2018 um 17:02 schrieb Michal Hocko:

From: Michal Hocko 

There are several blockable mmu notifiers which might sleep in
mmu_notifier_invalidate_range_start and that is a problem for the
oom_reaper because it needs to guarantee a forward progress so it cannot
depend on any sleepable locks. Currently we simply back off and mark an
oom victim with blockable mmu notifiers as done after a short sleep.
That can result in selecting a new oom victim prematurely because the
previous one still hasn't torn its memory down yet.

We can do much better though. Even if mmu notifiers use sleepable locks
there is no reason to automatically assume those locks are held.
Moreover most notifiers only care about a portion of the address
space. This patch handles the first part of the problem.
__mmu_notifier_invalidate_range_start gets a blockable flag and
callbacks are not allowed to sleep if the flag is set to false. This is
achieved by using trylock instead of the sleepable lock for most
callbacks. I think we can improve that even further because there is
a common pattern to do a range lookup first and then do something about
that. The first part can be done without a sleeping lock I presume.

Anyway, what does the oom_reaper do with all that? We do not have to
fail right away. We simply retry if there is at least one notifier which
couldn't make any progress. A retry loop is already implemented to wait
for the mmap_sem and this is basically the same thing.

Cc: "David (ChunMing) Zhou" 
Cc: Paolo Bonzini 
Cc: "Radim Krčmář" 
Cc: Alex Deucher 
Cc: "Christian König" 
Cc: David Airlie 
Cc: Jani Nikula 
Cc: Joonas Lahtinen 
Cc: Rodrigo Vivi 
Cc: Doug Ledford 
Cc: Jason Gunthorpe 
Cc: Mike Marciniszyn 
Cc: Dennis Dalessandro 
Cc: Sudeep Dutt 
Cc: Ashutosh Dixit 
Cc: Dimitri Sivanich 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: "Jérôme Glisse" 
Cc: Andrea Arcangeli 
Cc: k...@vger.kernel.org (open list:KERNEL VIRTUAL MACHINE FOR X86 (KVM/x86))
Cc: linux-ker...@vger.kernel.org (open list:X86 ARCHITECTURE (32-BIT AND 
64-BIT))
Cc: amd-...@lists.freedesktop.org (open list:RADEON and AMDGPU DRM DRIVERS)
Cc: dri-de...@lists.freedesktop.org (open list:DRM DRIVERS)
Cc: intel-...@lists.freedesktop.org (open list:INTEL DRM DRIVERS (excluding 
Poulsbo, Moorestow...)
Cc: linux-r...@vger.kernel.org (open list:INFINIBAND SUBSYSTEM)
Cc: xen-devel@lists.xenproject.org (moderated list:XEN HYPERVISOR INTERFACE)
Cc: linux...@kvack.org (open list:HMM - Heterogeneous Memory Management)
Reported-by: David Rientjes 
Signed-off-by: Michal Hocko 
---

Hi,
this is an RFC and not tested at all. I am not very familiar with the
mmu notifiers semantics very much so this is a crude attempt to achieve
what I need basically. It might be completely wrong but I would like
to discuss what would be a better way if that is the case.

get_maintainers gave me quite large list of people to CC so I had to trim
it down. If you think I have forgot somebody, please let me know

Any feedback is highly appreciated.

  arch/x86/kvm/x86.c  |  7 --
  drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c  | 33 +++--
  drivers/gpu/drm/i915/i915_gem_userptr.c | 10 +---
  drivers/gpu/drm/radeon/radeon_mn.c  | 15 ---
  drivers/infiniband/core/umem_odp.c  | 15 ---
  drivers/infiniband/hw/hfi1/mmu_rb.c |  7 --
  drivers/misc/mic/scif/scif_dma.c|  7 --
  drivers/misc/sgi-gru/grutlbpurge.c  |  7 --
  drivers/xen/gntdev.c| 14 ---
  include/linux/kvm_host.h|  2 +-
  include/linux/mmu_notifier.h| 15 +--
  mm/hmm.c|  7 --
  mm/mmu_notifier.c   | 15 ---
  mm/oom_kill.c   | 29 +++---
  virt/kvm/kvm_main.c | 12 ++---
  15 files changed, 137 insertions(+), 58 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6bcecc325e7e..ac08f5d711be 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7203,8 +7203,9 @@ static void vcpu_load_eoi_exitmap(

Re: [Xen-devel] [PATCH v3] xen/balloon: Mark unallocated host memory as UNUSABLE

2017-12-24 Thread Christian König

Am 20.12.2017 um 15:05 schrieb Boris Ostrovsky:

Commit f5775e0b6116 ("x86/xen: discard RAM regions above the maximum
reservation") left host memory not assigned to dom0 as available for
memory hotplug.

Unfortunately this also meant that those regions could be used by
others. Specifically, commit fa564ad96366 ("x86/PCI: Enable a 64bit BAR
on AMD Family 15h (Models 00-1f, 30-3f, 60-7f)") may try to map those
addresses as MMIO.

To prevent this mark unallocated host memory as E820_TYPE_UNUSABLE (thus
effectively reverting f5775e0b6116) and keep track of that region as
a hostmem resource that can be used for the hotplug.

Signed-off-by: Boris Ostrovsky 


Acked-by: Christian König 


---
Changes in v3:
* Use PFN_PHYS
* Replace kzalloc with kmalloc
* Declare arch_xen_balloon_init prototype in balloon.h
* Rename resources (s/memory/RAM/)
* Clarify (I think) comment when populating hostmem_resource
* Print open-ended interval on insert_resource() error
* Constify declaration of struct e820_entry *entry

  arch/x86/xen/enlighten.c | 81 
  arch/x86/xen/setup.c |  6 ++--
  drivers/xen/balloon.c| 65 --
  include/xen/balloon.h|  5 +++
  4 files changed, 144 insertions(+), 13 deletions(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index d669e9d..c9081c6 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1,8 +1,12 @@
+#ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG
+#include 
+#endif
  #include 
  #include 
  
  #include 

  #include 
+#include 
  
  #include 

  #include 
@@ -331,3 +335,80 @@ void xen_arch_unregister_cpu(int num)
  }
  EXPORT_SYMBOL(xen_arch_unregister_cpu);
  #endif
+
+#ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG
+void __init arch_xen_balloon_init(struct resource *hostmem_resource)
+{
+   struct xen_memory_map memmap;
+   int rc;
+   unsigned int i, last_guest_ram;
+   phys_addr_t max_addr = PFN_PHYS(max_pfn);
+   struct e820_table *xen_e820_table;
+   const struct e820_entry *entry;
+   struct resource *res;
+
+   if (!xen_initial_domain())
+   return;
+
+   xen_e820_table = kmalloc(sizeof(*xen_e820_table), GFP_KERNEL);
+   if (!xen_e820_table)
+   return;
+
+   memmap.nr_entries = ARRAY_SIZE(xen_e820_table->entries);
+   set_xen_guest_handle(memmap.buffer, xen_e820_table->entries);
+   rc = HYPERVISOR_memory_op(XENMEM_machine_memory_map, &memmap);
+   if (rc) {
+   pr_warn("%s: Can't read host e820 (%d)\n", __func__, rc);
+   goto out;
+   }
+
+   last_guest_ram = 0;
+   for (i = 0; i < memmap.nr_entries; i++) {
+   if (xen_e820_table->entries[i].addr >= max_addr)
+   break;
+   if (xen_e820_table->entries[i].type == E820_TYPE_RAM)
+   last_guest_ram = i;
+   }
+
+   entry = &xen_e820_table->entries[last_guest_ram];
+   if (max_addr >= entry->addr + entry->size)
+   goto out; /* No unallocated host RAM. */
+
+   hostmem_resource->start = max_addr;
+   hostmem_resource->end = entry->addr + entry->size;
+
+   /*
+* Mark non-RAM regions between the end of dom0 RAM and end of host RAM
+* as unavailable. The rest of that region can be used for hotplug-based
+* ballooning.
+*/
+   for (; i < memmap.nr_entries; i++) {
+   entry = &xen_e820_table->entries[i];
+
+   if (entry->type == E820_TYPE_RAM)
+   continue;
+
+   if (entry->addr >= hostmem_resource->end)
+   break;
+
+   res = kzalloc(sizeof(*res), GFP_KERNEL);
+   if (!res)
+   goto out;
+
+   res->name = "Unavailable host RAM";
+   res->start = entry->addr;
+   res->end = (entry->addr + entry->size < hostmem_resource->end) ?
+   entry->addr + entry->size : hostmem_resource->end;
+   rc = insert_resource(hostmem_resource, res);
+   if (rc) {
+   pr_warn("%s: Can't insert [%llx - %llx) (%d)\n",
+   __func__, res->start, res->end, rc);
+   kfree(res);
+   goto  out;
+   }
+   }
+
+ out:
+   kfree(xen_e820_table);
+}
+#endif /* CONFIG_XEN_BALLOON_MEMORY_HOTPLUG */
diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index c114ca7..6e0d208 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -808,7 +808,6 @@ char * __init xen_memory_setup(void)
addr = xen_e820_table.entries[0].addr;
size = xen_e820_table.entries[0].size;
while (i < xen_e820_table.nr_entries) 

Re: [Xen-devel] [PATCH] x86/PCI: limit the size of the 64bit BAR to 256GB

2017-12-12 Thread Christian König

Am 12.12.2017 um 19:12 schrieb Bjorn Helgaas:

[+cc Boris, Juergen, xen-devel]

On Mon, Dec 11, 2017 at 04:04:52PM +0100, Christian König wrote:

Xen hides a bit of system memory from the OS for its own purpose by
intercepting e820. This memory is unfortunately not reported as
reserved, but rather completely invisible.

Avoid this address space collision and possible similar problems by
limiting the size of the newly allocated root hub window to 256GB which
should be sufficient for the short term.

It sounds like Boris is working on a more complete fix, so I'm going
to drop this for now.  This changelog includes a few more details, but
I think it makes implicit assumptions about where memory and holes
are and how big they are, and it's still not clear why 256GB is the
right number.  Is it connected to the expected size of the BAR, or
related somehow to the size of the hole?


256GB was just an arbitrary number I've thought should work for at least 
my use case.


And yes Boris is working on a more complete and cleaner fix. I agree 
that we can completely drop my patch for now.



If we need this as a short-term workaround, we can do that, but I
would like to include a reference to f5775e0b6116 ("x86/xen: discard
RAM regions above the maximum reservation") and somehow make what's
going on here a little more explicit.


That patch looks more like it only applies to Xen guests, but not dom0. 
But take that with a grain of salt I really don't know anything about 
that code.


Christian.




Signed-off-by: Christian König 
---
  arch/x86/pci/fixup.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
index 8f86060f5cf6..ed8bc6ab0573 100644
--- a/arch/x86/pci/fixup.c
+++ b/arch/x86/pci/fixup.c
@@ -702,7 +702,7 @@ static void pci_amd_enable_64bit_bar(struct pci_dev *dev)
res->name = "PCI Bus :00";
res->flags = IORESOURCE_PREFETCH | IORESOURCE_MEM |
IORESOURCE_MEM_64 | IORESOURCE_WINDOW;
-   res->start = 0x1ull;
+   res->start = 0xbdull;
res->end = 0xfdull - 1;
  
  	/* Just grab the free area behind system memory for this */

--
2.11.0




___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5

2017-11-28 Thread Christian König

Am 28.11.2017 um 11:53 schrieb Jan Beulich:

On 28.11.17 at 11:17,  wrote:

Am 28.11.2017 um 10:46 schrieb Jan Beulich:

On 28.11.17 at 10:12,  wrote:

In theory the BIOS would search for address space and won't find
anything, so the hotplug operation should fail even before it reaches
the kernel in the first place.

How would the BIOS know what the OS does or plans to do?

As far as I know the ACPI BIOS should work directly with the register
content.

So when we update the register content to enable the MMIO decoding the
BIOS should know that as well.

I'm afraid I don't follow: During memory hotplug, surely you don't
expect the BIOS to do a PCI bus scan? Plus even if it did, it would
be racy - some device could, at this very moment, have memory
decoding disabled, just for the OS to re-enable it a millisecond
later. Yet looking at BAR values is meaningless when memory
decode of a device is disabled.


No, sorry you misunderstood me. The PCI bus is not even involved here.

In AMD Family CPUs you have four main types of address space routed by 
the NB:

1.  Memory space targeting system DRAM.
2.  Memory space targeting IO (MMIO).
3.  IO space.
4.  Configuration space.

See section "2.8.2 NB Routing" in the BIOS and Kernel Developer’s Guide 
(https://support.amd.com/TechDocs/49125_15h_Models_30h-3Fh_BKDG.pdf).


Long story short you have fix addresses for configuration and legacy IO 
(VGA for example) and then configurable memory space for DRAM and MMIO.


What the ACPI BIOS does (or at least should do) is taking a look at the 
registers to find space during memory hotplug.


Now in theory the MMIO space should be configurable by similar ACPI BIOS 
functions, but unfortunately most BIOS doesn't enable that function 
because it can break some older Windows versions.


So what we do here is just what the BIOS should have provided in the 
first place.



I think
it's the other way around - the OS needs to avoid using any regions
for MMIO which are marked as hotpluggable in SRAT.

I was under the impression that this is exactly what
acpi_numa_memory_affinity_init() does.

Perhaps, except that (when I last looked) insufficient state is
(was) being recorded to have that information readily available
at the time MMIO space above 4Gb needs to be allocated for
some device.


That was also my concern, but in the most recent version I'm 
intentionally doing this fixup very late after all the PCI setup is 
already done.


This way the extra address space is only available for devices which are 
added by PCI hotplug or for resizing BARs on the fly (which is the use 
case I'm interested in).



Since there is
no vNUMA yet for Xen Dom0, that would need special handling.

I think that the problem is rather that SRAT is NUMA specific and if I'm
not totally mistaken the content is ignored when NUMA support isn't
compiled into the kernel.

When Xen steals some memory from Dom0 by hocking up itself into the e820
call then I would say the cleanest way is to report this memory in e820
as reserved as well. But take that with a grain of salt, I'm seriously
not a Xen expert.

The E820 handling in PV Linux is all fake anyway - there's a single
chunk of memory given to a PV guest (including Dom0), contiguous
in what PV guests know as "physical address space" (not to be
mixed up with "machine address space", which is where MMIO
needs to be allocated from). Xen code in the kernel then mimics
an E820 matching the host one, moving around pieces of memory
in physical address space if necessary.


Good to know.


Since Dom0 knows the machine E820, MMIO allocation shouldn't
need to be much different there from the non-Xen case.


Yes, completely agree.

I think even if we don't do MMIO allocation with this fixup letting the 
kernel in Dom0 know which memory/address space regions are in use is 
still a good idea.


Otherwise we will run into exactly the same problem when we do the MMIO 
allocation with an ACPI method and that is certainly going to come in 
the next BIOS generations because Microsoft is pushing for it.


Regards,
Christian.



Jan




___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5

2017-11-28 Thread Christian König

Am 27.11.2017 um 19:30 schrieb Boris Ostrovsky:

On 11/23/2017 09:12 AM, Boris Ostrovsky wrote:


On 11/23/2017 03:11 AM, Christian König wrote:

Am 22.11.2017 um 18:27 schrieb Boris Ostrovsky:

On 11/22/2017 11:54 AM, Christian König wrote:

Am 22.11.2017 um 17:24 schrieb Boris Ostrovsky:

On 11/22/2017 05:09 AM, Christian König wrote:

Am 21.11.2017 um 23:26 schrieb Boris Ostrovsky:

On 11/21/2017 08:34 AM, Christian König wrote:

Hi Boris,

attached are two patches.

The first one is a trivial fix for the infinite loop issue, it now
correctly aborts the fixup when it can't find address space for
the
root window.

The second is a workaround for your board. It simply checks if
there
is exactly one Processor Function to apply this fix on.

Both are based on linus current master branch. Please test if they
fix
your issue.

Yes, they do fix it but that's because the feature is disabled.

Do you know what the actual problem was (on Xen)?

I still haven't understood what you actually did with Xen.

When you used PCI pass through with those devices then you have
made a
major configuration error.

When the problem happened on dom0 then the explanation is most
likely
that some PCI device ended up in the configured space, but the
routing
was only setup correctly on one CPU socket.

The problem is that dom0 can be (and was in my case() booted with
less
than full physical memory and so the "rest" of the host memory is not
necessarily reflected in iomem. Your patch then tried to configure
that
memory for MMIO and the system hang.

And so my guess is that this patch will break dom0 on a single-socket
system as well.

Oh, thanks!

I've thought about that possibility before, but wasn't able to find a
system which actually does that.

May I ask why the rest of the memory isn't reported to the OS?

That memory doesn't belong to the OS (dom0), it is owned by the
hypervisor.


Sounds like I can't trust Linux resource management and probably need
to read the DRAM config to figure things out after all.

My question is whether what you are trying to do should ever be done
for
a guest at all (any guest, not necessarily Xen).

The issue is probably that I don't know enough about Xen: What
exactly is dom0? My understanding was that dom0 is the hypervisor,
but that seems to be incorrect.

The issue is that under no circumstances *EVER* a virtualized guest
should have access to the PCI devices marked as "Processor Function"
on AMD platforms. Otherwise it is trivial to break out of the
virtualization.

When dom0 is something like the system domain with all hardware
access then the approach seems legitimate, but then the hypervisor
should report the stolen memory to the OS using the e820 table.

When the hypervisor doesn't do that and the Linux kernel isn't aware
that there is memory at a given location mapping PCI space there will
obviously crash the hypervisor.

Possible solutions as far as I can see are either disabling this
feature when we detect that we are a Xen dom0, scanning the DRAM
settings to update Linux resource handling or fixing Xen to report
stolen memory to the dom0 OS as reserved.

Opinions?

You are right, these functions are not exposed to a regular guest.

I think for dom0 (which is a special Xen guest, with additional
privileges) we may be able to add a reserved e820 region for host
memory that is not assigned to dom0. Let me try it on Monday (I am out
until then).


One thing I realized while looking at solution for Xen dom0 is that this
patch may cause problems for memory hotplug.


Good point. My assumption was that when you got an BIOS which can handle 
memory hotplug then you also got an BIOS which doesn't need this fixup. 
But I've never validated that assumption.



What happens if new memory
is added to the system and we have everything above current memory set
to MMIO?


In theory the BIOS would search for address space and won't find 
anything, so the hotplug operation should fail even before it reaches 
the kernel in the first place.


In practice I think that nobody ever tested if that works correctly. So 
I'm pretty sure the system would just crash.


How about the attached patch? It limits the newly added MMIO space to 
the upper 256GB of the address space. That should still be enough for 
most devices, but we avoid both issues with Xen dom0 as most likely 
problems with memory hotplug as well.


Christian.



-boris



>From 586bd9d67ebb6ca48bd0a6b1bd9203e94093cc8e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Christian=20K=C3=B6nig?= 
Date: Tue, 28 Nov 2017 10:02:35 +0100
Subject: [PATCH] x86/PCI: limit the size of the 64bit BAR to 256GB
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This avoids problems with Xen which hides some memory resources from the
OS and potentially also allows memory hotplug while this fixup is
enabled.

Signed-off-by: Christian Kö

Re: [Xen-devel] [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5

2017-11-28 Thread Christian König

Am 28.11.2017 um 10:46 schrieb Jan Beulich:

On 28.11.17 at 10:12,  wrote:

In theory the BIOS would search for address space and won't find
anything, so the hotplug operation should fail even before it reaches
the kernel in the first place.

How would the BIOS know what the OS does or plans to do?


As far as I know the ACPI BIOS should work directly with the register 
content.


So when we update the register content to enable the MMIO decoding the 
BIOS should know that as well.



I think
it's the other way around - the OS needs to avoid using any regions
for MMIO which are marked as hotpluggable in SRAT.


I was under the impression that this is exactly what 
acpi_numa_memory_affinity_init() does.



Since there is
no vNUMA yet for Xen Dom0, that would need special handling.


I think that the problem is rather that SRAT is NUMA specific and if I'm 
not totally mistaken the content is ignored when NUMA support isn't 
compiled into the kernel.


When Xen steals some memory from Dom0 by hocking up itself into the e820 
call then I would say the cleanest way is to report this memory in e820 
as reserved as well. But take that with a grain of salt, I'm seriously 
not a Xen expert.


Regards,
Christian.



Jan




___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5

2017-11-23 Thread Christian König

Am 22.11.2017 um 18:27 schrieb Boris Ostrovsky:

On 11/22/2017 11:54 AM, Christian König wrote:

Am 22.11.2017 um 17:24 schrieb Boris Ostrovsky:

On 11/22/2017 05:09 AM, Christian König wrote:

Am 21.11.2017 um 23:26 schrieb Boris Ostrovsky:

On 11/21/2017 08:34 AM, Christian König wrote:

Hi Boris,

attached are two patches.

The first one is a trivial fix for the infinite loop issue, it now
correctly aborts the fixup when it can't find address space for the
root window.

The second is a workaround for your board. It simply checks if there
is exactly one Processor Function to apply this fix on.

Both are based on linus current master branch. Please test if they
fix
your issue.

Yes, they do fix it but that's because the feature is disabled.

Do you know what the actual problem was (on Xen)?

I still haven't understood what you actually did with Xen.

When you used PCI pass through with those devices then you have made a
major configuration error.

When the problem happened on dom0 then the explanation is most likely
that some PCI device ended up in the configured space, but the routing
was only setup correctly on one CPU socket.

The problem is that dom0 can be (and was in my case() booted with less
than full physical memory and so the "rest" of the host memory is not
necessarily reflected in iomem. Your patch then tried to configure that
memory for MMIO and the system hang.

And so my guess is that this patch will break dom0 on a single-socket
system as well.

Oh, thanks!

I've thought about that possibility before, but wasn't able to find a
system which actually does that.

May I ask why the rest of the memory isn't reported to the OS?

That memory doesn't belong to the OS (dom0), it is owned by the hypervisor.


Sounds like I can't trust Linux resource management and probably need
to read the DRAM config to figure things out after all.


My question is whether what you are trying to do should ever be done for
a guest at all (any guest, not necessarily Xen).


The issue is probably that I don't know enough about Xen: What exactly 
is dom0? My understanding was that dom0 is the hypervisor, but that 
seems to be incorrect.


The issue is that under no circumstances *EVER* a virtualized guest 
should have access to the PCI devices marked as "Processor Function" on 
AMD platforms. Otherwise it is trivial to break out of the virtualization.


When dom0 is something like the system domain with all hardware access 
then the approach seems legitimate, but then the hypervisor should 
report the stolen memory to the OS using the e820 table.


When the hypervisor doesn't do that and the Linux kernel isn't aware 
that there is memory at a given location mapping PCI space there will 
obviously crash the hypervisor.


Possible solutions as far as I can see are either disabling this feature 
when we detect that we are a Xen dom0, scanning the DRAM settings to 
update Linux resource handling or fixing Xen to report stolen memory to 
the dom0 OS as reserved.


Opinions?

Thanks,
Christian.



-boris




___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xen.org/xen-devel