Re: Design session notes: GPU acceleration in Xen
Am 18.06.24 um 16:12 schrieb Demi Marie Obenour: On Tue, Jun 18, 2024 at 08:33:38AM +0200, Christian König wrote: > Am 18.06.24 um 02:57 schrieb Demi Marie Obenour: >> On Mon, Jun 17, 2024 at 10:46:13PM +0200, Marek Marczykowski-Górecki >> wrote: >>> On Mon, Jun 17, 2024 at 09:46:29AM +0200, Roger Pau Monné wrote: >>>> On Sun, Jun 16, 2024 at 08:38:19PM -0400, Demi Marie Obenour wrote: >>>>> In both cases, the device physical >>>>> addresses are identical to dom0’s physical addresses. >>>> >>>> Yes, but a PV dom0 physical address space can be very scattered. >>>> >>>> IIRC there's an hypercall to request physically contiguous memory for >>>> PV, but you don't want to be using that every time you allocate a >>>> buffer (not sure it would support the sizes needed by the GPU >>>> anyway). >> >>> Indeed that isn't going to fly. In older Qubes versions we had PV >>> sys-net with PCI passthrough for a network card. After some uptime it >>> was basically impossible to restart and still have enough contagious >>> memory for a network driver, and there it was about _much_ smaller >>> buffers, like 2M or 4M. At least not without shutting down a lot more >>> things to free some more memory. >> >> Ouch! That makes me wonder if all GPU drivers actually need physically >> contiguous buffers, or if it is (as I suspect) driver-specific. CCing >> Christian König who has mentioned issues in this area. > Well GPUs don't need physical contiguous memory to function, but if they > only get 4k pages to work with it means a quite large (up to 30%) > performance penalty. The status quo is "no GPU acceleration at all", so 70% of bare metal performance would be amazing right now. Well AMD uses native context approach in XEN which which delivers over 90% of bare metal performance. Pierre-Eric can tell you more, but we certainly have GPU solutions in productions with XEN which would suffer greatly if we see the underlying memory fragmented like this. However, the implementation should not preclude eliminating this performance penalty in the future. What size pages do GPUs need for good performance? Is it the same as CPU huge pages? 2MiB are usually sufficient. Regards, Christian. PV dom0 doesn't get huge pages at all, but PVH and HVM guests do, and the goal is to move away from PV guests as they have lots of unrelated problems. > So scattering memory like you described is probably a very bad idea if you > want any halve way decent performance. For an initial prototype a 30% performance penalty is acceptable, but it's good to know that memory fragmentation needs to be avoided. > Regards, > Christian Thanks for the prompt response!
Re: Design session notes: GPU acceleration in Xen
Am 18.06.24 um 02:57 schrieb Demi Marie Obenour: On Mon, Jun 17, 2024 at 10:46:13PM +0200, Marek Marczykowski-Górecki wrote: > On Mon, Jun 17, 2024 at 09:46:29AM +0200, Roger Pau Monné wrote: >> On Sun, Jun 16, 2024 at 08:38:19PM -0400, Demi Marie Obenour wrote: >>> In both cases, the device physical >>> addresses are identical to dom0’s physical addresses. >> >> Yes, but a PV dom0 physical address space can be very scattered. >> >> IIRC there's an hypercall to request physically contiguous memory for >> PV, but you don't want to be using that every time you allocate a >> buffer (not sure it would support the sizes needed by the GPU >> anyway). > Indeed that isn't going to fly. In older Qubes versions we had PV > sys-net with PCI passthrough for a network card. After some uptime it > was basically impossible to restart and still have enough contagious > memory for a network driver, and there it was about _much_ smaller > buffers, like 2M or 4M. At least not without shutting down a lot more > things to free some more memory. Ouch! That makes me wonder if all GPU drivers actually need physically contiguous buffers, or if it is (as I suspect) driver-specific. CCing Christian König who has mentioned issues in this area. Well GPUs don't need physical contiguous memory to function, but if they only get 4k pages to work with it means a quite large (up to 30%) performance penalty. So scattering memory like you described is probably a very bad idea if you want any halve way decent performance. Regards, Christian. Given the recent progress on PVH dom0, is it reasonable to assume that PVH dom0 will be ready in time for R4.3, and that therefore Qubes OS doesn't need to worry about this problem on x86?
Re: [PATCH] xen/gntdev: Fix the abuse of underlying struct page in DMA-buf import
Am 04.01.24 um 19:53 schrieb Oleksandr Tyshchenko: From: Oleksandr Tyshchenko DO NOT access the underlying struct page of an sg table exported by DMA-buf in dmabuf_imp_to_refs(), this is not allowed. Please see drivers/dma-buf/dma-buf.c:mangle_sg_table() for details. Fortunately, here (for special Xen device) we can avoid using pages and calculate gfns directly from dma addresses provided by the sg table. Suggested-by: Daniel Vetter Signed-off-by: Oleksandr Tyshchenko I can't say that I can judge the full technical background, but that looks reasonable to me. Acked-by: Christian König --- Please note, I didn't manage to test the patch against the latest master branch on real HW (patch was only build tested there). Patch was tested on Arm64 guests using Linux v5.10.41 from vendor's BSP, this is the environment where running this use-case is possible and to which I have an access (Xen PV display with zero-copy and backend domain as a buffer provider - be-alloc=1, so dma-buf import part was involved). A little bit old, but the dma-buf import code in gntdev-dmabuf.c hasn't been changed much since that time, all context remains allmost the same according to my code inspection. --- --- drivers/xen/gntdev-dmabuf.c | 42 +++-- 1 file changed, 17 insertions(+), 25 deletions(-) diff --git a/drivers/xen/gntdev-dmabuf.c b/drivers/xen/gntdev-dmabuf.c index 4440e626b797..0dde49fca9a5 100644 --- a/drivers/xen/gntdev-dmabuf.c +++ b/drivers/xen/gntdev-dmabuf.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include #include @@ -50,7 +51,7 @@ struct gntdev_dmabuf { /* Number of pages this buffer has. */ int nr_pages; - /* Pages of this buffer. */ + /* Pages of this buffer (only for dma-buf export). */ struct page **pages; }; @@ -484,7 +485,7 @@ static int dmabuf_exp_from_refs(struct gntdev_priv *priv, int flags, /* DMA buffer import support. */ static int -dmabuf_imp_grant_foreign_access(struct page **pages, u32 *refs, +dmabuf_imp_grant_foreign_access(unsigned long *gfns, u32 *refs, int count, int domid) { grant_ref_t priv_gref_head; @@ -507,7 +508,7 @@ dmabuf_imp_grant_foreign_access(struct page **pages, u32 *refs, } gnttab_grant_foreign_access_ref(cur_ref, domid, - xen_page_to_gfn(pages[i]), 0); + gfns[i], 0); refs[i] = cur_ref; } @@ -529,7 +530,6 @@ static void dmabuf_imp_end_foreign_access(u32 *refs, int count) static void dmabuf_imp_free_storage(struct gntdev_dmabuf *gntdev_dmabuf) { - kfree(gntdev_dmabuf->pages); kfree(gntdev_dmabuf->u.imp.refs); kfree(gntdev_dmabuf); } @@ -549,12 +549,6 @@ static struct gntdev_dmabuf *dmabuf_imp_alloc_storage(int count) if (!gntdev_dmabuf->u.imp.refs) goto fail; - gntdev_dmabuf->pages = kcalloc(count, - sizeof(gntdev_dmabuf->pages[0]), - GFP_KERNEL); - if (!gntdev_dmabuf->pages) - goto fail; - gntdev_dmabuf->nr_pages = count; for (i = 0; i < count; i++) @@ -576,7 +570,8 @@ dmabuf_imp_to_refs(struct gntdev_dmabuf_priv *priv, struct device *dev, struct dma_buf *dma_buf; struct dma_buf_attachment *attach; struct sg_table *sgt; - struct sg_page_iter sg_iter; + struct sg_dma_page_iter sg_iter; + unsigned long *gfns; int i; dma_buf = dma_buf_get(fd); @@ -624,26 +619,23 @@ dmabuf_imp_to_refs(struct gntdev_dmabuf_priv *priv, struct device *dev, gntdev_dmabuf->u.imp.sgt = sgt; - /* Now convert sgt to array of pages and check for page validity. */ + gfns = kcalloc(count, sizeof(*gfns), GFP_KERNEL); + if (!gfns) + goto fail_unmap; + + /* Now convert sgt to array of gfns without accessing underlying pages. */ i = 0; - for_each_sgtable_page(sgt, &sg_iter, 0) { - struct page *page = sg_page_iter_page(&sg_iter); - /* -* Check if page is valid: this can happen if we are given -* a page from VRAM or other resources which are not backed -* by a struct page. -*/ - if (!pfn_valid(page_to_pfn(page))) { - ret = ERR_PTR(-EINVAL); - goto fail_unmap; - } + for_each_sgtable_dma_page(sgt, &sg_iter, 0) { + dma_addr_t addr = sg_page_iter_dma_address(&sg_iter); + unsigned long pfn = bfn_to_pfn(XEN_PFN_DOWN(dma_to_phys(dev, addr))); - gntdev_dmabuf->pages[i++] = page; + gfns[i++] = pfn_to_gfn(pfn); } - ret = ERR_PTR(dmabuf_imp
Re: [PATCH RFC v1 00/52] drm/crtc: Rename struct drm_crtc::dev to drm_dev
Am 12.07.23 um 15:38 schrieb Uwe Kleine-König: Hello Maxime, On Wed, Jul 12, 2023 at 02:52:38PM +0200, Maxime Ripard wrote: On Wed, Jul 12, 2023 at 01:02:53PM +0200, Uwe Kleine-König wrote: Background is that this makes merge conflicts easier to handle and detect. Really? FWIW, I agree with Christian here. Each file (apart from include/drm/drm_crtc.h) is only touched once. So unless I'm missing something you don't get less or easier conflicts by doing it all in a single patch. But you gain the freedom to drop a patch for one driver without having to drop the rest with it. Not really, because the last patch removed the union anyway. So you have to revert both the last patch, plus that driver one. And then you need to add a TODO to remove that union eventually. Yes, with a single patch you have only one revert (but 194 files changed, 1264 insertions(+), 1296 deletions(-)) instead of two (one of them: 1 file changed, 9 insertions(+), 1 deletion(-); the other maybe a bit bigger). (And maybe you get away with just reverting the last patch.) With a single patch the TODO after a revert is "redo it all again (and prepare for a different set of conflicts)" while with the split series it's only "fix that one driver that was forgotten/borked" + reapply that 10 line patch. Yeah, but for a maintainer the size of the patches doesn't matter. That's only interesting if you need to manually review the patch, which you hopefully doesn't do in case of something auto-generated. In other words if the patch is auto-generated re-applying it completely is less work than fixing things up individually. As the one who gets that TODO, I prefer the latter. Yeah, but your personal preferences are not a technical relevant argument to a maintainer. At the end of the day Dave or Daniel need to decide, because they need to live with it. Regards, Christian. So in sum: If your metric is "small count of reverted commits", you're right. If however your metric is: Better get 95% of this series' change in than maybe 0%, the split series is the way to do it. With me having spend ~3h on this series' changes, it's maybe understandable that I did it the way I did. FTR: This series was created on top of v6.5-rc1. If you apply it to drm-misc-next you get a (trivial) conflict in patch #2. If I consider to be the responsible maintainer who applies this series, I like being able to just do git am --skip then. FTR#2: In drm-misc-next is a new driver (drivers/gpu/drm/loongson/lsdc_crtc.c) so skipping the last patch for now might indeed be a good idea. So I still like the split version better, but I'm open to a more verbose reasoning from your side. You're doing only one thing here, really: you change the name of a structure field. If it was shared between multiple maintainers, then sure, splitting that up is easier for everyone, but this will go through drm-misc, so I can't see the benefit it brings. I see your argument, but I think mine weights more. Best regards Uwe
Re: [PATCH RFC v1 00/52] drm/crtc: Rename struct drm_crtc::dev to drm_dev
Am 12.07.23 um 11:46 schrieb Uwe Kleine-König: Hello, while I debugged an issue in the imx-lcdc driver I was constantly irritated about struct drm_device pointer variables being named "dev" because with that name I usually expect a struct device pointer. I think there is a big benefit when these are all renamed to "drm_dev". I have no strong preference here though, so "drmdev" or "drm" are fine for me, too. Let the bikesheding begin! Some statistics: $ git grep -ohE 'struct drm_device *\* *[^ (),;]*' v6.5-rc1 | sort | uniq -c | sort -n 1 struct drm_device *adev_to_drm 1 struct drm_device *drm_ 1 struct drm_device *drm_dev 1 struct drm_device*drm_dev 1 struct drm_device *pdev 1 struct drm_device *rdev 1 struct drm_device *vdev 2 struct drm_device *dcss_drv_dev_to_drm 2 struct drm_device **ddev 2 struct drm_device *drm_dev_alloc 2 struct drm_device *mock 2 struct drm_device *p_ddev 5 struct drm_device *device 9 struct drm_device * dev 25 struct drm_device *d 95 struct drm_device * 216 struct drm_device *ddev 234 struct drm_device *drm_dev 611 struct drm_device *drm 4190 struct drm_device *dev This series starts with renaming struct drm_crtc::dev to drm_dev. If it's not only me and others like the result of this effort it should be followed up by adapting the other structs and the individual usages in the different drivers. To make this series a bit easier handleable, I first added an alias for drm_crtc::dev, then converted the drivers one after another and the last patch drops the "dev" name. This has the advantage of being easier to review, and if I should have missed an instance only the last patch must be dropped/reverted. Also this series might conflict with other patches, in this case the remaining patches can still go in (apart from the last one of course). Maybe it also makes sense to delay applying the last patch by one development cycle? When you automatically generate the patch (with cocci for example) I usually prefer a single patch instead. Background is that this makes merge conflicts easier to handle and detect. When you have multiple patches and a merge conflict because of some added lines using the old field the build breaks only on the last patch which removes the old field. In such cases reviewing the patch just means automatically re-generating it and double checking that you don't see anything funky. Apart from that I honestly absolutely don't care what the name is. Cheers, Christian. The series was compile tested for arm, arm64, powerpc and amd64 using an allmodconfig (though I only build drivers/gpu/). Best regards Uwe Uwe Kleine-König (52): drm/crtc: Start renaming struct drm_crtc::dev to drm_dev drm/core: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/amd: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/armada: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/arm: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/aspeed: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/ast: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/atmel-hlcdc: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/exynos: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/fsl-dcu: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/gma500: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/gud: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/hisilicon: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/hyperv: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/i915: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/imx: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/ingenic: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/kmb: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/logicvc: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/mcde: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/mediatek: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/meson: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/mgag200: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/msm: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/mxsfb: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/nouveau: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/omapdrm: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/panel-ili9341: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/pl111: Use struct d
Re: [RFC PATCH 1/5] x86/xen: disable swiotlb for xen pvh
Am 17.03.23 um 15:45 schrieb Alex Deucher: On Thu, Mar 16, 2023 at 7:09 PM Stefano Stabellini wrote: On Thu, 16 Mar 2023, Juergen Gross wrote: On 16.03.23 14:53, Alex Deucher wrote: On Thu, Mar 16, 2023 at 9:48 AM Juergen Gross wrote: On 16.03.23 14:45, Alex Deucher wrote: On Thu, Mar 16, 2023 at 3:50 AM Jan Beulich wrote: On 16.03.2023 00:25, Stefano Stabellini wrote: On Wed, 15 Mar 2023, Jan Beulich wrote: On 15.03.2023 01:52, Stefano Stabellini wrote: On Mon, 13 Mar 2023, Jan Beulich wrote: On 12.03.2023 13:01, Huang Rui wrote: Xen PVH is the paravirtualized mode and takes advantage of hardware virtualization support when possible. It will using the hardware IOMMU support instead of xen-swiotlb, so disable swiotlb if current domain is Xen PVH. But the kernel has no way (yet) to drive the IOMMU, so how can it get away without resorting to swiotlb in certain cases (like I/O to an address-restricted device)? I think Ray meant that, thanks to the IOMMU setup by Xen, there is no need for swiotlb-xen in Dom0. Address translations are done by the IOMMU so we can use guest physical addresses instead of machine addresses for DMA. This is a similar case to Dom0 on ARM when the IOMMU is available (see include/xen/arm/swiotlb-xen.h:xen_swiotlb_detect, the corresponding case is XENFEAT_not_direct_mapped). But how does Xen using an IOMMU help with, as said, address-restricted devices? They may still need e.g. a 32-bit address to be programmed in, and if the kernel has memory beyond the 4G boundary not all I/O buffers may fulfill this requirement. In short, it is going to work as long as Linux has guest physical addresses (not machine addresses, those could be anything) lower than 4GB. If the address-restricted device does DMA via an IOMMU, then the device gets programmed by Linux using its guest physical addresses (not machine addresses). The 32-bit restriction would be applied by Linux to its choice of guest physical address to use to program the device, the same way it does on native. The device would be fine as it always uses Linux-provided <4GB addresses. After the IOMMU translation (pagetable setup by Xen), we could get any address, including >4GB addresses, and that is expected to work. I understand that's the "normal" way of working. But whatever the swiotlb is used for in baremetal Linux, that would similarly require its use in PVH (or HVM) aiui. So unconditionally disabling it in PVH would look to me like an incomplete attempt to disable its use altogether on x86. What difference of PVH vs baremetal am I missing here? swiotlb is not usable for GPUs even on bare metal. They often have hundreds or megs or even gigs of memory mapped on the device at any given time. Also, AMD GPUs support 44-48 bit DMA masks (depending on the chip family). But the swiotlb isn't per device, but system global. Sure, but if the swiotlb is in use, then you can't really use the GPU. So you get to pick one. The swiotlb is used only for buffers which are not within the DMA mask of a device (see dma_direct_map_page()). So an AMD GPU supporting a 44 bit DMA mask won't use the swiotlb unless you have a buffer above guest physical address of 16TB (so basically never). Disabling swiotlb in such a guest would OTOH mean, that a device with only 32 bit DMA mask passed through to this guest couldn't work with buffers above 4GB. I don't think this is acceptable. From the Xen subsystem in Linux point of view, the only thing we need to do is to make sure *not* to enable swiotlb_xen (yes "swiotlb_xen", not the global swiotlb) on PVH because it is not needed anyway. I think we should leave the global "swiotlb" setting alone. The global swiotlb is not relevant to Xen anyway, and surely baremetal Linux has to have a way to deal with swiotlb/GPU incompatibilities. We just have to avoid making things worse on Xen, and for that we just need to avoid unconditionally enabling swiotlb-xen. If the Xen subsystem doesn't enable swiotlb_xen/swiotlb, and no other subsystem enables swiotlb, then we have a good Linux configuration capable of handling the GPU properly. Alex, please correct me if I am wrong. How is x86_swiotlb_enable set to false on native (non-Xen) x86? In most cases we have an IOMMU enabled and IIRC, TTM has slightly different behavior for memory allocation depending on whether swiotlb would be needed or not. Well "slightly different" is an understatement. We need to disable quite a bunch of features to make swiotlb work with GPUs. Especially userptr and inter device sharing won't work any more. Regards, Christian. Alex
Re: [RFC XEN PATCH 2/6] vpci: accept BAR writes if dom0 is PVH
Am 13.03.23 um 08:23 schrieb Christian König: Am 12.03.23 um 08:54 schrieb Huang Rui: From: Chen Jiqian When dom0 is PVH and we want to passthrough gpu to guest, we should allow BAR writes even through BAR is mapped. If not, the value of BARs are not initialized when guest firstly start. Signed-off-by: Chen Jiqian Signed-off-by: Huang Rui --- xen/drivers/vpci/header.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c index ec2e978a4e..918d11fbce 100644 --- a/xen/drivers/vpci/header.c +++ b/xen/drivers/vpci/header.c @@ -392,7 +392,7 @@ static void cf_check bar_write( * Xen only cares whether the BAR is mapped into the p2m, so allow BAR * writes as long as the BAR is not mapped into the p2m. */ - if ( bar->enabled ) + if ( pci_conf_read16(pdev->sbdf, PCI_COMMAND) & PCI_COMMAND_MEMORY ) Checkpath.pl gives here: ERROR: space prohibited after that open parenthesis '(' #115: FILE: xen/drivers/vpci/header.c:395: + if ( pci_conf_read16(pdev->sbdf, PCI_COMMAND) & PCI_COMMAND_MEMORY ) But I should probably mention that I'm not 100% sure if this code base uses kernel coding style! Christian. Christian. { /* If the value written is the current one avoid printing a warning. */ if ( val != (uint32_t)(bar->addr >> (hi ? 32 : 0)) )
Re: [RFC XEN PATCH 0/6] Introduce VirtIO GPU and Passthrough GPU support on Xen PVH dom0
Hi Ray, one nit comment on the style, apart from that looks technical correct. But I'm *really* not and expert on all that stuff. Regards, Christian. Am 12.03.23 um 08:54 schrieb Huang Rui: Hi all, In graphic world, the 3D applications/games are runing based on open graphic libraries such as OpenGL and Vulkan. Mesa is the Linux implemenatation of OpenGL and Vulkan for multiple hardware platforms. Because the graphic libraries would like to have the GPU hardware acceleration. In virtualization world, virtio-gpu and passthrough-gpu are two of gpu virtualization technologies. Current Xen only supports OpenGL (virgl: https://docs.mesa3d.org/drivers/virgl.html) for virtio-gpu and passthrough gpu based on PV dom0 for x86 platform. Today, we would like to introduce Vulkan (venus: https://docs.mesa3d.org/drivers/venus.html) and another OpenGL on Vulkan (zink: https://docs.mesa3d.org/drivers/zink.html) support for VirtIO GPU on Xen. These functions are supported on KVM at this moment, but so far, they are not supported on Xen. And we also introduce the PCIe passthrough (GPU) function based on PVH dom0 for AMD x86 platform. These supports required multiple repositories changes on kernel, xen, qemu, mesa, and virglrenderer. Please check below branches: Kernel: https://git.kernel.org/pub/scm/linux/kernel/git/rui/linux.git/log/?h=upstream-fox-xen Xen: https://gitlab.com/huangrui123/xen/-/commits/upstream-for-xen QEMU: https://gitlab.com/huangrui123/qemu/-/commits/upstream-for-xen Mesa: https://gitlab.freedesktop.org/rui/mesa/-/commits/upstream-for-xen Virglrenderer: https://gitlab.freedesktop.org/rui/virglrenderer/-/commits/upstream-for-xen In xen part, we mainly add the PCIe passthrough support on PVH dom0. It's using the QEMU to passthrough the GPU device into guest HVM domU. And mainly work is to transfer the interrupt by using gsi, vector, and pirq. Below are the screenshot of these functions, please take a look. Venus: https://drive.google.com/file/d/1_lPq6DMwHu1JQv7LUUVRx31dBj0HJYcL/view?usp=share_link Zink: https://drive.google.com/file/d/1FxLmKu6X7uJOxx1ZzwOm1yA6IL5WMGzd/view?usp=share_link Passthrough GPU: https://drive.google.com/file/d/17onr5gvDK8KM_LniHTSQEI2hGJZlI09L/view?usp=share_link We are working to write the documentation that describe how to verify these functions in the xen wiki page. And will update it in the future version. Thanks, Ray Chen Jiqian (5): vpci: accept BAR writes if dom0 is PVH x86/pvh: shouldn't check pirq flag when map pirq in PVH x86/pvh: PVH dom0 also need PHYSDEVOP_setup_gsi call tools/libs/call: add linux os call to get gsi from irq tools/libs/light: pci: translate irq to gsi Roger Pau Monne (1): x86/pvh: report ACPI VFCT table to dom0 if present tools/include/xen-sys/Linux/privcmd.h | 7 +++ tools/include/xencall.h | 2 ++ tools/include/xenctrl.h | 2 ++ tools/libs/call/core.c| 5 + tools/libs/call/libxencall.map| 2 ++ tools/libs/call/linux.c | 14 ++ tools/libs/call/private.h | 9 + tools/libs/ctrl/xc_physdev.c | 4 tools/libs/light/libxl_pci.c | 1 + xen/arch/x86/hvm/dom0_build.c | 1 + xen/arch/x86/hvm/hypercall.c | 3 +-- xen/drivers/vpci/header.c | 2 +- xen/include/acpi/actbl3.h | 1 + 13 files changed, 50 insertions(+), 3 deletions(-)
Re: [RFC XEN PATCH 2/6] vpci: accept BAR writes if dom0 is PVH
Am 12.03.23 um 08:54 schrieb Huang Rui: From: Chen Jiqian When dom0 is PVH and we want to passthrough gpu to guest, we should allow BAR writes even through BAR is mapped. If not, the value of BARs are not initialized when guest firstly start. Signed-off-by: Chen Jiqian Signed-off-by: Huang Rui --- xen/drivers/vpci/header.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c index ec2e978a4e..918d11fbce 100644 --- a/xen/drivers/vpci/header.c +++ b/xen/drivers/vpci/header.c @@ -392,7 +392,7 @@ static void cf_check bar_write( * Xen only cares whether the BAR is mapped into the p2m, so allow BAR * writes as long as the BAR is not mapped into the p2m. */ -if ( bar->enabled ) +if ( pci_conf_read16(pdev->sbdf, PCI_COMMAND) & PCI_COMMAND_MEMORY ) Checkpath.pl gives here: ERROR: space prohibited after that open parenthesis '(' #115: FILE: xen/drivers/vpci/header.c:395: + if ( pci_conf_read16(pdev->sbdf, PCI_COMMAND) & PCI_COMMAND_MEMORY ) Christian. { /* If the value written is the current one avoid printing a warning. */ if ( val != (uint32_t)(bar->addr >> (hi ? 32 : 0)) )
Re: [PATCH 01/21] drm/amdgpu: Don't set struct drm_driver.lastclose
Am 20.10.22 um 12:37 schrieb Thomas Zimmermann: Don't set struct drm_driver.lastclose. It's used to restore the fbdev console. But as amdgpu uses generic fbdev emulation, the console is being restored by the DRM client helpers already. See the call to drm_client_dev_restore() in drm_lastclose(). ??? The commit message doesn't match what the patch is doing. You are removing output_poll_changed instead of lastclose here. Did something got mixed up? Cheers, Christian. Signed-off-by: Thomas Zimmermann --- drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 1 - drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 -- 2 files changed, 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c index 23998f727c7f9..fb7186c5ade2a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c @@ -1224,7 +1224,6 @@ amdgpu_display_user_framebuffer_create(struct drm_device *dev, const struct drm_mode_config_funcs amdgpu_mode_funcs = { .fb_create = amdgpu_display_user_framebuffer_create, - .output_poll_changed = drm_fb_helper_output_poll_changed, }; static const struct drm_prop_enum_list amdgpu_underscan_enum_list[] = diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index f6a9e8fdd87d6..e9a28a5363b9a 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -82,7 +82,6 @@ #include #include #include -#include #include #include #include @@ -2810,7 +2809,6 @@ const struct amdgpu_ip_block_version dm_ip_block = static const struct drm_mode_config_funcs amdgpu_dm_mode_funcs = { .fb_create = amdgpu_display_user_framebuffer_create, .get_format_info = amd_get_format_info, - .output_poll_changed = drm_fb_helper_output_poll_changed, .atomic_check = amdgpu_dm_atomic_check, .atomic_commit = drm_atomic_helper_commit, };
Re: [PATCH v5 05/10] drm/ttm: Add vmap/vunmap to TTM and TTM GEM helpers
Am 20.10.20 um 14:20 schrieb Thomas Zimmermann: The new functions ttm_bo_{vmap,vunmap}() map and unmap a TTM BO in kernel address space. The mapping's address is returned as struct dma_buf_map. Each function is a simplified version of TTM's existing kmap code. Both functions respect the memory's location ani/or writecombine flags. On top TTM's functions, GEM TTM helpers got drm_gem_ttm_{vmap,vunmap}(), two helpers that convert a GEM object into the TTM BO and forward the call to TTM's vmap/vunmap. These helpers can be dropped into the rsp GEM object callbacks. v5: * use size_t for storing mapping size (Christian) * ignore premapped memory areas correctly in ttm_bo_vunmap() * rebase onto latest TTM interfaces (Christian) * remove BUG() from ttm_bo_vmap() (Christian) v4: * drop ttm_kmap_obj_to_dma_buf() in favor of vmap helpers (Daniel, Christian) Signed-off-by: Thomas Zimmermann Acked-by: Daniel Vetter Tested-by: Sam Ravnborg Reviewed-by: Christian König --- drivers/gpu/drm/drm_gem_ttm_helper.c | 38 +++ drivers/gpu/drm/ttm/ttm_bo_util.c| 72 include/drm/drm_gem_ttm_helper.h | 6 +++ include/drm/ttm/ttm_bo_api.h | 28 +++ include/linux/dma-buf-map.h | 20 5 files changed, 164 insertions(+) diff --git a/drivers/gpu/drm/drm_gem_ttm_helper.c b/drivers/gpu/drm/drm_gem_ttm_helper.c index 0e4fb9ba43ad..db4c14d78a30 100644 --- a/drivers/gpu/drm/drm_gem_ttm_helper.c +++ b/drivers/gpu/drm/drm_gem_ttm_helper.c @@ -49,6 +49,44 @@ void drm_gem_ttm_print_info(struct drm_printer *p, unsigned int indent, } EXPORT_SYMBOL(drm_gem_ttm_print_info); +/** + * drm_gem_ttm_vmap() - vmap &ttm_buffer_object + * @gem: GEM object. + * @map: [out] returns the dma-buf mapping. + * + * Maps a GEM object with ttm_bo_vmap(). This function can be used as + * &drm_gem_object_funcs.vmap callback. + * + * Returns: + * 0 on success, or a negative errno code otherwise. + */ +int drm_gem_ttm_vmap(struct drm_gem_object *gem, +struct dma_buf_map *map) +{ + struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gem); + + return ttm_bo_vmap(bo, map); + +} +EXPORT_SYMBOL(drm_gem_ttm_vmap); + +/** + * drm_gem_ttm_vunmap() - vunmap &ttm_buffer_object + * @gem: GEM object. + * @map: dma-buf mapping. + * + * Unmaps a GEM object with ttm_bo_vunmap(). This function can be used as + * &drm_gem_object_funcs.vmap callback. + */ +void drm_gem_ttm_vunmap(struct drm_gem_object *gem, + struct dma_buf_map *map) +{ + struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gem); + + ttm_bo_vunmap(bo, map); +} +EXPORT_SYMBOL(drm_gem_ttm_vunmap); + /** * drm_gem_ttm_mmap() - mmap &ttm_buffer_object * @gem: GEM object. diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c index ba7ab5ed85d0..5c79418405ea 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_util.c +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c @@ -32,6 +32,7 @@ #include #include #include +#include #include #include #include @@ -527,6 +528,77 @@ void ttm_bo_kunmap(struct ttm_bo_kmap_obj *map) } EXPORT_SYMBOL(ttm_bo_kunmap); +int ttm_bo_vmap(struct ttm_buffer_object *bo, struct dma_buf_map *map) +{ + struct ttm_resource *mem = &bo->mem; + int ret; + + ret = ttm_mem_io_reserve(bo->bdev, mem); + if (ret) + return ret; + + if (mem->bus.is_iomem) { + void __iomem *vaddr_iomem; + size_t size = bo->num_pages << PAGE_SHIFT; + + if (mem->bus.addr) + vaddr_iomem = (void __iomem *)mem->bus.addr; + else if (mem->bus.caching == ttm_write_combined) + vaddr_iomem = ioremap_wc(mem->bus.offset, size); + else + vaddr_iomem = ioremap(mem->bus.offset, size); + + if (!vaddr_iomem) + return -ENOMEM; + + dma_buf_map_set_vaddr_iomem(map, vaddr_iomem); + + } else { + struct ttm_operation_ctx ctx = { + .interruptible = false, + .no_wait_gpu = false + }; + struct ttm_tt *ttm = bo->ttm; + pgprot_t prot; + void *vaddr; + + ret = ttm_tt_populate(bo->bdev, ttm, &ctx); + if (ret) + return ret; + + /* +* We need to use vmap to get the desired page protection +* or to make the buffer object look contiguous. +*/ + prot = ttm_io_prot(bo, mem, PAGE_KERNEL); + vaddr = vmap(ttm->pages, bo->num_pages, 0, prot); + if (!vaddr) + return -ENOMEM; + + dma_bu
Re: [PATCH v4 05/10] drm/ttm: Add vmap/vunmap to TTM and TTM GEM helpers
Hi Thomas, [SNIP] +int ttm_bo_vmap(struct ttm_buffer_object *bo, struct dma_buf_map *map) +{ + struct ttm_resource *mem = &bo->mem; + int ret; + + ret = ttm_mem_io_reserve(bo->bdev, mem); + if (ret) + return ret; + + if (mem->bus.is_iomem) { + void __iomem *vaddr_iomem; + unsigned long size = bo->num_pages << PAGE_SHIFT; Please use uint64_t here and make sure to cast bo->num_pages before shifting. I thought the rule of thumb is to use u64 in source code. Yet TTM only uses uint*_t types. Is there anything special about TTM? My last status is that you can use both and my personal preference is to use the uint*_t types because they are part of a higher level standard. We have an unit tests of allocating a 8GB BO and that should work on a 32bit machine as well :) + + if (mem->bus.addr) + vaddr_iomem = (void *)(((u8 *)mem->bus.addr)); I after reading the patch again, I realized that this is the 'ttm_bo_map_premapped' case and it's missing from _vunmap(). I see two options here: ignore this case in _vunmap(), or do an ioremap() unconditionally. Which one is preferable? ioremap would be very very bad, so we should just do nothing. Thanks, Christian. Best regards Thomas + else if (mem->placement & TTM_PL_FLAG_WC) I've just nuked the TTM_PL_FLAG_WC flag in drm-misc-next. There is a new mem->bus.caching enum as replacement. + vaddr_iomem = ioremap_wc(mem->bus.offset, size); + else + vaddr_iomem = ioremap(mem->bus.offset, size); + + if (!vaddr_iomem) + return -ENOMEM; + + dma_buf_map_set_vaddr_iomem(map, vaddr_iomem); + + } else { + struct ttm_operation_ctx ctx = { + .interruptible = false, + .no_wait_gpu = false + }; + struct ttm_tt *ttm = bo->ttm; + pgprot_t prot; + void *vaddr; + + BUG_ON(!ttm); I think we can drop this, populate will just crash badly anyway. + + ret = ttm_tt_populate(bo->bdev, ttm, &ctx); + if (ret) + return ret; + + /* + * We need to use vmap to get the desired page protection + * or to make the buffer object look contiguous. + */ + prot = ttm_io_prot(mem->placement, PAGE_KERNEL); The calling convention has changed on drm-misc-next as well, but should be trivial to adapt. Regards, Christian. + vaddr = vmap(ttm->pages, bo->num_pages, 0, prot); + if (!vaddr) + return -ENOMEM; + + dma_buf_map_set_vaddr(map, vaddr); + } + + return 0; +} +EXPORT_SYMBOL(ttm_bo_vmap); + +void ttm_bo_vunmap(struct ttm_buffer_object *bo, struct dma_buf_map *map) +{ + if (dma_buf_map_is_null(map)) + return; + + if (map->is_iomem) + iounmap(map->vaddr_iomem); + else + vunmap(map->vaddr); + dma_buf_map_clear(map); + + ttm_mem_io_free(bo->bdev, &bo->mem); +} +EXPORT_SYMBOL(ttm_bo_vunmap); + static int ttm_bo_wait_free_node(struct ttm_buffer_object *bo, bool dst_use_tt) { diff --git a/include/drm/drm_gem_ttm_helper.h b/include/drm/drm_gem_ttm_helper.h index 118cef76f84f..7c6d874910b8 100644 --- a/include/drm/drm_gem_ttm_helper.h +++ b/include/drm/drm_gem_ttm_helper.h @@ -10,11 +10,17 @@ #include #include +struct dma_buf_map; + #define drm_gem_ttm_of_gem(gem_obj) \ container_of(gem_obj, struct ttm_buffer_object, base) void drm_gem_ttm_print_info(struct drm_printer *p, unsigned int indent, const struct drm_gem_object *gem); +int drm_gem_ttm_vmap(struct drm_gem_object *gem, + struct dma_buf_map *map); +void drm_gem_ttm_vunmap(struct drm_gem_object *gem, + struct dma_buf_map *map); int drm_gem_ttm_mmap(struct drm_gem_object *gem, struct vm_area_struct *vma); diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h index 37102e45e496..2c59a785374c 100644 --- a/include/drm/ttm/ttm_bo_api.h +++ b/include/drm/ttm/ttm_bo_api.h @@ -48,6 +48,8 @@ struct ttm_bo_global; struct ttm_bo_device; +struct dma_buf_map; + struct drm_mm_node; struct ttm_placement; @@ -494,6 +496,32 @@ int ttm_bo_kmap(struct ttm_buffer_object *bo, unsigned long start_page, */ void ttm_bo_kunmap(struct ttm_bo_kmap_obj *map); +/** + * ttm_bo_vmap + * + * @bo: The buffer object. + * @map: pointer to a struct dma_buf_map representing the map. + * + * Sets up a kernel virtual mapping, using ioremap or vmap to the + * data in the buffer object. The parameter @map returns the virtual + * address as struct dma_buf_map. Unmap the buffer with ttm_bo_vunmap(). + * + * Returns + * -ENOMEM: Out of memory. + * -EINVAL: Invalid range. + */ +int ttm_bo_vmap(struct ttm_buffer_object *bo, struct dma_buf_map *map); + +/** + * ttm_bo_vunmap + * + * @bo: The buffer object. + * @map: Object describing the map to unmap. + * + * Unmaps a kernel map set up by ttm_bo_vmap(). + */ +void tt
Re: [PATCH v4 05/10] drm/ttm: Add vmap/vunmap to TTM and TTM GEM helpers
Am 15.10.20 um 19:52 schrieb Thomas Zimmermann: Hi On Thu, 15 Oct 2020 18:49:09 +0200 Daniel Vetter wrote: On Thu, Oct 15, 2020 at 04:08:13PM +0200, Christian König wrote: Am 15.10.20 um 14:38 schrieb Thomas Zimmermann: The new functions ttm_bo_{vmap,vunmap}() map and unmap a TTM BO in kernel address space. The mapping's address is returned as struct dma_buf_map. Each function is a simplified version of TTM's existing kmap code. Both functions respect the memory's location ani/or writecombine flags. On top TTM's functions, GEM TTM helpers got drm_gem_ttm_{vmap,vunmap}(), two helpers that convert a GEM object into the TTM BO and forward the call to TTM's vmap/vunmap. These helpers can be dropped into the rsp GEM object callbacks. v4: * drop ttm_kmap_obj_to_dma_buf() in favor of vmap helpers (Daniel, Christian) Bunch of minor comments below, but over all look very solid to me. Yeah I think just duplicating the ttm bo map stuff for vmap is indeed the cleanest. And then we can maybe push the combinatorial monster into vmwgfx, which I think is the only user after this series. Or perhaps a dedicated set of helpers to map an invidual page (again using the dma_buf_map stuff). From a quick look, I'd say it should be possible to have the same interface for kmap/kunmap as for vmap/vunmap (i.e., parameters are bo and dma-buf-map). All mapping state can be deduced from this. And struct ttm_bo_kmap_obj can be killed off entirely. Yes, that would be rather nice to have. Thanks, Christian. Best regards Thomas I'll let Christian with the details, but at a high level this is definitely Acked-by: Daniel Vetter Thanks a lot for doing all this. -Daniel Signed-off-by: Thomas Zimmermann --- drivers/gpu/drm/drm_gem_ttm_helper.c | 38 +++ drivers/gpu/drm/ttm/ttm_bo_util.c| 72 include/drm/drm_gem_ttm_helper.h | 6 +++ include/drm/ttm/ttm_bo_api.h | 28 +++ include/linux/dma-buf-map.h | 20 5 files changed, 164 insertions(+) diff --git a/drivers/gpu/drm/drm_gem_ttm_helper.c b/drivers/gpu/drm/drm_gem_ttm_helper.c index 0e4fb9ba43ad..db4c14d78a30 100644 --- a/drivers/gpu/drm/drm_gem_ttm_helper.c +++ b/drivers/gpu/drm/drm_gem_ttm_helper.c @@ -49,6 +49,44 @@ void drm_gem_ttm_print_info(struct drm_printer *p, unsigned int indent, } EXPORT_SYMBOL(drm_gem_ttm_print_info); +/** + * drm_gem_ttm_vmap() - vmap &ttm_buffer_object + * @gem: GEM object. + * @map: [out] returns the dma-buf mapping. + * + * Maps a GEM object with ttm_bo_vmap(). This function can be used as + * &drm_gem_object_funcs.vmap callback. + * + * Returns: + * 0 on success, or a negative errno code otherwise. + */ +int drm_gem_ttm_vmap(struct drm_gem_object *gem, +struct dma_buf_map *map) +{ + struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gem); + + return ttm_bo_vmap(bo, map); + +} +EXPORT_SYMBOL(drm_gem_ttm_vmap); + +/** + * drm_gem_ttm_vunmap() - vunmap &ttm_buffer_object + * @gem: GEM object. + * @map: dma-buf mapping. + * + * Unmaps a GEM object with ttm_bo_vunmap(). This function can be used as + * &drm_gem_object_funcs.vmap callback. + */ +void drm_gem_ttm_vunmap(struct drm_gem_object *gem, + struct dma_buf_map *map) +{ + struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gem); + + ttm_bo_vunmap(bo, map); +} +EXPORT_SYMBOL(drm_gem_ttm_vunmap); + /** * drm_gem_ttm_mmap() - mmap &ttm_buffer_object * @gem: GEM object. diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c index bdee4df1f3f2..80c42c774c7d 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_util.c +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c @@ -32,6 +32,7 @@ #include #include #include +#include #include #include #include @@ -526,6 +527,77 @@ void ttm_bo_kunmap(struct ttm_bo_kmap_obj *map) } EXPORT_SYMBOL(ttm_bo_kunmap); +int ttm_bo_vmap(struct ttm_buffer_object *bo, struct dma_buf_map *map) +{ + struct ttm_resource *mem = &bo->mem; + int ret; + + ret = ttm_mem_io_reserve(bo->bdev, mem); + if (ret) + return ret; + + if (mem->bus.is_iomem) { + void __iomem *vaddr_iomem; + unsigned long size = bo->num_pages << PAGE_SHIFT; Please use uint64_t here and make sure to cast bo->num_pages before shifting. We have an unit tests of allocating a 8GB BO and that should work on a 32bit machine as well :) + + if (mem->bus.addr) + vaddr_iomem = (void *)(((u8 *)mem->bus.addr)); + else if (mem->placement & TTM_PL_FLAG_WC) I've just nuked the TTM_PL_FLAG_WC flag in drm-misc-next. There is a new mem->bus.caching enum as replacement. + vaddr_iomem = ioremap_wc(mem->bus.offset, size); + else
Re: [PATCH v4 06/10] drm/gem: Use struct dma_buf_map in GEM vmap ops and convert GEM backends
Am 15.10.20 um 14:38 schrieb Thomas Zimmermann: This patch replaces the vmap/vunmap's use of raw pointers in GEM object functions with instances of struct dma_buf_map. GEM backends are converted as well. For most of them, this simply changes the returned type. TTM-based drivers now return information about the location of the memory, either system or I/O memory. GEM VRAM helpers and qxl now use ttm_bo_vmap() et al. Amdgpu, nouveau and radeon use drm_gem_ttm_vmap() et al instead of implementing their own vmap callbacks. v4: * use ttm_bo_vmap(), drm_gem_ttm_vmap(), et al. (Daniel, Christian) * fix a trailing { in drm_gem_vmap() * remove several empty functions instead of converting them (Daniel) * comment uses of raw pointers with a TODO (Daniel) * TODO list: convert more helpers to use struct dma_buf_map Signed-off-by: Thomas Zimmermann The amdgpu changes look good to me, but I can't fully judge the other stuff. Acked-by: Christian König --- Documentation/gpu/todo.rst | 18 drivers/gpu/drm/Kconfig | 2 + drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 36 --- drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.h | 2 - drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 5 +- drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 1 - drivers/gpu/drm/ast/ast_cursor.c| 27 +++-- drivers/gpu/drm/ast/ast_drv.h | 7 +- drivers/gpu/drm/drm_gem.c | 23 +++-- drivers/gpu/drm/drm_gem_cma_helper.c| 10 +- drivers/gpu/drm/drm_gem_shmem_helper.c | 48 + drivers/gpu/drm/drm_gem_vram_helper.c | 107 ++-- drivers/gpu/drm/etnaviv/etnaviv_drv.h | 2 +- drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c | 9 +- drivers/gpu/drm/lima/lima_gem.c | 6 +- drivers/gpu/drm/lima/lima_sched.c | 11 +- drivers/gpu/drm/mgag200/mgag200_mode.c | 10 +- drivers/gpu/drm/nouveau/Kconfig | 1 + drivers/gpu/drm/nouveau/nouveau_bo.h| 2 - drivers/gpu/drm/nouveau/nouveau_gem.c | 6 +- drivers/gpu/drm/nouveau/nouveau_gem.h | 2 - drivers/gpu/drm/nouveau/nouveau_prime.c | 20 drivers/gpu/drm/panfrost/panfrost_perfcnt.c | 14 +-- drivers/gpu/drm/qxl/qxl_display.c | 11 +- drivers/gpu/drm/qxl/qxl_draw.c | 14 ++- drivers/gpu/drm/qxl/qxl_drv.h | 11 +- drivers/gpu/drm/qxl/qxl_object.c| 31 +++--- drivers/gpu/drm/qxl/qxl_object.h| 2 +- drivers/gpu/drm/qxl/qxl_prime.c | 12 +-- drivers/gpu/drm/radeon/radeon.h | 1 - drivers/gpu/drm/radeon/radeon_gem.c | 7 +- drivers/gpu/drm/radeon/radeon_prime.c | 20 drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 22 ++-- drivers/gpu/drm/rockchip/rockchip_drm_gem.h | 4 +- drivers/gpu/drm/tiny/cirrus.c | 10 +- drivers/gpu/drm/tiny/gm12u320.c | 10 +- drivers/gpu/drm/udl/udl_modeset.c | 8 +- drivers/gpu/drm/vboxvideo/vbox_mode.c | 11 +- drivers/gpu/drm/vc4/vc4_bo.c| 6 +- drivers/gpu/drm/vc4/vc4_drv.h | 2 +- drivers/gpu/drm/vgem/vgem_drv.c | 16 ++- drivers/gpu/drm/xen/xen_drm_front_gem.c | 18 ++-- drivers/gpu/drm/xen/xen_drm_front_gem.h | 6 +- include/drm/drm_gem.h | 5 +- include/drm/drm_gem_cma_helper.h| 2 +- include/drm/drm_gem_shmem_helper.h | 4 +- include/drm/drm_gem_vram_helper.h | 14 +-- 47 files changed, 321 insertions(+), 295 deletions(-) diff --git a/Documentation/gpu/todo.rst b/Documentation/gpu/todo.rst index 700637e25ecd..7e6fc3c04add 100644 --- a/Documentation/gpu/todo.rst +++ b/Documentation/gpu/todo.rst @@ -446,6 +446,24 @@ Contact: Ville Syrjälä, Daniel Vetter Level: Intermediate +Use struct dma_buf_map throughout codebase +-- + +Pointers to shared device memory are stored in struct dma_buf_map. Each +instance knows whether it refers to system or I/O memory. Most of the DRM-wide +interface have been converted to use struct dma_buf_map, but implementations +often still use raw pointers. + +The task is to use struct dma_buf_map where it makes sense. + +* Memory managers should use struct dma_buf_map for dma-buf-imported buffers. +* TTM might benefit from using struct dma_buf_map internally. +* Framebuffer copying and blitting helpers should operate on struct dma_buf_map. + +Contact: Thomas Zimmermann , Christian König, Daniel Vetter + +Level: Intermediate + Core refactorings = diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index 147d61b9674e..319839b87d37 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -239,6 +239,7 @@ config DRM_RADEON select FW_LOADER select DRM_
Re: [PATCH v4 05/10] drm/ttm: Add vmap/vunmap to TTM and TTM GEM helpers
Am 15.10.20 um 14:38 schrieb Thomas Zimmermann: The new functions ttm_bo_{vmap,vunmap}() map and unmap a TTM BO in kernel address space. The mapping's address is returned as struct dma_buf_map. Each function is a simplified version of TTM's existing kmap code. Both functions respect the memory's location ani/or writecombine flags. On top TTM's functions, GEM TTM helpers got drm_gem_ttm_{vmap,vunmap}(), two helpers that convert a GEM object into the TTM BO and forward the call to TTM's vmap/vunmap. These helpers can be dropped into the rsp GEM object callbacks. v4: * drop ttm_kmap_obj_to_dma_buf() in favor of vmap helpers (Daniel, Christian) Bunch of minor comments below, but over all look very solid to me. Signed-off-by: Thomas Zimmermann --- drivers/gpu/drm/drm_gem_ttm_helper.c | 38 +++ drivers/gpu/drm/ttm/ttm_bo_util.c| 72 include/drm/drm_gem_ttm_helper.h | 6 +++ include/drm/ttm/ttm_bo_api.h | 28 +++ include/linux/dma-buf-map.h | 20 5 files changed, 164 insertions(+) diff --git a/drivers/gpu/drm/drm_gem_ttm_helper.c b/drivers/gpu/drm/drm_gem_ttm_helper.c index 0e4fb9ba43ad..db4c14d78a30 100644 --- a/drivers/gpu/drm/drm_gem_ttm_helper.c +++ b/drivers/gpu/drm/drm_gem_ttm_helper.c @@ -49,6 +49,44 @@ void drm_gem_ttm_print_info(struct drm_printer *p, unsigned int indent, } EXPORT_SYMBOL(drm_gem_ttm_print_info); +/** + * drm_gem_ttm_vmap() - vmap &ttm_buffer_object + * @gem: GEM object. + * @map: [out] returns the dma-buf mapping. + * + * Maps a GEM object with ttm_bo_vmap(). This function can be used as + * &drm_gem_object_funcs.vmap callback. + * + * Returns: + * 0 on success, or a negative errno code otherwise. + */ +int drm_gem_ttm_vmap(struct drm_gem_object *gem, +struct dma_buf_map *map) +{ + struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gem); + + return ttm_bo_vmap(bo, map); + +} +EXPORT_SYMBOL(drm_gem_ttm_vmap); + +/** + * drm_gem_ttm_vunmap() - vunmap &ttm_buffer_object + * @gem: GEM object. + * @map: dma-buf mapping. + * + * Unmaps a GEM object with ttm_bo_vunmap(). This function can be used as + * &drm_gem_object_funcs.vmap callback. + */ +void drm_gem_ttm_vunmap(struct drm_gem_object *gem, + struct dma_buf_map *map) +{ + struct ttm_buffer_object *bo = drm_gem_ttm_of_gem(gem); + + ttm_bo_vunmap(bo, map); +} +EXPORT_SYMBOL(drm_gem_ttm_vunmap); + /** * drm_gem_ttm_mmap() - mmap &ttm_buffer_object * @gem: GEM object. diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c index bdee4df1f3f2..80c42c774c7d 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_util.c +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c @@ -32,6 +32,7 @@ #include #include #include +#include #include #include #include @@ -526,6 +527,77 @@ void ttm_bo_kunmap(struct ttm_bo_kmap_obj *map) } EXPORT_SYMBOL(ttm_bo_kunmap); +int ttm_bo_vmap(struct ttm_buffer_object *bo, struct dma_buf_map *map) +{ + struct ttm_resource *mem = &bo->mem; + int ret; + + ret = ttm_mem_io_reserve(bo->bdev, mem); + if (ret) + return ret; + + if (mem->bus.is_iomem) { + void __iomem *vaddr_iomem; + unsigned long size = bo->num_pages << PAGE_SHIFT; Please use uint64_t here and make sure to cast bo->num_pages before shifting. We have an unit tests of allocating a 8GB BO and that should work on a 32bit machine as well :) + + if (mem->bus.addr) + vaddr_iomem = (void *)(((u8 *)mem->bus.addr)); + else if (mem->placement & TTM_PL_FLAG_WC) I've just nuked the TTM_PL_FLAG_WC flag in drm-misc-next. There is a new mem->bus.caching enum as replacement. + vaddr_iomem = ioremap_wc(mem->bus.offset, size); + else + vaddr_iomem = ioremap(mem->bus.offset, size); + + if (!vaddr_iomem) + return -ENOMEM; + + dma_buf_map_set_vaddr_iomem(map, vaddr_iomem); + + } else { + struct ttm_operation_ctx ctx = { + .interruptible = false, + .no_wait_gpu = false + }; + struct ttm_tt *ttm = bo->ttm; + pgprot_t prot; + void *vaddr; + + BUG_ON(!ttm); I think we can drop this, populate will just crash badly anyway. + + ret = ttm_tt_populate(bo->bdev, ttm, &ctx); + if (ret) + return ret; + + /* +* We need to use vmap to get the desired page protection +* or to make the buffer object look contiguous. +*/ + prot = ttm_io_prot(mem->placement, PAGE_KERNEL); The calling convention has changed on drm-misc-next as well, but should be trivial to adapt. Regards
Re: [PATCH v4 04/10] drm/exynos: Remove empty exynos_drm_gem_prime_{vmap,vunmap}()
Am 15.10.20 um 14:38 schrieb Thomas Zimmermann: The functions exynos_drm_gem_prime_{vmap,vunmap}() are empty. Remove them before changing the interface to use struct drm_buf_map. As a side effect of removing drm_gem_prime_vmap(), the error code changes from ENOMEM to EOPNOTSUPP. Signed-off-by: Thomas Zimmermann Acked-by: Christian König --- drivers/gpu/drm/exynos/exynos_drm_gem.c | 12 drivers/gpu/drm/exynos/exynos_drm_gem.h | 2 -- 2 files changed, 14 deletions(-) diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c b/drivers/gpu/drm/exynos/exynos_drm_gem.c index e7a6eb96f692..13a35623ac04 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_gem.c +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c @@ -137,8 +137,6 @@ static const struct vm_operations_struct exynos_drm_gem_vm_ops = { static const struct drm_gem_object_funcs exynos_drm_gem_object_funcs = { .free = exynos_drm_gem_free_object, .get_sg_table = exynos_drm_gem_prime_get_sg_table, - .vmap = exynos_drm_gem_prime_vmap, - .vunmap = exynos_drm_gem_prime_vunmap, .vm_ops = &exynos_drm_gem_vm_ops, }; @@ -471,16 +469,6 @@ exynos_drm_gem_prime_import_sg_table(struct drm_device *dev, return &exynos_gem->base; } -void *exynos_drm_gem_prime_vmap(struct drm_gem_object *obj) -{ - return NULL; -} - -void exynos_drm_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr) -{ - /* Nothing to do */ -} - int exynos_drm_gem_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma) { diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.h b/drivers/gpu/drm/exynos/exynos_drm_gem.h index 74e926abeff0..a23272fb96fb 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_gem.h +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.h @@ -107,8 +107,6 @@ struct drm_gem_object * exynos_drm_gem_prime_import_sg_table(struct drm_device *dev, struct dma_buf_attachment *attach, struct sg_table *sgt); -void *exynos_drm_gem_prime_vmap(struct drm_gem_object *obj); -void exynos_drm_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr); int exynos_drm_gem_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma);
Re: [PATCH v4 03/10] drm/etnaviv: Remove empty etnaviv_gem_prime_vunmap()
Am 15.10.20 um 14:37 schrieb Thomas Zimmermann: The function etnaviv_gem_prime_vunmap() is empty. Remove it before changing the interface to use struct drm_buf_map. Signed-off-by: Thomas Zimmermann Acked-by: Christian König --- drivers/gpu/drm/etnaviv/etnaviv_drv.h | 1 - drivers/gpu/drm/etnaviv/etnaviv_gem.c | 1 - drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c | 5 - 3 files changed, 7 deletions(-) diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.h b/drivers/gpu/drm/etnaviv/etnaviv_drv.h index 914f0867ff71..9682c26d89bb 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_drv.h +++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.h @@ -52,7 +52,6 @@ int etnaviv_gem_mmap(struct file *filp, struct vm_area_struct *vma); int etnaviv_gem_mmap_offset(struct drm_gem_object *obj, u64 *offset); struct sg_table *etnaviv_gem_prime_get_sg_table(struct drm_gem_object *obj); void *etnaviv_gem_prime_vmap(struct drm_gem_object *obj); -void etnaviv_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr); int etnaviv_gem_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma); struct drm_gem_object *etnaviv_gem_prime_import_sg_table(struct drm_device *dev, diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c b/drivers/gpu/drm/etnaviv/etnaviv_gem.c index 67d9a2b9ea6a..bbd235473645 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c @@ -571,7 +571,6 @@ static const struct drm_gem_object_funcs etnaviv_gem_object_funcs = { .unpin = etnaviv_gem_prime_unpin, .get_sg_table = etnaviv_gem_prime_get_sg_table, .vmap = etnaviv_gem_prime_vmap, - .vunmap = etnaviv_gem_prime_vunmap, .vm_ops = &vm_ops, }; diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c index 135fbff6fecf..a6d9932a32ae 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c @@ -27,11 +27,6 @@ void *etnaviv_gem_prime_vmap(struct drm_gem_object *obj) return etnaviv_gem_vmap(obj); } -void etnaviv_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr) -{ - /* TODO msm_gem_vunmap() */ -} - int etnaviv_gem_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma) {
Re: [PATCH v4 02/10] drm/cma-helper: Remove empty drm_gem_cma_prime_vunmap()
Am 15.10.20 um 14:37 schrieb Thomas Zimmermann: The function drm_gem_cma_prime_vunmap() is empty. Remove it before changing the interface to use struct drm_buf_map. Signed-off-by: Thomas Zimmermann Reviewed-by: Christian König --- drivers/gpu/drm/drm_gem_cma_helper.c | 17 - drivers/gpu/drm/vc4/vc4_bo.c | 1 - include/drm/drm_gem_cma_helper.h | 1 - 3 files changed, 19 deletions(-) diff --git a/drivers/gpu/drm/drm_gem_cma_helper.c b/drivers/gpu/drm/drm_gem_cma_helper.c index 2165633c9b9e..d527485ea0b7 100644 --- a/drivers/gpu/drm/drm_gem_cma_helper.c +++ b/drivers/gpu/drm/drm_gem_cma_helper.c @@ -537,23 +537,6 @@ void *drm_gem_cma_prime_vmap(struct drm_gem_object *obj) } EXPORT_SYMBOL_GPL(drm_gem_cma_prime_vmap); -/** - * drm_gem_cma_prime_vunmap - unmap a CMA GEM object from the kernel's virtual - * address space - * @obj: GEM object - * @vaddr: kernel virtual address where the CMA GEM object was mapped - * - * This function removes a buffer exported via DRM PRIME from the kernel's - * virtual address space. This is a no-op because CMA buffers cannot be - * unmapped from kernel space. Drivers using the CMA helpers should set this - * as their &drm_gem_object_funcs.vunmap callback. - */ -void drm_gem_cma_prime_vunmap(struct drm_gem_object *obj, void *vaddr) -{ - /* Nothing to do */ -} -EXPORT_SYMBOL_GPL(drm_gem_cma_prime_vunmap); - static const struct drm_gem_object_funcs drm_gem_cma_default_funcs = { .free = drm_gem_cma_free_object, .print_info = drm_gem_cma_print_info, diff --git a/drivers/gpu/drm/vc4/vc4_bo.c b/drivers/gpu/drm/vc4/vc4_bo.c index f432278173cd..557f0d1e6437 100644 --- a/drivers/gpu/drm/vc4/vc4_bo.c +++ b/drivers/gpu/drm/vc4/vc4_bo.c @@ -387,7 +387,6 @@ static const struct drm_gem_object_funcs vc4_gem_object_funcs = { .export = vc4_prime_export, .get_sg_table = drm_gem_cma_prime_get_sg_table, .vmap = vc4_prime_vmap, - .vunmap = drm_gem_cma_prime_vunmap, .vm_ops = &vc4_vm_ops, }; diff --git a/include/drm/drm_gem_cma_helper.h b/include/drm/drm_gem_cma_helper.h index 2bfa2502607a..a064b0d1c480 100644 --- a/include/drm/drm_gem_cma_helper.h +++ b/include/drm/drm_gem_cma_helper.h @@ -104,7 +104,6 @@ drm_gem_cma_prime_import_sg_table(struct drm_device *dev, int drm_gem_cma_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma); void *drm_gem_cma_prime_vmap(struct drm_gem_object *obj); -void drm_gem_cma_prime_vunmap(struct drm_gem_object *obj, void *vaddr); struct drm_gem_object * drm_gem_cma_create_object_default_funcs(struct drm_device *dev, size_t size);
Re: [PATCH v4 01/10] drm/vram-helper: Remove invariant parameters from internal kmap function
Am 15.10.20 um 14:37 schrieb Thomas Zimmermann: The parameters map and is_iomem are always of the same value. Removed them to prepares the function for conversion to struct dma_buf_map. v4: * don't check for !kmap->virtual; will always be false Signed-off-by: Thomas Zimmermann Reviewed-by: Daniel Vetter Reviewed-by: Christian König --- drivers/gpu/drm/drm_gem_vram_helper.c | 18 -- 1 file changed, 4 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/drm_gem_vram_helper.c b/drivers/gpu/drm/drm_gem_vram_helper.c index 3213429f8444..2d5ed30518f1 100644 --- a/drivers/gpu/drm/drm_gem_vram_helper.c +++ b/drivers/gpu/drm/drm_gem_vram_helper.c @@ -382,32 +382,22 @@ int drm_gem_vram_unpin(struct drm_gem_vram_object *gbo) } EXPORT_SYMBOL(drm_gem_vram_unpin); -static void *drm_gem_vram_kmap_locked(struct drm_gem_vram_object *gbo, - bool map, bool *is_iomem) +static void *drm_gem_vram_kmap_locked(struct drm_gem_vram_object *gbo) { int ret; struct ttm_bo_kmap_obj *kmap = &gbo->kmap; + bool is_iomem; if (gbo->kmap_use_count > 0) goto out; - if (kmap->virtual || !map) - goto out; - ret = ttm_bo_kmap(&gbo->bo, 0, gbo->bo.num_pages, kmap); if (ret) return ERR_PTR(ret); out: - if (!kmap->virtual) { - if (is_iomem) - *is_iomem = false; - return NULL; /* not mapped; don't increment ref */ - } ++gbo->kmap_use_count; - if (is_iomem) - return ttm_kmap_obj_virtual(kmap, is_iomem); - return kmap->virtual; + return ttm_kmap_obj_virtual(kmap, &is_iomem); } static void drm_gem_vram_kunmap_locked(struct drm_gem_vram_object *gbo) @@ -452,7 +442,7 @@ void *drm_gem_vram_vmap(struct drm_gem_vram_object *gbo) ret = drm_gem_vram_pin_locked(gbo, 0); if (ret) goto err_ttm_bo_unreserve; - base = drm_gem_vram_kmap_locked(gbo, true, NULL); + base = drm_gem_vram_kmap_locked(gbo); if (IS_ERR(base)) { ret = PTR_ERR(base); goto err_drm_gem_vram_unpin_locked;
Re: [PATCH v3 2/7] drm/ttm: Add ttm_kmap_obj_to_dma_buf_map() for type conversion
Am 07.10.20 um 15:20 schrieb Thomas Zimmermann: Hi Am 07.10.20 um 15:10 schrieb Daniel Vetter: On Wed, Oct 7, 2020 at 2:57 PM Thomas Zimmermann wrote: Hi Am 02.10.20 um 11:58 schrieb Daniel Vetter: On Wed, Sep 30, 2020 at 02:51:46PM +0200, Daniel Vetter wrote: On Wed, Sep 30, 2020 at 2:34 PM Christian König wrote: Am 30.09.20 um 11:47 schrieb Daniel Vetter: On Wed, Sep 30, 2020 at 10:34:31AM +0200, Christian König wrote: Am 30.09.20 um 10:19 schrieb Thomas Zimmermann: Hi Am 30.09.20 um 10:05 schrieb Christian König: Am 29.09.20 um 19:49 schrieb Thomas Zimmermann: Hi Christian Am 29.09.20 um 17:35 schrieb Christian König: Am 29.09.20 um 17:14 schrieb Thomas Zimmermann: The new helper ttm_kmap_obj_to_dma_buf() extracts address and location from and instance of TTM's kmap_obj and initializes struct dma_buf_map with these values. Helpful for TTM-based drivers. We could completely drop that if we use the same structure inside TTM as well. Additional to that which driver is going to use this? As Daniel mentioned, it's in patch 3. The TTM-based drivers will retrieve the pointer via this function. I do want to see all that being more tightly integrated into TTM, but not in this series. This one is about fixing the bochs-on-sparc64 problem for good. Patch 7 adds an update to TTM to the DRM TODO list. I should have asked which driver you try to fix here :) In this case just keep the function inside bochs and only fix it there. All other drivers can be fixed when we generally pump this through TTM. Did you take a look at patch 3? This function will be used by VRAM helpers, nouveau, radeon, amdgpu and qxl. If we don't put it here, we have to duplicate the functionality in each if these drivers. Bochs itself uses VRAM helpers and doesn't touch the function directly. Ah, ok can we have that then only in the VRAM helpers? Alternative you could go ahead and use dma_buf_map in ttm_bo_kmap_obj directly and drop the hack with the TTM_BO_MAP_IOMEM_MASK. What I want to avoid is to have another conversion function in TTM because what happens here is that we already convert from ttm_bus_placement to ttm_bo_kmap_obj and then to dma_buf_map. Hm I'm not really seeing how that helps with a gradual conversion of everything over to dma_buf_map and assorted helpers for access? There's too many places in ttm drivers where is_iomem and related stuff is used to be able to convert it all in one go. An intermediate state with a bunch of conversions seems fairly unavoidable to me. Fair enough. I would just have started bottom up and not top down. Anyway feel free to go ahead with this approach as long as we can remove the new function again when we clean that stuff up for good. Yeah I guess bottom up would make more sense as a refactoring. But the main motivation to land this here is to fix the __mmio vs normal memory confusion in the fbdev emulation helpers for sparc (and anything else that needs this). Hence the top down approach for rolling this out. Ok I started reviewing this a bit more in-depth, and I think this is a bit too much of a de-tour. Looking through all the callers of ttm_bo_kmap almost everyone maps the entire object. Only vmwgfx uses to map less than that. Also, everyone just immediately follows up with converting that full object map into a pointer. So I think what we really want here is: - new function int ttm_bo_vmap(struct ttm_buffer_object *bo, struct dma_buf_map *map); _vmap name since that's consistent with both dma_buf functions and what's usually used to implement this. Outside of the ttm world kmap usually just means single-page mappings using kmap() or it's iomem sibling io_mapping_map* so rather confusing name for a function which usually is just used to set up a vmap of the entire buffer. - a helper which can be used for the drm_gem_object_funcs vmap/vunmap functions for all ttm drivers. We should be able to make this fully generic because a) we now have dma_buf_map and b) drm_gem_object is embedded in the ttm_bo, so we can upcast for everyone who's both a ttm and gem driver. This is maybe a good follow-up, since it should allow us to ditch quite a bit of the vram helper code for this more generic stuff. I also might have missed some special-cases here, but from a quick look everything just pins the buffer to the current location and that's it. Also this obviously requires Christian's generic ttm_bo_pin rework first. - roll the above out to drivers. Christian/Thomas, thoughts on this? I agree on the goals, but what is the immediate objective here? Adding ttm_bo_vmap() does not work out easily, as struct ttm_bo_kmap_obj is a central part of the internals of TTM. struct ttm_bo_kmap_obj has more internal state that struct dma_buf_map, so they are not easily convertible either. What you propose seems to require a reimplementation of the existing ttm_bo_kmap() c
Re: [PATCH v3 2/7] drm/ttm: Add ttm_kmap_obj_to_dma_buf_map() for type conversion
Am 02.10.20 um 11:58 schrieb Daniel Vetter: On Wed, Sep 30, 2020 at 02:51:46PM +0200, Daniel Vetter wrote: On Wed, Sep 30, 2020 at 2:34 PM Christian König wrote: Am 30.09.20 um 11:47 schrieb Daniel Vetter: On Wed, Sep 30, 2020 at 10:34:31AM +0200, Christian König wrote: Am 30.09.20 um 10:19 schrieb Thomas Zimmermann: Hi Am 30.09.20 um 10:05 schrieb Christian König: Am 29.09.20 um 19:49 schrieb Thomas Zimmermann: Hi Christian Am 29.09.20 um 17:35 schrieb Christian König: Am 29.09.20 um 17:14 schrieb Thomas Zimmermann: The new helper ttm_kmap_obj_to_dma_buf() extracts address and location from and instance of TTM's kmap_obj and initializes struct dma_buf_map with these values. Helpful for TTM-based drivers. We could completely drop that if we use the same structure inside TTM as well. Additional to that which driver is going to use this? As Daniel mentioned, it's in patch 3. The TTM-based drivers will retrieve the pointer via this function. I do want to see all that being more tightly integrated into TTM, but not in this series. This one is about fixing the bochs-on-sparc64 problem for good. Patch 7 adds an update to TTM to the DRM TODO list. I should have asked which driver you try to fix here :) In this case just keep the function inside bochs and only fix it there. All other drivers can be fixed when we generally pump this through TTM. Did you take a look at patch 3? This function will be used by VRAM helpers, nouveau, radeon, amdgpu and qxl. If we don't put it here, we have to duplicate the functionality in each if these drivers. Bochs itself uses VRAM helpers and doesn't touch the function directly. Ah, ok can we have that then only in the VRAM helpers? Alternative you could go ahead and use dma_buf_map in ttm_bo_kmap_obj directly and drop the hack with the TTM_BO_MAP_IOMEM_MASK. What I want to avoid is to have another conversion function in TTM because what happens here is that we already convert from ttm_bus_placement to ttm_bo_kmap_obj and then to dma_buf_map. Hm I'm not really seeing how that helps with a gradual conversion of everything over to dma_buf_map and assorted helpers for access? There's too many places in ttm drivers where is_iomem and related stuff is used to be able to convert it all in one go. An intermediate state with a bunch of conversions seems fairly unavoidable to me. Fair enough. I would just have started bottom up and not top down. Anyway feel free to go ahead with this approach as long as we can remove the new function again when we clean that stuff up for good. Yeah I guess bottom up would make more sense as a refactoring. But the main motivation to land this here is to fix the __mmio vs normal memory confusion in the fbdev emulation helpers for sparc (and anything else that needs this). Hence the top down approach for rolling this out. Ok I started reviewing this a bit more in-depth, and I think this is a bit too much of a de-tour. Looking through all the callers of ttm_bo_kmap almost everyone maps the entire object. Only vmwgfx uses to map less than that. Also, everyone just immediately follows up with converting that full object map into a pointer. So I think what we really want here is: - new function int ttm_bo_vmap(struct ttm_buffer_object *bo, struct dma_buf_map *map); _vmap name since that's consistent with both dma_buf functions and what's usually used to implement this. Outside of the ttm world kmap usually just means single-page mappings using kmap() or it's iomem sibling io_mapping_map* so rather confusing name for a function which usually is just used to set up a vmap of the entire buffer. - a helper which can be used for the drm_gem_object_funcs vmap/vunmap functions for all ttm drivers. We should be able to make this fully generic because a) we now have dma_buf_map and b) drm_gem_object is embedded in the ttm_bo, so we can upcast for everyone who's both a ttm and gem driver. This is maybe a good follow-up, since it should allow us to ditch quite a bit of the vram helper code for this more generic stuff. I also might have missed some special-cases here, but from a quick look everything just pins the buffer to the current location and that's it. Also this obviously requires Christian's generic ttm_bo_pin rework first. - roll the above out to drivers. Christian/Thomas, thoughts on this? Calling this vmap instead of kmap certainly makes sense. Not 100% sure about the generic helpers, but it sounds like this should indeed look rather clean in the end. Christian. I think for the immediate need of rolling this out for vram helpers and fbdev code we should be able to do this, but just postpone the driver wide roll-out for now. Cheers, Daniel -Daniel Christian. -Daniel Thanks, Christian. Best regards Thomas Regards, Christian. Best regards Thomas Regards, Christian. Signed-off-by: T
Re: [PATCH v3 2/7] drm/ttm: Add ttm_kmap_obj_to_dma_buf_map() for type conversion
Am 30.09.20 um 11:47 schrieb Daniel Vetter: On Wed, Sep 30, 2020 at 10:34:31AM +0200, Christian König wrote: Am 30.09.20 um 10:19 schrieb Thomas Zimmermann: Hi Am 30.09.20 um 10:05 schrieb Christian König: Am 29.09.20 um 19:49 schrieb Thomas Zimmermann: Hi Christian Am 29.09.20 um 17:35 schrieb Christian König: Am 29.09.20 um 17:14 schrieb Thomas Zimmermann: The new helper ttm_kmap_obj_to_dma_buf() extracts address and location from and instance of TTM's kmap_obj and initializes struct dma_buf_map with these values. Helpful for TTM-based drivers. We could completely drop that if we use the same structure inside TTM as well. Additional to that which driver is going to use this? As Daniel mentioned, it's in patch 3. The TTM-based drivers will retrieve the pointer via this function. I do want to see all that being more tightly integrated into TTM, but not in this series. This one is about fixing the bochs-on-sparc64 problem for good. Patch 7 adds an update to TTM to the DRM TODO list. I should have asked which driver you try to fix here :) In this case just keep the function inside bochs and only fix it there. All other drivers can be fixed when we generally pump this through TTM. Did you take a look at patch 3? This function will be used by VRAM helpers, nouveau, radeon, amdgpu and qxl. If we don't put it here, we have to duplicate the functionality in each if these drivers. Bochs itself uses VRAM helpers and doesn't touch the function directly. Ah, ok can we have that then only in the VRAM helpers? Alternative you could go ahead and use dma_buf_map in ttm_bo_kmap_obj directly and drop the hack with the TTM_BO_MAP_IOMEM_MASK. What I want to avoid is to have another conversion function in TTM because what happens here is that we already convert from ttm_bus_placement to ttm_bo_kmap_obj and then to dma_buf_map. Hm I'm not really seeing how that helps with a gradual conversion of everything over to dma_buf_map and assorted helpers for access? There's too many places in ttm drivers where is_iomem and related stuff is used to be able to convert it all in one go. An intermediate state with a bunch of conversions seems fairly unavoidable to me. Fair enough. I would just have started bottom up and not top down. Anyway feel free to go ahead with this approach as long as we can remove the new function again when we clean that stuff up for good. Christian. -Daniel Thanks, Christian. Best regards Thomas Regards, Christian. Best regards Thomas Regards, Christian. Signed-off-by: Thomas Zimmermann --- include/drm/ttm/ttm_bo_api.h | 24 include/linux/dma-buf-map.h | 20 2 files changed, 44 insertions(+) diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h index c96a25d571c8..62d89f05a801 100644 --- a/include/drm/ttm/ttm_bo_api.h +++ b/include/drm/ttm/ttm_bo_api.h @@ -34,6 +34,7 @@ #include #include #include +#include #include #include #include @@ -486,6 +487,29 @@ static inline void *ttm_kmap_obj_virtual(struct ttm_bo_kmap_obj *map, return map->virtual; } +/** + * ttm_kmap_obj_to_dma_buf_map + * + * @kmap: A struct ttm_bo_kmap_obj returned from ttm_bo_kmap. + * @map: Returns the mapping as struct dma_buf_map + * + * Converts struct ttm_bo_kmap_obj to struct dma_buf_map. If the memory + * is not mapped, the returned mapping is initialized to NULL. + */ +static inline void ttm_kmap_obj_to_dma_buf_map(struct ttm_bo_kmap_obj *kmap, + struct dma_buf_map *map) +{ + bool is_iomem; + void *vaddr = ttm_kmap_obj_virtual(kmap, &is_iomem); + + if (!vaddr) + dma_buf_map_clear(map); + else if (is_iomem) + dma_buf_map_set_vaddr_iomem(map, (void __force __iomem *)vaddr); + else + dma_buf_map_set_vaddr(map, vaddr); +} + /** * ttm_bo_kmap * diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h index fd1aba545fdf..2e8bbecb5091 100644 --- a/include/linux/dma-buf-map.h +++ b/include/linux/dma-buf-map.h @@ -45,6 +45,12 @@ * * dma_buf_map_set_vaddr(&map. 0xdeadbeaf); * + * To set an address in I/O memory, use dma_buf_map_set_vaddr_iomem(). + * + * .. code-block:: c + * + * dma_buf_map_set_vaddr_iomem(&map. 0xdeadbeaf); + * * Test if a mapping is valid with either dma_buf_map_is_set() or * dma_buf_map_is_null(). * @@ -118,6 +124,20 @@ static inline void dma_buf_map_set_vaddr(struct dma_buf_map *map, void *vaddr) map->is_iomem = false; } +/** + * dma_buf_map_set_vaddr_iomem - Sets a dma-buf mapping structure to an address in I/O memory + * @map: The dma-buf mapping structure + * @vaddr_iomem: An I/O-memory address + * + * Sets the address and the I/O-memory flag. + */ +static inline void dma_buf_map_set_vaddr_iomem(struct dma_buf_map *map, + voi
Re: [PATCH v3 2/7] drm/ttm: Add ttm_kmap_obj_to_dma_buf_map() for type conversion
Am 30.09.20 um 10:19 schrieb Thomas Zimmermann: Hi Am 30.09.20 um 10:05 schrieb Christian König: Am 29.09.20 um 19:49 schrieb Thomas Zimmermann: Hi Christian Am 29.09.20 um 17:35 schrieb Christian König: Am 29.09.20 um 17:14 schrieb Thomas Zimmermann: The new helper ttm_kmap_obj_to_dma_buf() extracts address and location from and instance of TTM's kmap_obj and initializes struct dma_buf_map with these values. Helpful for TTM-based drivers. We could completely drop that if we use the same structure inside TTM as well. Additional to that which driver is going to use this? As Daniel mentioned, it's in patch 3. The TTM-based drivers will retrieve the pointer via this function. I do want to see all that being more tightly integrated into TTM, but not in this series. This one is about fixing the bochs-on-sparc64 problem for good. Patch 7 adds an update to TTM to the DRM TODO list. I should have asked which driver you try to fix here :) In this case just keep the function inside bochs and only fix it there. All other drivers can be fixed when we generally pump this through TTM. Did you take a look at patch 3? This function will be used by VRAM helpers, nouveau, radeon, amdgpu and qxl. If we don't put it here, we have to duplicate the functionality in each if these drivers. Bochs itself uses VRAM helpers and doesn't touch the function directly. Ah, ok can we have that then only in the VRAM helpers? Alternative you could go ahead and use dma_buf_map in ttm_bo_kmap_obj directly and drop the hack with the TTM_BO_MAP_IOMEM_MASK. What I want to avoid is to have another conversion function in TTM because what happens here is that we already convert from ttm_bus_placement to ttm_bo_kmap_obj and then to dma_buf_map. Thanks, Christian. Best regards Thomas Regards, Christian. Best regards Thomas Regards, Christian. Signed-off-by: Thomas Zimmermann --- include/drm/ttm/ttm_bo_api.h | 24 include/linux/dma-buf-map.h | 20 2 files changed, 44 insertions(+) diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h index c96a25d571c8..62d89f05a801 100644 --- a/include/drm/ttm/ttm_bo_api.h +++ b/include/drm/ttm/ttm_bo_api.h @@ -34,6 +34,7 @@ #include #include #include +#include #include #include #include @@ -486,6 +487,29 @@ static inline void *ttm_kmap_obj_virtual(struct ttm_bo_kmap_obj *map, return map->virtual; } +/** + * ttm_kmap_obj_to_dma_buf_map + * + * @kmap: A struct ttm_bo_kmap_obj returned from ttm_bo_kmap. + * @map: Returns the mapping as struct dma_buf_map + * + * Converts struct ttm_bo_kmap_obj to struct dma_buf_map. If the memory + * is not mapped, the returned mapping is initialized to NULL. + */ +static inline void ttm_kmap_obj_to_dma_buf_map(struct ttm_bo_kmap_obj *kmap, + struct dma_buf_map *map) +{ + bool is_iomem; + void *vaddr = ttm_kmap_obj_virtual(kmap, &is_iomem); + + if (!vaddr) + dma_buf_map_clear(map); + else if (is_iomem) + dma_buf_map_set_vaddr_iomem(map, (void __force __iomem *)vaddr); + else + dma_buf_map_set_vaddr(map, vaddr); +} + /** * ttm_bo_kmap * diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h index fd1aba545fdf..2e8bbecb5091 100644 --- a/include/linux/dma-buf-map.h +++ b/include/linux/dma-buf-map.h @@ -45,6 +45,12 @@ * * dma_buf_map_set_vaddr(&map. 0xdeadbeaf); * + * To set an address in I/O memory, use dma_buf_map_set_vaddr_iomem(). + * + * .. code-block:: c + * + * dma_buf_map_set_vaddr_iomem(&map. 0xdeadbeaf); + * * Test if a mapping is valid with either dma_buf_map_is_set() or * dma_buf_map_is_null(). * @@ -118,6 +124,20 @@ static inline void dma_buf_map_set_vaddr(struct dma_buf_map *map, void *vaddr) map->is_iomem = false; } +/** + * dma_buf_map_set_vaddr_iomem - Sets a dma-buf mapping structure to an address in I/O memory + * @map: The dma-buf mapping structure + * @vaddr_iomem: An I/O-memory address + * + * Sets the address and the I/O-memory flag. + */ +static inline void dma_buf_map_set_vaddr_iomem(struct dma_buf_map *map, + void __iomem *vaddr_iomem) +{ + map->vaddr_iomem = vaddr_iomem; + map->is_iomem = true; +} + /** * dma_buf_map_is_equal - Compares two dma-buf mapping structures for equality * @lhs: The dma-buf mapping structure ___ dri-devel mailing list dri-de...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ___ amd-gfx mailing list amd-...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ dri-devel mailing list dri-de...@lists.freedesktop.org https://list
Re: [PATCH v3 2/7] drm/ttm: Add ttm_kmap_obj_to_dma_buf_map() for type conversion
Am 29.09.20 um 19:49 schrieb Thomas Zimmermann: Hi Christian Am 29.09.20 um 17:35 schrieb Christian König: Am 29.09.20 um 17:14 schrieb Thomas Zimmermann: The new helper ttm_kmap_obj_to_dma_buf() extracts address and location from and instance of TTM's kmap_obj and initializes struct dma_buf_map with these values. Helpful for TTM-based drivers. We could completely drop that if we use the same structure inside TTM as well. Additional to that which driver is going to use this? As Daniel mentioned, it's in patch 3. The TTM-based drivers will retrieve the pointer via this function. I do want to see all that being more tightly integrated into TTM, but not in this series. This one is about fixing the bochs-on-sparc64 problem for good. Patch 7 adds an update to TTM to the DRM TODO list. I should have asked which driver you try to fix here :) In this case just keep the function inside bochs and only fix it there. All other drivers can be fixed when we generally pump this through TTM. Regards, Christian. Best regards Thomas Regards, Christian. Signed-off-by: Thomas Zimmermann --- include/drm/ttm/ttm_bo_api.h | 24 include/linux/dma-buf-map.h | 20 2 files changed, 44 insertions(+) diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h index c96a25d571c8..62d89f05a801 100644 --- a/include/drm/ttm/ttm_bo_api.h +++ b/include/drm/ttm/ttm_bo_api.h @@ -34,6 +34,7 @@ #include #include #include +#include #include #include #include @@ -486,6 +487,29 @@ static inline void *ttm_kmap_obj_virtual(struct ttm_bo_kmap_obj *map, return map->virtual; } +/** + * ttm_kmap_obj_to_dma_buf_map + * + * @kmap: A struct ttm_bo_kmap_obj returned from ttm_bo_kmap. + * @map: Returns the mapping as struct dma_buf_map + * + * Converts struct ttm_bo_kmap_obj to struct dma_buf_map. If the memory + * is not mapped, the returned mapping is initialized to NULL. + */ +static inline void ttm_kmap_obj_to_dma_buf_map(struct ttm_bo_kmap_obj *kmap, + struct dma_buf_map *map) +{ + bool is_iomem; + void *vaddr = ttm_kmap_obj_virtual(kmap, &is_iomem); + + if (!vaddr) + dma_buf_map_clear(map); + else if (is_iomem) + dma_buf_map_set_vaddr_iomem(map, (void __force __iomem *)vaddr); + else + dma_buf_map_set_vaddr(map, vaddr); +} + /** * ttm_bo_kmap * diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h index fd1aba545fdf..2e8bbecb5091 100644 --- a/include/linux/dma-buf-map.h +++ b/include/linux/dma-buf-map.h @@ -45,6 +45,12 @@ * * dma_buf_map_set_vaddr(&map. 0xdeadbeaf); * + * To set an address in I/O memory, use dma_buf_map_set_vaddr_iomem(). + * + * .. code-block:: c + * + * dma_buf_map_set_vaddr_iomem(&map. 0xdeadbeaf); + * * Test if a mapping is valid with either dma_buf_map_is_set() or * dma_buf_map_is_null(). * @@ -118,6 +124,20 @@ static inline void dma_buf_map_set_vaddr(struct dma_buf_map *map, void *vaddr) map->is_iomem = false; } +/** + * dma_buf_map_set_vaddr_iomem - Sets a dma-buf mapping structure to an address in I/O memory + * @map: The dma-buf mapping structure + * @vaddr_iomem: An I/O-memory address + * + * Sets the address and the I/O-memory flag. + */ +static inline void dma_buf_map_set_vaddr_iomem(struct dma_buf_map *map, + void __iomem *vaddr_iomem) +{ + map->vaddr_iomem = vaddr_iomem; + map->is_iomem = true; +} + /** * dma_buf_map_is_equal - Compares two dma-buf mapping structures for equality * @lhs: The dma-buf mapping structure ___ dri-devel mailing list dri-de...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ___ amd-gfx mailing list amd-...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH v3 2/7] drm/ttm: Add ttm_kmap_obj_to_dma_buf_map() for type conversion
Am 29.09.20 um 17:14 schrieb Thomas Zimmermann: The new helper ttm_kmap_obj_to_dma_buf() extracts address and location from and instance of TTM's kmap_obj and initializes struct dma_buf_map with these values. Helpful for TTM-based drivers. We could completely drop that if we use the same structure inside TTM as well. Additional to that which driver is going to use this? Regards, Christian. Signed-off-by: Thomas Zimmermann --- include/drm/ttm/ttm_bo_api.h | 24 include/linux/dma-buf-map.h | 20 2 files changed, 44 insertions(+) diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h index c96a25d571c8..62d89f05a801 100644 --- a/include/drm/ttm/ttm_bo_api.h +++ b/include/drm/ttm/ttm_bo_api.h @@ -34,6 +34,7 @@ #include #include #include +#include #include #include #include @@ -486,6 +487,29 @@ static inline void *ttm_kmap_obj_virtual(struct ttm_bo_kmap_obj *map, return map->virtual; } +/** + * ttm_kmap_obj_to_dma_buf_map + * + * @kmap: A struct ttm_bo_kmap_obj returned from ttm_bo_kmap. + * @map: Returns the mapping as struct dma_buf_map + * + * Converts struct ttm_bo_kmap_obj to struct dma_buf_map. If the memory + * is not mapped, the returned mapping is initialized to NULL. + */ +static inline void ttm_kmap_obj_to_dma_buf_map(struct ttm_bo_kmap_obj *kmap, + struct dma_buf_map *map) +{ + bool is_iomem; + void *vaddr = ttm_kmap_obj_virtual(kmap, &is_iomem); + + if (!vaddr) + dma_buf_map_clear(map); + else if (is_iomem) + dma_buf_map_set_vaddr_iomem(map, (void __force __iomem *)vaddr); + else + dma_buf_map_set_vaddr(map, vaddr); +} + /** * ttm_bo_kmap * diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h index fd1aba545fdf..2e8bbecb5091 100644 --- a/include/linux/dma-buf-map.h +++ b/include/linux/dma-buf-map.h @@ -45,6 +45,12 @@ * *dma_buf_map_set_vaddr(&map. 0xdeadbeaf); * + * To set an address in I/O memory, use dma_buf_map_set_vaddr_iomem(). + * + * .. code-block:: c + * + * dma_buf_map_set_vaddr_iomem(&map. 0xdeadbeaf); + * * Test if a mapping is valid with either dma_buf_map_is_set() or * dma_buf_map_is_null(). * @@ -118,6 +124,20 @@ static inline void dma_buf_map_set_vaddr(struct dma_buf_map *map, void *vaddr) map->is_iomem = false; } +/** + * dma_buf_map_set_vaddr_iomem - Sets a dma-buf mapping structure to an address in I/O memory + * @map: The dma-buf mapping structure + * @vaddr_iomem: An I/O-memory address + * + * Sets the address and the I/O-memory flag. + */ +static inline void dma_buf_map_set_vaddr_iomem(struct dma_buf_map *map, + void __iomem *vaddr_iomem) +{ + map->vaddr_iomem = vaddr_iomem; + map->is_iomem = true; +} + /** * dma_buf_map_is_equal - Compares two dma-buf mapping structures for equality * @lhs: The dma-buf mapping structure
Re: [PATCH v3 00/22] Convert all remaining drivers to GEM object functions
Feel free to add an Acked-by: Christian König to all patches which I haven't explicitly reviewed. I would say we should just push this to drm-misc-next now. Thanks for the nice cleanup, Christian. Am 23.09.20 um 12:21 schrieb Thomas Zimmermann: The GEM and PRIME related callbacks in struct drm_driver are deprecated in favor of GEM object functions in struct drm_gem_object_funcs. This patchset converts the remaining drivers to object functions and removes most of the obsolete interfaces. Version 3 of this patchset mostly fixes drm_gem_prime_handle_to_fd and updates i.MX's dcss driver. The driver was missing from earlier versions and still needs review. Patches #1 to #6, #8 to #17 and #19 to #20 convert DRM drivers to GEM object functions, one by one. Each patch moves existing callbacks from struct drm_driver to an instance of struct drm_gem_object_funcs, and sets these funcs when the GEM object is initialized. The expection is .gem_prime_mmap. There are different ways of how drivers implement the callback, and moving it to GEM object functions requires a closer review for each. Patch #18 fixes virtgpu to use GEM object functions where possible. The driver recently introduced a function for one of the deprecated callbacks. Patches #7 and #20 convert i.MX's dcss and xlnx to CMA helper macros. There's no apparent reason why the drivers do the GEM setup on their's own. Using CMA helper macros adds GEM object functions implicitly. With most of the GEM and PRIME moved to GEM object functions, related code in struct drm_driver and in the DRM core/helpers is being removed by patch #22. Further testing is welcome. I tested the drivers for which I have HW available. These are gma500, i915, nouveau, radeon and vc4. The console, Weston and Xorg apparently work with the patches applied. v3: * restore default call to drm_gem_prime_export() in drm_gem_prime_handle_to_fd() * return -ENOSYS if get_sg_table is not set * drop all checks for obj->funcs * clean up TODO list and documentation v2: * moved code in amdgpu and radeon * made several functions static in various drivers * updated TODO-list item * fix virtgpu Thomas Zimmermann (22): drm/amdgpu: Introduce GEM object functions drm/armada: Introduce GEM object functions drm/etnaviv: Introduce GEM object functions drm/exynos: Introduce GEM object functions drm/gma500: Introduce GEM object functions drm/i915: Introduce GEM object functions drm/imx/dcss: Initialize DRM driver instance with CMA helper macro drm/mediatek: Introduce GEM object functions drm/msm: Introduce GEM object funcs drm/nouveau: Introduce GEM object functions drm/omapdrm: Introduce GEM object functions drm/pl111: Introduce GEM object functions drm/radeon: Introduce GEM object functions drm/rockchip: Convert to drm_gem_object_funcs drm/tegra: Introduce GEM object functions drm/vc4: Introduce GEM object functions drm/vgem: Introduce GEM object functions drm/virtgpu: Set PRIME export function in struct drm_gem_object_funcs drm/vkms: Introduce GEM object functions drm/xen: Introduce GEM object functions drm/xlnx: Initialize DRM driver instance with CMA helper macro drm: Remove obsolete GEM and PRIME callbacks from struct drm_driver Documentation/gpu/drm-mm.rst | 4 +- Documentation/gpu/todo.rst| 9 +- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 6 -- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 23 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.h | 5 -- drivers/gpu/drm/armada/armada_drv.c | 3 - drivers/gpu/drm/armada/armada_gem.c | 12 ++- drivers/gpu/drm/armada/armada_gem.h | 2 - drivers/gpu/drm/drm_gem.c | 53 drivers/gpu/drm/drm_gem_cma_helper.c | 8 +- drivers/gpu/drm/drm_prime.c | 14 +-- drivers/gpu/drm/etnaviv/etnaviv_drv.c | 13 --- drivers/gpu/drm/etnaviv/etnaviv_drv.h | 1 - drivers/gpu/drm/etnaviv/etnaviv_gem.c | 19 - drivers/gpu/drm/exynos/exynos_drm_drv.c | 10 --- drivers/gpu/drm/exynos/exynos_drm_gem.c | 15 drivers/gpu/drm/gma500/framebuffer.c | 2 + drivers/gpu/drm/gma500/gem.c | 18 +++- drivers/gpu/drm/gma500/gem.h | 3 + drivers/gpu/drm/gma500/psb_drv.c | 9 -- drivers/gpu/drm/gma500/psb_drv.h | 2 - drivers/gpu/drm/i915/gem/i915_gem_object.c| 21 - drivers/gpu/drm/i915/gem/i915_gem_object.h| 3 - drivers/gpu/drm/i915/i915_drv.c | 4 - .../gpu/drm/i915/selftests/mock_gem_device.c | 3 - drivers/gpu/drm/imx/dcss/dcss-kms.c | 14 +-- drivers/gpu/drm/mediatek/mtk_drm_drv.c| 5 -- drivers/gpu/drm/mediatek/mtk_drm_gem.c| 11 +++ drivers/gpu/drm/msm/msm_drv.c
Re: [PATCH v2 00/21] Convert all remaining drivers to GEM object functions
Added my rb to the amdgpu and radeon patches. Should we pick those up through the amd branches or do you want to push everything to drm-misc-next? I think the later since this should result in much merge clash. Christian. Am 15.09.20 um 16:59 schrieb Thomas Zimmermann: The GEM and PRIME related callbacks in struct drm_driver are deprecated in favor of GEM object functions in struct drm_gem_object_funcs. This patchset converts the remaining drivers to object functions and removes most of the obsolete interfaces. Patches #1 to #16 and #18 to #19 convert DRM drivers to GEM object functions, one by one. Each patch moves existing callbacks from struct drm_driver to an instance of struct drm_gem_object_funcs, and sets these funcs when the GEM object is initialized. The expection is .gem_prime_mmap. There are different ways of how drivers implement the callback, and moving it to GEM object functions requires a closer review for each. Patch #17 fixes virtgpu to use GEM object functions where possible. The driver recently introduced a function for one of the deprecated callbacks. Patch #20 converts xlnx to CMA helper macros. There's no apparent reason why the driver does the GEM setup on it's own. Using CMA helper macros adds GEM object functions implicitly. With most of the GEM and PRIME moved to GEM object functions, related code in struct drm_driver and in the DRM core/helpers is being removed by patch #21. Further testing is welcome. I tested the drivers for which I have HW available. These are gma500, i915, nouveau, radeon and vc4. The console, Weston and Xorg apparently work with the patches applied. v2: * moved code in amdgpu and radeon * made several functions static in various drivers * updated TODO-list item * fix virtgpu Thomas Zimmermann (21): drm/amdgpu: Introduce GEM object functions drm/armada: Introduce GEM object functions drm/etnaviv: Introduce GEM object functions drm/exynos: Introduce GEM object functions drm/gma500: Introduce GEM object functions drm/i915: Introduce GEM object functions drm/mediatek: Introduce GEM object functions drm/msm: Introduce GEM object funcs drm/nouveau: Introduce GEM object functions drm/omapdrm: Introduce GEM object functions drm/pl111: Introduce GEM object functions drm/radeon: Introduce GEM object functions drm/rockchip: Convert to drm_gem_object_funcs drm/tegra: Introduce GEM object functions drm/vc4: Introduce GEM object functions drm/vgem: Introduce GEM object functions drm/virtgpu: Set PRIME export function in struct drm_gem_object_funcs drm/vkms: Introduce GEM object functions drm/xen: Introduce GEM object functions drm/xlnx: Initialize DRM driver instance with CMA helper macro drm: Remove obsolete GEM and PRIME callbacks from struct drm_driver Documentation/gpu/todo.rst| 7 +- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 6 -- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 23 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.h | 5 -- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c| 1 + drivers/gpu/drm/armada/armada_drv.c | 3 - drivers/gpu/drm/armada/armada_gem.c | 12 ++- drivers/gpu/drm/armada/armada_gem.h | 2 - drivers/gpu/drm/drm_gem.c | 35 ++-- drivers/gpu/drm/drm_gem_cma_helper.c | 6 +- drivers/gpu/drm/drm_prime.c | 17 ++-- drivers/gpu/drm/etnaviv/etnaviv_drv.c | 13 --- drivers/gpu/drm/etnaviv/etnaviv_drv.h | 1 - drivers/gpu/drm/etnaviv/etnaviv_gem.c | 19 - drivers/gpu/drm/exynos/exynos_drm_drv.c | 10 --- drivers/gpu/drm/exynos/exynos_drm_gem.c | 15 drivers/gpu/drm/gma500/framebuffer.c | 2 + drivers/gpu/drm/gma500/gem.c | 18 +++- drivers/gpu/drm/gma500/gem.h | 3 + drivers/gpu/drm/gma500/psb_drv.c | 9 -- drivers/gpu/drm/gma500/psb_drv.h | 2 - drivers/gpu/drm/i915/gem/i915_gem_object.c| 21 - drivers/gpu/drm/i915/gem/i915_gem_object.h| 3 - drivers/gpu/drm/i915/i915_drv.c | 4 - .../gpu/drm/i915/selftests/mock_gem_device.c | 3 - drivers/gpu/drm/mediatek/mtk_drm_drv.c| 5 -- drivers/gpu/drm/mediatek/mtk_drm_gem.c| 11 +++ drivers/gpu/drm/msm/msm_drv.c | 13 --- drivers/gpu/drm/msm/msm_drv.h | 1 - drivers/gpu/drm/msm/msm_gem.c | 19 - drivers/gpu/drm/nouveau/nouveau_drm.c | 9 -- drivers/gpu/drm/nouveau/nouveau_gem.c | 13 +++ drivers/gpu/drm/nouveau/nouveau_gem.h | 2 + drivers/gpu/drm/nouveau/nouveau_prime.c | 2 + drivers/gpu/drm/omapdrm/omap_drv.c| 9 -- drivers/gpu/drm/omapdrm/omap_gem.c| 18 +++- drivers/gpu/drm/omapdrm/omap_gem.h| 2 - drivers/gpu/drm/pl111/pl111_drv.c | 5 +- d
Re: [PATCH v2 12/21] drm/radeon: Introduce GEM object functions
Am 15.09.20 um 16:59 schrieb Thomas Zimmermann: GEM object functions deprecate several similar callback interfaces in struct drm_driver. This patch replaces the per-driver callbacks with per-instance callbacks in radeon. v2: * move object-function instance to radeon_gem.c (Christian) * set callbacks in radeon_gem_object_create() (Christian) Signed-off-by: Thomas Zimmermann Reviewed-by: Christian König --- drivers/gpu/drm/radeon/radeon_drv.c | 23 + drivers/gpu/drm/radeon/radeon_gem.c | 31 + 2 files changed, 28 insertions(+), 26 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index 4cd30613fa1d..65061c949aee 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -124,13 +124,6 @@ void radeon_driver_irq_preinstall_kms(struct drm_device *dev); int radeon_driver_irq_postinstall_kms(struct drm_device *dev); void radeon_driver_irq_uninstall_kms(struct drm_device *dev); irqreturn_t radeon_driver_irq_handler_kms(int irq, void *arg); -void radeon_gem_object_free(struct drm_gem_object *obj); -int radeon_gem_object_open(struct drm_gem_object *obj, - struct drm_file *file_priv); -void radeon_gem_object_close(struct drm_gem_object *obj, - struct drm_file *file_priv); -struct dma_buf *radeon_gem_prime_export(struct drm_gem_object *gobj, - int flags); extern int radeon_get_crtc_scanoutpos(struct drm_device *dev, unsigned int crtc, unsigned int flags, int *vpos, int *hpos, ktime_t *stime, ktime_t *etime, @@ -145,14 +138,9 @@ int radeon_mode_dumb_mmap(struct drm_file *filp, int radeon_mode_dumb_create(struct drm_file *file_priv, struct drm_device *dev, struct drm_mode_create_dumb *args); -struct sg_table *radeon_gem_prime_get_sg_table(struct drm_gem_object *obj); struct drm_gem_object *radeon_gem_prime_import_sg_table(struct drm_device *dev, struct dma_buf_attachment *, struct sg_table *sg); -int radeon_gem_prime_pin(struct drm_gem_object *obj); -void radeon_gem_prime_unpin(struct drm_gem_object *obj); -void *radeon_gem_prime_vmap(struct drm_gem_object *obj); -void radeon_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr); /* atpx handler */ #if defined(CONFIG_VGA_SWITCHEROO) @@ -550,7 +538,7 @@ long radeon_drm_ioctl(struct file *filp, } ret = drm_ioctl(filp, cmd, arg); - + pm_runtime_mark_last_busy(dev->dev); pm_runtime_put_autosuspend(dev->dev); return ret; @@ -609,22 +597,13 @@ static struct drm_driver kms_driver = { .irq_uninstall = radeon_driver_irq_uninstall_kms, .irq_handler = radeon_driver_irq_handler_kms, .ioctls = radeon_ioctls_kms, - .gem_free_object_unlocked = radeon_gem_object_free, - .gem_open_object = radeon_gem_object_open, - .gem_close_object = radeon_gem_object_close, .dumb_create = radeon_mode_dumb_create, .dumb_map_offset = radeon_mode_dumb_mmap, .fops = &radeon_driver_kms_fops, .prime_handle_to_fd = drm_gem_prime_handle_to_fd, .prime_fd_to_handle = drm_gem_prime_fd_to_handle, - .gem_prime_export = radeon_gem_prime_export, - .gem_prime_pin = radeon_gem_prime_pin, - .gem_prime_unpin = radeon_gem_prime_unpin, - .gem_prime_get_sg_table = radeon_gem_prime_get_sg_table, .gem_prime_import_sg_table = radeon_gem_prime_import_sg_table, - .gem_prime_vmap = radeon_gem_prime_vmap, - .gem_prime_vunmap = radeon_gem_prime_vunmap, .name = DRIVER_NAME, .desc = DRIVER_DESC, diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c index e5c4271e64ed..0ccd7213e41f 100644 --- a/drivers/gpu/drm/radeon/radeon_gem.c +++ b/drivers/gpu/drm/radeon/radeon_gem.c @@ -35,7 +35,17 @@ #include "radeon.h" -void radeon_gem_object_free(struct drm_gem_object *gobj) +struct dma_buf *radeon_gem_prime_export(struct drm_gem_object *gobj, + int flags); +struct sg_table *radeon_gem_prime_get_sg_table(struct drm_gem_object *obj); +int radeon_gem_prime_pin(struct drm_gem_object *obj); +void radeon_gem_prime_unpin(struct drm_gem_object *obj); +void *radeon_gem_prime_vmap(struct drm_gem_object *obj); +void radeon_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr); + +static const struct drm_gem_object_funcs radeon_gem_object_funcs; + +static void radeon_gem_object_free(struct drm_gem_object *gobj) { struct radeon_bo *robj = gem_to_radeon_bo(gobj); @@ -85,6 +95,7 @@ int radeon_gem_object_create(struct radeon_device
Re: [PATCH v2 01/21] drm/amdgpu: Introduce GEM object functions
g amdgpu_gem_timeout(uint64_t timeout_ns); /* diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index ac043baac05d..c4e82a8fa53f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -561,6 +561,7 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev, bo = kzalloc(sizeof(struct amdgpu_bo), GFP_KERNEL); if (bo == NULL) return -ENOMEM; + The newline is not unrelated. Apart from that the patch is Reviewed-by: Christian König . But I think we need some smoke testing of it. Christian. drm_gem_private_object_init(adev_to_drm(adev), &bo->tbo.base, size); INIT_LIST_HEAD(&bo->shadow_list); bo->vm_bo = NULL;
Re: [PATCH 01/20] drm/amdgpu: Introduce GEM object functions
Am 14.09.20 um 17:05 schrieb Thomas Zimmermann: Hi Am 13.08.20 um 12:22 schrieb Christian König: Am 13.08.20 um 10:36 schrieb Thomas Zimmermann: GEM object functions deprecate several similar callback interfaces in struct drm_driver. This patch replaces the per-driver callbacks with per-instance callbacks in amdgpu. The only exception is gem_prime_mmap, which is non-trivial to convert. Signed-off-by: Thomas Zimmermann --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 6 -- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 12 2 files changed, 12 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 81a79760ca61..51525b8774c9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -1468,19 +1468,13 @@ static struct drm_driver kms_driver = { .lastclose = amdgpu_driver_lastclose_kms, .irq_handler = amdgpu_irq_handler, .ioctls = amdgpu_ioctls_kms, - .gem_free_object_unlocked = amdgpu_gem_object_free, - .gem_open_object = amdgpu_gem_object_open, - .gem_close_object = amdgpu_gem_object_close, .dumb_create = amdgpu_mode_dumb_create, .dumb_map_offset = amdgpu_mode_dumb_mmap, .fops = &amdgpu_driver_kms_fops, .prime_handle_to_fd = drm_gem_prime_handle_to_fd, .prime_fd_to_handle = drm_gem_prime_fd_to_handle, - .gem_prime_export = amdgpu_gem_prime_export, .gem_prime_import = amdgpu_gem_prime_import, - .gem_prime_vmap = amdgpu_gem_prime_vmap, - .gem_prime_vunmap = amdgpu_gem_prime_vunmap, .gem_prime_mmap = amdgpu_gem_prime_mmap, .name = DRIVER_NAME, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index 43f4966331dd..ca2b79f94e99 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -36,6 +36,7 @@ #include #include #include "amdgpu.h" +#include "amdgpu_dma_buf.h" #include "amdgpu_trace.h" #include "amdgpu_amdkfd.h" @@ -510,6 +511,15 @@ bool amdgpu_bo_support_uswc(u64 bo_flags) #endif } +static const struct drm_gem_object_funcs amdgpu_gem_object_funcs = { + .free = amdgpu_gem_object_free, + .open = amdgpu_gem_object_open, + .close = amdgpu_gem_object_close, + .export = amdgpu_gem_prime_export, + .vmap = amdgpu_gem_prime_vmap, + .vunmap = amdgpu_gem_prime_vunmap, +}; + Wrong file, this belongs into amdgpu_gem.c static int amdgpu_bo_do_create(struct amdgpu_device *adev, struct amdgpu_bo_param *bp, struct amdgpu_bo **bo_ptr) @@ -552,6 +562,8 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev, bo = kzalloc(sizeof(struct amdgpu_bo), GFP_KERNEL); if (bo == NULL) return -ENOMEM; + + bo->tbo.base.funcs = &amdgpu_gem_object_funcs; And this should probably go into amdgpu_gem_object_create(). I'm trying to understand what amdgpu does. What about all the places where amdgpu calls amdgpu_bo_create() internally? Wouldn't these miss the free callback for the GEM object? Those shouldn't have a GEM object in the first place. Or otherwise we would have a reference counting issue. Regards, Christian. Best regards Thomas Apart from that looks like a good idea to me. Christian. drm_gem_private_object_init(adev->ddev, &bo->tbo.base, size); INIT_LIST_HEAD(&bo->shadow_list); bo->vm_bo = NULL; ___ amd-gfx mailing list amd-...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 1/2] drm: allow limiting the scatter list size.
Am 18.08.20 um 10:27 schrieb Gerd Hoffmann: On Tue, Aug 18, 2020 at 09:57:59AM +0200, Christian König wrote: Am 18.08.20 um 09:48 schrieb Gerd Hoffmann: Add max_segment argument to drm_prime_pages_to_sg(). When set pass it through to the __sg_alloc_table_from_pages() call, otherwise use SCATTERLIST_MAX_SEGMENT. Also add max_segment field to gem objects and pass it to drm_prime_pages_to_sg() calls in drivers and helpers. Signed-off-by: Gerd Hoffmann I'm missing an explanation why this should be useful (it certainly is). virtio-gpu needs this to work properly with SEV (see patch 2/2 of this series). Yeah, that's the problem patch 2/2 never showed up here :) And the maximum segment size seems misplaced in the GEM object. This is usually a property of the device or even completely constant. Placing it in drm_device instead would indeed work for virtio-gpu, so I guess you are suggesting that instead? That is probably the best approach, yes. For Intel and AMD it could even be global/constant, but it certainly doesn't needs to be kept around for each buffer. Christian. take care, Gerd
Re: [PATCH 1/2] drm: allow limiting the scatter list size.
Am 18.08.20 um 09:48 schrieb Gerd Hoffmann: Add max_segment argument to drm_prime_pages_to_sg(). When set pass it through to the __sg_alloc_table_from_pages() call, otherwise use SCATTERLIST_MAX_SEGMENT. Also add max_segment field to gem objects and pass it to drm_prime_pages_to_sg() calls in drivers and helpers. Signed-off-by: Gerd Hoffmann I'm missing an explanation why this should be useful (it certainly is). And the maximum segment size seems misplaced in the GEM object. This is usually a property of the device or even completely constant. Christian. --- include/drm/drm_gem.h | 8 include/drm/drm_prime.h | 3 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 3 ++- drivers/gpu/drm/drm_gem_shmem_helper.c | 3 ++- drivers/gpu/drm/drm_prime.c | 10 +++--- drivers/gpu/drm/etnaviv/etnaviv_gem.c | 3 ++- drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c | 3 ++- drivers/gpu/drm/msm/msm_gem.c | 3 ++- drivers/gpu/drm/msm/msm_gem_prime.c | 3 ++- drivers/gpu/drm/nouveau/nouveau_prime.c | 3 ++- drivers/gpu/drm/radeon/radeon_prime.c | 3 ++- drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 6 -- drivers/gpu/drm/tegra/gem.c | 3 ++- drivers/gpu/drm/vgem/vgem_drv.c | 3 ++- drivers/gpu/drm/xen/xen_drm_front_gem.c | 3 ++- 15 files changed, 43 insertions(+), 17 deletions(-) diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h index 337a48321705..dea5e92e745b 100644 --- a/include/drm/drm_gem.h +++ b/include/drm/drm_gem.h @@ -241,6 +241,14 @@ struct drm_gem_object { */ size_t size; + /** +* @max_segment: +* +* Max size for scatter list segments. When unset the default +* (SCATTERLIST_MAX_SEGMENT) is used. +*/ + size_t max_segment; + /** * @name: * diff --git a/include/drm/drm_prime.h b/include/drm/drm_prime.h index 9af7422b44cf..2c3689435cb4 100644 --- a/include/drm/drm_prime.h +++ b/include/drm/drm_prime.h @@ -88,7 +88,8 @@ void drm_gem_dmabuf_vunmap(struct dma_buf *dma_buf, void *vaddr); int drm_gem_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma); int drm_gem_dmabuf_mmap(struct dma_buf *dma_buf, struct vm_area_struct *vma); -struct sg_table *drm_prime_pages_to_sg(struct page **pages, unsigned int nr_pages); +struct sg_table *drm_prime_pages_to_sg(struct page **pages, unsigned int nr_pages, + size_t max_segment); struct dma_buf *drm_gem_prime_export(struct drm_gem_object *obj, int flags); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c index 519ce4427fce..5e8a9760b33f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c @@ -303,7 +303,8 @@ static struct sg_table *amdgpu_dma_buf_map(struct dma_buf_attachment *attach, switch (bo->tbo.mem.mem_type) { case TTM_PL_TT: sgt = drm_prime_pages_to_sg(bo->tbo.ttm->pages, - bo->tbo.num_pages); + bo->tbo.num_pages, + obj->max_segment); if (IS_ERR(sgt)) return sgt; diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c index 4b7cfbac4daa..cfb979d808fd 100644 --- a/drivers/gpu/drm/drm_gem_shmem_helper.c +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c @@ -656,7 +656,8 @@ struct sg_table *drm_gem_shmem_get_sg_table(struct drm_gem_object *obj) WARN_ON(shmem->base.import_attach); - return drm_prime_pages_to_sg(shmem->pages, obj->size >> PAGE_SHIFT); + return drm_prime_pages_to_sg(shmem->pages, obj->size >> PAGE_SHIFT, +obj->max_segment); } EXPORT_SYMBOL_GPL(drm_gem_shmem_get_sg_table); diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index 1693aa7c14b5..27c783fd6633 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -802,7 +802,8 @@ static const struct dma_buf_ops drm_gem_prime_dmabuf_ops = { * * This is useful for implementing &drm_gem_object_funcs.get_sg_table. */ -struct sg_table *drm_prime_pages_to_sg(struct page **pages, unsigned int nr_pages) +struct sg_table *drm_prime_pages_to_sg(struct page **pages, unsigned int nr_pages, + size_t max_segment) { struct sg_table *sg = NULL; int ret; @@ -813,8 +814,11 @@ struct sg_table *drm_prime_pages_to_sg(struct page **pages, unsigned int nr_page goto out; } - ret = sg_alloc_table_from_pages(sg, pages, nr_pages, 0, - nr_pages << PAGE_SHIFT, GFP_KERNEL); + if (max_segment
Re: [PATCH 12/20] drm/radeon: Introduce GEM object functions
Am 13.08.20 um 12:41 schrieb Thomas Zimmermann: Hi Am 13.08.20 um 12:24 schrieb Christian König: Am 13.08.20 um 10:36 schrieb Thomas Zimmermann: GEM object functions deprecate several similar callback interfaces in struct drm_driver. This patch replaces the per-driver callbacks with per-instance callbacks in radeon. Signed-off-by: Thomas Zimmermann --- drivers/gpu/drm/radeon/radeon_drv.c | 23 +-- drivers/gpu/drm/radeon/radeon_object.c | 26 ++ 2 files changed, 27 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index 4cd30613fa1d..65061c949aee 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -124,13 +124,6 @@ void radeon_driver_irq_preinstall_kms(struct drm_device *dev); int radeon_driver_irq_postinstall_kms(struct drm_device *dev); void radeon_driver_irq_uninstall_kms(struct drm_device *dev); irqreturn_t radeon_driver_irq_handler_kms(int irq, void *arg); -void radeon_gem_object_free(struct drm_gem_object *obj); -int radeon_gem_object_open(struct drm_gem_object *obj, - struct drm_file *file_priv); -void radeon_gem_object_close(struct drm_gem_object *obj, - struct drm_file *file_priv); -struct dma_buf *radeon_gem_prime_export(struct drm_gem_object *gobj, - int flags); extern int radeon_get_crtc_scanoutpos(struct drm_device *dev, unsigned int crtc, unsigned int flags, int *vpos, int *hpos, ktime_t *stime, ktime_t *etime, @@ -145,14 +138,9 @@ int radeon_mode_dumb_mmap(struct drm_file *filp, int radeon_mode_dumb_create(struct drm_file *file_priv, struct drm_device *dev, struct drm_mode_create_dumb *args); -struct sg_table *radeon_gem_prime_get_sg_table(struct drm_gem_object *obj); struct drm_gem_object *radeon_gem_prime_import_sg_table(struct drm_device *dev, struct dma_buf_attachment *, struct sg_table *sg); -int radeon_gem_prime_pin(struct drm_gem_object *obj); -void radeon_gem_prime_unpin(struct drm_gem_object *obj); -void *radeon_gem_prime_vmap(struct drm_gem_object *obj); -void radeon_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr); /* atpx handler */ #if defined(CONFIG_VGA_SWITCHEROO) @@ -550,7 +538,7 @@ long radeon_drm_ioctl(struct file *filp, } ret = drm_ioctl(filp, cmd, arg); - + pm_runtime_mark_last_busy(dev->dev); pm_runtime_put_autosuspend(dev->dev); return ret; @@ -609,22 +597,13 @@ static struct drm_driver kms_driver = { .irq_uninstall = radeon_driver_irq_uninstall_kms, .irq_handler = radeon_driver_irq_handler_kms, .ioctls = radeon_ioctls_kms, - .gem_free_object_unlocked = radeon_gem_object_free, - .gem_open_object = radeon_gem_object_open, - .gem_close_object = radeon_gem_object_close, .dumb_create = radeon_mode_dumb_create, .dumb_map_offset = radeon_mode_dumb_mmap, .fops = &radeon_driver_kms_fops, .prime_handle_to_fd = drm_gem_prime_handle_to_fd, .prime_fd_to_handle = drm_gem_prime_fd_to_handle, - .gem_prime_export = radeon_gem_prime_export, - .gem_prime_pin = radeon_gem_prime_pin, - .gem_prime_unpin = radeon_gem_prime_unpin, - .gem_prime_get_sg_table = radeon_gem_prime_get_sg_table, .gem_prime_import_sg_table = radeon_gem_prime_import_sg_table, - .gem_prime_vmap = radeon_gem_prime_vmap, - .gem_prime_vunmap = radeon_gem_prime_vunmap, .name = DRIVER_NAME, .desc = DRIVER_DESC, diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c index bb7582afd803..882390e15dfe 100644 --- a/drivers/gpu/drm/radeon/radeon_object.c +++ b/drivers/gpu/drm/radeon/radeon_object.c @@ -45,6 +45,19 @@ int radeon_ttm_init(struct radeon_device *rdev); void radeon_ttm_fini(struct radeon_device *rdev); static void radeon_bo_clear_surface_reg(struct radeon_bo *bo); +void radeon_gem_object_free(struct drm_gem_object *obj); +int radeon_gem_object_open(struct drm_gem_object *obj, + struct drm_file *file_priv); +void radeon_gem_object_close(struct drm_gem_object *obj, + struct drm_file *file_priv); +struct dma_buf *radeon_gem_prime_export(struct drm_gem_object *gobj, + int flags); +struct sg_table *radeon_gem_prime_get_sg_table(struct drm_gem_object *obj); +int radeon_gem_prime_pin(struct drm_gem_object *obj); +void radeon_gem_prime_unpin(struct drm_gem_object *obj); +void *radeon_gem_prime_vmap(struct drm_gem_object *obj); +void radeon_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr); + /* * To exclude mutual BO access we rely on bo_reserve exclusion, as all * function are calling it. @@ -180,6 +193,18 @@ void radeon_ttm_placement_from_domain(struct rade
Re: [PATCH 12/20] drm/radeon: Introduce GEM object functions
Am 13.08.20 um 10:36 schrieb Thomas Zimmermann: GEM object functions deprecate several similar callback interfaces in struct drm_driver. This patch replaces the per-driver callbacks with per-instance callbacks in radeon. Signed-off-by: Thomas Zimmermann --- drivers/gpu/drm/radeon/radeon_drv.c| 23 +-- drivers/gpu/drm/radeon/radeon_object.c | 26 ++ 2 files changed, 27 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c index 4cd30613fa1d..65061c949aee 100644 --- a/drivers/gpu/drm/radeon/radeon_drv.c +++ b/drivers/gpu/drm/radeon/radeon_drv.c @@ -124,13 +124,6 @@ void radeon_driver_irq_preinstall_kms(struct drm_device *dev); int radeon_driver_irq_postinstall_kms(struct drm_device *dev); void radeon_driver_irq_uninstall_kms(struct drm_device *dev); irqreturn_t radeon_driver_irq_handler_kms(int irq, void *arg); -void radeon_gem_object_free(struct drm_gem_object *obj); -int radeon_gem_object_open(struct drm_gem_object *obj, - struct drm_file *file_priv); -void radeon_gem_object_close(struct drm_gem_object *obj, - struct drm_file *file_priv); -struct dma_buf *radeon_gem_prime_export(struct drm_gem_object *gobj, - int flags); extern int radeon_get_crtc_scanoutpos(struct drm_device *dev, unsigned int crtc, unsigned int flags, int *vpos, int *hpos, ktime_t *stime, ktime_t *etime, @@ -145,14 +138,9 @@ int radeon_mode_dumb_mmap(struct drm_file *filp, int radeon_mode_dumb_create(struct drm_file *file_priv, struct drm_device *dev, struct drm_mode_create_dumb *args); -struct sg_table *radeon_gem_prime_get_sg_table(struct drm_gem_object *obj); struct drm_gem_object *radeon_gem_prime_import_sg_table(struct drm_device *dev, struct dma_buf_attachment *, struct sg_table *sg); -int radeon_gem_prime_pin(struct drm_gem_object *obj); -void radeon_gem_prime_unpin(struct drm_gem_object *obj); -void *radeon_gem_prime_vmap(struct drm_gem_object *obj); -void radeon_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr); /* atpx handler */ #if defined(CONFIG_VGA_SWITCHEROO) @@ -550,7 +538,7 @@ long radeon_drm_ioctl(struct file *filp, } ret = drm_ioctl(filp, cmd, arg); - + pm_runtime_mark_last_busy(dev->dev); pm_runtime_put_autosuspend(dev->dev); return ret; @@ -609,22 +597,13 @@ static struct drm_driver kms_driver = { .irq_uninstall = radeon_driver_irq_uninstall_kms, .irq_handler = radeon_driver_irq_handler_kms, .ioctls = radeon_ioctls_kms, - .gem_free_object_unlocked = radeon_gem_object_free, - .gem_open_object = radeon_gem_object_open, - .gem_close_object = radeon_gem_object_close, .dumb_create = radeon_mode_dumb_create, .dumb_map_offset = radeon_mode_dumb_mmap, .fops = &radeon_driver_kms_fops, .prime_handle_to_fd = drm_gem_prime_handle_to_fd, .prime_fd_to_handle = drm_gem_prime_fd_to_handle, - .gem_prime_export = radeon_gem_prime_export, - .gem_prime_pin = radeon_gem_prime_pin, - .gem_prime_unpin = radeon_gem_prime_unpin, - .gem_prime_get_sg_table = radeon_gem_prime_get_sg_table, .gem_prime_import_sg_table = radeon_gem_prime_import_sg_table, - .gem_prime_vmap = radeon_gem_prime_vmap, - .gem_prime_vunmap = radeon_gem_prime_vunmap, .name = DRIVER_NAME, .desc = DRIVER_DESC, diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c index bb7582afd803..882390e15dfe 100644 --- a/drivers/gpu/drm/radeon/radeon_object.c +++ b/drivers/gpu/drm/radeon/radeon_object.c @@ -45,6 +45,19 @@ int radeon_ttm_init(struct radeon_device *rdev); void radeon_ttm_fini(struct radeon_device *rdev); static void radeon_bo_clear_surface_reg(struct radeon_bo *bo); +void radeon_gem_object_free(struct drm_gem_object *obj); +int radeon_gem_object_open(struct drm_gem_object *obj, + struct drm_file *file_priv); +void radeon_gem_object_close(struct drm_gem_object *obj, + struct drm_file *file_priv); +struct dma_buf *radeon_gem_prime_export(struct drm_gem_object *gobj, + int flags); +struct sg_table *radeon_gem_prime_get_sg_table(struct drm_gem_object *obj); +int radeon_gem_prime_pin(struct drm_gem_object *obj); +void radeon_gem_prime_unpin(struct drm_gem_object *obj); +void *radeon_gem_prime_vmap(struct drm_gem_object *obj); +void radeon_gem_prime_vunmap(struct drm_gem_object *obj, void *vaddr); + /* * To exclude mutual BO access we rely on bo_reserve exclusi
Re: [PATCH 01/20] drm/amdgpu: Introduce GEM object functions
Am 13.08.20 um 10:36 schrieb Thomas Zimmermann: GEM object functions deprecate several similar callback interfaces in struct drm_driver. This patch replaces the per-driver callbacks with per-instance callbacks in amdgpu. The only exception is gem_prime_mmap, which is non-trivial to convert. Signed-off-by: Thomas Zimmermann --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 6 -- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 12 2 files changed, 12 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 81a79760ca61..51525b8774c9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -1468,19 +1468,13 @@ static struct drm_driver kms_driver = { .lastclose = amdgpu_driver_lastclose_kms, .irq_handler = amdgpu_irq_handler, .ioctls = amdgpu_ioctls_kms, - .gem_free_object_unlocked = amdgpu_gem_object_free, - .gem_open_object = amdgpu_gem_object_open, - .gem_close_object = amdgpu_gem_object_close, .dumb_create = amdgpu_mode_dumb_create, .dumb_map_offset = amdgpu_mode_dumb_mmap, .fops = &amdgpu_driver_kms_fops, .prime_handle_to_fd = drm_gem_prime_handle_to_fd, .prime_fd_to_handle = drm_gem_prime_fd_to_handle, - .gem_prime_export = amdgpu_gem_prime_export, .gem_prime_import = amdgpu_gem_prime_import, - .gem_prime_vmap = amdgpu_gem_prime_vmap, - .gem_prime_vunmap = amdgpu_gem_prime_vunmap, .gem_prime_mmap = amdgpu_gem_prime_mmap, .name = DRIVER_NAME, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index 43f4966331dd..ca2b79f94e99 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -36,6 +36,7 @@ #include #include #include "amdgpu.h" +#include "amdgpu_dma_buf.h" #include "amdgpu_trace.h" #include "amdgpu_amdkfd.h" @@ -510,6 +511,15 @@ bool amdgpu_bo_support_uswc(u64 bo_flags) #endif } +static const struct drm_gem_object_funcs amdgpu_gem_object_funcs = { + .free = amdgpu_gem_object_free, + .open = amdgpu_gem_object_open, + .close = amdgpu_gem_object_close, + .export = amdgpu_gem_prime_export, + .vmap = amdgpu_gem_prime_vmap, + .vunmap = amdgpu_gem_prime_vunmap, +}; + Wrong file, this belongs into amdgpu_gem.c static int amdgpu_bo_do_create(struct amdgpu_device *adev, struct amdgpu_bo_param *bp, struct amdgpu_bo **bo_ptr) @@ -552,6 +562,8 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev, bo = kzalloc(sizeof(struct amdgpu_bo), GFP_KERNEL); if (bo == NULL) return -ENOMEM; + + bo->tbo.base.funcs = &amdgpu_gem_object_funcs; And this should probably go into amdgpu_gem_object_create(). Apart from that looks like a good idea to me. Christian. drm_gem_private_object_init(adev->ddev, &bo->tbo.base, size); INIT_LIST_HEAD(&bo->shadow_list); bo->vm_bo = NULL;
Re: [Xen-devel] [PATCH v2 12/15] drm/amdgpu: Call find_vma under mmap_sem
Am 29.10.19 um 17:28 schrieb Kuehling, Felix: On 2019-10-28 4:10 p.m., Jason Gunthorpe wrote: From: Jason Gunthorpe find_vma() must be called under the mmap_sem, reorganize this code to do the vma check after entering the lock. Further, fix the unlocked use of struct task_struct's mm, instead use the mm from hmm_mirror which has an active mm_grab. Also the mm_grab must be converted to a mm_get before acquiring mmap_sem or calling find_vma(). Fixes: 66c45500bfdc ("drm/amdgpu: use new HMM APIs and helpers") Fixes: 0919195f2b0d ("drm/amdgpu: Enable amdgpu_ttm_tt_get_user_pages in worker threads") Cc: Alex Deucher Cc: Christian König Cc: David (ChunMing) Zhou Cc: amd-...@lists.freedesktop.org Signed-off-by: Jason Gunthorpe One question inline to confirm my understanding. Otherwise this patch is Reviewed-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 37 ++--- 1 file changed, 21 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index dff41d0a85fe96..c0e41f1f0c2365 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -35,6 +35,7 @@ #include #include #include +#include #include #include #include @@ -788,7 +789,7 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, struct page **pages) struct hmm_mirror *mirror = bo->mn ? &bo->mn->mirror : NULL; struct ttm_tt *ttm = bo->tbo.ttm; struct amdgpu_ttm_tt *gtt = (void *)ttm; - struct mm_struct *mm = gtt->usertask->mm; + struct mm_struct *mm; unsigned long start = gtt->userptr; struct vm_area_struct *vma; struct hmm_range *range; @@ -796,25 +797,14 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, struct page **pages) uint64_t *pfns; int r = 0; - if (!mm) /* Happens during process shutdown */ - return -ESRCH; - if (unlikely(!mirror)) { DRM_DEBUG_DRIVER("Failed to get hmm_mirror\n"); - r = -EFAULT; - goto out; + return -EFAULT; } - vma = find_vma(mm, start); - if (unlikely(!vma || start < vma->vm_start)) { - r = -EFAULT; - goto out; - } - if (unlikely((gtt->userflags & AMDGPU_GEM_USERPTR_ANONONLY) && - vma->vm_file)) { - r = -EPERM; - goto out; - } + mm = mirror->hmm->mmu_notifier.mm; + if (!mmget_not_zero(mm)) /* Happens during process shutdown */ This works because mirror->hmm->mmu_notifier holds an mmgrab reference to the mm? So the MM will not just go away, but if the mmget refcount is 0, it means the mm is marked for destruction and shouldn't be used any more. Yes, exactly. That is a rather common pattern, one reference count for the functionality and one for the structure. When the functionality is gone the structure might still be alive for some reason. TTM and a couple of other structures use the same approach. Christian. + return -ESRCH; range = kzalloc(sizeof(*range), GFP_KERNEL); if (unlikely(!range)) { @@ -847,6 +837,17 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, struct page **pages) hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT); down_read(&mm->mmap_sem); + vma = find_vma(mm, start); + if (unlikely(!vma || start < vma->vm_start)) { + r = -EFAULT; + goto out_unlock; + } + if (unlikely((gtt->userflags & AMDGPU_GEM_USERPTR_ANONONLY) && + vma->vm_file)) { + r = -EPERM; + goto out_unlock; + } + r = hmm_range_fault(range, 0); up_read(&mm->mmap_sem); @@ -865,15 +866,19 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, struct page **pages) } gtt->range = range; + mmput(mm); return 0; +out_unlock: + up_read(&mm->mmap_sem); out_free_pfns: hmm_range_unregister(range); kvfree(pfns); out_free_ranges: kfree(range); out: + mmput(mm); return r; } ___ amd-gfx mailing list amd-...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] dma-buf: add struct dma_buf_attach_info v2
Am 30.04.19 um 19:31 schrieb Russell King - ARM Linux admin: On Tue, Apr 30, 2019 at 01:10:02PM +0200, Christian König wrote: Add a structure for the parameters of dma_buf_attach, this makes it much easier to add new parameters later on. I don't understand this reasoning. What are the "new parameters" that are being proposed, and why do we need to put them into memory to pass them across this interface? If the intention is to make it easier to change the interface, passing parameters in this manner mean that it's easy for the interface to change and drivers not to notice the changes, since the compiler will not warn (unless some member of the structure that the driver is using gets removed, in which case it will error.) Additions to the structure will go unnoticed by drivers - what if the caller is expecting some different kind of behaviour, and the driver ignores that new addition? Well, exactly that's the intention here: That the drivers using this interface should be able to ignore the new additions for now as long as they are not going to use them. The background is that we have multiple interface changes in the pipeline, and each step requires new optional parameters. This doesn't seem to me like a good idea. Well, the obvious alternatives are: a) Change all drivers to explicitly provide NULL/0 for the new parameters. b) Use a wrapper, so that the function signature of dma_buf_attach stays the same. Key point here is that I have an invalidation callback change, a P2P patch set and some locking changes which all require adding new parameters or flags. And at each step I would then start to change all drivers, adding some more NULL pointers or flags with 0 default value. I'm actually perfectly fine going down any route, but this just seemed to me simplest and with the least risk of breaking anything. Opinions? Thanks, Christian. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH] dma-buf: add struct dma_buf_attach_info v2
Add a structure for the parameters of dma_buf_attach, this makes it much easier to add new parameters later on. v2: rebase cleanup and fix all new implementations as well Signed-off-by: Christian König --- drivers/dma-buf/dma-buf.c | 13 +++-- drivers/gpu/drm/armada/armada_gem.c | 6 +- drivers/gpu/drm/drm_prime.c | 6 +- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 6 +- drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c | 6 +- drivers/gpu/drm/tegra/gem.c | 6 +- drivers/gpu/drm/udl/udl_dmabuf.c| 6 +- .../common/videobuf2/videobuf2-dma-contig.c | 6 +- .../media/common/videobuf2/videobuf2-dma-sg.c | 6 +- drivers/misc/fastrpc.c | 6 +- drivers/staging/media/tegra-vde/tegra-vde.c | 6 +- drivers/xen/gntdev-dmabuf.c | 4 include/linux/dma-buf.h | 17 +++-- 13 files changed, 76 insertions(+), 18 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 3ae6c0c2cc02..e295e76a8c57 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -535,8 +535,9 @@ EXPORT_SYMBOL_GPL(dma_buf_put); /** * dma_buf_attach - Add the device to dma_buf's attachments list; optionally, * calls attach() of dma_buf_ops to allow device-specific attach functionality - * @dmabuf:[in]buffer to attach device to. - * @dev: [in]device to be attached. + * @info: [in]holds all the attach related information provided + * by the importer. see &struct dma_buf_attach_info + * for further details. * * Returns struct dma_buf_attachment pointer for this attachment. Attachments * must be cleaned up by calling dma_buf_detach(). @@ -550,20 +551,20 @@ EXPORT_SYMBOL_GPL(dma_buf_put); * accessible to @dev, and cannot be moved to a more suitable place. This is * indicated with the error code -EBUSY. */ -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, - struct device *dev) +struct dma_buf_attachment *dma_buf_attach(const struct dma_buf_attach_info *info) { + struct dma_buf *dmabuf = info->dmabuf; struct dma_buf_attachment *attach; int ret; - if (WARN_ON(!dmabuf || !dev)) + if (WARN_ON(!dmabuf || !info->dev)) return ERR_PTR(-EINVAL); attach = kzalloc(sizeof(*attach), GFP_KERNEL); if (!attach) return ERR_PTR(-ENOMEM); - attach->dev = dev; + attach->dev = info->dev; attach->dmabuf = dmabuf; mutex_lock(&dmabuf->lock); diff --git a/drivers/gpu/drm/armada/armada_gem.c b/drivers/gpu/drm/armada/armada_gem.c index 642d0e70d0f8..19c47821032f 100644 --- a/drivers/gpu/drm/armada/armada_gem.c +++ b/drivers/gpu/drm/armada/armada_gem.c @@ -501,6 +501,10 @@ armada_gem_prime_export(struct drm_device *dev, struct drm_gem_object *obj, struct drm_gem_object * armada_gem_prime_import(struct drm_device *dev, struct dma_buf *buf) { + struct dma_buf_attach_info attach_info = { + .dev = dev->dev, + .dmabuf = buf + }; struct dma_buf_attachment *attach; struct armada_gem_object *dobj; @@ -516,7 +520,7 @@ armada_gem_prime_import(struct drm_device *dev, struct dma_buf *buf) } } - attach = dma_buf_attach(buf, dev->dev); + attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) return ERR_CAST(attach); diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index dc079efb3b0f..1dd70fc095ee 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -710,6 +710,10 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, struct dma_buf *dma_buf, struct device *attach_dev) { + struct dma_buf_attach_info attach_info = { + .dev = attach_dev, + .dmabuf = dma_buf + }; struct dma_buf_attachment *attach; struct sg_table *sgt; struct drm_gem_object *obj; @@ -730,7 +734,7 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, if (!dev->driver->gem_prime_import_sg_table) return ERR_PTR(-EINVAL); - attach = dma_buf_attach(dma_buf, attach_dev); + attach = dma_buf_attach(&attach_info); if (IS_ERR(attach)) return ERR_CAST(attach); diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c index 5a101a9462d8..978054157c64 100644 --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c @@ -277,6
Re: [Xen-devel] [PATCH] mm, oom: distinguish blockable mode for mmu notifiers
Am 26.08.2018 um 10:40 schrieb Tetsuo Handa: On 2018/08/24 22:52, Michal Hocko wrote: @@ -180,11 +180,15 @@ void amdgpu_mn_unlock(struct amdgpu_mn *mn) */ static int amdgpu_mn_read_lock(struct amdgpu_mn *amn, bool blockable) { - if (blockable) - mutex_lock(&amn->read_lock); - else if (!mutex_trylock(&amn->read_lock)) - return -EAGAIN; - + /* +* We can take sleepable lock even on !blockable mode because +* read_lock is only ever take from this path and the notifier +* lock never really sleeps. In fact the only reason why the +* later is sleepable is because the notifier itself might sleep +* in amdgpu_mn_invalidate_node but blockable mode is handled +* before calling into that path. +*/ + mutex_lock(&amn->read_lock); if (atomic_inc_return(&amn->recursion) == 1) down_read_non_owner(&amn->lock); mutex_unlock(&amn->read_lock); I'm not following. Why don't we need to do like below (given that nobody except amdgpu_mn_read_lock() holds ->read_lock) because e.g. drm_sched_fence_create() from drm_sched_job_init() from amdgpu_cs_submit() is doing GFP_KERNEL memory allocation with ->lock held for write? That's a bug which needs to be fixed separately. Allocating memory with GFP_KERNEL while holding a lock which is also taken in the reclaim code path is illegal not matter what you do. Patches to fix this are already on the appropriate mailing list and will be pushed upstream today. Regards, Christian. diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c index e55508b..e1cb344 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c @@ -64,8 +64,6 @@ * @node: hash table node to find structure by adev and mn * @lock: rw semaphore protecting the notifier nodes * @objects: interval tree containing amdgpu_mn_nodes - * @read_lock: mutex for recursive locking of @lock - * @recursion: depth of recursion * * Data for each amdgpu device and process address space. */ @@ -85,8 +83,6 @@ struct amdgpu_mn { /* objects protected by lock */ struct rw_semaphore lock; struct rb_root_cached objects; - struct mutexread_lock; - atomic_trecursion; }; /** @@ -181,14 +177,9 @@ void amdgpu_mn_unlock(struct amdgpu_mn *mn) static int amdgpu_mn_read_lock(struct amdgpu_mn *amn, bool blockable) { if (blockable) - mutex_lock(&amn->read_lock); - else if (!mutex_trylock(&amn->read_lock)) + down_read(&amn->lock); + else if (!down_read_trylock(&amn->lock)) return -EAGAIN; - - if (atomic_inc_return(&amn->recursion) == 1) - down_read_non_owner(&amn->lock); - mutex_unlock(&amn->read_lock); - return 0; } @@ -199,8 +190,7 @@ static int amdgpu_mn_read_lock(struct amdgpu_mn *amn, bool blockable) */ static void amdgpu_mn_read_unlock(struct amdgpu_mn *amn) { - if (atomic_dec_return(&amn->recursion) == 0) - up_read_non_owner(&amn->lock); + up_read(&amn->lock); } /** @@ -410,8 +400,6 @@ struct amdgpu_mn *amdgpu_mn_get(struct amdgpu_device *adev, amn->type = type; amn->mn.ops = &amdgpu_mn_ops[type]; amn->objects = RB_ROOT_CACHED; - mutex_init(&amn->read_lock); - atomic_set(&amn->recursion, 0); r = __mmu_notifier_register(&amn->mn, mm); if (r) ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] mm, oom: distinguish blockable mode for mmu notifiers
Am 24.08.2018 um 15:40 schrieb Michal Hocko: On Fri 24-08-18 15:28:33, Christian König wrote: Am 24.08.2018 um 15:24 schrieb Michal Hocko: On Fri 24-08-18 15:10:08, Christian König wrote: Am 24.08.2018 um 15:01 schrieb Michal Hocko: On Fri 24-08-18 14:52:26, Christian König wrote: Am 24.08.2018 um 14:33 schrieb Michal Hocko: [...] Thiking about it some more, I can imagine that a notifier callback which performs an allocation might trigger a memory reclaim and that in turn might trigger a notifier to be invoked and recurse. But notifier shouldn't really allocate memory. They are called from deep MM code paths and this would be extremely deadlock prone. Maybe Jerome can come up some more realistic scenario. If not then I would propose to simplify the locking here. We have lockdep to catch self deadlocks and it is always better to handle a specific issue rather than having a code without a clear indication how it can recurse. Well I agree that we should probably fix that, but I have some concerns to remove the existing workaround. See we added that to get rid of a real problem in a customer environment and I don't want to that to show up again. It would really help to know more about that case and fix it properly rather than workaround it like this. Anyway, let me think how to handle the non-blocking notifier invocation then. I was not able to come up with anything remotely sane yet. With avoiding allocating memory in the write lock path I don't see an issue any more with that. All what the write lock path does now is adding items to a linked lists, arrays etc Can we change it to non-sleepable lock then? No, the write side doesn't sleep any more, but the read side does. See amdgpu_mn_invalidate_node() and that is where you actually need to handle the non-blocking flag correctly. Ohh, right you are. We already handle that by bailing out before calling amdgpu_mn_invalidate_node in !blockable mode. Yeah, that is sufficient. It could be improved because we have something like 90% chance that amdgpu_mn_invalidate_node() actually doesn't need to do anything. But I can take care of that when the patch set has landed. So does this looks good to you? Yeah, that looks perfect to me. Reviewed-by: Christian König Thanks, Christian. diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c index e55508b39496..48fa152231be 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c @@ -180,11 +180,15 @@ void amdgpu_mn_unlock(struct amdgpu_mn *mn) */ static int amdgpu_mn_read_lock(struct amdgpu_mn *amn, bool blockable) { - if (blockable) - mutex_lock(&amn->read_lock); - else if (!mutex_trylock(&amn->read_lock)) - return -EAGAIN; - + /* +* We can take sleepable lock even on !blockable mode because +* read_lock is only ever take from this path and the notifier +* lock never really sleeps. In fact the only reason why the +* later is sleepable is because the notifier itself might sleep +* in amdgpu_mn_invalidate_node but blockable mode is handled +* before calling into that path. +*/ + mutex_lock(&amn->read_lock); if (atomic_inc_return(&amn->recursion) == 1) down_read_non_owner(&amn->lock); mutex_unlock(&amn->read_lock); ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] mm, oom: distinguish blockable mode for mmu notifiers
Am 24.08.2018 um 15:24 schrieb Michal Hocko: On Fri 24-08-18 15:10:08, Christian König wrote: Am 24.08.2018 um 15:01 schrieb Michal Hocko: On Fri 24-08-18 14:52:26, Christian König wrote: Am 24.08.2018 um 14:33 schrieb Michal Hocko: [...] Thiking about it some more, I can imagine that a notifier callback which performs an allocation might trigger a memory reclaim and that in turn might trigger a notifier to be invoked and recurse. But notifier shouldn't really allocate memory. They are called from deep MM code paths and this would be extremely deadlock prone. Maybe Jerome can come up some more realistic scenario. If not then I would propose to simplify the locking here. We have lockdep to catch self deadlocks and it is always better to handle a specific issue rather than having a code without a clear indication how it can recurse. Well I agree that we should probably fix that, but I have some concerns to remove the existing workaround. See we added that to get rid of a real problem in a customer environment and I don't want to that to show up again. It would really help to know more about that case and fix it properly rather than workaround it like this. Anyway, let me think how to handle the non-blocking notifier invocation then. I was not able to come up with anything remotely sane yet. With avoiding allocating memory in the write lock path I don't see an issue any more with that. All what the write lock path does now is adding items to a linked lists, arrays etc Can we change it to non-sleepable lock then? No, the write side doesn't sleep any more, but the read side does. See amdgpu_mn_invalidate_node() and that is where you actually need to handle the non-blocking flag correctly. Christian. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] mm, oom: distinguish blockable mode for mmu notifiers
Am 24.08.2018 um 15:01 schrieb Michal Hocko: On Fri 24-08-18 14:52:26, Christian König wrote: Am 24.08.2018 um 14:33 schrieb Michal Hocko: [...] Thiking about it some more, I can imagine that a notifier callback which performs an allocation might trigger a memory reclaim and that in turn might trigger a notifier to be invoked and recurse. But notifier shouldn't really allocate memory. They are called from deep MM code paths and this would be extremely deadlock prone. Maybe Jerome can come up some more realistic scenario. If not then I would propose to simplify the locking here. We have lockdep to catch self deadlocks and it is always better to handle a specific issue rather than having a code without a clear indication how it can recurse. Well I agree that we should probably fix that, but I have some concerns to remove the existing workaround. See we added that to get rid of a real problem in a customer environment and I don't want to that to show up again. It would really help to know more about that case and fix it properly rather than workaround it like this. Anyway, let me think how to handle the non-blocking notifier invocation then. I was not able to come up with anything remotely sane yet. With avoiding allocating memory in the write lock path I don't see an issue any more with that. All what the write lock path does now is adding items to a linked lists, arrays etc So there is no more blocking involved here and the read lock side should be able to grab the lock immediately. Christian. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] mm, oom: distinguish blockable mode for mmu notifiers
Am 24.08.2018 um 14:33 schrieb Michal Hocko: On Fri 24-08-18 14:18:44, Christian König wrote: Am 24.08.2018 um 14:03 schrieb Michal Hocko: On Fri 24-08-18 13:57:52, Christian König wrote: Am 24.08.2018 um 13:52 schrieb Michal Hocko: On Fri 24-08-18 13:43:16, Christian König wrote: [...] That won't work like this there might be multiple invalidate_range_start()/invalidate_range_end() pairs open at the same time. E.g. the lock might be taken recursively and that is illegal for a rw_semaphore. I am not sure I follow. Are you saying that one invalidate_range might trigger another one from the same path? No, but what can happen is: invalidate_range_start(A,B); invalidate_range_start(C,D); ... invalidate_range_end(C,D); invalidate_range_end(A,B); Grabbing the read lock twice would be illegal in this case. I am sorry but I still do not follow. What is the context the two are called from? I don't have the slightest idea. Can you give me an example. I simply do not see it in the code, mostly because I am not familiar with it. I'm neither. We stumbled over that by pure observation and after discussing the problem with Jerome came up with this solution. No idea where exactly that case comes from, but I can confirm that it indeed happens. Thiking about it some more, I can imagine that a notifier callback which performs an allocation might trigger a memory reclaim and that in turn might trigger a notifier to be invoked and recurse. But notifier shouldn't really allocate memory. They are called from deep MM code paths and this would be extremely deadlock prone. Maybe Jerome can come up some more realistic scenario. If not then I would propose to simplify the locking here. We have lockdep to catch self deadlocks and it is always better to handle a specific issue rather than having a code without a clear indication how it can recurse. Well I agree that we should probably fix that, but I have some concerns to remove the existing workaround. See we added that to get rid of a real problem in a customer environment and I don't want to that to show up again. In the meantime I've send out a fix to avoid allocating memory while holding the mn_lock. Thanks for pointing that out, Christian. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] mm, oom: distinguish blockable mode for mmu notifiers
Am 24.08.2018 um 14:03 schrieb Michal Hocko: On Fri 24-08-18 13:57:52, Christian König wrote: Am 24.08.2018 um 13:52 schrieb Michal Hocko: On Fri 24-08-18 13:43:16, Christian König wrote: [...] That won't work like this there might be multiple invalidate_range_start()/invalidate_range_end() pairs open at the same time. E.g. the lock might be taken recursively and that is illegal for a rw_semaphore. I am not sure I follow. Are you saying that one invalidate_range might trigger another one from the same path? No, but what can happen is: invalidate_range_start(A,B); invalidate_range_start(C,D); ... invalidate_range_end(C,D); invalidate_range_end(A,B); Grabbing the read lock twice would be illegal in this case. I am sorry but I still do not follow. What is the context the two are called from? I don't have the slightest idea. Can you give me an example. I simply do not see it in the code, mostly because I am not familiar with it. I'm neither. We stumbled over that by pure observation and after discussing the problem with Jerome came up with this solution. No idea where exactly that case comes from, but I can confirm that it indeed happens. Regards, Christian. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] mm, oom: distinguish blockable mode for mmu notifiers
Am 24.08.2018 um 13:52 schrieb Michal Hocko: On Fri 24-08-18 13:43:16, Christian König wrote: Am 24.08.2018 um 13:32 schrieb Michal Hocko: On Fri 24-08-18 19:54:19, Tetsuo Handa wrote: Two more worries for this patch. --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c @@ -178,12 +178,18 @@ void amdgpu_mn_unlock(struct amdgpu_mn *mn) * * @amn: our notifier */ -static void amdgpu_mn_read_lock(struct amdgpu_mn *amn) +static int amdgpu_mn_read_lock(struct amdgpu_mn *amn, bool blockable) { - mutex_lock(&amn->read_lock); + if (blockable) + mutex_lock(&amn->read_lock); + else if (!mutex_trylock(&amn->read_lock)) + return -EAGAIN; + if (atomic_inc_return(&amn->recursion) == 1) down_read_non_owner(&amn->lock); Why don't we need to use trylock here if blockable == false ? Want comment why it is safe to use blocking lock here. Hmm, I am pretty sure I have checked the code but it was quite confusing so I might have missed something. Double checking now, it seems that this read_lock is not used anywhere else and it is not _the_ lock we are interested about. It is the amn->lock (amdgpu_mn_lock) which matters as it is taken in exclusive mode for expensive operations. The write side of the lock is only taken in the command submission IOCTL. So you actually don't need to change anything here (even the proposed changes are overkill) since we can't tear down the struct_mm while an IOCTL is still using. I am not so sure. We are not in the mm destruction phase yet. This is mostly about the oom context which might fire right during the IOCTL. If any of the path which is holding the write lock blocks for unbound amount of time or even worse allocates a memory then we are screwed. So we need to back of when blockable = false. Oh, yeah good point. Haven't thought about that possibility. Is that correct Christian? If this is correct then we need to update the locking here. I am struggling to grasp the ref counting part. Why cannot all readers simply take the lock rather than rely on somebody else to take it? 1ed3d2567c800 didn't really help me to understand the locking scheme here so any help would be appreciated. That won't work like this there might be multiple invalidate_range_start()/invalidate_range_end() pairs open at the same time. E.g. the lock might be taken recursively and that is illegal for a rw_semaphore. I am not sure I follow. Are you saying that one invalidate_range might trigger another one from the same path? No, but what can happen is: invalidate_range_start(A,B); invalidate_range_start(C,D); ... invalidate_range_end(C,D); invalidate_range_end(A,B); Grabbing the read lock twice would be illegal in this case. Regards, Christian. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] mm, oom: distinguish blockable mode for mmu notifiers
Am 24.08.2018 um 13:32 schrieb Michal Hocko: On Fri 24-08-18 19:54:19, Tetsuo Handa wrote: Two more worries for this patch. --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c @@ -178,12 +178,18 @@ void amdgpu_mn_unlock(struct amdgpu_mn *mn) * * @amn: our notifier */ -static void amdgpu_mn_read_lock(struct amdgpu_mn *amn) +static int amdgpu_mn_read_lock(struct amdgpu_mn *amn, bool blockable) { - mutex_lock(&amn->read_lock); + if (blockable) + mutex_lock(&amn->read_lock); + else if (!mutex_trylock(&amn->read_lock)) + return -EAGAIN; + if (atomic_inc_return(&amn->recursion) == 1) down_read_non_owner(&amn->lock); Why don't we need to use trylock here if blockable == false ? Want comment why it is safe to use blocking lock here. Hmm, I am pretty sure I have checked the code but it was quite confusing so I might have missed something. Double checking now, it seems that this read_lock is not used anywhere else and it is not _the_ lock we are interested about. It is the amn->lock (amdgpu_mn_lock) which matters as it is taken in exclusive mode for expensive operations. The write side of the lock is only taken in the command submission IOCTL. So you actually don't need to change anything here (even the proposed changes are overkill) since we can't tear down the struct_mm while an IOCTL is still using. Is that correct Christian? If this is correct then we need to update the locking here. I am struggling to grasp the ref counting part. Why cannot all readers simply take the lock rather than rely on somebody else to take it? 1ed3d2567c800 didn't really help me to understand the locking scheme here so any help would be appreciated. That won't work like this there might be multiple invalidate_range_start()/invalidate_range_end() pairs open at the same time. E.g. the lock might be taken recursively and that is illegal for a rw_semaphore. Regards, Christian. I am wondering why we cannot do diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c index e55508b39496..93034178673d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c @@ -180,14 +180,11 @@ void amdgpu_mn_unlock(struct amdgpu_mn *mn) */ static int amdgpu_mn_read_lock(struct amdgpu_mn *amn, bool blockable) { - if (blockable) - mutex_lock(&amn->read_lock); - else if (!mutex_trylock(&amn->read_lock)) - return -EAGAIN; - - if (atomic_inc_return(&amn->recursion) == 1) - down_read_non_owner(&amn->lock); - mutex_unlock(&amn->read_lock); + if (!down_read_trylock(&amn->lock)) { + if (!blockable) + return -EAGAIN; + down_read(amn->lock); + } return 0; } @@ -199,8 +196,7 @@ static int amdgpu_mn_read_lock(struct amdgpu_mn *amn, bool blockable) */ static void amdgpu_mn_read_unlock(struct amdgpu_mn *amn) { - if (atomic_dec_return(&amn->recursion) == 0) - up_read_non_owner(&amn->lock); + up_read(&amn->lock); } /** ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [RFC PATCH] mm, oom: distinguish blockable mode for mmu notifiers
Am 02.07.2018 um 14:35 schrieb Michal Hocko: On Mon 02-07-18 14:24:29, Christian König wrote: Am 02.07.2018 um 14:20 schrieb Michal Hocko: On Mon 02-07-18 14:13:42, Christian König wrote: Am 02.07.2018 um 13:54 schrieb Michal Hocko: On Mon 02-07-18 11:14:58, Christian König wrote: Am 27.06.2018 um 09:44 schrieb Michal Hocko: This is the v2 of RFC based on the feedback I've received so far. The code even compiles as a bonus ;) I haven't runtime tested it yet, mostly because I have no idea how. Any further feedback is highly appreciated of course. That sounds like it should work and at least the amdgpu changes now look good to me on first glance. Can you split that up further in the usual way? E.g. adding the blockable flag in one patch and fixing all implementations of the MMU notifier in follow up patches. But such a code would be broken, no? Ignoring the blockable state will simply lead to lockups until the fixup parts get applied. Well to still be bisect-able you only need to get the interface change in first with fixing the function signature of the implementations. That would only work if those functions return -AGAIN unconditionally. Otherwise they would pretend to not block while that would be obviously incorrect. This doesn't sound correct to me. Then add all the new code to the implementations and last start to actually use the new interface. That is a pattern we use regularly and I think it's good practice to do this. But we do rely on the proper blockable handling. Yeah, but you could add the handling only after you have all the implementations in place. Don't you? Yeah, but then I would be adding a code with no user. And I really prefer to no do so because then the code is harder to argue about. Is the split up really worth it? I was thinking about that but had hard times to end up with something that would be bisectable. Well, except for returning -EBUSY until all notifiers are implemented. Which I found confusing. It at least makes reviewing changes much easier, cause as driver maintainer I can concentrate on the stuff only related to me. Additional to that when you cause some unrelated side effect in a driver we can much easier pinpoint the actual change later on when the patch is smaller. This way I'm pretty sure Felix and I can give an rb on the amdgpu/amdkfd changes. If you are worried to give r-b only for those then this can be done even for larger patches. Just make your Reviewd-by more specific R-b: name # For BLA BLA Yeah, possible alternative but more work for me when I review it :) I definitely do not want to add more work to reviewers and I completely see how massive "flag days" like these are not popular but I really didn't find a reasonable way around that would be both correct and wouldn't add much more churn on the way. So if you really insist then I would really appreciate a hint on the way to achive the same without any above downsides. Well, I don't insist on this. It's just from my point of view that this patch doesn't needs to be one patch, but could be split up. Well, if there are more people with the same concern I can try to do that. But if your only concern is to focus on your particular part then I guess it would be easier both for you and me to simply apply the patch and use git show $files_for_your_subystem on your end. I have put the patch to attempts/oom-vs-mmu-notifiers branch to my tree at git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git Not wanting to block something as important as this, so feel free to add an Acked-by: Christian König to the patch. Let's rather face the next topic: Any idea how to runtime test this? I mean I can rather easily provide a test which crashes an AMD GPU, which in turn then would mean that the MMU notifier would block forever without this patch. But do you know a way to let the OOM killer kill a specific process? Regards, Christian. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [RFC PATCH] mm, oom: distinguish blockable mode for mmu notifiers
Am 02.07.2018 um 14:20 schrieb Michal Hocko: On Mon 02-07-18 14:13:42, Christian König wrote: Am 02.07.2018 um 13:54 schrieb Michal Hocko: On Mon 02-07-18 11:14:58, Christian König wrote: Am 27.06.2018 um 09:44 schrieb Michal Hocko: This is the v2 of RFC based on the feedback I've received so far. The code even compiles as a bonus ;) I haven't runtime tested it yet, mostly because I have no idea how. Any further feedback is highly appreciated of course. That sounds like it should work and at least the amdgpu changes now look good to me on first glance. Can you split that up further in the usual way? E.g. adding the blockable flag in one patch and fixing all implementations of the MMU notifier in follow up patches. But such a code would be broken, no? Ignoring the blockable state will simply lead to lockups until the fixup parts get applied. Well to still be bisect-able you only need to get the interface change in first with fixing the function signature of the implementations. That would only work if those functions return -AGAIN unconditionally. Otherwise they would pretend to not block while that would be obviously incorrect. This doesn't sound correct to me. Then add all the new code to the implementations and last start to actually use the new interface. That is a pattern we use regularly and I think it's good practice to do this. But we do rely on the proper blockable handling. Yeah, but you could add the handling only after you have all the implementations in place. Don't you? Is the split up really worth it? I was thinking about that but had hard times to end up with something that would be bisectable. Well, except for returning -EBUSY until all notifiers are implemented. Which I found confusing. It at least makes reviewing changes much easier, cause as driver maintainer I can concentrate on the stuff only related to me. Additional to that when you cause some unrelated side effect in a driver we can much easier pinpoint the actual change later on when the patch is smaller. This way I'm pretty sure Felix and I can give an rb on the amdgpu/amdkfd changes. If you are worried to give r-b only for those then this can be done even for larger patches. Just make your Reviewd-by more specific R-b: name # For BLA BLA Yeah, possible alternative but more work for me when I review it :) I definitely do not want to add more work to reviewers and I completely see how massive "flag days" like these are not popular but I really didn't find a reasonable way around that would be both correct and wouldn't add much more churn on the way. So if you really insist then I would really appreciate a hint on the way to achive the same without any above downsides. Well, I don't insist on this. It's just from my point of view that this patch doesn't needs to be one patch, but could be split up. Could be that I just don't know the code or the consequences of adding that well enough to really judge. Christian. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [RFC PATCH] mm, oom: distinguish blockable mode for mmu notifiers
Am 02.07.2018 um 13:54 schrieb Michal Hocko: On Mon 02-07-18 11:14:58, Christian König wrote: Am 27.06.2018 um 09:44 schrieb Michal Hocko: This is the v2 of RFC based on the feedback I've received so far. The code even compiles as a bonus ;) I haven't runtime tested it yet, mostly because I have no idea how. Any further feedback is highly appreciated of course. That sounds like it should work and at least the amdgpu changes now look good to me on first glance. Can you split that up further in the usual way? E.g. adding the blockable flag in one patch and fixing all implementations of the MMU notifier in follow up patches. But such a code would be broken, no? Ignoring the blockable state will simply lead to lockups until the fixup parts get applied. Well to still be bisect-able you only need to get the interface change in first with fixing the function signature of the implementations. Then add all the new code to the implementations and last start to actually use the new interface. That is a pattern we use regularly and I think it's good practice to do this. Is the split up really worth it? I was thinking about that but had hard times to end up with something that would be bisectable. Well, except for returning -EBUSY until all notifiers are implemented. Which I found confusing. It at least makes reviewing changes much easier, cause as driver maintainer I can concentrate on the stuff only related to me. Additional to that when you cause some unrelated side effect in a driver we can much easier pinpoint the actual change later on when the patch is smaller. This way I'm pretty sure Felix and I can give an rb on the amdgpu/amdkfd changes. If you are worried to give r-b only for those then this can be done even for larger patches. Just make your Reviewd-by more specific R-b: name # For BLA BLA Yeah, possible alternative but more work for me when I review it :) Regards, Christian. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [RFC PATCH] mm, oom: distinguish blockable mode for mmu notifiers
Am 27.06.2018 um 09:44 schrieb Michal Hocko: This is the v2 of RFC based on the feedback I've received so far. The code even compiles as a bonus ;) I haven't runtime tested it yet, mostly because I have no idea how. Any further feedback is highly appreciated of course. That sounds like it should work and at least the amdgpu changes now look good to me on first glance. Can you split that up further in the usual way? E.g. adding the blockable flag in one patch and fixing all implementations of the MMU notifier in follow up patches. This way I'm pretty sure Felix and I can give an rb on the amdgpu/amdkfd changes. Thanks, Christian. --- From ec9a7241bf422b908532c4c33953b0da2655ad05 Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Wed, 20 Jun 2018 15:03:20 +0200 Subject: [PATCH] mm, oom: distinguish blockable mode for mmu notifiers MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit There are several blockable mmu notifiers which might sleep in mmu_notifier_invalidate_range_start and that is a problem for the oom_reaper because it needs to guarantee a forward progress so it cannot depend on any sleepable locks. Currently we simply back off and mark an oom victim with blockable mmu notifiers as done after a short sleep. That can result in selecting a new oom victim prematurely because the previous one still hasn't torn its memory down yet. We can do much better though. Even if mmu notifiers use sleepable locks there is no reason to automatically assume those locks are held. Moreover majority of notifiers only care about a portion of the address space and there is absolutely zero reason to fail when we are unmapping an unrelated range. Many notifiers do really block and wait for HW which is harder to handle and we have to bail out though. This patch handles the low hanging fruid. __mmu_notifier_invalidate_range_start gets a blockable flag and callbacks are not allowed to sleep if the flag is set to false. This is achieved by using trylock instead of the sleepable lock for most callbacks and continue as long as we do not block down the call chain. I think we can improve that even further because there is a common pattern to do a range lookup first and then do something about that. The first part can be done without a sleeping lock in most cases AFAICS. The oom_reaper end then simply retries if there is at least one notifier which couldn't make any progress in !blockable mode. A retry loop is already implemented to wait for the mmap_sem and this is basically the same thing. Changes since rfc v1 - gpu notifiers can sleep while waiting for HW (evict_process_queues_cpsch on a lock and amdgpu_mn_invalidate_node on unbound timeout) make sure we bail out when we have an intersecting range for starter - note that a notifier failed to the log for easier debugging - back off early in ib_umem_notifier_invalidate_range_start if the callback is called - mn_invl_range_start waits for completion down the unmap_grant_pages path so we have to back off early on overlapping ranges Cc: "David (ChunMing) Zhou" Cc: Paolo Bonzini Cc: "Radim Krčmář" Cc: Alex Deucher Cc: "Christian König" Cc: David Airlie Cc: Jani Nikula Cc: Joonas Lahtinen Cc: Rodrigo Vivi Cc: Doug Ledford Cc: Jason Gunthorpe Cc: Mike Marciniszyn Cc: Dennis Dalessandro Cc: Sudeep Dutt Cc: Ashutosh Dixit Cc: Dimitri Sivanich Cc: Boris Ostrovsky Cc: Juergen Gross Cc: "Jérôme Glisse" Cc: Andrea Arcangeli Cc: Felix Kuehling Cc: k...@vger.kernel.org (open list:KERNEL VIRTUAL MACHINE FOR X86 (KVM/x86)) Cc: linux-ker...@vger.kernel.org (open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)) Cc: amd-...@lists.freedesktop.org (open list:RADEON and AMDGPU DRM DRIVERS) Cc: dri-de...@lists.freedesktop.org (open list:DRM DRIVERS) Cc: intel-...@lists.freedesktop.org (open list:INTEL DRM DRIVERS (excluding Poulsbo, Moorestow...) Cc: linux-r...@vger.kernel.org (open list:INFINIBAND SUBSYSTEM) Cc: xen-devel@lists.xenproject.org (moderated list:XEN HYPERVISOR INTERFACE) Cc: linux...@kvack.org (open list:HMM - Heterogeneous Memory Management) Reported-by: David Rientjes Signed-off-by: Michal Hocko --- arch/x86/kvm/x86.c | 7 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c | 43 +++- drivers/gpu/drm/i915/i915_gem_userptr.c | 13 ++-- drivers/gpu/drm/radeon/radeon_mn.c | 22 +++-- drivers/infiniband/core/umem_odp.c | 33 +++ drivers/infiniband/hw/hfi1/mmu_rb.c | 11 --- drivers/infiniband/hw/mlx5/odp.c| 2 +- drivers/misc/mic/scif/scif_dma.c| 7 ++-- drivers/misc/sgi-gru/grutlbpurge.c | 7 ++-- drivers/xen/gntdev.c| 44 - include/linux/kvm_host.h| 4 +-- include/linux/mmu_notifier.h| 35 +++- include/linux/oom.h
Re: [Xen-devel] [RFC PATCH] mm, oom: distinguish blockable mode for mmu notifiers
Hi Michal, [Adding Felix as well] Well first of all you have a misconception why at least the AMD graphics driver need to be able to sleep in an MMU notifier: We need to sleep because we need to wait for hardware operations to finish and *NOT* because we need to wait for locks. I'm not sure if your flag now means that you generally can't sleep in MMU notifiers any more, but if that's the case at least AMD hardware will break badly. In our case the approach of waiting for a short time for the process to be reaped and then select another victim actually sounds like the right thing to do. What we also already try to do is to abort hardware operations with the address space when we detect that the process is dying, but that can certainly be improved. Regards, Christian. Am 22.06.2018 um 17:02 schrieb Michal Hocko: From: Michal Hocko There are several blockable mmu notifiers which might sleep in mmu_notifier_invalidate_range_start and that is a problem for the oom_reaper because it needs to guarantee a forward progress so it cannot depend on any sleepable locks. Currently we simply back off and mark an oom victim with blockable mmu notifiers as done after a short sleep. That can result in selecting a new oom victim prematurely because the previous one still hasn't torn its memory down yet. We can do much better though. Even if mmu notifiers use sleepable locks there is no reason to automatically assume those locks are held. Moreover most notifiers only care about a portion of the address space. This patch handles the first part of the problem. __mmu_notifier_invalidate_range_start gets a blockable flag and callbacks are not allowed to sleep if the flag is set to false. This is achieved by using trylock instead of the sleepable lock for most callbacks. I think we can improve that even further because there is a common pattern to do a range lookup first and then do something about that. The first part can be done without a sleeping lock I presume. Anyway, what does the oom_reaper do with all that? We do not have to fail right away. We simply retry if there is at least one notifier which couldn't make any progress. A retry loop is already implemented to wait for the mmap_sem and this is basically the same thing. Cc: "David (ChunMing) Zhou" Cc: Paolo Bonzini Cc: "Radim Krčmář" Cc: Alex Deucher Cc: "Christian König" Cc: David Airlie Cc: Jani Nikula Cc: Joonas Lahtinen Cc: Rodrigo Vivi Cc: Doug Ledford Cc: Jason Gunthorpe Cc: Mike Marciniszyn Cc: Dennis Dalessandro Cc: Sudeep Dutt Cc: Ashutosh Dixit Cc: Dimitri Sivanich Cc: Boris Ostrovsky Cc: Juergen Gross Cc: "Jérôme Glisse" Cc: Andrea Arcangeli Cc: k...@vger.kernel.org (open list:KERNEL VIRTUAL MACHINE FOR X86 (KVM/x86)) Cc: linux-ker...@vger.kernel.org (open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)) Cc: amd-...@lists.freedesktop.org (open list:RADEON and AMDGPU DRM DRIVERS) Cc: dri-de...@lists.freedesktop.org (open list:DRM DRIVERS) Cc: intel-...@lists.freedesktop.org (open list:INTEL DRM DRIVERS (excluding Poulsbo, Moorestow...) Cc: linux-r...@vger.kernel.org (open list:INFINIBAND SUBSYSTEM) Cc: xen-devel@lists.xenproject.org (moderated list:XEN HYPERVISOR INTERFACE) Cc: linux...@kvack.org (open list:HMM - Heterogeneous Memory Management) Reported-by: David Rientjes Signed-off-by: Michal Hocko --- Hi, this is an RFC and not tested at all. I am not very familiar with the mmu notifiers semantics very much so this is a crude attempt to achieve what I need basically. It might be completely wrong but I would like to discuss what would be a better way if that is the case. get_maintainers gave me quite large list of people to CC so I had to trim it down. If you think I have forgot somebody, please let me know Any feedback is highly appreciated. arch/x86/kvm/x86.c | 7 -- drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c | 33 +++-- drivers/gpu/drm/i915/i915_gem_userptr.c | 10 +--- drivers/gpu/drm/radeon/radeon_mn.c | 15 --- drivers/infiniband/core/umem_odp.c | 15 --- drivers/infiniband/hw/hfi1/mmu_rb.c | 7 -- drivers/misc/mic/scif/scif_dma.c| 7 -- drivers/misc/sgi-gru/grutlbpurge.c | 7 -- drivers/xen/gntdev.c| 14 --- include/linux/kvm_host.h| 2 +- include/linux/mmu_notifier.h| 15 +-- mm/hmm.c| 7 -- mm/mmu_notifier.c | 15 --- mm/oom_kill.c | 29 +++--- virt/kvm/kvm_main.c | 12 ++--- 15 files changed, 137 insertions(+), 58 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 6bcecc325e7e..ac08f5d711be 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7203,8 +7203,9 @@ static void vcpu_load_eoi_exitmap(
Re: [Xen-devel] [PATCH v3] xen/balloon: Mark unallocated host memory as UNUSABLE
Am 20.12.2017 um 15:05 schrieb Boris Ostrovsky: Commit f5775e0b6116 ("x86/xen: discard RAM regions above the maximum reservation") left host memory not assigned to dom0 as available for memory hotplug. Unfortunately this also meant that those regions could be used by others. Specifically, commit fa564ad96366 ("x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f, 60-7f)") may try to map those addresses as MMIO. To prevent this mark unallocated host memory as E820_TYPE_UNUSABLE (thus effectively reverting f5775e0b6116) and keep track of that region as a hostmem resource that can be used for the hotplug. Signed-off-by: Boris Ostrovsky Acked-by: Christian König --- Changes in v3: * Use PFN_PHYS * Replace kzalloc with kmalloc * Declare arch_xen_balloon_init prototype in balloon.h * Rename resources (s/memory/RAM/) * Clarify (I think) comment when populating hostmem_resource * Print open-ended interval on insert_resource() error * Constify declaration of struct e820_entry *entry arch/x86/xen/enlighten.c | 81 arch/x86/xen/setup.c | 6 ++-- drivers/xen/balloon.c| 65 -- include/xen/balloon.h| 5 +++ 4 files changed, 144 insertions(+), 13 deletions(-) diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index d669e9d..c9081c6 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -1,8 +1,12 @@ +#ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG +#include +#endif #include #include #include #include +#include #include #include @@ -331,3 +335,80 @@ void xen_arch_unregister_cpu(int num) } EXPORT_SYMBOL(xen_arch_unregister_cpu); #endif + +#ifdef CONFIG_XEN_BALLOON_MEMORY_HOTPLUG +void __init arch_xen_balloon_init(struct resource *hostmem_resource) +{ + struct xen_memory_map memmap; + int rc; + unsigned int i, last_guest_ram; + phys_addr_t max_addr = PFN_PHYS(max_pfn); + struct e820_table *xen_e820_table; + const struct e820_entry *entry; + struct resource *res; + + if (!xen_initial_domain()) + return; + + xen_e820_table = kmalloc(sizeof(*xen_e820_table), GFP_KERNEL); + if (!xen_e820_table) + return; + + memmap.nr_entries = ARRAY_SIZE(xen_e820_table->entries); + set_xen_guest_handle(memmap.buffer, xen_e820_table->entries); + rc = HYPERVISOR_memory_op(XENMEM_machine_memory_map, &memmap); + if (rc) { + pr_warn("%s: Can't read host e820 (%d)\n", __func__, rc); + goto out; + } + + last_guest_ram = 0; + for (i = 0; i < memmap.nr_entries; i++) { + if (xen_e820_table->entries[i].addr >= max_addr) + break; + if (xen_e820_table->entries[i].type == E820_TYPE_RAM) + last_guest_ram = i; + } + + entry = &xen_e820_table->entries[last_guest_ram]; + if (max_addr >= entry->addr + entry->size) + goto out; /* No unallocated host RAM. */ + + hostmem_resource->start = max_addr; + hostmem_resource->end = entry->addr + entry->size; + + /* +* Mark non-RAM regions between the end of dom0 RAM and end of host RAM +* as unavailable. The rest of that region can be used for hotplug-based +* ballooning. +*/ + for (; i < memmap.nr_entries; i++) { + entry = &xen_e820_table->entries[i]; + + if (entry->type == E820_TYPE_RAM) + continue; + + if (entry->addr >= hostmem_resource->end) + break; + + res = kzalloc(sizeof(*res), GFP_KERNEL); + if (!res) + goto out; + + res->name = "Unavailable host RAM"; + res->start = entry->addr; + res->end = (entry->addr + entry->size < hostmem_resource->end) ? + entry->addr + entry->size : hostmem_resource->end; + rc = insert_resource(hostmem_resource, res); + if (rc) { + pr_warn("%s: Can't insert [%llx - %llx) (%d)\n", + __func__, res->start, res->end, rc); + kfree(res); + goto out; + } + } + + out: + kfree(xen_e820_table); +} +#endif /* CONFIG_XEN_BALLOON_MEMORY_HOTPLUG */ diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index c114ca7..6e0d208 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -808,7 +808,6 @@ char * __init xen_memory_setup(void) addr = xen_e820_table.entries[0].addr; size = xen_e820_table.entries[0].size; while (i < xen_e820_table.nr_entries)
Re: [Xen-devel] [PATCH] x86/PCI: limit the size of the 64bit BAR to 256GB
Am 12.12.2017 um 19:12 schrieb Bjorn Helgaas: [+cc Boris, Juergen, xen-devel] On Mon, Dec 11, 2017 at 04:04:52PM +0100, Christian König wrote: Xen hides a bit of system memory from the OS for its own purpose by intercepting e820. This memory is unfortunately not reported as reserved, but rather completely invisible. Avoid this address space collision and possible similar problems by limiting the size of the newly allocated root hub window to 256GB which should be sufficient for the short term. It sounds like Boris is working on a more complete fix, so I'm going to drop this for now. This changelog includes a few more details, but I think it makes implicit assumptions about where memory and holes are and how big they are, and it's still not clear why 256GB is the right number. Is it connected to the expected size of the BAR, or related somehow to the size of the hole? 256GB was just an arbitrary number I've thought should work for at least my use case. And yes Boris is working on a more complete and cleaner fix. I agree that we can completely drop my patch for now. If we need this as a short-term workaround, we can do that, but I would like to include a reference to f5775e0b6116 ("x86/xen: discard RAM regions above the maximum reservation") and somehow make what's going on here a little more explicit. That patch looks more like it only applies to Xen guests, but not dom0. But take that with a grain of salt I really don't know anything about that code. Christian. Signed-off-by: Christian König --- arch/x86/pci/fixup.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c index 8f86060f5cf6..ed8bc6ab0573 100644 --- a/arch/x86/pci/fixup.c +++ b/arch/x86/pci/fixup.c @@ -702,7 +702,7 @@ static void pci_amd_enable_64bit_bar(struct pci_dev *dev) res->name = "PCI Bus :00"; res->flags = IORESOURCE_PREFETCH | IORESOURCE_MEM | IORESOURCE_MEM_64 | IORESOURCE_WINDOW; - res->start = 0x1ull; + res->start = 0xbdull; res->end = 0xfdull - 1; /* Just grab the free area behind system memory for this */ -- 2.11.0 ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5
Am 28.11.2017 um 11:53 schrieb Jan Beulich: On 28.11.17 at 11:17, wrote: Am 28.11.2017 um 10:46 schrieb Jan Beulich: On 28.11.17 at 10:12, wrote: In theory the BIOS would search for address space and won't find anything, so the hotplug operation should fail even before it reaches the kernel in the first place. How would the BIOS know what the OS does or plans to do? As far as I know the ACPI BIOS should work directly with the register content. So when we update the register content to enable the MMIO decoding the BIOS should know that as well. I'm afraid I don't follow: During memory hotplug, surely you don't expect the BIOS to do a PCI bus scan? Plus even if it did, it would be racy - some device could, at this very moment, have memory decoding disabled, just for the OS to re-enable it a millisecond later. Yet looking at BAR values is meaningless when memory decode of a device is disabled. No, sorry you misunderstood me. The PCI bus is not even involved here. In AMD Family CPUs you have four main types of address space routed by the NB: 1. Memory space targeting system DRAM. 2. Memory space targeting IO (MMIO). 3. IO space. 4. Configuration space. See section "2.8.2 NB Routing" in the BIOS and Kernel Developer’s Guide (https://support.amd.com/TechDocs/49125_15h_Models_30h-3Fh_BKDG.pdf). Long story short you have fix addresses for configuration and legacy IO (VGA for example) and then configurable memory space for DRAM and MMIO. What the ACPI BIOS does (or at least should do) is taking a look at the registers to find space during memory hotplug. Now in theory the MMIO space should be configurable by similar ACPI BIOS functions, but unfortunately most BIOS doesn't enable that function because it can break some older Windows versions. So what we do here is just what the BIOS should have provided in the first place. I think it's the other way around - the OS needs to avoid using any regions for MMIO which are marked as hotpluggable in SRAT. I was under the impression that this is exactly what acpi_numa_memory_affinity_init() does. Perhaps, except that (when I last looked) insufficient state is (was) being recorded to have that information readily available at the time MMIO space above 4Gb needs to be allocated for some device. That was also my concern, but in the most recent version I'm intentionally doing this fixup very late after all the PCI setup is already done. This way the extra address space is only available for devices which are added by PCI hotplug or for resizing BARs on the fly (which is the use case I'm interested in). Since there is no vNUMA yet for Xen Dom0, that would need special handling. I think that the problem is rather that SRAT is NUMA specific and if I'm not totally mistaken the content is ignored when NUMA support isn't compiled into the kernel. When Xen steals some memory from Dom0 by hocking up itself into the e820 call then I would say the cleanest way is to report this memory in e820 as reserved as well. But take that with a grain of salt, I'm seriously not a Xen expert. The E820 handling in PV Linux is all fake anyway - there's a single chunk of memory given to a PV guest (including Dom0), contiguous in what PV guests know as "physical address space" (not to be mixed up with "machine address space", which is where MMIO needs to be allocated from). Xen code in the kernel then mimics an E820 matching the host one, moving around pieces of memory in physical address space if necessary. Good to know. Since Dom0 knows the machine E820, MMIO allocation shouldn't need to be much different there from the non-Xen case. Yes, completely agree. I think even if we don't do MMIO allocation with this fixup letting the kernel in Dom0 know which memory/address space regions are in use is still a good idea. Otherwise we will run into exactly the same problem when we do the MMIO allocation with an ACPI method and that is certainly going to come in the next BIOS generations because Microsoft is pushing for it. Regards, Christian. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5
Am 27.11.2017 um 19:30 schrieb Boris Ostrovsky: On 11/23/2017 09:12 AM, Boris Ostrovsky wrote: On 11/23/2017 03:11 AM, Christian König wrote: Am 22.11.2017 um 18:27 schrieb Boris Ostrovsky: On 11/22/2017 11:54 AM, Christian König wrote: Am 22.11.2017 um 17:24 schrieb Boris Ostrovsky: On 11/22/2017 05:09 AM, Christian König wrote: Am 21.11.2017 um 23:26 schrieb Boris Ostrovsky: On 11/21/2017 08:34 AM, Christian König wrote: Hi Boris, attached are two patches. The first one is a trivial fix for the infinite loop issue, it now correctly aborts the fixup when it can't find address space for the root window. The second is a workaround for your board. It simply checks if there is exactly one Processor Function to apply this fix on. Both are based on linus current master branch. Please test if they fix your issue. Yes, they do fix it but that's because the feature is disabled. Do you know what the actual problem was (on Xen)? I still haven't understood what you actually did with Xen. When you used PCI pass through with those devices then you have made a major configuration error. When the problem happened on dom0 then the explanation is most likely that some PCI device ended up in the configured space, but the routing was only setup correctly on one CPU socket. The problem is that dom0 can be (and was in my case() booted with less than full physical memory and so the "rest" of the host memory is not necessarily reflected in iomem. Your patch then tried to configure that memory for MMIO and the system hang. And so my guess is that this patch will break dom0 on a single-socket system as well. Oh, thanks! I've thought about that possibility before, but wasn't able to find a system which actually does that. May I ask why the rest of the memory isn't reported to the OS? That memory doesn't belong to the OS (dom0), it is owned by the hypervisor. Sounds like I can't trust Linux resource management and probably need to read the DRAM config to figure things out after all. My question is whether what you are trying to do should ever be done for a guest at all (any guest, not necessarily Xen). The issue is probably that I don't know enough about Xen: What exactly is dom0? My understanding was that dom0 is the hypervisor, but that seems to be incorrect. The issue is that under no circumstances *EVER* a virtualized guest should have access to the PCI devices marked as "Processor Function" on AMD platforms. Otherwise it is trivial to break out of the virtualization. When dom0 is something like the system domain with all hardware access then the approach seems legitimate, but then the hypervisor should report the stolen memory to the OS using the e820 table. When the hypervisor doesn't do that and the Linux kernel isn't aware that there is memory at a given location mapping PCI space there will obviously crash the hypervisor. Possible solutions as far as I can see are either disabling this feature when we detect that we are a Xen dom0, scanning the DRAM settings to update Linux resource handling or fixing Xen to report stolen memory to the dom0 OS as reserved. Opinions? You are right, these functions are not exposed to a regular guest. I think for dom0 (which is a special Xen guest, with additional privileges) we may be able to add a reserved e820 region for host memory that is not assigned to dom0. Let me try it on Monday (I am out until then). One thing I realized while looking at solution for Xen dom0 is that this patch may cause problems for memory hotplug. Good point. My assumption was that when you got an BIOS which can handle memory hotplug then you also got an BIOS which doesn't need this fixup. But I've never validated that assumption. What happens if new memory is added to the system and we have everything above current memory set to MMIO? In theory the BIOS would search for address space and won't find anything, so the hotplug operation should fail even before it reaches the kernel in the first place. In practice I think that nobody ever tested if that works correctly. So I'm pretty sure the system would just crash. How about the attached patch? It limits the newly added MMIO space to the upper 256GB of the address space. That should still be enough for most devices, but we avoid both issues with Xen dom0 as most likely problems with memory hotplug as well. Christian. -boris >From 586bd9d67ebb6ca48bd0a6b1bd9203e94093cc8e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Christian=20K=C3=B6nig?= Date: Tue, 28 Nov 2017 10:02:35 +0100 Subject: [PATCH] x86/PCI: limit the size of the 64bit BAR to 256GB MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This avoids problems with Xen which hides some memory resources from the OS and potentially also allows memory hotplug while this fixup is enabled. Signed-off-by: Christian Kö
Re: [Xen-devel] [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5
Am 28.11.2017 um 10:46 schrieb Jan Beulich: On 28.11.17 at 10:12, wrote: In theory the BIOS would search for address space and won't find anything, so the hotplug operation should fail even before it reaches the kernel in the first place. How would the BIOS know what the OS does or plans to do? As far as I know the ACPI BIOS should work directly with the register content. So when we update the register content to enable the MMIO decoding the BIOS should know that as well. I think it's the other way around - the OS needs to avoid using any regions for MMIO which are marked as hotpluggable in SRAT. I was under the impression that this is exactly what acpi_numa_memory_affinity_init() does. Since there is no vNUMA yet for Xen Dom0, that would need special handling. I think that the problem is rather that SRAT is NUMA specific and if I'm not totally mistaken the content is ignored when NUMA support isn't compiled into the kernel. When Xen steals some memory from Dom0 by hocking up itself into the e820 call then I would say the cleanest way is to report this memory in e820 as reserved as well. But take that with a grain of salt, I'm seriously not a Xen expert. Regards, Christian. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v9 4/5] x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 30h-3fh) Processors v5
Am 22.11.2017 um 18:27 schrieb Boris Ostrovsky: On 11/22/2017 11:54 AM, Christian König wrote: Am 22.11.2017 um 17:24 schrieb Boris Ostrovsky: On 11/22/2017 05:09 AM, Christian König wrote: Am 21.11.2017 um 23:26 schrieb Boris Ostrovsky: On 11/21/2017 08:34 AM, Christian König wrote: Hi Boris, attached are two patches. The first one is a trivial fix for the infinite loop issue, it now correctly aborts the fixup when it can't find address space for the root window. The second is a workaround for your board. It simply checks if there is exactly one Processor Function to apply this fix on. Both are based on linus current master branch. Please test if they fix your issue. Yes, they do fix it but that's because the feature is disabled. Do you know what the actual problem was (on Xen)? I still haven't understood what you actually did with Xen. When you used PCI pass through with those devices then you have made a major configuration error. When the problem happened on dom0 then the explanation is most likely that some PCI device ended up in the configured space, but the routing was only setup correctly on one CPU socket. The problem is that dom0 can be (and was in my case() booted with less than full physical memory and so the "rest" of the host memory is not necessarily reflected in iomem. Your patch then tried to configure that memory for MMIO and the system hang. And so my guess is that this patch will break dom0 on a single-socket system as well. Oh, thanks! I've thought about that possibility before, but wasn't able to find a system which actually does that. May I ask why the rest of the memory isn't reported to the OS? That memory doesn't belong to the OS (dom0), it is owned by the hypervisor. Sounds like I can't trust Linux resource management and probably need to read the DRAM config to figure things out after all. My question is whether what you are trying to do should ever be done for a guest at all (any guest, not necessarily Xen). The issue is probably that I don't know enough about Xen: What exactly is dom0? My understanding was that dom0 is the hypervisor, but that seems to be incorrect. The issue is that under no circumstances *EVER* a virtualized guest should have access to the PCI devices marked as "Processor Function" on AMD platforms. Otherwise it is trivial to break out of the virtualization. When dom0 is something like the system domain with all hardware access then the approach seems legitimate, but then the hypervisor should report the stolen memory to the OS using the e820 table. When the hypervisor doesn't do that and the Linux kernel isn't aware that there is memory at a given location mapping PCI space there will obviously crash the hypervisor. Possible solutions as far as I can see are either disabling this feature when we detect that we are a Xen dom0, scanning the DRAM settings to update Linux resource handling or fixing Xen to report stolen memory to the dom0 OS as reserved. Opinions? Thanks, Christian. -boris ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xen.org/xen-devel