Re: [PATCH 0/9] drm: Annotate structs with __counted_by
Am 02.10.23 um 20:08 schrieb Kees Cook: On Mon, Oct 02, 2023 at 08:01:57PM +0200, Christian König wrote: Am 02.10.23 um 18:53 schrieb Kees Cook: On Mon, Oct 02, 2023 at 11:06:19AM -0400, Alex Deucher wrote: On Mon, Oct 2, 2023 at 5:20 AM Christian König wrote: Am 29.09.23 um 21:33 schrieb Kees Cook: On Fri, 22 Sep 2023 10:32:05 -0700, Kees Cook wrote: This is a batch of patches touching drm for preparing for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by to structs that would benefit from the annotation. [...] Since this got Acks, I figure I should carry it in my tree. Let me know if this should go via drm instead. Applied to for-next/hardening, thanks! [1/9] drm/amd/pm: Annotate struct smu10_voltage_dependency_table with __counted_by https://git.kernel.org/kees/c/a6046ac659d6 STOP! In a follow up discussion Alex and I figured out that this won't work. I'm so confused; from the discussion I saw that Alex said both instances were false positives? The value in the structure is byte swapped based on some firmware endianness which not necessary matches the CPU endianness. SMU10 is APU only so the endianess of the SMU firmware and the CPU will always match. Which I think is what is being said here? Please revert that one from going upstream if it's already on it's way. And because of those reasons I strongly think that patches like this should go through the DRM tree :) Sure, that's fine -- please let me know. It was others Acked/etc. Who should carry these patches? Probably best if the relevant maintainer pick them up individually. Some of those structures are filled in by firmware/hardware and only the maintainers can judge if that value actually matches what the compiler needs. We have cases where individual bits are used as flags or when the size is byte swapped etc... Even Alex and I didn't immediately say how and where that field is actually used and had to dig that up. That's where the confusion came from. Okay, I've dropped them all from my tree. Several had Acks/Reviews, so hopefully those can get picked up for the DRM tree? I will pick those up to go through drm-misc-next. Going to ping maintainers once more when I'm not sure if stuff is correct or not. Christian. Thanks! -Kees ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 1/9] drm/amd/pm: Annotate struct smu10_voltage_dependency_table with __counted_by
Am 22.09.23 um 19:41 schrieb Alex Deucher: On Fri, Sep 22, 2023 at 1:32 PM Kees Cook wrote: Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct smu10_voltage_dependency_table. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Evan Quan Cc: Alex Deucher Cc: "Christian König" Cc: "Pan, Xinhui" Cc: David Airlie Cc: Daniel Vetter Cc: Xiaojian Du Cc: Huang Rui Cc: Kevin Wang Cc: amd-...@lists.freedesktop.org Cc: dri-de...@lists.freedesktop.org Signed-off-by: Kees Cook Acked-by: Alex Deucher Mhm, I'm not sure if this is a good idea. That is a structure filled in by the firmware, isn't it? That would imply that we might need to byte swap count before it is checkable. Regards, Christian. --- drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.h b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.h index 808e0ecbe1f0..42adc2a3dcbc 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.h +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.h @@ -192,7 +192,7 @@ struct smu10_clock_voltage_dependency_record { struct smu10_voltage_dependency_table { uint32_t count; - struct smu10_clock_voltage_dependency_record entries[]; + struct smu10_clock_voltage_dependency_record entries[] __counted_by(count); }; struct smu10_clock_voltage_information { -- 2.34.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v15 17/23] drm/shmem-helper: Add and use drm_gem_shmem_resv_assert_held() helper
Am 29.08.23 um 09:29 schrieb Boris Brezillon: On Tue, 29 Aug 2023 05:34:23 +0300 Dmitry Osipenko wrote: On 8/28/23 13:12, Boris Brezillon wrote: On Sun, 27 Aug 2023 20:54:43 +0300 Dmitry Osipenko wrote: In a preparation of adding drm-shmem memory shrinker, move all reservation locking lockdep checks to use new drm_gem_shmem_resv_assert_held() that will resolve spurious lockdep warning about wrong locking order vs fs_reclam code paths during freeing of shmem GEM, where lockdep isn't aware that it's impossible to have locking contention with the fs_reclam at this special time. Signed-off-by: Dmitry Osipenko --- drivers/gpu/drm/drm_gem_shmem_helper.c | 37 +- 1 file changed, 25 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c index d96fee3d6166..ca5da976aafa 100644 --- a/drivers/gpu/drm/drm_gem_shmem_helper.c +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c @@ -128,6 +128,23 @@ struct drm_gem_shmem_object *drm_gem_shmem_create(struct drm_device *dev, size_t } EXPORT_SYMBOL_GPL(drm_gem_shmem_create); +static void drm_gem_shmem_resv_assert_held(struct drm_gem_shmem_object *shmem) +{ + /* +* Destroying the object is a special case.. drm_gem_shmem_free() +* calls many things that WARN_ON if the obj lock is not held. But +* acquiring the obj lock in drm_gem_shmem_free() can cause a locking +* order inversion between reservation_ww_class_mutex and fs_reclaim. +* +* This deadlock is not actually possible, because no one should +* be already holding the lock when drm_gem_shmem_free() is called. +* Unfortunately lockdep is not aware of this detail. So when the +* refcount drops to zero, we pretend it is already locked. +*/ + if (kref_read(>base.refcount)) + drm_gem_shmem_resv_assert_held(shmem); +} + /** * drm_gem_shmem_free - Free resources associated with a shmem GEM object * @shmem: shmem GEM object to free @@ -142,8 +159,6 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem) if (obj->import_attach) { drm_prime_gem_destroy(obj, shmem->sgt); } else if (!shmem->imported_sgt) { - dma_resv_lock(shmem->base.resv, NULL); - drm_WARN_ON(obj->dev, kref_read(>vmap_use_count)); if (shmem->sgt) { @@ -156,8 +171,6 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem) drm_gem_shmem_put_pages_locked(shmem); AFAICT, drm_gem_shmem_put_pages_locked() is the only function that's called in the free path and would complain about resv-lock not being held. I think I'd feel more comfortable if we were adding a drm_gem_shmem_free_pages() function that did everything drm_gem_shmem_put_pages_locked() does except for the lock_held() check and the refcount dec, and have it called here (and in drm_gem_shmem_put_pages_locked()). This way we can keep using dma_resv_assert_held() instead of having our own variant. It's not only drm_gem_shmem_free_pages(), but any drm-shmem function that drivers may use in the GEM's freeing callback. For example, panfrost_gem_free_object() may unpin shmem BO and then do drm_gem_shmem_free(). Is this really a valid use case? I haven't followed the whole discussion, but I think it isn't a valid use case. That page_use_count is none zero while the GEM object is about to be destroyed can only happen is someone managed to grab a reference to the page without referencing the GEM object. This is turn usually happens when somebody incorrectly walks the CPU page tables and grabs page references where it shouldn't. KMS used to do this and we had already had a discussion that they shouldn't do this. Regards, Christian. If the GEM refcount dropped to zero, we should certainly not have pages_pin_count > 0 (thinking of vmap-ed buffers that might disappear while kernel still has a pointer to the CPU-mapped area). The only reason we have this drm_gem_shmem_put_pages_locked() in drm_gem_shmem_free() is because of this implicit ref hold by the sgt, and IMHO, we should be stricter and check that pages_use_count == 1 when sgt != NULL and pages_use_count == 0 otherwise. I actually think it's a good thing to try and catch any attempt to call functions trying lock the resv in a path they're not supposed to. At least we can decide whether these actions are valid or not in this context, and provide dedicated helpers for the free path if they are. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v15 11/23] dma-resv: Add kref_put_dma_resv()
Am 27.08.23 um 19:54 schrieb Dmitry Osipenko: Add simple kref_put_dma_resv() helper that wraps around kref_put_ww_mutex() for drivers that needs to lock dma-resv on kref_put(). It's not possible to easily add this helper to kref.h because of the headers inclusion dependency, hence add it to dma-resv.h. I was never really a big fan of kref_put_mutex() in the first place. The main advantage comes from the included memory barrier, but this actually doesn't work like most people think it works and is usually pretty dangerous. And IIRC this was done because of the some special behavior mutexes have with memory barriers and that isn't necessary with ww-mutex. Christian. Signed-off-by: Dmitry Osipenko --- include/linux/dma-resv.h | 9 + 1 file changed, 9 insertions(+) diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h index 8d0e34dad446..c5cf302e4194 100644 --- a/include/linux/dma-resv.h +++ b/include/linux/dma-resv.h @@ -41,6 +41,7 @@ #include #include +#include #include #include #include @@ -464,6 +465,14 @@ static inline void dma_resv_unlock(struct dma_resv *obj) ww_mutex_unlock(>lock); } +static inline int kref_put_dma_resv(struct kref *kref, + void (*release)(struct kref *kref), + struct dma_resv *resv, + struct ww_acquire_ctx *ctx) +{ + return kref_put_ww_mutex(kref, release, >lock, ctx); +} + void dma_resv_init(struct dma_resv *obj); void dma_resv_fini(struct dma_resv *obj); int dma_resv_reserve_fences(struct dma_resv *obj, unsigned int num_fences); ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH -next 1/7] drm/amdkfd: Remove unnecessary NULL values
Am 09.08.23 um 05:44 schrieb Ruan Jinjie: The NULL initialization of the pointers assigned by kzalloc() first is not necessary, because if the kzalloc() failed, the pointers will be assigned NULL, otherwise it works as usual. so remove it. Signed-off-by: Ruan Jinjie Reviewed-by: Christian König for this one, the amd display code and the radeon stuff. Thanks, Christian. --- drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c index 863cf060af48..d01bb57733b3 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.c @@ -48,7 +48,7 @@ int pipe_priority_map[] = { struct kfd_mem_obj *allocate_hiq_mqd(struct kfd_node *dev, struct queue_properties *q) { - struct kfd_mem_obj *mqd_mem_obj = NULL; + struct kfd_mem_obj *mqd_mem_obj; mqd_mem_obj = kzalloc(sizeof(struct kfd_mem_obj), GFP_KERNEL); if (!mqd_mem_obj) @@ -64,7 +64,7 @@ struct kfd_mem_obj *allocate_hiq_mqd(struct kfd_node *dev, struct queue_properti struct kfd_mem_obj *allocate_sdma_mqd(struct kfd_node *dev, struct queue_properties *q) { - struct kfd_mem_obj *mqd_mem_obj = NULL; + struct kfd_mem_obj *mqd_mem_obj; uint64_t offset; mqd_mem_obj = kzalloc(sizeof(struct kfd_mem_obj), GFP_KERNEL); ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH RFC v1 00/52] drm/crtc: Rename struct drm_crtc::dev to drm_dev
Am 12.07.23 um 15:38 schrieb Uwe Kleine-König: Hello Maxime, On Wed, Jul 12, 2023 at 02:52:38PM +0200, Maxime Ripard wrote: On Wed, Jul 12, 2023 at 01:02:53PM +0200, Uwe Kleine-König wrote: Background is that this makes merge conflicts easier to handle and detect. Really? FWIW, I agree with Christian here. Each file (apart from include/drm/drm_crtc.h) is only touched once. So unless I'm missing something you don't get less or easier conflicts by doing it all in a single patch. But you gain the freedom to drop a patch for one driver without having to drop the rest with it. Not really, because the last patch removed the union anyway. So you have to revert both the last patch, plus that driver one. And then you need to add a TODO to remove that union eventually. Yes, with a single patch you have only one revert (but 194 files changed, 1264 insertions(+), 1296 deletions(-)) instead of two (one of them: 1 file changed, 9 insertions(+), 1 deletion(-); the other maybe a bit bigger). (And maybe you get away with just reverting the last patch.) With a single patch the TODO after a revert is "redo it all again (and prepare for a different set of conflicts)" while with the split series it's only "fix that one driver that was forgotten/borked" + reapply that 10 line patch. Yeah, but for a maintainer the size of the patches doesn't matter. That's only interesting if you need to manually review the patch, which you hopefully doesn't do in case of something auto-generated. In other words if the patch is auto-generated re-applying it completely is less work than fixing things up individually. As the one who gets that TODO, I prefer the latter. Yeah, but your personal preferences are not a technical relevant argument to a maintainer. At the end of the day Dave or Daniel need to decide, because they need to live with it. Regards, Christian. So in sum: If your metric is "small count of reverted commits", you're right. If however your metric is: Better get 95% of this series' change in than maybe 0%, the split series is the way to do it. With me having spend ~3h on this series' changes, it's maybe understandable that I did it the way I did. FTR: This series was created on top of v6.5-rc1. If you apply it to drm-misc-next you get a (trivial) conflict in patch #2. If I consider to be the responsible maintainer who applies this series, I like being able to just do git am --skip then. FTR#2: In drm-misc-next is a new driver (drivers/gpu/drm/loongson/lsdc_crtc.c) so skipping the last patch for now might indeed be a good idea. So I still like the split version better, but I'm open to a more verbose reasoning from your side. You're doing only one thing here, really: you change the name of a structure field. If it was shared between multiple maintainers, then sure, splitting that up is easier for everyone, but this will go through drm-misc, so I can't see the benefit it brings. I see your argument, but I think mine weights more. Best regards Uwe ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH RFC v1 00/52] drm/crtc: Rename struct drm_crtc::dev to drm_dev
Am 12.07.23 um 11:46 schrieb Uwe Kleine-König: Hello, while I debugged an issue in the imx-lcdc driver I was constantly irritated about struct drm_device pointer variables being named "dev" because with that name I usually expect a struct device pointer. I think there is a big benefit when these are all renamed to "drm_dev". I have no strong preference here though, so "drmdev" or "drm" are fine for me, too. Let the bikesheding begin! Some statistics: $ git grep -ohE 'struct drm_device *\* *[^ (),;]*' v6.5-rc1 | sort | uniq -c | sort -n 1 struct drm_device *adev_to_drm 1 struct drm_device *drm_ 1 struct drm_device *drm_dev 1 struct drm_device*drm_dev 1 struct drm_device *pdev 1 struct drm_device *rdev 1 struct drm_device *vdev 2 struct drm_device *dcss_drv_dev_to_drm 2 struct drm_device **ddev 2 struct drm_device *drm_dev_alloc 2 struct drm_device *mock 2 struct drm_device *p_ddev 5 struct drm_device *device 9 struct drm_device * dev 25 struct drm_device *d 95 struct drm_device * 216 struct drm_device *ddev 234 struct drm_device *drm_dev 611 struct drm_device *drm 4190 struct drm_device *dev This series starts with renaming struct drm_crtc::dev to drm_dev. If it's not only me and others like the result of this effort it should be followed up by adapting the other structs and the individual usages in the different drivers. To make this series a bit easier handleable, I first added an alias for drm_crtc::dev, then converted the drivers one after another and the last patch drops the "dev" name. This has the advantage of being easier to review, and if I should have missed an instance only the last patch must be dropped/reverted. Also this series might conflict with other patches, in this case the remaining patches can still go in (apart from the last one of course). Maybe it also makes sense to delay applying the last patch by one development cycle? When you automatically generate the patch (with cocci for example) I usually prefer a single patch instead. Background is that this makes merge conflicts easier to handle and detect. When you have multiple patches and a merge conflict because of some added lines using the old field the build breaks only on the last patch which removes the old field. In such cases reviewing the patch just means automatically re-generating it and double checking that you don't see anything funky. Apart from that I honestly absolutely don't care what the name is. Cheers, Christian. The series was compile tested for arm, arm64, powerpc and amd64 using an allmodconfig (though I only build drivers/gpu/). Best regards Uwe Uwe Kleine-König (52): drm/crtc: Start renaming struct drm_crtc::dev to drm_dev drm/core: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/amd: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/armada: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/arm: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/aspeed: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/ast: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/atmel-hlcdc: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/exynos: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/fsl-dcu: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/gma500: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/gud: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/hisilicon: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/hyperv: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/i915: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/imx: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/ingenic: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/kmb: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/logicvc: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/mcde: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/mediatek: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/meson: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/mgag200: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/msm: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/mxsfb: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/nouveau: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/omapdrm: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/panel-ili9341: Use struct drm_crtc::drm_dev instead of struct drm_crtc::dev drm/pl111: Use struct
Re: [PATCH] drm: Remove unnecessary (void*) conversions
Am 26.05.23 um 05:32 schrieb Su Hui: Pointer variables of (void*) type do not require type cast. Please split that up by subsystem/driver. Taking it through the misc tree might just cause merge conflicts. Christian. Signed-off-by: Su Hui --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c | 2 +- drivers/gpu/drm/amd/pm/amdgpu_pm.c| 2 +- drivers/gpu/drm/etnaviv/etnaviv_drv.c | 4 ++-- drivers/gpu/drm/nouveau/nouveau_debugfs.c | 2 +- drivers/gpu/drm/omapdrm/omap_debugfs.c| 6 +++--- drivers/gpu/drm/pl111/pl111_debugfs.c | 2 +- drivers/gpu/drm/qxl/qxl_debugfs.c | 4 ++-- drivers/gpu/drm/tiny/arcpgu.c | 2 +- drivers/gpu/drm/ttm/ttm_resource.c| 3 +-- drivers/gpu/drm/virtio/virtgpu_debugfs.c | 6 +++--- drivers/gpu/drm/vmwgfx/ttm_object.c | 5 ++--- drivers/gpu/drm/vmwgfx/vmwgfx_gem.c | 2 +- 12 files changed, 19 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c index 827fcb4fb3b3..8a2c39927167 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_debugfs.c @@ -3312,7 +3312,7 @@ static ssize_t dtn_log_write( static int mst_topo_show(struct seq_file *m, void *unused) { - struct amdgpu_device *adev = (struct amdgpu_device *)m->private; + struct amdgpu_device *adev = m->private; struct drm_device *dev = adev_to_drm(adev); struct drm_connector *connector; struct drm_connector_list_iter conn_iter; diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c b/drivers/gpu/drm/amd/pm/amdgpu_pm.c index 58c2246918fd..e6c870bd307b 100644 --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c @@ -3671,7 +3671,7 @@ static void amdgpu_parse_cg_state(struct seq_file *m, u64 flags) static int amdgpu_debugfs_pm_info_show(struct seq_file *m, void *unused) { - struct amdgpu_device *adev = (struct amdgpu_device *)m->private; + struct amdgpu_device *adev = m->private; struct drm_device *dev = adev_to_drm(adev); u64 flags = 0; int r; diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c b/drivers/gpu/drm/etnaviv/etnaviv_drv.c index 31a7f59ccb49..dd57f7164e9a 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c @@ -198,7 +198,7 @@ static int etnaviv_ring_show(struct etnaviv_gpu *gpu, struct seq_file *m) static int show_unlocked(struct seq_file *m, void *arg) { - struct drm_info_node *node = (struct drm_info_node *) m->private; + struct drm_info_node *node = m->private; struct drm_device *dev = node->minor->dev; int (*show)(struct drm_device *dev, struct seq_file *m) = node->info_ent->data; @@ -208,7 +208,7 @@ static int show_unlocked(struct seq_file *m, void *arg) static int show_each_gpu(struct seq_file *m, void *arg) { - struct drm_info_node *node = (struct drm_info_node *) m->private; + struct drm_info_node *node = m->private; struct drm_device *dev = node->minor->dev; struct etnaviv_drm_private *priv = dev->dev_private; struct etnaviv_gpu *gpu; diff --git a/drivers/gpu/drm/nouveau/nouveau_debugfs.c b/drivers/gpu/drm/nouveau/nouveau_debugfs.c index 2a36d1ca8fda..96b59d5d68ed 100644 --- a/drivers/gpu/drm/nouveau/nouveau_debugfs.c +++ b/drivers/gpu/drm/nouveau/nouveau_debugfs.c @@ -37,7 +37,7 @@ static int nouveau_debugfs_vbios_image(struct seq_file *m, void *data) { - struct drm_info_node *node = (struct drm_info_node *) m->private; + struct drm_info_node *node = m->private; struct nouveau_drm *drm = nouveau_drm(node->minor->dev); int i; diff --git a/drivers/gpu/drm/omapdrm/omap_debugfs.c b/drivers/gpu/drm/omapdrm/omap_debugfs.c index a3d470468e5b..a94ce502e152 100644 --- a/drivers/gpu/drm/omapdrm/omap_debugfs.c +++ b/drivers/gpu/drm/omapdrm/omap_debugfs.c @@ -19,7 +19,7 @@ static int gem_show(struct seq_file *m, void *arg) { - struct drm_info_node *node = (struct drm_info_node *) m->private; + struct drm_info_node *node = m->private; struct drm_device *dev = node->minor->dev; struct omap_drm_private *priv = dev->dev_private; @@ -33,7 +33,7 @@ static int gem_show(struct seq_file *m, void *arg) static int mm_show(struct seq_file *m, void *arg) { - struct drm_info_node *node = (struct drm_info_node *) m->private; + struct drm_info_node *node = m->private; struct drm_device *dev = node->minor->dev; struct drm_printer p = drm_seq_file_printer(m); @@ -45,7 +45,7 @@ static int mm_show(struct seq_file *m, void *arg) #ifdef
Re: [PATCH] drm/gem: Expose the buffer object handle to userspace last
Am 20.02.23 um 11:23 schrieb Tvrtko Ursulin: On 20/02/2023 10:01, Christian König wrote: Am 20.02.23 um 10:55 schrieb Tvrtko Ursulin: Hi, On 14/02/2023 13:59, Christian König wrote: Am 14.02.23 um 13:50 schrieb Tvrtko Ursulin: From: Tvrtko Ursulin Currently drm_gem_handle_create_tail exposes the handle to userspace before the buffer object constructions is complete. This allowing of working against a partially constructed object, which may also be in the process of having its creation fail, can have a range of negative outcomes. A lot of those will depend on what the individual drivers are doing in their obj->funcs->open() callbacks, and also with a common failure mode being -ENOMEM from drm_vma_node_allow. We can make sure none of this can happen by allocating a handle last, although with a downside that more of the function now runs under the dev->object_name_lock. Looking into the individual drivers open() hooks, we have amdgpu_gem_object_open which seems like it could have a potential security issue without this change. A couple drivers like qxl_gem_object_open and vmw_gem_object_open implement no-op hooks so no impact for them. A bunch of other require a deeper look by individual owners to asses for impact. Those are lima_gem_object_open, nouveau_gem_object_open, panfrost_gem_open, radeon_gem_object_open and virtio_gpu_gem_object_open. Putting aside the risk assesment of the above, some common scenarios to think about are along these lines: 1) Userspace closes a handle by speculatively "guessing" it from a second thread. This results in an unreachable buffer object so, a memory leak. 2) Same as 1), but object is in the process of getting closed (failed creation). The second thread is then able to re-cycle the handle and idr_remove would in the first thread would then remove the handle it does not own from the idr. 3) Going back to the earlier per driver problem space - individual impact assesment of allowing a second thread to access and operate on a partially constructed handle / object. (Can something crash? Leak information?) In terms of identifying when the problem started I will tag some patches as references, but not all, if even any, of them actually point to a broken state. I am just identifying points at which more opportunity for issues to arise was added. Yes I've looked into this once as well, but couldn't completely solve it for some reason. Give me a day or two to get this tested and all the logic swapped back into my head again. Managed to recollect what the problem with earlier attempts was? Nope, that's way to long ago. I can only assume that I ran into problems with the object_name_lock. Probably best to double check if that doesn't result in a lock inversion when somebody grabs the reservation lock in their ->load() callback. Hmm I don't immediately follow the connection. But I have only found radeon_driver_load_kms as using the load callback. Is there any lockdep enabled CI for that driver which could tell us if there is a problem there? We don't have CI for radeon and most likely never will, that hw is just to old. But we also have amdgpu_gem_object_open() and that looks suspiciously like trouble. The function makes sure that every BO is registered in the VM house keeping functions of the drm_file and while doing so grabs a few locks. I'm not sure what the locking order of those are. Could be that this will work, could be that it breaks. I will ping internally today if somebody from my team can take a look at this patch. Regards, Christian. Regards, Tvrtko Regards, Christian. Regards, Tvrtko Christian. References: 304eda32920b ("drm/gem: add hooks to notify driver when object handle is created/destroyed") References: ca481c9b2a3a ("drm/gem: implement vma access management") References: b39b5394fabc ("drm/gem: Add drm_gem_object_funcs") Cc: dri-de...@lists.freedesktop.org Cc: Rob Clark Cc: Ben Skeggs Cc: David Herrmann Cc: Noralf Trønnes Cc: David Airlie Cc: Daniel Vetter Cc: amd-...@lists.freedesktop.org Cc: l...@lists.freedesktop.org Cc: nouv...@lists.freedesktop.org Cc: Steven Price Cc: virtualization@lists.linux-foundation.org Cc: spice-de...@lists.freedesktop.org Cc: Zack Rusin --- drivers/gpu/drm/drm_gem.c | 48 +++ 1 file changed, 24 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index aa15c52ae182..e3d897bca0f2 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -356,52 +356,52 @@ drm_gem_handle_create_tail(struct drm_file *file_priv, u32 *handlep) { struct drm_device *dev = obj->dev; - u32 handle; int ret; WARN_ON(!mutex_is_locked(>object_name_lock)); if (obj->handle_count++ == 0) drm_gem_object_get(obj); + ret = drm_vma_node_allow(>vma_node, file_priv); + if (ret) + goto err_put; + + if
Re: [PATCH] drm/gem: Expose the buffer object handle to userspace last
Am 14.02.23 um 13:50 schrieb Tvrtko Ursulin: From: Tvrtko Ursulin Currently drm_gem_handle_create_tail exposes the handle to userspace before the buffer object constructions is complete. This allowing of working against a partially constructed object, which may also be in the process of having its creation fail, can have a range of negative outcomes. A lot of those will depend on what the individual drivers are doing in their obj->funcs->open() callbacks, and also with a common failure mode being -ENOMEM from drm_vma_node_allow. We can make sure none of this can happen by allocating a handle last, although with a downside that more of the function now runs under the dev->object_name_lock. Looking into the individual drivers open() hooks, we have amdgpu_gem_object_open which seems like it could have a potential security issue without this change. A couple drivers like qxl_gem_object_open and vmw_gem_object_open implement no-op hooks so no impact for them. A bunch of other require a deeper look by individual owners to asses for impact. Those are lima_gem_object_open, nouveau_gem_object_open, panfrost_gem_open, radeon_gem_object_open and virtio_gpu_gem_object_open. Putting aside the risk assesment of the above, some common scenarios to think about are along these lines: 1) Userspace closes a handle by speculatively "guessing" it from a second thread. This results in an unreachable buffer object so, a memory leak. 2) Same as 1), but object is in the process of getting closed (failed creation). The second thread is then able to re-cycle the handle and idr_remove would in the first thread would then remove the handle it does not own from the idr. 3) Going back to the earlier per driver problem space - individual impact assesment of allowing a second thread to access and operate on a partially constructed handle / object. (Can something crash? Leak information?) In terms of identifying when the problem started I will tag some patches as references, but not all, if even any, of them actually point to a broken state. I am just identifying points at which more opportunity for issues to arise was added. Yes I've looked into this once as well, but couldn't completely solve it for some reason. Give me a day or two to get this tested and all the logic swapped back into my head again. Christian. References: 304eda32920b ("drm/gem: add hooks to notify driver when object handle is created/destroyed") References: ca481c9b2a3a ("drm/gem: implement vma access management") References: b39b5394fabc ("drm/gem: Add drm_gem_object_funcs") Cc: dri-de...@lists.freedesktop.org Cc: Rob Clark Cc: Ben Skeggs Cc: David Herrmann Cc: Noralf Trønnes Cc: David Airlie Cc: Daniel Vetter Cc: amd-...@lists.freedesktop.org Cc: l...@lists.freedesktop.org Cc: nouv...@lists.freedesktop.org Cc: Steven Price Cc: virtualization@lists.linux-foundation.org Cc: spice-de...@lists.freedesktop.org Cc: Zack Rusin --- drivers/gpu/drm/drm_gem.c | 48 +++ 1 file changed, 24 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index aa15c52ae182..e3d897bca0f2 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -356,52 +356,52 @@ drm_gem_handle_create_tail(struct drm_file *file_priv, u32 *handlep) { struct drm_device *dev = obj->dev; - u32 handle; int ret; WARN_ON(!mutex_is_locked(>object_name_lock)); if (obj->handle_count++ == 0) drm_gem_object_get(obj); + ret = drm_vma_node_allow(>vma_node, file_priv); + if (ret) + goto err_put; + + if (obj->funcs->open) { + ret = obj->funcs->open(obj, file_priv); + if (ret) + goto err_revoke; + } + /* -* Get the user-visible handle using idr. Preload and perform -* allocation under our spinlock. +* Get the user-visible handle using idr as the _last_ step. +* Preload and perform allocation under our spinlock. */ idr_preload(GFP_KERNEL); spin_lock(_priv->table_lock); - ret = idr_alloc(_priv->object_idr, obj, 1, 0, GFP_NOWAIT); - spin_unlock(_priv->table_lock); idr_preload_end(); - mutex_unlock(>object_name_lock); if (ret < 0) - goto err_unref; - - handle = ret; + goto err_close; - ret = drm_vma_node_allow(>vma_node, file_priv); - if (ret) - goto err_remove; + mutex_unlock(>object_name_lock); - if (obj->funcs->open) { - ret = obj->funcs->open(obj, file_priv); - if (ret) - goto err_revoke; - } + *handlep = ret; - *handlep = handle; return 0; +err_close: + if (obj->funcs->close) + obj->funcs->close(obj, file_priv); err_revoke:
Re: [drm-misc:drm-misc-next] [drm/ttm] 1802537820: WARNING:at_drivers/gpu/drm/drm_gem_vram_helper.c:#drm_gem_vram_offset
Am 08.02.23 um 10:38 schrieb Matthew Auld: On Wed, 8 Feb 2023 at 08:32, Christian König wrote: Hey guys, I'm pretty sure this is a bug in bochs which happens to surface because of a recent TTM change, we have seen similar problems in the past with this driver. What happens is that userspace tries to bind a BO to a CRTC before the BO has even a backing store. Any idea how to fix this? I can just remove the warning, but that's not really a good solution. IIUC this driver is just using ttm_bo_move_memcpy() underneath for its bo_move callback, which looks to be doing: if (!bo->resource) return 0; Which doesn't make any sense to me.There should at least be a move_null(), and maybe also a multi-hop to handle clearing. Otherwise bo->resource is likely always NULL (and we hit the above warning), even after the dummy move. What do you think? Oh, good point. That should indeed be move_null(). Do you want to write a patch or should I take care of this? Thanks for pointing that out, Christian. Regards, Christian. Am 08.02.23 um 05:32 schrieb kernel test robot: Greeting, FYI, we noticed WARNING:at_drivers/gpu/drm/drm_gem_vram_helper.c:#drm_gem_vram_offset due to commit (built with gcc-11): commit: 1802537820389183dfcd814e0f6a60d1496a75ef ("drm/ttm: stop allocating dummy resources during BO creation") git://anongit.freedesktop.org/drm/drm-misc drm-misc-next in testcase: boot on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): If you fix the issue, kindly add following tag | Reported-by: kernel test robot | Link: https://lore.kernel.org/oe-lkp/202302081038.984b8c1-oliver.s...@intel.com [ 25.994992][T1] [ cut here ] [ 25.995050][ T1] WARNING: CPU: 0 PID: 1 at drivers/gpu/drm/drm_gem_vram_helper.c:255 drm_gem_vram_offset (??:?) [ 25.995080][T1] Modules linked in: [ 25.995100][T1] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G T 6.2.0-rc6-01191-g180253782038 #1 a8db67375c3ac749313dafaec43f39836e38fae9 [ 25.995117][T1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-5 04/01/2014 [ 25.995128][ T1] RIP: 0010:drm_gem_vram_offset (??:?) [ 25.995144][ T1] Code: 02 00 00 b8 ff ff 37 00 48 89 fa 48 c1 e0 2a 48 c1 ea 03 80 3c 02 00 74 05 e8 7f 1f eb fe 48 8b 9b 20 02 00 00 48 85 db 75 06 <0f> 0b 31 c0 eb 4b 48 8d 7b 10 b8 ff ff 37 00 48 89 fa 48 c1 e0 2a All code 0:02 00 add(%rax),%al 2:00 b8 ff ff 37 00 add%bh,0x37(%rax) 8:48 89 famov%rdi,%rdx b:48 c1 e0 2a shl$0x2a,%rax f:48 c1 ea 03 shr$0x3,%rdx 13:80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) 17:74 05 je 0x1e 19:e8 7f 1f eb fe callq 0xfeeb1f9d 1e:48 8b 9b 20 02 00 00mov0x220(%rbx),%rbx 25:48 85 dbtest %rbx,%rbx 28:75 06 jne0x30 2a:* 0f 0b ud2 <-- trapping instruction 2c:31 c0 xor%eax,%eax 2e:eb 4b jmp0x7b 30:48 8d 7b 10 lea0x10(%rbx),%rdi 34:b8 ff ff 37 00 mov$0x37,%eax 39:48 89 famov%rdi,%rdx 3c:48 c1 e0 2a shl$0x2a,%rax Code starting with the faulting instruction === 0:0f 0b ud2 2:31 c0 xor%eax,%eax 4:eb 4b jmp0x51 6:48 8d 7b 10 lea0x10(%rbx),%rdi a:b8 ff ff 37 00 mov$0x37,%eax f:48 89 famov%rdi,%rdx 12:48 c1 e0 2a shl$0x2a,%rax [ 25.995156][T1] RSP: :c901f028 EFLAGS: 00210246 [ 25.995174][T1] RAX: dc00 RBX: RCX: [ 25.995186][T1] RDX: 111026dee544 RSI: 8881372d4b10 RDI: 888136f72a20 [ 25.995196][T1] RBP: c901f030 R08: R09: [ 25.995206][T1] R10: R11: R12: 8881372d4b00 [ 25.995215][T1] R13: 888136e9ee00 R14: 888136f4a060 R15: 0500 [ 25.995225][T1] FS: () GS:8883aee0() knlGS: [ 25.995236][T1] CS: 0010 DS: ES: CR0: 80050033 [ 25.995247][T1] CR2: f7fa1cd4 CR3: 06015000 CR4: 000406b0 [ 25.995262][T1] Call Trace: [ 25.995271][T1] [ 25.995287][ T1] bochs_plane_update (bochs.c:?) [ 25.995308][ T1] ?
Re: [drm-misc:drm-misc-next] [drm/ttm] 1802537820: WARNING:at_drivers/gpu/drm/drm_gem_vram_helper.c:#drm_gem_vram_offset
Hey guys, I'm pretty sure this is a bug in bochs which happens to surface because of a recent TTM change, we have seen similar problems in the past with this driver. What happens is that userspace tries to bind a BO to a CRTC before the BO has even a backing store. Any idea how to fix this? I can just remove the warning, but that's not really a good solution. Regards, Christian. Am 08.02.23 um 05:32 schrieb kernel test robot: Greeting, FYI, we noticed WARNING:at_drivers/gpu/drm/drm_gem_vram_helper.c:#drm_gem_vram_offset due to commit (built with gcc-11): commit: 1802537820389183dfcd814e0f6a60d1496a75ef ("drm/ttm: stop allocating dummy resources during BO creation") git://anongit.freedesktop.org/drm/drm-misc drm-misc-next in testcase: boot on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): If you fix the issue, kindly add following tag | Reported-by: kernel test robot | Link: https://lore.kernel.org/oe-lkp/202302081038.984b8c1-oliver.s...@intel.com [ 25.994992][T1] [ cut here ] [ 25.995050][ T1] WARNING: CPU: 0 PID: 1 at drivers/gpu/drm/drm_gem_vram_helper.c:255 drm_gem_vram_offset (??:?) [ 25.995080][T1] Modules linked in: [ 25.995100][T1] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G T 6.2.0-rc6-01191-g180253782038 #1 a8db67375c3ac749313dafaec43f39836e38fae9 [ 25.995117][T1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-5 04/01/2014 [ 25.995128][ T1] RIP: 0010:drm_gem_vram_offset (??:?) [ 25.995144][ T1] Code: 02 00 00 b8 ff ff 37 00 48 89 fa 48 c1 e0 2a 48 c1 ea 03 80 3c 02 00 74 05 e8 7f 1f eb fe 48 8b 9b 20 02 00 00 48 85 db 75 06 <0f> 0b 31 c0 eb 4b 48 8d 7b 10 b8 ff ff 37 00 48 89 fa 48 c1 e0 2a All code 0: 02 00 add(%rax),%al 2: 00 b8 ff ff 37 00 add%bh,0x37(%rax) 8: 48 89 famov%rdi,%rdx b: 48 c1 e0 2a shl$0x2a,%rax f: 48 c1 ea 03 shr$0x3,%rdx 13: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) 17: 74 05 je 0x1e 19: e8 7f 1f eb fe callq 0xfeeb1f9d 1e: 48 8b 9b 20 02 00 00mov0x220(%rbx),%rbx 25: 48 85 dbtest %rbx,%rbx 28: 75 06 jne0x30 2a:* 0f 0b ud2 <-- trapping instruction 2c: 31 c0 xor%eax,%eax 2e: eb 4b jmp0x7b 30: 48 8d 7b 10 lea0x10(%rbx),%rdi 34: b8 ff ff 37 00 mov$0x37,%eax 39: 48 89 famov%rdi,%rdx 3c: 48 c1 e0 2a shl$0x2a,%rax Code starting with the faulting instruction === 0: 0f 0b ud2 2: 31 c0 xor%eax,%eax 4: eb 4b jmp0x51 6: 48 8d 7b 10 lea0x10(%rbx),%rdi a: b8 ff ff 37 00 mov$0x37,%eax f: 48 89 famov%rdi,%rdx 12: 48 c1 e0 2a shl$0x2a,%rax [ 25.995156][T1] RSP: :c901f028 EFLAGS: 00210246 [ 25.995174][T1] RAX: dc00 RBX: RCX: [ 25.995186][T1] RDX: 111026dee544 RSI: 8881372d4b10 RDI: 888136f72a20 [ 25.995196][T1] RBP: c901f030 R08: R09: [ 25.995206][T1] R10: R11: R12: 8881372d4b00 [ 25.995215][T1] R13: 888136e9ee00 R14: 888136f4a060 R15: 0500 [ 25.995225][T1] FS: () GS:8883aee0() knlGS: [ 25.995236][T1] CS: 0010 DS: ES: CR0: 80050033 [ 25.995247][T1] CR2: f7fa1cd4 CR3: 06015000 CR4: 000406b0 [ 25.995262][T1] Call Trace: [ 25.995271][T1] [ 25.995287][ T1] bochs_plane_update (bochs.c:?) [ 25.995308][ T1] ? rcu_read_lock_bh_held (??:?) [ 25.995337][ T1] ? bochs_pci_probe (bochs.c:?) [ 25.995352][ T1] ? srcu_read_unlock (blk-mq.c:?) [ 25.995396][ T1] bochs_pipe_enable (bochs.c:?) [ 25.995410][ T1] ? drm_dev_printk (??:?) [ 25.995435][ T1] ? bochs_pipe_update (bochs.c:?) [ 25.995454][ T1] ? bochs_plane_update (bochs.c:?) [ 25.995473][ T1] ? bochs_pipe_update (bochs.c:?) [ 25.995487][ T1] ? bochs_pipe_update (bochs.c:?) [ 25.995507][ T1] drm_simple_kms_crtc_enable (drm_simple_kms_helper.c:?) [ 25.995533][ T1] drm_atomic_helper_commit_modeset_enables (??:?) [ 25.995570][ T1] drm_atomic_helper_commit_tail (??:?) [ 25.995591][ T1] commit_tail (drm_atomic_helper.c:?) [ 25.995631][ T1] drm_atomic_helper_commit (??:?) [ 25.995652][ T1] ? commit_work (??:?) [ 25.995670][ T1] drm_atomic_commit (??:?) [ 25.995689][ T1] ?
Re: [PATCH 01/21] drm/amdgpu: Don't set struct drm_driver.lastclose
Am 20.10.22 um 12:37 schrieb Thomas Zimmermann: Don't set struct drm_driver.lastclose. It's used to restore the fbdev console. But as amdgpu uses generic fbdev emulation, the console is being restored by the DRM client helpers already. See the call to drm_client_dev_restore() in drm_lastclose(). ??? The commit message doesn't match what the patch is doing. You are removing output_poll_changed instead of lastclose here. Did something got mixed up? Cheers, Christian. Signed-off-by: Thomas Zimmermann --- drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 1 - drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 -- 2 files changed, 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c index 23998f727c7f9..fb7186c5ade2a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c @@ -1224,7 +1224,6 @@ amdgpu_display_user_framebuffer_create(struct drm_device *dev, const struct drm_mode_config_funcs amdgpu_mode_funcs = { .fb_create = amdgpu_display_user_framebuffer_create, - .output_poll_changed = drm_fb_helper_output_poll_changed, }; static const struct drm_prop_enum_list amdgpu_underscan_enum_list[] = diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index f6a9e8fdd87d6..e9a28a5363b9a 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -82,7 +82,6 @@ #include #include #include -#include #include #include #include @@ -2810,7 +2809,6 @@ const struct amdgpu_ip_block_version dm_ip_block = static const struct drm_mode_config_funcs amdgpu_dm_mode_funcs = { .fb_create = amdgpu_display_user_framebuffer_create, .get_format_info = amd_get_format_info, - .output_poll_changed = drm_fb_helper_output_poll_changed, .atomic_check = amdgpu_dm_atomic_check, .atomic_commit = drm_atomic_helper_commit, }; ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v7 00/21] Move all drivers to a common dma-buf locking convention
Am 18.10.22 um 01:07 schrieb Dmitry Osipenko: On 10/17/22 20:22, Dmitry Osipenko wrote: Hello, This series moves all drivers to a dynamic dma-buf locking specification. From now on all dma-buf importers are made responsible for holding dma-buf's reservation lock around all operations performed over dma-bufs in accordance to the locking specification. This allows us to utilize reservation lock more broadly around kernel without fearing of a potential deadlocks. This patchset passes all i915 selftests. It was also tested using VirtIO, Panfrost, Lima, Tegra, udmabuf, AMDGPU and Nouveau drivers. I tested cases of display+GPU, display+V4L and GPU+V4L dma-buf sharing (where appropriate), which covers majority of kernel drivers since rest of the drivers share same or similar code paths. Changelog: v7: - Rebased on top of recent drm-misc-next. - Added ack from Jason Gunthorpe to the RDMA patch. - Added iosys_map_clear() to dma_buf_vmap_unlocked(), making it fully consistent with dma_buf_vmap(). v6: - Added r-b from Michael Ruhl to the i915 patch. - Added acks from Sumit Semwal and updated commit message of the "Move dma_buf_vmap() to dynamic locking specification" patch like was suggested by Sumit. - Added "!dmabuf" check to dma_buf_vmap_unlocked() to match the locked variant of the function, for consistency. v5: - Added acks and r-bs that were given to v4. - Changed i915 preparation patch like was suggested by Michael Ruhl. The scope of reservation locking is smaller now. v4: - Added dma_buf_mmap() to the "locking convention" documentation, which was missed by accident in v3. - Added acks from Christian König, Tomasz Figa and Hans Verkuil that they gave to couple v3 patches. - Dropped the "_unlocked" postfix from function names that don't have the locked variant, as was requested by Christian König. - Factored out the per-driver preparations into separate patches to ease reviewing of the changes, which is now doable without the global dma-buf functions renaming. - Factored out the dynamic locking convention enforcements into separate patches which add the final dma_resv_assert_held(dmabuf->resv) to the dma-buf API functions. v3: - Factored out dma_buf_mmap_unlocked() and attachment functions into aseparate patches, like was suggested by Christian König. - Corrected and factored out dma-buf locking documentation into a separate patch, like was suggested by Christian König. - Intel driver dropped the reservation locking fews days ago from its BO-release code path, but we need that locking for the imported GEMs because in the end that code path unmaps the imported GEM. So I added back the locking needed by the imported GEMs, updating the "dma-buf attachment locking specification" patch appropriately. - Tested Nouveau+Intel dma-buf import/export combo. - Tested udmabuf import to i915/Nouveau/AMDGPU. - Fixed few places in Etnaviv, Panfrost and Lima drivers that I missed to switch to locked dma-buf vmapping in the drm/gem: Take reservation lock for vmap/vunmap operations" patch. In a result invalidated the Christian's r-b that he gave to v2. - Added locked dma-buf vmap/vunmap functions that are needed for fixing vmappping of Etnaviv, Panfrost and Lima drivers mentioned above. I actually had this change stashed for the drm-shmem shrinker patchset, but then realized that it's already needed by the dma-buf patches. Also improved my tests to better cover these code paths. v2: - Changed locking specification to avoid problems with a cross-driver ww locking, like was suggested by Christian König. Now the attach/detach callbacks are invoked without the held lock and exporter should take the lock. - Added "locking convention" documentation that explains which dma-buf functions and callbacks are locked/unlocked for importers and exporters, which was requested by Christian König. - Added ack from Tomasz Figa to the V4L patches that he gave to v1. Dmitry Osipenko (21): dma-buf: Add unlocked variant of vmapping functions dma-buf: Add unlocked variant of attachment-mapping functions drm/gem: Take reservation lock for vmap/vunmap operations drm/prime: Prepare to dynamic dma-buf locking specification drm/armada: Prepare to dynamic dma-buf locking specification drm/i915: Prepare to dynamic dma-buf locking specification drm/omapdrm: Prepare to dynamic dma-buf locking specification drm/tegra: Prepare to dynamic dma-buf locking specification drm/etnaviv: Prepare to dynamic dma-buf locking specification RDMA/umem: Prepare to dynamic dma-buf locking specification misc: fastrpc: Prepare to dynamic dma-buf locking specification xen/gntdev: Prepare to dynamic dma-buf locking
Re: [PATCH v1] drm/ttm: Refcount allocated tail pages
Am 06.09.22 um 22:05 schrieb Daniel Vetter: On Tue, Sep 06, 2022 at 10:01:47PM +0200, Daniel Vetter wrote: On Mon, Aug 15, 2022 at 12:05:19PM +0200, Christian König wrote: Am 15.08.22 um 11:54 schrieb Dmitry Osipenko: Higher order pages allocated using alloc_pages() aren't refcounted and they need to be refcounted, otherwise it's impossible to map them by KVM. This patch sets the refcount of the tail pages and fixes the KVM memory mapping faults. Without this change guest virgl driver can't map host buffers into guest and can't provide OpenGL 4.5 profile support to the guest. The host mappings are also needed for enabling the Venus driver using host GPU drivers that are utilizing TTM. Based on a patch proposed by Trigger Huang. Well I can't count how often I have repeated this: This is an absolutely clear NAK! TTM pages are not reference counted in the first place and because of this giving them to virgl is illegal. Please immediately stop this completely broken approach. We have discussed this multiple times now. Yeah we need to get this stuff closed for real by tagging them all with VM_IO or VM_PFNMAP asap. For a bit more context: Anything mapping a bo should be VM_SPECIAL. And I think we should add the checks to the gem and dma-buf mmap functions to validate for that, and fix all the fallout. Otherwise this dragon keeps resurrecting ... VM_SPECIAL _will_ block get_user_pages, which will block everyone from even trying to refcount this stuff. Minimally we need to fix this for all ttm drivers, and it sounds like that's still not yet the case :-( Iirc last time around some funky amdkfd userspace was the hold-up because regressions? My recollection is that Felix and I fixed this with a KFD specific workaround. But I can double check with Felix on Monday. Christian. -Daniel It seems ot be a recurring amount of fun that people try to mmap dma-buf and then call get_user_pages on them. Which just doesn't work. I guess this is also why Rob Clark send out that dma-buf patch to expos mapping information (i.e. wc vs wb vs uncached). There seems to be some serious bonghits going on :-/ -Daniel Regards, Christian. Cc: sta...@vger.kernel.org Cc: Trigger Huang Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.collabora.com%2Fnews-and-blog%2Fblog%2F2021%2F11%2F26%2Fvenus-on-qemu-enabling-new-virtual-vulkan-driver%2F%23qcom1343data=05%7C01%7Cchristian.koenig%40amd.com%7C37a7d9b0f91249da415b08da90432d3a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637980915471280078%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=XN6wFiWc6Jljekmst0aOCPSTsFLlmkUjD9F%2Fl9nluAs%3Dreserved=0 Tested-by: Dmitry Osipenko # AMDGPU (Qemu and crosvm) Signed-off-by: Dmitry Osipenko --- drivers/gpu/drm/ttm/ttm_pool.c | 25 - 1 file changed, 24 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c index 21b61631f73a..11e92bb149c9 100644 --- a/drivers/gpu/drm/ttm/ttm_pool.c +++ b/drivers/gpu/drm/ttm/ttm_pool.c @@ -81,6 +81,7 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags, unsigned long attr = DMA_ATTR_FORCE_CONTIGUOUS; struct ttm_pool_dma *dma; struct page *p; + unsigned int i; void *vaddr; /* Don't set the __GFP_COMP flag for higher order allocations. @@ -93,8 +94,10 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags, if (!pool->use_dma_alloc) { p = alloc_pages(gfp_flags, order); - if (p) + if (p) { p->private = order; + goto ref_tail_pages; + } return p; } @@ -120,6 +123,23 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags, dma->vaddr = (unsigned long)vaddr | order; p->private = (unsigned long)dma; + +ref_tail_pages: + /* +* KVM requires mapped tail pages to be refcounted because put_page() +* is invoked on them in the end of the page fault handling, and thus, +* tail pages need to be protected from the premature releasing. +* In fact, KVM page fault handler refuses to map tail pages to guest +* if they aren't refcounted because hva_to_pfn_remapped() checks the +* refcount specifically for this case. +* +* In particular, unreferenced tail pages result in a KVM "Bad address" +* failure for VMMs that use VirtIO-GPU when guest's Mesa VirGL driver +* accesses mapped host TTM buffer that contains tail pages. +*/ + for (i = 1; i < 1 << order; i++) + page_ref_inc(p + i); + return p; error_free: @@ -133,6 +153,7 @@ static void ttm_pool_free_page(struct ttm_pool *pool, enum ttm_caching caching, { unsigned long attr =
Re: [PATCH v4 00/21] Move all drivers to a common dma-buf locking convention
Hi Dmitry, I've gone over this multiple times now and while it is still possible that we missed something I think that this should land now. The whole topic is just to complicated that we can 100% sure guarantee that there isn't anything wrong with the locking, but lockdep and driver testing should allow us to quickly find remaining issues. Do you have commit rights to drm-misc-next or should I push it? Thanks, Christian. Am 31.08.22 um 17:37 schrieb Dmitry Osipenko: Hello, This series moves all drivers to a dynamic dma-buf locking specification. From now on all dma-buf importers are made responsible for holding dma-buf's reservation lock around all operations performed over dma-bufs in accordance to the locking specification. This allows us to utilize reservation lock more broadly around kernel without fearing of a potential deadlocks. This patchset passes all i915 selftests. It was also tested using VirtIO, Panfrost, Lima, Tegra, udmabuf, AMDGPU and Nouveau drivers. I tested cases of display+GPU, display+V4L and GPU+V4L dma-buf sharing (where appropriate), which covers majority of kernel drivers since rest of the drivers share same or similar code paths. Changelog: v4: - Added dma_buf_mmap() to the "locking convention" documentation, which was missed by accident in v3. - Added acks from Christian König, Tomasz Figa and Hans Verkuil that they gave to couple v3 patches. - Dropped the "_unlocked" postfix from function names that don't have the locked variant, as was requested by Christian König. - Factored out the per-driver preparations into separate patches to ease reviewing of the changes, which is now doable without the global dma-buf functions renaming. - Factored out the dynamic locking convention enforcements into separate patches which add the final dma_resv_assert_held(dmabuf->resv) to the dma-buf API functions. v3: - Factored out dma_buf_mmap_unlocked() and attachment functions into aseparate patches, like was suggested by Christian König. - Corrected and factored out dma-buf locking documentation into a separate patch, like was suggested by Christian König. - Intel driver dropped the reservation locking fews days ago from its BO-release code path, but we need that locking for the imported GEMs because in the end that code path unmaps the imported GEM. So I added back the locking needed by the imported GEMs, updating the "dma-buf attachment locking specification" patch appropriately. - Tested Nouveau+Intel dma-buf import/export combo. - Tested udmabuf import to i915/Nouveau/AMDGPU. - Fixed few places in Etnaviv, Panfrost and Lima drivers that I missed to switch to locked dma-buf vmapping in the drm/gem: Take reservation lock for vmap/vunmap operations" patch. In a result invalidated the Christian's r-b that he gave to v2. - Added locked dma-buf vmap/vunmap functions that are needed for fixing vmappping of Etnaviv, Panfrost and Lima drivers mentioned above. I actually had this change stashed for the drm-shmem shrinker patchset, but then realized that it's already needed by the dma-buf patches. Also improved my tests to better cover these code paths. v2: - Changed locking specification to avoid problems with a cross-driver ww locking, like was suggested by Christian König. Now the attach/detach callbacks are invoked without the held lock and exporter should take the lock. - Added "locking convention" documentation that explains which dma-buf functions and callbacks are locked/unlocked for importers and exporters, which was requested by Christian König. - Added ack from Tomasz Figa to the V4L patches that he gave to v1. Dmitry Osipenko (21): dma-buf: Add unlocked variant of vmapping functions dma-buf: Add unlocked variant of attachment-mapping functions drm/gem: Take reservation lock for vmap/vunmap operations drm/prime: Prepare to dynamic dma-buf locking specification drm/armada: Prepare to dynamic dma-buf locking specification drm/i915: Prepare to dynamic dma-buf locking specification drm/omapdrm: Prepare to dynamic dma-buf locking specification drm/tegra: Prepare to dynamic dma-buf locking specification drm/etnaviv: Prepare to dynamic dma-buf locking specification RDMA/umem: Prepare to dynamic dma-buf locking specification misc: fastrpc: Prepare to dynamic dma-buf locking specification xen/gntdev: Prepare to dynamic dma-buf locking specification media: videobuf2: Prepare to dynamic dma-buf locking specification media: tegra-vde: Prepare to dynamic dma-buf locking specification dma-buf: Move dma_buf_vmap() to dynamic locking specification dma-buf: Move dma_buf_attach() to dynamic locking specification dma-buf: Move dma_buf_map_attachment() to dynamic locking specification
Re: [PATCH v4 21/21] dma-buf: Remove obsoleted internal lock
Am 31.08.22 um 17:37 schrieb Dmitry Osipenko: The internal dma-buf lock isn't needed anymore because the updated locking specification claims that dma-buf reservation must be locked by importers, and thus, the internal data is already protected by the reservation lock. Remove the obsoleted internal lock. Acked-by: Christian König Signed-off-by: Dmitry Osipenko Reviewed-by: Christian König --- drivers/dma-buf/dma-buf.c | 14 -- include/linux/dma-buf.h | 9 - 2 files changed, 4 insertions(+), 19 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 97ce884fad76..772fdd9eeed8 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -656,7 +656,6 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info) dmabuf->file = file; - mutex_init(>lock); INIT_LIST_HEAD(>attachments); mutex_lock(_list.lock); @@ -1502,7 +1501,7 @@ EXPORT_SYMBOL_NS_GPL(dma_buf_mmap, DMA_BUF); int dma_buf_vmap(struct dma_buf *dmabuf, struct iosys_map *map) { struct iosys_map ptr; - int ret = 0; + int ret; iosys_map_clear(map); @@ -1514,28 +1513,25 @@ int dma_buf_vmap(struct dma_buf *dmabuf, struct iosys_map *map) if (!dmabuf->ops->vmap) return -EINVAL; - mutex_lock(>lock); if (dmabuf->vmapping_counter) { dmabuf->vmapping_counter++; BUG_ON(iosys_map_is_null(>vmap_ptr)); *map = dmabuf->vmap_ptr; - goto out_unlock; + return 0; } BUG_ON(iosys_map_is_set(>vmap_ptr)); ret = dmabuf->ops->vmap(dmabuf, ); if (WARN_ON_ONCE(ret)) - goto out_unlock; + return ret; dmabuf->vmap_ptr = ptr; dmabuf->vmapping_counter = 1; *map = dmabuf->vmap_ptr; -out_unlock: - mutex_unlock(>lock); - return ret; + return 0; } EXPORT_SYMBOL_NS_GPL(dma_buf_vmap, DMA_BUF); @@ -1577,13 +1573,11 @@ void dma_buf_vunmap(struct dma_buf *dmabuf, struct iosys_map *map) BUG_ON(dmabuf->vmapping_counter == 0); BUG_ON(!iosys_map_is_equal(>vmap_ptr, map)); - mutex_lock(>lock); if (--dmabuf->vmapping_counter == 0) { if (dmabuf->ops->vunmap) dmabuf->ops->vunmap(dmabuf, map); iosys_map_clear(>vmap_ptr); } - mutex_unlock(>lock); } EXPORT_SYMBOL_NS_GPL(dma_buf_vunmap, DMA_BUF); diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index f11b5bbc2f37..6fa8d4e29719 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -326,15 +326,6 @@ struct dma_buf { /** @ops: dma_buf_ops associated with this buffer object. */ const struct dma_buf_ops *ops; - /** -* @lock: -* -* Used internally to serialize list manipulation, attach/detach and -* vmap/unmap. Note that in many cases this is superseeded by -* dma_resv_lock() on @resv. -*/ - struct mutex lock; - /** * @vmapping_counter: * ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v4 20/21] media: videobuf2: Stop using internal dma-buf lock
Am 31.08.22 um 17:37 schrieb Dmitry Osipenko: All drivers that use dma-bufs have been moved to the updated locking specification and now dma-buf reservation is guaranteed to be locked by importers during the mapping operations. There is no need to take the internal dma-buf lock anymore. Remove locking from the videobuf2 memory allocators. Acked-by: Tomasz Figa Acked-by: Hans Verkuil Signed-off-by: Dmitry Osipenko Acked-by: Christian König --- drivers/media/common/videobuf2/videobuf2-dma-contig.c | 11 +-- drivers/media/common/videobuf2/videobuf2-dma-sg.c | 11 +-- drivers/media/common/videobuf2/videobuf2-vmalloc.c| 11 +-- 3 files changed, 3 insertions(+), 30 deletions(-) diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c index 79f4d8301fbb..555bd40fa472 100644 --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c @@ -382,18 +382,12 @@ static struct sg_table *vb2_dc_dmabuf_ops_map( struct dma_buf_attachment *db_attach, enum dma_data_direction dma_dir) { struct vb2_dc_attachment *attach = db_attach->priv; - /* stealing dmabuf mutex to serialize map/unmap operations */ - struct mutex *lock = _attach->dmabuf->lock; struct sg_table *sgt; - mutex_lock(lock); - sgt = >sgt; /* return previously mapped sg table */ - if (attach->dma_dir == dma_dir) { - mutex_unlock(lock); + if (attach->dma_dir == dma_dir) return sgt; - } /* release any previous cache */ if (attach->dma_dir != DMA_NONE) { @@ -409,14 +403,11 @@ static struct sg_table *vb2_dc_dmabuf_ops_map( if (dma_map_sgtable(db_attach->dev, sgt, dma_dir, DMA_ATTR_SKIP_CPU_SYNC)) { pr_err("failed to map scatterlist\n"); - mutex_unlock(lock); return ERR_PTR(-EIO); } attach->dma_dir = dma_dir; - mutex_unlock(lock); - return sgt; } diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c b/drivers/media/common/videobuf2/videobuf2-dma-sg.c index 36ecdea8d707..36981a5b5c53 100644 --- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c +++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c @@ -424,18 +424,12 @@ static struct sg_table *vb2_dma_sg_dmabuf_ops_map( struct dma_buf_attachment *db_attach, enum dma_data_direction dma_dir) { struct vb2_dma_sg_attachment *attach = db_attach->priv; - /* stealing dmabuf mutex to serialize map/unmap operations */ - struct mutex *lock = _attach->dmabuf->lock; struct sg_table *sgt; - mutex_lock(lock); - sgt = >sgt; /* return previously mapped sg table */ - if (attach->dma_dir == dma_dir) { - mutex_unlock(lock); + if (attach->dma_dir == dma_dir) return sgt; - } /* release any previous cache */ if (attach->dma_dir != DMA_NONE) { @@ -446,14 +440,11 @@ static struct sg_table *vb2_dma_sg_dmabuf_ops_map( /* mapping to the client with new direction */ if (dma_map_sgtable(db_attach->dev, sgt, dma_dir, 0)) { pr_err("failed to map scatterlist\n"); - mutex_unlock(lock); return ERR_PTR(-EIO); } attach->dma_dir = dma_dir; - mutex_unlock(lock); - return sgt; } diff --git a/drivers/media/common/videobuf2/videobuf2-vmalloc.c b/drivers/media/common/videobuf2/videobuf2-vmalloc.c index 7831bf545874..41db707e43a4 100644 --- a/drivers/media/common/videobuf2/videobuf2-vmalloc.c +++ b/drivers/media/common/videobuf2/videobuf2-vmalloc.c @@ -267,18 +267,12 @@ static struct sg_table *vb2_vmalloc_dmabuf_ops_map( struct dma_buf_attachment *db_attach, enum dma_data_direction dma_dir) { struct vb2_vmalloc_attachment *attach = db_attach->priv; - /* stealing dmabuf mutex to serialize map/unmap operations */ - struct mutex *lock = _attach->dmabuf->lock; struct sg_table *sgt; - mutex_lock(lock); - sgt = >sgt; /* return previously mapped sg table */ - if (attach->dma_dir == dma_dir) { - mutex_unlock(lock); + if (attach->dma_dir == dma_dir) return sgt; - } /* release any previous cache */ if (attach->dma_dir != DMA_NONE) { @@ -289,14 +283,11 @@ static struct sg_table *vb2_vmalloc_dmabuf_ops_map( /* mapping to the client with new direction */ if (dma_map_sgtable(db_attach->dev, sgt, dma_dir, 0)) { pr_err("failed to map scatterlist\n"); - mutex_unlock(lock); return ERR_PTR(-EIO); } attach->dma_dir = dma_dir; - mutex_unlock(lock); - return sgt; } ___ Virtualization mailing list
Re: [PATCH v4 19/21] dma-buf: Document dynamic locking convention
Am 31.08.22 um 17:37 schrieb Dmitry Osipenko: Add documentation for the dynamic locking convention. The documentation tells dma-buf API users when they should take the reservation lock and when not. Signed-off-by: Dmitry Osipenko Reviewed-by: Christian König --- Documentation/driver-api/dma-buf.rst | 6 +++ drivers/dma-buf/dma-buf.c| 64 2 files changed, 70 insertions(+) diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst index 36a76cbe9095..622b8156d212 100644 --- a/Documentation/driver-api/dma-buf.rst +++ b/Documentation/driver-api/dma-buf.rst @@ -119,6 +119,12 @@ DMA Buffer ioctls .. kernel-doc:: include/uapi/linux/dma-buf.h +DMA-BUF locking convention +~ + +.. kernel-doc:: drivers/dma-buf/dma-buf.c + :doc: locking convention + Kernel Functions and Structures Reference ~ diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index d9130486cb8f..97ce884fad76 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -794,6 +794,70 @@ static struct sg_table * __map_dma_buf(struct dma_buf_attachment *attach, return sg_table; } +/** + * DOC: locking convention + * + * In order to avoid deadlock situations between dma-buf exports and importers, + * all dma-buf API users must follow the common dma-buf locking convention. + * + * Convention for importers + * + * 1. Importers must hold the dma-buf reservation lock when calling these + *functions: + * + * - dma_buf_pin() + * - dma_buf_unpin() + * - dma_buf_map_attachment() + * - dma_buf_unmap_attachment() + * - dma_buf_vmap() + * - dma_buf_vunmap() + * + * 2. Importers must not hold the dma-buf reservation lock when calling these + *functions: + * + * - dma_buf_attach() + * - dma_buf_dynamic_attach() + * - dma_buf_detach() + * - dma_buf_export( + * - dma_buf_fd() + * - dma_buf_get() + * - dma_buf_put() + * - dma_buf_mmap() + * - dma_buf_begin_cpu_access() + * - dma_buf_end_cpu_access() + * - dma_buf_map_attachment_unlocked() + * - dma_buf_unmap_attachment_unlocked() + * - dma_buf_vmap_unlocked() + * - dma_buf_vunmap_unlocked() + * + * Convention for exporters + * + * 1. These _buf_ops callbacks are invoked with unlocked dma-buf + *reservation and exporter can take the lock: + * + * - _buf_ops.attach() + * - _buf_ops.detach() + * - _buf_ops.release() + * - _buf_ops.begin_cpu_access() + * - _buf_ops.end_cpu_access() + * + * 2. These _buf_ops callbacks are invoked with locked dma-buf + *reservation and exporter can't take the lock: + * + * - _buf_ops.pin() + * - _buf_ops.unpin() + * - _buf_ops.map_dma_buf() + * - _buf_ops.unmap_dma_buf() + * - _buf_ops.mmap() + * - _buf_ops.vmap() + * - _buf_ops.vunmap() + * + * 3. Exporters must hold the dma-buf reservation lock when calling these + *functions: + * + * - dma_buf_move_notify() + */ + /** * dma_buf_dynamic_attach - Add the device to dma_buf's attachments list * @dmabuf: [in]buffer to attach device to. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v4 17/21] dma-buf: Move dma_buf_map_attachment() to dynamic locking specification
Am 31.08.22 um 17:37 schrieb Dmitry Osipenko: Move dma-buf attachment mapping functions to the dynamic locking specification by asserting that the reservation lock is held. Signed-off-by: Dmitry Osipenko Reviewed-by: Christian König --- drivers/dma-buf/dma-buf.c | 10 ++ 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 073942bf5ae9..8e928fe6e8df 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -1037,8 +1037,7 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach, if (WARN_ON(!attach || !attach->dmabuf)) return ERR_PTR(-EINVAL); - if (dma_buf_attachment_is_dynamic(attach)) - dma_resv_assert_held(attach->dmabuf->resv); + dma_resv_assert_held(attach->dmabuf->resv); if (attach->sgt) { /* @@ -1053,7 +1052,6 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach, } if (dma_buf_is_dynamic(attach->dmabuf)) { - dma_resv_assert_held(attach->dmabuf->resv); if (!IS_ENABLED(CONFIG_DMABUF_MOVE_NOTIFY)) { r = attach->dmabuf->ops->pin(attach); if (r) @@ -1142,15 +1140,11 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, if (WARN_ON(!attach || !attach->dmabuf || !sg_table)) return; - if (dma_buf_attachment_is_dynamic(attach)) - dma_resv_assert_held(attach->dmabuf->resv); + dma_resv_assert_held(attach->dmabuf->resv); if (attach->sgt == sg_table) return; - if (dma_buf_is_dynamic(attach->dmabuf)) - dma_resv_assert_held(attach->dmabuf->resv); - __unmap_dma_buf(attach, sg_table, direction); if (dma_buf_is_dynamic(attach->dmabuf) && ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v4 16/21] dma-buf: Move dma_buf_attach() to dynamic locking specification
Am 31.08.22 um 17:37 schrieb Dmitry Osipenko: Move dma-buf attachment API functions to the dynamic locking specification by taking the reservation lock around the mapping operations. The strict locking convention prevents deadlock situations for dma-buf importers and exporters. Signed-off-by: Dmitry Osipenko Reviewed-by: Christian König --- drivers/dma-buf/dma-buf.c | 20 1 file changed, 8 insertions(+), 12 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index ceea4839c641..073942bf5ae9 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -858,8 +858,8 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev, dma_buf_is_dynamic(dmabuf)) { struct sg_table *sgt; + dma_resv_lock(attach->dmabuf->resv, NULL); if (dma_buf_is_dynamic(attach->dmabuf)) { - dma_resv_lock(attach->dmabuf->resv, NULL); ret = dmabuf->ops->pin(attach); if (ret) goto err_unlock; @@ -872,8 +872,7 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev, ret = PTR_ERR(sgt); goto err_unpin; } - if (dma_buf_is_dynamic(attach->dmabuf)) - dma_resv_unlock(attach->dmabuf->resv); + dma_resv_unlock(attach->dmabuf->resv); attach->sgt = sgt; attach->dir = DMA_BIDIRECTIONAL; } @@ -889,8 +888,7 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev, dmabuf->ops->unpin(attach); err_unlock: - if (dma_buf_is_dynamic(attach->dmabuf)) - dma_resv_unlock(attach->dmabuf->resv); + dma_resv_unlock(attach->dmabuf->resv); dma_buf_detach(dmabuf, attach); return ERR_PTR(ret); @@ -936,21 +934,19 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) if (WARN_ON(!dmabuf || !attach)) return; + dma_resv_lock(attach->dmabuf->resv, NULL); + if (attach->sgt) { - if (dma_buf_is_dynamic(attach->dmabuf)) - dma_resv_lock(attach->dmabuf->resv, NULL); __unmap_dma_buf(attach, attach->sgt, attach->dir); - if (dma_buf_is_dynamic(attach->dmabuf)) { + if (dma_buf_is_dynamic(attach->dmabuf)) dmabuf->ops->unpin(attach); - dma_resv_unlock(attach->dmabuf->resv); - } } - - dma_resv_lock(dmabuf->resv, NULL); list_del(>node); + dma_resv_unlock(dmabuf->resv); + if (dmabuf->ops->detach) dmabuf->ops->detach(dmabuf, attach); ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v4 14/21] media: tegra-vde: Prepare to dynamic dma-buf locking specification
Am 31.08.22 um 17:37 schrieb Dmitry Osipenko: Prepare Tegra video decoder driver to the common dynamic dma-buf locking convention by starting to use the unlocked versions of dma-buf API functions. Signed-off-by: Dmitry Osipenko Acked-by: Christian König --- drivers/media/platform/nvidia/tegra-vde/dmabuf-cache.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/media/platform/nvidia/tegra-vde/dmabuf-cache.c b/drivers/media/platform/nvidia/tegra-vde/dmabuf-cache.c index 69c346148070..1c5b94989aec 100644 --- a/drivers/media/platform/nvidia/tegra-vde/dmabuf-cache.c +++ b/drivers/media/platform/nvidia/tegra-vde/dmabuf-cache.c @@ -38,7 +38,7 @@ static void tegra_vde_release_entry(struct tegra_vde_cache_entry *entry) if (entry->vde->domain) tegra_vde_iommu_unmap(entry->vde, entry->iova); - dma_buf_unmap_attachment(entry->a, entry->sgt, entry->dma_dir); + dma_buf_unmap_attachment_unlocked(entry->a, entry->sgt, entry->dma_dir); dma_buf_detach(dmabuf, entry->a); dma_buf_put(dmabuf); @@ -102,7 +102,7 @@ int tegra_vde_dmabuf_cache_map(struct tegra_vde *vde, goto err_unlock; } - sgt = dma_buf_map_attachment(attachment, dma_dir); + sgt = dma_buf_map_attachment_unlocked(attachment, dma_dir); if (IS_ERR(sgt)) { dev_err(dev, "Failed to get dmabufs sg_table\n"); err = PTR_ERR(sgt); @@ -152,7 +152,7 @@ int tegra_vde_dmabuf_cache_map(struct tegra_vde *vde, err_free: kfree(entry); err_unmap: - dma_buf_unmap_attachment(attachment, sgt, dma_dir); + dma_buf_unmap_attachment_unlocked(attachment, sgt, dma_dir); err_detach: dma_buf_detach(dmabuf, attachment); err_unlock: ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v4 13/21] media: videobuf2: Prepare to dynamic dma-buf locking specification
Am 31.08.22 um 17:37 schrieb Dmitry Osipenko: Prepare V4L2 memory allocators to the common dynamic dma-buf locking convention by starting to use the unlocked versions of dma-buf API functions. Acked-by: Tomasz Figa Signed-off-by: Dmitry Osipenko Acked-by: Christian König --- drivers/media/common/videobuf2/videobuf2-dma-contig.c | 11 ++- drivers/media/common/videobuf2/videobuf2-dma-sg.c | 8 drivers/media/common/videobuf2/videobuf2-vmalloc.c| 6 +++--- 3 files changed, 13 insertions(+), 12 deletions(-) diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c index 678b359717c4..79f4d8301fbb 100644 --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c @@ -101,7 +101,7 @@ static void *vb2_dc_vaddr(struct vb2_buffer *vb, void *buf_priv) if (buf->db_attach) { struct iosys_map map; - if (!dma_buf_vmap(buf->db_attach->dmabuf, )) + if (!dma_buf_vmap_unlocked(buf->db_attach->dmabuf, )) buf->vaddr = map.vaddr; return buf->vaddr; @@ -711,7 +711,7 @@ static int vb2_dc_map_dmabuf(void *mem_priv) } /* get the associated scatterlist for this buffer */ - sgt = dma_buf_map_attachment(buf->db_attach, buf->dma_dir); + sgt = dma_buf_map_attachment_unlocked(buf->db_attach, buf->dma_dir); if (IS_ERR(sgt)) { pr_err("Error getting dmabuf scatterlist\n"); return -EINVAL; @@ -722,7 +722,8 @@ static int vb2_dc_map_dmabuf(void *mem_priv) if (contig_size < buf->size) { pr_err("contiguous chunk is too small %lu/%lu\n", contig_size, buf->size); - dma_buf_unmap_attachment(buf->db_attach, sgt, buf->dma_dir); + dma_buf_unmap_attachment_unlocked(buf->db_attach, sgt, + buf->dma_dir); return -EFAULT; } @@ -750,10 +751,10 @@ static void vb2_dc_unmap_dmabuf(void *mem_priv) } if (buf->vaddr) { - dma_buf_vunmap(buf->db_attach->dmabuf, ); + dma_buf_vunmap_unlocked(buf->db_attach->dmabuf, ); buf->vaddr = NULL; } - dma_buf_unmap_attachment(buf->db_attach, sgt, buf->dma_dir); + dma_buf_unmap_attachment_unlocked(buf->db_attach, sgt, buf->dma_dir); buf->dma_addr = 0; buf->dma_sgt = NULL; diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c b/drivers/media/common/videobuf2/videobuf2-dma-sg.c index fa69158a65b1..36ecdea8d707 100644 --- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c +++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c @@ -309,7 +309,7 @@ static void *vb2_dma_sg_vaddr(struct vb2_buffer *vb, void *buf_priv) if (!buf->vaddr) { if (buf->db_attach) { - ret = dma_buf_vmap(buf->db_attach->dmabuf, ); + ret = dma_buf_vmap_unlocked(buf->db_attach->dmabuf, ); buf->vaddr = ret ? NULL : map.vaddr; } else { buf->vaddr = vm_map_ram(buf->pages, buf->num_pages, -1); @@ -565,7 +565,7 @@ static int vb2_dma_sg_map_dmabuf(void *mem_priv) } /* get the associated scatterlist for this buffer */ - sgt = dma_buf_map_attachment(buf->db_attach, buf->dma_dir); + sgt = dma_buf_map_attachment_unlocked(buf->db_attach, buf->dma_dir); if (IS_ERR(sgt)) { pr_err("Error getting dmabuf scatterlist\n"); return -EINVAL; @@ -594,10 +594,10 @@ static void vb2_dma_sg_unmap_dmabuf(void *mem_priv) } if (buf->vaddr) { - dma_buf_vunmap(buf->db_attach->dmabuf, ); + dma_buf_vunmap_unlocked(buf->db_attach->dmabuf, ); buf->vaddr = NULL; } - dma_buf_unmap_attachment(buf->db_attach, sgt, buf->dma_dir); + dma_buf_unmap_attachment_unlocked(buf->db_attach, sgt, buf->dma_dir); buf->dma_sgt = NULL; } diff --git a/drivers/media/common/videobuf2/videobuf2-vmalloc.c b/drivers/media/common/videobuf2/videobuf2-vmalloc.c index 948152f1596b..7831bf545874 100644 --- a/drivers/media/common/videobuf2/videobuf2-vmalloc.c +++ b/drivers/media/common/videobuf2/videobuf2-vmalloc.c @@ -376,7 +376,7 @@ static int vb2_vmalloc_map_dmabuf(void *mem_priv) struct iosys_map map; int ret; - ret = dma_buf_vmap(buf->dbuf, ); + ret = dma_buf_vmap_unlocked(buf->dbuf, ); if (ret) return -EFAULT; buf->vaddr = map.vaddr; @@ -389,7 +389,7 @@ static void vb2_vmalloc_unmap_dmabuf(void *mem_priv) struct vb2_vmalloc_buf *buf = mem_priv; struct iosys_map map = IOSYS_MAP_INIT_VADDR(buf->vaddr); - dma_buf_vunmap(buf->dbuf, ); + dma_buf_vunmap_unlocked(buf->dbuf, ); buf->vaddr = NULL;
Re: [PATCH v4 12/21] xen/gntdev: Prepare to dynamic dma-buf locking specification
Am 31.08.22 um 17:37 schrieb Dmitry Osipenko: Prepare gntdev driver to the common dynamic dma-buf locking convention by starting to use the unlocked versions of dma-buf API functions. Signed-off-by: Dmitry Osipenko Acked-by: Christian König --- drivers/xen/gntdev-dmabuf.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/xen/gntdev-dmabuf.c b/drivers/xen/gntdev-dmabuf.c index 940e5e9e8a54..4440e626b797 100644 --- a/drivers/xen/gntdev-dmabuf.c +++ b/drivers/xen/gntdev-dmabuf.c @@ -600,7 +600,7 @@ dmabuf_imp_to_refs(struct gntdev_dmabuf_priv *priv, struct device *dev, gntdev_dmabuf->u.imp.attach = attach; - sgt = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL); + sgt = dma_buf_map_attachment_unlocked(attach, DMA_BIDIRECTIONAL); if (IS_ERR(sgt)) { ret = ERR_CAST(sgt); goto fail_detach; @@ -658,7 +658,7 @@ dmabuf_imp_to_refs(struct gntdev_dmabuf_priv *priv, struct device *dev, fail_end_access: dmabuf_imp_end_foreign_access(gntdev_dmabuf->u.imp.refs, count); fail_unmap: - dma_buf_unmap_attachment(attach, sgt, DMA_BIDIRECTIONAL); + dma_buf_unmap_attachment_unlocked(attach, sgt, DMA_BIDIRECTIONAL); fail_detach: dma_buf_detach(dma_buf, attach); fail_free_obj: @@ -708,8 +708,8 @@ static int dmabuf_imp_release(struct gntdev_dmabuf_priv *priv, u32 fd) attach = gntdev_dmabuf->u.imp.attach; if (gntdev_dmabuf->u.imp.sgt) - dma_buf_unmap_attachment(attach, gntdev_dmabuf->u.imp.sgt, -DMA_BIDIRECTIONAL); + dma_buf_unmap_attachment_unlocked(attach, gntdev_dmabuf->u.imp.sgt, + DMA_BIDIRECTIONAL); dma_buf = attach->dmabuf; dma_buf_detach(attach->dmabuf, attach); dma_buf_put(dma_buf); ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v4 11/21] misc: fastrpc: Prepare to dynamic dma-buf locking specification
Am 31.08.22 um 17:37 schrieb Dmitry Osipenko: Prepare fastrpc to the common dynamic dma-buf locking convention by starting to use the unlocked versions of dma-buf API functions. Signed-off-by: Dmitry Osipenko Acked-by: Christian König --- drivers/misc/fastrpc.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/misc/fastrpc.c b/drivers/misc/fastrpc.c index 93ebd174d848..6fcfb2e9f7a7 100644 --- a/drivers/misc/fastrpc.c +++ b/drivers/misc/fastrpc.c @@ -310,8 +310,8 @@ static void fastrpc_free_map(struct kref *ref) return; } } - dma_buf_unmap_attachment(map->attach, map->table, -DMA_BIDIRECTIONAL); + dma_buf_unmap_attachment_unlocked(map->attach, map->table, + DMA_BIDIRECTIONAL); dma_buf_detach(map->buf, map->attach); dma_buf_put(map->buf); } @@ -726,7 +726,7 @@ static int fastrpc_map_create(struct fastrpc_user *fl, int fd, goto attach_err; } - map->table = dma_buf_map_attachment(map->attach, DMA_BIDIRECTIONAL); + map->table = dma_buf_map_attachment_unlocked(map->attach, DMA_BIDIRECTIONAL); if (IS_ERR(map->table)) { err = PTR_ERR(map->table); goto map_err; ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v4 10/21] RDMA/umem: Prepare to dynamic dma-buf locking specification
Am 31.08.22 um 17:37 schrieb Dmitry Osipenko: Prepare InfiniBand drivers to the common dynamic dma-buf locking convention by starting to use the unlocked versions of dma-buf API functions. Signed-off-by: Dmitry Osipenko Acked-by: Christian König --- drivers/infiniband/core/umem_dmabuf.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c index 04c04e6d24c3..43b26bc12288 100644 --- a/drivers/infiniband/core/umem_dmabuf.c +++ b/drivers/infiniband/core/umem_dmabuf.c @@ -26,7 +26,8 @@ int ib_umem_dmabuf_map_pages(struct ib_umem_dmabuf *umem_dmabuf) if (umem_dmabuf->sgt) goto wait_fence; - sgt = dma_buf_map_attachment(umem_dmabuf->attach, DMA_BIDIRECTIONAL); + sgt = dma_buf_map_attachment_unlocked(umem_dmabuf->attach, + DMA_BIDIRECTIONAL); if (IS_ERR(sgt)) return PTR_ERR(sgt); @@ -102,8 +103,8 @@ void ib_umem_dmabuf_unmap_pages(struct ib_umem_dmabuf *umem_dmabuf) umem_dmabuf->last_sg_trim = 0; } - dma_buf_unmap_attachment(umem_dmabuf->attach, umem_dmabuf->sgt, -DMA_BIDIRECTIONAL); + dma_buf_unmap_attachment_unlocked(umem_dmabuf->attach, umem_dmabuf->sgt, + DMA_BIDIRECTIONAL); umem_dmabuf->sgt = NULL; } ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v4 09/21] drm/etnaviv: Prepare to dynamic dma-buf locking specification
Am 31.08.22 um 17:37 schrieb Dmitry Osipenko: Prepare Etnaviv driver to the common dynamic dma-buf locking convention by starting to use the unlocked versions of dma-buf API functions. Signed-off-by: Dmitry Osipenko Interesting, where is the matching vmap()? Anyway, this patch is Acked-by: Christian König --- drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c index 3fa2da149639..7031db145a77 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c @@ -65,7 +65,7 @@ static void etnaviv_gem_prime_release(struct etnaviv_gem_object *etnaviv_obj) struct iosys_map map = IOSYS_MAP_INIT_VADDR(etnaviv_obj->vaddr); if (etnaviv_obj->vaddr) - dma_buf_vunmap(etnaviv_obj->base.import_attach->dmabuf, ); + dma_buf_vunmap_unlocked(etnaviv_obj->base.import_attach->dmabuf, ); /* Don't drop the pages for imported dmabuf, as they are not * ours, just free the array we allocated: ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v4 08/21] drm/tegra: Prepare to dynamic dma-buf locking specification
Am 31.08.22 um 17:37 schrieb Dmitry Osipenko: Prepare Tegra DRM driver to the common dynamic dma-buf locking convention by starting to use the unlocked versions of dma-buf API functions. Signed-off-by: Dmitry Osipenko Acked-by: Christian König --- drivers/gpu/drm/tegra/gem.c | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c index 81991090adcc..b09b8ab40ae4 100644 --- a/drivers/gpu/drm/tegra/gem.c +++ b/drivers/gpu/drm/tegra/gem.c @@ -84,7 +84,7 @@ static struct host1x_bo_mapping *tegra_bo_pin(struct device *dev, struct host1x_ goto free; } - map->sgt = dma_buf_map_attachment(map->attach, direction); + map->sgt = dma_buf_map_attachment_unlocked(map->attach, direction); if (IS_ERR(map->sgt)) { dma_buf_detach(buf, map->attach); err = PTR_ERR(map->sgt); @@ -160,7 +160,8 @@ static struct host1x_bo_mapping *tegra_bo_pin(struct device *dev, struct host1x_ static void tegra_bo_unpin(struct host1x_bo_mapping *map) { if (map->attach) { - dma_buf_unmap_attachment(map->attach, map->sgt, map->direction); + dma_buf_unmap_attachment_unlocked(map->attach, map->sgt, + map->direction); dma_buf_detach(map->attach->dmabuf, map->attach); } else { dma_unmap_sgtable(map->dev, map->sgt, map->direction, 0); @@ -181,7 +182,7 @@ static void *tegra_bo_mmap(struct host1x_bo *bo) if (obj->vaddr) { return obj->vaddr; } else if (obj->gem.import_attach) { - ret = dma_buf_vmap(obj->gem.import_attach->dmabuf, ); + ret = dma_buf_vmap_unlocked(obj->gem.import_attach->dmabuf, ); return ret ? NULL : map.vaddr; } else { return vmap(obj->pages, obj->num_pages, VM_MAP, @@ -197,7 +198,7 @@ static void tegra_bo_munmap(struct host1x_bo *bo, void *addr) if (obj->vaddr) return; else if (obj->gem.import_attach) - dma_buf_vunmap(obj->gem.import_attach->dmabuf, ); + dma_buf_vunmap_unlocked(obj->gem.import_attach->dmabuf, ); else vunmap(addr); } @@ -461,7 +462,7 @@ static struct tegra_bo *tegra_bo_import(struct drm_device *drm, get_dma_buf(buf); - bo->sgt = dma_buf_map_attachment(attach, DMA_TO_DEVICE); + bo->sgt = dma_buf_map_attachment_unlocked(attach, DMA_TO_DEVICE); if (IS_ERR(bo->sgt)) { err = PTR_ERR(bo->sgt); goto detach; @@ -479,7 +480,7 @@ static struct tegra_bo *tegra_bo_import(struct drm_device *drm, detach: if (!IS_ERR_OR_NULL(bo->sgt)) - dma_buf_unmap_attachment(attach, bo->sgt, DMA_TO_DEVICE); + dma_buf_unmap_attachment_unlocked(attach, bo->sgt, DMA_TO_DEVICE); dma_buf_detach(buf, attach); dma_buf_put(buf); @@ -508,8 +509,8 @@ void tegra_bo_free_object(struct drm_gem_object *gem) tegra_bo_iommu_unmap(tegra, bo); if (gem->import_attach) { - dma_buf_unmap_attachment(gem->import_attach, bo->sgt, -DMA_TO_DEVICE); + dma_buf_unmap_attachment_unlocked(gem->import_attach, bo->sgt, + DMA_TO_DEVICE); drm_prime_gem_destroy(gem, NULL); } else { tegra_bo_free(gem->dev, bo); ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v4 06/21] drm/i915: Prepare to dynamic dma-buf locking specification
Am 31.08.22 um 17:37 schrieb Dmitry Osipenko: Prepare i915 driver to the common dynamic dma-buf locking convention by starting to use the unlocked versions of dma-buf API functions and handling cases where importer now holds the reservation lock. Signed-off-by: Dmitry Osipenko Acked-by: Christian König , but it's probably best if somebody from the Intel guys take a look as well. --- drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_object.c | 12 .../gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c | 16 3 files changed, 21 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c index f5062d0c6333..07eee1c09aaf 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c @@ -72,7 +72,7 @@ static int i915_gem_dmabuf_vmap(struct dma_buf *dma_buf, struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf); void *vaddr; - vaddr = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB); + vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB); if (IS_ERR(vaddr)) return PTR_ERR(vaddr); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 389e9f157ca5..7e2a9b02526c 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -331,7 +331,19 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915, continue; } + /* +* dma_buf_unmap_attachment() requires reservation to be +* locked. The imported GEM shouldn't share reservation lock, +* so it's safe to take the lock. +*/ + if (obj->base.import_attach) + i915_gem_object_lock(obj, NULL); + __i915_gem_object_pages_fini(obj); + + if (obj->base.import_attach) + i915_gem_object_unlock(obj); + __i915_gem_free_object(obj); /* But keep the pointer alive for RCU-protected lookups */ diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c index 62c61af77a42..9e3ed634aa0e 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c @@ -213,7 +213,7 @@ static int igt_dmabuf_import_same_driver(struct drm_i915_private *i915, goto out_import; } - st = dma_buf_map_attachment(import_attach, DMA_BIDIRECTIONAL); + st = dma_buf_map_attachment_unlocked(import_attach, DMA_BIDIRECTIONAL); if (IS_ERR(st)) { err = PTR_ERR(st); goto out_detach; @@ -226,7 +226,7 @@ static int igt_dmabuf_import_same_driver(struct drm_i915_private *i915, timeout = -ETIME; } err = timeout > 0 ? 0 : timeout; - dma_buf_unmap_attachment(import_attach, st, DMA_BIDIRECTIONAL); + dma_buf_unmap_attachment_unlocked(import_attach, st, DMA_BIDIRECTIONAL); out_detach: dma_buf_detach(dmabuf, import_attach); out_import: @@ -296,7 +296,7 @@ static int igt_dmabuf_import(void *arg) goto out_obj; } - err = dma_buf_vmap(dmabuf, ); + err = dma_buf_vmap_unlocked(dmabuf, ); dma_map = err ? NULL : map.vaddr; if (!dma_map) { pr_err("dma_buf_vmap failed\n"); @@ -337,7 +337,7 @@ static int igt_dmabuf_import(void *arg) err = 0; out_dma_map: - dma_buf_vunmap(dmabuf, ); + dma_buf_vunmap_unlocked(dmabuf, ); out_obj: i915_gem_object_put(obj); out_dmabuf: @@ -358,7 +358,7 @@ static int igt_dmabuf_import_ownership(void *arg) if (IS_ERR(dmabuf)) return PTR_ERR(dmabuf); - err = dma_buf_vmap(dmabuf, ); + err = dma_buf_vmap_unlocked(dmabuf, ); ptr = err ? NULL : map.vaddr; if (!ptr) { pr_err("dma_buf_vmap failed\n"); @@ -367,7 +367,7 @@ static int igt_dmabuf_import_ownership(void *arg) } memset(ptr, 0xc5, PAGE_SIZE); - dma_buf_vunmap(dmabuf, ); + dma_buf_vunmap_unlocked(dmabuf, ); obj = to_intel_bo(i915_gem_prime_import(>drm, dmabuf)); if (IS_ERR(obj)) { @@ -418,7 +418,7 @@ static int igt_dmabuf_export_vmap(void *arg) } i915_gem_object_put(obj); - err = dma_buf_vmap(dmabuf, ); + err = dma_buf_vmap_unlocked(dmabuf, ); ptr = err ? NULL : map.vaddr; if (!ptr) { pr_err("dma_buf_vmap failed\n"); @@ -435,7 +435,7 @@ static int igt_dmabuf_export_vmap(void *arg) memset(ptr, 0xc5, dmabuf->size); err = 0; - dma_buf_vunmap(dmabuf, ); + dma_buf_vunmap_unlocked(dmabuf, ); out: dma_buf_put(dmabuf); return err;
Re: [PATCH v4 05/21] drm/armada: Prepare to dynamic dma-buf locking specification
Am 31.08.22 um 17:37 schrieb Dmitry Osipenko: Prepare Armada driver to the common dynamic dma-buf locking convention by starting to use the unlocked versions of dma-buf API functions. Signed-off-by: Dmitry Osipenko Acked-by: Christian König --- drivers/gpu/drm/armada/armada_gem.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/armada/armada_gem.c b/drivers/gpu/drm/armada/armada_gem.c index 5430265ad458..26d10065d534 100644 --- a/drivers/gpu/drm/armada/armada_gem.c +++ b/drivers/gpu/drm/armada/armada_gem.c @@ -66,8 +66,8 @@ void armada_gem_free_object(struct drm_gem_object *obj) if (dobj->obj.import_attach) { /* We only ever display imported data */ if (dobj->sgt) - dma_buf_unmap_attachment(dobj->obj.import_attach, -dobj->sgt, DMA_TO_DEVICE); + dma_buf_unmap_attachment_unlocked(dobj->obj.import_attach, + dobj->sgt, DMA_TO_DEVICE); drm_prime_gem_destroy(>obj, NULL); } @@ -539,8 +539,8 @@ int armada_gem_map_import(struct armada_gem_object *dobj) { int ret; - dobj->sgt = dma_buf_map_attachment(dobj->obj.import_attach, - DMA_TO_DEVICE); + dobj->sgt = dma_buf_map_attachment_unlocked(dobj->obj.import_attach, + DMA_TO_DEVICE); if (IS_ERR(dobj->sgt)) { ret = PTR_ERR(dobj->sgt); dobj->sgt = NULL; ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v4 04/21] drm/prime: Prepare to dynamic dma-buf locking specification
Am 31.08.22 um 17:37 schrieb Dmitry Osipenko: Prepare DRM prime core to the common dynamic dma-buf locking convention by starting to use the unlocked versions of dma-buf API functions. Signed-off-by: Dmitry Osipenko Reviewed-by: Christian König --- drivers/gpu/drm/drm_prime.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index eb09e86044c6..20e109a802ae 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -940,7 +940,7 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, get_dma_buf(dma_buf); - sgt = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL); + sgt = dma_buf_map_attachment_unlocked(attach, DMA_BIDIRECTIONAL); if (IS_ERR(sgt)) { ret = PTR_ERR(sgt); goto fail_detach; @@ -958,7 +958,7 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, return obj; fail_unmap: - dma_buf_unmap_attachment(attach, sgt, DMA_BIDIRECTIONAL); + dma_buf_unmap_attachment_unlocked(attach, sgt, DMA_BIDIRECTIONAL); fail_detach: dma_buf_detach(dma_buf, attach); dma_buf_put(dma_buf); @@ -1056,7 +1056,7 @@ void drm_prime_gem_destroy(struct drm_gem_object *obj, struct sg_table *sg) attach = obj->import_attach; if (sg) - dma_buf_unmap_attachment(attach, sg, DMA_BIDIRECTIONAL); + dma_buf_unmap_attachment_unlocked(attach, sg, DMA_BIDIRECTIONAL); dma_buf = attach->dmabuf; dma_buf_detach(attach->dmabuf, attach); /* remove the reference */ ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [Linaro-mm-sig] [PATCH v3 6/9] dma-buf: Move dma-buf attachment to dynamic locking specification
Am 24.08.22 um 17:49 schrieb Dmitry Osipenko: On 8/24/22 18:24, Christian König wrote: Am 24.08.22 um 12:22 schrieb Dmitry Osipenko: Move dma-buf attachment API functions to the dynamic locking specification. The strict locking convention prevents deadlock situations for dma-buf importers and exporters. Previously, the "unlocked" versions of the attachment API functions weren't taking the reservation lock and this patch makes them to take the lock. Intel and AMD GPU drivers already were mapping the attached dma-bufs under the held lock during attachment, hence these drivers are updated to use the locked functions. Signed-off-by: Dmitry Osipenko --- drivers/dma-buf/dma-buf.c | 115 ++--- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 4 +- drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 8 +- drivers/gpu/drm/i915/gem/i915_gem_object.c | 12 +++ include/linux/dma-buf.h | 20 ++-- 5 files changed, 110 insertions(+), 49 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 4556a12bd741..f2a5a122da4a 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -559,7 +559,7 @@ static struct file *dma_buf_getfile(struct dma_buf *dmabuf, int flags) * 2. Userspace passes this file-descriptors to all drivers it wants this buffer * to share with: First the file descriptor is converted to a _buf using * dma_buf_get(). Then the buffer is attached to the device using - * dma_buf_attach(). + * dma_buf_attach_unlocked(). Now I get why this is confusing me so much. The _unlocked postfix implies that there is another function which should be called with the locks already held, but this is not the case for attach/detach (because they always need to grab the lock themselves). That's correct. The attach/detach ops of exporter can take the lock (like i915 exporter does it), hence importer must not grab the lock around dma_buf_attach() invocation. So I suggest to drop the _unlocked postfix for the attach/detach functions. Another step would then be to unify attach/detach with dynamic_attach/dynamic_detach when both have the same locking convention anyway. It's not a problem to change the name, but it's unclear to me why we should do it. The _unlocked postfix tells importer that reservation must be unlocked and it must be unlocked in case of dma_buf_attach(). Dropping the postfix will make dma_buf_attach() inconsistent with the rest of the _unlocked functions(?). Are you sure we need to rename it? The idea of the postfix was to distinguish between two different versions of the same function, e.g. dma_buf_vmap_unlocked() vs normal dma_buf_vmap(). When we don't have those two types of the same function I don't think it makes to much sense to keep that. We should just properly document which functions expect what and that's what your documentation patch does. Regards, Christian. Sorry that this is going so much back and forth, it's really complicated to keep all the stuff in my head at the moment :) Not a problem at all, I expected that it will take some time for this patchset to settle down. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v3 6/9] dma-buf: Move dma-buf attachment to dynamic locking specification
Am 24.08.22 um 17:03 schrieb Dmitry Osipenko: On 8/24/22 17:08, Christian König wrote: Am 24.08.22 um 12:22 schrieb Dmitry Osipenko: Move dma-buf attachment API functions to the dynamic locking specification. The strict locking convention prevents deadlock situations for dma-buf importers and exporters. Previously, the "unlocked" versions of the attachment API functions weren't taking the reservation lock and this patch makes them to take the lock. Didn't we concluded that we need to keep the attach and detach callbacks without the lock and only move the map/unmap callbacks over? Otherwise it won't be possible for drivers to lock multiple buffers if they have to shuffle things around for a specific attachment. We did conclude that. The attach/detach dma-buf ops are unlocked, but the map_dma_buf/unmap_dma_buf must be invoked under lock and dma_buf_dynamic_attach_unlocked() maps dma-buf if either importer or exporter can't handle the dynamic mapping [1]. Ah! You are confusing me over and over again with that :) Ok in this case that here is fine, I just need to re-read the patch. Thanks, Christian. [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Fv6.0-rc2%2Fsource%2Fdrivers%2Fdma-buf%2Fdma-buf.c%23L869data=05%7C01%7Cchristian.koenig%40amd.com%7Cdf23d89db8b84bf6d4c008da85e1dc6c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637969502441026991%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=d8kWKjDCFn%2B3KmK135Gcv6%2FMLffEYcipouqWxfc%2BKXM%3Dreserved=0 Hence I re-arranged the dma_resv_lock() in dma_buf_dynamic_attach_unlocked() to move both pinning and mapping under the held lock. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v3 9/9] dma-buf: Remove internal lock
Am 24.08.22 um 12:22 schrieb Dmitry Osipenko: The internal dma-buf lock isn't needed anymore because the updated locking specification claims that dma-buf reservation must be locked by importers, and thus, the internal data is already protected by the reservation lock. Remove the obsoleted internal lock. Signed-off-by: Dmitry Osipenko Reviewed-by: Christian König --- drivers/dma-buf/dma-buf.c | 14 -- include/linux/dma-buf.h | 9 - 2 files changed, 4 insertions(+), 19 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 696d132b02f4..a0406254f0ae 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -656,7 +656,6 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info) dmabuf->file = file; - mutex_init(>lock); INIT_LIST_HEAD(>attachments); mutex_lock(_list.lock); @@ -1503,7 +1502,7 @@ EXPORT_SYMBOL_NS_GPL(dma_buf_mmap_unlocked, DMA_BUF); int dma_buf_vmap(struct dma_buf *dmabuf, struct iosys_map *map) { struct iosys_map ptr; - int ret = 0; + int ret; iosys_map_clear(map); @@ -1515,28 +1514,25 @@ int dma_buf_vmap(struct dma_buf *dmabuf, struct iosys_map *map) if (!dmabuf->ops->vmap) return -EINVAL; - mutex_lock(>lock); if (dmabuf->vmapping_counter) { dmabuf->vmapping_counter++; BUG_ON(iosys_map_is_null(>vmap_ptr)); *map = dmabuf->vmap_ptr; - goto out_unlock; + return 0; } BUG_ON(iosys_map_is_set(>vmap_ptr)); ret = dmabuf->ops->vmap(dmabuf, ); if (WARN_ON_ONCE(ret)) - goto out_unlock; + return ret; dmabuf->vmap_ptr = ptr; dmabuf->vmapping_counter = 1; *map = dmabuf->vmap_ptr; -out_unlock: - mutex_unlock(>lock); - return ret; + return 0; } EXPORT_SYMBOL_NS_GPL(dma_buf_vmap, DMA_BUF); @@ -1578,13 +1574,11 @@ void dma_buf_vunmap(struct dma_buf *dmabuf, struct iosys_map *map) BUG_ON(dmabuf->vmapping_counter == 0); BUG_ON(!iosys_map_is_equal(>vmap_ptr, map)); - mutex_lock(>lock); if (--dmabuf->vmapping_counter == 0) { if (dmabuf->ops->vunmap) dmabuf->ops->vunmap(dmabuf, map); iosys_map_clear(>vmap_ptr); } - mutex_unlock(>lock); } EXPORT_SYMBOL_NS_GPL(dma_buf_vunmap, DMA_BUF); diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index d48d534dc55c..aed6695bbb50 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -326,15 +326,6 @@ struct dma_buf { /** @ops: dma_buf_ops associated with this buffer object. */ const struct dma_buf_ops *ops; - /** -* @lock: -* -* Used internally to serialize list manipulation, attach/detach and -* vmap/unmap. Note that in many cases this is superseeded by -* dma_resv_lock() on @resv. -*/ - struct mutex lock; - /** * @vmapping_counter: * ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v3 9/9] dma-buf: Remove internal lock
Am 24.08.22 um 12:22 schrieb Dmitry Osipenko: The internal dma-buf lock isn't needed anymore because the updated locking specification claims that dma-buf reservation must be locked by importers, and thus, the internal data is already protected by the reservation lock. Remove the obsoleted internal lock. Signed-off-by: Dmitry Osipenko Reviewed-by: Christian König --- drivers/dma-buf/dma-buf.c | 14 -- include/linux/dma-buf.h | 9 - 2 files changed, 4 insertions(+), 19 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 696d132b02f4..a0406254f0ae 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -656,7 +656,6 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info) dmabuf->file = file; - mutex_init(>lock); INIT_LIST_HEAD(>attachments); mutex_lock(_list.lock); @@ -1503,7 +1502,7 @@ EXPORT_SYMBOL_NS_GPL(dma_buf_mmap_unlocked, DMA_BUF); int dma_buf_vmap(struct dma_buf *dmabuf, struct iosys_map *map) { struct iosys_map ptr; - int ret = 0; + int ret; iosys_map_clear(map); @@ -1515,28 +1514,25 @@ int dma_buf_vmap(struct dma_buf *dmabuf, struct iosys_map *map) if (!dmabuf->ops->vmap) return -EINVAL; - mutex_lock(>lock); if (dmabuf->vmapping_counter) { dmabuf->vmapping_counter++; BUG_ON(iosys_map_is_null(>vmap_ptr)); *map = dmabuf->vmap_ptr; - goto out_unlock; + return 0; } BUG_ON(iosys_map_is_set(>vmap_ptr)); ret = dmabuf->ops->vmap(dmabuf, ); if (WARN_ON_ONCE(ret)) - goto out_unlock; + return ret; dmabuf->vmap_ptr = ptr; dmabuf->vmapping_counter = 1; *map = dmabuf->vmap_ptr; -out_unlock: - mutex_unlock(>lock); - return ret; + return 0; } EXPORT_SYMBOL_NS_GPL(dma_buf_vmap, DMA_BUF); @@ -1578,13 +1574,11 @@ void dma_buf_vunmap(struct dma_buf *dmabuf, struct iosys_map *map) BUG_ON(dmabuf->vmapping_counter == 0); BUG_ON(!iosys_map_is_equal(>vmap_ptr, map)); - mutex_lock(>lock); if (--dmabuf->vmapping_counter == 0) { if (dmabuf->ops->vunmap) dmabuf->ops->vunmap(dmabuf, map); iosys_map_clear(>vmap_ptr); } - mutex_unlock(>lock); } EXPORT_SYMBOL_NS_GPL(dma_buf_vunmap, DMA_BUF); diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index d48d534dc55c..aed6695bbb50 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -326,15 +326,6 @@ struct dma_buf { /** @ops: dma_buf_ops associated with this buffer object. */ const struct dma_buf_ops *ops; - /** -* @lock: -* -* Used internally to serialize list manipulation, attach/detach and -* vmap/unmap. Note that in many cases this is superseeded by -* dma_resv_lock() on @resv. -*/ - struct mutex lock; - /** * @vmapping_counter: * ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v3 6/9] dma-buf: Move dma-buf attachment to dynamic locking specification
Am 24.08.22 um 12:22 schrieb Dmitry Osipenko: Move dma-buf attachment API functions to the dynamic locking specification. The strict locking convention prevents deadlock situations for dma-buf importers and exporters. Previously, the "unlocked" versions of the attachment API functions weren't taking the reservation lock and this patch makes them to take the lock. Didn't we concluded that we need to keep the attach and detach callbacks without the lock and only move the map/unmap callbacks over? Otherwise it won't be possible for drivers to lock multiple buffers if they have to shuffle things around for a specific attachment. Regards, Christian. Intel and AMD GPU drivers already were mapping the attached dma-bufs under the held lock during attachment, hence these drivers are updated to use the locked functions. Signed-off-by: Dmitry Osipenko --- drivers/dma-buf/dma-buf.c | 115 ++--- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c| 4 +- drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 8 +- drivers/gpu/drm/i915/gem/i915_gem_object.c | 12 +++ include/linux/dma-buf.h| 20 ++-- 5 files changed, 110 insertions(+), 49 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 4556a12bd741..f2a5a122da4a 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -559,7 +559,7 @@ static struct file *dma_buf_getfile(struct dma_buf *dmabuf, int flags) * 2. Userspace passes this file-descriptors to all drivers it wants this buffer *to share with: First the file descriptor is converted to a _buf using *dma_buf_get(). Then the buffer is attached to the device using - *dma_buf_attach(). + *dma_buf_attach_unlocked(). * *Up to this stage the exporter is still free to migrate or reallocate the *backing storage. @@ -569,8 +569,8 @@ static struct file *dma_buf_getfile(struct dma_buf *dmabuf, int flags) *dma_buf_map_attachment() and dma_buf_unmap_attachment(). * * 4. Once a driver is done with a shared buffer it needs to call - *dma_buf_detach() (after cleaning up any mappings) and then release the - *reference acquired with dma_buf_get() by calling dma_buf_put(). + *dma_buf_detach_unlocked() (after cleaning up any mappings) and then + *release the reference acquired with dma_buf_get() by calling dma_buf_put(). * * For the detailed semantics exporters are expected to implement see * _buf_ops. @@ -802,7 +802,7 @@ static struct sg_table * __map_dma_buf(struct dma_buf_attachment *attach, * @importer_priv:[in]importer private pointer for the attachment * * Returns struct dma_buf_attachment pointer for this attachment. Attachments - * must be cleaned up by calling dma_buf_detach(). + * must be cleaned up by calling dma_buf_detach_unlocked(). * * Optionally this calls _buf_ops.attach to allow device-specific attach * functionality. @@ -858,8 +858,8 @@ dma_buf_dynamic_attach_unlocked(struct dma_buf *dmabuf, struct device *dev, dma_buf_is_dynamic(dmabuf)) { struct sg_table *sgt; + dma_resv_lock(attach->dmabuf->resv, NULL); if (dma_buf_is_dynamic(attach->dmabuf)) { - dma_resv_lock(attach->dmabuf->resv, NULL); ret = dmabuf->ops->pin(attach); if (ret) goto err_unlock; @@ -872,8 +872,7 @@ dma_buf_dynamic_attach_unlocked(struct dma_buf *dmabuf, struct device *dev, ret = PTR_ERR(sgt); goto err_unpin; } - if (dma_buf_is_dynamic(attach->dmabuf)) - dma_resv_unlock(attach->dmabuf->resv); + dma_resv_unlock(attach->dmabuf->resv); attach->sgt = sgt; attach->dir = DMA_BIDIRECTIONAL; } @@ -889,8 +888,7 @@ dma_buf_dynamic_attach_unlocked(struct dma_buf *dmabuf, struct device *dev, dmabuf->ops->unpin(attach); err_unlock: - if (dma_buf_is_dynamic(attach->dmabuf)) - dma_resv_unlock(attach->dmabuf->resv); + dma_resv_unlock(attach->dmabuf->resv); dma_buf_detach_unlocked(dmabuf, attach); return ERR_PTR(ret); @@ -927,7 +925,7 @@ static void __unmap_dma_buf(struct dma_buf_attachment *attach, * @dmabuf: [in]buffer to detach from. * @attach: [in]attachment to be detached; is free'd after this call. * - * Clean up a device attachment obtained by calling dma_buf_attach(). + * Clean up a device attachment obtained by calling dma_buf_attach_unlocked(). * * Optionally this calls _buf_ops.detach for device-specific detach. */ @@ -937,21 +935,19 @@ void dma_buf_detach_unlocked(struct dma_buf *dmabuf, if (WARN_ON(!dmabuf || !attach)) return; + dma_resv_lock(attach->dmabuf->resv, NULL); + if (attach->sgt) { -
Re: [PATCH v3 5/9] dma-buf: Move dma_buf_mmap_unlocked() to dynamic locking specification
This should work, but I'm really wondering if this makes a difference for somebody. Anyway the approach is fine with me: Acked-by: Christian König Regards, Christian. Am 24.08.22 um 12:22 schrieb Dmitry Osipenko: Move dma_buf_mmap_unlocked() function to the dynamic locking specification by taking the reservation lock. Neither of the today's drivers take the reservation lock within the mmap() callback, hence it's safe to enforce the locking. Signed-off-by: Dmitry Osipenko --- drivers/dma-buf/dma-buf.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index f358af401360..4556a12bd741 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -1348,6 +1348,8 @@ EXPORT_SYMBOL_NS_GPL(dma_buf_end_cpu_access, DMA_BUF); int dma_buf_mmap_unlocked(struct dma_buf *dmabuf, struct vm_area_struct *vma, unsigned long pgoff) { + int ret; + if (WARN_ON(!dmabuf || !vma)) return -EINVAL; @@ -1368,7 +1370,11 @@ int dma_buf_mmap_unlocked(struct dma_buf *dmabuf, struct vm_area_struct *vma, vma_set_file(vma, dmabuf->file); vma->vm_pgoff = pgoff; - return dmabuf->ops->mmap(dmabuf, vma); + dma_resv_lock(dmabuf->resv, NULL); + ret = dmabuf->ops->mmap(dmabuf, vma); + dma_resv_unlock(dmabuf->resv); + + return ret; } EXPORT_SYMBOL_NS_GPL(dma_buf_mmap_unlocked, DMA_BUF); ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v1] drm/ttm: Refcount allocated tail pages
Am 18.08.22 um 01:13 schrieb Dmitry Osipenko: On 8/18/22 01:57, Dmitry Osipenko wrote: On 8/15/22 18:54, Dmitry Osipenko wrote: On 8/15/22 17:57, Dmitry Osipenko wrote: On 8/15/22 16:53, Christian König wrote: Am 15.08.22 um 15:45 schrieb Dmitry Osipenko: [SNIP] Well that comment sounds like KVM is doing the right thing, so I'm wondering what exactly is going on here. KVM actually doesn't hold the page reference, it takes the temporal reference during page fault and then drops the reference once page is mapped, IIUC. Is it still illegal for TTM? Or there is a possibility for a race condition here? Well the question is why does KVM grab the page reference in the first place? If that is to prevent the mapping from changing then yes that's illegal and won't work. It can always happen that you grab the address, solve the fault and then immediately fault again because the address you just grabbed is invalidated. If it's for some other reason than we should probably investigate if we shouldn't stop doing this. CC: +Paolo Bonzini who introduced this code commit add6a0cd1c5ba51b201e1361b05a5df817083618 Author: Paolo Bonzini Date: Tue Jun 7 17:51:18 2016 +0200 KVM: MMU: try to fix up page faults before giving up The vGPU folks would like to trap the first access to a BAR by setting vm_ops on the VMAs produced by mmap-ing a VFIO device. The fault handler then can use remap_pfn_range to place some non-reserved pages in the VMA. This kind of VM_PFNMAP mapping is not handled by KVM, but follow_pfn and fixup_user_fault together help supporting it. The patch also supports VM_MIXEDMAP vmas where the pfns are not reserved and thus subject to reference counting. @Paolo, https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F73e5ed8d-0d25-7d44-8fa2-e1d61b1f5a04%40amd.com%2FT%2F%23m7647ce5f8c4749599d2c6bc15a2b45f8d8cf8154data=05%7C01%7Cchristian.koenig%40amd.com%7Cecb0f0eb6c2d43afa15b08da80a625ff%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637963748360952327%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=52Dvisa7p%2BckmBxMvDJFScGSNy9JRPDdnPK05C%2F6n7A%3Dreserved=0 If we need to bump the refcount only for VM_MIXEDMAP and not for VM_PFNMAP, then perhaps we could add a flag for that to the kvm_main code that will denote to kvm_release_page_clean whether it needs to put the page? The other variant that kind of works is to mark TTM pages reserved using SetPageReserved/ClearPageReserved, telling KVM not to mess with the page struct. But the potential consequences of doing this are unclear to me. Christian, do you think we can do it? Although, no. It also doesn't work with KVM without additional changes to KVM. Well my fundamental problem is that I can't fit together why KVM is grabing a page reference in the first place. See the idea of the page reference is that you have one reference is that you count the reference so that the memory is not reused while you access it, e.g. for I/O or mapping it into different address spaces etc... But none of those use cases seem to apply to KVM. If I'm not totally mistaken in KVM you want to make sure that the address space mapping, e.g. the translation between virtual and physical address, don't change while you handle it, but grabbing a page reference is the completely wrong approach for that. Regards, Christian. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v1] drm/ttm: Refcount allocated tail pages
Am 15.08.22 um 15:45 schrieb Dmitry Osipenko: [SNIP] Well that comment sounds like KVM is doing the right thing, so I'm wondering what exactly is going on here. KVM actually doesn't hold the page reference, it takes the temporal reference during page fault and then drops the reference once page is mapped, IIUC. Is it still illegal for TTM? Or there is a possibility for a race condition here? Well the question is why does KVM grab the page reference in the first place? If that is to prevent the mapping from changing then yes that's illegal and won't work. It can always happen that you grab the address, solve the fault and then immediately fault again because the address you just grabbed is invalidated. If it's for some other reason than we should probably investigate if we shouldn't stop doing this. Regards, Christian. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v1] drm/ttm: Refcount allocated tail pages
Am 15.08.22 um 13:50 schrieb Dmitry Osipenko: On 8/15/22 14:28, Christian König wrote: Maybe it was discussed privately? In this case I will be happy to get more info from you about the root of the problem so I could start to look at how to fix it properly. It's not apparent where the problem is to a TTM newbie like me. Well this is completely unfixable. See the whole purpose of TTM is to allow tracing where what is mapped of a buffer object. If you circumvent that and increase the page reference yourself than that whole functionality can't work correctly any more. Are you suggesting that the problem is that TTM doesn't see the KVM page faults/mappings? Yes, and no. It's one of the issues, but there is more behind that (e.g. what happens when TTM switches from pages to local memory for backing a BO). If KVM page fault could reach TTM, then it should be able to relocate BO. I see now where is the problem, thanks. Although, I'm wondering whether it already works somehow.. I'll try to play with the the AMDGPU shrinker and see what will happen on guest mapping of a relocated BO. Well the page fault already somehow reaches TTM, otherwise the pfn couldn't be filled in in the first place. The issues is more that KVM should never ever grab a page reference to pages mapped with VM_IO or VM_PFNMAP. Essentially we need to apply the same restriction as with get_user_pages() here. Another question is why is KVM accessing the page structure in the first place? The VMA is mapped with VM_PFNMAP and VM_IO, KVM should never ever touch any of those pages. https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Fv5.19%2Fsource%2Fvirt%2Fkvm%2Fkvm_main.c%23L2549data=05%7C01%7Cchristian.koenig%40amd.com%7C2f38c27f20f842fc582a08da7eb4580d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637961610314049167%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=Pu5F1EF9UvDPdOQ7sjJ1WDRt5XpFZmAMXdkexnDpEmU%3Dreserved=0 Well that comment sounds like KVM is doing the right thing, so I'm wondering what exactly is going on here. Regards, Christian. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v1] drm/ttm: Refcount allocated tail pages
Am 15.08.22 um 13:19 schrieb Dmitry Osipenko: [SNIP] I'll try to dig out the older discussions, thank you for the quick reply! Are you sure it was really discussed in public previously? All I can find is yours two answers to a similar patches where you're saying that this it's a wrong solution without in-depth explanation and further discussions. Yeah, that's my problem as well I can't find that of hand. But yes it certainly was discussed in public. If it was only CC'd to dri-devel, then could be that emails didn't pass the spam moderation :/ That might be possible. Maybe it was discussed privately? In this case I will be happy to get more info from you about the root of the problem so I could start to look at how to fix it properly. It's not apparent where the problem is to a TTM newbie like me. Well this is completely unfixable. See the whole purpose of TTM is to allow tracing where what is mapped of a buffer object. If you circumvent that and increase the page reference yourself than that whole functionality can't work correctly any more. Are you suggesting that the problem is that TTM doesn't see the KVM page faults/mappings? Yes, and no. It's one of the issues, but there is more behind that (e.g. what happens when TTM switches from pages to local memory for backing a BO). Another question is why is KVM accessing the page structure in the first place? The VMA is mapped with VM_PFNMAP and VM_IO, KVM should never ever touch any of those pages. Regards, Christian. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v1] drm/ttm: Refcount allocated tail pages
Am 15.08.22 um 12:47 schrieb Dmitry Osipenko: On 8/15/22 13:18, Dmitry Osipenko wrote: On 8/15/22 13:14, Christian König wrote: Am 15.08.22 um 12:11 schrieb Christian König: Am 15.08.22 um 12:09 schrieb Dmitry Osipenko: On 8/15/22 13:05, Christian König wrote: Am 15.08.22 um 11:54 schrieb Dmitry Osipenko: Higher order pages allocated using alloc_pages() aren't refcounted and they need to be refcounted, otherwise it's impossible to map them by KVM. This patch sets the refcount of the tail pages and fixes the KVM memory mapping faults. Without this change guest virgl driver can't map host buffers into guest and can't provide OpenGL 4.5 profile support to the guest. The host mappings are also needed for enabling the Venus driver using host GPU drivers that are utilizing TTM. Based on a patch proposed by Trigger Huang. Well I can't count how often I have repeated this: This is an absolutely clear NAK! TTM pages are not reference counted in the first place and because of this giving them to virgl is illegal. A? The first page is refcounted when allocated, the tail pages are not. No they aren't. The first page is just by coincident initialized with a refcount of 1. This refcount is completely ignored and not used at all. Incrementing the reference count and by this mapping the page into some other address space is illegal and corrupts the internal state tracking of TTM. See this comment in the source code as well: /* Don't set the __GFP_COMP flag for higher order allocations. * Mapping pages directly into an userspace process and calling * put_page() on a TTM allocated page is illegal. */ I have absolutely no idea how somebody had the idea he could do this. I saw this comment, but it doesn't make sense because it doesn't explain why it's illegal. Hence it looks like a bogus comment since the refcouting certainly works, at least to a some degree because I haven't noticed any problems in practice, maybe by luck :) I'll try to dig out the older discussions, thank you for the quick reply! Are you sure it was really discussed in public previously? All I can find is yours two answers to a similar patches where you're saying that this it's a wrong solution without in-depth explanation and further discussions. Yeah, that's my problem as well I can't find that of hand. But yes it certainly was discussed in public. Maybe it was discussed privately? In this case I will be happy to get more info from you about the root of the problem so I could start to look at how to fix it properly. It's not apparent where the problem is to a TTM newbie like me. Well this is completely unfixable. See the whole purpose of TTM is to allow tracing where what is mapped of a buffer object. If you circumvent that and increase the page reference yourself than that whole functionality can't work correctly any more. Regards, Christian. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v1] drm/ttm: Refcount allocated tail pages
Am 15.08.22 um 12:18 schrieb Dmitry Osipenko: On 8/15/22 13:14, Christian König wrote: Am 15.08.22 um 12:11 schrieb Christian König: Am 15.08.22 um 12:09 schrieb Dmitry Osipenko: On 8/15/22 13:05, Christian König wrote: Am 15.08.22 um 11:54 schrieb Dmitry Osipenko: Higher order pages allocated using alloc_pages() aren't refcounted and they need to be refcounted, otherwise it's impossible to map them by KVM. This patch sets the refcount of the tail pages and fixes the KVM memory mapping faults. Without this change guest virgl driver can't map host buffers into guest and can't provide OpenGL 4.5 profile support to the guest. The host mappings are also needed for enabling the Venus driver using host GPU drivers that are utilizing TTM. Based on a patch proposed by Trigger Huang. Well I can't count how often I have repeated this: This is an absolutely clear NAK! TTM pages are not reference counted in the first place and because of this giving them to virgl is illegal. A? The first page is refcounted when allocated, the tail pages are not. No they aren't. The first page is just by coincident initialized with a refcount of 1. This refcount is completely ignored and not used at all. Incrementing the reference count and by this mapping the page into some other address space is illegal and corrupts the internal state tracking of TTM. See this comment in the source code as well: /* Don't set the __GFP_COMP flag for higher order allocations. * Mapping pages directly into an userspace process and calling * put_page() on a TTM allocated page is illegal. */ I have absolutely no idea how somebody had the idea he could do this. I saw this comment, but it doesn't make sense because it doesn't explain why it's illegal. Hence it looks like a bogus comment since the refcouting certainly works, at least to a some degree because I haven't noticed any problems in practice, maybe by luck :) Well exactly that's the problem. It does not work, you are just lucky :) I will provide a patch to set the reference count to zero even for non-compound pages. Maybe that will yield more backtrace to abusers of this interface. Regards, Christian. I'll try to dig out the older discussions, thank you for the quick reply! ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v1] drm/ttm: Refcount allocated tail pages
Am 15.08.22 um 12:11 schrieb Christian König: Am 15.08.22 um 12:09 schrieb Dmitry Osipenko: On 8/15/22 13:05, Christian König wrote: Am 15.08.22 um 11:54 schrieb Dmitry Osipenko: Higher order pages allocated using alloc_pages() aren't refcounted and they need to be refcounted, otherwise it's impossible to map them by KVM. This patch sets the refcount of the tail pages and fixes the KVM memory mapping faults. Without this change guest virgl driver can't map host buffers into guest and can't provide OpenGL 4.5 profile support to the guest. The host mappings are also needed for enabling the Venus driver using host GPU drivers that are utilizing TTM. Based on a patch proposed by Trigger Huang. Well I can't count how often I have repeated this: This is an absolutely clear NAK! TTM pages are not reference counted in the first place and because of this giving them to virgl is illegal. A? The first page is refcounted when allocated, the tail pages are not. No they aren't. The first page is just by coincident initialized with a refcount of 1. This refcount is completely ignored and not used at all. Incrementing the reference count and by this mapping the page into some other address space is illegal and corrupts the internal state tracking of TTM. See this comment in the source code as well: /* Don't set the __GFP_COMP flag for higher order allocations. * Mapping pages directly into an userspace process and calling * put_page() on a TTM allocated page is illegal. */ I have absolutely no idea how somebody had the idea he could do this. Regards, Christian. Please immediately stop this completely broken approach. We have discussed this multiple times now. Could you please give me a link to these discussions? Not of hand, please search the dri-devel list for similar patches. This was brought up multiple times now. Regards, Christian. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v1] drm/ttm: Refcount allocated tail pages
Am 15.08.22 um 12:09 schrieb Dmitry Osipenko: On 8/15/22 13:05, Christian König wrote: Am 15.08.22 um 11:54 schrieb Dmitry Osipenko: Higher order pages allocated using alloc_pages() aren't refcounted and they need to be refcounted, otherwise it's impossible to map them by KVM. This patch sets the refcount of the tail pages and fixes the KVM memory mapping faults. Without this change guest virgl driver can't map host buffers into guest and can't provide OpenGL 4.5 profile support to the guest. The host mappings are also needed for enabling the Venus driver using host GPU drivers that are utilizing TTM. Based on a patch proposed by Trigger Huang. Well I can't count how often I have repeated this: This is an absolutely clear NAK! TTM pages are not reference counted in the first place and because of this giving them to virgl is illegal. A? The first page is refcounted when allocated, the tail pages are not. No they aren't. The first page is just by coincident initialized with a refcount of 1. This refcount is completely ignored and not used at all. Incrementing the reference count and by this mapping the page into some other address space is illegal and corrupts the internal state tracking of TTM. Please immediately stop this completely broken approach. We have discussed this multiple times now. Could you please give me a link to these discussions? Not of hand, please search the dri-devel list for similar patches. This was brought up multiple times now. Regards, Christian. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v1] drm/ttm: Refcount allocated tail pages
Am 15.08.22 um 11:54 schrieb Dmitry Osipenko: Higher order pages allocated using alloc_pages() aren't refcounted and they need to be refcounted, otherwise it's impossible to map them by KVM. This patch sets the refcount of the tail pages and fixes the KVM memory mapping faults. Without this change guest virgl driver can't map host buffers into guest and can't provide OpenGL 4.5 profile support to the guest. The host mappings are also needed for enabling the Venus driver using host GPU drivers that are utilizing TTM. Based on a patch proposed by Trigger Huang. Well I can't count how often I have repeated this: This is an absolutely clear NAK! TTM pages are not reference counted in the first place and because of this giving them to virgl is illegal. Please immediately stop this completely broken approach. We have discussed this multiple times now. Regards, Christian. Cc: sta...@vger.kernel.org Cc: Trigger Huang Link: https://www.collabora.com/news-and-blog/blog/2021/11/26/venus-on-qemu-enabling-new-virtual-vulkan-driver/#qcom1343 Tested-by: Dmitry Osipenko # AMDGPU (Qemu and crosvm) Signed-off-by: Dmitry Osipenko --- drivers/gpu/drm/ttm/ttm_pool.c | 25 - 1 file changed, 24 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c index 21b61631f73a..11e92bb149c9 100644 --- a/drivers/gpu/drm/ttm/ttm_pool.c +++ b/drivers/gpu/drm/ttm/ttm_pool.c @@ -81,6 +81,7 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags, unsigned long attr = DMA_ATTR_FORCE_CONTIGUOUS; struct ttm_pool_dma *dma; struct page *p; + unsigned int i; void *vaddr; /* Don't set the __GFP_COMP flag for higher order allocations. @@ -93,8 +94,10 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags, if (!pool->use_dma_alloc) { p = alloc_pages(gfp_flags, order); - if (p) + if (p) { p->private = order; + goto ref_tail_pages; + } return p; } @@ -120,6 +123,23 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags, dma->vaddr = (unsigned long)vaddr | order; p->private = (unsigned long)dma; + +ref_tail_pages: + /* +* KVM requires mapped tail pages to be refcounted because put_page() +* is invoked on them in the end of the page fault handling, and thus, +* tail pages need to be protected from the premature releasing. +* In fact, KVM page fault handler refuses to map tail pages to guest +* if they aren't refcounted because hva_to_pfn_remapped() checks the +* refcount specifically for this case. +* +* In particular, unreferenced tail pages result in a KVM "Bad address" +* failure for VMMs that use VirtIO-GPU when guest's Mesa VirGL driver +* accesses mapped host TTM buffer that contains tail pages. +*/ + for (i = 1; i < 1 << order; i++) + page_ref_inc(p + i); + return p; error_free: @@ -133,6 +153,7 @@ static void ttm_pool_free_page(struct ttm_pool *pool, enum ttm_caching caching, { unsigned long attr = DMA_ATTR_FORCE_CONTIGUOUS; struct ttm_pool_dma *dma; + unsigned int i; void *vaddr; #ifdef CONFIG_X86 @@ -142,6 +163,8 @@ static void ttm_pool_free_page(struct ttm_pool *pool, enum ttm_caching caching, if (caching != ttm_cached && !PageHighMem(p)) set_pages_wb(p, 1 << order); #endif + for (i = 1; i < 1 << order; i++) + page_ref_dec(p + i); if (!pool || !pool->use_dma_alloc) { __free_pages(p, order); ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [Linaro-mm-sig] [PATCH v2 3/5] dma-buf: Move all dma-bufs to dynamic locking specification
Am 10.08.22 um 19:49 schrieb Dmitry Osipenko: On 8/10/22 14:30, Christian König wrote: Am 25.07.22 um 17:18 schrieb Dmitry Osipenko: This patch moves the non-dynamic dma-buf users over to the dynamic locking specification. The strict locking convention prevents deadlock situation for dma-buf importers and exporters. Previously the "unlocked" versions of the dma-buf API functions weren't taking the reservation lock and this patch makes them to take the lock. Intel and AMD GPU drivers already were mapping imported dma-bufs under the held lock, hence the "locked" variant of the functions are added for them and the drivers are updated to use the "locked" versions. In general "Yes, please", but that won't be that easy. You not only need to change amdgpu and i915, but all drivers implementing the map_dma_buf(), unmap_dma_buf() callbacks. Auditing all that code is a huge bunch of work. Hm, neither of drivers take the resv lock in map_dma_buf/unmap_dma_buf. It's easy to audit them all and I did it. So either I'm missing something or it doesn't take much time to check them all. Am I really missing something? Ok, so this is only changing map/unmap now? In this case please separate this from the documentation change. I would also drop the _locked postfix from the function name, just having _unlocked on all functions which are supposed to be called with the lock held should be sufficient. Thanks for looking into this, Christian. https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2FA%2Fident%2Fmap_dma_bufdata=05%7C01%7Cchristian.koenig%40amd.com%7C70fd52d0a82a477bfbfe08da7af8bec7%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637957506041914442%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=K47uCULsoiURjze0H0ksUa4vzJ%2BxqgoShH9106FvyyA%3Dreserved=0 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v8 1/2] drm/gem: Properly annotate WW context on drm_gem_lock_reservations() error
Am 10.08.22 um 10:33 schrieb Daniel Vetter: On Wed, 10 Aug 2022 at 08:52, Christian König wrote: Am 09.08.22 um 18:44 schrieb Daniel Vetter: On Tue, Jul 05, 2022 at 01:33:51PM +0200, Christian König wrote: Am 01.07.22 um 11:02 schrieb Dmitry Osipenko: Use ww_acquire_fini() in the error code paths. Otherwise lockdep thinks that lock is held when lock's memory is freed after the drm_gem_lock_reservations() error. The ww_acquire_context needs to be annotated as "released", which fixes the noisy "WARNING: held lock freed!" splat of VirtIO-GPU driver with CONFIG_DEBUG_MUTEXES=y and enabled lockdep. Cc: sta...@vger.kernel.org Fixes: 7edc3e3b975b5 ("drm: Add helpers for locking an array of BO reservations.") Reviewed-by: Thomas Hellström Signed-off-by: Dmitry Osipenko Reviewed-by: Christian König Also added this r-b tag when merging to drm-misc-next-fixes. IIRC I've already pushed this to drm-misc-fixes with a CC stable tag about 2 weeks ago. Please double check, it probably just hasn't come down the stream again yet. Hm quickly check and I didn't spot it? There's a few patches from Dmitry in the last few pulls, and some more stuff pending, but not these two afaics? Mhm, there is some potential that I wanted to push it but got distracted by the re-occurring drm-tip build breakages. Anyway what I wanted to say is that this stuff should probably go to drm-misc-fixes with a CC: stable tag :) Christian. -Daniel Christian. -Daniel --- drivers/gpu/drm/drm_gem.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index eb0c2d041f13..86d670c71286 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -1226,7 +1226,7 @@ drm_gem_lock_reservations(struct drm_gem_object **objs, int count, ret = dma_resv_lock_slow_interruptible(obj->resv, acquire_ctx); if (ret) { - ww_acquire_done(acquire_ctx); + ww_acquire_fini(acquire_ctx); return ret; } } @@ -1251,7 +1251,7 @@ drm_gem_lock_reservations(struct drm_gem_object **objs, int count, goto retry; } - ww_acquire_done(acquire_ctx); + ww_acquire_fini(acquire_ctx); return ret; } } ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v8 1/2] drm/gem: Properly annotate WW context on drm_gem_lock_reservations() error
Am 09.08.22 um 18:44 schrieb Daniel Vetter: On Tue, Jul 05, 2022 at 01:33:51PM +0200, Christian König wrote: Am 01.07.22 um 11:02 schrieb Dmitry Osipenko: Use ww_acquire_fini() in the error code paths. Otherwise lockdep thinks that lock is held when lock's memory is freed after the drm_gem_lock_reservations() error. The ww_acquire_context needs to be annotated as "released", which fixes the noisy "WARNING: held lock freed!" splat of VirtIO-GPU driver with CONFIG_DEBUG_MUTEXES=y and enabled lockdep. Cc: sta...@vger.kernel.org Fixes: 7edc3e3b975b5 ("drm: Add helpers for locking an array of BO reservations.") Reviewed-by: Thomas Hellström Signed-off-by: Dmitry Osipenko Reviewed-by: Christian König Also added this r-b tag when merging to drm-misc-next-fixes. IIRC I've already pushed this to drm-misc-fixes with a CC stable tag about 2 weeks ago. Please double check, it probably just hasn't come down the stream again yet. Christian. -Daniel --- drivers/gpu/drm/drm_gem.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index eb0c2d041f13..86d670c71286 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -1226,7 +1226,7 @@ drm_gem_lock_reservations(struct drm_gem_object **objs, int count, ret = dma_resv_lock_slow_interruptible(obj->resv, acquire_ctx); if (ret) { - ww_acquire_done(acquire_ctx); + ww_acquire_fini(acquire_ctx); return ret; } } @@ -1251,7 +1251,7 @@ drm_gem_lock_reservations(struct drm_gem_object **objs, int count, goto retry; } - ww_acquire_done(acquire_ctx); + ww_acquire_fini(acquire_ctx); return ret; } } ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v1 4/6] dma-buf: Acquire wait-wound context on attachment
Am 19.07.22 um 22:05 schrieb Dmitry Osipenko: On 7/15/22 09:59, Dmitry Osipenko wrote: On 7/15/22 09:50, Christian König wrote: Am 15.07.22 um 02:52 schrieb Dmitry Osipenko: Intel i915 GPU driver uses wait-wound mutex to lock multiple GEMs on the attachment to the i915 dma-buf. In order to let all drivers utilize shared wait-wound context during attachment in a general way, make dma-buf core to acquire the ww context internally for the attachment operation and update i915 driver to use the importer's ww context instead of the internal one. From now on all dma-buf exporters shall use the importer's ww context for the attachment operation. Signed-off-by: Dmitry Osipenko --- drivers/dma-buf/dma-buf.c | 8 +- drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 2 +- .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_object.h | 6 ++--- drivers/gpu/drm/i915/i915_gem_evict.c | 2 +- drivers/gpu/drm/i915/i915_gem_ww.c | 26 +++ drivers/gpu/drm/i915/i915_gem_ww.h | 15 +-- 7 files changed, 47 insertions(+), 14 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 0ee588276534..37545ecb845a 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -807,6 +807,8 @@ static struct sg_table * __map_dma_buf(struct dma_buf_attachment *attach, * Optionally this calls _buf_ops.attach to allow device-specific attach * functionality. * + * Exporters shall use ww_ctx acquired by this function. + * * Returns: * * A pointer to newly created _buf_attachment on success, or a negative @@ -822,6 +824,7 @@ dma_buf_dynamic_attach_unlocked(struct dma_buf *dmabuf, struct device *dev, void *importer_priv) { struct dma_buf_attachment *attach; + struct ww_acquire_ctx ww_ctx; int ret; if (WARN_ON(!dmabuf || !dev)) @@ -841,7 +844,8 @@ dma_buf_dynamic_attach_unlocked(struct dma_buf *dmabuf, struct device *dev, attach->importer_ops = importer_ops; attach->importer_priv = importer_priv; - dma_resv_lock(dmabuf->resv, NULL); + ww_acquire_init(_ctx, _ww_class); + dma_resv_lock(dmabuf->resv, _ctx); That won't work like this. The core property of a WW context is that you need to unwind all the locks and re-quire them with the contended one first. When you statically lock the imported one here you can't do that any more. You're right. I felt that something is missing here, but couldn't notice. I'll think more about this and enable CONFIG_DEBUG_WW_MUTEX_SLOWPATH. Thank you! Christian, do you think we could make an excuse for the attach() callback and make the exporter responsible for taking the resv lock? It will be inconsistent with the rest of the callbacks, where importer takes the lock, but it will be the simplest and least invasive solution. It's very messy to do a cross-driver ww locking, I don't think it's the right approach. So to summarize the following calls will require that the caller hold the resv lock: 1. dma_buf_pin()/dma_buf_unpin() 2. dma_buf_map_attachment()/dma_buf_unmap_attachment() 3. dma_buf_vmap()/dma_buf_vunmap() 4. dma_buf_move_notify() The following calls require that caller does not held the resv lock: 1. dma_buf_attach()/dma_buf_dynamic_attach()/dma_buf_detach() 2. dma_buf_export()/dma_buf_fd() 3. dma_buf_get()/dma_buf_put() 4. dma_buf_begin_cpu_access()/dma_buf_end_cpu_access() If that's correct than that would work for me as well, but we should probably document this. Or let me ask the other way around: What calls exactly do you need to change to solve your original issue? That was vmap/vunmap, wasn't it? If yes then let's concentrate on those for the moment. Regards, Christian. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v1 1/6] dma-buf: Add _unlocked postfix to function names
Am 15.07.22 um 11:31 schrieb Dmitry Osipenko: On 7/15/22 10:19, Christian König wrote: -struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach, - enum dma_data_direction direction) +struct sg_table * +dma_buf_map_attachment_unlocked(struct dma_buf_attachment *attach, + enum dma_data_direction direction) The locking state of mapping and unmapping operations depend on if the attachment is dynamic or not. So this here is not a good idea at all since it suggests that the function is always called without holding the lock. I had the same thought while was working on this patch and initially was thinking about adding an "unlocked" alias to dma_buf_map_attachment(). In the end I decided that it will create even more confusion and it's simpler just to rename this func here since there are only two drivers using the dynamic mapping. Do you have suggestions how to improve it? Just keep it as it is. The ultimative goal is to switch all drivers over to the dynamic mapping or at least over to assume that the map/unmap callbacks are called with the lock held. Regards, Christian. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v1 2/6] drm/gem: Take reservation lock for vmap/vunmap operations
Am 15.07.22 um 02:52 schrieb Dmitry Osipenko: The new common dma-buf locking convention will require buffer importers to hold the reservation lock around mapping operations. Make DRM GEM core to take the lock around the vmapping operations and update QXL and i915 drivers to use the locked functions for the case where DRM core now holds the lock. This patch prepares DRM core and drivers to transition to the common dma-buf locking convention where vmapping of exported GEMs will be done under the held reservation lock. Oh ^^ That looks like a bug fix to me! At least drm_gem_ttm_vmap() and drm_gem_ttm_vunmap() already expected that they are called with the reservation lock held. Otherwise you could mess up internal structures in the TTM buffer object while vmapping it. I will take a deeper look at this. Regards, Christian. Signed-off-by: Dmitry Osipenko --- drivers/gpu/drm/drm_client.c | 4 +-- drivers/gpu/drm/drm_gem.c| 28 drivers/gpu/drm/drm_gem_framebuffer_helper.c | 6 ++--- drivers/gpu/drm/drm_prime.c | 4 +-- drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 2 +- drivers/gpu/drm/qxl/qxl_object.c | 17 ++-- drivers/gpu/drm/qxl/qxl_prime.c | 4 +-- include/drm/drm_gem.h| 3 +++ 8 files changed, 50 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/drm_client.c b/drivers/gpu/drm/drm_client.c index 2b230b4d6942..fbcb1e995384 100644 --- a/drivers/gpu/drm/drm_client.c +++ b/drivers/gpu/drm/drm_client.c @@ -323,7 +323,7 @@ drm_client_buffer_vmap(struct drm_client_buffer *buffer, * fd_install step out of the driver backend hooks, to make that * final step optional for internal users. */ - ret = drm_gem_vmap(buffer->gem, map); + ret = drm_gem_vmap_unlocked(buffer->gem, map); if (ret) return ret; @@ -345,7 +345,7 @@ void drm_client_buffer_vunmap(struct drm_client_buffer *buffer) { struct iosys_map *map = >map; - drm_gem_vunmap(buffer->gem, map); + drm_gem_vunmap_unlocked(buffer->gem, map); } EXPORT_SYMBOL(drm_client_buffer_vunmap); diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index eb0c2d041f13..9769c33cad99 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -1155,6 +1155,8 @@ void drm_gem_print_info(struct drm_printer *p, unsigned int indent, int drm_gem_pin(struct drm_gem_object *obj) { + dma_resv_assert_held(obj->resv); + if (obj->funcs->pin) return obj->funcs->pin(obj); else @@ -1163,6 +1165,8 @@ int drm_gem_pin(struct drm_gem_object *obj) void drm_gem_unpin(struct drm_gem_object *obj) { + dma_resv_assert_held(obj->resv); + if (obj->funcs->unpin) obj->funcs->unpin(obj); } @@ -1171,6 +1175,8 @@ int drm_gem_vmap(struct drm_gem_object *obj, struct iosys_map *map) { int ret; + dma_resv_assert_held(obj->resv); + if (!obj->funcs->vmap) return -EOPNOTSUPP; @@ -1186,6 +1192,8 @@ EXPORT_SYMBOL(drm_gem_vmap); void drm_gem_vunmap(struct drm_gem_object *obj, struct iosys_map *map) { + dma_resv_assert_held(obj->resv); + if (iosys_map_is_null(map)) return; @@ -1197,6 +1205,26 @@ void drm_gem_vunmap(struct drm_gem_object *obj, struct iosys_map *map) } EXPORT_SYMBOL(drm_gem_vunmap); +int drm_gem_vmap_unlocked(struct drm_gem_object *obj, struct iosys_map *map) +{ + int ret; + + dma_resv_lock(obj->resv, NULL); + ret = drm_gem_vmap(obj, map); + dma_resv_unlock(obj->resv); + + return ret; +} +EXPORT_SYMBOL(drm_gem_vmap_unlocked); + +void drm_gem_vunmap_unlocked(struct drm_gem_object *obj, struct iosys_map *map) +{ + dma_resv_lock(obj->resv, NULL); + drm_gem_vunmap(obj, map); + dma_resv_unlock(obj->resv); +} +EXPORT_SYMBOL(drm_gem_vunmap_unlocked); + /** * drm_gem_lock_reservations - Sets up the ww context and acquires * the lock on an array of GEM objects. diff --git a/drivers/gpu/drm/drm_gem_framebuffer_helper.c b/drivers/gpu/drm/drm_gem_framebuffer_helper.c index 880a4975507f..e35e224e6303 100644 --- a/drivers/gpu/drm/drm_gem_framebuffer_helper.c +++ b/drivers/gpu/drm/drm_gem_framebuffer_helper.c @@ -354,7 +354,7 @@ int drm_gem_fb_vmap(struct drm_framebuffer *fb, struct iosys_map *map, ret = -EINVAL; goto err_drm_gem_vunmap; } - ret = drm_gem_vmap(obj, [i]); + ret = drm_gem_vmap_unlocked(obj, [i]); if (ret) goto err_drm_gem_vunmap; } @@ -376,7 +376,7 @@ int drm_gem_fb_vmap(struct drm_framebuffer *fb, struct iosys_map *map, obj = drm_gem_fb_get_obj(fb, i); if (!obj) continue; -
Re: [PATCH v1 1/6] dma-buf: Add _unlocked postfix to function names
Am 15.07.22 um 02:52 schrieb Dmitry Osipenko: Add _unlocked postfix to the dma-buf API function names in a preparation to move all non-dynamic dma-buf users over to the dynamic locking specification. This patch only renames API functions, preparing drivers to the common locking convention. Later on we will make the "unlocked" functions to take the reservation lock. Suggested-by: Christian König Signed-off-by: Dmitry Osipenko --- drivers/dma-buf/dma-buf.c | 76 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 4 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 4 +- drivers/gpu/drm/armada/armada_gem.c | 14 ++-- drivers/gpu/drm/drm_gem_cma_helper.c | 6 +- drivers/gpu/drm/drm_gem_shmem_helper.c| 6 +- drivers/gpu/drm/drm_prime.c | 12 +-- drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c | 6 +- drivers/gpu/drm/exynos/exynos_drm_gem.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c| 12 +-- .../drm/i915/gem/selftests/i915_gem_dmabuf.c | 20 ++--- drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c | 8 +- drivers/gpu/drm/tegra/gem.c | 27 +++ drivers/infiniband/core/umem_dmabuf.c | 11 +-- .../common/videobuf2/videobuf2-dma-contig.c | 15 ++-- .../media/common/videobuf2/videobuf2-dma-sg.c | 12 +-- .../common/videobuf2/videobuf2-vmalloc.c | 6 +- .../platform/nvidia/tegra-vde/dmabuf-cache.c | 12 +-- drivers/misc/fastrpc.c| 12 +-- drivers/xen/gntdev-dmabuf.c | 14 ++-- include/linux/dma-buf.h | 34 + 21 files changed, 161 insertions(+), 152 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 44574fbe7482..d16237a6ffaa 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -795,7 +795,7 @@ static struct sg_table * __map_dma_buf(struct dma_buf_attachment *attach, } /** - * dma_buf_dynamic_attach - Add the device to dma_buf's attachments list + * dma_buf_dynamic_attach_unlocked - Add the device to dma_buf's attachments list * @dmabuf: [in]buffer to attach device to. * @dev: [in]device to be attached. * @importer_ops: [in]importer operations for the attachment @@ -817,9 +817,9 @@ static struct sg_table * __map_dma_buf(struct dma_buf_attachment *attach, * indicated with the error code -EBUSY. */ struct dma_buf_attachment * -dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev, - const struct dma_buf_attach_ops *importer_ops, - void *importer_priv) +dma_buf_dynamic_attach_unlocked(struct dma_buf *dmabuf, struct device *dev, + const struct dma_buf_attach_ops *importer_ops, + void *importer_priv) { struct dma_buf_attachment *attach; int ret; @@ -892,25 +892,25 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev, if (dma_buf_is_dynamic(attach->dmabuf)) dma_resv_unlock(attach->dmabuf->resv); - dma_buf_detach(dmabuf, attach); + dma_buf_detach_unlocked(dmabuf, attach); return ERR_PTR(ret); } -EXPORT_SYMBOL_NS_GPL(dma_buf_dynamic_attach, DMA_BUF); +EXPORT_SYMBOL_NS_GPL(dma_buf_dynamic_attach_unlocked, DMA_BUF); /** - * dma_buf_attach - Wrapper for dma_buf_dynamic_attach + * dma_buf_attach_unlocked - Wrapper for dma_buf_dynamic_attach * @dmabuf: [in]buffer to attach device to. * @dev: [in]device to be attached. * - * Wrapper to call dma_buf_dynamic_attach() for drivers which still use a static - * mapping. + * Wrapper to call dma_buf_dynamic_attach_unlocked() for drivers which still + * use a static mapping. */ -struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, - struct device *dev) +struct dma_buf_attachment *dma_buf_attach_unlocked(struct dma_buf *dmabuf, + struct device *dev) { - return dma_buf_dynamic_attach(dmabuf, dev, NULL, NULL); + return dma_buf_dynamic_attach_unlocked(dmabuf, dev, NULL, NULL); } -EXPORT_SYMBOL_NS_GPL(dma_buf_attach, DMA_BUF); +EXPORT_SYMBOL_NS_GPL(dma_buf_attach_unlocked, DMA_BUF); static void __unmap_dma_buf(struct dma_buf_attachment *attach, struct sg_table *sg_table, @@ -923,7 +923,7 @@ static void __unmap_dma_buf(struct dma_buf_attachment *attach, } /** - * dma_buf_detach - Remove the given attachment from dmabuf's attachments list + * dma_buf_detach_unlocked - Remove the given attachment from dmabuf's attachments list * @dmabuf: [in]buffer to detach from. * @attach: [in]attachment to be detached; is free'd after this call. * @@ -931,7 +931,8 @@ static void __unmap_dma_buf(struct dma_buf_attachment *attach, * * Optionally
Re: [PATCH v1 4/6] dma-buf: Acquire wait-wound context on attachment
Am 15.07.22 um 02:52 schrieb Dmitry Osipenko: Intel i915 GPU driver uses wait-wound mutex to lock multiple GEMs on the attachment to the i915 dma-buf. In order to let all drivers utilize shared wait-wound context during attachment in a general way, make dma-buf core to acquire the ww context internally for the attachment operation and update i915 driver to use the importer's ww context instead of the internal one. From now on all dma-buf exporters shall use the importer's ww context for the attachment operation. Signed-off-by: Dmitry Osipenko --- drivers/dma-buf/dma-buf.c | 8 +- drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c| 2 +- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 2 +- drivers/gpu/drm/i915/gem/i915_gem_object.h| 6 ++--- drivers/gpu/drm/i915/i915_gem_evict.c | 2 +- drivers/gpu/drm/i915/i915_gem_ww.c| 26 +++ drivers/gpu/drm/i915/i915_gem_ww.h| 15 +-- 7 files changed, 47 insertions(+), 14 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 0ee588276534..37545ecb845a 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -807,6 +807,8 @@ static struct sg_table * __map_dma_buf(struct dma_buf_attachment *attach, * Optionally this calls _buf_ops.attach to allow device-specific attach * functionality. * + * Exporters shall use ww_ctx acquired by this function. + * * Returns: * * A pointer to newly created _buf_attachment on success, or a negative @@ -822,6 +824,7 @@ dma_buf_dynamic_attach_unlocked(struct dma_buf *dmabuf, struct device *dev, void *importer_priv) { struct dma_buf_attachment *attach; + struct ww_acquire_ctx ww_ctx; int ret; if (WARN_ON(!dmabuf || !dev)) @@ -841,7 +844,8 @@ dma_buf_dynamic_attach_unlocked(struct dma_buf *dmabuf, struct device *dev, attach->importer_ops = importer_ops; attach->importer_priv = importer_priv; - dma_resv_lock(dmabuf->resv, NULL); + ww_acquire_init(_ctx, _ww_class); + dma_resv_lock(dmabuf->resv, _ctx); That won't work like this. The core property of a WW context is that you need to unwind all the locks and re-quire them with the contended one first. When you statically lock the imported one here you can't do that any more. Regards, Christian. if (dmabuf->ops->attach) { ret = dmabuf->ops->attach(dmabuf, attach); @@ -876,11 +880,13 @@ dma_buf_dynamic_attach_unlocked(struct dma_buf *dmabuf, struct device *dev, } dma_resv_unlock(dmabuf->resv); + ww_acquire_fini(_ctx); return attach; err_attach: dma_resv_unlock(attach->dmabuf->resv); + ww_acquire_fini(_ctx); kfree(attach); return ERR_PTR(ret); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c index c199bf71c373..9173f0232b16 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c @@ -173,7 +173,7 @@ static int i915_gem_dmabuf_attach(struct dma_buf *dmabuf, if (!i915_gem_object_can_migrate(obj, INTEL_REGION_SMEM)) return -EOPNOTSUPP; - for_i915_gem_ww(, err, true) { + for_i915_dmabuf_ww(, dmabuf, err, true) { err = i915_gem_object_migrate(obj, , INTEL_REGION_SMEM); if (err) continue; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 30fe847c6664..ad7d602fc43a 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -3409,7 +3409,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, goto err_vma; } - ww_acquire_done(); + ww_acquire_done(eb.ww.ctx); eb_capture_stage(); out_fence = eb_requests_create(, in_fence, out_fence_fd); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index e11d82a9f7c3..5ae38f94a5c7 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -178,9 +178,9 @@ static inline int __i915_gem_object_lock(struct drm_i915_gem_object *obj, int ret; if (intr) - ret = dma_resv_lock_interruptible(obj->base.resv, ww ? >ctx : NULL); + ret = dma_resv_lock_interruptible(obj->base.resv, ww ? ww->ctx : NULL); else - ret = dma_resv_lock(obj->base.resv, ww ? >ctx : NULL); + ret = dma_resv_lock(obj->base.resv, ww ? ww->ctx : NULL); if (!ret && ww) { i915_gem_object_get(obj); @@ -216,7 +216,7 @@ static inline bool i915_gem_object_trylock(struct drm_i915_gem_object *obj, if (!ww) return dma_resv_trylock(obj->base.resv); else - return
Re: [PATCH v8 2/2] drm/gem: Don't map imported GEMs
Am 01.07.22 um 11:02 schrieb Dmitry Osipenko: Drivers that use drm_gem_mmap() and drm_gem_mmap_obj() helpers don't handle imported dma-bufs properly, which results in mapping of something else than the imported dma-buf. On NVIDIA Tegra we get a hard lockup when userspace writes to the memory mapping of a dma-buf that was imported into Tegra's DRM GEM. Majority of DRM drivers prohibit mapping of the imported GEM objects. Mapping of imported GEMs require special care from userspace since it should sync dma-buf because mapping coherency of the exporter device may not match the DRM device. Let's prohibit the mapping for all DRM drivers for consistency. Suggested-by: Thomas Hellström Signed-off-by: Dmitry Osipenko I'm pretty sure that this is the right approach, but it's certainly more than possible that somebody abused this already. Anyway patch is Reviewed-by: Christian König since you are really fixing a major stability problem here. Regards, Christian. --- drivers/gpu/drm/drm_gem.c | 4 drivers/gpu/drm/drm_gem_shmem_helper.c | 9 - 2 files changed, 4 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index 86d670c71286..fc9ec42fa0ab 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -1034,6 +1034,10 @@ int drm_gem_mmap_obj(struct drm_gem_object *obj, unsigned long obj_size, { int ret; + /* Don't allow imported objects to be mapped */ + if (obj->import_attach) + return -EINVAL; + /* Check for valid size. */ if (obj_size < vma->vm_end - vma->vm_start) return -EINVAL; diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c index 8ad0e02991ca..6190f5018986 100644 --- a/drivers/gpu/drm/drm_gem_shmem_helper.c +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c @@ -609,17 +609,8 @@ EXPORT_SYMBOL_GPL(drm_gem_shmem_vm_ops); */ int drm_gem_shmem_mmap(struct drm_gem_shmem_object *shmem, struct vm_area_struct *vma) { - struct drm_gem_object *obj = >base; int ret; - if (obj->import_attach) { - /* Drop the reference drm_gem_mmap_obj() acquired.*/ - drm_gem_object_put(obj); - vma->vm_private_data = NULL; - - return dma_buf_mmap(obj->dma_buf, vma, 0); - } - ret = drm_gem_shmem_get_pages(shmem); if (ret) { drm_gem_vm_close(vma); ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v8 1/2] drm/gem: Properly annotate WW context on drm_gem_lock_reservations() error
Am 01.07.22 um 11:02 schrieb Dmitry Osipenko: Use ww_acquire_fini() in the error code paths. Otherwise lockdep thinks that lock is held when lock's memory is freed after the drm_gem_lock_reservations() error. The ww_acquire_context needs to be annotated as "released", which fixes the noisy "WARNING: held lock freed!" splat of VirtIO-GPU driver with CONFIG_DEBUG_MUTEXES=y and enabled lockdep. Cc: sta...@vger.kernel.org Fixes: 7edc3e3b975b5 ("drm: Add helpers for locking an array of BO reservations.") Reviewed-by: Thomas Hellström Signed-off-by: Dmitry Osipenko Reviewed-by: Christian König --- drivers/gpu/drm/drm_gem.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index eb0c2d041f13..86d670c71286 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -1226,7 +1226,7 @@ drm_gem_lock_reservations(struct drm_gem_object **objs, int count, ret = dma_resv_lock_slow_interruptible(obj->resv, acquire_ctx); if (ret) { - ww_acquire_done(acquire_ctx); + ww_acquire_fini(acquire_ctx); return ret; } } @@ -1251,7 +1251,7 @@ drm_gem_lock_reservations(struct drm_gem_object **objs, int count, goto retry; } - ww_acquire_done(acquire_ctx); + ww_acquire_fini(acquire_ctx); return ret; } } ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [Linaro-mm-sig] Re: [PATCH v6 02/22] drm/gem: Move mapping of imported dma-bufs to drm_gem_mmap_obj()
Am 30.06.22 um 01:06 schrieb Dmitry Osipenko: On 6/29/22 11:43, Thomas Hellström (Intel) wrote: On 6/29/22 10:22, Dmitry Osipenko wrote: On 6/29/22 09:40, Thomas Hellström (Intel) wrote: On 5/27/22 01:50, Dmitry Osipenko wrote: Drivers that use drm_gem_mmap() and drm_gem_mmap_obj() helpers don't handle imported dma-bufs properly, which results in mapping of something else than the imported dma-buf. For example, on NVIDIA Tegra we get a hard lockup when userspace writes to the memory mapping of a dma-buf that was imported into Tegra's DRM GEM. To fix this bug, move mapping of imported dma-bufs to drm_gem_mmap_obj(). Now mmaping of imported dma-bufs works properly for all DRM drivers. Same comment about Fixes: as in patch 1, Cc: sta...@vger.kernel.org Signed-off-by: Dmitry Osipenko --- drivers/gpu/drm/drm_gem.c | 3 +++ drivers/gpu/drm/drm_gem_shmem_helper.c | 9 - drivers/gpu/drm/tegra/gem.c | 4 3 files changed, 7 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index 86d670c71286..7c0b025508e4 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -1038,6 +1038,9 @@ int drm_gem_mmap_obj(struct drm_gem_object *obj, unsigned long obj_size, if (obj_size < vma->vm_end - vma->vm_start) return -EINVAL; + if (obj->import_attach) + return dma_buf_mmap(obj->dma_buf, vma, 0); If we start enabling mmaping of imported dma-bufs on a majority of drivers in this way, how do we ensure that user-space is not blindly using the object mmap without calling the needed DMA_BUF_IOCTL_SYNC which is needed before and after cpu access of mmap'ed dma-bufs? I was under the impression (admittedly without looking) that the few drivers that actually called into dma_buf_mmap() had some private user-mode driver code in place that ensured this happened. Since it's a userspace who does the mapping, then it should be a responsibility of userspace to do all the necessary syncing. Sure, but nothing prohibits user-space to ignore the syncing thinking "It works anyway", testing those drivers where the syncing is a NOP. And when a driver that finally needs syncing is tested it's too late to fix all broken user-space. I'm not sure whether anyone in userspace really needs to map imported dma-bufs in practice. Nevertheless, this use-case is broken and should be fixed by either allowing to do the mapping or prohibiting it. Then I'd vote for prohibiting it, at least for now. And for the future moving forward we could perhaps revisit the dma-buf need for syncing, requiring those drivers that actually need it to implement emulated coherent memory which can be done not too inefficiently (vmwgfx being one example). Alright, I'll change it to prohibit the mapping. This indeed should be a better option. Oh, yes please. But I would expect that some people start screaming. Over time I've got tons of TTM patches because people illegally tried to mmap() imported DMA-bufs in their driver. Anyway this is probably the right thing to do and we can work on fixing the fallout later on. Regards, Christian. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v6 14/22] dma-buf: Introduce new locking convention
Hi Dmitry, Am 30.05.22 um 15:26 schrieb Dmitry Osipenko: Hello Christian, On 5/30/22 09:50, Christian König wrote: Hi Dmitry, First of all please separate out this patch from the rest of the series, since this is a complex separate structural change. I assume all the patches will go via the DRM tree in the end since the rest of the DRM patches in this series depend on this dma-buf change. But I see that separation may ease reviewing of the dma-buf changes, so let's try it. That sounds like you are underestimating a bit how much trouble this will be. I have tried this before and failed because catching all the locks in the right code paths are very tricky. So expect some fallout from this and make sure the kernel test robot and CI systems are clean. Sure, I'll fix up all the reported things in the next iteration. BTW, have you ever posted yours version of the patch? Will be great if we could compare the changed code paths. No, I never even finished creating it after realizing how much work it would be. This patch introduces new locking convention for dma-buf users. From now on all dma-buf importers are responsible for holding dma-buf reservation lock around operations performed over dma-bufs. This patch implements the new dma-buf locking convention by: 1. Making dma-buf API functions to take the reservation lock. 2. Adding new locked variants of the dma-buf API functions for drivers that need to manage imported dma-bufs under the held lock. Instead of adding new locked variants please mark all variants which expect to be called without a lock with an _unlocked postfix. This should make it easier to remove those in a follow up patch set and then fully move the locking into the importer. Do we really want to move all the locks to the importers? Seems the majority of drivers should be happy with the dma-buf helpers handling the locking for them. Yes, I clearly think so. 3. Converting all drivers to the new locking scheme. I have strong doubts that you got all of them. At least radeon and nouveau should grab the reservation lock in their ->attach callbacks somehow. Radeon and Nouveau use gem_prime_import_sg_table() and they take resv lock already, seems they should be okay (?) You are looking at the wrong side. You need to fix the export code path, not the import ones. See for example attach on radeon works like this drm_gem_map_attach->drm_gem_pin->radeon_gem_prime_pin->radeon_bo_reserve->ttm_bo_reserve->dma_resv_lock. Same for nouveau and probably a few other exporters as well. That will certainly cause a deadlock if you don't fix it. I strongly suggest to do this step by step, first attach/detach and then the rest. Regards, Christian. I assume all the basics should covered in this v6. At minimum Intel, Tegra, Panfrost, Lima and Rockchip drivers should be good. If I missed something, then please let me know and I'll correct it. Signed-off-by: Dmitry Osipenko --- drivers/dma-buf/dma-buf.c | 270 +++--- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 6 +- drivers/gpu/drm/drm_client.c | 4 +- drivers/gpu/drm/drm_gem.c | 33 +++ drivers/gpu/drm/drm_gem_framebuffer_helper.c | 6 +- drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 10 +- drivers/gpu/drm/qxl/qxl_object.c | 17 +- drivers/gpu/drm/qxl/qxl_prime.c | 4 +- .../common/videobuf2/videobuf2-dma-contig.c | 11 +- .../media/common/videobuf2/videobuf2-dma-sg.c | 11 +- .../common/videobuf2/videobuf2-vmalloc.c | 11 +- include/drm/drm_gem.h | 3 + include/linux/dma-buf.h | 14 +- 13 files changed, 241 insertions(+), 159 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 32f55640890c..64a9909ccfa2 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -552,7 +552,6 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info) file->f_mode |= FMODE_LSEEK; dmabuf->file = file; - mutex_init(>lock); Please make removing dmabuf->lock a separate change. Alright ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v6 14/22] dma-buf: Introduce new locking convention
Hi Dmitry, First of all please separate out this patch from the rest of the series, since this is a complex separate structural change. Am 27.05.22 um 01:50 schrieb Dmitry Osipenko: All dma-bufs have dma-reservation lock that allows drivers to perform exclusive operations over shared dma-bufs. Today's dma-buf API has incomplete locking specification, which creates dead lock situation for dma-buf importers and exporters that don't coordinate theirs locks. Well please drop that sentence. The locking specifications are actually very well defined, it's just that some drivers are a bit broken regarding them. What you do here is rather moving all the non-dynamic drivers over to the dynamic locking specification (which is really nice to have). I have tried this before and failed because catching all the locks in the right code paths are very tricky. So expect some fallout from this and make sure the kernel test robot and CI systems are clean. This patch introduces new locking convention for dma-buf users. From now on all dma-buf importers are responsible for holding dma-buf reservation lock around operations performed over dma-bufs. This patch implements the new dma-buf locking convention by: 1. Making dma-buf API functions to take the reservation lock. 2. Adding new locked variants of the dma-buf API functions for drivers that need to manage imported dma-bufs under the held lock. Instead of adding new locked variants please mark all variants which expect to be called without a lock with an _unlocked postfix. This should make it easier to remove those in a follow up patch set and then fully move the locking into the importer. 3. Converting all drivers to the new locking scheme. I have strong doubts that you got all of them. At least radeon and nouveau should grab the reservation lock in their ->attach callbacks somehow. Signed-off-by: Dmitry Osipenko --- drivers/dma-buf/dma-buf.c | 270 +++--- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 6 +- drivers/gpu/drm/drm_client.c | 4 +- drivers/gpu/drm/drm_gem.c | 33 +++ drivers/gpu/drm/drm_gem_framebuffer_helper.c | 6 +- drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c| 10 +- drivers/gpu/drm/qxl/qxl_object.c | 17 +- drivers/gpu/drm/qxl/qxl_prime.c | 4 +- .../common/videobuf2/videobuf2-dma-contig.c | 11 +- .../media/common/videobuf2/videobuf2-dma-sg.c | 11 +- .../common/videobuf2/videobuf2-vmalloc.c | 11 +- include/drm/drm_gem.h | 3 + include/linux/dma-buf.h | 14 +- 13 files changed, 241 insertions(+), 159 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 32f55640890c..64a9909ccfa2 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -552,7 +552,6 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info) file->f_mode |= FMODE_LSEEK; dmabuf->file = file; - mutex_init(>lock); Please make removing dmabuf->lock a separate change. Regards, Christian. INIT_LIST_HEAD(>attachments); mutex_lock(_list.lock); @@ -737,14 +736,14 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev, attach->importer_ops = importer_ops; attach->importer_priv = importer_priv; 3. Converting all drivers to the new locking scheme. Signed-off-by: Dmitry Osipenko --- drivers/dma-buf/dma-buf.c | 270 +++--- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 6 +- drivers/gpu/drm/drm_client.c | 4 +- + dma_resv_lock(dmabuf->resv, NULL); + if (dmabuf->ops->attach) { ret = dmabuf->ops->attach(dmabuf, attach); if (ret) goto err_attach; } - dma_resv_lock(dmabuf->resv, NULL); list_add(>node, >attachments); - dma_resv_unlock(dmabuf->resv); /* When either the importer or the exporter can't handle dynamic * mappings we cache the mapping here to avoid issues with the @@ -755,7 +754,6 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev, struct sg_table *sgt; if (dma_buf_is_dynamic(attach->dmabuf)) { - dma_resv_lock(attach->dmabuf->resv, NULL); ret = dmabuf->ops->pin(attach); if (ret) goto err_unlock; @@ -768,15 +766,16 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev, ret = PTR_ERR(sgt); goto err_unpin; } - if (dma_buf_is_dynamic(attach->dmabuf)) - dma_resv_unlock(attach->dmabuf->resv); attach->sgt = sgt; attach->dir = DMA_BIDIRECTIONAL; } +
Re: [PATCH v2] drm/qxl: fix qxl can't use in arm64
Am 24.03.22 um 11:49 schrieb Cong Liu: qxl use ioremap to map ram_header and rom, in the arm64 implementation, the device is mapped as DEVICE_nGnRE, it can not support unaligned access. and qxl is a virtual device, it can be treated more like RAM than actual MMIO registers. use ioremap_wc() replace it. Signed-off-by: Cong Liu Looks sane to me, but I'm really not involved enough to fully judge. Acked-by: Christian König --- drivers/gpu/drm/qxl/qxl_kms.c | 4 ++-- drivers/gpu/drm/qxl/qxl_ttm.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/qxl/qxl_kms.c b/drivers/gpu/drm/qxl/qxl_kms.c index 4dc5ad13f12c..a054e4a00fe8 100644 --- a/drivers/gpu/drm/qxl/qxl_kms.c +++ b/drivers/gpu/drm/qxl/qxl_kms.c @@ -165,7 +165,7 @@ int qxl_device_init(struct qxl_device *qdev, (int)qdev->surfaceram_size / 1024, (sb == 4) ? "64bit" : "32bit"); - qdev->rom = ioremap(qdev->rom_base, qdev->rom_size); + qdev->rom = ioremap_wc(qdev->rom_base, qdev->rom_size); if (!qdev->rom) { pr_err("Unable to ioremap ROM\n"); r = -ENOMEM; @@ -183,7 +183,7 @@ int qxl_device_init(struct qxl_device *qdev, goto rom_unmap; } - qdev->ram_header = ioremap(qdev->vram_base + + qdev->ram_header = ioremap_wc(qdev->vram_base + qdev->rom->ram_header_offset, sizeof(*qdev->ram_header)); if (!qdev->ram_header) { diff --git a/drivers/gpu/drm/qxl/qxl_ttm.c b/drivers/gpu/drm/qxl/qxl_ttm.c index b2e33d5ba5d0..95df5750f47f 100644 --- a/drivers/gpu/drm/qxl/qxl_ttm.c +++ b/drivers/gpu/drm/qxl/qxl_ttm.c @@ -82,13 +82,13 @@ int qxl_ttm_io_mem_reserve(struct ttm_device *bdev, case TTM_PL_VRAM: mem->bus.is_iomem = true; mem->bus.offset = (mem->start << PAGE_SHIFT) + qdev->vram_base; - mem->bus.caching = ttm_cached; + mem->bus.caching = ttm_write_combined; break; case TTM_PL_PRIV: mem->bus.is_iomem = true; mem->bus.offset = (mem->start << PAGE_SHIFT) + qdev->surfaceram_base; - mem->bus.caching = ttm_cached; + mem->bus.caching = ttm_write_combined; break; default: return -EINVAL; ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: 回复: Re: 回复: Re: 回复: Re: [PATCH v1 1/2] drm/qxl: replace ioremap by ioremap_cache on arm64
Hi Cong, when I understand Robin correctly all mapping (host, guest, kernel, userspace etc..) must have the same caching attributes unless you use the S2FWB feature introduced with Armv8.4. If you don't follow those rules you usually run into coherency issues or even worse system hangs. So you not only need to adjust the kernel mappings, but also the function for userspace mappings to follow those rules. Additional to that I think I read Robin correctly that mapping the resource write combined should be sufficient to solve your problem. So I suggest to use that instead. On the other hand if you manage to fix this completely for arm64 by updating all the places to use cached mappings I'm fine with it as well. It's then up to Dave to decide if that's something we want because QXL is intentionally emulating a hardware device as far as I know. Regards, Christian. Am 24.03.22 um 08:04 schrieb liuco...@kylinos.cn: Hi Christian, Can you help explain more detail under what circumstances userspace mapping will be problematic, you mean cached mappings also need adjustment in function ttm_prot_from_caching? Regards, Cong. *主 题:*Re: 回复: Re: 回复: Re: [PATCH v1 1/2] drm/qxl: replace ioremap by ioremap_cache on arm64 *日 期:*2022-03-23 18:17 *发件人:*Christian König *收件人:*liucong2@kylinos.cnairlied@redhat.comkraxel@redhat.comairlied@linux.iedaniel@ffwll.chray.huang@amd.comvirtualization@lists.linux-foundation.orgspice-devel@lists.freedesktop.orgdri-de...@lists.freedesktop.org Hi Cong, yes I've seen that, but that is still not sufficient. You need to update the check in ttm_module.c as well or otherwise your userspace mapping might not work correctly either. Regards, Christian. Am 23.03.22 um 11:00 schrieb liuco...@kylinos.cn: Hi Christian, another commit fix the case in ttm. I send two patches at the same time, but seems I miss '--cover-letter' when run format-patch or some other bad operation. Regards, Cong. *主 题:*Re: 回复: Re: [PATCH v1 1/2] drm/qxl: replace ioremap by ioremap_cache on arm64 *日 期:*2022-03-23 17:34 *发件人:*Christian König *收件人:*liucong2@kylinos.cnairlied@redhat.comkraxel@redhat.comairlied@linux.iedaniel@ffwll.chray.huang@amd.comvirtualization@lists.linux-foundation.orgspice-devel@lists.freedesktop.orgdri-de...@lists.freedesktop.org Hi Cong, well than Dave must decide what to do here. When QXL emulates a device it should also not use memory accesses which won't work on a physical device. BTW: Your patch is really buggy, it misses the cases in ttm_module.c Regards, Christian. Am 23.03.22 um 09:48 schrieb liuco...@kylinos.cn: Hi Christian, according to 'Arm Architecture Reference Manual Armv8,for Armv8-A architecture profile' pdf E2.6.5 E2.6.5 Generation of Alignment faults by load/store multiple accesses to Device memory When FEAT_LSMAOC is implemented and the value of the applicable nTLSMD field is 0, any memory access by an AArch32 Load Multiple or Store Multiple instruction to an address that the stage 1 translation assigns as Device-nGRE, Device-nGnRE, or Device-nGnRnE generates an Alignment fault. so it seems not just some ARM boards doesn't allow unaligned access to MMIO space, all pci memory mapped as Device-nGnRE in arm64 cannot support unaligned access. and qxl is a device simulated by qemu, it should be able to access to MMIO space in a more flexible way(PROT_NORMAL) than the real physical graphics card. Cong. *主 题:*Re: [PATCH v1 1/2] drm/qxl: replace ioremap by ioremap_cache on arm64 *日 期:*2022-03-23 15:15 *发件人:*Christian König *收件人:*CongLiuairlied@redhat.comkraxel@redhat.comairlied@linux.iedaniel@ffwll.chray.huang@amd.comvirtualization@lists.linux-foundation.orgspice-devel@lists.freedesktop.orgdri-de...@lists.freedesktop.org Am 22.03.22 um 10:34 schrieb Cong Liu: > qxl use ioremap to map ram_header and rom, in the arm64 implementation, > the device is mapped as DEVICE_nGnRE, it can not support unaligned > access. Well that some ARM boards doesn't allow unaligned access to MMIO space is a well known bug of those ARM boards. So as far as I know this is a hardware bug you are trying to workaround here and I'm not 100% sure that this is correct. Christian. > > 6.620515] pc :
Re: 回复: Re: 回复: Re: [PATCH v1 1/2] drm/qxl: replace ioremap by ioremap_cache on arm64
Hi Cong, yes I've seen that, but that is still not sufficient. You need to update the check in ttm_module.c as well or otherwise your userspace mapping might not work correctly either. Regards, Christian. Am 23.03.22 um 11:00 schrieb liuco...@kylinos.cn: Hi Christian, another commit fix the case in ttm. I send two patches at the same time, but seems I miss '--cover-letter' when run format-patch or some other bad operation. Regards, Cong. *主 题:*Re: 回复: Re: [PATCH v1 1/2] drm/qxl: replace ioremap by ioremap_cache on arm64 *日 期:*2022-03-23 17:34 *发件人:*Christian König *收件人:*liucong2@kylinos.cnairlied@redhat.comkraxel@redhat.comairlied@linux.iedaniel@ffwll.chray.huang@amd.comvirtualization@lists.linux-foundation.orgspice-devel@lists.freedesktop.orgdri-de...@lists.freedesktop.org Hi Cong, well than Dave must decide what to do here. When QXL emulates a device it should also not use memory accesses which won't work on a physical device. BTW: Your patch is really buggy, it misses the cases in ttm_module.c Regards, Christian. Am 23.03.22 um 09:48 schrieb liuco...@kylinos.cn: Hi Christian, according to 'Arm Architecture Reference Manual Armv8,for Armv8-A architecture profile' pdf E2.6.5 E2.6.5 Generation of Alignment faults by load/store multiple accesses to Device memory When FEAT_LSMAOC is implemented and the value of the applicable nTLSMD field is 0, any memory access by an AArch32 Load Multiple or Store Multiple instruction to an address that the stage 1 translation assigns as Device-nGRE, Device-nGnRE, or Device-nGnRnE generates an Alignment fault. so it seems not just some ARM boards doesn't allow unaligned access to MMIO space, all pci memory mapped as Device-nGnRE in arm64 cannot support unaligned access. and qxl is a device simulated by qemu, it should be able to access to MMIO space in a more flexible way(PROT_NORMAL) than the real physical graphics card. Cong. *主 题:*Re: [PATCH v1 1/2] drm/qxl: replace ioremap by ioremap_cache on arm64 *日 期:*2022-03-23 15:15 *发件人:*Christian König *收件人:*CongLiuairlied@redhat.comkraxel@redhat.comairlied@linux.iedaniel@ffwll.chray.huang@amd.comvirtualization@lists.linux-foundation.orgspice-devel@lists.freedesktop.orgdri-de...@lists.freedesktop.org Am 22.03.22 um 10:34 schrieb Cong Liu: > qxl use ioremap to map ram_header and rom, in the arm64 implementation, > the device is mapped as DEVICE_nGnRE, it can not support unaligned > access. Well that some ARM boards doesn't allow unaligned access to MMIO space is a well known bug of those ARM boards. So as far as I know this is a hardware bug you are trying to workaround here and I'm not 100% sure that this is correct. Christian. > > 6.620515] pc : setup_hw_slot+0x24/0x60 [qxl] > [ 6.620961] lr : setup_slot+0x34/0xf0 [qxl] > [ 6.621376] sp : 800012b73760 > [ 6.621701] x29: 800012b73760 x28: 0001 x27: 1000 > [ 6.622400] x26: 0001 x25: 0400 x24: cf376848c000 > [ 6.623099] x23: c4087400 x22: cf3718e17140 x21: > [ 6.623823] x20: c4086000 x19: c40870b0 x18: 0014 > [ 6.624519] x17: 4d3605ab x16: bb3b6129 x15: 6e771809 > [ 6.625214] x14: 0001 x13: 007473696c5f7974 x12: 696e69615f65 > [ 6.625909] x11: d543656a x10: x9 : cf3718e085a4 > [ 6.626616] x8 : 006c7871 x7 : 000a x6 : 0017 > [ 6.627343] x5 : 1400 x4 : 800011f63400 x3 : 1400 > [ 6.628047] x2 : x1 : c40870b0 x0 : c4086000 > [ 6.628751] Call trace: > [ 6.628994] setup_hw_slot+0x24/0x60 [qxl] > [ 6.629404] setup_slot+0x34/0xf0 [qxl] > [ 6.629790] qxl_device_init+0x6f0/0x7f0 [qxl] > [ 6.630235] qxl_pci_probe+0xdc/0x1d0 [qxl] > [ 6.630654] local_pci_probe+0x48/0xb8 > [ 6.631027] pci_device_probe+0x194/0x208 > [ 6.631464] really_probe+0xd0/0x458 > [ 6.631818] __driver_probe_device+0x124/0x1c0 > [ 6.632256] driver_probe_device+0x48/0x130 > [ 6.632669] __driver_attach+0xc4/0x1a8 > [ 6.633049] bus_for_each_dev+0x78/0xd0 > [ 6.633437] driver_attach+0x2c/0x38 > [ 6.633789] bus_add_driver+0x154/0x248 > [ 6.634168] driver_register+0x6c/0x128 > [ 6.635205] __pci_register_driver+0x4c/0x58 > [ 6.635628] qxl_init+0x48/0x1000 [qxl] > [
Re: [PATCH v1 1/2] drm/qxl: replace ioremap by ioremap_cache on arm64
Am 23.03.22 um 10:45 schrieb Robin Murphy: On 2022-03-23 07:15, Christian König wrote: Am 22.03.22 um 10:34 schrieb Cong Liu: qxl use ioremap to map ram_header and rom, in the arm64 implementation, the device is mapped as DEVICE_nGnRE, it can not support unaligned access. Well that some ARM boards doesn't allow unaligned access to MMIO space is a well known bug of those ARM boards. So as far as I know this is a hardware bug you are trying to workaround here and I'm not 100% sure that this is correct. No, this one's not a bug. The Device memory type used for iomem mappings is *architecturally* defined to enforce properties like aligned accesses, no speculation, no reordering, etc. If something wants to be treated more like RAM than actual MMIO registers, then ioremap_wc() or ioremap_cache() is the appropriate thing to do in general (with the former being a bit more portable according to Documentation/driver-api/device-io.rst). Of course *then* you might find that on some systems the interconnect/PCIe implementation/endpoint doesn't actually like unaligned accesses once the CPU MMU starts allowing them to be sent out, but hey, one step at a time ;) Ah, good point! I was already wondering why that sometimes work and sometimes doesn't. Thanks, Christian. Robin. Christian. 6.620515] pc : setup_hw_slot+0x24/0x60 [qxl] [ 6.620961] lr : setup_slot+0x34/0xf0 [qxl] [ 6.621376] sp : 800012b73760 [ 6.621701] x29: 800012b73760 x28: 0001 x27: 1000 [ 6.622400] x26: 0001 x25: 0400 x24: cf376848c000 [ 6.623099] x23: c4087400 x22: cf3718e17140 x21: [ 6.623823] x20: c4086000 x19: c40870b0 x18: 0014 [ 6.624519] x17: 4d3605ab x16: bb3b6129 x15: 6e771809 [ 6.625214] x14: 0001 x13: 007473696c5f7974 x12: 696e69615f65 [ 6.625909] x11: d543656a x10: x9 : cf3718e085a4 [ 6.626616] x8 : 006c7871 x7 : 000a x6 : 0017 [ 6.627343] x5 : 1400 x4 : 800011f63400 x3 : 1400 [ 6.628047] x2 : x1 : c40870b0 x0 : c4086000 [ 6.628751] Call trace: [ 6.628994] setup_hw_slot+0x24/0x60 [qxl] [ 6.629404] setup_slot+0x34/0xf0 [qxl] [ 6.629790] qxl_device_init+0x6f0/0x7f0 [qxl] [ 6.630235] qxl_pci_probe+0xdc/0x1d0 [qxl] [ 6.630654] local_pci_probe+0x48/0xb8 [ 6.631027] pci_device_probe+0x194/0x208 [ 6.631464] really_probe+0xd0/0x458 [ 6.631818] __driver_probe_device+0x124/0x1c0 [ 6.632256] driver_probe_device+0x48/0x130 [ 6.632669] __driver_attach+0xc4/0x1a8 [ 6.633049] bus_for_each_dev+0x78/0xd0 [ 6.633437] driver_attach+0x2c/0x38 [ 6.633789] bus_add_driver+0x154/0x248 [ 6.634168] driver_register+0x6c/0x128 [ 6.635205] __pci_register_driver+0x4c/0x58 [ 6.635628] qxl_init+0x48/0x1000 [qxl] [ 6.636013] do_one_initcall+0x50/0x240 [ 6.636390] do_init_module+0x60/0x238 [ 6.636768] load_module+0x2458/0x2900 [ 6.637136] __do_sys_finit_module+0xbc/0x128 [ 6.637561] __arm64_sys_finit_module+0x28/0x38 [ 6.638004] invoke_syscall+0x74/0xf0 [ 6.638366] el0_svc_common.constprop.0+0x58/0x1a8 [ 6.638836] do_el0_svc+0x2c/0x90 [ 6.639216] el0_svc+0x40/0x190 [ 6.639526] el0t_64_sync_handler+0xb0/0xb8 [ 6.639934] el0t_64_sync+0x1a4/0x1a8 [ 6.640294] Code: 910003fd f9484804 f9400c23 8b050084 (f809c083) [ 6.640889] ---[ end trace 95615d89b7c87f95 ]--- Signed-off-by: Cong Liu --- drivers/gpu/drm/qxl/qxl_kms.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/drivers/gpu/drm/qxl/qxl_kms.c b/drivers/gpu/drm/qxl/qxl_kms.c index 4dc5ad13f12c..0e61ac04d8ad 100644 --- a/drivers/gpu/drm/qxl/qxl_kms.c +++ b/drivers/gpu/drm/qxl/qxl_kms.c @@ -165,7 +165,11 @@ int qxl_device_init(struct qxl_device *qdev, (int)qdev->surfaceram_size / 1024, (sb == 4) ? "64bit" : "32bit"); +#ifdef CONFIG_ARM64 + qdev->rom = ioremap_cache(qdev->rom_base, qdev->rom_size); +#else qdev->rom = ioremap(qdev->rom_base, qdev->rom_size); +#endif if (!qdev->rom) { pr_err("Unable to ioremap ROM\n"); r = -ENOMEM; @@ -183,9 +187,15 @@ int qxl_device_init(struct qxl_device *qdev, goto rom_unmap; } +#ifdef CONFIG_ARM64 + qdev->ram_header = ioremap_cache(qdev->vram_base + + qdev->rom->ram_header_offset, + sizeof(*qdev->ram_header)); +#else qdev->ram_header = ioremap(qdev->vram_base + qdev->rom->ram_header_offset, sizeof(*qdev->ram_header)); +#endif if (!qdev->ram_header) { DRM_ERROR("Unable to ioremap RAM header\n"); r = -ENOMEM; ___ Virtualization mailing list
Re: 回复: Re: [PATCH v1 1/2] drm/qxl: replace ioremap by ioremap_cache on arm64
Hi Cong, well than Dave must decide what to do here. When QXL emulates a device it should also not use memory accesses which won't work on a physical device. BTW: Your patch is really buggy, it misses the cases in ttm_module.c Regards, Christian. Am 23.03.22 um 09:48 schrieb liuco...@kylinos.cn: Hi Christian, according to 'Arm Architecture Reference Manual Armv8,for Armv8-A architecture profile' pdf E2.6.5 E2.6.5 Generation of Alignment faults by load/store multiple accesses to Device memory When FEAT_LSMAOC is implemented and the value of the applicable nTLSMD field is 0, any memory access by an AArch32 Load Multiple or Store Multiple instruction to an address that the stage 1 translation assigns as Device-nGRE, Device-nGnRE, or Device-nGnRnE generates an Alignment fault. so it seems not just some ARM boards doesn't allow unaligned access to MMIO space, all pci memory mapped as Device-nGnRE in arm64 cannot support unaligned access. and qxl is a device simulated by qemu, it should be able to access to MMIO space in a more flexible way(PROT_NORMAL) than the real physical graphics card. Cong. *主 题:*Re: [PATCH v1 1/2] drm/qxl: replace ioremap by ioremap_cache on arm64 *日 期:*2022-03-23 15:15 *发件人:*Christian König *收件人:*Cong Liuairlied@redhat.comkraxel@redhat.comairlied@linux.iedaniel@ffwll.chray.huang@amd.comvirtualization@lists.linux-foundation.orgspice-devel@lists.freedesktop.orgdri-de...@lists.freedesktop.org Am 22.03.22 um 10:34 schrieb Cong Liu: > qxl use ioremap to map ram_header and rom, in the arm64 implementation, > the device is mapped as DEVICE_nGnRE, it can not support unaligned > access. Well that some ARM boards doesn't allow unaligned access to MMIO space is a well known bug of those ARM boards. So as far as I know this is a hardware bug you are trying to workaround here and I'm not 100% sure that this is correct. Christian. > > 6.620515] pc : setup_hw_slot+0x24/0x60 [qxl] > [ 6.620961] lr : setup_slot+0x34/0xf0 [qxl] > [ 6.621376] sp : 800012b73760 > [ 6.621701] x29: 800012b73760 x28: 0001 x27: 1000 > [ 6.622400] x26: 0001 x25: 0400 x24: cf376848c000 > [ 6.623099] x23: c4087400 x22: cf3718e17140 x21: > [ 6.623823] x20: c4086000 x19: c40870b0 x18: 0014 > [ 6.624519] x17: 4d3605ab x16: bb3b6129 x15: 6e771809 > [ 6.625214] x14: 0001 x13: 007473696c5f7974 x12: 696e69615f65 > [ 6.625909] x11: d543656a x10: x9 : cf3718e085a4 > [ 6.626616] x8 : 006c7871 x7 : 000a x6 : 0017 > [ 6.627343] x5 : 1400 x4 : 800011f63400 x3 : 1400 > [ 6.628047] x2 : x1 : c40870b0 x0 : c4086000 > [ 6.628751] Call trace: > [ 6.628994] setup_hw_slot+0x24/0x60 [qxl] > [ 6.629404] setup_slot+0x34/0xf0 [qxl] > [ 6.629790] qxl_device_init+0x6f0/0x7f0 [qxl] > [ 6.630235] qxl_pci_probe+0xdc/0x1d0 [qxl] > [ 6.630654] local_pci_probe+0x48/0xb8 > [ 6.631027] pci_device_probe+0x194/0x208 > [ 6.631464] really_probe+0xd0/0x458 > [ 6.631818] __driver_probe_device+0x124/0x1c0 > [ 6.632256] driver_probe_device+0x48/0x130 > [ 6.632669] __driver_attach+0xc4/0x1a8 > [ 6.633049] bus_for_each_dev+0x78/0xd0 > [ 6.633437] driver_attach+0x2c/0x38 > [ 6.633789] bus_add_driver+0x154/0x248 > [ 6.634168] driver_register+0x6c/0x128 > [ 6.635205] __pci_register_driver+0x4c/0x58 > [ 6.635628] qxl_init+0x48/0x1000 [qxl] > [ 6.636013] do_one_initcall+0x50/0x240 > [ 6.636390] do_init_module+0x60/0x238 > [ 6.636768] load_module+0x2458/0x2900 > [ 6.637136] __do_sys_finit_module+0xbc/0x128 > [ 6.637561] __arm64_sys_finit_module+0x28/0x38 > [ 6.638004] invoke_syscall+0x74/0xf0 > [ 6.638366] el0_svc_common.constprop.0+0x58/0x1a8 > [ 6.638836] do_el0_svc+0x2c/0x90 > [ 6.639216] el0_svc+0x40/0x190 > [ 6.639526] el0t_64_sync_handler+0xb0/0xb8 > [ 6.639934] el0t_64_sync+0x1a4/0x1a8 > [ 6.640294] Code: 910003fd f9484804 f9400c23 8b050084 (f809c083) > [ 6.640889] ---[ end trace 95615d89b7c87f95 ]--- > > Signed-off-by: Cong Liu > --- > drivers/gpu/drm/qxl/qxl_kms.c | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/drivers/gpu/drm/qxl/qxl_kms.c b/drivers/gpu/drm/qxl/qxl_kms.c > index 4dc5ad13f12c..0e61ac04d8ad 100644 > --- a/drivers/gpu/drm/qxl/qxl_kms.c > +++ b/drivers/gpu/drm/qxl/qxl_kms.c > @@ -165,7 +165,11 @@ int qxl_device_init(struct qxl_device *qdev, > (int)qdev->surfaceram_size / 1024, > (sb == 4) ? "64bit" : "32bit"); > > +#ifdef CONFIG_ARM64 > + qdev->rom = ioremap_cache(qdev->rom_base, qdev->rom_size); > +#else > qdev->rom = ioremap(qdev->rom_base, qdev->rom_size); > +#endif > if (!qdev->rom) { > pr_err("Unable to ioremap ROM\n");
Re: [PATCH v1 1/2] drm/qxl: replace ioremap by ioremap_cache on arm64
Am 22.03.22 um 10:34 schrieb Cong Liu: qxl use ioremap to map ram_header and rom, in the arm64 implementation, the device is mapped as DEVICE_nGnRE, it can not support unaligned access. Well that some ARM boards doesn't allow unaligned access to MMIO space is a well known bug of those ARM boards. So as far as I know this is a hardware bug you are trying to workaround here and I'm not 100% sure that this is correct. Christian. 6.620515] pc : setup_hw_slot+0x24/0x60 [qxl] [6.620961] lr : setup_slot+0x34/0xf0 [qxl] [6.621376] sp : 800012b73760 [6.621701] x29: 800012b73760 x28: 0001 x27: 1000 [6.622400] x26: 0001 x25: 0400 x24: cf376848c000 [6.623099] x23: c4087400 x22: cf3718e17140 x21: [6.623823] x20: c4086000 x19: c40870b0 x18: 0014 [6.624519] x17: 4d3605ab x16: bb3b6129 x15: 6e771809 [6.625214] x14: 0001 x13: 007473696c5f7974 x12: 696e69615f65 [6.625909] x11: d543656a x10: x9 : cf3718e085a4 [6.626616] x8 : 006c7871 x7 : 000a x6 : 0017 [6.627343] x5 : 1400 x4 : 800011f63400 x3 : 1400 [6.628047] x2 : x1 : c40870b0 x0 : c4086000 [6.628751] Call trace: [6.628994] setup_hw_slot+0x24/0x60 [qxl] [6.629404] setup_slot+0x34/0xf0 [qxl] [6.629790] qxl_device_init+0x6f0/0x7f0 [qxl] [6.630235] qxl_pci_probe+0xdc/0x1d0 [qxl] [6.630654] local_pci_probe+0x48/0xb8 [6.631027] pci_device_probe+0x194/0x208 [6.631464] really_probe+0xd0/0x458 [6.631818] __driver_probe_device+0x124/0x1c0 [6.632256] driver_probe_device+0x48/0x130 [6.632669] __driver_attach+0xc4/0x1a8 [6.633049] bus_for_each_dev+0x78/0xd0 [6.633437] driver_attach+0x2c/0x38 [6.633789] bus_add_driver+0x154/0x248 [6.634168] driver_register+0x6c/0x128 [6.635205] __pci_register_driver+0x4c/0x58 [6.635628] qxl_init+0x48/0x1000 [qxl] [6.636013] do_one_initcall+0x50/0x240 [6.636390] do_init_module+0x60/0x238 [6.636768] load_module+0x2458/0x2900 [6.637136] __do_sys_finit_module+0xbc/0x128 [6.637561] __arm64_sys_finit_module+0x28/0x38 [6.638004] invoke_syscall+0x74/0xf0 [6.638366] el0_svc_common.constprop.0+0x58/0x1a8 [6.638836] do_el0_svc+0x2c/0x90 [6.639216] el0_svc+0x40/0x190 [6.639526] el0t_64_sync_handler+0xb0/0xb8 [6.639934] el0t_64_sync+0x1a4/0x1a8 [6.640294] Code: 910003fd f9484804 f9400c23 8b050084 (f809c083) [6.640889] ---[ end trace 95615d89b7c87f95 ]--- Signed-off-by: Cong Liu --- drivers/gpu/drm/qxl/qxl_kms.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/drivers/gpu/drm/qxl/qxl_kms.c b/drivers/gpu/drm/qxl/qxl_kms.c index 4dc5ad13f12c..0e61ac04d8ad 100644 --- a/drivers/gpu/drm/qxl/qxl_kms.c +++ b/drivers/gpu/drm/qxl/qxl_kms.c @@ -165,7 +165,11 @@ int qxl_device_init(struct qxl_device *qdev, (int)qdev->surfaceram_size / 1024, (sb == 4) ? "64bit" : "32bit"); +#ifdef CONFIG_ARM64 + qdev->rom = ioremap_cache(qdev->rom_base, qdev->rom_size); +#else qdev->rom = ioremap(qdev->rom_base, qdev->rom_size); +#endif if (!qdev->rom) { pr_err("Unable to ioremap ROM\n"); r = -ENOMEM; @@ -183,9 +187,15 @@ int qxl_device_init(struct qxl_device *qdev, goto rom_unmap; } +#ifdef CONFIG_ARM64 + qdev->ram_header = ioremap_cache(qdev->vram_base + + qdev->rom->ram_header_offset, + sizeof(*qdev->ram_header)); +#else qdev->ram_header = ioremap(qdev->vram_base + qdev->rom->ram_header_offset, sizeof(*qdev->ram_header)); +#endif if (!qdev->ram_header) { DRM_ERROR("Unable to ioremap RAM header\n"); r = -ENOMEM; ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization