Re: [PATCH] drm/ttm: Don't inherit GEM object VMAs in child process
Am 22.12.21 um 21:53 schrieb Daniel Vetter: On Mon, Dec 20, 2021 at 01:12:51PM -0500, Bhardwaj, Rajneesh wrote: [SNIP] Still sounds funky. I think minimally we should have an ack from CRIU developers that this is officially the right way to solve this problem. I really don't want to have random one-off hacks that don't work across the board, for a problem where we (drm subsystem) really shouldn't be the only one with this problem. Where "this problem" means that the mmap space is per file description, and not per underlying inode or real device or whatever. That part sounds like a CRIU problem, and I expect CRIU folks want a consistent solution across the board for this. Hence please grab an ack from them. Unfortunately it's a KFD design problem. AMD used a single device node, then mmaped different objects from the same offset to different processes and expected it to work the rest of the fs subsystem without churn. So yes, this is indeed because the mmap space is per file descriptor for the use case here. And thanks for pointing this out, this indeed makes the whole change extremely questionable. Regards, Christian. Cheers, Daniel
Re: mmotm 2021-12-22-19-02 uploaded (drivers/gpu/drm/i915/display/intel_backlight.o)
On 12/22/21 19:02, a...@linux-foundation.org wrote: > The mm-of-the-moment snapshot 2021-12-22-19-02 has been uploaded to > >https://www.ozlabs.org/~akpm/mmotm/ > > mmotm-readme.txt says > > README for mm-of-the-moment: > > https://www.ozlabs.org/~akpm/mmotm/ > > This is a snapshot of my -mm patch queue. Uploaded at random hopefully > more than once a week. > on x86_64: ld: drivers/gpu/drm/i915/display/intel_backlight.o: in function `intel_backlight_device_register': intel_backlight.c:(.text+0x27ba): undefined reference to `backlight_device_register' ld: intel_backlight.c:(.text+0x2871): undefined reference to `backlight_device_register' ld: drivers/gpu/drm/i915/display/intel_backlight.o: in function `intel_backlight_device_unregister': intel_backlight.c:(.text+0x28c4): undefined reference to `backlight_device_unregister' Full randconfig file is attached. -- ~Randy config-intel-backlight.gz Description: application/gzip
[PATCH V2] drm: nouveau: lsfw: cleanup coccinelle warning
From: Wang Qing odd_ptr_err.cocci has complained about this warning for a long time: lsfw.c:194:5-11: inconsistent IS_ERR and PTR_ERR on line 195. Although there is no actual impact, it can improve scanning efficiency. Signed-off-by: Wang Qing --- drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c b/drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c index 9b1cf67..0f70d14 --- a/drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c +++ b/drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c @@ -191,7 +191,8 @@ nvkm_acr_lsfw_load_bl_inst_data_sig(struct nvkm_subdev *subdev, u32 *bldata; int ret; - if (IS_ERR((lsfw = nvkm_acr_lsfw_add(func, acr, falcon, id + lsfw = nvkm_acr_lsfw_add(func, acr, falcon, id); + if (IS_ERR(lsfw)) return PTR_ERR(lsfw); ret = nvkm_firmware_load_name(subdev, path, "bl", ver, &bl); -- 2.7.4
Re: [PATCH] drm/ast: Support 1600x900 with 108MHz PCLK
On Wed, 22 Dec 2021 at 11:19, Kuo-Hsiang Chou wrote: > > Hi > > -Original Message- > From: Dave Airlie [mailto:airl...@gmail.com] > Sent: Wednesday, December 22, 2021 5:56 AM > To: Thomas Zimmermann > > Subject: Re: [PATCH] drm/ast: Support 1600x900 with 108MHz PCLK > > On Mon, 2 Nov 2020 at 17:57, Thomas Zimmermann wrote: > > > > Hi > > > > Am 30.10.20 um 08:42 schrieb KuoHsiang Chou: > > > [New] Create the setting for 1600x900 @60Hz refresh rate > > > by 108MHz pixel-clock. > > > > > > Signed-off-by: KuoHsiang Chou > > > > Acked-by: Thomas Zimmermann > > > > I'll add your patch to drm-misc-next. > > > > As Sam mentioned, you should use scripts/get_maintainers.pl to > > retrieve the relevant people. These include those in MAINTAINERS, but > > also developers that have previously worked on the code. > > We are seeing a possible report of a regression on an ast2600 server with > this patch. > > I haven't ascertained that reverting it fixes it for the customer yet, but > this is a heads up in case anyone else has seen issues. > > Hi Dave, > > Yes, you're right, The patch needs to be removed. The patch occurs incorrect > timing on CRT and ASTDP when 1600x900 are selected. > So, do I need to commit a new patch to remove/revert it from drm/ast? Yes, do a git revert Fixup the resulting message, to say why, and add a Fixes: <12 chars of sha1> ("commitmsg") and send to the list. Dave.
Re: [PATCH] drm/ttm: Don't inherit GEM object VMAs in child process
Sorry for the typo in my previous email. Please read Adrian Reber* On 12/22/2021 8:49 PM, Bhardwaj, Rajneesh wrote: Adding Adrian Rebel who is the CRIU maintainer and CRIU list On 12/22/2021 3:53 PM, Daniel Vetter wrote: On Mon, Dec 20, 2021 at 01:12:51PM -0500, Bhardwaj, Rajneesh wrote: On 12/20/2021 4:29 AM, Daniel Vetter wrote: On Fri, Dec 10, 2021 at 07:58:50AM +0100, Christian König wrote: Am 09.12.21 um 19:28 schrieb Felix Kuehling: Am 2021-12-09 um 10:30 a.m. schrieb Christian König: That still won't work. But I think we could do this change for the amdgpu mmap callback only. If graphics user mode has problems with it, we could even make this specific to KFD BOs in the amdgpu_gem_object_mmap callback. I think it's fine for the whole amdgpu stack, my concern is more about radeon, nouveau and the ARM stacks which are using this as well. That blew up so nicely the last time we tried to change it and I know of at least one case where radeon was/is used with BOs in a child process. I'm way late and burried again, but I think it'd be good to be consistent I had committed this change into our amd-staging-drm-next branch last week after I got the ACK and RB from Felix and Christian. here across drivers. Or at least across drm drivers. And we've had the vma open/close refcounting to make fork work since forever. I think if we do this we should really only do this for mmap() where this applies, but reading through the thread here I'm honestly confused why this is a problem. If CRIU can't handle forked mmaps it needs to be thought that, not hacked around. Or at least I'm not understanding why this shouldn't work ... -Daniel Hi Daniel In the v2 https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2Fa1a865f5-ad2c-29c8-cbe4-2635d53eceb6%40amd.com%2FT%2F&data=04%7C01%7Crajneesh.bhardwaj%40amd.com%7Ce4634a16c37149da173408d9c58d1338%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637758031981907821%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=h0z4sO19bsJecMqeHGdz%2BHZElKuyzK%2BW%2FMbLWA79I10%3D&reserved=0 I pretty much limited the scope of the change to KFD BOs on mmap. Regarding CRIU, I think its not a CRIU problem as CRIU on restore, only tries to recreate all the child processes and then mmaps all the VMAs it sees (as per checkpoint snapshot) in the new process address space after the VMA placements are finalized in the position independent code phase. Since the inherited VMAs don't have access rights the criu mmap fails. Still sounds funky. I think minimally we should have an ack from CRIU developers that this is officially the right way to solve this problem. I really don't want to have random one-off hacks that don't work across the board, for a problem where we (drm subsystem) really shouldn't be the only one with this problem. Where "this problem" means that the mmap space is per file description, and not per underlying inode or real device or whatever. That part sounds like a CRIU problem, and I expect CRIU folks want a consistent solution across the board for this. Hence please grab an ack from them. Cheers, Daniel Maybe Adrian can share his views on this. Hi Adrian - For the context, on CRIU restore we see mmap failures ( in PIE restore phase) due to permission issues on the (render node) VMAs that were inherited since the application that check pointed had forked. The VMAs ideally should not be in the child process but the smaps file shows these VMAs in the child address space. We didn't want to use madvise to avoid this copy and rather change in the kernel mode to limit the impact to our user space library thunk. Based on my understanding, during PIE restore phase, after the VMA placements are finalized, CRIU does a sys_mmap on all the VMA it sees in the VmaEntry list and I think its not an issue as per CRIU design but do you think we could handle this corner case better inside CRIU? Regards, Rajneesh Regards, Christian. Regards, Felix Regards, Christian. Am 09.12.21 um 16:29 schrieb Bhardwaj, Rajneesh: Sounds good. I will send a v2 with only ttm_bo_mmap_obj change. Thank you! On 12/9/2021 10:27 AM, Christian König wrote: Hi Rajneesh, yes, separating this from the drm_gem_mmap_obj() change is certainly a good idea. The child cannot access the BOs mapped by the parent anyway with access restrictions applied exactly that is not correct. That behavior is actively used by some userspace stacks as far as I know. Regards, Christian. Am 09.12.21 um 16:23 schrieb Bhardwaj, Rajneesh: Thanks Christian. Would it make it less intrusive if I just use the flag for ttm bo mmap and remove the drm_gem_mmap_obj change from this patch? For our use case, just the ttm_bo_mmap_obj change should suffice and we don't want to put any more work arounds in the user space (thunk, in our case). The child cannot access the BOs mapped by the parent anyway with access rest
Re: [PATCH] drm/ttm: Don't inherit GEM object VMAs in child process
Adding Adrian Rebel who is the CRIU maintainer and CRIU list On 12/22/2021 3:53 PM, Daniel Vetter wrote: On Mon, Dec 20, 2021 at 01:12:51PM -0500, Bhardwaj, Rajneesh wrote: On 12/20/2021 4:29 AM, Daniel Vetter wrote: On Fri, Dec 10, 2021 at 07:58:50AM +0100, Christian König wrote: Am 09.12.21 um 19:28 schrieb Felix Kuehling: Am 2021-12-09 um 10:30 a.m. schrieb Christian König: That still won't work. But I think we could do this change for the amdgpu mmap callback only. If graphics user mode has problems with it, we could even make this specific to KFD BOs in the amdgpu_gem_object_mmap callback. I think it's fine for the whole amdgpu stack, my concern is more about radeon, nouveau and the ARM stacks which are using this as well. That blew up so nicely the last time we tried to change it and I know of at least one case where radeon was/is used with BOs in a child process. I'm way late and burried again, but I think it'd be good to be consistent I had committed this change into our amd-staging-drm-next branch last week after I got the ACK and RB from Felix and Christian. here across drivers. Or at least across drm drivers. And we've had the vma open/close refcounting to make fork work since forever. I think if we do this we should really only do this for mmap() where this applies, but reading through the thread here I'm honestly confused why this is a problem. If CRIU can't handle forked mmaps it needs to be thought that, not hacked around. Or at least I'm not understanding why this shouldn't work ... -Daniel Hi Daniel In the v2 https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2Fa1a865f5-ad2c-29c8-cbe4-2635d53eceb6%40amd.com%2FT%2F&data=04%7C01%7Crajneesh.bhardwaj%40amd.com%7Ce4634a16c37149da173408d9c58d1338%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637758031981907821%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=h0z4sO19bsJecMqeHGdz%2BHZElKuyzK%2BW%2FMbLWA79I10%3D&reserved=0 I pretty much limited the scope of the change to KFD BOs on mmap. Regarding CRIU, I think its not a CRIU problem as CRIU on restore, only tries to recreate all the child processes and then mmaps all the VMAs it sees (as per checkpoint snapshot) in the new process address space after the VMA placements are finalized in the position independent code phase. Since the inherited VMAs don't have access rights the criu mmap fails. Still sounds funky. I think minimally we should have an ack from CRIU developers that this is officially the right way to solve this problem. I really don't want to have random one-off hacks that don't work across the board, for a problem where we (drm subsystem) really shouldn't be the only one with this problem. Where "this problem" means that the mmap space is per file description, and not per underlying inode or real device or whatever. That part sounds like a CRIU problem, and I expect CRIU folks want a consistent solution across the board for this. Hence please grab an ack from them. Cheers, Daniel Maybe Adrian can share his views on this. Hi Adrian - For the context, on CRIU restore we see mmap failures ( in PIE restore phase) due to permission issues on the (render node) VMAs that were inherited since the application that check pointed had forked. The VMAs ideally should not be in the child process but the smaps file shows these VMAs in the child address space. We didn't want to use madvise to avoid this copy and rather change in the kernel mode to limit the impact to our user space library thunk. Based on my understanding, during PIE restore phase, after the VMA placements are finalized, CRIU does a sys_mmap on all the VMA it sees in the VmaEntry list and I think its not an issue as per CRIU design but do you think we could handle this corner case better inside CRIU? Regards, Rajneesh Regards, Christian. Regards, Felix Regards, Christian. Am 09.12.21 um 16:29 schrieb Bhardwaj, Rajneesh: Sounds good. I will send a v2 with only ttm_bo_mmap_obj change. Thank you! On 12/9/2021 10:27 AM, Christian König wrote: Hi Rajneesh, yes, separating this from the drm_gem_mmap_obj() change is certainly a good idea. The child cannot access the BOs mapped by the parent anyway with access restrictions applied exactly that is not correct. That behavior is actively used by some userspace stacks as far as I know. Regards, Christian. Am 09.12.21 um 16:23 schrieb Bhardwaj, Rajneesh: Thanks Christian. Would it make it less intrusive if I just use the flag for ttm bo mmap and remove the drm_gem_mmap_obj change from this patch? For our use case, just the ttm_bo_mmap_obj change should suffice and we don't want to put any more work arounds in the user space (thunk, in our case). The child cannot access the BOs mapped by the parent anyway with access restrictions applied so I wonder why even inherit the vma? On 12/9/2021 2:54 AM, Christian König wrote: Am 08.12.21 um
[PATCH] drm/i915/guc: Report error on invalid reset notification
From: John Harrison Don't silently drop reset notifications from the GuC. It might not be safe to do an error capture but we still want some kind of report that the reset happened. Signed-off-by: John Harrison --- drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index e7517206af82..0fbf24b8d5e1 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -3979,6 +3979,11 @@ static void guc_handle_context_reset(struct intel_guc *guc, !context_blocked(ce))) { capture_error_state(guc, ce); guc_context_replay(ce); + } else { + drm_err(&guc_to_gt(guc)->i915->drm, + "Invalid GuC engine reset notificaion for 0x%04X on %s: banned = %d, blocked = %d", + ce->guc_id.id, ce->engine->name, intel_context_is_banned(ce), + context_blocked(ce)); } } -- 2.25.1
Re: [PATCH] drm/i915/guc: Use lockless list for destroyed contexts
On Wed, Dec 22, 2021 at 04:48:36PM -0800, John Harrison wrote: > On 12/22/2021 15:29, Matthew Brost wrote: > > Use a lockless list structure for destroyed contexts to avoid hammering > > on global submission spin lock. > I thought the guidance was that lockless anything without an explanation > longer than War And Peace comes with an automatic termination penalty? > I was thinking that was for custom lockless algorithms not using core uAPIs. If this is really concern I could protect the llist_del_all by a lock but the doc explicitly says how I'm using this uAPI is safe without a lock. > Also, I thought the simple suggestion was to just move the entire list > sideways under the existing lock and then loop through the local list safely > without requiring locks because it is now local only. > That's basically what this uAPI does in a few simple calls rather than our own algorithm to move to a new list. Matt > John. > > > > > > Suggested-by: Tvrtko Ursulin > > Signed-off-by: Matthew Brost > > --- > > drivers/gpu/drm/i915/gt/intel_context.c | 2 - > > drivers/gpu/drm/i915/gt/intel_context_types.h | 3 +- > > drivers/gpu/drm/i915/gt/uc/intel_guc.h| 3 +- > > .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 44 +-- > > 4 files changed, 16 insertions(+), 36 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_context.c > > b/drivers/gpu/drm/i915/gt/intel_context.c > > index 5d0ec7c49b6a..4aacb4b0418d 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_context.c > > +++ b/drivers/gpu/drm/i915/gt/intel_context.c > > @@ -403,8 +403,6 @@ intel_context_init(struct intel_context *ce, struct > > intel_engine_cs *engine) > > ce->guc_id.id = GUC_INVALID_LRC_ID; > > INIT_LIST_HEAD(&ce->guc_id.link); > > - INIT_LIST_HEAD(&ce->destroyed_link); > > - > > INIT_LIST_HEAD(&ce->parallel.child_list); > > /* > > diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h > > b/drivers/gpu/drm/i915/gt/intel_context_types.h > > index 30cd81ad8911..4532d43ec9c0 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_context_types.h > > +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h > > @@ -9,6 +9,7 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > @@ -224,7 +225,7 @@ struct intel_context { > > * list when context is pending to be destroyed (deregistered with the > > * GuC), protected by guc->submission_state.lock > > */ > > - struct list_head destroyed_link; > > + struct llist_node destroyed_link; > > /** @parallel: sub-structure for parallel submission members */ > > struct { > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h > > b/drivers/gpu/drm/i915/gt/uc/intel_guc.h > > index f9240d4baa69..705085058411 100644 > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h > > @@ -8,6 +8,7 @@ > > #include > > #include > > +#include > > #include "intel_uncore.h" > > #include "intel_guc_fw.h" > > @@ -112,7 +113,7 @@ struct intel_guc { > > * @destroyed_contexts: list of contexts waiting to be destroyed > > * (deregistered with the GuC) > > */ > > - struct list_head destroyed_contexts; > > + struct llist_head destroyed_contexts; > > /** > > * @destroyed_worker: worker to deregister contexts, need as we > > * need to take a GT PM reference and can't from destroy > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > index 0a03a30e4c6d..6f7643edc139 100644 > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > @@ -1771,7 +1771,7 @@ int intel_guc_submission_init(struct intel_guc *guc) > > spin_lock_init(&guc->submission_state.lock); > > INIT_LIST_HEAD(&guc->submission_state.guc_id_list); > > ida_init(&guc->submission_state.guc_ids); > > - INIT_LIST_HEAD(&guc->submission_state.destroyed_contexts); > > + init_llist_head(&guc->submission_state.destroyed_contexts); > > INIT_WORK(&guc->submission_state.destroyed_worker, > > destroyed_worker_func); > > @@ -2696,26 +2696,18 @@ static void __guc_context_destroy(struct > > intel_context *ce) > > } > > } > > +#define take_destroyed_contexts(guc) \ > > + llist_del_all(&guc->submission_state.destroyed_contexts) > > + > > static void guc_flush_destroyed_contexts(struct intel_guc *guc) > > { > > - struct intel_context *ce; > > - unsigned long flags; > > + struct intel_context *ce, *cn; > > GEM_BUG_ON(!submission_disabled(guc) && > >guc_submission_initialized(guc)); > > - while (!list_empty(&guc->submission_state.destroyed_contexts)) { > > - spin_lock_irqsave(&guc->submission_state.lock, flags); > > - ce = > > list_first_entry_or_null(&guc->submission_state.destroyed_contex
Re: [PATCH] drm/i915/execlists: Weak parallel submission support for execlists
On 12/22/2021 14:35, Matthew Brost wrote: A weak implementation of parallel submission (multi-bb execbuf IOCTL) for execlists. Doing as little as possible to support this interface for execlists - basically just passing submit fences between each request generated and virtual engines are not allowed. This is on par with what is there for the existing (hopefully soon deprecated) bonding interface. We perma-pin these execlists contexts to align with GuC implementation. v2: (John Harrison) - Drop siblings array as num_siblings must be 1 v3: (John Harrison) - Drop single submission v4: (John Harrison) - Actually drop single submission - Use IS_ERR check on return value from intel_context_create - Set last request to NULL on unpin Signed-off-by: Matthew Brost --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 11 -- drivers/gpu/drm/i915/gt/intel_context.c | 4 +- .../drm/i915/gt/intel_execlists_submission.c | 38 +++ drivers/gpu/drm/i915/gt/intel_lrc.c | 4 ++ .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 2 - 5 files changed, 51 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index cad3f0b2be9e..b0d2d81fc3b3 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -570,10 +570,6 @@ set_proto_ctx_engines_parallel_submit(struct i915_user_extension __user *base, struct intel_engine_cs **siblings = NULL; intel_engine_mask_t prev_mask; - /* FIXME: This is NIY for execlists */ - if (!(intel_uc_uses_guc_submission(&to_gt(i915)->uc))) - return -ENODEV; - if (get_user(slot, &ext->engine_index)) return -EFAULT; @@ -583,6 +579,13 @@ set_proto_ctx_engines_parallel_submit(struct i915_user_extension __user *base, if (get_user(num_siblings, &ext->num_siblings)) return -EFAULT; + if (!intel_uc_uses_guc_submission(&to_gt(i915)->uc) && + num_siblings != 1) { + drm_dbg(&i915->drm, "Only 1 sibling (%d) supported in non-GuC mode\n", + num_siblings); + return -EINVAL; + } + if (slot >= set->num_engines) { drm_dbg(&i915->drm, "Invalid placement value, %d >= %d\n", slot, set->num_engines); diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index ba083d800a08..5d0ec7c49b6a 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -79,7 +79,8 @@ static int intel_context_active_acquire(struct intel_context *ce) __i915_active_acquire(&ce->active); - if (intel_context_is_barrier(ce) || intel_engine_uses_guc(ce->engine)) + if (intel_context_is_barrier(ce) || intel_engine_uses_guc(ce->engine) || + intel_context_is_parallel(ce)) return 0; /* Preallocate tracking nodes */ @@ -563,7 +564,6 @@ void intel_context_bind_parent_child(struct intel_context *parent, * Callers responsibility to validate that this function is used * correctly but we use GEM_BUG_ON here ensure that they do. */ - GEM_BUG_ON(!intel_engine_uses_guc(parent->engine)); GEM_BUG_ON(intel_context_is_pinned(parent)); GEM_BUG_ON(intel_context_is_child(parent)); GEM_BUG_ON(intel_context_is_pinned(child)); diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index a69df5e9e77a..be56d0b41892 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -2599,6 +2599,43 @@ static void execlists_context_cancel_request(struct intel_context *ce, current->comm); } +static struct intel_context * +execlists_create_parallel(struct intel_engine_cs **engines, + unsigned int num_siblings, + unsigned int width) +{ + struct intel_context *parent = NULL, *ce, *err; + int i; + + GEM_BUG_ON(num_siblings != 1); + + for (i = 0; i < width; ++i) { + ce = intel_context_create(engines[i]); + if (IS_ERR(ce)) { + err = ce; Could get rid of 'err' and just say 'return ce;' at the end of 'unwind:'. Either way: Reviewed-by: John Harrison + goto unwind; + } + + if (i == 0) + parent = ce; + else + intel_context_bind_parent_child(parent, ce); + } + + parent->parallel.fence_context = dma_fence_context_alloc(1); + + intel_context_set_nopreempt(parent); + for_each_child(parent, ce) + intel_context_set_nopreempt(ce); + + return parent; + +unwind: +
[PATCH v9 2/2] drm/msm/dp: do not initialize phy until plugin interrupt received
Current DP drivers have regulators, clocks, irq and phy are grouped together within a function and executed not in a symmetric manner. This increase difficulty of code maintenance and limited code scalability. This patch divides the driver life cycle of operation into four states, resume (including booting up), dongle plugin, dongle unplugged and suspend. Regulators, core clocks and irq are grouped together and enabled at resume (or booting up) so that the DP controller is armed and ready to receive HPD plugin interrupts. HPD plugin interrupt is generated when a dongle plugs into DUT (device under test). Once HPD plugin interrupt is received, DP controller will initialize phy so that dpcd read/write will function and following link training can be proceeded successfully. DP phy will be disabled after main link is teared down at end of unplugged HPD interrupt handle triggered by dongle unplugged out of DUT. Finally regulators, code clocks and irq are disabled at corresponding suspension. Changes in V2: -- removed unnecessary dp_ctrl NULL check -- removed unnecessary phy init_count and power_count DRM_DEBUG_DP logs -- remove flip parameter out of dp_ctrl_irq_enable() -- add fixes tag Changes in V3: -- call dp_display_host_phy_init() instead of dp_ctrl_phy_init() at dp_display_host_init() for eDP Changes in V4: -- rewording commit text to match this commit changes Changes in V5: -- rebase on top of msm-next branch Changes in V6: -- delete flip variable Changes in V7: -- dp_ctrl_irq_enable/disabe() merged into dp_ctrl_reset_irq_ctrl() Changes in V8: -- add more detail comment regrading dp phy at dp_display_host_init() Changes in V9: -- remove set phy_initialized to false when -ECONNRESET detected Fixes: 8ede2ecc3e5e ("drm/msm/dp: Add DP compliance tests on Snapdragon Chipsets") Signed-off-by: Kuogee Hsieh --- drivers/gpu/drm/msm/dp/dp_ctrl.c| 80 + drivers/gpu/drm/msm/dp/dp_ctrl.h| 8 ++-- drivers/gpu/drm/msm/dp/dp_display.c | 89 - 3 files changed, 94 insertions(+), 83 deletions(-) diff --git a/drivers/gpu/drm/msm/dp/dp_ctrl.c b/drivers/gpu/drm/msm/dp/dp_ctrl.c index c724cb0..9c80b49 100644 --- a/drivers/gpu/drm/msm/dp/dp_ctrl.c +++ b/drivers/gpu/drm/msm/dp/dp_ctrl.c @@ -1365,60 +1365,44 @@ static int dp_ctrl_enable_stream_clocks(struct dp_ctrl_private *ctrl) return ret; } -int dp_ctrl_host_init(struct dp_ctrl *dp_ctrl, bool flip, bool reset) +void dp_ctrl_reset_irq_ctrl(struct dp_ctrl *dp_ctrl, bool enable) +{ + struct dp_ctrl_private *ctrl; + + ctrl = container_of(dp_ctrl, struct dp_ctrl_private, dp_ctrl); + + dp_catalog_ctrl_reset(ctrl->catalog); + + if (enable) + dp_catalog_ctrl_enable_irq(ctrl->catalog, enable); +} + +void dp_ctrl_phy_init(struct dp_ctrl *dp_ctrl) { struct dp_ctrl_private *ctrl; struct dp_io *dp_io; struct phy *phy; - if (!dp_ctrl) { - DRM_ERROR("Invalid input data\n"); - return -EINVAL; - } - ctrl = container_of(dp_ctrl, struct dp_ctrl_private, dp_ctrl); dp_io = &ctrl->parser->io; phy = dp_io->phy; - ctrl->dp_ctrl.orientation = flip; - - if (reset) - dp_catalog_ctrl_reset(ctrl->catalog); - - DRM_DEBUG_DP("flip=%d\n", flip); dp_catalog_ctrl_phy_reset(ctrl->catalog); phy_init(phy); - dp_catalog_ctrl_enable_irq(ctrl->catalog, true); - - return 0; } -/** - * dp_ctrl_host_deinit() - Uninitialize DP controller - * @dp_ctrl: Display Port Driver data - * - * Perform required steps to uninitialize DP controller - * and its resources. - */ -void dp_ctrl_host_deinit(struct dp_ctrl *dp_ctrl) +void dp_ctrl_phy_exit(struct dp_ctrl *dp_ctrl) { struct dp_ctrl_private *ctrl; struct dp_io *dp_io; struct phy *phy; - if (!dp_ctrl) { - DRM_ERROR("Invalid input data\n"); - return; - } - ctrl = container_of(dp_ctrl, struct dp_ctrl_private, dp_ctrl); dp_io = &ctrl->parser->io; phy = dp_io->phy; - dp_catalog_ctrl_enable_irq(ctrl->catalog, false); + dp_catalog_ctrl_phy_reset(ctrl->catalog); phy_exit(phy); - - DRM_DEBUG_DP("Host deinitialized successfully\n"); } static bool dp_ctrl_use_fixed_nvid(struct dp_ctrl_private *ctrl) @@ -1488,7 +1472,10 @@ static int dp_ctrl_deinitialize_mainlink(struct dp_ctrl_private *ctrl) } phy_power_off(phy); + + /* aux channel down, reinit phy */ phy_exit(phy); + phy_init(phy); return 0; } @@ -1893,8 +1880,14 @@ int dp_ctrl_off_link_stream(struct dp_ctrl *dp_ctrl) return ret; } + DRM_DEBUG_DP("Before, phy=%x init_count=%d power_on=%d\n", + (u32)(uintptr_t)phy, phy->init_count, phy->power_count); + phy_power_off(phy); + DRM_DEBUG_DP("After, phy=%x init_c
[PATCH v9 1/2] drm/msm/dp: dp_link_parse_sink_count() return immediately if aux read failed
Add checking aux read/write status at both dp_link_parse_sink_count() and dp_link_parse_sink_status_filed() to avoid long timeout delay if dp aux read/write failed at timeout due to cable unplugged. Also make sure dp controller had been initialized before start dpcd read and write. Changes in V4: -- split this patch as stand alone patch Changes in v5: -- rebase on msm-next branch Changes in v6: -- add more details commit text Signed-off-by: Kuogee Hsieh Reviewed-by: Stephen Boyd Tested-by: Stephen Boyd --- drivers/gpu/drm/msm/dp/dp_display.c | 12 +--- drivers/gpu/drm/msm/dp/dp_link.c| 19 ++- 2 files changed, 23 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c index 3d61459..0766752 100644 --- a/drivers/gpu/drm/msm/dp/dp_display.c +++ b/drivers/gpu/drm/msm/dp/dp_display.c @@ -692,9 +692,15 @@ static int dp_irq_hpd_handle(struct dp_display_private *dp, u32 data) return 0; } - ret = dp_display_usbpd_attention_cb(&dp->pdev->dev); - if (ret == -ECONNRESET) { /* cable unplugged */ - dp->core_initialized = false; + /* +* dp core (ahb/aux clks) must be initialized before +* irq_hpd be handled +*/ + if (dp->core_initialized) { + ret = dp_display_usbpd_attention_cb(&dp->pdev->dev); + if (ret == -ECONNRESET) { /* cable unplugged */ + dp->core_initialized = false; + } } DRM_DEBUG_DP("hpd_state=%d\n", state); diff --git a/drivers/gpu/drm/msm/dp/dp_link.c b/drivers/gpu/drm/msm/dp/dp_link.c index a5bdfc5..d4d31e5 100644 --- a/drivers/gpu/drm/msm/dp/dp_link.c +++ b/drivers/gpu/drm/msm/dp/dp_link.c @@ -737,18 +737,25 @@ static int dp_link_parse_sink_count(struct dp_link *dp_link) return 0; } -static void dp_link_parse_sink_status_field(struct dp_link_private *link) +static int dp_link_parse_sink_status_field(struct dp_link_private *link) { int len = 0; link->prev_sink_count = link->dp_link.sink_count; - dp_link_parse_sink_count(&link->dp_link); + len = dp_link_parse_sink_count(&link->dp_link); + if (len < 0) { + DRM_ERROR("DP parse sink count failed\n"); + return len; + } len = drm_dp_dpcd_read_link_status(link->aux, link->link_status); - if (len < DP_LINK_STATUS_SIZE) + if (len < DP_LINK_STATUS_SIZE) { DRM_ERROR("DP link status read failed\n"); - dp_link_parse_request(link); + return len; + } + + return dp_link_parse_request(link); } /** @@ -1023,7 +1030,9 @@ int dp_link_process_request(struct dp_link *dp_link) dp_link_reset_data(link); - dp_link_parse_sink_status_field(link); + ret = dp_link_parse_sink_status_field(link); + if (ret) + return ret; if (link->request.test_requested == DP_TEST_LINK_EDID_READ) { dp_link->sink_request |= DP_TEST_LINK_EDID_READ; -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
Re: [PATCH] drm/i915/guc: Use lockless list for destroyed contexts
On 12/22/2021 15:29, Matthew Brost wrote: Use a lockless list structure for destroyed contexts to avoid hammering on global submission spin lock. I thought the guidance was that lockless anything without an explanation longer than War And Peace comes with an automatic termination penalty? Also, I thought the simple suggestion was to just move the entire list sideways under the existing lock and then loop through the local list safely without requiring locks because it is now local only. John. Suggested-by: Tvrtko Ursulin Signed-off-by: Matthew Brost --- drivers/gpu/drm/i915/gt/intel_context.c | 2 - drivers/gpu/drm/i915/gt/intel_context_types.h | 3 +- drivers/gpu/drm/i915/gt/uc/intel_guc.h| 3 +- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 44 +-- 4 files changed, 16 insertions(+), 36 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index 5d0ec7c49b6a..4aacb4b0418d 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -403,8 +403,6 @@ intel_context_init(struct intel_context *ce, struct intel_engine_cs *engine) ce->guc_id.id = GUC_INVALID_LRC_ID; INIT_LIST_HEAD(&ce->guc_id.link); - INIT_LIST_HEAD(&ce->destroyed_link); - INIT_LIST_HEAD(&ce->parallel.child_list); /* diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h index 30cd81ad8911..4532d43ec9c0 100644 --- a/drivers/gpu/drm/i915/gt/intel_context_types.h +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h @@ -9,6 +9,7 @@ #include #include #include +#include #include #include @@ -224,7 +225,7 @@ struct intel_context { * list when context is pending to be destroyed (deregistered with the * GuC), protected by guc->submission_state.lock */ - struct list_head destroyed_link; + struct llist_node destroyed_link; /** @parallel: sub-structure for parallel submission members */ struct { diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h index f9240d4baa69..705085058411 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h @@ -8,6 +8,7 @@ #include #include +#include #include "intel_uncore.h" #include "intel_guc_fw.h" @@ -112,7 +113,7 @@ struct intel_guc { * @destroyed_contexts: list of contexts waiting to be destroyed * (deregistered with the GuC) */ - struct list_head destroyed_contexts; + struct llist_head destroyed_contexts; /** * @destroyed_worker: worker to deregister contexts, need as we * need to take a GT PM reference and can't from destroy diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index 0a03a30e4c6d..6f7643edc139 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -1771,7 +1771,7 @@ int intel_guc_submission_init(struct intel_guc *guc) spin_lock_init(&guc->submission_state.lock); INIT_LIST_HEAD(&guc->submission_state.guc_id_list); ida_init(&guc->submission_state.guc_ids); - INIT_LIST_HEAD(&guc->submission_state.destroyed_contexts); + init_llist_head(&guc->submission_state.destroyed_contexts); INIT_WORK(&guc->submission_state.destroyed_worker, destroyed_worker_func); @@ -2696,26 +2696,18 @@ static void __guc_context_destroy(struct intel_context *ce) } } +#define take_destroyed_contexts(guc) \ + llist_del_all(&guc->submission_state.destroyed_contexts) + static void guc_flush_destroyed_contexts(struct intel_guc *guc) { - struct intel_context *ce; - unsigned long flags; + struct intel_context *ce, *cn; GEM_BUG_ON(!submission_disabled(guc) && guc_submission_initialized(guc)); - while (!list_empty(&guc->submission_state.destroyed_contexts)) { - spin_lock_irqsave(&guc->submission_state.lock, flags); - ce = list_first_entry_or_null(&guc->submission_state.destroyed_contexts, - struct intel_context, - destroyed_link); - if (ce) - list_del_init(&ce->destroyed_link); - spin_unlock_irqrestore(&guc->submission_state.lock, flags); - - if (!ce) - break; - + llist_for_each_entry_safe(ce, cn, take_destroyed_contexts(guc), +destroyed_link) { release_guc_id(guc, ce); __guc_context_destroy(ce); } @@ -2723,23 +2715,11 @@ static void guc_flush_destroyed_contexts(struct in
[Patch v4 12/24] drm/amdkfd: CRIU restore queue doorbell id
From: David Yat Sin When re-creating queues during CRIU restore, restore the queue with the same doorbell id value used during CRIU dump. Signed-off-by: David Yat Sin --- .../drm/amd/amdkfd/kfd_device_queue_manager.c | 60 +-- 1 file changed, 41 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index 7e49f70b81b9..a0f5b8533a03 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -153,7 +153,13 @@ static void decrement_queue_count(struct device_queue_manager *dqm, dqm->active_cp_queue_count--; } -static int allocate_doorbell(struct qcm_process_device *qpd, struct queue *q) +/* + * Allocate a doorbell ID to this queue. + * If doorbell_id is passed in, make sure requested ID is valid then allocate it. + */ +static int allocate_doorbell(struct qcm_process_device *qpd, +struct queue *q, +uint32_t const *restore_id) { struct kfd_dev *dev = qpd->dqm->dev; @@ -161,6 +167,10 @@ static int allocate_doorbell(struct qcm_process_device *qpd, struct queue *q) /* On pre-SOC15 chips we need to use the queue ID to * preserve the user mode ABI. */ + + if (restore_id && *restore_id != q->properties.queue_id) + return -EINVAL; + q->doorbell_id = q->properties.queue_id; } else if (q->properties.type == KFD_QUEUE_TYPE_SDMA || q->properties.type == KFD_QUEUE_TYPE_SDMA_XGMI) { @@ -169,25 +179,37 @@ static int allocate_doorbell(struct qcm_process_device *qpd, struct queue *q) * The doobell index distance between RLC (2*i) and (2*i+1) * for a SDMA engine is 512. */ - uint32_t *idx_offset = - dev->shared_resources.sdma_doorbell_idx; - q->doorbell_id = idx_offset[q->properties.sdma_engine_id] - + (q->properties.sdma_queue_id & 1) - * KFD_QUEUE_DOORBELL_MIRROR_OFFSET - + (q->properties.sdma_queue_id >> 1); + uint32_t *idx_offset = dev->shared_resources.sdma_doorbell_idx; + uint32_t valid_id = idx_offset[q->properties.sdma_engine_id] + + (q->properties.sdma_queue_id & 1) + * KFD_QUEUE_DOORBELL_MIRROR_OFFSET + + (q->properties.sdma_queue_id >> 1); + + if (restore_id && *restore_id != valid_id) + return -EINVAL; + q->doorbell_id = valid_id; } else { - /* For CP queues on SOC15 reserve a free doorbell ID */ - unsigned int found; - - found = find_first_zero_bit(qpd->doorbell_bitmap, - KFD_MAX_NUM_OF_QUEUES_PER_PROCESS); - if (found >= KFD_MAX_NUM_OF_QUEUES_PER_PROCESS) { - pr_debug("No doorbells available"); - return -EBUSY; + /* For CP queues on SOC15 */ + if (restore_id) { + /* make sure that ID is free */ + if (__test_and_set_bit(*restore_id, qpd->doorbell_bitmap)) + return -EINVAL; + + q->doorbell_id = *restore_id; + } else { + /* or reserve a free doorbell ID */ + unsigned int found; + + found = find_first_zero_bit(qpd->doorbell_bitmap, + KFD_MAX_NUM_OF_QUEUES_PER_PROCESS); + if (found >= KFD_MAX_NUM_OF_QUEUES_PER_PROCESS) { + pr_debug("No doorbells available"); + return -EBUSY; + } + set_bit(found, qpd->doorbell_bitmap); + q->doorbell_id = found; } - set_bit(found, qpd->doorbell_bitmap); - q->doorbell_id = found; } q->properties.doorbell_off = @@ -355,7 +377,7 @@ static int create_queue_nocpsch(struct device_queue_manager *dqm, dqm->asic_ops.init_sdma_vm(dqm, q, qpd); } - retval = allocate_doorbell(qpd, q); + retval = allocate_doorbell(qpd, q, qd ? &qd->doorbell_id : NULL); if (retval) goto out_deallocate_hqd; @@ -1338,7 +1360,7 @@ static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue *q, goto out; } - retval = allocate_doorbell(qpd, q); + retval = allocate_doorbell(qpd, q, qd ? &qd->doorbel
[Patch v4 19/24] drm/amdkfd: CRIU allow external mm for svm ranges
Both svm_range_get_attr and svm_range_set_attr helpers use mm struct from current but for a Checkpoint or Restore operation, the current->mm will fetch the mm for the CRIU master process. So modify these helpers to accept the task mm for a target kfd process to support Checkpoint Restore. Signed-off-by: Rajneesh Bhardwaj --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 88360f23eb61..7c92116153fe 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -3134,11 +3134,11 @@ static void svm_range_evict_svm_bo_worker(struct work_struct *work) } static int -svm_range_set_attr(struct kfd_process *p, uint64_t start, uint64_t size, - uint32_t nattr, struct kfd_ioctl_svm_attribute *attrs) +svm_range_set_attr(struct kfd_process *p, struct mm_struct *mm, + uint64_t start, uint64_t size, uint32_t nattr, + struct kfd_ioctl_svm_attribute *attrs) { struct amdkfd_process_info *process_info = p->kgd_process_info; - struct mm_struct *mm = current->mm; struct list_head update_list; struct list_head insert_list; struct list_head remove_list; @@ -3242,8 +3242,9 @@ svm_range_set_attr(struct kfd_process *p, uint64_t start, uint64_t size, } static int -svm_range_get_attr(struct kfd_process *p, uint64_t start, uint64_t size, - uint32_t nattr, struct kfd_ioctl_svm_attribute *attrs) +svm_range_get_attr(struct kfd_process *p, struct mm_struct *mm, + uint64_t start, uint64_t size, uint32_t nattr, + struct kfd_ioctl_svm_attribute *attrs) { DECLARE_BITMAP(bitmap_access, MAX_GPU_INSTANCE); DECLARE_BITMAP(bitmap_aip, MAX_GPU_INSTANCE); @@ -3253,7 +3254,6 @@ svm_range_get_attr(struct kfd_process *p, uint64_t start, uint64_t size, bool get_accessible = false; bool get_flags = false; uint64_t last = start + size - 1UL; - struct mm_struct *mm = current->mm; uint8_t granularity = 0xff; struct interval_tree_node *node; struct svm_range_list *svms; @@ -3422,6 +3422,7 @@ int svm_ioctl(struct kfd_process *p, enum kfd_ioctl_svm_op op, uint64_t start, uint64_t size, uint32_t nattrs, struct kfd_ioctl_svm_attribute *attrs) { + struct mm_struct *mm = current->mm; int r; start >>= PAGE_SHIFT; @@ -3429,10 +3430,10 @@ svm_ioctl(struct kfd_process *p, enum kfd_ioctl_svm_op op, uint64_t start, switch (op) { case KFD_IOCTL_SVM_OP_SET_ATTR: - r = svm_range_set_attr(p, start, size, nattrs, attrs); + r = svm_range_set_attr(p, mm, start, size, nattrs, attrs); break; case KFD_IOCTL_SVM_OP_GET_ATTR: - r = svm_range_get_attr(p, start, size, nattrs, attrs); + r = svm_range_get_attr(p, mm, start, size, nattrs, attrs); break; default: r = EINVAL; -- 2.17.1
[Patch v4 16/24] drm/amdkfd: CRIU implement gpu_id remapping
From: David Yat Sin When doing a restore on a different node, the gpu_id's on the restore node may be different. But the user space application will still refer use the original gpu_id's in the ioctl calls. Adding code to create a gpu id mapping so that kfd can determine actual gpu_id during the user ioctl's. Signed-off-by: David Yat Sin Signed-off-by: Rajneesh Bhardwaj --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 465 -- drivers/gpu/drm/amd/amdkfd/kfd_events.c | 45 +- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 11 + drivers/gpu/drm/amd/amdkfd/kfd_process.c | 32 ++ .../amd/amdkfd/kfd_process_queue_manager.c| 18 +- 5 files changed, 412 insertions(+), 159 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 08467fa2f514..20652d488cde 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -294,18 +294,20 @@ static int kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p, return err; pr_debug("Looking for gpu id 0x%x\n", args->gpu_id); - dev = kfd_device_by_id(args->gpu_id); - if (!dev) { - pr_debug("Could not find gpu id 0x%x\n", args->gpu_id); - return -EINVAL; - } mutex_lock(&p->mutex); + pdd = kfd_process_device_data_by_id(p, args->gpu_id); + if (!pdd) { + pr_debug("Could not find gpu id 0x%x\n", args->gpu_id); + err = -EINVAL; + goto err_unlock; + } + dev = pdd->dev; pdd = kfd_bind_process_to_device(dev, p); if (IS_ERR(pdd)) { err = -ESRCH; - goto err_bind_process; + goto err_unlock; } pr_debug("Creating queue for PASID 0x%x on gpu 0x%x\n", @@ -315,7 +317,7 @@ static int kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p, err = pqm_create_queue(&p->pqm, dev, filep, &q_properties, &queue_id, NULL, NULL, NULL, &doorbell_offset_in_process); if (err != 0) - goto err_create_queue; + goto err_unlock; args->queue_id = queue_id; @@ -344,8 +346,7 @@ static int kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p, return 0; -err_create_queue: -err_bind_process: +err_unlock: mutex_unlock(&p->mutex); return err; } @@ -492,7 +493,6 @@ static int kfd_ioctl_set_memory_policy(struct file *filep, struct kfd_process *p, void *data) { struct kfd_ioctl_set_memory_policy_args *args = data; - struct kfd_dev *dev; int err = 0; struct kfd_process_device *pdd; enum cache_policy default_policy, alternate_policy; @@ -507,13 +507,15 @@ static int kfd_ioctl_set_memory_policy(struct file *filep, return -EINVAL; } - dev = kfd_device_by_id(args->gpu_id); - if (!dev) - return -EINVAL; - mutex_lock(&p->mutex); + pdd = kfd_process_device_data_by_id(p, args->gpu_id); + if (!pdd) { + pr_debug("Could not find gpu id 0x%x\n", args->gpu_id); + err = -EINVAL; + goto out; + } - pdd = kfd_bind_process_to_device(dev, p); + pdd = kfd_bind_process_to_device(pdd->dev, p); if (IS_ERR(pdd)) { err = -ESRCH; goto out; @@ -526,7 +528,7 @@ static int kfd_ioctl_set_memory_policy(struct file *filep, (args->alternate_policy == KFD_IOC_CACHE_POLICY_COHERENT) ? cache_policy_coherent : cache_policy_noncoherent; - if (!dev->dqm->ops.set_cache_memory_policy(dev->dqm, + if (!pdd->dev->dqm->ops.set_cache_memory_policy(pdd->dev->dqm, &pdd->qpd, default_policy, alternate_policy, @@ -544,17 +546,18 @@ static int kfd_ioctl_set_trap_handler(struct file *filep, struct kfd_process *p, void *data) { struct kfd_ioctl_set_trap_handler_args *args = data; - struct kfd_dev *dev; int err = 0; struct kfd_process_device *pdd; - dev = kfd_device_by_id(args->gpu_id); - if (!dev) - return -EINVAL; - mutex_lock(&p->mutex); - pdd = kfd_bind_process_to_device(dev, p); + pdd = kfd_process_device_data_by_id(p, args->gpu_id); + if (!pdd) { + err = -EINVAL; + goto out; + } + + pdd = kfd_bind_process_to_device(pdd->dev, p); if (IS_ERR(pdd)) { err = -ESRCH; goto out; @@ -578,16 +581,20 @@ static int kfd_ioctl_dbg_register(struct file *filep, bool create_ok; long status = 0; - dev = kfd_device_by_id(args->gpu_id); - if (!dev) -
[Patch v4 22/24] drm/amdkfd: CRIU Save Shared Virtual Memory ranges
During checkpoint stage, save the shared virtual memory ranges and attributes for the target process. A process may contain a number of svm ranges and each range might contain a number of arrtibutes. While not all attributes may be applicable for a given prange but during checkpoint we store all possible values for the max possible attribute types. Signed-off-by: Rajneesh Bhardwaj --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 4 +- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 95 drivers/gpu/drm/amd/amdkfd/kfd_svm.h | 10 +++ 3 files changed, 108 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 1c25d5e9067c..916b8d000317 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -2186,7 +2186,9 @@ static int criu_checkpoint(struct file *filep, if (ret) goto close_bo_fds; - /* TODO: Dump SVM-Ranges */ + ret = kfd_criu_checkpoint_svm(p, (uint8_t __user *)args->priv_data, &priv_offset); + if (ret) + goto close_bo_fds; } close_bo_fds: diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 49e05fb5c898..6d59f1bedcf2 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -3478,6 +3478,101 @@ int svm_range_get_info(struct kfd_process *p, uint32_t *num_svm_ranges, return 0; } +int kfd_criu_checkpoint_svm(struct kfd_process *p, + uint8_t __user *user_priv_data, + uint64_t *priv_data_offset) +{ + struct kfd_criu_svm_range_priv_data *svm_priv = NULL; + struct kfd_ioctl_svm_attribute *query_attr = NULL; + uint64_t svm_priv_data_size, query_attr_size = 0; + int index, nattr_common = 4, ret = 0; + struct svm_range_list *svms; + int num_devices = p->n_pdds; + struct svm_range *prange; + struct mm_struct *mm; + + svms = &p->svms; + if (!svms) + return -EINVAL; + + mm = get_task_mm(p->lead_thread); + if (!mm) { + pr_err("failed to get mm for the target process\n"); + return -ESRCH; + } + + query_attr_size = sizeof(struct kfd_ioctl_svm_attribute) * + (nattr_common + num_devices); + + query_attr = kzalloc(query_attr_size, GFP_KERNEL); + if (!query_attr) { + ret = -ENOMEM; + goto exit; + } + + query_attr[0].type = KFD_IOCTL_SVM_ATTR_PREFERRED_LOC; + query_attr[1].type = KFD_IOCTL_SVM_ATTR_PREFETCH_LOC; + query_attr[2].type = KFD_IOCTL_SVM_ATTR_SET_FLAGS; + query_attr[3].type = KFD_IOCTL_SVM_ATTR_GRANULARITY; + + for (index = 0; index < num_devices; index++) { + struct kfd_process_device *pdd = p->pdds[index]; + + query_attr[index + nattr_common].type = + KFD_IOCTL_SVM_ATTR_ACCESS; + query_attr[index + nattr_common].value = pdd->user_gpu_id; + } + + svm_priv_data_size = sizeof(*svm_priv) + query_attr_size; + + svm_priv = kzalloc(svm_priv_data_size, GFP_KERNEL); + if (!svm_priv) { + ret = -ENOMEM; + goto exit_query; + } + + index = 0; + list_for_each_entry(prange, &svms->list, list) { + + svm_priv->object_type = KFD_CRIU_OBJECT_TYPE_SVM_RANGE; + svm_priv->start_addr = prange->start; + svm_priv->size = prange->npages; + memcpy(&svm_priv->attrs, query_attr, query_attr_size); + pr_debug("CRIU: prange: 0x%p start: 0x%lx\t npages: 0x%llx end: 0x%llx\t size: 0x%llx\n", +prange, prange->start, prange->npages, +prange->start + prange->npages - 1, +prange->npages * PAGE_SIZE); + + ret = svm_range_get_attr(p, mm, svm_priv->start_addr, +svm_priv->size, +(nattr_common + num_devices), +svm_priv->attrs); + if (ret) { + pr_err("CRIU: failed to obtain range attributes\n"); + goto exit_priv; + } + + ret = copy_to_user(user_priv_data + *priv_data_offset, + svm_priv, svm_priv_data_size); + if (ret) { + pr_err("Failed to copy svm priv to user\n"); + goto exit_priv; + } + + *priv_data_offset += svm_priv_data_size; + + } + + +exit_priv: + kfree(svm_priv); +exit_query: + kfree(query_attr); +exit: + mmput(mm); + return ret; +} + int svm_ioctl(struct kfd_process *p, enum
[Patch v4 23/24] drm/amdkfd: CRIU prepare for svm resume
During CRIU restore phase, the VMAs for the virtual address ranges are not at their final location yet so in this stage, only cache the data required to successfully resume the svm ranges during an imminent CRIU resume phase. Signed-off-by: Rajneesh Bhardwaj --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 4 +- drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 5 ++ drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 99 drivers/gpu/drm/amd/amdkfd/kfd_svm.h | 12 +++ 4 files changed, 118 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 916b8d000317..f7aa15b18f95 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -2638,8 +2638,8 @@ static int criu_restore_objects(struct file *filep, goto exit; break; case KFD_CRIU_OBJECT_TYPE_SVM_RANGE: - /* TODO: Implement SVM range */ - *priv_offset += sizeof(struct kfd_criu_svm_range_priv_data); + ret = kfd_criu_restore_svm(p, (uint8_t __user *)args->priv_data, +priv_offset, max_priv_data_size); if (ret) goto exit; break; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index 87eb6739a78e..92191c541c29 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -790,6 +790,7 @@ struct svm_range_list { struct list_headlist; struct work_struct deferred_list_work; struct list_headdeferred_range_list; + struct list_headcriu_svm_metadata_list; spinlock_t deferred_list_lock; atomic_tevicted_ranges; booldrain_pagefaults; @@ -1148,6 +1149,10 @@ int kfd_criu_restore_event(struct file *devkfd, uint8_t __user *user_priv_data, uint64_t *priv_data_offset, uint64_t max_priv_data_size); +int kfd_criu_restore_svm(struct kfd_process *p, +uint8_t __user *user_priv_data, +uint64_t *priv_data_offset, +uint64_t max_priv_data_size); /* CRIU - End */ /* Queue Context Management */ diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 6d59f1bedcf2..e9f6c63c2a26 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -45,6 +45,14 @@ */ #define AMDGPU_SVM_RANGE_RETRY_FAULT_PENDING 2000 +struct criu_svm_metadata { + struct list_head list; + __u64 start_addr; + __u64 size; + /* Variable length array of attributes */ + struct kfd_ioctl_svm_attribute attrs[0]; +}; + static void svm_range_evict_svm_bo_worker(struct work_struct *work); static bool svm_range_cpu_invalidate_pagetables(struct mmu_interval_notifier *mni, @@ -2753,6 +2761,7 @@ int svm_range_list_init(struct kfd_process *p) INIT_DELAYED_WORK(&svms->restore_work, svm_range_restore_work); INIT_WORK(&svms->deferred_list_work, svm_range_deferred_list_work); INIT_LIST_HEAD(&svms->deferred_range_list); + INIT_LIST_HEAD(&svms->criu_svm_metadata_list); spin_lock_init(&svms->deferred_list_lock); for (i = 0; i < p->n_pdds; i++) @@ -3418,6 +3427,96 @@ svm_range_get_attr(struct kfd_process *p, struct mm_struct *mm, return 0; } +int svm_criu_prepare_for_resume(struct kfd_process *p, + struct kfd_criu_svm_range_priv_data *svm_priv) +{ + int nattr_common = 4, nattr_accessibility = 1; + struct criu_svm_metadata *criu_svm_md = NULL; + uint64_t svm_attrs_size, svm_object_md_size; + struct svm_range_list *svms = &p->svms; + int num_devices = p->n_pdds; + int i, ret = 0; + + svm_attrs_size = sizeof(struct kfd_ioctl_svm_attribute) * + (nattr_common + nattr_accessibility * num_devices); + svm_object_md_size = sizeof(struct criu_svm_metadata) + svm_attrs_size; + + criu_svm_md = kzalloc(svm_object_md_size, GFP_KERNEL); + if (!criu_svm_md) { + pr_err("failed to allocate memory to store svm metadata\n"); + ret = -ENOMEM; + goto exit; + } + + criu_svm_md->start_addr = svm_priv->start_addr; + criu_svm_md->size = svm_priv->size; + for (i = 0; i < svm_attrs_size; i++) + { + criu_svm_md->attrs[i].type = svm_priv->attrs[i].type; + criu_svm_md->attrs[i].value = svm_priv->attrs[i].value; + } + + list_add_tail(&criu_svm_md->list, &svms->criu_svm_m
[Patch v4 14/24] drm/amdkfd: CRIU checkpoint and restore queue control stack
From: David Yat Sin Checkpoint contents of queue control stacks on CRIU dump and restore them during CRIU restore. Signed-off-by: David Yat Sin Signed-off-by: Rajneesh Bhardwaj --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c | 2 +- .../drm/amd/amdkfd/kfd_device_queue_manager.c | 23 --- .../drm/amd/amdkfd/kfd_device_queue_manager.h | 9 ++- drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h | 11 +++- .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c | 13 ++-- .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c | 14 +++-- .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c | 29 +++-- .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c | 22 +-- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 5 +- .../amd/amdkfd/kfd_process_queue_manager.c| 62 +-- 11 files changed, 139 insertions(+), 53 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 146879cd3f2b..582b4a393f95 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -312,7 +312,7 @@ static int kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p, p->pasid, dev->id); - err = pqm_create_queue(&p->pqm, dev, filep, &q_properties, &queue_id, NULL, NULL, + err = pqm_create_queue(&p->pqm, dev, filep, &q_properties, &queue_id, NULL, NULL, NULL, &doorbell_offset_in_process); if (err != 0) goto err_create_queue; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c index 3a5303ebcabf..8eca9ed3ab36 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c @@ -185,7 +185,7 @@ static int dbgdev_register_diq(struct kfd_dbgdev *dbgdev) properties.type = KFD_QUEUE_TYPE_DIQ; status = pqm_create_queue(dbgdev->pqm, dbgdev->dev, NULL, - &properties, &qid, NULL, NULL, NULL); + &properties, &qid, NULL, NULL, NULL, NULL); if (status) { pr_err("Failed to create DIQ\n"); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index a92274f9f1f7..248e69c7960b 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -332,7 +332,7 @@ static int create_queue_nocpsch(struct device_queue_manager *dqm, struct queue *q, struct qcm_process_device *qpd, const struct kfd_criu_queue_priv_data *qd, - const void *restore_mqd) + const void *restore_mqd, const void *restore_ctl_stack) { struct mqd_manager *mqd_mgr; int retval; @@ -394,7 +394,8 @@ static int create_queue_nocpsch(struct device_queue_manager *dqm, if (qd) mqd_mgr->restore_mqd(mqd_mgr, &q->mqd, q->mqd_mem_obj, &q->gart_mqd_addr, -&q->properties, restore_mqd); +&q->properties, restore_mqd, restore_ctl_stack, +qd->ctl_stack_size); else mqd_mgr->init_mqd(mqd_mgr, &q->mqd, q->mqd_mem_obj, &q->gart_mqd_addr, &q->properties); @@ -1347,7 +1348,7 @@ static void destroy_kernel_queue_cpsch(struct device_queue_manager *dqm, static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue *q, struct qcm_process_device *qpd, const struct kfd_criu_queue_priv_data *qd, - const void *restore_mqd) + const void *restore_mqd, const void *restore_ctl_stack) { int retval; struct mqd_manager *mqd_mgr; @@ -1393,9 +1394,11 @@ static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue *q, * updates the is_evicted flag but is a no-op otherwise. */ q->properties.is_evicted = !!qpd->evicted; + if (qd) mqd_mgr->restore_mqd(mqd_mgr, &q->mqd, q->mqd_mem_obj, &q->gart_mqd_addr, -&q->properties, restore_mqd); +&q->properties, restore_mqd, restore_ctl_stack, +qd->ctl_stack_size); else mqd_mgr->init_mqd(mqd_mgr, &q->mqd, q->mqd_mem_obj, &q->gart_mqd_addr, &q->properties); @@ -1788,7 +1791,8 @@ static int get_wave_state(struct device_queue_manager *dqm, static void get_queue_checkpoint_info(struct device_queue_manager *dqm, const struct queue *q, - u32 *mqd_
[Patch v4 18/24] drm/amdkfd: CRIU checkpoint and restore xnack mode
Recoverable page faults are represented by the xnack mode setting inside a kfd process and are used to represent the device page faults. For CR, we don't consider negative values which are typically used for querying the current xnack mode without modifying it. Signed-off-by: Rajneesh Bhardwaj --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 15 +++ drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 1 + 2 files changed, 16 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 178b0ccfb286..446eb9310915 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -1845,6 +1845,11 @@ static int criu_checkpoint_process(struct kfd_process *p, memset(&process_priv, 0, sizeof(process_priv)); process_priv.version = KFD_CRIU_PRIV_VERSION; + /* For CR, we don't consider negative xnack mode which is used for +* querying without changing it, here 0 simply means disabled and 1 +* means enabled so retry for finding a valid PTE. +*/ + process_priv.xnack_mode = p->xnack_enabled ? 1 : 0; ret = copy_to_user(user_priv_data + *priv_offset, &process_priv, sizeof(process_priv)); @@ -2231,6 +2236,16 @@ static int criu_restore_process(struct kfd_process *p, return -EINVAL; } + pr_debug("Setting XNACK mode\n"); + if (process_priv.xnack_mode && !kfd_process_xnack_mode(p, true)) { + pr_err("xnack mode cannot be set\n"); + ret = -EPERM; + goto exit; + } else { + pr_debug("set xnack mode: %d\n", process_priv.xnack_mode); + p->xnack_enabled = process_priv.xnack_mode; + } + exit: return ret; } diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index 855c162b85ea..d72dda84c18c 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -1057,6 +1057,7 @@ void kfd_process_set_trap_handler(struct qcm_process_device *qpd, struct kfd_criu_process_priv_data { uint32_t version; + uint32_t xnack_mode; }; struct kfd_criu_device_priv_data { -- 2.17.1
[Patch v4 11/24] drm/amdkfd: CRIU restore sdma id for queues
From: David Yat Sin When re-creating queues during CRIU restore, restore the queue with the same sdma id value used during CRIU dump. Signed-off-by: David Yat Sin --- .../drm/amd/amdkfd/kfd_device_queue_manager.c | 48 ++- .../drm/amd/amdkfd/kfd_device_queue_manager.h | 3 +- .../amd/amdkfd/kfd_process_queue_manager.c| 4 +- 3 files changed, 40 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index 62fe28244a80..7e49f70b81b9 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -58,7 +58,7 @@ static inline void deallocate_hqd(struct device_queue_manager *dqm, struct queue *q); static int allocate_hqd(struct device_queue_manager *dqm, struct queue *q); static int allocate_sdma_queue(struct device_queue_manager *dqm, - struct queue *q); + struct queue *q, const uint32_t *restore_sdma_id); static void kfd_process_hw_exception(struct work_struct *work); static inline @@ -308,7 +308,8 @@ static void deallocate_vmid(struct device_queue_manager *dqm, static int create_queue_nocpsch(struct device_queue_manager *dqm, struct queue *q, - struct qcm_process_device *qpd) + struct qcm_process_device *qpd, + const struct kfd_criu_queue_priv_data *qd) { struct mqd_manager *mqd_mgr; int retval; @@ -348,7 +349,7 @@ static int create_queue_nocpsch(struct device_queue_manager *dqm, q->pipe, q->queue); } else if (q->properties.type == KFD_QUEUE_TYPE_SDMA || q->properties.type == KFD_QUEUE_TYPE_SDMA_XGMI) { - retval = allocate_sdma_queue(dqm, q); + retval = allocate_sdma_queue(dqm, q, qd ? &qd->sdma_id : NULL); if (retval) goto deallocate_vmid; dqm->asic_ops.init_sdma_vm(dqm, q, qpd); @@ -1040,7 +1041,7 @@ static void pre_reset(struct device_queue_manager *dqm) } static int allocate_sdma_queue(struct device_queue_manager *dqm, - struct queue *q) + struct queue *q, const uint32_t *restore_sdma_id) { int bit; @@ -1050,9 +1051,21 @@ static int allocate_sdma_queue(struct device_queue_manager *dqm, return -ENOMEM; } - bit = __ffs64(dqm->sdma_bitmap); - dqm->sdma_bitmap &= ~(1ULL << bit); - q->sdma_id = bit; + if (restore_sdma_id) { + /* Re-use existing sdma_id */ + if (!(dqm->sdma_bitmap & (1ULL << *restore_sdma_id))) { + pr_err("SDMA queue already in use\n"); + return -EBUSY; + } + dqm->sdma_bitmap &= ~(1ULL << *restore_sdma_id); + q->sdma_id = *restore_sdma_id; + } else { + /* Find first available sdma_id */ + bit = __ffs64(dqm->sdma_bitmap); + dqm->sdma_bitmap &= ~(1ULL << bit); + q->sdma_id = bit; + } + q->properties.sdma_engine_id = q->sdma_id % get_num_sdma_engines(dqm); q->properties.sdma_queue_id = q->sdma_id / @@ -1062,9 +1075,19 @@ static int allocate_sdma_queue(struct device_queue_manager *dqm, pr_err("No more XGMI SDMA queue to allocate\n"); return -ENOMEM; } - bit = __ffs64(dqm->xgmi_sdma_bitmap); - dqm->xgmi_sdma_bitmap &= ~(1ULL << bit); - q->sdma_id = bit; + if (restore_sdma_id) { + /* Re-use existing sdma_id */ + if (!(dqm->xgmi_sdma_bitmap & (1ULL << *restore_sdma_id))) { + pr_err("SDMA queue already in use\n"); + return -EBUSY; + } + dqm->xgmi_sdma_bitmap &= ~(1ULL << *restore_sdma_id); + q->sdma_id = *restore_sdma_id; + } else { + bit = __ffs64(dqm->xgmi_sdma_bitmap); + dqm->xgmi_sdma_bitmap &= ~(1ULL << bit); + q->sdma_id = bit; + } /* sdma_engine_id is sdma id including * both PCIe-optimized SDMAs and XGMI- * optimized SDMAs. The calculation below @@ -1293,7 +1316,8 @@ static void destroy_kernel_queue_cpsch(struct device_queue_manager *dqm, } static int create_queue_cpsch(struct device_qu
[Patch v4 21/24] drm/amdkfd: CRIU Discover svm ranges
A KFD process may contain a number of virtual address ranges for shared virtual memory management and each such range can have many SVM attributes spanning across various nodes within the process boundary. This change reports the total number of such SVM ranges and their total private data size by extending the PROCESS_INFO op of the the CRIU IOCTL to discover the svm ranges in the target process and a future patches brings in the required support for checkpoint and restore for SVM ranges. Signed-off-by: Rajneesh Bhardwaj --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 12 +++-- drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 5 +- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 60 drivers/gpu/drm/amd/amdkfd/kfd_svm.h | 11 + 4 files changed, 82 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 446eb9310915..1c25d5e9067c 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -2089,10 +2089,9 @@ static int criu_get_process_object_info(struct kfd_process *p, uint32_t *num_objects, uint64_t *objs_priv_size) { - int ret; - uint64_t priv_size; + uint64_t queues_priv_data_size, svm_priv_data_size, priv_size; uint32_t num_queues, num_events, num_svm_ranges; - uint64_t queues_priv_data_size; + int ret; *num_devices = p->n_pdds; *num_bos = get_process_num_bos(p); @@ -2102,7 +2101,10 @@ static int criu_get_process_object_info(struct kfd_process *p, return ret; num_events = kfd_get_num_events(p); - num_svm_ranges = 0; /* TODO: Implement SVM-Ranges */ + + ret = svm_range_get_info(p, &num_svm_ranges, &svm_priv_data_size); + if (ret) + return ret; *num_objects = num_queues + num_events + num_svm_ranges; @@ -2112,7 +2114,7 @@ static int criu_get_process_object_info(struct kfd_process *p, priv_size += *num_bos * sizeof(struct kfd_criu_bo_priv_data); priv_size += queues_priv_data_size; priv_size += num_events * sizeof(struct kfd_criu_event_priv_data); - /* TODO: Add SVM ranges priv size */ + priv_size += svm_priv_data_size; *objs_priv_size = priv_size; } return 0; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index d72dda84c18c..87eb6739a78e 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -1082,7 +1082,10 @@ enum kfd_criu_object_type { struct kfd_criu_svm_range_priv_data { uint32_t object_type; - uint64_t reserved; + uint64_t start_addr; + uint64_t size; + /* Variable length array of attributes */ + struct kfd_ioctl_svm_attribute attrs[0]; }; struct kfd_criu_queue_priv_data { diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 7c92116153fe..49e05fb5c898 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -3418,6 +3418,66 @@ svm_range_get_attr(struct kfd_process *p, struct mm_struct *mm, return 0; } +int svm_range_get_info(struct kfd_process *p, uint32_t *num_svm_ranges, + uint64_t *svm_priv_data_size) +{ + uint64_t total_size, accessibility_size, common_attr_size; + int nattr_common = 4, naatr_accessibility = 1; + int num_devices = p->n_pdds; + struct svm_range_list *svms; + struct svm_range *prange; + uint32_t count = 0; + + *svm_priv_data_size = 0; + + svms = &p->svms; + if (!svms) + return -EINVAL; + + mutex_lock(&svms->lock); + list_for_each_entry(prange, &svms->list, list) { + pr_debug("prange: 0x%p start: 0x%lx\t npages: 0x%llx\t end: 0x%llx\n", +prange, prange->start, prange->npages, +prange->start + prange->npages - 1); + count++; + } + mutex_unlock(&svms->lock); + + *num_svm_ranges = count; + /* Only the accessbility attributes need to be queried for all the gpus +* individually, remaining ones are spanned across the entire process +* regardless of the various gpu nodes. Of the remaining attributes, +* KFD_IOCTL_SVM_ATTR_CLR_FLAGS need not be saved. +* +* KFD_IOCTL_SVM_ATTR_PREFERRED_LOC +* KFD_IOCTL_SVM_ATTR_PREFETCH_LOC +* KFD_IOCTL_SVM_ATTR_SET_FLAGS +* KFD_IOCTL_SVM_ATTR_GRANULARITY +* +* ** ACCESSBILITY ATTRIBUTES ** +* (Considered as one, type is altered during query, value is gpuid) +* KFD_IOCTL_SVM_ATTR_ACCESS +* KFD_IOCTL_SVM_ATTR_ACCESS_IN_PLACE +* KFD_IOCTL_SVM_ATTR_NO_ACCESS +*/
[Patch v4 15/24] drm/amdkfd: CRIU checkpoint and restore events
From: David Yat Sin Add support to existing CRIU ioctl's to save and restore events during criu checkpoint and restore. Signed-off-by: David Yat Sin --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 70 +- drivers/gpu/drm/amd/amdkfd/kfd_events.c | 272 --- drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 27 ++- 3 files changed, 280 insertions(+), 89 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 582b4a393f95..08467fa2f514 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -1009,57 +1009,11 @@ static int kfd_ioctl_create_event(struct file *filp, struct kfd_process *p, * through the event_page_offset field. */ if (args->event_page_offset) { - struct kfd_dev *kfd; - struct kfd_process_device *pdd; - void *mem, *kern_addr; - uint64_t size; - - kfd = kfd_device_by_id(GET_GPU_ID(args->event_page_offset)); - if (!kfd) { - pr_err("Getting device by id failed in %s\n", __func__); - return -EINVAL; - } - mutex_lock(&p->mutex); - - if (p->signal_page) { - pr_err("Event page is already set\n"); - err = -EINVAL; - goto out_unlock; - } - - pdd = kfd_bind_process_to_device(kfd, p); - if (IS_ERR(pdd)) { - err = PTR_ERR(pdd); - goto out_unlock; - } - - mem = kfd_process_device_translate_handle(pdd, - GET_IDR_HANDLE(args->event_page_offset)); - if (!mem) { - pr_err("Can't find BO, offset is 0x%llx\n", - args->event_page_offset); - err = -EINVAL; - goto out_unlock; - } - - err = amdgpu_amdkfd_gpuvm_map_gtt_bo_to_kernel(kfd->adev, - mem, &kern_addr, &size); - if (err) { - pr_err("Failed to map event page to kernel\n"); - goto out_unlock; - } - - err = kfd_event_page_set(p, kern_addr, size); - if (err) { - pr_err("Failed to set event page\n"); - amdgpu_amdkfd_gpuvm_unmap_gtt_bo_from_kernel(kfd->adev, mem); - goto out_unlock; - } - - p->signal_handle = args->event_page_offset; - + err = kfd_kmap_event_page(p, args->event_page_offset); mutex_unlock(&p->mutex); + if (err) + return err; } err = kfd_event_create(filp, p, args->event_type, @@ -1068,10 +1022,7 @@ static int kfd_ioctl_create_event(struct file *filp, struct kfd_process *p, &args->event_page_offset, &args->event_slot_index); - return err; - -out_unlock: - mutex_unlock(&p->mutex); + pr_debug("Created event (id:0x%08x) (%s)\n", args->event_id, __func__); return err; } @@ -2022,7 +1973,7 @@ static int criu_get_process_object_info(struct kfd_process *p, if (ret) return ret; - num_events = 0; /* TODO: Implement Events */ + num_events = kfd_get_num_events(p); num_svm_ranges = 0; /* TODO: Implement SVM-Ranges */ *num_objects = num_queues + num_events + num_svm_ranges; @@ -2031,7 +1982,7 @@ static int criu_get_process_object_info(struct kfd_process *p, priv_size = sizeof(struct kfd_criu_process_priv_data); priv_size += *num_bos * sizeof(struct kfd_criu_bo_priv_data); priv_size += queues_priv_data_size; - /* TODO: Add Events priv size */ + priv_size += num_events * sizeof(struct kfd_criu_event_priv_data); /* TODO: Add SVM ranges priv size */ *objs_priv_size = priv_size; } @@ -2093,7 +2044,10 @@ static int criu_checkpoint(struct file *filep, if (ret) goto exit_unlock; - /* TODO: Dump Events */ + ret = kfd_criu_checkpoint_events(p, (uint8_t __user *)args->priv_data, +&priv_offset); + if (ret) + goto exit_unlock; /* TODO: Dump SVM-Ranges */ } @@ -2406,8 +2360,8 @@ static int criu_restore_objects(struct file *filep, goto exit; break; case KFD_CRIU_OBJECT_TYPE_EVENT: - /* TODO: Implement Events */ - *priv_offset += sizeof(s
[Patch v4 17/24] drm/amdkfd: CRIU export BOs as prime dmabuf objects
KFD buffer objects do not associate a GEM handle with them so cannot directly be used with libdrm to initiate a system dma (sDMA) operation to speedup the checkpoint and restore operation so export them as dmabuf objects and use with libdrm helper (amdgpu_bo_import) to further process the sdma command submissions. With sDMA, we see huge improvement in checkpoint and restore operations compared to the generic pci based access via host data path. Suggested-by: Felix Kuehling Signed-off-by: Rajneesh Bhardwaj Signed-off-by: David Yat Sin --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 71 +++- 1 file changed, 69 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 20652d488cde..178b0ccfb286 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -35,6 +35,7 @@ #include #include #include +#include #include #include "kfd_priv.h" #include "kfd_device_queue_manager.h" @@ -43,6 +44,7 @@ #include "amdgpu_amdkfd.h" #include "kfd_smi_events.h" #include "amdgpu_object.h" +#include "amdgpu_dma_buf.h" static long kfd_ioctl(struct file *, unsigned int, unsigned long); static int kfd_open(struct inode *, struct file *); @@ -1932,6 +1934,33 @@ uint64_t get_process_num_bos(struct kfd_process *p) return num_of_bos; } +static int criu_get_prime_handle(struct drm_gem_object *gobj, int flags, + u32 *shared_fd) +{ + struct dma_buf *dmabuf; + int ret; + + dmabuf = amdgpu_gem_prime_export(gobj, flags); + if (IS_ERR(dmabuf)) { + ret = PTR_ERR(dmabuf); + pr_err("dmabuf export failed for the BO\n"); + return ret; + } + + ret = dma_buf_fd(dmabuf, flags); + if (ret < 0) { + pr_err("dmabuf create fd failed, ret:%d\n", ret); + goto out_free_dmabuf; + } + + *shared_fd = ret; + return 0; + +out_free_dmabuf: + dma_buf_put(dmabuf); + return ret; +} + static int criu_checkpoint_bos(struct kfd_process *p, uint32_t num_bos, uint8_t __user *user_bos, @@ -1992,6 +2021,14 @@ static int criu_checkpoint_bos(struct kfd_process *p, goto exit; } } + if (bo_bucket->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_VRAM) { + ret = criu_get_prime_handle(&dumper_bo->tbo.base, + bo_bucket->alloc_flags & + KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE ? DRM_RDWR : 0, + &bo_bucket->dmabuf_fd); + if (ret) + goto exit; + } if (bo_bucket->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_DOORBELL) bo_bucket->offset = KFD_MMAP_TYPE_DOORBELL | KFD_MMAP_GPU_ID(pdd->dev->id); @@ -2031,6 +2068,10 @@ static int criu_checkpoint_bos(struct kfd_process *p, *priv_offset += num_bos * sizeof(*bo_privs); exit: + while (ret && bo_index--) { + if (bo_buckets[bo_index].alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_VRAM) + close_fd(bo_buckets[bo_index].dmabuf_fd); + } kvfree(bo_buckets); kvfree(bo_privs); @@ -2131,16 +2172,28 @@ static int criu_checkpoint(struct file *filep, ret = kfd_criu_checkpoint_queues(p, (uint8_t __user *)args->priv_data, &priv_offset); if (ret) - goto exit_unlock; + goto close_bo_fds; ret = kfd_criu_checkpoint_events(p, (uint8_t __user *)args->priv_data, &priv_offset); if (ret) - goto exit_unlock; + goto close_bo_fds; /* TODO: Dump SVM-Ranges */ } +close_bo_fds: + if (ret) { + /* If IOCTL returns err, user assumes all FDs opened in criu_dump_bos are closed */ + uint32_t i; + struct kfd_criu_bo_bucket *bo_buckets = (struct kfd_criu_bo_bucket *) args->bos; + + for (i = 0; i < num_bos; i++) { + if (bo_buckets[i].alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_VRAM) + close_fd(bo_buckets[i].dmabuf_fd); + } + } + exit_unlock: mutex_unlock(&p->mutex); if (ret) @@ -2335,6 +2388,7 @@ static int criu_restore_bos(struct kfd_process *p, struct kfd_criu_bo_priv_data *bo_priv; struct kfd_dev *dev;
[Patch v4 09/24] drm/amdkfd: CRIU add queues support
From: David Yat Sin Add support to existing CRIU ioctl's to save number of queues and queue properties for each queue during checkpoint and re-create queues on restore. Signed-off-by: David Yat Sin --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 110 - drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 43 +++- .../amd/amdkfd/kfd_process_queue_manager.c| 212 ++ 3 files changed, 357 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index db2bb302a8d4..9665c8657929 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -2006,19 +2006,36 @@ static int criu_checkpoint_bos(struct kfd_process *p, return ret; } -static void criu_get_process_object_info(struct kfd_process *p, -uint32_t *num_bos, -uint64_t *objs_priv_size) +static int criu_get_process_object_info(struct kfd_process *p, + uint32_t *num_bos, + uint32_t *num_objects, + uint64_t *objs_priv_size) { + int ret; uint64_t priv_size; + uint32_t num_queues, num_events, num_svm_ranges; + uint64_t queues_priv_data_size; *num_bos = get_process_num_bos(p); + ret = kfd_process_get_queue_info(p, &num_queues, &queues_priv_data_size); + if (ret) + return ret; + + num_events = 0; /* TODO: Implement Events */ + num_svm_ranges = 0; /* TODO: Implement SVM-Ranges */ + + *num_objects = num_queues + num_events + num_svm_ranges; + if (objs_priv_size) { priv_size = sizeof(struct kfd_criu_process_priv_data); priv_size += *num_bos * sizeof(struct kfd_criu_bo_priv_data); + priv_size += queues_priv_data_size; + /* TODO: Add Events priv size */ + /* TODO: Add SVM ranges priv size */ *objs_priv_size = priv_size; } + return 0; } static int criu_checkpoint(struct file *filep, @@ -2026,7 +2043,7 @@ static int criu_checkpoint(struct file *filep, struct kfd_ioctl_criu_args *args) { int ret; - uint32_t num_bos; + uint32_t num_bos, num_objects; uint64_t priv_size, priv_offset = 0; if (!args->bos || !args->priv_data) @@ -2048,9 +2065,12 @@ static int criu_checkpoint(struct file *filep, goto exit_unlock; } - criu_get_process_object_info(p, &num_bos, &priv_size); + ret = criu_get_process_object_info(p, &num_bos, &num_objects, &priv_size); + if (ret) + goto exit_unlock; if (num_bos != args->num_bos || + num_objects != args->num_objects || priv_size != args->priv_data_size) { ret = -EINVAL; @@ -2067,6 +2087,17 @@ static int criu_checkpoint(struct file *filep, if (ret) goto exit_unlock; + if (num_objects) { + ret = kfd_criu_checkpoint_queues(p, (uint8_t __user *)args->priv_data, +&priv_offset); + if (ret) + goto exit_unlock; + + /* TODO: Dump Events */ + + /* TODO: Dump SVM-Ranges */ + } + exit_unlock: mutex_unlock(&p->mutex); if (ret) @@ -2340,6 +2371,62 @@ static int criu_restore_bos(struct kfd_process *p, return ret; } +static int criu_restore_objects(struct file *filep, + struct kfd_process *p, + struct kfd_ioctl_criu_args *args, + uint64_t *priv_offset, + uint64_t max_priv_data_size) +{ + int ret = 0; + uint32_t i; + + BUILD_BUG_ON(offsetof(struct kfd_criu_queue_priv_data, object_type)); + BUILD_BUG_ON(offsetof(struct kfd_criu_event_priv_data, object_type)); + BUILD_BUG_ON(offsetof(struct kfd_criu_svm_range_priv_data, object_type)); + + for (i = 0; i < args->num_objects; i++) { + uint32_t object_type; + + if (*priv_offset + sizeof(object_type) > max_priv_data_size) { + pr_err("Invalid private data size\n"); + return -EINVAL; + } + + ret = get_user(object_type, (uint32_t __user *)(args->priv_data + *priv_offset)); + if (ret) { + pr_err("Failed to copy private information from user\n"); + goto exit; + } + + switch (object_type) { + case KFD_CRIU_OBJECT_TYPE_QUEUE: + ret = kfd_criu_restore_queue(p, (uint8_t __user *)args->priv_data, +priv_offse
[Patch v4 02/24] x86/configs: Add rock-rel_defconfig for amd-feature-criu branch
- Add rock-rel_defconfig for release builds. Signed-off-by: Rajneesh Bhardwaj --- arch/x86/configs/rock-rel_defconfig | 4927 +++ 1 file changed, 4927 insertions(+) create mode 100644 arch/x86/configs/rock-rel_defconfig diff --git a/arch/x86/configs/rock-rel_defconfig b/arch/x86/configs/rock-rel_defconfig new file mode 100644 index ..f038ce7a0d06 --- /dev/null +++ b/arch/x86/configs/rock-rel_defconfig @@ -0,0 +1,4927 @@ +# +# Automatically generated file; DO NOT EDIT. +# Linux/x86 5.13.0 Kernel Configuration +# +CONFIG_CC_VERSION_TEXT="gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0" +CONFIG_CC_IS_GCC=y +CONFIG_GCC_VERSION=70500 +CONFIG_CLANG_VERSION=0 +CONFIG_AS_IS_GNU=y +CONFIG_AS_VERSION=23000 +CONFIG_LD_IS_BFD=y +CONFIG_LD_VERSION=23000 +CONFIG_LLD_VERSION=0 +CONFIG_CC_CAN_LINK=y +CONFIG_CC_CAN_LINK_STATIC=y +CONFIG_CC_HAS_ASM_GOTO=y +CONFIG_CC_HAS_ASM_INLINE=y +CONFIG_IRQ_WORK=y +CONFIG_BUILDTIME_TABLE_SORT=y +CONFIG_THREAD_INFO_IN_TASK=y + +# +# General setup +# +CONFIG_INIT_ENV_ARG_LIMIT=32 +# CONFIG_COMPILE_TEST is not set +CONFIG_LOCALVERSION="-kfd" +# CONFIG_LOCALVERSION_AUTO is not set +CONFIG_BUILD_SALT="" +CONFIG_HAVE_KERNEL_GZIP=y +CONFIG_HAVE_KERNEL_BZIP2=y +CONFIG_HAVE_KERNEL_LZMA=y +CONFIG_HAVE_KERNEL_XZ=y +CONFIG_HAVE_KERNEL_LZO=y +CONFIG_HAVE_KERNEL_LZ4=y +CONFIG_HAVE_KERNEL_ZSTD=y +CONFIG_KERNEL_GZIP=y +# CONFIG_KERNEL_BZIP2 is not set +# CONFIG_KERNEL_LZMA is not set +# CONFIG_KERNEL_XZ is not set +# CONFIG_KERNEL_LZO is not set +# CONFIG_KERNEL_LZ4 is not set +# CONFIG_KERNEL_ZSTD is not set +CONFIG_DEFAULT_INIT="" +CONFIG_DEFAULT_HOSTNAME="(none)" +CONFIG_SWAP=y +CONFIG_SYSVIPC=y +CONFIG_SYSVIPC_SYSCTL=y +CONFIG_POSIX_MQUEUE=y +CONFIG_POSIX_MQUEUE_SYSCTL=y +# CONFIG_WATCH_QUEUE is not set +CONFIG_CROSS_MEMORY_ATTACH=y +CONFIG_USELIB=y +CONFIG_AUDIT=y +CONFIG_HAVE_ARCH_AUDITSYSCALL=y +CONFIG_AUDITSYSCALL=y + +# +# IRQ subsystem +# +CONFIG_GENERIC_IRQ_PROBE=y +CONFIG_GENERIC_IRQ_SHOW=y +CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK=y +CONFIG_GENERIC_PENDING_IRQ=y +CONFIG_GENERIC_IRQ_MIGRATION=y +CONFIG_HARDIRQS_SW_RESEND=y +CONFIG_IRQ_DOMAIN=y +CONFIG_IRQ_DOMAIN_HIERARCHY=y +CONFIG_GENERIC_MSI_IRQ=y +CONFIG_GENERIC_MSI_IRQ_DOMAIN=y +CONFIG_IRQ_MSI_IOMMU=y +CONFIG_GENERIC_IRQ_MATRIX_ALLOCATOR=y +CONFIG_GENERIC_IRQ_RESERVATION_MODE=y +CONFIG_IRQ_FORCED_THREADING=y +CONFIG_SPARSE_IRQ=y +# CONFIG_GENERIC_IRQ_DEBUGFS is not set +# end of IRQ subsystem + +CONFIG_CLOCKSOURCE_WATCHDOG=y +CONFIG_ARCH_CLOCKSOURCE_INIT=y +CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y +CONFIG_GENERIC_TIME_VSYSCALL=y +CONFIG_GENERIC_CLOCKEVENTS=y +CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y +CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y +CONFIG_GENERIC_CMOS_UPDATE=y +CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK=y +CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y + +# +# Timers subsystem +# +CONFIG_TICK_ONESHOT=y +CONFIG_NO_HZ_COMMON=y +# CONFIG_HZ_PERIODIC is not set +CONFIG_NO_HZ_IDLE=y +# CONFIG_NO_HZ_FULL is not set +CONFIG_NO_HZ=y +CONFIG_HIGH_RES_TIMERS=y +# end of Timers subsystem + +CONFIG_BPF=y +CONFIG_HAVE_EBPF_JIT=y +CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y + +# +# BPF subsystem +# +CONFIG_BPF_SYSCALL=y +# CONFIG_BPF_JIT is not set +# CONFIG_BPF_UNPRIV_DEFAULT_OFF is not set +# CONFIG_BPF_PRELOAD is not set +# end of BPF subsystem + +# CONFIG_PREEMPT_NONE is not set +CONFIG_PREEMPT_VOLUNTARY=y +# CONFIG_PREEMPT is not set +CONFIG_PREEMPT_COUNT=y + +# +# CPU/Task time and stats accounting +# +CONFIG_TICK_CPU_ACCOUNTING=y +# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set +# CONFIG_IRQ_TIME_ACCOUNTING is not set +CONFIG_BSD_PROCESS_ACCT=y +CONFIG_BSD_PROCESS_ACCT_V3=y +CONFIG_TASKSTATS=y +CONFIG_TASK_DELAY_ACCT=y +CONFIG_TASK_XACCT=y +CONFIG_TASK_IO_ACCOUNTING=y +# CONFIG_PSI is not set +# end of CPU/Task time and stats accounting + +# CONFIG_CPU_ISOLATION is not set + +# +# RCU Subsystem +# +CONFIG_TREE_RCU=y +# CONFIG_RCU_EXPERT is not set +CONFIG_SRCU=y +CONFIG_TREE_SRCU=y +CONFIG_TASKS_RCU_GENERIC=y +CONFIG_TASKS_RUDE_RCU=y +CONFIG_TASKS_TRACE_RCU=y +CONFIG_RCU_STALL_COMMON=y +CONFIG_RCU_NEED_SEGCBLIST=y +# end of RCU Subsystem + +CONFIG_BUILD_BIN2C=y +# CONFIG_IKCONFIG is not set +# CONFIG_IKHEADERS is not set +CONFIG_LOG_BUF_SHIFT=18 +CONFIG_LOG_CPU_MAX_BUF_SHIFT=12 +CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=13 +CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y + +# +# Scheduler features +# +# CONFIG_UCLAMP_TASK is not set +# end of Scheduler features + +CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y +CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH=y +CONFIG_CC_HAS_INT128=y +CONFIG_ARCH_SUPPORTS_INT128=y +CONFIG_NUMA_BALANCING=y +CONFIG_NUMA_BALANCING_DEFAULT_ENABLED=y +CONFIG_CGROUPS=y +CONFIG_PAGE_COUNTER=y +CONFIG_MEMCG=y +CONFIG_MEMCG_SWAP=y +CONFIG_MEMCG_KMEM=y +CONFIG_BLK_CGROUP=y +CONFIG_CGROUP_WRITEBACK=y +CONFIG_CGROUP_SCHED=y +CONFIG_FAIR_GROUP_SCHED=y +CONFIG_CFS_BANDWIDTH=y +# CONFIG_RT_GROUP_SCHED is not set +CONFIG_CGROUP_PIDS=y +# CONFIG_CGROUP_RDMA is not set +CONFIG_CGROUP_FREEZER=y +CONFIG_CGROUP_HUGETLB=y +CONFIG_CPUSETS=y +CONFIG_PROC_PID_C
[Patch v4 13/24] drm/amdkfd: CRIU checkpoint and restore queue mqds
From: David Yat Sin Checkpoint contents of queue MQD's on CRIU dump and restore them during CRIU restore. Signed-off-by: David Yat Sin --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c | 2 +- .../drm/amd/amdkfd/kfd_device_queue_manager.c | 72 +++- .../drm/amd/amdkfd/kfd_device_queue_manager.h | 14 +- drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h | 7 + .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c | 67 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c | 68 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c | 68 .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c | 69 drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 5 + .../amd/amdkfd/kfd_process_queue_manager.c| 158 -- 11 files changed, 506 insertions(+), 26 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 3fb155f756fd..146879cd3f2b 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -312,7 +312,7 @@ static int kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p, p->pasid, dev->id); - err = pqm_create_queue(&p->pqm, dev, filep, &q_properties, &queue_id, NULL, + err = pqm_create_queue(&p->pqm, dev, filep, &q_properties, &queue_id, NULL, NULL, &doorbell_offset_in_process); if (err != 0) goto err_create_queue; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c index 0c50e67e2b51..3a5303ebcabf 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c @@ -185,7 +185,7 @@ static int dbgdev_register_diq(struct kfd_dbgdev *dbgdev) properties.type = KFD_QUEUE_TYPE_DIQ; status = pqm_create_queue(dbgdev->pqm, dbgdev->dev, NULL, - &properties, &qid, NULL, NULL); + &properties, &qid, NULL, NULL, NULL); if (status) { pr_err("Failed to create DIQ\n"); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index a0f5b8533a03..a92274f9f1f7 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -331,7 +331,8 @@ static void deallocate_vmid(struct device_queue_manager *dqm, static int create_queue_nocpsch(struct device_queue_manager *dqm, struct queue *q, struct qcm_process_device *qpd, - const struct kfd_criu_queue_priv_data *qd) + const struct kfd_criu_queue_priv_data *qd, + const void *restore_mqd) { struct mqd_manager *mqd_mgr; int retval; @@ -390,8 +391,14 @@ static int create_queue_nocpsch(struct device_queue_manager *dqm, retval = -ENOMEM; goto out_deallocate_doorbell; } - mqd_mgr->init_mqd(mqd_mgr, &q->mqd, q->mqd_mem_obj, - &q->gart_mqd_addr, &q->properties); + + if (qd) + mqd_mgr->restore_mqd(mqd_mgr, &q->mqd, q->mqd_mem_obj, &q->gart_mqd_addr, +&q->properties, restore_mqd); + else + mqd_mgr->init_mqd(mqd_mgr, &q->mqd, q->mqd_mem_obj, + &q->gart_mqd_addr, &q->properties); + if (q->properties.is_active) { if (!dqm->sched_running) { WARN_ONCE(1, "Load non-HWS mqd while stopped\n"); @@ -1339,7 +1346,8 @@ static void destroy_kernel_queue_cpsch(struct device_queue_manager *dqm, static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue *q, struct qcm_process_device *qpd, - const struct kfd_criu_queue_priv_data *qd) + const struct kfd_criu_queue_priv_data *qd, + const void *restore_mqd) { int retval; struct mqd_manager *mqd_mgr; @@ -1385,8 +1393,12 @@ static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue *q, * updates the is_evicted flag but is a no-op otherwise. */ q->properties.is_evicted = !!qpd->evicted; - mqd_mgr->init_mqd(mqd_mgr, &q->mqd, q->mqd_mem_obj, - &q->gart_mqd_addr, &q->properties); + if (qd) + mqd_mgr->restore_mqd(mqd_mgr, &q->mqd, q->mqd_mem_obj, &q->gart_mqd_addr, +&q->properties, restore_mqd); + else + mqd_mgr->init_mqd(mqd_mgr, &q->mqd, q->mqd_mem_obj, + &q->gart_mqd_addr, &q->properties); list_add(&q->list, &qpd->que
[Patch v4 06/24] drm/amdkfd: CRIU Implement KFD restore ioctl
This implements the KFD CRIU Restore ioctl that lays the basic foundation for the CRIU restore operation. It provides support to create the buffer objects corresponding to Non-Paged system memory mapped for GPU and/or CPU access and lays basic foundation for the userptrs buffer objects which will be added in a separate patch. This ioctl creates various types of buffer objects such as VRAM, MMIO, Doorbell, GTT based on the date sent from the userspace plugin. The data mostly contains the previously checkpointed KFD images from some KFD processs. While restoring a criu process, attach old IDR values to newly created BOs. This also adds the minimal gpu mapping support for a single gpu checkpoint restore use case. Signed-off-by: David Yat Sin Signed-off-by: Rajneesh Bhardwaj --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 298 ++- 1 file changed, 297 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index cdbb92972338..c93f74ad073f 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -2069,11 +2069,307 @@ static int criu_checkpoint(struct file *filep, return ret; } +static int criu_restore_process(struct kfd_process *p, + struct kfd_ioctl_criu_args *args, + uint64_t *priv_offset, + uint64_t max_priv_data_size) +{ + int ret = 0; + struct kfd_criu_process_priv_data process_priv; + + if (*priv_offset + sizeof(process_priv) > max_priv_data_size) + return -EINVAL; + + ret = copy_from_user(&process_priv, + (void __user *)(args->priv_data + *priv_offset), + sizeof(process_priv)); + if (ret) { + pr_err("Failed to copy process private information from user\n"); + ret = -EFAULT; + goto exit; + } + *priv_offset += sizeof(process_priv); + + if (process_priv.version != KFD_CRIU_PRIV_VERSION) { + pr_err("Invalid CRIU API version (checkpointed:%d current:%d)\n", + process_priv.version, KFD_CRIU_PRIV_VERSION); + return -EINVAL; + } + +exit: + return ret; +} + +static int criu_restore_bos(struct kfd_process *p, + struct kfd_ioctl_criu_args *args, + uint64_t *priv_offset, + uint64_t max_priv_data_size) +{ + struct kfd_criu_bo_bucket *bo_buckets; + struct kfd_criu_bo_priv_data *bo_privs; + bool flush_tlbs = false; + int ret = 0, j = 0; + uint32_t i; + + if (*priv_offset + (args->num_bos * sizeof(*bo_privs)) > max_priv_data_size) + return -EINVAL; + + bo_buckets = kvmalloc_array(args->num_bos, sizeof(*bo_buckets), GFP_KERNEL); + if (!bo_buckets) + return -ENOMEM; + + ret = copy_from_user(bo_buckets, (void __user *)args->bos, +args->num_bos * sizeof(*bo_buckets)); + if (ret) { + pr_err("Failed to copy BOs information from user\n"); + ret = -EFAULT; + goto exit; + } + + bo_privs = kvmalloc_array(args->num_bos, sizeof(*bo_privs), GFP_KERNEL); + if (!bo_privs) { + ret = -ENOMEM; + goto exit; + } + + ret = copy_from_user(bo_privs, (void __user *)args->priv_data + *priv_offset, +args->num_bos * sizeof(*bo_privs)); + if (ret) { + pr_err("Failed to copy BOs information from user\n"); + ret = -EFAULT; + goto exit; + } + *priv_offset += args->num_bos * sizeof(*bo_privs); + + /* Create and map new BOs */ + for (i = 0; i < args->num_bos; i++) { + struct kfd_criu_bo_bucket *bo_bucket; + struct kfd_criu_bo_priv_data *bo_priv; + struct kfd_dev *dev; + struct kfd_process_device *pdd; + void *mem; + u64 offset; + int idr_handle; + + bo_bucket = &bo_buckets[i]; + bo_priv = &bo_privs[i]; + + dev = kfd_device_by_id(bo_bucket->gpu_id); + if (!dev) { + ret = -EINVAL; + pr_err("Failed to get pdd\n"); + goto exit; + } + pdd = kfd_get_process_device_data(dev, p); + if (!pdd) { + ret = -EINVAL; + pr_err("Failed to get pdd\n"); + goto exit; + } + + pr_debug("kfd restore ioctl - bo_bucket[%d]:\n", i); + pr_debug("size = 0x%llx, bo_addr = 0x%llx bo_offset = 0x%llx\n" + "gpu_id = 0x%x alloc_flags = 0x%x\n" +
[Patch v4 24/24] drm/amdkfd: CRIU resume shared virtual memory ranges
In CRIU resume stage, resume all the shared virtual memory ranges from the data stored inside the resuming kfd process during CRIU restore phase. Also setup xnack mode and free up the resources. Signed-off-by: Rajneesh Bhardwaj --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 10 + drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 55 drivers/gpu/drm/amd/amdkfd/kfd_svm.h | 6 +++ 3 files changed, 71 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index f7aa15b18f95..6191e37656dd 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -2759,7 +2759,17 @@ static int criu_resume(struct file *filep, } mutex_lock(&target->mutex); + ret = kfd_criu_resume_svm(target); + if (ret) { + pr_err("kfd_criu_resume_svm failed for %i\n", args->pid); + goto exit; + } + ret = amdgpu_amdkfd_criu_resume(target->kgd_process_info); + if (ret) + pr_err("amdgpu_amdkfd_criu_resume failed for %i\n", args->pid); + +exit: mutex_unlock(&target->mutex); kfd_unref_process(target); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index e9f6c63c2a26..bd2dce37f345 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -3427,6 +3427,61 @@ svm_range_get_attr(struct kfd_process *p, struct mm_struct *mm, return 0; } +int kfd_criu_resume_svm(struct kfd_process *p) +{ + int nattr_common = 4, nattr_accessibility = 1; + struct criu_svm_metadata *criu_svm_md = NULL; + struct criu_svm_metadata *next = NULL; + struct svm_range_list *svms = &p->svms; + int i, j, num_attrs, ret = 0; + struct mm_struct *mm; + + if (list_empty(&svms->criu_svm_metadata_list)) { + pr_debug("No SVM data from CRIU restore stage 2\n"); + return ret; + } + + mm = get_task_mm(p->lead_thread); + if (!mm) { + pr_err("failed to get mm for the target process\n"); + return -ESRCH; + } + + num_attrs = nattr_common + (nattr_accessibility * p->n_pdds); + + i = j = 0; + list_for_each_entry(criu_svm_md, &svms->criu_svm_metadata_list, list) { + pr_debug("criu_svm_md[%d]\n\tstart: 0x%llx size: 0x%llx (npages)\n", +i, criu_svm_md->start_addr, criu_svm_md->size); + for (j = 0; j < num_attrs; j++) { + pr_debug("\ncriu_svm_md[%d]->attrs[%d].type : 0x%x \ncriu_svm_md[%d]->attrs[%d].value : 0x%x\n", +i,j, criu_svm_md->attrs[j].type, +i,j, criu_svm_md->attrs[j].value); + } + + ret = svm_range_set_attr(p, mm, criu_svm_md->start_addr, +criu_svm_md->size, num_attrs, +criu_svm_md->attrs); + if (ret) { + pr_err("CRIU: failed to set range attributes\n"); + goto exit; + } + + i++; + } + +exit: + list_for_each_entry_safe(criu_svm_md, next, &svms->criu_svm_metadata_list, list) { + pr_debug("freeing criu_svm_md[]\n\tstart: 0x%llx\n", + criu_svm_md->start_addr); + kfree(criu_svm_md); + } + + mmput(mm); + return ret; + +} + int svm_criu_prepare_for_resume(struct kfd_process *p, struct kfd_criu_svm_range_priv_data *svm_priv) { diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.h b/drivers/gpu/drm/amd/amdkfd/kfd_svm.h index e0c0853f085c..3b5bcb52723c 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.h @@ -195,6 +195,7 @@ int kfd_criu_restore_svm(struct kfd_process *p, uint8_t __user *user_priv_ptr, uint64_t *priv_data_offset, uint64_t max_priv_data_size); +int kfd_criu_resume_svm(struct kfd_process *p); struct kfd_process_device * svm_range_get_pdd_by_adev(struct svm_range *prange, struct amdgpu_device *adev); void svm_range_list_lock_and_flush_work(struct svm_range_list *svms, struct mm_struct *mm); @@ -256,6 +257,11 @@ static inline int kfd_criu_restore_svm(struct kfd_process *p, return -EINVAL; } +static inline int kfd_criu_resume_svm(struct kfd_process *p) +{ + return 0; +} + #define KFD_IS_SVM_API_SUPPORTED(dev) false #endif /* IS_ENABLED(CONFIG_HSA_AMD_SVM) */ -- 2.17.1
[Patch v4 20/24] drm/amdkfd: use user_gpu_id for svm ranges
Currently the SVM ranges use actual_gpu_id but with Checkpoint Restore support its possible that the SVM ranges can be resumed on another node where the actual_gpu_id may not be same as the original (user_gpu_id) gpu id. So modify svm code to use user_gpu_id. Signed-off-by: Rajneesh Bhardwaj --- drivers/gpu/drm/amd/amdkfd/kfd_process.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c index 67e2432098d1..0769dc655e15 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c @@ -1813,7 +1813,7 @@ int kfd_process_gpuidx_from_gpuid(struct kfd_process *p, uint32_t gpu_id) int i; for (i = 0; i < p->n_pdds; i++) - if (p->pdds[i] && gpu_id == p->pdds[i]->dev->id) + if (p->pdds[i] && gpu_id == p->pdds[i]->user_gpu_id) return i; return -EINVAL; } @@ -1826,7 +1826,7 @@ kfd_process_gpuid_from_adev(struct kfd_process *p, struct amdgpu_device *adev, for (i = 0; i < p->n_pdds; i++) if (p->pdds[i] && p->pdds[i]->dev->adev == adev) { - *gpuid = p->pdds[i]->dev->id; + *gpuid = p->pdds[i]->user_gpu_id; *gpuidx = i; return 0; } -- 2.17.1
[Patch v4 04/24] drm/amdkfd: CRIU Implement KFD process_info ioctl
This IOCTL is expected to be called as a precursor to the actual Checkpoint operation. This does the basic discovery into the target process seized by CRIU and relays the information to the userspace that utilizes it to start the Checkpoint operation via another dedicated IOCTL. The process_info IOCTL determines the number of GPUs, buffer objects that are associated with the target process, its process id in caller's namespace since /proc/pid/mem interface maybe used to drain the contents of the discovered buffer objects in userspace and getpid returns the pid of CRIU dumper process. Also the pid of a process inside a container might be different than its global pid so return the ns pid. Signed-off-by: Rajneesh Bhardwaj Signed-off-by: David Yat Sin --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 55 +++- drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 2 + drivers/gpu/drm/amd/amdkfd/kfd_process.c | 14 ++ 3 files changed, 70 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 1b863bd84c96..53d7a20e3c06 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -1857,6 +1857,41 @@ static int kfd_ioctl_svm(struct file *filep, struct kfd_process *p, void *data) } #endif +uint64_t get_process_num_bos(struct kfd_process *p) +{ + uint64_t num_of_bos = 0, i; + + /* Run over all PDDs of the process */ + for (i = 0; i < p->n_pdds; i++) { + struct kfd_process_device *pdd = p->pdds[i]; + void *mem; + int id; + + idr_for_each_entry(&pdd->alloc_idr, mem, id) { + struct kgd_mem *kgd_mem = (struct kgd_mem *)mem; + + if ((uint64_t)kgd_mem->va > pdd->gpuvm_base) + num_of_bos++; + } + } + return num_of_bos; +} + +static void criu_get_process_object_info(struct kfd_process *p, +uint32_t *num_bos, +uint64_t *objs_priv_size) +{ + uint64_t priv_size; + + *num_bos = get_process_num_bos(p); + + if (objs_priv_size) { + priv_size = sizeof(struct kfd_criu_process_priv_data); + priv_size += *num_bos * sizeof(struct kfd_criu_bo_priv_data); + *objs_priv_size = priv_size; + } +} + static int criu_checkpoint(struct file *filep, struct kfd_process *p, struct kfd_ioctl_criu_args *args) @@ -1889,7 +1924,25 @@ static int criu_process_info(struct file *filep, struct kfd_process *p, struct kfd_ioctl_criu_args *args) { - return 0; + int ret = 0; + + mutex_lock(&p->mutex); + + if (!kfd_has_process_device_data(p)) { + pr_err("No pdd for given process\n"); + ret = -ENODEV; + goto err_unlock; + } + + args->pid = task_pid_nr_ns(p->lead_thread, + task_active_pid_ns(p->lead_thread)); + + criu_get_process_object_info(p, &args->num_bos, &args->priv_data_size); + + dev_dbg(kfd_device, "Num of bos:%u\n", args->num_bos); +err_unlock: + mutex_unlock(&p->mutex); + return ret; } static int kfd_ioctl_criu(struct file *filep, struct kfd_process *p, void *data) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index e68f692362bb..4d9bc7af03af 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -950,6 +950,8 @@ void *kfd_process_device_translate_handle(struct kfd_process_device *p, void kfd_process_device_remove_obj_handle(struct kfd_process_device *pdd, int handle); +bool kfd_has_process_device_data(struct kfd_process *p); + /* PASIDs */ int kfd_pasid_init(void); void kfd_pasid_exit(void); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c index d4c8a6948a9f..f77d556ca0fc 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c @@ -1456,6 +1456,20 @@ static int init_doorbell_bitmap(struct qcm_process_device *qpd, return 0; } +bool kfd_has_process_device_data(struct kfd_process *p) +{ + int i; + + for (i = 0; i < p->n_pdds; i++) { + struct kfd_process_device *pdd = p->pdds[i]; + + if (pdd) + return true; + } + + return false; +} + struct kfd_process_device *kfd_get_process_device_data(struct kfd_dev *dev, struct kfd_process *p) { -- 2.17.1
[Patch v4 10/24] drm/amdkfd: CRIU restore queue ids
From: David Yat Sin When re-creating queues during CRIU restore, restore the queue with the same queue id value used during CRIU dump. Signed-off-by: Rajneesh Bhardwaj Signed-off-by: David Yat Sin --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 + .../amd/amdkfd/kfd_process_queue_manager.c| 37 +++ 4 files changed, 34 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 9665c8657929..3fb155f756fd 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -312,7 +312,7 @@ static int kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p, p->pasid, dev->id); - err = pqm_create_queue(&p->pqm, dev, filep, &q_properties, &queue_id, + err = pqm_create_queue(&p->pqm, dev, filep, &q_properties, &queue_id, NULL, &doorbell_offset_in_process); if (err != 0) goto err_create_queue; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c index 1e30717b5253..0c50e67e2b51 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c @@ -185,7 +185,7 @@ static int dbgdev_register_diq(struct kfd_dbgdev *dbgdev) properties.type = KFD_QUEUE_TYPE_DIQ; status = pqm_create_queue(dbgdev->pqm, dbgdev->dev, NULL, - &properties, &qid, NULL); + &properties, &qid, NULL, NULL); if (status) { pr_err("Failed to create DIQ\n"); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index 7c2679a23aa3..8272bd5c4600 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -461,6 +461,7 @@ enum KFD_QUEUE_PRIORITY { * it's user mode or kernel mode queue. * */ + struct queue_properties { enum kfd_queue_type type; enum kfd_queue_format format; @@ -1156,6 +1157,7 @@ int pqm_create_queue(struct process_queue_manager *pqm, struct file *f, struct queue_properties *properties, unsigned int *qid, + const struct kfd_criu_queue_priv_data *q_data, uint32_t *p_doorbell_offset_in_process); int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid); int pqm_update_queue_properties(struct process_queue_manager *pqm, unsigned int qid, diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c index 480ad794df4e..275aeebc58fa 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c @@ -42,6 +42,20 @@ static inline struct process_queue_node *get_queue_by_qid( return NULL; } +static int assign_queue_slot_by_qid(struct process_queue_manager *pqm, + unsigned int qid) +{ + if (qid >= KFD_MAX_NUM_OF_QUEUES_PER_PROCESS) + return -EINVAL; + + if (__test_and_set_bit(qid, pqm->queue_slot_bitmap)) { + pr_err("Cannot create new queue because requested qid(%u) is in use\n", qid); + return -ENOSPC; + } + + return 0; +} + static int find_available_queue_slot(struct process_queue_manager *pqm, unsigned int *qid) { @@ -194,6 +208,7 @@ int pqm_create_queue(struct process_queue_manager *pqm, struct file *f, struct queue_properties *properties, unsigned int *qid, + const struct kfd_criu_queue_priv_data *q_data, uint32_t *p_doorbell_offset_in_process) { int retval; @@ -225,7 +240,12 @@ int pqm_create_queue(struct process_queue_manager *pqm, if (pdd->qpd.queue_count >= max_queues) return -ENOSPC; - retval = find_available_queue_slot(pqm, qid); + if (q_data) { + retval = assign_queue_slot_by_qid(pqm, q_data->q_id); + *qid = q_data->q_id; + } else + retval = find_available_queue_slot(pqm, qid); + if (retval != 0) return retval; @@ -528,7 +548,7 @@ int kfd_process_get_queue_info(struct kfd_process *p, return 0; } -static void criu_dump_queue(struct kfd_process_device *pdd, +static void criu_checkpoint_queue(struct kfd_process_device *pdd, struct queue *q, struct kfd_criu_queue_priv_data *q_data) { @@ -560,7 +580,7 @@ static void criu_dump_queue(struct kfd_proces
[Patch v4 07/24] drm/amdkfd: CRIU Implement KFD resume ioctl
This adds support to create userptr BOs on restore and introduces a new ioctl to restart memory notifiers for the restored userptr BOs. When doing CRIU restore MMU notifications can happen anytime after we call amdgpu_mn_register. Prevent MMU notifications until we reach stage-4 of the restore process i.e. criu_resume ioctl is received, and the process is ready to be resumed. This ioctl is different from other KFD CRIU ioctls since its called by CRIU master restore process for all the target processes being resumed by CRIU. Signed-off-by: David Yat Sin Signed-off-by: Rajneesh Bhardwaj --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h| 6 ++- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 51 +-- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 44 ++-- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 1 + drivers/gpu/drm/amd/amdkfd/kfd_process.c | 35 +++-- 5 files changed, 123 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h index fcbc8a9c9e06..5c5fc839f701 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h @@ -131,6 +131,7 @@ struct amdkfd_process_info { atomic_t evicted_bos; struct delayed_work restore_userptr_work; struct pid *pid; + bool block_mmu_notifications; }; int amdgpu_amdkfd_init(void); @@ -269,7 +270,7 @@ uint64_t amdgpu_amdkfd_gpuvm_get_process_page_dir(void *drm_priv); int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu( struct amdgpu_device *adev, uint64_t va, uint64_t size, void *drm_priv, struct kgd_mem **mem, - uint64_t *offset, uint32_t flags); + uint64_t *offset, uint32_t flags, bool criu_resume); int amdgpu_amdkfd_gpuvm_free_memory_of_gpu( struct amdgpu_device *adev, struct kgd_mem *mem, void *drm_priv, uint64_t *size); @@ -297,6 +298,9 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev, int amdgpu_amdkfd_get_tile_config(struct amdgpu_device *adev, struct tile_config *config); void amdgpu_amdkfd_ras_poison_consumption_handler(struct amdgpu_device *adev); +void amdgpu_amdkfd_block_mmu_notifications(void *p); +int amdgpu_amdkfd_criu_resume(void *p); + #if IS_ENABLED(CONFIG_HSA_AMD) void amdgpu_amdkfd_gpuvm_init_mem_limits(void); void amdgpu_amdkfd_gpuvm_destroy_cb(struct amdgpu_device *adev, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c index 90b985436878..5679fb75ec88 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c @@ -846,7 +846,8 @@ static void remove_kgd_mem_from_kfd_bo_list(struct kgd_mem *mem, * * Returns 0 for success, negative errno for errors. */ -static int init_user_pages(struct kgd_mem *mem, uint64_t user_addr) +static int init_user_pages(struct kgd_mem *mem, uint64_t user_addr, + bool criu_resume) { struct amdkfd_process_info *process_info = mem->process_info; struct amdgpu_bo *bo = mem->bo; @@ -868,6 +869,17 @@ static int init_user_pages(struct kgd_mem *mem, uint64_t user_addr) goto out; } + if (criu_resume) { + /* +* During a CRIU restore operation, the userptr buffer objects +* will be validated in the restore_userptr_work worker at a +* later stage when it is scheduled by another ioctl called by +* CRIU master process for the target pid for restore. +*/ + atomic_inc(&mem->invalid); + mutex_unlock(&process_info->lock); + return 0; + } ret = amdgpu_ttm_tt_get_user_pages(bo, bo->tbo.ttm->pages); if (ret) { pr_err("%s: Failed to get user pages: %d\n", __func__, ret); @@ -1240,6 +1252,7 @@ static int init_kfd_vm(struct amdgpu_vm *vm, void **process_info, INIT_DELAYED_WORK(&info->restore_userptr_work, amdgpu_amdkfd_restore_userptr_worker); + info->block_mmu_notifications = false; *process_info = info; *ef = dma_fence_get(&info->eviction_fence->base); } @@ -1456,10 +1469,37 @@ uint64_t amdgpu_amdkfd_gpuvm_get_process_page_dir(void *drm_priv) return avm->pd_phys_addr; } +void amdgpu_amdkfd_block_mmu_notifications(void *p) +{ + struct amdkfd_process_info *pinfo = (struct amdkfd_process_info *)p; + + pinfo->block_mmu_notifications = true; +} + +int amdgpu_amdkfd_criu_resume(void *p) +{ + int ret = 0; + struct amdkfd_process_info *pinfo = (struct amdkfd_process_info *)p; + + mutex_lock(&pinfo->lock); + pr_debug("scheduling work\n"); + atomic_inc(&pinfo->evicted_bos); + if (!p
[Patch v4 05/24] drm/amdkfd: CRIU Implement KFD checkpoint ioctl
This adds support to discover the buffer objects that belong to a process being checkpointed. The data corresponding to these buffer objects is returned to user space plugin running under criu master context which then stores this info to recreate these buffer objects during a restore operation. Signed-off-by: David Yat Sin Signed-off-by: Rajneesh Bhardwaj --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 20 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 2 + drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 172 ++- drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 3 +- 4 files changed, 195 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 56c5c4464829..4fd36bd9dcfd 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -1173,6 +1173,26 @@ static void amdgpu_ttm_tt_unpopulate(struct ttm_device *bdev, return ttm_pool_free(&adev->mman.bdev.pool, ttm); } +/** + * amdgpu_ttm_tt_get_userptr - Return the userptr GTT ttm_tt for the current + * task + * + * @tbo: The ttm_buffer_object that contains the userptr + * @user_addr: The returned value + */ +int amdgpu_ttm_tt_get_userptr(const struct ttm_buffer_object *tbo, + uint64_t *user_addr) +{ + struct amdgpu_ttm_tt *gtt; + + if (!tbo->ttm) + return -EINVAL; + + gtt = (void *)tbo->ttm; + *user_addr = gtt->userptr; + return 0; +} + /** * amdgpu_ttm_tt_set_userptr - Initialize userptr GTT ttm_tt for the current * task diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h index 7346ecff4438..6e6d67ec43f8 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h @@ -177,6 +177,8 @@ static inline bool amdgpu_ttm_tt_get_user_pages_done(struct ttm_tt *ttm) #endif void amdgpu_ttm_tt_set_user_pages(struct ttm_tt *ttm, struct page **pages); +int amdgpu_ttm_tt_get_userptr(const struct ttm_buffer_object *tbo, + uint64_t *user_addr); int amdgpu_ttm_tt_set_userptr(struct ttm_buffer_object *bo, uint64_t addr, uint32_t flags); bool amdgpu_ttm_tt_has_userptr(struct ttm_tt *ttm); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 53d7a20e3c06..cdbb92972338 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -42,6 +42,7 @@ #include "kfd_svm.h" #include "amdgpu_amdkfd.h" #include "kfd_smi_events.h" +#include "amdgpu_object.h" static long kfd_ioctl(struct file *, unsigned int, unsigned long); static int kfd_open(struct inode *, struct file *); @@ -1857,6 +1858,29 @@ static int kfd_ioctl_svm(struct file *filep, struct kfd_process *p, void *data) } #endif +static int criu_checkpoint_process(struct kfd_process *p, +uint8_t __user *user_priv_data, +uint64_t *priv_offset) +{ + struct kfd_criu_process_priv_data process_priv; + int ret; + + memset(&process_priv, 0, sizeof(process_priv)); + + process_priv.version = KFD_CRIU_PRIV_VERSION; + + ret = copy_to_user(user_priv_data + *priv_offset, + &process_priv, sizeof(process_priv)); + + if (ret) { + pr_err("Failed to copy process information to user\n"); + ret = -EFAULT; + } + + *priv_offset += sizeof(process_priv); + return ret; +} + uint64_t get_process_num_bos(struct kfd_process *p) { uint64_t num_of_bos = 0, i; @@ -1877,6 +1901,111 @@ uint64_t get_process_num_bos(struct kfd_process *p) return num_of_bos; } +static int criu_checkpoint_bos(struct kfd_process *p, + uint32_t num_bos, + uint8_t __user *user_bos, + uint8_t __user *user_priv_data, + uint64_t *priv_offset) +{ + struct kfd_criu_bo_bucket *bo_buckets; + struct kfd_criu_bo_priv_data *bo_privs; + int ret = 0, pdd_index, bo_index = 0, id; + void *mem; + + bo_buckets = kvzalloc(num_bos * sizeof(*bo_buckets), GFP_KERNEL); + if (!bo_buckets) { + ret = -ENOMEM; + goto exit; + } + + bo_privs = kvzalloc(num_bos * sizeof(*bo_privs), GFP_KERNEL); + if (!bo_privs) { + ret = -ENOMEM; + goto exit; + } + + for (pdd_index = 0; pdd_index < p->n_pdds; pdd_index++) { + struct kfd_process_device *pdd = p->pdds[pdd_index]; + struct amdgpu_bo *dumper_bo; + struct kgd_mem *kgd_mem; + + idr_for_each_entry(&pdd->alloc_idr, mem, id) { + struct kfd_criu_bo_bucket *bo_bucket; + struct kfd_criu_bo_priv_data
[Patch v4 08/24] drm/amdkfd: CRIU Implement KFD unpause operation
From: David Yat Sin Introducing UNPAUSE op. After CRIU amdgpu plugin performs a PROCESS_INFO op the queues will be stay in an evicted state. Once the plugin is done draining BO contents, it is safe to perform an UNPAUSE op for the queues to resume. Signed-off-by: David Yat Sin --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 37 +++- drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 3 ++ drivers/gpu/drm/amd/amdkfd/kfd_process.c | 1 + 3 files changed, 40 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 87b9f019e96e..db2bb302a8d4 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -2040,6 +2040,14 @@ static int criu_checkpoint(struct file *filep, goto exit_unlock; } + /* Confirm all process queues are evicted */ + if (!p->queues_paused) { + pr_err("Cannot dump process when queues are not in evicted state\n"); + /* CRIU plugin did not call op PROCESS_INFO before checkpointing */ + ret = -EINVAL; + goto exit_unlock; + } + criu_get_process_object_info(p, &num_bos, &priv_size); if (num_bos != args->num_bos || @@ -2382,7 +2390,24 @@ static int criu_unpause(struct file *filep, struct kfd_process *p, struct kfd_ioctl_criu_args *args) { - return 0; + int ret; + + mutex_lock(&p->mutex); + + if (!p->queues_paused) { + mutex_unlock(&p->mutex); + return -EINVAL; + } + + ret = kfd_process_restore_queues(p); + if (ret) + pr_err("Failed to unpause queues ret:%d\n", ret); + else + p->queues_paused = false; + + mutex_unlock(&p->mutex); + + return ret; } static int criu_resume(struct file *filep, @@ -2434,6 +2459,12 @@ static int criu_process_info(struct file *filep, goto err_unlock; } + ret = kfd_process_evict_queues(p); + if (ret) + goto err_unlock; + + p->queues_paused = true; + args->pid = task_pid_nr_ns(p->lead_thread, task_active_pid_ns(p->lead_thread)); @@ -2441,6 +2472,10 @@ static int criu_process_info(struct file *filep, dev_dbg(kfd_device, "Num of bos:%u\n", args->num_bos); err_unlock: + if (ret) { + kfd_process_restore_queues(p); + p->queues_paused = false; + } mutex_unlock(&p->mutex); return ret; } diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index cd72541a8f4f..f3a9f3de34e4 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -875,6 +875,9 @@ struct kfd_process { struct svm_range_list svms; bool xnack_enabled; + + /* Queues are in paused stated because we are in the process of doing a CRIU checkpoint */ + bool queues_paused; }; #define KFD_PROCESS_TABLE_SIZE 5 /* bits: 32 entries */ diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c index d2fcdc5e581f..e20fbb7ba9bb 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c @@ -1364,6 +1364,7 @@ static struct kfd_process *create_process(const struct task_struct *thread) process->mm = thread->mm; process->lead_thread = thread->group_leader; process->n_pdds = 0; + process->queues_paused = false; INIT_DELAYED_WORK(&process->eviction_work, evict_process_worker); INIT_DELAYED_WORK(&process->restore_work, restore_process_worker); process->last_restore_timestamp = get_jiffies_64(); -- 2.17.1
[Patch v4 01/24] x86/configs: CRIU update debug rock defconfig
- Update debug config for Checkpoint-Restore (CR) support - Also include necessary options for CR with docker containers. Signed-off-by: Rajneesh Bhardwaj --- arch/x86/configs/rock-dbg_defconfig | 53 ++--- 1 file changed, 34 insertions(+), 19 deletions(-) diff --git a/arch/x86/configs/rock-dbg_defconfig b/arch/x86/configs/rock-dbg_defconfig index 4877da183599..bc2a34666c1d 100644 --- a/arch/x86/configs/rock-dbg_defconfig +++ b/arch/x86/configs/rock-dbg_defconfig @@ -249,6 +249,7 @@ CONFIG_KALLSYMS_ALL=y CONFIG_KALLSYMS_ABSOLUTE_PERCPU=y CONFIG_KALLSYMS_BASE_RELATIVE=y # CONFIG_USERFAULTFD is not set +CONFIG_USERFAULTFD=y CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE=y CONFIG_KCMP=y CONFIG_RSEQ=y @@ -1015,6 +1016,11 @@ CONFIG_PACKET_DIAG=y CONFIG_UNIX=y CONFIG_UNIX_SCM=y CONFIG_UNIX_DIAG=y +CONFIG_SMC_DIAG=y +CONFIG_XDP_SOCKETS_DIAG=y +CONFIG_INET_MPTCP_DIAG=y +CONFIG_TIPC_DIAG=y +CONFIG_VSOCKETS_DIAG=y # CONFIG_TLS is not set CONFIG_XFRM=y CONFIG_XFRM_ALGO=y @@ -1052,15 +1058,17 @@ CONFIG_SYN_COOKIES=y # CONFIG_NET_IPVTI is not set # CONFIG_NET_FOU is not set # CONFIG_NET_FOU_IP_TUNNELS is not set -# CONFIG_INET_AH is not set -# CONFIG_INET_ESP is not set -# CONFIG_INET_IPCOMP is not set -CONFIG_INET_TUNNEL=y -CONFIG_INET_DIAG=y -CONFIG_INET_TCP_DIAG=y -# CONFIG_INET_UDP_DIAG is not set -# CONFIG_INET_RAW_DIAG is not set -# CONFIG_INET_DIAG_DESTROY is not set +CONFIG_INET_AH=m +CONFIG_INET_ESP=m +CONFIG_INET_IPCOMP=m +CONFIG_INET_ESP_OFFLOAD=m +CONFIG_INET_TUNNEL=m +CONFIG_INET_XFRM_TUNNEL=m +CONFIG_INET_DIAG=m +CONFIG_INET_TCP_DIAG=m +CONFIG_INET_UDP_DIAG=m +CONFIG_INET_RAW_DIAG=m +CONFIG_INET_DIAG_DESTROY=y CONFIG_TCP_CONG_ADVANCED=y # CONFIG_TCP_CONG_BIC is not set CONFIG_TCP_CONG_CUBIC=y @@ -1085,12 +1093,14 @@ CONFIG_TCP_MD5SIG=y CONFIG_IPV6=y # CONFIG_IPV6_ROUTER_PREF is not set # CONFIG_IPV6_OPTIMISTIC_DAD is not set -CONFIG_INET6_AH=y -CONFIG_INET6_ESP=y -# CONFIG_INET6_ESP_OFFLOAD is not set -# CONFIG_INET6_ESPINTCP is not set -# CONFIG_INET6_IPCOMP is not set -# CONFIG_IPV6_MIP6 is not set +CONFIG_INET6_AH=m +CONFIG_INET6_ESP=m +CONFIG_INET6_ESP_OFFLOAD=m +CONFIG_INET6_IPCOMP=m +CONFIG_IPV6_MIP6=m +CONFIG_INET6_XFRM_TUNNEL=m +CONFIG_INET_DCCP_DIAG=m +CONFIG_INET_SCTP_DIAG=m # CONFIG_IPV6_ILA is not set # CONFIG_IPV6_VTI is not set CONFIG_IPV6_SIT=y @@ -1146,8 +1156,13 @@ CONFIG_NF_CT_PROTO_UDPLITE=y # CONFIG_NF_CONNTRACK_SANE is not set # CONFIG_NF_CONNTRACK_SIP is not set # CONFIG_NF_CONNTRACK_TFTP is not set -# CONFIG_NF_CT_NETLINK is not set -# CONFIG_NF_CT_NETLINK_TIMEOUT is not set +CONFIG_COMPAT_NETLINK_MESSAGES=y +CONFIG_NF_CT_NETLINK=m +CONFIG_NF_CT_NETLINK_TIMEOUT=m +CONFIG_NF_CT_NETLINK_HELPER=m +CONFIG_NETFILTER_NETLINK_GLUE_CT=y +CONFIG_SCSI_NETLINK=y +CONFIG_QUOTA_NETLINK_INTERFACE=y CONFIG_NF_NAT=m CONFIG_NF_NAT_REDIRECT=y CONFIG_NF_NAT_MASQUERADE=y @@ -1992,7 +2007,7 @@ CONFIG_NETCONSOLE_DYNAMIC=y CONFIG_NETPOLL=y CONFIG_NET_POLL_CONTROLLER=y # CONFIG_RIONET is not set -# CONFIG_TUN is not set +CONFIG_TUN=y # CONFIG_TUN_VNET_CROSS_LE is not set CONFIG_VETH=y # CONFIG_NLMON is not set @@ -3990,7 +4005,7 @@ CONFIG_MANDATORY_FILE_LOCKING=y CONFIG_FSNOTIFY=y CONFIG_DNOTIFY=y CONFIG_INOTIFY_USER=y -# CONFIG_FANOTIFY is not set +CONFIG_FANOTIFY=y CONFIG_QUOTA=y CONFIG_QUOTA_NETLINK_INTERFACE=y # CONFIG_PRINT_QUOTA_WARNING is not set -- 2.17.1
[Patch v4 03/24] drm/amdkfd: CRIU Introduce Checkpoint-Restore APIs
Checkpoint-Restore in userspace (CRIU) is a powerful tool that can snapshot a running process and later restore it on same or a remote machine but expects the processes that have a device file (e.g. GPU) associated with them, provide necessary driver support to assist CRIU and its extensible plugin interface. Thus, In order to support the Checkpoint-Restore of any ROCm process, the AMD Radeon Open Compute Kernel driver, needs to provide a set of new APIs that provide necessary VRAM metadata and its contents to a userspace component (CRIU plugin) that can store it in form of image files. This introduces some new ioctls which will be used to checkpoint-Restore any KFD bound user process. KFD doesn't allow any arbitrary ioctl call unless it is called by the group leader process. Since these ioctls are expected to be called from a KFD criu plugin which has elevated ptrace attached privileges and CAP_CHECKPOINT_RESTORE capabilities attached with the file descriptors so modify KFD to allow such calls. (API redesigned by David Yat Sin) Suggested-by: Felix Kuehling Signed-off-by: David Yat Sin Signed-off-by: Rajneesh Bhardwaj --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 94 +++- drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 65 +++- include/uapi/linux/kfd_ioctl.h | 79 +++- 3 files changed, 235 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index 4bfc0c8ab764..1b863bd84c96 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -33,6 +33,7 @@ #include #include #include +#include #include #include #include "kfd_priv.h" @@ -1856,6 +1857,75 @@ static int kfd_ioctl_svm(struct file *filep, struct kfd_process *p, void *data) } #endif +static int criu_checkpoint(struct file *filep, + struct kfd_process *p, + struct kfd_ioctl_criu_args *args) +{ + return 0; +} + +static int criu_restore(struct file *filep, + struct kfd_process *p, + struct kfd_ioctl_criu_args *args) +{ + return 0; +} + +static int criu_unpause(struct file *filep, + struct kfd_process *p, + struct kfd_ioctl_criu_args *args) +{ + return 0; +} + +static int criu_resume(struct file *filep, + struct kfd_process *p, + struct kfd_ioctl_criu_args *args) +{ + return 0; +} + +static int criu_process_info(struct file *filep, + struct kfd_process *p, + struct kfd_ioctl_criu_args *args) +{ + return 0; +} + +static int kfd_ioctl_criu(struct file *filep, struct kfd_process *p, void *data) +{ + struct kfd_ioctl_criu_args *args = data; + int ret; + + dev_dbg(kfd_device, "CRIU operation: %d\n", args->op); + switch (args->op) { + case KFD_CRIU_OP_PROCESS_INFO: + ret = criu_process_info(filep, p, args); + break; + case KFD_CRIU_OP_CHECKPOINT: + ret = criu_checkpoint(filep, p, args); + break; + case KFD_CRIU_OP_UNPAUSE: + ret = criu_unpause(filep, p, args); + break; + case KFD_CRIU_OP_RESTORE: + ret = criu_restore(filep, p, args); + break; + case KFD_CRIU_OP_RESUME: + ret = criu_resume(filep, p, args); + break; + default: + dev_dbg(kfd_device, "Unsupported CRIU operation:%d\n", args->op); + ret = -EINVAL; + break; + } + + if (ret) + dev_dbg(kfd_device, "CRIU operation:%d err:%d\n", args->op, ret); + + return ret; +} + #define AMDKFD_IOCTL_DEF(ioctl, _func, _flags) \ [_IOC_NR(ioctl)] = {.cmd = ioctl, .func = _func, .flags = _flags, \ .cmd_drv = 0, .name = #ioctl} @@ -1959,6 +2029,9 @@ static const struct amdkfd_ioctl_desc amdkfd_ioctls[] = { AMDKFD_IOCTL_DEF(AMDKFD_IOC_SET_XNACK_MODE, kfd_ioctl_set_xnack_mode, 0), + + AMDKFD_IOCTL_DEF(AMDKFD_IOC_CRIU_OP, + kfd_ioctl_criu, KFD_IOC_FLAG_CHECKPOINT_RESTORE), }; #define AMDKFD_CORE_IOCTL_COUNTARRAY_SIZE(amdkfd_ioctls) @@ -1973,6 +2046,7 @@ static long kfd_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) char *kdata = NULL; unsigned int usize, asize; int retcode = -EINVAL; + bool ptrace_attached = false; if (nr >= AMDKFD_CORE_IOCTL_COUNT) goto err_i1; @@ -1998,7 +2072,15 @@ static long kfd_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) * processes need to create their own KFD device context. */ process = filep->private_data; - if (process->lead_thread != curren
[Patch v4 00/24] CHECKPOINT RESTORE WITH ROCm
CRIU is a user space tool which is very popular for container live migration in datacentres. It can checkpoint a running application, save its complete state, memory contents and all system resources to images on disk which can be migrated to another m achine and restored later. More information on CRIU can be found at https://criu.org/Main_Page CRIU currently does not support Checkpoint / Restore with applications that have devices files open so it cannot perform checkpoint and restore on GPU devices which are very complex and have their own VRAM managed privately. CRIU, however can support external devices by using a plugin architecture. We feel that we are getting close to finalizing our IOCTL APIs which were again changed since V3 for an improved modular design. Our changes to CRIU user space are can be obtained from here: https://github.com/RadeonOpenCompute/criu/tree/amdgpu_rfc-211222 We have tested the following scenarios: - Checkpoint / Restore of a Pytorch (BERT) workload - kfdtests with queues and events - Gfx9 and Gfx10 based multi GPU test systems - On baremetal and inside a docker container - Restoring on a different system V1: Initial V2: Addressed review comments V3: Rebased on latest amd-staging-drm-next (5.15 based) v4: New API design and basic support for SVM, however there is an outstanding issue with SVM restore which is currently under debug and hopefully that won't impact the ioctl APIs as SVMs are treated as private data hidden from user space like queues and events with the new approch. David Yat Sin (9): drm/amdkfd: CRIU Implement KFD unpause operation drm/amdkfd: CRIU add queues support drm/amdkfd: CRIU restore queue ids drm/amdkfd: CRIU restore sdma id for queues drm/amdkfd: CRIU restore queue doorbell id drm/amdkfd: CRIU checkpoint and restore queue mqds drm/amdkfd: CRIU checkpoint and restore queue control stack drm/amdkfd: CRIU checkpoint and restore events drm/amdkfd: CRIU implement gpu_id remapping Rajneesh Bhardwaj (15): x86/configs: CRIU update debug rock defconfig x86/configs: Add rock-rel_defconfig for amd-feature-criu branch drm/amdkfd: CRIU Introduce Checkpoint-Restore APIs drm/amdkfd: CRIU Implement KFD process_info ioctl drm/amdkfd: CRIU Implement KFD checkpoint ioctl drm/amdkfd: CRIU Implement KFD restore ioctl drm/amdkfd: CRIU Implement KFD resume ioctl drm/amdkfd: CRIU export BOs as prime dmabuf objects drm/amdkfd: CRIU checkpoint and restore xnack mode drm/amdkfd: CRIU allow external mm for svm ranges drm/amdkfd: use user_gpu_id for svm ranges drm/amdkfd: CRIU Discover svm ranges drm/amdkfd: CRIU Save Shared Virtual Memory ranges drm/amdkfd: CRIU prepare for svm resume drm/amdkfd: CRIU resume shared virtual memory ranges arch/x86/configs/rock-dbg_defconfig | 53 +- arch/x86/configs/rock-rel_defconfig | 4927 + drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h|6 +- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 51 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 20 + drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h |2 + drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 1453 - drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c |2 +- .../drm/amd/amdkfd/kfd_device_queue_manager.c | 185 +- .../drm/amd/amdkfd/kfd_device_queue_manager.h | 18 +- drivers/gpu/drm/amd/amdkfd/kfd_events.c | 313 +- drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h | 14 + .../gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c | 72 + .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c | 74 + .../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c | 89 + .../gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c | 81 + drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 166 +- drivers/gpu/drm/amd/amdkfd/kfd_process.c | 86 +- .../amd/amdkfd/kfd_process_queue_manager.c| 377 +- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 326 +- drivers/gpu/drm/amd/amdkfd/kfd_svm.h | 39 + include/uapi/linux/kfd_ioctl.h| 79 +- 22 files changed, 8099 insertions(+), 334 deletions(-) create mode 100644 arch/x86/configs/rock-rel_defconfig -- 2.17.1
[PATCH] drm/i915/guc: Use lockless list for destroyed contexts
Use a lockless list structure for destroyed contexts to avoid hammering on global submission spin lock. Suggested-by: Tvrtko Ursulin Signed-off-by: Matthew Brost --- drivers/gpu/drm/i915/gt/intel_context.c | 2 - drivers/gpu/drm/i915/gt/intel_context_types.h | 3 +- drivers/gpu/drm/i915/gt/uc/intel_guc.h| 3 +- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 44 +-- 4 files changed, 16 insertions(+), 36 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index 5d0ec7c49b6a..4aacb4b0418d 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -403,8 +403,6 @@ intel_context_init(struct intel_context *ce, struct intel_engine_cs *engine) ce->guc_id.id = GUC_INVALID_LRC_ID; INIT_LIST_HEAD(&ce->guc_id.link); - INIT_LIST_HEAD(&ce->destroyed_link); - INIT_LIST_HEAD(&ce->parallel.child_list); /* diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h index 30cd81ad8911..4532d43ec9c0 100644 --- a/drivers/gpu/drm/i915/gt/intel_context_types.h +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h @@ -9,6 +9,7 @@ #include #include #include +#include #include #include @@ -224,7 +225,7 @@ struct intel_context { * list when context is pending to be destroyed (deregistered with the * GuC), protected by guc->submission_state.lock */ - struct list_head destroyed_link; + struct llist_node destroyed_link; /** @parallel: sub-structure for parallel submission members */ struct { diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h index f9240d4baa69..705085058411 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h @@ -8,6 +8,7 @@ #include #include +#include #include "intel_uncore.h" #include "intel_guc_fw.h" @@ -112,7 +113,7 @@ struct intel_guc { * @destroyed_contexts: list of contexts waiting to be destroyed * (deregistered with the GuC) */ - struct list_head destroyed_contexts; + struct llist_head destroyed_contexts; /** * @destroyed_worker: worker to deregister contexts, need as we * need to take a GT PM reference and can't from destroy diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index 0a03a30e4c6d..6f7643edc139 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -1771,7 +1771,7 @@ int intel_guc_submission_init(struct intel_guc *guc) spin_lock_init(&guc->submission_state.lock); INIT_LIST_HEAD(&guc->submission_state.guc_id_list); ida_init(&guc->submission_state.guc_ids); - INIT_LIST_HEAD(&guc->submission_state.destroyed_contexts); + init_llist_head(&guc->submission_state.destroyed_contexts); INIT_WORK(&guc->submission_state.destroyed_worker, destroyed_worker_func); @@ -2696,26 +2696,18 @@ static void __guc_context_destroy(struct intel_context *ce) } } +#define take_destroyed_contexts(guc) \ + llist_del_all(&guc->submission_state.destroyed_contexts) + static void guc_flush_destroyed_contexts(struct intel_guc *guc) { - struct intel_context *ce; - unsigned long flags; + struct intel_context *ce, *cn; GEM_BUG_ON(!submission_disabled(guc) && guc_submission_initialized(guc)); - while (!list_empty(&guc->submission_state.destroyed_contexts)) { - spin_lock_irqsave(&guc->submission_state.lock, flags); - ce = list_first_entry_or_null(&guc->submission_state.destroyed_contexts, - struct intel_context, - destroyed_link); - if (ce) - list_del_init(&ce->destroyed_link); - spin_unlock_irqrestore(&guc->submission_state.lock, flags); - - if (!ce) - break; - + llist_for_each_entry_safe(ce, cn, take_destroyed_contexts(guc), +destroyed_link) { release_guc_id(guc, ce); __guc_context_destroy(ce); } @@ -2723,23 +2715,11 @@ static void guc_flush_destroyed_contexts(struct intel_guc *guc) static void deregister_destroyed_contexts(struct intel_guc *guc) { - struct intel_context *ce; - unsigned long flags; - - while (!list_empty(&guc->submission_state.destroyed_contexts)) { - spin_lock_irqsave(&guc->submission_state.lock, flags); - ce = list_first_entry_or_null(&guc->submission_state.destroyed_contexts, -
[PATCH] drm/i915/execlists: Weak parallel submission support for execlists
A weak implementation of parallel submission (multi-bb execbuf IOCTL) for execlists. Doing as little as possible to support this interface for execlists - basically just passing submit fences between each request generated and virtual engines are not allowed. This is on par with what is there for the existing (hopefully soon deprecated) bonding interface. We perma-pin these execlists contexts to align with GuC implementation. v2: (John Harrison) - Drop siblings array as num_siblings must be 1 v3: (John Harrison) - Drop single submission v4: (John Harrison) - Actually drop single submission - Use IS_ERR check on return value from intel_context_create - Set last request to NULL on unpin Signed-off-by: Matthew Brost --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 11 -- drivers/gpu/drm/i915/gt/intel_context.c | 4 +- .../drm/i915/gt/intel_execlists_submission.c | 38 +++ drivers/gpu/drm/i915/gt/intel_lrc.c | 4 ++ .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 2 - 5 files changed, 51 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index cad3f0b2be9e..b0d2d81fc3b3 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -570,10 +570,6 @@ set_proto_ctx_engines_parallel_submit(struct i915_user_extension __user *base, struct intel_engine_cs **siblings = NULL; intel_engine_mask_t prev_mask; - /* FIXME: This is NIY for execlists */ - if (!(intel_uc_uses_guc_submission(&to_gt(i915)->uc))) - return -ENODEV; - if (get_user(slot, &ext->engine_index)) return -EFAULT; @@ -583,6 +579,13 @@ set_proto_ctx_engines_parallel_submit(struct i915_user_extension __user *base, if (get_user(num_siblings, &ext->num_siblings)) return -EFAULT; + if (!intel_uc_uses_guc_submission(&to_gt(i915)->uc) && + num_siblings != 1) { + drm_dbg(&i915->drm, "Only 1 sibling (%d) supported in non-GuC mode\n", + num_siblings); + return -EINVAL; + } + if (slot >= set->num_engines) { drm_dbg(&i915->drm, "Invalid placement value, %d >= %d\n", slot, set->num_engines); diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index ba083d800a08..5d0ec7c49b6a 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -79,7 +79,8 @@ static int intel_context_active_acquire(struct intel_context *ce) __i915_active_acquire(&ce->active); - if (intel_context_is_barrier(ce) || intel_engine_uses_guc(ce->engine)) + if (intel_context_is_barrier(ce) || intel_engine_uses_guc(ce->engine) || + intel_context_is_parallel(ce)) return 0; /* Preallocate tracking nodes */ @@ -563,7 +564,6 @@ void intel_context_bind_parent_child(struct intel_context *parent, * Callers responsibility to validate that this function is used * correctly but we use GEM_BUG_ON here ensure that they do. */ - GEM_BUG_ON(!intel_engine_uses_guc(parent->engine)); GEM_BUG_ON(intel_context_is_pinned(parent)); GEM_BUG_ON(intel_context_is_child(parent)); GEM_BUG_ON(intel_context_is_pinned(child)); diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index a69df5e9e77a..be56d0b41892 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -2599,6 +2599,43 @@ static void execlists_context_cancel_request(struct intel_context *ce, current->comm); } +static struct intel_context * +execlists_create_parallel(struct intel_engine_cs **engines, + unsigned int num_siblings, + unsigned int width) +{ + struct intel_context *parent = NULL, *ce, *err; + int i; + + GEM_BUG_ON(num_siblings != 1); + + for (i = 0; i < width; ++i) { + ce = intel_context_create(engines[i]); + if (IS_ERR(ce)) { + err = ce; + goto unwind; + } + + if (i == 0) + parent = ce; + else + intel_context_bind_parent_child(parent, ce); + } + + parent->parallel.fence_context = dma_fence_context_alloc(1); + + intel_context_set_nopreempt(parent); + for_each_child(parent, ce) + intel_context_set_nopreempt(ce); + + return parent; + +unwind: + if (parent) + intel_context_put(parent); + return err; +} + static const struct intel_context_ops execlists_context_ops = { .flags =
Re: [PATCH] drm/i915/execlists: Weak parallel submission support for execlists
On Mon, Dec 06, 2021 at 12:01:04PM -0800, John Harrison wrote: > On 11/11/2021 13:20, Matthew Brost wrote: > > A weak implementation of parallel submission (multi-bb execbuf IOCTL) for > > execlists. Doing as little as possible to support this interface for > > execlists - basically just passing submit fences between each request > > generated and virtual engines are not allowed. This is on par with what > > is there for the existing (hopefully soon deprecated) bonding interface. > > > > We perma-pin these execlists contexts to align with GuC implementation. > > > > v2: > > (John Harrison) > >- Drop siblings array as num_siblings must be 1 > > v3: > > (John Harrison) > >- Drop single submission > > > > Signed-off-by: Matthew Brost > > --- > > drivers/gpu/drm/i915/gem/i915_gem_context.c | 10 +++-- > > drivers/gpu/drm/i915/gt/intel_context.c | 4 +- > > .../drm/i915/gt/intel_execlists_submission.c | 40 +++ > > drivers/gpu/drm/i915/gt/intel_lrc.c | 2 + > > .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 2 - > > 5 files changed, 50 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c > > b/drivers/gpu/drm/i915/gem/i915_gem_context.c > > index ebd775cb1661c..d7bf6c8f70b7b 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c > > @@ -570,10 +570,6 @@ set_proto_ctx_engines_parallel_submit(struct > > i915_user_extension __user *base, > > struct intel_engine_cs **siblings = NULL; > > intel_engine_mask_t prev_mask; > > - /* FIXME: This is NIY for execlists */ > > - if (!(intel_uc_uses_guc_submission(&i915->gt.uc))) > > - return -ENODEV; > > - > > if (get_user(slot, &ext->engine_index)) > > return -EFAULT; > > @@ -583,6 +579,12 @@ set_proto_ctx_engines_parallel_submit(struct > > i915_user_extension __user *base, > > if (get_user(num_siblings, &ext->num_siblings)) > > return -EFAULT; > > + if (!intel_uc_uses_guc_submission(&i915->gt.uc) && num_siblings != 1) { > > + drm_dbg(&i915->drm, "Only 1 sibling (%d) supported in non-GuC > > mode\n", > > + num_siblings); > > + return -EINVAL; > > + } > > + > > if (slot >= set->num_engines) { > > drm_dbg(&i915->drm, "Invalid placement value, %d >= %d\n", > > slot, set->num_engines); > > diff --git a/drivers/gpu/drm/i915/gt/intel_context.c > > b/drivers/gpu/drm/i915/gt/intel_context.c > > index 5634d14052bc9..1bec92e1d8e63 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_context.c > > +++ b/drivers/gpu/drm/i915/gt/intel_context.c > > @@ -79,7 +79,8 @@ static int intel_context_active_acquire(struct > > intel_context *ce) > > __i915_active_acquire(&ce->active); > > - if (intel_context_is_barrier(ce) || intel_engine_uses_guc(ce->engine)) > > + if (intel_context_is_barrier(ce) || intel_engine_uses_guc(ce->engine) || > > + intel_context_is_parallel(ce)) > > return 0; > > /* Preallocate tracking nodes */ > > @@ -563,7 +564,6 @@ void intel_context_bind_parent_child(struct > > intel_context *parent, > > * Callers responsibility to validate that this function is used > > * correctly but we use GEM_BUG_ON here ensure that they do. > > */ > > - GEM_BUG_ON(!intel_engine_uses_guc(parent->engine)); > > GEM_BUG_ON(intel_context_is_pinned(parent)); > > GEM_BUG_ON(intel_context_is_child(parent)); > > GEM_BUG_ON(intel_context_is_pinned(child)); > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c > > b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c > > index ca03880fa7e49..5fd49ee47096d 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c > > @@ -2598,6 +2598,45 @@ static void execlists_context_cancel_request(struct > > intel_context *ce, > > current->comm); > > } > > +static struct intel_context * > > +execlists_create_parallel(struct intel_engine_cs **engines, > > + unsigned int num_siblings, > > + unsigned int width) > > +{ > > + struct intel_context *parent = NULL, *ce, *err; > > + int i; > > + > > + GEM_BUG_ON(num_siblings != 1); > > + > > + for (i = 0; i < width; ++i) { > > + ce = intel_context_create(engines[i]); > > + if (!ce) { > > + err = ERR_PTR(-ENOMEM); > intel_context_create already checks for null and returns -ENOMEM. This needs > to check for IS_ERR(ce). > Yep. > > + goto unwind; > > + } > > + > > + if (i == 0) > > + parent = ce; > > + else > > + intel_context_bind_parent_child(parent, ce); > > + } > > + > > + parent->parallel.fence_context = dma_fence_context_alloc(1); > > + > > + intel_context_set_nopreempt(p
Re: completely rework the dma_resv semantic
On Fri, Dec 17, 2021 at 03:39:52PM +0100, Christian König wrote: > Hi Daniel, > > looks like this is going nowhere and you don't seem to have time to review. > > What can we do? cc more people, you didn't cc any of the driver folks :-) Also I did find some review before I disappeared, back on 10th Jan. Cheers, Daniel > > Thanks, > Christian. > > Am 07.12.21 um 13:33 schrieb Christian König: > > Hi Daniel, > > > > just a gentle ping that you wanted to take a look at this. > > > > Not much changed compared to the last version, only a minor bugfix in > > the dma_resv_get_singleton error handling. > > > > Regards, > > Christian. > > > > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 22/24] dma-buf: wait for map to complete for static attachments
On Tue, Dec 07, 2021 at 01:34:09PM +0100, Christian König wrote: > We have previously done that in the individual drivers but it is > more defensive to move that into the common code. > > Dynamic attachments should wait for map operations to complete by themselves. > > Signed-off-by: Christian König i915 should probably stop reinveinting so much stuff here and align more ... I do wonder whether we want the same for dma_buf_pin(), or at least document that for dynamic attachments, you still need to sync even if it's pinned. Especially since your kerneldoc for the usage flags suggests that waiting isn't needed, but after this patch waiting _is_ needed even for dynamic importers. So there is a gap here I think, and I deleted my r-b tag that I already typed again. Or do I miss something? Minimally needs accurate docs, but I'm leaning towards an unconditional dma_resv_wait() in dma_buf_pin() for safety's sake. > --- > drivers/dma-buf/dma-buf.c | 18 +++--- > drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 14 +- > drivers/gpu/drm/nouveau/nouveau_prime.c | 17 + > drivers/gpu/drm/radeon/radeon_prime.c | 16 +++- > 4 files changed, 20 insertions(+), 45 deletions(-) > > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c > index 528983d3ba64..d3dd602c4753 100644 > --- a/drivers/dma-buf/dma-buf.c > +++ b/drivers/dma-buf/dma-buf.c > @@ -660,12 +660,24 @@ static struct sg_table * __map_dma_buf(struct > dma_buf_attachment *attach, > enum dma_data_direction direction) > { > struct sg_table *sg_table; > + signed long ret; > > sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction); > + if (IS_ERR_OR_NULL(sg_table)) > + return sg_table; > + > + if (!dma_buf_attachment_is_dynamic(attach)) { > + ret = dma_resv_wait_timeout(attach->dmabuf->resv, Another place where this dma_resv_wait() wrapper would be good. I think we should have it :-) Cheers, Daniel > + DMA_RESV_USAGE_KERNEL, true, > + MAX_SCHEDULE_TIMEOUT); > + if (ret < 0) { > + attach->dmabuf->ops->unmap_dma_buf(attach, sg_table, > +direction); > + return ERR_PTR(ret); > + } > + } > > - if (!IS_ERR_OR_NULL(sg_table)) > - mangle_sg_table(sg_table); > - > + mangle_sg_table(sg_table); > return sg_table; > } > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c > index 4896c876ffec..33127bd56c64 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c > @@ -102,21 +102,9 @@ static int amdgpu_dma_buf_pin(struct dma_buf_attachment > *attach) > { > struct drm_gem_object *obj = attach->dmabuf->priv; > struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj); > - int r; > > /* pin buffer into GTT */ > - r = amdgpu_bo_pin(bo, AMDGPU_GEM_DOMAIN_GTT); > - if (r) > - return r; > - > - if (bo->tbo.moving) { > - r = dma_fence_wait(bo->tbo.moving, true); > - if (r) { > - amdgpu_bo_unpin(bo); > - return r; > - } > - } > - return 0; > + return amdgpu_bo_pin(bo, AMDGPU_GEM_DOMAIN_GTT); > } > > /** > diff --git a/drivers/gpu/drm/nouveau/nouveau_prime.c > b/drivers/gpu/drm/nouveau/nouveau_prime.c > index 60019d0532fc..347488685f74 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_prime.c > +++ b/drivers/gpu/drm/nouveau/nouveau_prime.c > @@ -93,22 +93,7 @@ int nouveau_gem_prime_pin(struct drm_gem_object *obj) > if (ret) > return -EINVAL; > > - ret = ttm_bo_reserve(&nvbo->bo, false, false, NULL); > - if (ret) > - goto error; > - > - if (nvbo->bo.moving) > - ret = dma_fence_wait(nvbo->bo.moving, true); > - > - ttm_bo_unreserve(&nvbo->bo); > - if (ret) > - goto error; > - > - return ret; > - > -error: > - nouveau_bo_unpin(nvbo); > - return ret; > + return 0; > } > > void nouveau_gem_prime_unpin(struct drm_gem_object *obj) > diff --git a/drivers/gpu/drm/radeon/radeon_prime.c > b/drivers/gpu/drm/radeon/radeon_prime.c > index 4a90807351e7..42a87948e28c 100644 > --- a/drivers/gpu/drm/radeon/radeon_prime.c > +++ b/drivers/gpu/drm/radeon/radeon_prime.c > @@ -77,19 +77,9 @@ int radeon_gem_prime_pin(struct drm_gem_object *obj) > > /* pin buffer into GTT */ > ret = radeon_bo_pin(bo, RADEON_GEM_DOMAIN_GTT, NULL); > - if (unlikely(ret)) > - goto error; > - > - if (bo->tbo.moving) { > - ret = dma_fence_wait(bo->tbo.moving, false); > - if (unlikely(ret)) { > - radeon_b
[RFC v2 7/8] drm/amdgpu: Drop concurrent GPU reset protection for device
Since now all GPU resets are serialzied there is no need for this. This patch also reverts 'drm/amdgpu: race issue when jobs on 2 ring timeout' Signed-off-by: Andrey Grodzovsky Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 89 ++ 1 file changed, 7 insertions(+), 82 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 107a393ebbfd..fef952ca8db5 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4763,11 +4763,10 @@ int amdgpu_do_asic_reset(struct list_head *device_list_handle, return r; } -static bool amdgpu_device_lock_adev(struct amdgpu_device *adev, +static void amdgpu_device_lock_adev(struct amdgpu_device *adev, struct amdgpu_hive_info *hive) { - if (atomic_cmpxchg(&adev->in_gpu_reset, 0, 1) != 0) - return false; + atomic_set(&adev->in_gpu_reset, 1); if (hive) { down_write_nest_lock(&adev->reset_sem, &hive->hive_lock); @@ -4786,8 +4785,6 @@ static bool amdgpu_device_lock_adev(struct amdgpu_device *adev, adev->mp1_state = PP_MP1_STATE_NONE; break; } - - return true; } static void amdgpu_device_unlock_adev(struct amdgpu_device *adev) @@ -4798,46 +4795,6 @@ static void amdgpu_device_unlock_adev(struct amdgpu_device *adev) up_write(&adev->reset_sem); } -/* - * to lockup a list of amdgpu devices in a hive safely, if not a hive - * with multiple nodes, it will be similar as amdgpu_device_lock_adev. - * - * unlock won't require roll back. - */ -static int amdgpu_device_lock_hive_adev(struct amdgpu_device *adev, struct amdgpu_hive_info *hive) -{ - struct amdgpu_device *tmp_adev = NULL; - - if (adev->gmc.xgmi.num_physical_nodes > 1) { - if (!hive) { - dev_err(adev->dev, "Hive is NULL while device has multiple xgmi nodes"); - return -ENODEV; - } - list_for_each_entry(tmp_adev, &hive->device_list, gmc.xgmi.head) { - if (!amdgpu_device_lock_adev(tmp_adev, hive)) - goto roll_back; - } - } else if (!amdgpu_device_lock_adev(adev, hive)) - return -EAGAIN; - - return 0; -roll_back: - if (!list_is_first(&tmp_adev->gmc.xgmi.head, &hive->device_list)) { - /* -* if the lockup iteration break in the middle of a hive, -* it may means there may has a race issue, -* or a hive device locked up independently. -* we may be in trouble and may not, so will try to roll back -* the lock and give out a warnning. -*/ - dev_warn(tmp_adev->dev, "Hive lock iteration broke in the middle. Rolling back to unlock"); - list_for_each_entry_continue_reverse(tmp_adev, &hive->device_list, gmc.xgmi.head) { - amdgpu_device_unlock_adev(tmp_adev); - } - } - return -EAGAIN; -} - static void amdgpu_device_resume_display_audio(struct amdgpu_device *adev) { struct pci_dev *p = NULL; @@ -5023,22 +4980,6 @@ int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev, reset_context.hive = hive; clear_bit(AMDGPU_NEED_FULL_RESET, &reset_context.flags); - /* -* lock the device before we try to operate the linked list -* if didn't get the device lock, don't touch the linked list since -* others may iterating it. -*/ - r = amdgpu_device_lock_hive_adev(adev, hive); - if (r) { - dev_info(adev->dev, "Bailing on TDR for s_job:%llx, as another already in progress", - job ? job->base.id : -1); - - /* even we skipped this reset, still need to set the job to guilty */ - if (job && job->vm) - drm_sched_increase_karma(&job->base); - goto skip_recovery; - } - /* * Build list of devices to reset. * In case we are in XGMI hive mode, resort the device list @@ -5058,6 +4999,9 @@ int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev, /* block all schedulers and reset given job's ring */ list_for_each_entry(tmp_adev, device_list_handle, reset_list) { + + amdgpu_device_lock_adev(tmp_adev, hive); + /* * Try to put the audio codec into suspend state * before gpu reset started. @@ -5209,13 +5153,12 @@ int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev, amdgpu_device_unlock_adev(tmp_adev); } -skip_recovery: if (hive) { mutex_unlock(&hive->hive_lock); amdgpu_put_xgmi_hive(hive);
[RFC v2 6/8] drm/amdgpu: Drop hive->in_reset
Since we serialize all resets no need to protect from concurrent resets. Signed-off-by: Andrey Grodzovsky Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 19 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 1 - drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h | 1 - 3 files changed, 1 insertion(+), 20 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 258ec3c0b2af..107a393ebbfd 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -5013,25 +5013,9 @@ int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev, dev_info(adev->dev, "GPU %s begin!\n", need_emergency_restart ? "jobs stop":"reset"); - /* -* Here we trylock to avoid chain of resets executing from -* either trigger by jobs on different adevs in XGMI hive or jobs on -* different schedulers for same device while this TO handler is running. -* We always reset all schedulers for device and all devices for XGMI -* hive so that should take care of them too. -*/ hive = amdgpu_get_xgmi_hive(adev); - if (hive) { - if (atomic_cmpxchg(&hive->in_reset, 0, 1) != 0) { - DRM_INFO("Bailing on TDR for s_job:%llx, hive: %llx as another already in progress", - job ? job->base.id : -1, hive->hive_id); - amdgpu_put_xgmi_hive(hive); - if (job && job->vm) - drm_sched_increase_karma(&job->base); - return 0; - } + if (hive) mutex_lock(&hive->hive_lock); - } reset_context.method = AMD_RESET_METHOD_NONE; reset_context.reset_req_dev = adev; @@ -5227,7 +5211,6 @@ int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev, skip_recovery: if (hive) { - atomic_set(&hive->in_reset, 0); mutex_unlock(&hive->hive_lock); amdgpu_put_xgmi_hive(hive); } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c index a858e3457c5c..9ad742039ac9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c @@ -404,7 +404,6 @@ struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct amdgpu_device *adev) INIT_LIST_HEAD(&hive->device_list); INIT_LIST_HEAD(&hive->node); mutex_init(&hive->hive_lock); - atomic_set(&hive->in_reset, 0); atomic_set(&hive->number_devices, 0); task_barrier_init(&hive->tb); hive->pstate = AMDGPU_XGMI_PSTATE_UNKNOWN; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h index 6121aaa292cb..2f2ce53645a5 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h @@ -33,7 +33,6 @@ struct amdgpu_hive_info { struct list_head node; atomic_t number_devices; struct mutex hive_lock; - atomic_t in_reset; int hi_req_count; struct amdgpu_device *hi_req_gpu; struct task_barrier tb; -- 2.25.1
[RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV
Since now flr work is serialized against GPU resets there is no need for this. Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c | 11 --- drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c | 11 --- 2 files changed, 22 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c index 487cd654b69e..7d59a66e3988 100644 --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c @@ -248,15 +248,7 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct *work) struct amdgpu_device *adev = container_of(virt, struct amdgpu_device, virt); int timeout = AI_MAILBOX_POLL_FLR_TIMEDOUT; - /* block amdgpu_gpu_recover till msg FLR COMPLETE received, -* otherwise the mailbox msg will be ruined/reseted by -* the VF FLR. -*/ - if (!down_write_trylock(&adev->reset_sem)) - return; - amdgpu_virt_fini_data_exchange(adev); - atomic_set(&adev->in_gpu_reset, 1); xgpu_ai_mailbox_trans_msg(adev, IDH_READY_TO_RESET, 0, 0, 0); @@ -269,9 +261,6 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct *work) } while (timeout > 1); flr_done: - atomic_set(&adev->in_gpu_reset, 0); - up_write(&adev->reset_sem); - /* Trigger recovery for world switch failure if no TDR */ if (amdgpu_device_should_recover_gpu(adev) && (!amdgpu_device_has_job_running(adev) || diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c index e3869067a31d..f82c066c8e8d 100644 --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c @@ -277,15 +277,7 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct *work) struct amdgpu_device *adev = container_of(virt, struct amdgpu_device, virt); int timeout = NV_MAILBOX_POLL_FLR_TIMEDOUT; - /* block amdgpu_gpu_recover till msg FLR COMPLETE received, -* otherwise the mailbox msg will be ruined/reseted by -* the VF FLR. -*/ - if (!down_write_trylock(&adev->reset_sem)) - return; - amdgpu_virt_fini_data_exchange(adev); - atomic_set(&adev->in_gpu_reset, 1); xgpu_nv_mailbox_trans_msg(adev, IDH_READY_TO_RESET, 0, 0, 0); @@ -298,9 +290,6 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct *work) } while (timeout > 1); flr_done: - atomic_set(&adev->in_gpu_reset, 0); - up_write(&adev->reset_sem); - /* Trigger recovery for world switch failure if no TDR */ if (amdgpu_device_should_recover_gpu(adev) && (!amdgpu_device_has_job_running(adev) || -- 2.25.1
[RFC v2 5/8] drm/amd/virt: For SRIOV send GPU reset directly to TDR queue.
No need to to trigger another work queue inside the work queue. Suggested-by: Liu Shaoyun Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c | 7 +-- drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c | 7 +-- drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c | 7 +-- 3 files changed, 15 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c index 23b066bcffb2..487cd654b69e 100644 --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c @@ -276,7 +276,7 @@ static void xgpu_ai_mailbox_flr_work(struct work_struct *work) if (amdgpu_device_should_recover_gpu(adev) && (!amdgpu_device_has_job_running(adev) || adev->sdma_timeout == MAX_SCHEDULE_TIMEOUT)) - amdgpu_device_gpu_recover(adev, NULL); + amdgpu_device_gpu_recover_imp(adev, NULL); } static int xgpu_ai_set_mailbox_rcv_irq(struct amdgpu_device *adev, @@ -302,7 +302,10 @@ static int xgpu_ai_mailbox_rcv_irq(struct amdgpu_device *adev, switch (event) { case IDH_FLR_NOTIFICATION: if (amdgpu_sriov_runtime(adev)) - schedule_work(&adev->virt.flr_work); + WARN_ONCE(!queue_work(adev->reset_domain.wq, + &adev->virt.flr_work), + "Failed to queue work! at %s", + __FUNCTION__ ); break; case IDH_QUERY_ALIVE: xgpu_ai_mailbox_send_ack(adev); diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c index a35e6d87e537..e3869067a31d 100644 --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c @@ -308,7 +308,7 @@ static void xgpu_nv_mailbox_flr_work(struct work_struct *work) adev->gfx_timeout == MAX_SCHEDULE_TIMEOUT || adev->compute_timeout == MAX_SCHEDULE_TIMEOUT || adev->video_timeout == MAX_SCHEDULE_TIMEOUT)) - amdgpu_device_gpu_recover(adev, NULL); + amdgpu_device_gpu_recover_imp(adev, NULL); } static int xgpu_nv_set_mailbox_rcv_irq(struct amdgpu_device *adev, @@ -337,7 +337,10 @@ static int xgpu_nv_mailbox_rcv_irq(struct amdgpu_device *adev, switch (event) { case IDH_FLR_NOTIFICATION: if (amdgpu_sriov_runtime(adev)) - schedule_work(&adev->virt.flr_work); + WARN_ONCE(!queue_work(adev->reset_domain.wq, + &adev->virt.flr_work), + "Failed to queue work! at %s", + __FUNCTION__ ); break; /* READY_TO_ACCESS_GPU is fetched by kernel polling, IRQ can ignore * it byfar since that polling thread will handle it, diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c index aef9d059ae52..23e802cae2bb 100644 --- a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c +++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c @@ -521,7 +521,7 @@ static void xgpu_vi_mailbox_flr_work(struct work_struct *work) /* Trigger recovery due to world switch failure */ if (amdgpu_device_should_recover_gpu(adev)) - amdgpu_device_gpu_recover(adev, NULL); + amdgpu_device_gpu_recover_imp(adev, NULL); } static int xgpu_vi_set_mailbox_rcv_irq(struct amdgpu_device *adev, @@ -551,7 +551,10 @@ static int xgpu_vi_mailbox_rcv_irq(struct amdgpu_device *adev, /* only handle FLR_NOTIFY now */ if (!r) - schedule_work(&adev->virt.flr_work); + WARN_ONCE(!queue_work(adev->reset_domain.wq, + &adev->virt.flr_work), + "Failed to queue work! at %s", + __FUNCTION__ ); } return 0; -- 2.25.1
Re: [PATCH 21/24] dma-buf: add DMA_RESV_USAGE_BOOKKEEP
On Tue, Dec 07, 2021 at 01:34:08PM +0100, Christian König wrote: > Add an usage for submissions independent of implicit sync but still > interesting for memory management. > > Signed-off-by: Christian König Focusing on the kerneldoc first to get semantics agreed. > diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h > index 29d71496..07ae5b00c1fa 100644 > --- a/include/linux/dma-resv.h > +++ b/include/linux/dma-resv.h > @@ -55,7 +55,7 @@ struct dma_resv_list; > * This enum describes the different use cases for a dma_resv object and > * controls which fences are returned when queried. > * > - * An important fact is that there is the order KERNEL + * An important fact is that there is the order KERNEL and > * when the dma_resv object is asked for fences for one use case the fences > * for the lower use case are returned as well. > * > @@ -93,6 +93,22 @@ enum dma_resv_usage { >* an implicit read dependency. >*/ > DMA_RESV_USAGE_READ, > + > + /** > + * @DMA_RESV_USAGE_BOOKKEEP: No implicit sync. > + * > + * This should be used by submissions which don't want to participate in > + * implicit synchronization. Uh we might still have a disagreement, because that isn't really what drivers which added opt-in implicit sync have done thus far. Minimally we need a note that some drivers also use _READ for this. > + * > + * The most common case are submissions with explicit synchronization, > + * but also things like preemption fences as well as page table updates > + * might use this. > + * > + * The kernel memory management *always* need to wait for those fences > + * before moving or freeing the resource protected by the dma_resv > + * object. Yeah this is the comment I wanted to see for READ, and which now is in bookkeeping (where it's correct in the end). I think we still should have something in the READ comment (and here) explaining that there could very well be writes hiding behind this, and that the kernel cannot assume anything about what's going on in general (maybe some drivers enforce read/write through command parsers). Also all the text in dma_buf.resv needs to be updated to use the right constants instead of words. -Daniel > + */ > + DMA_RESV_USAGE_BOOKKEEP > }; > > /** > -- > 2.25.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
[RFC v2 2/8] drm/amdgpu: Move scheduler init to after XGMI is ready
Before we initialize schedulers we must know which reset domain are we in - for single device there iis a single domain per device and so single wq per device. For XGMI the reset domain spans the entire XGMI hive and so the reset wq is per hive. Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 45 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 34 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 + 3 files changed, 51 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 0f3e6c078f88..7c063fd37389 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2284,6 +2284,47 @@ static int amdgpu_device_fw_loading(struct amdgpu_device *adev) return r; } +static int amdgpu_device_init_schedulers(struct amdgpu_device *adev) +{ + long timeout; + int r, i; + + for (i = 0; i < AMDGPU_MAX_RINGS; ++i) { + struct amdgpu_ring *ring = adev->rings[i]; + + /* No need to setup the GPU scheduler for rings that don't need it */ + if (!ring || ring->no_scheduler) + continue; + + switch (ring->funcs->type) { + case AMDGPU_RING_TYPE_GFX: + timeout = adev->gfx_timeout; + break; + case AMDGPU_RING_TYPE_COMPUTE: + timeout = adev->compute_timeout; + break; + case AMDGPU_RING_TYPE_SDMA: + timeout = adev->sdma_timeout; + break; + default: + timeout = adev->video_timeout; + break; + } + + r = drm_sched_init(&ring->sched, &amdgpu_sched_ops, + ring->num_hw_submission, amdgpu_job_hang_limit, + timeout, adev->reset_domain.wq, ring->sched_score, ring->name); + if (r) { + DRM_ERROR("Failed to create scheduler on ring %s.\n", + ring->name); + return r; + } + } + + return 0; +} + + /** * amdgpu_device_ip_init - run init for hardware IPs * @@ -2412,6 +2453,10 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev) } } + r = amdgpu_device_init_schedulers(adev); + if (r) + goto init_failed; + /* Don't init kfd if whole hive need to be reset during init */ if (!adev->gmc.xgmi.pending_reset) amdgpu_amdkfd_device_init(adev); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index 3b7e86ea7167..5527c68c51de 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -456,8 +456,6 @@ int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring, atomic_t *sched_score) { struct amdgpu_device *adev = ring->adev; - long timeout; - int r; if (!adev) return -EINVAL; @@ -477,36 +475,12 @@ int amdgpu_fence_driver_init_ring(struct amdgpu_ring *ring, spin_lock_init(&ring->fence_drv.lock); ring->fence_drv.fences = kcalloc(num_hw_submission * 2, sizeof(void *), GFP_KERNEL); - if (!ring->fence_drv.fences) - return -ENOMEM; - /* No need to setup the GPU scheduler for rings that don't need it */ - if (ring->no_scheduler) - return 0; + ring->num_hw_submission = num_hw_submission; + ring->sched_score = sched_score; - switch (ring->funcs->type) { - case AMDGPU_RING_TYPE_GFX: - timeout = adev->gfx_timeout; - break; - case AMDGPU_RING_TYPE_COMPUTE: - timeout = adev->compute_timeout; - break; - case AMDGPU_RING_TYPE_SDMA: - timeout = adev->sdma_timeout; - break; - default: - timeout = adev->video_timeout; - break; - } - - r = drm_sched_init(&ring->sched, &amdgpu_sched_ops, - num_hw_submission, amdgpu_job_hang_limit, - timeout, NULL, sched_score, ring->name); - if (r) { - DRM_ERROR("Failed to create scheduler on ring %s.\n", - ring->name); - return r; - } + if (!ring->fence_drv.fences) + return -ENOMEM; return 0; } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h index 4d380e79752c..a4b8279e3011 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h @@ -
[RFC v2 4/8] drm/amdgpu: Serialize non TDR gpu recovery with TDRs
Use reset domain wq also for non TDR gpu recovery trigers such as sysfs and RAS. We must serialize all possible GPU recoveries to gurantee no concurrency there. For TDR call the original recovery function directly since it's already executed from within the wq. For others just use a wrapper to qeueue work and wait on it to finish. v2: Rename to amdgpu_recover_work_struct Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 2 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 33 +- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c| 2 +- 3 files changed, 35 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index b5ff76aae7e0..8e96b9a14452 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1296,6 +1296,8 @@ bool amdgpu_device_has_job_running(struct amdgpu_device *adev); bool amdgpu_device_should_recover_gpu(struct amdgpu_device *adev); int amdgpu_device_gpu_recover(struct amdgpu_device *adev, struct amdgpu_job* job); +int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev, + struct amdgpu_job *job); void amdgpu_device_pci_config_reset(struct amdgpu_device *adev); int amdgpu_device_pci_reset(struct amdgpu_device *adev); bool amdgpu_device_need_post(struct amdgpu_device *adev); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 7c063fd37389..258ec3c0b2af 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4979,7 +4979,7 @@ static void amdgpu_device_recheck_guilty_jobs( * Returns 0 for success or an error on failure. */ -int amdgpu_device_gpu_recover(struct amdgpu_device *adev, +int amdgpu_device_gpu_recover_imp(struct amdgpu_device *adev, struct amdgpu_job *job) { struct list_head device_list, *device_list_handle = NULL; @@ -5237,6 +5237,37 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, return r; } +struct amdgpu_recover_work_struct { + struct work_struct base; + struct amdgpu_device *adev; + struct amdgpu_job *job; + int ret; +}; + +static void amdgpu_device_queue_gpu_recover_work(struct work_struct *work) +{ + struct amdgpu_recover_work_struct *recover_work = container_of(work, struct amdgpu_recover_work_struct, base); + + recover_work->ret = amdgpu_device_gpu_recover_imp(recover_work->adev, recover_work->job); +} +/* + * Serialize gpu recover into reset domain single threaded wq + */ +int amdgpu_device_gpu_recover(struct amdgpu_device *adev, + struct amdgpu_job *job) +{ + struct amdgpu_recover_work_struct work = {.adev = adev, .job = job}; + + INIT_WORK(&work.base, amdgpu_device_queue_gpu_recover_work); + + if (!queue_work(adev->reset_domain.wq, &work.base)) + return -EAGAIN; + + flush_work(&work.base); + + return work.ret; +} + /** * amdgpu_device_get_pcie_info - fence pcie info about the PCIE slot * diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index bfc47bea23db..38c9fd7b7ad4 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -63,7 +63,7 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job) ti.process_name, ti.tgid, ti.task_name, ti.pid); if (amdgpu_device_should_recover_gpu(ring->adev)) { - amdgpu_device_gpu_recover(ring->adev, job); + amdgpu_device_gpu_recover_imp(ring->adev, job); } else { drm_sched_suspend_timeout(&ring->sched); if (amdgpu_sriov_vf(adev)) -- 2.25.1
[RFC v2 3/8] drm/amdgpu: Fix crash on modprobe
Restrict jobs resubmission to suspend case only since schedulers not initialised yet on probe. Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c index 5527c68c51de..8ebd954e06c6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c @@ -582,7 +582,7 @@ void amdgpu_fence_driver_hw_init(struct amdgpu_device *adev) if (!ring || !ring->fence_drv.initialized) continue; - if (!ring->no_scheduler) { + if (adev->in_suspend && !ring->no_scheduler) { drm_sched_resubmit_jobs(&ring->sched); drm_sched_start(&ring->sched, true); } -- 2.25.1
[RFC v2 1/8] drm/amdgpu: Introduce reset domain
Defined a reset_domain struct such that all the entities that go through reset together will be serialized one against another. Do it for both single device and XGMI hive cases. Signed-off-by: Andrey Grodzovsky Suggested-by: Daniel Vetter Suggested-by: Christian König Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 7 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 20 +++- drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 9 + drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h | 2 ++ 4 files changed, 37 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 9f017663ac50..b5ff76aae7e0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -812,6 +812,11 @@ struct amd_powerplay { #define AMDGPU_RESET_MAGIC_NUM 64 #define AMDGPU_MAX_DF_PERFMONS 4 + +struct amdgpu_reset_domain { + struct workqueue_struct *wq; +}; + struct amdgpu_device { struct device *dev; struct pci_dev *pdev; @@ -1096,6 +1101,8 @@ struct amdgpu_device { struct amdgpu_reset_control *reset_cntl; uint32_t ip_versions[HW_ID_MAX][HWIP_MAX_INSTANCE]; + + struct amdgpu_reset_domain reset_domain; }; static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 90d22a376632..0f3e6c078f88 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2391,9 +2391,27 @@ static int amdgpu_device_ip_init(struct amdgpu_device *adev) if (r) goto init_failed; - if (adev->gmc.xgmi.num_physical_nodes > 1) + if (adev->gmc.xgmi.num_physical_nodes > 1) { + struct amdgpu_hive_info *hive; + amdgpu_xgmi_add_device(adev); + hive = amdgpu_get_xgmi_hive(adev); + if (!hive || !hive->reset_domain.wq) { + DRM_ERROR("Failed to obtain reset domain info for XGMI hive:%llx", hive->hive_id); + r = -EINVAL; + goto init_failed; + } + + adev->reset_domain.wq = hive->reset_domain.wq; + } else { + adev->reset_domain.wq = alloc_ordered_workqueue("amdgpu-reset-dev", 0); + if (!adev->reset_domain.wq) { + r = -ENOMEM; + goto init_failed; + } + } + /* Don't init kfd if whole hive need to be reset during init */ if (!adev->gmc.xgmi.pending_reset) amdgpu_amdkfd_device_init(adev); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c index 567df2db23ac..a858e3457c5c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c @@ -392,6 +392,14 @@ struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct amdgpu_device *adev) goto pro_end; } + hive->reset_domain.wq = alloc_ordered_workqueue("amdgpu-reset-hive", 0); + if (!hive->reset_domain.wq) { + dev_err(adev->dev, "XGMI: failed allocating wq for reset domain!\n"); + kfree(hive); + hive = NULL; + goto pro_end; + } + hive->hive_id = adev->gmc.xgmi.hive_id; INIT_LIST_HEAD(&hive->device_list); INIT_LIST_HEAD(&hive->node); @@ -401,6 +409,7 @@ struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct amdgpu_device *adev) task_barrier_init(&hive->tb); hive->pstate = AMDGPU_XGMI_PSTATE_UNKNOWN; hive->hi_req_gpu = NULL; + /* * hive pstate on boot is high in vega20 so we have to go to low * pstate on after boot. diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h index d2189bf7d428..6121aaa292cb 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h @@ -42,6 +42,8 @@ struct amdgpu_hive_info { AMDGPU_XGMI_PSTATE_MAX_VEGA20, AMDGPU_XGMI_PSTATE_UNKNOWN } pstate; + + struct amdgpu_reset_domain reset_domain; }; struct amdgpu_pcs_ras_field { -- 2.25.1
[RFC v2 0/8] Define and use reset domain for GPU recovery in amdgpu
This patchset is based on earlier work by Boris[1] that allowed to have an ordered workqueue at the driver level that will be used by the different schedulers to queue their timeout work. On top of that I also serialized any GPU reset we trigger from within amdgpu code to also go through the same ordered wq and in this way simplify somewhat our GPU reset code so we don't need to protect from concurrency by multiple GPU reset triggeres such as TDR on one hand and sysfs trigger or RAS trigger on the other hand. As advised by Christian and Daniel I defined a reset_domain struct such that all the entities that go through reset together will be serialized one against another. TDR triggered by multiple entities within the same domain due to the same reason will not be triggered as the first such reset will cancel all the pending resets. This is relevant only to TDR timers and not to triggered resets coming from RAS or SYSFS, those will still happen after the in flight resets finishes. v2: Add handling on SRIOV configuration, the reset notify coming from host and driver already trigger a work queue to handle the reset so drop this intermidiate wq and send directly to timeout wq. (Shaoyun) [1] https://patchwork.kernel.org/project/dri-devel/patch/20210629073510.2764391-3-boris.brezil...@collabora.com/ P.S Going through drm-misc-next and not amd-staging-drm-next as Boris work hasn't landed yet there. Andrey Grodzovsky (8): drm/amdgpu: Introduce reset domain drm/amdgpu: Move scheduler init to after XGMI is ready drm/amdgpu: Fix crash on modprobe drm/amdgpu: Serialize non TDR gpu recovery with TDRs drm/amd/virt: For SRIOV send GPU reset directly to TDR queue. drm/amdgpu: Drop hive->in_reset drm/amdgpu: Drop concurrent GPU reset protection for device drm/amd/virt: Drop concurrent GPU reset protection for SRIOV drivers/gpu/drm/amd/amdgpu/amdgpu.h| 9 + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 206 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 36 +--- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c| 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 2 + drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 10 +- drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h | 3 +- drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c | 18 +- drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c | 18 +- drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c | 7 +- 10 files changed, 147 insertions(+), 164 deletions(-) -- 2.25.1
Re: [PATCH 20/24] dma-buf: add DMA_RESV_USAGE_KERNEL
On Tue, Dec 07, 2021 at 01:34:07PM +0100, Christian König wrote: > Add an usage for kernel submissions. Waiting for those > are mandatory for dynamic DMA-bufs. > > Signed-off-by: Christian König Again just skipping to the doc bikeshedding, maybe with more cc others help with some code review too. > EXPORT_SYMBOL(ib_umem_dmabuf_map_pages); > diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h > index 4f3a6abf43c4..29d71496 100644 > --- a/include/linux/dma-resv.h > +++ b/include/linux/dma-resv.h > @@ -54,8 +54,30 @@ struct dma_resv_list; > * > * This enum describes the different use cases for a dma_resv object and > * controls which fences are returned when queried. > + * > + * An important fact is that there is the order KERNEL + * when the dma_resv object is asked for fences for one use case the fences > + * for the lower use case are returned as well. > + * > + * For example when asking for WRITE fences then the KERNEL fences are > returned > + * as well. Similar when asked for READ fences then both WRITE and KERNEL > + * fences are returned as well. > */ > enum dma_resv_usage { > + /** > + * @DMA_RESV_USAGE_KERNEL: For in kernel memory management only. > + * > + * This should only be used for things like copying or clearing memory > + * with a DMA hardware engine for the purpose of kernel memory > + * management. > + * > + * Drivers *always* need to wait for those fences before accessing > the s/need to/must/ to stay with usual RFC wording. It's a hard requirement or there's a security bug somewhere. > + * resource protected by the dma_resv object. The only exception for > + * that is when the resource is known to be locked down in place by > + * pinning it previously. Is this true? This sounds more confusing than helpful, because afaik in general our pin interfaces do not block for any kernel fences. dma_buf_pin doesn't do that for sure. And I don't think ttm does that either. I think the only safe thing here is to state that it's safe if a) the resource is pinned down and b) the callers has previously waited for the kernel fences. I also think we should put that wait for kernel fences into dma_buf_pin(), but that's maybe a later patch. -Daniel > + */ > + DMA_RESV_USAGE_KERNEL, > + > /** >* @DMA_RESV_USAGE_WRITE: Implicit write synchronization. >* > -- > 2.25.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 18/24] dma-buf: add enum dma_resv_usage v3
On Tue, Dec 07, 2021 at 01:34:05PM +0100, Christian König wrote: > This change adds the dma_resv_usage enum and allows us to specify why a > dma_resv object is queried for its containing fences. > > Additional to that a dma_resv_usage_rw() helper function is added to aid > retrieving the fences for a read or write userspace submission. > > This is then deployed to the different query functions of the dma_resv > object and all of their users. When the write paratermer was previously > true we now use DMA_RESV_USAGE_WRITE and DMA_RESV_USAGE_READ otherwise. > > v2: add KERNEL/OTHER in separate patch > v3: some kerneldoc suggestions by Daniel > > Signed-off-by: Christian König Just commenting on the kerneldoc here. > diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h > index 40ac9d486f8f..d96d8ca9af56 100644 > --- a/include/linux/dma-resv.h > +++ b/include/linux/dma-resv.h > @@ -49,6 +49,49 @@ extern struct ww_class reservation_ww_class; > > struct dma_resv_list; > > +/** > + * enum dma_resv_usage - how the fences from a dma_resv obj are used > + * > + * This enum describes the different use cases for a dma_resv object and > + * controls which fences are returned when queried. We need to link here to both dma_buf.resv and from there to here. Also we had a fair amount of text in the old dma_resv fields which should probably be included here. > + */ > +enum dma_resv_usage { > + /** > + * @DMA_RESV_USAGE_WRITE: Implicit write synchronization. > + * > + * This should only be used for userspace command submissions which add > + * an implicit write dependency. > + */ > + DMA_RESV_USAGE_WRITE, > + > + /** > + * @DMA_RESV_USAGE_READ: Implicit read synchronization. > + * > + * This should only be used for userspace command submissions which add > + * an implicit read dependency. I think the above would benefit from at least a link each to &dma_buf.resv for further discusion. Plus the READ flag needs a huge warning that in general it does _not_ guarantee that neither there's no writes possible, nor that the writes can be assumed mistakes and dropped (on buffer moves e.g.). Drivers can only make further assumptions for driver-internal dma_resv objects (e.g. on vm/pagetables) or when the fences are all fences of the same driver (e.g. the special sync rules amd has that takes the fence owner into account). We have this documented in the dma_buf.resv rules, but since it came up again in a discussion with Thomas H. somewhere, it's better to hammer this in a few more time. Specically in generally ignoring READ fences for buffer moves (well the copy job, memory freeing still has to wait for all of them) is a correctness bug. Maybe include a big warning that really the difference between READ and WRITE should only matter for implicit sync, and _not_ for anything else the kernel does. I'm assuming the actual replacement is all mechanical, so I skipped that one for now, that's for next year :-) -Daniel > + */ > + DMA_RESV_USAGE_READ, > +}; > + > +/** > + * dma_resv_usage_rw - helper for implicit sync > + * @write: true if we create a new implicit sync write > + * > + * This returns the implicit synchronization usage for write or read > accesses, > + * see enum dma_resv_usage. > + */ > +static inline enum dma_resv_usage dma_resv_usage_rw(bool write) > +{ > + /* This looks confusing at first sight, but is indeed correct. > + * > + * The rational is that new write operations needs to wait for the > + * existing read and write operations to finish. > + * But a new read operation only needs to wait for the existing write > + * operations to finish. > + */ > + return write ? DMA_RESV_USAGE_READ : DMA_RESV_USAGE_WRITE; > +} > + > /** > * struct dma_resv - a reservation object manages fences for a buffer > * > @@ -147,8 +190,8 @@ struct dma_resv_iter { > /** @obj: The dma_resv object we iterate over */ > struct dma_resv *obj; > > - /** @all_fences: If all fences should be returned */ > - bool all_fences; > + /** @usage: Controls which fences are returned */ > + enum dma_resv_usage usage; > > /** @fence: the currently handled fence */ > struct dma_fence *fence; > @@ -178,14 +221,14 @@ struct dma_fence *dma_resv_iter_next(struct > dma_resv_iter *cursor); > * dma_resv_iter_begin - initialize a dma_resv_iter object > * @cursor: The dma_resv_iter object to initialize > * @obj: The dma_resv object which we want to iterate over > - * @all_fences: If all fences should be returned or just the exclusive one > + * @usage: controls which fences to include, see enum dma_resv_usage. > */ > static inline void dma_resv_iter_begin(struct dma_resv_iter *cursor, > struct dma_resv *obj, > -bool all_fences) > +enum dma_resv_usage usage) > { > curso
Re: [Intel-gfx] [PATCH] drm/i915/guc: Log engine resets
On 12/22/2021 08:21, Tvrtko Ursulin wrote: On 21/12/2021 22:14, John Harrison wrote: On 12/21/2021 05:37, Tvrtko Ursulin wrote: On 20/12/2021 18:34, John Harrison wrote: On 12/20/2021 07:00, Tvrtko Ursulin wrote: On 17/12/2021 16:22, Matthew Brost wrote: On Fri, Dec 17, 2021 at 12:15:53PM +, Tvrtko Ursulin wrote: On 14/12/2021 15:07, Tvrtko Ursulin wrote: From: Tvrtko Ursulin Log engine resets done by the GuC firmware in the similar way it is done by the execlists backend. This way we have notion of where the hangs are before the GuC gains support for proper error capture. Ping - any interest to log this info? All there currently is a non-descriptive "[drm] GPU HANG: ecode 12:0:". Yea, this could be helpful. One suggestion below. Also, will GuC be reporting the reason for the engine reset at any point? We are working on the error state capture, presumably the registers will give a clue what caused the hang. As for the GuC providing a reason, that isn't defined in the interface but that is decent idea to provide a hint in G2H what the issue was. Let me run that by the i915 GuC developers / GuC firmware team and see what they think. The GuC does not do any hang analysis. So as far as GuC is concerned, the reason is pretty much always going to be pre-emption timeout. There are a few ways the pre-emption itself could be triggered but basically, if GuC resets an active context then it is because it did not pre-empt quickly enough when requested. Regards, Tvrtko Signed-off-by: Tvrtko Ursulin Cc: Matthew Brost Cc: John Harrison --- drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index 9739da6f..51512123dc1a 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -11,6 +11,7 @@ #include "gt/intel_context.h" #include "gt/intel_engine_pm.h" #include "gt/intel_engine_heartbeat.h" +#include "gt/intel_engine_user.h" #include "gt/intel_gpu_commands.h" #include "gt/intel_gt.h" #include "gt/intel_gt_clock_utils.h" @@ -3934,9 +3935,18 @@ static void capture_error_state(struct intel_guc *guc, { struct intel_gt *gt = guc_to_gt(guc); struct drm_i915_private *i915 = gt->i915; - struct intel_engine_cs *engine = __context_to_physical_engine(ce); + struct intel_engine_cs *engine = ce->engine; intel_wakeref_t wakeref; + if (intel_engine_is_virtual(engine)) { + drm_notice(&i915->drm, "%s class, engines 0x%x; GuC engine reset\n", + intel_engine_class_repr(engine->class), + engine->mask); + engine = guc_virtual_get_sibling(engine, 0); + } else { + drm_notice(&i915->drm, "%s GuC engine reset\n", engine->name); Probably include the guc_id of the context too then? Is the guc id stable and useful on its own - who would be the user? The GuC id is the only thing that matters when trying to correlate KMD activity with a GuC log. So while it might not be of any use or interest to an end user, it is extremely important and useful to a kernel developer attempting to debug an issue. And that includes bug reports from end users that are hard to repro given that the standard error capture will include the GuC log. On the topic of GuC log - is there a tool in IGT (or will be) which will parse the bit saved in the error capture or how is that supposed to be used? Nope. However, Alan is currently working on supporting the GuC error capture mechanism. Prior to sending the reset notification to the KMD, the GuC will save a whole bunch of register state to a memory buffer and send a notification to the KMD that this is available. When we then get the actual reset notification, we need to match the two together and include a parsed, human readable version of the GuC's capture state buffer in the sysfs error log output. The GuC log should not be involved in this process. And note that any register dumps in the GuC log are limited in scope and only enabled at higher verbosity levels. Whereas, the official state capture is based on a register list provided by the KMD and is available irrespective of debug CONFIG settings, verbosity levels, etc. Hm why should GuC log not be involved now? I thought earlier you said: """ And that includes bug reports from end users that are hard to repro given that the standard error capture will include the GuC log. """ Hence I thought there would be a tool in IGT which would parse the part saved inside the error capture. Different things. The GuC log is not involved in capturing hardware register state and reporting that as part of the sysfs error capture that user's can read out. The GuC needs to do the state capture for us if it is doing the reset, but it is provided v
Re: [PATCH 17/24] drm/amdgpu: use dma_resv_get_singleton in amdgpu_pasid_free_cb
On Tue, Dec 07, 2021 at 01:34:04PM +0100, Christian König wrote: > Makes the code a bit more simpler. > > Signed-off-by: Christian König Reviewed-by: Daniel Vetter > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 23 +++ > 1 file changed, 3 insertions(+), 20 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c > index be48487e2ca7..888d97143177 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c > @@ -107,36 +107,19 @@ static void amdgpu_pasid_free_cb(struct dma_fence > *fence, > void amdgpu_pasid_free_delayed(struct dma_resv *resv, > u32 pasid) > { > - struct dma_fence *fence, **fences; > struct amdgpu_pasid_cb *cb; > - unsigned count; > + struct dma_fence *fence; > int r; > > - r = dma_resv_get_fences(resv, true, &count, &fences); > + r = dma_resv_get_singleton(resv, true, &fence); > if (r) > goto fallback; > > - if (count == 0) { > + if (!fence) { > amdgpu_pasid_free(pasid); > return; > } > > - if (count == 1) { > - fence = fences[0]; > - kfree(fences); > - } else { > - uint64_t context = dma_fence_context_alloc(1); > - struct dma_fence_array *array; > - > - array = dma_fence_array_create(count, fences, context, > -1, false); > - if (!array) { > - kfree(fences); > - goto fallback; > - } > - fence = &array->base; > - } > - > cb = kmalloc(sizeof(*cb), GFP_KERNEL); > if (!cb) { > /* Last resort when we are OOM */ > -- > 2.25.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 16/24] drm/nouveau: support more than one write fence in fenv50_wndw_prepare_fb
On Tue, Dec 07, 2021 at 01:34:03PM +0100, Christian König wrote: > Use dma_resv_get_singleton() here to eventually get more than one write > fence as single fence. > > Signed-off-by: Christian König > --- > drivers/gpu/drm/nouveau/dispnv50/wndw.c | 14 +- > 1 file changed, 5 insertions(+), 9 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/dispnv50/wndw.c > b/drivers/gpu/drm/nouveau/dispnv50/wndw.c > index 133c8736426a..b55a8a723581 100644 > --- a/drivers/gpu/drm/nouveau/dispnv50/wndw.c > +++ b/drivers/gpu/drm/nouveau/dispnv50/wndw.c > @@ -536,8 +536,6 @@ nv50_wndw_prepare_fb(struct drm_plane *plane, struct > drm_plane_state *state) > struct nouveau_bo *nvbo; > struct nv50_head_atom *asyh; > struct nv50_wndw_ctxdma *ctxdma; > - struct dma_resv_iter cursor; > - struct dma_fence *fence; > int ret; > > NV_ATOMIC(drm, "%s prepare: %p\n", plane->name, fb); > @@ -560,13 +558,11 @@ nv50_wndw_prepare_fb(struct drm_plane *plane, struct > drm_plane_state *state) > asyw->image.handle[0] = ctxdma->object.handle; > } > > - dma_resv_iter_begin(&cursor, nvbo->bo.base.resv, false); > - dma_resv_for_each_fence_unlocked(&cursor, fence) { > - /* TODO: We only use the first writer here */ > - asyw->state.fence = dma_fence_get(fence); > - break; > - } > - dma_resv_iter_end(&cursor); > + ret = dma_resv_get_singleton(nvbo->bo.base.resv, false, > + &asyw->state.fence); Needs nouveau-ack, but otherwise lgtm. Reviewed-by: Daniel Vetter > + if (ret) > + return ret; > + > asyw->image.offset[0] = nvbo->offset; > > if (wndw->func->prepare) { > -- > 2.25.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 15/24] drm: support more than one write fence in drm_gem_plane_helper_prepare_fb
On Tue, Dec 07, 2021 at 01:34:02PM +0100, Christian König wrote: > Use dma_resv_get_singleton() here to eventually get more than one write > fence as single fence. > > Signed-off-by: Christian König Patch title should be drm/atomic-helper: prefix, not just drm: With that nit: Reviewed-by: Daniel Vetter > --- > drivers/gpu/drm/drm_gem_atomic_helper.c | 18 +++--- > 1 file changed, 7 insertions(+), 11 deletions(-) > > diff --git a/drivers/gpu/drm/drm_gem_atomic_helper.c > b/drivers/gpu/drm/drm_gem_atomic_helper.c > index c3189afe10cb..9338ddb7edff 100644 > --- a/drivers/gpu/drm/drm_gem_atomic_helper.c > +++ b/drivers/gpu/drm/drm_gem_atomic_helper.c > @@ -143,25 +143,21 @@ > */ > int drm_gem_plane_helper_prepare_fb(struct drm_plane *plane, struct > drm_plane_state *state) > { > - struct dma_resv_iter cursor; > struct drm_gem_object *obj; > struct dma_fence *fence; > + int ret; > > if (!state->fb) > return 0; > > obj = drm_gem_fb_get_obj(state->fb, 0); > - dma_resv_iter_begin(&cursor, obj->resv, false); > - dma_resv_for_each_fence_unlocked(&cursor, fence) { > - /* TODO: Currently there should be only one write fence, so this > - * here works fine. But drm_atomic_set_fence_for_plane() should > - * be changed to be able to handle more fences in general for > - * multiple BOs per fb anyway. */ > - dma_fence_get(fence); > - break; > - } > - dma_resv_iter_end(&cursor); > + ret = dma_resv_get_singleton(obj->resv, false, &fence); > + if (ret) > + return ret; > > + /* TODO: drm_atomic_set_fence_for_plane() should be changed to be able > + * to handle more fences in general for multiple BOs per fb. > + */ > drm_atomic_set_fence_for_plane(state, fence); > return 0; > } > -- > 2.25.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 14/24] dma-buf/drivers: make reserving a shared slot mandatory v2
On Tue, Dec 07, 2021 at 01:34:01PM +0100, Christian König wrote: > Audit all the users of dma_resv_add_excl_fence() and make sure they > reserve a shared slot also when only trying to add an exclusive fence. > > This is the next step towards handling the exclusive fence like a > shared one. > > v2: fix missed case in amdgpu > > Signed-off-by: Christian König Needs all the driver cc and also at least some acks/testing. > --- > drivers/dma-buf/st-dma-resv.c | 64 +-- > drivers/gpu/drm/amd/amdgpu/amdgpu_object.c| 8 +++ > drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 8 +-- > drivers/gpu/drm/i915/gem/i915_gem_clflush.c | 3 +- > .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 8 +-- > .../drm/i915/gem/selftests/i915_gem_migrate.c | 5 +- > drivers/gpu/drm/i915/i915_vma.c | 6 ++ > .../drm/i915/selftests/intel_memory_region.c | 7 ++ > drivers/gpu/drm/lima/lima_gem.c | 10 ++- > drivers/gpu/drm/msm/msm_gem_submit.c | 18 +++--- > drivers/gpu/drm/nouveau/nouveau_fence.c | 9 +-- > drivers/gpu/drm/panfrost/panfrost_job.c | 4 ++ > drivers/gpu/drm/ttm/ttm_bo_util.c | 12 +++- > drivers/gpu/drm/ttm/ttm_execbuf_util.c| 11 ++-- vc4 seems missing? Also I think I found one bug below in the conversions. -Daniel > drivers/gpu/drm/v3d/v3d_gem.c | 15 +++-- > drivers/gpu/drm/vgem/vgem_fence.c | 12 ++-- > drivers/gpu/drm/virtio/virtgpu_gem.c | 9 +++ > drivers/gpu/drm/vmwgfx/vmwgfx_bo.c| 16 +++-- > 18 files changed, 133 insertions(+), 92 deletions(-) > > diff --git a/drivers/dma-buf/st-dma-resv.c b/drivers/dma-buf/st-dma-resv.c > index cbe999c6e7a6..f33bafc78693 100644 > --- a/drivers/dma-buf/st-dma-resv.c > +++ b/drivers/dma-buf/st-dma-resv.c > @@ -75,17 +75,16 @@ static int test_signaling(void *arg, bool shared) > goto err_free; > } > > - if (shared) { > - r = dma_resv_reserve_shared(&resv, 1); > - if (r) { > - pr_err("Resv shared slot allocation failed\n"); > - goto err_unlock; > - } > + r = dma_resv_reserve_shared(&resv, 1); > + if (r) { > + pr_err("Resv shared slot allocation failed\n"); > + goto err_unlock; > + } > > + if (shared) > dma_resv_add_shared_fence(&resv, f); > - } else { > + else > dma_resv_add_excl_fence(&resv, f); > - } > > if (dma_resv_test_signaled(&resv, shared)) { > pr_err("Resv unexpectedly signaled\n"); > @@ -134,17 +133,16 @@ static int test_for_each(void *arg, bool shared) > goto err_free; > } > > - if (shared) { > - r = dma_resv_reserve_shared(&resv, 1); > - if (r) { > - pr_err("Resv shared slot allocation failed\n"); > - goto err_unlock; > - } > + r = dma_resv_reserve_shared(&resv, 1); > + if (r) { > + pr_err("Resv shared slot allocation failed\n"); > + goto err_unlock; > + } > > + if (shared) > dma_resv_add_shared_fence(&resv, f); > - } else { > + else > dma_resv_add_excl_fence(&resv, f); > - } > > r = -ENOENT; > dma_resv_for_each_fence(&cursor, &resv, shared, fence) { > @@ -206,18 +204,17 @@ static int test_for_each_unlocked(void *arg, bool > shared) > goto err_free; > } > > - if (shared) { > - r = dma_resv_reserve_shared(&resv, 1); > - if (r) { > - pr_err("Resv shared slot allocation failed\n"); > - dma_resv_unlock(&resv); > - goto err_free; > - } > + r = dma_resv_reserve_shared(&resv, 1); > + if (r) { > + pr_err("Resv shared slot allocation failed\n"); > + dma_resv_unlock(&resv); > + goto err_free; > + } > > + if (shared) > dma_resv_add_shared_fence(&resv, f); > - } else { > + else > dma_resv_add_excl_fence(&resv, f); > - } > dma_resv_unlock(&resv); > > r = -ENOENT; > @@ -290,18 +287,17 @@ static int test_get_fences(void *arg, bool shared) > goto err_resv; > } > > - if (shared) { > - r = dma_resv_reserve_shared(&resv, 1); > - if (r) { > - pr_err("Resv shared slot allocation failed\n"); > - dma_resv_unlock(&resv); > - goto err_resv; > - } > + r = dma_resv_reserve_shared(&resv, 1); > + if (r) { > + pr_err("Resv shared slot allocation failed\n"); > + dma_resv_unlock(&resv); > + goto err_resv; > + } > > + if (shared) > dma_resv_add_shared_fence(&resv, f); > - } else { > + else > dma_
Re: [PATCH 13/24] dma-buf: drop the DAG approach for the dma_resv object
On Tue, Dec 07, 2021 at 01:34:00PM +0100, Christian König wrote: > So far we had the approach of using a directed acyclic > graph with the dma_resv obj. > > This turned out to have many downsides, especially it means > that every single driver and user of this interface needs > to be aware of this restriction when adding fences. If the > rules for the DAG are not followed then we end up with > potential hard to debug memory corruption, information > leaks or even elephant big security holes because we allow > userspace to access freed up memory. > > Since we already took a step back from that by always > looking at all fences we now go a step further and stop > dropping the shared fences when a new exclusive one is > added. > > Signed-off-by: Christian König > --- > drivers/dma-buf/dma-resv.c | 13 - > 1 file changed, 13 deletions(-) > > diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c > index 9acceabc9399..ecb2ff606bac 100644 > --- a/drivers/dma-buf/dma-resv.c > +++ b/drivers/dma-buf/dma-resv.c No doc update at all! I checked, we're not that shitty with docs, Minimally the DOC: section header and also the struct dma_resv kerneldoc. Also there's maybe more references and stuff I've missed on a quick look, please check for them (e.g. dma_buf.resv kerneldoc is rather important to keep correct too). Code itself does what it says in the commit message, but we really need the most accurate docs we can get for this stuff, or the confusion will persist :-/ Cheers, Daniel > @@ -383,29 +383,16 @@ EXPORT_SYMBOL(dma_resv_replace_fences); > void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence) > { > struct dma_fence *old_fence = dma_resv_excl_fence(obj); > - struct dma_resv_list *old; > - u32 i = 0; > > dma_resv_assert_held(obj); > > - old = dma_resv_shared_list(obj); > - if (old) > - i = old->shared_count; > - > dma_fence_get(fence); > > write_seqcount_begin(&obj->seq); > /* write_seqcount_begin provides the necessary memory barrier */ > RCU_INIT_POINTER(obj->fence_excl, fence); > - if (old) > - old->shared_count = 0; > write_seqcount_end(&obj->seq); > > - /* inplace update, no shared fences */ > - while (i--) > - dma_fence_put(rcu_dereference_protected(old->shared[i], > - dma_resv_held(obj))); > - > dma_fence_put(old_fence); > } > EXPORT_SYMBOL(dma_resv_add_excl_fence); > -- > 2.25.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 12/24] dma-buf: finally make dma_resv_excl_fence private
On Tue, Dec 07, 2021 at 01:33:59PM +0100, Christian König wrote: > Drivers should never touch this directly. > > Signed-off-by: Christian König > --- > drivers/dma-buf/dma-resv.c | 17 + > include/linux/dma-resv.h | 17 - > 2 files changed, 17 insertions(+), 17 deletions(-) > > diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c > index 694716a3d66d..9acceabc9399 100644 > --- a/drivers/dma-buf/dma-resv.c > +++ b/drivers/dma-buf/dma-resv.c > @@ -147,6 +147,23 @@ void dma_resv_fini(struct dma_resv *obj) > } > EXPORT_SYMBOL(dma_resv_fini); > > +/** > + * dma_resv_excl_fence - return the object's exclusive fence > + * @obj: the reservation object > + * > + * Returns the exclusive fence (if any). Caller must either hold the objects > + * through dma_resv_lock() or the RCU read side lock through rcu_read_lock(), > + * or one of the variants of each > + * > + * RETURNS > + * The exclusive fence or NULL > + */ Same thing with us not documenting internals, pls drop the comment outright it doesn't really explain anything. With that: Reviewed-by: Daniel Vetter > +static inline struct dma_fence * > +dma_resv_excl_fence(struct dma_resv *obj) > +{ > + return rcu_dereference_check(obj->fence_excl, dma_resv_held(obj)); > +} > + > /** > * dma_resv_shared_list - get the reservation object's shared fence list > * @obj: the reservation object > diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h > index cdfbbda6f600..40ac9d486f8f 100644 > --- a/include/linux/dma-resv.h > +++ b/include/linux/dma-resv.h > @@ -412,23 +412,6 @@ static inline void dma_resv_unlock(struct dma_resv *obj) > ww_mutex_unlock(&obj->lock); > } > > -/** > - * dma_resv_excl_fence - return the object's exclusive fence > - * @obj: the reservation object > - * > - * Returns the exclusive fence (if any). Caller must either hold the objects > - * through dma_resv_lock() or the RCU read side lock through rcu_read_lock(), > - * or one of the variants of each > - * > - * RETURNS > - * The exclusive fence or NULL > - */ > -static inline struct dma_fence * > -dma_resv_excl_fence(struct dma_resv *obj) > -{ > - return rcu_dereference_check(obj->fence_excl, dma_resv_held(obj)); > -} > - > void dma_resv_init(struct dma_resv *obj); > void dma_resv_fini(struct dma_resv *obj); > int dma_resv_reserve_shared(struct dma_resv *obj, unsigned int num_fences); > -- > 2.25.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 11/24] drm/amdgpu: use dma_resv_for_each_fence for CS workaround
On Tue, Dec 07, 2021 at 01:33:58PM +0100, Christian König wrote: > Get the write fence using dma_resv_for_each_fence instead of accessing > it manually. > > Signed-off-by: Christian König > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 9 ++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > index 53e407ea4c89..7facd614e50a 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > @@ -1268,6 +1268,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, > amdgpu_bo_list_for_each_entry(e, p->bo_list) { > struct dma_resv *resv = e->tv.bo->base.resv; > struct dma_fence_chain *chain = e->chain; > + struct dma_resv_iter cursor; > + struct dma_fence *fence; > > if (!chain) > continue; > @@ -1277,9 +1279,10 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, >* submission in a dma_fence_chain and add it as exclusive >* fence. >*/ > - dma_fence_chain_init(chain, dma_resv_excl_fence(resv), > - dma_fence_get(p->fence), 1); > - > + dma_resv_for_each_fence(&cursor, resv, false, fence) { > + break; > + } > + dma_fence_chain_init(chain, fence, dma_fence_get(p->fence), 1); Uh this needs a TODO. I'm assuming you'll fix this up later on when there's more than write fence, but in case of bisect or whatever this is a bit too clever. Like you just replace one "dig around in dma-resv implementation details" with one that's not even a documented interface :-) With an adequately loud comment added interim: Reviewed-by: Daniel Vetter > rcu_assign_pointer(resv->fence_excl, &chain->base); > e->chain = NULL; > } > -- > 2.25.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 10/24] drm/amdgpu: remove excl as shared workarounds
On Tue, Dec 07, 2021 at 01:33:57PM +0100, Christian König wrote: > This was added because of the now dropped shared on excl dependency. > > Signed-off-by: Christian König I didn't do a full re-audit of whether you got them all, I think latest with the semantic change to allow more kinds of fence types with dma-resv we should catch them all. Reviewed-by: Daniel Vetter > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 5 + > drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 6 -- > 2 files changed, 1 insertion(+), 10 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > index 0311d799a010..53e407ea4c89 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > @@ -1275,14 +1275,11 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser > *p, > /* >* Work around dma_resv shortcommings by wrapping up the >* submission in a dma_fence_chain and add it as exclusive > - * fence, but first add the submission as shared fence to make > - * sure that shared fences never signal before the exclusive > - * one. > + * fence. >*/ > dma_fence_chain_init(chain, dma_resv_excl_fence(resv), >dma_fence_get(p->fence), 1); > > - dma_resv_add_shared_fence(resv, p->fence); > rcu_assign_pointer(resv->fence_excl, &chain->base); > e->chain = NULL; > } > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > index a1e63ba4c54a..85d31d85c384 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > @@ -226,12 +226,6 @@ static void amdgpu_gem_object_close(struct > drm_gem_object *obj, > if (!amdgpu_vm_ready(vm)) > goto out_unlock; > > - fence = dma_resv_excl_fence(bo->tbo.base.resv); > - if (fence) { > - amdgpu_bo_fence(bo, fence, true); > - fence = NULL; > - } > - > r = amdgpu_vm_clear_freed(adev, vm, &fence); > if (r || !fence) > goto out_unlock; > -- > 2.25.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 08/24] drm/vmwgfx: stop using dma_resv_excl_fence
On Tue, Dec 07, 2021 at 01:33:55PM +0100, Christian König wrote: > Instead use the new dma_resv_get_singleton function. > > Signed-off-by: Christian König > --- > drivers/gpu/drm/vmwgfx/vmwgfx_resource.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c > b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c > index 8d1e869cc196..23c3fc2cbf10 100644 > --- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c > +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c > @@ -1168,8 +1168,10 @@ int vmw_resources_clean(struct vmw_buffer_object *vbo, > pgoff_t start, > vmw_bo_fence_single(bo, NULL); > if (bo->moving) > dma_fence_put(bo->moving); > - bo->moving = dma_fence_get > - (dma_resv_excl_fence(bo->base.resv)); > + > + /* TODO: This is actually a memory management dependency */ > + return dma_resv_get_singleton(bo->base.resv, false, > + &bo->moving); Reviewed-by: Daniel Vetter > } > > return 0; > -- > 2.25.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 09/24] drm/radeon: stop using dma_resv_excl_fence
On Tue, Dec 07, 2021 at 01:33:56PM +0100, Christian König wrote: > Instead use the new dma_resv_get_singleton function. > > Signed-off-by: Christian König > --- > drivers/gpu/drm/radeon/radeon_display.c | 7 ++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/radeon/radeon_display.c > b/drivers/gpu/drm/radeon/radeon_display.c > index 573154268d43..a6f875118f01 100644 > --- a/drivers/gpu/drm/radeon/radeon_display.c > +++ b/drivers/gpu/drm/radeon/radeon_display.c > @@ -533,7 +533,12 @@ static int radeon_crtc_page_flip_target(struct drm_crtc > *crtc, > DRM_ERROR("failed to pin new rbo buffer before flip\n"); > goto cleanup; > } > - work->fence = > dma_fence_get(dma_resv_excl_fence(new_rbo->tbo.base.resv)); > + r = dma_resv_get_singleton(new_rbo->tbo.base.resv, false, &work->fence); > + if (r) { > + radeon_bo_unreserve(new_rbo); > + DRM_ERROR("failed to get new rbo buffer fences\n"); > + goto cleanup; > + } Reviewed-by: Daniel Vetter > radeon_bo_get_tiling_flags(new_rbo, &tiling_flags, NULL); > radeon_bo_unreserve(new_rbo); > > -- > 2.25.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 07/24] drm/nouveau: stop using dma_resv_excl_fence
On Tue, Dec 07, 2021 at 01:33:54PM +0100, Christian König wrote: > Instead use the new dma_resv_get_singleton function. > > Signed-off-by: Christian König > --- > drivers/gpu/drm/nouveau/nouveau_bo.c | 9 - > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c > b/drivers/gpu/drm/nouveau/nouveau_bo.c > index fa73fe57f97b..74f8652d2bd3 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_bo.c > +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c > @@ -959,7 +959,14 @@ nouveau_bo_vm_cleanup(struct ttm_buffer_object *bo, > { > struct nouveau_drm *drm = nouveau_bdev(bo->bdev); > struct drm_device *dev = drm->dev; > - struct dma_fence *fence = dma_resv_excl_fence(bo->base.resv); > + struct dma_fence *fence; > + int ret; > + > + /* TODO: This is actually a memory management dependency */ > + ret = dma_resv_get_singleton(bo->base.resv, false, &fence); > + if (ret) > + dma_resv_wait_timeout(bo->base.resv, false, false, > + MAX_SCHEDULE_TIMEOUT); Needs ack from nouveau folks. Reviewed-by: Daniel Vetter > > nv10_bo_put_tile_region(dev, *old_tile, fence); > *old_tile = new_tile; > -- > 2.25.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 06/24] drm/etnaviv: stop using dma_resv_excl_fence
On Tue, Dec 07, 2021 at 01:33:53PM +0100, Christian König wrote: > We can get the excl fence together with the shared ones as well. > > Signed-off-by: Christian König Pls cc driver maintainers. dim add-missing-cc is your friend if you're lazy can even combine that with git rebase -x. Same for all the other driver patches, some acks/testing would be good to avoid fallout (we had a bit much of that with all these I think). > --- > drivers/gpu/drm/etnaviv/etnaviv_gem.h| 1 - > drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 14 +- > drivers/gpu/drm/etnaviv/etnaviv_sched.c | 10 -- > 3 files changed, 5 insertions(+), 20 deletions(-) > > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.h > b/drivers/gpu/drm/etnaviv/etnaviv_gem.h > index 98e60df882b6..f596d743baa3 100644 > --- a/drivers/gpu/drm/etnaviv/etnaviv_gem.h > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.h > @@ -80,7 +80,6 @@ struct etnaviv_gem_submit_bo { > u64 va; > struct etnaviv_gem_object *obj; > struct etnaviv_vram_mapping *mapping; > - struct dma_fence *excl; > unsigned int nr_shared; > struct dma_fence **shared; > }; > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c > b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c > index 64c90ff348f2..4286dc93fdaa 100644 > --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c > @@ -188,15 +188,11 @@ static int submit_fence_sync(struct etnaviv_gem_submit > *submit) > if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT) > continue; > > - if (bo->flags & ETNA_SUBMIT_BO_WRITE) { > - ret = dma_resv_get_fences(robj, true, &bo->nr_shared, > - &bo->shared); > - if (ret) > - return ret; > - } else { > - bo->excl = dma_fence_get(dma_resv_excl_fence(robj)); > - } > - > + ret = dma_resv_get_fences(robj, > + !!(bo->flags & ETNA_SUBMIT_BO_WRITE), Afaik the cast to bool !! here is overkill, compiler will do that for you or something like that. With that dropped: Reviewed-by: Daniel Vetter > + &bo->nr_shared, &bo->shared); > + if (ret) > + return ret; > } > > return ret; > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c > b/drivers/gpu/drm/etnaviv/etnaviv_sched.c > index 180bb633d5c5..8c038a363d15 100644 > --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c > +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c > @@ -39,16 +39,6 @@ etnaviv_sched_dependency(struct drm_sched_job *sched_job, > struct etnaviv_gem_submit_bo *bo = &submit->bos[i]; > int j; > > - if (bo->excl) { > - fence = bo->excl; > - bo->excl = NULL; > - > - if (!dma_fence_is_signaled(fence)) > - return fence; > - > - dma_fence_put(fence); > - } > - > for (j = 0; j < bo->nr_shared; j++) { > if (!bo->shared[j]) > continue; > -- > 2.25.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 05/24] RDMA: use dma_resv_wait() instead of extracting the fence
On Tue, Dec 07, 2021 at 01:33:52PM +0100, Christian König wrote: > Use dma_resv_wait() instead of extracting the exclusive fence and > waiting on it manually. > > Signed-off-by: Christian König No rdma lists nor maintainers on cc, so no chances to get the ack you need to merge this through drm-misc-next. > --- > drivers/infiniband/core/umem_dmabuf.c | 8 ++-- > 1 file changed, 2 insertions(+), 6 deletions(-) > > diff --git a/drivers/infiniband/core/umem_dmabuf.c > b/drivers/infiniband/core/umem_dmabuf.c > index f0760741f281..d32cd7538835 100644 > --- a/drivers/infiniband/core/umem_dmabuf.c > +++ b/drivers/infiniband/core/umem_dmabuf.c > @@ -16,7 +16,6 @@ int ib_umem_dmabuf_map_pages(struct ib_umem_dmabuf > *umem_dmabuf) > { > struct sg_table *sgt; > struct scatterlist *sg; > - struct dma_fence *fence; > unsigned long start, end, cur = 0; > unsigned int nmap = 0; > int i; > @@ -68,11 +67,8 @@ int ib_umem_dmabuf_map_pages(struct ib_umem_dmabuf > *umem_dmabuf) >* may be not up-to-date. Wait for the exporter to finish >* the migration. >*/ > - fence = dma_resv_excl_fence(umem_dmabuf->attach->dmabuf->resv); > - if (fence) > - return dma_fence_wait(fence, false); > - > - return 0; > + return dma_resv_wait_timeout(umem_dmabuf->attach->dmabuf->resv, false, > + false, MAX_SCHEDULE_TIMEOUT); I think a wrapper for dma_resv_wait() without timeout would be neat, which we lack. Either way: Reviewed-by: Daniel Vetter > } > EXPORT_SYMBOL(ib_umem_dmabuf_map_pages); > > -- > 2.25.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 04/24] dma-buf: add dma_resv_get_singleton v2
On Tue, Dec 07, 2021 at 01:33:51PM +0100, Christian König wrote: > Add a function to simplify getting a single fence for all the fences in > the dma_resv object. > > v2: fix ref leak in error handling > > Signed-off-by: Christian König > --- > drivers/dma-buf/dma-resv.c | 52 ++ > include/linux/dma-resv.h | 2 ++ > 2 files changed, 54 insertions(+) > > diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c > index 480c305554a1..694716a3d66d 100644 > --- a/drivers/dma-buf/dma-resv.c > +++ b/drivers/dma-buf/dma-resv.c > @@ -34,6 +34,7 @@ > */ > > #include > +#include > #include > #include > #include > @@ -657,6 +658,57 @@ int dma_resv_get_fences(struct dma_resv *obj, bool write, > } > EXPORT_SYMBOL_GPL(dma_resv_get_fences); > > +/** > + * dma_resv_get_singleton - Get a single fence for all the fences > + * @obj: the reservation object > + * @write: true if we should return all fences > + * @fence: the resulting fence > + * > + * Get a single fence representing all the fences inside the resv object. > + * Returns either 0 for success or -ENOMEM. > + * > + * Warning: This can't be used like this when adding the fence back to the > resv > + * object since that can lead to stack corruption when finalizing the > + * dma_fence_array. Uh I don't get this one? I thought the only problem with nested fences is the signalling recursion, which we work around with the irq_work? Also if there's really an issue with dma_fence_array fences, then that warning should be on the dma_resv kerneldoc, not somewhere hidden like this. And finally I really don't see what can go wrong, sure we'll end up with the same fence once in the dma_resv_list and then once more in the fence array. But they're all refcounted, so really shouldn't matter. The code itself looks correct, but me not understanding what even goes wrong here freaks me out a bit. I guess something to figure out next year, I kinda hoped I could squeeze a review in before I disappear :-/ -Daniel > + */ > +int dma_resv_get_singleton(struct dma_resv *obj, bool write, > +struct dma_fence **fence) > +{ > + struct dma_fence_array *array; > + struct dma_fence **fences; > + unsigned count; > + int r; > + > + r = dma_resv_get_fences(obj, write, &count, &fences); > +if (r) > + return r; > + > + if (count == 0) { > + *fence = NULL; > + return 0; > + } > + > + if (count == 1) { > + *fence = fences[0]; > + kfree(fences); > + return 0; > + } > + > + array = dma_fence_array_create(count, fences, > +dma_fence_context_alloc(1), > +1, false); > + if (!array) { > + while (count--) > + dma_fence_put(fences[count]); > + kfree(fences); > + return -ENOMEM; > + } > + > + *fence = &array->base; > + return 0; > +} > +EXPORT_SYMBOL_GPL(dma_resv_get_singleton); > + > /** > * dma_resv_wait_timeout - Wait on reservation's objects > * shared and/or exclusive fences. > diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h > index fa2002939b19..cdfbbda6f600 100644 > --- a/include/linux/dma-resv.h > +++ b/include/linux/dma-resv.h > @@ -438,6 +438,8 @@ void dma_resv_replace_fences(struct dma_resv *obj, > uint64_t context, > void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence); > int dma_resv_get_fences(struct dma_resv *obj, bool write, > unsigned int *num_fences, struct dma_fence ***fences); > +int dma_resv_get_singleton(struct dma_resv *obj, bool write, > +struct dma_fence **fence); > int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src); > long dma_resv_wait_timeout(struct dma_resv *obj, bool wait_all, bool intr, > unsigned long timeout); > -- > 2.25.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 03/24] dma-buf: drop excl_fence parameter from dma_resv_get_fences
On Tue, Dec 07, 2021 at 01:33:50PM +0100, Christian König wrote: > Returning the exclusive fence separately is no longer used. > > Instead add a write parameter to indicate the use case. > > Signed-off-by: Christian König > --- > drivers/dma-buf/dma-resv.c | 48 > drivers/dma-buf/st-dma-resv.c| 26 ++- > drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 6 ++- > drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 2 +- > drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 3 +- > include/linux/dma-resv.h | 4 +- > 6 files changed, 31 insertions(+), 58 deletions(-) > > diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c > index a12a3a39f280..480c305554a1 100644 > --- a/drivers/dma-buf/dma-resv.c > +++ b/drivers/dma-buf/dma-resv.c > @@ -611,57 +611,45 @@ EXPORT_SYMBOL(dma_resv_copy_fences); > * dma_resv_get_fences - Get an object's shared and exclusive > * fences without update side lock held > * @obj: the reservation object > - * @fence_excl: the returned exclusive fence (or NULL) > - * @shared_count: the number of shared fences returned > - * @shared: the array of shared fence ptrs returned (array is krealloc'd to > - * the required size, and must be freed by caller) > - * > - * Retrieve all fences from the reservation object. If the pointer for the > - * exclusive fence is not specified the fence is put into the array of the > - * shared fences as well. Returns either zero or -ENOMEM. > + * @write: true if we should return all fences I'm assuming that this will be properly documented later on in the series ... > + * @num_fences: the number of fences returned > + * @fences: the array of fence ptrs returned (array is krealloc'd to the > + * required size, and must be freed by caller) > + * > + * Retrieve all fences from the reservation object. > + * Returns either zero or -ENOMEM. > */ > -int dma_resv_get_fences(struct dma_resv *obj, struct dma_fence **fence_excl, > - unsigned int *shared_count, struct dma_fence ***shared) > +int dma_resv_get_fences(struct dma_resv *obj, bool write, > + unsigned int *num_fences, struct dma_fence ***fences) > { > struct dma_resv_iter cursor; > struct dma_fence *fence; > > - *shared_count = 0; > - *shared = NULL; > - > - if (fence_excl) > - *fence_excl = NULL; > + *num_fences = 0; > + *fences = NULL; > > - dma_resv_iter_begin(&cursor, obj, true); > + dma_resv_iter_begin(&cursor, obj, write); > dma_resv_for_each_fence_unlocked(&cursor, fence) { > > if (dma_resv_iter_is_restarted(&cursor)) { > unsigned int count; > > - while (*shared_count) > - dma_fence_put((*shared)[--(*shared_count)]); > + while (*num_fences) > + dma_fence_put((*fences)[--(*num_fences)]); > > - if (fence_excl) > - dma_fence_put(*fence_excl); > - > - count = cursor.shared_count; > - count += fence_excl ? 0 : 1; > + count = cursor.shared_count + 1; > > /* Eventually re-allocate the array */ > - *shared = krealloc_array(*shared, count, > + *fences = krealloc_array(*fences, count, >sizeof(void *), >GFP_KERNEL); > - if (count && !*shared) { > + if (count && !*fences) { > dma_resv_iter_end(&cursor); > return -ENOMEM; > } > } > > - dma_fence_get(fence); > - if (dma_resv_iter_is_exclusive(&cursor) && fence_excl) > - *fence_excl = fence; > - else > - (*shared)[(*shared_count)++] = fence; > + (*fences)[(*num_fences)++] = dma_fence_get(fence); > } > dma_resv_iter_end(&cursor); > > diff --git a/drivers/dma-buf/st-dma-resv.c b/drivers/dma-buf/st-dma-resv.c > index bc32b3eedcb6..cbe999c6e7a6 100644 > --- a/drivers/dma-buf/st-dma-resv.c > +++ b/drivers/dma-buf/st-dma-resv.c > @@ -275,7 +275,7 @@ static int test_shared_for_each_unlocked(void *arg) > > static int test_get_fences(void *arg, bool shared) > { > - struct dma_fence *f, *excl = NULL, **fences = NULL; > + struct dma_fence *f, **fences = NULL; > struct dma_resv resv; > int r, i; > > @@ -304,35 +304,19 @@ static int test_get_fences(void *arg, bool shared) > } > dma_resv_unlock(&resv); > > - r = dma_resv_get_fences(&resv, &excl, &i, &fences); > + r = dma_resv_get_fences(&resv, shared, &i, &fences); > if (r) { > pr_err("get_fences failed\n"); > goto
Re: [PATCH 02/24] dma-buf: finally make the dma_resv_list private
On Tue, Dec 07, 2021 at 01:33:49PM +0100, Christian König wrote: > Drivers should never touch this directly. > > Signed-off-by: Christian König > --- > drivers/dma-buf/dma-resv.c | 26 ++ > include/linux/dma-resv.h | 26 +- > 2 files changed, 27 insertions(+), 25 deletions(-) > > diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c > index a688dbded3d3..a12a3a39f280 100644 > --- a/drivers/dma-buf/dma-resv.c > +++ b/drivers/dma-buf/dma-resv.c > @@ -56,6 +56,19 @@ > DEFINE_WD_CLASS(reservation_ww_class); > EXPORT_SYMBOL(reservation_ww_class); > > +/** > + * struct dma_resv_list - a list of shared fences > + * @rcu: for internal use > + * @shared_count: table of shared fences > + * @shared_max: for growing shared fence table > + * @shared: shared fence table > + */ Imo drop the kerneldoc here and just make these comments before the right member if you feel like keeping them. Imo it's obvious enough what's going on that the comments aren't necessary, and we don't kerneldoc document internals generally at all - only interfaces relevant by drivers and things outside of a subsystem. > +struct dma_resv_list { > + struct rcu_head rcu; > + u32 shared_count, shared_max; > + struct dma_fence __rcu *shared[]; > +}; > + > /** > * dma_resv_list_alloc - allocate fence list > * @shared_max: number of fences we need space for > @@ -133,6 +146,19 @@ void dma_resv_fini(struct dma_resv *obj) > } > EXPORT_SYMBOL(dma_resv_fini); > > +/** > + * dma_resv_shared_list - get the reservation object's shared fence list > + * @obj: the reservation object > + * > + * Returns the shared fence list. Caller must either hold the objects > + * through dma_resv_lock() or the RCU read side lock through rcu_read_lock(), > + * or one of the variants of each > + */ Same here. With that: Reviewed-by: Daniel Vetter > +static inline struct dma_resv_list *dma_resv_shared_list(struct dma_resv > *obj) > +{ > + return rcu_dereference_check(obj->fence, dma_resv_held(obj)); > +} > + > /** > * dma_resv_reserve_shared - Reserve space to add shared fences to > * a dma_resv. > diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h > index e0be34265eae..3baf2a4a9a0d 100644 > --- a/include/linux/dma-resv.h > +++ b/include/linux/dma-resv.h > @@ -47,18 +47,7 @@ > > extern struct ww_class reservation_ww_class; > > -/** > - * struct dma_resv_list - a list of shared fences > - * @rcu: for internal use > - * @shared_count: table of shared fences > - * @shared_max: for growing shared fence table > - * @shared: shared fence table > - */ > -struct dma_resv_list { > - struct rcu_head rcu; > - u32 shared_count, shared_max; > - struct dma_fence __rcu *shared[]; > -}; > +struct dma_resv_list; > > /** > * struct dma_resv - a reservation object manages fences for a buffer > @@ -440,19 +429,6 @@ dma_resv_excl_fence(struct dma_resv *obj) > return rcu_dereference_check(obj->fence_excl, dma_resv_held(obj)); > } > > -/** > - * dma_resv_shared_list - get the reservation object's shared fence list > - * @obj: the reservation object > - * > - * Returns the shared fence list. Caller must either hold the objects > - * through dma_resv_lock() or the RCU read side lock through rcu_read_lock(), > - * or one of the variants of each > - */ > -static inline struct dma_resv_list *dma_resv_shared_list(struct dma_resv > *obj) > -{ > - return rcu_dereference_check(obj->fence, dma_resv_held(obj)); > -} > - > void dma_resv_init(struct dma_resv *obj); > void dma_resv_fini(struct dma_resv *obj); > int dma_resv_reserve_shared(struct dma_resv *obj, unsigned int num_fences); > -- > 2.25.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 01/24] dma-buf: add dma_resv_replace_fences
On Tue, Dec 07, 2021 at 01:33:48PM +0100, Christian König wrote: > This function allows to replace fences from the shared fence list when > we can gurantee that the operation represented by the original fence has > finished or no accesses to the resources protected by the dma_resv > object any more when the new fence finishes. > > Then use this function in the amdkfd code when BOs are unmapped from the > process. > > Signed-off-by: Christian König > --- > drivers/dma-buf/dma-resv.c| 43 > .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 49 +++ > include/linux/dma-resv.h | 2 + > 3 files changed, 52 insertions(+), 42 deletions(-) > > diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c > index 4deea75c0b9c..a688dbded3d3 100644 > --- a/drivers/dma-buf/dma-resv.c > +++ b/drivers/dma-buf/dma-resv.c > @@ -284,6 +284,49 @@ void dma_resv_add_shared_fence(struct dma_resv *obj, > struct dma_fence *fence) > } > EXPORT_SYMBOL(dma_resv_add_shared_fence); > > +/** > + * dma_resv_replace_fences - replace fences in the dma_resv obj > + * @obj: the reservation object > + * @context: the context of the fences to replace > + * @replacement: the new fence to use instead > + * > + * Replace fences with a specified context with a new fence. Only valid if > the > + * operation represented by the original fences is completed or has no longer > + * access to the resources protected by the dma_resv object when the new > fence > + * completes. > + */ > +void dma_resv_replace_fences(struct dma_resv *obj, uint64_t context, > + struct dma_fence *replacement) > +{ > + struct dma_resv_list *list; > + struct dma_fence *old; > + unsigned int i; > + > + dma_resv_assert_held(obj); > + > + write_seqcount_begin(&obj->seq); > + > + old = dma_resv_excl_fence(obj); > + if (old->context == context) { > + RCU_INIT_POINTER(obj->fence_excl, dma_fence_get(replacement)); > + dma_fence_put(old); > + } > + > + list = dma_resv_shared_list(obj); > + for (i = 0; list && i < list->shared_count; ++i) { > + old = rcu_dereference_protected(list->shared[i], > + dma_resv_held(obj)); > + if (old->context != context) > + continue; > + > + rcu_assign_pointer(list->shared[i], dma_fence_get(replacement)); > + dma_fence_put(old); Since the fences are all guaranteed to be from the same context, maybe we should have a WARN_ON(__dma_fence_is_later()); here just to be safe? With that added: Reviewed-by: Daniel Vetter > + } > + > + write_seqcount_end(&obj->seq); > +} > +EXPORT_SYMBOL(dma_resv_replace_fences); > + > /** > * dma_resv_add_excl_fence - Add an exclusive fence. > * @obj: the reservation object > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c > index 71acd577803e..b558ef0f8c4a 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c > @@ -236,53 +236,18 @@ void amdgpu_amdkfd_release_notify(struct amdgpu_bo *bo) > static int amdgpu_amdkfd_remove_eviction_fence(struct amdgpu_bo *bo, > struct amdgpu_amdkfd_fence *ef) > { > - struct dma_resv *resv = bo->tbo.base.resv; > - struct dma_resv_list *old, *new; > - unsigned int i, j, k; > + struct dma_fence *replacement; > > if (!ef) > return -EINVAL; > > - old = dma_resv_shared_list(resv); > - if (!old) > - return 0; > - > - new = kmalloc(struct_size(new, shared, old->shared_max), GFP_KERNEL); > - if (!new) > - return -ENOMEM; > - > - /* Go through all the shared fences in the resevation object and sort > - * the interesting ones to the end of the list. > + /* TODO: Instead of block before we should use the fence of the page > + * table update and TLB flush here directly. >*/ > - for (i = 0, j = old->shared_count, k = 0; i < old->shared_count; ++i) { > - struct dma_fence *f; > - > - f = rcu_dereference_protected(old->shared[i], > - dma_resv_held(resv)); > - > - if (f->context == ef->base.context) > - RCU_INIT_POINTER(new->shared[--j], f); > - else > - RCU_INIT_POINTER(new->shared[k++], f); > - } > - new->shared_max = old->shared_max; > - new->shared_count = k; > - > - /* Install the new fence list, seqcount provides the barriers */ > - write_seqcount_begin(&resv->seq); > - RCU_INIT_POINTER(resv->fence, new); > - write_seqcount_end(&resv->seq); > - > - /* Drop the references to the removed fences or move them to ef_list */ > - for (i = j; i < old->shared_count; ++i) {
Re: [PATCH] drm/ttm: fix compilation on ARCH=um
On Mon, Dec 20, 2021 at 11:15:22AM +0100, Johannes Berg wrote: > From: Johannes Berg > > Even if it's probably not really useful, it can get selected > by e.g. randconfig builds, and then failing to compile is an > annoyance. Unfortunately, it's hard to fix in Kconfig, since > DRM_TTM is selected by many things that don't really depend > on any specific architecture, and just depend on PCI (which > is indeed now available in ARCH=um via simulation/emulation). > > Fix this in the code instead by just ifdef'ing the relevant > two lines that depend on "real X86". > > Reported-by: Geert Uytterhoeven > Signed-off-by: Johannes Berg Probably the last thing before I disappear until 2022 :-) Merged into drm-misc-fixes, thanks for your patch. -Daniel > --- > drivers/gpu/drm/ttm/ttm_module.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/ttm/ttm_module.c > b/drivers/gpu/drm/ttm/ttm_module.c > index 0037eefe3239..a3ad7c9736ec 100644 > --- a/drivers/gpu/drm/ttm/ttm_module.c > +++ b/drivers/gpu/drm/ttm/ttm_module.c > @@ -68,9 +68,11 @@ pgprot_t ttm_prot_from_caching(enum ttm_caching caching, > pgprot_t tmp) > #if defined(__i386__) || defined(__x86_64__) > if (caching == ttm_write_combined) > tmp = pgprot_writecombine(tmp); > +#ifndef CONFIG_UML > else if (boot_cpu_data.x86 > 3) > tmp = pgprot_noncached(tmp); > -#endif > +#endif /* CONFIG_UML */ > +#endif /* __i386__ || __x86_64__ */ > #if defined(__ia64__) || defined(__arm__) || defined(__aarch64__) || \ > defined(__powerpc__) || defined(__mips__) > if (caching == ttm_write_combined) > -- > 2.33.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH] drm/ttm: Don't inherit GEM object VMAs in child process
On Mon, Dec 20, 2021 at 01:12:51PM -0500, Bhardwaj, Rajneesh wrote: > > On 12/20/2021 4:29 AM, Daniel Vetter wrote: > > On Fri, Dec 10, 2021 at 07:58:50AM +0100, Christian König wrote: > > > Am 09.12.21 um 19:28 schrieb Felix Kuehling: > > > > Am 2021-12-09 um 10:30 a.m. schrieb Christian König: > > > > > That still won't work. > > > > > > > > > > But I think we could do this change for the amdgpu mmap callback only. > > > > If graphics user mode has problems with it, we could even make this > > > > specific to KFD BOs in the amdgpu_gem_object_mmap callback. > > > I think it's fine for the whole amdgpu stack, my concern is more about > > > radeon, nouveau and the ARM stacks which are using this as well. > > > > > > That blew up so nicely the last time we tried to change it and I know of > > > at > > > least one case where radeon was/is used with BOs in a child process. > > I'm way late and burried again, but I think it'd be good to be consistent > > here across drivers. Or at least across drm drivers. And we've had the vma > > open/close refcounting to make fork work since forever. > > > > I think if we do this we should really only do this for mmap() where this > > applies, but reading through the thread here I'm honestly confused why > > this is a problem. If CRIU can't handle forked mmaps it needs to be > > thought that, not hacked around. Or at least I'm not understanding why > > this shouldn't work ... > > -Daniel > > > > Hi Daniel > > In the v2 > https://lore.kernel.org/all/a1a865f5-ad2c-29c8-cbe4-2635d53ec...@amd.com/T/ > I pretty much limited the scope of the change to KFD BOs on mmap. Regarding > CRIU, I think its not a CRIU problem as CRIU on restore, only tries to > recreate all the child processes and then mmaps all the VMAs it sees (as per > checkpoint snapshot) in the new process address space after the VMA > placements are finalized in the position independent code phase. Since the > inherited VMAs don't have access rights the criu mmap fails. Still sounds funky. I think minimally we should have an ack from CRIU developers that this is officially the right way to solve this problem. I really don't want to have random one-off hacks that don't work across the board, for a problem where we (drm subsystem) really shouldn't be the only one with this problem. Where "this problem" means that the mmap space is per file description, and not per underlying inode or real device or whatever. That part sounds like a CRIU problem, and I expect CRIU folks want a consistent solution across the board for this. Hence please grab an ack from them. Cheers, Daniel > > Regards, > > Rajneesh > > > > Regards, > > > Christian. > > > > > > > Regards, > > > > Felix > > > > > > > > > > > > > Regards, > > > > > Christian. > > > > > > > > > > Am 09.12.21 um 16:29 schrieb Bhardwaj, Rajneesh: > > > > > > Sounds good. I will send a v2 with only ttm_bo_mmap_obj change. > > > > > > Thank > > > > > > you! > > > > > > > > > > > > On 12/9/2021 10:27 AM, Christian König wrote: > > > > > > > Hi Rajneesh, > > > > > > > > > > > > > > yes, separating this from the drm_gem_mmap_obj() change is > > > > > > > certainly > > > > > > > a good idea. > > > > > > > > > > > > > > > The child cannot access the BOs mapped by the parent anyway with > > > > > > > > access restrictions applied > > > > > > > exactly that is not correct. That behavior is actively used by > > > > > > > some > > > > > > > userspace stacks as far as I know. > > > > > > > > > > > > > > Regards, > > > > > > > Christian. > > > > > > > > > > > > > > Am 09.12.21 um 16:23 schrieb Bhardwaj, Rajneesh: > > > > > > > > Thanks Christian. Would it make it less intrusive if I just use > > > > > > > > the > > > > > > > > flag for ttm bo mmap and remove the drm_gem_mmap_obj change from > > > > > > > > this patch? For our use case, just the ttm_bo_mmap_obj change > > > > > > > > should suffice and we don't want to put any more work arounds in > > > > > > > > the user space (thunk, in our case). > > > > > > > > > > > > > > > > The child cannot access the BOs mapped by the parent anyway with > > > > > > > > access restrictions applied so I wonder why even inherit the > > > > > > > > vma? > > > > > > > > > > > > > > > > On 12/9/2021 2:54 AM, Christian König wrote: > > > > > > > > > Am 08.12.21 um 21:53 schrieb Rajneesh Bhardwaj: > > > > > > > > > > When an application having open file access to a node > > > > > > > > > > forks, its > > > > > > > > > > shared > > > > > > > > > > mappings also get reflected in the address space of child > > > > > > > > > > process > > > > > > > > > > even > > > > > > > > > > though it cannot access them with the object permissions > > > > > > > > > > applied. > > > > > > > > > > With the > > > > > > > > > > existing permission checks on the gem objects, it might be > > > > > > > > > > reasonable to > > > > > > > > > > also create the VMAs with VM_DONTCOPY flag so a user space > > > > > > > > > > application > > > > > > >
Re: [Intel-gfx] [PATCH 4/7] drm/i915/guc: Don't hog IRQs when destroying contexts
On Wed, Dec 22, 2021 at 04:25:13PM +, Tvrtko Ursulin wrote: > > Ping? > Missed this. This was merged before your comments landed on the list. > Main two points being: > > 1) Commit message seems in contradiction with the change in > guc_flush_destroyed_contexts. And the lock drop to immediately re-acquire it > looks questionable to start with. > > 2) And in deregister_destroyed_contexts and in 1) I was therefore asking if > you can unlink all at once and process with reduced hammering on the lock. > Probably can address both concerns by using a llist, right? Be on the look out for this rework patch over the next week or so. Matt > Regards, > > Tvrtko > > On 17/12/2021 11:14, Tvrtko Ursulin wrote: > > > > On 17/12/2021 11:06, Tvrtko Ursulin wrote: > > > On 14/12/2021 17:04, Matthew Brost wrote: > > > > From: John Harrison > > > > > > > > While attempting to debug a CT deadlock issue in various CI failures > > > > (most easily reproduced with gem_ctx_create/basic-files), I was seeing > > > > CPU deadlock errors being reported. This were because the context > > > > destroy loop was blocking waiting on H2G space from inside an IRQ > > > > spinlock. There no was deadlock as such, it's just that the H2G queue > > > > was full of context destroy commands and GuC was taking a long time to > > > > process them. However, the kernel was seeing the large amount of time > > > > spent inside the IRQ lock as a dead CPU. Various Bad Things(tm) would > > > > then happen (heartbeat failures, CT deadlock errors, outstanding H2G > > > > WARNs, etc.). > > > > > > > > Re-working the loop to only acquire the spinlock around the list > > > > management (which is all it is meant to protect) rather than the > > > > entire destroy operation seems to fix all the above issues. > > > > > > > > v2: > > > > (John Harrison) > > > > - Fix typo in comment message > > > > > > > > Signed-off-by: John Harrison > > > > Signed-off-by: Matthew Brost > > > > Reviewed-by: Matthew Brost > > > > --- > > > > .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 45 --- > > > > 1 file changed, 28 insertions(+), 17 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > > > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > > > index 36c2965db49b..96fcf869e3ff 100644 > > > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > > > @@ -2644,7 +2644,6 @@ static inline void > > > > guc_lrc_desc_unpin(struct intel_context *ce) > > > > unsigned long flags; > > > > bool disabled; > > > > - lockdep_assert_held(&guc->submission_state.lock); > > > > GEM_BUG_ON(!intel_gt_pm_is_awake(gt)); > > > > GEM_BUG_ON(!lrc_desc_registered(guc, ce->guc_id.id)); > > > > GEM_BUG_ON(ce != __get_context(guc, ce->guc_id.id)); > > > > @@ -2660,7 +2659,7 @@ static inline void > > > > guc_lrc_desc_unpin(struct intel_context *ce) > > > > } > > > > spin_unlock_irqrestore(&ce->guc_state.lock, flags); > > > > if (unlikely(disabled)) { > > > > - __release_guc_id(guc, ce); > > > > + release_guc_id(guc, ce); > > > > __guc_context_destroy(ce); > > > > return; > > > > } > > > > @@ -2694,36 +2693,48 @@ static void __guc_context_destroy(struct > > > > intel_context *ce) > > > > static void guc_flush_destroyed_contexts(struct intel_guc *guc) > > > > { > > > > - struct intel_context *ce, *cn; > > > > + struct intel_context *ce; > > > > unsigned long flags; > > > > GEM_BUG_ON(!submission_disabled(guc) && > > > > guc_submission_initialized(guc)); > > > > - spin_lock_irqsave(&guc->submission_state.lock, flags); > > > > - list_for_each_entry_safe(ce, cn, > > > > - &guc->submission_state.destroyed_contexts, > > > > - destroyed_link) { > > > > - list_del_init(&ce->destroyed_link); > > > > - __release_guc_id(guc, ce); > > > > + while (!list_empty(&guc->submission_state.destroyed_contexts)) { > > > > > > Are lockless false negatives a concern here - I mean this thread not > > > seeing something just got added to the list? > > > > > > > + spin_lock_irqsave(&guc->submission_state.lock, flags); > > > > + ce = > > > > list_first_entry_or_null(&guc->submission_state.destroyed_contexts, > > > > + struct intel_context, > > > > + destroyed_link); > > > > + if (ce) > > > > + list_del_init(&ce->destroyed_link); > > > > + spin_unlock_irqrestore(&guc->submission_state.lock, flags); > > > > + > > > > + if (!ce) > > > > + break; > > > > + > > > > + release_guc_id(guc, ce); > > > > > > This looks suboptimal and in conflict with this part of the commit > > > message: > > > > > > """ > > > Re-working the loop to only acquire the spinlock around the list > > > mana
[Bug 201957] amdgpu: ring gfx timeout
https://bugzilla.kernel.org/show_bug.cgi?id=201957 roman (cool...@gmx.at) changed: What|Removed |Added CC||cool...@gmx.at --- Comment #52 from roman (cool...@gmx.at) --- I can confirm that amdgpu.dpm=0 removes the issue on an AMD Radeon PRO FIJI (Dual Fury) kernel: 5.15.10|FW: 20211027.1d00989-1|mesa: 21.3.2-1 Works perfectly fine in Gnome as long as there is no application accessing the 2nd GPU. When opening Radeon-profile as long as card0 is selected, there is no issue but as soon as I select card1 I get instantly Dec 22 21:15:46 Workstation kernel: amdgpu: failed to send message 171 ret is 0 Dec 22 21:15:49 Workstation kernel: amdgpu: last message was failed ret is 0 The application Radeon-profile freezes but desktop is still responsive. When opening CS:GO with mangohud and configuring either pci_dev = :3d:00.0 # primary card works fine or pci_dev = :3e:00.0 # secondary card, errors from above occur and CS:GO loads super slow and after menu is visible it is stuck When CSM is disabled in BIOS I have 2 GPUs Dec 22 20:45:50 Workstation kernel: [drm] amdgpu kernel modesetting enabled. Dec 22 20:45:50 Workstation kernel: amdgpu: CRAT table not found Dec 22 20:45:50 Workstation kernel: amdgpu: Virtual CRAT table created for CPU Dec 22 20:45:50 Workstation kernel: amdgpu: Topology: Add CPU node Dec 22 20:45:50 Workstation kernel: amdgpu :3d:00.0: vgaarb: deactivate vga console Dec 22 20:45:50 Workstation kernel: amdgpu :3d:00.0: enabling device (0106 -> 0107) Dec 22 20:45:50 Workstation kernel: amdgpu :3d:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported Dec 22 20:45:50 Workstation kernel: amdgpu :3d:00.0: amdgpu: Fetched VBIOS from ROM BAR Dec 22 20:45:50 Workstation kernel: amdgpu: ATOM BIOS: 113-C88801MS-102 Dec 22 20:45:50 Workstation kernel: amdgpu :3d:00.0: amdgpu: VRAM: 4096M 0x00F4 - 0x00F4 (4096M used) Dec 22 20:45:50 Workstation kernel: amdgpu :3d:00.0: amdgpu: GART: 1024M 0x00FF - 0x00FF3FFF Dec 22 20:45:50 Workstation kernel: [drm] amdgpu: 4096M of VRAM memory ready Dec 22 20:45:50 Workstation kernel: [drm] amdgpu: 4096M of GTT memory ready. Dec 22 20:45:50 Workstation kernel: amdgpu: hwmgr_sw_init smu backed is fiji_smu Dec 22 20:45:50 Workstation kernel: snd_hda_intel :3d:00.1: bound :3d:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu]) Dec 22 20:45:50 Workstation kernel: [drm:retrieve_link_cap [amdgpu]] *ERROR* retrieve_link_cap: Read receiver caps dpcd data failed. Dec 22 20:45:50 Workstation kernel: kfd kfd: amdgpu: Allocated 3969056 bytes on gart Dec 22 20:45:50 Workstation kernel: amdgpu: Virtual CRAT table created for GPU Dec 22 20:45:50 Workstation kernel: amdgpu: Topology: Add dGPU node [0x7300:0x1002] Dec 22 20:45:50 Workstation kernel: kfd kfd: amdgpu: added device 1002:7300 Dec 22 20:45:50 Workstation kernel: amdgpu :3d:00.0: amdgpu: SE 4, SH per SE 1, CU per SH 16, active_cu_number 64 Dec 22 20:45:50 Workstation kernel: fbcon: amdgpu (fb0) is primary device Dec 22 20:45:51 Workstation kernel: amdgpu :3d:00.0: [drm] fb0: amdgpu frame buffer device Dec 22 20:45:51 Workstation kernel: amdgpu :3d:00.0: amdgpu: Using BACO for runtime pm Dec 22 20:45:51 Workstation kernel: [drm] Initialized amdgpu 3.42.0 20150101 for :3d:00.0 on minor 0 Dec 22 20:45:51 Workstation kernel: amdgpu :3e:00.0: enabling device (0106 -> 0107) Dec 22 20:45:51 Workstation kernel: amdgpu :3e:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported Dec 22 20:45:51 Workstation kernel: amdgpu :3e:00.0: amdgpu: Fetched VBIOS from ROM BAR Dec 22 20:45:51 Workstation kernel: amdgpu: ATOM BIOS: 113-C88801SL-102 Dec 22 20:45:51 Workstation kernel: amdgpu :3e:00.0: amdgpu: VRAM: 4096M 0x00F4 - 0x00F4 (4096M used) Dec 22 20:45:51 Workstation kernel: amdgpu :3e:00.0: amdgpu: GART: 1024M 0x00FF - 0x00FF3FFF Dec 22 20:45:51 Workstation kernel: [drm] amdgpu: 4096M of VRAM memory ready Dec 22 20:45:51 Workstation kernel: [drm] amdgpu: 4096M of GTT memory ready. Dec 22 20:45:51 Workstation kernel: amdgpu: hwmgr_sw_init smu backed is fiji_smu Dec 22 20:45:51 Workstation kernel: kfd kfd: amdgpu: Allocated 3969056 bytes on gart Dec 22 20:45:51 Workstation kernel: amdgpu: Virtual CRAT table created for GPU Dec 22 20:45:51 Workstation kernel: amdgpu: Topology: Add dGPU node [0x7300:0x1002] Dec 22 20:45:51 Workstation kernel: kfd kfd: amdgpu: added device 1002:7300 Dec 22 20:45:51 Workstation kernel: amdgpu :3e:00.0: amdgpu: SE 4, SH per SE 1, CU per SH 16, active_cu_number 64 Dec 22 20:45:51 Workstation kernel: amdgpu :3e:00.0: amdgpu: Using BACO for runtime pm Dec 22 20:45:51 Workstation kernel:
Re: [PATCH 08/22] dt-bindings: display: rockchip: dw-hdmi: use "ref" as clock name
On Wed, Dec 22, 2021 at 3:40 PM Heiko Stübner wrote: > > Am Mittwoch, 22. Dezember 2021, 14:52:51 CET schrieb Rob Herring: > > On Wed, Dec 22, 2021 at 6:47 AM Sascha Hauer wrote: > > > > > > On Tue, Dec 21, 2021 at 10:31:23AM -0400, Rob Herring wrote: > > > > On Mon, Dec 20, 2021 at 12:06:16PM +0100, Sascha Hauer wrote: > > > > > "vpll" is a misnomer. A clock input to a device should be named after > > > > > the usage in the device, not after the clock that drives it. On the > > > > > rk3568 the same clock is driven by the HPLL. > > > > > To fix that, this patch renames the vpll clock to ref clock. > > > > > > > > The problem with this series is it breaks an old kernel with new dt. You > > > > can partially mitigate that with stable kernel backport, but IMO keeping > > > > the old name is not a burden to maintain. > > > > > > As suggested I only removed vpll from the binding document, but not from > > > the code. The code still handles the old binding as well. > > > > The problem is updating rk3399.dtsi. That change won't work with old > > kernels because they won't look for 'ref'. Since you shouldn't change > > it, the binding needs to cover both the old and new cases. > > is "newer dt with old kernel" really a case these days? I've had complaints about it. In particular from SUSE folks that were shipping new dtbs with old (stable) kernels. > I do understand the new kernel old dt case - for example with the > dtb being provided by firmware. Yes, so update your firmware that contains a newer dtb and then you stop booting or a device stops working. > But which user would get the idea of updating only the devicetree > while staying with an older kernel? Any synchronization between firmware and OS updates is a problem. Rob
Re: [PATCH 1/2] drm/tegra: dpaux: Populate AUX bus
20.12.2021 13:48, Thierry Reding пишет: > From: Thierry Reding > > The DPAUX hardware block exposes an DP AUX interface that provides > access to an AUX bus and the devices on that bus. Use the DP AUX bus > infrastructure that was recently introduced to probe devices on this > bus from DT. > > Signed-off-by: Thierry Reding > --- > drivers/gpu/drm/tegra/Kconfig | 1 + > drivers/gpu/drm/tegra/dpaux.c | 7 +++ > 2 files changed, 8 insertions(+) > > diff --git a/drivers/gpu/drm/tegra/Kconfig b/drivers/gpu/drm/tegra/Kconfig > index 8cf5aeb9db6c..201f5175ecfe 100644 > --- a/drivers/gpu/drm/tegra/Kconfig > +++ b/drivers/gpu/drm/tegra/Kconfig > @@ -5,6 +5,7 @@ config DRM_TEGRA > depends on COMMON_CLK > depends on DRM > depends on OF > + select DRM_DP_AUX_BUS > select DRM_KMS_HELPER > select DRM_MIPI_DSI > select DRM_PANEL > diff --git a/drivers/gpu/drm/tegra/dpaux.c b/drivers/gpu/drm/tegra/dpaux.c > index 1f96e416fa08..9da1edcdc835 100644 > --- a/drivers/gpu/drm/tegra/dpaux.c > +++ b/drivers/gpu/drm/tegra/dpaux.c > @@ -18,6 +18,7 @@ > #include > #include > > +#include > #include > #include > > @@ -570,6 +571,12 @@ static int tegra_dpaux_probe(struct platform_device > *pdev) > list_add_tail(&dpaux->list, &dpaux_list); > mutex_unlock(&dpaux_lock); > > + err = devm_of_dp_aux_populate_ep_devices(&dpaux->aux); > + if (err < 0) { > + dev_err(dpaux->dev, "failed to populate AUX bus: %d\n", err); > + return err; > + } > + > return 0; > } Needs stable tag for 5.15+.
Re: [PATCH 08/22] dt-bindings: display: rockchip: dw-hdmi: use "ref" as clock name
On Mittwoch, 22. Dezember 2021 20:39:58 CET Heiko Stübner wrote: > Am Mittwoch, 22. Dezember 2021, 14:52:51 CET schrieb Rob Herring: > > On Wed, Dec 22, 2021 at 6:47 AM Sascha Hauer wrote: > > > > > > On Tue, Dec 21, 2021 at 10:31:23AM -0400, Rob Herring wrote: > > > > On Mon, Dec 20, 2021 at 12:06:16PM +0100, Sascha Hauer wrote: > > > > > "vpll" is a misnomer. A clock input to a device should be named after > > > > > the usage in the device, not after the clock that drives it. On the > > > > > rk3568 the same clock is driven by the HPLL. > > > > > To fix that, this patch renames the vpll clock to ref clock. > > > > > > > > The problem with this series is it breaks an old kernel with new dt. You > > > > can partially mitigate that with stable kernel backport, but IMO keeping > > > > the old name is not a burden to maintain. > > > > > > As suggested I only removed vpll from the binding document, but not from > > > the code. The code still handles the old binding as well. > > > > The problem is updating rk3399.dtsi. That change won't work with old > > kernels because they won't look for 'ref'. Since you shouldn't change > > it, the binding needs to cover both the old and new cases. > > is "newer dt with old kernel" really a case these days? > > I do understand the new kernel old dt case - for example with the > dtb being provided by firmware. > > But which user would get the idea of updating only the devicetree > while staying with an older kernel? > Side-by-side installations of LTS kernels with new kernels. LTS kernel uses same DT as new kernel because distribution set it up this way. Other scenario: user wants to modify their device tree. They download the latest kernel sources from kernel.org because they can't use over- lays and they don't want to fiddle with decompiled device trees.
Re: [Intel-gfx] [PATCH] drm/i915: Use trylock instead of blocking lock for __i915_gem_free_objects.
On 12/22/21 16:56, Maarten Lankhorst wrote: Convert free_work into delayed_work, similar to ttm to allow converting the blocking lock in __i915_gem_free_objects to a trylock. Unlike ttm, the object should already be idle, as it's kept alive by a reference through struct i915_vma->active, which is dropped after all vma's are idle. Because of this, we can use a no wait by default, or when the lock is contested, we use ttm's 10 ms. The trylock should only fail when the object is sharing it's resv with other objects, and typically objects are not kept locked for a long time, so we can safely retry on failure. Fixes: be7612fd6665 ("drm/i915: Require object lock when freeing pages during destruction") Testcase: igt/gem_exec_alignment/pi* Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/gem/i915_gem_object.c | 14 ++ drivers/gpu/drm/i915/i915_drv.h| 4 ++-- 2 files changed, 12 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 39cd563544a5..d87b508b59b1 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -331,7 +331,13 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915, continue; } - i915_gem_object_lock(obj, NULL); + if (!i915_gem_object_trylock(obj, NULL)) { + /* busy, toss it back to the pile */ + if (llist_add(&obj->freed, &i915->mm.free_list)) + queue_delayed_work(i915->wq, &i915->mm.free_work, msecs_to_jiffies(10)); i915->wq is ordered. From what I can tell, with queue_delayed_work(), the work doesn't get inserted into the queue order until the delay expires, right? So we don't unnecessarily hold up other objects getting freed? + continue; + } + __i915_gem_object_pages_fini(obj); i915_gem_object_unlock(obj); __i915_gem_free_object(obj); @@ -353,7 +359,7 @@ void i915_gem_flush_free_objects(struct drm_i915_private *i915) static void __i915_gem_free_work(struct work_struct *work) { struct drm_i915_private *i915 = - container_of(work, struct drm_i915_private, mm.free_work); + container_of(work, struct drm_i915_private, mm.free_work.work); i915_gem_flush_free_objects(i915); } @@ -385,7 +391,7 @@ static void i915_gem_free_object(struct drm_gem_object *gem_obj) */ if (llist_add(&obj->freed, &i915->mm.free_list)) - queue_work(i915->wq, &i915->mm.free_work); + queue_delayed_work(i915->wq, &i915->mm.free_work, 0); } void __i915_gem_object_flush_frontbuffer(struct drm_i915_gem_object *obj, @@ -710,7 +716,7 @@ bool i915_gem_object_placement_possible(struct drm_i915_gem_object *obj, void i915_gem_init__objects(struct drm_i915_private *i915) { - INIT_WORK(&i915->mm.free_work, __i915_gem_free_work); + INIT_DELAYED_WORK(&i915->mm.free_work, __i915_gem_free_work); } void i915_objects_module_exit(void) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index c8fddb7e61c9..beeb42a14aae 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -465,7 +465,7 @@ struct i915_gem_mm { * List of objects which are pending destruction. */ struct llist_head free_list; - struct work_struct free_work; + struct delayed_work free_work; /** * Count of objects pending destructions. Used to skip needlessly * waiting on an RCU barrier if no objects are waiting to be freed. @@ -1625,7 +1625,7 @@ static inline void i915_gem_drain_freed_objects(struct drm_i915_private *i915) * armed the work again. */ while (atomic_read(&i915->mm.free_count)) { - flush_work(&i915->mm.free_work); + flush_delayed_work(&i915->mm.free_work); flush_delayed_work(&i915->bdev.wq); rcu_barrier(); } Otherwise LGTM. Reviewed-by: Thomas Hellström
Re: [PATCH 08/22] dt-bindings: display: rockchip: dw-hdmi: use "ref" as clock name
Am Mittwoch, 22. Dezember 2021, 14:52:51 CET schrieb Rob Herring: > On Wed, Dec 22, 2021 at 6:47 AM Sascha Hauer wrote: > > > > On Tue, Dec 21, 2021 at 10:31:23AM -0400, Rob Herring wrote: > > > On Mon, Dec 20, 2021 at 12:06:16PM +0100, Sascha Hauer wrote: > > > > "vpll" is a misnomer. A clock input to a device should be named after > > > > the usage in the device, not after the clock that drives it. On the > > > > rk3568 the same clock is driven by the HPLL. > > > > To fix that, this patch renames the vpll clock to ref clock. > > > > > > The problem with this series is it breaks an old kernel with new dt. You > > > can partially mitigate that with stable kernel backport, but IMO keeping > > > the old name is not a burden to maintain. > > > > As suggested I only removed vpll from the binding document, but not from > > the code. The code still handles the old binding as well. > > The problem is updating rk3399.dtsi. That change won't work with old > kernels because they won't look for 'ref'. Since you shouldn't change > it, the binding needs to cover both the old and new cases. is "newer dt with old kernel" really a case these days? I do understand the new kernel old dt case - for example with the dtb being provided by firmware. But which user would get the idea of updating only the devicetree while staying with an older kernel?
[PATCH] drm/msm/dp: Simplify dp_debug_init() and dp_debug_get()
dp_debug_init() always returns 0. So, make it a void function and simplify the only caller accordingly. While at it remove a useless 'rc' initialization in dp_debug_get() Signed-off-by: Christophe JAILLET --- drivers/gpu/drm/msm/dp/dp_debug.c | 13 +++-- 1 file changed, 3 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/msm/dp/dp_debug.c b/drivers/gpu/drm/msm/dp/dp_debug.c index da4323556ef3..338f1f9c4d14 100644 --- a/drivers/gpu/drm/msm/dp/dp_debug.c +++ b/drivers/gpu/drm/msm/dp/dp_debug.c @@ -207,9 +207,8 @@ static const struct file_operations test_active_fops = { .write = dp_test_active_write }; -static int dp_debug_init(struct dp_debug *dp_debug, struct drm_minor *minor) +static void dp_debug_init(struct dp_debug *dp_debug, struct drm_minor *minor) { - int rc = 0; struct dp_debug_private *debug = container_of(dp_debug, struct dp_debug_private, dp_debug); @@ -229,17 +228,15 @@ static int dp_debug_init(struct dp_debug *dp_debug, struct drm_minor *minor) debug, &dp_test_type_fops); debug->root = minor->debugfs_root; - - return rc; } struct dp_debug *dp_debug_get(struct device *dev, struct dp_panel *panel, struct dp_usbpd *usbpd, struct dp_link *link, struct drm_connector *connector, struct drm_minor *minor) { - int rc = 0; struct dp_debug_private *debug; struct dp_debug *dp_debug; + int rc; if (!dev || !panel || !usbpd || !link) { DRM_ERROR("invalid input\n"); @@ -266,11 +263,7 @@ struct dp_debug *dp_debug_get(struct device *dev, struct dp_panel *panel, dp_debug->hdisplay = 0; dp_debug->vrefresh = 0; - rc = dp_debug_init(dp_debug, minor); - if (rc) { - devm_kfree(dev, debug); - goto error; - } + dp_debug_init(dp_debug, minor); return dp_debug; error: -- 2.32.0
Re: [PATCH v16 08/40] gpu: host1x: Add initial runtime PM and OPP support
22.12.2021 22:30, Jon Hunter пишет: > > On 22/12/2021 19:01, Dmitry Osipenko wrote: > > ... > >> diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c >> index e08e331e46ae..8194826c9ce3 100644 >> --- a/drivers/gpu/host1x/syncpt.c >> +++ b/drivers/gpu/host1x/syncpt.c >> @@ -137,6 +137,15 @@ void host1x_syncpt_restore(struct host1x *host) >> struct host1x_syncpt *sp_base = host->syncpt; >> unsigned int i; >> >> + for (i = 0; i < host->info->nb_pts; i++) { >> + /* >> + * Unassign syncpt from channels for purposes of Tegra186 >> + * syncpoint protection. This prevents any channel from >> + * accessing it until it is reassigned. >> + */ >> + host1x_hw_syncpt_assign_to_channel(host, sp_base + i, NULL); >> + } >> + >> for (i = 0; i < host1x_syncpt_nb_pts(host); i++) >> host1x_hw_syncpt_restore(host, sp_base + i); >> >> @@ -352,13 +361,6 @@ int host1x_syncpt_init(struct host1x *host) >> for (i = 0; i < host->info->nb_pts; i++) { >> syncpt[i].id = i; >> syncpt[i].host = host; >> - >> - /* >> - * Unassign syncpt from channels for purposes of Tegra186 >> - * syncpoint protection. This prevents any channel from >> - * accessing it until it is reassigned. >> - */ >> - host1x_hw_syncpt_assign_to_channel(host, &syncpt[i], NULL); >> } >> >> for (i = 0; i < host->info->nb_bases; i++) >> > > > Thanks! This fixed it! I'll prepare proper patch with yours t-b, thank you.
Re: [PATCH v16 08/40] gpu: host1x: Add initial runtime PM and OPP support
On 22/12/2021 19:01, Dmitry Osipenko wrote: ... diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c index e08e331e46ae..8194826c9ce3 100644 --- a/drivers/gpu/host1x/syncpt.c +++ b/drivers/gpu/host1x/syncpt.c @@ -137,6 +137,15 @@ void host1x_syncpt_restore(struct host1x *host) struct host1x_syncpt *sp_base = host->syncpt; unsigned int i; + for (i = 0; i < host->info->nb_pts; i++) { + /* +* Unassign syncpt from channels for purposes of Tegra186 +* syncpoint protection. This prevents any channel from +* accessing it until it is reassigned. +*/ + host1x_hw_syncpt_assign_to_channel(host, sp_base + i, NULL); + } + for (i = 0; i < host1x_syncpt_nb_pts(host); i++) host1x_hw_syncpt_restore(host, sp_base + i); @@ -352,13 +361,6 @@ int host1x_syncpt_init(struct host1x *host) for (i = 0; i < host->info->nb_pts; i++) { syncpt[i].id = i; syncpt[i].host = host; - - /* -* Unassign syncpt from channels for purposes of Tegra186 -* syncpoint protection. This prevents any channel from -* accessing it until it is reassigned. -*/ - host1x_hw_syncpt_assign_to_channel(host, &syncpt[i], NULL); } for (i = 0; i < host->info->nb_bases; i++) Thanks! This fixed it! Jon -- nvpublic
Re: [PATCH 2/2] ARM: tegra: Move panels to AUX bus
20.12.2021 13:48, Thierry Reding пишет: > From: Thierry Reding > > Move the eDP panel on Venice 2 and Nyan boards into the corresponding > AUX bus device tree node. This allows us to avoid a nasty circular > dependency that would otherwise be created between the DPAUX and panel > nodes via the DDC/I2C phandle. > > Signed-off-by: Thierry Reding > --- > arch/arm/boot/dts/tegra124-nyan-big.dts | 15 +-- > arch/arm/boot/dts/tegra124-nyan-blaze.dts | 15 +-- > arch/arm/boot/dts/tegra124-venice2.dts| 14 +++--- > 3 files changed, 25 insertions(+), 19 deletions(-) > > diff --git a/arch/arm/boot/dts/tegra124-nyan-big.dts > b/arch/arm/boot/dts/tegra124-nyan-big.dts > index 1d2aac2cb6d0..fdc1d64dfff9 100644 > --- a/arch/arm/boot/dts/tegra124-nyan-big.dts > +++ b/arch/arm/boot/dts/tegra124-nyan-big.dts > @@ -13,12 +13,15 @@ / { >"google,nyan-big-rev1", "google,nyan-big-rev0", >"google,nyan-big", "google,nyan", "nvidia,tegra124"; > > - panel: panel { > - compatible = "auo,b133xtn01"; > - > - power-supply = <&vdd_3v3_panel>; > - backlight = <&backlight>; > - ddc-i2c-bus = <&dpaux>; > + host1x@5000 { > + dpaux@545c { > + aux-bus { > + panel: panel { > + compatible = "auo,b133xtn01"; > + backlight = <&backlight>; > + }; > + }; > + }; > }; > > mmc@700b0400 { /* SD Card on this bus */ > diff --git a/arch/arm/boot/dts/tegra124-nyan-blaze.dts > b/arch/arm/boot/dts/tegra124-nyan-blaze.dts > index 677babde6460..abdf4456826f 100644 > --- a/arch/arm/boot/dts/tegra124-nyan-blaze.dts > +++ b/arch/arm/boot/dts/tegra124-nyan-blaze.dts > @@ -15,12 +15,15 @@ / { >"google,nyan-blaze-rev0", "google,nyan-blaze", >"google,nyan", "nvidia,tegra124"; > > - panel: panel { > - compatible = "samsung,ltn140at29-301"; > - > - power-supply = <&vdd_3v3_panel>; > - backlight = <&backlight>; > - ddc-i2c-bus = <&dpaux>; > + host1x@5000 { > + dpaux@545c { > + aux-bus { > + panel: panel { > + compatible = "samsung,ltn140at29-301"; > + backlight = <&backlight>; > + }; > + }; > + }; > }; > > sound { > diff --git a/arch/arm/boot/dts/tegra124-venice2.dts > b/arch/arm/boot/dts/tegra124-venice2.dts > index 232c90604df9..6a9592ceb5f2 100644 > --- a/arch/arm/boot/dts/tegra124-venice2.dts > +++ b/arch/arm/boot/dts/tegra124-venice2.dts > @@ -48,6 +48,13 @@ sor@5454 { > dpaux@545c { > vdd-supply = <&vdd_3v3_panel>; > status = "okay"; > + > + aux-bus { > + panel: panel { > + compatible = "lg,lp129qe"; > + backlight = <&backlight>; > + }; > + }; > }; > }; > > @@ -1080,13 +1087,6 @@ power { > }; > }; > > - panel: panel { > - compatible = "lg,lp129qe"; > - power-supply = <&vdd_3v3_panel>; > - backlight = <&backlight>; > - ddc-i2c-bus = <&dpaux>; > - }; > - > vdd_mux: regulator-mux { > compatible = "regulator-fixed"; > regulator-name = "+VDD_MUX"; > You should add stable tag for 5.15 and also add separate patch to update the new arch/arm/boot/dts/tegra124-nyan-big-fhd.dts which we have in -next now.
Re: [PATCH 0/2] drm/tegra: Fix panel support on Venice 2 and Nyan
22.12.2021 14:53, Thierry Reding пишет: > On Wed, Dec 22, 2021 at 06:01:26AM +0300, Dmitry Osipenko wrote: >> 21.12.2021 21:01, Thierry Reding пишет: >>> On Tue, Dec 21, 2021 at 07:45:31PM +0300, Dmitry Osipenko wrote: 21.12.2021 19:17, Thierry Reding пишет: > On Tue, Dec 21, 2021 at 06:47:31PM +0300, Dmitry Osipenko wrote: >> 21.12.2021 13:58, Thierry Reding пишет: >> .. >> The panel->ddc isn't used by the new panel-edp driver unless panel is >> compatible with "edp-panel". Hence the generic_edp_panel_probe() >> should >> either fail or crash for a such "edp-panel" since panel->ddc isn't >> fully >> instantiated, AFAICS. > > I've tested this and it works fine on Venice 2. Since that was the > reference design for Nyan, I suspect that Nyan's will also work. > > It'd be great if Thomas or anyone else with access to a Nyan could > test this to verify that. There is no panel-edp driver in the v5.15. The EOL of v5.15 is Oct, 2023, hence we need to either use: >>> >>> All the (at least relevant) functionality that is in panel-edp was in >>> panel-simple before it was moved to panel-edp. I've backported this set >>> of patches to v5.15 and it works just fine there. >> >> Will we be able to add patch to bypass the panel's DT ddc-i2c-bus on >> Nyan to keep the older DTBs working? > > I don't see why we would want to do that. It's quite clear that the DTB > is buggy in this case and we have a more accurate way to describe what's > really there in hardware. In addition that more accurate representation > also gets rid of a bug. Obviously because the bug is caused by the > previous representation that was not accurate. > > Given that we can easily replace the DTBs on these devices there's no > reason to make this any more complicated than it has to be. Don't you care about normal people at all? Do you assume that everyone must to be a kernel developer to be able to use Tegra devices? :/ >>> >>> If you know how to install a custom kernel you also know how to replace >>> the DTB on these devices. >>> >>> For everyone else, once these patches are merged upstream and >>> distributions start shipping the new version, they will get this >>> automatically by updating their kernel package since most distributions >>> actually ship the DTB files as part of that. >>> It's not a problem for you to figure out why display is broken, for other people it's a problem. Usually nobody will update DTB without a well known reason, instead device will be dusted on a shelf. In the end you won't have any users at all. >>> >>> Most "normal" people aren't even going to notice that their DTB is going >>> to be updated. They would actually have to do extra work *not* to update >>> it. >> >> My past experience tells that your assumption is incorrect. There are >> quite a lot of people who will update kernel, but not DTB. > > People that do this will have to do it manually because most > distributions I know of will actually ship the DTBs. If they know how to > update the kernel separately, I'm sure they will manage to update the > DTB as well. It's really not more complicated that updating the kernel > image. > >> ARM devices have endless variations of bootloaders and individual quirks >> required for a successful installation of a kernel. Kernel update by >> distro usually isn't a thing on ARM. > > I'm not sure what distribution you have been using, but the ones that > I'm familiar with all install the DTBs along with the kernel. Most Tegra > devices (newer ones at least) do also support booting with U-Boot which > supports standard ways to boot a system (which were co-developed with > distributions precisely so that it would become easier for users to keep > their systems up-to-date), so there's really nothing magical anyone > should need to do in order to get an updated DTB along with the updated > kernel. > > It's a simple fact that sometimes a DTB contains a bug and we have to > fix it. > > In general we try to fix things up in the driver code when reasonable so > that people don't have to update the DTB. This is for the (mostly hypo- > thetical) case where updating the DTB is not possible or very > complicated. > > However, that's not the case on the Venice 2 or Nyan boards. And looking > at the alternative in this case, I don't think it's reasonable compared > to just fixing the problem at the root, which is in the DTB. My understanding that U-Boot isn't the only available bootloader option for Nyan. I don't feel happy about the ABI breakage, but in the same time don't feel very strong about the need to care about it in the case of Nyan since its DT already had a preexisting problem with the wrong panel model used for the FHD model. The decision will be on your conscience :)
Re: [PATCH] dt-bindings: display: bridge: lvds-codec: Fix duplicate key
On 12/22/21 19:03, Rob Herring wrote: On Mon, 20 Dec 2021 13:51:47 +0100, Thierry Reding wrote: From: Thierry Reding In order to validate multiple "if" conditionals, they must be part of an "allOf:" list, otherwise they will cause a failure in parsing the schema because of the duplicated "if" property. Fixes: d7df3948eb49 ("dt-bindings: display: bridge: lvds-codec: Document pixel data sampling edge select") Signed-off-by: Thierry Reding --- .../bindings/display/bridge/lvds-codec.yaml | 43 ++- 1 file changed, 22 insertions(+), 21 deletions(-) I went ahead and applied to drm-misc, so linux-next is fixed. Thank you
Re: [PATCH v16 08/40] gpu: host1x: Add initial runtime PM and OPP support
22.12.2021 21:41, Jon Hunter пишет: > > On 22/12/2021 09:47, Jon Hunter wrote: >> >> On 21/12/2021 20:58, Dmitry Osipenko wrote: >>> Hi, >>> >>> Thank you for testing it all. >>> >>> 21.12.2021 21:55, Jon Hunter пишет: Hi Dmitry, Thierry, On 30/11/2021 23:23, Dmitry Osipenko wrote: > Add runtime PM and OPP support to the Host1x driver. For the > starter we > will keep host1x always-on because dynamic power management require a > major > refactoring of the driver code since lot's of code paths are > missing the > RPM handling and we're going to remove some of these paths in the > future. Unfortunately, this change is breaking boot on Tegra186. Bisect points to this and reverting on top of -next gets the board booting again. Sadly, there is no panic or error reported, it is just a hard hang. I will not have time to look at this this week and so we may need to revert for the moment. >>> >>> Only T186 broken? What about T194? >> >> Yes interestingly only Tegra186 and no other board. >> >>> Which board model fails to boot? Is it running in hypervisor mode? >> >> This is Jetson TX2. No hypervisor. >> >>> Do you use any additional patches? >> >> No just plain -next. The tests run every day on top of tree. >> >>> Could you please test the below diff? I suspect that >>> host1x_syncpt_save/restore may be entirely broken for T186 since we >>> never used these funcs before. >>> >>> --- >8 --- >>> >>> diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c >>> index f5b4dcded088..fd5dfb875422 100644 >>> --- a/drivers/gpu/host1x/dev.c >>> +++ b/drivers/gpu/host1x/dev.c >>> @@ -580,7 +580,6 @@ static int __maybe_unused >>> host1x_runtime_suspend(struct device *dev) >>> int err; >>> >>> host1x_intr_stop(host); >>> - host1x_syncpt_save(host); >>> >>> err = reset_control_bulk_assert(host->nresets, host->resets); >>> if (err) { >>> @@ -596,9 +595,8 @@ static int __maybe_unused >>> host1x_runtime_suspend(struct device *dev) >>> return 0; >>> >>> resume_host1x: >>> - host1x_setup_sid_table(host); >>> - host1x_syncpt_restore(host); >>> host1x_intr_start(host); >>> + host1x_setup_sid_table(host); >>> >>> return err; >>> } >>> @@ -626,9 +624,8 @@ static int __maybe_unused >>> host1x_runtime_resume(struct device *dev) >>> goto disable_clk; >>> } >>> >>> - host1x_setup_sid_table(host); >>> - host1x_syncpt_restore(host); >>> host1x_intr_start(host); >>> + host1x_setup_sid_table(host); >> >> >> Thanks! Will try this later, once the next bisect is finished :-) > > I tested the above, but this did not fix it. It still hangs on boot. Thank you, now I see where the problem should be. Apparently host1x is disabled at a boot time on T186 and we touch h/w before RPM is resumed. Could you please revert the above change and try this instead: diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c index e08e331e46ae..8194826c9ce3 100644 --- a/drivers/gpu/host1x/syncpt.c +++ b/drivers/gpu/host1x/syncpt.c @@ -137,6 +137,15 @@ void host1x_syncpt_restore(struct host1x *host) struct host1x_syncpt *sp_base = host->syncpt; unsigned int i; + for (i = 0; i < host->info->nb_pts; i++) { + /* +* Unassign syncpt from channels for purposes of Tegra186 +* syncpoint protection. This prevents any channel from +* accessing it until it is reassigned. +*/ + host1x_hw_syncpt_assign_to_channel(host, sp_base + i, NULL); + } + for (i = 0; i < host1x_syncpt_nb_pts(host); i++) host1x_hw_syncpt_restore(host, sp_base + i); @@ -352,13 +361,6 @@ int host1x_syncpt_init(struct host1x *host) for (i = 0; i < host->info->nb_pts; i++) { syncpt[i].id = i; syncpt[i].host = host; - - /* -* Unassign syncpt from channels for purposes of Tegra186 -* syncpoint protection. This prevents any channel from -* accessing it until it is reassigned. -*/ - host1x_hw_syncpt_assign_to_channel(host, &syncpt[i], NULL); } for (i = 0; i < host->info->nb_bases; i++)
Re: [PATCH v16 08/40] gpu: host1x: Add initial runtime PM and OPP support
On 22/12/2021 09:47, Jon Hunter wrote: On 21/12/2021 20:58, Dmitry Osipenko wrote: Hi, Thank you for testing it all. 21.12.2021 21:55, Jon Hunter пишет: Hi Dmitry, Thierry, On 30/11/2021 23:23, Dmitry Osipenko wrote: Add runtime PM and OPP support to the Host1x driver. For the starter we will keep host1x always-on because dynamic power management require a major refactoring of the driver code since lot's of code paths are missing the RPM handling and we're going to remove some of these paths in the future. Unfortunately, this change is breaking boot on Tegra186. Bisect points to this and reverting on top of -next gets the board booting again. Sadly, there is no panic or error reported, it is just a hard hang. I will not have time to look at this this week and so we may need to revert for the moment. Only T186 broken? What about T194? Yes interestingly only Tegra186 and no other board. Which board model fails to boot? Is it running in hypervisor mode? This is Jetson TX2. No hypervisor. Do you use any additional patches? No just plain -next. The tests run every day on top of tree. Could you please test the below diff? I suspect that host1x_syncpt_save/restore may be entirely broken for T186 since we never used these funcs before. --- >8 --- diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c index f5b4dcded088..fd5dfb875422 100644 --- a/drivers/gpu/host1x/dev.c +++ b/drivers/gpu/host1x/dev.c @@ -580,7 +580,6 @@ static int __maybe_unused host1x_runtime_suspend(struct device *dev) int err; host1x_intr_stop(host); - host1x_syncpt_save(host); err = reset_control_bulk_assert(host->nresets, host->resets); if (err) { @@ -596,9 +595,8 @@ static int __maybe_unused host1x_runtime_suspend(struct device *dev) return 0; resume_host1x: - host1x_setup_sid_table(host); - host1x_syncpt_restore(host); host1x_intr_start(host); + host1x_setup_sid_table(host); return err; } @@ -626,9 +624,8 @@ static int __maybe_unused host1x_runtime_resume(struct device *dev) goto disable_clk; } - host1x_setup_sid_table(host); - host1x_syncpt_restore(host); host1x_intr_start(host); + host1x_setup_sid_table(host); Thanks! Will try this later, once the next bisect is finished :-) I tested the above, but this did not fix it. It still hangs on boot. Jon -- nvpublic
Re: [PATCH] dt-bindings: display: novatek,nt36672a: Fix unevaluated properties warning
On Tue, 21 Dec 2021 08:51:26 -0400, Rob Herring wrote: > With 'unevaluatedProperties' support enabled, the novatek,nt36672a > binding has a new warning: > > Documentation/devicetree/bindings/display/panel/novatek,nt36672a.example.dt.yaml: > panel@0: Unevaluated properties are not allowed ('vddi0-supply', > '#address-cells', '#size-cells' were unexpected) > > Based on dts files, 'vddi0-supply' does appear to be the correct name. > Drop '#address-cells' and '#size-cells' which aren't needed. > > Signed-off-by: Rob Herring > --- > .../devicetree/bindings/display/panel/novatek,nt36672a.yaml | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > Applied, thanks!
Re: [PATCH] dt-bindings: msm: disp: remove bus from dpu bindings
On Mon, 20 Dec 2021 19:42:20 +0100, David Heidelberg wrote: > Driver and dts has been already adjusted and bus moved out of dpu, let's > update also dt-bindings. > > Fixes warnings as: > arch/arm64/boot/dts/qcom/sdm845-oneplus-fajita.dt.yaml: mdss > @ae0: clock-names: ['iface', 'core'] is too short > From schema: > Documentation/devicetree/bindings/display/msm/dpu-sdm845.yaml > > Ref: > https://lore.kernel.org/all/20210803101657.1072358-1-dmitry.barysh...@linaro.org/ > > Signed-off-by: David Heidelberg > --- > .../devicetree/bindings/display/msm/dpu-sdm845.yaml | 5 + > 1 file changed, 1 insertion(+), 4 deletions(-) > Applied, thanks!
Re: [PATCH] dt-bindings: display: bridge: lvds-codec: Fix duplicate key
On Mon, 20 Dec 2021 13:51:47 +0100, Thierry Reding wrote: > From: Thierry Reding > > In order to validate multiple "if" conditionals, they must be part of an > "allOf:" list, otherwise they will cause a failure in parsing the schema > because of the duplicated "if" property. > > Fixes: d7df3948eb49 ("dt-bindings: display: bridge: lvds-codec: Document > pixel data sampling edge select") > Signed-off-by: Thierry Reding > --- > .../bindings/display/bridge/lvds-codec.yaml | 43 ++- > 1 file changed, 22 insertions(+), 21 deletions(-) > I went ahead and applied to drm-misc, so linux-next is fixed. Rob
Re: make dt_binding_check broken by drm & lvds-codec
On 12/22/21 18:43, Rafał Miłecki wrote: Hi, Hi, ba3e86789eaf ("dt-bindings: display: bridge: lvds-codec: Document LVDS data mapping select") d7df3948eb49 ("dt-bindings: display: bridge: lvds-codec: Document pixel data sampling edge select") Both commits add "if" and "then" at YAML "root" level. Can you take a look at that, please? This should already be fixed by: [PATCH] dt-bindings: display: bridge: lvds-codec: Fix duplicate key +CC Thomas/Thierry, can you please pick the aforementioned patch ?
make dt_binding_check broken by drm & lvds-codec
Hi, I just noticed that "make dt_binding_check" doesn't work in linux-next: SCHEMA Documentation/devicetree/bindings/processed-schema-examples.json Traceback (most recent call last): File "/home/rmilecki/.local/bin/dt-mk-schema", line 38, in schemas = dtschema.process_schemas(args.schemas, core_schema=(not args.useronly)) File "/home/rmilecki/.local/lib/python3.6/site-packages/dtschema/lib.py", line 587, in process_schemas sch = process_schema(os.path.abspath(filename)) File "/home/rmilecki/.local/lib/python3.6/site-packages/dtschema/lib.py", line 561, in process_schema schema = load_schema(filename) File "/home/rmilecki/.local/lib/python3.6/site-packages/dtschema/lib.py", line 126, in load_schema return do_load(os.path.join(schema_basedir, schema)) File "/home/rmilecki/.local/lib/python3.6/site-packages/dtschema/lib.py", line 112, in do_load return yaml.load(tmp) File "/usr/lib/python3.6/site-packages/ruamel/yaml/main.py", line 343, in load return constructor.get_single_data() File "/usr/lib/python3.6/site-packages/ruamel/yaml/constructor.py", line 113, in get_single_data return self.construct_document(node) File "/usr/lib/python3.6/site-packages/ruamel/yaml/constructor.py", line 123, in construct_document for _dummy in generator: File "/usr/lib/python3.6/site-packages/ruamel/yaml/constructor.py", line 723, in construct_yaml_map value = self.construct_mapping(node) File "/usr/lib/python3.6/site-packages/ruamel/yaml/constructor.py", line 440, in construct_mapping return BaseConstructor.construct_mapping(self, node, deep=deep) File "/usr/lib/python3.6/site-packages/ruamel/yaml/constructor.py", line 257, in construct_mapping if self.check_mapping_key(node, key_node, mapping, key, value): File "/usr/lib/python3.6/site-packages/ruamel/yaml/constructor.py", line 295, in check_mapping_key raise DuplicateKeyError(*args) ruamel.yaml.constructor.DuplicateKeyError: while constructing a mapping in "", line 4, column 1 found duplicate key "if" with value "{}" (original value: "{}") in "", line 113, column 1 It's caused by two commits: ba3e86789eaf ("dt-bindings: display: bridge: lvds-codec: Document LVDS data mapping select") d7df3948eb49 ("dt-bindings: display: bridge: lvds-codec: Document pixel data sampling edge select") Both commits add "if" and "then" at YAML "root" level. Can you take a look at that, please?
Re: [PATCH 22/22] drm: rockchip: Add VOP2 driver
On Dienstag, 21. Dezember 2021 14:44:39 CET Nicolas Frattaroli wrote: > On Montag, 20. Dezember 2021 12:06:30 CET Sascha Hauer wrote: > > From: Andy Yan > > > > The VOP2 unit is found on Rockchip SoCs beginning with rk3566/rk3568. > > It replaces the VOP unit found in the older Rockchip SoCs. > > > > This driver has been derived from the downstream Rockchip Kernel and > > heavily modified: > > > > - All nonstandard DRM properties have been removed > > - dropped struct vop2_plane_state and pass around less data between > > functions > > - Dropped all DRM_FORMAT_* not known on upstream > > - rework register access to get rid of excessively used macros > > - Drop all waiting for framesyncs > > > > The driver is tested with HDMI and MIPI-DSI display on a RK3568-EVB > > board. Overlay support is tested with the modetest utility. AFBC support > > on the cluster windows is tested with weston-simple-dmabuf-egl on > > weston using the (yet to be upstreamed) panfrost driver support. > > > > Signed-off-by: Sascha Hauer > > --- > > Hi Sascha, > > quick partial review of the code in-line. > > For reference, I debugged locking issues with the kernel lock > debug config options and assert_spin_locked in the reg write > functions, as well as some manual deduction. > As a small follow-up, I've completely mapped out the calls to vop2_writel, vop2_readl, vop2_vp_write and vop2_win_write and coloured in whether they were called with the lock held or not. The conclusion is startling: Most of the code absolutely does not care about the reg_lock. Here's the graph as an SVG: https://overviewer.org/~pillow/up/6800427ef3/vop2_callgraph_modified.svg vop2_isr here needs to be paid special attention, as it also acquires a different spinlock, and we want to avoid deadlocks. Perhaps we should precisely define which lock must be held for what registers, such that the vop2_isr can write its interrupt related registers without acquiring the "big" reg_lock. I'm also not entirely sure whether I should assume vop2_readl needs to be called with the lock held. This needs some investigating both in terms of whether the hardware presents a writel as an atomic write of a long, and whether the code assumes the state between readl calls is ever a consistent view. Regards, Nicolas Frattaroli
Re: [Intel-gfx] [PATCH 4/7] drm/i915/guc: Don't hog IRQs when destroying contexts
Ping? Main two points being: 1) Commit message seems in contradiction with the change in guc_flush_destroyed_contexts. And the lock drop to immediately re-acquire it looks questionable to start with. 2) And in deregister_destroyed_contexts and in 1) I was therefore asking if you can unlink all at once and process with reduced hammering on the lock. Regards, Tvrtko On 17/12/2021 11:14, Tvrtko Ursulin wrote: On 17/12/2021 11:06, Tvrtko Ursulin wrote: On 14/12/2021 17:04, Matthew Brost wrote: From: John Harrison While attempting to debug a CT deadlock issue in various CI failures (most easily reproduced with gem_ctx_create/basic-files), I was seeing CPU deadlock errors being reported. This were because the context destroy loop was blocking waiting on H2G space from inside an IRQ spinlock. There no was deadlock as such, it's just that the H2G queue was full of context destroy commands and GuC was taking a long time to process them. However, the kernel was seeing the large amount of time spent inside the IRQ lock as a dead CPU. Various Bad Things(tm) would then happen (heartbeat failures, CT deadlock errors, outstanding H2G WARNs, etc.). Re-working the loop to only acquire the spinlock around the list management (which is all it is meant to protect) rather than the entire destroy operation seems to fix all the above issues. v2: (John Harrison) - Fix typo in comment message Signed-off-by: John Harrison Signed-off-by: Matthew Brost Reviewed-by: Matthew Brost --- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 45 --- 1 file changed, 28 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index 36c2965db49b..96fcf869e3ff 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -2644,7 +2644,6 @@ static inline void guc_lrc_desc_unpin(struct intel_context *ce) unsigned long flags; bool disabled; - lockdep_assert_held(&guc->submission_state.lock); GEM_BUG_ON(!intel_gt_pm_is_awake(gt)); GEM_BUG_ON(!lrc_desc_registered(guc, ce->guc_id.id)); GEM_BUG_ON(ce != __get_context(guc, ce->guc_id.id)); @@ -2660,7 +2659,7 @@ static inline void guc_lrc_desc_unpin(struct intel_context *ce) } spin_unlock_irqrestore(&ce->guc_state.lock, flags); if (unlikely(disabled)) { - __release_guc_id(guc, ce); + release_guc_id(guc, ce); __guc_context_destroy(ce); return; } @@ -2694,36 +2693,48 @@ static void __guc_context_destroy(struct intel_context *ce) static void guc_flush_destroyed_contexts(struct intel_guc *guc) { - struct intel_context *ce, *cn; + struct intel_context *ce; unsigned long flags; GEM_BUG_ON(!submission_disabled(guc) && guc_submission_initialized(guc)); - spin_lock_irqsave(&guc->submission_state.lock, flags); - list_for_each_entry_safe(ce, cn, - &guc->submission_state.destroyed_contexts, - destroyed_link) { - list_del_init(&ce->destroyed_link); - __release_guc_id(guc, ce); + while (!list_empty(&guc->submission_state.destroyed_contexts)) { Are lockless false negatives a concern here - I mean this thread not seeing something just got added to the list? + spin_lock_irqsave(&guc->submission_state.lock, flags); + ce = list_first_entry_or_null(&guc->submission_state.destroyed_contexts, + struct intel_context, + destroyed_link); + if (ce) + list_del_init(&ce->destroyed_link); + spin_unlock_irqrestore(&guc->submission_state.lock, flags); + + if (!ce) + break; + + release_guc_id(guc, ce); This looks suboptimal and in conflict with this part of the commit message: """ Re-working the loop to only acquire the spinlock around the list management (which is all it is meant to protect) rather than the entire destroy operation seems to fix all the above issues. """ Because you end up doing: ... loop ... spin_lock_irqsave(&guc->submission_state.lock, flags); list_del_init(&ce->destroyed_link); spin_unlock_irqrestore(&guc->submission_state.lock, flags); release_guc_id, which calls: spin_lock_irqsave(&guc->submission_state.lock, flags); __release_guc_id(guc, ce); spin_unlock_irqrestore(&guc->submission_state.lock, flags); So a) the lock seems to be protecting more than just list management, or release_guc_if is wrong, and b) the loop ends up with highly questionable hammering on the lock. Is there any point to this part of the patch? Or the only business end of the patch is below: __guc_context_destroy(ce); } - spin_unlock_irqrestore(&guc->submission_state.lock, flags); } static void deregister_destroyed_contexts(struct intel_guc *guc) { - struct intel_conte
Re: [Intel-gfx] [PATCH] drm/i915/guc: Log engine resets
On 21/12/2021 22:14, John Harrison wrote: On 12/21/2021 05:37, Tvrtko Ursulin wrote: On 20/12/2021 18:34, John Harrison wrote: On 12/20/2021 07:00, Tvrtko Ursulin wrote: On 17/12/2021 16:22, Matthew Brost wrote: On Fri, Dec 17, 2021 at 12:15:53PM +, Tvrtko Ursulin wrote: On 14/12/2021 15:07, Tvrtko Ursulin wrote: From: Tvrtko Ursulin Log engine resets done by the GuC firmware in the similar way it is done by the execlists backend. This way we have notion of where the hangs are before the GuC gains support for proper error capture. Ping - any interest to log this info? All there currently is a non-descriptive "[drm] GPU HANG: ecode 12:0:". Yea, this could be helpful. One suggestion below. Also, will GuC be reporting the reason for the engine reset at any point? We are working on the error state capture, presumably the registers will give a clue what caused the hang. As for the GuC providing a reason, that isn't defined in the interface but that is decent idea to provide a hint in G2H what the issue was. Let me run that by the i915 GuC developers / GuC firmware team and see what they think. The GuC does not do any hang analysis. So as far as GuC is concerned, the reason is pretty much always going to be pre-emption timeout. There are a few ways the pre-emption itself could be triggered but basically, if GuC resets an active context then it is because it did not pre-empt quickly enough when requested. Regards, Tvrtko Signed-off-by: Tvrtko Ursulin Cc: Matthew Brost Cc: John Harrison --- drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index 9739da6f..51512123dc1a 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -11,6 +11,7 @@ #include "gt/intel_context.h" #include "gt/intel_engine_pm.h" #include "gt/intel_engine_heartbeat.h" +#include "gt/intel_engine_user.h" #include "gt/intel_gpu_commands.h" #include "gt/intel_gt.h" #include "gt/intel_gt_clock_utils.h" @@ -3934,9 +3935,18 @@ static void capture_error_state(struct intel_guc *guc, { struct intel_gt *gt = guc_to_gt(guc); struct drm_i915_private *i915 = gt->i915; - struct intel_engine_cs *engine = __context_to_physical_engine(ce); + struct intel_engine_cs *engine = ce->engine; intel_wakeref_t wakeref; + if (intel_engine_is_virtual(engine)) { + drm_notice(&i915->drm, "%s class, engines 0x%x; GuC engine reset\n", + intel_engine_class_repr(engine->class), + engine->mask); + engine = guc_virtual_get_sibling(engine, 0); + } else { + drm_notice(&i915->drm, "%s GuC engine reset\n", engine->name); Probably include the guc_id of the context too then? Is the guc id stable and useful on its own - who would be the user? The GuC id is the only thing that matters when trying to correlate KMD activity with a GuC log. So while it might not be of any use or interest to an end user, it is extremely important and useful to a kernel developer attempting to debug an issue. And that includes bug reports from end users that are hard to repro given that the standard error capture will include the GuC log. On the topic of GuC log - is there a tool in IGT (or will be) which will parse the bit saved in the error capture or how is that supposed to be used? Nope. However, Alan is currently working on supporting the GuC error capture mechanism. Prior to sending the reset notification to the KMD, the GuC will save a whole bunch of register state to a memory buffer and send a notification to the KMD that this is available. When we then get the actual reset notification, we need to match the two together and include a parsed, human readable version of the GuC's capture state buffer in the sysfs error log output. The GuC log should not be involved in this process. And note that any register dumps in the GuC log are limited in scope and only enabled at higher verbosity levels. Whereas, the official state capture is based on a register list provided by the KMD and is available irrespective of debug CONFIG settings, verbosity levels, etc. Hm why should GuC log not be involved now? I thought earlier you said: """ And that includes bug reports from end users that are hard to repro given that the standard error capture will include the GuC log. """ Hence I thought there would be a tool in IGT which would parse the part saved inside the error capture. Also, note that GuC really resets contexts rather than engines. What it reports back to i915 on a reset is simply the GuC id of the context. It is up to i915 to work back from that to determine engine instances/classes if required. And in the case of a virtual context, it is impossible to extract
[PATCH] drm/i915: Use trylock instead of blocking lock for __i915_gem_free_objects.
Convert free_work into delayed_work, similar to ttm to allow converting the blocking lock in __i915_gem_free_objects to a trylock. Unlike ttm, the object should already be idle, as it's kept alive by a reference through struct i915_vma->active, which is dropped after all vma's are idle. Because of this, we can use a no wait by default, or when the lock is contested, we use ttm's 10 ms. The trylock should only fail when the object is sharing it's resv with other objects, and typically objects are not kept locked for a long time, so we can safely retry on failure. Fixes: be7612fd6665 ("drm/i915: Require object lock when freeing pages during destruction") Testcase: igt/gem_exec_alignment/pi* Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/gem/i915_gem_object.c | 14 ++ drivers/gpu/drm/i915/i915_drv.h| 4 ++-- 2 files changed, 12 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 39cd563544a5..d87b508b59b1 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -331,7 +331,13 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915, continue; } - i915_gem_object_lock(obj, NULL); + if (!i915_gem_object_trylock(obj, NULL)) { + /* busy, toss it back to the pile */ + if (llist_add(&obj->freed, &i915->mm.free_list)) + queue_delayed_work(i915->wq, &i915->mm.free_work, msecs_to_jiffies(10)); + continue; + } + __i915_gem_object_pages_fini(obj); i915_gem_object_unlock(obj); __i915_gem_free_object(obj); @@ -353,7 +359,7 @@ void i915_gem_flush_free_objects(struct drm_i915_private *i915) static void __i915_gem_free_work(struct work_struct *work) { struct drm_i915_private *i915 = - container_of(work, struct drm_i915_private, mm.free_work); + container_of(work, struct drm_i915_private, mm.free_work.work); i915_gem_flush_free_objects(i915); } @@ -385,7 +391,7 @@ static void i915_gem_free_object(struct drm_gem_object *gem_obj) */ if (llist_add(&obj->freed, &i915->mm.free_list)) - queue_work(i915->wq, &i915->mm.free_work); + queue_delayed_work(i915->wq, &i915->mm.free_work, 0); } void __i915_gem_object_flush_frontbuffer(struct drm_i915_gem_object *obj, @@ -710,7 +716,7 @@ bool i915_gem_object_placement_possible(struct drm_i915_gem_object *obj, void i915_gem_init__objects(struct drm_i915_private *i915) { - INIT_WORK(&i915->mm.free_work, __i915_gem_free_work); + INIT_DELAYED_WORK(&i915->mm.free_work, __i915_gem_free_work); } void i915_objects_module_exit(void) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index c8fddb7e61c9..beeb42a14aae 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -465,7 +465,7 @@ struct i915_gem_mm { * List of objects which are pending destruction. */ struct llist_head free_list; - struct work_struct free_work; + struct delayed_work free_work; /** * Count of objects pending destructions. Used to skip needlessly * waiting on an RCU barrier if no objects are waiting to be freed. @@ -1625,7 +1625,7 @@ static inline void i915_gem_drain_freed_objects(struct drm_i915_private *i915) * armed the work again. */ while (atomic_read(&i915->mm.free_count)) { - flush_work(&i915->mm.free_work); + flush_delayed_work(&i915->mm.free_work); flush_delayed_work(&i915->bdev.wq); rcu_barrier(); } -- 2.34.1
[PATCH v2 1/1] drm/i915/dsi: Drop double check ACPI companion device for NULL
acpi_dev_get_resources() does perform the NULL pointer check against ACPI companion device which is given as function parameter. Thus, there is no need to duplicate this check in the caller. Signed-off-by: Andy Shevchenko --- v2: used LIST_HEAD() (Ville), initialized lookup directly on stack (Ville) drivers/gpu/drm/i915/display/intel_dsi_vbt.c | 28 +++- 1 file changed, 10 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_dsi_vbt.c b/drivers/gpu/drm/i915/display/intel_dsi_vbt.c index 0da91849efde..da0bd056f3d3 100644 --- a/drivers/gpu/drm/i915/display/intel_dsi_vbt.c +++ b/drivers/gpu/drm/i915/display/intel_dsi_vbt.c @@ -426,24 +426,16 @@ static void i2c_acpi_find_adapter(struct intel_dsi *intel_dsi, const u16 slave_addr) { struct drm_device *drm_dev = intel_dsi->base.base.dev; - struct device *dev = drm_dev->dev; - struct acpi_device *acpi_dev; - struct list_head resource_list; - struct i2c_adapter_lookup lookup; - - acpi_dev = ACPI_COMPANION(dev); - if (acpi_dev) { - memset(&lookup, 0, sizeof(lookup)); - lookup.slave_addr = slave_addr; - lookup.intel_dsi = intel_dsi; - lookup.dev_handle = acpi_device_handle(acpi_dev); - - INIT_LIST_HEAD(&resource_list); - acpi_dev_get_resources(acpi_dev, &resource_list, - i2c_adapter_lookup, - &lookup); - acpi_dev_free_resource_list(&resource_list); - } + struct acpi_device *adev = ACPI_COMPANION(drm_dev->dev); + struct i2c_adapter_lookup lookup = { + .slave_addr = slave_addr, + .intel_dsi = intel_dsi, + .dev_handle = acpi_device_handle(adev), + }; + LIST_HEAD(resource_list); + + acpi_dev_get_resources(adev, &resource_list, i2c_adapter_lookup, &lookup); + acpi_dev_free_resource_list(&resource_list); } #else static inline void i2c_acpi_find_adapter(struct intel_dsi *intel_dsi, -- 2.34.1
Re: [PATCH] drm/amd/display: Fix the uninitialized variable in enable_stream_features()
Applied. Thanks! Alex On Fri, Dec 17, 2021 at 11:22 PM Yizhuo Zhai wrote: > > In function enable_stream_features(), the variable "old_downspread.raw" > could be uninitialized if core_link_read_dpcd() fails, however, it is > used in the later if statement, and further, core_link_write_dpcd() > may write random value, which is potentially unsafe. > > Fixes: 6016cd9dba0f ("drm/amd/display: add helper for enabling mst stream > features") > Cc: sta...@vger.kernel.org > Signed-off-by: Yizhuo Zhai > --- > drivers/gpu/drm/amd/display/dc/core/dc_link.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link.c > b/drivers/gpu/drm/amd/display/dc/core/dc_link.c > index c8457babfdea..fd5a0e7eb029 100644 > --- a/drivers/gpu/drm/amd/display/dc/core/dc_link.c > +++ b/drivers/gpu/drm/amd/display/dc/core/dc_link.c > @@ -1844,6 +1844,8 @@ static void enable_stream_features(struct pipe_ctx > *pipe_ctx) > union down_spread_ctrl old_downspread; > union down_spread_ctrl new_downspread; > > + memset(&old_downspread, 0, sizeof(old_downspread)); > + > core_link_read_dpcd(link, DP_DOWNSPREAD_CTRL, > &old_downspread.raw, sizeof(old_downspread)); > > -- > 2.25.1 >