Re: [PATCH RFC 4/4] bpf,cgroup,perf: extend bpf-cgroup to support tracepoint attachment
On Thu, Nov 18, 2021 at 11:33 PM Alexei Starovoitov wrote: > > On Thu, Nov 18, 2021 at 03:28:40PM -0500, Kenny Ho wrote: > > + for_each_possible_cpu(cpu) { > > + /* allocate first, connect the cgroup later */ > > + events[i] = perf_event_create_kernel_counter(attr, cpu, NULL, > > NULL, NULL); > > This is a very heavy hammer for this task. > There is really no need for perf_event to be created. > Did you consider using raw_tp approach instead? I came across raw_tp but I don't have a good understanding of it yet. Initially I was hoping perf event/tracepoint is a stepping stone to raw tp but that doesn't seem to be the case (and unfortunately I picked the perf event/tracepoint route to dive in first because I saw cgroup usage.) Can you confirm if the following statements are true? - is raw_tp related to writable tracepoint - are perf_event/tracepoint/kprobe/uprobe and fentry/fexit/raw_tp considered two separate 'things' (even though both of their purpose is tracing)? > It doesn't need this heavy stuff. > Also I suspect in follow up you'd be adding tracepoints to GPU code? > Did you consider just leaving few __weak global functions in GPU code > and let bpf progs attach to them as fentry? There are already tracepoints in the GPU code. And I do like fentry way of doing things more but my head was very much focused on cgroup, and tracepoint/kprobe path seems to have something for it. I suspected this would be a bit too heavy after seeing the scalability discussion but I wasn't sure so I whip this up quickly to get some feedback (while learning more about perf/bpf/cgroup.) > I suspect the true hierarchical nature of bpf-cgroup framework isn't > necessary. > The bpf program itself can filter for given cgroup. > We have bpf_current_task_under_cgroup() and friends. Is there a way to access cgroup local storage from a prog that is not attached to a bpf-cgroup? Although, I guess I can just store/read things using a map with the cg id as key. And with the bpf_get_current_ancestor_cgroup_id below I can just simulate the values being propagated if the hierarchy ends up being relevant. Then again, is there a way to atomically update multiple elements of a map? I am trying to figure out how to support a multi-user multi-app sharing use case (like user A given quota X and user B given quota Y with app 1 and 2 each having a quota assigned by A and app 8 and 9 each having quota assigned by B.) Is there some kind of 'lock' mechanism for me to keep quota 1,2,X in sync? (Same for 8,9,Y.) > I suggest to sprinkle __weak empty funcs in GPU and see what > you can do with it with fentry and bpf_current_task_under_cgroup. > There is also bpf_get_current_ancestor_cgroup_id().
RE: [PATCH] drm/amd/amdgpu: move kfd post_reset out of reset_sriov function
[AMD Official Use Only] Ping -Original Message- From: Liu, Shaoyun Sent: Thursday, November 18, 2021 11:58 AM To: amd-gfx@lists.freedesktop.org Cc: Liu, Shaoyun Subject: [PATCH] drm/amd/amdgpu: move kfd post_reset out of reset_sriov function For sriov XGMI configuration, the host driver will handle the hive reset, so in guest side, the reset_sriov only be called once on one device. This will make kfd post_reset unblanced with kfd pre_reset since kfd pre_reset already been moved out of reset_sriov function. Move kfd post_reset out of reset_sriov function to make them balance . Signed-off-by: shaoyunl --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 10c8008d1da0..9a9d5493c676 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4308,7 +4308,6 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device *adev, amdgpu_irq_gpu_reset_resume_helper(adev); r = amdgpu_ib_ring_tests(adev); - amdgpu_amdkfd_post_reset(adev); error: if (!r && adev->virt.gim_feature & AMDGIM_FEATURE_GIM_FLR_VRAMLOST) { @@ -5081,7 +5080,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, tmp_vram_lost_counter = atomic_read(&((adev)->vram_lost_counter)); /* Actual ASIC resets if needed.*/ - /* TODO Implement XGMI hive reset logic for SRIOV */ + /* Host driver will handle XGMI hive reset for SRIOV */ if (amdgpu_sriov_vf(adev)) { r = amdgpu_device_reset_sriov(adev, job ? false : true); if (r) @@ -5141,8 +5140,8 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, skip_sched_resume: list_for_each_entry(tmp_adev, device_list_handle, reset_list) { - /* unlock kfd: SRIOV would do it separately */ - if (!need_emergency_restart && !amdgpu_sriov_vf(tmp_adev)) + /* unlock kfd */ + if (!need_emergency_restart) amdgpu_amdkfd_post_reset(tmp_adev); /* kfd_post_reset will do nothing if kfd device is not initialized, -- 2.17.1
Re: [PATCH 1/2] drm/amdgpu/gfx10: add wraparound gpu counter check for APUs as well
Seems reasonable. Series is Acked-by: Luben Tuikov Regards, Luben On 2021-11-18 14:54, Alex Deucher wrote: > Apply the same check we do for dGPUs for APUs as well. > > Signed-off-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 15 +-- > 1 file changed, 13 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c > b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c > index e7dfeb466a0e..dbe7442fb25c 100644 > --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c > @@ -7707,8 +7707,19 @@ static uint64_t gfx_v10_0_get_gpu_clock_counter(struct > amdgpu_device *adev) > switch (adev->ip_versions[GC_HWIP][0]) { > case IP_VERSION(10, 3, 1): > case IP_VERSION(10, 3, 3): > - clock = (uint64_t)RREG32_SOC15(SMUIO, 0, > mmGOLDEN_TSC_COUNT_LOWER_Vangogh) | > - ((uint64_t)RREG32_SOC15(SMUIO, 0, > mmGOLDEN_TSC_COUNT_UPPER_Vangogh) << 32ULL); > + preempt_disable(); > + clock_hi = RREG32_SOC15_NO_KIQ(SMUIO, 0, > mmGOLDEN_TSC_COUNT_UPPER_Vangogh); > + clock_lo = RREG32_SOC15_NO_KIQ(SMUIO, 0, > mmGOLDEN_TSC_COUNT_LOWER_Vangogh); > + hi_check = RREG32_SOC15_NO_KIQ(SMUIO, 0, > mmGOLDEN_TSC_COUNT_UPPER_Vangogh); > + /* The SMUIO TSC clock frequency is 100MHz, which sets 32-bit > carry over > + * roughly every 42 seconds. > + */ > + if (hi_check != clock_hi) { > + clock_lo = RREG32_SOC15_NO_KIQ(SMUIO, 0, > mmGOLDEN_TSC_COUNT_LOWER_Vangogh); > + clock_hi = hi_check; > + } > + preempt_enable(); > + clock = clock_lo | (clock_hi << 32ULL); > break; > default: > preempt_disable();
Re: [PATCH] drm/amdgpu/pm: clean up some inconsistent indenting
Applied. Thanks! Alex On Thu, Nov 18, 2021 at 5:57 AM Jiapeng Chong wrote: > > Eliminate the follow smatch warning: > > drivers/gpu/drm/amd/amdgpu/../pm/powerplay/amd_powerplay.c:1554 > pp_asic_reset_mode_2() warn: inconsistent indenting. > > Reported-by: Abaci Robot > Signed-off-by: Jiapeng Chong > --- > drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c > b/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c > index 8d796ed..20cb234 100644 > --- a/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c > +++ b/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c > @@ -1551,7 +1551,7 @@ static int pp_set_ppfeature_status(void *handle, > uint64_t ppfeature_masks) > static int pp_asic_reset_mode_2(void *handle) > { > struct pp_hwmgr *hwmgr = handle; > - int ret = 0; > + int ret = 0; > > if (!hwmgr || !hwmgr->pm_en) > return -EINVAL; > -- > 1.8.3.1 >
Re: [radeon] connector_info_from_object_table
On Thu, Nov 18, 2021 at 11:37 AM Amol wrote: > > Hello, > > The function radeon_get_atom_connector_info_from_object_table, > at location [1], ends up parsing ATOM_COMMON_TABLE_HEADER > as ATOM_COMMON_RECORD_HEADER if > enc_obj->asObjects[k].usRecordOffset is zero. It is found to be zero > in the BIOS found at [2]. > > Thankfully, the loop that follows exits immediately since ucRecordSize > is 0 because > (ATOM_COMMON_TABLE_HEADER.usStructureSize & 0xff00) is zero. > But, with suitable values in the usStructureSize, the loop can be made to > run and parse garbage. > > A similar loop exists when parsing the conn objects. Can you send a patch to make it more robust? Thanks, Alex > > -Amol > > [1] > https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/radeon/radeon_atombios.c#L652 > [2] https://www.techpowerup.com/vgabios/211981/211981
Re: [PATCH] drm/amdgpu: Declare Unpin BO api as static
[Public] Reviewed-by: Alex Deucher From: amd-gfx on behalf of Ramesh Errabolu Sent: Thursday, November 18, 2021 5:11 PM To: amd-gfx@lists.freedesktop.org Cc: Errabolu, Ramesh ; kernel test robot Subject: [PATCH] drm/amdgpu: Declare Unpin BO api as static Fixes warning report from kernel test robot Reported-by: kernel test robot Signed-off-by: Ramesh Errabolu --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c index 90b985436878..3463e0d4e5ca 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c @@ -1332,7 +1332,7 @@ static int amdgpu_amdkfd_gpuvm_pin_bo(struct amdgpu_bo *bo, u32 domain) * - All other BO types (GTT, VRAM, MMIO and DOORBELL) will have their * PIN count decremented. Calls to UNPIN must balance calls to PIN */ -void amdgpu_amdkfd_gpuvm_unpin_bo(struct amdgpu_bo *bo) +static void amdgpu_amdkfd_gpuvm_unpin_bo(struct amdgpu_bo *bo) { int ret = 0; -- 2.31.1
[PATCH] drm/amdgpu: Declare Unpin BO api as static
Fixes warning report from kernel test robot Reported-by: kernel test robot Signed-off-by: Ramesh Errabolu --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c index 90b985436878..3463e0d4e5ca 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c @@ -1332,7 +1332,7 @@ static int amdgpu_amdkfd_gpuvm_pin_bo(struct amdgpu_bo *bo, u32 domain) * - All other BO types (GTT, VRAM, MMIO and DOORBELL) will have their * PIN count decremented. Calls to UNPIN must balance calls to PIN */ -void amdgpu_amdkfd_gpuvm_unpin_bo(struct amdgpu_bo *bo) +static void amdgpu_amdkfd_gpuvm_unpin_bo(struct amdgpu_bo *bo) { int ret = 0; -- 2.31.1
RE: [PATCH] drm/amdkfd: Remove unused entries in table
[AMD Official Use Only] Reviewed-by: Graham Sider > Remove unused entries in kfd_device_info table: num_xgmi_sdma_engines > and num_sdma_queues_per_engine. They are calculated in > kfd_get_num_sdma_engines and kfd_get_num_xgmi_sdma_engines > instead. > > Signed-off-by: Amber Lin > --- > drivers/gpu/drm/amd/amdkfd/kfd_device.c | 58 - > drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 - > 2 files changed, 60 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > index 3fea47e37c17..e1294fba0c26 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c > @@ -68,8 +68,6 @@ static const struct kfd_device_info kaveri_device_info = > { > .supports_cwsr = false, > .needs_iommu_device = true, > .needs_pci_atomics = false, > - .num_sdma_engines = 2, > - .num_xgmi_sdma_engines = 0, > .num_sdma_queues_per_engine = 2, }; > > @@ -87,8 +85,6 @@ static const struct kfd_device_info carrizo_device_info = > { > .supports_cwsr = true, > .needs_iommu_device = true, > .needs_pci_atomics = false, > - .num_sdma_engines = 2, > - .num_xgmi_sdma_engines = 0, > .num_sdma_queues_per_engine = 2, }; > > @@ -105,8 +101,6 @@ static const struct kfd_device_info raven_device_info > = { > .supports_cwsr = true, > .needs_iommu_device = true, > .needs_pci_atomics = true, > - .num_sdma_engines = 1, > - .num_xgmi_sdma_engines = 0, > .num_sdma_queues_per_engine = 2, }; #endif @@ -126,8 +120,6 @@ > static const struct kfd_device_info hawaii_device_info = { > .supports_cwsr = false, > .needs_iommu_device = false, > .needs_pci_atomics = false, > - .num_sdma_engines = 2, > - .num_xgmi_sdma_engines = 0, > .num_sdma_queues_per_engine = 2, }; #endif @@ -145,8 +137,6 @@ > static const struct kfd_device_info tonga_device_info = { > .supports_cwsr = false, > .needs_iommu_device = false, > .needs_pci_atomics = true, > - .num_sdma_engines = 2, > - .num_xgmi_sdma_engines = 0, > .num_sdma_queues_per_engine = 2, }; > > @@ -163,8 +153,6 @@ static const struct kfd_device_info fiji_device_info = { > .supports_cwsr = true, > .needs_iommu_device = false, > .needs_pci_atomics = true, > - .num_sdma_engines = 2, > - .num_xgmi_sdma_engines = 0, > .num_sdma_queues_per_engine = 2, }; > > @@ -181,8 +169,6 @@ static const struct kfd_device_info fiji_vf_device_info > = { > .supports_cwsr = true, > .needs_iommu_device = false, > .needs_pci_atomics = false, > - .num_sdma_engines = 2, > - .num_xgmi_sdma_engines = 0, > .num_sdma_queues_per_engine = 2, }; > > @@ -200,8 +186,6 @@ static const struct kfd_device_info > polaris10_device_info = { > .supports_cwsr = true, > .needs_iommu_device = false, > .needs_pci_atomics = true, > - .num_sdma_engines = 2, > - .num_xgmi_sdma_engines = 0, > .num_sdma_queues_per_engine = 2, }; > > @@ -218,8 +202,6 @@ static const struct kfd_device_info > polaris10_vf_device_info = { > .supports_cwsr = true, > .needs_iommu_device = false, > .needs_pci_atomics = false, > - .num_sdma_engines = 2, > - .num_xgmi_sdma_engines = 0, > .num_sdma_queues_per_engine = 2, }; > > @@ -236,8 +218,6 @@ static const struct kfd_device_info > polaris11_device_info = { > .supports_cwsr = true, > .needs_iommu_device = false, > .needs_pci_atomics = true, > - .num_sdma_engines = 2, > - .num_xgmi_sdma_engines = 0, > .num_sdma_queues_per_engine = 2, }; > > @@ -254,8 +234,6 @@ static const struct kfd_device_info > polaris12_device_info = { > .supports_cwsr = true, > .needs_iommu_device = false, > .needs_pci_atomics = true, > - .num_sdma_engines = 2, > - .num_xgmi_sdma_engines = 0, > .num_sdma_queues_per_engine = 2, }; > > @@ -272,8 +250,6 @@ static const struct kfd_device_info > vegam_device_info = { > .supports_cwsr = true, > .needs_iommu_device = false, > .needs_pci_atomics = true, > - .num_sdma_engines = 2, > - .num_xgmi_sdma_engines = 0, > .num_sdma_queues_per_engine = 2, }; > > @@ -290,8 +266,6 @@ static const struct kfd_device_info > vega10_device_info = { > .supports_cwsr = true, > .needs_iommu_device = false, > .needs_pci_atomics = false, > - .num_sdma_engines = 2, > - .num_xgmi_sdma_engines = 0, > .num_sdma_queues_per_engine = 2, }; > > @@ -308,8 +282,6 @@ static const struct kfd_device_info > vega10_vf_device_info = { > .supports_cwsr = true, > .needs_iommu_device = false, > .needs_pci_atomics = false, > - .num_sdma_engin
[PATCH RFC 0/4] Add ability to attach bpf programs to a tracepoint inside a cgroup
Per an earlier discussion last year[1], I have been looking for a mechanism to a) collect resource usages for devices (GPU for now but there could be other device type in the future) and b) possibly enforce some of the resource usages. An obvious mechanism was to use cgroup but there are too much diversity in GPU hardware architecture to have a common cgroup interface at this point. An alternative is to leverage tracepoint with a bpf program inside a cgroup hierarchy for usage collection and enforcement (via writable tracepoint.) This is a prototype for such idea. It is incomplete but I would like to solicit some feedback before continuing to make sure I am going down the right path. This prototype is built based on my understanding of the followings: - tracepoint (and kprobe, uprobe) is associated with perf event - perf events/tracepoint can be a hook for bpf progs but those bpf progs are not part of the cgroup hierarchy - bpf progs can be attached to the cgroup hierarchy with cgroup local storage and other benefits - separately, perf subsystem has a cgroup controller (perf cgroup) that allow perf event to be triggered with a cgroup filter So the key idea of this RFC is to leverage hierarchical organization of bpf-cgroup for the purpose of perf event/tracepoints. ==Known unresolved topics (feedback very much welcome)== Storage: I came across the idea of "preallocated" memory for bpf hash map/storage to avoid deadlock[2] but I don't have a good understanding about it currently. If existing bpf_cgroup_storage_type are not considered pre-allocated then I am thinking we can introduce a new type but I am not sure if this is needed yet. Scalability: Scalability concern has been raised about perf cgroup [3] and there seems to be a solution to it recently with bperf [4]. This RFC does not change the status quo on the scalability question but if I understand the bperf idea correctly, this RFC may have some similarity. [1] https://lore.kernel.org/netdev/yjxrhxiykyebd...@slm.duckdns.org/T/#m52bc26bbbf16131c48e6b34d875c87660943c452 [2] https://lwn.net/Articles/679074/ [3] https://www.linuxplumbersconf.org/event/4/contributions/291/attachments/313/528/Linux_Plumbers_Conference_2019.pdf [4] https://linuxplumbersconf.org/event/11/contributions/899/ Kenny Ho (4): cgroup, perf: Add ability to connect to perf cgroup from other cgroup controller bpf, perf: add ability to attach complete array of bpf prog to perf event bpf,cgroup,tracing: add new BPF_PROG_TYPE_CGROUP_TRACEPOINT bpf,cgroup,perf: extend bpf-cgroup to support tracepoint attachment include/linux/bpf-cgroup.h | 17 +-- include/linux/bpf_types.h| 4 ++ include/linux/cgroup.h | 2 + include/linux/perf_event.h | 6 +++ include/linux/trace_events.h | 9 include/uapi/linux/bpf.h | 2 + kernel/bpf/cgroup.c | 96 +--- kernel/bpf/syscall.c | 4 ++ kernel/cgroup/cgroup.c | 13 ++--- kernel/events/core.c | 62 +++ kernel/trace/bpf_trace.c | 36 ++ 11 files changed, 222 insertions(+), 29 deletions(-) -- 2.25.1
[PATCH RFC 2/4] bpf, perf: add ability to attach complete array of bpf prog to perf event
Change-Id: Ie2580c3a71e2a5116551879358cb5304b04d3838 Signed-off-by: Kenny Ho --- include/linux/trace_events.h | 9 + kernel/trace/bpf_trace.c | 28 2 files changed, 37 insertions(+) diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h index 3e475eeb5a99..5cfe3d08966c 100644 --- a/include/linux/trace_events.h +++ b/include/linux/trace_events.h @@ -725,6 +725,8 @@ trace_trigger_soft_disabled(struct trace_event_file *file) #ifdef CONFIG_BPF_EVENTS unsigned int trace_call_bpf(struct trace_event_call *call, void *ctx); +int perf_event_attach_bpf_prog_array(struct perf_event *event, +struct bpf_prog_array *new_array); int perf_event_attach_bpf_prog(struct perf_event *event, struct bpf_prog *prog, u64 bpf_cookie); void perf_event_detach_bpf_prog(struct perf_event *event); int perf_event_query_prog_array(struct perf_event *event, void __user *info); @@ -741,6 +743,13 @@ static inline unsigned int trace_call_bpf(struct trace_event_call *call, void *c return 1; } +static inline int +int perf_event_attach_bpf_prog_array(struct perf_event *event, +struct bpf_prog_array *new_array) +{ + return -EOPNOTSUPP; +} + static inline int perf_event_attach_bpf_prog(struct perf_event *event, struct bpf_prog *prog, u64 bpf_cookie) { diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 6b3153841a33..8addd10202c2 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -1802,6 +1802,34 @@ static DEFINE_MUTEX(bpf_event_mutex); #define BPF_TRACE_MAX_PROGS 64 +int perf_event_attach_bpf_prog_array(struct perf_event *event, +struct bpf_prog_array *new_array) +{ + struct bpf_prog_array_item *item; + struct bpf_prog_array *old_array; + + if (!new_array) + return -EINVAL; + + if (bpf_prog_array_length(new_array) >= BPF_TRACE_MAX_PROGS) + return -E2BIG; + + if (!trace_kprobe_on_func_entry(event->tp_event) || +!trace_kprobe_error_injectable(event->tp_event)) + for (item = new_array->items; item->prog; item++) + if (item->prog->kprobe_override) + return -EINVAL; + + mutex_lock(&bpf_event_mutex); + + old_array = bpf_event_rcu_dereference(event->tp_event->prog_array); + rcu_assign_pointer(event->tp_event->prog_array, new_array); + bpf_prog_array_free(old_array); + + mutex_unlock(&bpf_event_mutex); + return 0; +} + int perf_event_attach_bpf_prog(struct perf_event *event, struct bpf_prog *prog, u64 bpf_cookie) -- 2.25.1
[PATCH RFC 1/4] cgroup, perf: Add ability to connect to perf cgroup from other cgroup controller
This provides the ability to allocate cgroup specific perf_event by bpf-cgroup in later patch Change-Id: I13aa7f3dfc2883ba3663c0b94744a6169504bbd8 Signed-off-by: Kenny Ho --- include/linux/cgroup.h | 2 ++ include/linux/perf_event.h | 2 ++ kernel/cgroup/cgroup.c | 4 ++-- kernel/events/core.c | 17 + 4 files changed, 23 insertions(+), 2 deletions(-) diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index 75c151413fda..1754e33cfe5e 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -97,6 +97,8 @@ extern struct css_set init_css_set; bool css_has_online_children(struct cgroup_subsys_state *css); struct cgroup_subsys_state *css_from_id(int id, struct cgroup_subsys *ss); +struct cgroup_subsys_state *cgroup_tryget_css(struct cgroup *cgroup, + struct cgroup_subsys *ss); struct cgroup_subsys_state *cgroup_e_css(struct cgroup *cgroup, struct cgroup_subsys *ss); struct cgroup_subsys_state *cgroup_get_e_css(struct cgroup *cgroup, diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 0cbc5dfe1110..9c440db65c18 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -926,6 +926,8 @@ struct perf_cgroup { struct perf_cgroup_info __percpu *info; }; +extern struct perf_cgroup *cgroup_tryget_perf_cgroup(struct cgroup *cgrp); + /* * Must ensure cgroup is pinned (css_get) before calling * this function. In other words, we cannot call this function diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 570b0c97392a..a645b212b69b 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -495,8 +495,8 @@ static struct cgroup_subsys_state *cgroup_css(struct cgroup *cgrp, * Find and get @cgrp's css associated with @ss. If the css doesn't exist * or is offline, %NULL is returned. */ -static struct cgroup_subsys_state *cgroup_tryget_css(struct cgroup *cgrp, -struct cgroup_subsys *ss) +struct cgroup_subsys_state *cgroup_tryget_css(struct cgroup *cgrp, + struct cgroup_subsys *ss) { struct cgroup_subsys_state *css; diff --git a/kernel/events/core.c b/kernel/events/core.c index 20367196fa9a..d34e00749c9b 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -941,6 +941,18 @@ static int perf_cgroup_ensure_storage(struct perf_event *event, return ret; } +struct perf_cgroup *cgroup_tryget_perf_cgroup(struct cgroup *cgrp) +{ + struct cgroup_subsys_state *css; + + css = cgroup_tryget_css(cgrp, &perf_event_cgrp_subsys); + + if (!css) + return NULL; + + return container_of(css, struct perf_cgroup, css); +} + static inline int perf_cgroup_connect(int fd, struct perf_event *event, struct perf_event_attr *attr, struct perf_event *group_leader) @@ -1080,6 +1092,11 @@ static inline void perf_cgroup_sched_in(struct task_struct *prev, { } +struct perf_cgroup *cgroup_tryget_perf_cgroup(struct cgroup *cgrp) +{ + return NULL; +} + static inline int perf_cgroup_connect(pid_t pid, struct perf_event *event, struct perf_event_attr *attr, struct perf_event *group_leader) -- 2.25.1
[PATCH] drm/amdkfd: Remove unused entries in table
Remove unused entries in kfd_device_info table: num_xgmi_sdma_engines and num_sdma_queues_per_engine. They are calculated in kfd_get_num_sdma_engines and kfd_get_num_xgmi_sdma_engines instead. Signed-off-by: Amber Lin --- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 58 - drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 - 2 files changed, 60 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index 3fea47e37c17..e1294fba0c26 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c @@ -68,8 +68,6 @@ static const struct kfd_device_info kaveri_device_info = { .supports_cwsr = false, .needs_iommu_device = true, .needs_pci_atomics = false, - .num_sdma_engines = 2, - .num_xgmi_sdma_engines = 0, .num_sdma_queues_per_engine = 2, }; @@ -87,8 +85,6 @@ static const struct kfd_device_info carrizo_device_info = { .supports_cwsr = true, .needs_iommu_device = true, .needs_pci_atomics = false, - .num_sdma_engines = 2, - .num_xgmi_sdma_engines = 0, .num_sdma_queues_per_engine = 2, }; @@ -105,8 +101,6 @@ static const struct kfd_device_info raven_device_info = { .supports_cwsr = true, .needs_iommu_device = true, .needs_pci_atomics = true, - .num_sdma_engines = 1, - .num_xgmi_sdma_engines = 0, .num_sdma_queues_per_engine = 2, }; #endif @@ -126,8 +120,6 @@ static const struct kfd_device_info hawaii_device_info = { .supports_cwsr = false, .needs_iommu_device = false, .needs_pci_atomics = false, - .num_sdma_engines = 2, - .num_xgmi_sdma_engines = 0, .num_sdma_queues_per_engine = 2, }; #endif @@ -145,8 +137,6 @@ static const struct kfd_device_info tonga_device_info = { .supports_cwsr = false, .needs_iommu_device = false, .needs_pci_atomics = true, - .num_sdma_engines = 2, - .num_xgmi_sdma_engines = 0, .num_sdma_queues_per_engine = 2, }; @@ -163,8 +153,6 @@ static const struct kfd_device_info fiji_device_info = { .supports_cwsr = true, .needs_iommu_device = false, .needs_pci_atomics = true, - .num_sdma_engines = 2, - .num_xgmi_sdma_engines = 0, .num_sdma_queues_per_engine = 2, }; @@ -181,8 +169,6 @@ static const struct kfd_device_info fiji_vf_device_info = { .supports_cwsr = true, .needs_iommu_device = false, .needs_pci_atomics = false, - .num_sdma_engines = 2, - .num_xgmi_sdma_engines = 0, .num_sdma_queues_per_engine = 2, }; @@ -200,8 +186,6 @@ static const struct kfd_device_info polaris10_device_info = { .supports_cwsr = true, .needs_iommu_device = false, .needs_pci_atomics = true, - .num_sdma_engines = 2, - .num_xgmi_sdma_engines = 0, .num_sdma_queues_per_engine = 2, }; @@ -218,8 +202,6 @@ static const struct kfd_device_info polaris10_vf_device_info = { .supports_cwsr = true, .needs_iommu_device = false, .needs_pci_atomics = false, - .num_sdma_engines = 2, - .num_xgmi_sdma_engines = 0, .num_sdma_queues_per_engine = 2, }; @@ -236,8 +218,6 @@ static const struct kfd_device_info polaris11_device_info = { .supports_cwsr = true, .needs_iommu_device = false, .needs_pci_atomics = true, - .num_sdma_engines = 2, - .num_xgmi_sdma_engines = 0, .num_sdma_queues_per_engine = 2, }; @@ -254,8 +234,6 @@ static const struct kfd_device_info polaris12_device_info = { .supports_cwsr = true, .needs_iommu_device = false, .needs_pci_atomics = true, - .num_sdma_engines = 2, - .num_xgmi_sdma_engines = 0, .num_sdma_queues_per_engine = 2, }; @@ -272,8 +250,6 @@ static const struct kfd_device_info vegam_device_info = { .supports_cwsr = true, .needs_iommu_device = false, .needs_pci_atomics = true, - .num_sdma_engines = 2, - .num_xgmi_sdma_engines = 0, .num_sdma_queues_per_engine = 2, }; @@ -290,8 +266,6 @@ static const struct kfd_device_info vega10_device_info = { .supports_cwsr = true, .needs_iommu_device = false, .needs_pci_atomics = false, - .num_sdma_engines = 2, - .num_xgmi_sdma_engines = 0, .num_sdma_queues_per_engine = 2, }; @@ -308,8 +282,6 @@ static const struct kfd_device_info vega10_vf_device_info = { .supports_cwsr = true, .needs_iommu_device = false, .needs_pci_atomics = false, - .num_sdma_engines = 2, - .num_xgmi_sdma_engines = 0, .num_sdma_queues_per_engine = 2, }; @@ -326,8 +298,6 @@ static const struct kfd_device_info vega12_device_info = { .supports_cwsr = true, .needs_iommu_device = false, .needs_pci_atomics = false, - .num_sdma_engines = 2,
[PATCH 1/2] drm/amdgpu/gfx10: add wraparound gpu counter check for APUs as well
Apply the same check we do for dGPUs for APUs as well. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 15 +-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c index e7dfeb466a0e..dbe7442fb25c 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c @@ -7707,8 +7707,19 @@ static uint64_t gfx_v10_0_get_gpu_clock_counter(struct amdgpu_device *adev) switch (adev->ip_versions[GC_HWIP][0]) { case IP_VERSION(10, 3, 1): case IP_VERSION(10, 3, 3): - clock = (uint64_t)RREG32_SOC15(SMUIO, 0, mmGOLDEN_TSC_COUNT_LOWER_Vangogh) | - ((uint64_t)RREG32_SOC15(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER_Vangogh) << 32ULL); + preempt_disable(); + clock_hi = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER_Vangogh); + clock_lo = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_LOWER_Vangogh); + hi_check = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER_Vangogh); + /* The SMUIO TSC clock frequency is 100MHz, which sets 32-bit carry over +* roughly every 42 seconds. +*/ + if (hi_check != clock_hi) { + clock_lo = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_LOWER_Vangogh); + clock_hi = hi_check; + } + preempt_enable(); + clock = clock_lo | (clock_hi << 32ULL); break; default: preempt_disable(); -- 2.31.1
[PATCH 2/2] drm/amdgpu/gfx9: switch to golden tsc registers for renoir+
Renoir and newer gfx9 APUs have new TSC register that is not part of the gfxoff tile, so it can be read without needing to disable gfx off. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 46 --- 1 file changed, 35 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c index 2d0bc1c91426..154fa6facf19 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c @@ -140,6 +140,11 @@ MODULE_FIRMWARE("amdgpu/aldebaran_rlc.bin"); #define mmTCP_CHAN_STEER_5_ARCT 0x0b0c #define mmTCP_CHAN_STEER_5_ARCT_BASE_IDX 0 +#define mmGOLDEN_TSC_COUNT_UPPER_Renoir0x0025 +#define mmGOLDEN_TSC_COUNT_UPPER_Renoir_BASE_IDX 1 +#define mmGOLDEN_TSC_COUNT_LOWER_Renoir0x0026 +#define mmGOLDEN_TSC_COUNT_LOWER_Renoir_BASE_IDX 1 + enum ta_ras_gfx_subblock { /*CPC*/ TA_RAS_BLOCK__GFX_CPC_INDEX_START = 0, @@ -4240,19 +4245,38 @@ static uint64_t gfx_v9_0_kiq_read_clock(struct amdgpu_device *adev) static uint64_t gfx_v9_0_get_gpu_clock_counter(struct amdgpu_device *adev) { - uint64_t clock; + uint64_t clock, clock_lo, clock_hi, hi_check; - amdgpu_gfx_off_ctrl(adev, false); - mutex_lock(&adev->gfx.gpu_clock_mutex); - if (adev->ip_versions[GC_HWIP][0] == IP_VERSION(9, 0, 1) && amdgpu_sriov_runtime(adev)) { - clock = gfx_v9_0_kiq_read_clock(adev); - } else { - WREG32_SOC15(GC, 0, mmRLC_CAPTURE_GPU_CLOCK_COUNT, 1); - clock = (uint64_t)RREG32_SOC15(GC, 0, mmRLC_GPU_CLOCK_COUNT_LSB) | - ((uint64_t)RREG32_SOC15(GC, 0, mmRLC_GPU_CLOCK_COUNT_MSB) << 32ULL); + switch (adev->ip_versions[GC_HWIP][0]) { + case IP_VERSION(9, 3, 0): + preempt_disable(); + clock_hi = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER_Renoir); + clock_lo = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_LOWER_Renoir); + hi_check = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_UPPER_Renoir); + /* The SMUIO TSC clock frequency is 100MHz, which sets 32-bit carry over +* roughly every 42 seconds. +*/ + if (hi_check != clock_hi) { + clock_lo = RREG32_SOC15_NO_KIQ(SMUIO, 0, mmGOLDEN_TSC_COUNT_LOWER_Renoir); + clock_hi = hi_check; + } + preempt_enable(); + clock = clock_lo | (clock_hi << 32ULL); + break; + default: + amdgpu_gfx_off_ctrl(adev, false); + mutex_lock(&adev->gfx.gpu_clock_mutex); + if (adev->ip_versions[GC_HWIP][0] == IP_VERSION(9, 0, 1) && amdgpu_sriov_runtime(adev)) { + clock = gfx_v9_0_kiq_read_clock(adev); + } else { + WREG32_SOC15(GC, 0, mmRLC_CAPTURE_GPU_CLOCK_COUNT, 1); + clock = (uint64_t)RREG32_SOC15(GC, 0, mmRLC_GPU_CLOCK_COUNT_LSB) | + ((uint64_t)RREG32_SOC15(GC, 0, mmRLC_GPU_CLOCK_COUNT_MSB) << 32ULL); + } + mutex_unlock(&adev->gfx.gpu_clock_mutex); + amdgpu_gfx_off_ctrl(adev, true); + break; } - mutex_unlock(&adev->gfx.gpu_clock_mutex); - amdgpu_gfx_off_ctrl(adev, true); return clock; } -- 2.31.1
Re: [PATCH v1 1/9] mm: add zone device coherent type memory support
Am 2021-11-18 um 1:53 a.m. schrieb Alistair Popple: > On Tuesday, 16 November 2021 6:30:18 AM AEDT Alex Sierra wrote: >> Device memory that is cache coherent from device and CPU point of view. >> This is used on platforms that have an advanced system bus (like CAPI >> or CXL). Any page of a process can be migrated to such memory. However, >> no one should be allowed to pin such memory so that it can always be >> evicted. >> >> Signed-off-by: Alex Sierra >> --- >> include/linux/memremap.h | 8 >> include/linux/mm.h | 9 + >> mm/memcontrol.c | 6 +++--- >> mm/memory-failure.c | 6 +- >> mm/memremap.c| 5 - >> mm/migrate.c | 21 + >> 6 files changed, 42 insertions(+), 13 deletions(-) >> >> diff --git a/include/linux/memremap.h b/include/linux/memremap.h >> index c0e9d35889e8..ff4d398edf35 100644 >> --- a/include/linux/memremap.h >> +++ b/include/linux/memremap.h >> @@ -39,6 +39,13 @@ struct vmem_altmap { >> * A more complete discussion of unaddressable memory may be found in >> * include/linux/hmm.h and Documentation/vm/hmm.rst. >> * >> + * MEMORY_DEVICE_COHERENT: >> + * Device memory that is cache coherent from device and CPU point of view. >> This >> + * is used on platforms that have an advanced system bus (like CAPI or >> CXL). A >> + * driver can hotplug the device memory using ZONE_DEVICE and with that >> memory >> + * type. Any page of a process can be migrated to such memory. However no >> one >> + * should be allowed to pin such memory so that it can always be evicted. >> + * >> * MEMORY_DEVICE_FS_DAX: >> * Host memory that has similar access semantics as System RAM i.e. DMA >> * coherent and supports page pinning. In support of coordinating page >> @@ -59,6 +66,7 @@ struct vmem_altmap { >> enum memory_type { >> /* 0 is reserved to catch uninitialized type fields */ >> MEMORY_DEVICE_PRIVATE = 1, >> +MEMORY_DEVICE_COHERENT, >> MEMORY_DEVICE_FS_DAX, >> MEMORY_DEVICE_GENERIC, >> MEMORY_DEVICE_PCI_P2PDMA, >> diff --git a/include/linux/mm.h b/include/linux/mm.h >> index 73a52aba448f..d23b6ebd2177 100644 >> --- a/include/linux/mm.h >> +++ b/include/linux/mm.h >> @@ -1162,6 +1162,7 @@ static inline bool page_is_devmap_managed(struct page >> *page) >> return false; >> switch (page->pgmap->type) { >> case MEMORY_DEVICE_PRIVATE: >> +case MEMORY_DEVICE_COHERENT: >> case MEMORY_DEVICE_FS_DAX: >> return true; >> default: >> @@ -1191,6 +1192,14 @@ static inline bool is_device_private_page(const >> struct page *page) >> page->pgmap->type == MEMORY_DEVICE_PRIVATE; >> } >> >> +static inline bool is_device_page(const struct page *page) >> +{ >> +return IS_ENABLED(CONFIG_DEV_PAGEMAP_OPS) && >> +is_zone_device_page(page) && >> +(page->pgmap->type == MEMORY_DEVICE_PRIVATE || >> +page->pgmap->type == MEMORY_DEVICE_COHERENT); >> +} >> + >> static inline bool is_pci_p2pdma_page(const struct page *page) >> { >> return IS_ENABLED(CONFIG_DEV_PAGEMAP_OPS) && >> diff --git a/mm/memcontrol.c b/mm/memcontrol.c >> index 6da5020a8656..e0275ebe1198 100644 >> --- a/mm/memcontrol.c >> +++ b/mm/memcontrol.c >> @@ -5695,8 +5695,8 @@ static int mem_cgroup_move_account(struct page *page, >> * 2(MC_TARGET_SWAP): if the swap entry corresponding to this pte is a >> * target for charge migration. if @target is not NULL, the entry is >> stored >> * in target->ent. >> - * 3(MC_TARGET_DEVICE): like MC_TARGET_PAGE but page is >> MEMORY_DEVICE_PRIVATE >> - * (so ZONE_DEVICE page and thus not on the lru). >> + * 3(MC_TARGET_DEVICE): like MC_TARGET_PAGE but page is >> MEMORY_DEVICE_COHERENT >> + * or MEMORY_DEVICE_PRIVATE (so ZONE_DEVICE page and thus not on the >> lru). >> * For now we such page is charge like a regular page would be as for >> all >> * intent and purposes it is just special memory taking the place of a >> * regular page. >> @@ -5730,7 +5730,7 @@ static enum mc_target_type get_mctgt_type(struct >> vm_area_struct *vma, >> */ >> if (page_memcg(page) == mc.from) { >> ret = MC_TARGET_PAGE; >> -if (is_device_private_page(page)) >> +if (is_device_page(page)) >> ret = MC_TARGET_DEVICE; >> if (target) >> target->page = page; >> diff --git a/mm/memory-failure.c b/mm/memory-failure.c >> index 3e6449f2102a..51b55fc5172c 100644 >> --- a/mm/memory-failure.c >> +++ b/mm/memory-failure.c >> @@ -1554,12 +1554,16 @@ static int memory_failure_dev_pagemap(unsigned long >> pfn, int flags, >> goto unlock; >> } >> >> -if (pgmap->type == MEMORY_DEVICE_PRIVATE) { >> +switch (pgmap->type) { >> +case MEMORY_DEVICE_PRIVATE: >> +case MEMORY_DEVICE_COHERE
[RFC 3/3] drm/amd/pm: Add debugfs info for STB
Add debugfs hook. Signed-off-by: Andrey Grodzovsky Reviewed-by: Lijo Lazar Reviewed-by: Luben Tuikov --- drivers/gpu/drm/amd/pm/amdgpu_pm.c| 2 + drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h | 1 + drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 86 +++ 3 files changed, 89 insertions(+) diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c b/drivers/gpu/drm/amd/pm/amdgpu_pm.c index 41472ed99253..49df4c20f09e 100644 --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c @@ -3759,5 +3759,7 @@ void amdgpu_debugfs_pm_init(struct amdgpu_device *adev) adev, &amdgpu_debugfs_pm_prv_buffer_fops, adev->pm.smu_prv_buffer_size); + + amdgpu_smu_stb_debug_fs_init(adev); #endif } diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h index 4301403af761..f738f7dc20c9 100644 --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h @@ -1420,6 +1420,7 @@ int smu_wait_for_event(struct amdgpu_device *adev, enum smu_event_type event, uint64_t event_arg); int smu_get_ecc_info(struct smu_context *smu, void *umc_ecc); int smu_stb_collect_info(struct smu_context *smu, void *buff, uint32_t size); +void amdgpu_smu_stb_debug_fs_init(struct amdgpu_device *adev); #endif #endif diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c index 97bafba4c5c9..f5e739d40b04 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c @@ -3193,3 +3193,89 @@ int smu_stb_collect_info(struct smu_context *smu, void *buf, uint32_t size) */ return smu->ppt_funcs->stb_collect_info(smu, buf, size); } + +#if defined(CONFIG_DEBUG_FS) + +static int smu_stb_debugfs_open(struct inode *inode, struct file *filp) +{ + struct amdgpu_device *adev = filp->f_inode->i_private; + struct smu_context *smu = &adev->smu; + unsigned char *buf; + int r; + + buf = kvmalloc_array(smu->stb_context.stb_buf_size, sizeof(*buf), GFP_KERNEL); + if (!buf) + return -ENOMEM; + + r = smu_stb_collect_info(smu, buf, smu->stb_context.stb_buf_size); + if (r) + goto out; + + filp->private_data = buf; + + return 0; + +out: + kvfree(buf); + return r; +} + +static ssize_t smu_stb_debugfs_read(struct file *filp, char __user *buf, size_t size, + loff_t *pos) +{ + struct amdgpu_device *adev = filp->f_inode->i_private; + struct smu_context *smu = &adev->smu; + + + if (!filp->private_data) + return -EINVAL; + + return simple_read_from_buffer(buf, + size, + pos, filp->private_data, + smu->stb_context.stb_buf_size); +} + +static int smu_stb_debugfs_release(struct inode *inode, struct file *filp) +{ + kvfree(filp->private_data); + filp->private_data = NULL; + + return 0; +} + +/* + * We have to define not only read method but also + * open and release because .read takes up to PAGE_SIZE + * data each time so and so is invoked multiple times. + * We allocate the STB buffer in .open and release it + * in .release + */ +static const struct file_operations smu_stb_debugfs_fops = { + .owner = THIS_MODULE, + .open = smu_stb_debugfs_open, + .read = smu_stb_debugfs_read, + .release = smu_stb_debugfs_release, + .llseek = default_llseek, +}; + +#endif + +void amdgpu_smu_stb_debug_fs_init(struct amdgpu_device *adev) +{ +#if defined(CONFIG_DEBUG_FS) + + struct smu_context *smu = &adev->smu; + + if (!smu->stb_context.stb_buf_size) + return; + + debugfs_create_file_size("amdgpu_smu_stb_dump", + S_IRUSR, + adev_to_drm(adev)->primary->debugfs_root, + adev, + &smu_stb_debugfs_fops, + smu->stb_context.stb_buf_size); +#endif + +} -- 2.25.1
[RFC 2/3] drm/amd/pm: Add STB support in sienna_cichlid
Add STB implementation for sienna_cichlid Signed-off-by: Andrey Grodzovsky Reviewed-by: Lijo Lazar Reviewed-by: Luben Tuikov --- .../amd/include/asic_reg/mp/mp_11_0_offset.h | 7 +++ .../amd/include/asic_reg/mp/mp_11_0_sh_mask.h | 12 .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 55 +++ 3 files changed, 74 insertions(+) diff --git a/drivers/gpu/drm/amd/include/asic_reg/mp/mp_11_0_offset.h b/drivers/gpu/drm/amd/include/asic_reg/mp/mp_11_0_offset.h index 6d0052ce6bed..da6d380c948b 100644 --- a/drivers/gpu/drm/amd/include/asic_reg/mp/mp_11_0_offset.h +++ b/drivers/gpu/drm/amd/include/asic_reg/mp/mp_11_0_offset.h @@ -354,5 +354,12 @@ #define mmMP1_SMN_EXT_SCRATCH7 0x03c7 #define mmMP1_SMN_EXT_SCRATCH7_BASE_IDX 0 +/* + * addressBlock: mp_SmuMp1Pub_MmuDec + * base address: 0x0 + */ +#define smnMP1_PMI_3_START 0x3030204 +#define smnMP1_PMI_3_FIFO 0x3030208 +#define smnMP1_PMI_3 0x3030600 #endif diff --git a/drivers/gpu/drm/amd/include/asic_reg/mp/mp_11_0_sh_mask.h b/drivers/gpu/drm/amd/include/asic_reg/mp/mp_11_0_sh_mask.h index 136fb5de6a4c..a5ae2a801254 100644 --- a/drivers/gpu/drm/amd/include/asic_reg/mp/mp_11_0_sh_mask.h +++ b/drivers/gpu/drm/amd/include/asic_reg/mp/mp_11_0_sh_mask.h @@ -959,5 +959,17 @@ #define MP1_SMN_EXT_SCRATCH7__DATA__SHIFT 0x0 #define MP1_SMN_EXT_SCRATCH7__DATA_MASK 0xL +// MP1_PMI_3_START +#define MP1_PMI_3_START__ENABLE_MASK 0x8000L +// MP1_PMI_3_FIFO +#define MP1_PMI_3_FIFO__DEPTH_MASK 0x0fffL + +// MP1_PMI_3_START +#define MP1_PMI_3_START__ENABLE__SHIFT 0x001f +// MP1_PMI_3_FIFO +#define MP1_PMI_3_FIFO__DEPTH__SHIFT 0x + + + #endif diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c index b0bb389185d5..9d7b4fade301 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c @@ -80,6 +80,9 @@ (*member) = (smu->smu_table.driver_pptable + offsetof(PPTable_t, field));\ } while(0) +/* STB FIFO depth is in 64bit units */ +#define SIENNA_CICHLID_STB_DEPTH_UNIT_BYTES 8 + static int get_table_size(struct smu_context *smu) { if (smu->adev->ip_versions[MP1_HWIP][0] == IP_VERSION(11, 0, 13)) @@ -650,6 +653,8 @@ static int sienna_cichlid_allocate_dpm_context(struct smu_context *smu) return 0; } +static void sienna_cichlid_stb_init(struct smu_context *smu); + static int sienna_cichlid_init_smc_tables(struct smu_context *smu) { int ret = 0; @@ -662,6 +667,8 @@ static int sienna_cichlid_init_smc_tables(struct smu_context *smu) if (ret) return ret; + sienna_cichlid_stb_init(smu); + return smu_v11_0_init_smc_tables(smu); } @@ -3793,6 +3800,53 @@ static int sienna_cichlid_set_mp1_state(struct smu_context *smu, return ret; } +static void sienna_cichlid_stb_init(struct smu_context *smu) +{ + struct amdgpu_device *adev = smu->adev; + uint32_t reg; + + reg = RREG32_PCIE(MP1_Public | smnMP1_PMI_3_START); + smu->stb_context.enabled = REG_GET_FIELD(reg, MP1_PMI_3_START, ENABLE); + + /* STB is disabled */ + if (!smu->stb_context.enabled) + return; + + spin_lock_init(&smu->stb_context.lock); + + /* STB buffer size in bytes as function of FIFO depth */ + reg = RREG32_PCIE(MP1_Public | smnMP1_PMI_3_FIFO); + smu->stb_context.stb_buf_size = 1 << REG_GET_FIELD(reg, MP1_PMI_3_FIFO, DEPTH); + smu->stb_context.stb_buf_size *= SIENNA_CICHLID_STB_DEPTH_UNIT_BYTES; + + dev_info(smu->adev->dev, "STB initialized to %d entries", +smu->stb_context.stb_buf_size / SIENNA_CICHLID_STB_DEPTH_UNIT_BYTES); + +} + +int sienna_cichlid_stb_get_data_direct(struct smu_context *smu, + void *buf, + uint32_t size) +{ + uint32_t *p = buf; + struct amdgpu_device *adev = smu->adev; + + /* No need to disable interrupts for now as we don't lock it yet from ISR */ + spin_lock(&smu->stb_context.lock); + + /* +* Read the STB FIFO in units of 32bit since this is the accessor window +* (register width) we have. +*/ + buf = ((char *) buf) + size; + while ((void *)p < buf) + *p
[RFC 1/3] drm/amd/pm: Add STB accessors interface
Add interface to collect STB logs. Signed-off-by: Andrey Grodzovsky Reviewed-by: Lijo Lazar Reviewed-by: Luben Tuikov --- drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h | 15 +++ drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 18 ++ 2 files changed, 33 insertions(+) diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h index 7a06021a58f0..4301403af761 100644 --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h @@ -474,6 +474,12 @@ struct cmn2asic_mapping { int map_to; }; +struct stb_context { + uint32_t stb_buf_size; + bool enabled; + spinlock_t lock; +}; + #define WORKLOAD_POLICY_MAX 7 struct smu_context { @@ -561,6 +567,8 @@ struct smu_context uint16_t cpu_core_num; struct smu_user_dpm_profile user_dpm_profile; + + struct stb_context stb_context; }; struct i2c_adapter; @@ -1268,6 +1276,12 @@ struct pptable_funcs { * @get_ecc_table: message SMU to get ECC INFO table. */ ssize_t (*get_ecc_info)(struct smu_context *smu, void *table); + + + /** +* @stb_collect_info: Collects Smart Trace Buffers data. +*/ + int (*stb_collect_info)(struct smu_context *smu, void *buf, uint32_t size); }; typedef enum { @@ -1405,6 +1419,7 @@ int smu_set_light_sbr(struct smu_context *smu, bool enable); int smu_wait_for_event(struct amdgpu_device *adev, enum smu_event_type event, uint64_t event_arg); int smu_get_ecc_info(struct smu_context *smu, void *umc_ecc); +int smu_stb_collect_info(struct smu_context *smu, void *buff, uint32_t size); #endif #endif diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c index fd3b6b460b12..97bafba4c5c9 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c @@ -3175,3 +3175,21 @@ int smu_wait_for_event(struct amdgpu_device *adev, enum smu_event_type event, return ret; } + +int smu_stb_collect_info(struct smu_context *smu, void *buf, uint32_t size) +{ + + if (!smu->ppt_funcs->stb_collect_info || !smu->stb_context.enabled) + return -EOPNOTSUPP; + + /* Confirm the buffer allocated is of correct size */ + if (size != smu->stb_context.stb_buf_size) + return -EINVAL; + + /* +* No need to lock smu mutex as we access STB directly through MMIO +* and not going through SMU messaging route (for now at least). +* For registers access rely on implementation internal locking. +*/ + return smu->ppt_funcs->stb_collect_info(smu, buf, size); +} -- 2.25.1
[RFC 0/3] Add Smart Trace Buffers support
The Smart Trace Buffer (STB), is a cyclic data buffer used to log information about system execution for characterization and debug purposes. If at any point should a system encounter a functional failure the trace can be collected without need for reproducing the failure while running additional instrumentation. Andrey Grodzovsky (3): drm/amd/pm: Add STB accessors interface drm/amd/pm: Add STB support in sienna_cichlid drm/amd/pm: Add debugfs info for STB .../amd/include/asic_reg/mp/mp_11_0_offset.h | 7 ++ .../amd/include/asic_reg/mp/mp_11_0_sh_mask.h | 12 ++ drivers/gpu/drm/amd/pm/amdgpu_pm.c| 2 + drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h | 16 +++ drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 104 ++ .../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 55 + 6 files changed, 196 insertions(+) -- 2.25.1
RE: [PATCH] drm/amd/display: Drop config guard for DC_LOG_DP2
[AMD Official Use Only] Agree. Patch applied. Thanks for your review -Leo -Original Message- From: Alex Deucher Sent: Thursday, November 18, 2021 12:02 PM To: Ma, Leo Cc: Kazlauskas, Nicholas ; amd-gfx list ; Deucher, Alexander ; Choi, Nicholas ; Wentland, Harry Subject: Re: [PATCH] drm/amd/display: Drop config guard for DC_LOG_DP2 On Thu, Nov 18, 2021 at 11:53 AM Leo (Hanghong) Ma wrote: > > [Why & How] > It doesn't make sense to guard DC_LOG_DP2 by CONFIG_DRM_AMD_DCN, and > this also caused build failure for allmodconfig; So drop the guard to > fix the compile failure; > > Signed-off-by: Leo (Hanghong) Ma Reviewed-by: Alex Deucher At some point we may want to rework what the DC_LOG stuff maps too (e.g., use dev_debug or the newer drm interfaces). Alex > --- > drivers/gpu/drm/amd/display/include/logger_types.h | 4 > 1 file changed, 4 deletions(-) > > diff --git a/drivers/gpu/drm/amd/display/include/logger_types.h > b/drivers/gpu/drm/amd/display/include/logger_types.h > index 370fad883e33..f093b49c5e6e 100644 > --- a/drivers/gpu/drm/amd/display/include/logger_types.h > +++ b/drivers/gpu/drm/amd/display/include/logger_types.h > @@ -72,9 +72,7 @@ > #define DC_LOG_DSC(...) DRM_DEBUG_KMS(__VA_ARGS__) #define > DC_LOG_SMU(...) pr_debug("[SMU_MSG]:"__VA_ARGS__) #define > DC_LOG_DWB(...) DRM_DEBUG_KMS(__VA_ARGS__) -#if > defined(CONFIG_DRM_AMD_DC_DCN) #define DC_LOG_DP2(...) > DRM_DEBUG_KMS(__VA_ARGS__) -#endif > > struct dal_logger; > > @@ -126,9 +124,7 @@ enum dc_log_type { > LOG_MAX_HW_POINTS, > LOG_ALL_TF_CHANNELS, > LOG_SAMPLE_1DLUT, > -#if defined(CONFIG_DRM_AMD_DC_DCN) > LOG_DP2, > -#endif > LOG_SECTION_TOTAL_COUNT > }; > > -- > 2.17.1 >
Re: [PATCH] drm/amd/display: Drop config guard for DC_LOG_DP2
On Thu, Nov 18, 2021 at 11:53 AM Leo (Hanghong) Ma wrote: > > [Why & How] > It doesn't make sense to guard DC_LOG_DP2 by CONFIG_DRM_AMD_DCN, and > this also caused build failure for allmodconfig; So drop the guard > to fix the compile failure; > > Signed-off-by: Leo (Hanghong) Ma Reviewed-by: Alex Deucher At some point we may want to rework what the DC_LOG stuff maps too (e.g., use dev_debug or the newer drm interfaces). Alex > --- > drivers/gpu/drm/amd/display/include/logger_types.h | 4 > 1 file changed, 4 deletions(-) > > diff --git a/drivers/gpu/drm/amd/display/include/logger_types.h > b/drivers/gpu/drm/amd/display/include/logger_types.h > index 370fad883e33..f093b49c5e6e 100644 > --- a/drivers/gpu/drm/amd/display/include/logger_types.h > +++ b/drivers/gpu/drm/amd/display/include/logger_types.h > @@ -72,9 +72,7 @@ > #define DC_LOG_DSC(...) DRM_DEBUG_KMS(__VA_ARGS__) > #define DC_LOG_SMU(...) pr_debug("[SMU_MSG]:"__VA_ARGS__) > #define DC_LOG_DWB(...) DRM_DEBUG_KMS(__VA_ARGS__) > -#if defined(CONFIG_DRM_AMD_DC_DCN) > #define DC_LOG_DP2(...) DRM_DEBUG_KMS(__VA_ARGS__) > -#endif > > struct dal_logger; > > @@ -126,9 +124,7 @@ enum dc_log_type { > LOG_MAX_HW_POINTS, > LOG_ALL_TF_CHANNELS, > LOG_SAMPLE_1DLUT, > -#if defined(CONFIG_DRM_AMD_DC_DCN) > LOG_DP2, > -#endif > LOG_SECTION_TOTAL_COUNT > }; > > -- > 2.17.1 >
[PATCH] drm/amd/amdgpu: move kfd post_reset out of reset_sriov function
For sriov XGMI configuration, the host driver will handle the hive reset, so in guest side, the reset_sriov only be called once on one device. This will make kfd post_reset unblanced with kfd pre_reset since kfd pre_reset already been moved out of reset_sriov function. Move kfd post_reset out of reset_sriov function to make them balance . Signed-off-by: shaoyunl --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 10c8008d1da0..9a9d5493c676 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4308,7 +4308,6 @@ static int amdgpu_device_reset_sriov(struct amdgpu_device *adev, amdgpu_irq_gpu_reset_resume_helper(adev); r = amdgpu_ib_ring_tests(adev); - amdgpu_amdkfd_post_reset(adev); error: if (!r && adev->virt.gim_feature & AMDGIM_FEATURE_GIM_FLR_VRAMLOST) { @@ -5081,7 +5080,7 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, tmp_vram_lost_counter = atomic_read(&((adev)->vram_lost_counter)); /* Actual ASIC resets if needed.*/ - /* TODO Implement XGMI hive reset logic for SRIOV */ + /* Host driver will handle XGMI hive reset for SRIOV */ if (amdgpu_sriov_vf(adev)) { r = amdgpu_device_reset_sriov(adev, job ? false : true); if (r) @@ -5141,8 +5140,8 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, skip_sched_resume: list_for_each_entry(tmp_adev, device_list_handle, reset_list) { - /* unlock kfd: SRIOV would do it separately */ - if (!need_emergency_restart && !amdgpu_sriov_vf(tmp_adev)) + /* unlock kfd */ + if (!need_emergency_restart) amdgpu_amdkfd_post_reset(tmp_adev); /* kfd_post_reset will do nothing if kfd device is not initialized, -- 2.17.1
[PATCH] drm/amd/display: Drop config guard for DC_LOG_DP2
[Why & How] It doesn't make sense to guard DC_LOG_DP2 by CONFIG_DRM_AMD_DCN, and this also caused build failure for allmodconfig; So drop the guard to fix the compile failure; Signed-off-by: Leo (Hanghong) Ma --- drivers/gpu/drm/amd/display/include/logger_types.h | 4 1 file changed, 4 deletions(-) diff --git a/drivers/gpu/drm/amd/display/include/logger_types.h b/drivers/gpu/drm/amd/display/include/logger_types.h index 370fad883e33..f093b49c5e6e 100644 --- a/drivers/gpu/drm/amd/display/include/logger_types.h +++ b/drivers/gpu/drm/amd/display/include/logger_types.h @@ -72,9 +72,7 @@ #define DC_LOG_DSC(...) DRM_DEBUG_KMS(__VA_ARGS__) #define DC_LOG_SMU(...) pr_debug("[SMU_MSG]:"__VA_ARGS__) #define DC_LOG_DWB(...) DRM_DEBUG_KMS(__VA_ARGS__) -#if defined(CONFIG_DRM_AMD_DC_DCN) #define DC_LOG_DP2(...) DRM_DEBUG_KMS(__VA_ARGS__) -#endif struct dal_logger; @@ -126,9 +124,7 @@ enum dc_log_type { LOG_MAX_HW_POINTS, LOG_ALL_TF_CHANNELS, LOG_SAMPLE_1DLUT, -#if defined(CONFIG_DRM_AMD_DC_DCN) LOG_DP2, -#endif LOG_SECTION_TOTAL_COUNT }; -- 2.17.1
Re: [PATCH 3/3] drm/amdkfd: simplify drain retry fault
On 2021-11-18 11:39 a.m., Felix Kuehling wrote: Am 2021-11-18 um 11:19 a.m. schrieb philip yang: On 2021-11-17 7:14 p.m., Felix Kuehling wrote: On 2021-11-16 10:43 p.m., Philip Yang wrote: unmap range always set svms->drain_pagefaults flag to simplify both parent range and child range unmap. Deferred list work takes mmap write lock to read and clear svms->drain_pagefaults, to serialize with unmap callback. Add atomic flag svms->draining_faults, if svms->draining_faults is set, page fault handle ignore the retry fault, to speed up interrupt handling. Having both svms->drain_pagefaults and svms->draining_faults is confusing. Do we really need both? Using one flag, I can not find a way to handle the case, unmap new range. if the flag is set, restore_pages uses the flag to drop fault, then drain_retry_fault reset the flag after draining is done, we will not able to drain retry fault for the new range. I think the correct solution would be to use atomic_inc to set the flag and atomic_cmp_xchg in svm_range_drain_retry_fault to clear it. If the flag was changed while svm_range_drain_retry_fault executed, it means another drain was requested by someone else and the flag should remain set for another round of draining. Tanks for the idea to use atomic_cmp_xchg, it will solve the issue. Philip Regards, Felix Regards, Philip Regards, Felix Signed-off-by: Philip Yang --- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 1 + drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 24 ++-- 2 files changed, 19 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index 1d3f012bcd2a..4e4640b731e1 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -767,6 +767,7 @@ struct svm_range_list { spinlock_t deferred_list_lock; atomic_t evicted_ranges; bool drain_pagefaults; + atomic_t draining_faults; struct delayed_work restore_work; DECLARE_BITMAP(bitmap_supported, MAX_GPU_INSTANCE); struct task_struct *faulting_task; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 3eb0a9491755..d332462bf9d3 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -1962,6 +1962,7 @@ void svm_range_drain_retry_fault(struct svm_range_list *svms) p = container_of(svms, struct kfd_process, svms); + atomic_set(&svms->draining_faults, 1); for_each_set_bit(i, svms->bitmap_supported, p->n_pdds) { pdd = p->pdds[i]; if (!pdd) @@ -1975,6 +1976,7 @@ void svm_range_drain_retry_fault(struct svm_range_list *svms) flush_work(&adev->irq.ih1_work); pr_debug("drain retry fault gpu %d svms 0x%p done\n", i, svms); } + atomic_set(&svms->draining_faults, 0); } static void svm_range_deferred_list_work(struct work_struct *work) @@ -2002,6 +2004,7 @@ static void svm_range_deferred_list_work(struct work_struct *work) * mmap write lock to serialize with munmap notifiers. */ if (unlikely(READ_ONCE(svms->drain_pagefaults))) { + atomic_set(&svms->draining_faults, 1); WRITE_ONCE(svms->drain_pagefaults, false); mmap_write_unlock(mm); svm_range_drain_retry_fault(svms); @@ -2049,12 +2052,6 @@ svm_range_add_list_work(struct svm_range_list *svms, struct svm_range *prange, struct mm_struct *mm, enum svm_work_list_ops op) { spin_lock(&svms->deferred_list_lock); - /* Make sure pending page faults are drained in the deferred worker - * before the range is freed to avoid straggler interrupts on - * unmapped memory causing "phantom faults". - */ - if (op == SVM_OP_UNMAP_RANGE) - svms->drain_pagefaults = true; /* if prange is on the deferred list */ if (!list_empty(&prange->deferred_list)) { pr_debug("update exist prange 0x%p work op %d\n", prange, op); @@ -2133,6 +2130,13 @@ svm_range_unmap_from_cpu(struct mm_struct *mm, struct svm_range *prange, pr_debug("svms 0x%p prange 0x%p [0x%lx 0x%lx] [0x%lx 0x%lx]\n", svms, prange, prange->start, prange->last, start, last); + /* Make sure pending page faults are drained in the deferred worker + * before the range is freed to avoid straggler interrupts on + * unmapped memory causing "phantom faults". + */ + pr_debug("set range drain_pagefaults true\n"); + WRITE_ONCE(svms->drain_pagefaults, true); + unmap_parent = start <= prange->start && last >= prange->last; list_for_each_entry(pchild, &prange->child_list, child_list) { @@ -2595,6 +2599
Re: [PATCH 3/3] drm/amdkfd: simplify drain retry fault
Am 2021-11-18 um 11:19 a.m. schrieb philip yang: > > > On 2021-11-17 7:14 p.m., Felix Kuehling wrote: >> >> On 2021-11-16 10:43 p.m., Philip Yang wrote: >>> unmap range always set svms->drain_pagefaults flag to simplify both >>> parent range and child range unmap. Deferred list work takes mmap write >>> lock to read and clear svms->drain_pagefaults, to serialize with unmap >>> callback. >>> >>> Add atomic flag svms->draining_faults, if svms->draining_faults is set, >>> page fault handle ignore the retry fault, to speed up interrupt >>> handling. >> Having both svms->drain_pagefaults and svms->draining_faults is >> confusing. Do we really need both? > > Using one flag, I can not find a way to handle the case, unmap new > range. if the flag is set, restore_pages uses the flag to drop fault, > then drain_retry_fault reset the flag after draining is done, we will > not able to drain retry fault for the new range. > I think the correct solution would be to use atomic_inc to set the flag and atomic_cmp_xchg in svm_range_drain_retry_fault to clear it. If the flag was changed while svm_range_drain_retry_fault executed, it means another drain was requested by someone else and the flag should remain set for another round of draining. Regards, Felix > Regards, > > Philip > >> Regards, >> Felix >> >>> >>> Signed-off-by: Philip Yang >>> --- >>> drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 1 + >>> drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 24 ++-- >>> 2 files changed, 19 insertions(+), 6 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h >>> b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h >>> index 1d3f012bcd2a..4e4640b731e1 100644 >>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h >>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h >>> @@ -767,6 +767,7 @@ struct svm_range_list { >>> spinlock_t deferred_list_lock; >>> atomic_t evicted_ranges; >>> bool drain_pagefaults; >>> + atomic_t draining_faults; >>> struct delayed_work restore_work; >>> DECLARE_BITMAP(bitmap_supported, MAX_GPU_INSTANCE); >>> struct task_struct *faulting_task; >>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c >>> b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c >>> index 3eb0a9491755..d332462bf9d3 100644 >>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c >>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c >>> @@ -1962,6 +1962,7 @@ void svm_range_drain_retry_fault(struct >>> svm_range_list *svms) >>> p = container_of(svms, struct kfd_process, svms); >>> + atomic_set(&svms->draining_faults, 1); >>> for_each_set_bit(i, svms->bitmap_supported, p->n_pdds) { >>> pdd = p->pdds[i]; >>> if (!pdd) >>> @@ -1975,6 +1976,7 @@ void svm_range_drain_retry_fault(struct >>> svm_range_list *svms) >>> flush_work(&adev->irq.ih1_work); >>> pr_debug("drain retry fault gpu %d svms 0x%p done\n", i, >>> svms); >>> } >>> + atomic_set(&svms->draining_faults, 0); >>> } >>> static void svm_range_deferred_list_work(struct work_struct *work) >>> @@ -2002,6 +2004,7 @@ static void >>> svm_range_deferred_list_work(struct work_struct *work) >>> * mmap write lock to serialize with munmap notifiers. >>> */ >>> if (unlikely(READ_ONCE(svms->drain_pagefaults))) { >>> + atomic_set(&svms->draining_faults, 1); >>> WRITE_ONCE(svms->drain_pagefaults, false); >>> mmap_write_unlock(mm); >>> svm_range_drain_retry_fault(svms); >>> @@ -2049,12 +2052,6 @@ svm_range_add_list_work(struct svm_range_list >>> *svms, struct svm_range *prange, >>> struct mm_struct *mm, enum svm_work_list_ops op) >>> { >>> spin_lock(&svms->deferred_list_lock); >>> - /* Make sure pending page faults are drained in the deferred >>> worker >>> - * before the range is freed to avoid straggler interrupts on >>> - * unmapped memory causing "phantom faults". >>> - */ >>> - if (op == SVM_OP_UNMAP_RANGE) >>> - svms->drain_pagefaults = true; >>> /* if prange is on the deferred list */ >>> if (!list_empty(&prange->deferred_list)) { >>> pr_debug("update exist prange 0x%p work op %d\n", prange, >>> op); >>> @@ -2133,6 +2130,13 @@ svm_range_unmap_from_cpu(struct mm_struct >>> *mm, struct svm_range *prange, >>> pr_debug("svms 0x%p prange 0x%p [0x%lx 0x%lx] [0x%lx >>> 0x%lx]\n", svms, >>> prange, prange->start, prange->last, start, last); >>> + /* Make sure pending page faults are drained in the deferred >>> worker >>> + * before the range is freed to avoid straggler interrupts on >>> + * unmapped memory causing "phantom faults". >>> + */ >>> + pr_debug("set range drain_pagefaults true\n"); >>> + WRITE_ONCE(svms->drain_pagefaults, true); >>> + >>> unmap_parent = start <= prange->start && last >= prange->last; >>> list_for_each_entry(pchild, &prange->child_list, child_list)
[radeon] connector_info_from_object_table
Hello, The function radeon_get_atom_connector_info_from_object_table, at location [1], ends up parsing ATOM_COMMON_TABLE_HEADER as ATOM_COMMON_RECORD_HEADER if enc_obj->asObjects[k].usRecordOffset is zero. It is found to be zero in the BIOS found at [2]. Thankfully, the loop that follows exits immediately since ucRecordSize is 0 because (ATOM_COMMON_TABLE_HEADER.usStructureSize & 0xff00) is zero. But, with suitable values in the usStructureSize, the loop can be made to run and parse garbage. A similar loop exists when parsing the conn objects. -Amol [1] https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/radeon/radeon_atombios.c#L652 [2] https://www.techpowerup.com/vgabios/211981/211981
Re: [PATCH 3/3] drm/amdkfd: simplify drain retry fault
On 2021-11-17 7:14 p.m., Felix Kuehling wrote: On 2021-11-16 10:43 p.m., Philip Yang wrote: unmap range always set svms->drain_pagefaults flag to simplify both parent range and child range unmap. Deferred list work takes mmap write lock to read and clear svms->drain_pagefaults, to serialize with unmap callback. Add atomic flag svms->draining_faults, if svms->draining_faults is set, page fault handle ignore the retry fault, to speed up interrupt handling. Having both svms->drain_pagefaults and svms->draining_faults is confusing. Do we really need both? Using one flag, I can not find a way to handle the case, unmap new range. if the flag is set, restore_pages uses the flag to drop fault, then drain_retry_fault reset the flag after draining is done, we will not able to drain retry fault for the new range. Regards, Philip Regards, Felix Signed-off-by: Philip Yang --- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 1 + drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 24 ++-- 2 files changed, 19 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index 1d3f012bcd2a..4e4640b731e1 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -767,6 +767,7 @@ struct svm_range_list { spinlock_t deferred_list_lock; atomic_t evicted_ranges; bool drain_pagefaults; + atomic_t draining_faults; struct delayed_work restore_work; DECLARE_BITMAP(bitmap_supported, MAX_GPU_INSTANCE); struct task_struct *faulting_task; diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 3eb0a9491755..d332462bf9d3 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -1962,6 +1962,7 @@ void svm_range_drain_retry_fault(struct svm_range_list *svms) p = container_of(svms, struct kfd_process, svms); + atomic_set(&svms->draining_faults, 1); for_each_set_bit(i, svms->bitmap_supported, p->n_pdds) { pdd = p->pdds[i]; if (!pdd) @@ -1975,6 +1976,7 @@ void svm_range_drain_retry_fault(struct svm_range_list *svms) flush_work(&adev->irq.ih1_work); pr_debug("drain retry fault gpu %d svms 0x%p done\n", i, svms); } + atomic_set(&svms->draining_faults, 0); } static void svm_range_deferred_list_work(struct work_struct *work) @@ -2002,6 +2004,7 @@ static void svm_range_deferred_list_work(struct work_struct *work) * mmap write lock to serialize with munmap notifiers. */ if (unlikely(READ_ONCE(svms->drain_pagefaults))) { + atomic_set(&svms->draining_faults, 1); WRITE_ONCE(svms->drain_pagefaults, false); mmap_write_unlock(mm); svm_range_drain_retry_fault(svms); @@ -2049,12 +2052,6 @@ svm_range_add_list_work(struct svm_range_list *svms, struct svm_range *prange, struct mm_struct *mm, enum svm_work_list_ops op) { spin_lock(&svms->deferred_list_lock); - /* Make sure pending page faults are drained in the deferred worker - * before the range is freed to avoid straggler interrupts on - * unmapped memory causing "phantom faults". - */ - if (op == SVM_OP_UNMAP_RANGE) - svms->drain_pagefaults = true; /* if prange is on the deferred list */ if (!list_empty(&prange->deferred_list)) { pr_debug("update exist prange 0x%p work op %d\n", prange, op); @@ -2133,6 +2130,13 @@ svm_range_unmap_from_cpu(struct mm_struct *mm, struct svm_range
Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted
On Thu, Nov 18, 2021 at 10:01 AM Lazar, Lijo wrote: > > [Public] > > > BTW, I'm not sure if 'reset always' on resume is a good idea for GPUs in a > hive (assuming those systems also get suspended and get hiccups). At this > point the hive isn't reinitialized. Yeah, we should probably not reset if we are part of a hive. Alex > > Thanks, > Lijo
Re: [PATCH 2/3] drm/amdkfd: handle VMA remove race
Am 2021-11-18 um 10:55 a.m. schrieb philip yang: > > > On 2021-11-18 10:07 a.m., Felix Kuehling wrote: >> Am 2021-11-18 um 10:00 a.m. schrieb philip yang: >>> On 2021-11-17 7:10 p.m., Felix Kuehling wrote: On 2021-11-16 10:43 p.m., Philip Yang wrote: > VMA may be removed before unmap notifier callback, restore pages take > mmap write lock to lookup VMA to avoid race, The old code looked up the VMA after taking the mmap lock (either read or write) and kept holding the lock afterwards. I think even with your new code it's possible that the VMA disappears before you take the lock the first time, so always taking the write lock only reduces the time window in which things can go wrong, but it doesn't remove the race. >>> Take mmap write lock will serialize with __do_munmap, >> __do_munmap runs with the mmap write lock. Taking the read lock should >> be sufficient to serialize with it. > > __do_munmap takes mmap write lock to remove vma, then downgrade to > read lock to call unmap_region. > Yes. But it does that after detaching the VMA. So holding the read lock is sufficient to ensure that a VMA you have looked up remains valid and that __do_munmap will not remove it. Regards, Felix > static int __vm_munmap(unsigned long start, size_t len, bool downgrade) > { > int ret; > struct mm_struct *mm = current->mm; > LIST_HEAD(uf); > > if (mmap_write_lock_killable(mm)) > return -EINTR; > > ret = __do_munmap(mm, start, len, &uf, downgrade); > /* > * Returning 1 indicates mmap_lock is downgraded. > * But 1 is not legal return value of vm_munmap() and munmap(), reset > * it to 0 before return. > */ > if (ret == 1) { > mmap_read_unlock(mm); > ret = 0; > } else > mmap_write_unlock(mm); > > } > > int __do_munmap(struct mm_struct *mm, unsigned long start, size_t len, > { > > ... > > /* Detach vmas from rbtree */ > if (!detach_vmas_to_be_unmapped(mm, vma, prev, end)) > downgrade = false; > > if (downgrade) > mmap_write_downgrade(mm); > > unmap_region(mm, vma, prev, start, end); > > } > > Regards, > > Philip > >> Regards, >> Felix >> >> >>> to ensure vma remove and unmap callback are done, because unmap >>> callback set drain_retryfaults flag, so we can safely drain the >>> faults, and it is app bug if vma not found after taking mmap write lock. I still struggle to understand the race you're trying to fix. The only time the svm_restore_pages can see that the VMA is gone AND the prange is gone is after the deferred worker has removed the prange. But the fault draining in the deferred worker should prevent us from ever seeing stale faults in that situation. That means, if no prange is found and no VMA is found, it's definitely an application bug. The only possible race is in the case where the prange still exists but the VMA is gone (needed by svm_fault_allowed). We can treat that as a special case where we just return success because we know that we're handling a stale fault for a VMA that's in the middle of being unmapped. The fact that the prange still existed means that there once was a valid mapping at the address but the deferred worker just hasn't had a chance to clean it up yet. >>> Yes, that is only possible race. One more comment inline. > and then create unregister > new range and check VMA access permission, then downgrade to take mmap > read lock to recover fault. Refactor code to avoid duplicate VMA > lookup. > > Signed-off-by: Philip Yang > --- > drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 65 > ++-- > 1 file changed, 24 insertions(+), 41 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c > b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c > index c1f367934428..3eb0a9491755 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c > @@ -2329,20 +2329,13 @@ svm_range_best_restore_location(struct > svm_range *prange, > } > static int > -svm_range_get_range_boundaries(struct kfd_process *p, int64_t addr, > - unsigned long *start, unsigned long *last, > - bool *is_heap_stack) > +svm_range_get_range_boundaries(struct kfd_process *p, struct > vm_area_struct *vma, > + int64_t addr, unsigned long *start, > + unsigned long *last, bool *is_heap_stack) > { > - struct vm_area_struct *vma; > struct interval_tree_node *node; > unsigned long start_limit, end_limit; > - vma = find_vma(p->mm, addr << PAGE_SHIFT); > - if (!vma || (addr << PAGE_SHIFT) < vma->vm_start) { > - pr_debug("VMA does not exist in address [0x%llx]\n", addr); > - return -EFAULT; > -
Re: [PATCH 2/3] drm/amdkfd: handle VMA remove race
On 2021-11-18 10:07 a.m., Felix Kuehling wrote: Am 2021-11-18 um 10:00 a.m. schrieb philip yang: On 2021-11-17 7:10 p.m., Felix Kuehling wrote: On 2021-11-16 10:43 p.m., Philip Yang wrote: VMA may be removed before unmap notifier callback, restore pages take mmap write lock to lookup VMA to avoid race, The old code looked up the VMA after taking the mmap lock (either read or write) and kept holding the lock afterwards. I think even with your new code it's possible that the VMA disappears before you take the lock the first time, so always taking the write lock only reduces the time window in which things can go wrong, but it doesn't remove the race. Take mmap write lock will serialize with __do_munmap, __do_munmap runs with the mmap write lock. Taking the read lock should be sufficient to serialize with it. __do_munmap takes mmap write lock to remove vma, then downgrade to read lock to call unmap_region. static int __vm_munmap(unsigned long start, size_t len, bool downgrade) { int ret; struct mm_struct *mm = current->mm; LIST_HEAD(uf); if (mmap_write_lock_killable(mm)) return -EINTR; ret = __do_munmap(mm, start, len, &uf, downgrade); /* * Returning 1 indicates mmap_lock is downgraded. * But 1 is not legal return value of vm_munmap() and munmap(), reset * it to 0 before return. */ if (ret == 1) { mmap_read_unlock(mm); ret = 0; } else mmap_write_unlock(mm); } int __do_munmap(struct mm_struct *mm, unsigned long start, size_t len, { ... /* Detach vmas from rbtree */ if (!detach_vmas_to_be_unmapped(mm, vma, prev, end)) downgrade = false; if (downgrade) mmap_write_downgrade(mm); unmap_region(mm, vma, prev, start, end); } Regards, Philip Regards, Felix to ensure vma remove and unmap callback are done, because unmap callback set drain_retryfaults flag, so we can safely drain the faults, and it is app bug if vma not found after taking mmap write lock. I still struggle to understand the race you're trying to fix. The only time the svm_restore_pages can see that the VMA is gone AND the prange is gone is after the deferred worker has removed the prange. But the fault draining in the deferred worker should prevent us from ever seeing stale faults in that situation. That means, if no prange is found and no VMA is found, it's definitely an application bug. The only possible race is in the case where the prange still exists but the VMA is gone (needed by svm_fault_allowed). We can treat that as a special case where we just return success because we know that we're handling a stale fault for a VMA that's in the middle of being unmapped. The fact that the prange still existed means that there once was a valid mapping at the address but the deferred worker just hasn't had a chance to clean it up yet. Yes, that is only possible race. One more comment inline. and then create unregister new range and check VMA access permission, then downgrade to take mmap read lock to recover fault. Refactor code to avoid duplicate VMA lookup. Signed-off-by: Philip Yang --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 65 ++-- 1 file changed, 24 insertions(+), 41 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index c1f367934428..3eb0a9491755 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -2329,20 +2329,13 @@ svm_range_best_restore_location(struct svm_range *prange, } static int -svm_range_get_range_boundaries(struct kfd_process *p, int64_t addr, - unsigned long *start, unsigned long *last, - bool *is_heap_stack) +svm_range_get_range_boundaries(struct kfd_process *p, struct vm_area_struct *vma, + int64_t addr, unsigned long *start, + unsigned long *last, bool *is_heap_stack) { - struct vm_area_struct *vma; struct interval_tree_node *node; unsigned long start_limit, end_limit; - vma = find_vma(p->mm, addr << PAGE_SHIFT); - if (!vma || (addr << PAGE_SHIFT) < vma->vm_start) { - pr_debug("VMA does not exist in address [0x%llx]\n", addr); - return -EFAULT; - } - *is_heap_stack = (vma->vm_start <= vma->vm_mm->brk && vma->vm_end >= vma->vm_mm->start_brk) || (vma->vm_start <= vma->vm_mm->start_stack && @@ -2437,9 +2430,10 @@ svm_range_check_vm_
Re: [PATCH v10 08/10] dyndbg: add print-to-tracefs, selftest with it - RFC
On Thu, 18 Nov 2021 09:29:27 -0500 Jason Baron wrote: > On 11/16/21 3:46 AM, Pekka Paalanen wrote: > > On Fri, 12 Nov 2021 10:08:41 -0500 > > Jason Baron wrote: > > > >> On 11/12/21 6:49 AM, Vincent Whitchurch wrote: > >>> On Thu, Nov 11, 2021 at 03:02:04PM -0700, Jim Cromie wrote: > Sean Paul proposed, in: > https://urldefense.com/v3/__https://patchwork.freedesktop.org/series/78133/__;!!GjvTz_vk!HcKnMRByYkIdyF1apqQjlN5aBIomzJR1an3YWXM6KXs0EftVMQdrewRA8Dki4A$ > > drm/trace: Mirror DRM debug logs to tracefs > > His patchset's objective is to be able to independently steer some of > the drm.debug stream to an alternate tracing destination, by splitting > drm_debug_enabled() into syslog & trace flavors, and enabling them > separately. 2 advantages were identified: > > 1- syslog is heavyweight, tracefs is much lighter > 2- separate selection of enabled categories means less traffic > > Dynamic-Debug can do 2nd exceedingly well: > > A- all work is behind jump-label's NOOP, zero off cost. > B- exact site selectivity, precisely the useful traffic. > can tailor enabled set interactively, at shell. > > Since the tracefs interface is effective for drm (the threads suggest > so), adding that interface to dynamic-debug has real potential for > everyone including drm. > > if CONFIG_TRACING: > > Grab Sean's trace_init/cleanup code, use it to provide tracefs > available by default to all pr_debugs. This will likely need some > further per-module treatment; perhaps something reflecting hierarchy > of module,file,function,line, maybe with a tuned flattening. > > endif CONFIG_TRACING > > Add a new +T flag to enable tracing, independent of +p, and add and > use 3 macros: dyndbg_site_is_enabled/logging/tracing(), to encapsulate > the flag checks. Existing code treats T like other flags. > >>> > >>> I posted a patchset a while ago to do something very similar, but that > >>> got stalled for some reason and I unfortunately didn't follow it up: > >>> > >>> > >>> https://urldefense.com/v3/__https://lore.kernel.org/lkml/20200825153338.17061-1-vincent.whitchu...@axis.com/__;!!GjvTz_vk!HcKnMRByYkIdyF1apqQjlN5aBIomzJR1an3YWXM6KXs0EftVMQdrewRGytKHPg$ > >>> > >>> > >>> A key difference between that patchset and this patch (besides that > >>> small fact that I used +x instead of +T) was that my patchset allowed > >>> the dyndbg trace to be emitted to the main buffer and did not force them > >>> to be in an instance-specific buffer. > >> > >> Yes, I agree I'd prefer that we print here to the 'main' buffer - it > >> seems to keep things simpler and easier to combine the output from > >> different sources as you mentioned. > > > > Hi, > > > > I'm not quite sure I understand this discussion, but I would like to > > remind you all of what Sean's original work is about: > > > > Userspace configures DRM tracing into a flight recorder buffer (I guess > > this is what you refer to "instance-specific buffer"). > > > > Userspace runs happily for months, and then hits a problem: a failure > > in the DRM sub-system most likely, e.g. an ioctl that should never > > fail, failed. Userspace handles that failure by dumping the flight > > recorder buffer into a file and saving or sending a bug report. The > > flight recorder contents give a log of all relevant DRM in-kernel > > actions leading to the unexpected failure to help developers debug it. > > > > I don't mind if one can additionally send the flight recorder stream to > > the main buffer, but I do want the separate flight recorder buffer to > > be an option so that a) unrelated things cannot flood the interesting > > bits out of it, and b) the scope of collected information is relevant. > > > > The very reason for this work is problems that are very difficult to > > reproduce in practice, either because the problem itself is triggered > > very rarely and randomly, or because the end users of the system have > > either no knowledge or no access to reconfigure debug logging and then > > reproduce the problem with good debug logs. > > > > Thank you very much for pushing this work forward! > > > > > > So I think Vincent (earlier in the thread) was saying that he finds it > very helpful have dynamic debug output go to the 'main' trace buffer, > while you seem to be saying you'd prefer it just go to dynamic debug > specific trace buffer. Seems like we have different use cases: traditional debugging, and in-production flight recorder for problem reporting. I'm not surprised if they need different treatment. > So we certainly can have dynamic output potentially go to both places - > although I think this would mean two tracepoints? But I really wonder > if we really need a separate tracing buffer for dynamic debug when > what goes to the 'main' buffer can be controlled and filtered to a
Re: [PATCH 2/3] drm/amdkfd: handle VMA remove race
Am 2021-11-18 um 10:00 a.m. schrieb philip yang: > > > On 2021-11-17 7:10 p.m., Felix Kuehling wrote: >> On 2021-11-16 10:43 p.m., Philip Yang wrote: >>> VMA may be removed before unmap notifier callback, restore pages take >>> mmap write lock to lookup VMA to avoid race, >> >> The old code looked up the VMA after taking the mmap lock (either >> read or write) and kept holding the lock afterwards. I think even >> with your new code it's possible that the VMA disappears before you >> take the lock the first time, so always taking the write lock only >> reduces the time window in which things can go wrong, but it doesn't >> remove the race. > Take mmap write lock will serialize with __do_munmap, __do_munmap runs with the mmap write lock. Taking the read lock should be sufficient to serialize with it. Regards, Felix > to ensure vma remove and unmap callback are done, because unmap > callback set drain_retryfaults flag, so we can safely drain the > faults, and it is app bug if vma not found after taking mmap write lock. >> >> I still struggle to understand the race you're trying to fix. The >> only time the svm_restore_pages can see that the VMA is gone AND the >> prange is gone is after the deferred worker has removed the prange. >> But the fault draining in the deferred worker should prevent us from >> ever seeing stale faults in that situation. That means, if no prange >> is found and no VMA is found, it's definitely an application bug. >> >> The only possible race is in the case where the prange still exists >> but the VMA is gone (needed by svm_fault_allowed). We can treat that >> as a special case where we just return success because we know that >> we're handling a stale fault for a VMA that's in the middle of being >> unmapped. The fact that the prange still existed means that there >> once was a valid mapping at the address but the deferred worker just >> hasn't had a chance to clean it up yet. >> > Yes, that is only possible race. >> One more comment inline. >> >> >>> and then create unregister >>> new range and check VMA access permission, then downgrade to take mmap >>> read lock to recover fault. Refactor code to avoid duplicate VMA >>> lookup. >>> >>> Signed-off-by: Philip Yang >>> --- >>> drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 65 >>> ++-- >>> 1 file changed, 24 insertions(+), 41 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c >>> b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c >>> index c1f367934428..3eb0a9491755 100644 >>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c >>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c >>> @@ -2329,20 +2329,13 @@ svm_range_best_restore_location(struct >>> svm_range *prange, >>> } >>> static int >>> -svm_range_get_range_boundaries(struct kfd_process *p, int64_t addr, >>> - unsigned long *start, unsigned long *last, >>> - bool *is_heap_stack) >>> +svm_range_get_range_boundaries(struct kfd_process *p, struct >>> vm_area_struct *vma, >>> + int64_t addr, unsigned long *start, >>> + unsigned long *last, bool *is_heap_stack) >>> { >>> - struct vm_area_struct *vma; >>> struct interval_tree_node *node; >>> unsigned long start_limit, end_limit; >>> - vma = find_vma(p->mm, addr << PAGE_SHIFT); >>> - if (!vma || (addr << PAGE_SHIFT) < vma->vm_start) { >>> - pr_debug("VMA does not exist in address [0x%llx]\n", addr); >>> - return -EFAULT; >>> - } >>> - >>> *is_heap_stack = (vma->vm_start <= vma->vm_mm->brk && >>> vma->vm_end >= vma->vm_mm->start_brk) || >>> (vma->vm_start <= vma->vm_mm->start_stack && >>> @@ -2437,9 +2430,10 @@ svm_range_check_vm_userptr(struct kfd_process >>> *p, uint64_t start, uint64_t last, >>> static struct >>> svm_range *svm_range_create_unregistered_range(struct >>> amdgpu_device *adev, >>> - struct kfd_process *p, >>> - struct mm_struct *mm, >>> - int64_t addr) >>> + struct kfd_process *p, >>> + struct mm_struct *mm, >>> + struct vm_area_struct *vma, >>> + int64_t addr) >>> { >>> struct svm_range *prange = NULL; >>> unsigned long start, last; >>> @@ -2449,7 +2443,7 @@ svm_range >>> *svm_range_create_unregistered_range(struct amdgpu_device *adev, >>> uint64_t bo_l = 0; >>> int r; >>> - if (svm_range_get_range_boundaries(p, addr, &start, &last, >>> + if (svm_range_get_range_boundaries(p, vma, addr, &start, &last, >>> &is_heap_stack)) >>> return NULL; >>> @@ -2552,20 +2546,13 @@ svm_range_count_fault(struct amdgpu_device >>> *adev, struct kfd_process *p, >>> } >>> static bool >>> -svm_fault_allowed(struct mm_struct *mm, uint64_t addr, bool >>> write_fault) >>> +svm_fault_allowed(struct vm_area_struct *vma
Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted
[Public] BTW, I'm not sure if 'reset always' on resume is a good idea for GPUs in a hive (assuming those systems also get suspended and get hiccups). At this point the hive isn't reinitialized. Thanks, Lijo
Re: [PATCH 2/3] drm/amdkfd: handle VMA remove race
On 2021-11-17 7:10 p.m., Felix Kuehling wrote: On 2021-11-16 10:43 p.m., Philip Yang wrote: VMA may be removed before unmap notifier callback, restore pages take mmap write lock to lookup VMA to avoid race, The old code looked up the VMA after taking the mmap lock (either read or write) and kept holding the lock afterwards. I think even with your new code it's possible that the VMA disappears before you take the lock the first time, so always taking the write lock only reduces the time window in which things can go wrong, but it doesn't remove the race. Take mmap write lock will serialize with __do_munmap, to ensure vma remove and unmap callback are done, because unmap callback set drain_retryfaults flag, so we can safely drain the faults, and it is app bug if vma not found after taking mmap write lock. I still struggle to understand the race you're trying to fix. The only time the svm_restore_pages can see that the VMA is gone AND the prange is gone is after the deferred worker has removed the prange. But the fault draining in the deferred worker should prevent us from ever seeing stale faults in that situation. That means, if no prange is found and no VMA is found, it's definitely an application bug. The only possible race is in the case where the prange still exists but the VMA is gone (needed by svm_fault_allowed). We can treat that as a special case where we just return success because we know that we're handling a stale fault for a VMA that's in the middle of being unmapped. The fact that the prange still existed means that there once was a valid mapping at the address but the deferred worker just hasn't had a chance to clean it up yet. Yes, that is only possible race. One more comment inline. and then create unregister new range and check VMA access permission, then downgrade to take mmap read lock to recover fault. Refactor code to avoid duplicate VMA lookup. Signed-off-by: Philip Yang --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 65 ++-- 1 file changed, 24 insertions(+), 41 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index c1f367934428..3eb0a9491755 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -2329,20 +2329,13 @@ svm_range_best_restore_location(struct svm_range *prange, } static int -svm_range_get_range_boundaries(struct kfd_process *p, int64_t addr, - unsigned long *start, unsigned long *last, - bool *is_heap_stack) +svm_range_get_range_boundaries(struct kfd_process *p, struct vm_area_struct *vma, + int64_t addr, unsigned long *start, + unsigned long *last, bool *is_heap_stack) { - struct vm_area_struct *vma; struct interval_tree_node *node; unsigned long start_limit, end_limit; - vma = find_vma(p->mm, addr << PAGE_SHIFT); - if (!vma || (addr << PAGE_SHIFT) < vma->vm_start) { - pr_debug("VMA does not exist in address [0x%llx]\n", addr); - return -EFAULT; - } - *is_heap_stack = (vma->vm_start <= vma->vm_mm->brk && vma->vm_end >= vma->vm_mm->start_brk) || (vma->vm_start <= vma->vm_mm->start_stack && @@ -2437,9 +2430,10 @@ svm_range_check_vm_userptr(struct kfd_process *p, uint64_t start, uint64_t last, static struct svm_range *svm_range_create_unregistered_range(struct amdgpu_device *adev, - struct kfd_process *p, - struct mm_struct *mm, - int64_t addr) + struct kfd_process *p, + struct mm_struct *mm, + struct vm_area_struct *vma, + int64_t addr) {
RE: 回复: [PATCH Review 3/4] drm/amdgpu: add message smu to get ecc_table v2
[AMD Official Use Only] Series looks good to me. Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Lazar, Lijo Sent: Thursday, November 18, 2021 22:41 To: Yang, Stanley ; amd-gfx@lists.freedesktop.org; Zhang, Hawking ; Clements, John ; Quan, Evan ; Wang, Yang(Kevin) Subject: Re: 回复: [PATCH Review 3/4] drm/amdgpu: add message smu to get ecc_table v2 On 11/18/2021 6:05 PM, Yang, Stanley wrote: > [AMD Official Use Only] > > > >> -邮件原件- >> 发件人: Lazar, Lijo >> 发送时间: Thursday, November 18, 2021 7:33 PM >> 收件人: Yang, Stanley ; amd- >> g...@lists.freedesktop.org; Zhang, Hawking ; >> Clements, John ; Quan, Evan >> ; Wang, Yang(Kevin) >> 主题: Re: [PATCH Review 3/4] drm/amdgpu: add message smu to get >> ecc_table v2 >> >> >> >> On 11/18/2021 3:03 PM, Stanley.Yang wrote: >>> support ECC TABLE message, this table include umc ras error count >>> and error address >>> >>> v2: >>> add smu version check to query whether support ecctable >>> call smu_cmn_update_table to get ecctable directly >>> >>> Signed-off-by: Stanley.Yang >>> --- >>>drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h | 8 +++ >>>drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 14 >>>.../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c| 70 >> +++ >>>.../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c| 2 + >>>4 files changed, 94 insertions(+) >>> >>> diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h >>> b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h >>> index 3557f4e7fc30..7a06021a58f0 100644 >>> --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h >>> +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h >>> @@ -324,6 +324,7 @@ enum smu_table_id >>> SMU_TABLE_OVERDRIVE, >>> SMU_TABLE_I2C_COMMANDS, >>> SMU_TABLE_PACE, >>> + SMU_TABLE_ECCINFO, >>> SMU_TABLE_COUNT, >>>}; >>> >>> @@ -340,6 +341,7 @@ struct smu_table_context >>> void*max_sustainable_clocks; >>> struct smu_bios_boot_up_values boot_values; >>> void*driver_pptable; >>> + void*ecc_table; >>> struct smu_tabletables[SMU_TABLE_COUNT]; >>> /* >>> * The driver table is just a staging buffer for @@ -1261,6 >>> +1263,11 @@ struct pptable_funcs { >>> * >> of SMUBUS table. >>> */ >>> int (*send_hbm_bad_pages_num)(struct smu_context *smu, >> uint32_t >>> size); >>> + >>> + /** >>> +* @get_ecc_table: message SMU to get ECC INFO table. >>> +*/ >>> + ssize_t (*get_ecc_info)(struct smu_context *smu, void *table); >>>}; >>> >>>typedef enum { >>> @@ -1397,6 +1404,7 @@ int smu_set_light_sbr(struct smu_context *smu, >>> bool enable); >>> >>>int smu_wait_for_event(struct amdgpu_device *adev, enum >> smu_event_type event, >>>uint64_t event_arg); >>> +int smu_get_ecc_info(struct smu_context *smu, void *umc_ecc); >>> >>>#endif >>>#endif >>> diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c >>> b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c >>> index 01168b8955bf..fd3b6b460b12 100644 >>> --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c >>> +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c >>> @@ -3072,6 +3072,20 @@ int smu_set_light_sbr(struct smu_context >>> *smu, >> bool enable) >>> return ret; >>>} >>> >>> +int smu_get_ecc_info(struct smu_context *smu, void *umc_ecc) { >>> + int ret = -EOPNOTSUPP; >>> + >>> + mutex_lock(&smu->mutex); >>> + if (smu->ppt_funcs && >>> + smu->ppt_funcs->get_ecc_info) >>> + ret = smu->ppt_funcs->get_ecc_info(smu, umc_ecc); >>> + mutex_unlock(&smu->mutex); >>> + >>> + return ret; >>> + >>> +} >>> + >>>static int smu_get_prv_buffer_details(void *handle, void **addr, >>> size_t >> *size) >>>{ >>> struct smu_context *smu = handle; diff --git >>> a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c >>> b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c >>> index f835d86cc2f5..4c21609ccea5 100644 >>> --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c >>> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c >>> @@ -78,6 +78,12 @@ >>> >>>#define smnPCIE_ESM_CTRL 0x111003D0 >>> >>> +/* >>> + * SMU support ECCTABLE since version 68.42.0, >>> + * use this to check ECCTALE feature whether support */ #define >>> +SUPPORT_ECCTABLE_SMU_VERSION 0x00442a00 >>> + >>>static const struct smu_temperature_range smu13_thermal_policy[] = >>>{ >>> {-273150, 99000, 99000, -273150, 99000, 99000, -273150, 99000, >>> 99000}, @@ -190,6 +196,7 @@ static const struct cmn2asic_mapping >> aldebaran_table_map[SMU_TABLE_COUNT] = { >>> TAB_MAP(SMU_METRICS), >>> TAB_MAP(DRIVER_SMU_CONFIG), >>> TAB_MAP(I2C_COMMANDS), >>> + TAB_MAP(ECCINFO), >>>}; >>> >>>static const uint8_t aldebaran_throttler_map[] = { @@ -223,6 >>> +230,9 @@ static int aldebaran_tables_init(struct smu_context *smu) >>> SMU_TABLE
Re: 回复: [PATCH Review 3/4] drm/amdgpu: add message smu to get ecc_table v2
On 11/18/2021 6:05 PM, Yang, Stanley wrote: [AMD Official Use Only] -邮件原件- 发件人: Lazar, Lijo 发送时间: Thursday, November 18, 2021 7:33 PM 收件人: Yang, Stanley ; amd- g...@lists.freedesktop.org; Zhang, Hawking ; Clements, John ; Quan, Evan ; Wang, Yang(Kevin) 主题: Re: [PATCH Review 3/4] drm/amdgpu: add message smu to get ecc_table v2 On 11/18/2021 3:03 PM, Stanley.Yang wrote: support ECC TABLE message, this table include umc ras error count and error address v2: add smu version check to query whether support ecctable call smu_cmn_update_table to get ecctable directly Signed-off-by: Stanley.Yang --- drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h | 8 +++ drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 14 .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c| 70 +++ .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c| 2 + 4 files changed, 94 insertions(+) diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h index 3557f4e7fc30..7a06021a58f0 100644 --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h @@ -324,6 +324,7 @@ enum smu_table_id SMU_TABLE_OVERDRIVE, SMU_TABLE_I2C_COMMANDS, SMU_TABLE_PACE, + SMU_TABLE_ECCINFO, SMU_TABLE_COUNT, }; @@ -340,6 +341,7 @@ struct smu_table_context void*max_sustainable_clocks; struct smu_bios_boot_up_values boot_values; void*driver_pptable; + void*ecc_table; struct smu_tabletables[SMU_TABLE_COUNT]; /* * The driver table is just a staging buffer for @@ -1261,6 +1263,11 @@ struct pptable_funcs { * of SMUBUS table. */ int (*send_hbm_bad_pages_num)(struct smu_context *smu, uint32_t size); + + /** +* @get_ecc_table: message SMU to get ECC INFO table. +*/ + ssize_t (*get_ecc_info)(struct smu_context *smu, void *table); }; typedef enum { @@ -1397,6 +1404,7 @@ int smu_set_light_sbr(struct smu_context *smu, bool enable); int smu_wait_for_event(struct amdgpu_device *adev, enum smu_event_type event, uint64_t event_arg); +int smu_get_ecc_info(struct smu_context *smu, void *umc_ecc); #endif #endif diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c index 01168b8955bf..fd3b6b460b12 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c @@ -3072,6 +3072,20 @@ int smu_set_light_sbr(struct smu_context *smu, bool enable) return ret; } +int smu_get_ecc_info(struct smu_context *smu, void *umc_ecc) { + int ret = -EOPNOTSUPP; + + mutex_lock(&smu->mutex); + if (smu->ppt_funcs && + smu->ppt_funcs->get_ecc_info) + ret = smu->ppt_funcs->get_ecc_info(smu, umc_ecc); + mutex_unlock(&smu->mutex); + + return ret; + +} + static int smu_get_prv_buffer_details(void *handle, void **addr, size_t *size) { struct smu_context *smu = handle; diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c index f835d86cc2f5..4c21609ccea5 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c @@ -78,6 +78,12 @@ #define smnPCIE_ESM_CTRL 0x111003D0 +/* + * SMU support ECCTABLE since version 68.42.0, + * use this to check ECCTALE feature whether support */ #define +SUPPORT_ECCTABLE_SMU_VERSION 0x00442a00 + static const struct smu_temperature_range smu13_thermal_policy[] = { {-273150, 99000, 99000, -273150, 99000, 99000, -273150, 99000, 99000}, @@ -190,6 +196,7 @@ static const struct cmn2asic_mapping aldebaran_table_map[SMU_TABLE_COUNT] = { TAB_MAP(SMU_METRICS), TAB_MAP(DRIVER_SMU_CONFIG), TAB_MAP(I2C_COMMANDS), + TAB_MAP(ECCINFO), }; static const uint8_t aldebaran_throttler_map[] = { @@ -223,6 +230,9 @@ static int aldebaran_tables_init(struct smu_context *smu) SMU_TABLE_INIT(tables, SMU_TABLE_I2C_COMMANDS, sizeof(SwI2cRequest_t), PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM); + SMU_TABLE_INIT(tables, SMU_TABLE_ECCINFO, sizeof(EccInfoTable_t), + PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM); + smu_table->metrics_table = kzalloc(sizeof(SmuMetrics_t), GFP_KERNEL); if (!smu_table->metrics_table) return -ENOMEM; @@ -235,6 +245,10 @@ static int aldebaran_tables_init(struct smu_context *smu) return -ENOMEM; } + smu_table->ecc_table = kzalloc(tables[SMU_TABLE_ECCINFO].size, GFP_KERNEL); + if (!smu_table->ecc_table) + return -ENOMEM; + return 0; } @@ -1765,6 +1779,61 @@
[PATCH] drm/amdgpu/pm: clean up some inconsistent indenting
Eliminate the follow smatch warning: drivers/gpu/drm/amd/amdgpu/../pm/powerplay/amd_powerplay.c:1554 pp_asic_reset_mode_2() warn: inconsistent indenting. Reported-by: Abaci Robot Signed-off-by: Jiapeng Chong --- drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c b/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c index 8d796ed..20cb234 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c +++ b/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c @@ -1551,7 +1551,7 @@ static int pp_set_ppfeature_status(void *handle, uint64_t ppfeature_masks) static int pp_asic_reset_mode_2(void *handle) { struct pp_hwmgr *hwmgr = handle; - int ret = 0; + int ret = 0; if (!hwmgr || !hwmgr->pm_en) return -EINVAL; -- 1.8.3.1
Re: [PATCH v10 08/10] dyndbg: add print-to-tracefs, selftest with it - RFC
On 11/16/21 3:46 AM, Pekka Paalanen wrote: > On Fri, 12 Nov 2021 10:08:41 -0500 > Jason Baron wrote: > >> On 11/12/21 6:49 AM, Vincent Whitchurch wrote: >>> On Thu, Nov 11, 2021 at 03:02:04PM -0700, Jim Cromie wrote: Sean Paul proposed, in: https://urldefense.com/v3/__https://patchwork.freedesktop.org/series/78133/__;!!GjvTz_vk!HcKnMRByYkIdyF1apqQjlN5aBIomzJR1an3YWXM6KXs0EftVMQdrewRA8Dki4A$ drm/trace: Mirror DRM debug logs to tracefs His patchset's objective is to be able to independently steer some of the drm.debug stream to an alternate tracing destination, by splitting drm_debug_enabled() into syslog & trace flavors, and enabling them separately. 2 advantages were identified: 1- syslog is heavyweight, tracefs is much lighter 2- separate selection of enabled categories means less traffic Dynamic-Debug can do 2nd exceedingly well: A- all work is behind jump-label's NOOP, zero off cost. B- exact site selectivity, precisely the useful traffic. can tailor enabled set interactively, at shell. Since the tracefs interface is effective for drm (the threads suggest so), adding that interface to dynamic-debug has real potential for everyone including drm. if CONFIG_TRACING: Grab Sean's trace_init/cleanup code, use it to provide tracefs available by default to all pr_debugs. This will likely need some further per-module treatment; perhaps something reflecting hierarchy of module,file,function,line, maybe with a tuned flattening. endif CONFIG_TRACING Add a new +T flag to enable tracing, independent of +p, and add and use 3 macros: dyndbg_site_is_enabled/logging/tracing(), to encapsulate the flag checks. Existing code treats T like other flags. >>> >>> I posted a patchset a while ago to do something very similar, but that >>> got stalled for some reason and I unfortunately didn't follow it up: >>> >>> >>> https://urldefense.com/v3/__https://lore.kernel.org/lkml/20200825153338.17061-1-vincent.whitchu...@axis.com/__;!!GjvTz_vk!HcKnMRByYkIdyF1apqQjlN5aBIomzJR1an3YWXM6KXs0EftVMQdrewRGytKHPg$ >>> >>> >>> A key difference between that patchset and this patch (besides that >>> small fact that I used +x instead of +T) was that my patchset allowed >>> the dyndbg trace to be emitted to the main buffer and did not force them >>> to be in an instance-specific buffer. >> >> Yes, I agree I'd prefer that we print here to the 'main' buffer - it >> seems to keep things simpler and easier to combine the output from >> different sources as you mentioned. > > Hi, > > I'm not quite sure I understand this discussion, but I would like to > remind you all of what Sean's original work is about: > > Userspace configures DRM tracing into a flight recorder buffer (I guess > this is what you refer to "instance-specific buffer"). > > Userspace runs happily for months, and then hits a problem: a failure > in the DRM sub-system most likely, e.g. an ioctl that should never > fail, failed. Userspace handles that failure by dumping the flight > recorder buffer into a file and saving or sending a bug report. The > flight recorder contents give a log of all relevant DRM in-kernel > actions leading to the unexpected failure to help developers debug it. > > I don't mind if one can additionally send the flight recorder stream to > the main buffer, but I do want the separate flight recorder buffer to > be an option so that a) unrelated things cannot flood the interesting > bits out of it, and b) the scope of collected information is relevant. > > The very reason for this work is problems that are very difficult to > reproduce in practice, either because the problem itself is triggered > very rarely and randomly, or because the end users of the system have > either no knowledge or no access to reconfigure debug logging and then > reproduce the problem with good debug logs. > > Thank you very much for pushing this work forward! > > So I think Vincent (earlier in the thread) was saying that he finds it very helpful have dynamic debug output go to the 'main' trace buffer, while you seem to be saying you'd prefer it just go to dynamic debug specific trace buffer. So we certainly can have dynamic output potentially go to both places - although I think this would mean two tracepoints? But I really wonder if we really need a separate tracing buffer for dynamic debug when what goes to the 'main' buffer can be controlled and filtered to avoid your concern around a 'flood'? Thanks, -Jason
Re: [PATCH 1/3] drm/amdkfd: process exit and retry fault race
On 2021-11-17 6:18 p.m., Felix Kuehling wrote: On 2021-11-16 10:43 p.m., Philip Yang wrote: kfd process mmu release notifier callback drain retry fault to ensure no retry fault comes after removing kfd process from the hash table, otherwise svm page fault handler will fail to recover the fault and dump GPU vm fault log. Drain retry fault needs flush restore page fault work to wait for the last fault is handled because IH dispatch increase rptr first and then calls restore_pages, so restore pages may still handle the last fault but amdgpu_ih_has_checkpoint_processed return true. This fixes the problem, but it will result in waiting longer than necessary because the worker only finishes when the IH ring is empty. Working on new IH ring1 overflow patch to handle drain_retry_fault race, flush will not need here. restore pages can not call mmget because mmput may call mmu notifier release to cause deadlock. See my comment inline. Refactor deferred list work to call mmget and take mmap write lock to handle all ranges, to avoid mm is gone while inserting mmu notifier. Signed-off-by: Philip Yang --- drivers/gpu/drm/amd/amdkfd/kfd_process.c | 6 +++ drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 69 drivers/gpu/drm/amd/amdkfd/kfd_svm.h | 1 + 3 files changed, 41 insertions(+), 35 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c index d4c8a6948a9f..8b4b045d5c92 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c @@ -1143,6 +1143,12 @@ static void kfd_process_notifier_release(struct mmu_notifier *mn, if (WARN_ON(p->mm != mm)) return; + /* + * Ensure no retry fault comes in afterwards, as page fault handler will + * not find kfd process and take mm lock to recover fault. + */ + svm_range_drain_retry_fault(&p->svms); + mutex_lock(&kfd_processes_mutex); hash_del_rcu(&p->kfd_processes); mutex_unlock(&kfd_processes_mutex); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 88360f23eb61..c1f367934428 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -1953,9 +1953,10 @@ svm_range_handle_list_op(struct svm_range_list *svms, struct svm_range *prange) } } -static void svm_range_drain_retry_fault(struct svm_range_list *svms) +void svm_range_drain_retry_fault(struct svm_range_list *svms) { struct kfd_process_device *pdd; + struct amdgpu_device *adev; struct kfd_process *p; uint32_t i; @@ -1967,9 +1968,11 @@ static void svm_range_drain_retry_fault(struct svm_range_list *svms) continue; pr_debug("drain retry fault gpu %d svms %p\n", i, svms); + adev = pdd->dev->adev; + amdgpu_ih_wait_on_checkpoint_process(adev, &adev->irq.ih1); - amdgpu_ih_wait_on_checkpoint_process(pdd->dev->adev, - &pdd->dev->adev->irq.ih1); + /* Wait for the last page fault is handled */ + flush_work(&adev->irq.ih1_work); pr_debug("drain retry fault gpu %d svms 0x%p done\n", i, svms); } } @@ -1979,43 +1982,43 @@ static void svm_range_deferred_list_work(struct work_struct *work) struct svm_range_list *svms; struct svm_range *prange; struct mm_struct *mm; + struct kfd_process *p; svms = container_of(work, struct svm_range_list,
Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted
On 11/18/2021 7:55 PM, Alex Deucher wrote: On Thu, Nov 18, 2021 at 9:15 AM Lazar, Lijo wrote: On 11/18/2021 7:41 PM, Christian König wrote: Am 18.11.21 um 15:09 schrieb Lazar, Lijo: On 11/18/2021 7:36 PM, Alex Deucher wrote: On Thu, Nov 18, 2021 at 8:11 AM Liang, Prike wrote: [Public] -Original Message- From: Lazar, Lijo Sent: Thursday, November 18, 2021 4:01 PM To: Liang, Prike ; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Huang, Ray Subject: Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted On 11/18/2021 12:32 PM, Prike Liang wrote: Do ASIC reset at the moment Sx suspend aborted behind of amdgpu suspend to keep AMDGPU in a clean reset state and that can avoid re-initialize device improperly error. Signed-off-by: Prike Liang --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 19 +++ 3 files changed, 24 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index b85b67a..8bd9833 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1053,6 +1053,7 @@ struct amdgpu_device { boolin_s3; boolin_s4; boolin_s0ix; + boolpm_completed; PM framework maintains separate flags, why not use the same? dev->power.is_suspended = false; dev->power.is_noirq_suspended = false; dev->power.is_late_suspended = false; Thanks for pointing it out and I miss that flag. In this case we can use the PM flag is_noirq_suspended for checking AMDGPU device whether is finished suspend. BTW, Alex posted a patch which does similar thing, though it tries reset if suspend fails. https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2FDM6PR12MB26195F8E099407B4B6966FEBE4999%40&data=04%7C01%7Clijo.lazar%40amd.com%7C2ce211aeeeb448f6cb2c08d9aa9f4741%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637728423343483218%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=nyzhGwTJV83YZkit34Bb%2B5tBxGEMvFzCyZPjz8eSDgc%3D&reserved=0 DM6PR12MB2619.namprd12.prod.outlook.com/ If that reset also failed, then no point in another reset, or keep it as part of resume. Alex's patch seems always do the ASIC reset at the end of AMDGPU device, but that may needn't on the normal AMDGPU device suspend. However, this patch shows that can handle the system suspend aborted after AMDGPU suspend case well, so now seems we only need take care suspend abort case here. Yeah, I was thinking we'd take this patch rather than mine to minimize the number of resets. Wondering if suspend fails whether there is a need to call resume. It may not get to resume() to do a reset, that needs to be checked. I would rather do it so that we reset the ASIC during resume when we detect that the hardware is in a bad state. This way it should also work with things like kexec and virtualization. I was thinking from the power framework perspective. If the device's suspend() callback returns failure, why would the framework needs to call a resume() on that device. The device's suspend callback succeeds, but some other device or event in the system causes the overall suspend to abort. As such the GPU is never powered off by the platform so it's left in a partially initialized state. The system then calls the resume() callbacks for all of the devices that were previously suspended to bring them back to a working system. GPU reset is just a convenient way to get the device back into a known good state. Unfortunately, I'm not sure if there is a good way to detect whether the GPU is in a known good state or not until we try and re-init the IPs and at that point the IPs are not fully initialized so it's complex to try and unwind that, reset, and try again. It's probably easiest to just reset the GPU on resume all the time. If the GPU was powered down, the reset should work fine since we are resetting a GPU that just came out of reset. If the GPU was not powered down, the reset will put the GPU into a known good state. https://elixir.bootlin.com/linux/latest/source/drivers/base/power/main.c#L925 I don't have a machine to trace the path. The flag is set only if suspend is succesful. Thanks, Lijo Alex Thanks, Lijo Regards, Christian. Thanks, Lijo Alex Thanks, Lijo atomic_tin_gpu_reset; enum pp_mp1_state mp1_state; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index ec42a6f..a12ed54 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3983,6 +3983,10 @@ int amdgpu_device
Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted
On Thu, Nov 18, 2021 at 9:15 AM Lazar, Lijo wrote: > > > > On 11/18/2021 7:41 PM, Christian König wrote: > > Am 18.11.21 um 15:09 schrieb Lazar, Lijo: > >> On 11/18/2021 7:36 PM, Alex Deucher wrote: > >>> On Thu, Nov 18, 2021 at 8:11 AM Liang, Prike > >>> wrote: > > [Public] > > > -Original Message- > > From: Lazar, Lijo > > Sent: Thursday, November 18, 2021 4:01 PM > > To: Liang, Prike ; amd-gfx@lists.freedesktop.org > > Cc: Deucher, Alexander ; Huang, Ray > > > > Subject: Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend > > aborted > > > > > > > > On 11/18/2021 12:32 PM, Prike Liang wrote: > >> Do ASIC reset at the moment Sx suspend aborted behind of amdgpu > >> suspend to keep AMDGPU in a clean reset state and that can avoid > >> re-initialize device improperly error. > >> > >> Signed-off-by: Prike Liang > >> --- > >>drivers/gpu/drm/amd/amdgpu/amdgpu.h| 1 + > >>drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 > >>drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 19 > > +++ > >>3 files changed, 24 insertions(+) > >> > >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > >> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > >> index b85b67a..8bd9833 100644 > >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > >> @@ -1053,6 +1053,7 @@ struct amdgpu_device { > >> boolin_s3; > >> boolin_s4; > >> boolin_s0ix; > >> + boolpm_completed; > > > > PM framework maintains separate flags, why not use the same? > > > > dev->power.is_suspended = false; > > dev->power.is_noirq_suspended = false; > > dev->power.is_late_suspended = false; > > > > Thanks for pointing it out and I miss that flag. In this case we can > use the PM flag is_noirq_suspended for checking AMDGPU device > whether is finished suspend. > > > BTW, Alex posted a patch which does similar thing, though it tries > > reset if > > suspend fails. > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2FDM6PR12MB26195F8E099407B4B6966FEBE4999%40&data=04%7C01%7Clijo.lazar%40amd.com%7C79c861fcfccc4f2cc45d08d9aa9d61f0%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637728415203834017%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=gGnctl8rlNTj6o02hlE06og3RCA%2BQP37B3ejsZSxPdM%3D&reserved=0 > > > > DM6PR12MB2619.namprd12.prod.outlook.com/ > > > > If that reset also failed, then no point in another reset, or keep > > it as part of > > resume. > > > > Alex's patch seems always do the ASIC reset at the end of AMDGPU > device, but that may needn't on the normal AMDGPU device suspend. > However, this patch shows that can handle the system suspend aborted > after AMDGPU suspend case well, so now seems we only need take care > suspend abort case here. > > >>> > >>> Yeah, I was thinking we'd take this patch rather than mine to minimize > >>> the number of resets. > >>> > >> > >> Wondering if suspend fails whether there is a need to call resume. It > >> may not get to resume() to do a reset, that needs to be checked. > > > > I would rather do it so that we reset the ASIC during resume when we > > detect that the hardware is in a bad state. > > > > This way it should also work with things like kexec and virtualization. > > I was thinking from the power framework perspective. If the device's > suspend() callback returns failure, why would the framework needs to > call a resume() on that device. The device's suspend callback succeeds, but some other device or event in the system causes the overall suspend to abort. As such the GPU is never powered off by the platform so it's left in a partially initialized state. The system then calls the resume() callbacks for all of the devices that were previously suspended to bring them back to a working system. GPU reset is just a convenient way to get the device back into a known good state. Unfortunately, I'm not sure if there is a good way to detect whether the GPU is in a known good state or not until we try and re-init the IPs and at that point the IPs are not fully initialized so it's complex to try and unwind that, reset, and try again. It's probably easiest to just reset the GPU on resume all the time. If the GPU was powered down, the reset should work fine since we are resetting a GPU that just came out of reset. If the GPU was not powered down, the reset will put the GPU into a known good state. Alex > > Thanks, > Lijo > > > > > Regards, > > Christian. > > > >> > >> Thanks, > >> Lijo > >> > >>
Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted
On 11/18/2021 7:41 PM, Christian König wrote: Am 18.11.21 um 15:09 schrieb Lazar, Lijo: On 11/18/2021 7:36 PM, Alex Deucher wrote: On Thu, Nov 18, 2021 at 8:11 AM Liang, Prike wrote: [Public] -Original Message- From: Lazar, Lijo Sent: Thursday, November 18, 2021 4:01 PM To: Liang, Prike ; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Huang, Ray Subject: Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted On 11/18/2021 12:32 PM, Prike Liang wrote: Do ASIC reset at the moment Sx suspend aborted behind of amdgpu suspend to keep AMDGPU in a clean reset state and that can avoid re-initialize device improperly error. Signed-off-by: Prike Liang --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 19 +++ 3 files changed, 24 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index b85b67a..8bd9833 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1053,6 +1053,7 @@ struct amdgpu_device { bool in_s3; bool in_s4; bool in_s0ix; + bool pm_completed; PM framework maintains separate flags, why not use the same? dev->power.is_suspended = false; dev->power.is_noirq_suspended = false; dev->power.is_late_suspended = false; Thanks for pointing it out and I miss that flag. In this case we can use the PM flag is_noirq_suspended for checking AMDGPU device whether is finished suspend. BTW, Alex posted a patch which does similar thing, though it tries reset if suspend fails. https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2FDM6PR12MB26195F8E099407B4B6966FEBE4999%40&data=04%7C01%7Clijo.lazar%40amd.com%7C79c861fcfccc4f2cc45d08d9aa9d61f0%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637728415203834017%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=gGnctl8rlNTj6o02hlE06og3RCA%2BQP37B3ejsZSxPdM%3D&reserved=0 DM6PR12MB2619.namprd12.prod.outlook.com/ If that reset also failed, then no point in another reset, or keep it as part of resume. Alex's patch seems always do the ASIC reset at the end of AMDGPU device, but that may needn't on the normal AMDGPU device suspend. However, this patch shows that can handle the system suspend aborted after AMDGPU suspend case well, so now seems we only need take care suspend abort case here. Yeah, I was thinking we'd take this patch rather than mine to minimize the number of resets. Wondering if suspend fails whether there is a need to call resume. It may not get to resume() to do a reset, that needs to be checked. I would rather do it so that we reset the ASIC during resume when we detect that the hardware is in a bad state. This way it should also work with things like kexec and virtualization. I was thinking from the power framework perspective. If the device's suspend() callback returns failure, why would the framework needs to call a resume() on that device. Thanks, Lijo Regards, Christian. Thanks, Lijo Alex Thanks, Lijo atomic_t in_gpu_reset; enum pp_mp1_state mp1_state; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index ec42a6f..a12ed54 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3983,6 +3983,10 @@ int amdgpu_device_resume(struct drm_device *dev, bool fbcon) if (adev->in_s0ix) amdgpu_gfx_state_change_set(adev, sGpuChangeState_D0Entry); + if (!adev->pm_completed) { + dev_warn(adev->dev, "suspend aborted will do asic reset\n"); + amdgpu_asic_reset(adev); + } /* post card */ if (amdgpu_device_need_post(adev)) { r = amdgpu_device_asic_init(adev); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index eee3cf8..9f90017 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -2168,6 +2168,23 @@ static int amdgpu_pmops_suspend(struct device *dev) return r; } +/* + * Actually the PM suspend whether is completed should be confirmed + * by checking the sysfs +sys/power/suspend_stats/failed_suspend.However, + * in this function only check the AMDGPU device whether is suspended + * completely in the system-wide suspend process. + */ +static int amdgpu_pmops_noirq_suspend(struct device *dev) { + struct drm_device *drm_dev = dev_get_drvdata(dev); + struct amdgpu_device *adev = drm_to_adev(drm_dev); + + dev_dbg(dev, "amdgpu suspend completely.\n"); + adev->pm_completed = true; + + return
Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted
Am 18.11.21 um 15:09 schrieb Lazar, Lijo: On 11/18/2021 7:36 PM, Alex Deucher wrote: On Thu, Nov 18, 2021 at 8:11 AM Liang, Prike wrote: [Public] -Original Message- From: Lazar, Lijo Sent: Thursday, November 18, 2021 4:01 PM To: Liang, Prike ; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Huang, Ray Subject: Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted On 11/18/2021 12:32 PM, Prike Liang wrote: Do ASIC reset at the moment Sx suspend aborted behind of amdgpu suspend to keep AMDGPU in a clean reset state and that can avoid re-initialize device improperly error. Signed-off-by: Prike Liang --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 19 +++ 3 files changed, 24 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index b85b67a..8bd9833 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1053,6 +1053,7 @@ struct amdgpu_device { bool in_s3; bool in_s4; bool in_s0ix; + bool pm_completed; PM framework maintains separate flags, why not use the same? dev->power.is_suspended = false; dev->power.is_noirq_suspended = false; dev->power.is_late_suspended = false; Thanks for pointing it out and I miss that flag. In this case we can use the PM flag is_noirq_suspended for checking AMDGPU device whether is finished suspend. BTW, Alex posted a patch which does similar thing, though it tries reset if suspend fails. https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2FDM6PR12MB26195F8E099407B4B6966FEBE4999%40&data=04%7C01%7CLijo.Lazar%40amd.com%7C6401dce9411b4c134b0208d9aa9ca644%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637728412055353107%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=Z4dgNvHISHVlR4grHm1RU3%2FJMJVdRe7Ukq1DnjCgG0o%3D&reserved=0 DM6PR12MB2619.namprd12.prod.outlook.com/ If that reset also failed, then no point in another reset, or keep it as part of resume. Alex's patch seems always do the ASIC reset at the end of AMDGPU device, but that may needn't on the normal AMDGPU device suspend. However, this patch shows that can handle the system suspend aborted after AMDGPU suspend case well, so now seems we only need take care suspend abort case here. Yeah, I was thinking we'd take this patch rather than mine to minimize the number of resets. Wondering if suspend fails whether there is a need to call resume. It may not get to resume() to do a reset, that needs to be checked. I would rather do it so that we reset the ASIC during resume when we detect that the hardware is in a bad state. This way it should also work with things like kexec and virtualization. Regards, Christian. Thanks, Lijo Alex Thanks, Lijo atomic_t in_gpu_reset; enum pp_mp1_state mp1_state; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index ec42a6f..a12ed54 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3983,6 +3983,10 @@ int amdgpu_device_resume(struct drm_device *dev, bool fbcon) if (adev->in_s0ix) amdgpu_gfx_state_change_set(adev, sGpuChangeState_D0Entry); + if (!adev->pm_completed) { + dev_warn(adev->dev, "suspend aborted will do asic reset\n"); + amdgpu_asic_reset(adev); + } /* post card */ if (amdgpu_device_need_post(adev)) { r = amdgpu_device_asic_init(adev); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index eee3cf8..9f90017 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -2168,6 +2168,23 @@ static int amdgpu_pmops_suspend(struct device *dev) return r; } +/* + * Actually the PM suspend whether is completed should be confirmed + * by checking the sysfs +sys/power/suspend_stats/failed_suspend.However, + * in this function only check the AMDGPU device whether is suspended + * completely in the system-wide suspend process. + */ +static int amdgpu_pmops_noirq_suspend(struct device *dev) { + struct drm_device *drm_dev = dev_get_drvdata(dev); + struct amdgpu_device *adev = drm_to_adev(drm_dev); + + dev_dbg(dev, "amdgpu suspend completely.\n"); + adev->pm_completed = true; + + return 0; +} + static int amdgpu_pmops_resume(struct device *dev) { struct drm_device *drm_dev = dev_get_drvdata(dev); @@ -2181,6 +2198,7 @@ static int amdgpu_pmops_resume(struct device *dev) r = amdgpu_device_resume(drm_dev, true)
Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted
On 11/18/2021 7:36 PM, Alex Deucher wrote: On Thu, Nov 18, 2021 at 8:11 AM Liang, Prike wrote: [Public] -Original Message- From: Lazar, Lijo Sent: Thursday, November 18, 2021 4:01 PM To: Liang, Prike ; amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Huang, Ray Subject: Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted On 11/18/2021 12:32 PM, Prike Liang wrote: Do ASIC reset at the moment Sx suspend aborted behind of amdgpu suspend to keep AMDGPU in a clean reset state and that can avoid re-initialize device improperly error. Signed-off-by: Prike Liang --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 19 +++ 3 files changed, 24 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index b85b67a..8bd9833 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1053,6 +1053,7 @@ struct amdgpu_device { boolin_s3; boolin_s4; boolin_s0ix; + boolpm_completed; PM framework maintains separate flags, why not use the same? dev->power.is_suspended = false; dev->power.is_noirq_suspended = false; dev->power.is_late_suspended = false; Thanks for pointing it out and I miss that flag. In this case we can use the PM flag is_noirq_suspended for checking AMDGPU device whether is finished suspend. BTW, Alex posted a patch which does similar thing, though it tries reset if suspend fails. https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2FDM6PR12MB26195F8E099407B4B6966FEBE4999%40&data=04%7C01%7CLijo.Lazar%40amd.com%7C6401dce9411b4c134b0208d9aa9ca644%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637728412055353107%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=Z4dgNvHISHVlR4grHm1RU3%2FJMJVdRe7Ukq1DnjCgG0o%3D&reserved=0 DM6PR12MB2619.namprd12.prod.outlook.com/ If that reset also failed, then no point in another reset, or keep it as part of resume. Alex's patch seems always do the ASIC reset at the end of AMDGPU device, but that may needn't on the normal AMDGPU device suspend. However, this patch shows that can handle the system suspend aborted after AMDGPU suspend case well, so now seems we only need take care suspend abort case here. Yeah, I was thinking we'd take this patch rather than mine to minimize the number of resets. Wondering if suspend fails whether there is a need to call resume. It may not get to resume() to do a reset, that needs to be checked. Thanks, Lijo Alex Thanks, Lijo atomic_tin_gpu_reset; enum pp_mp1_state mp1_state; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index ec42a6f..a12ed54 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3983,6 +3983,10 @@ int amdgpu_device_resume(struct drm_device *dev, bool fbcon) if (adev->in_s0ix) amdgpu_gfx_state_change_set(adev, sGpuChangeState_D0Entry); + if (!adev->pm_completed) { + dev_warn(adev->dev, "suspend aborted will do asic reset\n"); + amdgpu_asic_reset(adev); + } /* post card */ if (amdgpu_device_need_post(adev)) { r = amdgpu_device_asic_init(adev); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index eee3cf8..9f90017 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -2168,6 +2168,23 @@ static int amdgpu_pmops_suspend(struct device *dev) return r; } +/* + * Actually the PM suspend whether is completed should be confirmed + * by checking the sysfs +sys/power/suspend_stats/failed_suspend.However, + * in this function only check the AMDGPU device whether is suspended + * completely in the system-wide suspend process. + */ +static int amdgpu_pmops_noirq_suspend(struct device *dev) { + struct drm_device *drm_dev = dev_get_drvdata(dev); + struct amdgpu_device *adev = drm_to_adev(drm_dev); + + dev_dbg(dev, "amdgpu suspend completely.\n"); + adev->pm_completed = true; + + return 0; +} + static int amdgpu_pmops_resume(struct device *dev) { struct drm_device *drm_dev = dev_get_drvdata(dev); @@ -2181,6 +2198,7 @@ static int amdgpu_pmops_resume(struct device *dev) r = amdgpu_device_resume(drm_dev, true); if (amdgpu_acpi_is_s0ix_active(adev)) adev->in_s0ix = false; + adev->pm_completed = false; return r; } @@ -2397,6 +2415,7 @@ static const struct dev_pm_ops amdgpu_pm_ops = { .runtime_suspend = amdgpu_pmops_runtime_suspend,
Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted
On Thu, Nov 18, 2021 at 8:11 AM Liang, Prike wrote: > > [Public] > > > -Original Message- > > From: Lazar, Lijo > > Sent: Thursday, November 18, 2021 4:01 PM > > To: Liang, Prike ; amd-gfx@lists.freedesktop.org > > Cc: Deucher, Alexander ; Huang, Ray > > > > Subject: Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend > > aborted > > > > > > > > On 11/18/2021 12:32 PM, Prike Liang wrote: > > > Do ASIC reset at the moment Sx suspend aborted behind of amdgpu > > > suspend to keep AMDGPU in a clean reset state and that can avoid > > > re-initialize device improperly error. > > > > > > Signed-off-by: Prike Liang > > > --- > > > drivers/gpu/drm/amd/amdgpu/amdgpu.h| 1 + > > > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 > > > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 19 > > +++ > > > 3 files changed, 24 insertions(+) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > > > index b85b67a..8bd9833 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > > > @@ -1053,6 +1053,7 @@ struct amdgpu_device { > > > boolin_s3; > > > boolin_s4; > > > boolin_s0ix; > > > + boolpm_completed; > > > > PM framework maintains separate flags, why not use the same? > > > > dev->power.is_suspended = false; > > dev->power.is_noirq_suspended = false; > > dev->power.is_late_suspended = false; > > > > Thanks for pointing it out and I miss that flag. In this case we can use the > PM flag is_noirq_suspended for checking AMDGPU device whether is finished > suspend. > > > BTW, Alex posted a patch which does similar thing, though it tries reset if > > suspend fails. > > > > https://lore.kernel.org/all/DM6PR12MB26195F8E099407B4B6966FEBE4999@ > > DM6PR12MB2619.namprd12.prod.outlook.com/ > > > > If that reset also failed, then no point in another reset, or keep it as > > part of > > resume. > > > > Alex's patch seems always do the ASIC reset at the end of AMDGPU device, but > that may needn't on the normal AMDGPU device suspend. However, this patch > shows that can handle the system suspend aborted after AMDGPU suspend case > well, so now seems we only need take care suspend abort case here. > Yeah, I was thinking we'd take this patch rather than mine to minimize the number of resets. Alex > > Thanks, > > Lijo > > > > > > > > atomic_tin_gpu_reset; > > > enum pp_mp1_state mp1_state; > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > > index ec42a6f..a12ed54 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > > @@ -3983,6 +3983,10 @@ int amdgpu_device_resume(struct drm_device > > *dev, bool fbcon) > > > if (adev->in_s0ix) > > > amdgpu_gfx_state_change_set(adev, > > sGpuChangeState_D0Entry); > > > > > > + if (!adev->pm_completed) { > > > + dev_warn(adev->dev, "suspend aborted will do asic reset\n"); > > > + amdgpu_asic_reset(adev); > > > + } > > > /* post card */ > > > if (amdgpu_device_need_post(adev)) { > > > r = amdgpu_device_asic_init(adev); diff --git > > > a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > > index eee3cf8..9f90017 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > > @@ -2168,6 +2168,23 @@ static int amdgpu_pmops_suspend(struct > > device *dev) > > > return r; > > > } > > > > > > +/* > > > + * Actually the PM suspend whether is completed should be confirmed > > > + * by checking the sysfs > > > +sys/power/suspend_stats/failed_suspend.However, > > > + * in this function only check the AMDGPU device whether is suspended > > > + * completely in the system-wide suspend process. > > > + */ > > > +static int amdgpu_pmops_noirq_suspend(struct device *dev) { > > > + struct drm_device *drm_dev = dev_get_drvdata(dev); > > > + struct amdgpu_device *adev = drm_to_adev(drm_dev); > > > + > > > + dev_dbg(dev, "amdgpu suspend completely.\n"); > > > + adev->pm_completed = true; > > > + > > > + return 0; > > > +} > > > + > > > static int amdgpu_pmops_resume(struct device *dev) > > > { > > > struct drm_device *drm_dev = dev_get_drvdata(dev); @@ -2181,6 > > > +2198,7 @@ static int amdgpu_pmops_resume(struct device *dev) > > > r = amdgpu_device_resume(drm_dev, true); > > > if (amdgpu_acpi_is_s0ix_active(adev)) > > > adev->in_s0ix = false; > > > + adev->pm_completed = false; > > > return r; > > > } > > > > > > @@ -2397,6 +2415,7 @@ static const struct dev_pm_ops amdgpu_pm_ops > > = { > > > .runtime_suspend = amdgpu_pmops_runtime
RE: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted
[Public] > -Original Message- > From: Lazar, Lijo > Sent: Thursday, November 18, 2021 4:01 PM > To: Liang, Prike ; amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander ; Huang, Ray > > Subject: Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend > aborted > > > > On 11/18/2021 12:32 PM, Prike Liang wrote: > > Do ASIC reset at the moment Sx suspend aborted behind of amdgpu > > suspend to keep AMDGPU in a clean reset state and that can avoid > > re-initialize device improperly error. > > > > Signed-off-by: Prike Liang > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu.h| 1 + > > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 > > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 19 > +++ > > 3 files changed, 24 insertions(+) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > > b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > > index b85b67a..8bd9833 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > > @@ -1053,6 +1053,7 @@ struct amdgpu_device { > > boolin_s3; > > boolin_s4; > > boolin_s0ix; > > + boolpm_completed; > > PM framework maintains separate flags, why not use the same? > > dev->power.is_suspended = false; > dev->power.is_noirq_suspended = false; > dev->power.is_late_suspended = false; > Thanks for pointing it out and I miss that flag. In this case we can use the PM flag is_noirq_suspended for checking AMDGPU device whether is finished suspend. > BTW, Alex posted a patch which does similar thing, though it tries reset if > suspend fails. > > https://lore.kernel.org/all/DM6PR12MB26195F8E099407B4B6966FEBE4999@ > DM6PR12MB2619.namprd12.prod.outlook.com/ > > If that reset also failed, then no point in another reset, or keep it as part > of > resume. > Alex's patch seems always do the ASIC reset at the end of AMDGPU device, but that may needn't on the normal AMDGPU device suspend. However, this patch shows that can handle the system suspend aborted after AMDGPU suspend case well, so now seems we only need take care suspend abort case here. > Thanks, > Lijo > > > > > atomic_tin_gpu_reset; > > enum pp_mp1_state mp1_state; > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > index ec42a6f..a12ed54 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > @@ -3983,6 +3983,10 @@ int amdgpu_device_resume(struct drm_device > *dev, bool fbcon) > > if (adev->in_s0ix) > > amdgpu_gfx_state_change_set(adev, > sGpuChangeState_D0Entry); > > > > + if (!adev->pm_completed) { > > + dev_warn(adev->dev, "suspend aborted will do asic reset\n"); > > + amdgpu_asic_reset(adev); > > + } > > /* post card */ > > if (amdgpu_device_need_post(adev)) { > > r = amdgpu_device_asic_init(adev); diff --git > > a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > index eee3cf8..9f90017 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > > @@ -2168,6 +2168,23 @@ static int amdgpu_pmops_suspend(struct > device *dev) > > return r; > > } > > > > +/* > > + * Actually the PM suspend whether is completed should be confirmed > > + * by checking the sysfs > > +sys/power/suspend_stats/failed_suspend.However, > > + * in this function only check the AMDGPU device whether is suspended > > + * completely in the system-wide suspend process. > > + */ > > +static int amdgpu_pmops_noirq_suspend(struct device *dev) { > > + struct drm_device *drm_dev = dev_get_drvdata(dev); > > + struct amdgpu_device *adev = drm_to_adev(drm_dev); > > + > > + dev_dbg(dev, "amdgpu suspend completely.\n"); > > + adev->pm_completed = true; > > + > > + return 0; > > +} > > + > > static int amdgpu_pmops_resume(struct device *dev) > > { > > struct drm_device *drm_dev = dev_get_drvdata(dev); @@ -2181,6 > > +2198,7 @@ static int amdgpu_pmops_resume(struct device *dev) > > r = amdgpu_device_resume(drm_dev, true); > > if (amdgpu_acpi_is_s0ix_active(adev)) > > adev->in_s0ix = false; > > + adev->pm_completed = false; > > return r; > > } > > > > @@ -2397,6 +2415,7 @@ static const struct dev_pm_ops amdgpu_pm_ops > = { > > .runtime_suspend = amdgpu_pmops_runtime_suspend, > > .runtime_resume = amdgpu_pmops_runtime_resume, > > .runtime_idle = amdgpu_pmops_runtime_idle, > > + .suspend_noirq = amdgpu_pmops_noirq_suspend, > > }; > > > > static int amdgpu_flush(struct file *f, fl_owner_t id) > >
回复: [PATCH Review 3/4] drm/amdgpu: add message smu to get ecc_table v2
[AMD Official Use Only] > -邮件原件- > 发件人: Lazar, Lijo > 发送时间: Thursday, November 18, 2021 7:33 PM > 收件人: Yang, Stanley ; amd- > g...@lists.freedesktop.org; Zhang, Hawking ; > Clements, John ; Quan, Evan > ; Wang, Yang(Kevin) > 主题: Re: [PATCH Review 3/4] drm/amdgpu: add message smu to get > ecc_table v2 > > > > On 11/18/2021 3:03 PM, Stanley.Yang wrote: > > support ECC TABLE message, this table include umc ras error count and > > error address > > > > v2: > > add smu version check to query whether support ecctable > > call smu_cmn_update_table to get ecctable directly > > > > Signed-off-by: Stanley.Yang > > --- > > drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h | 8 +++ > > drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 14 > > .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c| 70 > +++ > > .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c| 2 + > > 4 files changed, 94 insertions(+) > > > > diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h > > b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h > > index 3557f4e7fc30..7a06021a58f0 100644 > > --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h > > +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h > > @@ -324,6 +324,7 @@ enum smu_table_id > > SMU_TABLE_OVERDRIVE, > > SMU_TABLE_I2C_COMMANDS, > > SMU_TABLE_PACE, > > + SMU_TABLE_ECCINFO, > > SMU_TABLE_COUNT, > > }; > > > > @@ -340,6 +341,7 @@ struct smu_table_context > > void*max_sustainable_clocks; > > struct smu_bios_boot_up_values boot_values; > > void*driver_pptable; > > + void*ecc_table; > > struct smu_tabletables[SMU_TABLE_COUNT]; > > /* > > * The driver table is just a staging buffer for @@ -1261,6 > > +1263,11 @@ struct pptable_funcs { > > * > of SMUBUS table. > > */ > > int (*send_hbm_bad_pages_num)(struct smu_context *smu, > uint32_t > > size); > > + > > + /** > > +* @get_ecc_table: message SMU to get ECC INFO table. > > +*/ > > + ssize_t (*get_ecc_info)(struct smu_context *smu, void *table); > > }; > > > > typedef enum { > > @@ -1397,6 +1404,7 @@ int smu_set_light_sbr(struct smu_context *smu, > > bool enable); > > > > int smu_wait_for_event(struct amdgpu_device *adev, enum > smu_event_type event, > >uint64_t event_arg); > > +int smu_get_ecc_info(struct smu_context *smu, void *umc_ecc); > > > > #endif > > #endif > > diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c > > b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c > > index 01168b8955bf..fd3b6b460b12 100644 > > --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c > > +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c > > @@ -3072,6 +3072,20 @@ int smu_set_light_sbr(struct smu_context *smu, > bool enable) > > return ret; > > } > > > > +int smu_get_ecc_info(struct smu_context *smu, void *umc_ecc) { > > + int ret = -EOPNOTSUPP; > > + > > + mutex_lock(&smu->mutex); > > + if (smu->ppt_funcs && > > + smu->ppt_funcs->get_ecc_info) > > + ret = smu->ppt_funcs->get_ecc_info(smu, umc_ecc); > > + mutex_unlock(&smu->mutex); > > + > > + return ret; > > + > > +} > > + > > static int smu_get_prv_buffer_details(void *handle, void **addr, size_t > *size) > > { > > struct smu_context *smu = handle; > > diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c > > b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c > > index f835d86cc2f5..4c21609ccea5 100644 > > --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c > > +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c > > @@ -78,6 +78,12 @@ > > > > #define smnPCIE_ESM_CTRL 0x111003D0 > > > > +/* > > + * SMU support ECCTABLE since version 68.42.0, > > + * use this to check ECCTALE feature whether support */ #define > > +SUPPORT_ECCTABLE_SMU_VERSION 0x00442a00 > > + > > static const struct smu_temperature_range smu13_thermal_policy[] = > > { > > {-273150, 99000, 99000, -273150, 99000, 99000, -273150, 99000, > > 99000}, @@ -190,6 +196,7 @@ static const struct cmn2asic_mapping > aldebaran_table_map[SMU_TABLE_COUNT] = { > > TAB_MAP(SMU_METRICS), > > TAB_MAP(DRIVER_SMU_CONFIG), > > TAB_MAP(I2C_COMMANDS), > > + TAB_MAP(ECCINFO), > > }; > > > > static const uint8_t aldebaran_throttler_map[] = { @@ -223,6 +230,9 > > @@ static int aldebaran_tables_init(struct smu_context *smu) > > SMU_TABLE_INIT(tables, SMU_TABLE_I2C_COMMANDS, > sizeof(SwI2cRequest_t), > >PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM); > > > > + SMU_TABLE_INIT(tables, SMU_TABLE_ECCINFO, > sizeof(EccInfoTable_t), > > + PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM); > > + > > smu_table->metrics_table = kzalloc(sizeof(SmuMetrics_t), > GFP_KERNEL); > > if (!smu_table->metrics_table) > > return -ENOMEM; > > @@ -235,6 +245,10 @@ static int aldebaran_tables_init(struct smu_context > *s
Re: [PATCH Review 3/4] drm/amdgpu: add message smu to get ecc_table v2
On 11/18/2021 3:03 PM, Stanley.Yang wrote: support ECC TABLE message, this table include umc ras error count and error address v2: add smu version check to query whether support ecctable call smu_cmn_update_table to get ecctable directly Signed-off-by: Stanley.Yang --- drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h | 8 +++ drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 14 .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c| 70 +++ .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c| 2 + 4 files changed, 94 insertions(+) diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h index 3557f4e7fc30..7a06021a58f0 100644 --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h @@ -324,6 +324,7 @@ enum smu_table_id SMU_TABLE_OVERDRIVE, SMU_TABLE_I2C_COMMANDS, SMU_TABLE_PACE, + SMU_TABLE_ECCINFO, SMU_TABLE_COUNT, }; @@ -340,6 +341,7 @@ struct smu_table_context void*max_sustainable_clocks; struct smu_bios_boot_up_values boot_values; void*driver_pptable; + void*ecc_table; struct smu_tabletables[SMU_TABLE_COUNT]; /* * The driver table is just a staging buffer for @@ -1261,6 +1263,11 @@ struct pptable_funcs { * of SMUBUS table. */ int (*send_hbm_bad_pages_num)(struct smu_context *smu, uint32_t size); + + /** +* @get_ecc_table: message SMU to get ECC INFO table. +*/ + ssize_t (*get_ecc_info)(struct smu_context *smu, void *table); }; typedef enum { @@ -1397,6 +1404,7 @@ int smu_set_light_sbr(struct smu_context *smu, bool enable); int smu_wait_for_event(struct amdgpu_device *adev, enum smu_event_type event, uint64_t event_arg); +int smu_get_ecc_info(struct smu_context *smu, void *umc_ecc); #endif #endif diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c index 01168b8955bf..fd3b6b460b12 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c @@ -3072,6 +3072,20 @@ int smu_set_light_sbr(struct smu_context *smu, bool enable) return ret; } +int smu_get_ecc_info(struct smu_context *smu, void *umc_ecc) +{ + int ret = -EOPNOTSUPP; + + mutex_lock(&smu->mutex); + if (smu->ppt_funcs && + smu->ppt_funcs->get_ecc_info) + ret = smu->ppt_funcs->get_ecc_info(smu, umc_ecc); + mutex_unlock(&smu->mutex); + + return ret; + +} + static int smu_get_prv_buffer_details(void *handle, void **addr, size_t *size) { struct smu_context *smu = handle; diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c index f835d86cc2f5..4c21609ccea5 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c @@ -78,6 +78,12 @@ #define smnPCIE_ESM_CTRL 0x111003D0 +/* + * SMU support ECCTABLE since version 68.42.0, + * use this to check ECCTALE feature whether support + */ +#define SUPPORT_ECCTABLE_SMU_VERSION 0x00442a00 + static const struct smu_temperature_range smu13_thermal_policy[] = { {-273150, 99000, 99000, -273150, 99000, 99000, -273150, 99000, 99000}, @@ -190,6 +196,7 @@ static const struct cmn2asic_mapping aldebaran_table_map[SMU_TABLE_COUNT] = { TAB_MAP(SMU_METRICS), TAB_MAP(DRIVER_SMU_CONFIG), TAB_MAP(I2C_COMMANDS), + TAB_MAP(ECCINFO), }; static const uint8_t aldebaran_throttler_map[] = { @@ -223,6 +230,9 @@ static int aldebaran_tables_init(struct smu_context *smu) SMU_TABLE_INIT(tables, SMU_TABLE_I2C_COMMANDS, sizeof(SwI2cRequest_t), PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM); + SMU_TABLE_INIT(tables, SMU_TABLE_ECCINFO, sizeof(EccInfoTable_t), + PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM); + smu_table->metrics_table = kzalloc(sizeof(SmuMetrics_t), GFP_KERNEL); if (!smu_table->metrics_table) return -ENOMEM; @@ -235,6 +245,10 @@ static int aldebaran_tables_init(struct smu_context *smu) return -ENOMEM; } + smu_table->ecc_table = kzalloc(tables[SMU_TABLE_ECCINFO].size, GFP_KERNEL); + if (!smu_table->ecc_table) + return -ENOMEM; + return 0; } @@ -1765,6 +1779,61 @@ static ssize_t aldebaran_get_gpu_metrics(struct smu_context *smu, return sizeof(struct gpu_metrics_v1_3); } +static int aldebaran_check_ecc_table_support(struct smu_context *smu) +{ + uint32_t if_version = 0xff, smu_version = 0xff; + int ret = 0; + + ret = smu_cmn_get_smc_version(smu, &i
RE: [PATCH] drm/amdgpu: update the domain flags for dumb buffer creation
[Public] Reviewed-by: Guchun Chen Regards, Guchun -Original Message- From: amd-gfx On Behalf Of Evan Quan Sent: Thursday, November 18, 2021 4:27 PM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander ; Quan, Evan ; Koenig, Christian Subject: [PATCH] drm/amdgpu: update the domain flags for dumb buffer creation After switching to generic framebuffer framework, we rely on the ->dumb_create routine for frame buffer creation. However, the different domain flags used are not optimal. Add the contiguous flag to directly allocate the scanout BO as one linear buffer. Fixes: 844612e1149d ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.") Signed-off-by: Evan Quan Reviewed-by: Christian König Change-Id: I403bf7a0b265c564b5f3a3343999670e5eb87ca6 --- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c index d07b6aebc449..189e32ee7a6e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -883,7 +883,8 @@ int amdgpu_mode_dumb_create(struct drm_file *file_priv, struct drm_gem_object *gobj; uint32_t handle; u64 flags = AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED | - AMDGPU_GEM_CREATE_CPU_GTT_USWC; + AMDGPU_GEM_CREATE_CPU_GTT_USWC | + AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS; u32 domain; int r; -- 2.29.0
回复: [PATCH Review 1/4] drm/amdgpu: Update smu driver interface for aldebaran
[AMD Official Use Only] Thanks Evan, Will update patch 1 and 3 title before submit. Regards, Stanley > -邮件原件- > 发件人: Quan, Evan > 发送时间: Thursday, November 18, 2021 5:58 PM > 收件人: Yang, Stanley ; amd- > g...@lists.freedesktop.org; Zhang, Hawking ; > Clements, John ; Lazar, Lijo > ; Wang, Yang(Kevin) > 抄送: Yang, Stanley > 主题: RE: [PATCH Review 1/4] drm/amdgpu: Update smu driver interface for > aldebaran > > [AMD Official Use Only] > > Better to update the patch title as "drm/amd/pm: Update smu driver > interface for aldebaran" as all other power related patches. > And please update patch3 also. > Other than above, patch 1, 3 are reviewed-by: Evan Quan > > > -Original Message- > > From: Stanley.Yang > > Sent: Thursday, November 18, 2021 5:34 PM > > To: amd-gfx@lists.freedesktop.org; Zhang, Hawking > > ; Clements, John > ; Quan, > > Evan ; Lazar, Lijo ; Wang, > > Yang(Kevin) > > Cc: Yang, Stanley > > Subject: [PATCH Review 1/4] drm/amdgpu: Update smu driver interface > > for aldebaran > > > > update smu driver if version to 0x08 to avoid mismatch log A version > > mismatch can still happen with an older FW > > > > Change-Id: I97f2bc4ed9a9cba313b744e2ff6812c90b244935 > > Signed-off-by: Stanley.Yang > > --- > > .../drm/amd/pm/inc/smu13_driver_if_aldebaran.h | 18 > > +- > > drivers/gpu/drm/amd/pm/inc/smu_v13_0.h | 2 +- > > 2 files changed, 18 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h > > b/drivers/gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h > > index a017983ff1fa..0f67c56c2863 100644 > > --- a/drivers/gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h > > +++ b/drivers/gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h > > @@ -140,6 +140,8 @@ > > > > #define MAX_SW_I2C_COMMANDS24 > > > > +#define ALDEBARAN_UMC_CHANNEL_NUM32 > > + > > typedef enum { > >I2C_CONTROLLER_PORT_0, //CKSVII2C0 > >I2C_CONTROLLER_PORT_1, //CKSVII2C1 > > @@ -507,6 +509,19 @@ typedef struct { > >uint32_t MmHubPadding[8]; // SMU internal use } AvfsDebugTable_t; > > > > +typedef struct { > > + uint64_t mca_umc_status; > > + uint64_t mca_umc_addr; > > + uint16_t ce_count_lo_chip; > > + uint16_t ce_count_hi_chip; > > + > > + uint32_t eccPadding; > > +} EccInfo_t; > > + > > +typedef struct { > > + EccInfo_t EccInfo[ALDEBARAN_UMC_CHANNEL_NUM]; > > +} EccInfoTable_t; > > + > > // These defines are used with the following messages: > > // SMC_MSG_TransferTableDram2Smu > > // SMC_MSG_TransferTableSmu2Dram > > @@ -517,6 +532,7 @@ typedef struct { > > #define TABLE_SMU_METRICS 4 > > #define TABLE_DRIVER_SMU_CONFIG 5 > > #define TABLE_I2C_COMMANDS6 > > -#define TABLE_COUNT 7 > > +#define TABLE_ECCINFO 7 > > +#define TABLE_COUNT 8 > > > > #endif > > diff --git a/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h > > b/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h > > index bbc608c990b0..44af23ae059e 100644 > > --- a/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h > > +++ b/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h > > @@ -27,7 +27,7 @@ > > > > #define SMU13_DRIVER_IF_VERSION_INV 0x #define > > SMU13_DRIVER_IF_VERSION_YELLOW_CARP 0x04 -#define > > SMU13_DRIVER_IF_VERSION_ALDE 0x07 > > +#define SMU13_DRIVER_IF_VERSION_ALDE 0x08 > > > > #define SMU13_MODE1_RESET_WAIT_TIME_IN_MS 500 //500ms > > > > -- > > 2.17.1
RE: [PATCH Review 1/4] drm/amdgpu: Update smu driver interface for aldebaran
[AMD Official Use Only] Better to update the patch title as "drm/amd/pm: Update smu driver interface for aldebaran" as all other power related patches. And please update patch3 also. Other than above, patch 1, 3 are reviewed-by: Evan Quan > -Original Message- > From: Stanley.Yang > Sent: Thursday, November 18, 2021 5:34 PM > To: amd-gfx@lists.freedesktop.org; Zhang, Hawking > ; Clements, John ; > Quan, Evan ; Lazar, Lijo ; > Wang, Yang(Kevin) > Cc: Yang, Stanley > Subject: [PATCH Review 1/4] drm/amdgpu: Update smu driver interface for > aldebaran > > update smu driver if version to 0x08 to avoid mismatch log > A version mismatch can still happen with an older FW > > Change-Id: I97f2bc4ed9a9cba313b744e2ff6812c90b244935 > Signed-off-by: Stanley.Yang > --- > .../drm/amd/pm/inc/smu13_driver_if_aldebaran.h | 18 > +- > drivers/gpu/drm/amd/pm/inc/smu_v13_0.h | 2 +- > 2 files changed, 18 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h > b/drivers/gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h > index a017983ff1fa..0f67c56c2863 100644 > --- a/drivers/gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h > +++ b/drivers/gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h > @@ -140,6 +140,8 @@ > > #define MAX_SW_I2C_COMMANDS24 > > +#define ALDEBARAN_UMC_CHANNEL_NUM32 > + > typedef enum { >I2C_CONTROLLER_PORT_0, //CKSVII2C0 >I2C_CONTROLLER_PORT_1, //CKSVII2C1 > @@ -507,6 +509,19 @@ typedef struct { >uint32_t MmHubPadding[8]; // SMU internal use > } AvfsDebugTable_t; > > +typedef struct { > + uint64_t mca_umc_status; > + uint64_t mca_umc_addr; > + uint16_t ce_count_lo_chip; > + uint16_t ce_count_hi_chip; > + > + uint32_t eccPadding; > +} EccInfo_t; > + > +typedef struct { > + EccInfo_t EccInfo[ALDEBARAN_UMC_CHANNEL_NUM]; > +} EccInfoTable_t; > + > // These defines are used with the following messages: > // SMC_MSG_TransferTableDram2Smu > // SMC_MSG_TransferTableSmu2Dram > @@ -517,6 +532,7 @@ typedef struct { > #define TABLE_SMU_METRICS 4 > #define TABLE_DRIVER_SMU_CONFIG 5 > #define TABLE_I2C_COMMANDS6 > -#define TABLE_COUNT 7 > +#define TABLE_ECCINFO 7 > +#define TABLE_COUNT 8 > > #endif > diff --git a/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h > b/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h > index bbc608c990b0..44af23ae059e 100644 > --- a/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h > +++ b/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h > @@ -27,7 +27,7 @@ > > #define SMU13_DRIVER_IF_VERSION_INV 0x > #define SMU13_DRIVER_IF_VERSION_YELLOW_CARP 0x04 > -#define SMU13_DRIVER_IF_VERSION_ALDE 0x07 > +#define SMU13_DRIVER_IF_VERSION_ALDE 0x08 > > #define SMU13_MODE1_RESET_WAIT_TIME_IN_MS 500 //500ms > > -- > 2.17.1
[PATCH Review 4/4] query umc error info from ecc_table v2
if smu support ECCTABLE, driver can message smu to get ecc_table then query umc error info from ECCTABLE v2: optimize source code makes logical more reasonable Signed-off-by: Stanley.Yang --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 42 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c | 72 + 2 files changed, 83 insertions(+), 31 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index 90f0db3b4f65..3a4e483cd5e2 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c @@ -888,6 +888,38 @@ void amdgpu_ras_mca_query_error_status(struct amdgpu_device *adev, } } +static void amdgpu_ras_get_ecc_info(struct amdgpu_device *adev, struct ras_err_data *err_data) +{ + struct amdgpu_ras *ras = amdgpu_ras_get_context(adev); + int ret = 0; + + /* +* choosing right query method according to +* whether smu support query error information +*/ + ret = smu_get_ecc_info(&adev->smu, (void *)&(ras->umc_ecc)); + if (ret == -EOPNOTSUPP) { + if (adev->umc.ras_funcs && + adev->umc.ras_funcs->query_ras_error_count) + adev->umc.ras_funcs->query_ras_error_count(adev, err_data); + + /* umc query_ras_error_address is also responsible for clearing +* error status +*/ + if (adev->umc.ras_funcs && + adev->umc.ras_funcs->query_ras_error_address) + adev->umc.ras_funcs->query_ras_error_address(adev, err_data); + } else if (!ret) { + if (adev->umc.ras_funcs && + adev->umc.ras_funcs->ecc_info_query_ras_error_count) + adev->umc.ras_funcs->ecc_info_query_ras_error_count(adev, err_data); + + if (adev->umc.ras_funcs && + adev->umc.ras_funcs->ecc_info_query_ras_error_address) + adev->umc.ras_funcs->ecc_info_query_ras_error_address(adev, err_data); + } +} + /* query/inject/cure begin */ int amdgpu_ras_query_error_status(struct amdgpu_device *adev, struct ras_query_if *info) @@ -901,15 +933,7 @@ int amdgpu_ras_query_error_status(struct amdgpu_device *adev, switch (info->head.block) { case AMDGPU_RAS_BLOCK__UMC: - if (adev->umc.ras_funcs && - adev->umc.ras_funcs->query_ras_error_count) - adev->umc.ras_funcs->query_ras_error_count(adev, &err_data); - /* umc query_ras_error_address is also responsible for clearing -* error status -*/ - if (adev->umc.ras_funcs && - adev->umc.ras_funcs->query_ras_error_address) - adev->umc.ras_funcs->query_ras_error_address(adev, &err_data); + amdgpu_ras_get_ecc_info(adev, &err_data); break; case AMDGPU_RAS_BLOCK__SDMA: if (adev->sdma.funcs->query_ras_error_count) { diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c index 0c7c56a91b25..2b37b1c293b2 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c @@ -95,30 +95,58 @@ int amdgpu_umc_process_ras_data_cb(struct amdgpu_device *adev, { struct ras_err_data *err_data = (struct ras_err_data *)ras_error_status; struct amdgpu_ras *con = amdgpu_ras_get_context(adev); + int ret = 0; kgd2kfd_set_sram_ecc_flag(adev->kfd.dev); - if (adev->umc.ras_funcs && - adev->umc.ras_funcs->query_ras_error_count) - adev->umc.ras_funcs->query_ras_error_count(adev, ras_error_status); - - if (adev->umc.ras_funcs && - adev->umc.ras_funcs->query_ras_error_address && - adev->umc.max_ras_err_cnt_per_query) { - err_data->err_addr = - kcalloc(adev->umc.max_ras_err_cnt_per_query, - sizeof(struct eeprom_table_record), GFP_KERNEL); - - /* still call query_ras_error_address to clear error status -* even NOMEM error is encountered -*/ - if(!err_data->err_addr) - dev_warn(adev->dev, "Failed to alloc memory for " - "umc error address record!\n"); - - /* umc query_ras_error_address is also responsible for clearing -* error status -*/ - adev->umc.ras_funcs->query_ras_error_address(adev, ras_error_status); + ret = smu_get_ecc_info(&adev->smu, (void *)&(con->umc_ecc)); + if (ret == -EOPNOTSUPP) { + if (adev->umc.ras_funcs && + adev->umc.ras_funcs->query_ras_error_count) + adev->umc.ras_funcs-
[PATCH Review 3/4] drm/amdgpu: add message smu to get ecc_table v2
support ECC TABLE message, this table include umc ras error count and error address v2: add smu version check to query whether support ecctable call smu_cmn_update_table to get ecctable directly Signed-off-by: Stanley.Yang --- drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h | 8 +++ drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 14 .../drm/amd/pm/swsmu/smu13/aldebaran_ppt.c| 70 +++ .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c| 2 + 4 files changed, 94 insertions(+) diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h index 3557f4e7fc30..7a06021a58f0 100644 --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_smu.h @@ -324,6 +324,7 @@ enum smu_table_id SMU_TABLE_OVERDRIVE, SMU_TABLE_I2C_COMMANDS, SMU_TABLE_PACE, + SMU_TABLE_ECCINFO, SMU_TABLE_COUNT, }; @@ -340,6 +341,7 @@ struct smu_table_context void*max_sustainable_clocks; struct smu_bios_boot_up_values boot_values; void*driver_pptable; + void*ecc_table; struct smu_tabletables[SMU_TABLE_COUNT]; /* * The driver table is just a staging buffer for @@ -1261,6 +1263,11 @@ struct pptable_funcs { * of SMUBUS table. */ int (*send_hbm_bad_pages_num)(struct smu_context *smu, uint32_t size); + + /** +* @get_ecc_table: message SMU to get ECC INFO table. +*/ + ssize_t (*get_ecc_info)(struct smu_context *smu, void *table); }; typedef enum { @@ -1397,6 +1404,7 @@ int smu_set_light_sbr(struct smu_context *smu, bool enable); int smu_wait_for_event(struct amdgpu_device *adev, enum smu_event_type event, uint64_t event_arg); +int smu_get_ecc_info(struct smu_context *smu, void *umc_ecc); #endif #endif diff --git a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c index 01168b8955bf..fd3b6b460b12 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c @@ -3072,6 +3072,20 @@ int smu_set_light_sbr(struct smu_context *smu, bool enable) return ret; } +int smu_get_ecc_info(struct smu_context *smu, void *umc_ecc) +{ + int ret = -EOPNOTSUPP; + + mutex_lock(&smu->mutex); + if (smu->ppt_funcs && + smu->ppt_funcs->get_ecc_info) + ret = smu->ppt_funcs->get_ecc_info(smu, umc_ecc); + mutex_unlock(&smu->mutex); + + return ret; + +} + static int smu_get_prv_buffer_details(void *handle, void **addr, size_t *size) { struct smu_context *smu = handle; diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c index f835d86cc2f5..4c21609ccea5 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c @@ -78,6 +78,12 @@ #define smnPCIE_ESM_CTRL 0x111003D0 +/* + * SMU support ECCTABLE since version 68.42.0, + * use this to check ECCTALE feature whether support + */ +#define SUPPORT_ECCTABLE_SMU_VERSION 0x00442a00 + static const struct smu_temperature_range smu13_thermal_policy[] = { {-273150, 99000, 99000, -273150, 99000, 99000, -273150, 99000, 99000}, @@ -190,6 +196,7 @@ static const struct cmn2asic_mapping aldebaran_table_map[SMU_TABLE_COUNT] = { TAB_MAP(SMU_METRICS), TAB_MAP(DRIVER_SMU_CONFIG), TAB_MAP(I2C_COMMANDS), + TAB_MAP(ECCINFO), }; static const uint8_t aldebaran_throttler_map[] = { @@ -223,6 +230,9 @@ static int aldebaran_tables_init(struct smu_context *smu) SMU_TABLE_INIT(tables, SMU_TABLE_I2C_COMMANDS, sizeof(SwI2cRequest_t), PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM); + SMU_TABLE_INIT(tables, SMU_TABLE_ECCINFO, sizeof(EccInfoTable_t), + PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM); + smu_table->metrics_table = kzalloc(sizeof(SmuMetrics_t), GFP_KERNEL); if (!smu_table->metrics_table) return -ENOMEM; @@ -235,6 +245,10 @@ static int aldebaran_tables_init(struct smu_context *smu) return -ENOMEM; } + smu_table->ecc_table = kzalloc(tables[SMU_TABLE_ECCINFO].size, GFP_KERNEL); + if (!smu_table->ecc_table) + return -ENOMEM; + return 0; } @@ -1765,6 +1779,61 @@ static ssize_t aldebaran_get_gpu_metrics(struct smu_context *smu, return sizeof(struct gpu_metrics_v1_3); } +static int aldebaran_check_ecc_table_support(struct smu_context *smu) +{ + uint32_t if_version = 0xff, smu_version = 0xff; + int ret = 0; + + ret = smu_cmn_get_smc_version(smu, &if_version, &smu_version); + if (ret) + r
[PATCH Review 2/4] drm/amdgpu: add new query interface for umc block v2
add message smu to query error information v2: rename message_smu to ecc_info Signed-off-by: Stanley.Yang --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 16 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h | 4 + drivers/gpu/drm/amd/amdgpu/umc_v6_7.c | 161 3 files changed, 181 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h index cdd0010a5389..bcbf3264d92f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h @@ -320,6 +320,19 @@ struct ras_common_if { char name[32]; }; +#define MAX_UMC_CHANNEL_NUM 32 + +struct ecc_info_per_ch { + uint16_t ce_count_lo_chip; + uint16_t ce_count_hi_chip; + uint64_t mca_umc_status; + uint64_t mca_umc_addr; +}; + +struct umc_ecc_info { + struct ecc_info_per_ch ecc[MAX_UMC_CHANNEL_NUM]; +}; + struct amdgpu_ras { /* ras infrastructure */ /* for ras itself. */ @@ -359,6 +372,9 @@ struct amdgpu_ras { struct delayed_work ras_counte_delay_work; atomic_t ras_ue_count; atomic_t ras_ce_count; + + /* record umc error info queried from smu */ + struct umc_ecc_info umc_ecc; }; struct ras_fs_data { diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h index 1f5fe2315236..9e40bade0a68 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h @@ -49,6 +49,10 @@ struct amdgpu_umc_ras_funcs { void (*query_ras_error_address)(struct amdgpu_device *adev, void *ras_error_status); bool (*query_ras_poison_mode)(struct amdgpu_device *adev); + void (*ecc_info_query_ras_error_count)(struct amdgpu_device *adev, + void *ras_error_status); + void (*ecc_info_query_ras_error_address)(struct amdgpu_device *adev, + void *ras_error_status); }; struct amdgpu_umc_funcs { diff --git a/drivers/gpu/drm/amd/amdgpu/umc_v6_7.c b/drivers/gpu/drm/amd/amdgpu/umc_v6_7.c index f7ec3fe134e5..6dd1e19e8d43 100644 --- a/drivers/gpu/drm/amd/amdgpu/umc_v6_7.c +++ b/drivers/gpu/drm/amd/amdgpu/umc_v6_7.c @@ -50,6 +50,165 @@ static inline uint32_t get_umc_v6_7_reg_offset(struct amdgpu_device *adev, return adev->umc.channel_offs * ch_inst + UMC_V6_7_INST_DIST * umc_inst; } +static inline uint32_t get_umc_v6_7_channel_index(struct amdgpu_device *adev, + uint32_t umc_inst, + uint32_t ch_inst) +{ + return adev->umc.channel_idx_tbl[umc_inst * adev->umc.channel_inst_num + ch_inst]; +} + +static void umc_v6_7_ecc_info_query_correctable_error_count(struct amdgpu_device *adev, + uint32_t channel_index, + unsigned long *error_count) +{ + uint32_t ecc_err_cnt; + uint64_t mc_umc_status; + struct amdgpu_ras *ras = amdgpu_ras_get_context(adev); + + /* +* select the lower chip and check the error count +* skip add error count, calc error counter only from mca_umc_status +*/ + ecc_err_cnt = ras->umc_ecc.ecc[channel_index].ce_count_lo_chip; + + /* +* select the higher chip and check the err counter +* skip add error count, calc error counter only from mca_umc_status +*/ + ecc_err_cnt = ras->umc_ecc.ecc[channel_index].ce_count_hi_chip; + + /* check for SRAM correctable error + MCUMC_STATUS is a 64 bit register */ + mc_umc_status = ras->umc_ecc.ecc[channel_index].mca_umc_status; + if (REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, Val) == 1 && + REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, CECC) == 1) + *error_count += 1; +} + +static void umc_v6_7_ecc_info_querry_uncorrectable_error_count(struct amdgpu_device *adev, + uint32_t channel_index, + unsigned long *error_count) +{ + uint64_t mc_umc_status; + struct amdgpu_ras *ras = amdgpu_ras_get_context(adev); + + /* check the MCUMC_STATUS */ + mc_umc_status = ras->umc_ecc.ecc[channel_index].mca_umc_status; + if ((REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, Val) == 1) && + (REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, Deferred) == 1 || + REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, UECC) == 1 || + REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, PCC) == 1 || + REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, UC) == 1 || + REG_GET_FIELD(mc_umc_status, MCA_UMC_UMC0_MCUMC_STATUST0, TCC) == 1)) + *error_count += 1; +} + +sta
[PATCH Review 1/4] drm/amdgpu: Update smu driver interface for aldebaran
update smu driver if version to 0x08 to avoid mismatch log A version mismatch can still happen with an older FW Change-Id: I97f2bc4ed9a9cba313b744e2ff6812c90b244935 Signed-off-by: Stanley.Yang --- .../drm/amd/pm/inc/smu13_driver_if_aldebaran.h | 18 +- drivers/gpu/drm/amd/pm/inc/smu_v13_0.h | 2 +- 2 files changed, 18 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h b/drivers/gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h index a017983ff1fa..0f67c56c2863 100644 --- a/drivers/gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h +++ b/drivers/gpu/drm/amd/pm/inc/smu13_driver_if_aldebaran.h @@ -140,6 +140,8 @@ #define MAX_SW_I2C_COMMANDS24 +#define ALDEBARAN_UMC_CHANNEL_NUM32 + typedef enum { I2C_CONTROLLER_PORT_0, //CKSVII2C0 I2C_CONTROLLER_PORT_1, //CKSVII2C1 @@ -507,6 +509,19 @@ typedef struct { uint32_t MmHubPadding[8]; // SMU internal use } AvfsDebugTable_t; +typedef struct { + uint64_t mca_umc_status; + uint64_t mca_umc_addr; + uint16_t ce_count_lo_chip; + uint16_t ce_count_hi_chip; + + uint32_t eccPadding; +} EccInfo_t; + +typedef struct { + EccInfo_t EccInfo[ALDEBARAN_UMC_CHANNEL_NUM]; +} EccInfoTable_t; + // These defines are used with the following messages: // SMC_MSG_TransferTableDram2Smu // SMC_MSG_TransferTableSmu2Dram @@ -517,6 +532,7 @@ typedef struct { #define TABLE_SMU_METRICS 4 #define TABLE_DRIVER_SMU_CONFIG 5 #define TABLE_I2C_COMMANDS6 -#define TABLE_COUNT 7 +#define TABLE_ECCINFO 7 +#define TABLE_COUNT 8 #endif diff --git a/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h b/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h index bbc608c990b0..44af23ae059e 100644 --- a/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h +++ b/drivers/gpu/drm/amd/pm/inc/smu_v13_0.h @@ -27,7 +27,7 @@ #define SMU13_DRIVER_IF_VERSION_INV 0x #define SMU13_DRIVER_IF_VERSION_YELLOW_CARP 0x04 -#define SMU13_DRIVER_IF_VERSION_ALDE 0x07 +#define SMU13_DRIVER_IF_VERSION_ALDE 0x08 #define SMU13_MODE1_RESET_WAIT_TIME_IN_MS 500 //500ms -- 2.17.1
Re: [PATCH v1 1/9] mm: add zone device coherent type memory support
On Tuesday, 16 November 2021 6:30:18 AM AEDT Alex Sierra wrote: > Device memory that is cache coherent from device and CPU point of view. > This is used on platforms that have an advanced system bus (like CAPI > or CXL). Any page of a process can be migrated to such memory. However, > no one should be allowed to pin such memory so that it can always be > evicted. > > Signed-off-by: Alex Sierra > --- > include/linux/memremap.h | 8 > include/linux/mm.h | 9 + > mm/memcontrol.c | 6 +++--- > mm/memory-failure.c | 6 +- > mm/memremap.c| 5 - > mm/migrate.c | 21 + > 6 files changed, 42 insertions(+), 13 deletions(-) > > diff --git a/include/linux/memremap.h b/include/linux/memremap.h > index c0e9d35889e8..ff4d398edf35 100644 > --- a/include/linux/memremap.h > +++ b/include/linux/memremap.h > @@ -39,6 +39,13 @@ struct vmem_altmap { > * A more complete discussion of unaddressable memory may be found in > * include/linux/hmm.h and Documentation/vm/hmm.rst. > * > + * MEMORY_DEVICE_COHERENT: > + * Device memory that is cache coherent from device and CPU point of view. > This > + * is used on platforms that have an advanced system bus (like CAPI or CXL). > A > + * driver can hotplug the device memory using ZONE_DEVICE and with that > memory > + * type. Any page of a process can be migrated to such memory. However no one > + * should be allowed to pin such memory so that it can always be evicted. > + * > * MEMORY_DEVICE_FS_DAX: > * Host memory that has similar access semantics as System RAM i.e. DMA > * coherent and supports page pinning. In support of coordinating page > @@ -59,6 +66,7 @@ struct vmem_altmap { > enum memory_type { > /* 0 is reserved to catch uninitialized type fields */ > MEMORY_DEVICE_PRIVATE = 1, > + MEMORY_DEVICE_COHERENT, > MEMORY_DEVICE_FS_DAX, > MEMORY_DEVICE_GENERIC, > MEMORY_DEVICE_PCI_P2PDMA, > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 73a52aba448f..d23b6ebd2177 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -1162,6 +1162,7 @@ static inline bool page_is_devmap_managed(struct page > *page) > return false; > switch (page->pgmap->type) { > case MEMORY_DEVICE_PRIVATE: > + case MEMORY_DEVICE_COHERENT: > case MEMORY_DEVICE_FS_DAX: > return true; > default: > @@ -1191,6 +1192,14 @@ static inline bool is_device_private_page(const struct > page *page) > page->pgmap->type == MEMORY_DEVICE_PRIVATE; > } > > +static inline bool is_device_page(const struct page *page) > +{ > + return IS_ENABLED(CONFIG_DEV_PAGEMAP_OPS) && > + is_zone_device_page(page) && > + (page->pgmap->type == MEMORY_DEVICE_PRIVATE || > + page->pgmap->type == MEMORY_DEVICE_COHERENT); > +} > + > static inline bool is_pci_p2pdma_page(const struct page *page) > { > return IS_ENABLED(CONFIG_DEV_PAGEMAP_OPS) && > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 6da5020a8656..e0275ebe1198 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -5695,8 +5695,8 @@ static int mem_cgroup_move_account(struct page *page, > * 2(MC_TARGET_SWAP): if the swap entry corresponding to this pte is a > * target for charge migration. if @target is not NULL, the entry is > stored > * in target->ent. > - * 3(MC_TARGET_DEVICE): like MC_TARGET_PAGE but page is > MEMORY_DEVICE_PRIVATE > - * (so ZONE_DEVICE page and thus not on the lru). > + * 3(MC_TARGET_DEVICE): like MC_TARGET_PAGE but page is > MEMORY_DEVICE_COHERENT > + * or MEMORY_DEVICE_PRIVATE (so ZONE_DEVICE page and thus not on the > lru). > * For now we such page is charge like a regular page would be as for all > * intent and purposes it is just special memory taking the place of a > * regular page. > @@ -5730,7 +5730,7 @@ static enum mc_target_type get_mctgt_type(struct > vm_area_struct *vma, >*/ > if (page_memcg(page) == mc.from) { > ret = MC_TARGET_PAGE; > - if (is_device_private_page(page)) > + if (is_device_page(page)) > ret = MC_TARGET_DEVICE; > if (target) > target->page = page; > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index 3e6449f2102a..51b55fc5172c 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -1554,12 +1554,16 @@ static int memory_failure_dev_pagemap(unsigned long > pfn, int flags, > goto unlock; > } > > - if (pgmap->type == MEMORY_DEVICE_PRIVATE) { > + switch (pgmap->type) { > + case MEMORY_DEVICE_PRIVATE: > + case MEMORY_DEVICE_COHERENT: > /* >* TODO: Handle HMM pages which may need coordination >* with device-side memory. >
Re: [PATCH v1 6/9] lib: test_hmm add module param for zone device type
On Tuesday, 16 November 2021 6:30:23 AM AEDT Alex Sierra wrote: > In order to configure device coherent in test_hmm, two module parameters > should be passed, which correspond to the SP start address of each > device (2) spm_addr_dev0 & spm_addr_dev1. If no parameters are passed, > private device type is configured. Thanks for taking the time to add proper tests for this, as previously mentioned I don't like the need for module parameters but understand why these are more difficult to avoid. However as also mentioned previously the restriction of being able to test only private *or* coherent device pages is unnecessary and makes testing both types harder, especially if we need to test migration between device private and coherent pages. > - res = request_free_mem_region(&iomem_resource, DEVMEM_CHUNK_SIZE, > - "hmm_dmirror"); > - if (IS_ERR(res)) > - goto err_devmem; > + if (!spm_addr_dev0 && !spm_addr_dev1) { > + res = request_free_mem_region(&iomem_resource, > DEVMEM_CHUNK_SIZE, > + "hmm_dmirror"); > + if (IS_ERR_OR_NULL(res)) > + goto err_devmem; > + devmem->pagemap.range.start = res->start; > + devmem->pagemap.range.end = res->end; > + devmem->pagemap.type = MEMORY_DEVICE_PRIVATE; > + mdevice->zone_device_type = HMM_DMIRROR_MEMORY_DEVICE_PRIVATE; > + } else if (spm_addr_dev0 && spm_addr_dev1) { > + devmem->pagemap.range.start = MINOR(mdevice->cdevice.dev) ? > + spm_addr_dev0 : > + spm_addr_dev1; It seems like it would be fairly straight forward to address this concern by adding extra minor character devices for the coherent devices. Would it be possible for you to try that? > + devmem->pagemap.range.end = devmem->pagemap.range.start + > + DEVMEM_CHUNK_SIZE - 1; > + devmem->pagemap.type = MEMORY_DEVICE_COHERENT; > + mdevice->zone_device_type = HMM_DMIRROR_MEMORY_DEVICE_COHERENT; > + } else { > + pr_err("Both spm_addr_dev parameters should be set\n"); > + } > > - mdevice->zone_device_type = HMM_DMIRROR_MEMORY_DEVICE_PRIVATE; > - devmem->pagemap.type = MEMORY_DEVICE_PRIVATE; > - devmem->pagemap.range.start = res->start; > - devmem->pagemap.range.end = res->end; > devmem->pagemap.nr_range = 1; > devmem->pagemap.ops = &dmirror_devmem_ops; > devmem->pagemap.owner = mdevice; > @@ -495,10 +517,14 @@ static bool dmirror_allocate_chunk(struct > dmirror_device *mdevice, > mdevice->devmem_capacity = new_capacity; > mdevice->devmem_chunks = new_chunks; > } > - > ptr = memremap_pages(&devmem->pagemap, numa_node_id()); > - if (IS_ERR(ptr)) > + if (IS_ERR_OR_NULL(ptr)) { > + if (ptr) > + ret = PTR_ERR(ptr); > + else > + ret = -EFAULT; > goto err_release; > + } > > devmem->mdevice = mdevice; > pfn_first = devmem->pagemap.range.start >> PAGE_SHIFT; > @@ -531,7 +557,8 @@ static bool dmirror_allocate_chunk(struct dmirror_device > *mdevice, > > err_release: > mutex_unlock(&mdevice->devmem_lock); > - release_mem_region(devmem->pagemap.range.start, > range_len(&devmem->pagemap.range)); > + if (res) > + release_mem_region(devmem->pagemap.range.start, > range_len(&devmem->pagemap.range)); > err_devmem: > kfree(devmem); > > @@ -1219,10 +1246,8 @@ static int dmirror_device_init(struct dmirror_device > *mdevice, int id) > if (ret) > return ret; > > - /* Build a list of free ZONE_DEVICE private struct pages */ > - dmirror_allocate_chunk(mdevice, NULL); > - > - return 0; > + /* Build a list of free ZONE_DEVICE struct pages */ > + return dmirror_allocate_chunk(mdevice, NULL); > } > > static void dmirror_device_remove(struct dmirror_device *mdevice) > @@ -1235,8 +1260,9 @@ static void dmirror_device_remove(struct dmirror_device > *mdevice) > mdevice->devmem_chunks[i]; > > memunmap_pages(&devmem->pagemap); > - release_mem_region(devmem->pagemap.range.start, > -range_len(&devmem->pagemap.range)); > + if (devmem->pagemap.type == MEMORY_DEVICE_PRIVATE) > + release_mem_region(devmem->pagemap.range.start, > + > range_len(&devmem->pagemap.range)); > kfree(devmem); > } > kfree(mdevice->devmem_chunks); > diff --git a/lib/test_hmm_uapi.h b/lib/test_hmm_uapi.h > index c42e57a6a71e..77f81e6314eb 100644 > --- a/lib/t
Re: [PATCH v1 2/9] mm: add device coherent vma selection for memory migration
On Tuesday, 16 November 2021 6:30:19 AM AEDT Alex Sierra wrote: > This case is used to migrate pages from device memory, back to system > memory. Device coherent type memory is cache coherent from device and CPU > point of view. > > Signed-off-by: Alex Sierra > --- > v2: > condition added when migrations from device coherent pages. > --- > include/linux/migrate.h | 1 + > mm/migrate.c| 9 +++-- > 2 files changed, 8 insertions(+), 2 deletions(-) > > diff --git a/include/linux/migrate.h b/include/linux/migrate.h > index c8077e936691..e74bb0978f6f 100644 > --- a/include/linux/migrate.h > +++ b/include/linux/migrate.h > @@ -138,6 +138,7 @@ static inline unsigned long migrate_pfn(unsigned long pfn) > enum migrate_vma_direction { > MIGRATE_VMA_SELECT_SYSTEM = 1 << 0, > MIGRATE_VMA_SELECT_DEVICE_PRIVATE = 1 << 1, > + MIGRATE_VMA_SELECT_DEVICE_COHERENT = 1 << 2, > }; > > struct migrate_vma { > diff --git a/mm/migrate.c b/mm/migrate.c > index f74422a42192..166bfec7d85e 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -2340,8 +2340,6 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, > if (is_writable_device_private_entry(entry)) > mpfn |= MIGRATE_PFN_WRITE; > } else { > - if (!(migrate->flags & MIGRATE_VMA_SELECT_SYSTEM)) > - goto next; > pfn = pte_pfn(pte); > if (is_zero_pfn(pfn)) { > mpfn = MIGRATE_PFN_MIGRATE; > @@ -2349,6 +2347,13 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, > goto next; > } > page = vm_normal_page(migrate->vma, addr, pte); > + if (!is_zone_device_page(page) && > + !(migrate->flags & MIGRATE_VMA_SELECT_SYSTEM)) > + goto next; > + if (is_zone_device_page(page) && > + (!(migrate->flags & > MIGRATE_VMA_SELECT_DEVICE_COHERENT) || > + page->pgmap->owner != migrate->pgmap_owner)) > + goto next; Thanks Alex, couple of comments on this: 1. vm_normal_page() can return NULL so you need to add a check for page == NULL otherwise the call to is_zone_device_page(NULL) will crash. 2. The check for a coherent device page is too indirect. Being explicit and using is_zone_device_coherent_page() instead would make it more direct and obvious, particularly for developers who may not immediately realise that device-private pages should never have pte_present() entries. > mpfn = migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE; > mpfn |= pte_write(pte) ? MIGRATE_PFN_WRITE : 0; > } >
[PATCH] drm/amdgpu: update the domain flags for dumb buffer creation
After switching to generic framebuffer framework, we rely on the ->dumb_create routine for frame buffer creation. However, the different domain flags used are not optimal. Add the contiguous flag to directly allocate the scanout BO as one linear buffer. Fixes: 844612e1149d ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.") Signed-off-by: Evan Quan Reviewed-by: Christian König Change-Id: I403bf7a0b265c564b5f3a3343999670e5eb87ca6 --- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c index d07b6aebc449..189e32ee7a6e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -883,7 +883,8 @@ int amdgpu_mode_dumb_create(struct drm_file *file_priv, struct drm_gem_object *gobj; uint32_t handle; u64 flags = AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED | - AMDGPU_GEM_CREATE_CPU_GTT_USWC; + AMDGPU_GEM_CREATE_CPU_GTT_USWC | + AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS; u32 domain; int r; -- 2.29.0
Re: [PATCH] drm/amdgpu: reset asic after system-wide suspend aborted
On 11/18/2021 12:32 PM, Prike Liang wrote: Do ASIC reset at the moment Sx suspend aborted behind of amdgpu suspend to keep AMDGPU in a clean reset state and that can avoid re-initialize device improperly error. Signed-off-by: Prike Liang --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 19 +++ 3 files changed, 24 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index b85b67a..8bd9833 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1053,6 +1053,7 @@ struct amdgpu_device { boolin_s3; boolin_s4; boolin_s0ix; + boolpm_completed; PM framework maintains separate flags, why not use the same? dev->power.is_suspended = false; dev->power.is_noirq_suspended = false; dev->power.is_late_suspended = false; BTW, Alex posted a patch which does similar thing, though it tries reset if suspend fails. https://lore.kernel.org/all/dm6pr12mb26195f8e099407b4b6966febe4...@dm6pr12mb2619.namprd12.prod.outlook.com/ If that reset also failed, then no point in another reset, or keep it as part of resume. Thanks, Lijo atomic_t in_gpu_reset; enum pp_mp1_state mp1_state; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index ec42a6f..a12ed54 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3983,6 +3983,10 @@ int amdgpu_device_resume(struct drm_device *dev, bool fbcon) if (adev->in_s0ix) amdgpu_gfx_state_change_set(adev, sGpuChangeState_D0Entry); + if (!adev->pm_completed) { + dev_warn(adev->dev, "suspend aborted will do asic reset\n"); + amdgpu_asic_reset(adev); + } /* post card */ if (amdgpu_device_need_post(adev)) { r = amdgpu_device_asic_init(adev); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index eee3cf8..9f90017 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -2168,6 +2168,23 @@ static int amdgpu_pmops_suspend(struct device *dev) return r; } +/* + * Actually the PM suspend whether is completed should be confirmed + * by checking the sysfs sys/power/suspend_stats/failed_suspend.However, + * in this function only check the AMDGPU device whether is suspended + * completely in the system-wide suspend process. + */ +static int amdgpu_pmops_noirq_suspend(struct device *dev) +{ + struct drm_device *drm_dev = dev_get_drvdata(dev); + struct amdgpu_device *adev = drm_to_adev(drm_dev); + + dev_dbg(dev, "amdgpu suspend completely.\n"); + adev->pm_completed = true; + + return 0; +} + static int amdgpu_pmops_resume(struct device *dev) { struct drm_device *drm_dev = dev_get_drvdata(dev); @@ -2181,6 +2198,7 @@ static int amdgpu_pmops_resume(struct device *dev) r = amdgpu_device_resume(drm_dev, true); if (amdgpu_acpi_is_s0ix_active(adev)) adev->in_s0ix = false; + adev->pm_completed = false; return r; } @@ -2397,6 +2415,7 @@ static const struct dev_pm_ops amdgpu_pm_ops = { .runtime_suspend = amdgpu_pmops_runtime_suspend, .runtime_resume = amdgpu_pmops_runtime_resume, .runtime_idle = amdgpu_pmops_runtime_idle, + .suspend_noirq = amdgpu_pmops_noirq_suspend, }; static int amdgpu_flush(struct file *f, fl_owner_t id)