Re: 6.10/regression/bisected commit c4cb23111103 causes sleeping function called from invalid context at kernel/locking/mutex.c:585

2024-05-21 Thread Vasant Hegde
-{2:2}, at:
> amd_iommu_attach_device+0x1ad/0x1e80
> [4.307243]  #2: 888113518c18 (_data->lock){}-{2:2},
> at: amd_iommu_attach_device+0x213/0x1e80
> [4.307243]  #3: 888108393030 (>lock){}-{2:2}, at:
> amd_iommu_iopf_add_device+0x69/0x140
> [4.307243] stack backtrace:
> [4.307243] CPU: 0 PID: 1 Comm: swapper/0 Tainted: GW
>   ---  ---
> 6.10.0-0.rc0.20240520giteb6a9339efeb.9.fc41.x86_64+debug #1
> [4.307243] Hardware name: ASUS System Product Name/ROG STRIX
> B650E-I GAMING WIFI, BIOS 2611 04/07/2024
> [4.307243] Call Trace:
> [4.307243]  
> [4.307243]  dump_stack_lvl+0x84/0xd0
> [4.307243]  __lock_acquire.cold+0x1fe/0x2a0
> [4.307243]  ? __pfx___lock_acquire+0x10/0x10
> [4.307243]  ? ret_from_fork_asm+0x1a/0x30
> [4.307243]  lock_acquire+0x1ae/0x540
> [4.307243]  ? iopf_queue_add_device+0xd2/0x5d0
> [4.307243]  ? __pfx_lock_acquire+0x10/0x10
> [4.307243]  ? __printk_cpu_sync_put+0x35/0x60
> [4.307243]  ? add_taint+0x2a/0x70
> [4.307243]  ? __might_resched.cold+0x203/0x23d
> [4.307243]  ? __pfx___might_resched+0x10/0x10
> [4.307243]  __mutex_lock+0x189/0x13f0
> [4.307243]  ? iopf_queue_add_device+0xd2/0x5d0
> [4.307243]  ? iopf_queue_add_device+0xd2/0x5d0
> [4.307243]  ? __pfx___mutex_lock+0x10/0x10
> [4.307243]  ? find_held_lock+0x34/0x120
> [4.307243]  ? __pfx_lock_acquired+0x10/0x10
> [4.307243]  ? iopf_queue_add_device+0xd2/0x5d0
> [4.307243]  iopf_queue_add_device+0xd2/0x5d0
> [4.307243]  amd_iommu_iopf_add_device+0xcd/0x140
> [4.307243]  amd_iommu_attach_device+0xdc8/0x1e80
> [4.307243]  ? iommu_create_device_direct_mappings+0x571/0x7d0
> [4.307243]  __iommu_attach_device+0x64/0x250
> [4.307243]  __iommu_device_set_domain+0x122/0x1c0
> [4.307243]  __iommu_group_set_domain_internal+0xfa/0x2d0
> [4.307243]  iommu_setup_default_domain+0x918/0xcd0
> [4.307243]  bus_iommu_probe+0x1ad/0x4c0
> [4.307243]  ? __pfx_bus_iommu_probe+0x10/0x10
> [4.307243]  iommu_device_register+0x184/0x230
> [4.307243]  ? amd_iommu_iopf_init+0xfd/0x170
> [4.307243]  iommu_go_to_state+0xf87/0x3890
> [4.307243]  ? __pfx_iommu_go_to_state+0x10/0x10
> [4.307243]  ? lockdep_hardirqs_on+0x7c/0x100
> [4.307243]  ? _raw_spin_unlock_irqrestore+0x4f/0x80
> [4.307243]  ? add_device_randomness+0xb8/0xf0
> [4.307243]  ? __pfx_add_device_randomness+0x10/0x10
> [4.307243]  ? __pfx_pci_iommu_init+0x10/0x10
> [4.307243]  amd_iommu_init+0x21/0x60
> [4.307243]  ? __pfx_pci_iommu_init+0x10/0x10
> [4.307243]  pci_iommu_init+0x38/0x60
> [4.307243]  do_one_initcall+0xd6/0x460
> [4.307243]  ? __pfx_do_one_initcall+0x10/0x10
> [4.307243]  ? kernel_init_freeable+0x4cb/0x750
> [4.307243]  ? kasan_unpoison+0x44/0x70
> [4.307243]  kernel_init_freeable+0x6b4/0x750
> [4.307243]  ? __pfx_kernel_init_freeable+0x10/0x10
> [4.307243]  ? __pfx_kernel_init+0x10/0x10
> [4.307243]  ? __pfx_kernel_init+0x10/0x10
> [4.307243]  kernel_init+0x1c/0x150
> [    4.307243]  ? __pfx_kernel_init+0x10/0x10
> [4.307243]  ret_from_fork+0x31/0x70
> [4.307243]  ? __pfx_kernel_init+0x10/0x10
> [4.307243]  ret_from_fork_asm+0x1a/0x30
> [4.307243]  
> [4.311628] AMD-Vi: Extended features (0x246577efa2254afa, 0x0):
> PPR NX GT [5] IA GA PC GA_vAPIC
> [4.311639] AMD-Vi: Interrupt remapping enabled
> [4.366191] AMD-Vi: Virtual APIC enabled
> 
> Bisect pointed to commit:
> commit c4cb2303a841c2df30058597398443bcad5f (HEAD)
> Author: Vasant Hegde 
> Date:   Thu Apr 18 10:33:57 2024 +
> 
> iommu/amd: Add support for enable/disable IOPF
> 
> Return success from enable_feature(IOPF) path as this interface is going
> away. Instead we will enable/disable IOPF support in attach/detach device
> path.
> 
> In attach device path, if device is capable of PRI, then we will add it to
> per IOMMU IOPF queue and enable PPR support in IOMMU. Also it will
> attach device to domain even if it fails to enable PRI or add device to
> IOPF queue as device can continue to work without PRI support.
> 
> In detach device patch it follows following sequence:
>   - Flush the queue for the given device
>   - Disable PPR support in DTE[devid]
>   - Remove device from IOPF queue
>   - Disable device PRI
> 
> Also add IOMMU_IOPF as dependency to AMD_IOMMU driver.
> 
> Co-developed-by: Suravee Suthikulpanit 
> Signed-off-by: Suravee Suthikulpanit 
> Signed-off-by: Vasant Hegde 
> Reviewed-by: Jason Gunthorpe 
> Link: 
> https://lore.kernel.org/r/2024041810340

Re: [PATCH] iommu/amd: fix the address translation issue when do detach

2023-07-28 Thread Vasant Hegde



On 7/27/2023 3:25 PM, Jesse Zhang wrote:
> From: Jesse Zhang 
> 
> iGpu driver fail to read/write register by iommu when start X.
> kernel: [  433.296634] audit: type=1400 audit(1690403823.130:64): 
> apparmor="DENIED" operation="capable" class="cap"
> profile="/snap/snapd/19457/usr/lib/snapd/snap-confine" pid=12344 
> comm="snap-confine" capability=38  capname="perfmon"
> kernel: [  433.515795] amdgpu :03:00.0: amdgpu: failed to write reg 28b4 
> wait reg 28c6
> kernel: [  440.195492] amdgpu :03:00.0: amdgpu: failed to write reg 28b4 
> wait reg 28c6
> kernel: [  453.679611] amdgpu :03:00.0: amdgpu: failed to write reg 1a6f4 
> wait reg 1a706
> kernel: [  460.383490] amdgpu :03:00.0: amdgpu: failed to write reg 1a6f4 
> wait reg 1a706
> 
> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2659
> 
> Disable address translation service, before detach device.
> Do detach will clear the page table point or pasid table entries,
> so all DMA requests from the device should be blocked before that.
> 
> Signed-off-by: Jesse Zhang 

Thanks Jesse for tracking down the issue. Patch looks good to me.

Reviewed-by: Vasant Hegde 

-Vasant



Re: [regression, bisected, pci/iommu] Bug 216865 - Black screen when amdgpu started during 6.2-rc1 boot with AMD IOMMU enabled

2023-02-16 Thread Vasant Hegde
Hi Felix,


On 2/17/2023 1:29 AM, Felix Kuehling wrote:
>> Feb 16 13:22:32 kernel: kfd kfd: amdgpu: Failed to resume IOMMU for device
>> 1002:9874
>> Feb 16 13:22:32 kernel: kfd kfd: amdgpu: device 1002:9874 NOT added due to 
>> errors 
> This look like IOMMU device initialization still fails (but more gracefully
> now). Vasant, is that expected?

My fix is to gracefully handle failure paths in IOMMU. So above logs are
expected. Basically it means IOMMU couldn't attach devices to new domain
(because it couldn't enable PASID on AMD GPU as ACS RR/UF flags are missing, see
commit 201007ef707 ) and we did fall back to old domain properly.

It also means that GPU will not be able to use PASID/PRI. If you need these
feauteres then you have to look into commit 201007ef707 and see how we can
enable PASID for GPU (without ACS UF/RR flag?).


> 
> This would lead to KFD not being available on Carrizo with this kernel, which 
> is
> probably not a big limitation in practice. It would only affect compute
> applications using the ROCm user mode stack. I don't think anyone does that
> these days on these old APUs.
> 
> The SMU errors seem unrelated to this unless there is some subtle interaction
> I'm missing.

I have no idea about GPU warning. All I can say is IOMMU side looks good but
PASID/PRI is not enabled for GPU.

-Vasant




Re: [regression, bisected, pci/iommu] Bug 216865 - Black screen when amdgpu started during 6.2-rc1 boot with AMD IOMMU enabled

2023-02-16 Thread Vasant Hegde
Matt,

Thanks a lot for testing and the dmesg log.

On 2/17/2023 12:29 AM, Matt Fagnani wrote:
> Vasant,
> 
> I applied your four patches to 6.2-rc8 and built that. The black screen, null
> pointer dereference, and warnings didn't happen when booting 6.2-rc8 with your
> patches. There were errors that the IOMMU wasn't restarted when amdgpu and
> amdkfd was starting though at kernel: kfd kfd: amdgpu: Failed to resume IOMMU
> for device 1002:9874. I don't know if those IOMMU errors were expected or not,

This patch is not for fixing PASID enablement issue. Its more of gracefully
handling the error path.

This means patch worked in expected way. i. e. It failed to enable PASID because
of original patch (commit 201007ef70), it didn't attach devices to new domain
and attach devices back to default domain.
It returned error to GPU saying we couldn't enable PASID/PRI. Hence we saw above
error message.

-Vasant


Re: [regression, bisected, pci/iommu] Bug 216865 - Black screen when amdgpu started during 6.2-rc1 boot with AMD IOMMU enabled

2023-02-15 Thread Vasant Hegde
Hi Jason,


On 2/16/2023 6:14 AM, Jason Gunthorpe wrote:
> On Wed, Feb 15, 2023 at 07:35:45PM -0500, Felix Kuehling wrote:
>>
>> If I understand this correctly, the HW or the BIOS is doing something wrong
>> about reporting ACS. I don't know what the GPU driver can do other than add
>> some quirk to stop using AMD IOMMUv2 on this HW/BIOS.
> 
> How about this:
> 
> diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c
> index 864e4ffb6aa94e..cc027ce9a6e86f 100644
> --- a/drivers/iommu/amd/iommu_v2.c
> +++ b/drivers/iommu/amd/iommu_v2.c
> @@ -732,6 +732,7 @@ EXPORT_SYMBOL(amd_iommu_unbind_pasid);
>  
>  int amd_iommu_init_device(struct pci_dev *pdev, int pasids)
>  {
> + struct iommu_dev_data *dev_data = dev_iommu_priv_get(>dev);
>   struct device_state *dev_state;
>   struct iommu_group *group;
>   unsigned long flags;
> @@ -740,6 +741,9 @@ int amd_iommu_init_device(struct pci_dev *pdev, int 
> pasids)
>  
>   might_sleep();
>  
> + if (!dev_data->ats.enabled)
> + return -EINVAL;
> +

Thanks for the proposed fix. But aactually this will not solve the issue because
current flow is :
  - in this function it tries to allocate new domain
  - Calls iommu_attach_group() which will call attach_device. In that path
it will try to enable ATS/PASID and hitting error.

As I mentioned in other reply I think even current code returns error from
amd_iommu_init_device() to GPU. But the issue is, in __iommu_attach_group() path
it detached device from current domain, failed to attach to new domain and
returned error. We didn't put the device back to old domain thats causing the
issue. Below series should fix this issue.

https://lore.kernel.org/linux-iommu/20230215052642.6016-1-vasant.he...@amd.com/

-Vasant



Re: [regression, bisected, pci/iommu] Bug 216865 - Black screen when amdgpu started during 6.2-rc1 boot with AMD IOMMU enabled

2023-02-15 Thread Vasant Hegde
Felix, Jason, Matt,


On 2/16/2023 6:05 AM, Felix Kuehling wrote:
> [+Shimmer, Aaron]
> 
> Am 2023-02-15 um 10:39 schrieb Bjorn Helgaas:
>> [+cc Christian, Xinhui, amd-gfx]
>>
>> On Fri, Jan 06, 2023 at 01:48:11PM +0800, Baolu Lu wrote:
>>> On 1/5/23 11:27 PM, Felix Kuehling wrote:
>>>> Am 2023-01-05 um 09:46 schrieb Deucher, Alexander:
>>>>>> -Original Message-
>>>>>> From: Hegde, Vasant 
>>>>>> On 1/5/2023 4:07 PM, Baolu Lu wrote:
>>>>>>> On 2023/1/5 18:27, Vasant Hegde wrote:
>>>>>>>> On 1/5/2023 6:39 AM, Matt Fagnani wrote:
>>>>>>>>> I built 6.2-rc2 with the patch applied. The same black
>>>>>>>>> screen problem happened with 6.2-rc2 with the patch. I
>>>>>>>>> tried to use early kdump with 6.2-rc2 with the patch
>>>>>>>>> twice by panicking the kernel with sysrq+alt+c after the
>>>>>>>>> black screen happened. The system rebooted after about
>>>>>>>>> 10-20 seconds both times, but no kdump and dmesg files
>>>>>>>>> were saved in /var/crash. I'm attaching the lspci -vvv
>>>>>>>>> output as requested. ...
>>>>>>>> Looking into lspci output, it doesn't list ACS feature
>>>>>>>> for Graphics card. So with your fix it didn't enable PASID
>>>>>>>> and hence it failed to boot. ...
>>>>>>> So do you mind telling why does the PASID need to be enabled
>>>>>>> for the graphic device? Or in another word, what does the
>>>>>>> graphic driver use the PASID for? ...
>>>>> The GPU driver uses the pasid for shared virtual memory between
>>>>> the CPU and GPU.  I.e., so that the user apps can use the same
>>>>> virtual address space on the GPU and the CPU.  It also uses
>>>>> pasid to take advantage of recoverable device page faults using
>>>>> PRS. ...
>>>> Agreed. This applies to GPU computing on some older AMD APUs that
>>>> take advantage of memory coherence and IOMMUv2 address translation
>>>> to create a shared virtual address space between the CPU and GPU.
>>>> In this case it seems to be a Carrizo APU. It is also true for
>>>> Raven APUs. ...
>>> Thanks for the explanation.
>>>
>>> This is actually the problem that commit 201007ef707a was trying to
>>> fix.  The PCIe fabric routes Memory Requests based on the TLP
>>> address, ignoring any PASID (PCIe r6.0, sec 2.2.10.4), so a TLP with
>>> PASID that should go upstream to the IOMMU may instead be routed as
>>> a P2P Request if its address falls in a bridge window.
>>>
>>> In SVA case, the IOMMU shares the address space of a user
>>> application.  The user application side has no knowledge about the
>>> PCI bridge window.  It is entirely possible that the device is
>>> programed with a P2P address and results in a disaster.
>> Is this stalled?  We explored the idea of changing the PCI core so
>> that for devices that use ATS/PRI, we could enable PASID without
>> checking for ACS [1], but IIUC we ultimately concluded that it was
>> based on a misunderstanding of how ATS Translation Requests are routed
>> and that an AMD driver change would be required [2].
>>
>> So it seems like we still have this regression, and we're running out
>> of time before v6.2.
>>
>> [1] 
>> https://lore.kernel.org/all/20230114073420.759989-1-baolu...@linux.intel.com/
>> [2] https://lore.kernel.org/all/y91x9mecosa67...@nvidia.com/
> 
> If I understand this correctly, the HW or the BIOS is doing something wrong
> about reporting ACS. I don't know what the GPU driver can do other than add 
> some
> quirk to stop using AMD IOMMUv2 on this HW/BIOS.
> 
> It looks like the problem is triggered when the driver calls
> amd_iommu_init_device. That's when the first WARNs happen, soon followed by a
> kernel oops in report_iommu_fault. The driver doesn't know anything is wrong
> because amd_iommu_init_device seems to return "success". And the oops is not 
> in
> the GPU driver either.

WARN is fixed and its in Joerg's tree.
https://lore.kernel.org/all/2023021503.5931-1-vasant.he...@amd.com/

report_iommu_fault() happened because in amd_iommu_init_device() path it failed
to attach devices to new domain and returned error. But it didn't put devices
back to old domain properly. It left in incosistent state and resulted in IO
page fault. I have proposed series to handle device to domain attachment failure
and better handling of subsequent report_iommu_fault().
https://lore.kernel.org/linux-iommu/20230215052642.6016-1-vasant.he...@amd.com/


@Matt,
  Can you please help to verify above patches on your system where you hit the
issue originally?
  (Grab above two series, apply it on top of latest kernel and test it)

-Vasant



Re: amd iommu configuration

2022-09-05 Thread Vasant Hegde
Steven,

[+Felix, amd-fgx list]


On 9/3/2022 4:29 AM, Steven J Abner wrote:
> Hi
> I was referred to you from linux-ker...@vger.kernel.org about the following 
> issue.
> Here is as was written:
> On 9/1/22 11:36, Steven J Abner wrote:
> Hi
> Building a kernel tailored for AMD 2400g on ASRock B450 using 5.18.12 as base.
> I stumbled across an odd situation and which lacked Kconfig info and lead to
> oddity.
> /drivers/iommu/amd/Kconfig states 'config AMD_IOMMU_V2' is 'tristate' but 
> unlike
> many
> other tristate configures doesn't mention that module name is 'iommu_v2.ko' 
> and
> loading should be done by adding to modules-load.d.
> 
> The oddity is that by loading as module is as follows (differences):
> 
> builtin iommu_v2 version dmesg:
> amdgpu: HMM registered 2048MB device memory
> amdgpu: Topology: Add APU node [0x0:0x0]
> amdgpu: Topology: Add APU node [0x15dd:0x1002]
> AMD-Vi: AMD IOMMUv2 loaded and initialized
> kfd kfd: amdgpu: added device 1002:15dd
> kfd kfd: amdgpu: Allocated 3969056 bytes on gart
> memmap_init_zone_device initialised 524288 pages in 0ms

IOMMU V2 modules provides IOMMU feature like attaching device to
process. I think amdgpu uses those features if available.
So in this case amdgpu is using those IOMMU features.

> 
> module not loaded due to missing iommu.conf dmesg:
> amdgpu: CRAT table disabled by module option
> amdgpu: Topology: Add CPU node
> amdgpu: Virtual CRAT table created for CPU
> kfd kfd: amdgpu: GC IP 090100 not supported in kfd
> 
> module load through iommu.conf dmesg:
> amdgpu: CRAT table disabled by module option
> amdgpu: Topology: Add CPU node
> amdgpu: Virtual CRAT table created for CPU
> AMD-Vi: AMD IOMMUv2 loaded and initialized
> kfd kfd: amdgpu: GC IP 090100 not supported in kfd
> 
> Note, only difference on witk/without iommu.conf is:
> AMD-Vi: AMD IOMMUv2 loaded and initialized

I think in this case iommu_v2.ko module got loaded after GPU
initialized. Hence amdgpu is not using iommu v2 features.


> 
> So does this mean missing features by not having builtin?
> If not, should Kconfig have hint about module and loading?

@Felix,
  I see that drivers/gpu/drm/amd/amdkfd/Kconfig contains below line
imply AMD_IOMMU_V2 if X86_64


  Should we change `s/imply/select` ?

-Vasant


> 
> Steve
> 
> I wish to be personally CC'ed the answers/comments posted to the list
> in response to my posting, please:) Not a list member.
> 
> I hope you can assist linux people and myself. I assumed from dmesg that
> it must be builtin. But also wonder if it should be in amdgpu or tied to it.
> Steve
> 
>