Re: [Nouveau] 5.12.1 0010:nvkm_falcon_v1_wait_for_halt+0x8f/0xb9 [nouveau]

2021-05-06 Thread Bjorn Helgaas
[+cc Ben]

Hi Marc,

Thanks for paying attention to these things.  I added Ben (who
probably would see this via nouveau@lists.freedesktop.org anyway).
I don't see a PCI issue here, but the nouveau timeout, which I know
nothing about, does look like it could be interesting.

On Wed, May 05, 2021 at 02:42:27PM -0700, Marc MERLIN wrote:
> Howdy,
> I upgraded my thinkpad P73 from 5.9 to 5.12, and I now get this new
> ug at boot (although the system does continue booting and display works
> since I use i915 for display and only use nouveau for PM)
> 
> Short:
> [   18.561181] WARNING: CPU: 15 PID: 220 at 
> drivers/gpu/drm/nouveau/nvkm/falcon/v1.c:247 
> nvkm_falcon_v1_wait_for_halt+0x8f/0xb9 [nouveau]
> [   18.561300] Modules linked in: dm_crypt trusted tpm rng_core dm_mod 
> raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx multipath 
> sata_sil24 r8169 realtek mdio_devres libphy mii hid_generic usbhid hid 
> crct10dif_pclmul crc32_pclmul crc32c_intel xhci_pci rtsx_pci_sdmmc nouveau 
> ghash_clmulni_intel xhci_hcd mmc_core e1000e i2c_designware_platform mxm_wmi 
> i2c_designware_core hwmon ptp aesni_intel intel_lpss_pci drm_ttm_helper 
> i2c_i801 crypto_simd intel_lpss i2c_smbus psmouse i915 cryptd pps_core 
> thunderbolt rtsx_pci idma64 usbcore ttm i2c_nvidia_gpu thermal wmi battery
> [   18.561636] CPU: 15 PID: 220 Comm: kworker/15:2 Tainted: G U   
>  5.12.1-amd64-preempt-sysrq-20190817 #1
> [   18.561707] Hardware name: LENOVO 20QRS00200/20QRS00200, BIOS N2NET40W 
> (1.25 ) 08/26/2020
> [   18.561765] Workqueue: pm pm_runtime_work
> [   18.561799] RIP: 0010:nvkm_falcon_v1_wait_for_halt+0x8f/0xb9 [nouveau]
> 
> Despite the warning, chip seems to go to sleep on batteries, poewertop
> shows an encouraging low battery use (my lowest one yet of any kernel):
> The battery reports a discharge rate of 10.7 W
> The power consumed was 230 J
> 
> So it seems that what I need from nouveau is working (power management)
> 
> Full warning below with logs
> 
> 
> Long:
> [0.00] Linux version 5.12.1-amd64-preempt-sysrq-20190817 
> (r...@sauron.svh.merlins.org) (gcc (Debian 10.2.1-3) 10.2.1 20201224, GNU ld 
> (GNU Binutils for Debian) 2.35.1) #1 SMP PREEMPT Wed May 5 13:05:02 PDT 2021
> [0.00] Command line: 
> BOOT_IMAGE=/vmlinuz-5.12.1-amd64-preempt-sysrq-20190817 
> root=/dev/mapper/cryptroot ro rootflags=subvol=root 
> cryptopts=source=/dev/nvme0n1p7,keyscript=/sbin/cryptgetpw 
> usbcore.autosuspend=1 pcie_aspm=force resume=/dev/dm-1 acpi_backlight=vendor 
> nouveau.debug=disp=trace
> [8.672663] nouveau :01:00.0: runtime IRQ mapping not provided by arch
> [8.677434] nouveau :01:00.0: enabling device ( -> 0003)
> [8.691872] nouveau :01:00.0: NVIDIA TU104 (164000a1)
> [8.789240] nouveau :01:00.0: bios: version 90.04.4d.00.2c
> [8.789605] nouveau :01:00.0: pmu: firmware unavailable
> [8.789897] nouveau :01:00.0: enabling bus mastering
> [8.789978] nouveau :01:00.0: disp: preinit running...
> [8.789981] nouveau :01:00.0: disp: preinit completed in 0us
> [8.789997] nouveau :01:00.0: disp: fini running...
> [8.78] nouveau :01:00.0: disp: fini completed in 0us
> [8.790189] nouveau :01:00.0: fb: 8192 MiB GDDR6
> [8.800113] nouveau :01:00.0: disp: init running...
> [8.800116] nouveau :01:00.0: disp: init skipped, engine has no users
> [8.800118] nouveau :01:00.0: disp: init completed in 2us
> [8.801512] nouveau :01:00.0: DRM: VRAM: 8192 MiB
> [8.801515] nouveau :01:00.0: DRM: GART: 536870912 MiB
> [8.801517] nouveau :01:00.0: DRM: BIT table 'A' not found
> [8.801520] nouveau :01:00.0: DRM: BIT table 'L' not found
> [8.801521] nouveau :01:00.0: DRM: TMDS table version 2.0
> [8.801525] nouveau :01:00.0: DRM: DCB version 4.1
> [8.801527] nouveau :01:00.0: DRM: DCB outp 00: 02800f66 04600020
> [8.801529] nouveau :01:00.0: DRM: DCB outp 01: 02011f52 00020010
> [8.801531] nouveau :01:00.0: DRM: DCB outp 02: 01022f36 04600010
> [8.801533] nouveau :01:00.0: DRM: DCB outp 03: 04033f76 04600010
> [8.801535] nouveau :01:00.0: DRM: DCB outp 04: 04044f86 04600020
> [8.801537] nouveau :01:00.0: DRM: DCB conn 00: 00020047
> [8.801539] nouveau :01:00.0: DRM: DCB conn 01: 00010161
> [8.801541] nouveau :01:00.0: DRM: DCB conn 02: 1248
> [8.801543] nouveau :01:00.0: DRM: DCB conn 03: 01000348
> [8.801543] nouveau :01:00.0: DRM: DCB conn 04: 02000471
> [8.802234] nouveau :01:00.0: DRM: MM: using COPY for buffer copies
> [8.802255] nouveau :01:00.0: disp: init running...
> [8.802257] nouveau :01:00.0: disp: one-time init running...
> [8.802259] nouveau :01:00.0: disp: outp 00:0006:0f82: type 06 loc 0 
> or 2 link 2 con 0 edid 6 bus 0 head f
> [8.802265] nouveau :01:00.0: disp: outp 00:0006:0f82: bios dp 42 13 
> 00 

Re: [Nouveau] [PATCH v8 0/8] Add support for SVM atomics in Nouveau

2021-05-06 Thread Alistair Popple
Hi Andrew,

There is currently no outstanding feedback for this series so I am hoping it 
may be considered for inclusion (or at least the mm portions - I still need 
Reviews/Acks for the Nouveau bits). The main change for v8 was removal of 
entries on fork rather than copying in response to feedback from Jason so any 
follow up comments on patch 5 would also be welcome. The series contains a 
number of general clean-ups suggested by Christoph along with a feature to 
temporarily make selected user page mappings write-protected.

This is needed to support OpenCL atomic operations in Nouveau to shared 
virtual memory (SVM) regions allocated with the CL_MEM_SVM_ATOMICS clSVMAlloc 
flag. A more complete description of the OpenCL SVM feature is available at 
https://www.khronos.org/registry/OpenCL/specs/3.0-unified/html/
OpenCL_API.html#_shared_virtual_memory .

I have been testing this with Mesa 21.1.0 and a simple OpenCL program which 
checks GPU atomic accesses to system memory are atomic. Without this series 
the test fails as there is no way of write-protecting the userspace page 
mapping which results in the device clobbering CPU writes. For reference the 
test is available at https://ozlabs.org/~apopple/opencl_svm_atomics/ .

 - Alistair

On Wednesday, 7 April 2021 6:42:30 PM AEST Alistair Popple wrote:
> This is the eighth version of a series to add support to Nouveau for atomic
> memory operations on OpenCL shared virtual memory (SVM) regions.
> 
> The main change for this version is a simplification of device exclusive
> entry handling. Instead of copying entries for copy-on-write mappings
> during fork they are removed instead. This is safer because there could be
> unique corner cases when copying, particularly for pinned pages which
> should follow the same logic as copy_present_page(). Removing entries
> avoids this possiblity by treating them as normal ptes.
> 
> Exclusive device access is implemented by adding a new swap entry type
> (SWAP_DEVICE_EXCLUSIVE) which is similar to a migration entry. The main
> difference is that on fault the original entry is immediately restored by
> the fault handler instead of waiting.
> 
> Restoring the entry triggers calls to MMU notifers which allows a device
> driver to revoke the atomic access permission from the GPU prior to the CPU
> finalising the entry.
> 
> Patches 1 & 2 refactor existing migration and device private entry
> functions.
> 
> Patches 3 & 4 rework try_to_unmap_one() by splitting out unrelated
> functionality into separate functions - try_to_migrate_one() and
> try_to_munlock_one(). These should not change any functionality, but any
> help testing would be much appreciated as I have not been able to test
> every usage of try_to_unmap_one().
> 
> Patch 5 contains the bulk of the implementation for device exclusive
> memory.
> 
> Patch 6 contains some additions to the HMM selftests to ensure everything
> works as expected.
> 
> Patch 7 is a cleanup for the Nouveau SVM implementation.
> 
> Patch 8 contains the implementation of atomic access for the Nouveau
> driver.
> 
> This has been tested using the latest upstream Mesa userspace with a simple
> OpenCL test program which checks the results of atomic GPU operations on a
> SVM buffer whilst also writing to the same buffer from the CPU.
> 
> Alistair Popple (8):
>   mm: Remove special swap entry functions
>   mm/swapops: Rework swap entry manipulation code
>   mm/rmap: Split try_to_munlock from try_to_unmap
>   mm/rmap: Split migration into its own function
>   mm: Device exclusive memory access
>   mm: Selftests for exclusive device memory
>   nouveau/svm: Refactor nouveau_range_fault
>   nouveau/svm: Implement atomic SVM access
> 
>  Documentation/vm/hmm.rst  |  19 +-
>  Documentation/vm/unevictable-lru.rst  |  33 +-
>  arch/s390/mm/pgtable.c|   2 +-
>  drivers/gpu/drm/nouveau/include/nvif/if000c.h |   1 +
>  drivers/gpu/drm/nouveau/nouveau_svm.c | 156 -
>  drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h |   1 +
>  .../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c|   6 +
>  fs/proc/task_mmu.c|  23 +-
>  include/linux/mmu_notifier.h  |  26 +-
>  include/linux/rmap.h  |  11 +-
>  include/linux/swap.h  |   8 +-
>  include/linux/swapops.h   | 123 ++--
>  lib/test_hmm.c| 126 +++-
>  lib/test_hmm_uapi.h   |   2 +
>  mm/debug_vm_pgtable.c |  12 +-
>  mm/hmm.c  |  12 +-
>  mm/huge_memory.c  |  45 +-
>  mm/hugetlb.c  |  10 +-
>  mm/memcontrol.c   |   2 +-
>  mm/memory.c   | 196 +-
>  mm/migrate.c  |  51 +-
>  mm/mlock.c|  10 +-