On 2026-01-08 at 08:02 +1100, Balbir Singh <[email protected]> wrote... > On 1/8/26 06:54, Jason Gunthorpe wrote: > > On Wed, Jan 07, 2026 at 12:06:08PM -0800, Andrew Morton wrote: > > > >>> 2) Attempting to add the device private pages to the linear map at > >>> addresses beyond the actual physical memory causes issues on > >>> architectures like aarch64 - meaning the feature does not work > >>> there [0]. > >> > >> Can you better help us understand the seriousness of these problems? > >> How much are our users really hurting from this? > > > > We think it is pretty serious, in the future HW support sense, as it > > means real systems being built do not work :)
There's actually existing HW that could benefit from this support - after all there is nothing stopping someone plugging a Intel/AMD/NVIDIA GPU into an ARM machine today :-) So it would be nice if we could support this feature there as it results in really sub-optimal performance compared with x86 when using the SVM (shared virtual memory) feature because data has to be remote mapped (ie. accessed via PCIe link) rather than migrated to local GPU video memory. Having the kernel steal physical address space has also caused problems on x86 - we have encountered virtualised environments which depending on specific firmware/BIOS don't have enough free physical address space to support device private pages and hence migration of memory to the GPU device, again leading to sub-optmial performance. > > Also Willy and others were cheering this work on at LPC. I think the > > possible followup to move DEVICE_PRIVATE from struct page and reduce > > the memory allocation would be well celebrated. For reference the recording of my LPC presentation covering both this series and the above is here - https://www.youtube.com/watch?v=CFe_c8-tEuM The hope is that in addition to enabling support for this more broadly across other platforms/architectures that it will also enable further clean-ups to reduce memory allocation overhead (I almost convinced myself we wouldn't need a struct at all ... almost) > > The Intel Xe and AMD GPU teams are the two drivers most important to > > be testing this as they consume the feature. > > > > And the ultravisor usage in powerpc as well (book3s_hv_uvmem). As does Nouveau (which I've tested). But I agree AMD GPU and Intel Xe are the most important drivers here. I would be surprised if anyone was actually using the powerpc ultravisor, and I don't have access to a setup for this, so unless some PPC folk can offer to help I wouldn't like to see testing there hold up the series. Especially as I believe most of the driver side changes are relatively straight forward. - Alistair > Balbir
