Hello,

Regarding the proposed interface for gathering GPU driver statistics, I would 
like to provide feedback based on the following points:

​1.
Providing per-process GPU information through an interface other than /proc 
would significantly improve the developer experience for Flatpak-based 
applications and tools. Since Flatpak containers have a restricted view of the 
host's /proc, it is currently very difficult for sandboxed monitoring tools to 
gather cross-process GPU metrics.

​2.
The current reliance on /proc/<pid>/{fd,fdinfo} requires root privileges to 
access info for other users' processes. This is a major hurdle for non-root 
users when attempting to detect system-critical issues, such as VRAM leaks in a 
compositor. 

​3.
While ROCm/amdkfd currently provides per-process VRAM usage, it lacks an 
interface to report the utilization of hardware engines such as Compute or 
SDMA. It would be highly beneficial if the new interface could address this 
gap, ensuring that hardware IP utilization is consistently trackable across 
both KFD and DRM nodes.
​
​Note: I've used Gemini to help structure my thoughts and refine the English in 
this mail.

Best regards,
Umio Yasuno

>
>
> On 2/5/26 20:25, Natalie Vock wrote:
>
> > On 2/5/26 19:58, Alex Deucher wrote:
> >
> > > Has anyone given any thought on how to support something like top for
> > > accelerators or GPUs?
> >
> > top for accelerators/GPUs kind of exists already, see [1] or [2].
> > Clearly, this problem has some kind of solution (looking through the code, 
> > it seems like they check every fd if it has a DRM fdinfo file associated 
> > (which is indeed not particularly efficient)).
> >
> > Maybe it's worth asking the authors of the respective tools for their 
> > opinions here?
>
>
> That is a really good point. Adding Maxime Schmitt and Umio Yasuno on CC.
>
> Let's hope I've picked the correct mail addresses.
>
> Christian.
>
> > Natalie
> >
> > [1] https://github.com/Umio-Yasuno/amdgpu_top
> > [2] https://github.com/Syllo/nvtop
> >
> > > We have fdinfo, but using fdinfo requires extra
> > > privileges (CAP_SYS_PTRACE) and there is not a particularly efficient
> > > way to even discover what processes are using the GPU. There is the
> > > clients list in debugfs, but that is also admin only. Tools like ps
> > > and top use /proc/<pid>/stat and statm. Do you think there would be
> > > an appetite for something like /proc/<pid>/drm/stat, statm, etc.?
> > > This would duplicate much of what is in fdinfo, but would be available
> > > to regular users.
> > >
> > > Thanks,
> > >
> > > Alex

Reply via email to