On Wed, Jun 26, 2019 at 5:05 PM Kenny Ho wrote:
> This is a follow up to the RFC I made previously to introduce a cgroup
> controller for the GPU/DRM subsystem [v1,v2]. The goal is to be able to
> provide resource management to GPU resources using things like container.
> The cover letter from v1 is copied below for reference.
>
> [v1]:
> https://lists.freedesktop.org/archives/dri-devel/2018-November/197106.html
> [v2]: https://www.spinics.net/lists/cgroups/msg22074.html
>
> v3:
> Base on feedbacks on v2:
> * removed .help type file from v2
> * conform to cgroup convention for default and max handling
> * conform to cgroup convention for addressing device specific limits (with
> major:minor)
> New function:
> * adopted memparse for memory size related attributes
> * added macro to marshall drmcgrp cftype private (DRMCG_CTF_PRIV, etc.)
> * added ttm buffer usage stats (per cgroup, for system, tt, vram.)
> * added ttm buffer usage limit (per cgroup, for vram.)
> * added per cgroup bandwidth stats and limiting (burst and average bandwidth)
>
> v2:
> * Removed the vendoring concepts
> * Add limit to total buffer allocation
> * Add limit to the maximum size of a buffer allocation
>
> v1: cover letter
>
> The purpose of this patch series is to start a discussion for a generic cgroup
> controller for the drm subsystem. The design proposed here is a very early
> one.
> We are hoping to engage the community as we develop the idea.
>
>
> Backgrounds
> ==
> Control Groups/cgroup provide a mechanism for aggregating/partitioning sets of
> tasks, and all their future children, into hierarchical groups with
> specialized
> behaviour, such as accounting/limiting the resources which processes in a
> cgroup
> can access[1]. Weights, limits, protections, allocations are the main
> resource
> distribution models. Existing cgroup controllers includes cpu, memory, io,
> rdma, and more. cgroup is one of the foundational technologies that enables
> the
> popular container application deployment and management method.
>
> Direct Rendering Manager/drm contains code intended to support the needs of
> complex graphics devices. Graphics drivers in the kernel may make use of DRM
> functions to make tasks like memory management, interrupt handling and DMA
> easier, and provide a uniform interface to applications. The DRM has also
> developed beyond traditional graphics applications to support compute/GPGPU
> applications.
>
>
> Motivations
> =
> As GPU grow beyond the realm of desktop/workstation graphics into areas like
> data center clusters and IoT, there are increasing needs to monitor and
> regulate
> GPU as a resource like cpu, memory and io.
>
> Matt Roper from Intel began working on similar idea in early 2018 [2] for the
> purpose of managing GPU priority using the cgroup hierarchy. While that
> particular use case may not warrant a standalone drm cgroup controller, there
> are other use cases where having one can be useful [3]. Monitoring GPU
> resources such as VRAM and buffers, CU (compute unit [AMD's nomenclature])/EU
> (execution unit [Intel's nomenclature]), GPU job scheduling [4] can help
> sysadmins get a better understanding of the applications usage profile.
> Further
> usage regulations of the aforementioned resources can also help sysadmins
> optimize workload deployment on limited GPU resources.
>
> With the increased importance of machine learning, data science and other
> cloud-based applications, GPUs are already in production use in data centers
> today [5,6,7]. Existing GPU resource management is very course grain,
> however,
> as sysadmins are only able to distribute workload on a per-GPU basis [8]. An
> alternative is to use GPU virtualization (with or without SRIOV) but it
> generally acts on the entire GPU instead of the specific resources in a GPU.
> With a drm cgroup controller, we can enable alternate, fine-grain, sub-GPU
> resource management (in addition to what may be available via GPU
> virtualization.)
>
> In addition to production use, the DRM cgroup can also help with testing
> graphics application robustness by providing a mean to artificially limit DRM
> resources availble to the applications.
>
>
> Challenges
>
> While there are common infrastructure in DRM that is shared across many
> vendors
> (the scheduler [4] for example), there are also aspects of DRM that are vendor
> specific. To accommodate this, we borrowed the mechanism used by the cgroup
> to
> handle different kinds of cgroup controller.
>
> Resources for DRM are also often device (GPU) specific instead of system
> specific and a system may contain more than one GPU. For this, we borrowed
> some
> of the ideas from RDMA cgroup controller.
Another question I have: What about HMM? With the device memory zone
the core mm will be a lot more involved in managing that, but I also
expect that we'll have classic buffer-based management for a long time
still. So these need to work togeth