Re: [RFC] Exclusive gpu access for SteamVR usecases

Andres Rodriguez Tue, 30 May 2017 14:38:53 -0700


On 2017-05-30 11:19 AM, Christian König wrote:

Looks like a good start, but a few notes in general:

1. Split the patches into two sets.
One for implementing changing the priorities and one for limiting thepriorities.


No problem.

2. How are the priorities from processes supposed to interact with theper context priority?


Do you mean process niceness?

There isn't any relationship between niceness and gpu priority.

Let me know if you meant something different here.

3. Thinking more about it we can't limit the minimum priority in thescheduler.
For example a low priority job might block resources the high priorityjob needs to run. E.g. VRAM memory.

We avoid deadlocks by making sure that all dependencies of an exclusivetask are also elevated to the same priority as said task. Usermode (theDRM_MASTER) is responsible to maintain this guarantee. The kernel doesprovide an ioctl that makes this task simple,amdgpu_sched_process_priority_set().


Lets take a look at this issue through three different scenarios.

(i) Job dependencies are all process internal, i.e. multiple contexts inone process.

This is the trivial case. A call to amdgpu_sched_process_priority_set()will change the priority of all contexts belonging to a process in lockstep.

Once amdgpu_sched_process_priority_set() returns, it is safe to raisethe minimum priority using amdgpu_sched_min_priority_get(). At thispoint we have a guarantee that all contexts belonging to the processwill be in a runnable state, or all the contexts will be in anot-runnable state. There won't be a mix of runnable and non-runnableprocesses.

Getting into that mixed state is what could cause a deadlock, a runnablecontext depends on a non-runnable one.

Note: the current patchset needs a fix to provide this guarantee inmulti-gpu systems.


(ii) Job dependencies between two processes.

This case is mentioned separately as it is probably the most common usecase we will encounter for this feature. Most graphics applicationsenter producer/consumer relationship with the compositor process (windowswapchain).

In this case the compositor should already have all the informationrequired to avoid a deadlock. It knows:

  - Itself (as a process)
  - The application process
  - The dependencies between both processes

At this stage it is simple for the compositor to understand that if itwishes to perform an exclusive mode transition, all dependencies (whichare known) should also be part of the exclusive group.

We should be able to implement this feature without modifying agame/application.


(iii) Job dependencies between multiple (3+) processes.

This scenario is very uncommon for games. For example, if a game orapplication is split into multiple processes. Process A interacts withthe compositor. Process B does some physics/compute calculations andsend the results to Process A.

To support this use case, we would require an interface for theapplication to communicate to the compositor its dependencies. I.e.Process A would say, "Also keep Process B's priority in sync with mine".This should be a simple bit of plumbing to allow Process A to share anfd from Process B with the compositor.

B --[pipe_send(fdB)]--> A --[compositor_ext_priority_group_add(fdB)]-->Compositor

Once the compositor is aware of all of A's dependencies, this can behandled in the same fashion as (ii).

A special extension would be required for compositor protocols tocommunicate the dependencies fd. Applications would also need to beupdated to use this extension.

I think this case would be very uncommon. But it is something that wewould be able to handle if the need would arise.


> We need something like blocking the submitter instead (bad) or detection
> of dependencies in the scheduler (good, but tricky to implement).
>

I definitely agree that detecting dependencies is tricky. Which is why Iprefer an approach where usermode defines the dependencies. It is simplefor both the kernel and usermode to implement.


> Otherwise we can easily run into a deadlock situation with that approach.
>

The current API does allow you to deadlock yourself pretty easily ifmisused. But so do many other APIs, like having a thread trying to grabthe same lock twice :)


Thanks for the comments,
Andres

Regards,
Christian.

Am 25.05.2017 um 02:00 schrieb Andres Rodriguez:
When multiple environments are running simultaneously on a system, e.g.
an X desktop + a SteamVR game session, it may be useful to sacrifice
performance in one environment in order to boost it on the other.

This series provides a mechanism for a DRM_MASTER to provide exclusive
gpu access to a group of processes.
Note: This series is built on the assumption that the drm lease patchseries
will extend DRM_MASTER status to lesees.

The libdrm we intend to provide is as follows:

/**
  * Set the priority of all contexts in a process
  *
  * This function will change the priority of all contexts owned by
  * the process identified by fd.
  *
  * \param dev             - \c [in] device handle
  * \param fd              - \c [in] fd from target process
* \param priority - \c [in] target priorityAMDGPU_CTX_PRIORITY_*
  *
  * \return  0 on success\n
  *         <0 - Negative POSIX error code
  *
  * \notes @fd can be *any* file descriptor from the target process.
  * \notes this function requires DRM_MASTER
  */
int amdgpu_sched_process_priority_set(amdgpu_device_handle dev,
                      int fd, int32_t priority);

/**
  * Request to raise the minimum required priority to schedule a gpu job
  *
* Submit a request to increase the minimum required priority toschedule* a gpu job. Once this function returns, the gpu scheduler will nolonger
  * consider jobs from contexts with priority lower than @priority.
  *
* The minimum priority considered by the scheduler will be thehighest from
  * all currently active requests.
  *
  * Requests are refcounted, and must be balanced using
  * amdgpu_sched_min_priority_put()
  *
  * \param dev             - \c [in] device handle
* \param priority - \c [in] target priorityAMDGPU_CTX_PRIORITY_*
  *
  * \return  0 on success\n
  *         <0 - Negative POSIX error code
  *
  * \notes this function requires DRM_MASTER
  */
int amdgpu_sched_min_priority_get(amdgpu_device_handle dev,
                  int32_t priority);

/**
  * Drop a request to raise the minimum required scheduler priority
  *
  * This call balances amdgpu_sched_min_priority_get()
  *
* If no other active requests exists for @priority, the minimumrequired* priority will decay to a lower level until one is reached with anactive
  * request or the lowest priority is reached.
  *
  * \param dev             - \c [in] device handle
* \param priority - \c [in] target priorityAMDGPU_CTX_PRIORITY_*
  *
  * \return  0 on success\n
  *         <0 - Negative POSIX error code
  *
  * \notes this function requires DRM_MASTER
  */
int amdgpu_sched_min_priority_put(amdgpu_device_handle dev,
                  int32_t priority);
Using this app, VRComposer can raise the priority of the VRapp anditself. Thenit can restrict the minimum scheduler priority in order to becomeexclusive gpu
clients.
One of the areas I'd like feedback is the following scenario. If aVRapp opensa new fd and creates a new context after a call to set_priority, thisspecificcontext will be lower priority than the rest. If the minimum requiredpriority
is then raised, it is possible that this new context will be starved and
deadlock the VRapp.
One solution I had in mind to address this situation, is to makeset_priorityalso raise the priority of future contexts created by the VRapp.However, thatwould require keeping track of the requested priority on a per-processdatastructure. The current design appears to steer clean of keeping anyprocessspecific data, and everything instead of stored on a per-file basis.Which iswhy I did not pursue this approach. But if this is something you'dlike me to
implement let me know.
One could also argue that preventing an application deadlock should behandledbetween the VRComposer and the VRApp. It is not the kernel'sresponsibility tobabysit userspace applications and prevent themselves from shootingthemselvesin the foot. The same could be achieved by improper usage of sharedfences
between processes.

Thoughts/feedback/comments on this issue, or others, are appreciated.

Regards,
Andres

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [RFC] Exclusive gpu access for SteamVR usecases

Reply via email to