Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

Francisco Jerez Thu, 21 Jan 2010 06:09:47 -0800

Luca Barbieri <l...@luca-barbieri.com> writes:

>> At a first glance:
>>
>> 1) We probably *will* need a delayed destroyed workqueue to avoid wasting
>> memory that otherwise should be freed to the system. At the very least, the
>> delayed delete process should optionally be run by a system shrinker.
> You are right. For VRAM we don't care since we are the only user,
> while for system backed memory some delayed destruction will be
> needed.
> The logical extension of the scheme would be for the Linux page
> allocator/swapper to check for TTM buffers to destroy when it would
> otherwise shrink caches, try to swap and/or wait on swap to happen.
> Not sure whether there are existing hooks for this or where exactly to
> hook this code.
>
>> 2) Fences in TTM are currently not necessarily strictly ordered, and
>> sequence numbers are hidden from the bo code. This means, for a given FIFO,
>> fence sequence 3 may expire before fence sequence 2, depending on the usage
>> of the buffer.
>
> My definition of "channel" (I sometimes used FIFO incorrectly as a
> synonym of that) is exactly a set of fences that are strictly ordered.
> If the card has multiple HW engines, each is considered a different
> channel (so that a channel becomes a (fifo, engine) pair).
>
> We may need however to add the concept of a "sync domain" that would
> be a set of channels that support on-GPU synchronization against each
> other.
> This would model hardware where channels with the same FIFO can be
> synchronized together but those with different FIFOs don't, and also
> multi-core GPUs where synchronization might be available only inside
> each core and not across cores.
>
> To sum it up, a GPU consists of a set of sync domains, each consisting
> of a set of channels, each consisting of a sequence of fences, with
> the following rules:
> 1. Fences within the same channel expire in order
> 2. If channels A and B belong to the same sync domain, it's possible
> to emit a fence on A that is guaranteed to expire after an arbitrary
> fence of B
>
> Whether channels have the same FIFO or not is essentially a driver
> implementation detail, and what TTM cares about is if they are in the
> same sync domain.
>
> [I just made up "sync domain" here: is there a standard term?]
>
> This assumes that the "synchronizability" graph is a disjoint union of
> complete graphs. Is there any example where it is not so?
> Also, does this actually model correctly Poulsbo, or am I wrong?
>
> Note that we could use CPU mediation more than we currently do.
> For instance now Nouveau, to do inter-channel synchronization, simply
> waits on the fence with the CPU immediately synchronously, while it
> could instead queue the commands in software, and with an
> interrupt/delayed mechanism submit them to hardware once the fence to
> be waited for is expired.


Nvidia cards have a synchronization primitive that could be used to
synchronize several FIFOs in hardware (AKA semaphores, see [1] for an
example).

> _______________________________________________
> Nouveau mailing list
> Nouveau@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/nouveau

[1] http://lists.freedesktop.org/archives/nouveau/2009-December/004514.html

pgpPDUkLn0FBH.pgp
Description: PGP signature

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

Re: [Nouveau] [PATCH] drm/ttm: Fix race condition in ttm_bo_delayed_delete

Reply via email to