On 16-11-25 03:40 PM, Christian König wrote: > Am 25.11.2016 um 20:32 schrieb Jason Gunthorpe: >> This assumes the commands are fairly short lived of course, the >> expectation of the mmu notifiers is that a flush is reasonably prompt > > Correct, this is another problem. GFX command submissions usually > don't take longer than a few milliseconds, but compute command > submission can easily take multiple hours. > > I can easily imagine what would happen when kswapd is blocked by a GPU > command submission for an hour or so while the system is under memory > pressure :) > > I'm thinking on this problem for about a year now and going in circles > for quite a while. So if you have ideas on this even if they sound > totally crazy, feel free to come up.
Our GPUs (at least starting with VI) support compute-wave-save-restore and can swap out compute queues with fairly low latency. Yes, there is some overhead (both memory usage and time), but it's a fairly regular thing with our hardware scheduler (firmware, actually) when we need to preempt running compute queues to update runlists or we overcommit the hardware queue resources. Regards, Felix