[Nouveau] [RFC PATCH 5/5] drm/nouveau: gpu lockup recovery

2012-04-25 Thread Christoph Bumiller
On 04/23/2012 06:32 PM, Marcin Slusarz wrote: > You can run fs-discard-exit-2 test first - for me it causes instant GPU > lockup. > That's because it's designed (but not supposed) to do that, it also locks up with the blob, it's a harmless shader infinite loop. (May be a bug in the MPs or a

[RFC PATCH 5/5] drm/nouveau: gpu lockup recovery

2012-04-25 Thread Ben Skeggs
On Tue, 2012-04-24 at 21:31 +0200, Marcin Slusarz wrote: > On Mon, Apr 23, 2012 at 06:56:44PM +0200, Martin Peres wrote: > > Le 23/04/2012 18:32, Marcin Slusarz a ?crit : > > > > > > Just run piglit. Even "quick" tests can cause ~5 lockups (it eventually > > > messes > > > up DDX channel, but

Re: [Nouveau] [RFC PATCH 5/5] drm/nouveau: gpu lockup recovery

2012-04-25 Thread Christoph Bumiller
On 04/23/2012 06:32 PM, Marcin Slusarz wrote: You can run fs-discard-exit-2 test first - for me it causes instant GPU lockup. That's because it's designed (but not supposed) to do that, it also locks up with the blob, it's a harmless shader infinite loop. (May be a bug in the MPs or a wrong

[RFC PATCH 5/5] drm/nouveau: gpu lockup recovery

2012-04-24 Thread Marcin Slusarz
On Mon, Apr 23, 2012 at 06:56:44PM +0200, Martin Peres wrote: > Le 23/04/2012 18:32, Marcin Slusarz a ?crit : > > > > Just run piglit. Even "quick" tests can cause ~5 lockups (it eventually > > messes > > up DDX channel, but this patchset can't fix this case). > > You can run fs-discard-exit-2

Re: [RFC PATCH 5/5] drm/nouveau: gpu lockup recovery

2012-04-24 Thread Marcin Slusarz
On Mon, Apr 23, 2012 at 06:56:44PM +0200, Martin Peres wrote: Le 23/04/2012 18:32, Marcin Slusarz a écrit : Just run piglit. Even quick tests can cause ~5 lockups (it eventually messes up DDX channel, but this patchset can't fix this case). You can run fs-discard-exit-2 test first -

Re: [RFC PATCH 5/5] drm/nouveau: gpu lockup recovery

2012-04-24 Thread Ben Skeggs
On Tue, 2012-04-24 at 21:31 +0200, Marcin Slusarz wrote: On Mon, Apr 23, 2012 at 06:56:44PM +0200, Martin Peres wrote: Le 23/04/2012 18:32, Marcin Slusarz a écrit : Just run piglit. Even quick tests can cause ~5 lockups (it eventually messes up DDX channel, but this patchset can't

[RFC PATCH 5/5] drm/nouveau: gpu lockup recovery

2012-04-23 Thread Marcin Slusarz
On Mon, Apr 23, 2012 at 06:46:41PM +0200, Martin Peres wrote: > Hey, > > Just a minor mistake spotted while skimming through the patch. > > Le 23/04/2012 00:18, Marcin Slusarz a ?crit : > > +static inline uint64_t nv_timeout(struct drm_device *dev) > > +{ > > + uint64_t tm = 20ULL; > >

[RFC PATCH 5/5] drm/nouveau: gpu lockup recovery

2012-04-23 Thread Martin Peres
Le 23/04/2012 18:32, Marcin Slusarz a ?crit : > > Just run piglit. Even "quick" tests can cause ~5 lockups (it eventually messes > up DDX channel, but this patchset can't fix this case). > You can run fs-discard-exit-2 test first - for me it causes instant GPU > lockup. > > Marcin Great, Thanks.

[RFC PATCH 5/5] drm/nouveau: gpu lockup recovery

2012-04-23 Thread Martin Peres
Hey, Just a minor mistake spotted while skimming through the patch. Le 23/04/2012 00:18, Marcin Slusarz a ?crit : > +static inline uint64_t nv_timeout(struct drm_device *dev) > +{ > + uint64_t tm = 20ULL; > + if (nouveau_gpu_reset_in_progress(dev)) > + tm /= 40; /*

[RFC PATCH 5/5] drm/nouveau: gpu lockup recovery

2012-04-23 Thread Marcin Slusarz
On Mon, Apr 23, 2012 at 10:43:08AM +0200, Martin Peres wrote: > Le 23/04/2012 00:18, Marcin Slusarz a ?crit : > > Overall idea: > > Detect lockups by watching for timeouts (vm flush / fence), return -EIOs, > > handle them at ioctl level, reset the GPU and repeat last ioctl. > > > > GPU reset is

[RFC PATCH 5/5] drm/nouveau: gpu lockup recovery

2012-04-23 Thread Martin Peres
Le 23/04/2012 00:18, Marcin Slusarz a ?crit : > Overall idea: > Detect lockups by watching for timeouts (vm flush / fence), return -EIOs, > handle them at ioctl level, reset the GPU and repeat last ioctl. > > GPU reset is done by doing suspend / resume cycle with few tweaks: > - CPU-only bo

[RFC PATCH 5/5] drm/nouveau: gpu lockup recovery

2012-04-23 Thread Marcin Slusarz
Overall idea: Detect lockups by watching for timeouts (vm flush / fence), return -EIOs, handle them at ioctl level, reset the GPU and repeat last ioctl. GPU reset is done by doing suspend / resume cycle with few tweaks: - CPU-only bo eviction - ignoring vm flush / fence timeouts - shortening

Re: [RFC PATCH 5/5] drm/nouveau: gpu lockup recovery

2012-04-23 Thread Martin Peres
Le 23/04/2012 00:18, Marcin Slusarz a écrit : Overall idea: Detect lockups by watching for timeouts (vm flush / fence), return -EIOs, handle them at ioctl level, reset the GPU and repeat last ioctl. GPU reset is done by doing suspend / resume cycle with few tweaks: - CPU-only bo eviction -

Re: [RFC PATCH 5/5] drm/nouveau: gpu lockup recovery

2012-04-23 Thread Marcin Slusarz
On Mon, Apr 23, 2012 at 10:43:08AM +0200, Martin Peres wrote: Le 23/04/2012 00:18, Marcin Slusarz a écrit : Overall idea: Detect lockups by watching for timeouts (vm flush / fence), return -EIOs, handle them at ioctl level, reset the GPU and repeat last ioctl. GPU reset is done by doing

Re: [RFC PATCH 5/5] drm/nouveau: gpu lockup recovery

2012-04-23 Thread Martin Peres
Hey, Just a minor mistake spotted while skimming through the patch. Le 23/04/2012 00:18, Marcin Slusarz a écrit : +static inline uint64_t nv_timeout(struct drm_device *dev) +{ + uint64_t tm = 20ULL; + if (nouveau_gpu_reset_in_progress(dev)) + tm /= 40; /* 50ms

Re: [RFC PATCH 5/5] drm/nouveau: gpu lockup recovery

2012-04-23 Thread Martin Peres
Le 23/04/2012 18:32, Marcin Slusarz a écrit : Just run piglit. Even quick tests can cause ~5 lockups (it eventually messes up DDX channel, but this patchset can't fix this case). You can run fs-discard-exit-2 test first - for me it causes instant GPU lockup. Marcin Great, Thanks. Did you

Re: [RFC PATCH 5/5] drm/nouveau: gpu lockup recovery

2012-04-23 Thread Marcin Slusarz
On Mon, Apr 23, 2012 at 06:46:41PM +0200, Martin Peres wrote: Hey, Just a minor mistake spotted while skimming through the patch. Le 23/04/2012 00:18, Marcin Slusarz a écrit : +static inline uint64_t nv_timeout(struct drm_device *dev) +{ + uint64_t tm = 20ULL; + if

[RFC PATCH 5/5] drm/nouveau: gpu lockup recovery

2012-04-22 Thread Marcin Slusarz
Overall idea: Detect lockups by watching for timeouts (vm flush / fence), return -EIOs, handle them at ioctl level, reset the GPU and repeat last ioctl. GPU reset is done by doing suspend / resume cycle with few tweaks: - CPU-only bo eviction - ignoring vm flush / fence timeouts - shortening