[Nouveau] [PATCH v2] drm: don't continue with anything after the GPU couldn't be woken up

2017-11-21 Thread Karol Herbst
This should make systems more stable where resuming the GPU fails. This can happen due to bad firmware or due to a bug within the kernel. The last thing which should happen in either case is an unusable system. v2: do the same in nouveau_pmops_resume Tested-by: Karl Hastings Signed-off-by: Karol

Re: [Nouveau] [PATCH v2] drm: don't continue with anything after the GPU couldn't be woken up

2017-11-21 Thread Thierry Reding
On Tue, Nov 21, 2017 at 04:01:16PM +0100, Karol Herbst wrote: > This should make systems more stable where resuming the GPU fails. This > can happen due to bad firmware or due to a bug within the kernel. The > last thing which should happen in either case is an unusable system. > > v2: do the same

Re: [Nouveau] GP10B regression

2017-11-21 Thread Mikko Perttunen
&gp100_vmm_desc_12[0], NVKM_VMM_PAGE_SxHx }, {} } on top of next-20171121 works at least for a simple test. Mikko On 11/11/2017 03:02 PM, Mikko Perttunen wrote: Bisection status report: The latest commit I have gotten to work is 10842ba074e9 drm/nouveau: remove unused nouveau_fence_wo

Re: [Nouveau] GP10B regression

2017-11-21 Thread Ilia Mirkin
&gp100_vmm_desc_16[0], NVKM_VMM_PAGE_SxHC },*/ > { 12, &gp100_vmm_desc_12[0], NVKM_VMM_PAGE_SxHx }, > {} > } > > on top of next-20171121 works at least for a simple test. You're on your own for this one :) I think Ben's on vacation until next

Re: [Nouveau] [PATCH v2] drm: don't continue with anything after the GPU couldn't be woken up

2017-11-21 Thread Karol Herbst
On Tue, Nov 21, 2017 at 6:46 PM, Thierry Reding wrote: > On Tue, Nov 21, 2017 at 04:01:16PM +0100, Karol Herbst wrote: >> This should make systems more stable where resuming the GPU fails. This >> can happen due to bad firmware or due to a bug within the kernel. The >> last thing which should happ

[Nouveau] [Bug 93629] [NVE6] complete system freeze, PGRAPH engine fault on channel 2, SCHED_ERROR [ CTXSW_TIMEOUT ]

2017-11-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=93629 --- Comment #40 from t-IX --- Linux 4.12.12-gentoo x86_64 Gentoo Nov 22 00:51:36 kernel: nouveau :01:00.0: gr: TRAP ch 5 [003fb72000 X[12943]] Nov 22 00:51:36 kernel: nouveau :01:00.0: gr: GPC0/PROP trap: 0400 [RT_LINEAR_MISMATCH] x

[Nouveau] [Bug 101665] lspci blocks forever with a GP107M

2017-11-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=101665 --- Comment #5 from Etienne URBAH --- With following Linux kernels, 'lspci' systematically fails to answer, and makes the whole machine immediately freeze : - 4.13.0-17 from Ubuntu 17.10 (Artful) - 4.14.1 from http://kernel.ubuntu.com/~kernel-pp

[Nouveau] [Bug 93629] [NVE6] complete system freeze, PGRAPH engine fault on channel 2, SCHED_ERROR [ CTXSW_TIMEOUT ]

2017-11-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=93629 --- Comment #41 from Pierre Moreau --- t-IX and Dustin, you are experiencing a different bug: the current bug report is about a context switching timing out on GK106/GK107 (Kepler architecture), whereas you are using different chipsets (GF106/GF1

Re: [Nouveau] [PATCH 01/32] bios/vpstate: There are some fermi vbios with no boost or tdp entry

2017-11-21 Thread Martin Peres
On 17/11/17 02:04, Karol Herbst wrote: Please add here something like this: If the entry size is too small, default to invalid values for both boost_id and tdp_id, so as to default to the base clock in both cases. > Signed-off-by: Karol Herbst With the better commit message, this patch is: Sig

Re: [Nouveau] [PATCH 02/32] debugfs: Wake up GPU before doing any reclocking

2017-11-21 Thread Martin Peres
On 17/11/17 02:04, Karol Herbst wrote: > Fixes various reclocking related issues on prime systems. Is that the only place that was not covered? Could you check if other places would need this code too? In any case, this patch is (assuming you are calling the right functions to prevent the GPU fro

Re: [Nouveau] [PATCH 03/32] therm: Split return code and value in nvkm_get_temp

2017-11-21 Thread Martin Peres
On 17/11/17 02:04, Karol Herbst wrote: > The current hwmon code doesn't check if the returned value was actually an > error. > > Since Kepler temperature sensors are able to report negative values. Those > negative values are not for error reporting, but rather when you buried > your GPU in snow s

Re: [Nouveau] Addressing the problem of noisy GPUs under Nouveau

2017-11-21 Thread Andy Ritger
Hi Martin, I was asked to clarify a few things: (1) Are all the user reports of loud fans on Fermi-era GPUs? (2) When the VBIOS POSTs the card, it loads initial ucode onto the Falcon processor (PMU), which will do basic fan management on its own. We call this init ucode "IFR" (Init From ROM).

Re: [Nouveau] [PATCH 01/32] bios/vpstate: There are some fermi vbios with no boost or tdp entry

2017-11-21 Thread Karol Herbst
On Wed, Nov 22, 2017 at 1:16 AM, Martin Peres wrote: > On 17/11/17 02:04, Karol Herbst wrote: > > Please add here something like this: > > If the entry size is too small, default to invalid values for both > boost_id and tdp_id, so as to default to the base clock in both cases. > well actually we

Re: [Nouveau] [PATCH 02/32] debugfs: Wake up GPU before doing any reclocking

2017-11-21 Thread Karol Herbst
On Wed, Nov 22, 2017 at 1:21 AM, Martin Peres wrote: > On 17/11/17 02:04, Karol Herbst wrote: >> Fixes various reclocking related issues on prime systems. > > Is that the only place that was not covered? Could you check if other > places would need this code too? > I had quite a discussion with B

Re: [Nouveau] [PATCH 03/32] therm: Split return code and value in nvkm_get_temp

2017-11-21 Thread Karol Herbst
On Wed, Nov 22, 2017 at 1:32 AM, Martin Peres wrote: > On 17/11/17 02:04, Karol Herbst wrote: >> The current hwmon code doesn't check if the returned value was actually an >> error. >> >> Since Kepler temperature sensors are able to report negative values. Those >> negative values are not for erro

Re: [Nouveau] Addressing the problem of noisy GPUs under Nouveau

2017-11-21 Thread Ilia Mirkin
On Tue, Nov 21, 2017 at 8:29 PM, Andy Ritger wrote: > Hi Martin, Martin should have complete answers, > > I was asked to clarify a few things: > > (1) Are all the user reports of loud fans on Fermi-era GPUs? Yes. Although I believe some GK208 users are also having trouble, including yours truly

Re: [Nouveau] Addressing the problem of noisy GPUs under Nouveau

2017-11-21 Thread Karol Herbst
On Wed, Nov 22, 2017 at 3:06 AM, Ilia Mirkin wrote: > On Tue, Nov 21, 2017 at 8:29 PM, Andy Ritger wrote: >> Hi Martin, > > Martin should have complete answers, > >> >> I was asked to clarify a few things: >> >> (1) Are all the user reports of loud fans on Fermi-era GPUs? > > Yes. Although I beli