subject:"\[Bug 65761\] HD 7970M Hybrid \- hangs and errors and rmmod causes crash"

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2015-01-04 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #57 from Christoph Haag  ---
Long time, no update from me.

Using 3.19-rc2:

rmmod radeon works fine
modprobe radeon after rmmod stalls once:

[   53.031311] radeon :01:00.0: ring 5 stalled for more than 1msec
[   53.032230] radeon :01:00.0: GPU lockup (current fence id
0x last fence id 0x0002 on ring 5)
[   53.032322] [drm:uvd_v1_0_ib_test [radeon]] *ERROR* radeon: fence wait
failed (-35).
[   53.033262] [drm:radeon_ib_ring_tests [radeon]] *ERROR* radeon: failed
testing IB on ring 5 (-35).
[   53.037997] [drm] Radeon Display Connectors
[   53.041287] radeon :01:00.0: No connectors reported connected with modes
[   53.041289] [drm] Cannot find any crtc or sizes - going 1024x768
[   53.041834] [drm] fb mappable at 0xE0479000
[   53.041835] [drm] vram apper at 0xE000
[   53.041835] [drm] size 3145728
[   53.041836] [drm] fb depth is 24
[   53.041837] [drm]pitch is 4096
[   53.041941] radeon :01:00.0: fb1: radeondrmfb frame buffer device
[   53.042347] [drm] Initialized radeon 2.40.0 20080528 for :01:00.0 on
minor 1

but it recovers and then works.

So rmmod radeon and modprobe radeon do not cause any critical errors anymore,
it only is a bit annoying that modprobe radeon takes 10 seconds.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-04-01 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

Mike Cloaked  changed:

   What|Removed |Added

 CC||mike.cloaked at gmail.com

--- Comment #56 from Mike Cloaked  ---
It is possible that this report is directly related:

https://bugzilla.kernel.org/show_bug.cgi?id=73291

So I guess at this time the bug is still awaiting the mesa fix to test further
in arch linux at least?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-03-03 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #55 from Christoph Haag  ---
(In reply to saadnaji89 from comment #54)

> Are we going to see this patch applied to fix the  problem in future kernel
> veriosn  ?. I don't know whether you work for AMD or just someone is
> contributing to fix the problem

I wondered whether to reply, but now...

He works for AMD. It's even on Wikipedia.

This patch is not to the kernel, it is to mesa.

It has been in mesa for 14 days:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=cf0172d46ab940a691da6516057c81f28961482f

(Is it necessary to keep talking about already fixed stuff?)



I'd say the only reason this is not closed yet is because of the radeon module
unloading problems. Thinking of it, I haven't tried in a while.. Should I try
to get some more information for it or is the delay just until someone has time
to look at it?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-03-03 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #54 from saadnaji89 at gmail.com ---
(In reply to Michel D?nzer from comment #53)
> (In reply to saadnaji89 from comment #51)
> > I am runnig Manjaro (Arch based distro 64 bit) with Kernel 3.13.5 as right
> > now and I am having the same error except that I am geting repetitive blocks
> > in the kernel log like this ever certain amount of time:
> 
> See comment #35.
> 
> 
> > and this is causing freeze for my laptop for 2-3 seconds, as well as causing
> > the temperature of the gpu to raise up by 10 C .
> 
> You probably need my Mesa patch to prevent the GPU from powering up
> needlessly: https://bugzilla.kernel.org/attachment.cgi?id=126011

Thanks you.

Are we going to see this patch applied to fix the  problem in future kernel
veriosn  ?. I don't know whether you work for AMD or just someone is
contributing to fix the problem

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-03-03 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #53 from Michel D?nzer  ---
(In reply to saadnaji89 from comment #51)
> I am runnig Manjaro (Arch based distro 64 bit) with Kernel 3.13.5 as right
> now and I am having the same error except that I am geting repetitive blocks
> in the kernel log like this ever certain amount of time:

See comment #35.


> and this is causing freeze for my laptop for 2-3 seconds, as well as causing
> the temperature of the gpu to raise up by 10 C .

You probably need my Mesa patch to prevent the GPU from powering up needlessly:
https://bugzilla.kernel.org/attachment.cgi?id=126011

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-03-03 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #52 from saadnaji89 at gmail.com ---
Created attachment 127801
  --> https://bugzilla.kernel.org/attachment.cgi?id=127801=edit
kenrel log

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-03-03 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

saadnaji89 at gmail.com changed:

   What|Removed |Added

 CC||saadnaji89 at gmail.com

--- Comment #51 from saadnaji89 at gmail.com ---
Hello,

I am runnig Manjaro (Arch based distro 64 bit) with Kernel 3.13.5 as right now
and I am having the same error except that I am geting repetitive blocks in the
kernel log like this ever certain amount of time:

03/02/14 06:59:12 PM[drm] ring test on 0 succeeded in 2 usecs
03/02/14 06:59:12 PM[drm] ring test on 1 succeeded in 1 usecs
03/02/14 06:59:12 PM[drm] ring test on 2 succeeded in 1 usecs
03/02/14 06:59:12 PM[drm] ring test on 3 succeeded in 2 usecs
03/02/14 06:59:12 PM[drm] ring test on 4 succeeded in 1 usecs
03/02/14 06:59:12 PM[drm] ring test on 5 succeeded in 2 usecs
03/02/14 06:59:12 PM[drm] UVD initialized successfully.
03/02/14 06:59:12 PM[drm] Enabling audio 0 support
03/02/14 06:59:12 PM[drm] Enabling audio 1 support
03/02/14 06:59:12 PM[drm] Enabling audio 2 support
03/02/14 06:59:12 PM[drm] Enabling audio 3 support
03/02/14 06:59:12 PM[drm] Enabling audio 4 support
03/02/14 06:59:12 PM[drm] Enabling audio 5 support
03/02/14 06:59:12 PM[drm] ib test on ring 0 succeeded in 0 usecs
03/02/14 06:59:12 PM[drm] ib test on ring 1 succeeded in 0 usecs
03/02/14 06:59:12 PM[drm] ib test on ring 2 succeeded in 0 usecs
03/02/14 06:59:12 PM[drm] ib test on ring 3 succeeded in 0 usecs
03/02/14 06:59:12 PM[drm] ib test on ring 4 succeeded in 0 usecs
03/02/14 06:59:12 PM[drm] ib test on ring 5 succeeded
03/02/14 06:59:48 PM[drm] Disabling audio 0 support
03/02/14 06:59:48 PM[drm] Disabling audio 1 support
03/02/14 06:59:48 PM[drm] Disabling audio 2 support
03/02/14 06:59:48 PM[drm] Disabling audio 3 support
03/02/14 06:59:48 PM[drm] Disabling audio 4 support
03/02/14 06:59:48 PM[drm] Disabling audio 5 support

and this is causing freeze for my laptop for 2-3 seconds, as well as causing
the temperature of the gpu to raise up by 10 C . This bug was not present in
prior 3.13 kernel

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-02-19 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #50 from Joshua M. Thompson  ---
On Fedora Rawhide kernel 3.14.0-0.rc3.git0.2.fc21.x86_64 still doesn't fix this
(maybe it doesn't have the necessary patch yet?)

With radeon.runpm=1 I get the "UVD not responding" messages, the machine locks
up  frequently for 5-10 seconds to reinit the radeon, and the radeon
performance is horrible (~400 fps in glxgears).

With radeon.runpm=0 I don't get "UVD not responding", the lockups are gone, and
glxgears is at 3650 fps (still less than the 6-7k fps I got under 3.12 though.)

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-02-19 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #49 from Alex Deucher  ---
(In reply to Hohahiu from comment #47)
> [   41.546747] [drm:uvd_v1_0_start] *ERROR* UVD not responding, trying to
> reset the VCPU!!!
> [   41.566826] [drm:uvd_v1_0_start] *ERROR* UVD not responding, giving up!!!
> [   41.566879] [drm:si_startup] *ERROR* radeon: failed initializing UVD (-1).
> 
> Is it related to this issue or do you want me to open new bug report?

Please open a different bug for this.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-02-19 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #48 from Hohahiu  ---
Created attachment 126711
  --> https://bugzilla.kernel.org/attachment.cgi?id=126711=edit
dmesg

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-02-19 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #47 from Hohahiu  ---
The problem is fixed now in kernel 3.14-rc3. Truly speaking the laptop now is
very cold compared to what it was before. So thank you very much for your work,
Alex and Michel!
However whenever my dGPU is powered on there are following error messages:
[   32.333940] [drm:uvd_v1_0_start] *ERROR* UVD not responding, trying to reset
the VCPU!!!
[   33.357599] [drm:uvd_v1_0_start] *ERROR* UVD not responding, trying to reset
the VCPU!!!
[   34.381217] [drm:uvd_v1_0_start] *ERROR* UVD not responding, trying to reset
the VCPU!!!
[   35.404882] [drm:uvd_v1_0_start] *ERROR* UVD not responding, trying to reset
the VCPU!!!
[   36.428483] [drm:uvd_v1_0_start] *ERROR* UVD not responding, trying to reset
the VCPU!!!
[   37.452137] [drm:uvd_v1_0_start] *ERROR* UVD not responding, trying to reset
the VCPU!!!
[   38.475777] [drm:uvd_v1_0_start] *ERROR* UVD not responding, trying to reset
the VCPU!!!
[   39.499447] [drm:uvd_v1_0_start] *ERROR* UVD not responding, trying to reset
the VCPU!!!
[   40.523078] [drm:uvd_v1_0_start] *ERROR* UVD not responding, trying to reset
the VCPU!!!
[   41.546747] [drm:uvd_v1_0_start] *ERROR* UVD not responding, trying to reset
the VCPU!!!
[   41.566826] [drm:uvd_v1_0_start] *ERROR* UVD not responding, giving up!!!
[   41.566879] [drm:si_startup] *ERROR* radeon: failed initializing UVD (-1).

Is it related to this issue or do you want me to open new bug report?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-02-17 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #46 from Alex Deucher  ---
Created attachment 126451
  --> https://bugzilla.kernel.org/attachment.cgi?id=126451=edit
report 0 for temp when dGPU is powered off

(In reply to Christoph Haag from comment #45)
> Yes, this is working pretty good, haven't noticed anything bad with it.
> 
> Even suspending works great, manual vgaswitcheroo was always a bit buggy
> after waking up, but this is working great.
> 
> Maybe a minor thing: When the GPU is off lm_sensors displays
> radeon-pci-0100
> Adapter: PCI adapter
> temp1:   +511.0?C
> Should maybe be 0 or -1 or the last detected temperature or something, 511?C
> could throw some tools off.
> 

Attached patch should report 0.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-02-15 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #45 from Christoph Haag  ---
Yes, this is working pretty good, haven't noticed anything bad with it.

Even suspending works great, manual vgaswitcheroo was always a bit buggy after
waking up, but this is working great.

Maybe a minor thing: When the GPU is off lm_sensors displays
radeon-pci-0100
Adapter: PCI adapter
temp1:   +511.0?C
Should maybe be 0 or -1 or the last detected temperature or something, 511?C
could throw some tools off.



Anyway, in comment #42 I was not right, the glx compositing black window issue
was not fixed, but it certainly had an influence on it. Might be worth checking
it out.



It's starting to work better than fglrx now, so, very nice.

Now the only issue I have is unloading the radeon kernel module. But
fortunately you don't really have to do that during normal usage so it's not a
priority for anyone I guess.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-02-14 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

Michel D?nzer  changed:

   What|Removed |Added

 Attachment #125821|0   |1
is obsolete||

--- Comment #44 from Michel D?nzer  ---
Created attachment 126011
  --> https://bugzilla.kernel.org/attachment.cgi?id=126011=edit
r600g,radeonsi: Consolidate logic for short-circuiting flushes

This patch doesn't break OpenCL for me.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-02-13 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #43 from Christoph Haag  ---
But it silently breaks opencl.

Before the patch:

opencl-example (git)-[master] % sudo ./hello_world
./hello_world: /usr/lib/libOpenCL.so.1: no version information available
(required by ./hello_world)
There are 1 platforms.
There are 1 GPU devices.
clCreateContext() succeeded.
clCreateCommandQueue() succeeded.
clCreateProgramWithSource() suceeded.
clBuildProgram() suceeded.
clCreateKernel() suceeded.
clCreateBuffer() succeeded.
clSetKernelArg() succeeded.
clEnqueueNDRangeKernel() suceeded.
clFinish() succeeded.
clEnqueueReadBuffer() suceeded.
pi = 3.141590

After the patch:

opencl-example (git)-[master] % sudo ./hello_world
./hello_world: /usr/lib/libOpenCL.so.1: no version information available
(required by ./hello_world)
There are 1 platforms.
There are 1 GPU devices.
clCreateContext() succeeded.
clCreateCommandQueue() succeeded.
clCreateProgramWithSource() suceeded.
clBuildProgram() suceeded.
clCreateKernel() suceeded.
clCreateBuffer() succeeded.
clSetKernelArg() succeeded.
clEnqueueNDRangeKernel() suceeded.
clFinish() succeeded.
clEnqueueReadBuffer() suceeded.
pi = -nan

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-02-13 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #42 from Christoph Haag  ---
0:DIS: :DynOff::01:00.0
1:IGD:+:Pwr::00:02.0

Nice. It now powers off properly in X.

I haven't extensively tried but with the latest mesa git and this patch this
bug seems to have vanished too:
https://bugs.freedesktop.org/show_bug.cgi?id=69101
https://bugs.kde.org/show_bug.cgi?id=324864

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-02-13 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #41 from Michel D?nzer  ---
Created attachment 125821
  --> https://bugzilla.kernel.org/attachment.cgi?id=125821=edit
radeonsi: Short-circuit flushes with no preceding draw calls

Does this Mesa patch help?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-02-12 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #40 from Christoph Haag  ---
Created attachment 125701
  --> https://bugzilla.kernel.org/attachment.cgi?id=125701=edit
radeon_cs_emit did not trigger, but ProcGetInputFocus causes some activity

(In reply to Michel D?nzer from comment #39)
> (In reply to Christoph Haag from comment #33)
> > 
> > I'm still kind of searching for a tool that could create a call graph and
> > annotate it over time...
> 
> I suspect profiling is just not the right tool for this.

Yes, what I actually wanted was a tool that traces every single function call
and outputs the call chain for every single function call but there doesn't
seem to be much in that regard except gdb scripting or long abandoned tools.

> While the Radeon card is supposed to be off, can you attach gdb to the X
> server process and set a breakpoint on radeon_cs_emit, and if that triggers,
> attach the output of bt full?

It does not trigger. On the other hand it also does not trigger when running a
3d application like xonotic on the radeon gpu. So here's another maybe useless
gdb log (radeon_cs_emit was Breakpoint 1)...

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-02-12 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #39 from Michel D?nzer  ---
(In reply to Christoph Haag from comment #33)
> 
> I'm still kind of searching for a tool that could create a call graph and
> annotate it over time...

I suspect profiling is just not the right tool for this.

While the Radeon card is supposed to be off, can you attach gdb to the X server
process and set a breakpoint on radeon_cs_emit, and if that triggers, attach
the output of bt full?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-02-11 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

Jani Nikula  changed:

   What|Removed |Added

 CC||michael at mijobe.de

--- Comment #38 from Jani Nikula  ---
*** Bug 69291 has been marked as a duplicate of this bug. ***

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-02-11 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

Joshua M. Thompson  changed:

   What|Removed |Added

 CC||joshua.thompson at gmail.com

--- Comment #37 from Joshua M. Thompson  ---
I'd just like to chime in, since I've been having similar runpm problems with
both 3.13 and 3.14 kernels on my laptop with built-in Intel HD4000 and a Radeon
7730M. The dGPU is constantly suspending and resuming every few minutes, even
though nothing is using it (although xrandr offload has been configured). This
is accompanied by the burst of kernel messages mentioned above, as well as
several seconds where the mouse cursor moves but X is otherwise unresponsive.
If I let the laptop sit idle for a while and come back it'll be frozen with a
blank screen, presumably because it either froze up blanking the screen or
shortly after.

Running with radeon.runpm=0 eliminates the freezes and the lockups, but it
keeps the laptop running a lot hotter than I'd like.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-02-07 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #36 from Hohahiu  ---
(In reply to Alex Deucher from comment #35)
> (In reply to Christoph Haag from comment #33)
> > But I have an awful lot of this block in dmesg, I guess from every time
> > starting X:
> 
> Those are the messages from the driver each time the card is powered off or
> on.
> 
> > 
> > [Feb 6 13:39] [drm] Disabling audio 0 support
> > [  +0,04] [drm] Disabling audio 1 support
> > [  +0,02] [drm] Disabling audio 2 support
> > [  +0,01] [drm] Disabling audio 3 support
> > [  +0,01] [drm] Disabling audio 4 support
> > [  +0,02] [drm] Disabling audio 5 support
> 
> Card is powered off.
> 
> 
> > [  +1,257992] [drm] probing gen 2 caps for device 8086:151 = 261ad01/e
> > [  +0,05] [drm] PCIE gen 3 link speeds already enabled
> > [  +0,004127] [drm] PCIE GART of 1024M enabled (table at 
> > 0x00478000).
> > [  +0,000102] radeon :01:00.0: WB enabled
> > [  +0,04] radeon :01:00.0: fence driver on ring 0 use gpu addr
> > 0x8c00 and cpu addr 0x8807fd9f8c00
> > [  +0,03] radeon :01:00.0: fence driver on ring 1 use gpu addr
> > 0x8c04 and cpu addr 0x8807fd9f8c04
> > [  +0,02] radeon :01:00.0: fence driver on ring 2 use gpu addr
> > 0x8c08 and cpu addr 0x8807fd9f8c08
> > [  +0,02] radeon :01:00.0: fence driver on ring 3 use gpu addr
> > 0x8c0c and cpu addr 0x8807fd9f8c0c
> > [  +0,02] radeon :01:00.0: fence driver on ring 4 use gpu addr
> > 0x8c10 and cpu addr 0x8807fd9f8c10
> > [  +0,000390] radeon :01:00.0: fence driver on ring 5 use gpu addr
> > 0x00075a18 and cpu addr 0xc90008b35a18
> > [  +0,144839] [drm] ring test on 0 succeeded in 2 usecs
> > [  +0,04] [drm] ring test on 1 succeeded in 1 usecs
> > [  +0,03] [drm] ring test on 2 succeeded in 1 usecs
> > [  +0,59] [drm] ring test on 3 succeeded in 2 usecs
> > [  +0,07] [drm] ring test on 4 succeeded in 1 usecs
> > [  +0,187260] [drm] ring test on 5 succeeded in 2 usecs
> > [  +0,04] [drm] UVD initialized successfully.
> > [  +0,001745] [drm] Enabling audio 0 support
> > [  +0,01] [drm] Enabling audio 1 support
> > [  +0,00] [drm] Enabling audio 2 support
> > [  +0,01] [drm] Enabling audio 3 support
> > [  +0,01] [drm] Enabling audio 4 support
> > [  +0,01] [drm] Enabling audio 5 support
> > [  +0,35] [drm] ib test on ring 0 succeeded in 0 usecs
> > [  +0,29] [drm] ib test on ring 1 succeeded in 0 usecs
> > [  +0,29] [drm] ib test on ring 2 succeeded in 0 usecs
> > [  +0,19] [drm] ib test on ring 3 succeeded in 0 usecs
> > [  +0,18] [drm] ib test on ring 4 succeeded in 0 usecs
> > [  +0,158619] [drm] ib test on ring 5 succeeded
> 
> Card is powered back on.

Thanks for explanation. In my case sometimes the card is powered on and off a
lot of times without any explicit use.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-02-07 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #35 from Alex Deucher  ---
(In reply to Christoph Haag from comment #33)
> But I have an awful lot of this block in dmesg, I guess from every time
> starting X:

Those are the messages from the driver each time the card is powered off or on.

> 
> [Feb 6 13:39] [drm] Disabling audio 0 support
> [  +0,04] [drm] Disabling audio 1 support
> [  +0,02] [drm] Disabling audio 2 support
> [  +0,01] [drm] Disabling audio 3 support
> [  +0,01] [drm] Disabling audio 4 support
> [  +0,02] [drm] Disabling audio 5 support

Card is powered off.


> [  +1,257992] [drm] probing gen 2 caps for device 8086:151 = 261ad01/e
> [  +0,05] [drm] PCIE gen 3 link speeds already enabled
> [  +0,004127] [drm] PCIE GART of 1024M enabled (table at 0x00478000).
> [  +0,000102] radeon :01:00.0: WB enabled
> [  +0,04] radeon :01:00.0: fence driver on ring 0 use gpu addr
> 0x8c00 and cpu addr 0x8807fd9f8c00
> [  +0,03] radeon :01:00.0: fence driver on ring 1 use gpu addr
> 0x8c04 and cpu addr 0x8807fd9f8c04
> [  +0,02] radeon :01:00.0: fence driver on ring 2 use gpu addr
> 0x8c08 and cpu addr 0x8807fd9f8c08
> [  +0,02] radeon :01:00.0: fence driver on ring 3 use gpu addr
> 0x8c0c and cpu addr 0x8807fd9f8c0c
> [  +0,02] radeon :01:00.0: fence driver on ring 4 use gpu addr
> 0x8c10 and cpu addr 0x8807fd9f8c10
> [  +0,000390] radeon :01:00.0: fence driver on ring 5 use gpu addr
> 0x00075a18 and cpu addr 0xc90008b35a18
> [  +0,144839] [drm] ring test on 0 succeeded in 2 usecs
> [  +0,04] [drm] ring test on 1 succeeded in 1 usecs
> [  +0,03] [drm] ring test on 2 succeeded in 1 usecs
> [  +0,59] [drm] ring test on 3 succeeded in 2 usecs
> [  +0,07] [drm] ring test on 4 succeeded in 1 usecs
> [  +0,187260] [drm] ring test on 5 succeeded in 2 usecs
> [  +0,04] [drm] UVD initialized successfully.
> [  +0,001745] [drm] Enabling audio 0 support
> [  +0,01] [drm] Enabling audio 1 support
> [  +0,00] [drm] Enabling audio 2 support
> [  +0,01] [drm] Enabling audio 3 support
> [  +0,01] [drm] Enabling audio 4 support
> [  +0,01] [drm] Enabling audio 5 support
> [  +0,35] [drm] ib test on ring 0 succeeded in 0 usecs
> [  +0,29] [drm] ib test on ring 1 succeeded in 0 usecs
> [  +0,29] [drm] ib test on ring 2 succeeded in 0 usecs
> [  +0,19] [drm] ib test on ring 3 succeeded in 0 usecs
> [  +0,18] [drm] ib test on ring 4 succeeded in 0 usecs
> [  +0,158619] [drm] ib test on ring 5 succeeded

Card is powered back on.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-02-07 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #34 from Hohahiu  ---
(In reply to Christoph Haag from comment #33)
> Created attachment 125111 [details]
> possibly call chain that calls radeon stuff
> 
> @ Hohahiu
> 
> Do you mean those?
> [ 8077.648324] [drm:si_dpm_set_power_state] *ERROR* si_set_sw_state failed
> 
> No, I don't see those at all.
> 
> 
> 
> By the way, I'm using linux 3.14-rc1 now with this patch:
> https://bugzilla.kernel.org/attachment.cgi?id=124621=diff
> 
> But I have an awful lot of this block in dmesg, I guess from every time
> starting X:
> 
> [Feb 6 13:39] [drm] Disabling audio 0 support
> [  +0,04] [drm] Disabling audio 1 support
> [  +0,02] [drm] Disabling audio 2 support
> [  +0,01] [drm] Disabling audio 3 support
> [  +0,01] [drm] Disabling audio 4 support
> [  +0,02] [drm] Disabling audio 5 support
> [  +1,257992] [drm] probing gen 2 caps for device 8086:151 = 261ad01/e
> [  +0,05] [drm] PCIE gen 3 link speeds already enabled
> [  +0,004127] [drm] PCIE GART of 1024M enabled (table at 0x00478000).
> [  +0,000102] radeon :01:00.0: WB enabled
> [  +0,04] radeon :01:00.0: fence driver on ring 0 use gpu addr
> 0x8c00 and cpu addr 0x8807fd9f8c00
> [  +0,03] radeon :01:00.0: fence driver on ring 1 use gpu addr
> 0x8c04 and cpu addr 0x8807fd9f8c04
> [  +0,02] radeon :01:00.0: fence driver on ring 2 use gpu addr
> 0x8c08 and cpu addr 0x8807fd9f8c08
> [  +0,02] radeon :01:00.0: fence driver on ring 3 use gpu addr
> 0x8c0c and cpu addr 0x8807fd9f8c0c
> [  +0,02] radeon :01:00.0: fence driver on ring 4 use gpu addr
> 0x8c10 and cpu addr 0x8807fd9f8c10
> [  +0,000390] radeon :01:00.0: fence driver on ring 5 use gpu addr
> 0x00075a18 and cpu addr 0xc90008b35a18
> [  +0,144839] [drm] ring test on 0 succeeded in 2 usecs
> [  +0,04] [drm] ring test on 1 succeeded in 1 usecs
> [  +0,03] [drm] ring test on 2 succeeded in 1 usecs
> [  +0,59] [drm] ring test on 3 succeeded in 2 usecs
> [  +0,07] [drm] ring test on 4 succeeded in 1 usecs
> [  +0,187260] [drm] ring test on 5 succeeded in 2 usecs
> [  +0,04] [drm] UVD initialized successfully.
> [  +0,001745] [drm] Enabling audio 0 support
> [  +0,01] [drm] Enabling audio 1 support
> [  +0,00] [drm] Enabling audio 2 support
> [  +0,01] [drm] Enabling audio 3 support
> [  +0,01] [drm] Enabling audio 4 support
> [  +0,01] [drm] Enabling audio 5 support
> [  +0,35] [drm] ib test on ring 0 succeeded in 0 usecs
> [  +0,29] [drm] ib test on ring 1 succeeded in 0 usecs
> [  +0,29] [drm] ib test on ring 2 succeeded in 0 usecs
> [  +0,19] [drm] ib test on ring 3 succeeded in 0 usecs
> [  +0,18] [drm] ib test on ring 4 succeeded in 0 usecs
> [  +0,158619] [drm] ib test on ring 5 succeeded
> 
> But I wouldn't really thing that it's a problem. Just a bit bloat in the log.
> 
> 
> 
> Anyway, I have stared a bit longer at the callgraph and what I have attached
> seems suspicious I think. There's a whole lot of calls throughout the radeon
> driver but the kcachegrind gui is quite limited and the graph is very
> convoluted at this point so it's not clear to me whether this is during
> normal operation. I'm still kind of searching for a tool that could create a
> call graph and annotate it over time...
> Where this is coming from is in dix/dixutils.c line 719 ("(*(cbr->proc))
> (pcbl, cbr->data, call_data);") at least but to my eyes it looks like it
> could do anything and has a whole lot of callers that valgrind caught:
> 
> WriteToClient  (Xorg: io.c, ...)
> FlushAllOutput (Xorg: io.c, ...)
> XaceHook  (Xorg: xace.c, ...)
> XaceHookPropertyAccess (Xorg: xace.c, ...)
> XaceHookDispatch (Xorg: xace.c, ...)
> CloseDownConnection (Xorg: connection.c, ...)
> DeleteClientFromAnySelections (Xorg: selection.c, ...)
> CloseDownClient (Xorg: dispatch.c, ...)
> ProcSetSelectionOwner (Xorg: selection.c, ...)
> SendConnSetup (Xorg: dispatch.c, ...)
> NextAvailableClient (Xorg: dispatch.c, ...)
> 
> But maybe I'm just going in a totally wrong direction...

I meant these repeated messages not the error.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-02-07 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #33 from Christoph Haag  ---
Created attachment 125111
  --> https://bugzilla.kernel.org/attachment.cgi?id=125111=edit
possibly call chain that calls radeon stuff

@ Hohahiu

Do you mean those?
[ 8077.648324] [drm:si_dpm_set_power_state] *ERROR* si_set_sw_state failed

No, I don't see those at all.



By the way, I'm using linux 3.14-rc1 now with this patch:
https://bugzilla.kernel.org/attachment.cgi?id=124621=diff

But I have an awful lot of this block in dmesg, I guess from every time
starting X:

[Feb 6 13:39] [drm] Disabling audio 0 support
[  +0,04] [drm] Disabling audio 1 support
[  +0,02] [drm] Disabling audio 2 support
[  +0,01] [drm] Disabling audio 3 support
[  +0,01] [drm] Disabling audio 4 support
[  +0,02] [drm] Disabling audio 5 support
[  +1,257992] [drm] probing gen 2 caps for device 8086:151 = 261ad01/e
[  +0,05] [drm] PCIE gen 3 link speeds already enabled
[  +0,004127] [drm] PCIE GART of 1024M enabled (table at 0x00478000).
[  +0,000102] radeon :01:00.0: WB enabled
[  +0,04] radeon :01:00.0: fence driver on ring 0 use gpu addr
0x8c00 and cpu addr 0x8807fd9f8c00
[  +0,03] radeon :01:00.0: fence driver on ring 1 use gpu addr
0x8c04 and cpu addr 0x8807fd9f8c04
[  +0,02] radeon :01:00.0: fence driver on ring 2 use gpu addr
0x8c08 and cpu addr 0x8807fd9f8c08
[  +0,02] radeon :01:00.0: fence driver on ring 3 use gpu addr
0x8c0c and cpu addr 0x8807fd9f8c0c
[  +0,02] radeon :01:00.0: fence driver on ring 4 use gpu addr
0x8c10 and cpu addr 0x8807fd9f8c10
[  +0,000390] radeon :01:00.0: fence driver on ring 5 use gpu addr
0x00075a18 and cpu addr 0xc90008b35a18
[  +0,144839] [drm] ring test on 0 succeeded in 2 usecs
[  +0,04] [drm] ring test on 1 succeeded in 1 usecs
[  +0,03] [drm] ring test on 2 succeeded in 1 usecs
[  +0,59] [drm] ring test on 3 succeeded in 2 usecs
[  +0,07] [drm] ring test on 4 succeeded in 1 usecs
[  +0,187260] [drm] ring test on 5 succeeded in 2 usecs
[  +0,04] [drm] UVD initialized successfully.
[  +0,001745] [drm] Enabling audio 0 support
[  +0,01] [drm] Enabling audio 1 support
[  +0,00] [drm] Enabling audio 2 support
[  +0,01] [drm] Enabling audio 3 support
[  +0,01] [drm] Enabling audio 4 support
[  +0,01] [drm] Enabling audio 5 support
[  +0,35] [drm] ib test on ring 0 succeeded in 0 usecs
[  +0,29] [drm] ib test on ring 1 succeeded in 0 usecs
[  +0,29] [drm] ib test on ring 2 succeeded in 0 usecs
[  +0,19] [drm] ib test on ring 3 succeeded in 0 usecs
[  +0,18] [drm] ib test on ring 4 succeeded in 0 usecs
[  +0,158619] [drm] ib test on ring 5 succeeded

But I wouldn't really thing that it's a problem. Just a bit bloat in the log.



Anyway, I have stared a bit longer at the callgraph and what I have attached
seems suspicious I think. There's a whole lot of calls throughout the radeon
driver but the kcachegrind gui is quite limited and the graph is very
convoluted at this point so it's not clear to me whether this is during normal
operation. I'm still kind of searching for a tool that could create a call
graph and annotate it over time...
Where this is coming from is in dix/dixutils.c line 719 ("(*(cbr->proc)) (pcbl,
cbr->data, call_data);") at least but to my eyes it looks like it could do
anything and has a whole lot of callers that valgrind caught:

WriteToClient  (Xorg: io.c, ...)
FlushAllOutput (Xorg: io.c, ...)
XaceHook  (Xorg: xace.c, ...)
XaceHookPropertyAccess (Xorg: xace.c, ...)
XaceHookDispatch (Xorg: xace.c, ...)
CloseDownConnection (Xorg: connection.c, ...)
DeleteClientFromAnySelections (Xorg: selection.c, ...)
CloseDownClient (Xorg: dispatch.c, ...)
ProcSetSelectionOwner (Xorg: selection.c, ...)
SendConnSetup (Xorg: dispatch.c, ...)
NextAvailableClient (Xorg: dispatch.c, ...)

But maybe I'm just going in a totally wrong direction...

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-02-07 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #32 from Hohahiu  ---
I have the same messages in dmesg:
http://pastebin.com/vAsRRVW8

It seems like radeon driver tries to set clock and it couldn't do it. Therefore
it tries again and again and GPU does not turn off.

Christoph, does your dmesg have the same messages?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-02-06 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #31 from Christoph Haag  ---
Created attachment 124781
  --> https://bugzilla.kernel.org/attachment.cgi?id=124781=edit
callgrind output from X

Not much news:
I tried to run X with valgrind --tool=callgrind to try to find out about what
it does that results in the radeon gpu clocking higher and going to higher
voltage when clicking around with windows like minimizing/restoring with kwin &
plasma.


Unfortunately it does not happen when X runs in valgrind, the radeon gpu always
keeps at the lowest frequency and voltage level. But the runpm status is still
"active" all the time with or without valgrind.

With the exact same xorg and xf86* builds the clocking higher and raising the
voltages happens when it runs not in valgrind...

I still post the callgrind log, even if it did not clock higher, because in the
Callgraph Blockhandler() calls intel sna stuff but it also calls a lot of
radeon stuff.

Whether this is something that is caused by valgrind or something that should
happen with a radeon gpu that is not configured to be used for anything, I
don't know.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-02-02 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #30 from Christoph Haag  ---
Created attachment 124221
  --> https://bugzilla.kernel.org/attachment.cgi?id=124221=edit
sysprof with 2 xterm windows and kwin

I still kind of want to try the opposite direction, to find out what X does
that keeps the gpu awake so I'm trying to get a good dynamic call graph. The
first thing that kind of worked is sysprof...

So I started recording this after starting X and two xterms and kwin on the
intel gpu and just clicked a bit between the two because it seems this triggers
whatever happens. The radeon gpu was not even configured as offload slave, just
default X configuration.

If I saw that correctly none of the applications called functions from the
radeon driver which is probably good, but in X you can see this (more detail is
available in the file):

[/usr/bin/X]0,00%  65,99%
  In file [heap]0,00%  29,73%
ioctl   0,18%  16,17%
  - - kernel - -0,00%  15,99%
system_call_fastpath0,00%  15,99%
  sys_ioctl 0,04%  15,99%
do_vfs_ioctl0,00%  15,63%
  radeon_drm_ioctl  0,11%  15,60%
drm_ioctl   0,43%  14,63%
__pm_runtime_suspend0,00%   0,57%
__pm_runtime_resume 0,04%   0,14%
copy_user_enhanced_fast_string  0,07%   0,07%
irq_exit0,00%   0,04%
_copy_from_user 0,04%   0,04%
  drm_ioctl 0,04%   0,04%
fget_light  0,25%   0,25%
radeon_drm_ioctl0,04%   0,04%
fput0,04%   0,04%

I have only this GPU so I don't have any comparison to one where runpm works
correctly, whether it should be this active... Maybe it's interesting that
stuff like __pm_runtime_suspend and __pm_runtime_resume is called?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-02-01 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #29 from Christoph Haag  ---
No, it does not seem to help.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-01-31 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #28 from Alex Deucher  ---
Created attachment 124051
  --> https://bugzilla.kernel.org/attachment.cgi?id=124051=edit
return IRQ_NONE if we don't have any interrupts

Does the attached patch help?  Dave suggested it might be interrupts that
causing the problem.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-01-31 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #27 from Christoph Haag  ---
It does get to the end of radeon_pmops_runtime_idle every time (checked right
before the return) and crtc->enabled is never true.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-01-31 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #26 from Alex Deucher  ---
Can you see how far it's getting in runtime_idle()?  Does it get all the way to
the end or does it think a crtc is active?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-01-31 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #25 from Christoph Haag  ---
Created attachment 124041
  --> https://bugzilla.kernel.org/attachment.cgi?id=124041=edit
just some printk to easily see what's going on

Well, I don't really know how it is supposed to work. I figured the easiest
thing I could do right now was adding some printk() to the code to have an
overview of what is happening in dmesg -H | grep radeonpm.

After booting up with no X I see this:

[  +0,16] radeonpm: runtime_idle
[  +4,000483] radeonpm: runtime_suspend

After starting X this is added:

[Jan31 19:04] radeonpm: runtime_resume
[  +1,126688] radeonpm: runtime_idle
[  +0,000203] radeonpm: runtime_idle
[  +0,000400] radeonpm: runtime_idle
[  +0,000232] radeonpm: runtime_idle

But then it never does anything again.

After this, I did the usual xrandr --setprovideroffloadsink radeon Intel and
started and exited DRI_PRIME=1 glxgears and stuff, but nothing with radeonpm
was printed to the log again.

After a while I tried exiting X and sure enough, another line:

[  +2,368895] radeonpm: runtime_suspend

And after starting X for the second time:

[ +19,119650] radeonpm: runtime_resume
[  +1,131037] radeonpm: runtime_idle

And then... nothing with radeonpm again. I have read a bit in the
Documentation/power/runtime_pm.txt and maybe I did not catch it but I haven't
really seen when you call pm_runtime_put_autosuspend whether it is supposed to
eventually call back radeon_pmops_runtime_suspend but I would think that's what
it does, right? So that just never happens until X is quit.




After boot with no X running I see this with for i in
/sys/class/drm/card0/device/power/*; do echo "$i: $(cat $i)"; done

/sys/class/drm/card0/device/power/async: enabled
/sys/class/drm/card0/device/power/autosuspend_delay_ms: 5000
/sys/class/drm/card0/device/power/control: auto
/sys/class/drm/card0/device/power/runtime_active_kids: 0
/sys/class/drm/card0/device/power/runtime_active_time: 13260
/sys/class/drm/card0/device/power/runtime_enabled: enabled
/sys/class/drm/card0/device/power/runtime_status: suspended
/sys/class/drm/card0/device/power/runtime_suspended_time: 218426
/sys/class/drm/card0/device/power/runtime_usage: 0
[snipped some wakeup* stuff]

The GPU is off, fan is not running, etc.


After starting X I see this:

/sys/class/drm/card0/device/power/async: enabled
/sys/class/drm/card0/device/power/autosuspend_delay_ms: 5000
/sys/class/drm/card0/device/power/control: auto
/sys/class/drm/card0/device/power/runtime_active_kids: 0
/sys/class/drm/card0/device/power/runtime_active_time: 185230
/sys/class/drm/card0/device/power/runtime_enabled: enabled
/sys/class/drm/card0/device/power/runtime_status: active
/sys/class/drm/card0/device/power/runtime_suspended_time: 226856
/sys/class/drm/card0/device/power/runtime_usage: 0

runtime_status is always active in X...

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-01-31 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #24 from Alex Deucher  ---
The kernel runtime pm infrastructure is used.  See radeon_drv.c in the kernel. 
The driver registers a struct dev_pm_ops structure with the pm core.  The pm
core will then suspend and resume the device on demand.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-01-31 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #23 from Christoph Haag  ---
Created attachment 123891
  --> https://bugzilla.kernel.org/attachment.cgi?id=123891=edit
quitting steam after some time

So maybe radeon-profile shows garbage, but in the steam client I have found one
of the worse offenders. It doesn't happen always, but often, especially after
quitting a game I think, but maybe it's just after some time.

When "it" happens, the fan runs one "level" higher than usual and after I
noticed it I started radeon-profile and it showed what you see in the attached
screenshot.

As soon as I quit steam, the temperature (read by lm_sensors I think) drops
quickly, the voltages and frequencies (read from debugfs I think) are (mostly)
stopping to being at the top all the time and only exhibit the spikes from my
last comment, but most importantly the fan quickly drops to the usual behavior
of switching between the lowest level and off instead of running on the "second
lowest level" all the time like when steam was running.

I have had a quick look at the xf86-video-ati but I'm not sure what I'm looking
for: Is there one single function or so I can set a break point on that would
definitely tell me whether the radeon gpu is ordered to wake up/render
something? I'd be happy to try it on the kernel side too, but maybe there's a
chance to get a backtrace on the X side to see where it is coming from - all
given that it actually happens like I naively imagine. But I don't see anything
else: According to LIBGL_DEBUG=verbose steam doesn't open the radeonsi driver,
but yet sometimes manages to keep the fan of the radeon gpu running on higher
rpm. I would be happy to also test patches against whatever that could add
debug logging to point out what the radeon gpu is doing and why.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-01-24 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #22 from Hohahiu  ---
What does the order of GPUs in /sys/kernel/debug/vgaswitcheroo/switch mean?
Sometimes it is like this (dGPU is powered off):
0:IGD:+:Pwr::00:02.0
1:DIS: :DynOff::01:00.0
In other case it is like that (and dGPU is obviously on):
0:DIS: :PwrDyn::01:00.0
1:IGD:+:Pwr::00:02.0

Can it affect runpm?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-01-22 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #21 from Hohahiu  ---
Created attachment 122971
  --> https://bugzilla.kernel.org/attachment.cgi?id=122971=edit
Xorg.0.log

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-01-22 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #20 from Hohahiu  ---
Created attachment 122961
  --> https://bugzilla.kernel.org/attachment.cgi?id=122961=edit
dmesg

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-01-22 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

Hohahiu  changed:

   What|Removed |Added

 CC||rakothedin at gmail.com

--- Comment #19 from Hohahiu  ---
I have similar trouble with vgaswitcheroo. My GPUs are Intel hd4000 + Amd
mobility radeon 7750M:
/sbin/lspci | grep VGA
00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor
Graphics Controller (rev 09)
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
Chelsea LP [Radeon HD 7730M] (rev ff)

If I don't set anything during the boot my dGPU is powered down successfully:
cat /sys/kernel/debug/vgaswitcheroo/switch
0:IGD:+:Pwr::00:02.0
1:DIS: :DynOff::01:00.0

But in the meantime:
xrandr --listproviders
Providers: number : 1
Provider 0: id: 0x4a cap: 0xb, Source Output, Sink Output, Sink Offload crtcs:
4 outputs: 7 associated providers: 0 name:Intel

So I cannot access my discrete GPU.

If after the boot I use
xrandr --setprovideroffloadsink radeon Intel
then dGPU never switches off.

Truly speaking runpm never worked for me even with Alex's early patches.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-01-21 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #18 from Jack  ---
>If there is a better way to find out why the fan doesn't shut off

I'm pretty sure the reason the fans don't shut off is because they're reacting
to the dGPU usage. In my incredibly un-scientific observation, I've noticed
that the laptop gets warmer to the touch in Linux than it does in Windows. I'll
be doing some actual testing on this later tonight, but it seems to me that the
actual issue isn't the fans -- it's the fact that the dGPU is revving up and
down on a pretty constant basis, and that's driving the fans to spin up and
down with the dGPU.

One thing I noticed is that if I don't have a compositor running it only goes
up to power level 1, instead of two, when doing things in the OS, ie, browsing
in Chrome, etc. I still don't even think it should be anything but 0 unless
using DRI_PRIME, but might the compositor be interacting with the dGPU through
PRIME, accidentally? I say this without having any real technical knowledge of
how PRIME works.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-01-21 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #17 from Christoph Haag  ---
It reads /sys/kernel/debug/dri/0/radeon_pm_info once a second and lm_sensors I
think.

Powered down in terms of vgaswitcheroo/dynoff, I don't think so. When X is
running I don't read dynoff for the radeon gpu from
/sys/kernel/debug/vgaswitcheroo/switch but when X is not running, it powers
down.

I checked with radeon-profile because it's actually the fan that is sometimes
annoying to me.
The fan only powers off for seconds before starting to run again, mostly only
up to "level 1" for a while.
But sometimes it runs at high speed for seconds or even minutes for seemingly
no reason and then powers down again.

When monitoring with
sudo sh -c 'while true; do cat /sys/kernel/debug/dri/0/radeon_pm_info; sleep
0.5; done'
I see the same thing. When clicking on an unfocussed window and focussing it
with that, it goes from 

power level 0sclk: 3 mclk: 15000 vddc: 825 vddci: 850 pcie gen: 3
uvdvclk: 0 dclk: 0

to

power level 2sclk: 85000 mclk: 12 vddc: 1050 vddci: 975 pcie gen: 3
uvdvclk: 0 dclk: 0

stays there for ~6 seconds and then goes back to power level 0.


If it was garbage it would be pretty consistent garbage with the gpu
temperature rising pretty much exactly on every of those spike...
If constant polling from /sys/kernel/debug/dri/0/radeon_pm_info or so is
causing it, then the long periods of no spikes when actually doing nothing in X
confuse me.

Currently it's just this situation that the fan keeps going on and off all the
time and I am pretty sure it is on most of the time when I'm doing something,
like clicking things and stuff and when I'm exiting X and see dynoff in
/sys/kernel/debug/vgaswitcheroo/switch then the fan actually stays off (I
think. Going to a tty while X is running in the background does not result in
dynoff)

If there is a better way to find out why the fan doesn't shut off, I'd gladly
investigate.

The code is here, by the way: https://github.com/marazmista/radeon-profile

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-01-21 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #16 from Alex Deucher  ---
I'm not familiar with radeon-profile, but if it does some sort of manual
polling of the card, it might be waking up the card every time it polls or it
might just be reading garbage because the card is powered down.  Does it help
if you don't run radeon-profile?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-01-21 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

Christoph Haag  changed:

   What|Removed |Added

 Attachment #115961|0   |1
is obsolete||
 Attachment #119501|0   |1
is obsolete||
 Attachment #119551|0   |1
is obsolete||

--- Comment #15 from Christoph Haag  ---
Created attachment 122811
  --> https://bugzilla.kernel.org/attachment.cgi?id=122811=edit
erratic dpm with window manager actions

Okay, so we got 3.13 without crashes with default settings, so far so good,
thanks so much. :)



The dynpm power management is wonky. It doesn't really fit in here, but I
already dumped a lot of stuff in here anyway and this report has been linked to
in relation to me mentioning that the power management isn't working well yet,
so here is another (not so) small point.

Here is a screenshot from the tool radeon-profile. I believe over the whole
period shown there I have not used the radeon GPU to render anything, but it is
configured as offload slave. I also hope that this tool did not cause this
behaviour, but from the fan noise I think it's happening without it too.

So there are some random "short" spikes that die off after a while of doing
nothing in X. However there are these longer spikes (about 5-7 seconds) when
doing some window management actions. An example of that is
minimizing/unminizing a window or simply changing the focus from one window to
another window via the mouse. I have tested this with kwin with compositing and
openbox without compositing and I think it's pretty clear that this actually
causes it. Changing tabs in chromium also seems to cause it.

The very quick spikes happened when I additionally ran glxgears with kwin
(still everything on the Intel GPU). They died down but appeared sometimes
again after doing something with the windows. With openbox on the other hand I
don't see such quick spikes with glxgears.

Another funny effect in openbox is that a single click on the "gears" in
glxgears doesn't have an effect, but a double click produces one of the longer
spikes. If I single click in chromium here in the bugtracker on the bakground
or in the textarea it produces a spike too. But simply typing in the textarea
does not.

I don't really know what to make of those observations, but I would guess that
the radeon gpu doesn't enter the dynoff state while X is running because of
something related to this. A short google search didn't show anything new about
radeonsi + dynoff, especially nobody who claims it's working, so maybe glamor
is worth looking at?

The intel gpu uses SNA and radeon obviously glamor. If the Xorg.0.log or so
would be helpful, just ask.

Maybe someone with a pre radeonsi gpu could test whether they have the same
issue when using glamor over exa?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-01-19 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #14 from Jack  ---
Still having issues in rc8, unfortunately.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-01-06 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

Jack  changed:

   What|Removed |Added

 CC||q at cyphernaut.org

--- Comment #13 from Jack  ---
Additionally vgaswitcheroo/switch remains at DynPwr whenever X is running, even
when nothing is using DRI_PRIME=1. This might be an issue specifically with
this card (I have a 7970M as well), as I've heard reports of users with earlier
generation cards (6xxxM) have seen vgaswitcheroo report DynOff when DRI_PRIME=1
is not set by any running application.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2014-01-06 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #12 from Christoph Haag  ---
Okay, with 3.13 rc7 I can confirm that the acpiphp problem is fixed. Now with
everything in the default configuration you can boot and use X without any
error.

rmmod radeon still causes nasty crashes.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2013-12-24 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #11 from Christoph Haag  ---
Created attachment 119551
  --> https://bugzilla.kernel.org/attachment.cgi?id=119551=edit
dmesg with acpiphp.disable=1

A quick search also showed this:
https://bugzilla.kernel.org/show_bug.cgi?id=67461

Good news: With acpiphp.disable=1 there are no errors with booting and using X.
In a tty the gpu is in DynOff state.



Not so good news: When starting X it switches to DynPwr and after several
minutes I don't think it ever powered down again. But when killing X it goes
back to DynOff.

Still bad news: rmmod radeon.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2013-12-24 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #10 from Alex Deucher  ---
See also:
https://bugzilla.kernel.org/show_bug.cgi?id=61891

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2013-12-24 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #9 from Alex Deucher  ---
This looks like a pci hotplug problem, not a radeon issue.  See:
https://bugs.freedesktop.org/show_bug.cgi?id=70687

"I seem to have discovered the root of the issue.

I've just built 3.13-rc5 kernel which has the dynamic powering of the discrete
gpu and all hell broke loose.

I've narrowed the error down to the pci hotplug driver. My machine loads shpchp
pci hotplug driver from what I can see in lsmod output. But the trick is, that
there is another pci hotplug driver, acpi pci hotplug one, which seems to break
all hell loose here. Disabling it seems to fix everything for me, at least on
kernel 3.13.

# CONFIG_HOTPLUG_PCI_ACPI is not set

This kernel config option is the culprit for this, and that also can be seen
from my backtrace:

[   22.731998]  [] ? acpiphp_check_bridge+0x72/0x88

So the trick behind this is that acpi pci hotplug driver conflicts with shpchp
one that my machine uses. And since it is a builtin driver, and can't be built
as module it is always loaded. The other possibility is that this machine
doesn't support acpi hotplug, but does support shpc pci hotplug. We need a
kernel workarround so that acpi pci hotplug is disabled and out of the way when
shpc pci hotplug is enabled."

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2013-12-24 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #8 from Christoph Haag  ---
Created attachment 119501
  --> https://bugzilla.kernel.org/attachment.cgi?id=119501=edit
rc5 with starting X (~line 1811) and trying to render something

rc5 with these patches now:
http://lists.freedesktop.org/archives/dri-devel/2013-December/050902.html

I'm not sure what I can do more without getting into trying to debug kernel
code of which I know nothing...

Not really a change with rc5, but I think when starting X the lockups go away
(not really sure). When trying to render something with DRI_PRIME=1 however, it
locks up a bit harder and can't recover apparently because when trying to start
another opengl program, it'll just segfault.

rmmod radeon still creates problems that lock up the cpus and prevent proper
rebooting etc.




With runpm=0 it's not really good either at the moment. For example the
Distance alpha causes a lockup too and after that lockup glxgears says

$ DRI_PRIME=1 glxgears -info
radeon: Failed to allocate virtual address for buffer:
radeon:size  : 4352 bytes
radeon:alignment : 4096 bytes
radeon:domains   : 4
radeon:va: 0x0080
radeon: Failed to allocate virtual address for buffer:
radeon:size  : 4352 bytes
radeon:alignment : 4096 bytes
radeon:domains   : 4
radeon:va: 0x0080
[1]2915 segmentation fault (core dumped)

I'm cramming way too much in here, but I just want to say that the HD 7970M
with the 8970M are the top mobile consumer GPUs AMD currently has and it would
be really cool if the developers could buy that hardware (hey, it's christmas!)
and fix this mess directly. From what I have seen so far it is really close to
working amazingly well. If there's something specific I could test, I'll gladly
try.
Thanks

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2013-12-07 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #7 from Christoph Haag  ---
Created attachment 117771
  --> https://bugzilla.kernel.org/attachment.cgi?id=117771=edit
Same as "117761: dmesg with drm-fixes-3.12-radeon-poweroff branch", but with
rmmod radeon and modprobe radeon

Okay, I exited X and from a tty did rmmod radeon. It produced the same error in
dmesg than it does on 3.13, see attached file...

Then with modprobe radeon I got the GPU lockup CP stall for more than
1msec, but interestingly only once and then it worked.

The radeon gpu was at DynOff then.

As soon as I started X, the gpu was powered on to DynPwr again. Not even
configured as provideroffloadsink or so, just the X startup and presumably
hardware detection powered it up and it does not go back to DynOff.

But again, knowing that runpm itself works without errors is encouraging.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2013-12-07 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #6 from Christoph Haag  ---
Created attachment 117761
  --> https://bugzilla.kernel.org/attachment.cgi?id=117761=edit
dmesg with drm-fixes-3.12-radeon-poweroff branch

Hm, I think it kind of works with drm-fixes-3.12-radeon-poweroff.

There seems to be no error in dmesg and X works fine.


I have monitored /sys/kernel/debug/vgaswitcheroo/switch a while.

Unfortunately most of the time it says
0:IGD:+:Pwr::00:02.0
1:DIS: :DynPwr::01:00.0

I have seen it say
0:IGD:+:Pwr::00:02.0
1:DIS: :DynOff::01:00.0
But I don't know really know how I got it to switch and whether it was
something I did at all.

Then, unprovoked, it switched to on ("DynPwr") again.
(There's nothing running on the radeon GPU, only X with xcompmgr on intel and
radeon is configured as provideroffloadsink)


I don't know how to coax it to power the gpu off. But at least it CAN work
already.




Maybe this is useful, maybe not:
With the radeon in idle I have this:
$ cat /sys/kernel/debug/dri/1/radeon_pm_info
uvdvclk: 0 dclk: 0
power level 0sclk: 3 mclk: 15000 vddc: 825 vddci: 850 pcie gen: 3

with glxgears I have this:
$ cat /sys/kernel/debug/dri/1/radeon_pm_info
uvdvclk: 0 dclk: 0
power level 2sclk: 85000 mclk: 12 vddc: 1050 vddci: 975 pcie gen: 3

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2013-12-06 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #5 from Alex Deucher  ---
Does runpm work on my old runpm branch prior to merging with 3.13?

http://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-fixes-3.12-radeon-poweroff

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2013-12-06 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #4 from Christoph Haag  ---
Created attachment 117751
  --> https://bugzilla.kernel.org/attachment.cgi?id=117751=edit
kernel panic on reboot with runpm=1

3.13-rc3, same problems.

Also, here is a crappy picture of a kernel panic that happens when I try to
reboot when runpm is enabled.

When I have more time next week I'll figure out how to setup netconsole
properly and get a full log.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2013-11-30 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

--- Comment #3 from Christoph Haag  ---
Created attachment 116871
  --> https://bugzilla.kernel.org/attachment.cgi?id=116871=edit
dmesg from 3.13-rc2 with rmmod radeon in line 2095

Having enough time to test everything would be nice.

But in the meantime I have only tested 3.13-rc2. With defaults the issues are
still there (I have not yet checked whether mentioned commits made it into
rc2).

Now I have some more messsages in dmesg after rmmod'ing radeon. This time it
didn't hang after all that, but rebooting didn't work for some reason. Maybe
it's just me but I'd like to have proper unloading to have the option to switch
between radeon and fglrx without rebooting. :)

Anyway with runpm=0 I don't get any errors so dpm does really work fine. On an
unrelated note: WOW, has radeonsi improved in performance! Much better than
fglrx with "official" hybrid support in some applications at least.

Unfortunately using runpm=0 and using vgaswitcheroo manually is not a viable
workaround because of https://bugzilla.kernel.org/show_bug.cgi?id=51381

But with runpm=0 there still is a similar (?) problem when rmmod'ing radeon:

[ 1897.087151] [drm] radeon: finishing device.
[ 1897.518615] [ cut here ]
[ 1897.518634] WARNING: CPU: 4 PID: 2539 at drivers/gpu/drm/drm_mm.c:578
drm_mm_takedown+0x2e/0x30 [drm]()
[ 1897.518635] Memory manager not clean during takedown.
[ 1897.518636] Modules linked in: bnep bluetooth iTCO_wdt iTCO_vendor_support
arc4 iwldvm snd_hda_codec_hdmi mac80211 snd_hda_codec_realtek joydev
x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul
crct10dif_common crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw
gf128mul glue_helper ablk_helper cryptd snd_hda_intel microcode snd_hda_codec
iwlwifi snd_hwdep psmouse cfg80211 snd_pcm serio_raw pcspkr snd_page_alloc
radeon(-) rtsx_pci_ms r8169 snd_timer memstick rfkill lpc_ich mii snd i2c_i801
soundcore ttm wmi thermal mei_me mei shpchp processor battery ac evdev nfs
lockd sunrpc fscache fuse ext4 crc16 mbcache jbd2 sr_mod cdrom sd_mod
hid_generic usbhid hid rtsx_pci_sdmmc mmc_core ahci i915 libahci crc32c_intel
i2c_algo_bit libata intel_agp firewire_ohci intel_gtt
[ 1897.518659]  firewire_core crc_itu_t drm_kms_helper ehci_pci ehci_hcd
xhci_hcd scsi_mod rtsx_pci drm usbcore i2c_core usb_common video button
[ 1897.518665] CPU: 4 PID: 2539 Comm: rmmod Not tainted 3.13.0-1-mainline #1
[ 1897.518666] Hardware name: CLEVO P170EM/P170EM,
BIOS 4.6.5 08/22/2012
[ 1897.518667]  0009 8807d4b47c20 814f5570
8807d4b47c68
[ 1897.518669]  8807d4b47c58 81061bad 8807fef616c0
8807fef61768
[ 1897.518671]  8808029a47b0  01ce5090
8807d4b47cb8
[ 1897.518673] Call Trace:
[ 1897.518677]  [] dump_stack+0x4d/0x6f
[ 1897.518681]  [] warn_slowpath_common+0x7d/0xa0
[ 1897.518683]  [] warn_slowpath_fmt+0x4c/0x50
[ 1897.518688]  [] ? ttm_bo_man_takedown+0x48/0x70 [ttm]
[ 1897.518693]  [] drm_mm_takedown+0x2e/0x30 [drm]
[ 1897.518696]  [] ttm_bo_man_takedown+0x38/0x70 [ttm]
[ 1897.518699]  [] ttm_bo_clean_mm+0x49/0x80 [ttm]
[ 1897.518709]  [] radeon_ttm_fini+0xbd/0x190 [radeon]
[ 1897.518716]  [] radeon_bo_fini+0x12/0x20 [radeon]
[ 1897.518727]  [] si_fini+0xc1/0x100 [radeon]
[ 1897.518733]  [] radeon_device_fini+0x3e/0x120 [radeon]
[ 1897.518739]  [] radeon_driver_unload_kms+0x4e/0x70
[radeon]
[ 1897.518744]  [] drm_dev_unregister+0x2c/0xe0 [drm]
[ 1897.518749]  [] drm_put_dev+0x3b/0x70 [drm]
[ 1897.518754]  [] radeon_pci_remove+0x1d/0x20 [radeon]
[ 1897.518756]  [] pci_device_remove+0x3b/0xb0
[ 1897.518759]  [] __device_release_driver+0x7f/0xf0
[ 1897.518761]  [] driver_detach+0xb8/0xc0
[ 1897.518763]  [] bus_remove_driver+0x55/0xd0
[ 1897.518765]  [] driver_unregister+0x2c/0x50
[ 1897.518767]  [] pci_unregister_driver+0x29/0x90
[ 1897.518772]  [] drm_pci_exit+0x98/0xa0 [drm]
[ 1897.518778]  [] radeon_exit+0x17/0x1e [radeon]
[ 1897.518780]  [] SyS_delete_module+0x172/0x240
[ 1897.518783]  [] ? do_notify_resume+0x8c/0xa0
[ 1897.518785]  [] system_call_fastpath+0x1a/0x1f
[ 1897.518786] ---[ end trace ba1fe37dd4719714 ]---
[ 1897.518790] [TTM] Finalizing pool allocator
[ 1897.518793] [TTM] Finalizing DMA pool allocator
[ 1897.518797] [ cut here ]
[ 1897.518799] WARNING: CPU: 4 PID: 2539 at
drivers/gpu/drm/ttm/ttm_page_alloc_dma.c:534 ttm_dma_free_pool+0x12b/0x130
[ttm]()
[ 1897.518800] Modules linked in: bnep bluetooth iTCO_wdt iTCO_vendor_support
arc4 iwldvm snd_hda_codec_hdmi mac80211 snd_hda_codec_realtek joydev
x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul
crct10dif_common crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw
gf128mul glue_helper ablk_helper cryptd snd_hda_intel microcode snd_hda_codec
iwlwifi snd_hwdep psmouse cfg80211 snd_pcm serio_raw pcspkr snd_page_alloc
radeon(-) rtsx_pci_ms r8169 snd_timer

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2013-11-26 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

Alex Deucher  changed:

   What|Removed |Added

 CC||alexdeucher at gmail.com,
   ||rjw at sisk.pl

--- Comment #2 from Alex Deucher  ---
>From the bisect in https://bugs.freedesktop.org/show_bug.cgi?id=71930 it looks
like commit bbd34fcdd1b201e996235731a7c98fd5197d9e51 caused a regression in
vgaswitcheroo and hence runpm.  Adding Rafael.  Also bug 65811 looks like a
duplicate of this one.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 65761] HD 7970M Hybrid - hangs and errors and rmmod causes crash

2013-11-25 Thread bugzilla-dae...@bugzilla.kernel.org

https://bugzilla.kernel.org/show_bug.cgi?id=65761

Mike Lothian  changed:

   What|Removed |Added

 CC||mike at fireburn.co.uk

--- Comment #1 from Mike Lothian  ---
I believe this is related to https://bugs.freedesktop.org/show_bug.cgi?id=71930

I think the memory Warnings have been fixed by either:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/gpu/drm?id=c58f009e01c918717379c206a63baa66f56a77f9

or

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/gpu/drm?id=0bc254257bfd9b25f64a68b719ee70a303b6d051

As a workaround you can boot with radeon.runpm=0 but that'll leave your
discreet card powered up

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

57 matches

Mail list logo