Re: [Intel-gfx] [BUG] i915 RC6 lockup

2012-10-19 Thread Jonas Jelten
On 10/17/2012 04:30 AM, Ben Widawsky wrote:
> On Tue, 16 Oct 2012 15:19:26 +0200
> Jonas Jelten  wrote:
> 
>> Hi list!
>>
>> I think i've got a problem with the intel driver:
>>
>> Sometimes, I think especially after running graphics intense
>> applications, RC6 is disabled completely and heats up my Thinkpad
>> X220t to 90 degree celsius, while idling.
>>
>> At first I thought that this is a CPU frequency scaling issue, as the
>> cpufreq_powersave claims to be running at 800 MHz, but i7z
>> (http://code.google.com/p/i7z/) shows all multipliers to be 25 -> 2.5
>> GHz CPU clock.
>>
>> Powertop 2.1 reveals that the GPU is 100% active, 0% RC6, 0% RC6p and
>> 0% RC6pp, and the CPU is 99,9% in C7-deep-sleep, at maximum frequency.
>> /sys/kernel/debug/dri/0/i915_ring_freq_table also pointed the issue to
>> being caused by the GPU.
> 
> Do you mean the GPU is 0% active? If you really mean 100% then the
> results are expected, though I'm not sure how powertop attempts to
> calculate the GPU activity. I'm guessing it's just 100 - rc6
> state percentage, which when rc6 works is probably pretty close to
> reasonable.
> 
>>
>> intel_gpu_top shows a total idle.
> 
> This indicates the above assumption is true.
> 
>>
>> I'm on ArchLinux, Kernel 3.6.2, xf86-video-intel-git
>> b42d81b63f5b6a571faffaadd42c74adce40128a, this is 2.20.10.
>> Problem first occured with Kernel 3.6.0.
>> Core i5-2520M HD 3000
> 
> Obviously a bisect of the exact failing commit would be fantastic.
> 
>>
>>> cat /proc/cmdline
>>> cryptdevice=/dev/sda2:cryptroot root=/dev/mapper/cryptroot ro vga=791
>>> i915.i915_enable_rc6=7 i915.modeset=1 i915.lvds_downclock=1
>>> i915.semaphores=1 drm.vblankoffdelay=1 init=/bin/systemd
>>> initrd=../initramfs-linux.img BOOT_IMAGE=../vmlinuz-linux
> 
> First and most obvious, do not set rc6=7. If you do, do not file
> bug reports with those results. RC6++ is known to be extremely broken,
> and why we let users so easily hurt themselves is probably something we
> need to remedy. On HD3000, even rc6+ is highly recommended against.
> 
>>
>> Sometimes it can be fixed by going to pm-suspend and waking up. A
>> reboot always fixes it, until it randomly locks up the GPU again.
>>
>> Please help me how i can do further investigation to catch the bug.
> 
> If you can reproduce it with rc6=1, then it echoes some other bugs
> we're trying to track down. Figuring out the most minimal test case to
> make it occur would be helpful. Also you can search the mailing list
> for RPS related patches which seem to be related. Trying some of those
> and reporting your results would be helpful.
> 
> Double check your dmesg for any GPU hangs which may have occurred before
> the laptop becomes a space heater.
> 
> 
>>
>> As this makes my Laptop consume ~40W, it would be really nice if this
>> gets fixed.
>>
>>
>> Cheers,
>>
>> Jonas
>>
>>
> 

others are also suffering:

https://bugs.archlinux.org/task/32025



signature.asc
Description: OpenPGP digital signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [BUG] i915 RC6 lockup

2012-10-16 Thread Ben Widawsky
On Tue, 16 Oct 2012 15:19:26 +0200
Jonas Jelten  wrote:

> Hi list!
> 
> I think i've got a problem with the intel driver:
> 
> Sometimes, I think especially after running graphics intense
> applications, RC6 is disabled completely and heats up my Thinkpad
> X220t to 90 degree celsius, while idling.
> 
> At first I thought that this is a CPU frequency scaling issue, as the
> cpufreq_powersave claims to be running at 800 MHz, but i7z
> (http://code.google.com/p/i7z/) shows all multipliers to be 25 -> 2.5
> GHz CPU clock.
> 
> Powertop 2.1 reveals that the GPU is 100% active, 0% RC6, 0% RC6p and
> 0% RC6pp, and the CPU is 99,9% in C7-deep-sleep, at maximum frequency.
> /sys/kernel/debug/dri/0/i915_ring_freq_table also pointed the issue to
> being caused by the GPU.

Do you mean the GPU is 0% active? If you really mean 100% then the
results are expected, though I'm not sure how powertop attempts to
calculate the GPU activity. I'm guessing it's just 100 - rc6
state percentage, which when rc6 works is probably pretty close to
reasonable.

> 
> intel_gpu_top shows a total idle.

This indicates the above assumption is true.

> 
> I'm on ArchLinux, Kernel 3.6.2, xf86-video-intel-git
> b42d81b63f5b6a571faffaadd42c74adce40128a, this is 2.20.10.
> Problem first occured with Kernel 3.6.0.
> Core i5-2520M HD 3000

Obviously a bisect of the exact failing commit would be fantastic.

> 
> >cat /proc/cmdline
> >cryptdevice=/dev/sda2:cryptroot root=/dev/mapper/cryptroot ro vga=791
> >i915.i915_enable_rc6=7 i915.modeset=1 i915.lvds_downclock=1
> >i915.semaphores=1 drm.vblankoffdelay=1 init=/bin/systemd
> >initrd=../initramfs-linux.img BOOT_IMAGE=../vmlinuz-linux

First and most obvious, do not set rc6=7. If you do, do not file
bug reports with those results. RC6++ is known to be extremely broken,
and why we let users so easily hurt themselves is probably something we
need to remedy. On HD3000, even rc6+ is highly recommended against.

> 
> Sometimes it can be fixed by going to pm-suspend and waking up. A
> reboot always fixes it, until it randomly locks up the GPU again.
> 
> Please help me how i can do further investigation to catch the bug.

If you can reproduce it with rc6=1, then it echoes some other bugs
we're trying to track down. Figuring out the most minimal test case to
make it occur would be helpful. Also you can search the mailing list
for RPS related patches which seem to be related. Trying some of those
and reporting your results would be helpful.

Double check your dmesg for any GPU hangs which may have occurred before
the laptop becomes a space heater.


> 
> As this makes my Laptop consume ~40W, it would be really nice if this
> gets fixed.
> 
> 
> Cheers,
> 
> Jonas
> 
> 

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [BUG] i915 RC6 lockup

2012-10-16 Thread Jonas Jelten
Hi list!

I think i've got a problem with the intel driver:

Sometimes, I think especially after running graphics intense
applications, RC6 is disabled completely and heats up my Thinkpad X220t
to 90 degree celsius, while idling.

At first I thought that this is a CPU frequency scaling issue, as the
cpufreq_powersave claims to be running at 800 MHz, but i7z
(http://code.google.com/p/i7z/) shows all multipliers to be 25 -> 2.5
GHz CPU clock.

Powertop 2.1 reveals that the GPU is 100% active, 0% RC6, 0% RC6p and 0%
RC6pp, and the CPU is 99,9% in C7-deep-sleep, at maximum frequency.
/sys/kernel/debug/dri/0/i915_ring_freq_table also pointed the issue to
being caused by the GPU.

intel_gpu_top shows a total idle.

I'm on ArchLinux, Kernel 3.6.2, xf86-video-intel-git
b42d81b63f5b6a571faffaadd42c74adce40128a, this is 2.20.10.
Problem first occured with Kernel 3.6.0.
Core i5-2520M HD 3000

>cat /proc/cmdline
>cryptdevice=/dev/sda2:cryptroot root=/dev/mapper/cryptroot ro vga=791
>i915.i915_enable_rc6=7 i915.modeset=1 i915.lvds_downclock=1
>i915.semaphores=1 drm.vblankoffdelay=1 init=/bin/systemd
>initrd=../initramfs-linux.img BOOT_IMAGE=../vmlinuz-linux

Sometimes it can be fixed by going to pm-suspend and waking up. A reboot
always fixes it, until it randomly locks up the GPU again.

Please help me how i can do further investigation to catch the bug.

As this makes my Laptop consume ~40W, it would be really nice if this
gets fixed.


Cheers,

Jonas



signature.asc
Description: OpenPGP digital signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx