[BUG] drm/i915 Screen corruption introduced by a00b10c360b35d6431a94cb

2011-02-22 Thread Indan Zupancic
On Tue, February 22, 2011 04:25, Indan Zupancic wrote:
> Hi,
>
> On Sun, February 20, 2011 14:21, Daniel Vetter wrote:
>> Well, don't start jumping around, yet. These patches are just to rule out
>> some theories. Now: Is it fixed with just the 2nd patch alone or do you need 
>> both
>> patches? This is very important, so please test extensively whether there 
>> are really
>> no corruptions with just the 2nd patch.
>
> I managed to create some corruption with an xterm above xpdf. It looks  
> different
> than the original corruption, so I think it's safe to say it's a different 
> bug:
>
> http://img593.imageshack.us/i/ss1298340823.png
> http://img203.imageshack.us/i/ss1298340776.png
>
> This was without xcompmgr running, I don't think I would have seen it 
> otherwise.
> Actually, it turns out it's really easy to trigger as well, so I can more 
> easily
> test patches now. Unfortunately I wasn't smart enough to store the bisecting 
> kernels,
> gah!
>
> I tried with both your patches, as well as the HIC poking patch, but I still 
> get it.
>
> I'll try to pinpoint more exactly when this started to happen.

Okay, I tried older kernels and I get it even with 2.6.34, so I think it's a bug
in userspace, like intel-dri, or perhaps the intel X driver (though I get the 
same
with an older version).

Greetings,

Indan




[BUG] drm/i915 Screen corruption introduced by a00b10c360b35d6431a94cb

2011-02-22 Thread Indan Zupancic
Hi,

On Sun, February 20, 2011 14:21, Daniel Vetter wrote:
> Well, don't start jumping around, yet. These patches are just to rule out
> some theories. Now: Is it fixed with just the 2nd patch alone or do you need 
> both
> patches? This is very important, so please test extensively whether there are 
> really
> no corruptions with just the 2nd patch.

I managed to create some corruption with an xterm above xpdf. It looks  
different
than the original corruption, so I think it's safe to say it's a different bug:

http://img593.imageshack.us/i/ss1298340823.png
http://img203.imageshack.us/i/ss1298340776.png

This was without xcompmgr running, I don't think I would have seen it otherwise.
Actually, it turns out it's really easy to trigger as well, so I can more easily
test patches now. Unfortunately I wasn't smart enough to store the bisecting 
kernels,
gah!

I tried with both your patches, as well as the HIC poking patch, but I still 
get it.

I'll try to pinpoint more exactly when this started to happen.

Greetings,

Indan




[BUG] drm/i915 Screen corruption introduced by a00b10c360b35d6431a94cb

2011-02-21 Thread Indan Zupancic
Hi Peter,

The new corruption is "fixed" with daniel's patch from
https://lkml.org/lkml/2011/2/20/25

On Mon, February 21, 2011 05:10, Peter Stuge wrote:
> Indan Zupancic wrote:
>> > Confirm I also have this issue on my X40, but there are other bugs
>> > that are much more significant so I haven't bothered mentioning this.
>>
>> What issues?
>
> The one I confirmed is corrupted graphics within Gecko. I haven't had
> Xv working for a long time either. Not sure if I've tried it with
> this kernel though.

Hmm, I recommend upgrading all the graphics userspace too then,
current stuff works pretty well for me.

>
>> If it's backlight related, try my patch at:
>> http://lkml.org/lkml/2011/2/16/447
>
> Yeah also have backlight issue whenever the backlight level changes
> by other means than Fn+Home/End. (Lid switch, screen blank)
> I noticed the patch and that it solved the issue for someone. I'm not
> too inconvenienced by this issue though.

Well, it was for me, because I suspend all the time.

>
>> Or if the screen is black after suspend/screen blank
>
> Oh I've stopped using suspend since using KMS. I get way too angry
> about all the state that I lose if resume fails so I don't risk it.

I never had a failed resume with this laptop, not even once. But I
avoided the early days of KMS because of all the instability.

>
>> Actually, a lot of bugs were recently introduced and fixed, with
>> two months ago you're probably in the new-bugs-only period, so I
>> can recommend trying 2.6.38-rc5.
>
> Yeah, I think it's time to pull Linus' git. I've been keeping an eye
> on things i915 on the list for a good while already.

There were coherency bugs for 855, but those seem to be fixed in the
newer kernel too (2.6.37?), so another reason to upgrade.

>
>> This screen corruption is the only problem for me, but I don't do
>> anything fancy with my laptop. The ipw2200 wireless driver is quite
>> crappy, but it has always been as far as I know.
>
> I've used ipw2200 with great success for many years, but these days
> I'm having fun (no, not at all) with ath9k where there is some very
> fundamental hardware issue between laptop and card. I'd need to hook
> up logic analyzer to say anything concrete, but I have no end of
> problems with internal ath9k in my machine. It's completely unusable.

My impression is that ipw2200's firmware is crappy. Everything is fine
with a good signal, but with a bad one things don't work out too well
after a time.

> The only other annoying issue I have is that as wine enumerates
> available screen resolutions i915 goes out to the VGA connector,
> which on 855 always means a 600ms timeout when nothing is connected,
> but this is a bit tricky because the hardware just can not tell if
> anything is connected.

I have the impression that got better too, but I haven't followed the
development. Can't you just lower the timeout in the code, or let it
cache the VGA status? The latter is what I'd expect the code to do.

Running xrandr to see the modes takes 0.3s here.

> I would be very happy if there was a knob for enabling/disabling the
> VGA connector though.
>
>
>> Good luck,
>
> Thanks, you too!

I can recommend using PHC too, if you don't already: www.linux-phc.org
With it I almost never have the fan running and it uses a lot less power
when under load. (But don't use it till you got your system stable.)

Greetings,

Indan




[BUG] drm/i915 Screen corruption introduced by a00b10c360b35d6431a94cb

2011-02-21 Thread Peter Stuge
Indan Zupancic wrote:
> > Confirm I also have this issue on my X40, but there are other bugs
> > that are much more significant so I haven't bothered mentioning this.
> 
> What issues?

The one I confirmed is corrupted graphics within Gecko. I haven't had
Xv working for a long time either. Not sure if I've tried it with
this kernel though.


> If it's backlight related, try my patch at:
> http://lkml.org/lkml/2011/2/16/447

Yeah also have backlight issue whenever the backlight level changes
by other means than Fn+Home/End. (Lid switch, screen blank)
I noticed the patch and that it solved the issue for someone. I'm not
too inconvenienced by this issue though.


> Or if the screen is black after suspend/screen blank

Oh I've stopped using suspend since using KMS. I get way too angry
about all the state that I lose if resume fails so I don't risk it.


> Actually, a lot of bugs were recently introduced and fixed, with
> two months ago you're probably in the new-bugs-only period, so I
> can recommend trying 2.6.38-rc5.

Yeah, I think it's time to pull Linus' git. I've been keeping an eye
on things i915 on the list for a good while already.


> This screen corruption is the only problem for me, but I don't do
> anything fancy with my laptop. The ipw2200 wireless driver is quite
> crappy, but it has always been as far as I know.

I've used ipw2200 with great success for many years, but these days
I'm having fun (no, not at all) with ath9k where there is some very
fundamental hardware issue between laptop and card. I'd need to hook
up logic analyzer to say anything concrete, but I have no end of
problems with internal ath9k in my machine. It's completely unusable.

The only other annoying issue I have is that as wine enumerates
available screen resolutions i915 goes out to the VGA connector,
which on 855 always means a 600ms timeout when nothing is connected,
but this is a bit tricky because the hardware just can not tell if
anything is connected.

I would be very happy if there was a knob for enabling/disabling the
VGA connector though.


> Good luck,

Thanks, you too!


//Peter


[BUG] drm/i915 Screen corruption introduced by a00b10c360b35d6431a94cb

2011-02-21 Thread Indan Zupancic
Hi,

Feel free to ignore this post, it's a it of rambling.

On Sun, February 20, 2011 11:55, Daniel Vetter wrote:
> I've just noticed that one of the patches (the 2nd one) doesn't work
> as advertised. Please retest with the attached one.

I was wondering why that was, so I looked a bit closer:

Old patch:

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 17bd766..fbc21e3 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -763,7 +763,7 @@ static int i915_getparam(struct drm_device *dev, void *data,
value = HAS_BLT(dev);
break;
case I915_PARAM_HAS_RELAXED_FENCING:
-   value = 1;
+   value = 0;
break;
case I915_PARAM_HAS_COHERENT_RINGS:
value = 1;

New patch:

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 17bd766..d275c96 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -762,9 +762,6 @@ static int i915_getparam(struct drm_device *dev, void *data,
case I915_PARAM_HAS_BLT:
value = HAS_BLT(dev);
break;
-   case I915_PARAM_HAS_RELAXED_FENCING:
-   value = 1;
-   break;
case I915_PARAM_HAS_COHERENT_RINGS:
value = 1;
break;

Looking at userspace intel-dri code it becomes clear why:

gp.param = I915_PARAM_HAS_EXECBUF2;
ret = drmIoctl(bufmgr_gem->fd, DRM_IOCTL_I915_GETPARAM, );
if (!ret)
exec2 = 1;

gp.param = I915_PARAM_HAS_BSD;
ret = drmIoctl(bufmgr_gem->fd, DRM_IOCTL_I915_GETPARAM, );
bufmgr_gem->has_bsd = ret == 0;

gp.param = I915_PARAM_HAS_BLT;
ret = drmIoctl(bufmgr_gem->fd, DRM_IOCTL_I915_GETPARAM, );
bufmgr_gem->has_blt = ret == 0;

gp.param = I915_PARAM_HAS_RELAXED_FENCING;
ret = drmIoctl(bufmgr_gem->fd, DRM_IOCTL_I915_GETPARAM, );
bufmgr_gem->has_relaxed_fencing = ret == 0;

The stupid code assumes that if an option is supported, the value is true too,
while the kernel return code just tells whether it knows the option, and the
value is copied to gp.value, but that's ignored by intel-dri!

And a quick look at the other code gives a strong impression of lots of code
dubplication with the kernel driver.

More and more I'm getting the impression that you guys haven't gotten the
interfaces between all bits right yet. All in all you seem to be somewhat
in the same mess as filesystems are with overly complex interaction between
the MM system, VFS and individual filesystems, all doing more or less the
same in slightly different ways, instead of abstracting things properly.
(Which is also just an impression, I haven't looked that close yet.)

To give you an idea of a driver subsystem that does get it right:

All the common libata code is 23k lines. (wc liba*)
All the individual sata drivers code combined are 19k lines. (wc sata_*)

DRM does it differently (only counting .c files):

Common drm code is 21k
i915 is 37k
nouveau is 47k
radeon is 68k

And then there are the userspace drivers:
xf86-video-intel/src/*.c is 16k
drm/intel/ is 4k
mesa/drivers/dri/i915/ is 22k

Of course graphics drivers are a lot more complex than sata drivers,
but because of that extra complexity it's a lot easier to make things
even more compex by getting the interfaces wrong. So what I'm saying is,
there seems a lot of room for improvements. The intel driver code I've
seen is mostly busy with infrastructure stuff and talking to other bits
and pieces, but it doesn't seem to do much actual work.

To come back to the I915_PARAM_HAS_RELAXED_FENCING thing:

Why is this interface there at all? It seems like a driver internal detail
thing. Either it should be handled entirely in the kernel driver, and this
interface wouldn't be there, or, if the userspace driver has to know about
it, it should be entirely handled by the userspace driver and not done by
the kernel driver at all.

Another slightly annoying thing: The code is littered with gen checks,
while very often the only difference is a register or size value. Why
not put those in a gen specific hardware description structure which is
used unconditionally, instead of having a lot of almost the same code?

The current design seems overly complex and fragile, and I think you guys
make it more difficult for yourself than necessary.

Greetings,

Indan




[BUG] drm/i915 Screen corruption introduced by a00b10c360b35d6431a94cb

2011-02-21 Thread Indan Zupancic
Hi,

On Sun, February 20, 2011 14:21, Daniel Vetter wrote:
> Well, don't start jumping around, yet. These patches are just to rule
> out some theories.

I know, that's why I said I don't mind testing other patches.

> Now: Is it fixed with just the 2nd patch alone or do you
> need both patches? This is very important, so please test extensively
> whether there are really no corruptions with just the 2nd patch.

I only applied the 2nd version of the 2nd patch, no other patches.
The horizontal garbage stripes are gone for sure. It's a lot easier
to prove that something doesn't work than to prove it does, but I'll
keep running it for a while and see if I can spot any other badness.
If it's hard to trigger it's also very hard to find out what causes it.
Without your 2nd patch I always get the garbage on certain webpages,
so I'm quite confident that the original source is located for this
particular problem.

Greetings,

Indan




Re: [BUG] drm/i915 Screen corruption introduced by a00b10c360b35d6431a94cb

2011-02-21 Thread Indan Zupancic
Hi,

On Sun, February 20, 2011 14:21, Daniel Vetter wrote:
 Well, don't start jumping around, yet. These patches are just to rule out
 some theories. Now: Is it fixed with just the 2nd patch alone or do you need 
 both
 patches? This is very important, so please test extensively whether there are 
 really
 no corruptions with just the 2nd patch.

I managed to create some corruption with an xterm above xpdf. It looks  
different
than the original corruption, so I think it's safe to say it's a different bug:

http://img593.imageshack.us/i/ss1298340823.png
http://img203.imageshack.us/i/ss1298340776.png

This was without xcompmgr running, I don't think I would have seen it otherwise.
Actually, it turns out it's really easy to trigger as well, so I can more easily
test patches now. Unfortunately I wasn't smart enough to store the bisecting 
kernels,
gah!

I tried with both your patches, as well as the HIC poking patch, but I still 
get it.

I'll try to pinpoint more exactly when this started to happen.

Greetings,

Indan


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[BUG] drm/i915 Screen corruption introduced by a00b10c360b35d6431a94cb

2011-02-20 Thread Daniel Vetter
On Sun, Feb 20, 2011 at 12:19:14PM +0100, Indan Zupancic wrote:
> Hi,
> 
> Good news:
> 
> On Sun, February 20, 2011 11:55, Daniel Vetter wrote:
> >> > I've created two quick patches to check a few theories, please test them
> >> > (both patches independently and both together). Patches attached.
> >>
> >> Tried with both applied, doesn't change anything.
> >
> > I've just noticed that one of the patches (the 2nd one) doesn't work as
> > advertised. Please retest with the attached one.
> 
> This one fixes it!

Well, don't start jumping around, yet. These patches are just to rule out
some theories. Now: Is it fixed with just the 2nd patch alone or do you
need both patches? This is very important, so please test extensively
whether there are really no corruptions with just the 2nd patch.

Yours, Daniel
-- 
Daniel Vetter
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48


[BUG] drm/i915 Screen corruption introduced by a00b10c360b35d6431a94cb

2011-02-20 Thread Indan Zupancic
Hi,

Good news:

On Sun, February 20, 2011 11:55, Daniel Vetter wrote:
>> > I've created two quick patches to check a few theories, please test them
>> > (both patches independently and both together). Patches attached.
>>
>> Tried with both applied, doesn't change anything.
>
> I've just noticed that one of the patches (the 2nd one) doesn't work as
> advertised. Please retest with the attached one.

This one fixes it!

So ignore my other email, you nailed it. Thanks a heap Daniel.

Tomorrow I'll test more and double check, but it seems fixed.

Of course, looking at the patch, I guess it isn't generally
applicable, so perhaps either more digging or a special
exception for older chipsets is needed. I'm happy to test
any further patches if that helps, triggering the corruption
is really easy for me.

Thanks,

Indan




[BUG] drm/i915 Screen corruption introduced by a00b10c360b35d6431a94cb

2011-02-20 Thread Indan Zupancic
Hi Daniel,

On Sun, February 20, 2011 10:20, Daniel Vetter wrote:
> On Sun, Feb 20, 2011 at 03:20:15AM +0100, Indan Zupancic wrote:
>> Sorry, I plainly forgot to mention that. I'm on my Thinkpad X40 with gen 2
>> hardware, 855GM.
>
> Ah, craptastic i855gm. Is this with the HIC poke patch?

Yeah, that one. That's what I first suspected, so I tried both with and
without it, but this is something new.

> Well, these two patches completely undo the patch by Chris Wilson (and
> I've re-reviewed it to check whether there are any other hidden bugs in
> it).

I can't find anything dodgy in it either, so it's either somewhere else
or really subtle. I certainly don't see much gen2 specific stuff.

> I suspect that your bisect went wrong because the range
> a00b10c360b35d6431a94cb..5e78330126e23e00950 is known to be broken on your
> hw.

It mostly worked after a bit of twiddling (after that I didn't need to skip
commits any more), and the kind of corruption was the same too.

Only way I can see that this would make a difference is when a00b10c3 created
the bug, the bug got fixed by your patches, but something else introduced it
between that range in a different way. This actually makes sense when it's
re-introduced by a bad merge. But then it should show up in the current code.
If not for a bad merge it seems unlikely though.

> Could you recheck these two endpoints to confirm that breakage was
> introduce in that range?

I'm pretty sure the breakage was introduced starting with a00b10c3, because
the commit just before it (746537) works fine. so yes, it's in that range.
I was getting a bit tired and less sharp though, so perhaps I made a mistake
somewhere. I'll recheck, probably tomorrow (getting a bit late here).

> If so, could you rebase this range and squash the
> three patches by me you've mentioned in the original report into the patch
> by Chris and re-run git bisect on this new branch?

As I have to apply patches anyway, it seems easier to just add your patches
to the bunch, if they apply...

Annoying bug, it seems I'm always hitting the tricky ones.

Looking at: gitk a00b10c360b35d6431a94cb..5e78330126e23e00950  \
 drivers/char/agp/ drivers/gpu/drm/i915/

There were three merges around that area, so it seems a bit messy and something
might have gone wrong when merging there. But the right branch seems okay 
because
all later good bisections were from there.

---

I just saw your new email with a new patch to try, I'll try that one first and
report back, before digging further into all this.

Greetings,

Indan

---

First bisection info:

When I did the bisect I had to fix a compile error and an OOPS, patches I
applied were:

commit ff75b9bc489710ff223bc0d805bf3b862dc347ea
Author: Chris Wilson 
Date:   Sat Oct 30 22:52:31 2010 +0100

drm/i915: Fix typo from e5281ccd in i915_gem_attach_phys_object()

Accessing the uninitialised obj->pages instead of the local page lead to
an OOPs.

Reported-by: Xavier Chantry 
Signed-off-by: Chris Wilson 

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 936ddd8..e3fc333 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4925,7 +4925,7 @@ i915_gem_attach_phys_object(struct drm_device *dev,
if (IS_ERR(page))
return PTR_ERR(page);

-   src = kmap_atomic(obj_priv->pages[i]);
+   src = kmap_atomic(page);
dst = obj_priv->phys_obj->handle->vaddr + (i * PAGE_SIZE);
memcpy(dst, src, PAGE_SIZE);
kunmap_atomic(src);

diff --git a/arch/x86/mm/iomap_32.c b/arch/x86/mm/iomap_32.c
index 75a3d7f..e1cc56a 100644
--- a/arch/x86/mm/iomap_32.c
+++ b/arch/x86/mm/iomap_32.c
@@ -61,7 +61,8 @@ void *kmap_atomic_prot_pfn(unsigned long pfn, pgprot_t prot)

pagefault_disable();

-   type = kmap_atomic_idx_push();
+// type = kmap_atomic_idx_push();
+   type = 0;
idx = type + KM_TYPE_NR * smp_processor_id();
vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
set_pte(kmap_pte - idx, pfn_pte(pfn, prot));
@@ -98,7 +99,8 @@ iounmap_atomic(void __iomem *kvaddr)
vaddr <= __fix_to_virt(FIX_KMAP_BEGIN)) {
int idx, type;

-   type = kmap_atomic_idx_pop();
+// type = kmap_atomic_idx_pop();
+   type = 0;
idx = type + KM_TYPE_NR * smp_processor_id();

 #ifdef CONFIG_DEBUG_HIGHMEM

With these patches it worked for me and the only difference was that corruption
showing up or not, and the kind of corruption seemed the same too.

I ran bisect on the drm/i915 and char/agp paths. My bisect log was:

# bad: [d2478521afc20227658a10a8c5c2bf1a2aa615b3] char/ipmi: fix OOPS caused by
pnp_unregister_driver on unregistered driver
# good: [3c0eee3fe6a3a1c745379547c7e7c904aa64f6d5] Linux 2.6.37
git bisect start 'd2478521afc20227658a10a8c5c2bf1a2aa615b3' 'v2.6.37'
'drivers/gpu/drm/i915/' 

[BUG] drm/i915 Screen corruption introduced by a00b10c360b35d6431a94cb

2011-02-20 Thread Daniel Vetter
Hi,

On Sun, Feb 20, 2011 at 03:20:15AM +0100, Indan Zupancic wrote:
> > I've created two quick patches to check a few theories, please test them
> > (both patches independently and both together). Patches attached.
> 
> Tried with both applied, doesn't change anything.

I've just noticed that one of the patches (the 2nd one) doesn't work as
advertised. Please retest with the attached one.

Yours, Daniel
-- 
Daniel Vetter
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48
-- next part --
A non-text attachment was scrubbed...
Name: for-indan-2.patch
Type: text/x-diff
Size: 453 bytes
Desc: not available
URL: 



[BUG] drm/i915 Screen corruption introduced by a00b10c360b35d6431a94cb

2011-02-20 Thread Daniel Vetter
Hi,

On Sun, Feb 20, 2011 at 03:20:15AM +0100, Indan Zupancic wrote:
> Sorry, I plainly forgot to mention that. I'm on my Thinkpad X40 with gen 2
> hardware, 855GM.

Ah, craptastic i855gm. Is this with the HIC poke patch?



> > I've created two quick patches to check a few theories, please test them
> > (both patches independently and both together). Patches attached.
> 
> Tried with both applied, doesn't change anything.

Well, these two patches completely undo the patch by Chris Wilson (and
I've re-reviewed it to check whether there are any other hidden bugs in
it). I suspect that your bisect went wrong because the range
a00b10c360b35d6431a94cb..5e78330126e23e00950 is known to be broken on your
hw. Could you recheck these two endpoints to confirm that breakage was
introduce in that range? If so, could you rebase this range and squash the
three patches by me you've mentioned in the original report into the patch
by Chris and re-run git bisect on this new branch?

Thanks, Daniel
-- 
Daniel Vetter
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48


[BUG] drm/i915 Screen corruption introduced by a00b10c360b35d6431a94cb

2011-02-20 Thread Indan Zupancic
Hi Peter,

On Sun, February 20, 2011 04:51, Peter Stuge wrote:
> Confirm I also have this issue on my X40, but there are other bugs
> that are much more significant so I haven't bothered mentioning this.

What issues? If it's backlight related, try my patch at:
http://lkml.org/lkml/2011/2/16/447

Or if the screen is black after suspend/screen blank then just try a
newer kernel, that got recently fixed. Actually, a lot of bugs were
recently introduced and fixed, with two months ago you're probably
in the new-bugs-only period, so I can recommend trying 2.6.38-rc5.

This screen corruption is the only problem for me, but I don't do anything
fancy with my laptop. The ipw2200 wireless driver is quite crappy, but it
has always been as far as I know. (Though not too long ago the USB got
burnt out, which probably took one RAM module with it, so now I'm running
with 512MB instead of 1GB, which is really noticeable when doing kernel
git stuff or compiles with this very very slow HD.)

Good luck,

Indan

P.S. Anyone any idea where that "Mail-Followup-To" header comes from?
Or is everyone subscribed to dri-devel automatically excluded from
Mail-Followup-To? If that's the case, pardon for the dubplicate mails.




[BUG] drm/i915 Screen corruption introduced by a00b10c360b35d6431a94cb

2011-02-20 Thread Peter Stuge
Daniel Vetter wrote:
> Please provide the usual details about your system (especially what gpu
> this is on). Also, screenshots of what typical corruptions look like can
> help a lot in tracking down such things.

Confirm I also have this issue on my X40, but there are other bugs
that are much more significant so I haven't bothered mentioning this.

On the other hand my kernel is nearly two months old.

Please don't mistake this as a complaint. I'm overall rather happy
even with current state and it's issues, and especially I think the
situation is consistently getting better, with some bumps here and
there along the way.

Last time I used a projector I had some trouble, because the graphics
driver was too clever; it automatically set the correct resolution
for the projector, but I needed the resolution of my internal screen. :)


//Peter


[BUG] drm/i915 Screen corruption introduced by a00b10c360b35d6431a94cb

2011-02-20 Thread Indan Zupancic
Huh, no idea why it didn't go to Daniel the first time, Squirrelmail never
did that before.

On Sun, February 20, 2011 03:20, Indan Zupancic wrote:
> Hi,
>
> On Sat, February 19, 2011 19:25, Daniel Vetter wrote:
>> Hi Indan,
>>
>> Please provide the usual details about your system (especially what gpu
>> this is on). Also, screenshots of what typical corruptions look like can
>> help a lot in tracking down such things.
>
> Sorry, I plainly forgot to mention that. I'm on my Thinkpad X40 with gen 2
> hardware, 855GM.
>
> I uploaded a screenshot of Firefox to:
> http://img406.imageshack.us/img406/9159/31827167.png
>
> It happens a lot when scrolling in FF on pages with images, pageup or pagedown
> don't exhibit it. Text gets corrupted as well, but in a less predictable way.
> It must be something that FF does that triggers it easily, I haven't managed
> to get it with other programs yet, though I do see some corruption in the
> window decoration too sometimes (which is Fluxbox). Scrolling horizontally
> doesn't show the same behaviour. The corruptions seems to be happening on
> the right side of a surface/window.
>
> Forcing a refresh makes it go away again (e.g. switching to another window or
> opening another window on top of it. Moving a window doesn't though). I get
> it with and without xcompmgr running.
>
> Okay, interesting titbit: For text corruption, the stripes are one pixel high
> for the default font size, no corruption for smaller sizes than the default,
> two pixel lines for 1 ctrl+ bigger than the default, 3 pixels for +2 etc.
> With bigger sizes (>=5) I get less corruption to the left of the text, but
> more in the text itself. The distance between the lines continues to increase,
> but the stripes thickness goes back to 1 pixel.
>
> Perhaps an off-by-one error somewhere?
>
>> I've created two quick patches to check a few theories, please test them
>> (both patches independently and both together). Patches attached.
>
> Tried with both applied, doesn't change anything.
>
> Greetings,
>
> Indan
>




[BUG] drm/i915 Screen corruption introduced by a00b10c360b35d6431a94cb

2011-02-20 Thread Indan Zupancic
Hi,

On Sat, February 19, 2011 19:25, Daniel Vetter wrote:
> Hi Indan,
>
> Please provide the usual details about your system (especially what gpu
> this is on). Also, screenshots of what typical corruptions look like can
> help a lot in tracking down such things.

Sorry, I plainly forgot to mention that. I'm on my Thinkpad X40 with gen 2
hardware, 855GM.

I uploaded a screenshot of Firefox to:
http://img406.imageshack.us/img406/9159/31827167.png

It happens a lot when scrolling in FF on pages with images, pageup or pagedown
don't exhibit it. Text gets corrupted as well, but in a less predictable way.
It must be something that FF does that triggers it easily, I haven't managed
to get it with other programs yet, though I do see some corruption in the
window decoration too sometimes (which is Fluxbox). Scrolling horizontally
doesn't show the same behaviour. The corruptions seems to be happening on
the right side of a surface/window.

Forcing a refresh makes it go away again (e.g. switching to another window or
opening another window on top of it. Moving a window doesn't though). I get
it with and without xcompmgr running.

Okay, interesting titbit: For text corruption, the stripes are one pixel high
for the default font size, no corruption for smaller sizes than the default,
two pixel lines for 1 ctrl+ bigger than the default, 3 pixels for +2 etc.
With bigger sizes (>=5) I get less corruption to the left of the text, but
more in the text itself. The distance between the lines continues to increase,
but the stripes thickness goes back to 1 pixel.

Perhaps an off-by-one error somewhere?

> I've created two quick patches to check a few theories, please test them
> (both patches independently and both together). Patches attached.

Tried with both applied, doesn't change anything.

Greetings,

Indan




Re: [BUG] drm/i915 Screen corruption introduced by a00b10c360b35d6431a94cb

2011-02-20 Thread Indan Zupancic
Hi Daniel,

On Sun, February 20, 2011 10:20, Daniel Vetter wrote:
 On Sun, Feb 20, 2011 at 03:20:15AM +0100, Indan Zupancic wrote:
 Sorry, I plainly forgot to mention that. I'm on my Thinkpad X40 with gen 2
 hardware, 855GM.

 Ah, craptastic i855gm. Is this with the HIC poke patch?

Yeah, that one. That's what I first suspected, so I tried both with and
without it, but this is something new.

 Well, these two patches completely undo the patch by Chris Wilson (and
 I've re-reviewed it to check whether there are any other hidden bugs in
 it).

I can't find anything dodgy in it either, so it's either somewhere else
or really subtle. I certainly don't see much gen2 specific stuff.

 I suspect that your bisect went wrong because the range
 a00b10c360b35d6431a94cb..5e78330126e23e00950 is known to be broken on your
 hw.

It mostly worked after a bit of twiddling (after that I didn't need to skip
commits any more), and the kind of corruption was the same too.

Only way I can see that this would make a difference is when a00b10c3 created
the bug, the bug got fixed by your patches, but something else introduced it
between that range in a different way. This actually makes sense when it's
re-introduced by a bad merge. But then it should show up in the current code.
If not for a bad merge it seems unlikely though.

 Could you recheck these two endpoints to confirm that breakage was
 introduce in that range?

I'm pretty sure the breakage was introduced starting with a00b10c3, because
the commit just before it (746537) works fine. so yes, it's in that range.
I was getting a bit tired and less sharp though, so perhaps I made a mistake
somewhere. I'll recheck, probably tomorrow (getting a bit late here).

 If so, could you rebase this range and squash the
 three patches by me you've mentioned in the original report into the patch
 by Chris and re-run git bisect on this new branch?

As I have to apply patches anyway, it seems easier to just add your patches
to the bunch, if they apply...

Annoying bug, it seems I'm always hitting the tricky ones.

Looking at: gitk a00b10c360b35d6431a94cb..5e78330126e23e00950  \
 drivers/char/agp/ drivers/gpu/drm/i915/

There were three merges around that area, so it seems a bit messy and something
might have gone wrong when merging there. But the right branch seems okay 
because
all later good bisections were from there.

---

I just saw your new email with a new patch to try, I'll try that one first and
report back, before digging further into all this.

Greetings,

Indan

---

First bisection info:

When I did the bisect I had to fix a compile error and an OOPS, patches I
applied were:

commit ff75b9bc489710ff223bc0d805bf3b862dc347ea
Author: Chris Wilson ch...@chris-wilson.co.uk
Date:   Sat Oct 30 22:52:31 2010 +0100

drm/i915: Fix typo from e5281ccd in i915_gem_attach_phys_object()

Accessing the uninitialised obj-pages instead of the local page lead to
an OOPs.

Reported-by: Xavier Chantry chantry.xav...@gmail.com
Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 936ddd8..e3fc333 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4925,7 +4925,7 @@ i915_gem_attach_phys_object(struct drm_device *dev,
if (IS_ERR(page))
return PTR_ERR(page);

-   src = kmap_atomic(obj_priv-pages[i]);
+   src = kmap_atomic(page);
dst = obj_priv-phys_obj-handle-vaddr + (i * PAGE_SIZE);
memcpy(dst, src, PAGE_SIZE);
kunmap_atomic(src);

diff --git a/arch/x86/mm/iomap_32.c b/arch/x86/mm/iomap_32.c
index 75a3d7f..e1cc56a 100644
--- a/arch/x86/mm/iomap_32.c
+++ b/arch/x86/mm/iomap_32.c
@@ -61,7 +61,8 @@ void *kmap_atomic_prot_pfn(unsigned long pfn, pgprot_t prot)

pagefault_disable();

-   type = kmap_atomic_idx_push();
+// type = kmap_atomic_idx_push();
+   type = 0;
idx = type + KM_TYPE_NR * smp_processor_id();
vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
set_pte(kmap_pte - idx, pfn_pte(pfn, prot));
@@ -98,7 +99,8 @@ iounmap_atomic(void __iomem *kvaddr)
vaddr = __fix_to_virt(FIX_KMAP_BEGIN)) {
int idx, type;

-   type = kmap_atomic_idx_pop();
+// type = kmap_atomic_idx_pop();
+   type = 0;
idx = type + KM_TYPE_NR * smp_processor_id();

 #ifdef CONFIG_DEBUG_HIGHMEM

With these patches it worked for me and the only difference was that corruption
showing up or not, and the kind of corruption seemed the same too.

I ran bisect on the drm/i915 and char/agp paths. My bisect log was:

# bad: [d2478521afc20227658a10a8c5c2bf1a2aa615b3] char/ipmi: fix OOPS caused by
pnp_unregister_driver on unregistered driver
# good: [3c0eee3fe6a3a1c745379547c7e7c904aa64f6d5] Linux 2.6.37
git bisect start 'd2478521afc20227658a10a8c5c2bf1a2aa615b3' 

Re: [BUG] drm/i915 Screen corruption introduced by a00b10c360b35d6431a94cb

2011-02-20 Thread Indan Zupancic
Hi,

Good news:

On Sun, February 20, 2011 11:55, Daniel Vetter wrote:
  I've created two quick patches to check a few theories, please test them
  (both patches independently and both together). Patches attached.

 Tried with both applied, doesn't change anything.

 I've just noticed that one of the patches (the 2nd one) doesn't work as
 advertised. Please retest with the attached one.

This one fixes it!

So ignore my other email, you nailed it. Thanks a heap Daniel.

Tomorrow I'll test more and double check, but it seems fixed.

Of course, looking at the patch, I guess it isn't generally
applicable, so perhaps either more digging or a special
exception for older chipsets is needed. I'm happy to test
any further patches if that helps, triggering the corruption
is really easy for me.

Thanks,

Indan


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [BUG] drm/i915 Screen corruption introduced by a00b10c360b35d6431a94cb

2011-02-20 Thread Daniel Vetter
Hi,

On Sun, Feb 20, 2011 at 03:20:15AM +0100, Indan Zupancic wrote:
 Sorry, I plainly forgot to mention that. I'm on my Thinkpad X40 with gen 2
 hardware, 855GM.

Ah, craptastic i855gm. Is this with the HIC poke patch?

snip

  I've created two quick patches to check a few theories, please test them
  (both patches independently and both together). Patches attached.
 
 Tried with both applied, doesn't change anything.

Well, these two patches completely undo the patch by Chris Wilson (and
I've re-reviewed it to check whether there are any other hidden bugs in
it). I suspect that your bisect went wrong because the range
a00b10c360b35d6431a94cb..5e78330126e23e00950 is known to be broken on your
hw. Could you recheck these two endpoints to confirm that breakage was
introduce in that range? If so, could you rebase this range and squash the
three patches by me you've mentioned in the original report into the patch
by Chris and re-run git bisect on this new branch?

Thanks, Daniel
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [BUG] drm/i915 Screen corruption introduced by a00b10c360b35d6431a94cb

2011-02-20 Thread Indan Zupancic
Hi,

On Sun, February 20, 2011 14:21, Daniel Vetter wrote:
 Well, don't start jumping around, yet. These patches are just to rule
 out some theories.

I know, that's why I said I don't mind testing other patches.

 Now: Is it fixed with just the 2nd patch alone or do you
 need both patches? This is very important, so please test extensively
 whether there are really no corruptions with just the 2nd patch.

I only applied the 2nd version of the 2nd patch, no other patches.
The horizontal garbage stripes are gone for sure. It's a lot easier
to prove that something doesn't work than to prove it does, but I'll
keep running it for a while and see if I can spot any other badness.
If it's hard to trigger it's also very hard to find out what causes it.
Without your 2nd patch I always get the garbage on certain webpages,
so I'm quite confident that the original source is located for this
particular problem.

Greetings,

Indan


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [BUG] drm/i915 Screen corruption introduced by a00b10c360b35d6431a94cb

2011-02-20 Thread Indan Zupancic
Hi,

Feel free to ignore this post, it's a it of rambling.

On Sun, February 20, 2011 11:55, Daniel Vetter wrote:
 I've just noticed that one of the patches (the 2nd one) doesn't work
 as advertised. Please retest with the attached one.

I was wondering why that was, so I looked a bit closer:

Old patch:

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 17bd766..fbc21e3 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -763,7 +763,7 @@ static int i915_getparam(struct drm_device *dev, void *data,
value = HAS_BLT(dev);
break;
case I915_PARAM_HAS_RELAXED_FENCING:
-   value = 1;
+   value = 0;
break;
case I915_PARAM_HAS_COHERENT_RINGS:
value = 1;

New patch:

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 17bd766..d275c96 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -762,9 +762,6 @@ static int i915_getparam(struct drm_device *dev, void *data,
case I915_PARAM_HAS_BLT:
value = HAS_BLT(dev);
break;
-   case I915_PARAM_HAS_RELAXED_FENCING:
-   value = 1;
-   break;
case I915_PARAM_HAS_COHERENT_RINGS:
value = 1;
break;

Looking at userspace intel-dri code it becomes clear why:

gp.param = I915_PARAM_HAS_EXECBUF2;
ret = drmIoctl(bufmgr_gem-fd, DRM_IOCTL_I915_GETPARAM, gp);
if (!ret)
exec2 = 1;

gp.param = I915_PARAM_HAS_BSD;
ret = drmIoctl(bufmgr_gem-fd, DRM_IOCTL_I915_GETPARAM, gp);
bufmgr_gem-has_bsd = ret == 0;

gp.param = I915_PARAM_HAS_BLT;
ret = drmIoctl(bufmgr_gem-fd, DRM_IOCTL_I915_GETPARAM, gp);
bufmgr_gem-has_blt = ret == 0;

gp.param = I915_PARAM_HAS_RELAXED_FENCING;
ret = drmIoctl(bufmgr_gem-fd, DRM_IOCTL_I915_GETPARAM, gp);
bufmgr_gem-has_relaxed_fencing = ret == 0;

The stupid code assumes that if an option is supported, the value is true too,
while the kernel return code just tells whether it knows the option, and the
value is copied to gp.value, but that's ignored by intel-dri!

And a quick look at the other code gives a strong impression of lots of code
dubplication with the kernel driver.

More and more I'm getting the impression that you guys haven't gotten the
interfaces between all bits right yet. All in all you seem to be somewhat
in the same mess as filesystems are with overly complex interaction between
the MM system, VFS and individual filesystems, all doing more or less the
same in slightly different ways, instead of abstracting things properly.
(Which is also just an impression, I haven't looked that close yet.)

To give you an idea of a driver subsystem that does get it right:

All the common libata code is 23k lines. (wc liba*)
All the individual sata drivers code combined are 19k lines. (wc sata_*)

DRM does it differently (only counting .c files):

Common drm code is 21k
i915 is 37k
nouveau is 47k
radeon is 68k

And then there are the userspace drivers:
xf86-video-intel/src/*.c is 16k
drm/intel/ is 4k
mesa/drivers/dri/i915/ is 22k

Of course graphics drivers are a lot more complex than sata drivers,
but because of that extra complexity it's a lot easier to make things
even more compex by getting the interfaces wrong. So what I'm saying is,
there seems a lot of room for improvements. The intel driver code I've
seen is mostly busy with infrastructure stuff and talking to other bits
and pieces, but it doesn't seem to do much actual work.

To come back to the I915_PARAM_HAS_RELAXED_FENCING thing:

Why is this interface there at all? It seems like a driver internal detail
thing. Either it should be handled entirely in the kernel driver, and this
interface wouldn't be there, or, if the userspace driver has to know about
it, it should be entirely handled by the userspace driver and not done by
the kernel driver at all.

Another slightly annoying thing: The code is littered with gen checks,
while very often the only difference is a register or size value. Why
not put those in a gen specific hardware description structure which is
used unconditionally, instead of having a lot of almost the same code?

The current design seems overly complex and fragile, and I think you guys
make it more difficult for yourself than necessary.

Greetings,

Indan


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [BUG] drm/i915 Screen corruption introduced by a00b10c360b35d6431a94cb

2011-02-20 Thread Peter Stuge
Indan Zupancic wrote:
  Confirm I also have this issue on my X40, but there are other bugs
  that are much more significant so I haven't bothered mentioning this.
 
 What issues?

The one I confirmed is corrupted graphics within Gecko. I haven't had
Xv working for a long time either. Not sure if I've tried it with
this kernel though.


 If it's backlight related, try my patch at:
 http://lkml.org/lkml/2011/2/16/447

Yeah also have backlight issue whenever the backlight level changes
by other means than Fn+Home/End. (Lid switch, screen blank)
I noticed the patch and that it solved the issue for someone. I'm not
too inconvenienced by this issue though.


 Or if the screen is black after suspend/screen blank

Oh I've stopped using suspend since using KMS. I get way too angry
about all the state that I lose if resume fails so I don't risk it.


 Actually, a lot of bugs were recently introduced and fixed, with
 two months ago you're probably in the new-bugs-only period, so I
 can recommend trying 2.6.38-rc5.

Yeah, I think it's time to pull Linus' git. I've been keeping an eye
on things i915 on the list for a good while already.


 This screen corruption is the only problem for me, but I don't do
 anything fancy with my laptop. The ipw2200 wireless driver is quite
 crappy, but it has always been as far as I know.

I've used ipw2200 with great success for many years, but these days
I'm having fun (no, not at all) with ath9k where there is some very
fundamental hardware issue between laptop and card. I'd need to hook
up logic analyzer to say anything concrete, but I have no end of
problems with internal ath9k in my machine. It's completely unusable.

The only other annoying issue I have is that as wine enumerates
available screen resolutions i915 goes out to the VGA connector,
which on 855 always means a 600ms timeout when nothing is connected,
but this is a bit tricky because the hardware just can not tell if
anything is connected.

I would be very happy if there was a knob for enabling/disabling the
VGA connector though.


 Good luck,

Thanks, you too!


//Peter
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [BUG] drm/i915 Screen corruption introduced by a00b10c360b35d6431a94cb

2011-02-20 Thread Indan Zupancic
Hi Peter,

The new corruption is fixed with daniel's patch from
https://lkml.org/lkml/2011/2/20/25

On Mon, February 21, 2011 05:10, Peter Stuge wrote:
 Indan Zupancic wrote:
  Confirm I also have this issue on my X40, but there are other bugs
  that are much more significant so I haven't bothered mentioning this.

 What issues?

 The one I confirmed is corrupted graphics within Gecko. I haven't had
 Xv working for a long time either. Not sure if I've tried it with
 this kernel though.

Hmm, I recommend upgrading all the graphics userspace too then,
current stuff works pretty well for me.


 If it's backlight related, try my patch at:
 http://lkml.org/lkml/2011/2/16/447

 Yeah also have backlight issue whenever the backlight level changes
 by other means than Fn+Home/End. (Lid switch, screen blank)
 I noticed the patch and that it solved the issue for someone. I'm not
 too inconvenienced by this issue though.

Well, it was for me, because I suspend all the time.


 Or if the screen is black after suspend/screen blank

 Oh I've stopped using suspend since using KMS. I get way too angry
 about all the state that I lose if resume fails so I don't risk it.

I never had a failed resume with this laptop, not even once. But I
avoided the early days of KMS because of all the instability.


 Actually, a lot of bugs were recently introduced and fixed, with
 two months ago you're probably in the new-bugs-only period, so I
 can recommend trying 2.6.38-rc5.

 Yeah, I think it's time to pull Linus' git. I've been keeping an eye
 on things i915 on the list for a good while already.

There were coherency bugs for 855, but those seem to be fixed in the
newer kernel too (2.6.37?), so another reason to upgrade.


 This screen corruption is the only problem for me, but I don't do
 anything fancy with my laptop. The ipw2200 wireless driver is quite
 crappy, but it has always been as far as I know.

 I've used ipw2200 with great success for many years, but these days
 I'm having fun (no, not at all) with ath9k where there is some very
 fundamental hardware issue between laptop and card. I'd need to hook
 up logic analyzer to say anything concrete, but I have no end of
 problems with internal ath9k in my machine. It's completely unusable.

My impression is that ipw2200's firmware is crappy. Everything is fine
with a good signal, but with a bad one things don't work out too well
after a time.

 The only other annoying issue I have is that as wine enumerates
 available screen resolutions i915 goes out to the VGA connector,
 which on 855 always means a 600ms timeout when nothing is connected,
 but this is a bit tricky because the hardware just can not tell if
 anything is connected.

I have the impression that got better too, but I haven't followed the
development. Can't you just lower the timeout in the code, or let it
cache the VGA status? The latter is what I'd expect the code to do.

Running xrandr to see the modes takes 0.3s here.

 I would be very happy if there was a knob for enabling/disabling the
 VGA connector though.


 Good luck,

 Thanks, you too!

I can recommend using PHC too, if you don't already: www.linux-phc.org
With it I almost never have the fan running and it uses a lot less power
when under load. (But don't use it till you got your system stable.)

Greetings,

Indan


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[BUG] drm/i915 Screen corruption introduced by a00b10c360b35d6431a94cb

2011-02-19 Thread Daniel Vetter
Hi Indan,

Please provide the usual details about your system (especially what gpu
this is on). Also, screenshots of what typical corruptions look like can
help a lot in tracking down such things.

I've created two quick patches to check a few theories, please test them
(both patches independently and both together). Patches attached.

Yours, Daniel

On Sat, Feb 19, 2011 at 06:58:06AM +0100, Indan Zupancic wrote:
> Hello,
> 
> Since 2.6.38-rc I get screen corruption (mostly horizontal grabage stripes on
> the right side of the screen). After a long time bisecting the offending 
> commit
> ends up being:
> 
> commit a00b10c360b35d6431a94cbf130a4e162870d661
> Author: Chris Wilson 
> Date:   Fri Sep 24 21:15:47 2010 +0100
> 
> drm/i915: Only enforce fence limits inside the GTT.
> 
> So long as we adhere to the fence registers rules for alignment and no
> overlaps (including with unfenced accesses to linear memory) and account
> for the tiled access in our size allocation, we do not have to allocate
> the full fenced region for the object. This allows us to fight the bloat
> tiling imposed on pre-i965 chipsets and frees up RAM for real use. [Inside
> the GTT we still suffer the additional alignment constraints, so it 
> doesn't
> magic allow us to render larger scenes without stalls -- we need the
> expanded GTT and fence pipelining to overcome those...]
> 
> Signed-off-by: Chris Wilson 
> 
> This commit caused other problems too, which Daniel tried to fix with commits:
> 
> 5e78330126e23e00950 drm/i915: fix relaxed tiling for gen <= 3 && !g33
> 75e9e9158f38e5cb21e drm/i915: kill mappable/fenceable disdinction
> 818f2a3cc34b0673dcc drm/i915: revert pageflip/mappable related abi breakage
> 
> But those don't fix my screen corruption.
> 
> Unfortunately, it's a big commit and it doesn't revert cleanly, and its size
> makes it unclear what the source of the problem is. Daniel's commits don't
> revert cleanly either, so reverting all of them didn't work.
> 
> I'll start poking at it and see if I can find anything.
> 
> Greetings,
> 
> Indan
> 
> 

-- 
Daniel Vetter
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48
-- next part --
A non-text attachment was scrubbed...
Name: for-indan-1.patch
Type: text/x-diff
Size: 452 bytes
Desc: not available
URL: 

-- next part --
A non-text attachment was scrubbed...
Name: for-indan-2.patch
Type: text/x-diff
Size: 430 bytes
Desc: not available
URL: 



[BUG] drm/i915 Screen corruption introduced by a00b10c360b35d6431a94cb

2011-02-19 Thread Indan Zupancic
Hello,

Since 2.6.38-rc I get screen corruption (mostly horizontal grabage stripes on
the right side of the screen). After a long time bisecting the offending commit
ends up being:

commit a00b10c360b35d6431a94cbf130a4e162870d661
Author: Chris Wilson 
Date:   Fri Sep 24 21:15:47 2010 +0100

drm/i915: Only enforce fence limits inside the GTT.

So long as we adhere to the fence registers rules for alignment and no
overlaps (including with unfenced accesses to linear memory) and account
for the tiled access in our size allocation, we do not have to allocate
the full fenced region for the object. This allows us to fight the bloat
tiling imposed on pre-i965 chipsets and frees up RAM for real use. [Inside
the GTT we still suffer the additional alignment constraints, so it doesn't
magic allow us to render larger scenes without stalls -- we need the
expanded GTT and fence pipelining to overcome those...]

Signed-off-by: Chris Wilson 

This commit caused other problems too, which Daniel tried to fix with commits:

5e78330126e23e00950 drm/i915: fix relaxed tiling for gen <= 3 && !g33
75e9e9158f38e5cb21e drm/i915: kill mappable/fenceable disdinction
818f2a3cc34b0673dcc drm/i915: revert pageflip/mappable related abi breakage

But those don't fix my screen corruption.

Unfortunately, it's a big commit and it doesn't revert cleanly, and its size
makes it unclear what the source of the problem is. Daniel's commits don't
revert cleanly either, so reverting all of them didn't work.

I'll start poking at it and see if I can find anything.

Greetings,

Indan




Re: [BUG] drm/i915 Screen corruption introduced by a00b10c360b35d6431a94cb

2011-02-19 Thread Indan Zupancic
Hi,

On Sat, February 19, 2011 19:25, Daniel Vetter wrote:
 Hi Indan,

 Please provide the usual details about your system (especially what gpu
 this is on). Also, screenshots of what typical corruptions look like can
 help a lot in tracking down such things.

Sorry, I plainly forgot to mention that. I'm on my Thinkpad X40 with gen 2
hardware, 855GM.

I uploaded a screenshot of Firefox to:
http://img406.imageshack.us/img406/9159/31827167.png

It happens a lot when scrolling in FF on pages with images, pageup or pagedown
don't exhibit it. Text gets corrupted as well, but in a less predictable way.
It must be something that FF does that triggers it easily, I haven't managed
to get it with other programs yet, though I do see some corruption in the
window decoration too sometimes (which is Fluxbox). Scrolling horizontally
doesn't show the same behaviour. The corruptions seems to be happening on
the right side of a surface/window.

Forcing a refresh makes it go away again (e.g. switching to another window or
opening another window on top of it. Moving a window doesn't though). I get
it with and without xcompmgr running.

Okay, interesting titbit: For text corruption, the stripes are one pixel high
for the default font size, no corruption for smaller sizes than the default,
two pixel lines for 1 ctrl+ bigger than the default, 3 pixels for +2 etc.
With bigger sizes (=5) I get less corruption to the left of the text, but
more in the text itself. The distance between the lines continues to increase,
but the stripes thickness goes back to 1 pixel.

Perhaps an off-by-one error somewhere?

 I've created two quick patches to check a few theories, please test them
 (both patches independently and both together). Patches attached.

Tried with both applied, doesn't change anything.

Greetings,

Indan


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel