[Bug 27722] [865] no compiz after upgrade from 7.7.1 to 7.8.1

2010-05-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=27722

Gordon Jin  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||DUPLICATE

--- Comment #1 from Gordon Jin  2010-05-11 01:45:26 PDT 
---


*** This bug has been marked as a duplicate of bug 27615 ***

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


fb+drm: possible circular locking dependency detected

2010-05-11 Thread Clemens Ladisch
With radeon KMS, after using a program accessing the framebuffer,
I started X and got this:

===
[ INFO: possible circular locking dependency detected ]
2.6.34-rc6 #117
---
X/1846 is trying to acquire lock:
 (&mm->mmap_sem){++}, at: [] might_fault+0x57/0xa4

but task is already holding lock:
 (&dev->mode_config.mutex){+.+.+.}, at: [] 
drm_mode_getresources+0x33/0x54d

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&dev->mode_config.mutex){+.+.+.}:
   [] __lock_acquire+0x1408/0x1747
   [] lock_acquire+0x5a/0x71
   [] mutex_lock_nested+0x58/0x2b1
   [] drm_fb_helper_set_par+0x9c/0xf2
   [] fb_set_var+0x1de/0x2d9
   [] do_fb_ioctl+0x13a/0x46e
   [] fb_ioctl+0x21/0x23
   [] vfs_ioctl+0x2a/0x9e
   [] do_vfs_ioctl+0x4b7/0x4f4
   [] sys_ioctl+0x42/0x65
   [] system_call_fastpath+0x16/0x1b

-> #1 (&fb_info->lock){+.+.+.}:
   [] __lock_acquire+0x1408/0x1747
   [] lock_acquire+0x5a/0x71
   [] mutex_lock_nested+0x58/0x2b1
   [] fb_release+0x1c/0x54
   [] __fput+0x120/0x1e3
   [] fput+0x18/0x1a
   [] remove_vma+0x51/0x76
   [] do_munmap+0x30a/0x32e
   [] sys_munmap+0x3b/0x54
   [] system_call_fastpath+0x16/0x1b

-> #0 (&mm->mmap_sem){++}:
   [] __lock_acquire+0x112d/0x1747
   [] lock_acquire+0x5a/0x71
   [] might_fault+0x84/0xa4
   [] drm_mode_getresources+0x280/0x54d
   [] drm_ioctl+0x255/0x34b
   [] vfs_ioctl+0x2a/0x9e
   [] do_vfs_ioctl+0x4b7/0x4f4
   [] sys_ioctl+0x42/0x65
   [] system_call_fastpath+0x16/0x1b

other info that might help us debug this:

1 lock held by X/1846:
 #0:  (&dev->mode_config.mutex){+.+.+.}, at: [] 
drm_mode_getresources+0x33/0x54d

stack backtrace:
Pid: 1846, comm: X Not tainted 2.6.34-rc6 #117
Call Trace:
 [] print_circular_bug+0xb3/0xc2
 [] __lock_acquire+0x112d/0x1747
 [] ? mark_held_locks+0x4d/0x6b
 [] ? mutex_lock_nested+0x296/0x2b1
 [] lock_acquire+0x5a/0x71
 [] ? might_fault+0x57/0xa4
 [] might_fault+0x84/0xa4
 [] ? might_fault+0x57/0xa4
 [] drm_mode_getresources+0x280/0x54d
 [] drm_ioctl+0x255/0x34b
 [] ? drm_mode_getresources+0x0/0x54d
 [] ? up_read+0x1e/0x35
 [] vfs_ioctl+0x2a/0x9e
 [] do_vfs_ioctl+0x4b7/0x4f4
 [] ? sysret_check+0x27/0x62
 [] sys_ioctl+0x42/0x65
 [] system_call_fastpath+0x16/0x1b
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


DRM Error on Acer Aspire One

2010-05-11 Thread Jaswinder Singh Rajput
Hello,

With latest git kernel, I am getting following DRM error and not
getting XWindows :

[   45.269075] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer
elapsed... GPU hung
[   45.269111] [ cut here ]
[   45.269139] WARNING: at mm/highmem.c:453 debug_kmap_atomic+0xa9/0x11e()
[   45.269150] Hardware name: Aspire one
[   45.269158] Modules linked in: nf_conntrack_ftp ath9k ath9k_common
battery ath9k_hw [last unloaded: scsi_wait_scan]
[   45.269198] Pid: 0, comm: swapper Not tainted 2.6.34-rc7-netbook #6
[   45.269208] Call Trace:
[   45.269231]  [] warn_slowpath_common+0x65/0x7c
[   45.269249]  [] ? debug_kmap_atomic+0xa9/0x11e
[   45.269267]  [] warn_slowpath_null+0xd/0x10
[   45.269284]  [] debug_kmap_atomic+0xa9/0x11e
[   45.269304]  [] kmap_atomic_prot+0x4d/0xb2
[   45.269321]  [] kmap_atomic+0xe/0x10
[   45.269341]  [] i915_error_object_create+0xea/0x14f
[   45.269359]  [] i915_handle_error+0x369/0x868
[   45.269380]  [] i915_hangcheck_elapsed+0x9f/0xdf
[   45.269399]  [] run_timer_softirq+0x1c9/0x269
[   45.269417]  [] ? i915_hangcheck_elapsed+0x0/0xdf
[   45.269435]  [] __do_softirq+0xc6/0x186
[   45.269451]  [] do_softirq+0x26/0x2b
[   45.269466]  [] irq_exit+0x29/0x66
[   45.269484]  [] smp_apic_timer_interrupt+0x6e/0x7c
[   45.269504]  [] apic_timer_interrupt+0x2a/0x30
[   45.269524]  [] ? ftrace_raw_event_signal_generate+0x6d/0xd4
[   45.269542]  [] ? acpi_idle_enter_simple+0x13b/0x168
[   45.269563]  [] cpuidle_idle_call+0x6b/0xda
[   45.269580]  [] cpu_idle+0x44/0x74
[   45.269598]  [] start_secondary+0x1b2/0x1b7
[   45.269612] ---[ end trace ce01d7ca0ae214f4 ]---
[   45.269631] [ cut here ]
[   45.269647] WARNING: at mm/highmem.c:453 debug_kmap_atomic+0xa9/0x11e()
[   45.269657] Hardware name: Aspire one
[   45.269665] Modules linked in: nf_conntrack_ftp ath9k ath9k_common
battery ath9k_hw [last unloaded: scsi_wait_scan]
[   45.269700] Pid: 0, comm: swapper Tainted: GW  2.6.34-rc7-netbook #6
[   45.269710] Call Trace:
[   45.269726]  [] warn_slowpath_common+0x65/0x7c
[   45.269743]  [] ? debug_kmap_atomic+0xa9/0x11e
[   45.269760]  [] warn_slowpath_null+0xd/0x10
[   45.269777]  [] debug_kmap_atomic+0xa9/0x11e
[   45.269795]  [] kmap_atomic_prot+0x4d/0xb2
[   45.269812]  [] kmap_atomic+0xe/0x10
[   45.269829]  [] i915_error_object_create+0xea/0x14f
[   45.269848]  [] i915_handle_error+0x369/0x868
[   45.269868]  [] i915_hangcheck_elapsed+0x9f/0xdf
[   45.269885]  [] run_timer_softirq+0x1c9/0x269
[   45.269903]  [] ? i915_hangcheck_elapsed+0x0/0xdf
[   45.269920]  [] __do_softirq+0xc6/0x186
[   45.269937]  [] do_softirq+0x26/0x2b
[   45.269952]  [] irq_exit+0x29/0x66
[   45.269968]  [] smp_apic_timer_interrupt+0x6e/0x7c
[   45.269985]  [] apic_timer_interrupt+0x2a/0x30
[   45.270004]  [] ? ftrace_raw_event_signal_generate+0x6d/0xd4
[   45.270051]  [] ? acpi_idle_enter_simple+0x13b/0x168
[   45.270071]  [] cpuidle_idle_call+0x6b/0xda
[   45.270087]  [] cpu_idle+0x44/0x74
[   45.270104]  [] start_secondary+0x1b2/0x1b7
[   45.270117] ---[ end trace ce01d7ca0ae214f5 ]---
[   45.270135] [ cut here ]

dmesg : http://userweb.kernel.org/~jaswinder/acer_netbook/dmesg_2634-rc7.txt
.config : http://userweb.kernel.org/~jaswinder/acer_netbook/config_2634-rc7.txt

How can I fix these errors.

Thanks,
--
Jaswinder Singh.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Chris Wilson
On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput 
 wrote:
> Hello,
> 
> With latest git kernel, I am getting following DRM error and not
> getting XWindows :

[snip]

Hmm, there are still patches for capturing error state that haven't gone
upstream, shame on me.

That error is a secondary issue to the GPU hang that is being reported. If
it is a regression caused by a kernel update it would be very useful if
you could bisect to the erroneous commit.

-- 
Chris Wilson, Intel Open Source Technology Centre
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Jaswinder Singh Rajput
Hello Chris,

On Tue, May 11, 2010 at 9:40 PM, Chris Wilson  wrote:
> On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput 
>  wrote:
>> Hello,
>>
>> With latest git kernel, I am getting following DRM error and not
>> getting XWindows :
>
> [snip]
>
> Hmm, there are still patches for capturing error state that haven't gone
> upstream, shame on me.
>
> That error is a secondary issue to the GPU hang that is being reported. If
> it is a regression caused by a kernel update it would be very useful if
> you could bisect to the erroneous commit.
>

Earlier I was using Moblin, I switched to Fedora and start getting
this error. I have also tested different kernel versions but getting
same error, so I do not think this is a regression.

moblin dmesg : 
http://userweb.kernel.org/~jaswinder/moblin/dmesg-moblin_2633rc5.txt

Thanks,
--
Jaswinder Singh.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Andrew Morton
On Tue, 11 May 2010 17:10:53 +0100 Chris Wilson  
wrote:

> On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput 
>  wrote:
> > Hello,
> > 
> > With latest git kernel, I am getting following DRM error and not
> > getting XWindows :
> 
> [snip]
> 
> Hmm, there are still patches for capturing error state that haven't gone
> upstream, shame on me.
> 
> That error is a secondary issue to the GPU hang that is being reported. If
> it is a regression caused by a kernel update it would be very useful if
> you could bisect to the erroneous commit.

It helps if one reads the code and the trace...

i915_error_object_create() is using KM_USER0 from softirq context. 
That's a bug, and a pretty serious one.  If some innocent civilian is
writing highmem data to disk and this timer interrupt fires and trashes
his KM_USER0 slot, the disk contents will be corrupted.

Something like this...

--- a/drivers/gpu/drm/i915/i915_irq.c~a
+++ a/drivers/gpu/drm/i915/i915_irq.c
@@ -456,11 +456,15 @@ i915_error_object_create(struct drm_devi
 
for (page = 0; page < page_count; page++) {
void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
+   unsigned long flags;
+
if (d == NULL)
goto unwind;
-   s = kmap_atomic(src_priv->pages[page], KM_USER0);
+   local_irq_save(flags);
+   s = kmap_atomic(src_priv->pages[page], KM_IRQ0);
memcpy(d, s, PAGE_SIZE);
-   kunmap_atomic(s, KM_USER0);
+   kunmap_atomic(s, KM_IRQ0);
+   local_irq_restore(flags);
dst->pages[page] = d;
}
dst->page_count = page_count;
_

Please let's get a tested fix for this into 2.6.34.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Chris Wilson
On Tue, 11 May 2010 10:48:18 -0400, Andrew Morton  
wrote:
> 
> On Tue, 11 May 2010 17:10:53 +0100 Chris Wilson  
> wrote:
> 
> > On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput 
> >  wrote:
> > > Hello,
> > > 
> > > With latest git kernel, I am getting following DRM error and not
> > > getting XWindows :
> > 
> > [snip]
> > 
> > Hmm, there are still patches for capturing error state that haven't gone
> > upstream, shame on me.
> > 
> > That error is a secondary issue to the GPU hang that is being reported. If
> > it is a regression caused by a kernel update it would be very useful if
> > you could bisect to the erroneous commit.
> 
> It helps if one reads the code and the trace...
> 
> i915_error_object_create() is using KM_USER0 from softirq context. 
> That's a bug, and a pretty serious one.  If some innocent civilian is
> writing highmem data to disk and this timer interrupt fires and trashes
> his KM_USER0 slot, the disk contents will be corrupted.
> 
> Something like this...
> 
> --- a/drivers/gpu/drm/i915/i915_irq.c~a
> +++ a/drivers/gpu/drm/i915/i915_irq.c
> @@ -456,11 +456,15 @@ i915_error_object_create(struct drm_devi
>  
>   for (page = 0; page < page_count; page++) {
>   void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> + unsigned long flags;
> +
>   if (d == NULL)
>   goto unwind;
> - s = kmap_atomic(src_priv->pages[page], KM_USER0);
> + local_irq_save(flags);
> + s = kmap_atomic(src_priv->pages[page], KM_IRQ0);
>   memcpy(d, s, PAGE_SIZE);
> - kunmap_atomic(s, KM_USER0);
> + kunmap_atomic(s, KM_IRQ0);
> + local_irq_restore(flags);
>   dst->pages[page] = d;
>   }
>   dst->page_count = page_count;
> _
> 
> Please let's get a tested fix for this into 2.6.34.

The change that I actually want is to replace the kmap_atomic(cpu_page) with an
io_mapping_map_atomic_wc(gtt_page), in case there is a incoherency between
the CPU and the GPU, we want to record what the GPU executed. Do you know
how if similar precautions are required with io_mapping_map_atomic_wc()?

-- 
Chris Wilson, Intel Open Source Technology Centre
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] drm/i915: Record error batch buffers using iomem

2010-05-11 Thread Chris Wilson
Directly read the GTT mapping for the contents of the batch buffers
rather than relying on possibly stale CPU caches. Also for completeness
scan the flushing/inactive lists for the current buffers - we are
collecting error state after all.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_irq.c |   64 ++
 1 files changed, 57 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 87113da..14301a4 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -441,9 +441,11 @@ static struct drm_i915_error_object *
 i915_error_object_create(struct drm_device *dev,
 struct drm_gem_object *src)
 {
+   drm_i915_private_t *dev_priv = dev->dev_private;
struct drm_i915_error_object *dst;
struct drm_i915_gem_object *src_priv;
int page, page_count;
+   u32 reloc_offset;
 
if (src == NULL)
return NULL;
@@ -458,14 +460,23 @@ i915_error_object_create(struct drm_device *dev,
if (dst == NULL)
return NULL;
 
+   reloc_offset = src_priv->gtt_offset;
for (page = 0; page < page_count; page++) {
-   void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
+   void __iomem *s;
+   void *d;
+
+   d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
if (d == NULL)
goto unwind;
-   s = kmap_atomic(src_priv->pages[page], KM_USER0);
-   memcpy(d, s, PAGE_SIZE);
-   kunmap_atomic(s, KM_USER0);
+
+   s = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping,
+reloc_offset);
+   memcpy_fromio(d, s, PAGE_SIZE);
+   io_mapping_unmap_atomic(s);
+
dst->pages[page] = d;
+
+   reloc_offset += PAGE_SIZE;
}
dst->page_count = page_count;
dst->gtt_offset = src_priv->gtt_offset;
@@ -621,18 +632,57 @@ static void i915_capture_error_state(struct drm_device 
*dev)
 
if (batchbuffer[1] == NULL &&
error->acthd >= obj_priv->gtt_offset &&
-   error->acthd < obj_priv->gtt_offset + obj->size &&
-   batchbuffer[0] != obj)
+   error->acthd < obj_priv->gtt_offset + obj->size)
batchbuffer[1] = obj;
 
count++;
}
+   /* Scan the other lists for completeness for those bizarre errors. */
+   if (batchbuffer[0] == NULL || batchbuffer[1] == NULL) {
+   list_for_each_entry(obj_priv, &dev_priv->mm.flushing_list, 
list) {
+   struct drm_gem_object *obj = obj_priv->obj;
+
+   if (batchbuffer[0] == NULL &&
+   bbaddr >= obj_priv->gtt_offset &&
+   bbaddr < obj_priv->gtt_offset + obj->size)
+   batchbuffer[0] = obj;
+
+   if (batchbuffer[1] == NULL &&
+   error->acthd >= obj_priv->gtt_offset &&
+   error->acthd < obj_priv->gtt_offset + obj->size)
+   batchbuffer[1] = obj;
+
+   if (batchbuffer[0] && batchbuffer[1])
+   break;
+   }
+   }
+   if (batchbuffer[0] == NULL || batchbuffer[1] == NULL) {
+   list_for_each_entry(obj_priv, &dev_priv->mm.inactive_list, 
list) {
+   struct drm_gem_object *obj = obj_priv->obj;
+
+   if (batchbuffer[0] == NULL &&
+   bbaddr >= obj_priv->gtt_offset &&
+   bbaddr < obj_priv->gtt_offset + obj->size)
+   batchbuffer[0] = obj;
+
+   if (batchbuffer[1] == NULL &&
+   error->acthd >= obj_priv->gtt_offset &&
+   error->acthd < obj_priv->gtt_offset + obj->size)
+   batchbuffer[1] = obj;
+
+   if (batchbuffer[0] && batchbuffer[1])
+   break;
+   }
+   }
 
/* We need to copy these to an anonymous buffer as the simplest
 * method to avoid being overwritten by userpace.
 */
error->batchbuffer[0] = i915_error_object_create(dev, batchbuffer[0]);
-   error->batchbuffer[1] = i915_error_object_create(dev, batchbuffer[1]);
+   if (batchbuffer[1] != batchbuffer[0])
+   error->batchbuffer[1] = i915_error_object_create(dev, 
batchbuffer[1]);
+   else
+   error->batchbuffer[1] = NULL;
 
/* Record the ringbuffer */
error->ringbuffer = i915_error_object_create(dev, 
dev_priv->ring.ring_obj);
-- 
1.7.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http:

Re: DRM Error on Acer Aspire One

2010-05-11 Thread Jaswinder Singh Rajput
Hello Andrew,

On Tue, May 11, 2010 at 8:18 PM, Andrew Morton
 wrote:
> On Tue, 11 May 2010 17:10:53 +0100 Chris Wilson  
> wrote:
>
>> On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput 
>>  wrote:
>> > Hello,
>> >
>> > With latest git kernel, I am getting following DRM error and not
>> > getting XWindows :
>>
>> [snip]
>>
>> Hmm, there are still patches for capturing error state that haven't gone
>> upstream, shame on me.
>>
>> That error is a secondary issue to the GPU hang that is being reported. If
>> it is a regression caused by a kernel update it would be very useful if
>> you could bisect to the erroneous commit.
>
> It helps if one reads the code and the trace...
>
> i915_error_object_create() is using KM_USER0 from softirq context.
> That's a bug, and a pretty serious one.  If some innocent civilian is
> writing highmem data to disk and this timer interrupt fires and trashes
> his KM_USER0 slot, the disk contents will be corrupted.
>
> Something like this...
>
> --- a/drivers/gpu/drm/i915/i915_irq.c~a
> +++ a/drivers/gpu/drm/i915/i915_irq.c
> @@ -456,11 +456,15 @@ i915_error_object_create(struct drm_devi
>
>        for (page = 0; page < page_count; page++) {
>                void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> +               unsigned long flags;
> +
>                if (d == NULL)
>                        goto unwind;
> -               s = kmap_atomic(src_priv->pages[page], KM_USER0);
> +               local_irq_save(flags);
> +               s = kmap_atomic(src_priv->pages[page], KM_IRQ0);
>                memcpy(d, s, PAGE_SIZE);
> -               kunmap_atomic(s, KM_USER0);
> +               kunmap_atomic(s, KM_IRQ0);
> +               local_irq_restore(flags);
>                dst->pages[page] = d;
>        }
>        dst->page_count = page_count;
> _
>
> Please let's get a tested fix for this into 2.6.34.
>

I tested your patch with latest linus git and it works, it fixes the
softirq error.

Now I am only getting DRM errors :

[   42.276059] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer
elapsed... GPU hung
[   42.276398] render error detected, EIR: 0x
[   42.276460] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request
returns -5 (awaiting 18 at 17)

Thanks,
--
Jaswinder Singh.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Andrew Morton
On Tue, 11 May 2010 19:19:26 +0100 Chris Wilson  
wrote:

> On Tue, 11 May 2010 10:48:18 -0400, Andrew Morton  
> wrote:
> > 
> > On Tue, 11 May 2010 17:10:53 +0100 Chris Wilson  
> > wrote:
> > 
> > > On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput 
> > >  wrote:
> > > > Hello,
> > > > 
> > > > With latest git kernel, I am getting following DRM error and not
> > > > getting XWindows :
> > > 
> > > [snip]
> > > 
> > > Hmm, there are still patches for capturing error state that haven't gone
> > > upstream, shame on me.
> > > 
> > > That error is a secondary issue to the GPU hang that is being reported. If
> > > it is a regression caused by a kernel update it would be very useful if
> > > you could bisect to the erroneous commit.
> > 
> > It helps if one reads the code and the trace...
> > 
> > i915_error_object_create() is using KM_USER0 from softirq context. 
> > That's a bug, and a pretty serious one.  If some innocent civilian is
> > writing highmem data to disk and this timer interrupt fires and trashes
> > his KM_USER0 slot, the disk contents will be corrupted.
> > 
> > Something like this...
> > 
> > --- a/drivers/gpu/drm/i915/i915_irq.c~a
> > +++ a/drivers/gpu/drm/i915/i915_irq.c
> > @@ -456,11 +456,15 @@ i915_error_object_create(struct drm_devi
> >  
> > for (page = 0; page < page_count; page++) {
> > void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> > +   unsigned long flags;
> > +
> > if (d == NULL)
> > goto unwind;
> > -   s = kmap_atomic(src_priv->pages[page], KM_USER0);
> > +   local_irq_save(flags);
> > +   s = kmap_atomic(src_priv->pages[page], KM_IRQ0);
> > memcpy(d, s, PAGE_SIZE);
> > -   kunmap_atomic(s, KM_USER0);
> > +   kunmap_atomic(s, KM_IRQ0);
> > +   local_irq_restore(flags);
> > dst->pages[page] = d;
> > }
> > dst->page_count = page_count;
> > _
> > 
> > Please let's get a tested fix for this into 2.6.34.
> 
> The change that I actually want is to replace the kmap_atomic(cpu_page) with 
> an
> io_mapping_map_atomic_wc(gtt_page), in case there is a incoherency between
> the CPU and the GPU, we want to record what the GPU executed. Do you know
> how if similar precautions are required with io_mapping_map_atomic_wc()?

gack, wtf is io_mapping_map_atomic_wc()?



Could do with some interface documentation. Looks too large to be inlined.

No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
it hardwires use of KM_USER0.  I suggest that io_mapping_create_wc(),
io_mapping_map_atomic_wc() etc be changed so that the caller passes in the
KM_foo kmap slot index.


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/i915: Record error batch buffers using iomem

2010-05-11 Thread Andrew Morton
On Tue, 11 May 2010 19:22:14 +0100 Chris Wilson  
wrote:

> + reloc_offset = src_priv->gtt_offset;
>   for (page = 0; page < page_count; page++) {
> - void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> + void __iomem *s;
> + void *d;
> +
> + d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
>   if (d == NULL)
>   goto unwind;
> - s = kmap_atomic(src_priv->pages[page], KM_USER0);
> - memcpy(d, s, PAGE_SIZE);
> - kunmap_atomic(s, KM_USER0);
> +
> + s = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping,
> +  reloc_offset);
> + memcpy_fromio(d, s, PAGE_SIZE);
> + io_mapping_unmap_atomic(s);

As mentioned in the other email, this will still corrupt the KM_USER0
slot, and will generate a debug_kmap_atomic() warning.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/i915: Record error batch buffers using iomem

2010-05-11 Thread Chris Wilson
On Tue, 11 May 2010 11:37:22 -0400, Andrew Morton  
wrote:
> On Tue, 11 May 2010 19:22:14 +0100 Chris Wilson  
> wrote:
> 
> > +   reloc_offset = src_priv->gtt_offset;
> > for (page = 0; page < page_count; page++) {
> > -   void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> > +   void __iomem *s;
> > +   void *d;
> > +
> > +   d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> > if (d == NULL)
> > goto unwind;
> > -   s = kmap_atomic(src_priv->pages[page], KM_USER0);
> > -   memcpy(d, s, PAGE_SIZE);
> > -   kunmap_atomic(s, KM_USER0);
> > +
> > +   s = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping,
> > +reloc_offset);
> > +   memcpy_fromio(d, s, PAGE_SIZE);
> > +   io_mapping_unmap_atomic(s);
> 
> As mentioned in the other email, this will still corrupt the KM_USER0
> slot, and will generate a debug_kmap_atomic() warning.

How, as kmap_atomic(KM_USER0) is no longer used?
-- 
Chris Wilson, Intel Open Source Technology Centre
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Chris Wilson
On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton  
wrote:
> No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
> it hardwires use of KM_USER0.  I suggest that io_mapping_create_wc(),
> io_mapping_map_atomic_wc() etc be changed so that the caller passes in the
> KM_foo kmap slot index.

Argh, sorry for the noise, read the mail in the wrong order. Thanks for
the review. It would be sensible to go with your simpler patch whilst
io_mapping_map_atomic_wc() is improved.

-- 
Chris Wilson, Intel Open Source Technology Centre
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Andrew Morton
On Tue, 11 May 2010 19:52:31 +0100
Chris Wilson  wrote:

> On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton  
> wrote:
> > No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
> > it hardwires use of KM_USER0.  I suggest that io_mapping_create_wc(),
> > io_mapping_map_atomic_wc() etc be changed so that the caller passes in the
> > KM_foo kmap slot index.
> 
> Argh, sorry for the noise, read the mail in the wrong order. Thanks for
> the review. It would be sensible to go with your simpler patch whilst
> io_mapping_map_atomic_wc() is improved.

OK.  I'll be sending a bunch of fixes Linuswards in an hour or two.  
Should I include this?


Subject: drivers/gpu/drm/i915/i915_irq.c:i915_error_object_create(): use 
correct kmap-atomic slot
From: Andrew Morton 

i915_error_object_create() is called from the timer interrupt and hence
can corrupt the KM_USER0 slot.  Use KM_IRQ0 instead.

Reported-by: Jaswinder Singh Rajput 
Tested-by: Jaswinder Singh Rajput 
Cc: Chris Wilson 
Cc: Dave Airlie 
Signed-off-by: Andrew Morton 
---

 drivers/gpu/drm/i915/i915_irq.c |8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff -puN 
drivers/gpu/drm/i915/i915_irq.c~drivers-gpu-drm-i915-i915_irqc-i915_error_object_create-use-correct-kmap-atomic-slot
 drivers/gpu/drm/i915/i915_irq.c
--- 
a/drivers/gpu/drm/i915/i915_irq.c~drivers-gpu-drm-i915-i915_irqc-i915_error_object_create-use-correct-kmap-atomic-slot
+++ a/drivers/gpu/drm/i915/i915_irq.c
@@ -461,11 +461,15 @@ i915_error_object_create(struct drm_devi
 
for (page = 0; page < page_count; page++) {
void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
+   unsigned long flags;
+
if (d == NULL)
goto unwind;
-   s = kmap_atomic(src_priv->pages[page], KM_USER0);
+   local_irq_save(flags);
+   s = kmap_atomic(src_priv->pages[page], KM_IRQ0);
memcpy(d, s, PAGE_SIZE);
-   kunmap_atomic(s, KM_USER0);
+   kunmap_atomic(s, KM_IRQ0);
+   local_irq_restore(flags);
dst->pages[page] = d;
}
dst->page_count = page_count;
_

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/i915: Record error batch buffers using iomem

2010-05-11 Thread Jaswinder Singh Rajput
Hello Chris and Andrew,

On Tue, May 11, 2010 at 11:52 PM, Chris Wilson  wrote:
> Directly read the GTT mapping for the contents of the batch buffers
> rather than relying on possibly stale CPU caches. Also for completeness
> scan the flushing/inactive lists for the current buffers - we are
> collecting error state after all.
>
> Signed-off-by: Chris Wilson 

Yes, I have tested this patch.

I booted 3 times, and this patch fixes the DRM as well as softirq
warnings and I am getting Xwindows with this patch.

I am still doing more testing.

Thanks,
--
Jaswinder Singh.
> ---
>  drivers/gpu/drm/i915/i915_irq.c |   64 ++
>  1 files changed, 57 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 87113da..14301a4 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -441,9 +441,11 @@ static struct drm_i915_error_object *
>  i915_error_object_create(struct drm_device *dev,
>                         struct drm_gem_object *src)
>  {
> +       drm_i915_private_t *dev_priv = dev->dev_private;
>        struct drm_i915_error_object *dst;
>        struct drm_i915_gem_object *src_priv;
>        int page, page_count;
> +       u32 reloc_offset;
>
>        if (src == NULL)
>                return NULL;
> @@ -458,14 +460,23 @@ i915_error_object_create(struct drm_device *dev,
>        if (dst == NULL)
>                return NULL;
>
> +       reloc_offset = src_priv->gtt_offset;
>        for (page = 0; page < page_count; page++) {
> -               void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> +               void __iomem *s;
> +               void *d;
> +
> +               d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
>                if (d == NULL)
>                        goto unwind;
> -               s = kmap_atomic(src_priv->pages[page], KM_USER0);
> -               memcpy(d, s, PAGE_SIZE);
> -               kunmap_atomic(s, KM_USER0);
> +
> +               s = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping,
> +                                            reloc_offset);
> +               memcpy_fromio(d, s, PAGE_SIZE);
> +               io_mapping_unmap_atomic(s);
> +
>                dst->pages[page] = d;
> +
> +               reloc_offset += PAGE_SIZE;
>        }
>        dst->page_count = page_count;
>        dst->gtt_offset = src_priv->gtt_offset;
> @@ -621,18 +632,57 @@ static void i915_capture_error_state(struct drm_device 
> *dev)
>
>                if (batchbuffer[1] == NULL &&
>                    error->acthd >= obj_priv->gtt_offset &&
> -                   error->acthd < obj_priv->gtt_offset + obj->size &&
> -                   batchbuffer[0] != obj)
> +                   error->acthd < obj_priv->gtt_offset + obj->size)
>                        batchbuffer[1] = obj;
>
>                count++;
>        }
> +       /* Scan the other lists for completeness for those bizarre errors. */
> +       if (batchbuffer[0] == NULL || batchbuffer[1] == NULL) {
> +               list_for_each_entry(obj_priv, &dev_priv->mm.flushing_list, 
> list) {
> +                       struct drm_gem_object *obj = obj_priv->obj;
> +
> +                       if (batchbuffer[0] == NULL &&
> +                           bbaddr >= obj_priv->gtt_offset &&
> +                           bbaddr < obj_priv->gtt_offset + obj->size)
> +                               batchbuffer[0] = obj;
> +
> +                       if (batchbuffer[1] == NULL &&
> +                           error->acthd >= obj_priv->gtt_offset &&
> +                           error->acthd < obj_priv->gtt_offset + obj->size)
> +                               batchbuffer[1] = obj;
> +
> +                       if (batchbuffer[0] && batchbuffer[1])
> +                               break;
> +               }
> +       }
> +       if (batchbuffer[0] == NULL || batchbuffer[1] == NULL) {
> +               list_for_each_entry(obj_priv, &dev_priv->mm.inactive_list, 
> list) {
> +                       struct drm_gem_object *obj = obj_priv->obj;
> +
> +                       if (batchbuffer[0] == NULL &&
> +                           bbaddr >= obj_priv->gtt_offset &&
> +                           bbaddr < obj_priv->gtt_offset + obj->size)
> +                               batchbuffer[0] = obj;
> +
> +                       if (batchbuffer[1] == NULL &&
> +                           error->acthd >= obj_priv->gtt_offset &&
> +                           error->acthd < obj_priv->gtt_offset + obj->size)
> +                               batchbuffer[1] = obj;
> +
> +                       if (batchbuffer[0] && batchbuffer[1])
> +                               break;
> +               }
> +       }
>
>        /* We need to copy these to an anonymous buffer as the simplest
>         * method to avoid being overwritten by userpace.
>         */
>        error->batchbuffer[0] = i915_error_object_create(dev, batchbuffer[0]);
> -       error->batchbuffer[1] = i915

Re: [PATCH] drm/i915: Record error batch buffers using iomem

2010-05-11 Thread Jaswinder Singh Rajput
Hello Chris and Andrew,

I did further testing and noticed that this patch fixes the boot
errors and warnings and I get the XWindows.

But XWindows freezes after some time.

Thanks,
--
Jaswinder Singh.

On Wed, May 12, 2010 at 12:52 AM, Jaswinder Singh Rajput
 wrote:
> Hello Chris and Andrew,
>
> On Tue, May 11, 2010 at 11:52 PM, Chris Wilson  
> wrote:
>> Directly read the GTT mapping for the contents of the batch buffers
>> rather than relying on possibly stale CPU caches. Also for completeness
>> scan the flushing/inactive lists for the current buffers - we are
>> collecting error state after all.
>>
>> Signed-off-by: Chris Wilson 
>
> Yes, I have tested this patch.
>
> I booted 3 times, and this patch fixes the DRM as well as softirq
> warnings and I am getting Xwindows with this patch.
>
> I am still doing more testing.
>
> Thanks,
> --
> Jaswinder Singh.
>> ---
>>  drivers/gpu/drm/i915/i915_irq.c |   64 
>> ++
>>  1 files changed, 57 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_irq.c 
>> b/drivers/gpu/drm/i915/i915_irq.c
>> index 87113da..14301a4 100644
>> --- a/drivers/gpu/drm/i915/i915_irq.c
>> +++ b/drivers/gpu/drm/i915/i915_irq.c
>> @@ -441,9 +441,11 @@ static struct drm_i915_error_object *
>>  i915_error_object_create(struct drm_device *dev,
>>                         struct drm_gem_object *src)
>>  {
>> +       drm_i915_private_t *dev_priv = dev->dev_private;
>>        struct drm_i915_error_object *dst;
>>        struct drm_i915_gem_object *src_priv;
>>        int page, page_count;
>> +       u32 reloc_offset;
>>
>>        if (src == NULL)
>>                return NULL;
>> @@ -458,14 +460,23 @@ i915_error_object_create(struct drm_device *dev,
>>        if (dst == NULL)
>>                return NULL;
>>
>> +       reloc_offset = src_priv->gtt_offset;
>>        for (page = 0; page < page_count; page++) {
>> -               void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
>> +               void __iomem *s;
>> +               void *d;
>> +
>> +               d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
>>                if (d == NULL)
>>                        goto unwind;
>> -               s = kmap_atomic(src_priv->pages[page], KM_USER0);
>> -               memcpy(d, s, PAGE_SIZE);
>> -               kunmap_atomic(s, KM_USER0);
>> +
>> +               s = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping,
>> +                                            reloc_offset);
>> +               memcpy_fromio(d, s, PAGE_SIZE);
>> +               io_mapping_unmap_atomic(s);
>> +
>>                dst->pages[page] = d;
>> +
>> +               reloc_offset += PAGE_SIZE;
>>        }
>>        dst->page_count = page_count;
>>        dst->gtt_offset = src_priv->gtt_offset;
>> @@ -621,18 +632,57 @@ static void i915_capture_error_state(struct drm_device 
>> *dev)
>>
>>                if (batchbuffer[1] == NULL &&
>>                    error->acthd >= obj_priv->gtt_offset &&
>> -                   error->acthd < obj_priv->gtt_offset + obj->size &&
>> -                   batchbuffer[0] != obj)
>> +                   error->acthd < obj_priv->gtt_offset + obj->size)
>>                        batchbuffer[1] = obj;
>>
>>                count++;
>>        }
>> +       /* Scan the other lists for completeness for those bizarre errors. */
>> +       if (batchbuffer[0] == NULL || batchbuffer[1] == NULL) {
>> +               list_for_each_entry(obj_priv, &dev_priv->mm.flushing_list, 
>> list) {
>> +                       struct drm_gem_object *obj = obj_priv->obj;
>> +
>> +                       if (batchbuffer[0] == NULL &&
>> +                           bbaddr >= obj_priv->gtt_offset &&
>> +                           bbaddr < obj_priv->gtt_offset + obj->size)
>> +                               batchbuffer[0] = obj;
>> +
>> +                       if (batchbuffer[1] == NULL &&
>> +                           error->acthd >= obj_priv->gtt_offset &&
>> +                           error->acthd < obj_priv->gtt_offset + obj->size)
>> +                               batchbuffer[1] = obj;
>> +
>> +                       if (batchbuffer[0] && batchbuffer[1])
>> +                               break;
>> +               }
>> +       }
>> +       if (batchbuffer[0] == NULL || batchbuffer[1] == NULL) {
>> +               list_for_each_entry(obj_priv, &dev_priv->mm.inactive_list, 
>> list) {
>> +                       struct drm_gem_object *obj = obj_priv->obj;
>> +
>> +                       if (batchbuffer[0] == NULL &&
>> +                           bbaddr >= obj_priv->gtt_offset &&
>> +                           bbaddr < obj_priv->gtt_offset + obj->size)
>> +                               batchbuffer[0] = obj;
>> +
>> +                       if (batchbuffer[1] == NULL &&
>> +                           error->acthd >= obj_priv->gtt_offset &&
>> +                           error->acthd < obj_priv->gtt_offset + obj->size)
>> +                               batchbuffer[1] = obj;
>> +

[Bug 28069] New: maniadrive - smooth play with LIBGL_ALWAYS_INDIRECT=true, (almost) unplayable otherwise

2010-05-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=28069

   Summary: maniadrive - smooth play with
LIBGL_ALWAYS_INDIRECT=true, (almost) unplayable
otherwise
   Product: DRI
   Version: XOrg CVS
  Platform: Other
OS/Version: All
Status: NEW
  Severity: enhancement
  Priority: medium
 Component: DRM/Radeon
AssignedTo: dri-devel@lists.freedesktop.org
ReportedBy: p...@mandriva.com.br


The problem described in https://bugs.freedesktop.org/show_bug.cgi?id=28002
actually also happens with the ati driver.
  Just that with the ati driver, it is less visible (the car position jumps
less), but with LIBGL_ALWAYS_INDIRECT it also becomes a lot more smooth.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/i915: Record error batch buffers using iomem

2010-05-11 Thread Chris Wilson
On Wed, 12 May 2010 01:08:23 +0530, Jaswinder Singh Rajput 
 wrote:
> Hello Chris and Andrew,
> 
> I did further testing and noticed that this patch fixes the boot
> errors and warnings and I get the XWindows.
> 
> But XWindows freezes after some time.

The BUG you were hitting before is on the error collection path which
presumably is still being triggered during boot by a GPU error.
Can you check to see if /sys/kernel/debug/dri/0/i915_error_state has
recorded anything? And if not, wait until it freezes and then please file
a bug report at bugs.freedesktop.org with the i915_error_state, Xorg.0.log
and dmesg.

-- 
Chris Wilson, Intel Open Source Technology Centre
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Chris Wilson
On Tue, 11 May 2010 12:10:01 -0700, Andrew Morton  
wrote:
> On Tue, 11 May 2010 19:52:31 +0100
> Chris Wilson  wrote:
> 
> > On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton 
> >  wrote:
> > > No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
> > > it hardwires use of KM_USER0.  I suggest that io_mapping_create_wc(),
> > > io_mapping_map_atomic_wc() etc be changed so that the caller passes in the
> > > KM_foo kmap slot index.
> > 
> > Argh, sorry for the noise, read the mail in the wrong order. Thanks for
> > the review. It would be sensible to go with your simpler patch whilst
> > io_mapping_map_atomic_wc() is improved.
> 
> OK.  I'll be sending a bunch of fixes Linuswards in an hour or two.  
> Should I include this?

Yes.

Acked-by: Chris Wilson 

-- 
Chris Wilson, Intel Open Source Technology Centre
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/i915: Record error batch buffers using iomem

2010-05-11 Thread Jaswinder Singh Rajput
Hello Chris,

On Wed, May 12, 2010 at 1:23 AM, Chris Wilson  wrote:
> On Wed, 12 May 2010 01:08:23 +0530, Jaswinder Singh Rajput 
>  wrote:
>> Hello Chris and Andrew,
>>
>> I did further testing and noticed that this patch fixes the boot
>> errors and warnings and I get the XWindows.
>>
>> But XWindows freezes after some time.
>
> The BUG you were hitting before is on the error collection path which
> presumably is still being triggered during boot by a GPU error.

No, I am not getting any bug with your patch.

dmesg with your patch :
http://userweb.kernel.org/~jaswinder/acer_netbook/dmesg_2634-rc7-chris.txt

> Can you check to see if /sys/kernel/debug/dri/0/i915_error_state has
> recorded anything?

No.

> And if not, wait until it freezes and then please file
> a bug report at bugs.freedesktop.org with the i915_error_state, Xorg.0.log
> and dmesg.
>

Ok.

Thanks,
--
Jaswinder Singh.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/i915: Record error batch buffers using iomem

2010-05-11 Thread Jaswinder Singh Rajput
Hello Chris,

On Wed, May 12, 2010 at 1:35 AM, Jaswinder Singh Rajput
 wrote:
> Hello Chris,
>
> On Wed, May 12, 2010 at 1:23 AM, Chris Wilson  
> wrote:
>> On Wed, 12 May 2010 01:08:23 +0530, Jaswinder Singh Rajput 
>>  wrote:
>>> Hello Chris and Andrew,
>>>
>>> I did further testing and noticed that this patch fixes the boot
>>> errors and warnings and I get the XWindows.
>>>
>>> But XWindows freezes after some time.
>>
>> The BUG you were hitting before is on the error collection path which
>> presumably is still being triggered during boot by a GPU error.
>
> No, I am not getting any bug with your patch.
>
> dmesg with your patch :
> http://userweb.kernel.org/~jaswinder/acer_netbook/dmesg_2634-rc7-chris.txt
>

I did more testing. And test pass 80% of time. I get the bugs with cold boot :

[   40.090295] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer
elapsed... GPU hung
[   40.090318] [ cut here ]
[   40.090338] WARNING: at mm/highmem.c:453 debug_kmap_atomic+0xa9/0x11e()
[   40.090345] Hardware name: Aspire one
[   40.090351] Modules linked in: nf_conntrack_ftp ath9k ath9k_common
battery ath9k_hw [last unloaded: scsi_wait_scan]
[   40.090378] Pid: 0, comm: swapper Not tainted 2.6.34-rc7-netbook #8
[   40.090385] Call Trace:
[   40.090402]  [] warn_slowpath_common+0x65/0x7c
[   40.090415]  [] ? debug_kmap_atomic+0xa9/0x11e
[   40.090428]  [] warn_slowpath_null+0xd/0x10
[   40.090440]  [] debug_kmap_atomic+0xa9/0x11e
[   40.090454]  [] kmap_atomic_prot_pfn+0x1d/0x5e
[   40.090465]  [] iomap_atomic_prot_pfn+0x23/0x26
[   40.090479]  [] i915_error_object_create+0x110/0x17c
[   40.090492]  [] i915_handle_error+0x4a2/0x9ba
[   40.090506]  [] i915_hangcheck_elapsed+0x9f/0xdf
[   40.090518]  [] run_timer_softirq+0x1c9/0x269
[   40.090531]  [] ? i915_hangcheck_elapsed+0x0/0xdf
[   40.090543]  [] __do_softirq+0xc6/0x186
[   40.090553]  [] do_softirq+0x26/0x2b
[   40.090564]  [] irq_exit+0x29/0x66
[   40.090576]  [] smp_apic_timer_interrupt+0x6e/0x7c
[   40.090591]  [] apic_timer_interrupt+0x2a/0x30
[   40.090605]  [] ? ftrace_raw_event_signal_generate+0x6d/0xd4
[   40.090618]  [] ? acpi_idle_enter_simple+0x13b/0x168
[   40.090633]  [] cpuidle_idle_call+0x6b/0xda
[   40.090645]  [] cpu_idle+0x44/0x74
[   40.090657]  [] start_secondary+0x1b2/0x1b7
[   40.090666] ---[ end trace 5e47c395a6f397dc ]---
[   40.090862] [ cut here ]

dmesg with this patch with cold boot :
http://userweb.kernel.org/~jaswinder/acer_netbook/dmesg_2634-rc7-chris-cold.txt

Thanks,
--
Jaswinder Singh.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Dave Airlie
On Wed, May 12, 2010 at 5:57 AM, Chris Wilson  wrote:
> On Tue, 11 May 2010 12:10:01 -0700, Andrew Morton  
> wrote:
>> On Tue, 11 May 2010 19:52:31 +0100
>> Chris Wilson  wrote:
>>
>> > On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton 
>> >  wrote:
>> > > No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
>> > > it hardwires use of KM_USER0.  I suggest that io_mapping_create_wc(),
>> > > io_mapping_map_atomic_wc() etc be changed so that the caller passes in 
>> > > the
>> > > KM_foo kmap slot index.
>> >
>> > Argh, sorry for the noise, read the mail in the wrong order. Thanks for
>> > the review. It would be sensible to go with your simpler patch whilst
>> > io_mapping_map_atomic_wc() is improved.
>>
>> OK.  I'll be sending a bunch of fixes Linuswards in an hour or two.
>> Should I include this?
>
> Yes.
>
> Acked-by: Chris Wilson 
>

I'm not sure pushing this in at this point is a good idea, if I'm
reading it correctly we've no idea what KM_IRQ is being used for, and
this codepath is called from non-irq contexts just as much as irq
contexts.

I'd rather we just backout the hangcheck stuff touching copies at all
at this point, and try again doing it properly with a slow work or
something for later.

Dave.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Chris Wilson
On Wed, 12 May 2010 08:22:49 +1000, Dave Airlie  wrote:
> I'd rather we just backout the hangcheck stuff touching copies at all
> at this point, and try again doing it properly with a slow work or
> something for later.

>From my point of view, the information provided by the hangcheck has been
invaluable for delving into and fixing some obnoxious driver bugs. I
suspect its honeymoon period is now over - those bugs that it could
detect easily have been fixed (I hope). In order to capture the relevant
information for later chipset generations, we will need to parse the
command stream and include auxiliary buffers. So whilst I would prefer to
see this in a release so that I can easily diagnose bug reports, I accept
that there is more work to be done and will HTFU.

-- 
Chris Wilson, Intel Open Source Technology Centre
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Andrew Morton
On Wed, 12 May 2010 08:22:49 +1000
Dave Airlie  wrote:

> On Wed, May 12, 2010 at 5:57 AM, Chris Wilson  
> wrote:
> > On Tue, 11 May 2010 12:10:01 -0700, Andrew Morton 
> >  wrote:
> >> On Tue, 11 May 2010 19:52:31 +0100
> >> Chris Wilson  wrote:
> >>
> >> > On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton 
> >> >  wrote:
> >> > > No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
> >> > > it hardwires use of KM_USER0. __I suggest that io_mapping_create_wc(),
> >> > > io_mapping_map_atomic_wc() etc be changed so that the caller passes in 
> >> > > the
> >> > > KM_foo kmap slot index.
> >> >
> >> > Argh, sorry for the noise, read the mail in the wrong order. Thanks for
> >> > the review. It would be sensible to go with your simpler patch whilst
> >> > io_mapping_map_atomic_wc() is improved.
> >>
> >> OK. __I'll be sending a bunch of fixes Linuswards in an hour or two.
> >> Should I include this?
> >
> > Yes.
> >
> > Acked-by: Chris Wilson 
> >
> 
> I'm not sure pushing this in at this point is a good idea, if I'm
> reading it correctly we've no idea what KM_IRQ is being used for,

It's used for taking kmaps from IRQ contexts.

> and
> this codepath is called from non-irq contexts just as much as irq
> contexts.

That's fine.  As long as we do a local_irq_disable(), KM_IRQ0 can be
used from both irq- and non-irq contexts.  All we need to do is to
ensure that some interrupt cannot come along on this CPU and corrupt
the slot.

> I'd rather we just backout the hangcheck stuff touching copies at all
> at this point, and try again doing it properly with a slow work or
> something for later.
> 
> Dave.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Dave Airlie
On Wed, May 12, 2010 at 8:32 AM, Andrew Morton
 wrote:
> On Wed, 12 May 2010 08:22:49 +1000
> Dave Airlie  wrote:
>
>> On Wed, May 12, 2010 at 5:57 AM, Chris Wilson  
>> wrote:
>> > On Tue, 11 May 2010 12:10:01 -0700, Andrew Morton 
>> >  wrote:
>> >> On Tue, 11 May 2010 19:52:31 +0100
>> >> Chris Wilson  wrote:
>> >>
>> >> > On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton 
>> >> >  wrote:
>> >> > > No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
>> >> > > it hardwires use of KM_USER0. __I suggest that io_mapping_create_wc(),
>> >> > > io_mapping_map_atomic_wc() etc be changed so that the caller passes 
>> >> > > in the
>> >> > > KM_foo kmap slot index.
>> >> >
>> >> > Argh, sorry for the noise, read the mail in the wrong order. Thanks for
>> >> > the review. It would be sensible to go with your simpler patch whilst
>> >> > io_mapping_map_atomic_wc() is improved.
>> >>
>> >> OK. __I'll be sending a bunch of fixes Linuswards in an hour or two.
>> >> Should I include this?
>> >
>> > Yes.
>> >
>> > Acked-by: Chris Wilson 
>> >
>>
>> I'm not sure pushing this in at this point is a good idea, if I'm
>> reading it correctly we've no idea what KM_IRQ is being used for,
>
> It's used for taking kmaps from IRQ contexts.
>
>> and
>> this codepath is called from non-irq contexts just as much as irq
>> contexts.
>
> That's fine.  As long as we do a local_irq_disable(), KM_IRQ0 can be
> used from both irq- and non-irq contexts.  All we need to do is to
> ensure that some interrupt cannot come along on this CPU and corrupt
> the slot.

I don't think we do that in a lot of places, and I'd rather not add
that in to fix this problem at this point in the release cycle, as
we've no idea what it might break/regress.

Its easier to just disable the hangcheck copy and try again for 2.6.35
with a workqueue or slow work.

Dave



>
>> I'd rather we just backout the hangcheck stuff touching copies at all
>> at this point, and try again doing it properly with a slow work or
>> something for later.
>>
>> Dave.
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Dave Airlie
On Wed, May 12, 2010 at 8:56 AM, Andrew Morton
 wrote:
> On Wed, 12 May 2010 08:51:05 +1000
> Dave Airlie  wrote:
>
>> On Wed, May 12, 2010 at 8:32 AM, Andrew Morton
>>  wrote:
>> > On Wed, 12 May 2010 08:22:49 +1000
>> > Dave Airlie  wrote:
>> >
>> >> On Wed, May 12, 2010 at 5:57 AM, Chris Wilson  
>> >> wrote:
>> >> > On Tue, 11 May 2010 12:10:01 -0700, Andrew Morton 
>> >> >  wrote:
>> >> >> On Tue, 11 May 2010 19:52:31 +0100
>> >> >> Chris Wilson  wrote:
>> >> >>
>> >> >> > On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton 
>> >> >> >  wrote:
>> >> >> > > No, io_mapping_map_atomic_wc() cannot be used from [soft]irq 
>> >> >> > > context:
>> >> >> > > it hardwires use of KM_USER0. __I suggest that 
>> >> >> > > io_mapping_create_wc(),
>> >> >> > > io_mapping_map_atomic_wc() etc be changed so that the caller 
>> >> >> > > passes in the
>> >> >> > > KM_foo kmap slot index.
>> >> >> >
>> >> >> > Argh, sorry for the noise, read the mail in the wrong order. Thanks 
>> >> >> > for
>> >> >> > the review. It would be sensible to go with your simpler patch whilst
>> >> >> > io_mapping_map_atomic_wc() is improved.
>> >> >>
>> >> >> OK. __I'll be sending a bunch of fixes Linuswards in an hour or two.
>> >> >> Should I include this?
>> >> >
>> >> > Yes.
>> >> >
>> >> > Acked-by: Chris Wilson 
>> >> >
>> >>
>> >> I'm not sure pushing this in at this point is a good idea, if I'm
>> >> reading it correctly we've no idea what KM_IRQ is being used for,
>> >
>> > It's used for taking kmaps from IRQ contexts.
>> >
>> >> and
>> >> this codepath is called from non-irq contexts just as much as irq
>> >> contexts.
>> >
>> > That's fine. __As long as we do a local_irq_disable(), KM_IRQ0 can be
>> > used from both irq- and non-irq contexts. __All we need to do is to
>> > ensure that some interrupt cannot come along on this CPU and corrupt
>> > the slot.
>>
>> I don't think we do that in a lot of places, and I'd rather not add
>> that in to fix this problem at this point in the release cycle, as
>> we've no idea what it might break/regress.
>
> What is "that"?  The switch to irq-protected KM_IRQ0?  That won't break
> anything.
>

disabling local cpu irqs around all these kmap mappings.

Dave.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Andrew Morton
On Wed, 12 May 2010 08:51:05 +1000
Dave Airlie  wrote:

> On Wed, May 12, 2010 at 8:32 AM, Andrew Morton
>  wrote:
> > On Wed, 12 May 2010 08:22:49 +1000
> > Dave Airlie  wrote:
> >
> >> On Wed, May 12, 2010 at 5:57 AM, Chris Wilson  
> >> wrote:
> >> > On Tue, 11 May 2010 12:10:01 -0700, Andrew Morton 
> >> >  wrote:
> >> >> On Tue, 11 May 2010 19:52:31 +0100
> >> >> Chris Wilson  wrote:
> >> >>
> >> >> > On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton 
> >> >> >  wrote:
> >> >> > > No, io_mapping_map_atomic_wc() cannot be used from [soft]irq 
> >> >> > > context:
> >> >> > > it hardwires use of KM_USER0. __I suggest that 
> >> >> > > io_mapping_create_wc(),
> >> >> > > io_mapping_map_atomic_wc() etc be changed so that the caller passes 
> >> >> > > in the
> >> >> > > KM_foo kmap slot index.
> >> >> >
> >> >> > Argh, sorry for the noise, read the mail in the wrong order. Thanks 
> >> >> > for
> >> >> > the review. It would be sensible to go with your simpler patch whilst
> >> >> > io_mapping_map_atomic_wc() is improved.
> >> >>
> >> >> OK. __I'll be sending a bunch of fixes Linuswards in an hour or two.
> >> >> Should I include this?
> >> >
> >> > Yes.
> >> >
> >> > Acked-by: Chris Wilson 
> >> >
> >>
> >> I'm not sure pushing this in at this point is a good idea, if I'm
> >> reading it correctly we've no idea what KM_IRQ is being used for,
> >
> > It's used for taking kmaps from IRQ contexts.
> >
> >> and
> >> this codepath is called from non-irq contexts just as much as irq
> >> contexts.
> >
> > That's fine. __As long as we do a local_irq_disable(), KM_IRQ0 can be
> > used from both irq- and non-irq contexts. __All we need to do is to
> > ensure that some interrupt cannot come along on this CPU and corrupt
> > the slot.
> 
> I don't think we do that in a lot of places, and I'd rather not add
> that in to fix this problem at this point in the release cycle, as
> we've no idea what it might break/regress.

What is "that"?  The switch to irq-protected KM_IRQ0?  That won't break
anything.

> Its easier to just disable the hangcheck copy and try again for 2.6.35
> with a workqueue or slow work.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Andrew Morton
On Wed, 12 May 2010 09:17:09 +1000
Dave Airlie  wrote:

> >> >> and
> >> >> this codepath is called from non-irq contexts just as much as irq
> >> >> contexts.
> >> >
> >> > That's fine. __As long as we do a local_irq_disable(), KM_IRQ0 can be
> >> > used from both irq- and non-irq contexts. __All we need to do is to
> >> > ensure that some interrupt cannot come along on this CPU and corrupt
> >> > the slot.
> >>
> >> I don't think we do that in a lot of places, and I'd rather not add
> >> that in to fix this problem at this point in the release cycle, as
> >> we've no idea what it might break/regress.
> >
> > What is "that"? __The switch to irq-protected KM_IRQ0? __That won't break
> > anything.
> >
> 
> disabling local cpu irqs around all these kmap mappings.
> 

Ah.  Well if there are other uses of KM_USER0 from interrupt context
then yes, we have more problems.  CONFIG_DEBUG_HIGHMEM &&
CONFIG_TRACE_IRQFLAGS_SUPPORT will detect that and as long as Jaswinder
has hit all code paths in his testing, we're good.  Some manual review
for this would be good.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 27211] endless PROTECTION_FAULT logs, Nouveau drm, TNT card

2010-05-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=27211

--- Comment #7 from Brent  2010-05-11 23:35:47 PDT ---
Tried the above patch against the stock 2.6.33.3 kernel, with acceleration
turned on, and got  "(EE) [drm] failed to open device" in Xorg.0.log.  But it
did give me a 80x50 text console, so it seems the framebuffer works.  And it is
no longer filling the logs with error messages.  So that's something.

>From what I read, nouveau and drm are very touchy about matching versions.  I
started with the kernel configuration for Arch Linux, but ended up removing
many modules as otherwise that 350 MHz Pentium II needed 6 hours to compile.  I
updated the system before I tried the patch.  These are the current versions
for Arch Linux:

nouveau-drm 0.0.16_20100313-2
xf86-video-nouveau 0.0.15_git20100314-1
kernel26 2.6.33.3-2


Here is the tail of Xorg.0.log:

(II) Module nouveau: vendor="X.Org Foundation"
compiled for 1.7.5.902, module version = 0.0.15
Module class: X.Org Video Driver
ABI class: X.Org Video Driver, version 6.0
(II) NOUVEAU driver 
(II) NOUVEAU driver for NVIDIA chipset families :
RIVA TNT(NV04)
RIVA TNT2   (NV05)
GeForce 256 (NV10)
GeForce 2   (NV11, NV15)
GeForce 4MX (NV17, NV18)
GeForce 3   (NV20)
GeForce 4Ti (NV25, NV28)
GeForce FX  (NV3x)
GeForce 6   (NV4x)
GeForce 7   (G7x)
GeForce 8   (G8x)
(II) Primary Device is: PCI 0...@00:00:0
drmOpenDevice: node name is /dev/dri/card0
drmOpenDevice: open result is 7, (OK)
drmOpenByBusid: Searching for BusID pci::01:00.0
drmOpenDevice: node name is /dev/dri/card0
drmOpenDevice: open result is 7, (OK)
drmOpenByBusid: drmOpenMinor returns 7
drmOpenByBusid: drmGetBusid reports pci::01:00.0
(EE) [drm] failed to open device
(EE) No devices detected.

Fatal server error:
no screens found

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] drm/radeon/kms: add query for crtc hw id from crtc id to get info

2010-05-11 Thread Dave Airlie
On Sat, May 8, 2010 at 1:18 AM, Jerome Glisse  wrote:
> Userspace need to know the hw crtc id (0, 1, 2, ...) from the drm
> crtc id. Bump the minor version so userspace can enable conditionaly
> features depend on this.
>
> Signed-off-by: Jerome Glisse 
> ---
> ?drivers/gpu/drm/radeon/radeon_drv.c | ? ?3 ++-
> ?drivers/gpu/drm/radeon/radeon_kms.c | ? 18 ++
> ?include/drm/radeon_drm.h ? ? ? ? ? ?| ? ?1 +
> ?3 files changed, 21 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
> b/drivers/gpu/drm/radeon/radeon_drv.c
> index b3749d4..df96ace 100644
> --- a/drivers/gpu/drm/radeon/radeon_drv.c
> +++ b/drivers/gpu/drm/radeon/radeon_drv.c
> @@ -44,9 +44,10 @@
> ?* - 2.1.0 - add square tiling interface
> ?* - 2.2.0 - add r6xx/r7xx const buffer support
> ?* - 2.3.0 - add MSPOS + 3D texture + r500 VAP regs
> + * - 2.4.0 - add crtc id query
> ?*/
> ?#define KMS_DRIVER_MAJOR ? ? ? 2
> -#define KMS_DRIVER_MINOR ? ? ? 3
> +#define KMS_DRIVER_MINOR ? ? ? 4
> ?#define KMS_DRIVER_PATCHLEVEL ?0
> ?int radeon_driver_load_kms(struct drm_device *dev, unsigned long flags);
> ?int radeon_driver_unload_kms(struct drm_device *dev);
> diff --git a/drivers/gpu/drm/radeon/radeon_kms.c 
> b/drivers/gpu/drm/radeon/radeon_kms.c
> index d3657dc..04ad452 100644
> --- a/drivers/gpu/drm/radeon/radeon_kms.c
> +++ b/drivers/gpu/drm/radeon/radeon_kms.c
> @@ -98,11 +98,15 @@ int radeon_info_ioctl(struct drm_device *dev, void *data, 
> struct drm_file *filp)
> ?{
> ? ? ? ?struct radeon_device *rdev = dev->dev_private;
> ? ? ? ?struct drm_radeon_info *info;
> + ? ? ? struct radeon_mode_info *minfo = &rdev->mode_info;
> ? ? ? ?uint32_t *value_ptr;
> ? ? ? ?uint32_t value;
> + ? ? ? struct drm_crtc *crtc;
> + ? ? ? int i, found;
>
> ? ? ? ?info = data;
> ? ? ? ?value_ptr = (uint32_t *)((unsigned long)info->value);
> + ? ? ? value = *value_ptr;
> ? ? ? ?switch (info->request) {
> ? ? ? ?case RADEON_INFO_DEVICE_ID:
> ? ? ? ? ? ? ? ?value = dev->pci_device;
> @@ -116,6 +120,20 @@ int radeon_info_ioctl(struct drm_device *dev, void 
> *data, struct drm_file *filp)
> ? ? ? ?case RADEON_INFO_ACCEL_WORKING:
> ? ? ? ? ? ? ? ?value = rdev->accel_working;
> ? ? ? ? ? ? ? ?break;
> + ? ? ? case RADEON_INFO_CRTC_FROM_ID:
> + ? ? ? ? ? ? ? for (i = 0, found = 0; i < 6; i++) {
> + ? ? ? ? ? ? ? ? ? ? ? crtc = (struct drm_crtc *)minfo->crtcs[i];
> + ? ? ? ? ? ? ? ? ? ? ? if (crtc && crtc->base.id == value) {
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? value = i;
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? found = 1;
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? break;
> + ? ? ? ? ? ? ? ? ? ? ? }
> + ? ? ? ? ? ? ? }
> + ? ? ? ? ? ? ? if (!found) {
> + ? ? ? ? ? ? ? ? ? ? ? DRM_ERROR("unknown crtc id %d\n", value);

Don't drm error or hardcode 6 here.

we have rdev->num_crtc and DRM erroring from a path triggerable
directly from a user doing something bad is generally a bad idea.

Dave.


[Bug 27722] [865] no compiz after upgrade from 7.7.1 to 7.8.1

2010-05-11 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=27722

Gordon Jin  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||DUPLICATE

--- Comment #1 from Gordon Jin  2010-05-11 01:45:26 
PDT ---


*** This bug has been marked as a duplicate of bug 27615 ***

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


fb+drm: possible circular locking dependency detected

2010-05-11 Thread Clemens Ladisch
With radeon KMS, after using a program accessing the framebuffer,
I started X and got this:

===
[ INFO: possible circular locking dependency detected ]
2.6.34-rc6 #117
---
X/1846 is trying to acquire lock:
 (&mm->mmap_sem){++}, at: [] might_fault+0x57/0xa4

but task is already holding lock:
 (&dev->mode_config.mutex){+.+.+.}, at: [] 
drm_mode_getresources+0x33/0x54d

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&dev->mode_config.mutex){+.+.+.}:
   [] __lock_acquire+0x1408/0x1747
   [] lock_acquire+0x5a/0x71
   [] mutex_lock_nested+0x58/0x2b1
   [] drm_fb_helper_set_par+0x9c/0xf2
   [] fb_set_var+0x1de/0x2d9
   [] do_fb_ioctl+0x13a/0x46e
   [] fb_ioctl+0x21/0x23
   [] vfs_ioctl+0x2a/0x9e
   [] do_vfs_ioctl+0x4b7/0x4f4
   [] sys_ioctl+0x42/0x65
   [] system_call_fastpath+0x16/0x1b

-> #1 (&fb_info->lock){+.+.+.}:
   [] __lock_acquire+0x1408/0x1747
   [] lock_acquire+0x5a/0x71
   [] mutex_lock_nested+0x58/0x2b1
   [] fb_release+0x1c/0x54
   [] __fput+0x120/0x1e3
   [] fput+0x18/0x1a
   [] remove_vma+0x51/0x76
   [] do_munmap+0x30a/0x32e
   [] sys_munmap+0x3b/0x54
   [] system_call_fastpath+0x16/0x1b

-> #0 (&mm->mmap_sem){++}:
   [] __lock_acquire+0x112d/0x1747
   [] lock_acquire+0x5a/0x71
   [] might_fault+0x84/0xa4
   [] drm_mode_getresources+0x280/0x54d
   [] drm_ioctl+0x255/0x34b
   [] vfs_ioctl+0x2a/0x9e
   [] do_vfs_ioctl+0x4b7/0x4f4
   [] sys_ioctl+0x42/0x65
   [] system_call_fastpath+0x16/0x1b

other info that might help us debug this:

1 lock held by X/1846:
 #0:  (&dev->mode_config.mutex){+.+.+.}, at: [] 
drm_mode_getresources+0x33/0x54d

stack backtrace:
Pid: 1846, comm: X Not tainted 2.6.34-rc6 #117
Call Trace:
 [] print_circular_bug+0xb3/0xc2
 [] __lock_acquire+0x112d/0x1747
 [] ? mark_held_locks+0x4d/0x6b
 [] ? mutex_lock_nested+0x296/0x2b1
 [] lock_acquire+0x5a/0x71
 [] ? might_fault+0x57/0xa4
 [] might_fault+0x84/0xa4
 [] ? might_fault+0x57/0xa4
 [] drm_mode_getresources+0x280/0x54d
 [] drm_ioctl+0x255/0x34b
 [] ? drm_mode_getresources+0x0/0x54d
 [] ? up_read+0x1e/0x35
 [] vfs_ioctl+0x2a/0x9e
 [] do_vfs_ioctl+0x4b7/0x4f4
 [] ? sysret_check+0x27/0x62
 [] sys_ioctl+0x42/0x65
 [] system_call_fastpath+0x16/0x1b


DRM Error on Acer Aspire One

2010-05-11 Thread Jaswinder Singh Rajput
Hello,

With latest git kernel, I am getting following DRM error and not
getting XWindows :

[   45.269075] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer
elapsed... GPU hung
[   45.269111] [ cut here ]
[   45.269139] WARNING: at mm/highmem.c:453 debug_kmap_atomic+0xa9/0x11e()
[   45.269150] Hardware name: Aspire one
[   45.269158] Modules linked in: nf_conntrack_ftp ath9k ath9k_common
battery ath9k_hw [last unloaded: scsi_wait_scan]
[   45.269198] Pid: 0, comm: swapper Not tainted 2.6.34-rc7-netbook #6
[   45.269208] Call Trace:
[   45.269231]  [] warn_slowpath_common+0x65/0x7c
[   45.269249]  [] ? debug_kmap_atomic+0xa9/0x11e
[   45.269267]  [] warn_slowpath_null+0xd/0x10
[   45.269284]  [] debug_kmap_atomic+0xa9/0x11e
[   45.269304]  [] kmap_atomic_prot+0x4d/0xb2
[   45.269321]  [] kmap_atomic+0xe/0x10
[   45.269341]  [] i915_error_object_create+0xea/0x14f
[   45.269359]  [] i915_handle_error+0x369/0x868
[   45.269380]  [] i915_hangcheck_elapsed+0x9f/0xdf
[   45.269399]  [] run_timer_softirq+0x1c9/0x269
[   45.269417]  [] ? i915_hangcheck_elapsed+0x0/0xdf
[   45.269435]  [] __do_softirq+0xc6/0x186
[   45.269451]  [] do_softirq+0x26/0x2b
[   45.269466]  [] irq_exit+0x29/0x66
[   45.269484]  [] smp_apic_timer_interrupt+0x6e/0x7c
[   45.269504]  [] apic_timer_interrupt+0x2a/0x30
[   45.269524]  [] ? ftrace_raw_event_signal_generate+0x6d/0xd4
[   45.269542]  [] ? acpi_idle_enter_simple+0x13b/0x168
[   45.269563]  [] cpuidle_idle_call+0x6b/0xda
[   45.269580]  [] cpu_idle+0x44/0x74
[   45.269598]  [] start_secondary+0x1b2/0x1b7
[   45.269612] ---[ end trace ce01d7ca0ae214f4 ]---
[   45.269631] [ cut here ]
[   45.269647] WARNING: at mm/highmem.c:453 debug_kmap_atomic+0xa9/0x11e()
[   45.269657] Hardware name: Aspire one
[   45.269665] Modules linked in: nf_conntrack_ftp ath9k ath9k_common
battery ath9k_hw [last unloaded: scsi_wait_scan]
[   45.269700] Pid: 0, comm: swapper Tainted: GW  2.6.34-rc7-netbook #6
[   45.269710] Call Trace:
[   45.269726]  [] warn_slowpath_common+0x65/0x7c
[   45.269743]  [] ? debug_kmap_atomic+0xa9/0x11e
[   45.269760]  [] warn_slowpath_null+0xd/0x10
[   45.269777]  [] debug_kmap_atomic+0xa9/0x11e
[   45.269795]  [] kmap_atomic_prot+0x4d/0xb2
[   45.269812]  [] kmap_atomic+0xe/0x10
[   45.269829]  [] i915_error_object_create+0xea/0x14f
[   45.269848]  [] i915_handle_error+0x369/0x868
[   45.269868]  [] i915_hangcheck_elapsed+0x9f/0xdf
[   45.269885]  [] run_timer_softirq+0x1c9/0x269
[   45.269903]  [] ? i915_hangcheck_elapsed+0x0/0xdf
[   45.269920]  [] __do_softirq+0xc6/0x186
[   45.269937]  [] do_softirq+0x26/0x2b
[   45.269952]  [] irq_exit+0x29/0x66
[   45.269968]  [] smp_apic_timer_interrupt+0x6e/0x7c
[   45.269985]  [] apic_timer_interrupt+0x2a/0x30
[   45.270004]  [] ? ftrace_raw_event_signal_generate+0x6d/0xd4
[   45.270051]  [] ? acpi_idle_enter_simple+0x13b/0x168
[   45.270071]  [] cpuidle_idle_call+0x6b/0xda
[   45.270087]  [] cpu_idle+0x44/0x74
[   45.270104]  [] start_secondary+0x1b2/0x1b7
[   45.270117] ---[ end trace ce01d7ca0ae214f5 ]---
[   45.270135] [ cut here ]

dmesg : http://userweb.kernel.org/~jaswinder/acer_netbook/dmesg_2634-rc7.txt
.config : http://userweb.kernel.org/~jaswinder/acer_netbook/config_2634-rc7.txt

How can I fix these errors.

Thanks,
--
Jaswinder Singh.


DRM Error on Acer Aspire One

2010-05-11 Thread Chris Wilson
On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput  wrote:
> Hello,
> 
> With latest git kernel, I am getting following DRM error and not
> getting XWindows :

[snip]

Hmm, there are still patches for capturing error state that haven't gone
upstream, shame on me.

That error is a secondary issue to the GPU hang that is being reported. If
it is a regression caused by a kernel update it would be very useful if
you could bisect to the erroneous commit.

-- 
Chris Wilson, Intel Open Source Technology Centre


DRM Error on Acer Aspire One

2010-05-11 Thread Jaswinder Singh Rajput
Hello Chris,

On Tue, May 11, 2010 at 9:40 PM, Chris Wilson  
wrote:
> On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput  gmail.com> wrote:
>> Hello,
>>
>> With latest git kernel, I am getting following DRM error and not
>> getting XWindows :
>
> [snip]
>
> Hmm, there are still patches for capturing error state that haven't gone
> upstream, shame on me.
>
> That error is a secondary issue to the GPU hang that is being reported. If
> it is a regression caused by a kernel update it would be very useful if
> you could bisect to the erroneous commit.
>

Earlier I was using Moblin, I switched to Fedora and start getting
this error. I have also tested different kernel versions but getting
same error, so I do not think this is a regression.

moblin dmesg : 
http://userweb.kernel.org/~jaswinder/moblin/dmesg-moblin_2633rc5.txt

Thanks,
--
Jaswinder Singh.


DRM Error on Acer Aspire One

2010-05-11 Thread Andrew Morton
On Tue, 11 May 2010 17:10:53 +0100 Chris Wilson  
wrote:

> On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput  gmail.com> wrote:
> > Hello,
> > 
> > With latest git kernel, I am getting following DRM error and not
> > getting XWindows :
> 
> [snip]
> 
> Hmm, there are still patches for capturing error state that haven't gone
> upstream, shame on me.
> 
> That error is a secondary issue to the GPU hang that is being reported. If
> it is a regression caused by a kernel update it would be very useful if
> you could bisect to the erroneous commit.

It helps if one reads the code and the trace...

i915_error_object_create() is using KM_USER0 from softirq context. 
That's a bug, and a pretty serious one.  If some innocent civilian is
writing highmem data to disk and this timer interrupt fires and trashes
his KM_USER0 slot, the disk contents will be corrupted.

Something like this...

--- a/drivers/gpu/drm/i915/i915_irq.c~a
+++ a/drivers/gpu/drm/i915/i915_irq.c
@@ -456,11 +456,15 @@ i915_error_object_create(struct drm_devi

for (page = 0; page < page_count; page++) {
void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
+   unsigned long flags;
+
if (d == NULL)
goto unwind;
-   s = kmap_atomic(src_priv->pages[page], KM_USER0);
+   local_irq_save(flags);
+   s = kmap_atomic(src_priv->pages[page], KM_IRQ0);
memcpy(d, s, PAGE_SIZE);
-   kunmap_atomic(s, KM_USER0);
+   kunmap_atomic(s, KM_IRQ0);
+   local_irq_restore(flags);
dst->pages[page] = d;
}
dst->page_count = page_count;
_

Please let's get a tested fix for this into 2.6.34.


DRM Error on Acer Aspire One

2010-05-11 Thread Chris Wilson
On Tue, 11 May 2010 10:48:18 -0400, Andrew Morton  wrote:
> 
> On Tue, 11 May 2010 17:10:53 +0100 Chris Wilson  
> wrote:
> 
> > On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput  > at gmail.com> wrote:
> > > Hello,
> > > 
> > > With latest git kernel, I am getting following DRM error and not
> > > getting XWindows :
> > 
> > [snip]
> > 
> > Hmm, there are still patches for capturing error state that haven't gone
> > upstream, shame on me.
> > 
> > That error is a secondary issue to the GPU hang that is being reported. If
> > it is a regression caused by a kernel update it would be very useful if
> > you could bisect to the erroneous commit.
> 
> It helps if one reads the code and the trace...
> 
> i915_error_object_create() is using KM_USER0 from softirq context. 
> That's a bug, and a pretty serious one.  If some innocent civilian is
> writing highmem data to disk and this timer interrupt fires and trashes
> his KM_USER0 slot, the disk contents will be corrupted.
> 
> Something like this...
> 
> --- a/drivers/gpu/drm/i915/i915_irq.c~a
> +++ a/drivers/gpu/drm/i915/i915_irq.c
> @@ -456,11 +456,15 @@ i915_error_object_create(struct drm_devi
>  
>   for (page = 0; page < page_count; page++) {
>   void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> + unsigned long flags;
> +
>   if (d == NULL)
>   goto unwind;
> - s = kmap_atomic(src_priv->pages[page], KM_USER0);
> + local_irq_save(flags);
> + s = kmap_atomic(src_priv->pages[page], KM_IRQ0);
>   memcpy(d, s, PAGE_SIZE);
> - kunmap_atomic(s, KM_USER0);
> + kunmap_atomic(s, KM_IRQ0);
> + local_irq_restore(flags);
>   dst->pages[page] = d;
>   }
>   dst->page_count = page_count;
> _
> 
> Please let's get a tested fix for this into 2.6.34.

The change that I actually want is to replace the kmap_atomic(cpu_page) with an
io_mapping_map_atomic_wc(gtt_page), in case there is a incoherency between
the CPU and the GPU, we want to record what the GPU executed. Do you know
how if similar precautions are required with io_mapping_map_atomic_wc()?

-- 
Chris Wilson, Intel Open Source Technology Centre


[PATCH] drm/i915: Record error batch buffers using iomem

2010-05-11 Thread Chris Wilson
Directly read the GTT mapping for the contents of the batch buffers
rather than relying on possibly stale CPU caches. Also for completeness
scan the flushing/inactive lists for the current buffers - we are
collecting error state after all.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_irq.c |   64 ++
 1 files changed, 57 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 87113da..14301a4 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -441,9 +441,11 @@ static struct drm_i915_error_object *
 i915_error_object_create(struct drm_device *dev,
 struct drm_gem_object *src)
 {
+   drm_i915_private_t *dev_priv = dev->dev_private;
struct drm_i915_error_object *dst;
struct drm_i915_gem_object *src_priv;
int page, page_count;
+   u32 reloc_offset;

if (src == NULL)
return NULL;
@@ -458,14 +460,23 @@ i915_error_object_create(struct drm_device *dev,
if (dst == NULL)
return NULL;

+   reloc_offset = src_priv->gtt_offset;
for (page = 0; page < page_count; page++) {
-   void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
+   void __iomem *s;
+   void *d;
+
+   d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
if (d == NULL)
goto unwind;
-   s = kmap_atomic(src_priv->pages[page], KM_USER0);
-   memcpy(d, s, PAGE_SIZE);
-   kunmap_atomic(s, KM_USER0);
+
+   s = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping,
+reloc_offset);
+   memcpy_fromio(d, s, PAGE_SIZE);
+   io_mapping_unmap_atomic(s);
+
dst->pages[page] = d;
+
+   reloc_offset += PAGE_SIZE;
}
dst->page_count = page_count;
dst->gtt_offset = src_priv->gtt_offset;
@@ -621,18 +632,57 @@ static void i915_capture_error_state(struct drm_device 
*dev)

if (batchbuffer[1] == NULL &&
error->acthd >= obj_priv->gtt_offset &&
-   error->acthd < obj_priv->gtt_offset + obj->size &&
-   batchbuffer[0] != obj)
+   error->acthd < obj_priv->gtt_offset + obj->size)
batchbuffer[1] = obj;

count++;
}
+   /* Scan the other lists for completeness for those bizarre errors. */
+   if (batchbuffer[0] == NULL || batchbuffer[1] == NULL) {
+   list_for_each_entry(obj_priv, &dev_priv->mm.flushing_list, 
list) {
+   struct drm_gem_object *obj = obj_priv->obj;
+
+   if (batchbuffer[0] == NULL &&
+   bbaddr >= obj_priv->gtt_offset &&
+   bbaddr < obj_priv->gtt_offset + obj->size)
+   batchbuffer[0] = obj;
+
+   if (batchbuffer[1] == NULL &&
+   error->acthd >= obj_priv->gtt_offset &&
+   error->acthd < obj_priv->gtt_offset + obj->size)
+   batchbuffer[1] = obj;
+
+   if (batchbuffer[0] && batchbuffer[1])
+   break;
+   }
+   }
+   if (batchbuffer[0] == NULL || batchbuffer[1] == NULL) {
+   list_for_each_entry(obj_priv, &dev_priv->mm.inactive_list, 
list) {
+   struct drm_gem_object *obj = obj_priv->obj;
+
+   if (batchbuffer[0] == NULL &&
+   bbaddr >= obj_priv->gtt_offset &&
+   bbaddr < obj_priv->gtt_offset + obj->size)
+   batchbuffer[0] = obj;
+
+   if (batchbuffer[1] == NULL &&
+   error->acthd >= obj_priv->gtt_offset &&
+   error->acthd < obj_priv->gtt_offset + obj->size)
+   batchbuffer[1] = obj;
+
+   if (batchbuffer[0] && batchbuffer[1])
+   break;
+   }
+   }

/* We need to copy these to an anonymous buffer as the simplest
 * method to avoid being overwritten by userpace.
 */
error->batchbuffer[0] = i915_error_object_create(dev, batchbuffer[0]);
-   error->batchbuffer[1] = i915_error_object_create(dev, batchbuffer[1]);
+   if (batchbuffer[1] != batchbuffer[0])
+   error->batchbuffer[1] = i915_error_object_create(dev, 
batchbuffer[1]);
+   else
+   error->batchbuffer[1] = NULL;

/* Record the ringbuffer */
error->ringbuffer = i915_error_object_create(dev, 
dev_priv->ring.ring_obj);
-- 
1.7.1



DRM Error on Acer Aspire One

2010-05-11 Thread Jaswinder Singh Rajput
Hello Andrew,

On Tue, May 11, 2010 at 8:18 PM, Andrew Morton
 wrote:
> On Tue, 11 May 2010 17:10:53 +0100 Chris Wilson  
> wrote:
>
>> On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput > at gmail.com> wrote:
>> > Hello,
>> >
>> > With latest git kernel, I am getting following DRM error and not
>> > getting XWindows :
>>
>> [snip]
>>
>> Hmm, there are still patches for capturing error state that haven't gone
>> upstream, shame on me.
>>
>> That error is a secondary issue to the GPU hang that is being reported. If
>> it is a regression caused by a kernel update it would be very useful if
>> you could bisect to the erroneous commit.
>
> It helps if one reads the code and the trace...
>
> i915_error_object_create() is using KM_USER0 from softirq context.
> That's a bug, and a pretty serious one. ?If some innocent civilian is
> writing highmem data to disk and this timer interrupt fires and trashes
> his KM_USER0 slot, the disk contents will be corrupted.
>
> Something like this...
>
> --- a/drivers/gpu/drm/i915/i915_irq.c~a
> +++ a/drivers/gpu/drm/i915/i915_irq.c
> @@ -456,11 +456,15 @@ i915_error_object_create(struct drm_devi
>
> ? ? ? ?for (page = 0; page < page_count; page++) {
> ? ? ? ? ? ? ? ?void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> + ? ? ? ? ? ? ? unsigned long flags;
> +
> ? ? ? ? ? ? ? ?if (d == NULL)
> ? ? ? ? ? ? ? ? ? ? ? ?goto unwind;
> - ? ? ? ? ? ? ? s = kmap_atomic(src_priv->pages[page], KM_USER0);
> + ? ? ? ? ? ? ? local_irq_save(flags);
> + ? ? ? ? ? ? ? s = kmap_atomic(src_priv->pages[page], KM_IRQ0);
> ? ? ? ? ? ? ? ?memcpy(d, s, PAGE_SIZE);
> - ? ? ? ? ? ? ? kunmap_atomic(s, KM_USER0);
> + ? ? ? ? ? ? ? kunmap_atomic(s, KM_IRQ0);
> + ? ? ? ? ? ? ? local_irq_restore(flags);
> ? ? ? ? ? ? ? ?dst->pages[page] = d;
> ? ? ? ?}
> ? ? ? ?dst->page_count = page_count;
> _
>
> Please let's get a tested fix for this into 2.6.34.
>

I tested your patch with latest linus git and it works, it fixes the
softirq error.

Now I am only getting DRM errors :

[   42.276059] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer
elapsed... GPU hung
[   42.276398] render error detected, EIR: 0x
[   42.276460] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request
returns -5 (awaiting 18 at 17)

Thanks,
--
Jaswinder Singh.


DRM Error on Acer Aspire One

2010-05-11 Thread Andrew Morton
On Tue, 11 May 2010 19:19:26 +0100 Chris Wilson  
wrote:

> On Tue, 11 May 2010 10:48:18 -0400, Andrew Morton  linux-foundation.org> wrote:
> > 
> > On Tue, 11 May 2010 17:10:53 +0100 Chris Wilson  > chris-wilson.co.uk> wrote:
> > 
> > > On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput 
> > >  wrote:
> > > > Hello,
> > > > 
> > > > With latest git kernel, I am getting following DRM error and not
> > > > getting XWindows :
> > > 
> > > [snip]
> > > 
> > > Hmm, there are still patches for capturing error state that haven't gone
> > > upstream, shame on me.
> > > 
> > > That error is a secondary issue to the GPU hang that is being reported. If
> > > it is a regression caused by a kernel update it would be very useful if
> > > you could bisect to the erroneous commit.
> > 
> > It helps if one reads the code and the trace...
> > 
> > i915_error_object_create() is using KM_USER0 from softirq context. 
> > That's a bug, and a pretty serious one.  If some innocent civilian is
> > writing highmem data to disk and this timer interrupt fires and trashes
> > his KM_USER0 slot, the disk contents will be corrupted.
> > 
> > Something like this...
> > 
> > --- a/drivers/gpu/drm/i915/i915_irq.c~a
> > +++ a/drivers/gpu/drm/i915/i915_irq.c
> > @@ -456,11 +456,15 @@ i915_error_object_create(struct drm_devi
> >  
> > for (page = 0; page < page_count; page++) {
> > void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> > +   unsigned long flags;
> > +
> > if (d == NULL)
> > goto unwind;
> > -   s = kmap_atomic(src_priv->pages[page], KM_USER0);
> > +   local_irq_save(flags);
> > +   s = kmap_atomic(src_priv->pages[page], KM_IRQ0);
> > memcpy(d, s, PAGE_SIZE);
> > -   kunmap_atomic(s, KM_USER0);
> > +   kunmap_atomic(s, KM_IRQ0);
> > +   local_irq_restore(flags);
> > dst->pages[page] = d;
> > }
> > dst->page_count = page_count;
> > _
> > 
> > Please let's get a tested fix for this into 2.6.34.
> 
> The change that I actually want is to replace the kmap_atomic(cpu_page) with 
> an
> io_mapping_map_atomic_wc(gtt_page), in case there is a incoherency between
> the CPU and the GPU, we want to record what the GPU executed. Do you know
> how if similar precautions are required with io_mapping_map_atomic_wc()?

gack, wtf is io_mapping_map_atomic_wc()?



Could do with some interface documentation. Looks too large to be inlined.

No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
it hardwires use of KM_USER0.  I suggest that io_mapping_create_wc(),
io_mapping_map_atomic_wc() etc be changed so that the caller passes in the
KM_foo kmap slot index.




[PATCH] drm/i915: Record error batch buffers using iomem

2010-05-11 Thread Andrew Morton
On Tue, 11 May 2010 19:22:14 +0100 Chris Wilson  
wrote:

> + reloc_offset = src_priv->gtt_offset;
>   for (page = 0; page < page_count; page++) {
> - void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> + void __iomem *s;
> + void *d;
> +
> + d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
>   if (d == NULL)
>   goto unwind;
> - s = kmap_atomic(src_priv->pages[page], KM_USER0);
> - memcpy(d, s, PAGE_SIZE);
> - kunmap_atomic(s, KM_USER0);
> +
> + s = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping,
> +  reloc_offset);
> + memcpy_fromio(d, s, PAGE_SIZE);
> + io_mapping_unmap_atomic(s);

As mentioned in the other email, this will still corrupt the KM_USER0
slot, and will generate a debug_kmap_atomic() warning.



[PATCH] drm/i915: Record error batch buffers using iomem

2010-05-11 Thread Chris Wilson
On Tue, 11 May 2010 11:37:22 -0400, Andrew Morton  wrote:
> On Tue, 11 May 2010 19:22:14 +0100 Chris Wilson  
> wrote:
> 
> > +   reloc_offset = src_priv->gtt_offset;
> > for (page = 0; page < page_count; page++) {
> > -   void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> > +   void __iomem *s;
> > +   void *d;
> > +
> > +   d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> > if (d == NULL)
> > goto unwind;
> > -   s = kmap_atomic(src_priv->pages[page], KM_USER0);
> > -   memcpy(d, s, PAGE_SIZE);
> > -   kunmap_atomic(s, KM_USER0);
> > +
> > +   s = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping,
> > +reloc_offset);
> > +   memcpy_fromio(d, s, PAGE_SIZE);
> > +   io_mapping_unmap_atomic(s);
> 
> As mentioned in the other email, this will still corrupt the KM_USER0
> slot, and will generate a debug_kmap_atomic() warning.

How, as kmap_atomic(KM_USER0) is no longer used?
-- 
Chris Wilson, Intel Open Source Technology Centre


DRM Error on Acer Aspire One

2010-05-11 Thread Chris Wilson
On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton  wrote:
> No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
> it hardwires use of KM_USER0.  I suggest that io_mapping_create_wc(),
> io_mapping_map_atomic_wc() etc be changed so that the caller passes in the
> KM_foo kmap slot index.

Argh, sorry for the noise, read the mail in the wrong order. Thanks for
the review. It would be sensible to go with your simpler patch whilst
io_mapping_map_atomic_wc() is improved.

-- 
Chris Wilson, Intel Open Source Technology Centre


DRM Error on Acer Aspire One

2010-05-11 Thread Andrew Morton
On Tue, 11 May 2010 19:52:31 +0100
Chris Wilson  wrote:

> On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton  linux-foundation.org> wrote:
> > No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
> > it hardwires use of KM_USER0.  I suggest that io_mapping_create_wc(),
> > io_mapping_map_atomic_wc() etc be changed so that the caller passes in the
> > KM_foo kmap slot index.
> 
> Argh, sorry for the noise, read the mail in the wrong order. Thanks for
> the review. It would be sensible to go with your simpler patch whilst
> io_mapping_map_atomic_wc() is improved.

OK.  I'll be sending a bunch of fixes Linuswards in an hour or two.  
Should I include this?


Subject: drivers/gpu/drm/i915/i915_irq.c:i915_error_object_create(): use 
correct kmap-atomic slot
From: Andrew Morton 

i915_error_object_create() is called from the timer interrupt and hence
can corrupt the KM_USER0 slot.  Use KM_IRQ0 instead.

Reported-by: Jaswinder Singh Rajput 
Tested-by: Jaswinder Singh Rajput 
Cc: Chris Wilson 
Cc: Dave Airlie 
Signed-off-by: Andrew Morton 
---

 drivers/gpu/drm/i915/i915_irq.c |8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff -puN 
drivers/gpu/drm/i915/i915_irq.c~drivers-gpu-drm-i915-i915_irqc-i915_error_object_create-use-correct-kmap-atomic-slot
 drivers/gpu/drm/i915/i915_irq.c
--- 
a/drivers/gpu/drm/i915/i915_irq.c~drivers-gpu-drm-i915-i915_irqc-i915_error_object_create-use-correct-kmap-atomic-slot
+++ a/drivers/gpu/drm/i915/i915_irq.c
@@ -461,11 +461,15 @@ i915_error_object_create(struct drm_devi

for (page = 0; page < page_count; page++) {
void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
+   unsigned long flags;
+
if (d == NULL)
goto unwind;
-   s = kmap_atomic(src_priv->pages[page], KM_USER0);
+   local_irq_save(flags);
+   s = kmap_atomic(src_priv->pages[page], KM_IRQ0);
memcpy(d, s, PAGE_SIZE);
-   kunmap_atomic(s, KM_USER0);
+   kunmap_atomic(s, KM_IRQ0);
+   local_irq_restore(flags);
dst->pages[page] = d;
}
dst->page_count = page_count;
_



[Bug 28069] New: maniadrive - smooth play with LIBGL_ALWAYS_INDIRECT=true, (almost) unplayable otherwise

2010-05-11 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=28069

   Summary: maniadrive - smooth play with
LIBGL_ALWAYS_INDIRECT=true, (almost) unplayable
otherwise
   Product: DRI
   Version: XOrg CVS
  Platform: Other
OS/Version: All
Status: NEW
  Severity: enhancement
  Priority: medium
 Component: DRM/Radeon
AssignedTo: dri-devel at lists.freedesktop.org
ReportedBy: pcpa at mandriva.com.br


The problem described in https://bugs.freedesktop.org/show_bug.cgi?id=28002
actually also happens with the ati driver.
  Just that with the ati driver, it is less visible (the car position jumps
less), but with LIBGL_ALWAYS_INDIRECT it also becomes a lot more smooth.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[PATCH] drm/i915: Record error batch buffers using iomem

2010-05-11 Thread Chris Wilson
On Wed, 12 May 2010 01:08:23 +0530, Jaswinder Singh Rajput  wrote:
> Hello Chris and Andrew,
> 
> I did further testing and noticed that this patch fixes the boot
> errors and warnings and I get the XWindows.
> 
> But XWindows freezes after some time.

The BUG you were hitting before is on the error collection path which
presumably is still being triggered during boot by a GPU error.
Can you check to see if /sys/kernel/debug/dri/0/i915_error_state has
recorded anything? And if not, wait until it freezes and then please file
a bug report at bugs.freedesktop.org with the i915_error_state, Xorg.0.log
and dmesg.

-- 
Chris Wilson, Intel Open Source Technology Centre


DRM Error on Acer Aspire One

2010-05-11 Thread Chris Wilson
On Tue, 11 May 2010 12:10:01 -0700, Andrew Morton  wrote:
> On Tue, 11 May 2010 19:52:31 +0100
> Chris Wilson  wrote:
> 
> > On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton  > linux-foundation.org> wrote:
> > > No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
> > > it hardwires use of KM_USER0.  I suggest that io_mapping_create_wc(),
> > > io_mapping_map_atomic_wc() etc be changed so that the caller passes in the
> > > KM_foo kmap slot index.
> > 
> > Argh, sorry for the noise, read the mail in the wrong order. Thanks for
> > the review. It would be sensible to go with your simpler patch whilst
> > io_mapping_map_atomic_wc() is improved.
> 
> OK.  I'll be sending a bunch of fixes Linuswards in an hour or two.  
> Should I include this?

Yes.

Acked-by: Chris Wilson 

-- 
Chris Wilson, Intel Open Source Technology Centre


DRM Error on Acer Aspire One

2010-05-11 Thread Chris Wilson
On Wed, 12 May 2010 08:22:49 +1000, Dave Airlie  wrote:
> I'd rather we just backout the hangcheck stuff touching copies at all
> at this point, and try again doing it properly with a slow work or
> something for later.



No subject

2010-05-11 Thread
invaluable for delving into and fixing some obnoxious driver bugs. I
suspect its honeymoon period is now over - those bugs that it could
detect easily have been fixed (I hope). In order to capture the relevant
information for later chipset generations, we will need to parse the
command stream and include auxiliary buffers. So whilst I would prefer to
see this in a release so that I can easily diagnose bug reports, I accept
that there is more work to be done and will HTFU.

-- 
Chris Wilson, Intel Open Source Technology Centre


DRM Error on Acer Aspire One

2010-05-11 Thread Andrew Morton
On Wed, 12 May 2010 08:22:49 +1000
Dave Airlie  wrote:

> On Wed, May 12, 2010 at 5:57 AM, Chris Wilson  
> wrote:
> > On Tue, 11 May 2010 12:10:01 -0700, Andrew Morton  > linux-foundation.org> wrote:
> >> On Tue, 11 May 2010 19:52:31 +0100
> >> Chris Wilson  wrote:
> >>
> >> > On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton  >> > linux-foundation.org> wrote:
> >> > > No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
> >> > > it hardwires use of KM_USER0. __I suggest that io_mapping_create_wc(),
> >> > > io_mapping_map_atomic_wc() etc be changed so that the caller passes in 
> >> > > the
> >> > > KM_foo kmap slot index.
> >> >
> >> > Argh, sorry for the noise, read the mail in the wrong order. Thanks for
> >> > the review. It would be sensible to go with your simpler patch whilst
> >> > io_mapping_map_atomic_wc() is improved.
> >>
> >> OK. __I'll be sending a bunch of fixes Linuswards in an hour or two.
> >> Should I include this?
> >
> > Yes.
> >
> > Acked-by: Chris Wilson 
> >
> 
> I'm not sure pushing this in at this point is a good idea, if I'm
> reading it correctly we've no idea what KM_IRQ is being used for,

It's used for taking kmaps from IRQ contexts.

> and
> this codepath is called from non-irq contexts just as much as irq
> contexts.

That's fine.  As long as we do a local_irq_disable(), KM_IRQ0 can be
used from both irq- and non-irq contexts.  All we need to do is to
ensure that some interrupt cannot come along on this CPU and corrupt
the slot.

> I'd rather we just backout the hangcheck stuff touching copies at all
> at this point, and try again doing it properly with a slow work or
> something for later.
> 
> Dave.


DRM Error on Acer Aspire One

2010-05-11 Thread Andrew Morton
On Wed, 12 May 2010 08:51:05 +1000
Dave Airlie  wrote:

> On Wed, May 12, 2010 at 8:32 AM, Andrew Morton
>  wrote:
> > On Wed, 12 May 2010 08:22:49 +1000
> > Dave Airlie  wrote:
> >
> >> On Wed, May 12, 2010 at 5:57 AM, Chris Wilson  >> chris-wilson.co.uk> wrote:
> >> > On Tue, 11 May 2010 12:10:01 -0700, Andrew Morton  >> > linux-foundation.org> wrote:
> >> >> On Tue, 11 May 2010 19:52:31 +0100
> >> >> Chris Wilson  wrote:
> >> >>
> >> >> > On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton  >> >> > linux-foundation.org> wrote:
> >> >> > > No, io_mapping_map_atomic_wc() cannot be used from [soft]irq 
> >> >> > > context:
> >> >> > > it hardwires use of KM_USER0. __I suggest that 
> >> >> > > io_mapping_create_wc(),
> >> >> > > io_mapping_map_atomic_wc() etc be changed so that the caller passes 
> >> >> > > in the
> >> >> > > KM_foo kmap slot index.
> >> >> >
> >> >> > Argh, sorry for the noise, read the mail in the wrong order. Thanks 
> >> >> > for
> >> >> > the review. It would be sensible to go with your simpler patch whilst
> >> >> > io_mapping_map_atomic_wc() is improved.
> >> >>
> >> >> OK. __I'll be sending a bunch of fixes Linuswards in an hour or two.
> >> >> Should I include this?
> >> >
> >> > Yes.
> >> >
> >> > Acked-by: Chris Wilson 
> >> >
> >>
> >> I'm not sure pushing this in at this point is a good idea, if I'm
> >> reading it correctly we've no idea what KM_IRQ is being used for,
> >
> > It's used for taking kmaps from IRQ contexts.
> >
> >> and
> >> this codepath is called from non-irq contexts just as much as irq
> >> contexts.
> >
> > That's fine. __As long as we do a local_irq_disable(), KM_IRQ0 can be
> > used from both irq- and non-irq contexts. __All we need to do is to
> > ensure that some interrupt cannot come along on this CPU and corrupt
> > the slot.
> 
> I don't think we do that in a lot of places, and I'd rather not add
> that in to fix this problem at this point in the release cycle, as
> we've no idea what it might break/regress.

What is "that"?  The switch to irq-protected KM_IRQ0?  That won't break
anything.

> Its easier to just disable the hangcheck copy and try again for 2.6.35
> with a workqueue or slow work.



DRM Error on Acer Aspire One

2010-05-11 Thread Andrew Morton
On Wed, 12 May 2010 09:17:09 +1000
Dave Airlie  wrote:

> >> >> and
> >> >> this codepath is called from non-irq contexts just as much as irq
> >> >> contexts.
> >> >
> >> > That's fine. __As long as we do a local_irq_disable(), KM_IRQ0 can be
> >> > used from both irq- and non-irq contexts. __All we need to do is to
> >> > ensure that some interrupt cannot come along on this CPU and corrupt
> >> > the slot.
> >>
> >> I don't think we do that in a lot of places, and I'd rather not add
> >> that in to fix this problem at this point in the release cycle, as
> >> we've no idea what it might break/regress.
> >
> > What is "that"? __The switch to irq-protected KM_IRQ0? __That won't break
> > anything.
> >
> 
> disabling local cpu irqs around all these kmap mappings.
> 

Ah.  Well if there are other uses of KM_USER0 from interrupt context
then yes, we have more problems.  CONFIG_DEBUG_HIGHMEM &&
CONFIG_TRACE_IRQFLAGS_SUPPORT will detect that and as long as Jaswinder
has hit all code paths in his testing, we're good.  Some manual review
for this would be good.



[Bug 27211] endless PROTECTION_FAULT logs, Nouveau drm, TNT card

2010-05-11 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=27211

--- Comment #7 from Brent  2010-05-11 23:35:47 PDT 
---
Tried the above patch against the stock 2.6.33.3 kernel, with acceleration
turned on, and got  "(EE) [drm] failed to open device" in Xorg.0.log.  But it
did give me a 80x50 text console, so it seems the framebuffer works.  And it is
no longer filling the logs with error messages.  So that's something.



[Bug 27722] [865] no compiz after upgrade from 7.7.1 to 7.8.1

2010-05-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=27722

Gordon Jin  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||DUPLICATE

--- Comment #1 from Gordon Jin  2010-05-11 01:45:26 PDT 
---


*** This bug has been marked as a duplicate of bug 27615 ***

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


fb+drm: possible circular locking dependency detected

2010-05-11 Thread Clemens Ladisch
With radeon KMS, after using a program accessing the framebuffer,
I started X and got this:

===
[ INFO: possible circular locking dependency detected ]
2.6.34-rc6 #117
---
X/1846 is trying to acquire lock:
 (&mm->mmap_sem){++}, at: [] might_fault+0x57/0xa4

but task is already holding lock:
 (&dev->mode_config.mutex){+.+.+.}, at: [] 
drm_mode_getresources+0x33/0x54d

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&dev->mode_config.mutex){+.+.+.}:
   [] __lock_acquire+0x1408/0x1747
   [] lock_acquire+0x5a/0x71
   [] mutex_lock_nested+0x58/0x2b1
   [] drm_fb_helper_set_par+0x9c/0xf2
   [] fb_set_var+0x1de/0x2d9
   [] do_fb_ioctl+0x13a/0x46e
   [] fb_ioctl+0x21/0x23
   [] vfs_ioctl+0x2a/0x9e
   [] do_vfs_ioctl+0x4b7/0x4f4
   [] sys_ioctl+0x42/0x65
   [] system_call_fastpath+0x16/0x1b

-> #1 (&fb_info->lock){+.+.+.}:
   [] __lock_acquire+0x1408/0x1747
   [] lock_acquire+0x5a/0x71
   [] mutex_lock_nested+0x58/0x2b1
   [] fb_release+0x1c/0x54
   [] __fput+0x120/0x1e3
   [] fput+0x18/0x1a
   [] remove_vma+0x51/0x76
   [] do_munmap+0x30a/0x32e
   [] sys_munmap+0x3b/0x54
   [] system_call_fastpath+0x16/0x1b

-> #0 (&mm->mmap_sem){++}:
   [] __lock_acquire+0x112d/0x1747
   [] lock_acquire+0x5a/0x71
   [] might_fault+0x84/0xa4
   [] drm_mode_getresources+0x280/0x54d
   [] drm_ioctl+0x255/0x34b
   [] vfs_ioctl+0x2a/0x9e
   [] do_vfs_ioctl+0x4b7/0x4f4
   [] sys_ioctl+0x42/0x65
   [] system_call_fastpath+0x16/0x1b

other info that might help us debug this:

1 lock held by X/1846:
 #0:  (&dev->mode_config.mutex){+.+.+.}, at: [] 
drm_mode_getresources+0x33/0x54d

stack backtrace:
Pid: 1846, comm: X Not tainted 2.6.34-rc6 #117
Call Trace:
 [] print_circular_bug+0xb3/0xc2
 [] __lock_acquire+0x112d/0x1747
 [] ? mark_held_locks+0x4d/0x6b
 [] ? mutex_lock_nested+0x296/0x2b1
 [] lock_acquire+0x5a/0x71
 [] ? might_fault+0x57/0xa4
 [] might_fault+0x84/0xa4
 [] ? might_fault+0x57/0xa4
 [] drm_mode_getresources+0x280/0x54d
 [] drm_ioctl+0x255/0x34b
 [] ? drm_mode_getresources+0x0/0x54d
 [] ? up_read+0x1e/0x35
 [] vfs_ioctl+0x2a/0x9e
 [] do_vfs_ioctl+0x4b7/0x4f4
 [] ? sysret_check+0x27/0x62
 [] sys_ioctl+0x42/0x65
 [] system_call_fastpath+0x16/0x1b
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


DRM Error on Acer Aspire One

2010-05-11 Thread Jaswinder Singh Rajput
Hello,

With latest git kernel, I am getting following DRM error and not
getting XWindows :

[   45.269075] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer
elapsed... GPU hung
[   45.269111] [ cut here ]
[   45.269139] WARNING: at mm/highmem.c:453 debug_kmap_atomic+0xa9/0x11e()
[   45.269150] Hardware name: Aspire one
[   45.269158] Modules linked in: nf_conntrack_ftp ath9k ath9k_common
battery ath9k_hw [last unloaded: scsi_wait_scan]
[   45.269198] Pid: 0, comm: swapper Not tainted 2.6.34-rc7-netbook #6
[   45.269208] Call Trace:
[   45.269231]  [] warn_slowpath_common+0x65/0x7c
[   45.269249]  [] ? debug_kmap_atomic+0xa9/0x11e
[   45.269267]  [] warn_slowpath_null+0xd/0x10
[   45.269284]  [] debug_kmap_atomic+0xa9/0x11e
[   45.269304]  [] kmap_atomic_prot+0x4d/0xb2
[   45.269321]  [] kmap_atomic+0xe/0x10
[   45.269341]  [] i915_error_object_create+0xea/0x14f
[   45.269359]  [] i915_handle_error+0x369/0x868
[   45.269380]  [] i915_hangcheck_elapsed+0x9f/0xdf
[   45.269399]  [] run_timer_softirq+0x1c9/0x269
[   45.269417]  [] ? i915_hangcheck_elapsed+0x0/0xdf
[   45.269435]  [] __do_softirq+0xc6/0x186
[   45.269451]  [] do_softirq+0x26/0x2b
[   45.269466]  [] irq_exit+0x29/0x66
[   45.269484]  [] smp_apic_timer_interrupt+0x6e/0x7c
[   45.269504]  [] apic_timer_interrupt+0x2a/0x30
[   45.269524]  [] ? ftrace_raw_event_signal_generate+0x6d/0xd4
[   45.269542]  [] ? acpi_idle_enter_simple+0x13b/0x168
[   45.269563]  [] cpuidle_idle_call+0x6b/0xda
[   45.269580]  [] cpu_idle+0x44/0x74
[   45.269598]  [] start_secondary+0x1b2/0x1b7
[   45.269612] ---[ end trace ce01d7ca0ae214f4 ]---
[   45.269631] [ cut here ]
[   45.269647] WARNING: at mm/highmem.c:453 debug_kmap_atomic+0xa9/0x11e()
[   45.269657] Hardware name: Aspire one
[   45.269665] Modules linked in: nf_conntrack_ftp ath9k ath9k_common
battery ath9k_hw [last unloaded: scsi_wait_scan]
[   45.269700] Pid: 0, comm: swapper Tainted: GW  2.6.34-rc7-netbook #6
[   45.269710] Call Trace:
[   45.269726]  [] warn_slowpath_common+0x65/0x7c
[   45.269743]  [] ? debug_kmap_atomic+0xa9/0x11e
[   45.269760]  [] warn_slowpath_null+0xd/0x10
[   45.269777]  [] debug_kmap_atomic+0xa9/0x11e
[   45.269795]  [] kmap_atomic_prot+0x4d/0xb2
[   45.269812]  [] kmap_atomic+0xe/0x10
[   45.269829]  [] i915_error_object_create+0xea/0x14f
[   45.269848]  [] i915_handle_error+0x369/0x868
[   45.269868]  [] i915_hangcheck_elapsed+0x9f/0xdf
[   45.269885]  [] run_timer_softirq+0x1c9/0x269
[   45.269903]  [] ? i915_hangcheck_elapsed+0x0/0xdf
[   45.269920]  [] __do_softirq+0xc6/0x186
[   45.269937]  [] do_softirq+0x26/0x2b
[   45.269952]  [] irq_exit+0x29/0x66
[   45.269968]  [] smp_apic_timer_interrupt+0x6e/0x7c
[   45.269985]  [] apic_timer_interrupt+0x2a/0x30
[   45.270004]  [] ? ftrace_raw_event_signal_generate+0x6d/0xd4
[   45.270051]  [] ? acpi_idle_enter_simple+0x13b/0x168
[   45.270071]  [] cpuidle_idle_call+0x6b/0xda
[   45.270087]  [] cpu_idle+0x44/0x74
[   45.270104]  [] start_secondary+0x1b2/0x1b7
[   45.270117] ---[ end trace ce01d7ca0ae214f5 ]---
[   45.270135] [ cut here ]

dmesg : http://userweb.kernel.org/~jaswinder/acer_netbook/dmesg_2634-rc7.txt
.config : http://userweb.kernel.org/~jaswinder/acer_netbook/config_2634-rc7.txt

How can I fix these errors.

Thanks,
--
Jaswinder Singh.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Chris Wilson
On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput 
 wrote:
> Hello,
> 
> With latest git kernel, I am getting following DRM error and not
> getting XWindows :

[snip]

Hmm, there are still patches for capturing error state that haven't gone
upstream, shame on me.

That error is a secondary issue to the GPU hang that is being reported. If
it is a regression caused by a kernel update it would be very useful if
you could bisect to the erroneous commit.

-- 
Chris Wilson, Intel Open Source Technology Centre
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Jaswinder Singh Rajput
Hello Chris,

On Tue, May 11, 2010 at 9:40 PM, Chris Wilson  wrote:
> On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput 
>  wrote:
>> Hello,
>>
>> With latest git kernel, I am getting following DRM error and not
>> getting XWindows :
>
> [snip]
>
> Hmm, there are still patches for capturing error state that haven't gone
> upstream, shame on me.
>
> That error is a secondary issue to the GPU hang that is being reported. If
> it is a regression caused by a kernel update it would be very useful if
> you could bisect to the erroneous commit.
>

Earlier I was using Moblin, I switched to Fedora and start getting
this error. I have also tested different kernel versions but getting
same error, so I do not think this is a regression.

moblin dmesg : 
http://userweb.kernel.org/~jaswinder/moblin/dmesg-moblin_2633rc5.txt

Thanks,
--
Jaswinder Singh.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Andrew Morton
On Tue, 11 May 2010 17:10:53 +0100 Chris Wilson  
wrote:

> On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput 
>  wrote:
> > Hello,
> > 
> > With latest git kernel, I am getting following DRM error and not
> > getting XWindows :
> 
> [snip]
> 
> Hmm, there are still patches for capturing error state that haven't gone
> upstream, shame on me.
> 
> That error is a secondary issue to the GPU hang that is being reported. If
> it is a regression caused by a kernel update it would be very useful if
> you could bisect to the erroneous commit.

It helps if one reads the code and the trace...

i915_error_object_create() is using KM_USER0 from softirq context. 
That's a bug, and a pretty serious one.  If some innocent civilian is
writing highmem data to disk and this timer interrupt fires and trashes
his KM_USER0 slot, the disk contents will be corrupted.

Something like this...

--- a/drivers/gpu/drm/i915/i915_irq.c~a
+++ a/drivers/gpu/drm/i915/i915_irq.c
@@ -456,11 +456,15 @@ i915_error_object_create(struct drm_devi
 
for (page = 0; page < page_count; page++) {
void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
+   unsigned long flags;
+
if (d == NULL)
goto unwind;
-   s = kmap_atomic(src_priv->pages[page], KM_USER0);
+   local_irq_save(flags);
+   s = kmap_atomic(src_priv->pages[page], KM_IRQ0);
memcpy(d, s, PAGE_SIZE);
-   kunmap_atomic(s, KM_USER0);
+   kunmap_atomic(s, KM_IRQ0);
+   local_irq_restore(flags);
dst->pages[page] = d;
}
dst->page_count = page_count;
_

Please let's get a tested fix for this into 2.6.34.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Chris Wilson
On Tue, 11 May 2010 10:48:18 -0400, Andrew Morton  
wrote:
> 
> On Tue, 11 May 2010 17:10:53 +0100 Chris Wilson  
> wrote:
> 
> > On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput 
> >  wrote:
> > > Hello,
> > > 
> > > With latest git kernel, I am getting following DRM error and not
> > > getting XWindows :
> > 
> > [snip]
> > 
> > Hmm, there are still patches for capturing error state that haven't gone
> > upstream, shame on me.
> > 
> > That error is a secondary issue to the GPU hang that is being reported. If
> > it is a regression caused by a kernel update it would be very useful if
> > you could bisect to the erroneous commit.
> 
> It helps if one reads the code and the trace...
> 
> i915_error_object_create() is using KM_USER0 from softirq context. 
> That's a bug, and a pretty serious one.  If some innocent civilian is
> writing highmem data to disk and this timer interrupt fires and trashes
> his KM_USER0 slot, the disk contents will be corrupted.
> 
> Something like this...
> 
> --- a/drivers/gpu/drm/i915/i915_irq.c~a
> +++ a/drivers/gpu/drm/i915/i915_irq.c
> @@ -456,11 +456,15 @@ i915_error_object_create(struct drm_devi
>  
>   for (page = 0; page < page_count; page++) {
>   void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> + unsigned long flags;
> +
>   if (d == NULL)
>   goto unwind;
> - s = kmap_atomic(src_priv->pages[page], KM_USER0);
> + local_irq_save(flags);
> + s = kmap_atomic(src_priv->pages[page], KM_IRQ0);
>   memcpy(d, s, PAGE_SIZE);
> - kunmap_atomic(s, KM_USER0);
> + kunmap_atomic(s, KM_IRQ0);
> + local_irq_restore(flags);
>   dst->pages[page] = d;
>   }
>   dst->page_count = page_count;
> _
> 
> Please let's get a tested fix for this into 2.6.34.

The change that I actually want is to replace the kmap_atomic(cpu_page) with an
io_mapping_map_atomic_wc(gtt_page), in case there is a incoherency between
the CPU and the GPU, we want to record what the GPU executed. Do you know
how if similar precautions are required with io_mapping_map_atomic_wc()?

-- 
Chris Wilson, Intel Open Source Technology Centre
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] drm/i915: Record error batch buffers using iomem

2010-05-11 Thread Chris Wilson
Directly read the GTT mapping for the contents of the batch buffers
rather than relying on possibly stale CPU caches. Also for completeness
scan the flushing/inactive lists for the current buffers - we are
collecting error state after all.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_irq.c |   64 ++
 1 files changed, 57 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 87113da..14301a4 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -441,9 +441,11 @@ static struct drm_i915_error_object *
 i915_error_object_create(struct drm_device *dev,
 struct drm_gem_object *src)
 {
+   drm_i915_private_t *dev_priv = dev->dev_private;
struct drm_i915_error_object *dst;
struct drm_i915_gem_object *src_priv;
int page, page_count;
+   u32 reloc_offset;
 
if (src == NULL)
return NULL;
@@ -458,14 +460,23 @@ i915_error_object_create(struct drm_device *dev,
if (dst == NULL)
return NULL;
 
+   reloc_offset = src_priv->gtt_offset;
for (page = 0; page < page_count; page++) {
-   void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
+   void __iomem *s;
+   void *d;
+
+   d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
if (d == NULL)
goto unwind;
-   s = kmap_atomic(src_priv->pages[page], KM_USER0);
-   memcpy(d, s, PAGE_SIZE);
-   kunmap_atomic(s, KM_USER0);
+
+   s = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping,
+reloc_offset);
+   memcpy_fromio(d, s, PAGE_SIZE);
+   io_mapping_unmap_atomic(s);
+
dst->pages[page] = d;
+
+   reloc_offset += PAGE_SIZE;
}
dst->page_count = page_count;
dst->gtt_offset = src_priv->gtt_offset;
@@ -621,18 +632,57 @@ static void i915_capture_error_state(struct drm_device 
*dev)
 
if (batchbuffer[1] == NULL &&
error->acthd >= obj_priv->gtt_offset &&
-   error->acthd < obj_priv->gtt_offset + obj->size &&
-   batchbuffer[0] != obj)
+   error->acthd < obj_priv->gtt_offset + obj->size)
batchbuffer[1] = obj;
 
count++;
}
+   /* Scan the other lists for completeness for those bizarre errors. */
+   if (batchbuffer[0] == NULL || batchbuffer[1] == NULL) {
+   list_for_each_entry(obj_priv, &dev_priv->mm.flushing_list, 
list) {
+   struct drm_gem_object *obj = obj_priv->obj;
+
+   if (batchbuffer[0] == NULL &&
+   bbaddr >= obj_priv->gtt_offset &&
+   bbaddr < obj_priv->gtt_offset + obj->size)
+   batchbuffer[0] = obj;
+
+   if (batchbuffer[1] == NULL &&
+   error->acthd >= obj_priv->gtt_offset &&
+   error->acthd < obj_priv->gtt_offset + obj->size)
+   batchbuffer[1] = obj;
+
+   if (batchbuffer[0] && batchbuffer[1])
+   break;
+   }
+   }
+   if (batchbuffer[0] == NULL || batchbuffer[1] == NULL) {
+   list_for_each_entry(obj_priv, &dev_priv->mm.inactive_list, 
list) {
+   struct drm_gem_object *obj = obj_priv->obj;
+
+   if (batchbuffer[0] == NULL &&
+   bbaddr >= obj_priv->gtt_offset &&
+   bbaddr < obj_priv->gtt_offset + obj->size)
+   batchbuffer[0] = obj;
+
+   if (batchbuffer[1] == NULL &&
+   error->acthd >= obj_priv->gtt_offset &&
+   error->acthd < obj_priv->gtt_offset + obj->size)
+   batchbuffer[1] = obj;
+
+   if (batchbuffer[0] && batchbuffer[1])
+   break;
+   }
+   }
 
/* We need to copy these to an anonymous buffer as the simplest
 * method to avoid being overwritten by userpace.
 */
error->batchbuffer[0] = i915_error_object_create(dev, batchbuffer[0]);
-   error->batchbuffer[1] = i915_error_object_create(dev, batchbuffer[1]);
+   if (batchbuffer[1] != batchbuffer[0])
+   error->batchbuffer[1] = i915_error_object_create(dev, 
batchbuffer[1]);
+   else
+   error->batchbuffer[1] = NULL;
 
/* Record the ringbuffer */
error->ringbuffer = i915_error_object_create(dev, 
dev_priv->ring.ring_obj);
-- 
1.7.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http:

Re: DRM Error on Acer Aspire One

2010-05-11 Thread Jaswinder Singh Rajput
Hello Andrew,

On Tue, May 11, 2010 at 8:18 PM, Andrew Morton
 wrote:
> On Tue, 11 May 2010 17:10:53 +0100 Chris Wilson  
> wrote:
>
>> On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput 
>>  wrote:
>> > Hello,
>> >
>> > With latest git kernel, I am getting following DRM error and not
>> > getting XWindows :
>>
>> [snip]
>>
>> Hmm, there are still patches for capturing error state that haven't gone
>> upstream, shame on me.
>>
>> That error is a secondary issue to the GPU hang that is being reported. If
>> it is a regression caused by a kernel update it would be very useful if
>> you could bisect to the erroneous commit.
>
> It helps if one reads the code and the trace...
>
> i915_error_object_create() is using KM_USER0 from softirq context.
> That's a bug, and a pretty serious one.  If some innocent civilian is
> writing highmem data to disk and this timer interrupt fires and trashes
> his KM_USER0 slot, the disk contents will be corrupted.
>
> Something like this...
>
> --- a/drivers/gpu/drm/i915/i915_irq.c~a
> +++ a/drivers/gpu/drm/i915/i915_irq.c
> @@ -456,11 +456,15 @@ i915_error_object_create(struct drm_devi
>
>        for (page = 0; page < page_count; page++) {
>                void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> +               unsigned long flags;
> +
>                if (d == NULL)
>                        goto unwind;
> -               s = kmap_atomic(src_priv->pages[page], KM_USER0);
> +               local_irq_save(flags);
> +               s = kmap_atomic(src_priv->pages[page], KM_IRQ0);
>                memcpy(d, s, PAGE_SIZE);
> -               kunmap_atomic(s, KM_USER0);
> +               kunmap_atomic(s, KM_IRQ0);
> +               local_irq_restore(flags);
>                dst->pages[page] = d;
>        }
>        dst->page_count = page_count;
> _
>
> Please let's get a tested fix for this into 2.6.34.
>

I tested your patch with latest linus git and it works, it fixes the
softirq error.

Now I am only getting DRM errors :

[   42.276059] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer
elapsed... GPU hung
[   42.276398] render error detected, EIR: 0x
[   42.276460] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request
returns -5 (awaiting 18 at 17)

Thanks,
--
Jaswinder Singh.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Andrew Morton
On Tue, 11 May 2010 19:19:26 +0100 Chris Wilson  
wrote:

> On Tue, 11 May 2010 10:48:18 -0400, Andrew Morton  
> wrote:
> > 
> > On Tue, 11 May 2010 17:10:53 +0100 Chris Wilson  
> > wrote:
> > 
> > > On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput 
> > >  wrote:
> > > > Hello,
> > > > 
> > > > With latest git kernel, I am getting following DRM error and not
> > > > getting XWindows :
> > > 
> > > [snip]
> > > 
> > > Hmm, there are still patches for capturing error state that haven't gone
> > > upstream, shame on me.
> > > 
> > > That error is a secondary issue to the GPU hang that is being reported. If
> > > it is a regression caused by a kernel update it would be very useful if
> > > you could bisect to the erroneous commit.
> > 
> > It helps if one reads the code and the trace...
> > 
> > i915_error_object_create() is using KM_USER0 from softirq context. 
> > That's a bug, and a pretty serious one.  If some innocent civilian is
> > writing highmem data to disk and this timer interrupt fires and trashes
> > his KM_USER0 slot, the disk contents will be corrupted.
> > 
> > Something like this...
> > 
> > --- a/drivers/gpu/drm/i915/i915_irq.c~a
> > +++ a/drivers/gpu/drm/i915/i915_irq.c
> > @@ -456,11 +456,15 @@ i915_error_object_create(struct drm_devi
> >  
> > for (page = 0; page < page_count; page++) {
> > void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> > +   unsigned long flags;
> > +
> > if (d == NULL)
> > goto unwind;
> > -   s = kmap_atomic(src_priv->pages[page], KM_USER0);
> > +   local_irq_save(flags);
> > +   s = kmap_atomic(src_priv->pages[page], KM_IRQ0);
> > memcpy(d, s, PAGE_SIZE);
> > -   kunmap_atomic(s, KM_USER0);
> > +   kunmap_atomic(s, KM_IRQ0);
> > +   local_irq_restore(flags);
> > dst->pages[page] = d;
> > }
> > dst->page_count = page_count;
> > _
> > 
> > Please let's get a tested fix for this into 2.6.34.
> 
> The change that I actually want is to replace the kmap_atomic(cpu_page) with 
> an
> io_mapping_map_atomic_wc(gtt_page), in case there is a incoherency between
> the CPU and the GPU, we want to record what the GPU executed. Do you know
> how if similar precautions are required with io_mapping_map_atomic_wc()?

gack, wtf is io_mapping_map_atomic_wc()?



Could do with some interface documentation. Looks too large to be inlined.

No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
it hardwires use of KM_USER0.  I suggest that io_mapping_create_wc(),
io_mapping_map_atomic_wc() etc be changed so that the caller passes in the
KM_foo kmap slot index.


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/i915: Record error batch buffers using iomem

2010-05-11 Thread Andrew Morton
On Tue, 11 May 2010 19:22:14 +0100 Chris Wilson  
wrote:

> + reloc_offset = src_priv->gtt_offset;
>   for (page = 0; page < page_count; page++) {
> - void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> + void __iomem *s;
> + void *d;
> +
> + d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
>   if (d == NULL)
>   goto unwind;
> - s = kmap_atomic(src_priv->pages[page], KM_USER0);
> - memcpy(d, s, PAGE_SIZE);
> - kunmap_atomic(s, KM_USER0);
> +
> + s = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping,
> +  reloc_offset);
> + memcpy_fromio(d, s, PAGE_SIZE);
> + io_mapping_unmap_atomic(s);

As mentioned in the other email, this will still corrupt the KM_USER0
slot, and will generate a debug_kmap_atomic() warning.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/i915: Record error batch buffers using iomem

2010-05-11 Thread Chris Wilson
On Tue, 11 May 2010 11:37:22 -0400, Andrew Morton  
wrote:
> On Tue, 11 May 2010 19:22:14 +0100 Chris Wilson  
> wrote:
> 
> > +   reloc_offset = src_priv->gtt_offset;
> > for (page = 0; page < page_count; page++) {
> > -   void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> > +   void __iomem *s;
> > +   void *d;
> > +
> > +   d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> > if (d == NULL)
> > goto unwind;
> > -   s = kmap_atomic(src_priv->pages[page], KM_USER0);
> > -   memcpy(d, s, PAGE_SIZE);
> > -   kunmap_atomic(s, KM_USER0);
> > +
> > +   s = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping,
> > +reloc_offset);
> > +   memcpy_fromio(d, s, PAGE_SIZE);
> > +   io_mapping_unmap_atomic(s);
> 
> As mentioned in the other email, this will still corrupt the KM_USER0
> slot, and will generate a debug_kmap_atomic() warning.

How, as kmap_atomic(KM_USER0) is no longer used?
-- 
Chris Wilson, Intel Open Source Technology Centre
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Chris Wilson
On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton  
wrote:
> No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
> it hardwires use of KM_USER0.  I suggest that io_mapping_create_wc(),
> io_mapping_map_atomic_wc() etc be changed so that the caller passes in the
> KM_foo kmap slot index.

Argh, sorry for the noise, read the mail in the wrong order. Thanks for
the review. It would be sensible to go with your simpler patch whilst
io_mapping_map_atomic_wc() is improved.

-- 
Chris Wilson, Intel Open Source Technology Centre
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Andrew Morton
On Tue, 11 May 2010 19:52:31 +0100
Chris Wilson  wrote:

> On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton  
> wrote:
> > No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
> > it hardwires use of KM_USER0.  I suggest that io_mapping_create_wc(),
> > io_mapping_map_atomic_wc() etc be changed so that the caller passes in the
> > KM_foo kmap slot index.
> 
> Argh, sorry for the noise, read the mail in the wrong order. Thanks for
> the review. It would be sensible to go with your simpler patch whilst
> io_mapping_map_atomic_wc() is improved.

OK.  I'll be sending a bunch of fixes Linuswards in an hour or two.  
Should I include this?


Subject: drivers/gpu/drm/i915/i915_irq.c:i915_error_object_create(): use 
correct kmap-atomic slot
From: Andrew Morton 

i915_error_object_create() is called from the timer interrupt and hence
can corrupt the KM_USER0 slot.  Use KM_IRQ0 instead.

Reported-by: Jaswinder Singh Rajput 
Tested-by: Jaswinder Singh Rajput 
Cc: Chris Wilson 
Cc: Dave Airlie 
Signed-off-by: Andrew Morton 
---

 drivers/gpu/drm/i915/i915_irq.c |8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff -puN 
drivers/gpu/drm/i915/i915_irq.c~drivers-gpu-drm-i915-i915_irqc-i915_error_object_create-use-correct-kmap-atomic-slot
 drivers/gpu/drm/i915/i915_irq.c
--- 
a/drivers/gpu/drm/i915/i915_irq.c~drivers-gpu-drm-i915-i915_irqc-i915_error_object_create-use-correct-kmap-atomic-slot
+++ a/drivers/gpu/drm/i915/i915_irq.c
@@ -461,11 +461,15 @@ i915_error_object_create(struct drm_devi
 
for (page = 0; page < page_count; page++) {
void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
+   unsigned long flags;
+
if (d == NULL)
goto unwind;
-   s = kmap_atomic(src_priv->pages[page], KM_USER0);
+   local_irq_save(flags);
+   s = kmap_atomic(src_priv->pages[page], KM_IRQ0);
memcpy(d, s, PAGE_SIZE);
-   kunmap_atomic(s, KM_USER0);
+   kunmap_atomic(s, KM_IRQ0);
+   local_irq_restore(flags);
dst->pages[page] = d;
}
dst->page_count = page_count;
_

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/i915: Record error batch buffers using iomem

2010-05-11 Thread Jaswinder Singh Rajput
Hello Chris and Andrew,

On Tue, May 11, 2010 at 11:52 PM, Chris Wilson  wrote:
> Directly read the GTT mapping for the contents of the batch buffers
> rather than relying on possibly stale CPU caches. Also for completeness
> scan the flushing/inactive lists for the current buffers - we are
> collecting error state after all.
>
> Signed-off-by: Chris Wilson 

Yes, I have tested this patch.

I booted 3 times, and this patch fixes the DRM as well as softirq
warnings and I am getting Xwindows with this patch.

I am still doing more testing.

Thanks,
--
Jaswinder Singh.
> ---
>  drivers/gpu/drm/i915/i915_irq.c |   64 ++
>  1 files changed, 57 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 87113da..14301a4 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -441,9 +441,11 @@ static struct drm_i915_error_object *
>  i915_error_object_create(struct drm_device *dev,
>                         struct drm_gem_object *src)
>  {
> +       drm_i915_private_t *dev_priv = dev->dev_private;
>        struct drm_i915_error_object *dst;
>        struct drm_i915_gem_object *src_priv;
>        int page, page_count;
> +       u32 reloc_offset;
>
>        if (src == NULL)
>                return NULL;
> @@ -458,14 +460,23 @@ i915_error_object_create(struct drm_device *dev,
>        if (dst == NULL)
>                return NULL;
>
> +       reloc_offset = src_priv->gtt_offset;
>        for (page = 0; page < page_count; page++) {
> -               void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> +               void __iomem *s;
> +               void *d;
> +
> +               d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
>                if (d == NULL)
>                        goto unwind;
> -               s = kmap_atomic(src_priv->pages[page], KM_USER0);
> -               memcpy(d, s, PAGE_SIZE);
> -               kunmap_atomic(s, KM_USER0);
> +
> +               s = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping,
> +                                            reloc_offset);
> +               memcpy_fromio(d, s, PAGE_SIZE);
> +               io_mapping_unmap_atomic(s);
> +
>                dst->pages[page] = d;
> +
> +               reloc_offset += PAGE_SIZE;
>        }
>        dst->page_count = page_count;
>        dst->gtt_offset = src_priv->gtt_offset;
> @@ -621,18 +632,57 @@ static void i915_capture_error_state(struct drm_device 
> *dev)
>
>                if (batchbuffer[1] == NULL &&
>                    error->acthd >= obj_priv->gtt_offset &&
> -                   error->acthd < obj_priv->gtt_offset + obj->size &&
> -                   batchbuffer[0] != obj)
> +                   error->acthd < obj_priv->gtt_offset + obj->size)
>                        batchbuffer[1] = obj;
>
>                count++;
>        }
> +       /* Scan the other lists for completeness for those bizarre errors. */
> +       if (batchbuffer[0] == NULL || batchbuffer[1] == NULL) {
> +               list_for_each_entry(obj_priv, &dev_priv->mm.flushing_list, 
> list) {
> +                       struct drm_gem_object *obj = obj_priv->obj;
> +
> +                       if (batchbuffer[0] == NULL &&
> +                           bbaddr >= obj_priv->gtt_offset &&
> +                           bbaddr < obj_priv->gtt_offset + obj->size)
> +                               batchbuffer[0] = obj;
> +
> +                       if (batchbuffer[1] == NULL &&
> +                           error->acthd >= obj_priv->gtt_offset &&
> +                           error->acthd < obj_priv->gtt_offset + obj->size)
> +                               batchbuffer[1] = obj;
> +
> +                       if (batchbuffer[0] && batchbuffer[1])
> +                               break;
> +               }
> +       }
> +       if (batchbuffer[0] == NULL || batchbuffer[1] == NULL) {
> +               list_for_each_entry(obj_priv, &dev_priv->mm.inactive_list, 
> list) {
> +                       struct drm_gem_object *obj = obj_priv->obj;
> +
> +                       if (batchbuffer[0] == NULL &&
> +                           bbaddr >= obj_priv->gtt_offset &&
> +                           bbaddr < obj_priv->gtt_offset + obj->size)
> +                               batchbuffer[0] = obj;
> +
> +                       if (batchbuffer[1] == NULL &&
> +                           error->acthd >= obj_priv->gtt_offset &&
> +                           error->acthd < obj_priv->gtt_offset + obj->size)
> +                               batchbuffer[1] = obj;
> +
> +                       if (batchbuffer[0] && batchbuffer[1])
> +                               break;
> +               }
> +       }
>
>        /* We need to copy these to an anonymous buffer as the simplest
>         * method to avoid being overwritten by userpace.
>         */
>        error->batchbuffer[0] = i915_error_object_create(dev, batchbuffer[0]);
> -       error->batchbuffer[1] = i915

Re: [PATCH] drm/i915: Record error batch buffers using iomem

2010-05-11 Thread Jaswinder Singh Rajput
Hello Chris and Andrew,

I did further testing and noticed that this patch fixes the boot
errors and warnings and I get the XWindows.

But XWindows freezes after some time.

Thanks,
--
Jaswinder Singh.

On Wed, May 12, 2010 at 12:52 AM, Jaswinder Singh Rajput
 wrote:
> Hello Chris and Andrew,
>
> On Tue, May 11, 2010 at 11:52 PM, Chris Wilson  
> wrote:
>> Directly read the GTT mapping for the contents of the batch buffers
>> rather than relying on possibly stale CPU caches. Also for completeness
>> scan the flushing/inactive lists for the current buffers - we are
>> collecting error state after all.
>>
>> Signed-off-by: Chris Wilson 
>
> Yes, I have tested this patch.
>
> I booted 3 times, and this patch fixes the DRM as well as softirq
> warnings and I am getting Xwindows with this patch.
>
> I am still doing more testing.
>
> Thanks,
> --
> Jaswinder Singh.
>> ---
>>  drivers/gpu/drm/i915/i915_irq.c |   64 
>> ++
>>  1 files changed, 57 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_irq.c 
>> b/drivers/gpu/drm/i915/i915_irq.c
>> index 87113da..14301a4 100644
>> --- a/drivers/gpu/drm/i915/i915_irq.c
>> +++ b/drivers/gpu/drm/i915/i915_irq.c
>> @@ -441,9 +441,11 @@ static struct drm_i915_error_object *
>>  i915_error_object_create(struct drm_device *dev,
>>                         struct drm_gem_object *src)
>>  {
>> +       drm_i915_private_t *dev_priv = dev->dev_private;
>>        struct drm_i915_error_object *dst;
>>        struct drm_i915_gem_object *src_priv;
>>        int page, page_count;
>> +       u32 reloc_offset;
>>
>>        if (src == NULL)
>>                return NULL;
>> @@ -458,14 +460,23 @@ i915_error_object_create(struct drm_device *dev,
>>        if (dst == NULL)
>>                return NULL;
>>
>> +       reloc_offset = src_priv->gtt_offset;
>>        for (page = 0; page < page_count; page++) {
>> -               void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
>> +               void __iomem *s;
>> +               void *d;
>> +
>> +               d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
>>                if (d == NULL)
>>                        goto unwind;
>> -               s = kmap_atomic(src_priv->pages[page], KM_USER0);
>> -               memcpy(d, s, PAGE_SIZE);
>> -               kunmap_atomic(s, KM_USER0);
>> +
>> +               s = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping,
>> +                                            reloc_offset);
>> +               memcpy_fromio(d, s, PAGE_SIZE);
>> +               io_mapping_unmap_atomic(s);
>> +
>>                dst->pages[page] = d;
>> +
>> +               reloc_offset += PAGE_SIZE;
>>        }
>>        dst->page_count = page_count;
>>        dst->gtt_offset = src_priv->gtt_offset;
>> @@ -621,18 +632,57 @@ static void i915_capture_error_state(struct drm_device 
>> *dev)
>>
>>                if (batchbuffer[1] == NULL &&
>>                    error->acthd >= obj_priv->gtt_offset &&
>> -                   error->acthd < obj_priv->gtt_offset + obj->size &&
>> -                   batchbuffer[0] != obj)
>> +                   error->acthd < obj_priv->gtt_offset + obj->size)
>>                        batchbuffer[1] = obj;
>>
>>                count++;
>>        }
>> +       /* Scan the other lists for completeness for those bizarre errors. */
>> +       if (batchbuffer[0] == NULL || batchbuffer[1] == NULL) {
>> +               list_for_each_entry(obj_priv, &dev_priv->mm.flushing_list, 
>> list) {
>> +                       struct drm_gem_object *obj = obj_priv->obj;
>> +
>> +                       if (batchbuffer[0] == NULL &&
>> +                           bbaddr >= obj_priv->gtt_offset &&
>> +                           bbaddr < obj_priv->gtt_offset + obj->size)
>> +                               batchbuffer[0] = obj;
>> +
>> +                       if (batchbuffer[1] == NULL &&
>> +                           error->acthd >= obj_priv->gtt_offset &&
>> +                           error->acthd < obj_priv->gtt_offset + obj->size)
>> +                               batchbuffer[1] = obj;
>> +
>> +                       if (batchbuffer[0] && batchbuffer[1])
>> +                               break;
>> +               }
>> +       }
>> +       if (batchbuffer[0] == NULL || batchbuffer[1] == NULL) {
>> +               list_for_each_entry(obj_priv, &dev_priv->mm.inactive_list, 
>> list) {
>> +                       struct drm_gem_object *obj = obj_priv->obj;
>> +
>> +                       if (batchbuffer[0] == NULL &&
>> +                           bbaddr >= obj_priv->gtt_offset &&
>> +                           bbaddr < obj_priv->gtt_offset + obj->size)
>> +                               batchbuffer[0] = obj;
>> +
>> +                       if (batchbuffer[1] == NULL &&
>> +                           error->acthd >= obj_priv->gtt_offset &&
>> +                           error->acthd < obj_priv->gtt_offset + obj->size)
>> +                               batchbuffer[1] = obj;
>> +

[Bug 28069] New: maniadrive - smooth play with LIBGL_ALWAYS_INDIRECT=true, (almost) unplayable otherwise

2010-05-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=28069

   Summary: maniadrive - smooth play with
LIBGL_ALWAYS_INDIRECT=true, (almost) unplayable
otherwise
   Product: DRI
   Version: XOrg CVS
  Platform: Other
OS/Version: All
Status: NEW
  Severity: enhancement
  Priority: medium
 Component: DRM/Radeon
AssignedTo: dri-devel@lists.freedesktop.org
ReportedBy: p...@mandriva.com.br


The problem described in https://bugs.freedesktop.org/show_bug.cgi?id=28002
actually also happens with the ati driver.
  Just that with the ati driver, it is less visible (the car position jumps
less), but with LIBGL_ALWAYS_INDIRECT it also becomes a lot more smooth.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/i915: Record error batch buffers using iomem

2010-05-11 Thread Chris Wilson
On Wed, 12 May 2010 01:08:23 +0530, Jaswinder Singh Rajput 
 wrote:
> Hello Chris and Andrew,
> 
> I did further testing and noticed that this patch fixes the boot
> errors and warnings and I get the XWindows.
> 
> But XWindows freezes after some time.

The BUG you were hitting before is on the error collection path which
presumably is still being triggered during boot by a GPU error.
Can you check to see if /sys/kernel/debug/dri/0/i915_error_state has
recorded anything? And if not, wait until it freezes and then please file
a bug report at bugs.freedesktop.org with the i915_error_state, Xorg.0.log
and dmesg.

-- 
Chris Wilson, Intel Open Source Technology Centre
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Chris Wilson
On Tue, 11 May 2010 12:10:01 -0700, Andrew Morton  
wrote:
> On Tue, 11 May 2010 19:52:31 +0100
> Chris Wilson  wrote:
> 
> > On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton 
> >  wrote:
> > > No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
> > > it hardwires use of KM_USER0.  I suggest that io_mapping_create_wc(),
> > > io_mapping_map_atomic_wc() etc be changed so that the caller passes in the
> > > KM_foo kmap slot index.
> > 
> > Argh, sorry for the noise, read the mail in the wrong order. Thanks for
> > the review. It would be sensible to go with your simpler patch whilst
> > io_mapping_map_atomic_wc() is improved.
> 
> OK.  I'll be sending a bunch of fixes Linuswards in an hour or two.  
> Should I include this?

Yes.

Acked-by: Chris Wilson 

-- 
Chris Wilson, Intel Open Source Technology Centre
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/i915: Record error batch buffers using iomem

2010-05-11 Thread Jaswinder Singh Rajput
Hello Chris,

On Wed, May 12, 2010 at 1:23 AM, Chris Wilson  wrote:
> On Wed, 12 May 2010 01:08:23 +0530, Jaswinder Singh Rajput 
>  wrote:
>> Hello Chris and Andrew,
>>
>> I did further testing and noticed that this patch fixes the boot
>> errors and warnings and I get the XWindows.
>>
>> But XWindows freezes after some time.
>
> The BUG you were hitting before is on the error collection path which
> presumably is still being triggered during boot by a GPU error.

No, I am not getting any bug with your patch.

dmesg with your patch :
http://userweb.kernel.org/~jaswinder/acer_netbook/dmesg_2634-rc7-chris.txt

> Can you check to see if /sys/kernel/debug/dri/0/i915_error_state has
> recorded anything?

No.

> And if not, wait until it freezes and then please file
> a bug report at bugs.freedesktop.org with the i915_error_state, Xorg.0.log
> and dmesg.
>

Ok.

Thanks,
--
Jaswinder Singh.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/i915: Record error batch buffers using iomem

2010-05-11 Thread Jaswinder Singh Rajput
Hello Chris,

On Wed, May 12, 2010 at 1:35 AM, Jaswinder Singh Rajput
 wrote:
> Hello Chris,
>
> On Wed, May 12, 2010 at 1:23 AM, Chris Wilson  
> wrote:
>> On Wed, 12 May 2010 01:08:23 +0530, Jaswinder Singh Rajput 
>>  wrote:
>>> Hello Chris and Andrew,
>>>
>>> I did further testing and noticed that this patch fixes the boot
>>> errors and warnings and I get the XWindows.
>>>
>>> But XWindows freezes after some time.
>>
>> The BUG you were hitting before is on the error collection path which
>> presumably is still being triggered during boot by a GPU error.
>
> No, I am not getting any bug with your patch.
>
> dmesg with your patch :
> http://userweb.kernel.org/~jaswinder/acer_netbook/dmesg_2634-rc7-chris.txt
>

I did more testing. And test pass 80% of time. I get the bugs with cold boot :

[   40.090295] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer
elapsed... GPU hung
[   40.090318] [ cut here ]
[   40.090338] WARNING: at mm/highmem.c:453 debug_kmap_atomic+0xa9/0x11e()
[   40.090345] Hardware name: Aspire one
[   40.090351] Modules linked in: nf_conntrack_ftp ath9k ath9k_common
battery ath9k_hw [last unloaded: scsi_wait_scan]
[   40.090378] Pid: 0, comm: swapper Not tainted 2.6.34-rc7-netbook #8
[   40.090385] Call Trace:
[   40.090402]  [] warn_slowpath_common+0x65/0x7c
[   40.090415]  [] ? debug_kmap_atomic+0xa9/0x11e
[   40.090428]  [] warn_slowpath_null+0xd/0x10
[   40.090440]  [] debug_kmap_atomic+0xa9/0x11e
[   40.090454]  [] kmap_atomic_prot_pfn+0x1d/0x5e
[   40.090465]  [] iomap_atomic_prot_pfn+0x23/0x26
[   40.090479]  [] i915_error_object_create+0x110/0x17c
[   40.090492]  [] i915_handle_error+0x4a2/0x9ba
[   40.090506]  [] i915_hangcheck_elapsed+0x9f/0xdf
[   40.090518]  [] run_timer_softirq+0x1c9/0x269
[   40.090531]  [] ? i915_hangcheck_elapsed+0x0/0xdf
[   40.090543]  [] __do_softirq+0xc6/0x186
[   40.090553]  [] do_softirq+0x26/0x2b
[   40.090564]  [] irq_exit+0x29/0x66
[   40.090576]  [] smp_apic_timer_interrupt+0x6e/0x7c
[   40.090591]  [] apic_timer_interrupt+0x2a/0x30
[   40.090605]  [] ? ftrace_raw_event_signal_generate+0x6d/0xd4
[   40.090618]  [] ? acpi_idle_enter_simple+0x13b/0x168
[   40.090633]  [] cpuidle_idle_call+0x6b/0xda
[   40.090645]  [] cpu_idle+0x44/0x74
[   40.090657]  [] start_secondary+0x1b2/0x1b7
[   40.090666] ---[ end trace 5e47c395a6f397dc ]---
[   40.090862] [ cut here ]

dmesg with this patch with cold boot :
http://userweb.kernel.org/~jaswinder/acer_netbook/dmesg_2634-rc7-chris-cold.txt

Thanks,
--
Jaswinder Singh.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Dave Airlie
On Wed, May 12, 2010 at 5:57 AM, Chris Wilson  wrote:
> On Tue, 11 May 2010 12:10:01 -0700, Andrew Morton  
> wrote:
>> On Tue, 11 May 2010 19:52:31 +0100
>> Chris Wilson  wrote:
>>
>> > On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton 
>> >  wrote:
>> > > No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
>> > > it hardwires use of KM_USER0.  I suggest that io_mapping_create_wc(),
>> > > io_mapping_map_atomic_wc() etc be changed so that the caller passes in 
>> > > the
>> > > KM_foo kmap slot index.
>> >
>> > Argh, sorry for the noise, read the mail in the wrong order. Thanks for
>> > the review. It would be sensible to go with your simpler patch whilst
>> > io_mapping_map_atomic_wc() is improved.
>>
>> OK.  I'll be sending a bunch of fixes Linuswards in an hour or two.
>> Should I include this?
>
> Yes.
>
> Acked-by: Chris Wilson 
>

I'm not sure pushing this in at this point is a good idea, if I'm
reading it correctly we've no idea what KM_IRQ is being used for, and
this codepath is called from non-irq contexts just as much as irq
contexts.

I'd rather we just backout the hangcheck stuff touching copies at all
at this point, and try again doing it properly with a slow work or
something for later.

Dave.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Chris Wilson
On Wed, 12 May 2010 08:22:49 +1000, Dave Airlie  wrote:
> I'd rather we just backout the hangcheck stuff touching copies at all
> at this point, and try again doing it properly with a slow work or
> something for later.

>From my point of view, the information provided by the hangcheck has been
invaluable for delving into and fixing some obnoxious driver bugs. I
suspect its honeymoon period is now over - those bugs that it could
detect easily have been fixed (I hope). In order to capture the relevant
information for later chipset generations, we will need to parse the
command stream and include auxiliary buffers. So whilst I would prefer to
see this in a release so that I can easily diagnose bug reports, I accept
that there is more work to be done and will HTFU.

-- 
Chris Wilson, Intel Open Source Technology Centre
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Andrew Morton
On Wed, 12 May 2010 08:22:49 +1000
Dave Airlie  wrote:

> On Wed, May 12, 2010 at 5:57 AM, Chris Wilson  
> wrote:
> > On Tue, 11 May 2010 12:10:01 -0700, Andrew Morton 
> >  wrote:
> >> On Tue, 11 May 2010 19:52:31 +0100
> >> Chris Wilson  wrote:
> >>
> >> > On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton 
> >> >  wrote:
> >> > > No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
> >> > > it hardwires use of KM_USER0. __I suggest that io_mapping_create_wc(),
> >> > > io_mapping_map_atomic_wc() etc be changed so that the caller passes in 
> >> > > the
> >> > > KM_foo kmap slot index.
> >> >
> >> > Argh, sorry for the noise, read the mail in the wrong order. Thanks for
> >> > the review. It would be sensible to go with your simpler patch whilst
> >> > io_mapping_map_atomic_wc() is improved.
> >>
> >> OK. __I'll be sending a bunch of fixes Linuswards in an hour or two.
> >> Should I include this?
> >
> > Yes.
> >
> > Acked-by: Chris Wilson 
> >
> 
> I'm not sure pushing this in at this point is a good idea, if I'm
> reading it correctly we've no idea what KM_IRQ is being used for,

It's used for taking kmaps from IRQ contexts.

> and
> this codepath is called from non-irq contexts just as much as irq
> contexts.

That's fine.  As long as we do a local_irq_disable(), KM_IRQ0 can be
used from both irq- and non-irq contexts.  All we need to do is to
ensure that some interrupt cannot come along on this CPU and corrupt
the slot.

> I'd rather we just backout the hangcheck stuff touching copies at all
> at this point, and try again doing it properly with a slow work or
> something for later.
> 
> Dave.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Dave Airlie
On Wed, May 12, 2010 at 8:32 AM, Andrew Morton
 wrote:
> On Wed, 12 May 2010 08:22:49 +1000
> Dave Airlie  wrote:
>
>> On Wed, May 12, 2010 at 5:57 AM, Chris Wilson  
>> wrote:
>> > On Tue, 11 May 2010 12:10:01 -0700, Andrew Morton 
>> >  wrote:
>> >> On Tue, 11 May 2010 19:52:31 +0100
>> >> Chris Wilson  wrote:
>> >>
>> >> > On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton 
>> >> >  wrote:
>> >> > > No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
>> >> > > it hardwires use of KM_USER0. __I suggest that io_mapping_create_wc(),
>> >> > > io_mapping_map_atomic_wc() etc be changed so that the caller passes 
>> >> > > in the
>> >> > > KM_foo kmap slot index.
>> >> >
>> >> > Argh, sorry for the noise, read the mail in the wrong order. Thanks for
>> >> > the review. It would be sensible to go with your simpler patch whilst
>> >> > io_mapping_map_atomic_wc() is improved.
>> >>
>> >> OK. __I'll be sending a bunch of fixes Linuswards in an hour or two.
>> >> Should I include this?
>> >
>> > Yes.
>> >
>> > Acked-by: Chris Wilson 
>> >
>>
>> I'm not sure pushing this in at this point is a good idea, if I'm
>> reading it correctly we've no idea what KM_IRQ is being used for,
>
> It's used for taking kmaps from IRQ contexts.
>
>> and
>> this codepath is called from non-irq contexts just as much as irq
>> contexts.
>
> That's fine.  As long as we do a local_irq_disable(), KM_IRQ0 can be
> used from both irq- and non-irq contexts.  All we need to do is to
> ensure that some interrupt cannot come along on this CPU and corrupt
> the slot.

I don't think we do that in a lot of places, and I'd rather not add
that in to fix this problem at this point in the release cycle, as
we've no idea what it might break/regress.

Its easier to just disable the hangcheck copy and try again for 2.6.35
with a workqueue or slow work.

Dave



>
>> I'd rather we just backout the hangcheck stuff touching copies at all
>> at this point, and try again doing it properly with a slow work or
>> something for later.
>>
>> Dave.
>
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Dave Airlie
On Wed, May 12, 2010 at 8:56 AM, Andrew Morton
 wrote:
> On Wed, 12 May 2010 08:51:05 +1000
> Dave Airlie  wrote:
>
>> On Wed, May 12, 2010 at 8:32 AM, Andrew Morton
>>  wrote:
>> > On Wed, 12 May 2010 08:22:49 +1000
>> > Dave Airlie  wrote:
>> >
>> >> On Wed, May 12, 2010 at 5:57 AM, Chris Wilson  
>> >> wrote:
>> >> > On Tue, 11 May 2010 12:10:01 -0700, Andrew Morton 
>> >> >  wrote:
>> >> >> On Tue, 11 May 2010 19:52:31 +0100
>> >> >> Chris Wilson  wrote:
>> >> >>
>> >> >> > On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton 
>> >> >> >  wrote:
>> >> >> > > No, io_mapping_map_atomic_wc() cannot be used from [soft]irq 
>> >> >> > > context:
>> >> >> > > it hardwires use of KM_USER0. __I suggest that 
>> >> >> > > io_mapping_create_wc(),
>> >> >> > > io_mapping_map_atomic_wc() etc be changed so that the caller 
>> >> >> > > passes in the
>> >> >> > > KM_foo kmap slot index.
>> >> >> >
>> >> >> > Argh, sorry for the noise, read the mail in the wrong order. Thanks 
>> >> >> > for
>> >> >> > the review. It would be sensible to go with your simpler patch whilst
>> >> >> > io_mapping_map_atomic_wc() is improved.
>> >> >>
>> >> >> OK. __I'll be sending a bunch of fixes Linuswards in an hour or two.
>> >> >> Should I include this?
>> >> >
>> >> > Yes.
>> >> >
>> >> > Acked-by: Chris Wilson 
>> >> >
>> >>
>> >> I'm not sure pushing this in at this point is a good idea, if I'm
>> >> reading it correctly we've no idea what KM_IRQ is being used for,
>> >
>> > It's used for taking kmaps from IRQ contexts.
>> >
>> >> and
>> >> this codepath is called from non-irq contexts just as much as irq
>> >> contexts.
>> >
>> > That's fine. __As long as we do a local_irq_disable(), KM_IRQ0 can be
>> > used from both irq- and non-irq contexts. __All we need to do is to
>> > ensure that some interrupt cannot come along on this CPU and corrupt
>> > the slot.
>>
>> I don't think we do that in a lot of places, and I'd rather not add
>> that in to fix this problem at this point in the release cycle, as
>> we've no idea what it might break/regress.
>
> What is "that"?  The switch to irq-protected KM_IRQ0?  That won't break
> anything.
>

disabling local cpu irqs around all these kmap mappings.

Dave.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Andrew Morton
On Wed, 12 May 2010 08:51:05 +1000
Dave Airlie  wrote:

> On Wed, May 12, 2010 at 8:32 AM, Andrew Morton
>  wrote:
> > On Wed, 12 May 2010 08:22:49 +1000
> > Dave Airlie  wrote:
> >
> >> On Wed, May 12, 2010 at 5:57 AM, Chris Wilson  
> >> wrote:
> >> > On Tue, 11 May 2010 12:10:01 -0700, Andrew Morton 
> >> >  wrote:
> >> >> On Tue, 11 May 2010 19:52:31 +0100
> >> >> Chris Wilson  wrote:
> >> >>
> >> >> > On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton 
> >> >> >  wrote:
> >> >> > > No, io_mapping_map_atomic_wc() cannot be used from [soft]irq 
> >> >> > > context:
> >> >> > > it hardwires use of KM_USER0. __I suggest that 
> >> >> > > io_mapping_create_wc(),
> >> >> > > io_mapping_map_atomic_wc() etc be changed so that the caller passes 
> >> >> > > in the
> >> >> > > KM_foo kmap slot index.
> >> >> >
> >> >> > Argh, sorry for the noise, read the mail in the wrong order. Thanks 
> >> >> > for
> >> >> > the review. It would be sensible to go with your simpler patch whilst
> >> >> > io_mapping_map_atomic_wc() is improved.
> >> >>
> >> >> OK. __I'll be sending a bunch of fixes Linuswards in an hour or two.
> >> >> Should I include this?
> >> >
> >> > Yes.
> >> >
> >> > Acked-by: Chris Wilson 
> >> >
> >>
> >> I'm not sure pushing this in at this point is a good idea, if I'm
> >> reading it correctly we've no idea what KM_IRQ is being used for,
> >
> > It's used for taking kmaps from IRQ contexts.
> >
> >> and
> >> this codepath is called from non-irq contexts just as much as irq
> >> contexts.
> >
> > That's fine. __As long as we do a local_irq_disable(), KM_IRQ0 can be
> > used from both irq- and non-irq contexts. __All we need to do is to
> > ensure that some interrupt cannot come along on this CPU and corrupt
> > the slot.
> 
> I don't think we do that in a lot of places, and I'd rather not add
> that in to fix this problem at this point in the release cycle, as
> we've no idea what it might break/regress.

What is "that"?  The switch to irq-protected KM_IRQ0?  That won't break
anything.

> Its easier to just disable the hangcheck copy and try again for 2.6.35
> with a workqueue or slow work.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: DRM Error on Acer Aspire One

2010-05-11 Thread Andrew Morton
On Wed, 12 May 2010 09:17:09 +1000
Dave Airlie  wrote:

> >> >> and
> >> >> this codepath is called from non-irq contexts just as much as irq
> >> >> contexts.
> >> >
> >> > That's fine. __As long as we do a local_irq_disable(), KM_IRQ0 can be
> >> > used from both irq- and non-irq contexts. __All we need to do is to
> >> > ensure that some interrupt cannot come along on this CPU and corrupt
> >> > the slot.
> >>
> >> I don't think we do that in a lot of places, and I'd rather not add
> >> that in to fix this problem at this point in the release cycle, as
> >> we've no idea what it might break/regress.
> >
> > What is "that"? __The switch to irq-protected KM_IRQ0? __That won't break
> > anything.
> >
> 
> disabling local cpu irqs around all these kmap mappings.
> 

Ah.  Well if there are other uses of KM_USER0 from interrupt context
then yes, we have more problems.  CONFIG_DEBUG_HIGHMEM &&
CONFIG_TRACE_IRQFLAGS_SUPPORT will detect that and as long as Jaswinder
has hit all code paths in his testing, we're good.  Some manual review
for this would be good.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 27211] endless PROTECTION_FAULT logs, Nouveau drm, TNT card

2010-05-11 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=27211

--- Comment #7 from Brent  2010-05-11 23:35:47 PDT ---
Tried the above patch against the stock 2.6.33.3 kernel, with acceleration
turned on, and got  "(EE) [drm] failed to open device" in Xorg.0.log.  But it
did give me a 80x50 text console, so it seems the framebuffer works.  And it is
no longer filling the logs with error messages.  So that's something.

>From what I read, nouveau and drm are very touchy about matching versions.  I
started with the kernel configuration for Arch Linux, but ended up removing
many modules as otherwise that 350 MHz Pentium II needed 6 hours to compile.  I
updated the system before I tried the patch.  These are the current versions
for Arch Linux:

nouveau-drm 0.0.16_20100313-2
xf86-video-nouveau 0.0.15_git20100314-1
kernel26 2.6.33.3-2


Here is the tail of Xorg.0.log:

(II) Module nouveau: vendor="X.Org Foundation"
compiled for 1.7.5.902, module version = 0.0.15
Module class: X.Org Video Driver
ABI class: X.Org Video Driver, version 6.0
(II) NOUVEAU driver 
(II) NOUVEAU driver for NVIDIA chipset families :
RIVA TNT(NV04)
RIVA TNT2   (NV05)
GeForce 256 (NV10)
GeForce 2   (NV11, NV15)
GeForce 4MX (NV17, NV18)
GeForce 3   (NV20)
GeForce 4Ti (NV25, NV28)
GeForce FX  (NV3x)
GeForce 6   (NV4x)
GeForce 7   (G7x)
GeForce 8   (G8x)
(II) Primary Device is: PCI 0...@00:00:0
drmOpenDevice: node name is /dev/dri/card0
drmOpenDevice: open result is 7, (OK)
drmOpenByBusid: Searching for BusID pci::01:00.0
drmOpenDevice: node name is /dev/dri/card0
drmOpenDevice: open result is 7, (OK)
drmOpenByBusid: drmOpenMinor returns 7
drmOpenByBusid: drmGetBusid reports pci::01:00.0
(EE) [drm] failed to open device
(EE) No devices detected.

Fatal server error:
no screens found

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] drm/radeon/kms: add query for crtc hw id from crtc id to get info

2010-05-11 Thread Dave Airlie
On Sat, May 8, 2010 at 1:18 AM, Jerome Glisse  wrote:
> Userspace need to know the hw crtc id (0, 1, 2, ...) from the drm
> crtc id. Bump the minor version so userspace can enable conditionaly
> features depend on this.
>
> Signed-off-by: Jerome Glisse 
> ---
> ?drivers/gpu/drm/radeon/radeon_drv.c | ? ?3 ++-
> ?drivers/gpu/drm/radeon/radeon_kms.c | ? 18 ++
> ?include/drm/radeon_drm.h ? ? ? ? ? ?| ? ?1 +
> ?3 files changed, 21 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/radeon_drv.c 
> b/drivers/gpu/drm/radeon/radeon_drv.c
> index b3749d4..df96ace 100644
> --- a/drivers/gpu/drm/radeon/radeon_drv.c
> +++ b/drivers/gpu/drm/radeon/radeon_drv.c
> @@ -44,9 +44,10 @@
> ?* - 2.1.0 - add square tiling interface
> ?* - 2.2.0 - add r6xx/r7xx const buffer support
> ?* - 2.3.0 - add MSPOS + 3D texture + r500 VAP regs
> + * - 2.4.0 - add crtc id query
> ?*/
> ?#define KMS_DRIVER_MAJOR ? ? ? 2
> -#define KMS_DRIVER_MINOR ? ? ? 3
> +#define KMS_DRIVER_MINOR ? ? ? 4
> ?#define KMS_DRIVER_PATCHLEVEL ?0
> ?int radeon_driver_load_kms(struct drm_device *dev, unsigned long flags);
> ?int radeon_driver_unload_kms(struct drm_device *dev);
> diff --git a/drivers/gpu/drm/radeon/radeon_kms.c 
> b/drivers/gpu/drm/radeon/radeon_kms.c
> index d3657dc..04ad452 100644
> --- a/drivers/gpu/drm/radeon/radeon_kms.c
> +++ b/drivers/gpu/drm/radeon/radeon_kms.c
> @@ -98,11 +98,15 @@ int radeon_info_ioctl(struct drm_device *dev, void *data, 
> struct drm_file *filp)
> ?{
> ? ? ? ?struct radeon_device *rdev = dev->dev_private;
> ? ? ? ?struct drm_radeon_info *info;
> + ? ? ? struct radeon_mode_info *minfo = &rdev->mode_info;
> ? ? ? ?uint32_t *value_ptr;
> ? ? ? ?uint32_t value;
> + ? ? ? struct drm_crtc *crtc;
> + ? ? ? int i, found;
>
> ? ? ? ?info = data;
> ? ? ? ?value_ptr = (uint32_t *)((unsigned long)info->value);
> + ? ? ? value = *value_ptr;
> ? ? ? ?switch (info->request) {
> ? ? ? ?case RADEON_INFO_DEVICE_ID:
> ? ? ? ? ? ? ? ?value = dev->pci_device;
> @@ -116,6 +120,20 @@ int radeon_info_ioctl(struct drm_device *dev, void 
> *data, struct drm_file *filp)
> ? ? ? ?case RADEON_INFO_ACCEL_WORKING:
> ? ? ? ? ? ? ? ?value = rdev->accel_working;
> ? ? ? ? ? ? ? ?break;
> + ? ? ? case RADEON_INFO_CRTC_FROM_ID:
> + ? ? ? ? ? ? ? for (i = 0, found = 0; i < 6; i++) {
> + ? ? ? ? ? ? ? ? ? ? ? crtc = (struct drm_crtc *)minfo->crtcs[i];
> + ? ? ? ? ? ? ? ? ? ? ? if (crtc && crtc->base.id == value) {
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? value = i;
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? found = 1;
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? break;
> + ? ? ? ? ? ? ? ? ? ? ? }
> + ? ? ? ? ? ? ? }
> + ? ? ? ? ? ? ? if (!found) {
> + ? ? ? ? ? ? ? ? ? ? ? DRM_ERROR("unknown crtc id %d\n", value);

Don't drm error or hardcode 6 here.

we have rdev->num_crtc and DRM erroring from a path triggerable
directly from a user doing something bad is generally a bad idea.

Dave.


[Bug 27722] [865] no compiz after upgrade from 7.7.1 to 7.8.1

2010-05-11 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=27722

Gordon Jin  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||DUPLICATE

--- Comment #1 from Gordon Jin  2010-05-11 01:45:26 
PDT ---


*** This bug has been marked as a duplicate of bug 27615 ***

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


fb+drm: possible circular locking dependency detected

2010-05-11 Thread Clemens Ladisch
With radeon KMS, after using a program accessing the framebuffer,
I started X and got this:

===
[ INFO: possible circular locking dependency detected ]
2.6.34-rc6 #117
---
X/1846 is trying to acquire lock:
 (&mm->mmap_sem){++}, at: [] might_fault+0x57/0xa4

but task is already holding lock:
 (&dev->mode_config.mutex){+.+.+.}, at: [] 
drm_mode_getresources+0x33/0x54d

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&dev->mode_config.mutex){+.+.+.}:
   [] __lock_acquire+0x1408/0x1747
   [] lock_acquire+0x5a/0x71
   [] mutex_lock_nested+0x58/0x2b1
   [] drm_fb_helper_set_par+0x9c/0xf2
   [] fb_set_var+0x1de/0x2d9
   [] do_fb_ioctl+0x13a/0x46e
   [] fb_ioctl+0x21/0x23
   [] vfs_ioctl+0x2a/0x9e
   [] do_vfs_ioctl+0x4b7/0x4f4
   [] sys_ioctl+0x42/0x65
   [] system_call_fastpath+0x16/0x1b

-> #1 (&fb_info->lock){+.+.+.}:
   [] __lock_acquire+0x1408/0x1747
   [] lock_acquire+0x5a/0x71
   [] mutex_lock_nested+0x58/0x2b1
   [] fb_release+0x1c/0x54
   [] __fput+0x120/0x1e3
   [] fput+0x18/0x1a
   [] remove_vma+0x51/0x76
   [] do_munmap+0x30a/0x32e
   [] sys_munmap+0x3b/0x54
   [] system_call_fastpath+0x16/0x1b

-> #0 (&mm->mmap_sem){++}:
   [] __lock_acquire+0x112d/0x1747
   [] lock_acquire+0x5a/0x71
   [] might_fault+0x84/0xa4
   [] drm_mode_getresources+0x280/0x54d
   [] drm_ioctl+0x255/0x34b
   [] vfs_ioctl+0x2a/0x9e
   [] do_vfs_ioctl+0x4b7/0x4f4
   [] sys_ioctl+0x42/0x65
   [] system_call_fastpath+0x16/0x1b

other info that might help us debug this:

1 lock held by X/1846:
 #0:  (&dev->mode_config.mutex){+.+.+.}, at: [] 
drm_mode_getresources+0x33/0x54d

stack backtrace:
Pid: 1846, comm: X Not tainted 2.6.34-rc6 #117
Call Trace:
 [] print_circular_bug+0xb3/0xc2
 [] __lock_acquire+0x112d/0x1747
 [] ? mark_held_locks+0x4d/0x6b
 [] ? mutex_lock_nested+0x296/0x2b1
 [] lock_acquire+0x5a/0x71
 [] ? might_fault+0x57/0xa4
 [] might_fault+0x84/0xa4
 [] ? might_fault+0x57/0xa4
 [] drm_mode_getresources+0x280/0x54d
 [] drm_ioctl+0x255/0x34b
 [] ? drm_mode_getresources+0x0/0x54d
 [] ? up_read+0x1e/0x35
 [] vfs_ioctl+0x2a/0x9e
 [] do_vfs_ioctl+0x4b7/0x4f4
 [] ? sysret_check+0x27/0x62
 [] sys_ioctl+0x42/0x65
 [] system_call_fastpath+0x16/0x1b


DRM Error on Acer Aspire One

2010-05-11 Thread Jaswinder Singh Rajput
Hello,

With latest git kernel, I am getting following DRM error and not
getting XWindows :

[   45.269075] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer
elapsed... GPU hung
[   45.269111] [ cut here ]
[   45.269139] WARNING: at mm/highmem.c:453 debug_kmap_atomic+0xa9/0x11e()
[   45.269150] Hardware name: Aspire one
[   45.269158] Modules linked in: nf_conntrack_ftp ath9k ath9k_common
battery ath9k_hw [last unloaded: scsi_wait_scan]
[   45.269198] Pid: 0, comm: swapper Not tainted 2.6.34-rc7-netbook #6
[   45.269208] Call Trace:
[   45.269231]  [] warn_slowpath_common+0x65/0x7c
[   45.269249]  [] ? debug_kmap_atomic+0xa9/0x11e
[   45.269267]  [] warn_slowpath_null+0xd/0x10
[   45.269284]  [] debug_kmap_atomic+0xa9/0x11e
[   45.269304]  [] kmap_atomic_prot+0x4d/0xb2
[   45.269321]  [] kmap_atomic+0xe/0x10
[   45.269341]  [] i915_error_object_create+0xea/0x14f
[   45.269359]  [] i915_handle_error+0x369/0x868
[   45.269380]  [] i915_hangcheck_elapsed+0x9f/0xdf
[   45.269399]  [] run_timer_softirq+0x1c9/0x269
[   45.269417]  [] ? i915_hangcheck_elapsed+0x0/0xdf
[   45.269435]  [] __do_softirq+0xc6/0x186
[   45.269451]  [] do_softirq+0x26/0x2b
[   45.269466]  [] irq_exit+0x29/0x66
[   45.269484]  [] smp_apic_timer_interrupt+0x6e/0x7c
[   45.269504]  [] apic_timer_interrupt+0x2a/0x30
[   45.269524]  [] ? ftrace_raw_event_signal_generate+0x6d/0xd4
[   45.269542]  [] ? acpi_idle_enter_simple+0x13b/0x168
[   45.269563]  [] cpuidle_idle_call+0x6b/0xda
[   45.269580]  [] cpu_idle+0x44/0x74
[   45.269598]  [] start_secondary+0x1b2/0x1b7
[   45.269612] ---[ end trace ce01d7ca0ae214f4 ]---
[   45.269631] [ cut here ]
[   45.269647] WARNING: at mm/highmem.c:453 debug_kmap_atomic+0xa9/0x11e()
[   45.269657] Hardware name: Aspire one
[   45.269665] Modules linked in: nf_conntrack_ftp ath9k ath9k_common
battery ath9k_hw [last unloaded: scsi_wait_scan]
[   45.269700] Pid: 0, comm: swapper Tainted: GW  2.6.34-rc7-netbook #6
[   45.269710] Call Trace:
[   45.269726]  [] warn_slowpath_common+0x65/0x7c
[   45.269743]  [] ? debug_kmap_atomic+0xa9/0x11e
[   45.269760]  [] warn_slowpath_null+0xd/0x10
[   45.269777]  [] debug_kmap_atomic+0xa9/0x11e
[   45.269795]  [] kmap_atomic_prot+0x4d/0xb2
[   45.269812]  [] kmap_atomic+0xe/0x10
[   45.269829]  [] i915_error_object_create+0xea/0x14f
[   45.269848]  [] i915_handle_error+0x369/0x868
[   45.269868]  [] i915_hangcheck_elapsed+0x9f/0xdf
[   45.269885]  [] run_timer_softirq+0x1c9/0x269
[   45.269903]  [] ? i915_hangcheck_elapsed+0x0/0xdf
[   45.269920]  [] __do_softirq+0xc6/0x186
[   45.269937]  [] do_softirq+0x26/0x2b
[   45.269952]  [] irq_exit+0x29/0x66
[   45.269968]  [] smp_apic_timer_interrupt+0x6e/0x7c
[   45.269985]  [] apic_timer_interrupt+0x2a/0x30
[   45.270004]  [] ? ftrace_raw_event_signal_generate+0x6d/0xd4
[   45.270051]  [] ? acpi_idle_enter_simple+0x13b/0x168
[   45.270071]  [] cpuidle_idle_call+0x6b/0xda
[   45.270087]  [] cpu_idle+0x44/0x74
[   45.270104]  [] start_secondary+0x1b2/0x1b7
[   45.270117] ---[ end trace ce01d7ca0ae214f5 ]---
[   45.270135] [ cut here ]

dmesg : http://userweb.kernel.org/~jaswinder/acer_netbook/dmesg_2634-rc7.txt
.config : http://userweb.kernel.org/~jaswinder/acer_netbook/config_2634-rc7.txt

How can I fix these errors.

Thanks,
--
Jaswinder Singh.


DRM Error on Acer Aspire One

2010-05-11 Thread Chris Wilson
On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput  wrote:
> Hello,
> 
> With latest git kernel, I am getting following DRM error and not
> getting XWindows :

[snip]

Hmm, there are still patches for capturing error state that haven't gone
upstream, shame on me.

That error is a secondary issue to the GPU hang that is being reported. If
it is a regression caused by a kernel update it would be very useful if
you could bisect to the erroneous commit.

-- 
Chris Wilson, Intel Open Source Technology Centre


DRM Error on Acer Aspire One

2010-05-11 Thread Jaswinder Singh Rajput
Hello Chris,

On Tue, May 11, 2010 at 9:40 PM, Chris Wilson  
wrote:
> On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput  gmail.com> wrote:
>> Hello,
>>
>> With latest git kernel, I am getting following DRM error and not
>> getting XWindows :
>
> [snip]
>
> Hmm, there are still patches for capturing error state that haven't gone
> upstream, shame on me.
>
> That error is a secondary issue to the GPU hang that is being reported. If
> it is a regression caused by a kernel update it would be very useful if
> you could bisect to the erroneous commit.
>

Earlier I was using Moblin, I switched to Fedora and start getting
this error. I have also tested different kernel versions but getting
same error, so I do not think this is a regression.

moblin dmesg : 
http://userweb.kernel.org/~jaswinder/moblin/dmesg-moblin_2633rc5.txt

Thanks,
--
Jaswinder Singh.


DRM Error on Acer Aspire One

2010-05-11 Thread Andrew Morton
On Tue, 11 May 2010 17:10:53 +0100 Chris Wilson  
wrote:

> On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput  gmail.com> wrote:
> > Hello,
> > 
> > With latest git kernel, I am getting following DRM error and not
> > getting XWindows :
> 
> [snip]
> 
> Hmm, there are still patches for capturing error state that haven't gone
> upstream, shame on me.
> 
> That error is a secondary issue to the GPU hang that is being reported. If
> it is a regression caused by a kernel update it would be very useful if
> you could bisect to the erroneous commit.

It helps if one reads the code and the trace...

i915_error_object_create() is using KM_USER0 from softirq context. 
That's a bug, and a pretty serious one.  If some innocent civilian is
writing highmem data to disk and this timer interrupt fires and trashes
his KM_USER0 slot, the disk contents will be corrupted.

Something like this...

--- a/drivers/gpu/drm/i915/i915_irq.c~a
+++ a/drivers/gpu/drm/i915/i915_irq.c
@@ -456,11 +456,15 @@ i915_error_object_create(struct drm_devi

for (page = 0; page < page_count; page++) {
void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
+   unsigned long flags;
+
if (d == NULL)
goto unwind;
-   s = kmap_atomic(src_priv->pages[page], KM_USER0);
+   local_irq_save(flags);
+   s = kmap_atomic(src_priv->pages[page], KM_IRQ0);
memcpy(d, s, PAGE_SIZE);
-   kunmap_atomic(s, KM_USER0);
+   kunmap_atomic(s, KM_IRQ0);
+   local_irq_restore(flags);
dst->pages[page] = d;
}
dst->page_count = page_count;
_

Please let's get a tested fix for this into 2.6.34.


DRM Error on Acer Aspire One

2010-05-11 Thread Chris Wilson
On Tue, 11 May 2010 10:48:18 -0400, Andrew Morton  wrote:
> 
> On Tue, 11 May 2010 17:10:53 +0100 Chris Wilson  
> wrote:
> 
> > On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput  > at gmail.com> wrote:
> > > Hello,
> > > 
> > > With latest git kernel, I am getting following DRM error and not
> > > getting XWindows :
> > 
> > [snip]
> > 
> > Hmm, there are still patches for capturing error state that haven't gone
> > upstream, shame on me.
> > 
> > That error is a secondary issue to the GPU hang that is being reported. If
> > it is a regression caused by a kernel update it would be very useful if
> > you could bisect to the erroneous commit.
> 
> It helps if one reads the code and the trace...
> 
> i915_error_object_create() is using KM_USER0 from softirq context. 
> That's a bug, and a pretty serious one.  If some innocent civilian is
> writing highmem data to disk and this timer interrupt fires and trashes
> his KM_USER0 slot, the disk contents will be corrupted.
> 
> Something like this...
> 
> --- a/drivers/gpu/drm/i915/i915_irq.c~a
> +++ a/drivers/gpu/drm/i915/i915_irq.c
> @@ -456,11 +456,15 @@ i915_error_object_create(struct drm_devi
>  
>   for (page = 0; page < page_count; page++) {
>   void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> + unsigned long flags;
> +
>   if (d == NULL)
>   goto unwind;
> - s = kmap_atomic(src_priv->pages[page], KM_USER0);
> + local_irq_save(flags);
> + s = kmap_atomic(src_priv->pages[page], KM_IRQ0);
>   memcpy(d, s, PAGE_SIZE);
> - kunmap_atomic(s, KM_USER0);
> + kunmap_atomic(s, KM_IRQ0);
> + local_irq_restore(flags);
>   dst->pages[page] = d;
>   }
>   dst->page_count = page_count;
> _
> 
> Please let's get a tested fix for this into 2.6.34.

The change that I actually want is to replace the kmap_atomic(cpu_page) with an
io_mapping_map_atomic_wc(gtt_page), in case there is a incoherency between
the CPU and the GPU, we want to record what the GPU executed. Do you know
how if similar precautions are required with io_mapping_map_atomic_wc()?

-- 
Chris Wilson, Intel Open Source Technology Centre


[PATCH] drm/i915: Record error batch buffers using iomem

2010-05-11 Thread Chris Wilson
Directly read the GTT mapping for the contents of the batch buffers
rather than relying on possibly stale CPU caches. Also for completeness
scan the flushing/inactive lists for the current buffers - we are
collecting error state after all.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_irq.c |   64 ++
 1 files changed, 57 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 87113da..14301a4 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -441,9 +441,11 @@ static struct drm_i915_error_object *
 i915_error_object_create(struct drm_device *dev,
 struct drm_gem_object *src)
 {
+   drm_i915_private_t *dev_priv = dev->dev_private;
struct drm_i915_error_object *dst;
struct drm_i915_gem_object *src_priv;
int page, page_count;
+   u32 reloc_offset;

if (src == NULL)
return NULL;
@@ -458,14 +460,23 @@ i915_error_object_create(struct drm_device *dev,
if (dst == NULL)
return NULL;

+   reloc_offset = src_priv->gtt_offset;
for (page = 0; page < page_count; page++) {
-   void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
+   void __iomem *s;
+   void *d;
+
+   d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
if (d == NULL)
goto unwind;
-   s = kmap_atomic(src_priv->pages[page], KM_USER0);
-   memcpy(d, s, PAGE_SIZE);
-   kunmap_atomic(s, KM_USER0);
+
+   s = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping,
+reloc_offset);
+   memcpy_fromio(d, s, PAGE_SIZE);
+   io_mapping_unmap_atomic(s);
+
dst->pages[page] = d;
+
+   reloc_offset += PAGE_SIZE;
}
dst->page_count = page_count;
dst->gtt_offset = src_priv->gtt_offset;
@@ -621,18 +632,57 @@ static void i915_capture_error_state(struct drm_device 
*dev)

if (batchbuffer[1] == NULL &&
error->acthd >= obj_priv->gtt_offset &&
-   error->acthd < obj_priv->gtt_offset + obj->size &&
-   batchbuffer[0] != obj)
+   error->acthd < obj_priv->gtt_offset + obj->size)
batchbuffer[1] = obj;

count++;
}
+   /* Scan the other lists for completeness for those bizarre errors. */
+   if (batchbuffer[0] == NULL || batchbuffer[1] == NULL) {
+   list_for_each_entry(obj_priv, &dev_priv->mm.flushing_list, 
list) {
+   struct drm_gem_object *obj = obj_priv->obj;
+
+   if (batchbuffer[0] == NULL &&
+   bbaddr >= obj_priv->gtt_offset &&
+   bbaddr < obj_priv->gtt_offset + obj->size)
+   batchbuffer[0] = obj;
+
+   if (batchbuffer[1] == NULL &&
+   error->acthd >= obj_priv->gtt_offset &&
+   error->acthd < obj_priv->gtt_offset + obj->size)
+   batchbuffer[1] = obj;
+
+   if (batchbuffer[0] && batchbuffer[1])
+   break;
+   }
+   }
+   if (batchbuffer[0] == NULL || batchbuffer[1] == NULL) {
+   list_for_each_entry(obj_priv, &dev_priv->mm.inactive_list, 
list) {
+   struct drm_gem_object *obj = obj_priv->obj;
+
+   if (batchbuffer[0] == NULL &&
+   bbaddr >= obj_priv->gtt_offset &&
+   bbaddr < obj_priv->gtt_offset + obj->size)
+   batchbuffer[0] = obj;
+
+   if (batchbuffer[1] == NULL &&
+   error->acthd >= obj_priv->gtt_offset &&
+   error->acthd < obj_priv->gtt_offset + obj->size)
+   batchbuffer[1] = obj;
+
+   if (batchbuffer[0] && batchbuffer[1])
+   break;
+   }
+   }

/* We need to copy these to an anonymous buffer as the simplest
 * method to avoid being overwritten by userpace.
 */
error->batchbuffer[0] = i915_error_object_create(dev, batchbuffer[0]);
-   error->batchbuffer[1] = i915_error_object_create(dev, batchbuffer[1]);
+   if (batchbuffer[1] != batchbuffer[0])
+   error->batchbuffer[1] = i915_error_object_create(dev, 
batchbuffer[1]);
+   else
+   error->batchbuffer[1] = NULL;

/* Record the ringbuffer */
error->ringbuffer = i915_error_object_create(dev, 
dev_priv->ring.ring_obj);
-- 
1.7.1



DRM Error on Acer Aspire One

2010-05-11 Thread Jaswinder Singh Rajput
Hello Andrew,

On Tue, May 11, 2010 at 8:18 PM, Andrew Morton
 wrote:
> On Tue, 11 May 2010 17:10:53 +0100 Chris Wilson  
> wrote:
>
>> On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput > at gmail.com> wrote:
>> > Hello,
>> >
>> > With latest git kernel, I am getting following DRM error and not
>> > getting XWindows :
>>
>> [snip]
>>
>> Hmm, there are still patches for capturing error state that haven't gone
>> upstream, shame on me.
>>
>> That error is a secondary issue to the GPU hang that is being reported. If
>> it is a regression caused by a kernel update it would be very useful if
>> you could bisect to the erroneous commit.
>
> It helps if one reads the code and the trace...
>
> i915_error_object_create() is using KM_USER0 from softirq context.
> That's a bug, and a pretty serious one. ?If some innocent civilian is
> writing highmem data to disk and this timer interrupt fires and trashes
> his KM_USER0 slot, the disk contents will be corrupted.
>
> Something like this...
>
> --- a/drivers/gpu/drm/i915/i915_irq.c~a
> +++ a/drivers/gpu/drm/i915/i915_irq.c
> @@ -456,11 +456,15 @@ i915_error_object_create(struct drm_devi
>
> ? ? ? ?for (page = 0; page < page_count; page++) {
> ? ? ? ? ? ? ? ?void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> + ? ? ? ? ? ? ? unsigned long flags;
> +
> ? ? ? ? ? ? ? ?if (d == NULL)
> ? ? ? ? ? ? ? ? ? ? ? ?goto unwind;
> - ? ? ? ? ? ? ? s = kmap_atomic(src_priv->pages[page], KM_USER0);
> + ? ? ? ? ? ? ? local_irq_save(flags);
> + ? ? ? ? ? ? ? s = kmap_atomic(src_priv->pages[page], KM_IRQ0);
> ? ? ? ? ? ? ? ?memcpy(d, s, PAGE_SIZE);
> - ? ? ? ? ? ? ? kunmap_atomic(s, KM_USER0);
> + ? ? ? ? ? ? ? kunmap_atomic(s, KM_IRQ0);
> + ? ? ? ? ? ? ? local_irq_restore(flags);
> ? ? ? ? ? ? ? ?dst->pages[page] = d;
> ? ? ? ?}
> ? ? ? ?dst->page_count = page_count;
> _
>
> Please let's get a tested fix for this into 2.6.34.
>

I tested your patch with latest linus git and it works, it fixes the
softirq error.

Now I am only getting DRM errors :

[   42.276059] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer
elapsed... GPU hung
[   42.276398] render error detected, EIR: 0x
[   42.276460] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request
returns -5 (awaiting 18 at 17)

Thanks,
--
Jaswinder Singh.


DRM Error on Acer Aspire One

2010-05-11 Thread Andrew Morton
On Tue, 11 May 2010 19:19:26 +0100 Chris Wilson  
wrote:

> On Tue, 11 May 2010 10:48:18 -0400, Andrew Morton  linux-foundation.org> wrote:
> > 
> > On Tue, 11 May 2010 17:10:53 +0100 Chris Wilson  > chris-wilson.co.uk> wrote:
> > 
> > > On Tue, 11 May 2010 20:30:07 +0530, Jaswinder Singh Rajput 
> > >  wrote:
> > > > Hello,
> > > > 
> > > > With latest git kernel, I am getting following DRM error and not
> > > > getting XWindows :
> > > 
> > > [snip]
> > > 
> > > Hmm, there are still patches for capturing error state that haven't gone
> > > upstream, shame on me.
> > > 
> > > That error is a secondary issue to the GPU hang that is being reported. If
> > > it is a regression caused by a kernel update it would be very useful if
> > > you could bisect to the erroneous commit.
> > 
> > It helps if one reads the code and the trace...
> > 
> > i915_error_object_create() is using KM_USER0 from softirq context. 
> > That's a bug, and a pretty serious one.  If some innocent civilian is
> > writing highmem data to disk and this timer interrupt fires and trashes
> > his KM_USER0 slot, the disk contents will be corrupted.
> > 
> > Something like this...
> > 
> > --- a/drivers/gpu/drm/i915/i915_irq.c~a
> > +++ a/drivers/gpu/drm/i915/i915_irq.c
> > @@ -456,11 +456,15 @@ i915_error_object_create(struct drm_devi
> >  
> > for (page = 0; page < page_count; page++) {
> > void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> > +   unsigned long flags;
> > +
> > if (d == NULL)
> > goto unwind;
> > -   s = kmap_atomic(src_priv->pages[page], KM_USER0);
> > +   local_irq_save(flags);
> > +   s = kmap_atomic(src_priv->pages[page], KM_IRQ0);
> > memcpy(d, s, PAGE_SIZE);
> > -   kunmap_atomic(s, KM_USER0);
> > +   kunmap_atomic(s, KM_IRQ0);
> > +   local_irq_restore(flags);
> > dst->pages[page] = d;
> > }
> > dst->page_count = page_count;
> > _
> > 
> > Please let's get a tested fix for this into 2.6.34.
> 
> The change that I actually want is to replace the kmap_atomic(cpu_page) with 
> an
> io_mapping_map_atomic_wc(gtt_page), in case there is a incoherency between
> the CPU and the GPU, we want to record what the GPU executed. Do you know
> how if similar precautions are required with io_mapping_map_atomic_wc()?

gack, wtf is io_mapping_map_atomic_wc()?



Could do with some interface documentation. Looks too large to be inlined.

No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
it hardwires use of KM_USER0.  I suggest that io_mapping_create_wc(),
io_mapping_map_atomic_wc() etc be changed so that the caller passes in the
KM_foo kmap slot index.




[PATCH] drm/i915: Record error batch buffers using iomem

2010-05-11 Thread Andrew Morton
On Tue, 11 May 2010 19:22:14 +0100 Chris Wilson  
wrote:

> + reloc_offset = src_priv->gtt_offset;
>   for (page = 0; page < page_count; page++) {
> - void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> + void __iomem *s;
> + void *d;
> +
> + d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
>   if (d == NULL)
>   goto unwind;
> - s = kmap_atomic(src_priv->pages[page], KM_USER0);
> - memcpy(d, s, PAGE_SIZE);
> - kunmap_atomic(s, KM_USER0);
> +
> + s = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping,
> +  reloc_offset);
> + memcpy_fromio(d, s, PAGE_SIZE);
> + io_mapping_unmap_atomic(s);

As mentioned in the other email, this will still corrupt the KM_USER0
slot, and will generate a debug_kmap_atomic() warning.



[PATCH] drm/i915: Record error batch buffers using iomem

2010-05-11 Thread Chris Wilson
On Tue, 11 May 2010 11:37:22 -0400, Andrew Morton  wrote:
> On Tue, 11 May 2010 19:22:14 +0100 Chris Wilson  
> wrote:
> 
> > +   reloc_offset = src_priv->gtt_offset;
> > for (page = 0; page < page_count; page++) {
> > -   void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> > +   void __iomem *s;
> > +   void *d;
> > +
> > +   d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
> > if (d == NULL)
> > goto unwind;
> > -   s = kmap_atomic(src_priv->pages[page], KM_USER0);
> > -   memcpy(d, s, PAGE_SIZE);
> > -   kunmap_atomic(s, KM_USER0);
> > +
> > +   s = io_mapping_map_atomic_wc(dev_priv->mm.gtt_mapping,
> > +reloc_offset);
> > +   memcpy_fromio(d, s, PAGE_SIZE);
> > +   io_mapping_unmap_atomic(s);
> 
> As mentioned in the other email, this will still corrupt the KM_USER0
> slot, and will generate a debug_kmap_atomic() warning.

How, as kmap_atomic(KM_USER0) is no longer used?
-- 
Chris Wilson, Intel Open Source Technology Centre


DRM Error on Acer Aspire One

2010-05-11 Thread Chris Wilson
On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton  wrote:
> No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
> it hardwires use of KM_USER0.  I suggest that io_mapping_create_wc(),
> io_mapping_map_atomic_wc() etc be changed so that the caller passes in the
> KM_foo kmap slot index.

Argh, sorry for the noise, read the mail in the wrong order. Thanks for
the review. It would be sensible to go with your simpler patch whilst
io_mapping_map_atomic_wc() is improved.

-- 
Chris Wilson, Intel Open Source Technology Centre


DRM Error on Acer Aspire One

2010-05-11 Thread Andrew Morton
On Tue, 11 May 2010 19:52:31 +0100
Chris Wilson  wrote:

> On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton  linux-foundation.org> wrote:
> > No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
> > it hardwires use of KM_USER0.  I suggest that io_mapping_create_wc(),
> > io_mapping_map_atomic_wc() etc be changed so that the caller passes in the
> > KM_foo kmap slot index.
> 
> Argh, sorry for the noise, read the mail in the wrong order. Thanks for
> the review. It would be sensible to go with your simpler patch whilst
> io_mapping_map_atomic_wc() is improved.

OK.  I'll be sending a bunch of fixes Linuswards in an hour or two.  
Should I include this?


Subject: drivers/gpu/drm/i915/i915_irq.c:i915_error_object_create(): use 
correct kmap-atomic slot
From: Andrew Morton 

i915_error_object_create() is called from the timer interrupt and hence
can corrupt the KM_USER0 slot.  Use KM_IRQ0 instead.

Reported-by: Jaswinder Singh Rajput 
Tested-by: Jaswinder Singh Rajput 
Cc: Chris Wilson 
Cc: Dave Airlie 
Signed-off-by: Andrew Morton 
---

 drivers/gpu/drm/i915/i915_irq.c |8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff -puN 
drivers/gpu/drm/i915/i915_irq.c~drivers-gpu-drm-i915-i915_irqc-i915_error_object_create-use-correct-kmap-atomic-slot
 drivers/gpu/drm/i915/i915_irq.c
--- 
a/drivers/gpu/drm/i915/i915_irq.c~drivers-gpu-drm-i915-i915_irqc-i915_error_object_create-use-correct-kmap-atomic-slot
+++ a/drivers/gpu/drm/i915/i915_irq.c
@@ -461,11 +461,15 @@ i915_error_object_create(struct drm_devi

for (page = 0; page < page_count; page++) {
void *s, *d = kmalloc(PAGE_SIZE, GFP_ATOMIC);
+   unsigned long flags;
+
if (d == NULL)
goto unwind;
-   s = kmap_atomic(src_priv->pages[page], KM_USER0);
+   local_irq_save(flags);
+   s = kmap_atomic(src_priv->pages[page], KM_IRQ0);
memcpy(d, s, PAGE_SIZE);
-   kunmap_atomic(s, KM_USER0);
+   kunmap_atomic(s, KM_IRQ0);
+   local_irq_restore(flags);
dst->pages[page] = d;
}
dst->page_count = page_count;
_



[Bug 28069] New: maniadrive - smooth play with LIBGL_ALWAYS_INDIRECT=true, (almost) unplayable otherwise

2010-05-11 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=28069

   Summary: maniadrive - smooth play with
LIBGL_ALWAYS_INDIRECT=true, (almost) unplayable
otherwise
   Product: DRI
   Version: XOrg CVS
  Platform: Other
OS/Version: All
Status: NEW
  Severity: enhancement
  Priority: medium
 Component: DRM/Radeon
AssignedTo: dri-devel at lists.freedesktop.org
ReportedBy: pcpa at mandriva.com.br


The problem described in https://bugs.freedesktop.org/show_bug.cgi?id=28002
actually also happens with the ati driver.
  Just that with the ati driver, it is less visible (the car position jumps
less), but with LIBGL_ALWAYS_INDIRECT it also becomes a lot more smooth.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[PATCH] drm/i915: Record error batch buffers using iomem

2010-05-11 Thread Chris Wilson
On Wed, 12 May 2010 01:08:23 +0530, Jaswinder Singh Rajput  wrote:
> Hello Chris and Andrew,
> 
> I did further testing and noticed that this patch fixes the boot
> errors and warnings and I get the XWindows.
> 
> But XWindows freezes after some time.

The BUG you were hitting before is on the error collection path which
presumably is still being triggered during boot by a GPU error.
Can you check to see if /sys/kernel/debug/dri/0/i915_error_state has
recorded anything? And if not, wait until it freezes and then please file
a bug report at bugs.freedesktop.org with the i915_error_state, Xorg.0.log
and dmesg.

-- 
Chris Wilson, Intel Open Source Technology Centre


DRM Error on Acer Aspire One

2010-05-11 Thread Chris Wilson
On Tue, 11 May 2010 12:10:01 -0700, Andrew Morton  wrote:
> On Tue, 11 May 2010 19:52:31 +0100
> Chris Wilson  wrote:
> 
> > On Tue, 11 May 2010 11:35:55 -0400, Andrew Morton  > linux-foundation.org> wrote:
> > > No, io_mapping_map_atomic_wc() cannot be used from [soft]irq context:
> > > it hardwires use of KM_USER0.  I suggest that io_mapping_create_wc(),
> > > io_mapping_map_atomic_wc() etc be changed so that the caller passes in the
> > > KM_foo kmap slot index.
> > 
> > Argh, sorry for the noise, read the mail in the wrong order. Thanks for
> > the review. It would be sensible to go with your simpler patch whilst
> > io_mapping_map_atomic_wc() is improved.
> 
> OK.  I'll be sending a bunch of fixes Linuswards in an hour or two.  
> Should I include this?

Yes.

Acked-by: Chris Wilson 

-- 
Chris Wilson, Intel Open Source Technology Centre


  1   2   3   4   >