Re: [MMTests] IO metadata on XFS

2012-07-03 Thread Dave Chinner
ch event, but that doesn't give
you stall times. Local testing tells me that about 40% of the
switches are from xfs_alloc_vextent, 55% are from the work threads,
and the rest are CPU idling events, which is exactly as I'd expect.

> > A pert top profile comparison might be informative,
> > too...
> > 
> 
> I'm not sure if this is what you really wanted. I thought an oprofile or
> perf report would have made more sense but I recorded perf top over time
> anyway and it's at the end of the mail.

perf report and oprofile give you CPU usage across the run, it's not
instantaneous and that's where all the interesting information is.
e.g. a 5% sample in a 20s profile might be 5% per second for 20s, or
it might be 100% for 1s - that's the behaviour run profiles cannot
give you insight into

As it is, the output you posted is nothing unusual.

> For just these XFS tests I've uploaded a tarball of the logs to
> http://www.csn.ul.ie/~mel/postings/xfsbisect-20120703/xfsbisect-logs.tar.gz

Ok, so the main thing I missed when first looking at this is that
you are concerned about single thread regressions. Well, I can't
reproduce your results here. Single threaded with or without the
workqueue based allocation gives me roughly 20k +/-0.5k files/s one
a single disk, a 12 disk RAID0 array and a RAM disk on a 8p/4GB RAM
machine.  That's the same results I've been seeing since I wrote
this patch almost 12 months ago

So, given that this is a metadata intensive workload, the only
extent allocation is going to be through inode and directory block
allocation. These paths do not consume a large amount of stack, so
we can tell the allocator not to switch to workqueue stack for these
allocations easily.

The patch below does this. It completely removes all the allocation
based context switches from the no-data fsmark workloads being used
for this testing. It makes no noticable difference to performance
here, so I'm interested if it solves the regression you are seeing
on your machines.

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com


xfs: don't defer metadata allocation to the workqueue

From: Dave Chinner 

Almost all metadata allocations come from shallow stack usage
situations. Avoid the overhead of switching the allocation to a
workqueue as we are not in danger of running out of stack when
making these allocations. Metadata allocations are already marked
through the args that are passed down, so this is trivial to do.

Signed-off-by: Dave Chinner 
---
 fs/xfs/xfs_alloc.c |   15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/xfs_alloc.c b/fs/xfs/xfs_alloc.c
index f654f51..4f33c32 100644
--- a/fs/xfs/xfs_alloc.c
+++ b/fs/xfs/xfs_alloc.c
@@ -2434,13 +2434,22 @@ xfs_alloc_vextent_worker(
current_restore_flags_nested(&pflags, PF_FSTRANS);
 }
 
-
-int/* error */
+/*
+ * Data allocation requests often come in with little stack to work on. Push
+ * them off to a worker thread so there is lots of stack to use. Metadata
+ * requests, OTOH, are generally from low stack usage paths, so avoid the
+ * context switch overhead here.
+ */
+int
 xfs_alloc_vextent(
-   xfs_alloc_arg_t *args)  /* allocation argument structure */
+   struct xfs_alloc_arg*args)
 {
DECLARE_COMPLETION_ONSTACK(done);
 
+   if (!args->userdata)
+   return __xfs_alloc_vextent(args);
+
+
args->done = &done;
INIT_WORK_ONSTACK(&args->work, xfs_alloc_vextent_worker);
queue_work(xfs_alloc_wq, &args->work);
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: Is drm-radeon-testing broken for anyone else?

2012-07-03 Thread David Witbrodt


> Oh I wasn't aware anyone still used it, I just pushed a branch  for

> Jerome the other day to test something, its known  broken.

Oh, boy... sorry for the noise, then.  This was caused by
the way I am using git:  I cloned the Linus Torvalds tree,
and then used 'git add' to include several trees, like
Ingo's "tip", Greg's stable kernels, and (of course)
"drm-airlied".

When you added that patch to d-r-testing, then some
output was caused when I ran 'git remote update', and
I assumed there was something happening there that
might be interesting.  I cherry pick commits from
drm-fixes and drm-next into my own branch off of the
latest GregKH stable kernel; this prevents me from
facing consequences in other areas due to merge window
insanity, but gives me early warning if radeon-related
stuff is going to cause me problems in the future.


Sorry about this, Dave!
Dave Witbrodt
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 44121] Reproducible GPU lockup CP stall on Radeon HD 6450

2012-07-03 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=44121





--- Comment #12 from Jean Delvare   2012-07-03 21:40:52 
---
Patch from comment #11 didn't work at all, not only it didn't fix the original
issue but it even caused additional trouble (gdm wouldn't even show up.)

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.


Is drm-radeon-testing broken for anyone else?

2012-07-03 Thread Dave Airlie
On Tue, Jul 3, 2012 at 6:02 PM, David Witbrodt  
wrote:
> Maybe I'm doing something wrong, but my cloned repo of
> drm-radeon-testing is giving build errors.  What I'm
> seeing is

Oh I wasn't aware anyone still used it, I just pushed a branch for
Jerome the other day to test something, its known broken.

Dave.


[Bug 44121] Reproducible GPU lockup CP stall on Radeon HD 6450

2012-07-03 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=44121


Alex Deucher  changed:

   What|Removed |Added

  Attachment #74711|0   |1
is obsolete||




--- Comment #11 from Alex Deucher   2012-07-03 
19:03:52 ---
Created an attachment (id=74771)
 --> (https://bugzilla.kernel.org/attachment.cgi?id=74771)
possible fix

Another possible fix, but I don't think it will help as it touches things never
previously touched.  I don't think the issue is the USER registers, but it's
worth a shot I suppose.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.


[Bug 44121] Reproducible GPU lockup CP stall on Radeon HD 6450

2012-07-03 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=44121





--- Comment #10 from Jean Delvare   2012-07-03 18:56:15 
---
Patch from comment #7 did not work either. Then I followed the instructions
from comment #8, but it also did not help.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.


[Bug 44121] Reproducible GPU lockup CP stall on Radeon HD 6450

2012-07-03 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=44121





--- Comment #9 from Jean Delvare   2012-07-03 18:08:29 
---
Patch from comment #6 doesn't work, testing patch from comment #7 now.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.


[Bug 44121] Reproducible GPU lockup CP stall on Radeon HD 6450

2012-07-03 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=44121





--- Comment #8 from Alex Deucher   2012-07-03 
18:00:24 ---
Does booting up a clean kernel without any patches applied or reverted work if
you manually set the following registers to their "patch reverted" values using
radeonreg? Just to be sure, write all of them even if the values are the same. 
Do this without X running.

0x98F4
0x3F88
0x9B7C
0x8950
0x98FC
0x8954

e.g.,
radeonreg regset 0x8950 0xfffcf001

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.


[Bug 44121] Reproducible GPU lockup CP stall on Radeon HD 6450

2012-07-03 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=44121





--- Comment #7 from Alex Deucher   2012-07-03 
17:31:40 ---
Created an attachment (id=74711)
 --> (https://bugzilla.kernel.org/attachment.cgi?id=74711)
possible fix

or this variant.  Although AFAIK, programming the USER register variants
shouldn't be necessary as the default values (0) are valid.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.


[Bug 44121] Reproducible GPU lockup CP stall on Radeon HD 6450

2012-07-03 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=44121


J?r?me Glisse  changed:

   What|Removed |Added

  Attachment #74671|0   |1
is obsolete||




--- Comment #6 from J?r?me Glisse   2012-07-03 
17:09:41 ---
Created an attachment (id=74701)
 --> (https://bugzilla.kernel.org/attachment.cgi?id=74701)
properly disable render backend

This one ?

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.


[PATCH] drm/radeon: add an exclusive lock for GPU reset

2012-07-03 Thread Christian König
On 03.07.2012 16:09, Jerome Glisse wrote:
> On Tue, Jul 3, 2012 at 5:26 AM, Christian K?nig  
> wrote:
>> [SNIP]
>> Yeah, but I thought that fixing those oops as the second step. I see the
>> fact that suspend/resume is unpinning all the ttm memory and then pinning it
>> again as a bug that needs to be fixed. Or as an alternative we could split
>> the suspend/resume calls into suspend/disable and resume/enable calls, where
>> we only call disable/enable in the gpu_reset path rather than a complete
>> suspend/resume (not really sure about that).
> Fixing oops are not second step, they are first step. I am not saying
> that the suspend/resume as it happens right now is a good thing, i am
> saying it's what it's and we have to deal with it until we do
> something better. There is no excuse to not fix oops with a simple
> patch 16 lines patch.
Completely agree.

>> Also a GPU reset isn't such a rare event, currently it just occurs when
>> userspace is doing something bad, for example submitting an invalid shader
>> or stuff like that. But with VM and IO protection coming into the driver we
>> are going to need a GPU reset when there is an protection fault, and with
>> that it is really desirable to just reset the hardware in a way where we can
>> say: This IB was faulty skip over it and resume with whatever is after it on
>> the ring.
> There is mecanism to handle that properly from irq handler that AMD
> need to release, the pagefault thing could be transparent and should
> only need few lines in the irq handler (i think i did a patch for that
> and sent it to AMD for review but i am wondering if i wasn't lacking
> some doc).
I also searched the AMD internal docs for a good description of how the 
lightweight recovery is supposed to work, but haven't found anything 
clear so far. My expectation is that there should be something like a 
"abort current IB" command you can issue by writing an register, but 
that doesn't seems to be the case.

>> And todo that we need to keep the auxiliary data like sub allocator memory,
>> blitting shader bo, and especially vm page tables at the same place in
>> hardware memory.
> I agree that we need a lightweight reset but we need to keep the heavy
> reset as it is right now, if you want to do a light weight reset do it
> as a new function. I always intended to have two reset path, hint gpu
> soft reset name vs what is call hard reset but not released, i even
> made patch for that long time ago but never got them cleared from AMD
> review.
My idea was to pass in some extra informations, so asic_reset more 
clearly knows what todo. An explicit distinction between a soft and a 
hard reset also seems like a possible solution, but sounds like a bit of 
code duplication.

>>> I stress it we need at very least a mutex to protect gpu reset and i
>>> will stand on that position because there is no other way around.
>>> Using rw lock have a bonus of allowing proper handling of gpu reset
>>> failure and that what the patch i sent to linus fix tree is about, so
>>> to make drm next merge properly while preserving proper behavior in
>>> gpu reset failure the rw semaphore is the best option.
>> Oh well, I'm not arguing that we don't need a look here. I'm just
>> questioning to put it at the ioctl level (e.g. the driver entry from
>> userspace), that wasn't such a good idea with the cs_mutex and doesn't seems
>> like a good idea now. Instead we should place the looking between the
>> ioctl/ttm and the actual hardware submission and that brings it pretty close
>> (if not identically) to the ring mutex.
>>
>> Cheers,
>> Christian.
> No, locking at the ioctl level make sense please show me figure that
> it hurt performance, i did a quick sysprof and i couldn't see them
> impacting anything. No matter how much you hate this, this is the best
> solution, it avoids each ioctl to do useless things in case of gpu
> lockup and it touch a lot less code than moving a lock down the call
> path would. So again this is the best solution for the heavy reset,
> and i am not saying that a soft reset would need to take this lock or
> that we can't improve the way it's done. All i am saying is that ring
> lock is the wrong solution for heavy reset, it should be ok for light
> weight reset.
>
I'm not into any performance concerns, it just doesn't seems to be the 
right place to me. So are you really sure that the 
ttm_bo_delayed_workqueue,  pageflips or callbacks to radeon_bo_move 
can't hit us here? IIRC that always was the big concern with the 
cs_mutex not being held in all cases.

Anyway, if you think it will work and fix the crash problem at hand then 
I'm ok with commit it.

Christian.



[Bug 44121] Reproducible GPU lockup CP stall on Radeon HD 6450

2012-07-03 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=44121





--- Comment #5 from Jean Delvare   2012-07-03 16:39:19 
---
With this patch applied, I get:

0x98F40x0001 (1)
0x3F880x0001 (1)
0x9B7C0x00fe (16646144)
0x89500xfffcf001 (-200703)
0x98FC0x (0)
0x89540x (0)

So the value of register 0x9B7C is correct now, but this was not sufficient.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.


[Bug 44121] Reproducible GPU lockup CP stall on Radeon HD 6450

2012-07-03 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=44121





--- Comment #4 from Jean Delvare   2012-07-03 16:21:27 
---
I tested the patch in comment #3 but unfortunately it doesn't solve the
problem.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.


[MMTests] IO metadata on XFS

2012-07-03 Thread Daniel Vetter
On Tue, Jul 03, 2012 at 02:04:14PM +0100, Mel Gorman wrote:
> On Mon, Jul 02, 2012 at 08:35:16PM +0100, Mel Gorman wrote:
> > > 
> > >
> > It was obvious very quickly that there were two distinct regression so I
> > ran two bisections. One led to a XFS and the other led to an i915 patch
> > that enables RC6 to reduce power usage.
> > 
> > [c999a223: xfs: introduce an allocation workqueue]
> > [aa464191: drm/i915: enable plain RC6 on Sandy Bridge by default]
> > 
> > gdm was running on the machine so i915 would have been in use. 
> 
> Bah, more PEBKAC. gdm was *not* running on this machine. i915 is loaded
> but X is not.

See my little explanation of rc6, just loading the driver will have
effects. But I'm happy to know that the issue also happens without using
it, makes it really unlikely it's an issue with the gpu or i915.ko ;-)
-Daniel
-- 
Daniel Vetter
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48


[Bug 44121] Reproducible GPU lockup CP stall on Radeon HD 6450

2012-07-03 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=44121





--- Comment #3 from J?r?me Glisse   2012-07-03 
15:42:09 ---
Created an attachment (id=74671)
 --> (https://bugzilla.kernel.org/attachment.cgi?id=74671)
 properly disable render backend

Does this patch fix it ?

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.


[Bug 44121] Reproducible GPU lockup CP stall on Radeon HD 6450

2012-07-03 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=44121





--- Comment #2 from Jean Delvare   2012-07-03 15:27:00 
---
With 3.5-rc5 kernel (failing) :

0x98F40x0001 (1)
0x3F880x0001 (1)
0x9B7C0x (0)
0x89500xfffcf001 (-200703)
0x98FC0x (0)

With commit 416a2bd2 reverted (working) :

0x98F40x0001 (1)
0x3F880x0001 (1)
0x9B7C0x00fe (16646144)
0x89500xfffcf001 (-200703)
0x98FC0x (0)

So, value of register GC_USER_RB_BACKEND_DISABLE (0x9B7C) differs.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.


[Bug 44121] Reproducible GPU lockup CP stall on Radeon HD 6450

2012-07-03 Thread bugzilla-dae...@bugzilla.kernel.org
https://bugzilla.kernel.org/show_bug.cgi?id=44121





--- Comment #1 from Alex Deucher   2012-07-03 
14:53:04 ---
Can you dump the following registers using radeonreg or avivotool
(http://cgit.freedesktop.org/~airlied/radeontool/) with the patch applied and
reverted and attach both results?

CC_RB_BACKEND_DISABLE (0x98F4)
CC_SYS_RB_BACKEND_DISABLE (0x3F88)
GC_USER_RB_BACKEND_DISABLE (0x9B7C)
CC_GC_SHADER_PIPE_CONFIG (0x8950)
GB_BACKEND_MAP (0x98FC)

(as root):
radeonreg regmatch 0x98F4
etc.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.


[Bug 44121] Reproducible GPU lockup CP stall on Radeon HD 6450

2012-07-03 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=44121





--- Comment #12 from Jean Delvare   2012-07-03 21:40:52 ---
Patch from comment #11 didn't work at all, not only it didn't fix the original
issue but it even caused additional trouble (gdm wouldn't even show up.)

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[MMTests] IO metadata on XFS

2012-07-03 Thread Daniel Vetter
On Tue, Jul 03, 2012 at 11:59:51AM +0100, Mel Gorman wrote:
> On Tue, Jul 03, 2012 at 10:19:28AM +1000, Dave Chinner wrote:
> > On Mon, Jul 02, 2012 at 08:35:16PM +0100, Mel Gorman wrote:
> > > Adding dri-devel and a few others because an i915 patch contributed to
> > > the regression.
> > > 
> > > On Mon, Jul 02, 2012 at 03:32:15PM +0100, Mel Gorman wrote:
> > > > On Mon, Jul 02, 2012 at 02:32:26AM -0400, Christoph Hellwig wrote:
> > > > > > It increases the CPU overhead (dirty_inode can be called up to 4
> > > > > > times per write(2) call, IIRC), so with limited numbers of
> > > > > > threads/limited CPU power it will result in lower performance. Where
> > > > > > you have lots of CPU power, there will be little difference in
> > > > > > performance...
> > > > > 
> > > > > When I checked it it could only be called twice, and we'd already
> > > > > optimize away the second call.  I'd defintively like to track down 
> > > > > where
> > > > > the performance changes happend, at least to a major version but even
> > > > > better to a -rc or git commit.
> > > > > 
> > > > 
> > > > By all means feel free to run the test yourself and run the bisection :)
> > > > 
> > > > It's rare but on this occasion the test machine is idle so I started an
> > > > automated git bisection. As you know the milage with an automated bisect
> > > > varies so it may or may not find the right commit. Test machine is 
> > > > sandy so
> > > > http://www.csn.ul.ie/~mel/postings/mmtests-20120424/global-dhp__io-metadata-xfs/sandy/comparison.html
> > > > is the report of interest. The script is doing a full search between 
> > > > v3.3 and
> > > > v3.4 for a point where average files/sec for fsmark-single drops below 
> > > > 25000.
> > > > I did not limit the search to fs/xfs on the off-chance that it is an
> > > > apparently unrelated patch that caused the problem.
> > > > 
> > > 
> > > It was obvious very quickly that there were two distinct regression so I
> > > ran two bisections. One led to a XFS and the other led to an i915 patch
> > > that enables RC6 to reduce power usage.
> > > 
> > > [aa464191: drm/i915: enable plain RC6 on Sandy Bridge by default]
> > 
> > Doesn't seem to be the major cause of the regression. By itself, it
> > has impact, but the majority comes from the XFS change...
> > 
> 
> The fact it has an impact at all is weird but lets see what the DRI
> folks think about it.

Well, presuming I understand things correctly the cpu die only goes into
the lowest sleep state (which iirc switches off l3 caches and
interconnects) when both the cpu and gpu are in the lowest sleep state.
rc6 is that deep-sleep state for the gpu, so without that enabled your
system won't go into these deep-sleep states.

I guess the slight changes in wakeup latency, power consumption (cuts
about 10W on an idle desktop snb with resulting big effect on what turbo
boost can sustain for short amounts of time) and all the follow-on effects
are good enough to massively change timing-critical things.

So this having an effect isn't too weird.

Obviously, if you also have X running while doing these tests there's the
chance that the gpu dies because of an issue when waking up from rc6
(we've known a few of these), but if no drm client is up, that shouldn't
be possible. So please retest without X running if that hasn't been done
already.

Yours, Daniel
-- 
Daniel Vetter
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48


[PATCH] drm/radeon: fix fence related segfault in CS

2012-07-03 Thread Christian König
Don't return success if scheduling the IB fails, otherwise
we end up with an oops in ttm_eu_fence_buffer_objects.

Signed-off-by: Christian K?nig 
Cc: stable at vger.kernel.org
---
 drivers/gpu/drm/radeon/radeon_cs.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
b/drivers/gpu/drm/radeon/radeon_cs.c
index 142f894..17238f4 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -377,7 +377,7 @@ static int radeon_cs_ib_chunk(struct radeon_device *rdev,
if (r) {
DRM_ERROR("Failed to schedule IB !\n");
}
-   return 0;
+   return r;
 }

 static int radeon_bo_vm_update_pte(struct radeon_cs_parser *parser,
-- 
1.7.9.5



[MMTests] IO metadata on XFS

2012-07-03 Thread Mel Gorman
On Tue, Jul 03, 2012 at 02:31:19PM +0200, Daniel Vetter wrote:
> On Tue, Jul 03, 2012 at 11:59:51AM +0100, Mel Gorman wrote:
> > On Tue, Jul 03, 2012 at 10:19:28AM +1000, Dave Chinner wrote:
> > > On Mon, Jul 02, 2012 at 08:35:16PM +0100, Mel Gorman wrote:
> > > > Adding dri-devel and a few others because an i915 patch contributed to
> > > > the regression.
> > > > 
> > > > On Mon, Jul 02, 2012 at 03:32:15PM +0100, Mel Gorman wrote:
> > > > > On Mon, Jul 02, 2012 at 02:32:26AM -0400, Christoph Hellwig wrote:
> > > > > > > It increases the CPU overhead (dirty_inode can be called up to 4
> > > > > > > times per write(2) call, IIRC), so with limited numbers of
> > > > > > > threads/limited CPU power it will result in lower performance. 
> > > > > > > Where
> > > > > > > you have lots of CPU power, there will be little difference in
> > > > > > > performance...
> > > > > > 
> > > > > > When I checked it it could only be called twice, and we'd already
> > > > > > optimize away the second call.  I'd defintively like to track down 
> > > > > > where
> > > > > > the performance changes happend, at least to a major version but 
> > > > > > even
> > > > > > better to a -rc or git commit.
> > > > > > 
> > > > > 
> > > > > By all means feel free to run the test yourself and run the bisection 
> > > > > :)
> > > > > 
> > > > > It's rare but on this occasion the test machine is idle so I started 
> > > > > an
> > > > > automated git bisection. As you know the milage with an automated 
> > > > > bisect
> > > > > varies so it may or may not find the right commit. Test machine is 
> > > > > sandy so
> > > > > http://www.csn.ul.ie/~mel/postings/mmtests-20120424/global-dhp__io-metadata-xfs/sandy/comparison.html
> > > > > is the report of interest. The script is doing a full search between 
> > > > > v3.3 and
> > > > > v3.4 for a point where average files/sec for fsmark-single drops 
> > > > > below 25000.
> > > > > I did not limit the search to fs/xfs on the off-chance that it is an
> > > > > apparently unrelated patch that caused the problem.
> > > > > 
> > > > 
> > > > It was obvious very quickly that there were two distinct regression so I
> > > > ran two bisections. One led to a XFS and the other led to an i915 patch
> > > > that enables RC6 to reduce power usage.
> > > > 
> > > > [aa464191: drm/i915: enable plain RC6 on Sandy Bridge by default]
> > > 
> > > Doesn't seem to be the major cause of the regression. By itself, it
> > > has impact, but the majority comes from the XFS change...
> > > 
> > 
> > The fact it has an impact at all is weird but lets see what the DRI
> > folks think about it.
> 
> Well, presuming I understand things correctly the cpu die only goes into
> the lowest sleep state (which iirc switches off l3 caches and
> interconnects) when both the cpu and gpu are in the lowest sleep state.

I made a mistake in my previous mail. gdm and X were were *not* running.
Once the screen blanked I would guess the GPU is in a low sleep state
the majority of the time.

> rc6 is that deep-sleep state for the gpu, so without that enabled your
> system won't go into these deep-sleep states.
> 
> I guess the slight changes in wakeup latency, power consumption (cuts
> about 10W on an idle desktop snb with resulting big effect on what turbo
> boost can sustain for short amounts of time) and all the follow-on effects
> are good enough to massively change timing-critical things.
> 

Maybe. How aggressively is the lowest sleep state entered and how long
does it take to exit?

> So this having an effect isn't too weird.
> 
> Obviously, if you also have X running while doing these tests there's the
> chance that the gpu dies because of an issue when waking up from rc6
> (we've known a few of these), but if no drm client is up, that shouldn't
> be possible. So please retest without X running if that hasn't been done
> already.
> 

Again, sorry for the confusion but the posted results are without X running.

-- 
Mel Gorman
SUSE Labs


[MMTests] IO metadata on XFS

2012-07-03 Thread Mel Gorman
On Mon, Jul 02, 2012 at 08:35:16PM +0100, Mel Gorman wrote:
> > 
> >
> It was obvious very quickly that there were two distinct regression so I
> ran two bisections. One led to a XFS and the other led to an i915 patch
> that enables RC6 to reduce power usage.
> 
> [c999a223: xfs: introduce an allocation workqueue]
> [aa464191: drm/i915: enable plain RC6 on Sandy Bridge by default]
> 
> gdm was running on the machine so i915 would have been in use. 

Bah, more PEBKAC. gdm was *not* running on this machine. i915 is loaded
but X is not.

-- 
Mel Gorman
SUSE Labs


[MMTests] IO metadata on XFS

2012-07-03 Thread Mel Gorman
On Tue, Jul 03, 2012 at 11:59:51AM +0100, Mel Gorman wrote:
> > Can you run latencytop to see
> > if there is excessive starvation/wait times for allocation
> > completion?
> 
> I'm not sure what format you are looking for.  latencytop is shit for
> capturing information throughout a test and it does not easily allow you to
> record a snapshot of a test. You can record all the console output of course
> but that's a complete mess. I tried capturing /proc/latency_stats over time
> instead because that can be trivially sorted on a system-wide basis but
> as I write this I find that latency_stats was bust. It was just spitting out
> 
> Latency Top version : v0.1
> 
> and nothing else.  Either latency_stats is broken or my config is. Not sure
> which it is right now and won't get enough time on this today to pinpoint it.
> 

PEBKAC. Script that monitored /proc/latency_stats was not enabling
latency top via /proc/sys/kernel/latencytop

-- 
Mel Gorman
SUSE Labs


Is drm-radeon-testing broken for anyone else?

2012-07-03 Thread David Witbrodt


> Oh I wasn't aware anyone still used it, I just pushed a branch  for

> Jerome the other day to test something, its known  broken.

Oh, boy... sorry for the noise, then.  This was caused by
the way I am using git:  I cloned the Linus Torvalds tree,
and then used 'git add' to include several trees, like
Ingo's "tip", Greg's stable kernels, and (of course)
"drm-airlied".

When you added that patch to d-r-testing, then some
output was caused when I ran 'git remote update', and
I assumed there was something happening there that
might be interesting.  I cherry pick commits from
drm-fixes and drm-next into my own branch off of the
latest GregKH stable kernel; this prevents me from
facing consequences in other areas due to merge window
insanity, but gives me early warning if radeon-related
stuff is going to cause me problems in the future.


Sorry about this, Dave!
Dave Witbrodt


[PATCH libdrm] intel: Fix build failure in test_decode.c

2012-07-03 Thread Lauri Kasanen
On Mon, 2 Jul 2012 14:54:58 -0700
Ben Widawsky  wrote:

> > +#define _GNU_SOURCE
> > +
> >  #include 
> >  #include 
> >  #include 
> 
> I can't reproduce this. Can anyone else confirm this is broken, and if
> so that the above patch fixes it?

See the manpage, it depends on the glibc version. Libdrm is broken on glibc < 
2.10.

- Lauri


[Bug 51658] r200 (& possibly radeon) DRI fixes for gnome shell on Mesa 8.0.3

2012-07-03 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=51658

--- Comment #9 from Roland Scheidegger  2012-07-03 
05:20:47 PDT ---
(In reply to comment #8)
> A summary of all of the patches follows.
> 
> r200 essential patches (1st 3 patches) for gnome shell:
> "Essential patch to disable texture formats that are reported unrenderable
> elsewhere in driver."
I'm not really happy with that. I guess the problem is if we try to attach a
texture to a fbo it is too late to notice we cannot render to that format. But
this sort of sucks for ordinary textures. Ideally we'd probably be able to
really determine our format only at the first upload of data to the texture
(which presumably won't happen) so we could change it if we attach it to a fbo.
Though if there is indeed already data uploaded we're screwed.

> "Essential patch to allow 2048 pixel blits."
This one looks good as far as I can tell. The programmed values are
width/height -1 so this clearly should work.

> "Essential patch to reapply dirtied texenv registers."
I don't understand this. If the state isn't used anyway why need to reuppload
it, dirty or not? I think this is just hiding the root cause of another bug.

> 
> suggested minor fixes patch (4th patch):
> "Unessential fixes/enhancements patch."
Look good.

> 
> proposed r200 blit patch (5th patch) - unknown importance:
> "Proposed blit register dirtying patch."
Makes sense to me.

> 
> untested but probably essential r100/radeon patch (7th patch) for gnome shell:
> "Untested but probably essential patch to allow 2048 pixel blits on r100."
Yes that should be same as for r200.

> 
> optional unrecommended patches to supplement blit patches already mentioned
> above (6th & 8th patches):
> "Optional patch to warn (once) when a blit with 2048 pixel dimension occurs on
> r200."
> "Optional untested patch to warn (once) when a blit with 2048 pixel dimension
> occurs on r100."
I don't think this is necessary - at least not there. texture src width/height
of 2048 should clearly work. I am however not so sure about destination.
width/height are fed into RADEON_RE_WIDTH_HEIGHT, and those have just 11 bit,
which isn't enough for 2048. Furthermore, docs say the values are inclusive.
Not sure what's up with that, the ddx also just uses width/height and not
width/height -1 but clamps them to 2047 in some places. dri driver though
otherwise seems to use x2/y2 (i.e. width -1, height -1). So I think this should
be fixed everywhere to really use width/height -1.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 44121] Reproducible GPU lockup CP stall on Radeon HD 6450

2012-07-03 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=44121


Alex Deucher  changed:

   What|Removed |Added

  Attachment #74711|0   |1
is obsolete||




--- Comment #11 from Alex Deucher   2012-07-03 19:03:52 
---
Created an attachment (id=74771)
 --> (https://bugzilla.kernel.org/attachment.cgi?id=74771)
possible fix

Another possible fix, but I don't think it will help as it touches things never
previously touched.  I don't think the issue is the USER registers, but it's
worth a shot I suppose.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[MMTests] IO metadata on XFS

2012-07-03 Thread Mel Gorman
 1173  2  9 89  0  0
 1341307749.8719  17.7959 2.0002 --  1  0  0 15524372  13224 409728
00 0 12288 2466  888  1 10 89  0  0
 1341307751.8721  19.7960 2.0002 --  1  0  0 15368424  13224 502552
00 0 12416 2312  878  1 10 89  0  0
 1341307753.8722  21.7962 2.0002 --  1  0  0 15225216  13232 593092
00 0 12448 2539 1380  1 10 88  0  0
 1341307755.8724  23.7963 2.0002 --  2  0  0 15163712  13232 664768
00 0 32160 2184 1177  1  8 90  0  0
 1341307757.8727  25.7967 2.0003 --  1  0  0 14973888  13240 755080
00 0 12316 2482 1219  1 10 89  0  0
 1341307759.8728  27.7968 2.0001 --  1  0  0 14883580  13240 840036
00 0 44471 2711 1234  2 10 88  0  0
 1341307761.8730  29.7970 2.0002 --  1  0  0 14800304  13240 920504
00 0 42554 2571 1050  1 10 89  0  0
 1341307763.8734  31.7973 2.0003 --  0  0  0 14642504  13248 995004
00 0  3232 2276 1081  1  8 90  0  0
 1341307765.8737  33.7976 2.0003 --  1  0  0 14545072  13248 1052536
00 0 18688 2628 1114  1  9 89  0  0
 1341307767.8739  35.7979 2.0003 --  1  0  0 14783848  13248 926824
00 0 59559 2409 1308  0 10 89  1  0
 1341307769.8740  37.7980 2.0001 --  2  0  0 14854800  13256 896832
00 0  9172 2419 1004  1 10 89  1  0
 1341307771.8742  39.7981 2.0002 --  2  0  0 14835084  13256 875612
00 0 12288 2524  812  0 11 89  0  0
 1341307773.8743  41.7983 2.0002 --  2  0  0 15126252  13256 745844
00 0 10297 2714 1163  1 12 88  0  0
 1341307775.8745  43.7985 2.0002 --  1  0  0 15108800  13264 724544
00 0 12316 2499  931  1 11 88  0  0
 134130.8746  45.7986 2.0001 --  2  0  0 15226236  13264 694580
00 0 12416 2700 1194  1 12 88  0  0
 1341307779.8750  47.7989 2.0003 --  1  0  0 15697632  13264 300716
00  1156 0  934 1701  0  2 96  1  0
 1341307781.8752  49.7992 2.0003 --  0  0  0 15697508  13272 300720
00 066  166  641  0  0 100  0  0
 1341307783.8755  51.7995 2.0003 --  0  0  0 15699008  13272 300524
00 0 0  248  865  0  0 100  0  0
 1341307785.8758  53.7997 2.0003 --  0  0  0 15702452  13272 300520
00 0 0  285  960  0  0 99  0  0
 1341307787.8760  55.7999 2.0002 --  0  0  0 15719404  13280 300436
00 026  136  590  0  0 99  0  0

Vanilla average context switch rate 4278.53
Revert average context switch rate  1095

> Can you run latencytop to see
> if there is excessive starvation/wait times for allocation
> completion?

I'm not sure what format you are looking for.  latencytop is shit for
capturing information throughout a test and it does not easily allow you to
record a snapshot of a test. You can record all the console output of course
but that's a complete mess. I tried capturing /proc/latency_stats over time
instead because that can be trivially sorted on a system-wide basis but
as I write this I find that latency_stats was bust. It was just spitting out

Latency Top version : v0.1

and nothing else.  Either latency_stats is broken or my config is. Not sure
which it is right now and won't get enough time on this today to pinpoint it.

> A pert top profile comparison might be informative,
> too...
> 

I'm not sure if this is what you really wanted. I thought an oprofile or
perf report would have made more sense but I recorded perf top over time
anyway and it's at the end of the mail.  The timestamp information is poor
because the perf top information was buffered so it would receive a bunch
of updates at once. Each sample should be roughly 2 seconds apart. This
buffering can be dealt with, I just failed to do it in advance and I do
not think it's necessary to rerun the tests for it.

> (*) The stack usage below submit_bio() can be more than 5k (DM, MD,
> SCSI, driver, memory allocation), so it's really not safe to do
> allocation anywhere below about 3k of kernel stack being used. e.g.
> on a relatively trivial storage setup without the above commit:
> 
> [142296.384921] flush-253:4 used greatest stack depth: 360 bytes left
> 
> Fundamentally, 8k stacks on x86-64 are too small for our
> increasingly complex storage layers and the 100+ function deep call
> chains that occur.
> 

I understand the patches motivation. For these tests I'm being deliberately
a bit of a dummy and just capturing information. This might allow me to
actually get through all the results and identify some of the problems
and spread them around a bit. Either that or I need to clone myself a few
times to tackle each of the problems in a reasonable timeframe :)

For just these XFS tests I've uploaded a tarball of the logs to
http://www.csn.ul.ie/~mel/postings/xfsbisect-20120703/xfsb

[Bug 44121] Reproducible GPU lockup CP stall on Radeon HD 6450

2012-07-03 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=44121





--- Comment #10 from Jean Delvare   2012-07-03 18:56:15 ---
Patch from comment #7 did not work either. Then I followed the instructions
from comment #8, but it also did not help.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[git pull] drm fixes

2012-07-03 Thread Dave Airlie

Hi Linus,

one regression fix, two radeon fixes (one for an oops), and an i915 fix
to unload framebuffers earlier.

We originally were going to leave the i915 fix until -next, but grub2 in 
some situations causes vesafb/efifb to be loaded now, and this causes big 
slowdowns, and I have reports in rawhide I'd like to have fixed.

Dave.

The following changes since commit 9acc7bde23ebb19a704395f76490685e1513e422:

  Merge tag 'hwmon-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging (2012-06-28 
12:38:51 -0700)

are available in the git repository at:


  git://people.freedesktop.org/~airlied/linux drm-fixes

for you to fetch changes up to 9f846a16d213523fbe6daea17e20df6b8ac5a1e5:

  drm/i915: kick any firmware framebuffers before claiming the gtt (2012-07-03 
11:18:48 +0100)


Alex Deucher (1):
  drm/radeon: fix VM page table setup on SI

Daniel Vetter (1):
  drm/i915: kick any firmware framebuffers before claiming the gtt

Jerome Glisse (1):
  drm/radeon: fix rare segfault

Takashi Iwai (1):
  drm: edid: Don't add inferred modes with higher resolution

 drivers/gpu/drm/drm_edid.c   |   27 ++---
 drivers/gpu/drm/i915/i915_dma.c  |   37 +++---
 drivers/gpu/drm/radeon/radeon_gart.c |   13 ++--
 drivers/gpu/drm/radeon/radeon_gem.c  |   10 +
 drivers/gpu/drm/radeon/si.c  |4 ++--
 5 files changed, 73 insertions(+), 18 deletions(-)


Re: Is drm-radeon-testing broken for anyone else?

2012-07-03 Thread Dave Airlie
On Tue, Jul 3, 2012 at 6:02 PM, David Witbrodt  wrote:
> Maybe I'm doing something wrong, but my cloned repo of
> drm-radeon-testing is giving build errors.  What I'm
> seeing is

Oh I wasn't aware anyone still used it, I just pushed a branch for
Jerome the other day to test something, its known broken.

Dave.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] drm/radeon: fix rare segfault

2012-07-03 Thread Christian König
On 02.07.2012 18:40, j.glisse at gmail.com wrote:
> From: Jerome Glisse 
>
> In gem idle/busy ioctl the radeon object was derefenced after
> drm_gem_object_unreference_unlocked which in case the object
> have been destroyed lead to use of a possibly free pointer with
> possibly wrong data.
>
> Signed-off-by: Jerome Glisse 

Reviewed-by: Christian K?nig 

> ---
>   drivers/gpu/drm/radeon/radeon_gem.c |   10 ++
>   1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
> b/drivers/gpu/drm/radeon/radeon_gem.c
> index 74176c5..c8838fc 100644
> --- a/drivers/gpu/drm/radeon/radeon_gem.c
> +++ b/drivers/gpu/drm/radeon/radeon_gem.c
> @@ -325,6 +325,7 @@ int radeon_gem_mmap_ioctl(struct drm_device *dev, void 
> *data,
>   int radeon_gem_busy_ioctl(struct drm_device *dev, void *data,
> struct drm_file *filp)
>   {
> + struct radeon_device *rdev = dev->dev_private;
>   struct drm_radeon_gem_busy *args = data;
>   struct drm_gem_object *gobj;
>   struct radeon_bo *robj;
> @@ -350,13 +351,14 @@ int radeon_gem_busy_ioctl(struct drm_device *dev, void 
> *data,
>   break;
>   }
>   drm_gem_object_unreference_unlocked(gobj);
> - r = radeon_gem_handle_lockup(robj->rdev, r);
> + r = radeon_gem_handle_lockup(rdev, r);
>   return r;
>   }
>   
>   int radeon_gem_wait_idle_ioctl(struct drm_device *dev, void *data,
> struct drm_file *filp)
>   {
> + struct radeon_device *rdev = dev->dev_private;
>   struct drm_radeon_gem_wait_idle *args = data;
>   struct drm_gem_object *gobj;
>   struct radeon_bo *robj;
> @@ -369,10 +371,10 @@ int radeon_gem_wait_idle_ioctl(struct drm_device *dev, 
> void *data,
>   robj = gem_to_radeon_bo(gobj);
>   r = radeon_bo_wait(robj, NULL, false);
>   /* callback hw specific functions if any */
> - if (robj->rdev->asic->ioctl_wait_idle)
> - robj->rdev->asic->ioctl_wait_idle(robj->rdev, robj);
> + if (rdev->asic->ioctl_wait_idle)
> + robj->rdev->asic->ioctl_wait_idle(rdev, robj);
>   drm_gem_object_unreference_unlocked(gobj);
> - r = radeon_gem_handle_lockup(robj->rdev, r);
> + r = radeon_gem_handle_lockup(rdev, r);
>   return r;
>   }
>   




[PATCH] drm/radeon: add an exclusive lock for GPU reset

2012-07-03 Thread Christian König
On 02.07.2012 19:27, Jerome Glisse wrote:
> On Mon, Jul 2, 2012 at 1:05 PM, Christian K?nig  
> wrote:
>> On 02.07.2012 18:41, Jerome Glisse wrote:
>>> On Mon, Jul 2, 2012 at 12:26 PM, Christian K?nig
>>>  wrote:
 On 02.07.2012 17:39, j.glisse at gmail.com wrote:
> From: Jerome Glisse 
>
> GPU reset need to be exclusive, one happening at a time. For this
> add a rw semaphore so that any path that trigger GPU activities
> have to take the semaphore as a reader thus allowing concurency.
>
> The GPU reset path take the semaphore as a writer ensuring that
> no concurrent reset take place.
>
> Signed-off-by: Jerome Glisse 
 NAK, that isn't as bad as the cs mutex was but still to complicated. It's
 just too far up in the call stack, e.g. it tries to catch ioctl
 operations,
 instead of catching the underlying hardware operation which is caused by
 the
 ioctl/ttm/etc...

 Why not just take the ring look as I suggested?


>>> No we can't use ring lock, we need to protect suspend/resume path and
>>> we need an exclusive lock for that so we need a reset mutex at the
>>> very least. But instead of having a reset mutex i prefer using a rw
>>> lock so that we can stop ioctl until a reset goes through an let
>>> pending ioctl take proper action. Think about multiple context trying
>>> to reset GPU ...
>>>
>>> Really this is the best option, the rw locking wont induce any lock
>>> contention execept in gpu reset case which is anyway bad news.
>> Why? That makes no sense to me. Well I don't want to prevent lock
>> contention, but understand why we should add locking at the ioctl level.
>> That violates locking rule number one "lock data instead of code" (or in our
>> case "lock hardware access instead of code path") and it really is the
>> reason why we ended up with the cs_mutex protecting practically everything.
>>
>> Multiple context trying to reset the GPU should be pretty fine, current it
>> would just reset the GPU twice, but in the future asic_reset should be much
>> more fine grained and only reset the parts of the GPU which really needs an
>> reset.
>>
>> Cheers,
>> Christian.
> No multiple reset is not fine, try it your self and you will see all
> kind of oops (strongly advise you to sync you hd before stress testing
> that). Yes we need to protect code path because suspend/resume code
> path is special one it touch many data in the device structure. GPU
> reset is a rare event or should be a rare event.
Yeah, but I thought that fixing those oops as the second step. I see the 
fact that suspend/resume is unpinning all the ttm memory and then 
pinning it again as a bug that needs to be fixed. Or as an alternative 
we could split the suspend/resume calls into suspend/disable and 
resume/enable calls, where we only call disable/enable in the gpu_reset 
path rather than a complete suspend/resume (not really sure about that).

Also a GPU reset isn't such a rare event, currently it just occurs when 
userspace is doing something bad, for example submitting an invalid 
shader or stuff like that. But with VM and IO protection coming into the 
driver we are going to need a GPU reset when there is an protection 
fault, and with that it is really desirable to just reset the hardware 
in a way where we can say: This IB was faulty skip over it and resume 
with whatever is after it on the ring.

And todo that we need to keep the auxiliary data like sub allocator 
memory, blitting shader bo, and especially vm page tables at the same 
place in hardware memory.

> I stress it we need at very least a mutex to protect gpu reset and i
> will stand on that position because there is no other way around.
> Using rw lock have a bonus of allowing proper handling of gpu reset
> failure and that what the patch i sent to linus fix tree is about, so
> to make drm next merge properly while preserving proper behavior in
> gpu reset failure the rw semaphore is the best option.
Oh well, I'm not arguing that we don't need a look here. I'm just 
questioning to put it at the ioctl level (e.g. the driver entry from 
userspace), that wasn't such a good idea with the cs_mutex and doesn't 
seems like a good idea now. Instead we should place the looking between 
the ioctl/ttm and the actual hardware submission and that brings it 
pretty close (if not identically) to the ring mutex.

Cheers,
Christian.


[PATCH] drm: edid: Don't add inferred modes with higher resolution

2012-07-03 Thread Takashi Iwai
When a monitor EDID doesn't give the preferred bit, driver assumes
that the mode with the higest resolution and rate is the preferred
mode.  Meanwhile the recent changes for allowing more modes in the
GFT/CVT ranges give actually more modes, and some modes may be over
the native size.  Thus such a mode would be picked up as the preferred
mode although it's no native resolution.

For avoiding such a problem, this patch limits the addition of
inferred modes by checking not to be greater than other modes.
Also, it checks the duplicated mode entry at the same time.

Reviewed-by: Adam Jackson 
Signed-off-by: Takashi Iwai 
---

Dave, could you apply this as a regression fix for 3.5 kernel?

 drivers/gpu/drm/drm_edid.c |   27 ---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 5873e48..a8743c3 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -1039,6 +1039,24 @@ mode_in_range(const struct drm_display_mode *mode, 
struct edid *edid,
return true;
 }

+static bool valid_inferred_mode(const struct drm_connector *connector,
+   const struct drm_display_mode *mode)
+{
+   struct drm_display_mode *m;
+   bool ok = false;
+
+   list_for_each_entry(m, &connector->probed_modes, head) {
+   if (mode->hdisplay == m->hdisplay &&
+   mode->vdisplay == m->vdisplay &&
+   drm_mode_vrefresh(mode) == drm_mode_vrefresh(m))
+   return false; /* duplicated */
+   if (mode->hdisplay <= m->hdisplay &&
+   mode->vdisplay <= m->vdisplay)
+   ok = true;
+   }
+   return ok;
+}
+
 static int
 drm_dmt_modes_for_range(struct drm_connector *connector, struct edid *edid,
struct detailed_timing *timing)
@@ -1048,7 +1066,8 @@ drm_dmt_modes_for_range(struct drm_connector *connector, 
struct edid *edid,
struct drm_device *dev = connector->dev;

for (i = 0; i < drm_num_dmt_modes; i++) {
-   if (mode_in_range(drm_dmt_modes + i, edid, timing)) {
+   if (mode_in_range(drm_dmt_modes + i, edid, timing) &&
+   valid_inferred_mode(connector, drm_dmt_modes + i)) {
newmode = drm_mode_duplicate(dev, &drm_dmt_modes[i]);
if (newmode) {
drm_mode_probed_add(connector, newmode);
@@ -1088,7 +1107,8 @@ drm_gtf_modes_for_range(struct drm_connector *connector, 
struct edid *edid,
return modes;

fixup_mode_1366x768(newmode);
-   if (!mode_in_range(newmode, edid, timing)) {
+   if (!mode_in_range(newmode, edid, timing) ||
+   !valid_inferred_mode(connector, newmode)) {
drm_mode_destroy(dev, newmode);
continue;
}
@@ -1116,7 +1136,8 @@ drm_cvt_modes_for_range(struct drm_connector *connector, 
struct edid *edid,
return modes;

fixup_mode_1366x768(newmode);
-   if (!mode_in_range(newmode, edid, timing)) {
+   if (!mode_in_range(newmode, edid, timing) ||
+   !valid_inferred_mode(connector, newmode)) {
drm_mode_destroy(dev, newmode);
continue;
}
-- 
1.7.10.4



Bogus video resolution in Linux 3.5-rc4

2012-07-03 Thread Takashi Iwai
At Mon, 02 Jul 2012 15:46:29 -0400,
Adam Jackson wrote:
> 
> On 6/26/12 3:21 AM, Takashi Iwai wrote:
> 
> > From: Takashi Iwai 
> > Subject: [PATCH] drm: edid: Don't add inferred modes with higher resolution
> >
> > When a monitor EDID doesn't give the preferred bit, driver assumes
> > that the mode with the higest resolution and rate is the preferred
> > mode.  Meanwhile the recent changes for allowing more modes in the
> > GFT/CVT ranges give actually more modes, and some modes may be over
> > the native size.  Thus such a mode would be picked up as the preferred
> > mode although it's no native resolution.
> >
> > For avoiding such a problem, this patch limits the addition of
> > inferred modes by checking not to be greater than other modes.
> > Also, it checks the duplicated mode entry at the same time.
> 
> This is a little aggressive on CRTs, but whatever, better than what's there.
> 
> Reviewed-by: Adam Jackson 

Thanks.  I'm going to resend the patch with your tag.


Takashi


[PATCH] drm/radeon: add an exclusive lock for GPU reset

2012-07-03 Thread Jerome Glisse
On Tue, Jul 3, 2012 at 10:52 AM, Christian K?nig
 wrote:
> On 03.07.2012 16:09, Jerome Glisse wrote:
>>
>> On Tue, Jul 3, 2012 at 5:26 AM, Christian K?nig 
>> wrote:
>>>
>>> [SNIP]
>>>
>>> Yeah, but I thought that fixing those oops as the second step. I see the
>>> fact that suspend/resume is unpinning all the ttm memory and then pinning
>>> it
>>> again as a bug that needs to be fixed. Or as an alternative we could
>>> split
>>> the suspend/resume calls into suspend/disable and resume/enable calls,
>>> where
>>> we only call disable/enable in the gpu_reset path rather than a complete
>>> suspend/resume (not really sure about that).
>>
>> Fixing oops are not second step, they are first step. I am not saying
>> that the suspend/resume as it happens right now is a good thing, i am
>> saying it's what it's and we have to deal with it until we do
>> something better. There is no excuse to not fix oops with a simple
>> patch 16 lines patch.
>
> Completely agree.
>
>
>>> Also a GPU reset isn't such a rare event, currently it just occurs when
>>> userspace is doing something bad, for example submitting an invalid
>>> shader
>>> or stuff like that. But with VM and IO protection coming into the driver
>>> we
>>> are going to need a GPU reset when there is an protection fault, and with
>>> that it is really desirable to just reset the hardware in a way where we
>>> can
>>> say: This IB was faulty skip over it and resume with whatever is after it
>>> on
>>> the ring.
>>
>> There is mecanism to handle that properly from irq handler that AMD
>> need to release, the pagefault thing could be transparent and should
>> only need few lines in the irq handler (i think i did a patch for that
>> and sent it to AMD for review but i am wondering if i wasn't lacking
>> some doc).
>
> I also searched the AMD internal docs for a good description of how the
> lightweight recovery is supposed to work, but haven't found anything clear
> so far. My expectation is that there should be something like a "abort
> current IB" command you can issue by writing an register, but that doesn't
> seems to be the case.

Doc are scarce on that, my understanding is that you can't skip IB
that trigger the segfault. The handling was designed with page fault
in mind, ie you page in the missing page and you resume the ib. That
being said i think one can abuse that by moving the rptr of the ring
and resuming.

>>> And todo that we need to keep the auxiliary data like sub allocator
>>> memory,
>>> blitting shader bo, and especially vm page tables at the same place in
>>> hardware memory.
>>
>> I agree that we need a lightweight reset but we need to keep the heavy
>> reset as it is right now, if you want to do a light weight reset do it
>> as a new function. I always intended to have two reset path, hint gpu
>> soft reset name vs what is call hard reset but not released, i even
>> made patch for that long time ago but never got them cleared from AMD
>> review.
>
> My idea was to pass in some extra informations, so asic_reset more clearly
> knows what todo. An explicit distinction between a soft and a hard reset
> also seems like a possible solution, but sounds like a bit of code
> duplication.

I think people should stop on the mantra of don't duplicate code, i do
agree that it's better to not duplicate code but in some case there is
no way around it and it's worse to try sharing the code, i think the
worst example in radeon kernel is the modesetting section where we try
to share same code for all different variation of DCE but in the end
it's a mess of if/else/switch/case and what not. Code duplication is
sometime the cleanest and best choice, might we fix a bug in one and
not the other yes it might happen but it's life ;)

>
 I stress it we need at very least a mutex to protect gpu reset and i
 will stand on that position because there is no other way around.
 Using rw lock have a bonus of allowing proper handling of gpu reset
 failure and that what the patch i sent to linus fix tree is about, so
 to make drm next merge properly while preserving proper behavior in
 gpu reset failure the rw semaphore is the best option.
>>>
>>> Oh well, I'm not arguing that we don't need a look here. I'm just
>>> questioning to put it at the ioctl level (e.g. the driver entry from
>>> userspace), that wasn't such a good idea with the cs_mutex and doesn't
>>> seems
>>> like a good idea now. Instead we should place the looking between the
>>> ioctl/ttm and the actual hardware submission and that brings it pretty
>>> close
>>> (if not identically) to the ring mutex.
>>>
>>> Cheers,
>>> Christian.
>>
>> No, locking at the ioctl level make sense please show me figure that
>> it hurt performance, i did a quick sysprof and i couldn't see them
>> impacting anything. No matter how much you hate this, this is the best
>> solution, it avoids each ioctl to do useless things in case of gpu
>> lockup and it touch a lot less code than moving a lock down the call
>

[Bug 44121] Reproducible GPU lockup CP stall on Radeon HD 6450

2012-07-03 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=44121





--- Comment #9 from Jean Delvare   2012-07-03 18:08:29 ---
Patch from comment #6 doesn't work, testing patch from comment #7 now.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 44121] Reproducible GPU lockup CP stall on Radeon HD 6450

2012-07-03 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=44121





--- Comment #8 from Alex Deucher   2012-07-03 18:00:24 
---
Does booting up a clean kernel without any patches applied or reverted work if
you manually set the following registers to their "patch reverted" values using
radeonreg? Just to be sure, write all of them even if the values are the same. 
Do this without X running.

0x98F4
0x3F88
0x9B7C
0x8950
0x98FC
0x8954

e.g.,
radeonreg regset 0x8950 0xfffcf001

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Is drm-radeon-testing broken for anyone else?

2012-07-03 Thread David Witbrodt
Maybe I'm doing something wrong, but my cloned repo of
drm-radeon-testing is giving build errors.  What I'm
seeing is

[...]
  CC  drivers/gpu/drm/radeon/radeon_gem.o
drivers/gpu/drm/radeon/radeon_gem.c: In function ‘radeon_gem_create_ioctl’:
drivers/gpu/drm/radeon/radeon_gem.c:221:3: error: implicit declaration of 
function ‘radeon_mutex_lock’ [-Werror=implicit-function-declaration]
drivers/gpu/drm/radeon/radeon_gem.c:221:26: error: ‘struct radeon_device’ 
has no member named ‘cs_mutex’
drivers/gpu/drm/radeon/radeon_gem.c:222:3: error: implicit declaration of 
function ‘radeon_mutex_unlock’ [-Werror=implicit-function-declaration]
drivers/gpu/drm/radeon/radeon_gem.c:222:28: error: ‘struct radeon_device’ 
has no member named ‘cs_mutex’
drivers/gpu/drm/radeon/radeon_gem.c: In function 
‘radeon_gem_set_domain_ioctl’:
drivers/gpu/drm/radeon/radeon_gem.c:268:26: error: ‘struct radeon_device’ 
has no member named ‘cs_mutex’
drivers/gpu/drm/radeon/radeon_gem.c:269:28: error: ‘struct radeon_device’ 
has no member named ‘cs_mutex’
cc1: some warnings being treated as errors
make[6]: *** [drivers/gpu/drm/radeon/radeon_gem.o] Error 1

Looking around a bit, I find this commit

commit 36ff39c4045ee71cd306f8af5f8c2a1c6e998eba
Author: Christian König 
Date:   Wed May 9 10:07:08 2012 +0200

drm/radeon: replace cs_mutex with vm_mutex v3

replaces radeon_mutex_lock()/radeon_mutex_unlock() with 
mutex_lock()/mutex_unlock()
in most places, eliminating it in others.  A more recent commit

commit 0a01063d5fd7417175512f0f916629008c0a842e
Author: Jerome Glisse 
Date:   Wed Jun 27 14:14:21 2012 -0400

drm/radeon: disable any GPU activity after unrecovered lockup v5

introduces this hunk:

diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index f28bd4b..74176c5 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -217,6 +217,22 @@ int radeon_gem_create_ioctl(struct drm_device *dev, 
void *data,
uint32_t handle;
int r;
 
+   /* if in middle of gpu reset wait on the mutex, if reset
+* failed don't do anything, otherwise proceed.
+*/
+   if (!rdev->accel_working) {
+   /* lockup situation wait for cs mutex to be droped */
+   radeon_mutex_lock(&rdev->cs_mutex);
+   radeon_mutex_unlock(&rdev->cs_mutex);
+   /* we are after a gpu reset */
+   if (!rdev->accel_working) {
+   /* gpu reset failed, don't create anymore GPU
+* object
+*/
+   return -EBUSY;
+   }
+   }
+

If it's not just something stupid I've done on my local
machine, then it looks like the 0a01063d patch was
applied without noticing the changes made in 36ff39c4.
(If the problem is really just me, sorry for the spam/
noise.)


Dave W.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 44121] Reproducible GPU lockup CP stall on Radeon HD 6450

2012-07-03 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=44121





--- Comment #7 from Alex Deucher   2012-07-03 17:31:40 
---
Created an attachment (id=74711)
 --> (https://bugzilla.kernel.org/attachment.cgi?id=74711)
possible fix

or this variant.  Although AFAIK, programming the USER register variants
shouldn't be necessary as the default values (0) are valid.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[MMTests] IO metadata on XFS

2012-07-03 Thread Eugeni Dodonov
On Tue, Jul 3, 2012 at 9:31 AM, Daniel Vetter  wrote:

> Well, presuming I understand things correctly the cpu die only goes into
> the lowest sleep state (which iirc switches off l3 caches and
> interconnects) when both the cpu and gpu are in the lowest sleep state.
> rc6 is that deep-sleep state for the gpu, so without that enabled your
> system won't go into these deep-sleep states.
>
> I guess the slight changes in wakeup latency, power consumption (cuts
> about 10W on an idle desktop snb with resulting big effect on what turbo
> boost can sustain for short amounts of time) and all the follow-on effects
> are good enough to massively change timing-critical things.
>

The sad side effect is that the software has very little control over the
RC6 entry and exit, the hardware enters and leaves RC6 state on its own
when it detects that the GPU is idle beyond a threshold. Chances are that
if you are not running any GPU workload, the GPU simple enters RC6 state
and stays there.

It is possible to observe the current state and also time spent in rc6 by
looking at the /sys/kernel/debug/dri/0/i915_drpc_info file.

One other effect of RC6 is that it also allows CPU to go into higher turbo
modes as it has more watts to spend while GPU is idle, perhaps this is what
causes the issue here?

-- 
Eugeni Dodonov
<http://eugeni.dodonov.net/>
-- next part --
An HTML attachment was scrubbed...
URL: 
<http://lists.freedesktop.org/archives/dri-devel/attachments/20120703/9d001245/attachment.html>


[MMTests] IO metadata on XFS

2012-07-03 Thread Dave Chinner
On Mon, Jul 02, 2012 at 08:35:16PM +0100, Mel Gorman wrote:
> Adding dri-devel and a few others because an i915 patch contributed to
> the regression.
> 
> On Mon, Jul 02, 2012 at 03:32:15PM +0100, Mel Gorman wrote:
> > On Mon, Jul 02, 2012 at 02:32:26AM -0400, Christoph Hellwig wrote:
> > > > It increases the CPU overhead (dirty_inode can be called up to 4
> > > > times per write(2) call, IIRC), so with limited numbers of
> > > > threads/limited CPU power it will result in lower performance. Where
> > > > you have lots of CPU power, there will be little difference in
> > > > performance...
> > > 
> > > When I checked it it could only be called twice, and we'd already
> > > optimize away the second call.  I'd defintively like to track down where
> > > the performance changes happend, at least to a major version but even
> > > better to a -rc or git commit.
> > > 
> > 
> > By all means feel free to run the test yourself and run the bisection :)
> > 
> > It's rare but on this occasion the test machine is idle so I started an
> > automated git bisection. As you know the milage with an automated bisect
> > varies so it may or may not find the right commit. Test machine is sandy so
> > http://www.csn.ul.ie/~mel/postings/mmtests-20120424/global-dhp__io-metadata-xfs/sandy/comparison.html
> > is the report of interest. The script is doing a full search between v3.3 
> > and
> > v3.4 for a point where average files/sec for fsmark-single drops below 
> > 25000.
> > I did not limit the search to fs/xfs on the off-chance that it is an
> > apparently unrelated patch that caused the problem.
> > 
> 
> It was obvious very quickly that there were two distinct regression so I
> ran two bisections. One led to a XFS and the other led to an i915 patch
> that enables RC6 to reduce power usage.
> 
> [aa464191: drm/i915: enable plain RC6 on Sandy Bridge by default]

Doesn't seem to be the major cause of the regression. By itself, it
has impact, but the majority comes from the XFS change...

> [c999a223: xfs: introduce an allocation workqueue]

Which indicates that there is workqueue scheduling issues, I think.
The same amount of work is being done, but half of it is being
pushed off into a workqueue to avoid stack overflow issues (*).  I
tested the above patch in anger on an 8p machine, similar to the
machine you saw no regressions on, but the workload didn't drive it
to being completely CPU bound (only about 90%) so the allocation
work was probably always scheduled quickly.

How many worker threads have been spawned on these machines
that are showing the regression? What is the context switch rate on
the machines whenteh test is running? Can you run latencytop to see
if there is excessive starvation/wait times for allocation
completion? A pert top profile comparison might be informative,
too...

(*) The stack usage below submit_bio() can be more than 5k (DM, MD,
SCSI, driver, memory allocation), so it's really not safe to do
allocation anywhere below about 3k of kernel stack being used. e.g.
on a relatively trivial storage setup without the above commit:

[142296.384921] flush-253:4 used greatest stack depth: 360 bytes left

Fundamentally, 8k stacks on x86-64 are too small for our
increasingly complex storage layers and the 100+ function deep call
chains that occur.

Cheers,

Dave.

-- 
Dave Chinner
david at fromorbit.com


[Bug 44121] Reproducible GPU lockup CP stall on Radeon HD 6450

2012-07-03 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=44121


Jérôme Glisse  changed:

   What|Removed |Added

  Attachment #74671|0   |1
is obsolete||




--- Comment #6 from Jérôme Glisse   2012-07-03 17:09:41 
---
Created an attachment (id=74701)
 --> (https://bugzilla.kernel.org/attachment.cgi?id=74701)
properly disable render backend

This one ?

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] drm/radeon: add an exclusive lock for GPU reset

2012-07-03 Thread Jerome Glisse
On Tue, Jul 3, 2012 at 5:26 AM, Christian K?nig  
wrote:
> On 02.07.2012 19:27, Jerome Glisse wrote:
>>
>> On Mon, Jul 2, 2012 at 1:05 PM, Christian K?nig 
>> wrote:
>>>
>>> On 02.07.2012 18:41, Jerome Glisse wrote:

 On Mon, Jul 2, 2012 at 12:26 PM, Christian K?nig
  wrote:
>
> On 02.07.2012 17:39, j.glisse at gmail.com wrote:
>>
>> From: Jerome Glisse 
>>
>> GPU reset need to be exclusive, one happening at a time. For this
>> add a rw semaphore so that any path that trigger GPU activities
>> have to take the semaphore as a reader thus allowing concurency.
>>
>> The GPU reset path take the semaphore as a writer ensuring that
>> no concurrent reset take place.
>>
>> Signed-off-by: Jerome Glisse 
>
> NAK, that isn't as bad as the cs mutex was but still to complicated.
> It's
> just too far up in the call stack, e.g. it tries to catch ioctl
> operations,
> instead of catching the underlying hardware operation which is caused
> by
> the
> ioctl/ttm/etc...
>
> Why not just take the ring look as I suggested?
>
>
 No we can't use ring lock, we need to protect suspend/resume path and
 we need an exclusive lock for that so we need a reset mutex at the
 very least. But instead of having a reset mutex i prefer using a rw
 lock so that we can stop ioctl until a reset goes through an let
 pending ioctl take proper action. Think about multiple context trying
 to reset GPU ...

 Really this is the best option, the rw locking wont induce any lock
 contention execept in gpu reset case which is anyway bad news.
>>>
>>> Why? That makes no sense to me. Well I don't want to prevent lock
>>> contention, but understand why we should add locking at the ioctl level.
>>> That violates locking rule number one "lock data instead of code" (or in
>>> our
>>> case "lock hardware access instead of code path") and it really is the
>>> reason why we ended up with the cs_mutex protecting practically
>>> everything.
>>>
>>> Multiple context trying to reset the GPU should be pretty fine, current
>>> it
>>> would just reset the GPU twice, but in the future asic_reset should be
>>> much
>>> more fine grained and only reset the parts of the GPU which really needs
>>> an
>>> reset.
>>>
>>> Cheers,
>>> Christian.
>>
>> No multiple reset is not fine, try it your self and you will see all
>> kind of oops (strongly advise you to sync you hd before stress testing
>> that). Yes we need to protect code path because suspend/resume code
>> path is special one it touch many data in the device structure. GPU
>> reset is a rare event or should be a rare event.
>
> Yeah, but I thought that fixing those oops as the second step. I see the
> fact that suspend/resume is unpinning all the ttm memory and then pinning it
> again as a bug that needs to be fixed. Or as an alternative we could split
> the suspend/resume calls into suspend/disable and resume/enable calls, where
> we only call disable/enable in the gpu_reset path rather than a complete
> suspend/resume (not really sure about that).

Fixing oops are not second step, they are first step. I am not saying
that the suspend/resume as it happens right now is a good thing, i am
saying it's what it's and we have to deal with it until we do
something better. There is no excuse to not fix oops with a simple
patch 16 lines patch.

> Also a GPU reset isn't such a rare event, currently it just occurs when
> userspace is doing something bad, for example submitting an invalid shader
> or stuff like that. But with VM and IO protection coming into the driver we
> are going to need a GPU reset when there is an protection fault, and with
> that it is really desirable to just reset the hardware in a way where we can
> say: This IB was faulty skip over it and resume with whatever is after it on
> the ring.

There is mecanism to handle that properly from irq handler that AMD
need to release, the pagefault thing could be transparent and should
only need few lines in the irq handler (i think i did a patch for that
and sent it to AMD for review but i am wondering if i wasn't lacking
some doc).

> And todo that we need to keep the auxiliary data like sub allocator memory,
> blitting shader bo, and especially vm page tables at the same place in
> hardware memory.

I agree that we need a lightweight reset but we need to keep the heavy
reset as it is right now, if you want to do a light weight reset do it
as a new function. I always intended to have two reset path, hint gpu
soft reset name vs what is call hard reset but not released, i even
made patch for that long time ago but never got them cleared from AMD
review.

>
>> I stress it we need at very least a mutex to protect gpu reset and i
>> will stand on that position because there is no other way around.
>> Using rw lock have a bonus of allowing proper handling of gpu reset
>> failure and that what the patch i sent to linus fix t

Is drm-radeon-testing broken for anyone else?

2012-07-03 Thread David Witbrodt
Maybe I'm doing something wrong, but my cloned repo of
drm-radeon-testing is giving build errors.  What I'm
seeing is

[...]
  CC  drivers/gpu/drm/radeon/radeon_gem.o
drivers/gpu/drm/radeon/radeon_gem.c: In function ?radeon_gem_create_ioctl?:
drivers/gpu/drm/radeon/radeon_gem.c:221:3: error: implicit declaration of 
function ?radeon_mutex_lock? [-Werror=implicit-function-declaration]
drivers/gpu/drm/radeon/radeon_gem.c:221:26: error: ?struct radeon_device? 
has no member named ?cs_mutex?
drivers/gpu/drm/radeon/radeon_gem.c:222:3: error: implicit declaration of 
function ?radeon_mutex_unlock? [-Werror=implicit-function-declaration]
drivers/gpu/drm/radeon/radeon_gem.c:222:28: error: ?struct radeon_device? 
has no member named ?cs_mutex?
drivers/gpu/drm/radeon/radeon_gem.c: In function 
?radeon_gem_set_domain_ioctl?:
drivers/gpu/drm/radeon/radeon_gem.c:268:26: error: ?struct radeon_device? 
has no member named ?cs_mutex?
drivers/gpu/drm/radeon/radeon_gem.c:269:28: error: ?struct radeon_device? 
has no member named ?cs_mutex?
cc1: some warnings being treated as errors
make[6]: *** [drivers/gpu/drm/radeon/radeon_gem.o] Error 1

Looking around a bit, I find this commit

commit 36ff39c4045ee71cd306f8af5f8c2a1c6e998eba
Author: Christian K?nig 
Date:   Wed May 9 10:07:08 2012 +0200

drm/radeon: replace cs_mutex with vm_mutex v3

replaces radeon_mutex_lock()/radeon_mutex_unlock() with 
mutex_lock()/mutex_unlock()
in most places, eliminating it in others.  A more recent commit

commit 0a01063d5fd7417175512f0f916629008c0a842e
Author: Jerome Glisse 
Date:   Wed Jun 27 14:14:21 2012 -0400

drm/radeon: disable any GPU activity after unrecovered lockup v5

introduces this hunk:

diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index f28bd4b..74176c5 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -217,6 +217,22 @@ int radeon_gem_create_ioctl(struct drm_device *dev, 
void *data,
uint32_t handle;
int r;

+   /* if in middle of gpu reset wait on the mutex, if reset
+* failed don't do anything, otherwise proceed.
+*/
+   if (!rdev->accel_working) {
+   /* lockup situation wait for cs mutex to be droped */
+   radeon_mutex_lock(&rdev->cs_mutex);
+   radeon_mutex_unlock(&rdev->cs_mutex);
+   /* we are after a gpu reset */
+   if (!rdev->accel_working) {
+   /* gpu reset failed, don't create anymore GPU
+* object
+*/
+   return -EBUSY;
+   }
+   }
+

If it's not just something stupid I've done on my local
machine, then it looks like the 0a01063d patch was
applied without noticing the changes made in 36ff39c4.
(If the problem is really just me, sorry for the spam/
noise.)


Dave W.


[PATCH] drm/radeon: fix fence related segfault in CS

2012-07-03 Thread Alex Deucher
On Tue, Jul 3, 2012 at 8:11 AM, Christian K?nig  
wrote:
> Don't return success if scheduling the IB fails, otherwise
> we end up with an oops in ttm_eu_fence_buffer_objects.
>
> Signed-off-by: Christian K?nig 
> Cc: stable at vger.kernel.org

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/radeon/radeon_cs.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
> b/drivers/gpu/drm/radeon/radeon_cs.c
> index 142f894..17238f4 100644
> --- a/drivers/gpu/drm/radeon/radeon_cs.c
> +++ b/drivers/gpu/drm/radeon/radeon_cs.c
> @@ -377,7 +377,7 @@ static int radeon_cs_ib_chunk(struct radeon_device *rdev,
> if (r) {
> DRM_ERROR("Failed to schedule IB !\n");
> }
> -   return 0;
> +   return r;
>  }
>
>  static int radeon_bo_vm_update_pte(struct radeon_cs_parser *parser,
> --
> 1.7.9.5
>
> ___
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH libdrm] intel: Fix build failure in test_decode.c

2012-07-03 Thread Ben Widawsky
On Tue, 3 Jul 2012 12:21:27 +0300
Lauri Kasanen  wrote:

> On Mon, 2 Jul 2012 14:54:58 -0700
> Ben Widawsky  wrote:
> 
> > > +#define _GNU_SOURCE
> > > +
> > >  #include 
> > >  #include 
> > >  #include 
> > 
> > I can't reproduce this. Can anyone else confirm this is broken, and if
> > so that the above patch fixes it?
> 
> See the manpage, it depends on the glibc version. Libdrm is broken on glibc < 
> 2.10.
> 
> - Lauri
> ___
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

Pushed. Thanks for the patch.

-- 
Ben Widawsky, Intel Open Source Technology Center
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH libdrm] intel: Fix build failure in test_decode.c

2012-07-03 Thread Ben Widawsky
On Tue, 3 Jul 2012 12:21:27 +0300
Lauri Kasanen  wrote:

> On Mon, 2 Jul 2012 14:54:58 -0700
> Ben Widawsky  wrote:
> 
> > > +#define _GNU_SOURCE
> > > +
> > >  #include 
> > >  #include 
> > >  #include 
> > 
> > I can't reproduce this. Can anyone else confirm this is broken, and if
> > so that the above patch fixes it?
> 
> See the manpage, it depends on the glibc version. Libdrm is broken on glibc < 
> 2.10.
> 
> - Lauri
> ___
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

Pushed. Thanks for the patch.

-- 
Ben Widawsky, Intel Open Source Technology Center


[Bug 51594] opengl does not work with radeon 9600

2012-07-03 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=51594

Michel D?nzer  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||DUPLICATE

--- Comment #1 from Michel D?nzer  2012-07-03 02:55:36 
PDT ---


*** This bug has been marked as a duplicate of bug 26496 ***

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 26496] OpenGL does not work on Radeon 9600 (r300)

2012-07-03 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=26496

Michel D?nzer  changed:

   What|Removed |Added

 CC||kevinbanjo at gmail.com

--- Comment #15 from Michel D?nzer  2012-07-03 02:55:36 
PDT ---
*** Bug 51594 has been marked as a duplicate of this bug. ***

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


[Bug 44121] Reproducible GPU lockup CP stall on Radeon HD 6450

2012-07-03 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=44121





--- Comment #5 from Jean Delvare   2012-07-03 16:39:19 ---
With this patch applied, I get:

0x98F40x0001 (1)
0x3F880x0001 (1)
0x9B7C0x00fe (16646144)
0x89500xfffcf001 (-200703)
0x98FC0x (0)
0x89540x (0)

So the value of register 0x9B7C is correct now, but this was not sufficient.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 44121] Reproducible GPU lockup CP stall on Radeon HD 6450

2012-07-03 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=44121





--- Comment #4 from Jean Delvare   2012-07-03 16:21:27 ---
I tested the patch in comment #3 but unfortunately it doesn't solve the
problem.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [ANNOUNCE] libdrm 2.4.37

2012-07-03 Thread Ben Widawsky
The link generated by the release script was incorrect. They should be

http://dri.freedesktop.org/libdrm/libdrm-2.4.37.tar.gz
http://dri.freedesktop.org/libdrm/libdrm-2.4.37.tar.bz2

On Fri, 29 Jun 2012 11:17:47 -0700
Ben Widawsky  wrote:

> I botched the 2.3.36 release quite royally. Here is 2.6.37 this time with the
> proper context APIs in place.
> 
> Ben Widawsky (2):
>   intel/context: create/destroy implementation
>   configure: bump version for release
> 
> Kristian Høgsberg (1):
>   modetest: Dump bit field names
> 
> git tag: libdrm-2.4.37
> 
> http://dri.freedesktop.org/www/libdrm/libdrm-2.4.37.tar.bz2
> MD5:  9765919c28d4a54887576db3680137cc  libdrm-2.4.37.tar.bz2
> SHA1: fa8463e390eee9b589dc369abc4cbe3e4ef16d16  libdrm-2.4.37.tar.bz2
> SHA256: e4ea39a901d4a8e59064f10f413bb037dad7790f7c16a5986e7cc1453b36488f  
> libdrm-2.4.37.tar.bz2
> 
> http://dri.freedesktop.org/www/libdrm/libdrm-2.4.37.tar.gz
> MD5:  7f762bfa0bdaa7216c926d0dc9629e87  libdrm-2.4.37.tar.gz
> SHA1: b086dc3f64570ac9aa9eccd23a1e8156e9038995  libdrm-2.4.37.tar.gz
> SHA256: b530d71ff9a7f5252f450a386540fe8512bde033e8283fa6a1bbcd3c62fc91e4  
> libdrm-2.4.37.tar.gz
> 
> ___
> xorg-announce mailing list
> xorg-annou...@lists.x.org
> http://lists.x.org/mailman/listinfo/xorg-announce



-- 
Ben Widawsky, Intel Open Source Technology Center
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[ANNOUNCE] libdrm 2.4.37

2012-07-03 Thread Ben Widawsky
The link generated by the release script was incorrect. They should be

http://dri.freedesktop.org/libdrm/libdrm-2.4.37.tar.gz
http://dri.freedesktop.org/libdrm/libdrm-2.4.37.tar.bz2

On Fri, 29 Jun 2012 11:17:47 -0700
Ben Widawsky  wrote:

> I botched the 2.3.36 release quite royally. Here is 2.6.37 this time with the
> proper context APIs in place.
> 
> Ben Widawsky (2):
>   intel/context: create/destroy implementation
>   configure: bump version for release
> 
> Kristian H?gsberg (1):
>   modetest: Dump bit field names
> 
> git tag: libdrm-2.4.37
> 
> http://dri.freedesktop.org/www/libdrm/libdrm-2.4.37.tar.bz2
> MD5:  9765919c28d4a54887576db3680137cc  libdrm-2.4.37.tar.bz2
> SHA1: fa8463e390eee9b589dc369abc4cbe3e4ef16d16  libdrm-2.4.37.tar.bz2
> SHA256: e4ea39a901d4a8e59064f10f413bb037dad7790f7c16a5986e7cc1453b36488f  
> libdrm-2.4.37.tar.bz2
> 
> http://dri.freedesktop.org/www/libdrm/libdrm-2.4.37.tar.gz
> MD5:  7f762bfa0bdaa7216c926d0dc9629e87  libdrm-2.4.37.tar.gz
> SHA1: b086dc3f64570ac9aa9eccd23a1e8156e9038995  libdrm-2.4.37.tar.gz
> SHA256: b530d71ff9a7f5252f450a386540fe8512bde033e8283fa6a1bbcd3c62fc91e4  
> libdrm-2.4.37.tar.gz
> 
> ___
> xorg-announce mailing list
> xorg-announce at lists.x.org
> http://lists.x.org/mailman/listinfo/xorg-announce



-- 
Ben Widawsky, Intel Open Source Technology Center


[Bug 44121] Reproducible GPU lockup CP stall on Radeon HD 6450

2012-07-03 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=44121





--- Comment #3 from Jérôme Glisse   2012-07-03 15:42:09 
---
Created an attachment (id=74671)
 --> (https://bugzilla.kernel.org/attachment.cgi?id=74671)
 properly disable render backend

Does this patch fix it ?

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 44121] Reproducible GPU lockup CP stall on Radeon HD 6450

2012-07-03 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=44121





--- Comment #2 from Jean Delvare   2012-07-03 15:27:00 ---
With 3.5-rc5 kernel (failing) :

0x98F40x0001 (1)
0x3F880x0001 (1)
0x9B7C0x (0)
0x89500xfffcf001 (-200703)
0x98FC0x (0)

With commit 416a2bd2 reverted (working) :

0x98F40x0001 (1)
0x3F880x0001 (1)
0x9B7C0x00fe (16646144)
0x89500xfffcf001 (-200703)
0x98FC0x (0)

So, value of register GC_USER_RB_BACKEND_DISABLE (0x9B7C) differs.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/radeon: add an exclusive lock for GPU reset

2012-07-03 Thread Jerome Glisse
On Tue, Jul 3, 2012 at 10:52 AM, Christian König
 wrote:
> On 03.07.2012 16:09, Jerome Glisse wrote:
>>
>> On Tue, Jul 3, 2012 at 5:26 AM, Christian König 
>> wrote:
>>>
>>> [SNIP]
>>>
>>> Yeah, but I thought that fixing those oops as the second step. I see the
>>> fact that suspend/resume is unpinning all the ttm memory and then pinning
>>> it
>>> again as a bug that needs to be fixed. Or as an alternative we could
>>> split
>>> the suspend/resume calls into suspend/disable and resume/enable calls,
>>> where
>>> we only call disable/enable in the gpu_reset path rather than a complete
>>> suspend/resume (not really sure about that).
>>
>> Fixing oops are not second step, they are first step. I am not saying
>> that the suspend/resume as it happens right now is a good thing, i am
>> saying it's what it's and we have to deal with it until we do
>> something better. There is no excuse to not fix oops with a simple
>> patch 16 lines patch.
>
> Completely agree.
>
>
>>> Also a GPU reset isn't such a rare event, currently it just occurs when
>>> userspace is doing something bad, for example submitting an invalid
>>> shader
>>> or stuff like that. But with VM and IO protection coming into the driver
>>> we
>>> are going to need a GPU reset when there is an protection fault, and with
>>> that it is really desirable to just reset the hardware in a way where we
>>> can
>>> say: This IB was faulty skip over it and resume with whatever is after it
>>> on
>>> the ring.
>>
>> There is mecanism to handle that properly from irq handler that AMD
>> need to release, the pagefault thing could be transparent and should
>> only need few lines in the irq handler (i think i did a patch for that
>> and sent it to AMD for review but i am wondering if i wasn't lacking
>> some doc).
>
> I also searched the AMD internal docs for a good description of how the
> lightweight recovery is supposed to work, but haven't found anything clear
> so far. My expectation is that there should be something like a "abort
> current IB" command you can issue by writing an register, but that doesn't
> seems to be the case.

Doc are scarce on that, my understanding is that you can't skip IB
that trigger the segfault. The handling was designed with page fault
in mind, ie you page in the missing page and you resume the ib. That
being said i think one can abuse that by moving the rptr of the ring
and resuming.

>>> And todo that we need to keep the auxiliary data like sub allocator
>>> memory,
>>> blitting shader bo, and especially vm page tables at the same place in
>>> hardware memory.
>>
>> I agree that we need a lightweight reset but we need to keep the heavy
>> reset as it is right now, if you want to do a light weight reset do it
>> as a new function. I always intended to have two reset path, hint gpu
>> soft reset name vs what is call hard reset but not released, i even
>> made patch for that long time ago but never got them cleared from AMD
>> review.
>
> My idea was to pass in some extra informations, so asic_reset more clearly
> knows what todo. An explicit distinction between a soft and a hard reset
> also seems like a possible solution, but sounds like a bit of code
> duplication.

I think people should stop on the mantra of don't duplicate code, i do
agree that it's better to not duplicate code but in some case there is
no way around it and it's worse to try sharing the code, i think the
worst example in radeon kernel is the modesetting section where we try
to share same code for all different variation of DCE but in the end
it's a mess of if/else/switch/case and what not. Code duplication is
sometime the cleanest and best choice, might we fix a bug in one and
not the other yes it might happen but it's life ;)

>
 I stress it we need at very least a mutex to protect gpu reset and i
 will stand on that position because there is no other way around.
 Using rw lock have a bonus of allowing proper handling of gpu reset
 failure and that what the patch i sent to linus fix tree is about, so
 to make drm next merge properly while preserving proper behavior in
 gpu reset failure the rw semaphore is the best option.
>>>
>>> Oh well, I'm not arguing that we don't need a look here. I'm just
>>> questioning to put it at the ioctl level (e.g. the driver entry from
>>> userspace), that wasn't such a good idea with the cs_mutex and doesn't
>>> seems
>>> like a good idea now. Instead we should place the looking between the
>>> ioctl/ttm and the actual hardware submission and that brings it pretty
>>> close
>>> (if not identically) to the ring mutex.
>>>
>>> Cheers,
>>> Christian.
>>
>> No, locking at the ioctl level make sense please show me figure that
>> it hurt performance, i did a quick sysprof and i couldn't see them
>> impacting anything. No matter how much you hate this, this is the best
>> solution, it avoids each ioctl to do useless things in case of gpu
>> lockup and it touch a lot less code than moving a lock down the call
>

[Bug 44121] Reproducible GPU lockup CP stall on Radeon HD 6450

2012-07-03 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=44121





--- Comment #1 from Alex Deucher   2012-07-03 14:53:04 
---
Can you dump the following registers using radeonreg or avivotool
(http://cgit.freedesktop.org/~airlied/radeontool/) with the patch applied and
reverted and attach both results?

CC_RB_BACKEND_DISABLE (0x98F4)
CC_SYS_RB_BACKEND_DISABLE (0x3F88)
GC_USER_RB_BACKEND_DISABLE (0x9B7C)
CC_GC_SHADER_PIPE_CONFIG (0x8950)
GB_BACKEND_MAP (0x98FC)

(as root):
radeonreg regmatch 0x98F4
etc.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are watching the assignee of the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/radeon: add an exclusive lock for GPU reset

2012-07-03 Thread Christian König

On 03.07.2012 16:09, Jerome Glisse wrote:

On Tue, Jul 3, 2012 at 5:26 AM, Christian König  wrote:

[SNIP]
Yeah, but I thought that fixing those oops as the second step. I see the
fact that suspend/resume is unpinning all the ttm memory and then pinning it
again as a bug that needs to be fixed. Or as an alternative we could split
the suspend/resume calls into suspend/disable and resume/enable calls, where
we only call disable/enable in the gpu_reset path rather than a complete
suspend/resume (not really sure about that).

Fixing oops are not second step, they are first step. I am not saying
that the suspend/resume as it happens right now is a good thing, i am
saying it's what it's and we have to deal with it until we do
something better. There is no excuse to not fix oops with a simple
patch 16 lines patch.

Completely agree.


Also a GPU reset isn't such a rare event, currently it just occurs when
userspace is doing something bad, for example submitting an invalid shader
or stuff like that. But with VM and IO protection coming into the driver we
are going to need a GPU reset when there is an protection fault, and with
that it is really desirable to just reset the hardware in a way where we can
say: This IB was faulty skip over it and resume with whatever is after it on
the ring.

There is mecanism to handle that properly from irq handler that AMD
need to release, the pagefault thing could be transparent and should
only need few lines in the irq handler (i think i did a patch for that
and sent it to AMD for review but i am wondering if i wasn't lacking
some doc).
I also searched the AMD internal docs for a good description of how the 
lightweight recovery is supposed to work, but haven't found anything 
clear so far. My expectation is that there should be something like a 
"abort current IB" command you can issue by writing an register, but 
that doesn't seems to be the case.



And todo that we need to keep the auxiliary data like sub allocator memory,
blitting shader bo, and especially vm page tables at the same place in
hardware memory.

I agree that we need a lightweight reset but we need to keep the heavy
reset as it is right now, if you want to do a light weight reset do it
as a new function. I always intended to have two reset path, hint gpu
soft reset name vs what is call hard reset but not released, i even
made patch for that long time ago but never got them cleared from AMD
review.
My idea was to pass in some extra informations, so asic_reset more 
clearly knows what todo. An explicit distinction between a soft and a 
hard reset also seems like a possible solution, but sounds like a bit of 
code duplication.



I stress it we need at very least a mutex to protect gpu reset and i
will stand on that position because there is no other way around.
Using rw lock have a bonus of allowing proper handling of gpu reset
failure and that what the patch i sent to linus fix tree is about, so
to make drm next merge properly while preserving proper behavior in
gpu reset failure the rw semaphore is the best option.

Oh well, I'm not arguing that we don't need a look here. I'm just
questioning to put it at the ioctl level (e.g. the driver entry from
userspace), that wasn't such a good idea with the cs_mutex and doesn't seems
like a good idea now. Instead we should place the looking between the
ioctl/ttm and the actual hardware submission and that brings it pretty close
(if not identically) to the ring mutex.

Cheers,
Christian.

No, locking at the ioctl level make sense please show me figure that
it hurt performance, i did a quick sysprof and i couldn't see them
impacting anything. No matter how much you hate this, this is the best
solution, it avoids each ioctl to do useless things in case of gpu
lockup and it touch a lot less code than moving a lock down the call
path would. So again this is the best solution for the heavy reset,
and i am not saying that a soft reset would need to take this lock or
that we can't improve the way it's done. All i am saying is that ring
lock is the wrong solution for heavy reset, it should be ok for light
weight reset.

I'm not into any performance concerns, it just doesn't seems to be the 
right place to me. So are you really sure that the 
ttm_bo_delayed_workqueue,  pageflips or callbacks to radeon_bo_move 
can't hit us here? IIRC that always was the big concern with the 
cs_mutex not being held in all cases.


Anyway, if you think it will work and fix the crash problem at hand then 
I'm ok with commit it.


Christian.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [MMTests] IO metadata on XFS

2012-07-03 Thread Mel Gorman
On Tue, Jul 03, 2012 at 02:31:19PM +0200, Daniel Vetter wrote:
> On Tue, Jul 03, 2012 at 11:59:51AM +0100, Mel Gorman wrote:
> > On Tue, Jul 03, 2012 at 10:19:28AM +1000, Dave Chinner wrote:
> > > On Mon, Jul 02, 2012 at 08:35:16PM +0100, Mel Gorman wrote:
> > > > Adding dri-devel and a few others because an i915 patch contributed to
> > > > the regression.
> > > > 
> > > > On Mon, Jul 02, 2012 at 03:32:15PM +0100, Mel Gorman wrote:
> > > > > On Mon, Jul 02, 2012 at 02:32:26AM -0400, Christoph Hellwig wrote:
> > > > > > > It increases the CPU overhead (dirty_inode can be called up to 4
> > > > > > > times per write(2) call, IIRC), so with limited numbers of
> > > > > > > threads/limited CPU power it will result in lower performance. 
> > > > > > > Where
> > > > > > > you have lots of CPU power, there will be little difference in
> > > > > > > performance...
> > > > > > 
> > > > > > When I checked it it could only be called twice, and we'd already
> > > > > > optimize away the second call.  I'd defintively like to track down 
> > > > > > where
> > > > > > the performance changes happend, at least to a major version but 
> > > > > > even
> > > > > > better to a -rc or git commit.
> > > > > > 
> > > > > 
> > > > > By all means feel free to run the test yourself and run the bisection 
> > > > > :)
> > > > > 
> > > > > It's rare but on this occasion the test machine is idle so I started 
> > > > > an
> > > > > automated git bisection. As you know the milage with an automated 
> > > > > bisect
> > > > > varies so it may or may not find the right commit. Test machine is 
> > > > > sandy so
> > > > > http://www.csn.ul.ie/~mel/postings/mmtests-20120424/global-dhp__io-metadata-xfs/sandy/comparison.html
> > > > > is the report of interest. The script is doing a full search between 
> > > > > v3.3 and
> > > > > v3.4 for a point where average files/sec for fsmark-single drops 
> > > > > below 25000.
> > > > > I did not limit the search to fs/xfs on the off-chance that it is an
> > > > > apparently unrelated patch that caused the problem.
> > > > > 
> > > > 
> > > > It was obvious very quickly that there were two distinct regression so I
> > > > ran two bisections. One led to a XFS and the other led to an i915 patch
> > > > that enables RC6 to reduce power usage.
> > > > 
> > > > [aa464191: drm/i915: enable plain RC6 on Sandy Bridge by default]
> > > 
> > > Doesn't seem to be the major cause of the regression. By itself, it
> > > has impact, but the majority comes from the XFS change...
> > > 
> > 
> > The fact it has an impact at all is weird but lets see what the DRI
> > folks think about it.
> 
> Well, presuming I understand things correctly the cpu die only goes into
> the lowest sleep state (which iirc switches off l3 caches and
> interconnects) when both the cpu and gpu are in the lowest sleep state.

I made a mistake in my previous mail. gdm and X were were *not* running.
Once the screen blanked I would guess the GPU is in a low sleep state
the majority of the time.

> rc6 is that deep-sleep state for the gpu, so without that enabled your
> system won't go into these deep-sleep states.
> 
> I guess the slight changes in wakeup latency, power consumption (cuts
> about 10W on an idle desktop snb with resulting big effect on what turbo
> boost can sustain for short amounts of time) and all the follow-on effects
> are good enough to massively change timing-critical things.
> 

Maybe. How aggressively is the lowest sleep state entered and how long
does it take to exit?

> So this having an effect isn't too weird.
> 
> Obviously, if you also have X running while doing these tests there's the
> chance that the gpu dies because of an issue when waking up from rc6
> (we've known a few of these), but if no drm client is up, that shouldn't
> be possible. So please retest without X running if that hasn't been done
> already.
> 

Again, sorry for the confusion but the posted results are without X running.

-- 
Mel Gorman
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [MMTests] IO metadata on XFS

2012-07-03 Thread Mel Gorman
On Mon, Jul 02, 2012 at 08:35:16PM +0100, Mel Gorman wrote:
> > 
> >
> It was obvious very quickly that there were two distinct regression so I
> ran two bisections. One led to a XFS and the other led to an i915 patch
> that enables RC6 to reduce power usage.
> 
> [c999a223: xfs: introduce an allocation workqueue]
> [aa464191: drm/i915: enable plain RC6 on Sandy Bridge by default]
> 
> gdm was running on the machine so i915 would have been in use. 

Bah, more PEBKAC. gdm was *not* running on this machine. i915 is loaded
but X is not.

-- 
Mel Gorman
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [MMTests] IO metadata on XFS

2012-07-03 Thread Mel Gorman
On Tue, Jul 03, 2012 at 11:59:51AM +0100, Mel Gorman wrote:
> > Can you run latencytop to see
> > if there is excessive starvation/wait times for allocation
> > completion?
> 
> I'm not sure what format you are looking for.  latencytop is shit for
> capturing information throughout a test and it does not easily allow you to
> record a snapshot of a test. You can record all the console output of course
> but that's a complete mess. I tried capturing /proc/latency_stats over time
> instead because that can be trivially sorted on a system-wide basis but
> as I write this I find that latency_stats was bust. It was just spitting out
> 
> Latency Top version : v0.1
> 
> and nothing else.  Either latency_stats is broken or my config is. Not sure
> which it is right now and won't get enough time on this today to pinpoint it.
> 

PEBKAC. Script that monitored /proc/latency_stats was not enabling
latency top via /proc/sys/kernel/latencytop

-- 
Mel Gorman
SUSE Labs
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


gma500 suspend to ram fails (3.4)

2012-07-03 Thread Mattia Dongili
On Tue, Jul 03, 2012 at 07:11:31AM +0900, Mattia Dongili wrote:
> On Mon, Jul 02, 2012 at 11:43:04AM +0100, Alan Cox wrote:
> > On Mon, 2 Jul 2012 07:06:59 +0900
> > Mattia Dongili  wrote:
> ...
> > > a did put some printks here and there and psb_driver_init seems to
> > > never return from gma_backlight_init.
> > 
> > Is it ok if you just comment out gma_backlight_init (at which point it
> > should just skip backlight handling and leave the firmware configured
> > one)
> 
> ha! that gave me a good lead.
> The problem is not gma_backlight_init but rather psb_modeset_init that
> calls the errata for the chip that in turn sets the brightness before
> any initialization is done.
> 
> Commenting out the call to dev_priv->ops->errata in framebuffer.c makes
> boot get to the end like it used to on 3.4. It seems to be just an
> ordering problem in the end.

and back to the original issue, with 3.5 the laptop suspends to ram just
fine! finally this vaio X may be of some use!!

I can test patches for the errata call issue if you need any
confirmation.

-- 
mattia
:wq!


[Bug 26092] shader problems with Flightgear

2012-07-03 Thread bugzilla-dae...@freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=26092

Christian Schmitt  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||WORKSFORME

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.


gma500 suspend to ram fails (3.4)

2012-07-03 Thread Mattia Dongili
On Mon, Jul 02, 2012 at 11:43:04AM +0100, Alan Cox wrote:
> On Mon, 2 Jul 2012 07:06:59 +0900
> Mattia Dongili  wrote:
...
> > a did put some printks here and there and psb_driver_init seems to
> > never return from gma_backlight_init.
> 
> Is it ok if you just comment out gma_backlight_init (at which point it
> should just skip backlight handling and leave the firmware configured
> one)

ha! that gave me a good lead.
The problem is not gma_backlight_init but rather psb_modeset_init that
calls the errata for the chip that in turn sets the brightness before
any initialization is done.

Commenting out the call to dev_priv->ops->errata in framebuffer.c makes
boot get to the end like it used to on 3.4. It seems to be just an
ordering problem in the end.

-- 
mattia
:wq!


Re: [PATCH] drm/radeon: add an exclusive lock for GPU reset

2012-07-03 Thread Jerome Glisse
On Tue, Jul 3, 2012 at 5:26 AM, Christian König  wrote:
> On 02.07.2012 19:27, Jerome Glisse wrote:
>>
>> On Mon, Jul 2, 2012 at 1:05 PM, Christian König 
>> wrote:
>>>
>>> On 02.07.2012 18:41, Jerome Glisse wrote:

 On Mon, Jul 2, 2012 at 12:26 PM, Christian König
  wrote:
>
> On 02.07.2012 17:39, j.gli...@gmail.com wrote:
>>
>> From: Jerome Glisse 
>>
>> GPU reset need to be exclusive, one happening at a time. For this
>> add a rw semaphore so that any path that trigger GPU activities
>> have to take the semaphore as a reader thus allowing concurency.
>>
>> The GPU reset path take the semaphore as a writer ensuring that
>> no concurrent reset take place.
>>
>> Signed-off-by: Jerome Glisse 
>
> NAK, that isn't as bad as the cs mutex was but still to complicated.
> It's
> just too far up in the call stack, e.g. it tries to catch ioctl
> operations,
> instead of catching the underlying hardware operation which is caused
> by
> the
> ioctl/ttm/etc...
>
> Why not just take the ring look as I suggested?
>
>
 No we can't use ring lock, we need to protect suspend/resume path and
 we need an exclusive lock for that so we need a reset mutex at the
 very least. But instead of having a reset mutex i prefer using a rw
 lock so that we can stop ioctl until a reset goes through an let
 pending ioctl take proper action. Think about multiple context trying
 to reset GPU ...

 Really this is the best option, the rw locking wont induce any lock
 contention execept in gpu reset case which is anyway bad news.
>>>
>>> Why? That makes no sense to me. Well I don't want to prevent lock
>>> contention, but understand why we should add locking at the ioctl level.
>>> That violates locking rule number one "lock data instead of code" (or in
>>> our
>>> case "lock hardware access instead of code path") and it really is the
>>> reason why we ended up with the cs_mutex protecting practically
>>> everything.
>>>
>>> Multiple context trying to reset the GPU should be pretty fine, current
>>> it
>>> would just reset the GPU twice, but in the future asic_reset should be
>>> much
>>> more fine grained and only reset the parts of the GPU which really needs
>>> an
>>> reset.
>>>
>>> Cheers,
>>> Christian.
>>
>> No multiple reset is not fine, try it your self and you will see all
>> kind of oops (strongly advise you to sync you hd before stress testing
>> that). Yes we need to protect code path because suspend/resume code
>> path is special one it touch many data in the device structure. GPU
>> reset is a rare event or should be a rare event.
>
> Yeah, but I thought that fixing those oops as the second step. I see the
> fact that suspend/resume is unpinning all the ttm memory and then pinning it
> again as a bug that needs to be fixed. Or as an alternative we could split
> the suspend/resume calls into suspend/disable and resume/enable calls, where
> we only call disable/enable in the gpu_reset path rather than a complete
> suspend/resume (not really sure about that).

Fixing oops are not second step, they are first step. I am not saying
that the suspend/resume as it happens right now is a good thing, i am
saying it's what it's and we have to deal with it until we do
something better. There is no excuse to not fix oops with a simple
patch 16 lines patch.

> Also a GPU reset isn't such a rare event, currently it just occurs when
> userspace is doing something bad, for example submitting an invalid shader
> or stuff like that. But with VM and IO protection coming into the driver we
> are going to need a GPU reset when there is an protection fault, and with
> that it is really desirable to just reset the hardware in a way where we can
> say: This IB was faulty skip over it and resume with whatever is after it on
> the ring.

There is mecanism to handle that properly from irq handler that AMD
need to release, the pagefault thing could be transparent and should
only need few lines in the irq handler (i think i did a patch for that
and sent it to AMD for review but i am wondering if i wasn't lacking
some doc).

> And todo that we need to keep the auxiliary data like sub allocator memory,
> blitting shader bo, and especially vm page tables at the same place in
> hardware memory.

I agree that we need a lightweight reset but we need to keep the heavy
reset as it is right now, if you want to do a light weight reset do it
as a new function. I always intended to have two reset path, hint gpu
soft reset name vs what is call hard reset but not released, i even
made patch for that long time ago but never got them cleared from AMD
review.

>
>> I stress it we need at very least a mutex to protect gpu reset and i
>> will stand on that position because there is no other way around.
>> Using rw lock have a bonus of allowing proper handling of gpu reset
>> failure and that what the patch i sent to linus fix tree 

Re: [MMTests] IO metadata on XFS

2012-07-03 Thread Daniel Vetter
On Tue, Jul 03, 2012 at 02:04:14PM +0100, Mel Gorman wrote:
> On Mon, Jul 02, 2012 at 08:35:16PM +0100, Mel Gorman wrote:
> > > 
> > >
> > It was obvious very quickly that there were two distinct regression so I
> > ran two bisections. One led to a XFS and the other led to an i915 patch
> > that enables RC6 to reduce power usage.
> > 
> > [c999a223: xfs: introduce an allocation workqueue]
> > [aa464191: drm/i915: enable plain RC6 on Sandy Bridge by default]
> > 
> > gdm was running on the machine so i915 would have been in use. 
> 
> Bah, more PEBKAC. gdm was *not* running on this machine. i915 is loaded
> but X is not.

See my little explanation of rc6, just loading the driver will have
effects. But I'm happy to know that the issue also happens without using
it, makes it really unlikely it's an issue with the gpu or i915.ko ;-)
-Daniel
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/radeon: fix fence related segfault in CS

2012-07-03 Thread Alex Deucher
On Tue, Jul 3, 2012 at 8:11 AM, Christian König  wrote:
> Don't return success if scheduling the IB fails, otherwise
> we end up with an oops in ttm_eu_fence_buffer_objects.
>
> Signed-off-by: Christian König 
> Cc: sta...@vger.kernel.org

Reviewed-by: Alex Deucher 

> ---
>  drivers/gpu/drm/radeon/radeon_cs.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
> b/drivers/gpu/drm/radeon/radeon_cs.c
> index 142f894..17238f4 100644
> --- a/drivers/gpu/drm/radeon/radeon_cs.c
> +++ b/drivers/gpu/drm/radeon/radeon_cs.c
> @@ -377,7 +377,7 @@ static int radeon_cs_ib_chunk(struct radeon_device *rdev,
> if (r) {
> DRM_ERROR("Failed to schedule IB !\n");
> }
> -   return 0;
> +   return r;
>  }
>
>  static int radeon_bo_vm_update_pte(struct radeon_cs_parser *parser,
> --
> 1.7.9.5
>
> ___
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [MMTests] IO metadata on XFS

2012-07-03 Thread Eugeni Dodonov
On Tue, Jul 3, 2012 at 9:31 AM, Daniel Vetter  wrote:

> Well, presuming I understand things correctly the cpu die only goes into
> the lowest sleep state (which iirc switches off l3 caches and
> interconnects) when both the cpu and gpu are in the lowest sleep state.
> rc6 is that deep-sleep state for the gpu, so without that enabled your
> system won't go into these deep-sleep states.
>
> I guess the slight changes in wakeup latency, power consumption (cuts
> about 10W on an idle desktop snb with resulting big effect on what turbo
> boost can sustain for short amounts of time) and all the follow-on effects
> are good enough to massively change timing-critical things.
>

The sad side effect is that the software has very little control over the
RC6 entry and exit, the hardware enters and leaves RC6 state on its own
when it detects that the GPU is idle beyond a threshold. Chances are that
if you are not running any GPU workload, the GPU simple enters RC6 state
and stays there.

It is possible to observe the current state and also time spent in rc6 by
looking at the /sys/kernel/debug/dri/0/i915_drpc_info file.

One other effect of RC6 is that it also allows CPU to go into higher turbo
modes as it has more watts to spend while GPU is idle, perhaps this is what
causes the issue here?

-- 
Eugeni Dodonov

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [MMTests] IO metadata on XFS

2012-07-03 Thread Daniel Vetter
On Tue, Jul 03, 2012 at 11:59:51AM +0100, Mel Gorman wrote:
> On Tue, Jul 03, 2012 at 10:19:28AM +1000, Dave Chinner wrote:
> > On Mon, Jul 02, 2012 at 08:35:16PM +0100, Mel Gorman wrote:
> > > Adding dri-devel and a few others because an i915 patch contributed to
> > > the regression.
> > > 
> > > On Mon, Jul 02, 2012 at 03:32:15PM +0100, Mel Gorman wrote:
> > > > On Mon, Jul 02, 2012 at 02:32:26AM -0400, Christoph Hellwig wrote:
> > > > > > It increases the CPU overhead (dirty_inode can be called up to 4
> > > > > > times per write(2) call, IIRC), so with limited numbers of
> > > > > > threads/limited CPU power it will result in lower performance. Where
> > > > > > you have lots of CPU power, there will be little difference in
> > > > > > performance...
> > > > > 
> > > > > When I checked it it could only be called twice, and we'd already
> > > > > optimize away the second call.  I'd defintively like to track down 
> > > > > where
> > > > > the performance changes happend, at least to a major version but even
> > > > > better to a -rc or git commit.
> > > > > 
> > > > 
> > > > By all means feel free to run the test yourself and run the bisection :)
> > > > 
> > > > It's rare but on this occasion the test machine is idle so I started an
> > > > automated git bisection. As you know the milage with an automated bisect
> > > > varies so it may or may not find the right commit. Test machine is 
> > > > sandy so
> > > > http://www.csn.ul.ie/~mel/postings/mmtests-20120424/global-dhp__io-metadata-xfs/sandy/comparison.html
> > > > is the report of interest. The script is doing a full search between 
> > > > v3.3 and
> > > > v3.4 for a point where average files/sec for fsmark-single drops below 
> > > > 25000.
> > > > I did not limit the search to fs/xfs on the off-chance that it is an
> > > > apparently unrelated patch that caused the problem.
> > > > 
> > > 
> > > It was obvious very quickly that there were two distinct regression so I
> > > ran two bisections. One led to a XFS and the other led to an i915 patch
> > > that enables RC6 to reduce power usage.
> > > 
> > > [aa464191: drm/i915: enable plain RC6 on Sandy Bridge by default]
> > 
> > Doesn't seem to be the major cause of the regression. By itself, it
> > has impact, but the majority comes from the XFS change...
> > 
> 
> The fact it has an impact at all is weird but lets see what the DRI
> folks think about it.

Well, presuming I understand things correctly the cpu die only goes into
the lowest sleep state (which iirc switches off l3 caches and
interconnects) when both the cpu and gpu are in the lowest sleep state.
rc6 is that deep-sleep state for the gpu, so without that enabled your
system won't go into these deep-sleep states.

I guess the slight changes in wakeup latency, power consumption (cuts
about 10W on an idle desktop snb with resulting big effect on what turbo
boost can sustain for short amounts of time) and all the follow-on effects
are good enough to massively change timing-critical things.

So this having an effect isn't too weird.

Obviously, if you also have X running while doing these tests there's the
chance that the gpu dies because of an issue when waking up from rc6
(we've known a few of these), but if no drm client is up, that shouldn't
be possible. So please retest without X running if that hasn't been done
already.

Yours, Daniel
-- 
Daniel Vetter
Mail: dan...@ffwll.ch
Mobile: +41 (0)79 365 57 48
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 51658] r200 (& possibly radeon) DRI fixes for gnome shell on Mesa 8.0.3

2012-07-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=51658

--- Comment #9 from Roland Scheidegger  2012-07-03 05:20:47 
PDT ---
(In reply to comment #8)
> A summary of all of the patches follows.
> 
> r200 essential patches (1st 3 patches) for gnome shell:
> "Essential patch to disable texture formats that are reported unrenderable
> elsewhere in driver."
I'm not really happy with that. I guess the problem is if we try to attach a
texture to a fbo it is too late to notice we cannot render to that format. But
this sort of sucks for ordinary textures. Ideally we'd probably be able to
really determine our format only at the first upload of data to the texture
(which presumably won't happen) so we could change it if we attach it to a fbo.
Though if there is indeed already data uploaded we're screwed.

> "Essential patch to allow 2048 pixel blits."
This one looks good as far as I can tell. The programmed values are
width/height -1 so this clearly should work.

> "Essential patch to reapply dirtied texenv registers."
I don't understand this. If the state isn't used anyway why need to reuppload
it, dirty or not? I think this is just hiding the root cause of another bug.

> 
> suggested minor fixes patch (4th patch):
> "Unessential fixes/enhancements patch."
Look good.

> 
> proposed r200 blit patch (5th patch) - unknown importance:
> "Proposed blit register dirtying patch."
Makes sense to me.

> 
> untested but probably essential r100/radeon patch (7th patch) for gnome shell:
> "Untested but probably essential patch to allow 2048 pixel blits on r100."
Yes that should be same as for r200.

> 
> optional unrecommended patches to supplement blit patches already mentioned
> above (6th & 8th patches):
> "Optional patch to warn (once) when a blit with 2048 pixel dimension occurs on
> r200."
> "Optional untested patch to warn (once) when a blit with 2048 pixel dimension
> occurs on r100."
I don't think this is necessary - at least not there. texture src width/height
of 2048 should clearly work. I am however not so sure about destination.
width/height are fed into RADEON_RE_WIDTH_HEIGHT, and those have just 11 bit,
which isn't enough for 2048. Furthermore, docs say the values are inclusive.
Not sure what's up with that, the ddx also just uses width/height and not
width/height -1 but clamps them to 2047 in some places. dri driver though
otherwise seems to use x2/y2 (i.e. width -1, height -1). So I think this should
be fixed everywhere to really use width/height -1.

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] drm/radeon: fix fence related segfault in CS

2012-07-03 Thread Christian König
Don't return success if scheduling the IB fails, otherwise
we end up with an oops in ttm_eu_fence_buffer_objects.

Signed-off-by: Christian König 
Cc: sta...@vger.kernel.org
---
 drivers/gpu/drm/radeon/radeon_cs.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
b/drivers/gpu/drm/radeon/radeon_cs.c
index 142f894..17238f4 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -377,7 +377,7 @@ static int radeon_cs_ib_chunk(struct radeon_device *rdev,
if (r) {
DRM_ERROR("Failed to schedule IB !\n");
}
-   return 0;
+   return r;
 }
 
 static int radeon_bo_vm_update_pte(struct radeon_cs_parser *parser,
-- 
1.7.9.5

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[git pull] drm fixes

2012-07-03 Thread Dave Airlie

Hi Linus,

one regression fix, two radeon fixes (one for an oops), and an i915 fix
to unload framebuffers earlier.

We originally were going to leave the i915 fix until -next, but grub2 in 
some situations causes vesafb/efifb to be loaded now, and this causes big 
slowdowns, and I have reports in rawhide I'd like to have fixed.

Dave.

The following changes since commit 9acc7bde23ebb19a704395f76490685e1513e422:

  Merge tag 'hwmon-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging (2012-06-28 
12:38:51 -0700)

are available in the git repository at:


  git://people.freedesktop.org/~airlied/linux drm-fixes

for you to fetch changes up to 9f846a16d213523fbe6daea17e20df6b8ac5a1e5:

  drm/i915: kick any firmware framebuffers before claiming the gtt (2012-07-03 
11:18:48 +0100)


Alex Deucher (1):
  drm/radeon: fix VM page table setup on SI

Daniel Vetter (1):
  drm/i915: kick any firmware framebuffers before claiming the gtt

Jerome Glisse (1):
  drm/radeon: fix rare segfault

Takashi Iwai (1):
  drm: edid: Don't add inferred modes with higher resolution

 drivers/gpu/drm/drm_edid.c   |   27 ++---
 drivers/gpu/drm/i915/i915_dma.c  |   37 +++---
 drivers/gpu/drm/radeon/radeon_gart.c |   13 ++--
 drivers/gpu/drm/radeon/radeon_gem.c  |   10 +
 drivers/gpu/drm/radeon/si.c  |4 ++--
 5 files changed, 73 insertions(+), 18 deletions(-)
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] omap2+: add drm device

2012-07-03 Thread Tony Lindgren
* Gross, Andy  [120619 14:17]:
> Tony,
> 
> Please queue this patch at your earliest convenience.
> 
> We had some discussion on the splitting out of the DMM/Tiler driver
> from the omapdrm driver.  There might be some interest in leveraging
> the Tiler for omapfb.  However, we agreed this can be deferred until
> some other device (omapfb or otherwise) needs to use the Tiler.

OK I'll apply this into devel-board as that's seems to fit the platform
data area.

Regards,

Tony
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 26496] OpenGL does not work on Radeon 9600 (r300)

2012-07-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=26496

Michel Dänzer  changed:

   What|Removed |Added

 CC||kevinba...@gmail.com

--- Comment #15 from Michel Dänzer  2012-07-03 02:55:36 PDT 
---
*** Bug 51594 has been marked as a duplicate of this bug. ***

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 51594] opengl does not work with radeon 9600

2012-07-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=51594

Michel Dänzer  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||DUPLICATE

--- Comment #1 from Michel Dänzer  2012-07-03 02:55:36 PDT 
---


*** This bug has been marked as a duplicate of bug 26496 ***

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/radeon: fix rare segfault

2012-07-03 Thread Christian König

On 02.07.2012 18:40, j.gli...@gmail.com wrote:

From: Jerome Glisse 

In gem idle/busy ioctl the radeon object was derefenced after
drm_gem_object_unreference_unlocked which in case the object
have been destroyed lead to use of a possibly free pointer with
possibly wrong data.

Signed-off-by: Jerome Glisse 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/radeon/radeon_gem.c |   10 ++
  1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_gem.c 
b/drivers/gpu/drm/radeon/radeon_gem.c
index 74176c5..c8838fc 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -325,6 +325,7 @@ int radeon_gem_mmap_ioctl(struct drm_device *dev, void 
*data,
  int radeon_gem_busy_ioctl(struct drm_device *dev, void *data,
  struct drm_file *filp)
  {
+   struct radeon_device *rdev = dev->dev_private;
struct drm_radeon_gem_busy *args = data;
struct drm_gem_object *gobj;
struct radeon_bo *robj;
@@ -350,13 +351,14 @@ int radeon_gem_busy_ioctl(struct drm_device *dev, void 
*data,
break;
}
drm_gem_object_unreference_unlocked(gobj);
-   r = radeon_gem_handle_lockup(robj->rdev, r);
+   r = radeon_gem_handle_lockup(rdev, r);
return r;
  }
  
  int radeon_gem_wait_idle_ioctl(struct drm_device *dev, void *data,

  struct drm_file *filp)
  {
+   struct radeon_device *rdev = dev->dev_private;
struct drm_radeon_gem_wait_idle *args = data;
struct drm_gem_object *gobj;
struct radeon_bo *robj;
@@ -369,10 +371,10 @@ int radeon_gem_wait_idle_ioctl(struct drm_device *dev, 
void *data,
robj = gem_to_radeon_bo(gobj);
r = radeon_bo_wait(robj, NULL, false);
/* callback hw specific functions if any */
-   if (robj->rdev->asic->ioctl_wait_idle)
-   robj->rdev->asic->ioctl_wait_idle(robj->rdev, robj);
+   if (rdev->asic->ioctl_wait_idle)
+   robj->rdev->asic->ioctl_wait_idle(rdev, robj);
drm_gem_object_unreference_unlocked(gobj);
-   r = radeon_gem_handle_lockup(robj->rdev, r);
+   r = radeon_gem_handle_lockup(rdev, r);
return r;
  }
  



___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/radeon: add an exclusive lock for GPU reset

2012-07-03 Thread Christian König

On 02.07.2012 19:27, Jerome Glisse wrote:

On Mon, Jul 2, 2012 at 1:05 PM, Christian König  wrote:

On 02.07.2012 18:41, Jerome Glisse wrote:

On Mon, Jul 2, 2012 at 12:26 PM, Christian König
 wrote:

On 02.07.2012 17:39, j.gli...@gmail.com wrote:

From: Jerome Glisse 

GPU reset need to be exclusive, one happening at a time. For this
add a rw semaphore so that any path that trigger GPU activities
have to take the semaphore as a reader thus allowing concurency.

The GPU reset path take the semaphore as a writer ensuring that
no concurrent reset take place.

Signed-off-by: Jerome Glisse 

NAK, that isn't as bad as the cs mutex was but still to complicated. It's
just too far up in the call stack, e.g. it tries to catch ioctl
operations,
instead of catching the underlying hardware operation which is caused by
the
ioctl/ttm/etc...

Why not just take the ring look as I suggested?



No we can't use ring lock, we need to protect suspend/resume path and
we need an exclusive lock for that so we need a reset mutex at the
very least. But instead of having a reset mutex i prefer using a rw
lock so that we can stop ioctl until a reset goes through an let
pending ioctl take proper action. Think about multiple context trying
to reset GPU ...

Really this is the best option, the rw locking wont induce any lock
contention execept in gpu reset case which is anyway bad news.

Why? That makes no sense to me. Well I don't want to prevent lock
contention, but understand why we should add locking at the ioctl level.
That violates locking rule number one "lock data instead of code" (or in our
case "lock hardware access instead of code path") and it really is the
reason why we ended up with the cs_mutex protecting practically everything.

Multiple context trying to reset the GPU should be pretty fine, current it
would just reset the GPU twice, but in the future asic_reset should be much
more fine grained and only reset the parts of the GPU which really needs an
reset.

Cheers,
Christian.

No multiple reset is not fine, try it your self and you will see all
kind of oops (strongly advise you to sync you hd before stress testing
that). Yes we need to protect code path because suspend/resume code
path is special one it touch many data in the device structure. GPU
reset is a rare event or should be a rare event.
Yeah, but I thought that fixing those oops as the second step. I see the 
fact that suspend/resume is unpinning all the ttm memory and then 
pinning it again as a bug that needs to be fixed. Or as an alternative 
we could split the suspend/resume calls into suspend/disable and 
resume/enable calls, where we only call disable/enable in the gpu_reset 
path rather than a complete suspend/resume (not really sure about that).


Also a GPU reset isn't such a rare event, currently it just occurs when 
userspace is doing something bad, for example submitting an invalid 
shader or stuff like that. But with VM and IO protection coming into the 
driver we are going to need a GPU reset when there is an protection 
fault, and with that it is really desirable to just reset the hardware 
in a way where we can say: This IB was faulty skip over it and resume 
with whatever is after it on the ring.


And todo that we need to keep the auxiliary data like sub allocator 
memory, blitting shader bo, and especially vm page tables at the same 
place in hardware memory.



I stress it we need at very least a mutex to protect gpu reset and i
will stand on that position because there is no other way around.
Using rw lock have a bonus of allowing proper handling of gpu reset
failure and that what the patch i sent to linus fix tree is about, so
to make drm next merge properly while preserving proper behavior in
gpu reset failure the rw semaphore is the best option.
Oh well, I'm not arguing that we don't need a look here. I'm just 
questioning to put it at the ioctl level (e.g. the driver entry from 
userspace), that wasn't such a good idea with the cs_mutex and doesn't 
seems like a good idea now. Instead we should place the looking between 
the ioctl/ttm and the actual hardware submission and that brings it 
pretty close (if not identically) to the ring mutex.


Cheers,
Christian.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH libdrm] intel: Fix build failure in test_decode.c

2012-07-03 Thread Lauri Kasanen
On Mon, 2 Jul 2012 14:54:58 -0700
Ben Widawsky  wrote:

> > +#define _GNU_SOURCE
> > +
> >  #include 
> >  #include 
> >  #include 
> 
> I can't reproduce this. Can anyone else confirm this is broken, and if
> so that the above patch fixes it?

See the manpage, it depends on the glibc version. Libdrm is broken on glibc < 
2.10.

- Lauri
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] drm: edid: Don't add inferred modes with higher resolution

2012-07-03 Thread Takashi Iwai
When a monitor EDID doesn't give the preferred bit, driver assumes
that the mode with the higest resolution and rate is the preferred
mode.  Meanwhile the recent changes for allowing more modes in the
GFT/CVT ranges give actually more modes, and some modes may be over
the native size.  Thus such a mode would be picked up as the preferred
mode although it's no native resolution.

For avoiding such a problem, this patch limits the addition of
inferred modes by checking not to be greater than other modes.
Also, it checks the duplicated mode entry at the same time.

Reviewed-by: Adam Jackson 
Signed-off-by: Takashi Iwai 
---

Dave, could you apply this as a regression fix for 3.5 kernel?

 drivers/gpu/drm/drm_edid.c |   27 ---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 5873e48..a8743c3 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -1039,6 +1039,24 @@ mode_in_range(const struct drm_display_mode *mode, 
struct edid *edid,
return true;
 }
 
+static bool valid_inferred_mode(const struct drm_connector *connector,
+   const struct drm_display_mode *mode)
+{
+   struct drm_display_mode *m;
+   bool ok = false;
+
+   list_for_each_entry(m, &connector->probed_modes, head) {
+   if (mode->hdisplay == m->hdisplay &&
+   mode->vdisplay == m->vdisplay &&
+   drm_mode_vrefresh(mode) == drm_mode_vrefresh(m))
+   return false; /* duplicated */
+   if (mode->hdisplay <= m->hdisplay &&
+   mode->vdisplay <= m->vdisplay)
+   ok = true;
+   }
+   return ok;
+}
+
 static int
 drm_dmt_modes_for_range(struct drm_connector *connector, struct edid *edid,
struct detailed_timing *timing)
@@ -1048,7 +1066,8 @@ drm_dmt_modes_for_range(struct drm_connector *connector, 
struct edid *edid,
struct drm_device *dev = connector->dev;
 
for (i = 0; i < drm_num_dmt_modes; i++) {
-   if (mode_in_range(drm_dmt_modes + i, edid, timing)) {
+   if (mode_in_range(drm_dmt_modes + i, edid, timing) &&
+   valid_inferred_mode(connector, drm_dmt_modes + i)) {
newmode = drm_mode_duplicate(dev, &drm_dmt_modes[i]);
if (newmode) {
drm_mode_probed_add(connector, newmode);
@@ -1088,7 +1107,8 @@ drm_gtf_modes_for_range(struct drm_connector *connector, 
struct edid *edid,
return modes;
 
fixup_mode_1366x768(newmode);
-   if (!mode_in_range(newmode, edid, timing)) {
+   if (!mode_in_range(newmode, edid, timing) ||
+   !valid_inferred_mode(connector, newmode)) {
drm_mode_destroy(dev, newmode);
continue;
}
@@ -1116,7 +1136,8 @@ drm_cvt_modes_for_range(struct drm_connector *connector, 
struct edid *edid,
return modes;
 
fixup_mode_1366x768(newmode);
-   if (!mode_in_range(newmode, edid, timing)) {
+   if (!mode_in_range(newmode, edid, timing) ||
+   !valid_inferred_mode(connector, newmode)) {
drm_mode_destroy(dev, newmode);
continue;
}
-- 
1.7.10.4

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: Bogus video resolution in Linux 3.5-rc4

2012-07-03 Thread Takashi Iwai
At Mon, 02 Jul 2012 15:46:29 -0400,
Adam Jackson wrote:
> 
> On 6/26/12 3:21 AM, Takashi Iwai wrote:
> 
> > From: Takashi Iwai 
> > Subject: [PATCH] drm: edid: Don't add inferred modes with higher resolution
> >
> > When a monitor EDID doesn't give the preferred bit, driver assumes
> > that the mode with the higest resolution and rate is the preferred
> > mode.  Meanwhile the recent changes for allowing more modes in the
> > GFT/CVT ranges give actually more modes, and some modes may be over
> > the native size.  Thus such a mode would be picked up as the preferred
> > mode although it's no native resolution.
> >
> > For avoiding such a problem, this patch limits the addition of
> > inferred modes by checking not to be greater than other modes.
> > Also, it checks the duplicated mode entry at the same time.
> 
> This is a little aggressive on CRTs, but whatever, better than what's there.
> 
> Reviewed-by: Adam Jackson 

Thanks.  I'm going to resend the patch with your tag.


Takashi
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 26092] shader problems with Flightgear

2012-07-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=26092

Christian Schmitt  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||WORKSFORME

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] omap2+: add drm device

2012-07-03 Thread Tony Lindgren
* Gross, Andy  [120619 14:17]:
> Tony,
> 
> Please queue this patch at your earliest convenience.
> 
> We had some discussion on the splitting out of the DMM/Tiler driver
> from the omapdrm driver.  There might be some interest in leveraging
> the Tiler for omapfb.  However, we agreed this can be deferred until
> some other device (omapfb or otherwise) needs to use the Tiler.

OK I'll apply this into devel-board as that's seems to fit the platform
data area.

Regards,

Tony