Re: DRM memory manager on cards with hardware contexts

2006-09-20 Thread Thomas Hellström




Benjamin Herrenschmidt wrote:

  On Tue, 2006-09-19 at 12:49 +0200, Thomas Hellström wrote:
  
  
Benjamin Herrenschmidt wrote: 


  On Tue, 2006-09-19 at 11:27 +0200, Thomas Hellström wrote:

  
  
  
But this should be the same problem encountered by the agpgart driver?
x86 and x86-64 calls change_page_attr() to take care of this.
On powerpc it is simply a noop. (asm/agp.h)


  
  Possibly. We sort-of ignore the issue for now on PowerPC and happen to
be lucky most of the time because 32 bits PowerPC aren't that agressive
at prefetching...

I haven't looked at change_page_attr() implementation on x86 but I know
they also map the linear mapping with large pages. I don't know what
happens if you start trying to change a single page attribute. x86 can
breakup that large page into 4k pages, so maybe that's what happens.

  
  

Yes, I think that's what happens. I know some Athlon chips had a big
issue with this some time ago.

I notice there are special functions in agp.h to alloc / free GATT
pages, so the general idea might be to have a pool of uncached pages
in the future for powerpc? Even better would perhaps be to have pages
that aren't mapped for the kernel. (like highmem pages on x86).

  
  
Yes, that's exactly what I'm thinking about doing. However, this is only
a problem for AGP.

  

Right.

  For objects that are in video memory but can also be moved back to main
memory (what I call "evicted") under pressure by the memory manager, one
thing I was wondering is, do we have to bother about cache settings at
all ?

  

I don't think so. 
We are not doing vram yet in the TTM code, but I think a general
"eviction" would consist of 

1) locking mmap_sems for all processes mapping the buffer.
2) zap the page table. Any attempt to access will be blocked by
mmap_sem in nopage().
3) Copy contents from vram to system using either PCI SG or
video-blit-AGP-flip-system.
4) Wait for completion.
5) release the mmap sem. The page table will be refilled using nopage().

A copy back might be more efficient since in come situations we don't
have to wait for completion 
(If the copy is done using the command queue.) Intel chips for instance
have the possibility to flip cached pages into AGP for use with the
video blitter.

  That is, have them mapped non-cacheable when in vram and cacheable when
in main memory. Is there any reason why there would be a problem with
userland having the same buffer being sometimes cacheable and
non-cacheable ? I don't think so as long as userland isn't using cache
tricks and whatever primitive is used to do the copy to/from vram
properly accounts for it.
  

I agree.

  
Ben.

  

/Thomas



-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[Bug 8218] Errors during World of Warcraft stress test

2006-09-20 Thread bugzilla-daemon
Please do not reply to this email: if you want to comment on the bug, go to
   
the URL shown below and enter yourcomments there. 
   
https://bugs.freedesktop.org/show_bug.cgi?id=8218  
 




--- Additional Comments From [EMAIL PROTECTED]  2006-09-20 01:01 ---
(In reply to comment #14)
 
  Incidentally, I saw that the r300 implements vbos in hardware. Is this an 
  extra
  feature of the R300 hardware, or could the R200 hardware do this too?
 No, r200 (and r100 too) can easily do that, with basically the same code as 
 r300
 uses for that. It's on my TODO list...

FWIW, I think it might be better to move all these drivers to the upcoming
common memory manager instead of putting more effort into the r300 hack.
  
 
 
--   
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email 
 
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[Bug 8283] i915 can't swizzle TEX arguments

2006-09-20 Thread bugzilla-daemon
Please do not reply to this email: if you want to comment on the bug, go to
   
the URL shown below and enter yourcomments there. 
   
https://bugs.freedesktop.org/show_bug.cgi?id=8283  
 




--- Additional Comments From [EMAIL PROTECTED]  2006-09-20 03:42 ---
Created an attachment (id=7095)
 -- (https://bugs.freedesktop.org/attachment.cgi?id=7095action=view)
TEXLD swizzle nr 3
  
 
 
--   
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email 
 
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[Bug 8283] i915 can't swizzle TEX arguments

2006-09-20 Thread bugzilla-daemon
Please do not reply to this email: if you want to comment on the bug, go to
   
the URL shown below and enter yourcomments there. 
   
https://bugs.freedesktop.org/show_bug.cgi?id=8283  
 




--- Additional Comments From [EMAIL PROTECTED]  2006-09-20 03:43 ---
Does this version work for you?  I've slightly cleaned things up:
  - move calculate into its own function
  - use the existing mesa call for the number of argments of an instruction
  - just pass the bitmask to emit_texld so that code doesn't learn too much
about mesa's fragment program representation.

Keith  
 
 
--   
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email 
 
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-09-20 Thread Benjamin Herrenschmidt

 I don't think so. 
 We are not doing vram yet in the TTM code, but I think a general
 eviction would consist of 
 
 1) locking mmap_sems for all processes mapping the buffer.
 2) zap the page table. Any attempt to access will be blocked by
 mmap_sem in nopage().
 3) Copy contents from vram to system using either PCI SG or
 video-blit-AGP-flip-system.
 4) Wait for completion.
 5) release the mmap sem. The page table will be refilled using
 nopage().

On Cell, for SPU mappings, we don't scan through all processes mapping
it, we use umap_mapping_range() which does it. However, after
double-checking, i have some doubts about the locking so I'm trying to
clarify that and I'll come back to you wether it's actually a viable
solution or not.

Ben.



-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[PATCH] drm: Fix 'debug' sysfs permissions (CVE-2005-3179)

2006-09-20 Thread Sergey Vlasov
DRM creates a debug file in sysfs with world-readable and world-writable
permissions, which allows local users to enable DRM debugging and obtain
sensitive information.

Signed-off-by: Sergey Vlasov [EMAIL PROTECTED]
---
 linux-core/drm_stub.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/linux-core/drm_stub.c b/linux-core/drm_stub.c
index 4708222..c8c2906 100644
--- a/linux-core/drm_stub.c
+++ b/linux-core/drm_stub.c
@@ -48,7 +48,7 @@ MODULE_PARM_DESC(cards_limit, Maximum n
 MODULE_PARM_DESC(debug, Enable debug output);
 
 module_param_named(cards_limit, drm_cards_limit, int, S_IRUGO);
-module_param_named(debug, drm_debug, int, S_IRUGO|S_IWUGO);
+module_param_named(debug, drm_debug, int, S_IRUSR|S_IWUSR);
 
 drm_head_t **drm_heads;
 struct drm_sysfs_class *drm_class;
-- 
1.4.2.ge6fa


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[PATCH] drm: Replace SPIN_LOCK_UNLOCKED with proper spinlock initializers

2006-09-20 Thread Sergey Vlasov
Old-style spinlock initialization does not work with some kernel
patches, and is not fully compatible with lockdep included in 2.6.18.

Signed-off-by: Sergey Vlasov [EMAIL PROTECTED]
---
 linux-core/drm_memory_debug.c |2 +-
 linux-core/drm_memory_debug.h |2 +-
 linux-core/via_dmablit.c  |2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/linux-core/drm_memory_debug.c b/linux-core/drm_memory_debug.c
index 2fe7aea..4dbb2cd 100644
--- a/linux-core/drm_memory_debug.c
+++ b/linux-core/drm_memory_debug.c
@@ -45,7 +45,7 @@ typedef struct drm_mem_stats {
unsigned long bytes_freed;
 } drm_mem_stats_t;
 
-static spinlock_t drm_mem_lock = SPIN_LOCK_UNLOCKED;
+static DEFINE_SPINLOCK(drm_mem_lock);
 static unsigned long drm_ram_available = 0;/* In pages */
 static unsigned long drm_ram_used = 0;
 static drm_mem_stats_t drm_mem_stats[] = {
diff --git a/linux-core/drm_memory_debug.h b/linux-core/drm_memory_debug.h
index 706b752..9a498ab 100644
--- a/linux-core/drm_memory_debug.h
+++ b/linux-core/drm_memory_debug.h
@@ -43,7 +43,7 @@ typedef struct drm_mem_stats {
unsigned long bytes_freed;
 } drm_mem_stats_t;
 
-static spinlock_t drm_mem_lock = SPIN_LOCK_UNLOCKED;
+static DEFINE_SPINLOCK(drm_mem_lock);
 static unsigned long drm_ram_available = 0;/* In pages */
 static unsigned long drm_ram_used = 0;
 static drm_mem_stats_t drm_mem_stats[] =
diff --git a/linux-core/via_dmablit.c b/linux-core/via_dmablit.c
index fdc2bd6..cbb7371 100644
--- a/linux-core/via_dmablit.c
+++ b/linux-core/via_dmablit.c
@@ -562,7 +562,7 @@ via_init_dmablit(drm_device_t *dev)
blitq-num_outstanding = 0;
blitq-is_active = 0;
blitq-aborting = 0;
-   blitq-blit_lock = SPIN_LOCK_UNLOCKED;
+   spin_lock_init(blitq-blit_lock);
for (j=0; jVIA_NUM_BLIT_SLOTS; ++j) {
DRM_INIT_WAITQUEUE(blitq-blit_queue + j);
}
-- 
1.4.2.ge6fa


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[Bug 8218] Errors during World of Warcraft stress test

2006-09-20 Thread bugzilla-daemon
Please do not reply to this email: if you want to comment on the bug, go to
   
the URL shown below and enter yourcomments there. 
   
https://bugs.freedesktop.org/show_bug.cgi?id=8218  
 




--- Additional Comments From [EMAIL PROTECTED]  2006-09-20 12:58 ---
(In reply to comment #15)
   Incidentally, I saw that the r300 implements vbos in hardware. Is this an
extra
   feature of the R300 hardware, or could the R200 hardware do this too?
  No, r200 (and r100 too) can easily do that, with basically the same code as 
  r300
  uses for that. It's on my TODO list...
 
 FWIW, I think it might be better to move all these drivers to the upcoming
 common memory manager instead of putting more effort into the r300 hack.
For sure. One of the reasons I didn't work on it, I'm waiting for these drivers
to magically migrate to the new memory manager ;-).
  
 
 
--   
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email 
 
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: DRM memory manager on cards with hardware contexts

2006-09-20 Thread Benjamin Herrenschmidt
  
 OK. It seems like mmap locks are needed even for
 unmap_mapping_range().

Well, I came to the opposite conclusion :) unmap_mapping_range() uses
the truncate count mecanism to guard against a racing no_page().

The idea is that:

no_page() itself internally takes the per-obkect lock/mutex mostly as a
sycnhronisation point before looking for the struct page and releases it
before returning the struct page to do_no_page().

unmap_mapping_range() is called with that muetx/lock held (and the copy
is done with that held too).

That should work without taking the mmap_sem.

Now, of course, the real problem is that we don't have struct page for
vram There are two ways out of this:

 - Enforce use of sparsemem and create struct page for vram. That will
probably make a few people jump out of their seats in x86 land but
that's what we do for cell and SPUs for now.

 - There's a prooposal that I'm pusing to add a way for no_page() to
return a NOPAGE_RETRY error, which essentially causes it to go all the
way back to userland and re-do the access. I want that to be able to
handle signals while blocked inside no_page() but that could -also- be
used to have no_page() setup the PTE mappings itself and return
NOPAGE_RETRY, thus avoiding the need for a struct page. Now I do not
-ever- want to see drivers mucking around with PTEs directly, however,
we can provide something in mm/memory.c that a driver can call from
within no_page() to perform the set_pte() along with all the necessary
locking, flushes, etc... The base code for NOPAGE_RETRY should get in
2.6.19 soon (one of these days).

Ben.


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[Bug 8218] Errors during World of Warcraft stress test

2006-09-20 Thread bugzilla-daemon
Please do not reply to this email: if you want to comment on the bug, go to
   
the URL shown below and enter yourcomments there. 
   
https://bugs.freedesktop.org/show_bug.cgi?id=8218  
 




--- Additional Comments From [EMAIL PROTECTED]  2006-09-20 17:05 ---
(In reply to comment #14)
  This also looked interesting:
  ==17123== Invalid write of size 1
  ==17123==at 0x40069A6: memcpy (mac_replace_strmem.c:394)
  ==17123==by 0x4156245: r200UploadTexImages (string3.h:51)
  ==17123==by 0x41580E9: r200UpdateTextureUnit (r200_texstate.c:1546)
 Yes, somewhat interesting. Unfortunately the output is a bit useless, what's
 up with that string3.h file? No idea on which line in r200UploadTexImages
 function this happens.

I removed the static qualifiers from a few functions, and now the valgrind
output looks like this:

==28760== Invalid write of size 1
==28760==at 0x40069A6: memcpy (mac_replace_strmem.c:394)
==28760==by 0x415582E: r200UploadRectSubImage (string3.h:51)
==28760==by 0x4155F13: uploadSubImage (r200_texmem.c:322)
==28760==by 0x41564D0: r200UploadTexImages (r200_texmem.c:516)
==28760==by 0x4158689: r200UpdateTextureUnit (r200_texstate.c:1671)
==28760==by 0x4158B9F: r200UpdateTextureState (r200_texstate.c:1793)
==28760==by 0x414C6C3: r200ValidateState (r200_state.c:2372)
==28760==by 0x4145640: r200MakeCurrent (r200_context.c:718)
==28760==by 0x414233F: driBindContext (dri_util.c:343)
==28760==by 0x4277CABB: (within /usr/lib/libGL.so.1.2)
==28760==by 0x4277ECAE: glXMakeContextCurrent (in /usr/lib/libGL.so.1.2)
==28760==by 0x4277EF42: glXMakeCurrent (in /usr/lib/libGL.so.1.2)
==28760==  Address 0xB7EEC183 is not stack'd, malloc'd or (recently) free'd
==28760==
==28760== Invalid write of size 1
==28760==at 0x40069AC: memcpy (mac_replace_strmem.c:394)
==28760==by 0x415582E: r200UploadRectSubImage (string3.h:51)
==28760==by 0x4155F13: uploadSubImage (r200_texmem.c:322)
==28760==by 0x41564D0: r200UploadTexImages (r200_texmem.c:516)
==28760==by 0x4158689: r200UpdateTextureUnit (r200_texstate.c:1671)
==28760==by 0x4158B9F: r200UpdateTextureState (r200_texstate.c:1793)
==28760==by 0x414C6C3: r200ValidateState (r200_state.c:2372)
==28760==by 0x4145640: r200MakeCurrent (r200_context.c:718)
==28760==by 0x414233F: driBindContext (dri_util.c:343)
==28760==by 0x4277CABB: (within /usr/lib/libGL.so.1.2)
==28760==by 0x4277ECAE: glXMakeContextCurrent (in /usr/lib/libGL.so.1.2)
==28760==by 0x4277EF42: glXMakeCurrent (in /usr/lib/libGL.so.1.2)
==28760==  Address 0xB7EEC182 is not stack'd, malloc'd or (recently) free'd
==28760==
==28760== Invalid write of size 1
==28760==at 0x40069B3: memcpy (mac_replace_strmem.c:394)
==28760==by 0x415582E: r200UploadRectSubImage (string3.h:51)
==28760==by 0x4155F13: uploadSubImage (r200_texmem.c:322)
==28760==by 0x41564D0: r200UploadTexImages (r200_texmem.c:516)
==28760==by 0x4158689: r200UpdateTextureUnit (r200_texstate.c:1671)
==28760==by 0x4158B9F: r200UpdateTextureState (r200_texstate.c:1793)
==28760==by 0x414C6C3: r200ValidateState (r200_state.c:2372)
==28760==by 0x4145640: r200MakeCurrent (r200_context.c:718)
==28760==by 0x414233F: driBindContext (dri_util.c:343)
==28760==by 0x4277CABB: (within /usr/lib/libGL.so.1.2)
==28760==by 0x4277ECAE: glXMakeContextCurrent (in /usr/lib/libGL.so.1.2)
==28760==by 0x4277EF42: glXMakeCurrent (in /usr/lib/libGL.so.1.2)
==28760==  Address 0xB7EEC181 is not stack'd, malloc'd or (recently) free'd
==28760==
==28760== Invalid write of size 1
==28760==at 0x40069BD: memcpy (mac_replace_strmem.c:394)
==28760==by 0x415582E: r200UploadRectSubImage (string3.h:51)
==28760==by 0x4155F13: uploadSubImage (r200_texmem.c:322)
==28760==by 0x41564D0: r200UploadTexImages (r200_texmem.c:516)
==28760==by 0x4158689: r200UpdateTextureUnit (r200_texstate.c:1671)
==28760==by 0x4158B9F: r200UpdateTextureState (r200_texstate.c:1793)
==28760==by 0x414C6C3: r200ValidateState (r200_state.c:2372)
==28760==by 0x4145640: r200MakeCurrent (r200_context.c:718)
==28760==by 0x414233F: driBindContext (dri_util.c:343)
==28760==by 0x4277CABB: (within /usr/lib/libGL.so.1.2)
==28760==by 0x4277ECAE: glXMakeContextCurrent (in /usr/lib/libGL.so.1.2)
==28760==by 0x4277EF42: glXMakeCurrent (in /usr/lib/libGL.so.1.2)
==28760==  Address 0xB7EEC180 is not stack'd, malloc'd or (recently) free'd
==28760==

string3.h is actually /usr/include/bits/string3.h, which defines memcpy() thus:

#define memcpy(dest, src, len) \
  ((__bos0 (dest) != (size_t) -1)   \
   ? __builtin___memcpy_chk (dest, src, len, __bos0 (dest)) \
   : __memcpy_ichk (dest, src, len))
static __always_inline void *
__NTH (__memcpy_ichk (void *__restrict __dest, __const void *__restrict __src,
  size_t __len))
{
  return 

[Bug 8218] Errors during World of Warcraft stress test

2006-09-20 Thread bugzilla-daemon
Please do not reply to this email: if you want to comment on the bug, go to
   
the URL shown below and enter yourcomments there. 
   
https://bugs.freedesktop.org/show_bug.cgi?id=8218  
 




--- Additional Comments From [EMAIL PROTECTED]  2006-09-20 17:23 ---
(In reply to comment #17)
 I removed the static qualifiers from a few functions, and now the valgrind
 output looks like this:
 
 ==28760== Invalid write of size 1
 ==28760==at 0x40069A6: memcpy (mac_replace_strmem.c:394)
 ==28760==by 0x415582E: r200UploadRectSubImage (string3.h:51)
 ==28760==by 0x4155F13: uploadSubImage (r200_texmem.c:322)
 ==28760==by 0x41564D0: r200UploadTexImages (r200_texmem.c:516)
Ah didn't realize that's for rectangular texture upload before. So this is again
a write to the (not-allocated-here) indirect buffer area, thus probably not a 
bug.
  
 
 
--   
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email 
 
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel