Re: new hyperz patch

2004-11-10 Thread Keith Whitwell
Roland Scheidegger wrote:
This is a new version of the hyperz patch. It should have a better 
chance of running on all cards.
I get the feeling that hyperz was designed with a private z-buffer in mind. 
It would be interesting to see if the X server can dynamically un-hyperz the 
backbuffer when it detects a second context being created - have a look at the 
TransitionTo/From functions in radeon_dri.c - currently they are mainly 
concerned with page flipping.

Also, Stephane has corrected the clearing problems if other windows were 
moved on top of a rendering window. However, for now clearing will clear 
all tiles up to the end of the window completely, so better don't try 
multiple apps or mixed hyperz/non-hyperz apps - you have been warned.
I have found no way unfortunately how to clear only the z or only the 
stencil buffer - the respective registers such as STENCILREFMASK and 
ZSTENCILCNTL do not seem to affect the hyperz clear call in any way, 
shape or form (and thus they are gone from the drm patch), I suspect 
they are only used when using the rasterizer for writing to the 
z/stencil buffer. This can obviously lead to rendering errors (try 
stencil_wrap with the attached patch to see what I mean - of course 
that's only useful if it ran correctly before...).
Typically this is resolved by clearing by drawing two big triangles to cover 
the area, enabling only depth or stencil drawing as required.

Keith
---
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_id=5588alloc_id=12065op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel


[Bug 1809] Still required to compile libGL without PIC on i386?

2004-11-10 Thread bugzilla-daemon
Please do not reply to this email: if you want to comment on the bug, go to 
 
the URL shown below and enter yourcomments there.   
 
https://freedesktop.org/bugzilla/show_bug.cgi?id=1809
   




--- Additional Comments From [EMAIL PROTECTED]  2004-11-10 02:08 ---
I'm not perfectly happy with ViewPerf, there's too much stuff going on behind
the scenes (and AFAIR it is quite CPU intesive as well).

I'll try do some performance tests with machtest (my own benchmark) for
obtaining some plain geometry transformation rates. Couldn't do that until now
because I didn't have a working GL driver until yesterday... :P
   
   
-- 
Configure bugmail: https://freedesktop.org/bugzilla/userprefs.cgi?tab=email 
  
   
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


---
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_id=5588alloc_id=12065op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: R100 readpixels acceleration

2004-11-10 Thread Brian Paul
Stephane Marchesin wrote:
Brian Paul wrote:
I'll be checking in a fix for this soon.
I've replaced the _swrast_clip_pixelrect() functions with two new 
functions: _mesa_clip_readpixels() and _mesa_clip_drawpixels().  The 
main difference is the later one obeys the scissor rectangle.

The DRI drivers might use _mesa_clip_readpixels() but a second stage 
of clipping which takes place in screen space (instead of window 
space) is needed too.  Otherwise, someone could try to read pixels 
outside of the framebuffer's bounds.

I'd suggest writing a screen-space clipping routine and putting it in 
the src/mesa/drivers/common/ or src/mesa/drivers/dri/common/ directory.

So, did you fix this yourself ?
I didn't touch any DRI code.

In any case, you might want to remove that fprintf near the end of the 
check_color function.
I guess I'd like to see the clipping issues resolved before checking 
in the patch.

-Brian
---
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_id=5588alloc_id=12065op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel


[Bug 1809] Still required to compile libGL without PIC on i386?

2004-11-10 Thread bugzilla-daemon
Please do not reply to this email: if you want to comment on the bug, go to 
 
the URL shown below and enter yourcomments there.   
 
https://freedesktop.org/bugzilla/show_bug.cgi?id=1809
   




--- Additional Comments From [EMAIL PROTECTED]  2004-11-10 08:28 ---
There's a pretty neat newish benchmarker called FrameGetter. You can read some
more about it at http://www.anandtech.com/linux/showdoc.aspx?i=2218 and download
it from http://www.anandtech.com/linux/showdoc.aspx?i=2229p=2.
   
   
-- 
Configure bugmail: https://freedesktop.org/bugzilla/userprefs.cgi?tab=email 
  
   
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


---
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_id=5588alloc_id=12065op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel


[Bug 1819] New: Add AGP 8x support to radeon

2004-11-10 Thread bugzilla-daemon
Please do not reply to this email: if you want to comment on the bug, go to 
 
the URL shown below and enter yourcomments there.   
 
https://freedesktop.org/bugzilla/show_bug.cgi?id=1819
   
   Summary: Add AGP 8x support to radeon
   Product: DRI
   Version: XOrg CVS
  Platform: PC
OS/Version: All
Status: NEW
  Severity: normal
  Priority: P2
 Component: DDX drivers
AssignedTo: [EMAIL PROTECTED]
ReportedBy: [EMAIL PROTECTED]


Patch from Hui Yu and Michel Daenzer:
http://penguinppc.org/~daenzer/DRI/radeon-agp8x.diff

At some point we should just commit this patch or a variation on it.  too many
users get bitten by having 8x agp set up by the bios, but only 4x in the radeon
driver.  I seriously doubt it'll break anything.  Unfortunately my mobo only
does 4x or I'd test 8x support.  if it causees problems it can always be
reverted, or better yet, fixed.  Thoughts?
   
   
-- 
Configure bugmail: https://freedesktop.org/bugzilla/userprefs.cgi?tab=email 
  
   
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


---
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_id=5588alloc_id=12065op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: R300 with xorg-x11-6.8.0

2004-11-10 Thread Nicolai Haehnle
On Wednesday 10 November 2004 08:10, eGore wrote:
 Hi list,
 
 I ran into some trouble getting DRI running with r300 (I have no idea if
 it is already supported or not), but it didn't work. I looked at xorg's
 logfile and found out that DRI was disabled and I also found out that
 this is caused by radeon_accelfuncs.c. So I wrote the attached patch
 to get around that. DRI does still not work, but at least my xorg log
 tells me it does :-)
 
 !! WARNING !!
 I have no idea what I'm doing, so this might be completely wrong :-)
 !! WARNING !!

You're confusing things. On the one hand, there's general support for DRI 
(and support for client side 3D acceleration), and on the other hand 
there's hardware acceleration for the Render extension. The two things are 
only loosely connected.
The R300 efforts are directed towards creating client side 3D acceleration 
(Render acceleration is a very specialised and limited subset of what the 
3D driver does), so your patch is both unnecessary and wrong, because both 
the R100 and the R200 Render acceleration paths cannot possibly work on an 
R300.

Client side 3D acceleration works without Render acceleration, where 
works means that Clear() operations are accelerated, and I have some code 
for hardware rasterization of untextured primitives which always locks up 
and which I haven't found the time to fix yet.

cu,
Nicolai

 PS: The webpage of r300 is missing a tutorial ;)
 PPS: Patch has been applied for a already patched file, I guess, so line
 numbers might be completely wrong.
 PPPS: I used xorg-x11-6.8.1 from gentoo Linux (xorg-x11-6.8.0-r1 to be
 exact)
 S: I'm having a Radon 9700 Pro from ELSA.
 
 That's it for now, regards,
   Christoph Brill aka egore


pgpsG36UP5Lfs.pgp
Description: PGP signature


[Bug 1809] Still required to compile libGL without PIC on i386?

2004-11-10 Thread bugzilla-daemon
Please do not reply to this email: if you want to comment on the bug, go to 
 
the URL shown below and enter yourcomments there.   
 
https://freedesktop.org/bugzilla/show_bug.cgi?id=1809
   




--- Additional Comments From [EMAIL PROTECTED]  2004-11-10 10:12 ---
Ok, some numbers from my side: 4x4 size triangles, color per vertex.
i915 with i810 driver, R200 with radeon driver. Tested with machtest
(http://www.vis.uni-stuttgart.de/machtest/ - don't tell me that it's old, I know
that myself =P ).

i915 nopic  i915 pic  R200 nopic  R200 pic
glVertex3f() w/o lighting:2.43 M/s  2.44 M/s5.46 M/s  5.46 M/s
glVertex3f() w/ 7 lights:  645 k/s   643 k/s1.89 M/s  1.88 M/s
glVertexArray() size 900 w/o l:   3.82 M/s  3.82 M/s4.72 M/s  4.72 M/s

So the difference is much smaller than the typical jitter between different
calls. The more vertices are sent down the framebuffer, the higher is the
relative overhead of pic vs. nopic. So I'm especially happy to see the R200 w/
glVertex3f() equally fast.

Unfortunately, I cannot present results from modern cards, as
a) the nvidia driver only works with its own libGL
b) the beta fglrx driver for Xorg only works with its own libGL (former 
versions 
   did work with mesa libGL) - mesa's libGL just says 'direct rendering: no' -
   and vice versa! That is the radeon driver does not work with the fglrx libGL!
Besides, the beta fglrx driver is broken WRT other aspects as well (e.g. array
rendering of arrays 900 vertices does not work as anticipated).

Facit: there is *no* meassurable performance difference between pic and nopic.

Nuke the nopic config.

   
   
-- 
Configure bugmail: https://freedesktop.org/bugzilla/userprefs.cgi?tab=email 
  
   
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


---
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_id=5588alloc_id=12065op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel


[Bug 1809] Still required to compile libGL without PIC on i386?

2004-11-10 Thread bugzilla-daemon
Please do not reply to this email: if you want to comment on the bug, go to 
 
the URL shown below and enter yourcomments there.   
 
https://freedesktop.org/bugzilla/show_bug.cgi?id=1809
   




--- Additional Comments From [EMAIL PROTECTED]  2004-11-10 10:14 ---
(In reply to comment #2)
 (In reply to comment #0)
non-PIC PIC   
  drv-093.928   3.282   
 the others show insiginificant differences, but I would say the 20% here is
 quite a large difference.

I guess this was typical jitter. Stefan told me that this was a quick test.
There is nothing in libGL that requires more function calls than rendering
triangles with one call per vertex (glVertex3f()).
   
   
-- 
Configure bugmail: https://freedesktop.org/bugzilla/userprefs.cgi?tab=email 
  
   
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


---
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_id=5588alloc_id=12065op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel


[Bug 1822] New: libGL (and DRI drivers) should support TLS

2004-11-10 Thread bugzilla-daemon
Please do not reply to this email: if you want to comment on the bug, go to 
 
the URL shown below and enter yourcomments there.   
 
https://freedesktop.org/bugzilla/show_bug.cgi?id=1822
   
   Summary: libGL (and DRI drivers) should support TLS
   Product: DRI
   Version: DRI CVS
  Platform: PC
OS/Version: Linux
Status: NEW
  Severity: enhancement
  Priority: P2
 Component: libGL
AssignedTo: [EMAIL PROTECTED]
ReportedBy: [EMAIL PROTECTED]


Using the TLS segment to store thread-specific data, such as the current context
pointer and the current dispatch table pointer, would be much more efficient
than the current method that used pthread_{get,set}specific.
   
   
-- 
Configure bugmail: https://freedesktop.org/bugzilla/userprefs.cgi?tab=email 
  
   
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


---
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_id=5588alloc_id=12065op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel


[Bug 1822] libGL (and DRI drivers) should support TLS

2004-11-10 Thread bugzilla-daemon
Please do not reply to this email: if you want to comment on the bug, go to 
 
the URL shown below and enter yourcomments there.   
 
https://freedesktop.org/bugzilla/show_bug.cgi?id=1822
   




--- Additional Comments From [EMAIL PROTECTED]  2004-11-10 12:21 ---
Created an attachment (id=1276)
 -- (https://freedesktop.org/bugzilla/attachment.cgi?id=1276action=view)
Patch to add TLS support

With this patch, when GLX_USE_TLS is defined (at build time), libGL and the DRI
drivers will be built to use TLS.  Using TLS changes the libGL / DRI driver
interface in the following ways:

1. _glapi_RealDispatch no longer exists.  I don't think it was ever accessed
outside libGL.so (or outside glapi.o, for that matter), so this should have no
impact.

2. _glapi_Dispatch, _glapi_DispatchTSD, and _glapi_Context are all now
constant.  _glapi_Dispatch always points to the threadsafe table, and the
other two are NULL.

3. _glapi_tls_Dispatch is the new thread-local dispatch table pointer.  It
should never be NULL.

4. _glapi_tls_Context is the new thread-local context pointer.

This patch is an updated version of a much older patch that I posted to the
list a long time ago.  The old version got a lot of testing, but this version
has had only minimal testing.  It's here for review.
   
   
-- 
Configure bugmail: https://freedesktop.org/bugzilla/userprefs.cgi?tab=email 
  
   
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


---
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_id=5588alloc_id=12065op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel


[Bug 1822] libGL (and DRI drivers) should support TLS

2004-11-10 Thread bugzilla-daemon
Please do not reply to this email: if you want to comment on the bug, go to 
 
the URL shown below and enter yourcomments there.   
 
https://freedesktop.org/bugzilla/show_bug.cgi?id=1822
   

[EMAIL PROTECTED] changed:

   What|Removed |Added

Attachment #1276 is|0   |1
   obsolete||




--- Additional Comments From [EMAIL PROTECTED]  2004-11-10 14:09 ---
Created an attachment (id=1278)
 -- (https://freedesktop.org/bugzilla/attachment.cgi?id=1278action=view)
Updated TLS support patch

This version of the patch actually compiles and links (sorry).  It add a new
feature.  If HAVE_ALIAS is defined at build time, all dispatch functions that
resolve to the same table entry (e.g., glPointParameterf, glPointParameterfARB,
glPointParameterfEXT, and glPointParameterfSGIS) will all resolve to the same
address.  In my build, this saves about 3KiB in the libGL.so (when debug info
is also removed).
   
   
-- 
Configure bugmail: https://freedesktop.org/bugzilla/userprefs.cgi?tab=email 
  
   
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


---
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_id=5588alloc_id=12065op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: new hyperz patch

2004-11-10 Thread Roland Scheidegger
Keith Whitwell wrote:
Roland Scheidegger wrote:
This is a new version of the hyperz patch. It should have a better
 chance of running on all cards.

I get the feeling that hyperz was designed with a private z-buffer in
 mind. It would be interesting to see if the X server can dynamically
 un-hyperz the backbuffer when it detects a second context being
created - have a look at the TransitionTo/From functions in
radeon_dri.c - currently they are mainly concerned with page
flipping.
In fact, that was already discussed briefly at irc. For now it just 
seemed more important to get it working on more cards and fix the 
rendering problems than to worry about minor issues like multiple 
rendering apps :). I did get clearing only the needed tiles working 
(minus some off-by-one issues currently) though at least on my rv250 
(with a 32x8 z pixel granularity however - for clearing the hardware 
seems to work on 8x2 4x4 macro tiles, it looks like it might be even 
possible to get granularity down to 4x4 for clearing, at least on my card).

Typically this is resolved by clearing by drawing two big triangles
to cover the area, enabling only depth or stencil drawing as
required.
That's what I was thinking too, but I'd certainly avoid it if possible - 
that's just basically switching hyperz off (well at least fast z-clear, 
z-buffer compression ought to be still possible). I just don't know if 
the hardware (could even be different for rv cards?) can do it or not. 
And even with the latest fglrx driver I was unsuccesful in measuring 
clear performance with clearspd, I've even converted it to SDL instead 
of GLUT (due to fullscreen problems) and fglrx still obviously never 
used fast z-clear - dri is now 80 times faster than fglrx, isn't that 
nice ;).
Though some optimization could be done in the driver (i.e. if only 
stencil buffer is cleared, but between the last z-buffer clear and this 
clear the z-buffer was never enabled for writing, then still hyperz 
clear could be used, though care would have to be taken so the right z 
clear value is still used and similar issues). It probably means that 
z/stencil clear can only be done with hyperz clear if the stencil write 
mask contains all stencil bits too. That's lots of possibilities for 
clear fallbacks...

Roland
---
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_id=5588alloc_id=12065op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel


[Bug 1822] libGL (and DRI drivers) should support TLS

2004-11-10 Thread bugzilla-daemon
Please do not reply to this email: if you want to comment on the bug, go to 
 
the URL shown below and enter yourcomments there.   
 
https://freedesktop.org/bugzilla/show_bug.cgi?id=1822
   

[EMAIL PROTECTED] changed:

   What|Removed |Added

Attachment #1278 is|0   |1
   obsolete||




--- Additional Comments From [EMAIL PROTECTED]  2004-11-10 18:43 ---
Created an attachment (id=1279)
 -- (https://freedesktop.org/bugzilla/attachment.cgi?id=1279action=view)
tls-support-3.patch

src/glx/x11/dispatch.c wouldn't build with the last patch under 'linux-dri',
needs to have glthread.h included too.  probably we don't need two copies of
dispatch.c...

builds, more or less works.  size comparison:

linux-dri, no TLS:
 437017   275725204  469793   72b21 libGL.so
1690845   84148  146832 1921825  1d5321 r200_dri.so

linux-dri, TLS:
 417798   275765176  450550   6dff6 libGL.so
1680789   84152  146832 1911773  1d2bdd r200_dri.so

linux-dri-x86, no TLS:
 40666189205204  420785   66bb1 libGL.so
1777602   72004  146832 1996438  1e7696 r200_dri.so

linux-dri-x86, TLS:
 37989089405176  394006   60316 libGL.so
1761114   72008  146832 1979954  1e3632 r200_dri.so

so we save about 17-23KiB on libGL, and 9KiB on r200_dri.so on all
configurations.

typical api_speed runs:

linux-dri, no TLS:
100 calls to   glColor3fv required 66275372 cycles.
100 calls to  glNormal3fv required 67313061 cycles.
100 calls toglTexCoord2fv required 67436552 cycles.
100 calls toglTexCoord3fv required 68557680 cycles.
100 calls to   glMultiTexCoord2fv required 78982018 cycles.
100 calls toglMultiTexCoord2f required 81005552 cycles.
100 calls to glFogCoordfv required 65530834 cycles.
100 calls to  glFogCoordf required 66423760 cycles.

linux-dri, TLS:
100 calls to   glColor3fv required 88917879 cycles.
100 calls to  glNormal3fv required 87621478 cycles.
100 calls toglTexCoord2fv required 88670172 cycles.
100 calls toglTexCoord3fv required 88722107 cycles.
100 calls to   glMultiTexCoord2fv required 93348227 cycles.
100 calls toglMultiTexCoord2f required 99807264 cycles.
100 calls to glFogCoordfv required 86875659 cycles.
100 calls to  glFogCoordf required 88212488 cycles.

linux-dri-x86, no TLS:
100 calls to   glColor3fv required 35566732 cycles.
100 calls to  glNormal3fv required 34631503 cycles.
100 calls toglTexCoord2fv required 33493758 cycles.
100 calls toglTexCoord3fv required 35927528 cycles.
100 calls to   glMultiTexCoord2fv required 43212345 cycles.
100 calls toglMultiTexCoord2f required 40598730 cycles.
100 calls to glFogCoordfv required 30849407 cycles.
100 calls to  glFogCoordf required 31340343 cycles.

linux-dri-x86, TLS:
100 calls to   glColor3fv required 60313505 cycles.
100 calls to  glNormal3fv required 60259834 cycles.
100 calls toglTexCoord2fv required 59010398 cycles.
100 calls toglTexCoord3fv required 59690089 cycles.
100 calls to   glMultiTexCoord2fv required 66056490 cycles.
100 calls toglMultiTexCoord2f required 67972318 cycles.
100 calls to glFogCoordfv required 54717279 cycles.
100 calls to  glFogCoordf required 54752466 cycles.

this is worrying.  why should it be this much slower?

also, the TLS libs make quake3 hang the machine.  might be able to blame that
using a hacked hyperz drm module though.
   
   
-- 
Configure bugmail: https://freedesktop.org/bugzilla/userprefs.cgi?tab=email 
  
   
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


---
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_id=5588alloc_id=12065op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel


[Bug 1822] libGL (and DRI drivers) should support TLS

2004-11-10 Thread bugzilla-daemon
Please do not reply to this email: if you want to comment on the bug, go to 
 
the URL shown below and enter yourcomments there.   
 
https://freedesktop.org/bugzilla/show_bug.cgi?id=1822
   




--- Additional Comments From [EMAIL PROTECTED]  2004-11-10 23:48 ---
I just realized why the TLS version is likely slower.  In Jakub's original TLS
patch, he modified the dispatch stubs to call a local function to get the
dispatch pointer.  In TLS mode, this means a function that does something like:

get_dispatch:
movl%gs:[EMAIL PROTECTED], %eax
ret

I seem to recall that this was done to eliminate a bunch of relocs.  Perhaps
something related to prelink?  Dunno.  The code that calls it is actually a
'call get_dispatch; nop'.  The intention was the other code in libGL could copy
the get_dispatch function inline.  This is what my patch also does.

The overhead of the CALL / RET pair is likely higher than the MOV / TEST / JNE
in the non-TLS dispatch stubs.  On my box, inlinging get_dispatch reduces
glColor3fv from ~107 cycles to ~61 cycles.
   
   
-- 
Configure bugmail: https://freedesktop.org/bugzilla/userprefs.cgi?tab=email 
  
   
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.


---
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_id=5588alloc_id=12065op=click
--
___
Dri-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dri-devel