Re: [Dri-devel] little fix for install.sh
Sergey, sorry for replying only now. For some time I was planning to apply That's OK. Everyone understands that actual work on the driver has incomparable higher priority. The problem is that the existence of those dri-old files are the only proof that anything was installed before. If we default to delete files when there is no backup then someone mistakenly trying to restore with no previous installation (or, e.g., accidental second restore) will delete all the drivers, and I think that this can be far more annoying than having a dummy mach64.o floating around. I see the point. My only concern is that when I do restore - I actually do not change my XF86Config in the DRI section. So I am a bit afraid that XFree would try to use the module (without other parts of the snapshot). If you tell me it's not a danger - I am ready to forget about all this stuff. Cheers, Sergey PS Hope you won't forget to announce AGP-mapping-enabled binaries - I am eager to hand my laptop. ___ Have big pipes? SourceForge.net is looking for download mirrors. We supply the hardware. You get the recognition. Email Us: [EMAIL PROTECTED] ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Website
On 2002.05.15 23:17 Ian Molton wrote: On Wed, 15 May 2002 22:42:25 +0100 José Fonseca [EMAIL PROTECTED] wrote: Ian, On 2002.05.15 21:39 Ian Molton wrote: ... Here are some things I think should be worked on: 1) I *NEED* info about what cards have what features supported by DRI, like the radeon driver does already. I think that more important than raw data is to establish a easily parsable format for that (either XML, or plain comma-seperated-values) and a template for the property sheet so that it can be easily maintained. I agree. I just want the values for the 'first pass'. Do you know if the server is running any database or such? What facilities do I have with which to play? Take a look to SF documents (http://sourceforge.net/docman/?group_id=1), section 7, especially the http://sourceforge.net/docman/display_doc.php?docid=4297group_id=1 . You'll probably want to reuse much of the stuff thats is in the site. I don't know in what scripting language that is. 3) Can people help me compile a list of janitorial tasks that could be undertaken by new developers? perhaps installation cleanups, or trivial driver tidyups ? I don't think that we should frustrate the potential new developers expecations with janitorial tasks. I wasnt suggesting that we foist jan. tasks on them specifically, but I do know that having a big selection of tasks will encourage a broader range of newbies. Plus it gives a stepping up point for the lesser skilled programmers oiut there. I'm of the opinion that if one can code then one should do it. The janitorial tasks should be taken by users that want to aliviate the developers load so that they code more. Well, lets present all our little tasks, so people can pick and choose. I think that the two most rewarding things that a potential developer can do is enhance the documentation and/or bugfixing. Both these tasks contribute to quickly generate the required know-how to start working on missing features. They give you better understanding of the architecture and the code. Definately. I can be more specific: * Documentation: - update the existing documents - change the code comments to Doxygen so that we can automatically - generate reference manuals of the subsystems and the drivers. - add more items to the Developers' FAQ * Bugfixing: - Basically pick a bug on the card of one's choice and hunt it down! Or, pick an unimplemented feature and let rip :) All the more reason to need feature lists. 4) Can people who have information about the intimate details of card HARDWARE (eg. register locations, DMA engines, etc.) please send me them, so that I can add them to the developer documents? Mike Harris posted recentely a link with this information for 3DFX, but this a rare exception. Most of the documents that the current developers have are under NDA (see http://dri.sourceforge.net/doc/faq/getting-started.html#NDA). Indeed I shall. 5) Can someone with a nice package re-draw the DRI flowcharts? the Precision Insight ones look like Fischer Price 'my first flowchart'... I kinda link their style... ;p perhaps they are OK for a conference, but I think they need prettying for normal developers. They /dont/ show layering clearly. Well, even more important, there are slightly outdated too. Jens posted an avaliation of the documentation state some months ago. I attached my local copy (which is meant to include it the FAQ references eventually). dia and/or sodipodi are two nice applications for this. (I'm a gnome desktop user). So, lets get to it :) Good luck, Yep. José Fonseca David Johnson wrote: Could the DRI experts offer some feedback on the relevence and usefullness of the following documents. Are the reasonably up to date? Are they moderately up to date? Are they out of date and not extremely useful with respect to the current DRI artitecture? I'll throw in my two cents... 1. Introduction to the Direct Rendering Infrastructure - Brian Paul, August 2000 http://dri.sourceforge.net/doc/DRIintro.html Reasonably up to date. Decent Intro. 2. Dri term glossary. http://dri.sourceforge.net/doc/glossary.html I suspect this doesn't need to be changed but is there anything that should be added? Reasonably up to date--except the author, Nathan, no longer works for VALinux-so his e-mail address is invalid. Useful. 3. Data flow diagram http://dri.sourceforge.net/doc/data_flow.jpg Very high level. Good for someone brand new to the concept of direct rendering--we'd use this at trade shows to explain what we are doing at a very high level. 4. Control flow diagram http://dri.sourceforge.net/doc/control_flow.jpg This is moderately up
Re: [Dri-devel] little fix for install.sh
On 2002.05.15 22:34 Jan Schmidt wrote: quote who=Jos? Fonseca ... The problem is that the existence of those dri-old files are the only proof that anything was installed before. If we default to delete file when there is no backup then someone mistakenly trying to restore with no previous installation (or, e.g., accidental second restore) will delete all the drivers, and I think that this can be far more annoying than having a dummy mach64.o floating around. But, if before you had: fileA, fileB then install fileA, fileB, fileC when you restore, it will copy the fileA and fileB back from dri-old, but doesn't remove fileC... so you haven't really restored the system to the way it was in that case. Yep. I know.. but although the fileC remains it has no effect. This affects mostly new drivers which aren't supported in the stock distro (e.g., Mach64). The orginal fileA and fileB have no knowledge of the fileC so it won't be used. Possibly, you can say that the stray file doesn't matter, since if they didn't have it before, it won't be used in the config, but how about: for FILE in *o; do if [ -f $XF86_DRV_DIR/$FILE ]; then mv -f $XF86_DRV_DIR/$FILE $XF86_DRV_DIR/dri-old.$FILE $LOGFILE rm -f $XF86_DRV_DIR/dri-old.n.$FILE $LOGFILE else rm -f $XF86_DRV_DIR/dri-old.$FILE $LOGFILE touch $XF86_DRV_DIR/dri-old.n.$FILE $LOGFILE fi done Then, to restore, check for the existence of dri-old.n.$FILE and remove the installed version if that exists... for FILE in *o; do if [ -f $XF86_DRV_DIR/dri-old.$FILE ]; then mv -f $XF86_DRV_DIR/dri-old.$FILE $XF86_DRV_DIR/$FILE $LOGFILE else if [ -f $XF86_DRV_DIR/dri-old.n.$FILE ]; then rm -f $XF86_DRV_DIR/$FILE $LOGFILE # Clean up touch fileE rm -f $XF86_DRV_DIR/dri-old.n.$FILE LOGFILE fi fi done Yes, this would do it. When I have some time I'll use your ideas to make an install and uninstall bash functions to be reused across the install.sh script. Then I'll also try to address another problem which is to avoid that two consequtive installs irreversibily remove the original drivers... José Fonseca ___ Have big pipes? SourceForge.net is looking for download mirrors. We supply the hardware. You get the recognition. Email Us: [EMAIL PROTECTED] ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
[Dri-devel] Radeon corruption with multiple contexts
Hi folks, Just a quick query about the state of the Radeon driver, as it appears in XFree86-4.2. Seems to work like a dream, unless you have simultaneous rendering to multiple contexts/windows, in which case there's huge corruption. It's easy to see, at least on my system (Mobility 7500/M7, XFree86-4.2, drm from 2.4.18 kernel). Just fire up two copies of gears: /usr/X11R6/lib/xscreensaver/gears /usr/X11R6/lib/xscreensaver/gears I've seen similar reports on this mailing list from 2001, and also in the SF bug database. Sorry I haven't tried CVS yet, perhaps this one is already fixed? Otherwise, I'd be happy to test any patches, though I regret I don't have the expertise to try to figure out what's wrong myself - sorry! Andrew Gee ___ Have big pipes? SourceForge.net is looking for download mirrors. We supply the hardware. You get the recognition. Email Us: [EMAIL PROTECTED] ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Radeon corruption with multiple contexts
[EMAIL PROTECTED] wrote: Hi folks, Just a quick query about the state of the Radeon driver, as it appears in XFree86-4.2. Seems to work like a dream, unless you have simultaneous rendering to multiple contexts/windows, in which case there's huge corruption. It's easy to see, at least on my system (Mobility 7500/M7, XFree86-4.2, drm from 2.4.18 kernel). Just fire up two copies of gears: /usr/X11R6/lib/xscreensaver/gears /usr/X11R6/lib/xscreensaver/gears I've seen similar reports on this mailing list from 2001, and also in the SF bug database. Sorry I haven't tried CVS yet, perhaps this one is already fixed? Otherwise, I'd be happy to test any patches, though I regret I don't have the expertise to try to figure out what's wrong myself - sorry! Andrew Gee This is fixed in the most recent kernel modules in the tcl branch at least. I'll check if the fix has been propogated. Keith ___ Have big pipes? SourceForge.net is looking for download mirrors. We supply the hardware. You get the recognition. Email Us: [EMAIL PROTECTED] ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Radeon corruption with multiple contexts
On Thu, 2002-05-16 at 17:12, [EMAIL PROTECTED] wrote: Just a quick query about the state of the Radeon driver, as it appears in XFree86-4.2. Seems to work like a dream, unless you have simultaneous rendering to multiple contexts/windows, in which case there's huge corruption. It's easy to see, at least on my system (Mobility 7500/M7, XFree86-4.2, drm from 2.4.18 kernel). Just fire up two copies of gears: /usr/X11R6/lib/xscreensaver/gears /usr/X11R6/lib/xscreensaver/gears Just tried this with current DRI CVS trunk, seems to work fine. Beware that DRI in XFree86 CVS may still be basically the same as in 4.2. -- Earthling Michel Dänzer (MrCooper)/ Debian GNU/Linux (powerpc) developer XFree86 and DRI project member / CS student, Free Software enthusiast ___ Have big pipes? SourceForge.net is looking for download mirrors. We supply the hardware. You get the recognition. Email Us: [EMAIL PROTECTED] ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Radeon corruption with multiple contexts
Keith Whitwell wrote: [EMAIL PROTECTED] wrote: Hi folks, Just a quick query about the state of the Radeon driver, as it appears in XFree86-4.2. Seems to work like a dream, unless you have simultaneous rendering to multiple contexts/windows, in which case there's huge corruption. It's easy to see, at least on my system (Mobility 7500/M7, XFree86-4.2, drm from 2.4.18 kernel). Just fire up two copies of gears: /usr/X11R6/lib/xscreensaver/gears /usr/X11R6/lib/xscreensaver/gears I've seen similar reports on this mailing list from 2001, and also in the SF bug database. Sorry I haven't tried CVS yet, perhaps this one is already fixed? Otherwise, I'd be happy to test any patches, though I regret I don't have the expertise to try to figure out what's wrong myself - sorry! Andrew Gee This is fixed in the most recent kernel modules in the tcl branch at least. I'll check if the fix has been propogated. Yes, it looks like the fix is also on the DRI trunk. Keith ___ Have big pipes? SourceForge.net is looking for download mirrors. We supply the hardware. You get the recognition. Email Us: [EMAIL PROTECTED] ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
[Dri-devel] [Fwd: [Xpert][PATCH] TDFX / Voodoo low tex mem hang]
Mike, Can you test whether this patch fixes your problems w/V3 in high res modes? If this fixes it, then I like it better than the previous one you sent--as it addresses the root of the problem. -- /\ Jens Owen/ \/\ _ [EMAIL PROTECTED] /\ \ \ Steamboat Springs, Colorado ---BeginMessage--- Two patches here, one is against xf-4_2-branch, one against DRI cvs Tested with 0.38mb free texture space running q3 on highest texture settings, which is probably the minimum you can have (as a comparison, 1280x960 leaves something like 8mb or so) Regardless, if it hits the same problem it should exit the application cleanly rather than hang. Please send feedback to me or dri-devel Thanks, -- Michael. Index: lib/GL/mesa/src/drv/tdfx/tdfx_texman.c === RCS file: /cvs/xc/lib/GL/mesa/src/drv/tdfx/tdfx_texman.c,v retrieving revision 1.4 diff -u -3 -p -r1.4 tdfx_texman.c --- lib/GL/mesa/src/drv/tdfx/tdfx_texman.c 2001/08/18 02:51:07 1.4 +++ lib/GL/mesa/src/drv/tdfx/tdfx_texman.c 2002/05/15 12:29:23 @@ -77,7 +77,7 @@ static void tdfxTMVerifyFreeList( tdfxCo if ( t ) { if ( t-isInTM ) { numRes++; - assert( t-range[0] ); +/* assert( t-range[0] ); */ if ( t-range[unit] ) totalUsed += (t-range[unit]-endAddr - t-range[unit]-startAddr); } else { @@ -112,7 +112,7 @@ static void tdfxTMDumpTexMem( tdfxContex printf( isInTM=%d whichTMU=%ld lastTimeUsed=%d\n, t-isInTM, t-whichTMU, t-lastTimeUsed ); printf( tm[0] = %p, t-range[0] ); -assert( t-range[0] ); +/* assert( t-range[0] ); */ if ( t-range[0] ) { printf( tm startAddr = %ld endAddr = %ld, t-range[0]-startAddr, @@ -389,7 +389,10 @@ tdfxTMFindOldestObject( tdfxContextPtr f ( t-whichTMU == TDFX_TMU_SPLIT ) ) ) { GLuint age, lastTime; -assert( t-range[0] ); +/* assert( t-range[0] );*/ +if (! t-range[unit] ) { + return NULL; +} lastTime = t-lastTimeUsed; if ( lastTime bindNumber ) { @@ -568,6 +571,8 @@ tdfxTMAllocTexMem( tdfxContextPtr fxMesa fprintf( stderr, tdfxTMAllocTexMem returned NULL! unit=%ld size=%ld\n, unit, size ); + UNLOCK_HARDWARE( fxMesa ); + exit( 1 ); } return range; } Index: programs/Xserver/hw/xfree86/drivers/tdfx/tdfx_driver.c === RCS file: /cvs/xc/programs/Xserver/hw/xfree86/drivers/tdfx/tdfx_driver.c,v retrieving revision 1.87 diff -u -3 -p -r1.87 tdfx_driver.c --- programs/Xserver/hw/xfree86/drivers/tdfx/tdfx_driver.c 2002/01/04 21:22:35 1.87 +++ programs/Xserver/hw/xfree86/drivers/tdfx/tdfx_driver.c 2002/05/15 12:29:28 @@ -1954,8 +1954,11 @@ static void allocateMemory(ScrnInfoPtr p /* for giggles. */ pTDFX-fbOffset = pTDFX-fifoOffset + pTDFX-fifoSize; pTDFX-texOffset = pTDFX-fbOffset + fbSize; + pTDFX-texSize = pTDFX-backOffset - pTDFX-texOffset; + if (pTDFX-depthOffset = pTDFX-texOffset || - pTDFX-backOffset = pTDFX-texOffset) { + pTDFX-backOffset = pTDFX-texOffset || + pTDFX-texSize 256*256*6) { /* * pTDFX-texSize 0 means that the DRI is disabled. pTDFX-backOffset * is used to calculate the maximum amount of memory available for @@ -1970,7 +1973,6 @@ static void allocateMemory(ScrnInfoPtr p \tand/or back buffer. Disabling DRI. To use DRI try lower\n \tresolution modes and/or a smaller virtual screen size\n); } else { -pTDFX-texSize = pTDFX-backOffset - pTDFX-texOffset; xf86DrvMsg(pScrn-scrnIndex, X_INFO, Textures Memory %0.02f MB\n, (float)pTDFX-texSize/1024.0/1024.0); } Index: lib/GL/mesa/src/drv/tdfx/tdfx_texman.c === RCS file: /cvsroot/dri/xc/xc/lib/GL/mesa/src/drv/tdfx/tdfx_texman.c,v retrieving revision 1.11 diff -u -3 -p -r1.11 tdfx_texman.c --- lib/GL/mesa/src/drv/tdfx/tdfx_texman.c 14 Feb 2002 01:59:59 - 1.11 +++ lib/GL/mesa/src/drv/tdfx/tdfx_texman.c 14 May 2002 20:45:12 - @@ -396,7 +396,9 @@ FindOldestObject(tdfxContextPtr fxMesa, (info-whichTMU == TDFX_TMU_SPLIT))) { GLuint age, lasttime; -assert(info-tm[0]); +/*assert(info-tm[0]);*/ + if (!info-tm[tmu]) + return NULL; lasttime = info-lastTimeUsed; if (lasttime bindnumber) @@ -625,7 +627,8 @@ AllocTexMem(tdfxContextPtr fxMesa, FxU32 sprintf(err, AllocTexMem returned NULL! tmu=%d texmemsize=%d\n, (int) tmu, (int) texmemsize); _mesa_problem(fxMesa-glCtx, err); -return NULL; + UNLOCK_HARDWARE(
Re: [Dri-devel] Mach64 bus mastering abilities (further test results)
Jose, In reading this it just occurred to me what the flaw in my code was. I was setting up the descriptor table for a new pass _before_ waiting for the last one to complete, so there was a race condition. If I wait for idle _before_, I get no lockups, but the framerate drops. So here's what I'm going to do: setup ring pointers so that I start the descriptors for a new pass where the last one ended. Then I'll start building the new descriptors, wrapping if necessary, and wait for idle if I hit the start of the ring (the start of the last pass). I'll let you know how it goes. I'm sure this method could be refined, but if it works, things should be stable and I can check it in. Leif On Thu, 16 May 2002, José Fonseca wrote: On 2002.05.15 00:33 José Fonseca wrote: ... I still have to workout more details: a) check if there is no other buffering besides the FIFO going on. This can only be checked by making a full proof of concept example and check if nothing goes wrong. I used one DMA buffer (as I don't know much of the kernel API) to hold the history of a register that I choose during the bus master operation. Here you can see BM_GUI_TABLE progressing: May 16 20:06:00 localhost kernel: [drm] REG = 0x0005 May 16 20:06:00 localhost kernel: [drm] REG = 0x00050010 May 16 20:06:00 localhost kernel: [drm] REG = 0x00050020 May 16 20:06:00 localhost kernel: [drm] REG = 0x00050030 May 16 20:06:00 localhost kernel: [drm] REG = 0x00050040 May 16 20:06:00 localhost kernel: [drm] REG = 0x00050050 May 16 20:06:00 localhost kernel: [drm] REG = 0x00050060 May 16 20:06:00 localhost kernel: [drm] REG = 0x00050070 May 16 20:06:00 localhost kernel: [drm] REG = 0x00050080 May 16 20:06:00 localhost kernel: [drm] REG = 0x00050090 May 16 20:06:00 localhost kernel: [drm] REG = 0x000500a0 May 16 20:06:00 localhost kernel: [drm] REG = 0x000500b0 May 16 20:06:00 localhost kernel: [drm] REG = 0x000500c0 ... May 16 20:06:00 localhost kernel: [drm] REG = 0x000503e0 May 16 20:06:00 localhost kernel: [drm] REG = 0x000503f0 May 16 20:06:00 localhost kernel: [drm] REG = 0x00050400 May 16 20:06:00 localhost kernel: [drm] REG = 0x00050410 Although you just see one line per value in reality there are several (I had to filter duplicates when printing to avoid overflow the system log) and each buffer is just 24 bytes. Nevertheless this doesn't mean that there is no buffering when reading the descriptor table. I'm gonna devise a test for that: it will monitor the BM_GUI_TABLE value and change the value at the last moment (like waiting for the train to come to cross the line!) b) see if the descriptor table can be made into a circular buffer. The specs mention something about this but they aren't clear. They say the circular buffer is in the card memory, but if the card was copying the whole buffer then test 3 couldn't be happening... I've allocated a 32KB buffer and put a continuation entry just before the 16KB boundary and two final entries: one right above the 16KB boundary and another right in the beginning of the table. Here you can see BM_GUI_TABLE looping around the 16k boundary! ;-) May 16 20:58:01 localhost kernel: [drm] REG = 0x00093ff0 May 16 20:58:01 localhost kernel: [drm] REG = 0x0009 c) instead of using a GUI register it's probably better to use END_OF_LIST_STATUS@BM_COMMAND to see if the card is processing the last entry of the descriptor table. If that bit is set then there is no point in adding to the table was the engine will surely stop. We'll still need the buffer aging register to resolve the race condition of the engine stops while we change the table. Here you see BM_COMMAND May 16 20:07:52 localhost kernel: [drm] REG = 0xc000 May 16 20:07:52 localhost kernel: [drm] REG = 0x4018 May 16 20:07:52 localhost kernel: [drm] REG = 0x4000 May 16 20:07:52 localhost kernel: [drm] REG = 0x4018 May 16 20:07:52 localhost kernel: [drm] REG = 0x4000 May 16 20:07:52 localhost kernel: [drm] REG = 0x4018 May 16 20:07:52 localhost kernel: [drm] REG = 0x4000 May 16 20:07:52 localhost kernel: [drm] REG = 0x4018 May 16 20:07:52 localhost kernel: [drm] REG = 0x4000 May 16 20:07:52 localhost kernel: [drm] REG = 0x4018 May 16 20:07:52 localhost kernel: [drm] REG = 0x4000 May 16 20:07:52 localhost kernel: [drm] REG = 0x4018 May 16 20:07:52 localhost kernel: [drm] REG = 0x4010 May 16 20:07:52 localhost kernel: [drm] REG = 0x4000 ... May 16 20:07:52 localhost kernel: [drm] REG = 0x4000 May 16 20:07:52 localhost kernel: [drm] REG = 0x4018 May 16 20:07:52 localhost kernel: [drm] REG = 0x4000 May 16 20:07:52 localhost kernel: [drm] REG = 0xc018 May 16 20:07:52 localhost kernel: [drm] REG = 0xc000 and the last two lines, END_OF_LIST_STATUS@BM_COMMAND being set! ;-) NOTE: For those
[Dri-devel] OpenGL and the LinuxThreads pthread_descr structure
I would like to propose a small change to the pthread_descr structure in the latest LinuxThreads code, to better support OpenGL on GNU/Linux systems (particularly on x86, but not excluding other platforms). The purpose of this patch is to provide efficient thread-local storage for both libGL itself and loadable OpenGL driver modules, so that they can be made thread-safe without any impact on performance. Indeed, using this mechanism, an OpenGL driver can ignore the difference between running with a single thread and running with multiple threads, as global data will be accessed in the same way independent of the number of threads running. To understand the need for such a change, one should consider what goes on inside an OpenGL implementation when an application makes an OpenGL API call. One of the primary tasks of the driver-independent libGL is to dispatch function calls to the driver backend(s), usually through a large function pointer table containing entries for the several hundred API entrypoints. Central to this process is the notion of a rendering context, or an abstraction of the OpenGL state machine. A context is required to perform OpenGL commands. The GLX specification states: Each thread can have at most one current rendering context. In addition, a rendering context can be current for only one thread at a time. The dispatch table for a context depends on the current state of OpenGL for that context, as things like display list compilation, display list playback and plain old immediate mode rendering change the behaviour of many API entrypoints. We see from the quote above that each thread has, at most, a single context, and this context has a single current dispatch table. The top-level API entrypoints can be implemented like the following: struct gl_dispatch { ... void (*Foo)(GLint bar); ... }; void glFoo(GLint bar) { struct gl_dispatch *current = __get_current_dispatch(); current-Foo(bar); } Similarly, a driver's implementation of the above entrypoint might look like the following: void __my_Foo(GLint bar) { struct gl_context *gc = __get_current_context(); /* remember the current setting of bar */ gc-state.current.bar = bar; /* do stuff with bar, like program hardware registers */ ... } We want __get_current_context() and __get_current_dispatch() (at a minimum) to be as efficient as possible, while still providing thread safety. Suppose we add a libGL-specific area to pthread_descr. This would allow us to implement these (and other similar) functions like so: void *__get_current_context(void) { pthread_descr self = thread_self(); return THREAD_GETMEM(self, p_libGL_specific[_LIBGL_TSD_KEY_CONTEXT]); } void *__get_current_dispatch(void) { pthread_descr self = thread_self(); return THREAD_GETMEM(self, p_libGL_specific[_LIBGL_TSD_KEY_DISPATCH]); } This would allow us to hand-code the top-level dispatch functions on x86 as: glFoo: movl %gs:__gl_context_offset, %eax jmp *__glapi_Foo(%eax) where __gl_context_offset is the byte offset of the thread-local context pointer and __glapi_Foo is the byte offset of the Foo entry in the dispatch table. Clearly this is an efficient implementation of the dispatch mechanism required by OpenGL, and is completely thread-safe to boot. With modern OpenGL applications and benchmarks dealing with datasets containing over 1 million vertices, with one or more function calls per vertex, you can see that an efficient dispatching mechanism is crucial for a high-performance OpenGL implementation. For example, the SPEC Viewperf benchmark's Light test (as described at http://www.spec.org/gpc/opc.static/light05.html) includes a subtest that renders over half a million wireframe primitives like so: GLfloat color[][4]; GLfloat position[][4]; glBegin(GL_LINE_LOOP); glColor3fv(color[i]); glVertex3fv(position[i]); glColor3fv(color[i+1]); glVertex3fv(position[i+1]); glColor3fv(color[i+2]); glVertex3fv(position[i+2]); glColor3fv(color[i+3]); glVertex3fv(position[i+3]); glEnd(); With 10 function calls per primitive, this equates to over 5 million function calls per frame. This is certainly a worst-case scenario, and there are certainly more efficient methods of rendering such large amounts of data, but Viewperf (the industry-standard OpenGL benchmark) deliberately stresses this path to measure the cost of API calls, as many workstation OpenGL apps (engineering, CAD and 3D modelling tools) still operate like this. An important point to understand is that the round trip through the API, into the driver and back out again for this immediate mode path can often be counted in tens of instructions. State of the art
[Dri-devel] Re: OpenGL and the LinuxThreads pthread_descr structure
Hi! What percentage of applications use different dispatch tables among its threads? How often do dispatch table changes occur? If both of these are fairly low, computing a dispatch table in an awx section at dispatch table switch time might be fastest (ie. prepare something like: .section dispatch, awx .align 8 .globl glFoobar glFooBar: jmp something nop; nop; nop and something would be changed whenever a dispatch table switch happens for all dispatch table members). BTW: Last time I looked at libGL (in March), these were things which I came over: 1) libGL should IMHO use a version script (at least an anonymous one if you want to avoid assigning a specific GL_x.y symbol version to it), that way you get rid of thousands of expensive run-time relocations 2) last time I looked, libGL.so was linked unconditionally against libpthread. This is punnishing all non-threaded apps, weak undefined symbols work very well 3) I don't think building without -fpic is a good idea, 1) together with other tricks might speed things up while avoiding DT_TEXTREL overhead There were some other things, but I don't remember it very well. If I find time I'll build libGL again and check the disassembly. Jakub ___ Have big pipes? SourceForge.net is looking for download mirrors. We supply the hardware. You get the recognition. Email Us: [EMAIL PROTECTED] ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
[Dri-devel] Re: OpenGL and the LinuxThreads pthread_descr structure
Jakub Jelinek wrote: Hi! What percentage of applications use different dispatch tables among its threads? How often do dispatch table changes occur? If both of these are fairly low, computing a dispatch table in an awx section at dispatch table switch time might be fastest (ie. prepare something like: .section dispatch, awx .align 8 .globl glFoobar glFooBar: jmp something nop; nop; nop and something would be changed whenever a dispatch table switch happens for all dispatch table members). That's not really feasible, as the tables can change very frequently (as often as every glBegin/glEnd, or maybe even every function call between glBegin and glEnd). Also, dispatch tables will *always* be different between threads, that's why they need to be accessed in a thread-safe manner. Finally, rewriting the instructions like this will have very bad trace cache behaviour on the Pentium 4, where touching instructions that have already been decoded causes the entire trace cache to be flushed. BTW: Last time I looked at libGL (in March), these were things which I came over: 1) libGL should IMHO use a version script (at least an anonymous one if you want to avoid assigning a specific GL_x.y symbol version to it), that way you get rid of thousands of expensive run-time relocations Can you explain this in more detail? I'm not sure I understand what you're saying. 2) last time I looked, libGL.so was linked unconditionally against libpthread. This is punnishing all non-threaded apps, weak undefined symbols work very well I agree. 3) I don't think building without -fpic is a good idea, 1) together with other tricks might speed things up while avoiding DT_TEXTREL overhead Again, could you explain this in more detail? Thanks. -- Gareth ___ Have big pipes? SourceForge.net is looking for download mirrors. We supply the hardware. You get the recognition. Email Us: [EMAIL PROTECTED] ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Re: OpenGL and the LinuxThreads pthread_descr structure
Jakub Jelinek wrote: Hi! What percentage of applications use different dispatch tables among its threads? How often do dispatch table changes occur? If both of these are fairly low, computing a dispatch table in an awx section at dispatch table switch time might be fastest (ie. prepare something like: .section dispatch, awx .align 8 .globl glFoobar glFooBar: jmp something nop; nop; nop and something would be changed whenever a dispatch table switch happens for all dispatch table members). BTW: Last time I looked at libGL (in March), these were things which I came over: 1) libGL should IMHO use a version script (at least an anonymous one if you want to avoid assigning a specific GL_x.y symbol version to it), that way you get rid of thousands of expensive run-time relocations Where can I get info on this? 2) last time I looked, libGL.so was linked unconditionally against libpthread. This is punnishing all non-threaded apps, weak undefined symbols work very well This is because we currently use the standard way of getting thread-local-data and detecting multi-thread situations. I'm not sure how Gareth is able to detect threaded vs. non-threaded situations without making any calls into the pthreads library, but once you know which one you're in, with his trick, you don't need to make any more. Currently we do something like this in MakeCurrent: void _glapi_check_multithread(void) { #if defined(THREADS) if (!ThreadSafe) { static unsigned long knownID; static GLboolean firstCall = GL_TRUE; if (firstCall) { knownID = _glthread_GetID(); firstCall = GL_FALSE; } else if (knownID != _glthread_GetID()) { ThreadSafe = GL_TRUE; } } if (ThreadSafe) { /* make sure that this thread's dispatch pointer isn't null */ if (!_glapi_get_dispatch()) { _glapi_set_dispatch(NULL); } } #endif } where _glthread_GetID() is really pthread_self(). How do you detect threading without making these calls to libpthreads.so? 3) I don't think building without -fpic is a good idea, 1) together with other tricks might speed things up while avoiding DT_TEXTREL overhead The thing that really bites with -fpic is the bs you have to go through to get access to static symbols (forgive my loose terminology) like static variables or other functions you want to call. Gareth's trick means that two very important variables avoid this, but it's still going to be necessary to call other functions often enough... There were some other things, but I don't remember it very well. If I find time I'll build libGL again and check the disassembly. As someone who 1) is concerned about libGL performance and 2) doesn't know much about relocation/fpic/pthreads/etc, I'd love to hear anything you've got on this. Keith ___ Have big pipes? SourceForge.net is looking for download mirrors. We supply the hardware. You get the recognition. Email Us: [EMAIL PROTECTED] ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Re: OpenGL and the LinuxThreads pthread_descr structure
Keith Whitwell wrote: 2) last time I looked, libGL.so was linked unconditionally against libpthread. This is punnishing all non-threaded apps, weak undefined symbols work very well This is because we currently use the standard way of getting thread-local-data and detecting multi-thread situations. I'm not sure how Gareth is able to detect threaded vs. non-threaded situations without making any calls into the pthreads library, but once you know which one you're in, with his trick, you don't need to make any more. Currently we do something like this in MakeCurrent: void _glapi_check_multithread(void) { #if defined(THREADS) if (!ThreadSafe) { static unsigned long knownID; static GLboolean firstCall = GL_TRUE; if (firstCall) { knownID = _glthread_GetID(); firstCall = GL_FALSE; } else if (knownID != _glthread_GetID()) { ThreadSafe = GL_TRUE; } } if (ThreadSafe) { /* make sure that this thread's dispatch pointer isn't null */ if (!_glapi_get_dispatch()) { _glapi_set_dispatch(NULL); } } #endif } where _glthread_GetID() is really pthread_self(). How do you detect threading without making these calls to libpthreads.so? The important point is that you don't really need to detect threading anymore. The Linux OpenGL ABI states that multithreaded apps must link with pthreads. Thus, at startup, you can detect the presence of pthreads or otherwise. Basically, if pthreads is present, you just use the pthread_descr that it set up, otherwise you create a dummy one and plug it into the segment registers (or whatever) and be done with it. From that point on, you don't care how many threads there are. Accessing global data is always done the same way, independant of the number of threads running. In any case, it would be great to remove the need of apps that link with libGL to also link with pthreads, and to force the use of pthreads even for single-threaded apps. The thing that really bites with -fpic is the bs you have to go through to get access to static symbols (forgive my loose terminology) like static variables or other functions you want to call. Gareth's trick means that two very important variables avoid this, but it's still going to be necessary to call other functions often enough... I'd like to hear a strong arguement as to why you *would* want to link with -fpic. Like Keith, I'm also not familiar with some of the more in-depth aspects w.r.t. relocation/fpic etc, so feel free to enlighten us. -- Gareth ___ Have big pipes? SourceForge.net is looking for download mirrors. We supply the hardware. You get the recognition. Email Us: [EMAIL PROTECTED] ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
[Dri-devel] Re: OpenGL and the LinuxThreads pthread_descr structure
Jakub Jelinek wrote: Hi! What percentage of applications use different dispatch tables among its threads? How often do dispatch table changes occur? If both of these are fairly low, computing a dispatch table in an awx section at dispatch table switch time might be fastest I should also point out that display list compilation and playback is another place where the dispatch table changes (typically, you have at least a dispatch table for regular immediate mode, display list compilation and display list playback). One of the ugliest things about the Microsoft Windows implementation of OpenGL is that the driver backend must call a function to register a new dispatch table, and the OpenGL library then makes several copies of this table internally. Being able to switch dispatch tables with a single pointer reassignment makes it easy to do very powerful optimizations. -- Gareth ___ Have big pipes? SourceForge.net is looking for download mirrors. We supply the hardware. You get the recognition. Email Us: [EMAIL PROTECTED] ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Re: OpenGL and the LinuxThreads pthread_descr structure
Gareth Hughes wrote: Keith Whitwell wrote: 2) last time I looked, libGL.so was linked unconditionally against libpthread. This is punnishing all non-threaded apps, weak undefined symbols work very well This is because we currently use the standard way of getting thread-local-data and detecting multi-thread situations. I'm not sure how Gareth is able to detect threaded vs. non-threaded situations without making any calls into the pthreads library, but once you know which one you're in, with his trick, you don't need to make any more. Currently we do something like this in MakeCurrent: void _glapi_check_multithread(void) { #if defined(THREADS) if (!ThreadSafe) { static unsigned long knownID; static GLboolean firstCall = GL_TRUE; if (firstCall) { knownID = _glthread_GetID(); firstCall = GL_FALSE; } else if (knownID != _glthread_GetID()) { ThreadSafe = GL_TRUE; } } if (ThreadSafe) { /* make sure that this thread's dispatch pointer isn't null */ if (!_glapi_get_dispatch()) { _glapi_set_dispatch(NULL); } } #endif } where _glthread_GetID() is really pthread_self(). How do you detect threading without making these calls to libpthreads.so? The important point is that you don't really need to detect threading anymore. The Linux OpenGL ABI states that multithreaded apps must link with pthreads. Thus, at startup, you can detect the presence of pthreads or otherwise. Basically, if pthreads is present, you just use the pthread_descr that it set up, otherwise you create a dummy one and plug it into the segment registers (or whatever) and be done with it. From that point on, you don't care how many threads there are. Accessing global data is always done the same way, independant of the number of threads running. Hmm. And does libpthreads.so *always* set this up -- is it possible to link to libpthreads.so but not actually use it, spoofing your detection? In any case, it would be great to remove the need of apps that link with libGL to also link with pthreads, and to force the use of pthreads even for single-threaded apps. I agree. The thing that really bites with -fpic is the bs you have to go through to get access to static symbols (forgive my loose terminology) like static variables or other functions you want to call. Gareth's trick means that two very important variables avoid this, but it's still going to be necessary to call other functions often enough... I'd like to hear a strong arguement as to why you *would* want to link with -fpic. Like Keith, I'm also not familiar with some of the more in-depth aspects w.r.t. relocation/fpic etc, so feel free to enlighten us. Me also. Keith ___ Have big pipes? SourceForge.net is looking for download mirrors. We supply the hardware. You get the recognition. Email Us: [EMAIL PROTECTED] ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
[Dri-devel] Re: OpenGL and the LinuxThreads pthread_descr structure
On Thu, 2002-05-16 at 15:41, Gareth Hughes wrote: I would like to propose a small change to the pthread_descr structure in the latest LinuxThreads code, to better support OpenGL on GNU/Linux systems (particularly on x86, but not excluding other platforms). The purpose of this patch is to provide efficient thread-local storage for both libGL itself and loadable OpenGL driver modules, glibc already supports the thread-local storage extension to ELF and so does binutils. Only gcc support is missing but you can work around this with asm. Once gcc has the support you'll be able to write __thread some_type some_var; as if some_var would be a global variable. In fact it'll be thread-specific. This is the only way you'll get access to thread-local storage. It is out of question to allow third party program peek and poke into the thread descriptor. -- ---. ,-. 1325 Chesapeake Terrace Ulrich Drepper \,---' \ Sunnyvale, CA 94089 USA Red Hat `--' drepper at redhat.com ` signature.asc Description: This is a digitally signed message part
[Dri-devel] Re: OpenGL and the LinuxThreads pthread_descr structure
Ulrich Drepper wrote: This is the only way you'll get access to thread-local storage. It is out of question to allow third party program peek and poke into the thread descriptor. What do you mean, a third party program? We're talking about a system library (libGL.so) here. There is a similar shortcut for libc (p_libc_specific) already in there. -- Gareth ___ Have big pipes? SourceForge.net is looking for download mirrors. We supply the hardware. You get the recognition. Email Us: [EMAIL PROTECTED] ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
[Dri-devel] Re: OpenGL and the LinuxThreads pthread_descr structure
On Thu, 2002-05-16 at 17:54, Gareth Hughes wrote: What do you mean, a third party program? We're talking about a system library (libGL.so) here. Everything which is not part of glibc is third-party. It's the same as if some program would require access to internal data structures of libGL. There are several different layouts of the thread descriptor and it's only getting worse. The actual layout doesn't matter since everything is internal to glibc and the other libraries which come with it so this is no problem. Beside, I don't understand why you react like this. Using __thread is the best you can ever get. It'll be portable (Solaris 9 already has the support as well) and it's faster than anything you can get to access the data. -- ---. ,-. 1325 Chesapeake Terrace Ulrich Drepper \,---' \ Sunnyvale, CA 94089 USA Red Hat `--' drepper at redhat.com ` signature.asc Description: This is a digitally signed message part
[Dri-devel] Re: OpenGL and the LinuxThreads pthread_descr structure
Gareth Hughes wrote: Let's be clear about what I'm proposing: you agree to reserve an 8*sizeof(void *) block at a well-defined and well-known offset in the TCB. Of course, I should add that space for such a block exists, and has existed for some time. My proposal requires no real changes on the glibc side of things, other than to set in stone the agreement between pthreads and OpenGL to ensure this block is there in the future. -- Gareth ___ Have big pipes? SourceForge.net is looking for download mirrors. We supply the hardware. You get the recognition. Email Us: [EMAIL PROTECTED] ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
[Dri-devel] Re: OpenGL and the LinuxThreads pthread_descr structure
A question about the __thread stuff: does it require -fPIC? What happens if you don't compile a library with -fPIC, and have __thread variables declared in that library? -- Gareth ___ Have big pipes? SourceForge.net is looking for download mirrors. We supply the hardware. You get the recognition. Email Us: [EMAIL PROTECTED] ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel