Re: [Dri-devel] Mach64 dma fixes
Linus Torvalds wrote: A hot system call takes about 0.2 us on an athlon (it takes significantly longer on a P4, which I'm beating up Intel over all the time). The ioctl stuff goes through slightly more layers, but we're not talking huge numbers here. The system calls are fast enough that you're better off trying to keep stuff in the cache, than trying to minimize system calls. This is an education for me, too. Thanks for the info. Any idea how heavy IOCTL's are on a P4? NOTE NOTE NOTE! The tradeoffs are seldom all that clear. Sometimes big buffers and few system calls are better. Sometimes they aren't. It just depends on a lot of things. You bet--and the real issue we're constantly swimming up stream against is security in open source. Most hardware vendors design the hardware for closed source drivers and don't put many (or sometimes any) time into making sure their hardware is optimized for performance *and* security. Consequently most modern graphics chips are optimized for user space DMA and they rely on security through obscurity of their closed source drivers. Then, the DRI team comes along and has to figure out how to kludge together a secure path that doesn't sacrafice *all* the performance. Linus, if you have any ideas on how we can uphold the security strengths of Linux without leaving all this performance on the table simply because we embrace open source, then I'd love to hear it. It really hurts to be competing tooth and nail against closed source drivers (on Linux even) and have to leave potentially large performance gains on the table. The other paradox here is that security is paramount for the server market where Linux is strong. But we're trying to help Linux into the domain of the graphics workstation and game machine markets where users already have full access to the machine (even physically). So how is all this security really helping us address those market? Sorry, I'm venting. This has been a difficult issue since the beginning of the DRI project--but I'm glad I got it off my chest :-) Regards, Jens -- /\ Jens Owen/ \/\ _ [EMAIL PROTECTED] /\ \ \ Steamboat Springs, Colorado ___ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] DRM_R128_DEPTH
Michel Dänzer wrote: I noticed there are two conflicting definitions for this in programs/Xserver/hw/xfree86/drivers/ati/r128_common.h. One of them used to be in programs/Xserver/hw/xfree86/os-support/xf86drmR128.h, the other one was introduced with the drmCommand stuff. Does this fix look good? Good catch! I see you've already commited that patch. That's great. -- /\ Jens Owen/ \/\ _ [EMAIL PROTECTED] /\ \ \ Steamboat Springs, Colorado ___ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Mach64 dma fixes
On Mon, 27 May 2002, Jens Owen wrote: This is an education for me, too. Thanks for the info. Any idea how heavy IOCTL's are on a P4? Much heavier. For some yet unexplained reason, a P4 takes about 1us to do a simple system call. That's on a 1.8GHz system, so it basically implies that a P4 takes 1800 cycles to do a int 0x80 + iret, which is just ludicrous. A 1.2Gz athlon does the same in 0.2us, ie around 250 cycles (the 200+ cycles also matches a pentium reasonably well, so it's really the P4 that stands out here). The rest of the ioctl overhead is not really noticeable compared to those 1800 cycles spent on the enter/exit kernel mode. Even so, those memcpy vs pipe throughput numbers I quoted were off my P4 machine: _despite_ the fact that a P4 is inexplicably bad at system calls, those 1800 CPU cycles is just a whole lot less than a lot of cache misses with modern hardware. It doesn't take many cache misses to make 1800 cycles just noise. And if the 1800 cycles are less than cache misses on normal non-IO benchmarks, they are going to be _completely_ swamped by any PCI/AGP overhead. You bet--and the real issue we're constantly swimming up stream against is security in open source. Most hardware vendors design the hardware for closed source drivers and don't put many (or sometimes any) time into making sure their hardware is optimized for performance *and* security. I realize this, and I feel for you. It's nasty. I don't know what the answer is. It _might_ even be something like a bi-modal system: - apps by default get the traditional GLX behaviour: the X server does all the 3D for them. No DRI. - there is some mechanism to tell which apps are trusted, and trusted apps get direct hw access and just aren't secure. I actually think that if the abstraction level is just high enough, DRI shouldn't matter in theory. Shared memory areas with X for the high-level data (to avoid the copies for things like the obviously huge texture data). From a game standpoint, think quake engine. The actual game doesn't need to tell the GX engine everything over and over again all the time. It tells it the basic stuff once, and then it just says render me. You don't need DRI for sending the render me command, you need DRI because you send each vertex separately. In that kind of high-level abstraction, the X client-server model should still work fine. In fact, it should work especially well on small-scale SMP (which seems inevitable). Are people thinkin gabout the next stage, when 2D just doesn't exist any more except as a high-level abstraction on top of a 3D model? Where the X server actually gets to render the world view, and the application doesn't need to (or want to) know about things like level-of-detail? Linus ___ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Cards Specs
Jens Owen wrote: It would be interesting to hear more details from their developers regarding the comment they put in their README: If an OpenGL application is forcibly terminated by closing the X connection then there may be leftovers on the desktop. This appears to be a problem in the DRI infrastructure the driver is based upon. The problem we were seeing is that closing the connection didn't update the context stamp, so the dri driver could still get the lock without talking to the server. This meant frames were still in flight when xlib finally realized the connection was closed and exited. We saw the same behavior with a Radeon card. The easiest way to reproduce this is to start gears and then kill the connection (wm or xkill). Are there any Kyro developers listening on this list? Yes. - Tim Rowley [EMAIL PROTECTED] ___ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Mach64 dma fixes
Linus Torvalds wrote: On Mon, 27 May 2002, Jens Owen wrote: This is an education for me, too. Thanks for the info. Any idea how heavy IOCTL's are on a P4? Much heavier. For some yet unexplained reason, a P4 takes about 1us to do a simple system call. That's on a 1.8GHz system, so it basically implies that a P4 takes 1800 cycles to do a int 0x80 + iret, which is just ludicrous. A 1.2Gz athlon does the same in 0.2us, ie around 250 cycles (the 200+ cycles also matches a pentium reasonably well, so it's really the P4 that stands out here). This is remarkable. I thought things were getting better, not worse. ... You bet--and the real issue we're constantly swimming up stream against is security in open source. Most hardware vendors design the hardware for closed source drivers and don't put many (or sometimes any) time into making sure their hardware is optimized for performance *and* security. I realize this, and I feel for you. It's nasty. I don't know what the answer is. It _might_ even be something like a bi-modal system: - apps by default get the traditional GLX behaviour: the X server does all the 3D for them. No DRI. - there is some mechanism to tell which apps are trusted, and trusted apps get direct hw access and just aren't secure. I actually think that if the abstraction level is just high enough, DRI shouldn't matter in theory. Shared memory areas with X for the high-level data (to avoid the copies for things like the obviously huge texture data). I like this because it offers a way out, although I would keep the direct, secure approach to 3d we currently have for the other clients. Indirect rendering is pretty painful... However: The applications that most people would want to 'trust' are things like quake or other closed source games, which makes the situation a little murkier. From a game standpoint, think quake engine. The actual game doesn't need to tell the GX engine everything over and over again all the time. It tells it the basic stuff once, and then it just says render me. You don't need DRI for sending the render me command, you need DRI because you send each vertex separately. You could view the static geometry of quake levels as a single display list and ask for the whole thing to be rendered each frame. However, the reality of the quake type games is anything but - huge amounts of effort have gone into the process of figuring out (as quickly as possible) what minimal amount of work can be done to render the visible portion of the level at each frame. Quake generates very dynamic data from quite a static environment in the name of performance... In that kind of high-level abstraction, the X client-server model should still work fine. In fact, it should work especially well on small-scale SMP (which seems inevitable). Games are free to partition themselves in other ways that help smp but keep their ability for a tight binding with the display system -- for example the physics (rigid body simulation) subsytem is a big and growing consumer of cpu and is quite easily seperated out from the graphics engine. AI is also a target for its own thread. Are people thinkin gabout the next stage, when 2D just doesn't exist any more except as a high-level abstraction on top of a 3D model? Where the X server actually gets to render the world view, and the application doesn't need to (or want to) know about things like level-of-detail? Yes, but there are a few steps between here and there, and there have been a few differences of opinion along the way. It would have been possible to get a lot of the X render extension via a client library emitting GL calls, for example. Keith ___ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
[Dri-devel] Re: DRI FAQ
Hello I am translating the DRI FAQ in french and I am trying do correct it or simplify it. I don't really understand why there is 2 parts on Mesa: one on Mesa 3.4 and one on 4.x because there are very similar. I don't really see what are the differences because it is too long. It would be better to point out the differences instead of rewriting twice the same things. Maybe I am wrong, thanks Roussel Jérôme. ___ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Re: DRI FAQ
On 2002.05.27 14:53 roussel jerome wrote: Hello I am translating the DRI FAQ in french and I am trying do correct it or simplify it. I don't really understand why there is 2 parts on Mesa: one on Mesa 3.4 and one on 4.x because there are very similar. I don't really see what are the differences because it is too long. It would be better to point out the differences instead of rewriting twice the same things. He! He! They are different, believe me - me and Leif spend a week together to port a driver from 3.4 to 4.x! The world wasn't made in one day: first there was Mesa 3.4 internal docs, then it was adapted to Mesa 4.x. Mesa 3.4 is deprecated now, but I left it there because there are some drivers which weren't ported to Mesa 4.x yet (both in DRI as in Utah GLX). But I'd rather delete it than loosing time to explain the differences. Maybe I am wrong, thanks Roussel Jérôme. My advice is to ignore the 3.x stuff. I'll eventually delete it too. If someone eventually needs it one can always get a older version of the FAQ in the CVS repository. I look forward to see the French version of the FAQ ;-) Regards, José Fonseca ___ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Mach64 dma fixes
Keith Whitwell wrote: Linus Torvalds wrote: On Mon, 27 May 2002, Jens Owen wrote: This is an education for me, too. Thanks for the info. Any idea how heavy IOCTL's are on a P4? Much heavier. For some yet unexplained reason, a P4 takes about 1us to do a simple system call. That's on a 1.8GHz system, so it basically implies that a P4 takes 1800 cycles to do a int 0x80 + iret, which is just ludicrous. A 1.2Gz athlon does the same in 0.2us, ie around 250 cycles (the 200+ cycles also matches a pentium reasonably well, so it's really the P4 that stands out here). This is remarkable. I thought things were getting better, not worse. ... You bet--and the real issue we're constantly swimming up stream against is security in open source. Most hardware vendors design the hardware for closed source drivers and don't put many (or sometimes any) time into making sure their hardware is optimized for performance *and* security. I realize this, and I feel for you. It's nasty. I don't know what the answer is. It _might_ even be something like a bi-modal system: - apps by default get the traditional GLX behaviour: the X server does all the 3D for them. No DRI. - there is some mechanism to tell which apps are trusted, and trusted apps get direct hw access and just aren't secure. I actually think that if the abstraction level is just high enough, DRI shouldn't matter in theory. Shared memory areas with X for the high-level data (to avoid the copies for things like the obviously huge texture data). I like this because it offers a way out, although I would keep the direct, secure approach to 3d we currently have for the other clients. Indirect rendering is pretty painful... A bi-modal system could be very possible from an implementation perspective in the short term. We have a security mechanism in place now for validating which processes are allowed to access the direct rendering mechanism. It is based on user ID's, and no process is allowed access to these resources unless they have: 1) Access to the X Server as an X client. 2) Their permission is acceptable based on how the DRI permissions are defined in the XF86Config file. Most distributions have picked up on this and now have a typical usage model that allows the DRI to work for all desktop users. If we do get some type of indirect rendering path working quicker, then perhaps we could tighten up these defaults so that the usage model required explicit administrative permision to a user before being allowed access to direct rendering. However, after going to all this trouble of making a decent level of fall back performance, I would then want to push the performance envelop for those processes that did meet the criteria for access to direct rendering resources, and soften the security requirements for just those processes. This could possible be users that have been given explicit permission and the X server itself (doing HW accellerated indirect rendering). There would really be three prongs of attach for this approach: 1) Audit the current DRI security model and confirm that it is strong enough to be used to prevent non authorized users from gaining access to the DRI mechanisms. Work with distros to tighten up the usage model (and possible the DRI security mechanism itself) so only explicit desktop users are allowed access to the DRI. 2) Develop a device independent indirect rendering module that plugs into the X server to utilize our 3D drivers. After getting some HW accel working, look at speeding up this path by utilizing Chormium-like technologies and/or shared memory for high level data. 3) Transition the direct rendering drivers to take full advantage of their user space DMA capabilities. The is a large amount of work, but something we should consider if step 1 can be achieved to the kernel teams satisfaction. It is even possible the direct path could be obsoleted over the long term as step 2 becomes more and more streamlined. However: The applications that most people would want to 'trust' are things like quake or other closed source games, which makes the situation a little murkier. Yes, but is this really any worse than a typical install for these apps that requires root level access. From a game standpoint, think quake engine. The actual game doesn't need to tell the GX engine everything over and over again all the time. It tells it the basic stuff once, and then it just says render me. You don't need DRI for sending the render me command, you need DRI because you send each vertex separately. You could view the static geometry of quake levels as a single display list and ask for the whole thing to be rendered each frame. However, the reality of the quake type games is anything but - huge amounts of effort have gone into the process of figuring out (as quickly as possible) what minimal amount of work can be done to render the
Re: [Dri-devel] g++-3.0 fix for lib/GLU/libnurbs/nurbtess/quicksort.cc
Felix Kühling wrote: Hi, I had trouble when compiling DRI with g++ 3.0.4 and -O3 related to function inlining. The function swap is declared static globally in quicksort.cc. In function quicksort it is redeclared. The redeclaration prevents g++ from inlining the swap function. Instead it emits function calls. In contrast to g++ 2.95 the 3.0.4 compiler did not keep a copy of swap. I assume that it does not relate the global and the local declaration to the same function. This leads to an undefined symbol as soon as a program using libGLU (like TuxRacer) was started. The fix is simple. Just leave out the useless local redeclaration of swap. This allows inlining the swap function in both 2.95 and 3.0.4. Note I checked all this in the assembler output. I'm just not sure whether the problem should be regarded a g++ bug or not. The patch is attached. I fixed this in the Mesa tree but forgot to propogate it to the DRI tree. It's fixed now. Thanks. -Brian ___ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Mach64 dma fixes
On 2002.05.27 16:28 Jens Owen wrote: ... If we do get some type of indirect rendering path working quicker, then perhaps we could tighten up these defaults so that the usage model required explicit administrative permision to a user before being allowed access to direct rendering. However, after going to all this trouble of making a decent level of fall back performance, I would then want to push the performance envelop for those processes that did meet the criteria for access to direct rendering resources, and soften the security requirements for just those processes. This could possible be users that have been given explicit permission and the X server itself (doing HW accellerated indirect rendering). There would really be three prongs of attach for this approach: 1) Audit the current DRI security model and confirm that it is strong enough to be used to prevent non authorized users from gaining access to the DRI mechanisms. Work with distros to tighten up the usage model (and possible the DRI security mechanism itself) so only explicit desktop users are allowed access to the DRI. 2) Develop a device independent indirect rendering module that plugs into the X server to utilize our 3D drivers. After getting some HW accel working, look at speeding up this path by utilizing Chormium-like technologies and/or shared memory for high level data. 3) Transition the direct rendering drivers to take full advantage of their user space DMA capabilities. The is a large amount of work, but something we should consider if step 1 can be achieved to the kernel teams satisfaction. It is even possible the direct path could be obsoleted over the long term as step 2 becomes more and more streamlined. ... Jens, if I understood correctly, basically you're suggesting having the OpenGL state machine on the X server process context, and therefore the GL drivers too, and most of the data (textures, display lists). So there would be no layering between the DMA buffer construction and its submition - as boths things would be carried by the GL drivers. This means that we would have a single driver model instead of 3. But the GLX protocol isn't good for this, is it? Hence the need for shared memory for big data. Am I getting the right picture, or am I way off..? José Fonseca PS: It would be nice to discuss these issues in tonight's meeting. ___ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] g++-3.0 fix for lib/GLU/libnurbs/nurbtess/quicksort.cc
Alexander Stohr wrote: Good fix Felix. I do hate local function prototypes. Its just bad coding style and laziness. Further it shows a critical lack of knowledge for the header file organisation. They are never verified against the implementation by the compiler and might be overseen rather quickly when the function API gets modified. There is only one way of eliminating those flaws: tuning the compiler warnings to a rather verbose level and let the compiler consider them as errors. Just one question: Would you (and the XFree86 and the kernel folks) allow me to rework all your sources at that degree, touching lots of code lines just to let the compiler report a few more warnings? Most of them will relate to constellations that are really not dangerous, and this will possibly unveil not even a single bug at all when compiling, possibly not now and not in any future. I'm always happy to fix code that causes warnings. I routinely compile Mesa using gcc's most pedantic error/warning options. Occasionally compiling with g++ often produces even more warnings. I'd suggest starting with a few isolated modules or directories. -Brian ___ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Re: DRI FAQ
On Mon, 27 May 2002 15:53:49 +0200 roussel jerome [EMAIL PROTECTED] wrote: I am translating the DRI FAQ in french and I am trying do correct it or simplify it. Do bear in mind that I am re-doing the website. I will accept any other translations of documents you wish to do also. ___ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Mach64 dma fixes
On Mon, 27 May 2002, Keith Whitwell wrote: Linus Torvalds wrote: Much heavier. For some yet unexplained reason, a P4 takes about 1us to do a simple system call. That's on a 1.8GHz system, so it basically implies that a P4 takes 1800 cycles to do a int 0x80 + iret, which is just ludicrous. A 1.2Gz athlon does the same in 0.2us, ie around 250 cycles (the 200+ cycles also matches a pentium reasonably well, so it's really the P4 that stands out here). This is remarkable. I thought things were getting better, not worse. In general, they are. I suspect the P4 system call slowness is just another artifact of some first-generation issues - the same way the P4 tends to be limited when it comes to shifts etc. It will get fixed eventually. And running at 3GHz+ makes some CPU cycles seem cheap if you can make up for them elsewhere. However, you should put all of this into perspective: those 1800 cycles are just about the same time it takes to do one _single_ read from an ISA device. It's roughly the time it takes for one cacheline to be DMA'd over PCI. Linus ___ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Re: DRI FAQ
Ian Molton wrote: On Mon, 27 May 2002 15:53:49 +0200 roussel jerome [EMAIL PROTECTED] wrote: I am translating the DRI FAQ in french and I am trying do correct it or simplify it. Do bear in mind that I am re-doing the website. I will accept any other translations of documents you wish to do also. Thanks. It takes a lot of time so I don't know if I will have the time to translate any other docs because I do this without any software. Do you know some softwares to translate docbook docs??? For the time being, I have also an general overview in French but it doesn't add anything to the actual docs. I will change it a little and I will send it to you but you will have to wait a few weeks... I am not an expert with docbook (and I do this in my spare time, I am a student in Telecom in Paris) so if someone wants to help me with the formatting of the files (with the different output), I would be pleased. Thanks, Roussel Jérôme. ___ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Mach64 dma fixes
From a game standpoint, think quake engine. The actual game doesn't need to tell the GX engine everything over and over again all the time. It tells it the basic stuff once, and then it just says render me. You don't need DRI for sending the render me command, you need DRI because you send each vertex separately. You could view the static geometry of quake levels as a single display list and ask for the whole thing to be rendered each frame. However, the reality of the quake type games is anything but - huge amounts of effort have gone into the process of figuring out (as quickly as possible) what minimal amount of work can be done to render the visible portion of the level at each frame. Quake generates very dynamic data from quite a static environment in the name of performance... I think I understand...even though Linus is refering to Quake's wire protocol here, you are pointing out that the real challenge is the underlying game engine which is highly optimized for that specific application. Am I correct? I think the multiplayer aspects of the game are a separate issue. Talking about the difference between a big display list with the whole quake level in it and the visibility/bsp-tree/whatever-new-technique coding that quake other games use to squeeze as much as possible out of the hardware. It may be that simple visibility issues are pretty well understood now, and that the competition between game engines is moving to the shading engines (and physics engines if the reports about doom 3 are right). Keith ___ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Indirect rendering strangeness / question
Mesa should handle rendering into any visual, including overlay planes. I think we just have to add some missing bits to the server-side GLX/Mesa code to support overlay rendering. But I don't have any hardware (such as FireGL) to test/fix this. Stand-alone Mesa has supported overlay rendering since 1996. If the X server advertises the SERVER_OVERLAY_VISUALS convention (a special root window atom) Mesa will query the server's overlay visual properties and do the right thing. Ian, if you run 'xprop -root | grep SERVER_OVERLAY_VISUALS' is anything found? -Brian Jens Owen wrote: Brian, Is it possible some behavioral changes were introduced into the internal GLX interface when you cleaned up this code a while back? The root of the issue appears to be how the two distinct sets of visuals (those supported by Mesa's SW renderer and those supported by a particular HW driver) are resolved into a single set of visuals that are advertised by the X Server. In the original DRI implementation, the driver callback into GLX via the GlxSetVisualConfigs entry would allow the driver to notify the GLX layer of *all* the visuals it was capable of supporting regardless of what the Mesa SW renderer could handle. The GLX layer would then externally advertise only the visuals that both the Mesa SW render and the Driver *both* supported. However, the newer code in init_visuals appears to advertise the *full* list of visuals the driver gives it regardless of what Mesa's SW renderer is capable of supporting. Have I interpreted this change correctly? If so, should all drivers now only advertise visuals they know are also supported by the Mesa SW renderer? Any insight you can offer on the current behavior of this interface is appreciated. Thanks, Jens Ian Romanick wrote: So, I've run into an interesting situation, and I'm wondering what should theoretically happen. The FireGL driver (closed source, from ATI) seems to be the only DRI based hardware driver that supports overlays. When running apps on the localhost, everything works fine. The catch is when an app is run on a remote system and indirect rendering is used. In this case, Mesa is reported as the renderer (instead of FireGL2 / FireGL3), but glxinfo still reports the set of supported visuals that the hardware driver supports. The problem is that the glXGetConfig reports that GLX_USE_GL is available in the overlay window, eventhough, as far as I can tell, the built-in Mesa renderer does not support this. I've looked at the code, but the path through it is somewhat confusing for the indirect case. It seems as though glXGetConfig uses the screenConfigs that is exported by the driver. So, what SHOULD happen? It seems that this has worked with all of the other drivers because the hardware drivers support a subset of the visuals supported by the built-in Mesa. Is this correct? I would ask this at the Monday IRC meeting, but I don't think I'll be available. Since it's a holiday in the US, is there even going to be a meeting? I ask because I know that most of the people that still come are not in the US... -- /\ Jens Owen/ \/\ _ [EMAIL PROTECTED] /\ \ \ Steamboat Springs, Colorado ___ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Mach64 dma fixes
Around 18 o'clock on May 27, Keith Whitwell wrote: I think the multiplayer aspects of the game are a separate issue. Talking about the difference between a big display list with the whole quake level in it and the visibility/bsp-tree/whatever-new-technique coding that quake other games use to squeeze as much as possible out of the hardware. We had a big display-list vs immediate-mode war around 1990 and immediate mode won. It's just a lot easier to send the whole frame worth of polygons each time than to try an edit display lists. Of course, this particular battle was framed by the scientific visualization trend of that era where each frame was generated from a completely new set of data. In that context, stored mode graphics lose pretty badly. However, given our experience with shared memory transport for images, and given the tremendous differential between CPU and bus speeds these days, it might make some sense to revisit the current 3D architecture. A system where the shared memory commands are validated by a user-level X server and passed to the graphics engine with only a small kernel level helper for DMA would allow for a greater possible level of security than the current DRI model does today. This would also provide for accelerated 3D graphics for remote applications, something that DRI doesn't support today, and which would take some significant work to enable. I would hope that it could also provide a significantly easier configuration environment; getting 3D running with the DRI is still a significant feat for the average Linux user. The question is whether this would impact performance at all; we're talking a process-process context switch instead of process-kernel for each chunk of data. However, we'd eliminate the current DRI overhead when running multiple 3D applications, and we'd be able to take better advantage of SMP systems. One trick would be to have the X server avoid reading much of the command buffer; much of that would make SMP performance significantly worse. Keith PackardXFree86 Core TeamHP Cambridge Research Lab ___ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Radeon 7500 lockup
On Sun, May 26, 2002 at 06:50:11PM +0100, Tim Smith wrote: The tcl-0-0-branch really doesn't like 2D menus being popped up over the 3D drawing area; it locks up after a few of these with or without RADEON_NO_TCL and with or without page flipping enabled. Following Michel's suggestion I've captured logs for three occurrences of this happening and put them at What 2d menus? From a window manager or in the app itself? I committed a patch yesterday that fixed crashes with overlapping / moving / resizing 3d windows. If you got the tree before then you might not have it. -- Michael. ___ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
[Dri-devel] Mach64 DRM: Permission denied: even for root?
Hi all Just took the latest (27.05) drm binary snapshot for Mach64. And cannot get DRI working any more. In XFree86.0.log I see: drmOpenDevice: minor is 0 drmOpenDevice: node name is /dev/dri/card0 drmOpenDevice: open result is 7, (OK) drmOpenDevice: minor is 0 drmOpenDevice: node name is /dev/dri/card0 drmOpenDevice: open result is 7, (OK) drmOpenDevice: minor is 0 drmOpenDevice: node name is /dev/dri/card0 drmOpenDevice: open result is 7, (OK) drmGetBusid returned '' (II) ATI(0): [drm] drmSetBusid failed (7, PCI:1:0:0), Permission denied (EE) ATI(0): [dri] DRIScreenInit Failed Even if I start just X from root shell! How is it possible? What's wrong with my configuration? It seems DRM security reached the highest possible level:) Cheers, Sergey ___ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Mach64 dma fixes
Keith Packard wrote: We had a big display-list vs immediate-mode war around 1990 and immediate mode won. It's just a lot easier to send the whole frame worth of polygons each time than to try an edit display lists. Of course, this particular battle was framed by the scientific visualization trend of that era where each frame was generated from a completely new set of data. In that context, stored mode graphics lose pretty badly. If you're referring to the OpenGL vs PEX war, there was more than technical issues weighing in...there was the reality that Microsoft *was* willing to support OpenGL. That made OpenGL a better cross platform choice. Kind of ironic, but predictable that Microsoft is now trying to sink OpenGL...but that's a thread for another group. However, given our experience with shared memory transport for images, and given the tremendous differential between CPU and bus speeds these days, it might make some sense to revisit the current 3D architecture. A system where the shared memory commands are validated by a user-level X server and passed to the graphics engine with only a small kernel level helper for DMA would allow for a greater possible level of security than the current DRI model does today. I wouldn't say we're laking in security today, we've in good shape now. This would also provide for accelerated 3D graphics for remote applications, something that DRI doesn't support today, and which would take some significant work to enable. In relative scale, getting HW acceleration for direct rendering is *much* smaller than the more aggressive architectural changes we're discussing. Let's just keep that in perspective. It might be a less aggressive first step to get the missing module(s) for HW acclerated indirect rendering going, then move to these types of more aggressive indirect methods. I would hope that it could also provide a significantly easier configuration environment; getting 3D running with the DRI is still a significant feat for the average Linux user. Hmm. I would have agreed a year ago, but most of the distributions appear to have a good handle on making this just happen...when there is driver support. The question is whether this would impact performance at all; we're talking a process-process context switch instead of process-kernel for each chunk of data. However, we'd eliminate the current DRI overhead when running multiple 3D applications, and we'd be able to take better advantage of SMP systems. One trick would be to have the X server avoid reading much of the command buffer; much of that would make SMP performance significantly worse. The performance path I'd like to push hardest in the short term in direct rendering completely within the user space context. Your suggestions for optimizing an indirect path are great. That path would become much more critical than today as general purpose processes that would no longer have access to the *faster* direct path. -- /\ Jens Owen/ \/\ _ [EMAIL PROTECTED] /\ \ \ Steamboat Springs, Colorado ___ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Mach64 dma fixes
José Fonseca wrote: On 2002.05.27 16:28 Jens Owen wrote: ... If we do get some type of indirect rendering path working quicker, then perhaps we could tighten up these defaults so that the usage model required explicit administrative permision to a user before being allowed access to direct rendering. However, after going to all this trouble of making a decent level of fall back performance, I would then want to push the performance envelop for those processes that did meet the criteria for access to direct rendering resources, and soften the security requirements for just those processes. This could possible be users that have been given explicit permission and the X server itself (doing HW accellerated indirect rendering). There would really be three prongs of attach for this approach: 1) Audit the current DRI security model and confirm that it is strong enough to be used to prevent non authorized users from gaining access to the DRI mechanisms. Work with distros to tighten up the usage model (and possible the DRI security mechanism itself) so only explicit desktop users are allowed access to the DRI. 2) Develop a device independent indirect rendering module that plugs into the X server to utilize our 3D drivers. After getting some HW accel working, look at speeding up this path by utilizing Chormium-like technologies and/or shared memory for high level data. 3) Transition the direct rendering drivers to take full advantage of their user space DMA capabilities. The is a large amount of work, but something we should consider if step 1 can be achieved to the kernel teams satisfaction. It is even possible the direct path could be obsoleted over the long term as step 2 becomes more and more streamlined. ... Jens, if I understood correctly, basically you're suggesting having the OpenGL state machine on the X server process context, and therefore the GL drivers too, and most of the data (textures, display lists). So there would be no layering between the DMA buffer construction and its submition - as boths things would be carried by the GL drivers. This means that we would have a single driver model instead of 3. But the GLX protocol isn't good for this, is it? Hence the need for shared memory for big data. Am I getting the right picture, or am I way off..? Sorry, we covered a lot of things at once. Let me simplify... 1) We loosen security requirements for 3D drivers. This will allow far less data copying, memory mapping/unmapping and system calls. Many modern graphics chips can have their data managed completely in a user space AGP ring buffer removing the need to call the kernel module at all. The primary limitation that has kept us from persuing these implementations so far have been security holes with AGP blits. 2) We implement HW acclerated indirect rendering for those processes that don't have the permissions to use the new optimized drivers. Most of the fancy architecture discusssions we had here are related to making indirect rendering faster...and could be done as a follow on to basic HW accelerated indirect rendering. The first and easiest way to implement this is to make the X server use our direct rendering drivers. I'm not really advocating going to a different model at all. Rather, I'm just advocating moving more of the kernel side validation we're currently doing, back into the 3D driver. PS: It would be nice to discuss these issues in tonight's meeting. I guess that's starting now. It's at irc.openproject.net #dri-devel for those interested in joing in... -- /\ Jens Owen/ \/\ _ [EMAIL PROTECTED] /\ \ \ Steamboat Springs, Colorado ___ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Mach64 dma fixes
On Mon, 27 May 2002 15:01:47 -0600 Jens Owen [EMAIL PROTECTED] wrote: 1) We loosen security requirements for 3D drivers. This will allow far less data copying, memory mapping/unmapping and system calls. Many modern graphics chips can have their data managed completely in a user space AGP ring buffer removing the need to call the kernel module at all. The primary limitation that has kept us from persuing these implementations so far have been security holes with AGP blits. I dont pretend to understand everything here, but wouldnt it be more secure, and STILL blindingly fast, to set up the data in userspace, and trigger the AGP DMA / blits from kernel space with some bounds checking? surely 1 system call per DMA isnt that bad? ___ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Mach64 dma fixes
Ian Molton wrote: On Mon, 27 May 2002 15:01:47 -0600 Jens Owen [EMAIL PROTECTED] wrote: 1) We loosen security requirements for 3D drivers. This will allow far less data copying, memory mapping/unmapping and system calls. Many modern graphics chips can have their data managed completely in a user space AGP ring buffer removing the need to call the kernel module at all. The primary limitation that has kept us from persuing these implementations so far have been security holes with AGP blits. I dont pretend to understand everything here, but wouldnt it be more secure, and STILL blindingly fast, to set up the data in userspace, and trigger the AGP DMA / blits from kernel space with some bounds checking? surely 1 system call per DMA isnt that bad? That's what we do for the cases where we can do so securely. All the vertex data on most cards takes this route. Some data can't go this way because the buffers are subject to attack after the checking has been performed but before they reach the hardware. Whether specific operations are vulnerable or not depends on the details of the card's dma engine. Keith ___ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Mach64 DRM: Permission denied: even for root?
On 27 May 2002, Sergey V. Udaltsov wrote: Hi all Just took the latest (27.05) drm binary snapshot for Mach64. And cannot get DRI working any more. In XFree86.0.log I see: drmOpenDevice: minor is 0 drmOpenDevice: node name is /dev/dri/card0 drmOpenDevice: open result is 7, (OK) drmOpenDevice: minor is 0 drmOpenDevice: node name is /dev/dri/card0 drmOpenDevice: open result is 7, (OK) drmOpenDevice: minor is 0 drmOpenDevice: node name is /dev/dri/card0 drmOpenDevice: open result is 7, (OK) drmGetBusid returned '' (II) ATI(0): [drm] drmSetBusid failed (7, PCI:1:0:0), Permission denied (EE) ATI(0): [dri] DRIScreenInit Failed Even if I start just X from root shell! How is it possible? What's wrong with my configuration? It seems DRM security reached the highest possible level:) Cheers, Sergey That's odd. Do you have a mode specified for the DRI device in XF86Config? -- Leif Delgass http://www.retinalburn.net ___ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
[Dri-devel] A few mach64 tests
Well, here's my small contribution to the mach64 dev (sorry but I'm FAR away beyond you in programming, especially compared to people like Jose,Leif or Linus :)). So with mach64-0-0-4-branch checked out and compiled monday morning (french time), a Mobility M1 with 8Mo, athlon-4 900Mhz, running at 1024x768:16 most of the time. To sum it up: nice work! Too bad the new Tuxracer won't work (I was planning to buy it). Maybe the big texture problem can be interesting to look up. Now more details and some facts: -lspci says: 01:00.0 VGA compatible controller: ATI Technologies Inc Rage Mobility P/M AGP 2x (rev 64) (prog-if 00 [VGA]) Subsystem: Compaq Computer Corporation: Unknown device 005f Flags: bus master, stepping, medium devsel, latency 66, IRQ 9 Memory at f500 (32-bit, non-prefetchable) [size=16M] I/O ports at 9000 [size=256] Memory at f410 (32-bit, non-prefetchable) [size=4K] Expansion ROM at unassigned [disabled] [size=128K] Capabilities: [50] AGP version 1.0 Capabilities: [5c] Power Management version 1 -glxinfo says: name of display: :0.0 display: :0 screen: 0 direct rendering: Yes server glx vendor string: SGI server glx version string: 1.2 server glx extensions: GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_EXT_import_context client glx vendor string: SGI client glx version string: 1.2 client glx extensions: GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_EXT_import_context GLX extensions: GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_EXT_import_context OpenGL vendor string: Gareth Hughes OpenGL renderer string: Mesa DRI Mach64 20020227 [Rage Pro] AGP 1x x86/MMX/3DNow! OpenGL version string: 1.2 Mesa 4.0.2 OpenGL extensions: GL_ARB_imaging, GL_ARB_multitexture, GL_ARB_transpose_matrix, GL_EXT_abgr, GL_EXT_bgra, GL_EXT_blend_color, GL_EXT_blend_minmax, GL_EXT_blend_subtract, GL_EXT_clip_volume_hint, GL_EXT_convolution, GL_EXT_compiled_vertex_array, GL_EXT_histogram, GL_EXT_packed_pixels, GL_EXT_polygon_offset, GL_EXT_rescale_normal, GL_EXT_texture3D, GL_EXT_texture_object, GL_EXT_vertex_array, GL_IBM_rasterpos_clip, GL_MESA_window_pos, GL_NV_texgen_reflection, GL_SGI_color_matrix, GL_SGI_color_table glu version: 1.3 glu extensions: GLU_EXT_nurbs_tessellator, GLU_EXT_object_space_tess Visual ID: 23 depth=16 class=TrueColor bufferSize=16 level=0 renderType=rgba doubleBuffer=1 stereo=0 rgba: redSize=5 greenSize=6 blueSize=5 alphaSize=0 auxBuffers=0 depthSize=16 stencilSize=0 accum: redSize=0 greenSize=0 blueSize=0 alphaSize=0 multiSample=0 multiSampleBuffers=0 visualCaveat=None Opaque. Visual ID: 24 depth=16 class=TrueColor bufferSize=16 level=0 renderType=rgba doubleBuffer=1 stereo=0 rgba: redSize=5 greenSize=6 blueSize=5 alphaSize=0 auxBuffers=0 depthSize=16 stencilSize=8 accum: redSize=0 greenSize=0 blueSize=0 alphaSize=0 multiSample=0 multiSampleBuffers=0 visualCaveat=Slow Opaque. Visual ID: 25 depth=16 class=TrueColor bufferSize=16 level=0 renderType=rgba doubleBuffer=1 stereo=0 rgba: redSize=5 greenSize=6 blueSize=5 alphaSize=0 auxBuffers=0 depthSize=16 stencilSize=0 accum: redSize=16 greenSize=16 blueSize=16 alphaSize=0 multiSample=0 multiSampleBuffers=0 visualCaveat=Slow Opaque. Visual ID: 26 depth=16 class=TrueColor bufferSize=16 level=0 renderType=rgba doubleBuffer=1 stereo=0 rgba: redSize=5 greenSize=6 blueSize=5 alphaSize=0 auxBuffers=0 depthSize=16 stencilSize=8 accum: redSize=16 greenSize=16 blueSize=16 alphaSize=0 multiSample=0 multiSampleBuffers=0 visualCaveat=Slow Opaque. Visual ID: 27 depth=16 class=DirectColor bufferSize=16 level=0 renderType=rgba doubleBuffer=1 stereo=0 rgba: redSize=5 greenSize=6 blueSize=5 alphaSize=0 auxBuffers=0 depthSize=16 stencilSize=0 accum: redSize=0 greenSize=0 blueSize=0 alphaSize=0 multiSample=0 multiSampleBuffers=0 visualCaveat=None Opaque. Visual ID: 28 depth=16 class=DirectColor bufferSize=16 level=0 renderType=rgba doubleBuffer=1 stereo=0 rgba: redSize=5 greenSize=6 blueSize=5 alphaSize=0 auxBuffers=0 depthSize=16 stencilSize=8 accum: redSize=0 greenSize=0 blueSize=0 alphaSize=0 multiSample=0 multiSampleBuffers=0 visualCaveat=Slow Opaque. Visual ID: 29 depth=16 class=DirectColor bufferSize=16 level=0 renderType=rgba doubleBuffer=1 stereo=0 rgba: redSize=5 greenSize=6 blueSize=5 alphaSize=0 auxBuffers=0 depthSize=16 stencilSize=0 accum: redSize=16 greenSize=16 blueSize=16 alphaSize=0 multiSample=0 multiSampleBuffers=0 visualCaveat=Slow Opaque. Visual ID: 2a depth=16 class=DirectColor bufferSize=16 level=0 renderType=rgba doubleBuffer=1 stereo=0 rgba: redSize=5 greenSize=6 blueSize=5 alphaSize=0 auxBuffers=0 depthSize=16 stencilSize=8 accum: redSize=16 greenSize=16 blueSize=16
[Dri-devel] Log of today's IRC
Here's today's log, if someone wants to post it on the site. I might be able to fill in some of the missing dates too, does someone have a list of the missing ones? -- Leif Delgass http://www.retinalburn.net BEGIN LOGGING AT Mon May 27 17:00:08 2002 May 27 17:00:08 --- Topic for #dri-devel is DRI/DRM/Mesa driver development forum | DRI developmental QA meeting every Monday at 4:00pm EST (2100h UTC) May 27 17:00:08 --- Topic for #dri-devel set by ChanServ at Tue May 14 22:55:01 May 27 17:01:25 -- jens (~[EMAIL PROTECTED]) has joined #dri-devel May 27 17:02:21 ldelgass hi jens May 27 17:02:35 jens hi ldelgass May 27 17:02:52 ldelgass I wonder if Linus will join us ;) May 27 17:03:24 jens it's been great to get his input on the mailing list. May 27 17:04:34 -- AndyF (~[EMAIL PROTECTED]) has joined #dri-devel May 27 17:05:10 jens it's a holiday here in the US, so I don't know how many people we'll get today. May 27 17:05:58 -- jrfonseca (~[EMAIL PROTECTED]) has joined #dri-devel May 27 17:06:04 jrfonseca Hi gang! May 27 17:06:08 ldelgass jrfonseca: hello May 27 17:06:36 xorl radeon-20020527-linux.i386.tar.bz2 (1493 KB) May 27 17:06:40 xorl is there anyway May 27 17:06:45 xorl to have that working in freebsd? May 27 17:07:33 -- fxkuehl ([EMAIL PROTECTED]) has joined #dri-devel May 27 17:07:52 anholtxorl: if that's the tcl driver, you would need a TCL DRM and compile bzflag for linux. May 27 17:08:06 anholtI had a tcl drm for bsd at one point, but it has rotted. May 27 17:08:18 xorl http://dri.sourceforge.net/ May 27 17:08:25 xorl Linux Intel x86 Packages May 27 17:08:29 xorl :) May 27 17:08:38 anholtof course, when you have a tcl drm for bsd, you might as well compile the tcl 3d driver and use that. May 27 17:08:45 jens anholt: how's OS templating? Any progress? May 27 17:09:05 xorl anholt, how can i compile tcl drm for bsd? May 27 17:09:05 anholtjens: well, I'm kind of waiting for a response from you guys. Is it a good plan? May 27 17:09:09 xorl where can i locate this May 27 17:09:37 anholtxorl: it's not written at this point. May 27 17:09:47 jens anholt: sorry, I thought we gave strong tentative...yes :-) May 27 17:09:50 xorl :(:( May 27 17:10:08 xorl anholt how long would it take you? May 27 17:10:24 jens They only tentative part was seeing if we can get complete isolation of OS parts from device specific parts. May 27 17:10:57 anholtA day? I've got the TCL code in almost-decent shape and I would have to check to see if the tcl drm has been updated in CVS since then. May 27 17:11:11 anholtjens: that'll take another two minutes. May 27 17:11:14 leahcim is this like tnl_dd/t_dd_dmatmp2.h type stuff in kernel? May 27 17:11:22 xorl anholt; would you do it for me? May 27 17:11:51 anholtjens: then the only OS part of it is the fact that you must use the given macros to keep os-independent as people continue working on the code. May 27 17:12:12 xorl Compiling...install.sh: 546: Syntax error: Bad fd number May 27 17:12:31 anholtxorl: when I get some time. I've been playing with making glideless 3dfx recently, and it's so much more interesting than copy'n'pasting the DRM code around. May 27 17:12:50 jrfonseca jens: I've just caught up with your email. When you mentioned the direct path could be obsoleted, I got the idea that you were suggesting model based on inderect rendering alone. May 27 17:13:28 jens anholt: there could be three phases... May 27 17:13:49 jens anholt: 1st) prototype with one driver, proof of concept May 27 17:14:16 jens anholt: 2nd) move existing drivers to new template technology May 27 17:14:30 jens 3rd) move active new development to new template technology May 27 17:15:08 jens anhost: does this addres your concern about as people continue working on the code? May 27 17:15:46 anholtjens: what would you want for a proof of concept? To me, the current BSD DRM is very, very close. May 27 17:15:51 jens jrfonseca: Over the long haul we could end up with an indirect only model...but I don't see that happening anytime soon.. May 27 17:16:02 xorl anholt; =) May 27 17:16:12 -- alanh (~[EMAIL PROTECTED]) has joined #dri-devel May 27 17:16:19 anholtjens: How about if I make a tarball of linux/drm/ and bsd/drm/ and shared/drm/ of what I think would be the almost-final result? May 27 17:17:13 jrfonseca jens: Ok. But my interpertation is a possible end result? May 27 17:17:14 -- Svartalf (~[EMAIL PROTECTED]) has joined #dri-devel May 27 17:17:18 jens anholt: that would be great. I'll help getting some people to seriously look at it. I think people will be motivivated when they realize that's how future DRM drivers might need to be written. May 27 17:17:19 Svartalf 'lo all! May 27 17:17:43 jrfonseca Svartalf: Hi
Re: [Dri-devel] A few mach64 tests
Thanks for the report! Concerning textures: the AGP texturing code is kind of a proof-of-concept and isn't very efficient yet. We need to work on reducing the amount of texture swapping that is happening. You can try AGP 2x, use Option AgpMode 2 in the Device section of XF86Config. Also, I have some code to increase the max texture size exported, but usually an app would refuse to run or use smaller textures if that's the problem. I should resurrect that bit of code. Something you can try to get a feel for what's happening with textures: export this environment var: LIBGL_PERFORMANCE_BOXES=1 You'll get some colored boxes and bars in the upper left corner of the GL window: From left to right: - A red box means glFinish was called to wait for rendering to complete. - A short bar showing approx. ratio of AGP/local textures used for a frame (AGP - blue, local - purple) - based on number of textures bound to the texture units, not size. - A yellow box means texture(s) were swapped during the frame - Then a split bar for texture uploads, AGP and local (same colors as above), based on size of the uploads. - On another row, a pink-ish bar for number of DMA buffers used. Note that many buffers are sent partially filled, so this is only an approximation of the size of data. I'm guessing that in most of the places where you see slowdowns, you'll see lots of texture swapping (yellow box) and/or the texture upload bars will be long (large texture uploads). Sound problems could be related to this and/or high CPU usage in wait loops? If things get _very_ slow, it's possible that something is requiring software rendering, like GL_BLEND texture environment, but there isn't an easy way to test this, you'd have to look at the app's source or try changing app options. The first priority is to eliminate lockups. I haven't tried tuxkart, I'll have to check that out. Do you see any messages in the system log? Can you kill the app or kill X? As for tuxracer, does it say anything about what it's looking for in the GLX visual that isn't supported? On Tue, 28 May 2002, Bernard Cafarelli wrote: Well, here's my small contribution to the mach64 dev (sorry but I'm FAR away beyond you in programming, especially compared to people like Jose,Leif or Linus :)). Well, I _definitely_ wouldn't put myself in the same class as Linus, but thanks. :) The biggest problem is the learning curve required to get familiar with the DRI structure, but Jose's Developer FAQ is a great starting point. So with mach64-0-0-4-branch checked out and compiled monday morning (french time), a Mobility M1 with 8Mo, athlon-4 900Mhz, running at 1024x768:16 most of the time. To sum it up: nice work! Too bad the new Tuxracer won't work (I was planning to buy it). Maybe the big texture problem can be interesting to look up. Now more details and some facts: -lspci says: 01:00.0 VGA compatible controller: ATI Technologies Inc Rage Mobility P/M AGP 2x (rev 64) (prog-if 00 [VGA]) Subsystem: Compaq Computer Corporation: Unknown device 005f Flags: bus master, stepping, medium devsel, latency 66, IRQ 9 Memory at f500 (32-bit, non-prefetchable) [size=16M] I/O ports at 9000 [size=256] Memory at f410 (32-bit, non-prefetchable) [size=4K] Expansion ROM at unassigned [disabled] [size=128K] Capabilities: [50] AGP version 1.0 Capabilities: [5c] Power Management version 1 -glxinfo says: name of display: :0.0 display: :0 screen: 0 direct rendering: Yes server glx vendor string: SGI server glx version string: 1.2 server glx extensions: GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_EXT_import_context client glx vendor string: SGI client glx version string: 1.2 client glx extensions: GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_EXT_import_context GLX extensions: GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_EXT_import_context OpenGL vendor string: Gareth Hughes OpenGL renderer string: Mesa DRI Mach64 20020227 [Rage Pro] AGP 1x x86/MMX/3DNow! OpenGL version string: 1.2 Mesa 4.0.2 OpenGL extensions: GL_ARB_imaging, GL_ARB_multitexture, GL_ARB_transpose_matrix, GL_EXT_abgr, GL_EXT_bgra, GL_EXT_blend_color, GL_EXT_blend_minmax, GL_EXT_blend_subtract, GL_EXT_clip_volume_hint, GL_EXT_convolution, GL_EXT_compiled_vertex_array, GL_EXT_histogram, GL_EXT_packed_pixels, GL_EXT_polygon_offset, GL_EXT_rescale_normal, GL_EXT_texture3D, GL_EXT_texture_object, GL_EXT_vertex_array, GL_IBM_rasterpos_clip, GL_MESA_window_pos, GL_NV_texgen_reflection, GL_SGI_color_matrix, GL_SGI_color_table glu version: 1.3 glu extensions: GLU_EXT_nurbs_tessellator, GLU_EXT_object_space_tess Visual ID: 23 depth=16 class=TrueColor bufferSize=16 level=0 renderType=rgba doubleBuffer=1 stereo=0 rgba: redSize=5 greenSize=6 blueSize=5 alphaSize=0 auxBuffers=0 depthSize=16 stencilSize=0 accum:
Re: [Dri-devel] Cards Specs
Tim Rowley wrote: Jens Owen wrote: It would be interesting to hear more details from their developers regarding the comment they put in their README: If an OpenGL application is forcibly terminated by closing the X connection then there may be leftovers on the desktop. This appears to be a problem in the DRI infrastructure the driver is based upon. The problem we were seeing is that closing the connection didn't update the context stamp, so the dri driver could still get the lock without talking to the server. This meant frames were still in flight when xlib finally realized the connection was closed and exited. We saw the same behavior with a Radeon card. The easiest way to reproduce this is to start gears and then kill the connection (wm or xkill). Are there any Kyro developers listening on this list? Yes. - Tim Rowley [EMAIL PROTECTED] Good to hear from you Tim. I couldn't reproduce this with the Radeon TCL driver I have installed (page flipping enabled) running on a KDE desktop. Could you try this patch on your system? It bumps the clipstamp at cleanup. -- /\ Jens Owen/ \/\ _ [EMAIL PROTECTED] /\ \ \ Steamboat Springs, Colorado Index: xc/programs/Xserver/GL/dri/dri.c === RCS file: /cvsroot/dri/xc/xc/programs/Xserver/GL/dri/dri.c,v retrieving revision 1.37.2.1 diff -u -r1.37.2.1 dri.c --- xc/programs/Xserver/GL/dri/dri.c21 May 2002 17:31:34 - 1.37.2.1 +++ xc/programs/Xserver/GL/dri/dri.c28 May 2002 03:34:22 - @@ -987,6 +987,10 @@ pDRIDrawablePriv = DRI_DRAWABLE_PRIV_FROM_WINDOW(pWin); if (pDRIDrawablePriv-drawableIndex != -1) { + /* bump stamp to force outstanding 3D requests to resync */ + pDRIPriv-pSAREA-drawableTable[pDRIDrawablePriv-drawableIndex].stamp + = DRIDrawableValidationStamp++; + /* release drawable table entry */ pDRIPriv-DRIDrawables[pDRIDrawablePriv-drawableIndex] = NULL; }
Re: [Dri-devel] pte/highmem changes for DRM kernel modules ...
Has anyone taken a peak at these patches? Stefan Dirsch wrote: Hi Just as introduction. I'm maintaining the XFree86 packages at SuSE and therefore I'm also responsible for XFree86 4.x/DRI support on SuSE Linux. I would like to let you know about some pte/highmem changes in the SuSE kernel of SuSE 8.0 and in upcoming upstream kernel releases. Andrea Arcangeli - you might know him as kernel developper - wrote the document. Additionally I attach two patches we apply for the DRM XFree86 modules we use for SuSE 8.0. The first one is the required pte/highmem patch for the SuSE 8.0 kernel. The second one is required since kernel 2.4.18 pre7/pre8 and we needed it for an update kernel. Hope you consider to integrate these changes in upcoming DRM releases. If you have any questions feel free to contact me directly as I do not read this mailing list. Maybe I should better do this as DRI maintainer at SuSE. :-) Stefan Public Key available Stefan Dirsch (Res. Dev.) SuSE Linux AG Tel: 0911-740530 Deutschherrnstr. 15-19 FAX: +49 911 741 77 55D-90429 Nuernberg http://www.suse.deGermany pte-highmem-drivers.htmlName: pte-highmem-drivers.html Type: Hypertext Markup Language (text/html) p_drm_pte-highmem.diffName: p_drm_pte-highmem.diff Type: Plain Text (text/plain) p_drm_page.diffName: p_drm_page.diff Type: Plain Text (text/plain) -- /\ Jens Owen/ \/\ _ [EMAIL PROTECTED] /\ \ \ Steamboat Springs, Colorado ___ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Indirect rendering strangeness / question
Brian Paul wrote: Mesa should handle rendering into any visual, including overlay planes. That's right, I forgot you upgraded the SW rasterizers to handle just about any visual up to a certain depth. -- /\ Jens Owen/ \/\ _ [EMAIL PROTECTED] /\ \ \ Steamboat Springs, Colorado ___ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
[Dri-devel] r128 texture problems
Hello, I am new to DRI hacking, and I am looking for some guidance. I am trying to get the Neverwinter Nights Toolset to work under wine (http://nwwine.beergeek.net). I have hacked wine a bit to get it working for most people, but I am having problems with the r128 drivers that other people using different cards are not seeing. I am using a Rage Mobility 128 with the latest CVS DRI and X 4.2. I have two problems. 1) Some textures don't appear (or may not repeat) when in 1600x1200 mode (they work fine when X is in 1280x1024 or 1024x768) Wine makes the following series of calls # Set GL_TEXTURE_WRAP_S to GL_REPEAT trace:opengl:wine_glTexParameteri (3553, 10242, 10497) # Set GL_TEXTURE_WRAP_T to GL_REPEAT trace:opengl:wine_glTexParameteri (3553, 10243, 10497) When in 1600x1200 the 1st call produces a GLError of GL_INVALID_VALUE (no error in other resolutions) the second call seems to succeed. So I think that call failing is the problem. However I looked at the mesa source in CVS and this doesn't make sense; these parameter seem valid. Is the GL_INVALID_VALUE being produced at a higher level than _mesa_TexParameterfv? How could this be a product of the resolution? The above problem occurs in both XFree 4.1 version and in CVS. 2) Assert fails in r128_texstate.c The texture that r128UpdateTextureUnit has been passed is null. I think what happens is that the texture is deleted and set to null. Then a new texture is bound to the old spot with glBindTexture, but no new texture is created. r128_texstate.c:494: r128UpdateTextureUnit: Assertion `t' failed. This happens when a window is created on top of another window and they both have renderings in them. The new window works fine (I believe rendering to the window below it is stopped while the new window is pen), but when the new window is closed (and the old window becomes the focus), this assertion fails. This seems to be a regression. The XFree 4.1 version does not have this problem. I would greatly appreciate it if one of you could point me in the right direction so I can fix the bug in the driver. Thanks! ryan Outgrown your current e-mail service? Get a 25MB Inbox, POP3 Access, No Ads and No Taglines with LYCOS MAIL PLUS. http://login.mail.lycos.com/brandPage.shtml?pageId=plus ___ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel