Re: [Mesa3d-dev] DRI SDK and modularized drivers.
On Mon, Mar 22, 2010 at 2:24 AM, Luc Verhaegen wrote: >> In >> particular, the Mesa core <-> classic driver split only makes sense if >> there are enough people who are actually working on those drivers who >> would support the split. Otherwise, this is bound to lead straight >> into hell. >> >> In a way, the kernel people got it right: put all the drivers in one >> repository, and make building the whole package and having parallel > > "put all the drivers in one repository"? > > So, all of: > * drm > * firmware > * libdrm > * xorg > * mesa/dri > * mesa/gallium > * libxvmc > * libvdpau > (add more here) > of the same driver stack, in one repository? Why not? Mind you, I'm not advocating for any change at all, but as long as you feel the need to move stuff around, why not try finding a goal that people actually find useful? Of course, my suggestion is probably crap, too. [snip] > The real question is: where is the most pain, and how can we reduce it. > And the most pain is between the driver specific parts. Nobody has ever had to feel the pain of a separation between Mesa core and drivers. And since a git log I've just done tells me that you have committed only twice to the Mesa repository within the last year or so, maybe you should listen to the opinion of people who *have* been active in the Mesa tree when it comes to that subject, and are working on drivers that are probably significantly more involved than whatever Unichrome does. >> 2) it wouldn't actually solve the DRM problems, because we want to >> have the DRM in our codebase, and the kernel people want to have it in >> theirs. > > The kernel people can have theirs. What stops anyone from getting the > drm code of a released driver stack into the next kernel version? > > But when anyone decides they need a new driver stack which requires a > new drm module, it should be easy to replace the stock kernel module. And that has worked so well in the past. cu, Nicolai -- Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Mesa3d-dev] A tiny problem regarding with the mesa source layout and DRI driver development
On Mon, Mar 22, 2010 at 1:29 PM, LiYe wrote: > I'm interested in openGL implementation and the DRI driver development. > Specifically, I want to learn how an OpenGL command was implemented and > how it was converted into direct rendering context and transferred to > the hardware. I know this is a quite complicated and time-consuming > task, but it would be great if I can start the learning cruve with my > newbie background. So I'm trying to look into the mesa codes. However, > it seems quite large and monolithic and I cannot find a suitable > breaking point. So I wrote this to ask for some experienced advice. For > an overview of how DRI works in codes(not in theory as explained in > documents), where should I start with? I would suggest getting an IDE that has decent code browsing capabilities (I personally like to play with the KDevelop4 beta even though it's still a bit flaky) and just start stepping through your favourite driver in a debugger. Note that your life will be less painful if you have a second machine from which you can SSH in, so that your gdb session doesn't live on the same X server as the OpenGL application that you're debugging. cu, Nicolai -- Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Mesa3d-dev] DRI SDK and modularized drivers.
On Thu, Mar 18, 2010 at 5:38 PM, Luc Verhaegen wrote: > So, identify the volatile interfaces, and the more stable interfaces, > and then isolate the volatile ones, and then you come to only one > conclusion. Except that the Mesa core <-> classic driver interface also wants to change from time to time in non-trivial ways, and trying to force a separation there on people who don't want an additional set of compatibility issues to deal with is not exactly a friendly move. It may seem e.g. like the DRM interface is the worst because of rather large threads caused by certain kernel developer's problems, but that doesn't mean problems wouldn't be created by splitting other areas. In particular, the Mesa core <-> classic driver split only makes sense if there are enough people who are actually working on those drivers who would support the split. Otherwise, this is bound to lead straight into hell. In a way, the kernel people got it right: put all the drivers in one repository, and make building the whole package and having parallel installations trivial. The (only?) issues with that in X.org are that: 1) there is a cultural aversion due to the bad experience with the horrible pre-modularization setup, and 2) it wouldn't actually solve the DRM problems, because we want to have the DRM in our codebase, and the kernel people want to have it in theirs. cu, Nicolai -- Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: Framebuffer read coherency (r3xx)
On Sun, Jun 8, 2008 at 10:02 PM, Nicolai Hähnle <[EMAIL PROTECTED]> wrote: > A workaround like the R200 driver had all along in r200_span.c should fix the > problem, *unless* we happen to be reading from the first row of the > framebuffer. I've committed a workaround to Mesa (00099731195b2e5b57b8bca6342a8a711e0e427a). However, the problem currently still exists in the DDX. cu, Nicolai - Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH] [r300] Fix reordering of fragment program instructions and register allocation
I just realized I didn't send it to the list: There was yet another problem with reordering of instructions. The attached patch (which is against my earlier patch) should fix this. ~Nicolai On 3/18/07, Oliver McFadden <[EMAIL PROTECTED]> wrote: Another thought; the same changed are probably needed for the vertprog code. I think there are also a lot of bugs there. On 3/18/07, Oliver McFadden <[EMAIL PROTECTED]> wrote: > This patch seems to break one of my longer fragment programs. I believe this > is > because it's running out of registers, but I haven't looked into it in > detail > yet. > > I think this patch should be committed, but directly followed by a patch to > reduce the number of registers used. > > > On 3/18/07, Nicolai Haehnle <[EMAIL PROTECTED]> wrote: > > There were a number of bugs related to the pairing of vector and > > scalar operations where swizzles ended up using the wrong source > > register, or an instruction was moved forward and ended up overwriting > > an aliased register. > > > > The new algorithm for register allocation is slightly conservative and > > may run out of registers before it's strictly necessary. On the plus > > side, it Just Works. > > > > Pairing of instructions is done whenever possible, and in more cases > > than before, so in practice this change should be a net win. > > > > The patch mostly fixes glean/texCombine. One remaining problem is that > > the code duplicates constants and parameters all over the place and > > therefore quickly runs out of resources and falls back to software. > > I'm going to look into that as well. > > > > Please test and commit this patch. If you notice any regressions, > > please tell me (but the tests are looking good). > > > > ~Nicolai > > > commit 1ec4703585171f504180425b65dfab92be2a7782 Author: Nicolai Haehnle <[EMAIL PROTECTED]> Date: Sun Mar 18 13:29:18 2007 +0100 r300: Fix fragment program reordering Do not move an instruction that writes to a temp forward past an instruction that reads the same temporary. diff --git a/src/mesa/drivers/dri/r300/r300_context.h b/src/mesa/drivers/dri/r300/r300_context.h index bc43953..29436ab 100644 --- a/src/mesa/drivers/dri/r300/r300_context.h +++ b/src/mesa/drivers/dri/r300/r300_context.h @@ -674,6 +674,11 @@ struct reg_lifetime { emitted instruction that writes to the register */ int vector_valid; int scalar_valid; + + /* Index to the slot where the register was last read. + This is also the first slot in which the register may be written again */ + int vector_lastread; + int scalar_lastread; }; diff --git a/src/mesa/drivers/dri/r300/r300_fragprog.c b/src/mesa/drivers/dri/r300/r300_fragprog.c index 3c54830..89e9f65 100644 --- a/src/mesa/drivers/dri/r300/r300_fragprog.c +++ b/src/mesa/drivers/dri/r300/r300_fragprog.c @@ -1026,10 +1026,11 @@ static void emit_tex(struct r300_fragment_program *rp, */ static int get_earliest_allowed_write( struct r300_fragment_program* rp, - GLuint dest) + GLuint dest, int mask) { COMPILE_STATE; int idx; + int pos; GLuint index = REG_GET_INDEX(dest); assert(REG_GET_VALID(dest)); @@ -1047,7 +1048,17 @@ static int get_earliest_allowed_write( return 0; } - return cs->hwtemps[idx].reserved; + pos = cs->hwtemps[idx].reserved; + if (mask & WRITEMASK_XYZ) { + if (pos < cs->hwtemps[idx].vector_lastread) + pos = cs->hwtemps[idx].vector_lastread; + } + if (mask & WRITEMASK_W) { + if (pos < cs->hwtemps[idx].scalar_lastread) + pos = cs->hwtemps[idx].scalar_lastread; + } + + return pos; } @@ -1070,7 +1081,8 @@ static int find_and_prepare_slot(struct r300_fragment_program* rp, GLboolean emit_sop, int argc, GLuint* src, - GLuint dest) + GLuint dest, + int mask) { COMPILE_STATE; int hwsrc[3]; @@ -1092,7 +1104,7 @@ static int find_and_prepare_slot(struct r300_fragment_program* rp, if (emit_sop) used |= SLOT_OP_SCALAR; - pos = get_earliest_allowed_write(rp, dest); + pos = get_earliest_allowed_write(rp, dest, mask); if (rp->node[rp->cur_node].alu_offset > pos) pos = rp->node[rp->cur_node].alu_offset; @@ -1191,6 +1203,21 @@ static int find_and_prepare_slot(struct r300_fragment_program* rp, cs->slot[pos].ssrc[i] = tempssrc[i]; } + for(i = 0; i < argc; ++i) { + if (REG_GET_TYPE(src[i]) == REG_TYPE_TEMP) { + int regnr = hwsrc[i] & 31; + + if (used & (SLOT_SRC_VECTOR << i)) { +if (cs->hwtemps[regnr].vector_lastread < pos) + cs->hwtemps[regnr].vector_lastread = pos; + } + if (used & (SLOT_SRC_SCALAR << i)) { +if (cs->hwtemps[regnr].scalar_lastread < pos) + cs->hwtemps[regnr].scalar_lastread = pos; + } + } + } + // Emit
[PATCH] [r300] Fix reordering of fragment program instructions and register allocation
There were a number of bugs related to the pairing of vector and scalar operations where swizzles ended up using the wrong source register, or an instruction was moved forward and ended up overwriting an aliased register. The new algorithm for register allocation is slightly conservative and may run out of registers before it's strictly necessary. On the plus side, it Just Works. Pairing of instructions is done whenever possible, and in more cases than before, so in practice this change should be a net win. The patch mostly fixes glean/texCombine. One remaining problem is that the code duplicates constants and parameters all over the place and therefore quickly runs out of resources and falls back to software. I'm going to look into that as well. Please test and commit this patch. If you notice any regressions, please tell me (but the tests are looking good). ~Nicolai diff --git a/src/mesa/drivers/dri/r300/r300_context.h b/src/mesa/drivers/dri/r300/r300_context.h index bd9ed6f..bc43953 100644 --- a/src/mesa/drivers/dri/r300/r300_context.h +++ b/src/mesa/drivers/dri/r300/r300_context.h @@ -647,38 +647,84 @@ struct r300_vertex_program_cont { #define PFS_NUM_TEMP_REGS 32 #define PFS_NUM_CONST_REGS 16 -/* Tracking data for Mesa registers */ +/* Mapping Mesa registers to R300 temporaries */ struct reg_acc { int reg;/* Assigned hw temp */ unsigned int refcount; /* Number of uses by mesa program */ }; +/** + * Describe the current lifetime information for an R300 temporary + */ +struct reg_lifetime { + /* Index of the first slot where this register is free in the sense + that it can be used as a new destination register. + This is -1 if the register has been assigned to a Mesa register + and the last access to the register has not yet been emitted */ + int free; + + /* Index of the first slot where this register is currently reserved. + This is used to stop e.g. a scalar operation from being moved + before the allocation time of a register that was first allocated + for a vector operation. */ + int reserved; + + /* Index of the first slot in which the register can be used as a + source without losing the value that is written by the last + emitted instruction that writes to the register */ + int vector_valid; + int scalar_valid; +}; + + +/** + * Store usage information about an ALU instruction slot during the + * compilation of a fragment program. + */ +#define SLOT_SRC_VECTOR (1<<0) +#define SLOT_SRC_SCALAR (1<<3) +#define SLOT_SRC_BOTH(SLOT_SRC_VECTOR | SLOT_SRC_SCALAR) +#define SLOT_OP_VECTOR (1<<16) +#define SLOT_OP_SCALAR (1<<17) +#define SLOT_OP_BOTH (SLOT_OP_VECTOR | SLOT_OP_SCALAR) + +struct r300_pfs_compile_slot { + /* Bitmask indicating which parts of the slot are used, using SLOT_ constants + defined above */ + unsigned int used; + + /* Selected sources */ + int vsrc[3]; + int ssrc[3]; +}; + +/** + * Store information during compilation of fragment programs. + */ struct r300_pfs_compile_state { - int v_pos, s_pos; /* highest ALU slots used */ - - /* Track some information gathered during opcode -* construction. -* -* NOTE: Data is only set by the code, and isn't used yet. -*/ - struct { - int vsrc[3]; - int ssrc[3]; - int umask; - } slot[PFS_MAX_ALU_INST]; - - /* Used to map Mesa's inputs/temps onto hardware temps */ - int temp_in_use; - struct reg_acc temps[PFS_NUM_TEMP_REGS]; - struct reg_acc inputs[32]; /* don't actually need 32... */ - - /* Track usage of hardware temps, for register allocation, -* indirection detection, etc. */ - int hwreg_in_use; - GLuint used_in_node; - GLuint dest_in_node; + int nrslots; /* number of ALU slots used so far */ + + /* Track which (parts of) slots are already filled with instructions */ + struct r300_pfs_compile_slot slot[PFS_MAX_ALU_INST]; + + /* Track the validity of R300 temporaries */ + struct reg_lifetime hwtemps[PFS_NUM_TEMP_REGS]; + + /* Used to map Mesa's inputs/temps onto hardware temps */ + int temp_in_use; + struct reg_acc temps[PFS_NUM_TEMP_REGS]; + struct reg_acc inputs[32]; /* don't actually need 32... */ + + /* Track usage of hardware temps, for register allocation, + * indirection detection, etc. */ + GLuint used_in_node; + GLuint dest_in_node; }; +/** + * Store everything about a fragment program that is needed + * to render with that program. + */ struct r300_fragment_program { struct gl_fragment_program mesa_program; diff --git a/src/mesa/drivers/dri/r300/r300_fragprog.c b/src/mesa/drivers/dri/r300/r300_fragprog.c index 251fd26..b2c89cc 100644 --- a/src/mesa/drivers/dri/r300/r300_fragprog.c +++ b/src/mesa/drivers/dri/r300/r300_fragprog.c @@ -94,8 +94,9 @@ #define REG_NEGV_SHIFT 18 #define REG_NEGS_SHIFT 19 #define REG_ABS_SHIFT 20 -#define REG_NO_USE_SHIFT 21 -#define REG_VALID_SHIF
Announcing Piglit, an automated testing framework
Hello, back when I was actively working on DRI drivers almost three years ago, I always felt uneasy about the fact that I didn't have an extensive array of tests that I could rely on to test for regressions. Now I've decided to do something about it. I've taken Glean and some code from Mesa and wrapped it with Python and cmake glue to - execute OpenGL tests without user interaction and - neatly format the results in HTML You can find the current version (and a sample HTML summary, to get an idea of what they look like at the moment) at http://homepages.upb.de/prefect/piglit/ The idea is to make testing dead simple for driver developers. I believe that Piglit already makes it quite simple, but I'm sure there's still room for improvement. My current plans are: - Hunt some bugs in R300, to get a better feeling for how the tool fares in practice - Integrate tests from Mesa; unfortunately, this needs manual work because those tests are mainly interactive, but it's definitely necessary to make this useful I'm also considering setting up a public repository somewhere, perhaps on Sourceforge. Please give it a try when you have a little time to spare and tell me if you find it useful (or more importantly, why you don't find it useful), and where it could be improved. Thanks, Nicolai - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: driver level sub-pixel rendering?
On Friday 31 March 2006 19:49, Keith Packard wrote: > On Fri, 2006-03-31 at 09:33 -0700, Brian Paul wrote: > > > AFAIK, nobody's hardware does that. > > > > When that kind of antialiasing is done for text, I think it's the job > > of the font rendering code to do so. > > It's not the construction of the glyphs that's at issue here, I don't > think. The glyphs are drawn to the screen using a separate alpha channel > for each component in the pixel, an operation which isn't directly > supported by the GL API at present. I don't know what we'd need in the > hardware for this to be efficient though; I believe it is possible to do > it today using three passes for each string, which seems horrendous > until you realize how slow it will be to do the same thing with the CPU. Surely you could just use an RGB texture instead of an ALPHA texture? Then it's just a matter of setting the appropriate texture environments and blending modes. cu, Nicolai pgpnhjLjf4Ljq.pgp Description: PGP signature
Re: Linux OpenGL ABI discussion
On Thursday 29 September 2005 18:30, Alan Cox wrote: > On Iau, 2005-09-29 at 09:49 +0200, Christoph Hellwig wrote: > > On Wed, Sep 28, 2005 at 04:07:56PM -0700, Andy Ritger wrote: > > > Some of the topics raised include: > > > > > > - minimum OpenGL version required by libGL > > > - SONAME change to libGL > > > - libGL installation path > > > > I think the single most important point is to explicitly disallow > > vendor-supplied libGL binaries in the LSB. Every other LSB componenet > > relies on a single backing implementation for a reason, and in practic > > That is not actually true. It defines a set of API and ABI behaviours > which are generally based on a single existing common implementation. > > > the Nvidia libGL just causes endless pain where people acceidentally > > link against it. The DRI libGL should be declare the one and official > > one, and people who need extended features over it that aren't in the > > driver-specific backend will need to contribute them back. > > If the LSB standard deals with libGL API/ABI interfaces then any > application using other interfaces/feature set items would not be LSB > compliant. Educating users to link with the base libGL is an education > problem not directly inside the LSB remit beyond the LSB test tools. > > In addition the way GL extensions work mean its fairly sane for an > application to ask for extensions and continue using different > approaches if they are not available. In fact this is done anyway for > hardware reasons. There is a lack of "is XYZ accelerated" as an API but > that is an upstream flaw. The real issue with an IHV-supplied libGL.so is mixing vendors' graphics cards. As an OpenGL user (i.e. a developer of applications that link against libGL), I regularly switch graphics cards around to make sure things work with all the relevant major vendors. Having a vendor-supplied libGL.so makes this unnecessarily difficult on the software side (add to that the custom-installed header files that have ever so slightly different semantics, and there is a whole lot of fun to be had). Not to mention the use case with two graphics cards installed at the same time, from different vendors. While the above problem is annoying but acceptable, there's simply no reasonable way to use two graphics cards from vendors that insist on their custom libGL.so. Having to hack around with LD_LIBRARY_PATH and the likes is ridiculous. I'm not too familiar with the exact details of the DRI client-server protocol, so maybe it may be necessary to turn the libGL.so into even more of a skeleton, and reduce the basic DRI protocol to a simple "tell me the client side driver name", so that IHVs can combine (for example) custom GLX extensions with direct rendering. cu, Nicolai pgpeJXl3bNGf2.pgp Description: PGP signature
Re: "dual-TMU support"
On Saturday 17 September 2005 16:04, Aapo Tahkola wrote: > On Sat, 17 Sep 2005 09:48:37 -0400 (EDT) > Vladimir Dergachev <[EMAIL PROTECTED]> wrote: > > The user error messages is due to the fact that glxgears sometimes outputs > > insufficient number of vertices to draw a primitive - for example only 2 > > vertices for a quad. > > This is normal AFAIK, and since mesa doesnt do it, we need to. > For all I care this message should be removed. This is by no means normal, and is the symptom of a bug in Mesa's recording of vertex arrays in display lists. At least the last time I looked at it it was. https://bugs.freedesktop.org/show_bug.cgi?id=3129 is the relevant bug report. cu, Nicolai pgpOhHEaHcPuW.pgp Description: PGP signature
Re: [r300] r300 driver locks up with Xglx
On Friday 01 July 2005 16:31, Lorenzo Colitti wrote: > Peter Zubaj wrote: > > Some of r300 driver lockups are card dependant (and for now I have > > only these card dependand lockups). Cards which will lock (soon or > > later) are 9500 Pro (maybe 9500 too), 9700, 9800. What card do you have ? > > According to lspci, it's an "ATI Technologies Inc RV350 [Mobility Radeon > 9600 M10]" That chip is actually known to work fine. Have you tried to run ordinary OpenGL applications within the normal X.Org server (e.g. Glean, TuxRacer, Cube, ...)? Are you seeing lockups there, too? If the lockups happen only with Xglx, this could well be an Xglx-specific software issue. In this case, you should try enabling the DRM debugging options (modprobe drm debug=1) and have a look into the dmesg/syslog/wherever kernel messages end up on your system to see what happens around the the time of lockup. Also, you should obviously make sure that all your components are recent (X.Org CVS, r300 CVS, Mesa CVS). > > Try to load fglrx 2d driver first, then uload it and then use r300 driver. > > I don't use fglrx, both because it's closed source and because I don't > want to litter my system with a lot of files I don't know about. Would > installing it help to debug the problem or is it just a workaround? If It's a workaround only. fglrx seems to perform some initializations that we're missing, but so far this workaround seems to be relevant to plain R300 hardware anyway (i.e. *not* RV350). > it's only a workaround, I can simply not try to use Xglx: it doesn't > work anyway... If it's a bug on our (i.e. the driver's) side, we should fix it, whether or not Xglx itself is in a usable state. It's likely that Xglx hits code paths that aren't used by most programs. cu, Nicolai pgp9Gq5dz7KJK.pgp Description: PGP signature
Re: [R300-commit] r300_driver/r300 r300_reg.h,1.44,1.45 r300_state.c,1.112,1.113
On Wednesday 22 June 2005 03:09, Rune Petersen wrote: > Nicolai Haehnle wrote: > >>Also I remember seeing that the values > >>are different depending on chip family. Is this safe? > > > > > > Well, I have tested this on three different chips (R300, rv350 (mobile) and > > R420, which is quite a nice sample), and: > > - fglrx sets this on all the chips and > > - setting it in our driver caused no regressions. > > > > Of course, it would be even better if people could test it on their hardware > > (use hw_script from r300 CVS to query the register value while fglrx is > > running, as well as test the patch). > > > I just had a quick try, it doesn't seam to cause any regressions. > Am I right in assuming that it should reduce lockups on Radeon 9800? No. At least it didn't for me, and it didn't help for Jerome either if I recall correctly. However, it apparently fixes some display issues (white horizontal lines?) that some people were seeing. cu, Nicolai pgpbVIuyR9kTA.pgp Description: PGP signature
Re: [R300] securing r300 drm
On Tuesday 21 June 2005 20:57, Vladimir Dergachev wrote: > Now that the driver paints usable pictures without lockups on many cards, > including AGP versions of X800 and Mobility M10, it would make sense to > ready it for inclusion into main DRI codebase. > > I do not think that elusive lockups of Radeon 9800 cards, or issues with > PowerPC will require any drastic changes. > > As we discussed earlier, the major reason against inclusion into > mainstream DRI CVS is that the driver is not secure in its current state. > > Below, I will attempt to list current known issues - please reply with > your additions. > >* r300_emit_unchecked_state - it is not as unchecked as it has been > initially, however a few poorly checked registers remain: Those poorly checked registers should be moved out of unchecked_state and into its own function. Adding these checks into unchecked_state will just add overhead to what should be a fast path. The idea would be to add something like r300_emit_special_state which doesn't use register addresses but has subcommands like the state setting for radeon/r200. > from r300_cmdbuf.c: > > ADD_RANGE(R300_RB3D_COLOROFFSET0, 1); /* Dangerous */ > ADD_RANGE(R300_RB3D_COLORPITCH0, 1); /* Dangerous */ >/* .. snip ... */ > ADD_RANGE(R300_RB3D_DEPTHOFFSET, 2); /* Dangerous */ > > In principle an attacker can set these to point to AGP or system > RAM and then cause a paint operation to overwrite particular > memory range. > > Ideally we should check that these point inside the framebuffer, > i.e. are within range specified by MC_FB_LOCATION register. Right. Actually, to be on the safe side, we'd have to set min/max clipping rects at the same time as setting those buffer offsets. This is currently not a problem, but it will become one when (if? ;)) we implement framebuffer_object etc. >/* Texture offset is dangerous and needs more checking */ > ADD_RANGE(R300_TX_OFFSET_0, 16); > > I don't think texture offsets are ever written to, however if they > point in the wrong place they can be used to read memory directly. Setting those texture offsets wrong can actually lock up the machine as I found out when I temporarily put MC_FB_LOCATION into its natural position (i.e. where it's put on older radeons). > ideally we would check these to be either with MC_FB_LOCATION > or MC_AGP_LOCATION ranges. Problem is what do we do on PCI cards ? > use AIC controller settings ? Unfortunately, I don't know enough about PCI to comment, and what's AIC anyway? I've seen some register names with AIC in it, but they don't really seem to be used. >* r300_emit_raw - we do not have code that checks any of bufferred 3d > packets, in particular VBUF_2, IMMD_2, INDX_2 and INDX_BUFFER. > > I think that none of these can be exploited except to cause a lockup - > please correct me if I am wrong > >* r300_emit_raw - RADEON_3D_LOAD_VBPNTR - this sets offsets and so > like texture offset registers could be exploited to read protected > memory locations. > > Again, we need to check the offsets against something reasonable. Note that by putting the offset at the end of allowed memory and setting the number of vertices very high, you could read memory that you shouldn't have access to. But the more important thing is: What's up with r300_emit_raw anyway? It was originally supposed to do what the name suggests: Emit raw data into the ring buffer, purely as a hack for experimentation. People have bent it in extreme ways, so it has clearly gone beyond that, and that's a Bad Thing. For one thing, r300_emit_raw doesn't get cliprects right. If you have more than four cliprects, you need to emit rendering commands multiple times. Seriously, all the stuff that uses emit_raw should just be migrated to use r300_emit_packet3, which will clean this up a lot. >* anything I forgot ? Talking about security, have a look at radeon_state.c. Where on earth does *this* come from: /* Allocate an in-kernel area and copy in the cmdbuf. Do this to avoid * races between checking values and using those values in other code, * and simply to avoid a lot of function calls to copy in data. */ orig_bufsz = cmdbuf.bufsz; if (orig_bufsz != 0) { kbuf = drm_alloc(cmdbuf.bufsz, DRM_MEM_DRIVER); if (kbuf == NULL) return DRM_ERR(ENOMEM); if (DRM_COPY_FROM_USER(kbuf, cmdbuf.buf, cmdbuf.bufsz)) { drm_free(kbuf, orig_bufsz, DRM_MEM_DRIVER); return DRM_ERR(EFAULT); } cmdbuf.buf = kbuf; } This just shouts insanity. It calls kmalloc every single time you emit a command buffer! The security issue mentioned in the comment is real, but there's a r
Re: [R300-commit] r300_driver/r300 r300_reg.h,1.44,1.45 r300_state.c,1.112,1.113
On Tuesday 21 June 2005 21:15, Rune Petersen wrote: > Aapo Tahkola wrote: > *snip* > > +if (info->ChipFamily >= CHIP_FAMILY_R300) { > > + unsigned char *RADEONMMIO = info->MMIO; > > + OUTREG(0x180, INREG(0x180) | 0x1100); > > +} > > + > 0x180 is defined as R300_MC_INIT_MISC_LAT_TIME in r300_reg.h. > This seams unrelated to tiling. I agree that a) the appropriate #defines should be added in the 2D driver instead of putting magic values everywhere when we can do better and b) this should be split out into a different patch (note that you can do this kind of splitting with a simple editor; you just have to make sure that you do not modify the patch chunks themselves, be especially with the whitespace) > Also I remember seeing that the values > are different depending on chip family. Is this safe? Well, I have tested this on three different chips (R300, rv350 (mobile) and R420, which is quite a nice sample), and: - fglrx sets this on all the chips and - setting it in our driver caused no regressions. Of course, it would be even better if people could test it on their hardware (use hw_script from r300 CVS to query the register value while fglrx is running, as well as test the patch). cu, Nicolai pgp66TLxvo7vs.pgp Description: PGP signature
Re: [R300-commit] r300_driver/r300 r300_reg.h,1.44,1.45 r300_state.c,1.112,1.113
On Tuesday 21 June 2005 18:06, Aapo Tahkola wrote: > On Thu, 16 Jun 2005 14:22:36 +0200 > Nicolai Haehnle <[EMAIL PROTECTED]> wrote: > > > On Thursday 16 June 2005 13:41, Aapo Tahkola wrote: > > > Update of /cvsroot/r300/r300_driver/r300 > > > In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv6333 > > > > > > Modified Files: > > > r300_reg.h r300_state.c > > > Log Message: > > > Use depth tiling. > > > > Will this work with software fallbacks? > > Im not quite sure but more recent r200_span.c has few words about it. > Attached patch enables color tiling in case someone wants to play with it. You *will* have to update radeon_span.c accordingly. I haven't looked into how this surface business works, that might help a bit, but I doubt you get away without changing anything in radeon_span.c In fact, enabling the depth tiling did break software fallbacks, which includes depth readbacks. So stuff like glean and Cube (which uses depth readback to figure out your line of fire) is broken with depth tiling, which is why I backed that change out. We really, really need a working fallback path. I can't stress this enough. cu, Nicolai pgpmdHNtAmguB.pgp Description: PGP signature
Re: [r300/ppc] lockups
On Tuesday 21 June 2005 10:54, Jerome Glisse wrote: > On 6/21/05, Vladimir Dergachev <[EMAIL PROTECTED]> wrote: > > On Sat, 18 Jun 2005, Johannes Berg wrote: > > > Any idea where I should start looking for the source of the lockups or what else to do? > > > > The problem is likely either due to the radeon memory controller - in > > particular registers like MC_FB_LOCATION MC_AGP_LOCATION - or some sort of > > AGP issue with ring buffer not working properly. > > > > IIRC Paul has a similar problem with his powerbook and Ben provided > patch against xorg & drm for correcting the way this reg are setup. But > this patch were for "normal" drm. Maybe once we get time we should look > at that an program properly this reg in r300... Not knowing this particular patch or whether anybody has tried our driver on PPC, what about endianness issues? I know it's obvious, but who knows... cu, Nicolai pgpi0g7Vngt13.pgp Description: PGP signature
Re: Removing the root priv requirement from DRM
On Saturday 18 June 2005 21:03, Adam Jackson wrote: > On Saturday 18 June 2005 11:20, Jon Smirl wrote: > > Access to the registers is something that should require root priv > > right? Once I can get to the registers I can program them to contol > > the DMA hardware and then muck with the kernel's memory and escalate > > my priveldge level. EGL avoids this possible hole by not using the > > registers from user space. > > Not all register access should require root. In fact you want to do as much > as possible directly from userspace because shuffling large amounts of data > into the kernel is painful. So what you need to restrict are those registers > which can trigger reads and writes from arbitrary system memory bypassing the > MMU, which basically means anything that can trigger bus-master writes or > DMA. > > The point to notice here is that these registers generally segmented apart in > the card's memory map. If all those trigger regs are within a single 4k > range, then that's the only range you need to hide from userspace. I don't see any reason for mapping registers into userspace in the first place. Except for mode setting and related setup tasks (which aren't exactly performance critical), you'll never want to write to registers directly but go through a DMA'd command stream. Okay, there may be ancient hardware that doesn't support that mode of operation. But why not get rid of register maps completely for everything else? cu, Nicolai pgphGMxZv7NKH.pgp Description: PGP signature
Re: [R300] radeon 9800 lockup : guilty reg list
On Saturday 18 June 2005 08:20, Benjamin Herrenschmidt wrote: > On Fri, 2005-06-17 at 18:37 +0200, Jerome Glisse wrote: > > Correct value (previous were ones of a dumb test :)): > > > > 0x01480xf7fff000 RADEON_MC_FB_LOCATION > > 0x014c0xfdfffc00 RADEON_MC_AGP_LOCATION > > Those look much better. If changing those help for us, then I was right > saying that our hacks are no good :) More specifically, for r300, for > some reason, we still put the FB at 0 in card space, which isn't a > terrific idea, and for both r200 and r300, we incorrectly use > CONFIG_APER_SIZE for sizing the memory controller apertures instead of > the actual memory size. Consider the following steps: 1. Load fglrx 2. Unload fglrx 3. Load r300 (without reboot) 4. r300 runs just fine without lockups However, r300 obviously overwrites the RADEON_MC_FB/AGP_LOCATION registers. So while it is obviously a good idea to fix our behaviour here, I'm afraid it would be highly surprising if those registers were the cause of lockups. cu, Nicolai pgp3KX8WbCIEg.pgp Description: PGP signature
Re: [R300-commit] r300_driver/r300 r300_reg.h,1.44,1.45 r300_state.c,1.112,1.113
On Thursday 16 June 2005 13:41, Aapo Tahkola wrote: > Update of /cvsroot/r300/r300_driver/r300 > In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv6333 > > Modified Files: > r300_reg.h r300_state.c > Log Message: > Use depth tiling. Will this work with software fallbacks? cu, Nicolai pgpV14VXQ7uoa.pgp Description: PGP signature
Re: [R300] new snapshot ?
On Friday 10 June 2005 18:10, Vladimir Dergachev wrote: > > On Fri, 10 Jun 2005, Aapo Tahkola wrote: > > >> Someone, I believe it was Aapo, said that they see white lines across the > >> screen when the framerate is fairly high. I didn't see this up until yesterday > >> when I had to change from my 9600pro to a 9600XT (I killed the card moving > >> it between machines somehow). > > > > Are you using SiS based motherboard by any chance? > > Following patch should fix this at the cost of some speed... > > I just committed the following patch to r300_reg.h: Thanks. By the way, I confirmed that fglrx sets those bits in 0x180 on the following cards: - 0x4E44 (R300) - 0x4E50 (RV350) - 0x4A49 (R420) ... i.e. pretty much across the board. However, there are many other registers that it touches, and I couldn't test how it affects lockups yet. > === > RCS file: /cvsroot/r300/r300_driver/r300/r300_reg.h,v > retrieving revision 1.41 > diff -u -r1.41 r300_reg.h > --- r300_reg.h 8 Jun 2005 15:05:24 - 1.41 > +++ r300_reg.h 10 Jun 2005 16:09:22 - > @@ -1,6 +1,27 @@ > #ifndef _R300_REG_H > #define _R300_REG_H > > +#define R300_MC_INIT_MISC_LAT_TIMER0x180 > +# define R300_MC_MISC__MC_CPR_INIT_LAT_SHIFT 0 > +# define R300_MC_MISC__MC_VF_INIT_LAT_SHIFT 4 > +# define R300_MC_MISC__MC_DISP0R_INIT_LAT_SHIFT 8 > +# define R300_MC_MISC__MC_DISP1R_INIT_LAT_SHIFT 12 > +# define R300_MC_MISC__MC_FIXED_INIT_LAT_SHIFT16 > +# define R300_MC_MISC__MC_E2R_INIT_LAT_SHIFT 20 > +# define R300_MC_MISC__MC_SAME_PAGE_PRIO_SHIFT24 > +# define R300_MC_MISC__MC_GLOBW_INIT_LAT_SHIFT24 Is the last 24 supposed to be a 28? > + > + > +#define R300_MC_INIT_GFX_LAT_TIMER 0x154 > +# define R300_MC_MISC__MC_G3D0R_INIT_LAT_SHIFT0 > +# define R300_MC_MISC__MC_G3D1R_INIT_LAT_SHIFT4 > +# define R300_MC_MISC__MC_G3D2R_INIT_LAT_SHIFT8 > +# define R300_MC_MISC__MC_G3D3R_INIT_LAT_SHIFT12 > +# define R300_MC_MISC__MC_TX0R_INIT_LAT_SHIFT 16 > +# define R300_MC_MISC__MC_TX1R_INIT_LAT_SHIFT 20 > +# define R300_MC_MISC__MC_GLOBR_INIT_LAT_SHIFT24 > +# define R300_MC_MISC__MC_GLOBW_FULL_LAT_SHIFT0 Is the last 0 supposed to be a 28? cu, Nicolai pgp7aXAA21s8A.pgp Description: PGP signature
Re: [R300] new snapshot ?
On Friday 10 June 2005 16:52, Aapo Tahkola wrote: > On Fri, 10 Jun 2005 14:31:48 +1000 > Ben Skeggs <[EMAIL PROTECTED]> wrote: > > > Aapo Tahkola wrote: > > > > >>Someone, I believe it was Aapo, said that they see white lines across the > > >>screen when the framerate is fairly high. I didn't see this up until yesterday > > >>when I had to change from my 9600pro to a 9600XT (I killed the card moving > > >>it between machines somehow). > > >> > > >> > > > > > >Are you using SiS based motherboard by any chance? > > > > > > > > Nope, I'm using an nforce3 based board (Gigabyte GA-K8NS Ultra-939) > > > > >Following patch should fix this at the cost of some speed... > > > > > > > > This does indeed seem to correct the problem, and I don't notice a loss > > of speed. > > glxgears rose by about 20fps, and quake3 by 5-10 fps.. I updated xorg > > in the > > process of applying the patch, so it could be something from there. > > > > What exactly does the patch do? Or is it some magic we don't about yet? > > Perhaps ATI guys could answer that. Umm... you *must* have that piece of code from *somewhere*, it can't just have fallen out of the sky. And that alone could provide at least some clue as to what this does... cu, Nicolai pgpUcUe8u9dz5.pgp Description: PGP signature
Re: [R300] on lockups
On Sunday 05 June 2005 20:07, Vladimir Dergachev wrote: > >> My understanding is that dev->agp->base is the address where the AGP GART > >> mirrors the pieces of system RAM comprising AGP space. > > > > Yes, that's my understanding, too. But what is the Radeon's business knowing > > that address? Why does it need to know this address? I thought this was CPU > > address space, not card address space. > > Yes, however it is convenient to do so. > > The point is that AGP base address will not normally overlap the location > of system RAM. This is, of course, only reasonable for 32 bit systems.. I understand that part, but it's not what I meant. What I mean is this: You said, RADEON_MC_AGP_LOCATION is used to program where AGP is in the card's address space, and that's all fine and makes sense. However, we are *also* programming dev->agp->base into a register called RADEON_AGP_BASE. What is the meaning of that register? cu, Nicolai pgpmkn1hvrepD.pgp Description: PGP signature
Re: [R300] on lockups
On Sunday 05 June 2005 15:55, Vladimir Dergachev wrote: > On Sat, 4 Jun 2005, Nicolai Haehnle wrote: > >> > >> The mirroring works as follows: each time scratch register is written > > the > >> radeon controller uses PCI to write their value to a specific location in > >> system memory. > > > > Are you sure it uses PCI? I'm assuming that the destination address for > > scratch writeback is controlled by the RADEON_SCRATCH_ADDR register. This > > register is programmed to a value that falls within the AGP area (as > > defined by RADEON_MC_AGP_LOCATION) if I understand the code correctly. > > My understanding is that AGP only does transfers system RAM -> video RAM > and all transfers in the opposite direction have to use plain PCI > transfers at least as far as the bus is concerned. You mean system RAM -> graphics card, right? Does this mean that the graphics card cannot always write into memory that falls within RADEON_MC_AGP_LOCATION? > It could be that AGP GART can still decode addresses for writes to system > memory, I guess this depends on a particular architecture. > > One of the reasons to look forward to PCI Express is that it is > bi-directional, unlike AGP. > > > > >> This, of course, would not work if the memory controller is > > misprogrammed > >> - which was the cause of failures. > >> > >> Which way can memory controller be misprogrammed ? The part that > > concerns > >> us are positions of Video RAM, AGP and System Ram in Radeon address space. > >> (these are specified by RADEON_MC_AGP_LOCATION, RADEON_MC_FB_LOCATION). > > > > What's the meaning of RADEON_AGP_BASE, by the way? It is programmed to > > dev->agp->base, which is AFAIK an address from the kernel's address space. > > That doesn't make much sense to me. > > It could be anything. However, the recommended way to program the memory > controller is to set the BASE of video memory to its physical PCI address > and to put AGP memory where it is mirrored by the AGP GART, as, > presumably, this does not overlap with system RAM or any of other > sensitive areas. > > My understanding is that dev->agp->base is the address where the AGP GART > mirrors the pieces of system RAM comprising AGP space. Yes, that's my understanding, too. But what is the Radeon's business knowing that address? Why does it need to know this address? I thought this was CPU address space, not card address space. cu, Nicolai pgpmw9nGNctUV.pgp Description: PGP signature
Re: [R300] on lockups
On Saturday 04 June 2005 15:01, Vladimir Dergachev wrote: > I just wanted to contribute the following piece of information that might > help with R300 lockups. I do not know whether it applies or not in this > case, but just something to be aware about. > > Radeon has a memory controller which translates internal address space of > the chip into accesses of different memory - framebuffer, agp, system ram. > > So from the point of view of Radeon chip there is a single flat 32 bit > address space which contains everything. This is nice because you can > simply set texture offset to a particular number and the chip will pull it > from appropriate memory - be it video memory, agp or system ram. (albeit > system ram access is done via PCI, not AGP commands and thus is much > slower). > > It used to be that Radeon DRM driver had two modes for usage of scratch > registers - a mode when it polled Radeon chip directly and a mode when the > contents of the registers were "mirrored" in the system RAM. The driver > would try mirroring during startup and if it fails uses polling method. > > The mirroring works as follows: each time scratch register is written the > radeon controller uses PCI to write their value to a specific location in > system memory. Are you sure it uses PCI? I'm assuming that the destination address for scratch writeback is controlled by the RADEON_SCRATCH_ADDR register. This register is programmed to a value that falls within the AGP area (as defined by RADEON_MC_AGP_LOCATION) if I understand the code correctly. > This, of course, would not work if the memory controller is misprogrammed > - which was the cause of failures. > > Which way can memory controller be misprogrammed ? The part that concerns > us are positions of Video RAM, AGP and System Ram in Radeon address space. > (these are specified by RADEON_MC_AGP_LOCATION, RADEON_MC_FB_LOCATION). What's the meaning of RADEON_AGP_BASE, by the way? It is programmed to dev->agp->base, which is AFAIK an address from the kernel's address space. That doesn't make much sense to me. > The memory controller *always* assumes that system RAM (accessible via > PCI) starts at 0. So, if RADEON_MC_FB_LOCATION, for example, is set to 0 > then we have video RAM overlapping system RAM. However, the size of video > RAM is usually much smaller than the size of system RAM. So if the scratch > registers image in system memory had small physical address you might get > a lockup and if it was high you don't. You also would be more likely to get > a lockup when load on system memory increased. Hmm. The way RADEON_MC_(FB|AGP)_LOCATION are programmed, it seems to be like they actually consist of two 16 bit fields, one indicating the start of the FB/AGP area, the other indicating the end. Do you know what happens when the programmed size of the FB area is larger than the physical size of video RAM? What happens when the programmed size of the AGP area is larger than the size of the AGP aperture? > This problem has been fixed for plain Radeon drivers, but it could be > that something similar is manifesting again on R300.. How did that fix work? cu, Nicolai pgpNmZIBoEWgy.pgp Description: PGP signature
Re: [r300] [patches] debugging lockups
On Friday 03 June 2005 00:25, Benjamin Herrenschmidt wrote: > > > > > > You guys seem to be getting closer... > > > When I had X + xfce4 + quake3 running (with this patch + > > > patch.drm-cmdbuf-more-pacifiers + patch.remove-userspace-pacifiers) X > > > locked up within 2 minutes. > > > However, X + quake3 (no window manager), I went thirty minutes before > > > my first problem. Quake3Arena crashed, and X quit. There was some > > > message on the terminal about radeon_wait and "IRQ 16". > > > > > > Here it is: > > > > radeonWaitIrq: drmRadeonIrqWait: -16 > > Have you tried David Airlie's or my latest DRM IRQ fixes ? It is unlikely that the problem is related, especially when X locks up, too. If you see this message and X is continuing to run fine (i.e. no complete lockup), you should indeed consider looking at the IRQ code. However, if X locks up completely, the most likely reason for this message is simply that the R300 locks up before it encounters the IRQ EMIT command in the ring buffer. It just happens then that the DRI client waits for an IRQ instead of busy-looping for the chip to idle. So it is perfectly likely that this message appears even when the IRQ handling code is working fine. Nevertheless, testing that patch can't hurt. cu, Nicolai pgp5KDqFMYqUZ.pgp Description: PGP signature
Re: [r300] [patches] debugging lockups
On Friday 03 June 2005 10:28, Aapo Tahkola wrote: > > On Thursday 02 June 2005 13:08, Boris Peterbarg wrote: > >> Aapo Tahkola wrote: > >> > I did some figuring on the CB_DPATH problem. > >> > After little testing it turns out that the lock up with > >> > progs/demos/isosurf goes away when the pacifier sequences are applied > >> to > >> > clearbuffer. > >> > > >> > Im starting to think that this sequence is needed whenever overwriting > >> > certain states rather than whenever 3d operation begins and ends. > > > > Perhaps. I don't think it's just the pacifier sequence, though. I've been > > running applications with RADEON_DEBUG=sync, which causes idle calls > > between cmdbuf calls, and I've been seeing lockups where the read pointer > > sits after the beginning of what cmdbuf emits and long before the first > > rendering commands. > > I dont know if packet3 was issued before since I tweaked isosurf to dump > each frame portion of RADEON_DEBUG info into files using freopen. > But "DISCARD BUF" was really the only key difference in these logs. > > > This indicates that at least one lockup is cause by > > one > > of the following: > > - intial pacifier sequence in r300_cp_cmdbuf > > - emission of cliprects or > Cliprects seem to be little off scale when compairing progs/samples/logo > against software rendering. > Perhaps near plane is negative ? > > > - initial unchecked_state commands sent by the client > This is bad as you can see from first frame drawn by texwrap... > Sticking: > r300_setup_textures(ctx); > r300_setup_rs_unit(ctx); > > r300SetupVertexShader(rmesa); > r300SetupPixelShader(rmesa); > or resethwstate to r300_run_vb_render should fix it. I'm not sure we're talking about the same thing here. This happens when the client sends a command buffer where all the state blocks (from r300->hw) are sent to the hardware *anyway*. It's actually the *order* of emission (e.g. the order in which insert_at_tail is called for state bits) that can make a difference. The thing is, while the order definitely *does* affect the probability of lockups, lockups will not go away completely even if I use the exact same order that fglrx uses. So I'm beginning to believe that we can't trust radeon_do_cp_idle to completely idle the chip, or that whatever is wrong is pretty fundamentally wrong (some wrong bits in how the memory is configured?). cu, Nicolai pgprQzzlJLfsA.pgp Description: PGP signature
Re: [r300] [patches] debugging lockups
On Thursday 02 June 2005 13:08, Boris Peterbarg wrote: > Aapo Tahkola wrote: > > I did some figuring on the CB_DPATH problem. > > After little testing it turns out that the lock up with > > progs/demos/isosurf goes away when the pacifier sequences are applied to > > clearbuffer. > > > > Im starting to think that this sequence is needed whenever overwriting > > certain states rather than whenever 3d operation begins and ends. Perhaps. I don't think it's just the pacifier sequence, though. I've been running applications with RADEON_DEBUG=sync, which causes idle calls between cmdbuf calls, and I've been seeing lockups where the read pointer sits after the beginning of what cmdbuf emits and long before the first rendering commands. This indicates that at least one lockup is cause by one of the following: - intial pacifier sequence in r300_cp_cmdbuf - emission of cliprects or - initial unchecked_state commands sent by the client I've been experimenting with rearranging the order in which certain blocks of state are emitted (by rearranging the appropriate code in r300/r300_cmdbuf.c), and this does seem to havve an effect on lockups, but I haven't found a single good combination, and it's all just not very reliable. I'm curious what the line of investigation related to memory consumption will show, as I'm more or less convinced that the lockups have *something* to do with memory access. > > Any brave souls left? (patch attached) > > > > I think you're going in the right direction. > ut2004demo, which previously always locked up the moment the menu > appeared, now locks up after 6-20 seconds or so. > Also, a lock up that happened to me with running other programs with > glxgears, opening their popup menus and pressing icons takes longer too. > > Adding patch.remove-userspace-pacifiers to this one causes almost > immediate lock ups. > > I get the feeling the mouse (even with silken mouse off) has some > effect. It's as if the more I move it the faster the lock up appears. To be honest, I doubt that. I've been running a pure X server with just one or two clients a lot in the last days, and moving the mouse doesn't make any difference at all. Most likely, moving the mouse in applications causes the behaviour of the applications to become less regular, which triggers lockups. glean, for example, which behaves in a very irregular fashion even when you don't do anything at all, seems pretty lockup-happy to me. cu, Nicolai pgpEwruheppnS.pgp Description: PGP signature
Re: [r300] [patches] debugging lockups
On Wednesday 01 June 2005 09:22, Benjamin Herrenschmidt wrote: > On Tue, 2005-05-31 at 21:52 +0200, Nicolai Haehnle wrote: > > Hello everybody, > > > > today's lockup-chasing wrapup follows :) > > BTW. Look at the "removing radeon_acknowledge_irqs hack.." thread and my > reply to David more specifically. I think we may be losing interrupts. I > don't know if that can explain the lockups though. I agree. There may have been an issue on old Radeon cards, but it's highly unlikely that the same problem still exists in the R3xx family. Also, I have seen no regressions with your patch (no improvements either, but that's not unexpected). So unless there are any objections, I'm going to apply your patch and the patch.drm-cmdbuf-more-pacifiers that I posted yesterday to the r300 CVS soon. cu, Nicolai pgpLPqIS4azDT.pgp Description: PGP signature
[r300] [patches] debugging lockups
Hello everybody, today's lockup-chasing wrapup follows :) Two observations about the lockups I've been seeing: (1) Lockups are more likely to occur when the ring buffer is filled with packet2s for alignment (see the attached experimental patch.drm-align-ring). (2) Lockups are a lot less likely to occur when additional synchronisation measures are taken (like waiting for the read pointer to catch up with the write pointer after every ADVANCE_RING). If we assume that the (most) lockups are caused by a race in memory accesses, then both observations make sense: Filling the ring buffer with packet2s causes the CP to request new batches from the ring buffer more often, and waiting for the ring buffer to catch up means that less stuff happens in parallel. Of course there may be a number of other interpretations. Another observation: (3) On my system, lockups involving simple programs (like glxgears) are a lot more likely to happen when multiple 3D clients are running in parallel. In particular, starting glean while running glxgears means an almost certain lockup, at least with the patch.remove-userspace-pacifiers that I posted earlier. (The background for that patch was that fglrx never emits a pacifier sequence in between 3D commands) I have written a very unintrusive debugging facility for the DRM that basically logs which parts of the code emit commands to the ring buffer. When a lockup is detected, it prints this information out (via printk) along with a dump of the relevant part of the ring buffer. I have attached this patch, it is called patch.drm-debug-lockups-enabled (this logging facility can be disabled at compile time via the RADEON_DEBUG_LOCKUPS define in radeon_drv.h). Using this patch, I have made another observation: (4) All the lockups that happen for me occur when two cmdbuf ioctls are processed immediately after another, without an idle ioctl or similar inbetween. So I have compared what our driver does at the boundary of 3D commands to what fglrx does, and I've come up with the attached patch.drm-cmdbuf-more-pacifiers, which adds an additional wait command to the end of r300_do_cp_cmdbuf. Using this patch, glean no longer locks up immediately when glxgears is running at the same time. Unfortunately, not all lockups have gone away yet... What you can do: Please, test the attached patch.drm-cmdbuf-more-pacifiers, and report if there are any regressions (I don't believe there are any) and/or if it removes certain lockups you are seeing. If you're feeling adventurous, I'd appreciate it if you could also try this patch in combination with the patch.remove-userspace-pacifiers patch I posted earlier, though this patch appears to be dangerous still (even though I do not understand why). cu, Nicolai Index: drm/shared-core/radeon_drv.h === RCS file: /cvsroot/r300/r300_driver/drm/shared-core/radeon_drv.h,v retrieving revision 1.12 diff -u -3 -p -r1.12 radeon_drv.h --- drm/shared-core/radeon_drv.h 3 Mar 2005 04:40:21 - 1.12 +++ drm/shared-core/radeon_drv.h 31 May 2005 17:36:01 - @@ -985,14 +985,37 @@ do { \ #define RING_LOCALS int write, _nr; unsigned int mask; u32 *ring; +#define ALIGN_RING() do { \ + int _nr = 32 - (dev_priv->ring.tail & 31); \ + int _write; \ + if (dev_priv->ring.space <= (_nr+1) * sizeof(u32)) { \ + COMMIT_RING(); \ + radeon_wait_ring( dev_priv, (_nr+1) * sizeof(u32) ); \ + }\ + _write = dev_priv->ring.tail; \ + if (_write & 1) { \ + dev_priv->ring.start[_write++] = RADEON_CP_PACKET2; \ + _write = _write % dev_priv->ring.tail_mask; \ + _nr--; \ + }\ + while( _nr >= 2 ) { \ + dev_priv->ring.start[_write++] = RADEON_CP_PACKET2; \ + dev_priv->ring.start[_write++] = RADEON_CP_PACKET2; \ + _write = _write % dev_priv->ring.tail_mask; \ + _nr -= 2; \ + }\ + dev_priv->ring.tail = _write; \ +} while (0) + #define BEGIN_RING( n ) do { \ + ALIGN_RING(); /* TEST TEST */ \ if ( RADEON_VERBOSE ) { \ DRM_INFO( "BEGIN_RING( %d ) in %s\n", \ n, __FUNCTION__ );\ }\ - if ( dev_priv->ring.space <= (n) * sizeof(u32) ) { \ + if ( dev_priv->ring.space <= dev_priv->ring.size/2 /*(n+1) * sizeof(u32)*/ ) { \ COMMIT_RING(); \ - radeon_wait_ring( dev_priv, (n) * sizeof(u32) ); \ + radeon_wait_ring( dev_priv, dev_priv->ring.size/2/*(n+1) * sizeof(u32)*/ ); \ }\ _nr = n; dev_priv->ring.space -= (n) * sizeof(u32); \ ring = dev_priv->ring.start; \ Index: drm/shared-core/radeon_cp.c === RCS file: /cvsroot/r300/r300_driver/drm/shared-core/radeon_cp.c,v retrieving revision 1.7 diff -u -3 -p -r1.7 radeon_cp.c --- drm/shared-core/radeon_cp.c 19 Apr 2005 21:05:18 - 1.7 +++ drm/shared-core/radeon_cp.c 31 May 2005 19:29:50 - @@ -846,6 +847,126 @@ static v
Re: r300 bugs
Hi, On Monday 30 May 2005 08:51, Vladimir Dergachev wrote: > On Mon, 30 May 2005, Bernhard Rosenkraenzer wrote: > > > Hi, > > I've just tried out the r300 driver - works remarkably well for "untested and > > broken" code. > > :)) > > > > > I've run into 2 bugs though: > > It doesn't work well if the display uses 16 bpp (24 bpp works perfectly) -- 3D > > in 16 bpp is pretty badly misrendered (sample attached; 2D works well w/ r300 > > DRI even at 16 bpp) -- mixed with a random section of the rest of the screen, > > wrong colors, and drawn way too large (but close enough to the expected > > output to recognize it). > > I don't think we ever focused on getting 16bpp right - having 32bpp > working is fun enough :) Also, all of r300 and later cards have more than > enough RAM for 32bpp modes. > > That said, it is probably just a matter of making sure some constants are > set properly (like colorbuffer parameters), I don't think anything else in > the driver is tied to that. If fglrx supports 16 bits (seriously, I've never tried that - who wants 16 bits anyway ;)), it's a matter of using glxtest to figure out the necessary color buffer setup code. Some other constants may be different, but it's unlikely. In addition to that, you will have to change radeon_span.c (for software fallbacks, as wall as Read/DrawPixels functionality) accordingly, as well as probably some context creation related stuff. Also, you might want to look into the code that selects texture formats. It probably doesn't make too much sense to select a 32 bit texture format at a 16 bit screen resolution unless the user explicitly requests it. cu, Nicolai pgpzGHEIXlh44.pgp Description: PGP signature
Re: [patches] Re: r300 radeon 9800 lockup
On Sunday 29 May 2005 02:31, Ben Skeggs wrote: > Morning, > > After playing UT2004 for 10 or so minutes, and then quickly checking > some other > apps known to worn, I see no regressions with either patch. > > I'll be putting it through some more rigorous testing as the day > progresses, will > report back if I find anything. > > Also, out of interest, what triggered the lockup you saw? Pretty much anything could trigger it, from glxgears (unlikely lockup) over glean (regular lockups) to cube (almost instant lockup). Unfortunately, just like others are reporting, I'm still getting lockups, too. This time, however, they are more elusive as the lockups disappear as soon as I start looking for them: I used the attached patch (in variations) to hunt for the previous lockup. What this patch does is that it basically commits the ring buffer after every ADVANCE_RING and waits for the read ptr to catch up with the write ptr. Feel free to try this patch if you're seeing lockups, perhaps you can find something out that way. However, as soon as I enable the call to commit_and_wait, the lockups disappear for me. I'm still going to try to find this thing, but it looks like it's going to be difficult. cu, Nicolai Index: drm/shared-core/radeon_cp.c === RCS file: /cvsroot/r300/r300_driver/drm/shared-core/radeon_cp.c,v retrieving revision 1.7 diff -u -3 -p -r1.7 radeon_cp.c --- drm/shared-core/radeon_cp.c 19 Apr 2005 21:05:18 - 1.7 +++ drm/shared-core/radeon_cp.c 28 May 2005 20:34:04 - @@ -923,6 +923,56 @@ static int radeon_do_wait_for_idle(drm_r return DRM_ERR(EBUSY); } + +/* Debugging function: + * + * Commit the ring immediately and verify that the hardware is making + * progress on the ring. + */ +static int failed_once = 0; +static const char* prev_inflight_caller = 0; +static const char* prev2_inflight_caller = 0; +static const char* prev3_inflight_caller = 0; + +void radeon_do_inflight_commit_and_wait(drm_radeon_private_t * dev_priv, const char* caller) +{ + const char* prev3_caller = prev3_inflight_caller; + const char* prev2_caller = prev2_inflight_caller; + const char* prev_caller = prev_inflight_caller; + const char* cur_caller = caller; + u32 old_tail = RADEON_READ(RADEON_CP_RB_WPTR); + u32 new_tail = dev_priv->ring.tail; + int i; + + prev3_inflight_caller = prev2_caller; + prev2_inflight_caller = prev_caller; + prev_inflight_caller = caller; + + if (failed_once) + return; + + COMMIT_RING(); + + for(i = 0; i < dev_priv->usec_timeout; i++) { + u32 head = GET_RING_HEAD(dev_priv); + + if (new_tail > old_tail) { + if (head > old_tail && head <= new_tail) +return; + } else { + if (head <= new_tail || head > old_tail) +return; + } + + DRM_UDELAY(1); + } + + DRM_ERROR("failed! (caller = %s, prev = %s <- %s <- %s)\n", cur_caller, prev_caller, prev2_caller, prev3_caller); + radeon_status(dev_priv); + failed_once = 1; +} + + /* * CP control, initialization */ Index: drm/shared-core/radeon_drv.h === RCS file: /cvsroot/r300/r300_driver/drm/shared-core/radeon_drv.h,v retrieving revision 1.12 diff -u -3 -p -r1.12 radeon_drv.h --- drm/shared-core/radeon_drv.h 3 Mar 2005 04:40:21 - 1.12 +++ drm/shared-core/radeon_drv.h 28 May 2005 20:34:05 - @@ -985,14 +985,39 @@ do { \ #define RING_LOCALS int write, _nr; unsigned int mask; u32 *ring; +#define ALIGN_RING() do { \ + int _nr = 32 - (dev_priv->ring.tail & 31); \ + int _write; \ + if (dev_priv->ring.space <= (_nr+1) * sizeof(u32)) { \ + COMMIT_RING(); \ + radeon_wait_ring( dev_priv, (_nr+1) * sizeof(u32) ); \ + }\ + _write = dev_priv->ring.tail; \ + if (_write & 1) { \ + dev_priv->ring.start[_write++] = RADEON_CP_PACKET2; \ + _write = _write % dev_priv->ring.tail_mask; \ + _nr--; \ + }\ + while( _nr >= 2 ) { \ + /*dev_priv->ring.start[_write++] = CP_PACKET3( RADEON_CP_NOP, 0 );*/ \ + /*dev_priv->ring.start[_write++] = CP_PACKET0( 0x1438, 0 );*/ \ + dev_priv->ring.start[_write++] = RADEON_CP_PACKET2; \ + dev_priv->ring.start[_write++] = RADEON_CP_PACKET2; \ + _write = _write % dev_priv->ring.tail_mask; \ + _nr -= 2; \ + }\ + dev_priv->ring.tail = _write; \ +} while (0) + #define BEGIN_RING( n ) do { \ + ALIGN_RING(); /* TEST TEST */ \ if ( RADEON_VERBOSE ) { \ DRM_INFO( "BEGIN_RING( %d ) in %s\n", \ n, __FUNCTION__ );\ }\ - if ( dev_priv->ring.space <= (n) * sizeof(u32) ) { \ + if ( dev_priv->ring.space <= dev_priv->ring.size/2 /*(n+1) * sizeof(u32)*/ ) { \ COMMIT_RING(); \ - radeon_wait_ring( dev_priv, (n) * sizeof(u32) ); \ + radeon_wait_ring( dev_priv, dev_priv->ring.size/2/*(n+1) * sizeof(u32)*/ ); \ }\ _nr = n; dev_priv->ring.space -= (n) * sizeof(u32
[patches] Re: r300 radeon 9800 lockup
Hi everybody, I once again tripped upon an R300 lockup (possibly the same one that everybody's been talking about) and spent the last one and half days chasing it down. It turns out that writing the vertex buffer age to scratch registers (the ones that are written back to main memory) during a 3D sequence is a bad idea. Apparently, this confuses the memory controller so much that some part of the engine locks up hard. The attached patch.out-of-loop-dispatch-age fixes this, at least for me. The attached patch.remove-userspace-pacifiers removes additional unnecessary emission of the pacifier sequence from the userspace driver. Userspace isn't supposed to emit this sequence anyway. Could everybody please test whether a) the first patch really does fix the lockups people are seeing and b) whether combining both the first and the second patch causes any regressions. If everything's fine with these patches, I'm going to commit them in a few days or so. cu, Nicolai Index: drm/shared-core/r300_cmdbuf.c === RCS file: /cvsroot/r300/r300_driver/drm/shared-core/r300_cmdbuf.c,v retrieving revision 1.22 diff -u -3 -p -r1.22 r300_cmdbuf.c --- drm/shared-core/r300_cmdbuf.c 28 May 2005 05:18:42 - 1.22 +++ drm/shared-core/r300_cmdbuf.c 28 May 2005 20:56:59 - @@ -487,21 +487,19 @@ static __inline__ void r300_pacify(drm_r } +/** + * Called by r300_do_cp_cmdbuf to update the internal buffer age and state. + * The actual age emit is done by r300_do_cp_cmdbuf, which is why you must + * be careful about how this function is called. + */ static void r300_discard_buffer(drm_device_t * dev, drm_buf_t * buf) { -drm_radeon_private_t *dev_priv = dev->dev_private; -drm_radeon_buf_priv_t *buf_priv = buf->dev_private; -RING_LOCALS; - -buf_priv->age = ++dev_priv->sarea_priv->last_dispatch; - -/* Emit the vertex buffer age */ -BEGIN_RING(2); -RADEON_DISPATCH_AGE(buf_priv->age); -ADVANCE_RING(); + drm_radeon_private_t *dev_priv = dev->dev_private; + drm_radeon_buf_priv_t *buf_priv = buf->dev_private; -buf->pending = 1; -buf->used = 0; + buf_priv->age = dev_priv->sarea_priv->last_dispatch+1; + buf->pending = 1; + buf->used = 0; } @@ -518,6 +516,7 @@ int r300_do_cp_cmdbuf(drm_device_t* dev, drm_radeon_private_t *dev_priv = dev->dev_private; drm_device_dma_t *dma = dev->dma; drm_buf_t *buf = NULL; + int emit_dispatch_age = 0; int ret = 0; DRM_DEBUG("\n"); @@ -608,14 +607,15 @@ int r300_do_cp_cmdbuf(drm_device_t* dev, goto cleanup; } - r300_discard_buffer(dev, buf); + emit_dispatch_age = 1; + r300_discard_buffer(dev, buf); break; case R300_CMD_WAIT: /* simple enough, we can do it here */ DRM_DEBUG("R300_CMD_WAIT\n"); if(header.wait.flags==0)break; /* nothing to do */ - + { RING_LOCALS; @@ -639,6 +639,24 @@ int r300_do_cp_cmdbuf(drm_device_t* dev, cleanup: r300_pacify(dev_priv); + + /* We emit the vertex buffer age here, outside the pacifier "brackets" + * for two reasons: + * (1) This may coalesce multiple age emissions into a single one and + * (2) more importantly, some chips lock up hard when scratch registers + * are written inside the pacifier bracket. + */ + if (emit_dispatch_age) { + RING_LOCALS; + + dev_priv->sarea_priv->last_dispatch++; + + /* Emit the vertex buffer age */ + BEGIN_RING(2); + RADEON_DISPATCH_AGE(dev_priv->sarea_priv->last_dispatch); + ADVANCE_RING(); + } + COMMIT_RING(); return ret; Index: r300/r300_render.c === RCS file: /cvsroot/r300/r300_driver/r300/r300_render.c,v retrieving revision 1.87 diff -u -3 -p -r1.87 r300_render.c --- r300/r300_render.c 19 May 2005 00:03:52 - 1.87 +++ r300/r300_render.c 28 May 2005 20:49:43 - @@ -489,23 +489,18 @@ static GLboolean r300_run_vb_render(GLco struct vertex_buffer *VB = &tnl->vb; int i, j; LOCAL_VARS - + if (RADEON_DEBUG & DEBUG_PRIMS) fprintf(stderr, "%s\n", __FUNCTION__); - + r300ReleaseArrays(ctx); r300EmitArrays(ctx, GL_FALSE); // LOCK_HARDWARE(&(rmesa->radeon)); - reg_start(R300_RB3D_DSTCACHE_CTLSTAT,0); - e32(0x000a); - - reg_start(0x4f18,0); - e32(0x0003); r300EmitState(rmesa); - + if(hw_tcl_on) /* FIXME */ r300FlushCmdBuf(rmesa, __FUNCTION__); @@ -515,16 +510,10 @@ static GLboolean r300_run_vb_render(GLco GLuint prim = VB->Primitive[i].mode; GLuint start = VB->Primitive[i].start; GLuint length = VB->Primitive[i].count; - + r300_render_vb_primitive(rmesa, ctx, start, start + length, prim); } - reg_start(R300_RB3D_DSTCACHE_CTLSTAT,0); - e32(0x000a); - - reg_start(0x4f18,0); - e32(0x0003); - // end_3d(PASS_PREFIX_VOID); /* Flush state - we are done drawing.. */ pgpo4sI9yBTrX.pgp Description: PGP signature
Re: r300 radeon 9800 lockup
On Wednesday 25 May 2005 17:01, Vladimir Dergachev wrote: > > Are you sure the read pointer is still moving 2mins after the lockup? That > > would be rather surprising, to say the least. > > > > I think I can imagine how this might be happenning. You see a lockup from > the driver point of view is when the 3d engine busy bit is constantly on. > > The read pointer is updated by the CP engine, not the 3d engine. It could > be that something would cause the CP engine to loop around sending > commands to 3d engine forever. This would keep the 3d engine bit on, > update the read pointer and appear to be a lockup to the driver. What you're saying is, some command that we sent could be misinterpreted by the 3D engine (or we sent something that we didn't intend to send, considering lack of docs etc.) as a command that takes insanely long to complete. > One way to try to make sure this does not happen is to put code in the DRM > driver to control the active size of the ring buffer. That could be useful for debugging, but that's about it. The thing is, we *want* to have the ring buffer full. If we didn't want that, we could just make the ring buffer smaller. But that doesn't really *solve* the problem either because even very small commands can take an insane amount of time to finish. In any case, it would be interesting to know how fast the RPTR still moves and if it becomes unstuck at some point. You also need to watch out for when the X server finally decides to reset the CP. I believe there's a bug where the X server waits much longer than intended to do this, but the reset could still mess with results if you're waiting for too long. > Also, there might be an issue where the CP engine expects the ring buffer > to be padded with NOPs in a certain way (say to have pointers always on > 256 bit boundaries) - I don't think we are doing this. Yes, that's what I mentioned in an earlier mail. cu, Nicolai pgpIU9yrssrmU.pgp Description: PGP signature
Re: r300 radeon 9800 lockup
On Tuesday 24 May 2005 22:54, Jerome Glisse wrote: > On 5/24/05, Nicolai Haehnle <[EMAIL PROTECTED]> wrote: > > Unfortunately, I don't think so. The thing is, all those OUT_RING and > > ADVANCE_RING commands do not really call into the hardware immediately; all > > they do is write stuff to the ring buffer, but the ring buffer is just some > > memory area without any magic of its own. > > > > Only a call to COMMIT_RING will tell the hardware that new commands are > > waiting in the ring buffer, and the only thing we do know is that > > *something* in the ring buffer before the last COMMIT_RING causes the chip > > to hang. > > > > So another possible way to investigate this could be: > > - Call radeon_do_wait_for_idle() at the end of the COMMIT_RING macro, and > > define RADEON_FIFO_DEBUG (this will print out additional information when > > wait_for_idle fails) > > - Increasingly add COMMIT_RING macros into r300_cmdbuf.c to pinpoint the > > exact location of the problem, if at all possible. > > > > It would be very helpful if you could single out one command we send using > > this procedure. > > > > Note that in the worst case (depending on the actual nature of the lockup in > > hardware), those debugging changes could actually *remove* the lockup (e.g. > > because they remove a race condition that caused the lockup in the first > > place). > > > > Below a sample of what i get when a lockup occur. There is something > that seems strange to me, i saw CP_RB_RTPR change while i am in a > lockup and CP_RB_WTPR increase 6 by 6, I haven't let the things live > for too much time (about 2mins before reboot) but i looks like it > still process ring buffer but slowly. The increases of the write pointer is easily by 6 dwords is easily explained by radeon_do_cp_idle: This function always emits a series of 6 dwords (cache flushes and stuff) before calling wait_for_idle. My understanding is that these commands make sure the chip is in a completely clean state. Are you sure the read pointer is still moving 2mins after the lockup? That would be rather surprising, to say the least. > Anyway i must misunderstood this > i have to dig up more this drm code to understand it a little more. > > By the way why radeon_cp_flush is disactivated ? The only thing that calls radeon_cp_flush is radeon_cp_stop, which is never called during normal 3D operation and COMMIT_RING should take care of posting the write pointer. I don't know the meaning of bit 31 of WPTR. cu, Nicolai > May 24 21:33:25 localhost kernel: [drm:radeon_do_wait_for_idle] *ERROR* failed! > May 24 21:33:25 localhost kernel: radeon_status: > May 24 21:33:25 localhost kernel: RBBM_STATUS = 0x80010140 > May 24 21:33:25 localhost kernel: CP_RB_RTPR = 0x0003fdf0 > May 24 21:33:25 localhost kernel: CP_RB_WTPR = 0x0d95 > May 24 21:33:25 localhost kernel: AIC_CNTL = 0x > May 24 21:33:25 localhost kernel: AIC_STAT = 0x0004 > May 24 21:33:25 localhost kernel: AIC_PT_BASE = 0x > May 24 21:33:25 localhost kernel: TLB_ADDR = 0x > May 24 21:33:25 localhost kernel: TLB_DATA = 0x 2 pgpt4F03RhI60.pgp Description: PGP signature
Re: r300 radeon 9800 lockup
On Tuesday 24 May 2005 18:33, Adam K Kirchhoff wrote: > Vladimir Dergachev wrote: > > >> Vladimir Dergachev wrote: > >> > >>> > >>> In the past I found useful not to turn drm debugging on, but, rather, > >>> insert printk statements in various place in radeon code. This should > >>> also provide more information about what is actually going on. > >> > >> > >> > >> I can't make any promises. My partner already thinks I spend too much > >> time in front of the computer :-) I'll see what I can do, though. > >> Think a > >> printk statement at the start and end of every function? Have any > > > > > > This is probably overkill and might not be very useful > > > > Rather try, at first, to just print a printk logging which command is > > being executed (r300_cmd.c) - this is not very thorough, but, maybe, > > there is a pattern. > > > I added a printk for each function in r300_cmdbuf.c... When Q3A locked > up, and the last thing showing up in syslog was r300_pacify. So I added > printk's after every line in r300_pacify :-) The last thing in syslog was: > > OUT_RING( CP_PACKET3( RADEON_CP_NOP, 0 ) ) > OUT_RING( 0x0 ) > ADVANCE_RING() > > So it seems to be making it all the way through r300_pacify, which had > been called from r300_check_range, from r300_emit_unchecked_state. > > Here's the sequence: > r300_emit_raw > r300_emit_packet3 > r300_emit_raw > r300_emit_unchecked_state > r300_check_range > r300_emit_unchecked_state > r300_check_range > r300_pacify > RING_LOCALS > BEGIN_RING(6) > OUT_RING( CP_PACKET0( R300_RB3D_DSTCACHE_CTLSTAT, 0 ) ) > OUT_RING( 0xa ) > OUT_RING( CP_PACKET0( 0x4f18, 0 ) ) > OUT_RING( 0x3 ) > OUT_RING( CP_PACKET3( RADEON_CP_NOP, 0 ) ) > OUT_RING( 0x0 ) > ADVANCE_RING() > > > Does this tell us anything? Unfortunately, I don't think so. The thing is, all those OUT_RING and ADVANCE_RING commands do not really call into the hardware immediately; all they do is write stuff to the ring buffer, but the ring buffer is just some memory area without any magic of its own. Only a call to COMMIT_RING will tell the hardware that new commands are waiting in the ring buffer, and the only thing we do know is that *something* in the ring buffer before the last COMMIT_RING causes the chip to hang. So another possible way to investigate this could be: - Call radeon_do_wait_for_idle() at the end of the COMMIT_RING macro, and define RADEON_FIFO_DEBUG (this will print out additional information when wait_for_idle fails) - Increasingly add COMMIT_RING macros into r300_cmdbuf.c to pinpoint the exact location of the problem, if at all possible. It would be very helpful if you could single out one command we send using this procedure. Note that in the worst case (depending on the actual nature of the lockup in hardware), those debugging changes could actually *remove* the lockup (e.g. because they remove a race condition that caused the lockup in the first place). cu, Nicolai pgp3gtAZqqVTh.pgp Description: PGP signature
Re: r300 radeon 9800 lockup
On Sunday 22 May 2005 21:00, Jerome Glisse wrote: > Hi, > > I setup a x86 with radeon 9800 pro or xt, trying to find > why it locks. I see little improvement with option no silken > mouse can you test and tell me if it dones anythings for > you (X -nosilk). > > My thought on this lockups is that it's similar to the one > r200 users report, X taking 100% of CPU waiting for > something. I saw a mail from Felix about a lock holding > issue will try to dig in mail archive. If I interpret the logs correctly, all those lockups are of the form where the R300 fails to process the ring buffer any further, i.e. the R300 locks up. This in turn causes the 3D driver or the X server (depending on the exact circumstances, and probably in a rather random fashion) to wait for the R300 to become idle in an endless loop. The 100% CPU usage is merely caused by the fact that we're polling the chip instead of doing proper IRQ-based wait-for-idle. > Anyone have any idea on that ? Could it be the mouse > code in xorg ? Or is it in r300_mesa or drm ? I really > suspect xorg radeon code... It is easy to blame the DDX, but the truth is, we just don't know. The people seeing lockups should try to figure out whether there is a direct causal connection between e.g. mouse movements and lockups. If you are in a fullscreen OpenGL applications, not moving the mouse, no popups occuring from something like a panel applet, and the chip *still* locks up, it is highly unlikely that the DDX is at fault. It is equally likely that the lockup is caused by, say, alignment or wraparound issues of the ring buffer. Note that fglrx always submits commands in indirect buffers, which are stored linearly in physical memory. We, on the other hand, always submit commands into the ring buffer, which is not linear (because it wraps around). Also, fglrx likes to emit NOPs into the command stream sometimes, though I haven't been able to find an exact pattern in those NOPs. We never emit NOPs (or do we?). So the fact is: We just don't know whether alignment/wraparound can cause trouble. The emission of NOPs by fglrx is IMO significant evidence that there *are* issues in this area, at least on some chipsets, but it could just be some weird artifact of the fglrx codebase. cu, Nicolai pgpgCpffRgue8.pgp Description: PGP signature
Re: R300 swizzle table
On Saturday 21 May 2005 17:42, Jerome Glisse wrote: > On 5/21/05, Ben Skeggs <[EMAIL PROTECTED]> wrote: > > Also, while I was debugging some problems in ut2004, I noticed that it > > re-uses > > the same few programs over and over, but they are translated each time. I'm > > thinking about adding a cache for the last 5 or so texenv programs so > > that we > > don't need to translated all the time. Should get a nice speedup in the > > more > > complex areas. Any thoughts on this? That would mean that either ut2004 rewrites different TexEnv settings multiple times between rendering calls, or the Mesa core fails to detect some redundant state setting. > ut2004 has a bad ogl attitude if so, (don't have it as i don't think there is > a PPC linux version :)) But yes caching program could be usefull. Moreover > IIRC r300 can have 2 fragment program in memory ? Well, there are 64 slots for ALU instructions, and it seems to be possible to set pretty arbitrary program start offsets. So you could write two programs' ALU instructions into the chip at the same time, but I don't think you can do the same for TEX instructions, so it has very limited usability. cu, Nicolai pgpoHGLkAguD0.pgp Description: PGP signature
Re: [R300] new snapshot ?
On Thursday 19 May 2005 09:20, Keith Whitwell wrote: > Vladimir Dergachev wrote: > > > > Hi Aapo, Ben, Jerome, Nicolai: > > > >I recently checked fresh code from CVS and was pleasantly surprised > > to see that all Quake3 levels that were broken are now perfect - in fact > > I cannot find anything that is amiss ! > > > >Do you think it would be a good idea to tag the current code and make > > a snapshot ? Sure, anytime :) > So have you guys given any consideration to moving the r300 driver into > mesa proper? CVS access shouldn't be a problem, fwiw... There are two main points that have stopped me from pushing for the inclusion of the driver into Mesa proper: 1. Kernel-level security holes We should take care of full command-stream verification before moving the driver into Mesa CVS. It's easy to say "We can do that later", but if we say that it's likely that it won't be done in a long time. 2. DRM binary compatibility We still don't know the meaning of many of the registers. Some registers are labelled "dangerous" which means we might have to do some more checks in the kernel to make sure user processes can't do harmful stuff. This means that we might have to *remove* some of the cmdbuf commands that exist today in the future. If the others believe moving r300 to Mesa is a good idea, then I'll do some auditing to the DRM code. Once I (or somebody else) has done this, I'm okay with moving the driver as long as we don't enforce DRM binary compatibility yet. cu, Nicolai pgpm9DoWLbaY9.pgp Description: PGP signature
Re: [Mesa3d-dev] update on GL_EXT_framebuffer_object work
On Tuesday 17 May 2005 15:47, Brian Paul wrote: > > Note that it can be easy to be miss this problem. One way that should > > trigger the issue in all drivers is: > > 1. Make sure that you hit software rasterization fallbacks (e.g. > > no_rast=true). > > 2. Run any GL application in a window and resize the window. If you make the > > window larger than its initial size, the framebuffer will be clipped > > incorrectly. > > > > I've fixed this by calling _mesa_resize_framebuffer in the same place where > > clip rectangles are recalculated after the DRI lock has been regained. > > However, I'd like to know if this is the correct/canonical/preferred way of > > doing it. > > That actually sounds like the right thing. > > The idea is that when the driver learns that the window has been > resized we need to call _mesa_resize_framebuffer() on the framebuffer > that corresponds to the window. Wherever we recompute the cliprects > in response to a window size change, is also the right place to resize > the Mesa framebuffer. > > This should be addressed in all the DRI drivers. > > If you can provide the details of how/where you're doing this in the > r300 driver, we can look at doing the same in the other drivers. In the r300 driver, the function radeonGetLock() is the function that handles all the non-fast cases of LOCK_HARDWARE. In this function, we call r300RegainedLock() after validating the drawable information. I have changed r300RegainedLock() to look like this: static void r300RegainedLock(radeonContextPtr radeon) { __DRIdrawablePrivate *dPriv = radeon->dri.drawable; if (radeon->lastStamp != dPriv->lastStamp) { /* --- Here is the interesting part --- */ _mesa_resize_framebuffer(radeon->glCtx, (GLframebuffer*)dPriv->driverPrivate, dPriv->w, dPriv->h); ... recalculate cliprects and scissor stuff here ... } } Inserting this call to _mesa_resize_framebuffer was the only relevant change. cu, Nicolai pgpntrjtOrF66.pgp Description: PGP signature
Re: [Mesa3d-dev] update on GL_EXT_framebuffer_object work
On Monday 02 May 2005 16:56, Brian Paul wrote: > > This weekend I finished updating the DRI drivers to work with the new > framebuffer/renderbuffer changes. My DRI test system is terribly out > of date so I haven't run any tests. I'm tempted to just check in the > changes now and help people fix any problems that arise, rather than > spend a few days updating my test box. I think the code changes are > pretty safe though. > > Here's a summary of changes to the DRI drivers: [snip] > Are there any questions or concerns? Working on the experimental R300 driver, I did come upon a question: How are DRI drivers supposed to handle window resizes? If I understand the code correctly, _mesa_resize_framebuffer would have to be called at some point when the window is resized, but I don't see when that happens in any of the DRI drivers in Mesa CVS. Note that it can be easy to be miss this problem. One way that should trigger the issue in all drivers is: 1. Make sure that you hit software rasterization fallbacks (e.g. no_rast=true). 2. Run any GL application in a window and resize the window. If you make the window larger than its initial size, the framebuffer will be clipped incorrectly. I've fixed this by calling _mesa_resize_framebuffer in the same place where clip rectangles are recalculated after the DRI lock has been regained. However, I'd like to know if this is the correct/canonical/preferred way of doing it. cu, Nicolai pgpjIe57zRehm.pgp Description: PGP signature
Re: r300 patch: change some parameters to GLvoid*
On Friday 13 May 2005 04:43, Jeff Smith wrote: > There are several places in r300_maos.c where a GLvoid* parameter is more appropriate > than char*. This patch makes these changes (which also fixes a compiler warning for me). Applied to CVS. cu, Nicolai > -- Jeff Smith > > > > __ > Yahoo! Mail Mobile > Take Yahoo! Mail with you! Check email on your mobile phone. > http://mobile.yahoo.com/learn/mail pgpTwaM2Ba5LZ.pgp Description: PGP signature
Re: r300 patch: correct a format/argument mismatch
On Friday 13 May 2005 04:45, Jeff Smith wrote: > There is a format/argument mismatch in r300_texprog.c. The format given is '%d' while > the argument is a char*. This patch corrects the format to '%s'. Applied to CVS. cu, Nicolai > -- Jeff Smith pgp5MmPtTQEc6.pgp Description: PGP signature
Re: licenses, R300 code, etc
On Sunday 01 May 2005 06:41, Vladimir Dergachev wrote: > On Sun, 1 May 2005, Paul Mackerras wrote: > > Vladimir Dergachev writes: > > > >> * the R300 driver derived from it appears under the same > >> license due to the notices left over from R200 files > >> (as we originally thought to merge the code in R200). > >> > >> This needs approval from everyone who contributed to R300 - > >> please let me know ! > > > > What exactly needs approval? The current license, or are you > > proposing a change to the license? > > Just wanted to confirm that everyone is ok with MIT/X11 license. > It was never explicit before - my fault, I was having too much fun playing > with the code :) I always thought it was explicit, at least for me - I didn't just copy&paste blindly ;) So yes, I'm obviously okay with that license. cu, Nicolai pgpd4cRidutV2.pgp Description: PGP signature
Re: Proprosed break in libGL / DRI driver ABI
On Tuesday 05 April 2005 22:11, Brian Paul wrote: > If you increase MAX_WIDTH/HEIGHT too far, you'll start to see > interpolation errors in triangle rasterization (the software > routines). The full explanation is long, but basically there needs to > be enough fractional bits in the GLfixed datatype to accomodate > interpolation across the full viewport width/height. > > In fact, I'm not sure that we've already gone too far by setting > MAX_WIDTH/HEIGHT to 4096 while the GLfixed type only has 11 fractional > bits. I haven't heard any reports of bad triangles so far though. > But there probably aren't too many people generating 4Kx4K images. > > Before increasing MAX_WIDTH/HEIGHT, someone should do an analysis of > the interpolation issues to see what side-effects might pop up. > > Finally, Mesa has a number of scratch arrays that get dimensioned to > [MAX_WIDTH]. Some of those arrays/structs are rather large already. Slightly off-topic, but a thought that occured to me in this regard was to tile rendering. Basically, do a logical divide of the framebuffer into rectangles of, say, 64x64 pixels. During rasterization, all primitives are split according to those tiles and rendered separately. This has some advantages: a) It could help reduce the interpolation issues you mentioned. It's obviously not a magic bullet, but it can avoid the need for insane precision in inner loops. c) Better control of the size of scratch structures, possibly even better caching behaviour. b) One could build a multi-threaded rasterizer (where work queues are per framebuffer tile), which is going to become all the more interesting once dualcore CPUs are widespread. cu, Nicolai pgpyy3jidOfu4.pgp Description: PGP signature
Re: r300 - alpha test
On Monday 21 March 2005 12:50, Peter Zubaj wrote: > >I just realized something - isn't the application supposed to change > Z test for that ? > > I don't know, but all application I tested and which uses alpha test - > has z test enabled and all displays errors (tuxracer, enemy territory, > fire - from mesa/demos) > > >Maybe what really happens is that disabling Z test is broken. > > Z test is not disabled - it is enabled. Problem is - even alpha test > fails (and fragment is discarded) z value is still written (and this > looks wrong). Bingo. If setting 0x4F14 to 1 does indeed enable early Z testing, this is easily explained: For every fragment, the card *first* does the Z test. The Z test passes, so the new depth is written out. The fragment program is probably run after the Z test, but this detail doesn't matter. What matters is that the alpha test discards the fragment, but only after Z has already been written. If, on the other hand, 0x4F14 is set to 0, Z testing happens *after* the alpha test and everything's fine. > >On the other hand, as Nicolai points out it would be nice to know > what that register does and whether other bits have any function. > > AFAIK: > > fglrx initialize 0x4f14 register to 0x0001, but when alpha test is > enabled it sets it to 0x. I have to do more tests to see if > fglrx sets this register back to 0x0001 (for now looks, like it is > not set back, but I need to make test program for it). Yes, that would need further testing. If fglrx does not set the register back to 1, that would indicate that there's more to this bit than just early Z. Possible explanations could be (a) a relation to other Z acceleration tricks, (b) fglrx is just being stupid or (c) switching between early and late Z testing is very slow or broken in hardware. But if fglrx *does* reset the register to 1 when alpha test is disabled, we can pretty much say with certainty that it enables early Z testing. cu, Nicolai pgp21RdGhCMMO.pgp Description: PGP signature
Re: r300 - alpha test
Meh, I originally sent this from the wrong email address, sorry... On Monday 21 March 2005 12:50, Peter Zubaj wrote: > >I just realized something - isn't the application supposed to change > Z test for that ? > > I don't know, but all application I tested and which uses alpha test - > has z test enabled and all displays errors (tuxracer, enemy territory, > fire - from mesa/demos) > > >Maybe what really happens is that disabling Z test is broken. > > Z test is not disabled - it is enabled. Problem is - even alpha test > fails (and fragment is discarded) z value is still written (and this > looks wrong). Bingo. If setting 0x4F14 to 1 does indeed enable early Z testing, this is easily explained: For every fragment, the card *first* does the Z test. The Z test passes, so the new depth is written out. The fragment program is probably run after the Z test, but this detail doesn't matter. What matters is that the alpha test discards the fragment, but only after Z has already been written. If, on the other hand, 0x4F14 is set to 0, Z testing happens *after* the alpha test and everything's fine. > >On the other hand, as Nicolai points out it would be nice to know > what that register does and whether other bits have any function. > > AFAIK: > > fglrx initialize 0x4f14 register to 0x0001, but when alpha test is > enabled it sets it to 0x. I have to do more tests to see if > fglrx sets this register back to 0x0001 (for now looks, like it is > not set back, but I need to make test program for it). Yes, that would need further testing. If fglrx does not set the register back to 1, that would indicate that there's more to this bit than just early Z. Possible explanations could be (a) a relation to other Z acceleration tricks, (b) fglrx is just being stupid or (c) switching between early and late Z testing is very slow or broken in hardware. But if fglrx *does* reset the register to 1 when alpha test is disabled, we can pretty much say with certainty that it enables early Z testing. cu, Nicolai pgp5ZzfXpDQXP.pgp Description: PGP signature
Re: r300 - alpha test
On Saturday 19 March 2005 22:31, Peter Zubaj wrote: > Hi, > > Looks like register 4F14 (alias unk4f10[2]) controls some sort of depth > test (probably some optimalization). > > To get ALPHA TEST work correctly (not writing to depth) needs to be set > to 0x. Then alpha textures looks correct (no more depth fighting > in fire, et, tuxracer). Otherways is set to 0x0001. > > Is there any other meaning of this register? Can I change this, because > now is set to 0x0001 ? Hmm.. I'd still love to *know* what this register is actually about, instead of guessing blindly. Since you mentioned alpha testing, do you think it could be related to early Z testing? cu, Nicolai pgphcR9ptz4Om.pgp Description: PGP signature
Re: [r200] Lockups...
On Sunday 13 March 2005 23:46, Adam K Kirchhoff wrote: > Adam K Kirchhoff wrote: > > I really am confused. This was all working (with my 9200) prior to > > reinstalling Debian on my system on Friday. Thankfully it still works > > (with drm 1.15.0) on my FreeBSD installation. Not really sure if that > > tells you anything. > > > Alright... So drm from both February 14th and January 1st are locking > up as well... Which is odd since I never had any of these problem till > this weekend. I'll start rolling back changes to the Mesa dri > driver... Perhaps this isn't directly related to the drm. > > Oh, I've also flashed my BIOS to the latest from the motherboard > manufacturer.. Thought it was worth a shot, but it didn't help. If rolling back the dri driver doesn't help, what about the DDX or even the kernel? cu, Nicolai > Adam pgpIA7614xZrz.pgp Description: PGP signature
Re: [r200] Lockups...
On Sunday 13 March 2005 03:10, Adam K Kirchhoff wrote: > >Was it always shared with the USB controller? Can you try changing that? > > Some more info for both of you... > > I remarked, in an earlier e-mail, that glxgears wouldn't cause the > lockups. That's not true. For whatever reason, it doesn't seem to > cause the lockups if I load the drm module with debug=1... At least not > immediately. However, if I don't load drm that way, glxgears will > lockup my machine rather quickly. Some lockups are hard lockups, unable > to even get the serial console to respond. With others, I can even ssh > in still. In all cases, iirc, my machine will lockup hard if I actually > try and 'reboot' the box after logging in. Wait a minute... isn't that a very similar lockup to the one you got with R300? I don't understand what's going on with radeon_cp_reset though. cu, Nicolai pgpINuZkMkuKs.pgp Description: PGP signature
Re: [R300] gliding_penguin snapshot
On Sunday 06 March 2005 14:15, Adam K Kirchhoff wrote: > Unfortunately, I'm still getting pretty constant lockups that seem to be > related to high framerates. From ppcracer with RADEON_DEBUG set to all: > > http://68.44.156.246/ppracer.txt.gz > > On the plus side, the texture problem that I had seen with > neverputt/neverball seems to be resolved. This is probably the same lockup that I have seen. Unfortunately, I can't test anything for another two weeks or so. It may be worth it to test whether the lockup is due to some (race?) interaction between the 3D client and the X server. In particular, test if the lockup also happens with Solo. If the lockup happens with Solo as well, we at least know that it's not caused by the X server doing something. cu, Nicolai > Adam pgpo4Bo89wFuG.pgp Description: PGP signature
Re: r300 - Saphire 9600
On Sunday 27 February 2005 23:10, Hamie wrote: > I've added in the pci-id's for the Saphire 9600 AGP card. As it has 2 > pci-id's, I've added both to the pciids file, and added it into > radeon_screen, but left the seocnd head commented out on radeon-screen.c > as I'm unsiure whether or not it should be treeated separately... > > Why does it appear as two pci id's anyway? Can you treat it as a second > card? To the best of my knowledge, we don't add the second head PCI ID to drm_pciids.txt because the driver only looks for the first PCI device (in fact, loading two driver instances, one for each device, would be certain to cause lockups). So please remove the second ID. I don't know why ATI decided to publish two PCI devices, and I don't have any related documentation. However, all the features like dual head can be used by only considering the first PCI device, as far as I know. cu, Nicolai > H pgpzXuX0ehPAX.pgp Description: PGP signature
Re: R300 lockups...
On Tuesday 22 February 2005 21:57, Adam K Kirchhoff wrote: > No luck. I setup my xorg.conf file to limit X to 640x480, and used > xrandr to drop the refresh rate to 60... Launched neverputt at 640x480, > fullscreen. Lockup was nearly instantaneous... The music continues, at > least till neverputt dies, and the mouse moves around. Rebooted and > tried again... Exact same result. At least when I was running it at > 1024x768 on a mergedfb desktop of 2560x1024, I was able to play a hole > or two of golf... > > Two times now, I've tried running it at 640x480 on my large mergedfb > desktop. I get further than I did when the screen resolution was > 640x480, but not much. > > I just tried two times now running it at 1280x1024 on my large mergedfb > desktop, and it plays fine for a number of holes. Usually locks up > between holes. > > My conclusion is that these lockups are occuring when the framerate is > at it's highest (ie. low resolution, low texture, low activity), which I > believe is a situation someone else described on here not to long ago. That was me, so I can confirm that, and it *is* different from the problem reported by John Clemens in the other thread (the one called "[r300] Radeon 9600se mostly working"). Unfortunately, I won't have access to my test setup for the next weeks, so I don't have anything new. cu, Nicolai > Adam pgpxmWm01RE3d.pgp Description: PGP signature
Re: [r300] Radeon 9600se mostly working..
On Monday 21 February 2005 17:40, John Clemens wrote: > > On Mon, 21 Feb 2005, John Clemens wrote: > > > >> give it a go on my fanless 9600se (RV350 AP). > > > > How much memory do you have ? What kind of CPU and motherboard ? > > Duron 1.8G, 256MB ddr, old(ish) via km266 motherboard in a shuttle sk41g. > Gentoo. The card has 128Mb ram. > > >> - glxinfo states r300 DRI is enabled. (AGP4x, NO-TCL) > >> - glxgears gives me about 250fps with drm debug=1, ~625fps without debug > >> on. > > should I be concerned that these fps are too low? others seem to be > reporting around 1000.. Well, I'm not sure about the value with debug off, it does seem rather low, but perhaps reasonable if you are using immediate mode (which is still the default in CVS, I believe - check r300_run_render in r300_render.c). Your debug FPS is rather high, actually - I only get around 50fps in glxgears with enabled DRM debugging (even less if I also enable debug messages from the userspace driver). > >> - tuxracer runs ok at 640x480 fullscreen > >> - ice textures look psychadelicly blue > >> - at 1280x1024, (and somewhat at 800x600 windowed), i get these > >>errors: > >> [drm:radeon_cp_dispatch_swap] *ERROR* Engine timed out before swap buffer > >> blit > > ... > > > The swap buffer blit is just a copy - for example a copy from back buffer to > > front buffer. Since the engine timed out before swap buffer blit it means > > that the commands before it were at fault. Which is puzzling as you point out > > that everything works in 640x480. > > Just to elaborate: 640x480 runs fine. at 800x600 windowed, it plays > fine, but if a scene gets more complicated i see some jerkyness.. i.e., > the scene freezes for a second or two and then jumps ahead, and i get a > few messages in the log. At 1280x1024, this happens all the time, so it > appears the game is locked, and I get a stream of those messages in the > log file. alt-F switching to the console works, and switching back i get > about 2 seconds more of movement, and then soft-lock again (persumably > because the card re-inits on VC switch). I can switch to the VC and kill > it and all's fine. Judging from what you're saying, the card isn't > locked, it just isn't able to draw a full scene before it times out. Well, this is certainly interesting, and it does sound like userspace is generating so many drawing commands that the card is simply too slow to process them all. My guess is that the one-two second freezes are causes by the X server when it, too, thinks that the engine has timed out and initiates a reset sequence. This is actually an interesting problem. Here are some issues to think about: 1) The SWAP ioctl should really report an error to userspace when the engine has timed out. 2) I agree that it would make sense to monitor the ring buffer somehow. Perhaps a wait_for_ringbuffer that is called at the top of wait_for_fifo? In the "fast path", this costs an additional I/O read operation, otherwise it should be essentially be no different performance-wise. 3) Come to think of it, couldn't the card just issue an IRQ when it's done? 4) If a drawing command takes very long, can we identify the userspace process that is responsible for sending the command buffer that caused the delay, and can we deal with this process somehow? Perhaps we could insert an age marker before and after the processing in the command buffer ioctl. The last point actually touches on a bigger subject: scheduling access to the graphics card. To get an idea of what I'm talking about, launch a terminal emulator and glxgears side by side. Then run "yes" in the terminal emulator. glxgears will essentially "lock up". cu, Nicolai pgpRFAJMWkcWT.pgp Description: PGP signature
Re: [R300 and other radeons] MergedFB lockups
On Saturday 19 February 2005 02:06, Vladimir Dergachev wrote: > I think I found the cause of lockups in VB mode - they were due to cursor > updating function in radeon_mergedfb.c calling OUTREGP() which in turn > called INREG. > > When silken mouse is enabled this function could be called at any time > which would do *bad* things when CP engine is active. > > The fix of putting RADEONWaitForIdleMMIO() works fine on my setup. > > I have *no* idea why this worked with immediate mode at all and why no > issues were reported by R200 and Radeon users (well, I only looked through > the mailing lists, perhaps there is something on bugzilla but I don't know > how to use that efficiently) > > Also, I have no idea why the code in radeon_cursor.c that writes images > directly into framebuffer memory works - according to the manual any > writes into framebuffer while GUI is active should cause a hard lock. > > However, I could not produce any lockups with it, and so left it as is. I can see no difference at all with this latest change, i.e. no regressions, but the lockup is still there. cu, Nicolai > best > > Vladimir Dergachev pgpwdAI9btdc8.pgp Description: PGP signature
Re: [r300] VB lockup found and fixed
On Saturday 19 February 2005 01:05, Adam K Kirchhoff wrote: > Nicolai Haehnle wrote: > >Please, everybody, get the latest CVS (anonymous will take some time to > >catch up...) and test vertex buffer mode with it (go to r300_run_render() > >in r300_render.c and change the #if so that r300_vb_run_render() is > >called). I want to be really sure that this fixes it for other people as > >well (after all, there may be other causes for lockups that haven't occured > >on my machine yet), and that there are no regressions for those who already > >had working VB mode. > > > > > > Correct me if I'm wrong, but to get the driver to automatically use vb > mode, all you have to do is to change: > > #if 1 > return r300_run_immediate_render(ctx, stage); > #else > return r300_run_vb_render(ctx, stage); > #endif > > to > > #if 1 > return r300_run_vb_render(ctx, stage); > #else > return r300_run_vb_render(ctx, stage); > #endif > > Correct? That's correct, although it would be easier to just change the 1 into a 0 ;) > If that's the case, I'm experiencing lockups with neverputt in both > immediate and vb modes, though the symptoms are slightly different. In > both cases, I have to ssh in and reboot. Simply killing neverputt > doesn't bring back the machine. With immediate mode, the lockup seems > to happen quicker. I can't get past the first hole. The mouse still > responds.. I can move it around though, of course, it does no good. In > vb mode, the mouse locks up, too. > > Any ideas? Interesting, I didn't have lockups that hard for quite some time. Then again, I'm only trying to get glxgears to run without lockups... So this could really be anything. The first rule of thumb is to run with the environment variable RADEON_DEBUG=all set and pipe stderr into a file (beware that this will reduce performance a lot), make sure you capture the entire file and examine that. The last line should be something like "R200 timed out... exiting" in "normal" lockups. cu, Nicolai pgpcMZig3phVt.pgp Description: PGP signature
Re: radeon unified driver
Hi, On Saturday 19 February 2005 00:46, Roland Scheidegger wrote: > There is some problem with driconf, it seems to have some problems > because the driver's name (radeon) does not match what it expects > (r200). Likewise, I couldn't figure out how you'd have 2 separate config > sections for both r100 and r200, currently you'll get all options of the > r200 (though it won't work for that chip family...), some options just > won't do anything on r100. When I started working on the R300 driver, I did some similar work so that the R300 driver should in theory be able to handle R200 as well (this R200 support has certainly gone to hell by now because of all the hacking that has been going on). The point is, I also faced the driconf issue, and you can see how I attempted to tackle it at http://cvs.sourceforge.net/viewcvs.py/r300/r300_driver/r300/radeon_screen.c?rev=1.7&view=auto My solution is probably not that good, but it might give you some ideas. cu, Nicolai pgpifZcHChIua.pgp Description: PGP signature
Re: [r300] VB lockup found and fixed
On Friday 18 February 2005 20:04, Nicolai Haehnle wrote: > There's still at least one hardware lockup bug that can be triggered with > glxgears; unfortunately, this one doesn't seem to be so easily > reproducible. This bug can be triggered on my machine by a single instance of glxgears. It seems to be unrelated to 2D activity. The lockup is a lot more likely to occur for me at high framerates. This is unfortunate, because it means that I need to turn down debug message volume, otherwise the lockup is actually very unlikely to appear at all. However, the lockup seems to be unrelated to the use of "sync": It has happened both with RADEON_DEBUG= empty and with RADEON_DEBUG=sync. When I don't issue any of the magic "begin3d" sequences from the userspace driver, the lockup always happens just after one DMA block has been discarded, but I can't find a pattern as to when it happens exactly. Emitting some of those magic sequences changes the time when the lockup happens a bit, but I have no idea what's really going on. cu, Nicolai pgpMA0G4AdB6f.pgp Description: PGP signature
Re: [r300] VB lockup found and fixed
On Friday 18 February 2005 18:17, Keith Whitwell wrote: > Ben Skeggs wrote: > > I'm still rather new at this, so forgive me if this is a bad suggestion. > > How about going with option 2, but only submitting the command buffer > > anyway if nr_released_bufs != 0. > > > > Would this cause any unwanted side effects? It seems better than just > > always submitting buffers with no cliprects anyhow. > > Oh, btw - note that if you start thowing buffers out, you have to > account for the fact that the hardware hasn't been programmed with the > state that you thought it had - probably by setting a dirty flag or lost > context flag. The command buffer is always sent to the kernel now (and clipping is used to prevent any real rendering from happening), so this particular bug should be gone. There's still at least one hardware lockup bug that can be triggered with glxgears; unfortunately, this one doesn't seem to be so easily reproducible. cu, Nicolai > Keith pgpCHhXvCXfaI.pgp Description: PGP signature
Re: [r300] VB lockup found and fixed
On Friday 18 February 2005 16:03, Keith Whitwell wrote: > Ben Skeggs wrote: > >> I still have a 100% reproducable bug which I need to find the cause of, > >> but time is once again a problem for me. If I move a window over the top > >> of a glxgears window my machine locks up immediately, but sysrq still > >> works > >> fine. > > > > > > I just discovered (and should've checked before), that I can ssh in and > > successfuly > > kill glxgears, then X returns to normal. I can have a partially covered > > glxgears > > window and everything is fine, but as soon as the entire window (not > > incl. window > > decorations) is covered, it seems that the 2d driver is unable to update > > the screen. > > I think some of the other drivers do a 'sched_yeild()' or 'usleep(0)' in > the zero cliprect case to get away from this sort of behaviour. Well, I can reproduce this bug and I tracked it down. There are a number of problems here, and they all have to do with DMA buffer accounting. The first (trivial) problem is that nr_released_bufs was never reset to 0. I've already fixed that in CVS. The real problem is that the following situation can occur when we have zero cliprects: 1. The command buffer contains a DISCARD command for a DMA buffer. 2. We simply drop that command buffer because there are no cliprects, i.e. nothing can be drawn. 3. As a consequence, DMA buffers aren't freed. 4. The rendering loop continues even though DMA buffers have been leaked, which eventually causes all DMA buffers to be exhausted, and this causes an infinite loop in r300RefillCurrentDmaRegion. The root cause is that we drop the command buffers with the DISCARD. I can see two possible solutions to this problem: 1. Wait until we have a cliprect again before submitting command buffers. 2. Submit command buffers even when we have no cliprects. The kernel module would basically ignore everything but the DISCARD commands. 3. Something else? I don't like option (1) because it somehow assumes that the user program only cares about OpenGL (and that's quite selfish). There are many use cases where it is plainly the incorrect thing to do: - A user running something like Quake in listenserver mode; if they switch away from Quake for some reason (incoming messages, whatever), the server will stop and eventuall all clients will timeout. - Imagine a chat application that uses some fancy 3D graphics for whatever reason (glitz, for example). Now this application may just be in the middle of drawing something when the user moves some other application above it. The end result will be that the applications essentially becomes locked up until it becomes visible again; in the mean time, the chat might time out and disconnect the user. So (1) clearly isn't a good solution. Option (2) is more correct, but it does seem a little bit hackish. Any better ideas? Perhaps tracking which buffers were discarded? That's not exactly beautiful either. cu, Nicolai > > Keith pgp9gwf1pFSni.pgp Description: PGP signature
[r300] VB lockup found and fixed
Hi everybody, As reported earlier, I had a perfectly repeatable lockup in VB mode that always happened after the exact same number of frames in glxgears. I can't explain everything about the lockup, mostly because I still don't know what the two registers in the begin3d/end3d sequence actually mean, but here's what I know: It turns out that after the first 4 DMA buffers had been used to completion, r300FlushCmdBuf() was called from r300RefillCurrentDmaRegion(). This only caused simple state setting commands as well as an upload of the current vertex program into the VAP. There was no rendering going on, and neither the begin3d nor the end3d sequence was part of the commands that were sent to the card. However for some reason, it was this sequence that caused the lockup. This leads me to believe that there's somehow more "magic" to the begin3d/end3d sequence than just cache control as I originally assumed (or maybe it *is* cache control, but there's something weird going on in connection with it, I simply don't know). In any case, what I did is *always* emit the begin3d sequence at the top of r300_do_cp_cmdbuf and end3d at the bottom of r300_do_cp_cmdbuf (it is also emitted in the case of an error). This works for me, I can run glxgears for several minutes, even doing some stuff that sometimes tends to produce lockups without any problems. Please, everybody, get the latest CVS (anonymous will take some time to catch up...) and test vertex buffer mode with it (go to r300_run_render() in r300_render.c and change the #if so that r300_vb_run_render() is called). I want to be really sure that this fixes it for other people as well (after all, there may be other causes for lockups that haven't occured on my machine yet), and that there are no regressions for those who already had working VB mode. Once we can be fairly certain that VB mode is stable (i.e. crash and lockup-free), let's talk about removing any mention of the begin3d and end3d sequence from the userspace driver. This is really far too subtle an issue to allow userspace to mess with it. This counts for the X server as well - if anybody feels like implementing Render acceleration, which I doubt at this stage, please leave the begin3d/end3d handling to the kernel module, as it's the only instance that really knows what's going on. cu, Nicolai pgpL5MGT4YXt6.pgp Description: PGP signature
Re: [r300] VB mode success
On Thursday 17 February 2005 22:34, Rune Petersen wrote: > On my system it works on my X800 with no lockups. > For now I have only tested with glxgears and q3demo. > So I won't be of much help fixing this apart from being a success-vector. > > Are there any pattern in what systems works with VB? I have an R300 ND (PCI ID 0x4E44), and it doesn't work (i.e. it locks up after some time). > Also an observation: > With VB mode running glxgears I can get an extra 100 fps by moving the > window left. If I move it too much it goes back to the initial fps. > This doesn't happen with immediate mode. > are there a good reason for this? > > 14345 frames in 5.0 seconds = 2868.983 FPS > 14357 frames in 5.0 seconds = 2871.259 FPS > 14332 frames in 5.0 seconds = 2866.250 FPS > 14324 frames in 5.0 seconds = 2864.621 FPS > "move" > 14837 frames in 5.0 seconds = 2967.306 FPS > 14905 frames in 5.0 seconds = 2980.958 FPS > 14913 frames in 5.0 seconds = 2982.533 FPS > 14897 frames in 5.0 seconds = 2979.251 FPS Huh, that's really weird :) One possible cause (though this is really wild speculation) is that you're outputting debug messages somewhere, and moving the window reduces the volume of debug messages. I also have news regarding "the" (I hope there's only one) VB lockup. I can launch and exit (on 0x4E44 as stated above) glxgears basically as often as I want, as long as I don't let it run for more than a few seconds. During that timeframe I can even do some things that are (used to be) notoriously lockup-prone, such as dragging windows around. But when I let it run more than a few seconds, it invariably locks up. Now the really interesting thing is that I captured the full libGL debug output from several "lock runs", and both the line count and the word count of all the logs is *exactly* the same. The byte counts differ by a very small amount, which appears to be due to some buffer indices being smaller (only one digit vs. two digits) in some cases. So I assume that there is some kind of "timebomb", at least on my machine, that reliably and reproducably causes the lockup, most likely when some counter or pointer wraps around. I haven't found the exact cause yet, but I'll look further into it. cu, Nicolai > > Rune Petersen > pgp2hEXpEbynZ.pgp Description: PGP signature
Re: [R300] jump_and_click retagged.
On Friday 04 February 2005 21:52, Vladimir Dergachev wrote: > On Fri, 4 Feb 2005, Adam Jackson wrote: > > Here again, ideally this would get folded upstream too, once it's at > > least secure. > > > > I can't really mandate a policy since I haven't been contributing much > > to r300, but I would like to hear how people think this should progress. > > Folding DRM driver is not difficult, in fact currently there is just one > extra file with r300-specific code. > > As for folding R300 driver, we'll see how things turn out. It is hard > for me to imagine how this folding could take place - albeit it might turn > out to be not too bad. You know, I actually started the r300 driver with this in mind, which is why you'll still see all those r200_* files around. The thing is, I neither have the hardware to test whether it still works on R200, nor can I currently contribute much to development. It really shouldn't require a complete rewrite, just a lot of careful (and tedious) refactoring. cu, Nicolai pgphN0PekkQlU.pgp Description: PGP signature
Re: ARB_vertex_program and r200...
On Saturday 29 January 2005 02:47, Dave Airlie wrote: > > I've noticed fglrx advertises this for the r200, and doom 3 wants it... > > So after I manage to beat fragment_shader into shape, going to have a look > at how to get ARB_vp working.. r300 guys you have something going on this > already? I don't have an R200, but the R200 registers related to vertex processing are *completely* different from those on the R300. Now maybe the R200 has both a dedicated fixed function pipeline *and* a programmable processor, but unless that is the case, I assume fglrx on R200 tries to map ARB_vp onto fixed function when it can, and falls back to software otherwise. Somebody with R200 hardware would have to test fglrx with the glxtest dumping tool to find out for sure. cu, Nicolai pgpRb0dLUiFNO.pgp Description: PGP signature
Re: [R300-commit] r300_driver/r300 r300_render.c,1.29,1.30
On Monday 10 January 2005 04:42, Vladimir Dergachev wrote: > Update of /cvsroot/r300/r300_driver/r300 > In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv1824 > > Modified Files: > r300_render.c > Log Message: > For some reason we need r300Flush when using textures. Perhaps the problem is > with BITBLT_MULTI call ? I haven't looked at how texturing is implemented yet, but are the GPU caches flushed after the texture upload and before the rendering? I think r300Flush() does this implicitly. cu, Nicolai pgptrCrP3wsKB.pgp Description: PGP signature
Re: [R300][PATCH] Allow use of custom compiler in drm
Hi, On Monday 13 December 2004 18:29, Kronos wrote: > Hi, > Makefile in drm/linux-core/ doesn't pass CC to linux kernel build > system. This prevents loading the modules if kernel has been compiled > with a compiler different from the default (ie. gcc). > > The following patch add CC to kernel Makefile: > > --- a/drm/linux-core/Makefile 2004-10-23 14:43:44.0 +0200 > +++ b/drm/linux-core/Makefile 2004-12-13 18:20:16.0 +0100 > @@ -172,7 +172,7 @@ > all: modules > > modules: includes > - make -C $(LINUXDIR) $(GETCONFIG) SUBDIRS=`pwd` DRMSRCDIR=`pwd` modules > + make -C $(LINUXDIR) $(GETCONFIG) CC=$(CC) SUBDIRS=`pwd` DRMSRCDIR=`pwd` modules > > ifeq ($(HEADERFROMBOOT),1) > > > In this way calling: > CC=gcc-3.4 make > does the Right Thing The base DRM Makefile doesn't pass the CC on either, and this may or may not be with good reason. AFAIK kernel code can be rather dependant on the exact compiler version used, so it's probably a good idea to always use the same compiler for both the kernel itself and all modules. Perhaps somebody with more experience in this area wants to comment? cu, Nicolai pgpBEwFUW4YOm.pgp Description: PGP signature
Re: Problems with r300 Mesa code
On Sunday 21 November 2004 04:36, Michael Lothian wrote: > Hi > > I'm having problems compiling the r300 Mesa stuff > > I get the following errors: > [snip] > > Anyone know what's causing this You have to make sure that the compiler uses the radeon_drm.h from drm/shared-core in r300 CVS. cu, Nicolai pgptw6UOxXPp5.pgp Description: PGP signature
Re: R300 with xorg-x11-6.8.0
On Wednesday 10 November 2004 08:10, eGore wrote: > Hi list, > > I ran into some trouble getting DRI running with r300 (I have no idea if > it is already supported or not), but it didn't work. I looked at xorg's > logfile and found out that DRI was disabled and I also found out that > this is caused by radeon_accelfuncs.c. So I "wrote" the attached patch > to get around that. DRI does still not work, but at least my xorg log > tells me it does :-) > > !! WARNING !! > I have no idea what I'm doing, so this might be completely wrong :-) > !! WARNING !! You're confusing things. On the one hand, there's general support for DRI (and support for client side 3D acceleration), and on the other hand there's hardware acceleration for the Render extension. The two things are only loosely connected. The R300 efforts are directed towards creating client side 3D acceleration (Render acceleration is a very specialised and limited subset of what the 3D driver does), so your patch is both unnecessary and wrong, because both the R100 and the R200 Render acceleration paths cannot possibly work on an R300. Client side 3D acceleration "works" without Render acceleration, where "works" means that Clear() operations are accelerated, and I have some code for hardware rasterization of untextured primitives which always locks up and which I haven't found the time to fix yet. cu, Nicolai > PS: The webpage of r300 is missing a tutorial ;) > PPS: Patch has been applied for a already patched file, I guess, so line > numbers might be completely wrong. > PPPS: I used xorg-x11-6.8.1 from gentoo Linux (xorg-x11-6.8.0-r1 to be > exact) > S: I'm having a Radon 9700 Pro from ELSA. > > That's it for now, regards, > Christoph Brill aka egore pgpsG36UP5Lfs.pgp Description: PGP signature
Re: [Fwd: r300 Report.]
Hi, On Saturday 06 November 2004 10:09, Ben Skeggs wrote: > I think the AGP issues *are* related to the lockup. I've just switched > sysloggers, and switched to CVS XServer (was using release 6.8 before). > My previous problems still occurred, but I now seem to have a lot more > debugging information in my syslog. I have the same AGP problem. If I > set AGP to 4x in my BIOS (rather than 8x), the corruption, and the > lockup don't occur. Okay, the new syslog has all the debug info. I notice the following line: Nov 7 07:37:58 disoft-dc [drm:radeon_cp_init_ring_buffer] writeback test failed The Radeon DRM source code has a comment indicating that writeback doesn't work everywhere, but I think it's safe to assume that all R300-based chips should be capable of writeback. This would indeed point towards a problem in the AGP setup in one way or another, and that means that the ring buffer won't work properly. Without a working ring buffer setup, it's only natural for a lockup to occur. Perhaps we should fail completely when the writeback test failed on R300-based hardware. Unfortunately, my AGP-fu isn't strong enough to know what's really going on here. > I've attached my syslog from when the lockup occurred, in case it helps. > > I also have the problem as Pat, where glxinfo reports "direct rendering: > No", but my Xorg log says it is. As long as the X server works and uses the ring buffer, that would point towards a simple configuration problem. Perhaps you could post a log of glxinfo with LIBGL_DEBUG=all and RADEON_DEBUG=all? cu, Nicolai > Ben Skeggs pgpVLxUiyCGRC.pgp Description: PGP signature
Re: [Fwd: r300 Report.]
Hi, On Friday 05 November 2004 23:12, Pat Suwalski wrote: [snip] > I am running the following system: > - AMD 64 fx-51 I'm afraid that this is a very likely culprit, assuming you're running in 64 bit mode. The trouble is that parts of the DRM interface and also some of the code interfacing the hardware might be broken when it comes to 64 bit. I'm trying to fix code that is obviously 64bit unsafe when I notice something, but since I don't have the hardware to test it, I really can't promise anything. [snip] > One way or another, the PCI id of my card is 1002:4e48, and it seems to > have no negative effects on the hardware, so it might as well be added > to the list of pci id's. The PCI ID is already added in the DRM branch of r300_driver. As for the AGP issues, I have no idea. Maybe they're related to the lockup, maybe not. cu, Nicolai > If anyone has any insight into what's up with all of this, I'm all ears. > Again, I'll help out as much as I can. Thanks. > > --Pat pgpCKPT3cxirO.pgp Description: PGP signature
Re: r300.sf.net lockup
Hi, On Saturday 06 November 2004 01:04, Ben Skeggs wrote: > Hello, > > I've been trying to get the experimental r300.sf.net driver to work on > my machine. I've compiled and installed it ok, but everytime I start > the X server with DRI enabled, the top of the screen is corrupted, which > I'm assuming is the xterms that are supposed to be showing. However, > the mouse pointer is draw correctly, and I'm still able to move it. > > I've posted what I captured in syslog, and my xorg log below. > > The card is a Powercolor Radeon 9600 256MB (RV350 AP), I tested with > vanilla 2.6.9 with reiser4 patched in. Thanks for testing on AFAIK untested hardware. (from just before the SysRq message) > Nov 6 06:05:44 [kernel] [drm:drm_ioctl] pid=9718, cmd=0x4008642a, > nr=0x2a, dev 0xe200, auth=1 > Nov 6 06:05:44 [kernel] [drm:drm_ioctl] pid=9718, cmd=0xc010644d, > nr=0x4d, dev 0xe200, auth=1 > Nov 6 06:05:44 [kernel] [drm:drm_ioctl] pid=9718, cmd=0xc0286429, > nr=0x29, dev 0xe200, auth=1 > Nov 6 06:05:44 [kernel] [drm:drm_ioctl] pid=9718, cmd=0xc010644d, > nr=0x4d, dev 0xe200, auth=1 > Nov 6 06:05:44 [kernel] [drm:drm_ioctl] pid=9718, cmd=0xc0286429, > nr=0x29, dev 0xe200, auth=1 > Nov 6 06:05:44 [kernel] [drm:drm_ioctl] pid=9718, cmd=0xc010644d, > nr=0x4d, dev 0xe200, auth=1 This is a lock, followed by indirect and freelist_get calls. There are two things that concern me: 1. There's a lot less debug output than I get. Also, it's interesting that radeon_cp_dispatch_indirect itself appears to hang. That's not completely impossible, but I've never seen it happen. (or maybe it just seems that way in your syslog because we don't get full debug messages). 2. There are no calls to cp_idle between indirect buffer emits. This indicates that you are running an old DDX, and not X.Org CVS + patch from r300_driver CVS. The latest patch contains a workaround for a known lockup problem. Said problem shouldn't cause a lockup unless a 3D client is running, but you never know... Some more things: 3. A lockup on X server startup is usually a sign for bad microcode, though you do have the correct log message in syslog. 4. Your card has 256MB memory while I can only test with 128MB. Has anybody successfully experimented in any way (r300_demo or r300_driver) with 256MB cards? I remember seeing that the large memory versions had some paging hacks, and there might be related differences that cause the lockup here. cu, Nicolai pgpuYC189RfMS.pgp Description: PGP signature
Re: Multiple hardware locks
On Monday 01 November 2004 07:01, Thomas Hellström wrote: > You are probably right, and it would be quite easy to implement such > checks in the via command verifier as long as each lock is associated with > a certain hardware address range. > > However, I don't quite see the point in plugging such a security hole when > there are a similar ways to accomplish DOS, hardware crashes and even > complete lockups using DRI. > > On via, for example, writing random data to the framebuffer, writing > random data to the sarea, taking the hardware lock and sleeping for an > indefinite amount of time. Writing certain data sequences to the HQV locks > the north bridge etc. > > Seems like DRI allow authorized clients to do these things by design? From what I've learned, the DRI isn't exactly designed for robustness. Still, an authorized client should never be able to cause a hardware crash/lockup, and an authorized client must not be able to issue arbitrary DMA requests. As far as I know, all DRMs that are enabled by default enforce at least the latter. Personally I believe that in the long term, the DRI should have (at least) the following security properties: 1. Protect against arbitrary DMA (arbitrary DMA trivially allows circumvention of process boundaries) This can be done via command-stream checks. 2. Prevent hardware lockup or provide a robust recovery mechanism (protection of multi-user systems, as well as data protection) Should be relatively cheap via command-stream checks on most hardware (unless there are crazy hardware problems with command ordering like there seem to be on some Radeons). I believe that in the long term, recovery should be in the kernel rather than the X server. 3. Make sure that no client can cause another client to crash (malfunctioning clients shouldn't cause data loss in other applications) In other words, make sure that a DRI client can continue even if the shared memory areas are overwritten with entirely random values. That does seem like a daunting task. 4. Make sure that no client can block access to the hardware forever (don't force the user to reboot) I have posted a watchdog patch that protects against the "take lock, sleep forever" problem a long time ago. The patch has recently been updated by Dieter Nützel (search for updated drm.watchdog.3). However, I have to admit that the patch doesn't feel quite right to me. 5. Enable the user to kill/suspend resource hogs Even if we protect against lock abuse, a client could still use excessive amounts of texture memory (thus causing lots of swap) or emit rendering calls that take extremely long to execute. That kills latency and makes the system virtually unusable. Perhaps the process that authorizes DRI clients should be able to revoke or suspend that authorization. A suspend would essentially mean that drmGetLock waits until the suspend is lifted. I know that actually implementing these things in such a way that they Just Work is not a pleasant task. I just felt like sharing a brain dump. cu, Nicolai pgpkLU7ArzKbS.pgp Description: PGP signature
Re: glxtest with fglrx / r200
On Thursday 28 October 2004 20:11, Roland Scheidegger wrote: > - 0x2284. This one is interesting. The script gives this the name > "X_VAP_PVS_WAITIDLE", the driver always emits this right before > R200_SE_VAP_CNTL. Apparently it exists on r200 too. Looks like it forces > the VAP (whatever that stands for...) to wait. Would we need to emit > that too? The guessed register name is from me. On the R300, fglrx almost always writes 0 to this register before changing any vertex processor-related state, so I assume that it has some kind of serialising purpose. However, I have yet to run into any situation where emitting it makes any difference in my own code, so I don't know this for sure. cu, Nicolai pgp0GdyzVtEM4.pgp Description: PGP signature
Re: R300 & depth buffer
Hi, First of all, you should really check out the r300_driver module from CVS of the r300 project on SourceForge, and especially have a look at docs/r300_reg.h, which is where I put all register information that I and others have found so far. On Tuesday 26 October 2004 14:18, Jerome Glisse wrote: > Hi, > > I was playing a little around with r300 mainly looking at depth buffer. > I'm still unable to make it work properly. > Thus i have few questions. In radeon driver it seems that default value > for z_scale & z_offset are 0 (radeon/radeon_state_init.c) > Why are they set like that ? > > I changed the depth in order to have something more conveniant on screen : > > adaptor.depth_pitch=display_width | (0x8 << 16); > maybe better to write it as : > adaptor.depth_pitch=display_width | (0x4 << 17); As long as we don't know what these constants mean, is there really a difference? > in void Emit3dRegs(void) i used informations from radeon register. > > /* do not enable either of stencil, Z or anything else */ > e32(CP_PACKET0(RADEON_RB3D_CNTL,0)); > e32(RADEON_COLOR_FORMAT_ARGB | RADEON_Z_ENABLE); > > e32(CP_PACKET0(RADEON_RB3D_ZSTENCILCNTL,0)); > e32(RADEON_Z_WRITE_ENABLE | RADEON_DEPTH_FORMAT_32BIT_FLOAT_Z | > RADEON_Z_TEST_LESS); Basically everything in the 3D hardware interface has changed from R200 to R300, so the above almost certainly doesn't do what you want. Again, have a look at r300_reg.h Also, my work-in-progress driver can already clear the depth buffer in hardware in a way that is consistent with the software fallback, so you can have a look at how the registers are set up there, in r300_ioctl.c and r300_state.c. > Maybe we should put somewhere a list of things to find and who is > working on it, thus people won't work > on the same things in the mean time or they could work together more > eviciently. Also maybe it could > be usefull to make a plan of things we want to discover. > > z buffer & stencil buffer That would be very helpful. I haven't looked at stencil settings at all, and I'm kind of confused about the Z-buffer format. The R300 seems to use ZZZS format for 24bit depth/8bit stencil where the R300 used SZZZ. > matrix stack for modelview, projection & texture (is the information > of radeon enought ?) I think I've pretty much got that down. The R300 has a very flexible programmable vertex processor, and the driver is responsible for setting up the correct matrices. > T&L route Again, I think I've got most of that down. If you could help with texture specification/formats, I'd be very thankful. The register addresses are already in r300_reg.h, but the texture format itself, how mipmaps work etc. is still a mystery. > I think with this feature we could make a quite good first hardware > accelerated driver. > > By the way i find that clear_depth_buffer & clear_color_buffer are quite > complex. Is all the stuff they have in really necessary ? (i will try to > look at that latter but if someone already done it). No, most of those are redundant state updates produced by ATI's proprietary driver. Again, look at how the work-in-progress DRI driver does it. > Oh yes i almost forgot :) my device id is 0x4e4a (it is a radeon 9800) I've added this and other IDs from pciids.sf.net to the experimental driver in r300_driver/drm cu, Nicolai > Jerome Glisse pgpvsr9MCtyY6.pgp Description: PGP signature
Re: [Mesa3d-dev] Doom3 works on R200!
On Sunday 24 October 2004 19:38, Bernardo Innocenti wrote: > Even though I just have a Radeon 9200, I'm very excited about the > ongoning R300 effort and with there was a similar project for NVidia > cards too. If that "with" above is a "wish" like I think it probably is, you might want to have a look at Utah-GLX which has rudimentary hw accel support. Also, somebody, somewhere (possibly in the nv driver in X, but I'm not sure) figured out how to do DMA. Of course, what's really needed is the equivalent of glxtest for NVidia and somebody with NVidia hardware who has a few weeks to spare for long nights of puzzling over register dumps :) cu, Nicolai pgppeOhHYFYMY.pgp Description: PGP signature
DRM linux-core: inter_module_get("agp")
Hi, shouldn't the inter_module_get("agp") in drm_core_init() be inter_module_get("drm_agp") instead? "drm_agp" is what the old (non-core) DRM uses, and it works for me (unlike "agp"). Also, which kernel version do I need for the symbol_get() thing to work? cu, Nicolai pgpF3XicFohtp.pgp Description: PGP signature
Re: Radeon 9600 with radeon DRM module
On Monday 18 October 2004 16:04, Tino Keitel wrote: > On Mon, Oct 18, 2004 at 09:13:57 +0200, Tino Keitel wrote: > > [...] > > > Thanks again. Looks like I used the wrong 2d driver patch before > > (xorg680.atipatch.r300). Now the glxinfo output looks right: > > > > OpenGL renderer string: Mesa DRI R300 20040924 AGP 4x NO-TCL > > > > However, glxgears only prints out this messages and exits: > > > > disabling 3D acceleration > > drmCommandWrite: -22 You're probably using the main-kernel or DRI version of the DRM. You need the DRM from r300_driver/drm, because only that version of the DRM implements the new ioctls. > Hm, this looks funny (from r300_context.c): > > if (1 || > driQueryOptionb(&r300->radeon.optionCache, "no_rast")) { > fprintf(stderr, "disabling 3D acceleration\n"); > > Is this intended behaviour? I thought the r300 only exists to provide > 3D acceleration. That's perfectly correct behaviour. My intention is to start with a purely software rendered driver and go from there. Right now, no primitives will be hardware accelerated, only glClear() actually uses the hardware path. Yes, that's a disappointment, but at least the driver is actually very stable (for me, that is ;)). If you think there should be more features, your help is always welcome :) cu, Nicolai pgpDLuMqoy58P.pgp Description: PGP signature
[r300] r300_driver update
Hi, I have uploaded my changes to the r300_driver CVS. I haven't merged any changes to the R200 driver that might apply, and I haven't merged the drm-core changes. I will do that within the next days. Accelerated color buffer clear and basic clipping (without GL scissors) works, although I have noticed a flickering regression - this might be an interaction with updated Mesa CVS or a stupid merging mistake on my part, I'm not sure. I have also uploaded a new patch for the DDX, which is especially necessary for stability. With this patch, there are currently no known lockups, and I have tested running multiple DRI clients thoroughly. Securing against lockups comes at a price. The basic problem is that there is too little communication between what the DRM writes to the ring buffer and what the X server sends via indirect buffers. For now, the X server will alway issues a single cp_idle ioctl before sending indirect buffers. To quote my comment in radeon_accel.c: /* TODO: Fix this more elegantly. * Sometimes (especially with multiple DRI clients), this code * runs immediately after a DRI client issues a rendering command. * * The accel code regularly inserts WAIT_UNTIL_IDLE into the * command buffer that is sent with the indirect buffer below. * The accel code fails to set the 3D cache flush registers for * the R300 before sending WAIT_UNTIL_IDLE. Sending a cache flush * on these new registers is not necessary for pure 2D functionality, * but it *is* necessary after 3D operations. * Without the cache flushes before WAIT_UNTIL_IDLE, the R300 locks up. * * The CP_IDLE call into the DRM indirectly flushes all caches and * thus avoids the lockup problem, but the solution is far from ideal. * Better solutions could be: * - always flush caches when entering the X server * - track the type of rendering commands somewhere and issue *cache flushes when they change * However, I don't feel confident enough with the control flow * inside the X server to implement either fix. -- nh */ My hope is that the X server will become just another DRM client as far as accelerated rendering is concerned, which will eventually allow this to be dealt with cleanly in the DRM. I am not interested in performance right now, and since these idle calls seem to be the most foolproof thing to do, I will leave them in. cu, Nicolai pgpACVOpeXt02.pgp Description: PGP signature
SW fallback: clipping bug [patch]
Hi, There is disagreement about the meaning of the CLIPSPAN _n parameter in CVS. The drivers I have looked at and drivers/dri/common/spantmp.h treat _n as the number of pixels in the span after clipping. depthtmp.h and stenciltmp.h treat _n as the end+1 x coordinate of the span. This inconsistency leads to artifacts when software fallbacks are hit while clipping is used, especially with partially obscured clients. The attached patch should fix these artifacts by changing depthtmp.h and stenciltmp.h appropriately. cu, Nicolai Index: drivers/dri/common/depthtmp.h === RCS file: /cvs/mesa/Mesa/src/mesa/drivers/dri/common/depthtmp.h,v retrieving revision 1.5 diff -u -p -b -r1.5 depthtmp.h --- drivers/dri/common/depthtmp.h 8 Oct 2004 22:21:09 - 1.5 +++ drivers/dri/common/depthtmp.h 15 Oct 2004 19:48:14 - @@ -45,15 +45,15 @@ static void TAG(WriteDepthSpan)( GLconte GLint i = 0; CLIPSPAN( x, y, n, x1, n1, i ); - if ( DBG ) fprintf( stderr, "WriteDepthSpan %d..%d (x1 %d)\n", - (int)i, (int)n1, (int)x1 ); + if ( DBG ) fprintf( stderr, "WriteDepthSpan %d..%d (x1 %d) (mask %p)\n", + (int)i, (int)n1, (int)x1, mask ); if ( mask ) { - for ( ; i < n1 ; i++, x1++ ) { + for ( ; n1>0 ; i++, x1++, n1-- ) { if ( mask[i] ) WRITE_DEPTH( x1, y, depth[i] ); } } else { - for ( ; i < n1 ; i++, x1++ ) { + for ( ; n1>0 ; i++, x1++, n1-- ) { WRITE_DEPTH( x1, y, depth[i] ); } } @@ -87,11 +87,11 @@ static void TAG(WriteMonoDepthSpan)( GLc __FUNCTION__, (int)i, (int)n1, (int)x1, (GLuint)depth ); if ( mask ) { - for ( ; i < n1 ; i++, x1++ ) { + for ( ; n1>0 ; i++, x1++, n1-- ) { if ( mask[i] ) WRITE_DEPTH( x1, y, depth ); } } else { - for ( ; i < n1 ; i++, x1++ ) { + for ( ; n1>0 ; x1++, n1-- ) { WRITE_DEPTH( x1, y, depth ); } } @@ -162,8 +162,9 @@ static void TAG(ReadDepthSpan)( GLcontex { GLint i = 0; CLIPSPAN( x, y, n, x1, n1, i ); - for ( ; i < n1 ; i++ ) - READ_DEPTH( depth[i], (x1+i), y ); + for ( ; n1>0 ; i++, n1-- ) { + READ_DEPTH( depth[i], x+i, y ); + } } HW_ENDCLIPLOOP(); #endif Index: drivers/dri/common/stenciltmp.h === RCS file: /cvs/mesa/Mesa/src/mesa/drivers/dri/common/stenciltmp.h,v retrieving revision 1.2 diff -u -p -b -r1.2 stenciltmp.h --- drivers/dri/common/stenciltmp.h 6 Aug 2003 18:12:22 - 1.2 +++ drivers/dri/common/stenciltmp.h 15 Oct 2004 19:48:15 - @@ -41,13 +41,13 @@ static void TAG(WriteStencilSpan)( GLcon if (mask) { - for (;i0;i++,x1++,n1--) if (mask[i]) WRITE_STENCIL( x1, y, stencil[i] ); } else { - for (;i0;i++,x1++,n1--) WRITE_STENCIL( x1, y, stencil[i] ); } } @@ -107,8 +107,8 @@ static void TAG(ReadStencilSpan)( GLcont { GLint i = 0; CLIPSPAN(x,y,n,x1,n1,i); - for (;i0;i++,n1--) + READ_STENCIL( stencil[i], (x+i), y ); } HW_ENDCLIPLOOP(); } pgpHBp7HXFaCC.pgp Description: PGP signature
Re: R300 driver update
On Tuesday 28 September 2004 18:03, Alex Deucher wrote: > I think Nicolai has proven his competence as a coder, I think it'd be > ok to give him mesa cvs access. it might be easier to develop in mesa > cvs to keep synced up and such. Thoughts? I suppose you might want > to wait till 6.2 is tagged. Thank you, I'd be happy to work in a branch of Mesa CVS. The thing is, I'm moving this weekend and starting my first semester of university soon afterwards. It will take some time for me to get regular internet access and for things to settle again in general. So I'd rather wait with making this move. cu, Nicolai > Alex pgptFBEFqgfX1.pgp Description: PGP signature
Re: R300 driver update
On Tuesday 28 September 2004 15:02, Vladimir Dergachev wrote: > On Tue, 28 Sep 2004, Nicolai Haehnle wrote: > Hi Nicolai, you can just rename the driver so it produces r300_dri.so - > the 2d driver is in fact configured to tell DRI clients to use that > binary. Well, it does produce r300_dri.so right now. The reason I am doing the work in an r300/ subdirectory is that development is going on in a different CVS repository, and I really don't fancy worrying about complicated merges with Mesa CVS (across file renames, at that!). Or maybe I somehow misunderstood what you meant? On a more happy note, Clear doesn't lock up anymore, but the coordinate calculation seems to be all wrong. cu, Nicolai pgpF6IkHKYM9k.pgp Description: PGP signature
R300 driver update
Hi, I decided to commit what I have in terms of an R300 driver so far. You can find You can find it in th r300 project on SourceForge in the r300_driver module: cvs -d:pserver:[EMAIL PROTECTED]:/cvsroot/r300 checkout r300_driver As you can easily see I started with the R200 driver. Since I didn't know what to rip out and what to keep in at first, I decided to take the extra time and separate stuff out into Radeon generic and R200/R300 code. So in theory, the driver should still work on R200 hardware, although a) I couldn't tested this and b) it's not quite uptodate (about a week old). I have written state emission code which works, and I have started implementing hardware accelerated clear. Something does happen on the screen, but immediately afterwards I get a hard lockup for now. I'll let you know when the driver is more usable. cu, Nicolai pgpsE6JTPWXag.pgp Description: PGP signature
Re: Where is the source for DRM 1.5?
On Monday 27 September 2004 19:00, Barry Scott wrote: > I have failed to find a tar ball of CVS tag that names any > specific version of DRM. What did I miss? To my knowledge, there is no global DRM version like this. Each driver has an interface version number, but this number does not necessarily mark a particular release of DRM. Basically, if your X windows driver or the 3D client driver complains that the DRM is too old, just upgrade to the latest version from your distribution. You can do this by upgrading the kernel. If the distribution's kernel is outdated, either compile your own kernel from stock sources or, if you're feeling adventurous, get the DRM CVS (I would link you to the download page on http://dri.sourceforge.net/ , but it appears to be down right now). cu, Nicolai pgpbesNNmaMmV.pgp Description: PGP signature
R200 - save_on_next_unlock
Hi, I'm trying to completely understand the command buffer stuff for my R300 driver work, and I noticed something about the save_on_next_unlock code. I'm concerned about the state_atom::check function. The check functions use the current state of the context to figure out which atoms must be emitted. Now, consider the following scenario: 1. The driver unlocks and saves the state. 2. The application issues a rendering command. The buffer is not full yet, so FlushCmdBuf isn't called. 3. The application changes some state (e.g. texture enable/disable) and issues some more rendering commands. 4. This time, FlushCmdBuf is called. It sees the lost_context flag and triggers the state backup code. 5. The state backup code still uses check() to see which state atoms are active, but this produces wrong results because the state at time (2) at the beginning of the command buffer is different from the state now. So either I haven't fully understood the mechanisms yet (please enlighten me), or I found your bug ;) Unfortunately, I can't verify this because I don't have R200 hardware. cu, Nicolai pgp10ZHxbwSWz.pgp Description: PGP signature
Software fallback fixes and R300 driver work
Hi, apparently I'm the first to use a full software fallback for glClear(), as I ran into a few problems that the attached patch should fix: - spantmp.h doesn't check for NULL masks - add a WriteMonoDepthSpan function to the swrast to driver interface - use this function to clear the depth buffer in swrast when available (swrast depth buffer clearing completely ignores driver functions right now) I decided to take it to the next level and actually start hacking on a DRI driver for the R300. So far I modified the R200 driver to recognize the R300 family and use 100% software fallbacks in this case. I will put source up as soon as some rasterization is actually hardware accelerated. One thing I noticed in the process: r200Flush() unconditionally calls r200EmitState(). Is this really necessary? I was assuming that glFlush() could be a noop when it's not preceded by any rendering commands. cu, Nicolai Index: drivers/dri/common/depthtmp.h === RCS file: /cvs/mesa/Mesa/src/mesa/drivers/dri/common/depthtmp.h,v retrieving revision 1.2 diff -u -p -r1.2 depthtmp.h --- drivers/dri/common/depthtmp.h 6 Aug 2003 18:12:22 - 1.2 +++ drivers/dri/common/depthtmp.h 23 Sep 2004 23:27:25 - @@ -64,6 +64,42 @@ static void TAG(WriteDepthSpan)( GLconte HW_WRITE_UNLOCK(); } +static void TAG(WriteMonoDepthSpan)( GLcontext *ctx, + GLuint n, GLint x, GLint y, + const GLdepth depth, + const GLubyte mask[] ) +{ + HW_WRITE_LOCK() + { + GLint x1; + GLint n1; + LOCAL_DEPTH_VARS; + + y = Y_FLIP( y ); + + HW_CLIPLOOP() + { + GLint i = 0; + CLIPSPAN( x, y, n, x1, n1, i ); + + if ( DBG ) fprintf( stderr, "%s %d..%d (x1 %d) = %u\n", + __FUNCTION__, (int)i, (int)n1, (int)x1, (uint)depth ); + + if ( mask ) { + for ( ; i < n1 ; i++, x1++ ) { + if ( mask[i] ) WRITE_DEPTH( x1, y, depth ); + } + } else { + for ( ; i < n1 ; i++, x1++ ) { + WRITE_DEPTH( x1, y, depth ); + } + } + } + HW_ENDCLIPLOOP(); + } + HW_WRITE_UNLOCK(); +} + static void TAG(WriteDepthPixels)( GLcontext *ctx, GLuint n, const GLint x[], Index: drivers/dri/common/spantmp.h === RCS file: /cvs/mesa/Mesa/src/mesa/drivers/dri/common/spantmp.h,v retrieving revision 1.2 diff -u -p -r1.2 spantmp.h --- drivers/dri/common/spantmp.h 6 Aug 2003 18:12:22 - 1.2 +++ drivers/dri/common/spantmp.h 23 Sep 2004 23:27:25 - @@ -123,15 +123,29 @@ static void TAG(WriteRGBAPixels)( const HW_WRITE_CLIPLOOP() { - for (i=0;i0;i++,x1++,n1--) - if (mask[i]) + if (mask) + { + for (;n1>0;i++,x1++,n1--) + if (mask[i]) + WRITE_PIXEL( x1, y, p ); + } + else + { + for (;n1>0;i++,x1++,n1--) WRITE_PIXEL( x1, y, p ); + } } HW_ENDCLIPLOOP(); } @@ -186,12 +208,23 @@ static void TAG(WriteMonoRGBAPixels)( co HW_WRITE_CLIPLOOP() { - for (i=0;iVisual.depthBits == 0 - || !ctx->DrawBuffer->DepthBuffer || !ctx->Depth.Mask) { /* no depth buffer, or writing to it is disabled */ return; } + if (swrast->Driver.WriteMonoDepthSpan) { + const GLdepth clearValue = (GLdepth)(ctx->Depth.Clear * ctx->DepthMax); + const GLint x = ctx->DrawBuffer->_Xmin; + const GLint y = ctx->DrawBuffer->_Ymin; + const GLint height = ctx->DrawBuffer->_Ymax - ctx->DrawBuffer->_Ymin; + const GLint width = ctx->DrawBuffer->_Xmax - ctx->DrawBuffer->_Xmin; + GLint i; + + for (i = 0; i < height; i++) { + (*swrast->Driver.WriteMonoDepthSpan)( ctx, width, x, y + i, + clearValue, NULL ); + } + + return; + } + + if (!ctx->DrawBuffer->DepthBuffer) + return; + /* The loops in this function have been written so the IRIX 5.3 * C compiler can unroll them. Hopefully other compilers can too! */ Index: swrast/swrast.h === RCS file: /cvs/mesa/Mesa/src/mesa/swrast/swrast.h,v retrieving revision 1.40 diff -u -p -b -r1.40 swrast.h --- swrast/swrast.h 21 Mar 2004 17:05:05 - 1.40 +++ swrast/swrast.h 23 Sep 2004 23:25:41 - @@ -403,6 +403,13 @@ struct swrast_device_driver { * depth[i] value if mask[i] is nonzero. */ + void (*WriteMonoDepthSpan)( GLcontext *ctx, GLuint n, GLint x, GLint y, + const GLdepth depth, const GLubyte mask[] ); + /* Write a horizontal run of depth values. +* If mask is NULL, draw all pixels. +* If mask is not null, only draw pixel [i] when mask [i] is true. +*/ + void (*ReadDepthSpan)( GLcontext *ctx, GLuint n, GLint x, GLint y, GLdepth depth[] ); /* Read a horizontal span of values from th
Re: [r300] - likely compatibility w rv360?
Hi, On Wednesday 22 September 2004 00:45, Dag Bakke wrote: > If I load "dri" in my xorg.conf, the graphics gets wedged as soon > as I start X. I get more or less garbled stuff from the previous session. I > can move the cursor, but that's it. I can not exit from X with > ctrl-alt-backspace, or shift to the console with ctrl-alt-f1. I also tried > without radeonfb, just in case. I see no evidence of problems in the Xorg > log with dri, which I can review after rebooting via sysrq. Of course, if > the machine panicked, nothing got to the kernel log. And no, my wireless > keyboard does not have keyboard leds.. Whenever this has happened to me, it was caused by bad microcode. Check your syslog for a message "Loading Rx00 microcode", and make sure it says R300. If it does, then maybe this chip does need different microcode, as Vladimir said. cu, Nicolai pgpEXEqNbM9KJ.pgp Description: PGP signature
Re: [r300] - likely compatibility w rv360?
On Tuesday 21 September 2004 20:18, Dag Bakke wrote: > Hi. > > I am very happy to see that the more recent Radeons are being looked into. > > A few questions: > 1. Is the rv360 (9600xt) close enough to the developers hardware to > a) benefit from the 2d improvements already made w.r.t. CP acceleration > b) be of any use for testing purposes > > If the answer to a) is "don't know" I'll be happy to try. But I'll > skip it if somebody says "don't bother". Yes, I think the rv360 is basically the same as one of the chips Vladimir has, so it's definitely worth a try. > 2. Should I assume that you guys will stick to 6.7.0 until a more uptodate > binary driver is available? Actually, I'm using recent X.Org CVS for my DRI testing platform. > 3. Anyone tried to patch 6.8.1 with the 2D patch? Yes. There's a new patch on r300.sf.net which should work with 6.8.1, although it's not pretty. > and finally: > > 4. Is the 9250 supported by the current dri code? No idea, but it's probably something R2xx-ish, so it should be. cu, Nicolai pgpsJMyjIl8oa.pgp Description: PGP signature
Re: [R300] pixel shader
On Sunday 19 September 2004 03:53, Vladimir Dergachev wrote: > Hi Nicolai, > > I committed a modification of pretty_print_command_stream.tcl that > decodes most of PFS_INSTR* registers. > > It still prints the actual value written - as a last value after equals > sign. So, I am hoping that even if this messed up your disassembler it is > easy to fix - I am not that proficient in Python to venture modifying your > code :) > > Also, r300_demo now have headers for both vertex shader and pixel > shader. It did mess up the disassembler which uses a simple regex to catch the data, but it's an easy fix. > Lastly, I think it would be useful to have an assembler for vertex > shaders and pixel shaders that does the job similar to those DirectX > functions that translate textual description into coded on (I also believe > that OpenGL 2.0 should have something like this as well). Doesn't Mesa already support ARB_vertex_program and ARB_fragment_program? I think it would be best to add R300 programs as an additional backend for the already existing infrastructure, though I have no idea how flexible the existing code is - I haven't looked at it in detail. cu, Nicolai pgpPA5WoUzFsG.pgp Description: PGP signature
Re: R300 development
On Tuesday 14 September 2004 17:01, Vladimir Dergachev wrote: > Hi all, > > The new project name on SF is R300, the registration just went through, > so I am in the process of setting things up. > > Everyone is welcome ! Also, despite the name, this project is *not* > just about R300. If are interested in finding out more about earlier > radeons this is a place to exchange (public !) info about them as well. I have committed my latest registers file into the r300_demo directory of CVS. Changes from the version posted to the mailing list: - better understanding of how vertex program and data is uploaded - some work towards input/output control of vertex programs - start with decoding fragment programs cu, Nicolai pgpgJTUrUI4tj.pgp Description: PGP signature
R300 registers
Hi, while I've had less success (read: hard locks and reboots) with the recently drmtest and r300_demo, I did use glxtest to find out registers of the R300. Basically, what I did is run a small GL program, get the command buffer, make some small changes and rerun. Often, this results in only a small change in the command buffer (found using diff), which makes it possible to guess register addresses and constants. So while I haven't been able to crosstest my results by _sending_ commands using my new knowledge, I am pretty certain that they are mostly correct (as long as there is no explicit comment stating otherwise). So far, I have found registers for alpha blending and testing among other things. I have also decoded most of the vertex program instruction set. However, I do not have the registers for vertex program *setup* yet. All I know is that both the program and its environment/parameters (whatever you want to call it) are uploaded via 0x2208. All my findings are documented in the attached header file. cu, Nicolai // The entire range from 0x2300 to 0x2AC inclusive seems to be used for // immediate vertices #define R300_VAP_VTX_COLOR_R0x2464 #define R300_VAP_VTX_COLOR_G0x2468 #define R300_VAP_VTX_COLOR_B0x246C #define R300_VAP_VTX_POS_0_X_1 0x2490 // used for glVertex2*() #define R300_VAP_VTX_POS_0_Y_1 0x2494 #define R300_VAP_VTX_COLOR_PKD 0x249C // RGBA #define R300_VAP_VTX_POS_0_X_2 0x24A0 // used for glVertex3*() #define R300_VAP_VTX_POS_0_Y_2 0x24A4 #define R300_VAP_VTX_POS_0_Z_2 0x24A8 #define R300_VAP_VTX_END_OF_PKT 0x24AC // write 0 to indicate end of packet? /* gap */ #define R300_PP_ALPHA_TEST 0x4BD4 # define R300_REF_ALPHA_MASK 0x00ff # define R300_ALPHA_TEST_FAIL (0 << 8) # define R300_ALPHA_TEST_LESS (1 << 8) # define R300_ALPHA_TEST_LEQUAL(2 << 8) # define R300_ALPHA_TEST_EQUAL (3 << 8) # define R300_ALPHA_TEST_GEQUAL(4 << 8) # define R300_ALPHA_TEST_GREATER (5 << 8) # define R300_ALPHA_TEST_NEQUAL(6 << 8) # define R300_ALPHA_TEST_PASS (7 << 8) # define R300_ALPHA_TEST_OP_MASK (7 << 8) # define R300_ALPHA_TEST_ENABLE(1 << 11) /* gap */ // Notes: // - AFAIK fglrx always sets BLEND_UNKNOWN when blending is used in the application // - AFAIK fglrx always sets BLEND_NO_SEPARATE when CBLEND and ABLEND are set to the same // function (both registers are always set up completely in any case) // - Most blend flags are simply copied from R200 and not tested yet #define R300_RB3D_CBLEND0x4E04 #define R300_RB3D_ABLEND0x4E08 /* the following only appear in CBLEND */ # define R300_BLEND_ENABLE (1 << 0) # define R300_BLEND_UNKNOWN(3 << 1) # define R300_BLEND_NO_SEPARATE(1 << 3) /* the following are shared between CBLEND and ABLEND */ # define R300_FCN_MASK (3 << 12) # define R300_COMB_FCN_ADD_CLAMP (0 << 12) # define R300_COMB_FCN_ADD_NOCLAMP (1 << 12) # define R300_COMB_FCN_SUB_CLAMP (2 << 12) # define R300_COMB_FCN_SUB_NOCLAMP (3 << 12) # define R300_SRC_BLEND_GL_ZERO(32 << 16) # define R300_SRC_BLEND_GL_ONE (33 << 16) # define R300_SRC_BLEND_GL_SRC_COLOR (34 << 16) # define R300_SRC_BLEND_GL_ONE_MINUS_SRC_COLOR (35 << 16) # define R300_SRC_BLEND_GL_DST_COLOR (36 << 16) # define R300_SRC_BLEND_GL_ONE_MINUS_DST_COLOR (37 << 16) # define R300_SRC_BLEND_GL_SRC_ALPHA (38 << 16) # define R300_SRC_BLEND_GL_ONE_MINUS_SRC_ALPHA (39 << 16) # define R300_SRC_BLEND_GL_DST_ALPHA (40 << 16) # define R300_SRC_BLEND_GL_ONE_MINUS_DST_ALPHA (41 << 16) # define R300_SRC_BLEND_GL_SRC_ALPHA_SATURATE (42 << 16) # define R300_SRC_BLEND_MASK (63 << 16) # define R300_DST_BLEND_GL_ZERO(32 << 24) # define R300_DST_BLEND_GL_ONE (33 << 24) # define R300_DST_BLEND_GL_SRC_COLOR (34 << 24) # define R300_DST_BLEND_GL_ONE_MINUS_SRC_COLOR (35 << 24) # define R300_DST_BLEND_GL_DST_COLOR (36 << 24) # define R300_DST_BLEND_GL_ONE_MINUS_DST_COLOR (37 << 24) # define R300_DST_BLEND_GL_SRC_ALPHA (38 << 24) # define R300_DST_BLEND_GL_ONE_MINUS_SRC_ALPHA (39 << 24) # define R300_DST_BLEND_GL_DST_ALPHA (40 << 24) # define R300_DST_BLEND_GL_ONE_MINUS_DST_ALPHA (41 << 24) # define R300_DST_BLEND_MASK (63 << 24) #define R300_RB3D_COLORMASK
Re: Radeon 7200 problems
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Friday 04 June 2004 12:22, Michel DÃnzer wrote: > > Currently, if you set the gart size manually higher than what's possible > > (set in bios), dri will just get disabled due to missing agp support, > > which I consider bad behaviour, and that you get a useless error message > > in that case doesn't help neither. > > (II) RADEON(0): [agp] 262144 kB allocated with handle 0x0001 > > (EE) RADEON(0): [agp] Could not bind > > (EE) RADEON(0): [agp] AGP failed to initialize. Disabling the DRI. > > (II) RADEON(0): [agp] You may want to make sure the agpgart kernel > > module is loaded before the radeon kernel module. > > IMHO only the 'Could not bind' error could use some clarification, > otherwise I find this the only sane way to deal with an impossible > configuration. Would it be possible to do an automatic fallback to the largest allowed gart size, along with an appropriate warning/error message? Tell me to shut up if it's not possible to query the maximum size ;) cu, Nicolai -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFAwFrssxPozBga0lwRAqn6AKDX6knsbuksaX3KoAC/A5kG852mbQCgzpxv 9wEb+fLrX40yADNIfAZeCww= =kgbs -END PGP SIGNATURE- --- This SF.Net email is sponsored by the new InstallShield X. >From Windows to Linux, servers to mobile, InstallShield X is the one installation-authoring solution that does it all. Learn more and evaluate today! http://www.installshield.com/Dev2Dev/0504 -- ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: Development setup
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Tuesday 25 May 2004 23:43, Maurice van der Pot wrote: > The modifications I made to the driver were visible when I executed an > OpenGL app, so I knew it was using the right r200_dri.so. Strangely, > I was unable to get most of the debug prints working. The general ones > with LIBGL_DEBUG seemed to work, but not the r200 specific ones. Look in r200_context.h for DO_DEBUG, and set it to 1. I assume you've already set the environment variable R200_DEBUG, as well. cu, Nicolai -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFAtQCKsxPozBga0lwRAitiAJ9opTRg/AbVBlvy6Fq6lyjg5Ji7gACeI9by gqPrSG0hAQi2SdIbn1UFGHQ= =NaxW -END PGP SIGNATURE- --- This SF.Net email is sponsored by: Oracle 10g Get certified on the hottest thing ever to hit the market... Oracle 10g. Take an Oracle 10g class now, and we'll give you the exam FREE. http://ads.osdn.com/?ad_id149&alloc_id66&op=click -- ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
R300: Recovering from lockups
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 As you may be aware, I was trying to get R300 support into a state where it is possible to start OpenGL applications, let them hang the CP and *not* bring down the entire machine. Looks like I was successful :) The attached patch ati.unlock.1.patch against the DDX makes sure the RBBM (whatever that means; I'm guessing Ring Buffer something or other) is reset in RADEONEngineReset(), before any other register is accessed that could potentially cause a final crash (DSTCACHE_* is the major offender in this category). Now since I don't have any Radeon-related documentation at all, I have no idea whether this patch will work on any other chip. For all that I know, it might totally break the driver on R100/R200. I'm especially confused by the fact that the bottom half of EngineReset() treats RBBM_SOFT_RESET differently for the R300. Can anybody explain why? Maybe it would even be safest/cleanest to move the entire RBBM_SOFT_RESET block to the top of the function? I can now launch glxgears several times in a row. It will be killed a few seconds later (during this time the GUI will hang), and as far as I can tell, everything continues to work normally. Of course, for all I know the 3D part of the chip might still be wedged internally, which would make this patch (partially) useless for working on the driver. I guess I'll find out soon enough. Important: You'll need my watchdog patch for the DRM from that other thread. Otherwise, the reset code in the X server will never be called, and this patch will have no effect. I would also like to point out that the modified xf86 driver that was posted on this list (see http://volodya-project.sourceforge.net/R300.php) does not check the version of the DRM. I know, this is really a silly, minor point to make at this time, but I've attached a small patch to fix this anyway. cu, Nicolai -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFAs6SssxPozBga0lwRAsKwAJ0eyDj01OjMybqe18du3Qs06peOSACaAkVL B9hN0+nizrYWhM6/nXcf6uE= =4tp0 -END PGP SIGNATURE- diff -ur -x '*.o' ati-vladimir/radeon_accel.c ati/radeon_accel.c --- ati-vladimir/radeon_accel.c 2004-05-20 16:02:24.0 +0200 +++ ati/radeon_accel.c 2004-05-25 21:14:24.0 +0200 @@ -170,6 +170,31 @@ CARD32 rbbm_soft_reset; CARD32 host_path_cntl; +/* The following RBBM_SOFT_RESET sequence can help un-wedge + * an R300 after the command processor got stuck. + */ +rbbm_soft_reset = INREG(RADEON_RBBM_SOFT_RESET); +OUTREG(RADEON_RBBM_SOFT_RESET, (rbbm_soft_reset | +RADEON_SOFT_RESET_CP | +RADEON_SOFT_RESET_HI | +RADEON_SOFT_RESET_SE | +RADEON_SOFT_RESET_RE | +RADEON_SOFT_RESET_PP | +RADEON_SOFT_RESET_E2 | +RADEON_SOFT_RESET_RB)); +INREG(RADEON_RBBM_SOFT_RESET); +OUTREG(RADEON_RBBM_SOFT_RESET, (rbbm_soft_reset & (CARD32) +~(RADEON_SOFT_RESET_CP | + RADEON_SOFT_RESET_HI | + RADEON_SOFT_RESET_SE | + RADEON_SOFT_RESET_RE | + RADEON_SOFT_RESET_PP | + RADEON_SOFT_RESET_E2 | + RADEON_SOFT_RESET_RB))); +INREG(RADEON_RBBM_SOFT_RESET); +OUTREG(RADEON_RBBM_SOFT_RESET, rbbm_soft_reset); +INREG(RADEON_RBBM_SOFT_RESET); + RADEONEngineFlush(pScrn); clock_cntl_index = INREG(RADEON_CLOCK_CNTL_INDEX); diff -ur -x '*.o' ati-vladimir/radeon_accelfuncs.c ati/radeon_accelfuncs.c --- ati-vladimir/radeon_accelfuncs.c 2004-05-20 16:02:24.0 +0200 +++ ati/radeon_accelfuncs.c 2004-05-25 21:13:37.0 +0200 @@ -122,7 +122,7 @@ xf86DrvMsg(pScrn->scrnIndex, X_ERROR, "%s: CP idle %d\n", __FUNCTION__, ret); } - } while ((ret == -EBUSY) && (i++ < RADEON_TIMEOUT)); + } while ((ret == -EBUSY) && (i++ < RADEON_TIMEOUT/1)); /* the ioctl has an internal delay */ if (ret == 0) return; --- ati-vladimir/radeon_dri.c 2004-05-20 16:02:24.0 +0200 +++ ati/radeon_dri.c 2004-05-20 16:13:47.0 +0200 @@ -1369,6 +1369,9 @@ if (info->IsIGP) { req_minor = 10; req_patch = 0; + } else if (info->ChipFamily >= CHIP_FAMILY_R300) { + req_minor = 11; + req_patch = 1; } else if (info->ChipFamily >= CHIP_FAMILY_R200) { req_minor = 5; req_patch = 0;
Re: [patch] Re: Some questions regarding locks
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I've attached a new version of the patch. This should fix a minor bug: I put the call to init_timer() too late, which resulted in a kernel warning when the module was loaded/unloaded without actually being used. On Sunday 23 May 2004 14:37, Michel DÃnzer wrote: > > 2. The timeout cannot be configured yet. I didn't find "prior art" as to how > > something like it should be configured, so I'm open for input. For a Linux > > driver, adding to the /proc entries seems to be the logical way to go, but > > the DRI is very ioctl-centric. Maybe both? > > What's the goal of making it configurable at all, to allow for driver > debugging? Maybe that could be dealt with better, see below. This is actually a good point :) > Is there a way to tell that a process is being debugged? If so, maybe it > could be handled sanely by default? E.g., release the lock while the > process is stopped? (That might wreak havoc once execution is resumed > though) ... Could be possible, but it *is* bound to wreak havoc. Now that you talk about it... in the far future, it would be *very* useful if clients could deal with temporary loss of access to the DRM. I'm thinking of the recent discussions about the possible future of fbdev, DRI, etc. where all graphics access eventually goes through the DRM, or something similar. In that scenario, we need to have a way to establish a secure terminal that is safe against a) fake messages / dialogs created by DRI clients running in the background and b) screen scraping by background clients I don't see how this could be done without revoking authorization temporarily, including unmapping memory regions. Once DRI clients can deal with this, running them in a debugger should be a piece of cake, really :) cu, Nicolai -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFAs6BxsxPozBga0lwRAglnAKC9ldd4n/KbE1cDSLPapkRHMx2O0QCZAdMC Ab4c9daD4WyRZWyGPRJmSyw= =yLV/ -END PGP SIGNATURE- diff -ur drm-base/linux/drm_drv.h drm/linux/drm_drv.h --- drm-base/linux/drm_drv.h 2004-05-22 21:41:28.0 +0200 +++ drm/linux/drm_drv.h 2004-05-25 19:51:21.0 +0200 @@ -273,6 +273,8 @@ MODULE_PARM( drm_opts, "s" ); MODULE_LICENSE("GPL and additional rights"); +static void drm_lock_watchdog( unsigned long __data ); + static int DRM(setup)( drm_device_t *dev ) { int i; @@ -415,6 +417,7 @@ down( &dev->struct_sem ); del_timer( &dev->timer ); + del_timer_sync( &dev->lock.watchdog ); if ( dev->devname ) { DRM(free)( dev->devname, strlen( dev->devname ) + 1, @@ -545,6 +548,7 @@ if ( dev->lock.hw_lock ) { dev->sigdata.lock = dev->lock.hw_lock = NULL; /* SHM removed */ dev->lock.filp = 0; + dev->lock.dontbreak = 1; wake_up_interruptible( &dev->lock.lock_queue ); } @@ -581,6 +585,10 @@ sema_init( &dev->struct_sem, 1 ); sema_init( &dev->ctxlist_sem, 1 ); + init_timer( &dev->lock.watchdog ); + dev->lock.watchdog.data = (unsigned long) dev; + dev->lock.watchdog.function = drm_lock_watchdog; + if ((dev->minor = DRM(stub_register)(DRIVER_NAME, &DRM(fops),dev)) < 0) { retcode = -EPERM; @@ -928,6 +936,11 @@ #if __HAVE_RELEASE DRIVER_RELEASE(); #endif + /* Avoid potential race where the watchdog callback is still + * running when filp is freed. + */ + del_timer_sync( &dev->lock.watchdog ); + DRM(lock_free)( dev, &dev->lock.hw_lock->lock, _DRM_LOCKING_CONTEXT(dev->lock.hw_lock->lock) ); @@ -951,6 +964,7 @@ } if ( DRM(lock_take)( &dev->lock.hw_lock->lock, DRM_KERNEL_CONTEXT ) ) { +dev->lock.dontbreak = 1; dev->lock.filp = filp; dev->lock.lock_time = jiffies; atomic_inc( &dev->counts[_DRM_STAT_LOCKS] ); @@ -1096,6 +1110,40 @@ return retcode; } + +/** + * Lock watchdog callback function. + * + * Whenever a privileged client must sleep on the lock waitqueue + * in the LOCK ioctl, the watchdog timer is started. + * When the UNLOCK ioctl is called, the timer is stopped. + * + * When the watchdog timer expires, the process holding the lock + * is killed. Privileged clients set lock.dontbreak and are exempt + * from this rule. + */ +static void drm_lock_watchdog( unsigned long __data ) +{ + drm_device_t *dev = (drm_device_t *)__data; + drm_file_t *priv; + + if ( !dev->lock.filp ) { + DRM_DEBUG( "held by kernel\n" ); + return; + } + + if ( dev->lock.dontbreak ) { + DRM_DEBUG( "privileged lock\n" ); + return; + } + + priv = dev->lock.filp->private_data; + DRM_DEBUG( "Kill pid=%d\n", priv->pid ); + + kill_proc( priv->pid, SIGKILL, 1 ); +} + + /** * Lock ioctl. * @@ -1115,6 +1163,7 @@ DECLARE_WAITQUEUE( entry, current ); drm_lock_t lock; int ret = 0; +int privileged = capable( CAP_SYS_ADMIN ); #if __HAVE_MULTIPLE_DMA_QUEUES drm_queue_t *q; #endif @@ -1157,6 +1206,7 @@ } if ( DRM(lock_take)( &dev->lock.hw_loc
[patch] Re: Some questions regarding locks
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Saturday 22 May 2004 16:04, Michel DÃnzer wrote: > On Sat, 2004-05-22 at 14:04, Nicolai Haehnle wrote: > > It seems to me as if DRM(unlock) in drm_drv.h unlocks without checking > > whether the caller actually holds the global lock. There is no > > LOCK_TEST_WITH_RETURN or similar, and the helper function lock_transfer has > > no check in it either. > > Did I miss something, or is this intended behaviour? It certainly seems > > strange to me. > > True. Note that the lock ioctls are only used on contention, but still. Unless I'm mistaken, DRM(lock) is always called when a client wants the lock for the first time (or when it needs to re-grab after it lost the lock). This is necessary because the DRM makes sure that dev->lock.filp matches the "calling" file. Afterwards, the ioctls are only used on contention. The entire locking can be subverted anyway, because part of the lock is in userspace. I believe the important thing is to make sure that the X server can force a return into a sane locking state. > > Side question: Is killing the offending DRI client enough? When the process > > is killed, the /dev/drm fd is closed, which should automatically release > > the lock. On the other hand, I'm pretty sure that we can't just kill a > > process immediately (unfortunately, I'm not familiar with process handling > > in the kernel). What if, for some reason, the process is in a state where > > it can't be killed yet? > > We're screwed? :) Looks like it... > This sounds like an idea for you to play with, but I'm afraid it won't > be useful very often in my experience: > > * getting rid of the offending client doesn't help with a wedged > chip (some way to recover from that would be nice...) > * it doesn't help if the X server itself spins with the lock held You were right, of course, while I show my lack of experience with driver writing. In my case I can get the X server's reset code to run, but some way through the reset the machine finally locks up completely (no more networking, no more disk I/O). I'm curious though, how can a complete lockup like this be caused by the graphics card? My guess would be that it grabs the PCI/AGP bus forever for some reason (the dark side of bus mastering, so to speak). Is there anything else that could be the cause? > > Side question #2: Is it safe to release the DRM lock in the watchdog? There > > might be races where the offending DRI client is currently executing a DRM > > ioctl when the watchdog fires. > > Not sure, but this might not be a problem when just killing the > offending process? You're right. On the other hand, it might sometimes be useful to be a little bit nicer to the offending process (see point 4 below). I had a go at implementing my watchdog idea for Linux, see the attached patch. It basically works, but I couldn't test it on a system where the DRI actually works without locking up... *sigh* Now for some notes: 1. This only affects the DRM for Linux. I don't have an installation of BSD, and while I know a little bit about the Linux kernel, I don't know anything about the BSD kernel(s). 2. The timeout cannot be configured yet. I didn't find "prior art" as to how something like it should be configured, so I'm open for input. For a Linux driver, adding to the /proc entries seems to be the logical way to go, but the DRI is very ioctl-centric. Maybe both? 3. Privileged processes may take the hardware lock for an infinite amount of time. This is necessary because the X server holds the lock when VT is switched away. Currently, "privileged" means capable(CAP_SYS_ADMIN). I would prefer if it meant "the multiplexing controller process", i.e. the one that authenticates other processes. Unfortunately, this distinction isn't made anywhere in the DRM as far as I can see. This means that runaway DRI clients owned by root aren't killed by the watchdog, either. 4. Keith mentioned single-stepping through a driver, and he does have a point. Unfortunately, I also believe that it's not that simple. Suppose an application developer debugs a windowed OpenGL application, on the local machine, without a dual-head setup. It may sound like a naive thing to do, but this actually works on Windows (yes, Windows is *a lot* more stable than Linux/BSD in that respect). Now suppose she's got a bug in her application (e.g. bad vertex array) that triggers a segmentation fault inside the GL driver, while the hardware lock is held. GDB will catch that signal, so the process won't die, which in turn means that the lock is not released. Thus the developer's machine locks up
Some questions regarding locks
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 It seems to me as if DRM(unlock) in drm_drv.h unlocks without checking whether the caller actually holds the global lock. There is no LOCK_TEST_WITH_RETURN or similar, and the helper function lock_transfer has no check in it either. Did I miss something, or is this intended behaviour? It certainly seems strange to me. Also, it is possible for a DRI client to effectively lock up the entire machine simply by entering an endless loop after taking the lock. I suppose one could still log in remotely and kill the offending process, but that's not a realistic option for most people. Switching to a different VT or killing the X server does not work, because the X server has to take the DRI lock in the process. This is a problem that I want to fix (it makes playing around with the R300 hack Vladimir Dergachev posted an infinite-rebooting nightmare), but I am unsure what the best solution would be. As far as I can see, the problem is two-fold: One, the X server must be able to "break" the lock, and two, it (or the DRM) must somehow disable the offending DRI client to prevent the problem from reoccurring. I think the simplest solution would look something like this: Whenever DRM(lock) is called by a privileged client (i.e. the X server), and it needs to sleep because the lock is held by an unprivileged client, a watchdog timer is started before we schedule. DRM(unlock) unconditionally stops this watchdog timer. When the watchdog timer fires, it releases the lock and/or kills the offending DRI client. Side question: Is killing the offending DRI client enough? When the process is killed, the /dev/drm fd is closed, which should automatically release the lock. On the other hand, I'm pretty sure that we can't just kill a process immediately (unfortunately, I'm not familiar with process handling in the kernel). What if, for some reason, the process is in a state where it can't be killed yet? I guess this isn't a problem when we're dealing with a faulty 3D driver, but it might be a problem when dealing with malicious code. Side question #2: Is it safe to release the DRM lock in the watchdog? There might be races where the offending DRI client is currently executing a DRM ioctl when the watchdog fires. This solution involves no ABI changes. Since all changes are kernel side and affect only code that is shared between all drivers, everybody would benefit immediately. Does this all look reasonable to the DRI gurus? cu, Nicolai -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFAr0G+sxPozBga0lwRAmqlAJ9fzmVB1t5D5Hqna3QoGD4zwv1suwCgqyZ1 tWYeGKKr22zwJuR3WNsFzjc= =M43m -END PGP SIGNATURE- --- This SF.Net email is sponsored by: Oracle 10g Get certified on the hottest thing ever to hit the market... Oracle 10g. Take an Oracle 10g class now, and we'll give you the exam FREE. http://ads.osdn.com/?ad_id149&alloc_id66&op=click -- ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
[Dri-devel] Typo in drm/drm_drv.h or drmP.h?
Hi, browsing the DRI source, I stumbled upon this. I have absolutely no idea what it does, but this just doesn't look right (drm_drv.h:317) #ifdef __HAVE_COUNTER15 dev->types[14] = __HAVE_COUNTER14; #endif Looks like this should be 15, i.e. #ifdef __HAVE_COUNTER15 dev->types[15] = __HAVE_COUNTER15; #endif However, in drmP.h, dev->types is defined to have only 15 fields, so dev->types[15] would be out of bounds. Looks like either those three lines should be removed, or the lines should be changed like above, and struct drm_device in drmP.h should be changed appropriately. cu, Nicolai --- This SF.Net email sponsored by: Free pre-built ASP.NET sites including Data Reports, E-commerce, Portals, and Forums are available now. Download today and enter to win an XBOX or Visual Studio .NET. http://aspnet.click-url.com/go/psa0013ave/direct;at.aspnet_072303_01/01 ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel