On Sun, 2004-05-23 at 12:05, Nicolai Haehnle wrote: > > > This sounds like an idea for you to play with, but I'm afraid it won't > > be useful very often in my experience: > > > > * getting rid of the offending client doesn't help with a wedged > > chip (some way to recover from that would be nice...) > > * it doesn't help if the X server itself spins with the lock held > > You were right, of course, while I show my lack of experience with driver > writing. In my case I can get the X server's reset code to run, but some > way through the reset the machine finally locks up completely (no more > networking, no more disk I/O). > > I'm curious though, how can a complete lockup like this be caused by the > graphics card? My guess would be that it grabs the PCI/AGP bus forever for > some reason (the dark side of bus mastering, so to speak). Is there > anything else that could be the cause?
I don't really know, but I also guess it's PCI/AGP related. > 2. The timeout cannot be configured yet. I didn't find "prior art" as to how > something like it should be configured, so I'm open for input. For a Linux > driver, adding to the /proc entries seems to be the logical way to go, but > the DRI is very ioctl-centric. Maybe both? What's the goal of making it configurable at all, to allow for driver debugging? Maybe that could be dealt with better, see below. > 3. Privileged processes may take the hardware lock for an infinite amount of > time. This is necessary because the X server holds the lock when VT is > switched away. > Currently, "privileged" means capable(CAP_SYS_ADMIN). I would prefer if it > meant "the multiplexing controller process", i.e. the one that > authenticates other processes. Unfortunately, this distinction isn't made > anywhere in the DRM as far as I can see. Maybe one could recognize it by the DRM context handle, a bit hackish though. > This means that runaway DRI clients owned by root aren't killed by the > watchdog, either. root is generally allowed to shoot itself in the foot, so this isn't all that bad I guess. > 4. Keith mentioned single-stepping through a driver, and he does have a > point. Unfortunately, I also believe that it's not that simple. > Suppose an application developer debugs a windowed OpenGL application, on > the local machine, without a dual-head setup. It may sound like a naive > thing to do, but this actually works on Windows (yes, Windows is *a lot* > more stable than Linux/BSD in that respect). > Now suppose she's got a bug in her application (e.g. bad vertex array) that > triggers a segmentation fault inside the GL driver, while the hardware lock > is held. GDB will catch that signal, so the process won't die, which in > turn means that the lock is not released. Thus the developer's machine > locks up unless the watchdog kicks in (of course, the watchdog in its > current form will also frustrate her to no end). Is there a way to tell that a process is being debugged? If so, maybe it could be handled sanely by default? E.g., release the lock while the process is stopped? (That might wreak havoc once execution is resumed though) ... Another random thought I had was to introduce a Magic Sysrq key combo to release the lock / kill the process holding it / whatever, but I guess that wouldn't help Joe User with the casual hanging 3D client either. -- Earthling Michel DÃnzer | Debian (powerpc), X and DRI developer Libre software enthusiast | http://svcs.affero.net/rm.php?r=daenzer ------------------------------------------------------- This SF.Net email is sponsored by: Oracle 10g Get certified on the hottest thing ever to hit the market... Oracle 10g. Take an Oracle 10g class now, and we'll give you the exam FREE. http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click -- _______________________________________________ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel