[Dri-devel] bug - execute permissions on data areas
Hmm... I can't seem to find a working bugzilla for DRI, instead things seem to be directed here. I just opened the following bug for xorg, it fixes various pieces of code that need to set execute permission on data memory. Some of these are in DRI and Mesa, it would be great if could get these fixes upstream. Full explanation in the bugzilla referenced. http://pdx.freedesktop.org/cgi-bin/bugzilla/show_bug.cgi?id=399 -- John Dennis [EMAIL PROTECTED] --- This SF.Net email is sponsored by: IBM Linux Tutorials Free Linux tutorial presented by Daniel Robbins, President and CEO of GenToo technologies. Learn everything from fundamentals to system administration.http://ads.osdn.com/?ad_id=1470alloc_id=3638op=click -- ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
[Dri-devel] DRI proprietary modules
For DRI to work correctly there are several independent pieces that all have to be in sync. * XFree86 server which loads drm modules (via xfree86 driver module) * The drm kernel module * The agpgart kernel module Does anybody know for the proprietary drivers (supplied by ATI and Nvidia) which pieces they replace and which pieces they expect to be there? The reason I'm asking is to understand the consequences of changing an API. I'm curious to the answer in general, but in this specific instance the api I'm worried about is between the agpgart kernel module and drm kernel module. If the agpgart kernel module modifies it's API will that break things for someone who installs a proprietary 3D driver? Do the proprietary drivers limit themselves to mesa driver and retain the existing kernel services assuming the IOCTL's are the same? Or do they replace the kernel drm drivers as well? If so do they manage AGP themselves, or do they use the systems agpgart driver? Do they replace the systems agpgart driver? -- John Dennis [EMAIL PROTECTED] --- This SF.net email is sponsored by: SF.net Giveback Program. SourceForge.net hosts over 70,000 Open Source Projects. See the people who have HELPED US provide better services: Click here: http://sourceforge.net/supporters.php ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [Dri-devel] Deadlock with radeon DRI
The locking problem is solved, my original analysis was incorrect. The problem was that DRM_CAS was not correctly implemented on IA64. Thus this was an IA64 issue only, this is consistent with others who showed up in a google search describing the problem, all were on IA64. I have filed an XFree86 bug report on this. I could not find a DRI specific bug reporting mechanism other than the dri-devel list. The IA64 implementation of CAS was this: #define DRM_CAS(lock,old,new,__ret) \ do { \ unsigned int __result, __old = (old); \ __asm__ __volatile__( \ mf\n\ mov ar.ccv=%2\n \ ;;\n\ cmpxchg4.acq %0=%1,%3,ar.ccv\ : =r (__result), =m (__drm_dummy_lock(lock)) \ : r (__old), r (new) \ : memory); \ __ret = (__result) != (__old);\ } while (0) The problem was with the data types given to the cmpxchg4 instruction. All of the lock types in DRM are int's and on IA64 thats 4 bytes wide. The digit suffix cmpxchg4 signifies this instruction operates on a 4 byte quantity. One might expect then since this instruction operates on 4 byte values and in DRM the locks are 4 bytes everything is fine, but it isn't. The cmpxchg4 instruction operates this way: cmpxchg4 r1=[r3],r2,ar.ccv 4 bytes are read at the address pointed to by r3, that 32 bit value is then zero extended to 64 bits. The 64 bit value is then compared to the 64 bit value stored in appliation register CCV. If the two 64 bit values are equal then the least significant 4 bytes in r2 are written back to the address pointed to by r3. The original value pointed to by r3 is stored in r1. The entire operation is atomic. The mistake in the DRM_CAS implemenation is that the comparison is 64 bits wide, thus the value stored in ar.ccv (%2 in the asm) must be 64 bits wide and for us that means zero extending the 32 bit old parameter to 64 bits. Because of the way GCC asm blocks work to tie C variables and data types to asm values the promotion of old from unsigned int to unsigned long was not happening. Thus when old was stored into ar.ccv its most significant 32 bits contained garbage. (Actually because of the way GCC generates constants it turns out the upper 32 bits was 0x, this was from the OR of DRM_LOCK_HELD which is defined as 0x8000, but the compiler generates a 64 bit OR operation using the immediate value 0x8000, which is legal because the upper 32 bits are undefined on int (32 bit) operations). The bottom line is that the test would fail when it shouldn't because the high 32 bits in ar.ccv were not zero. One might think that because old was assigned to __old in a local block which was unsigned int the compiler would know enough when using this value in the asm to have zero extended it. But that's not true, in asm blocks its critical to define the asm value correctly so the compiler can translate between the C code variable and what the asm code is referring to. The line: : r (__old), r (new) says %2 is mapped by r (__old), in other words put __old in a general 64 bit register. We've told the compiler to put 64 bits of __old into a register, but __old is a 32 bit value with its high order 32 bits undefined. We need to tell the compiler to widen the type when assigning it to a general register, thus the asm template type definition needs to be modified with a cast to unsigned long. : r ((unsigned long)__old), r (new) Only with this will the compiler know to widen the 32 bit __old value to 64 bits inside the asm code. Thanks to Jakub Jelinek who helped me understand the nuances of GCC asm templates and type conversions. As a minor side note, definitions of bit flags should be tagged as unsigned. Thus things like: #define DRM_LOCK_HELD 0x8000 #define DRM_LOCK_CONT 0x4000 should really be: #define DRM_LOCK_HELD 0x8000U #define DRM_LOCK_CONT 0x4000U John --- This SF.net email is sponsored by: SF.net Giveback Program. SourceForge.net hosts over 70,000 Open Source Projects. See the people who have HELPED US provide better services: Click here: http://sourceforge.net/supporters.php ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
[Dri-devel] bug in light locks?
I've been trying to track down a DRI client and server deadlock problem. I think I now know the problem, I'd appreciate it if others could confirm this is a bug or if I have a misunderstanding. This is the scenario: 1) Client takes heavyweight lock via ioctl, lock now has DRM_LOCK_HELD bit or'ed in, high nibble is now 0x8. 2) Server requests heavyweight lock on a different context via ioctl, lock is held by client, server is suspened pending release of lock by client. The DRM_LOCK_CONT flag is or'ed in, high nibble is now 0xC. 3) Client wants to take lightweight lock, client currently holds lock. A CAS test is performed between the lock and (DRM_LOCK_HELD | context). The CAS test fails because even though the context is the same the high nibble now has both the DRM_LOCK_HELD and the DRM_LOCK_CONT flags or'ed into it. The test would have succeeded if the DRM_LOCK_CONT flags was not set. Because the test fails the client does not believe it owns the lock (but it does!) and then issues heavyweight ioctl lock on the very context it already owns the lock on. 4) In the kernel driver DRM(take_lock) discovers the lock is already held on that context by that process. It issues an ERROR message, and returns 0. A zero return value indicates the lock cannot be taken, it then suspends the client waiting for the lock to be released, but it is this client that holds the lock, both the client and server are now suspended both waiting for a lock release that will never occur, a classic deadlock. Assuming my analysis is correct I see the following possible solutions: 1) Invoke CAS twice, once with (DRM_LOCK_HELD | context) and if it fails try once again with (DRM_LOCK_HELD | DRM_LOCK_CONT | context) 2) remove the DRM_LOCK_CONT from the lock and put the flag elsewhere. 3) Have CAS (or better a macro that wraps it) mask out bits not belonging to the test (at the moment thats just DRM_LOCK_CONT). 4) Have DRM(take_lock) return TRUE if the lock is already held. I think this is a bad choice because it violates the locking semantics of no nested heavyweight locks in the driver. The client would continue to be confused over when to lock and unlock, thus no matter what the client needs to be fixed. Questions: 1) Does the analysis sound correct? 2) If so, which approach is preferred? I need to make a patch to fix this, might as well do in a manner that keeps the upstream developers happy. My personal preference is solution #3. John --- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
[Dri-devel] Deadlock with radeon DRI
[Note: this is cross posted between dri-devel and [EMAIL PROTECTED] ] I'm trying to debug a hung X server problem with DRI using the radeon driver. Sources are XFree86 4.3.0. This happens to be on ia64, but at the moment I don't see anything architecture specific about the problem. The symptom of the problem is the following message from the drm radeon kernel driver: [drm:radeon_lock_take] *ERROR* x holds heavyweight lock where x is a context id. I've tracked the sequence of events down to the following: DRIFinishScreenInit is called during the radeon driver initialization, inside DRIFinishScreenInit is the following code snippet: /* Now that we have created the X server's context, we can grab the * hardware lock for the X server. */ DRILock(pScreen, 0); pDRIPriv-grabbedDRILock = TRUE; Slightly later on RADEONAdjustFrame is called and it does the following: #ifdef XF86DRI if (info-CPStarted) DRILock(pScrn-pScreen, 0); #endif Its this DRILock which is causing the *ERROR* x holds heavyweight lock message. The reason is both DRIFinishScreenInit and RADEONAdjustFrame are executing in the server and using the servers DRI lock. DRIFinishScreenInit never unlocks, it sets the grabbedDRILock flag, big deal, no one ever references this flag. When RADEONAdjustFrame calls DRILock its already locked because DRIFinishScreenInit locked and never unlocked. The dri kernel driver on the second lock call then suspends the X server process (DRM(lock_take) returns zero to DRM(lock) because the context holding the lock and context requesting the lock are the same, this then causes DRM(lock) to put the X server on the lock wait queue). Putting the X server on the wait queue waiting for the lock to be released then deadlocks the X server because its the process holding the lock on its context. Questions: The whole crux of the problem seems to me the taking and holding of the lock in DRIFinishScreenInit. Why is this being done? I can't see a reason for it. Why does it set a flag indicating its holding the lock if nobody examines that flag? Is suspending a process that already holds a lock during a lock request really the right behavior? Granted, a process thats trying to lock twice without an intervening unlock is broken, but do we really want to deadlock that process? Any other insights to this issue? FWIW, I googled for this error and came up with several folks who starting around last spring started seeing the same problem, but none of the mail threads had a follow up solution. Thanks, John --- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel
[Dri-devel] drm vs. drm-4.0
In the linux 2.4 kernel tree under drivers/char there is both a drm subdirectory and a drm-4.0 subdirectory. I've looked high and low for an explanation of drm vs. drm-4.0, I assume its a version difference, but I've come up dry. Can someone either explain to me what drm-4.0 is or send me a pointer to something that documents it. I'd like to know why one implementation is picked over the other, are there version dependencies, why it exists as parallel to drm and what its trying to fix. Thanks, John -- John Dennis [EMAIL PROTECTED] --- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf ___ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel