Re: [PATCH] drm: Fix lock order reversal between mmap_sem and struct_mutex.
On Thu, 2009-02-19 at 18:04 -0800, Eric Anholt wrote: > On Thu, 2009-02-19 at 23:26 +0100, Peter Zijlstra wrote: > > On Thu, 2009-02-19 at 22:02 +0100, Thomas Hellstrom wrote: > > > > > > It looks to me like the driver preferred locking order is > > > > > > object_mutex (which happens to be the device global struct_mutex) > > > mmap_sem > > > offset_mutex. > > > > > > So if one could avoid using the struct_mutex for object bookkeeping (A > > > separate lock) then > > > vm_open() and vm_close() would adhere to that locking order as well, > > > simply by not taking the struct_mutex at all. > > > > > > So only fault() remains, in which that locking order is reversed. > > > Personally I think the trylock ->reschedule->retry method with proper > > > commenting is a good solution. It will be the _only_ place where locking > > > order is reversed and it is done in a deadlock-safe manner. Note that > > > fault() doesn't really fail, but requests a retry from user-space with > > > rescheduling to give the process holding the struct_mutex time to > > > release it. > > > > It doesn't do the reschedule -- need_resched() will check if the current > > task was marked to be scheduled away, furthermore yield based locking > > sucks chunks. Imagine what would happen if your faulting task was the highest RT prio task in the system, you'd end up with a life-lock. > > What's so very difficult about pulling the copy_*_user() out from under > > the locks? > > That we're expecting the data movement to occur while holding device > state in place. For example, we write data through the GTT most of the > time so we: > > lock struct_mutex > pin the object to the GTT > flushing caches as needed > copy_from_user > unpin object > unlock struct_mutex So you cannot drop the lock once you've pinned the dst object? > If I'm to pull the copy_from_user out, that means I have to: > > alloc temporary storage > for each block of temp storage size: > copy_from_user > lock struct_mutex > pin the object to the GTT > flush caches as needed > memcpy > unpin object > unlock struct_mutex > > At this point of introducing our third copy of the user's data in our > hottest path, we should probably ditch the pwrite path entirely and go > to user mapping of the objects for performance. Requiring user mapping > (which has significant overhead) cuts the likelihood of moving from > user-space object caching to kernel object caching in the future, which > has the potential of saving steaming piles of memory. Or you could get_user_pages() to fault the user pages and pin them, and then do pagefault_disable() and use copy_from_user_inatomic or such, and release the pages again. -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH] drm: Fix lock order reversal between mmap_sem and struct_mutex.
On Thu, 2009-02-19 at 22:02 +0100, Thomas Hellstrom wrote: > > It looks to me like the driver preferred locking order is > > object_mutex (which happens to be the device global struct_mutex) > mmap_sem > offset_mutex. > > So if one could avoid using the struct_mutex for object bookkeeping (A > separate lock) then > vm_open() and vm_close() would adhere to that locking order as well, > simply by not taking the struct_mutex at all. > > So only fault() remains, in which that locking order is reversed. > Personally I think the trylock ->reschedule->retry method with proper > commenting is a good solution. It will be the _only_ place where locking > order is reversed and it is done in a deadlock-safe manner. Note that > fault() doesn't really fail, but requests a retry from user-space with > rescheduling to give the process holding the struct_mutex time to > release it. It doesn't do the reschedule -- need_resched() will check if the current task was marked to be scheduled away, furthermore yield based locking sucks chunks. What's so very difficult about pulling the copy_*_user() out from under the locks? -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
RE: [Intel-gfx] [PATCH] i915: add page flipping ioctl
>-Original Message- >From: intel-gfx-boun...@lists.freedesktop.org >[mailto:intel-gfx-boun...@lists.freedesktop.org] On Behalf Of Jesse Barnes >Sent: 2009年2月20日 8:48 >To: dri-devel@lists.sourceforge.net >Cc: intel-...@lists.freedesktop.org >Subject: Re: [Intel-gfx] [PATCH] i915: add page flipping ioctl > >On Thursday 19 February 2009 16:43:54 Jesse Barnes wrote: >> On Thursday 19 February 2009 11:37:01 Chris Wilson wrote: >> > With a few additional suggestions by Jesse, I've managed to get >> > tear-free compositing working on i915. Here's the diff on top of the >> > original patch (though obviously this is just a suggestion, still need >> > to prevent multiple pending flips to the same plane and ensure that the >> > old buffer is eventually unpinned, and you might choose to drop the >> > mutex around the wait_for_vblank ;-): >> >> Yeah, looks pretty reasonable. Here's what I've been working with. It >> adds a couple of more changes (and slightly different cleanups) over your >> version: - added a seqno to wait for >> - wait for flip before queuing another >> but it's missing thew 915 changes you added (btw did I get the pitch wrong? >> or are you submitting a different value for your objects?). >> >> I added a new seqno since it's possible (even likely) that the next vblank >> won't actually reflect the flip, if the GPU is busy processing a large >> batchbuffer when the ring command goes in for example. >> >> Also it might be a good idea to wait for any previous flip before queuing a >> new one. Anyway still tracking down some issues with the X side, but it >> seems like it's approaching readiness. > >Oh you want to actually see the code? > Hi Jesse, See below some comments. >diff --git a/drivers/gpu/drm/i915/i915_dma.c >b/drivers/gpu/drm/i915/i915_dma.c >index cbb8224..4e4fb61 100644 >--- a/drivers/gpu/drm/i915/i915_dma.c >+++ b/drivers/gpu/drm/i915/i915_dma.c >@@ -829,6 +829,152 @@ static int i915_set_status_page(struct drm_device *dev, >void *data, > return 0; > } > >+static int i915_pipe_to_plane(struct drm_device *dev, int pipe) >+{ >+ drm_i915_private_t *dev_priv = dev->dev_private; >+ u32 reg, pipe_mask; >+ >+ if (pipe == 0) >+ pipe_mask = DISPPLANE_SEL_PIPE_A; >+ else >+ pipe_mask = DISPPLANE_SEL_PIPE_B; >+ >+ pipe_mask |= DISPLAY_PLANE_ENABLE; >+ >+ reg = I915_READ(DSPACNTR); >+ if ((reg & (DISPLAY_PLANE_ENABLE | DISPPLANE_SEL_PIPE_MASK)) == >+ pipe_mask) >+ return 0; >+ >+ reg = I915_READ(DSPBCNTR); >+ if ((reg & (DISPLAY_PLANE_ENABLE | DISPPLANE_SEL_PIPE_MASK)) == >+ pipe_mask) >+ return 1; >+ >+ return -1; >+} >+ >+bool >+i915_gem_flip_pending(struct drm_gem_object *obj) >+{ >+ struct drm_device *dev = obj->dev; >+ drm_i915_private_t *dev_priv = dev->dev_private; >+ struct drm_i915_gem_object *obj_priv = obj->driver_private; >+ unsigned long flags; >+ bool pending = false; >+ >+ spin_lock_irqsave(&dev_priv->vblank_lock, flags); >+ if (!list_empty(&obj_priv->vblank_head)) >+ pending = true; >+ spin_unlock_irqrestore(&dev_priv->vblank_lock, flags); >+ >+ return pending; >+} >+ >+static int i915_gem_page_flip(struct drm_device *dev, void *data, >+struct drm_file *file_priv) >+{ >+ struct drm_i915_gem_page_flip *flip_data = data; >+ drm_i915_private_t *dev_priv = dev->dev_private; >+ struct drm_gem_object *obj; >+ struct drm_i915_gem_object *obj_priv; >+ unsigned long flags; >+ uint32_t offset; >+ int pitch, pipe, plane, tiled; >+ int ret = 0; >+ RING_LOCALS; >+ >+ if (!(dev->driver->driver_features & DRIVER_GEM)) >+ return -ENODEV; >+ >+ /* >+ * Reject unknown flags so future userspace knows what we (don't) >+ * support >+ */ >+ if (flip_data->flags & (~I915_PAGE_FLIP_WAIT)) { >+ DRM_ERROR("bad page flip flags\n"); >+ return -EINVAL; >+ } >+ >+ if ((pipe = flip_data->pipe) > 1) { >+ DRM_ERROR("bad pipe\n"); >+ return -EINVAL; >+ } pipe should be unsigned int, if it is int, this check should be if (pipe > 1 || pipe < 0) >+ >+ plane = i915_pipe_to_plane(dev, pipe); >+ if (plane < 0) { >+ DRM_ERROR("bad pipe (no planes enabled?)\n"); >+ return -EINVAL; >+ } >+ >+ obj = drm_gem_object_lookup(dev, file_priv, flip_data->handle); >+ if (obj == NULL) >+ return -EBADF; >+ >+ if (i915_gem_flip_pending(obj)) >+ wait_for_completion(&obj_priv->vblank); >+ >+ mutex_lock(&dev->struct_mutex); >+ >+ obj_priv = obj->driver_private; >+ if (obj_priv->tiling_mode != I915_TILING_NONE && >+ obj_priv->tiling_mode != I915_TILING_X) { >+ DRM_ERROR("can only flip non-tiled or X tiled pages\n"); >+
DRI2 flush extension
Attached is a new DRI2 flush extension that allows the driver to perform a "real flush" before dispatching a swap or Xserver copy operation. Currently we do this before a DRI2CopyRegion() call. This allows drivers a real "end of scene" flush to ensure rendering is complete prior to a swap. I've committed this already to the gallium-mesa-7.4 branch, but any comments appreciated before I push to the master branch? Thanks, Alan. commit b163d4f9ee2ab4d54daf7c17c097cae51c9c6db2 Author: Alan Hourihane Date: Thu Feb 19 18:39:08 2009 + glx: add support for a reallyFlush() function before swap occurs. diff --git a/include/GL/internal/dri_interface.h b/include/GL/internal/dri_interface.h index 27cc1be..a726b93 100644 --- a/include/GL/internal/dri_interface.h +++ b/include/GL/internal/dri_interface.h @@ -78,6 +78,7 @@ typedef struct __DRIswrastExtensionRec __DRIswrastExtension; typedef struct __DRIbufferRec __DRIbuffer; typedef struct __DRIdri2ExtensionRec __DRIdri2Extension; typedef struct __DRIdri2LoaderExtensionRec __DRIdri2LoaderExtension; +typedef struct __DRI2flushExtensionRec __DRI2flushExtension; /*...@}*/ @@ -245,6 +246,16 @@ struct __DRItexBufferExtensionRec { __DRIdrawable *pDraw); }; +/** + * Used by drivers that implement DRI2 + */ +#define __DRI2_FLUSH "DRI2_Flush" +#define __DRI2_FLUSH_VERSION 1 +struct __DRI2flushExtensionRec { +__DRIextension base; +void (*flush)(__DRIdrawable *drawable); +}; + /** * XML document describing the configuration options supported by the diff --git a/src/glx/x11/dri2_glx.c b/src/glx/x11/dri2_glx.c index 639aa19..fdda852 100644 --- a/src/glx/x11/dri2_glx.c +++ b/src/glx/x11/dri2_glx.c @@ -207,7 +207,13 @@ static void dri2CopySubBuffer(__GLXDRIdrawable *pdraw, xrect.width = width; xrect.height = height; +#ifdef __DRI2_FLUSH +if (pdraw->psc->f) + (*pdraw->psc->f->flush)(pdraw->driDrawable); +#endif + region = XFixesCreateRegion(pdraw->psc->dpy, &xrect, 1); +/* should get a fence ID back from here at some point */ DRI2CopyRegion(pdraw->psc->dpy, pdraw->drawable, region, DRI2BufferFrontLeft, DRI2BufferBackLeft); XFixesDestroyRegion(pdraw->psc->dpy, region); @@ -235,6 +241,11 @@ static void dri2WaitX(__GLXDRIdrawable *pdraw) xrect.width = priv->width; xrect.height = priv->height; +#ifdef __DRI2_FLUSH +if (pdraw->psc->f) + (*pdraw->psc->f->flush)(pdraw->driDrawable); +#endif + region = XFixesCreateRegion(pdraw->psc->dpy, &xrect, 1); DRI2CopyRegion(pdraw->psc->dpy, pdraw->drawable, region, DRI2BufferFakeFrontLeft, DRI2BufferFrontLeft); @@ -255,6 +266,11 @@ static void dri2WaitGL(__GLXDRIdrawable *pdraw) xrect.width = priv->width; xrect.height = priv->height; +#ifdef __DRI2_FLUSH +if (pdraw->psc->f) + (*pdraw->psc->f->flush)(pdraw->driDrawable); +#endif + region = XFixesCreateRegion(pdraw->psc->dpy, &xrect, 1); DRI2CopyRegion(pdraw->psc->dpy, pdraw->drawable, region, DRI2BufferFrontLeft, DRI2BufferFakeFrontLeft); diff --git a/src/glx/x11/dri_common.c b/src/glx/x11/dri_common.c index 4fda649..90c3d8c 100644 --- a/src/glx/x11/dri_common.c +++ b/src/glx/x11/dri_common.c @@ -392,6 +392,13 @@ driBindExtensions(__GLXscreenConfigs *psc, int dri2) } #endif +#ifdef __DRI2_FLUSH + if ((strcmp(extensions[i]->name, __DRI2_FLUSH) == 0) && dri2) { + psc->f = (__DRI2flushExtension *) extensions[i]; + /* internal driver extension, no GL extension exposed */ + } +#endif + /* Ignore unknown extensions */ } } diff --git a/src/glx/x11/glxclient.h b/src/glx/x11/glxclient.h index 3e70759..caf58bb 100644 --- a/src/glx/x11/glxclient.h +++ b/src/glx/x11/glxclient.h @@ -519,6 +519,10 @@ struct __GLXscreenConfigsRec { const __DRItexBufferExtension *texBuffer; #endif +#ifdef __DRI2_FLUSH +const __DRI2flushExtension *f; +#endif + #endif /** -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H-- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH] drm: Fix lock order reversal between mmap_sem and struct_mutex.
On Thu, 2009-02-19 at 23:26 +0100, Peter Zijlstra wrote: > On Thu, 2009-02-19 at 22:02 +0100, Thomas Hellstrom wrote: > > > > It looks to me like the driver preferred locking order is > > > > object_mutex (which happens to be the device global struct_mutex) > > mmap_sem > > offset_mutex. > > > > So if one could avoid using the struct_mutex for object bookkeeping (A > > separate lock) then > > vm_open() and vm_close() would adhere to that locking order as well, > > simply by not taking the struct_mutex at all. > > > > So only fault() remains, in which that locking order is reversed. > > Personally I think the trylock ->reschedule->retry method with proper > > commenting is a good solution. It will be the _only_ place where locking > > order is reversed and it is done in a deadlock-safe manner. Note that > > fault() doesn't really fail, but requests a retry from user-space with > > rescheduling to give the process holding the struct_mutex time to > > release it. > > It doesn't do the reschedule -- need_resched() will check if the current > task was marked to be scheduled away, furthermore yield based locking > sucks chunks. > > What's so very difficult about pulling the copy_*_user() out from under > the locks? That we're expecting the data movement to occur while holding device state in place. For example, we write data through the GTT most of the time so we: lock struct_mutex pin the object to the GTT flushing caches as needed copy_from_user unpin object unlock struct_mutex If I'm to pull the copy_from_user out, that means I have to: alloc temporary storage for each block of temp storage size: copy_from_user lock struct_mutex pin the object to the GTT flush caches as needed memcpy unpin object unlock struct_mutex At this point of introducing our third copy of the user's data in our hottest path, we should probably ditch the pwrite path entirely and go to user mapping of the objects for performance. Requiring user mapping (which has significant overhead) cuts the likelihood of moving from user-space object caching to kernel object caching in the future, which has the potential of saving steaming piles of memory. -- Eric Anholt e...@anholt.net eric.anh...@intel.com signature.asc Description: This is a digitally signed message part -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H-- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[git pull] drm fixes tree.
Hi Linus, Please pull the 'drm-fixes' branch from ssh://master.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6.git drm-fixes Intel: This contains one lockdep fix, the other is still under discussion along with a few locking and error path fixes. It also contains some minor kms updates for PLL clocks and LVDS found during Fedora integration. Radeon: fixes a suspend/resume regression introduced by the master changes. Dave. drivers/gpu/drm/drm_crtc.c |3 +- drivers/gpu/drm/drm_crtc_helper.c | 21 +++- drivers/gpu/drm/drm_fops.c |3 + drivers/gpu/drm/drm_gem.c | 79 +++- drivers/gpu/drm/i915/i915_drv.c|2 + drivers/gpu/drm/i915/i915_drv.h|2 + drivers/gpu/drm/i915/i915_gem.c| 119 +++- drivers/gpu/drm/i915/i915_gem_tiling.c |6 +- drivers/gpu/drm/i915/intel_bios.c |8 ++ drivers/gpu/drm/i915/intel_display.c | 160 +--- drivers/gpu/drm/i915/intel_fb.c|8 +- drivers/gpu/drm/i915/intel_lvds.c |2 - drivers/gpu/drm/i915/intel_sdvo.c |2 +- drivers/gpu/drm/i915/intel_tv.c|2 +- drivers/gpu/drm/radeon/radeon_cp.c | 21 +++- include/drm/drmP.h |2 + include/drm/drm_crtc.h |2 +- include/drm/drm_crtc_helper.h | 10 +- 18 files changed, 279 insertions(+), 173 deletions(-) commit 3d16118dc825a654043dfe3e14371fdf2976994d Author: etienne Date: Fri Feb 20 09:44:45 2009 +1000 drm/radeon: update sarea copies of last_ variables on resume. This fixes a regression reported in bug #12613. [airlied: not I tweaked the patch slightly and fixed it by etienne did all the hardwork so gets authorship] Signed-off-by: etienne Signed-off-by: Dave Airlie commit ab00b3e5210954cbaff9207db874a9f03197e3ba Author: Jesse Barnes Date: Wed Feb 11 14:01:46 2009 -0800 drm/i915: Keep refs on the object over the lifetime of vmas for GTT mmap. This fixes potential fault at fault time if the object was unreferenced while the mapping still existed. Now, while the mmap_offset only lives for the lifetime of the object, the object also stays alive while a vma exists that needs it. Signed-off-by: Jesse Barnes Signed-off-by: Eric Anholt Signed-off-by: Dave Airlie commit 496818f08a78476abdb307e241911536221239fc Author: Jesse Barnes Date: Wed Feb 11 13:28:14 2009 -0800 drm/i915: take struct mutex around fb unref Need to do this in case the unref ends up doing a free. Signed-off-by: Jesse Barnes Signed-off-by: Eric Anholt Signed-off-by: Dave Airlie commit 43565a0648e664744ac9201c199681451355edcc Author: Kristian Høgsberg Date: Fri Feb 13 20:56:52 2009 -0500 drm: Use spread spectrum when the bios tells us it's ok. Lifted from the DDX modesetting. Signed-off-by: Kristian Høgsberg Signed-off-by: Eric Anholt Signed-off-by: Dave Airlie commit a29f5ca3d691995266a4b1df313e32ff0509a03c Author: Kristian Høgsberg Date: Fri Feb 13 20:56:51 2009 -0500 drm: Collapse identical i8xx_clock() and i9xx_clock(). They used to be different. Now they're identical. Signed-off-by: Kristian Høgsberg Signed-off-by: Eric Anholt Signed-off-by: Dave Airlie commit f3cade5c037054ce5f57651fe0b64eaa9781c753 Author: Kristian Høgsberg Date: Fri Feb 13 20:56:50 2009 -0500 drm: Bring PLL limits in sync with DDX values. Signed-off-by: Kristian Høgsberg Signed-off-by: Eric Anholt Signed-off-by: Dave Airlie commit 7f9872e06d749afdc2029aa6b7ffe88cb3b8c5c2 Author: Kristian Høgsberg Date: Fri Feb 13 20:56:49 2009 -0500 drm: Add locking around cursor gem operations. We need to hold the struct_mutex around pinning and the phys object operations. Signed-off-by: Kristian Høgsberg Signed-off-by: Eric Anholt Signed-off-by: Dave Airlie commit 5c3b82e2b229e78eb32f4ea12d16f3ebeeab3fc7 Author: Chris Wilson Date: Wed Feb 11 13:25:09 2009 + drm: Propagate failure from setting crtc base. Check the error paths within intel_pipe_set_base() to first cleanup and then report back the error. Signed-off-by: Chris Wilson Signed-off-by: Eric Anholt Signed-off-by: Dave Airlie commit e62fb64e6187ea9d8bcedb17ccaa045ed92d4b55 Author: Chris Wilson Date: Wed Feb 11 16:39:21 2009 + drm: Check for a NULL encoder when reverting on error path We need to skip the connectors with a NULL encoder to match the success path and avoid an OOPS. Signed-off-by: Chris Wilson Signed-off-by: Eric Anholt Signed-off-by: Dave Airlie commit 85a7bb98582b60b7e9130159d2464eb0bbac13f7 Author: Chris Wilson Date: Wed Feb 11 14:52:44 2009 + drm/i915: Cleanup the hws on ringbuffer constrution failure.
Re: [PATCH] i915: add page flipping ioctl
On Thursday 19 February 2009 16:43:54 Jesse Barnes wrote: > On Thursday 19 February 2009 11:37:01 Chris Wilson wrote: > > With a few additional suggestions by Jesse, I've managed to get > > tear-free compositing working on i915. Here's the diff on top of the > > original patch (though obviously this is just a suggestion, still need > > to prevent multiple pending flips to the same plane and ensure that the > > old buffer is eventually unpinned, and you might choose to drop the > > mutex around the wait_for_vblank ;-): > > Yeah, looks pretty reasonable. Here's what I've been working with. It > adds a couple of more changes (and slightly different cleanups) over your > version: - added a seqno to wait for > - wait for flip before queuing another > but it's missing thew 915 changes you added (btw did I get the pitch wrong? > or are you submitting a different value for your objects?). > > I added a new seqno since it's possible (even likely) that the next vblank > won't actually reflect the flip, if the GPU is busy processing a large > batchbuffer when the ring command goes in for example. > > Also it might be a good idea to wait for any previous flip before queuing a > new one. Anyway still tracking down some issues with the X side, but it > seems like it's approaching readiness. Oh you want to actually see the code? diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index cbb8224..4e4fb61 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -829,6 +829,152 @@ static int i915_set_status_page(struct drm_device *dev, void *data, return 0; } +static int i915_pipe_to_plane(struct drm_device *dev, int pipe) +{ + drm_i915_private_t *dev_priv = dev->dev_private; + u32 reg, pipe_mask; + + if (pipe == 0) + pipe_mask = DISPPLANE_SEL_PIPE_A; + else + pipe_mask = DISPPLANE_SEL_PIPE_B; + + pipe_mask |= DISPLAY_PLANE_ENABLE; + + reg = I915_READ(DSPACNTR); + if ((reg & (DISPLAY_PLANE_ENABLE | DISPPLANE_SEL_PIPE_MASK)) == + pipe_mask) + return 0; + + reg = I915_READ(DSPBCNTR); + if ((reg & (DISPLAY_PLANE_ENABLE | DISPPLANE_SEL_PIPE_MASK)) == + pipe_mask) + return 1; + + return -1; +} + +bool +i915_gem_flip_pending(struct drm_gem_object *obj) +{ + struct drm_device *dev = obj->dev; + drm_i915_private_t *dev_priv = dev->dev_private; + struct drm_i915_gem_object *obj_priv = obj->driver_private; + unsigned long flags; + bool pending = false; + + spin_lock_irqsave(&dev_priv->vblank_lock, flags); + if (!list_empty(&obj_priv->vblank_head)) + pending = true; + spin_unlock_irqrestore(&dev_priv->vblank_lock, flags); + + return pending; +} + +static int i915_gem_page_flip(struct drm_device *dev, void *data, + struct drm_file *file_priv) +{ + struct drm_i915_gem_page_flip *flip_data = data; + drm_i915_private_t *dev_priv = dev->dev_private; + struct drm_gem_object *obj; + struct drm_i915_gem_object *obj_priv; + unsigned long flags; + uint32_t offset; + int pitch, pipe, plane, tiled; + int ret = 0; + RING_LOCALS; + + if (!(dev->driver->driver_features & DRIVER_GEM)) + return -ENODEV; + + /* +* Reject unknown flags so future userspace knows what we (don't) +* support +*/ + if (flip_data->flags & (~I915_PAGE_FLIP_WAIT)) { + DRM_ERROR("bad page flip flags\n"); + return -EINVAL; + } + + if ((pipe = flip_data->pipe) > 1) { + DRM_ERROR("bad pipe\n"); + return -EINVAL; + } + + plane = i915_pipe_to_plane(dev, pipe); + if (plane < 0) { + DRM_ERROR("bad pipe (no planes enabled?)\n"); + return -EINVAL; + } + + obj = drm_gem_object_lookup(dev, file_priv, flip_data->handle); + if (obj == NULL) + return -EBADF; + + if (i915_gem_flip_pending(obj)) + wait_for_completion(&obj_priv->vblank); + + mutex_lock(&dev->struct_mutex); + + obj_priv = obj->driver_private; + if (obj_priv->tiling_mode != I915_TILING_NONE && + obj_priv->tiling_mode != I915_TILING_X) { + DRM_ERROR("can only flip non-tiled or X tiled pages\n"); + ret = -EINVAL; + goto out_unref; + } + + ret = i915_gem_object_pin(obj, 0); + if (ret) { + DRM_ERROR("failed to pin object for flip\n"); + ret = -EBUSY; + goto out_unref; + } + mutex_unlock(&dev->struct_mutex); + + /* +* Put the object in the GTT domain before the flip, +* since there may be outstanding rendering +*/ + i915_gem_object_set_to_gtt_domain(obj, 0); + + offset = obj_priv->gtt
Re: [PATCH] i915: add page flipping ioctl
On Thursday 19 February 2009 11:37:01 Chris Wilson wrote: > With a few additional suggestions by Jesse, I've managed to get > tear-free compositing working on i915. Here's the diff on top of the > original patch (though obviously this is just a suggestion, still need > to prevent multiple pending flips to the same plane and ensure that the > old buffer is eventually unpinned, and you might choose to drop the > mutex around the wait_for_vblank ;-): Yeah, looks pretty reasonable. Here's what I've been working with. It adds a couple of more changes (and slightly different cleanups) over your version: - added a seqno to wait for - wait for flip before queuing another but it's missing thew 915 changes you added (btw did I get the pitch wrong? or are you submitting a different value for your objects?). I added a new seqno since it's possible (even likely) that the next vblank won't actually reflect the flip, if the GPU is busy processing a large batchbuffer when the ring command goes in for example. Also it might be a good idea to wait for any previous flip before queuing a new one. Anyway still tracking down some issues with the X side, but it seems like it's approaching readiness. Thanks, -- Jesse Barnes, Intel Open Source Technology Center -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH] i915: add page flipping ioctl
On Thursday 19 February 2009 06:55:28 Chris Wilson wrote: > diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c > index 3795dbc..93e677a 100644 > --- a/drivers/gpu/drm/drm_irq.c > +++ b/drivers/gpu/drm/drm_irq.c > @@ -435,6 +435,8 @@ EXPORT_SYMBOL(drm_vblank_get); > */ > void drm_vblank_put(struct drm_device *dev, int crtc) > { > + BUG_ON (atomic_read (&dev->vblank_refcount[crtc]) == 0); > + > /* Last user schedules interrupt disable */ > if (atomic_dec_and_test(&dev->vblank_refcount[crtc])) > mod_timer(&dev->vblank_disable_timer, jiffies + 5*DRM_HZ); > @@ -460,8 +462,9 @@ void drm_vblank_pre_modeset(struct drm_device *dev, > int crtc) >* so that interrupts remain enabled in the interim. >*/ > if (!dev->vblank_inmodeset[crtc]) { > - dev->vblank_inmodeset[crtc] = 1; > - drm_vblank_get(dev, crtc); > + dev->vblank_inmodeset[crtc] = 0x1; > + if (drm_vblank_get(dev, crtc) == 0) > + dev->vblank_inmodeset[crtc] |= 0x2; > } > } > EXPORT_SYMBOL(drm_vblank_pre_modeset); > @@ -473,9 +476,12 @@ void drm_vblank_post_modeset(struct drm_device > *dev, int crtc) > if (dev->vblank_inmodeset[crtc]) { > spin_lock_irqsave(&dev->vbl_lock, irqflags); > dev->vblank_disable_allowed = 1; > - dev->vblank_inmodeset[crtc] = 0; > spin_unlock_irqrestore(&dev->vbl_lock, irqflags); > - drm_vblank_put(dev, crtc); > + > + if (dev->vblank_inmodeset[crtc] & 0x2) > + drm_vblank_put(dev, crtc); > + > + dev->vblank_inmodeset[crtc] = 0; > } > } > EXPORT_SYMBOL(drm_vblank_post_modeset); Looks fine to me; checking for errors when calling functions is probably a good idea! Thanks, -- Jesse Barnes, Intel Open Source Technology Center -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH] drm: Preserve SHMLBA bits in hash key for _DRM_SHM mappings.
On Wed, 18 Feb 2009 15:41:02 -0800 (PST) David Miller wrote: > > Platforms such as sparc64 have D-cache aliasing issues. We > cannot allow virtual mappings in different contexts to be such > that two cache lines can be loaded for the same backing data. > Updates to one cache line won't be seen by accesses to the other > cache line. > > Code in sparc64 and other architectures solve this problem by > making sure that all userland mappings of MAP_SHARED objects have > the same virtual address base. They implement this by keying > off of the page offset, and using that to choose a suitably > consistent virtual address for mmap() requests. > > Making things even worse, getting this wrong on sparc64 can result > in hangs during DRM lock acquisition. This is because, at least on > UltraSPARC-III, normal loads consult the D-cache but atomics such > as 'cas' (which is what cmpxchg() is implement using) only consult > the L2 cache. So if a D-cache alias is inserted, the load can > see different data than the atomic, and we'll loop forever because > the atomic compare-and-exchange will never complete successfully. > > So to make this all work properly, we need to make sure that the > hash address computed by drm_map_handle() preserves the SHMLBA > relevant bits, and that's what this patch does for _DRM_SHM mappings. > > As a historical note, many years ago this bug didn't exist because we > used to just use the low 32-bits of the address as the hash and just > hope for the best. This preserved the SHMLBA bits properly. But when > the hashtab code was added to DRM, this was no longer the case. > > ... > > #include > +#include > +#include > #include "drmP.h" > The inclusion of asm/shmparam.h direct from a driver is a bit risky. It assumes that asm/shmparam.h is compileable in isolation from the additional things which include/linux/shm.h includes. In particular, asm/page.h. eg: arch/xtensa/include/asm/shmparam.h #define SHMLBA ((PAGE_SIZE > DCACHE_WAY_SIZE)? PAGE_SIZE : DCACHE_WAY_SIZE) But including linux/shm.h here seems a bit silly. We'll see.. -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH] drm: Fix lock order reversal between mmap_sem and struct_mutex.
Peter Zijlstra wrote: > On Tue, 2009-02-17 at 16:59 -0800, Eric Anholt wrote: > >> The basic problem was >> mmap_sem (do_mmap()) -> struct_mutex (drm_gem_mmap(), i915_gem_fault()) >> struct_mutex (i915_gem_execbuffer()) -> mmap_sem (copy_from/to_user()) >> > > That's not the only problem, there's also: > > dup_mmap() > down_write(mmap_sem) > vm_ops->open() -> drm_vm_open() > mutex_lock(struct_mutex); > > >> We have plenty of places where we want to hold device state the same >> (struct_mutex) while we move a non-trivial amount of data >> (copy_from/to_user()), such as i915_gem_pwrite(). Solve this by moving the >> easy things that needed struct_mutex with mmap_sem held to using a lock to >> cover just those data structures (offset hash and offset manager), and do >> trylock and reschedule in fault. >> > > So we establish, > > mmap_sem > offset_mutex > > i915_gem_mmap_gtt_ioctl() > mutex_lock(struct_mutex) > i915_gem_create_mmap_offset() > mutex_lock(offset_mutex) > > However we still have > > struct_mutex > mmap_sem > > in basically every copy_*_user() case > > But you cannot seem to switch ->fault() to use offset_mutex, which would > work out the inversion because you then have: > > struct_mutex > mmap_sem > offset_mutex > > So why bother with the offset_mutex? Instead you make your ->fault() > fail randomly. > > I'm not sure what Wang Chen sees after this patch, but I should not be > the exact same splat, still it would not at all surprise me if there's > plenty left. > > The locking looks very fragile and I don't think this patch is helping > anything, sorry. > > It looks to me like the driver preferred locking order is object_mutex (which happens to be the device global struct_mutex) mmap_sem offset_mutex. So if one could avoid using the struct_mutex for object bookkeeping (A separate lock) then vm_open() and vm_close() would adhere to that locking order as well, simply by not taking the struct_mutex at all. So only fault() remains, in which that locking order is reversed. Personally I think the trylock ->reschedule->retry method with proper commenting is a good solution. It will be the _only_ place where locking order is reversed and it is done in a deadlock-safe manner. Note that fault() doesn't really fail, but requests a retry from user-space with rescheduling to give the process holding the struct_mutex time to release it. /Thomas -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH] i915: add page flipping ioctl
With a few additional suggestions by Jesse, I've managed to get tear-free compositing working on i915. Here's the diff on top of the original patch (though obviously this is just a suggestion, still need to prevent multiple pending flips to the same plane and ensure that the old buffer is eventually unpinned, and you might choose to drop the mutex around the wait_for_vblank ;-): diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index bc440ff..b4d347c 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -835,19 +835,18 @@ static int i915_pipe_to_plane(struct drm_device *dev, int pipe) u32 reg, pipe_mask; if (pipe == 0) - pipe_mask = DISPPLANE_SEL_PIPE_A; + pipe_mask = DISPPLANE_SEL_PIPE_A | DISPLAY_PLANE_ENABLE; else - pipe_mask = DISPPLANE_SEL_PIPE_B; + pipe_mask = DISPPLANE_SEL_PIPE_B | DISPLAY_PLANE_ENABLE; +#define DISPPLANE_ENABLED_PIPE_MASK (DISPLAY_PLANE_ENABLE | DISPPLANE_SEL_PIPE_MASK) reg = I915_READ(DSPACNTR); - if ((reg & DISPLAY_PLANE_ENABLE) && - ((reg & DISPPLANE_SEL_PIPE_MASK) == pipe_mask)) - return 0; + if ((reg & DISPPLANE_ENABLED_PIPE_MASK) == pipe_mask) + return 0; reg = I915_READ(DSPBCNTR); - if ((reg & DISPLAY_PLANE_ENABLE) && - ((reg & DISPPLANE_SEL_PIPE_MASK) == pipe_mask)) - return 1; + if ((reg & DISPPLANE_ENABLED_PIPE_MASK) == pipe_mask) + return 1; return -1; } @@ -893,6 +892,8 @@ static int i915_gem_page_flip(struct drm_device *dev, void *data, return -EBADF; obj_priv = obj->driver_private; + + mutex_lock(&dev->struct_mutex); if (obj_priv->tiling_mode != I915_TILING_NONE && obj_priv->tiling_mode != I915_TILING_X) { DRM_ERROR("can only flip non-tiled or X tiled pages\n"); @@ -903,13 +904,18 @@ static int i915_gem_page_flip(struct drm_device *dev, void *data, ret = i915_gem_object_pin(obj, 0); if (ret) { DRM_ERROR("failed to pin object for flip\n"); - ret = -EBUSY; + goto out_unref; + } + + ret = i915_gem_object_set_to_gtt_domain(obj, 0); + if (ret) { + DRM_ERROR("failed to set object to gtt domain for flip\n"); goto out_unref; } offset = obj_priv->gtt_offset; pitch = obj_priv->stride; - tiled = !!(obj_priv->tiling_mode == I915_TILING_X); + tiled = obj_priv->tiling_mode == I915_TILING_X; /* * Queue the flip with the lock held in case the flip happens @@ -919,21 +925,27 @@ static int i915_gem_page_flip(struct drm_device *dev, void *data, BEGIN_LP_RING(4); OUT_RING(CMD_OP_DISPLAYBUFFER_INFO | (plane << 20)); - OUT_RING((pitch / 8) << 3); /* flip queue and/or pitch */ - OUT_RING(offset | tiled); - OUT_RING(((flip_data->x - 1) << 16) | (flip_data->y - 1)); + OUT_RING(pitch); + if (IS_I965G(dev)) { + OUT_RING(offset | tiled); + OUT_RING(((flip_data->x - 1) << 16) | (flip_data->y - 1)); + } else { + OUT_RING(offset); + OUT_RING(0); + } ADVANCE_LP_RING(); - list_add_tail(&obj_priv->vblank_head, &dev_priv->mm.vblank_list[pipe]); - drm_vblank_get(dev, pipe); + ret = drm_vblank_get(dev, pipe); + if (ret == 0) + list_add_tail(&obj_priv->vblank_head, + &dev_priv->mm.vblank_list[pipe]); spin_unlock_irqrestore(&dev_priv->vblank_lock, flags); - if (flip_data->flags & I915_PAGE_FLIP_WAIT) + if (ret == 0 && flip_data->flags & I915_PAGE_FLIP_WAIT) wait_for_completion(&obj_priv->vblank); out_unref: - mutex_lock(&dev->struct_mutex); drm_gem_object_unreference(obj); mutex_unlock(&dev->struct_mutex); diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 8f2dc72..28e8177 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2467,6 +2467,25 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file_priv) return ret; } +static void +i915_gem_object_wait_for_vblank(struct drm_device *dev, + struct drm_gem_object *obj) +{ + drm_i915_private_t *dev_priv = dev->dev_private; + struct drm_i915_gem_object *obj_priv; + unsigned long irqflags; + int wait_for_vblank; + + obj_priv = obj->driver_private; + + spin_lock_irqsave(&dev_priv->vblank_lock, irqflags); + wait_for_vblank = !list_empty(&obj_priv->vblank_head); + spin_unlock_irqrestore(&dev_priv->vblank_lock, irqflags); + + if (wait_for_vblank) + wait_for_completion(&obj_priv->vblank); +} + int i915_gem_execbuffer(str
Re: [PATCH] drm: Take mmap_sem up front to avoid lock order violations.
On Thu, Feb 19, 2009 at 09:49:40AM -0500, Kristian Høgsberg wrote: > > > Secondly, mmap_sem is not a recursive lock (very few kernel locks are, > > and we generally frown upon recursive locking schemes), this means that > > the fault handler still cannot function properly. > > I understand, but we take it twice only as a read lock, so that should > work, right? We prevent the deadlock the lockdep validator warned about > and as far as I can see, the patch doesn't introduce a new one. But > other than that I agree with the frowning on recursive locking, it's too > often used to paper over badly thought out locking. It doesn't work. rwsems are fair (otherwise there is terrible starvation properties), so if another process does an interleaved down_write, then the 2nd down_read will block until the down_write is serviced. -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH] drm: Take mmap_sem up front to avoid lock order violations.
On Thu, 2009-02-19 at 16:17 +0100, Nick Piggin wrote: > On Thu, Feb 19, 2009 at 09:49:40AM -0500, Kristian Høgsberg wrote: > > > > > Secondly, mmap_sem is not a recursive lock (very few kernel locks are, > > > and we generally frown upon recursive locking schemes), this means that > > > the fault handler still cannot function properly. > > > > I understand, but we take it twice only as a read lock, so that should > > work, right? We prevent the deadlock the lockdep validator warned about > > and as far as I can see, the patch doesn't introduce a new one. But > > other than that I agree with the frowning on recursive locking, it's too > > often used to paper over badly thought out locking. > > It doesn't work. rwsems are fair (otherwise there is terrible starvation > properties), so if another process does an interleaved down_write, then > the 2nd down_read will block until the down_write is serviced. Ooh, right, yes of course, ouch. thanks, Kristian -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH] i915: add page flipping ioctl
On Mon, 2009-02-16 at 10:55 -0800, Jesse Barnes wrote: > On Sunday, February 15, 2009 4:02 pm Chris Wilson wrote: > > On my i915, the flip never occurs and we wait forever on the the vblank. > > So I presume the command we sent the chip is invalid, or we miss the > > irq? (I double-checked with lockdep in case I had missed an obvious > > dead-lock.) Comments on the patch itself inline. > > Thanks a lot for looking at this and testing. > > Hm, maybe you're not getting vblank interrupts at all? Do you see the count > increasing? Is the IRQ registered? Apparently, the vblank is never being enabled... Adding a BUG_ON() to drm_vblank_put() soon identified the unbalanced culprit. So it appears that drm_vblank_pre_modeset() can fail to enable vblank, obviously because at that point we have never enabled vblank in pipeconf - but we unconditionally decrement the vblank reference counter afterwards. Ok, so now I see lots of vblank interrupts but the display is still snafu. The following patch is a crude fix. >From 1c4fa54654a604e0ea604553b983dca737a76e99 Mon Sep 17 00:00:00 2001 From: Chris Wilson Date: Thu, 19 Feb 2009 14:48:22 + Subject: [PATCH] drm: Correct unbalanced drm_vblank_put() during mode setting. The first time we install a mode, the vblank will be disabled for a pipe and so drm_vblank_get() in drm_vblank_pre_modeset() will fail. As we unconditionally call drm_vblank_put() afterwards, the vblank reference counter becomes unbalanced. Signed-off-by: Chris Wilson --- drivers/gpu/drm/drm_irq.c | 14 ++ 1 files changed, 10 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 3795dbc..93e677a 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -435,6 +435,8 @@ EXPORT_SYMBOL(drm_vblank_get); */ void drm_vblank_put(struct drm_device *dev, int crtc) { + BUG_ON (atomic_read (&dev->vblank_refcount[crtc]) == 0); + /* Last user schedules interrupt disable */ if (atomic_dec_and_test(&dev->vblank_refcount[crtc])) mod_timer(&dev->vblank_disable_timer, jiffies + 5*DRM_HZ); @@ -460,8 +462,9 @@ void drm_vblank_pre_modeset(struct drm_device *dev, int crtc) * so that interrupts remain enabled in the interim. */ if (!dev->vblank_inmodeset[crtc]) { - dev->vblank_inmodeset[crtc] = 1; - drm_vblank_get(dev, crtc); + dev->vblank_inmodeset[crtc] = 0x1; + if (drm_vblank_get(dev, crtc) == 0) + dev->vblank_inmodeset[crtc] |= 0x2; } } EXPORT_SYMBOL(drm_vblank_pre_modeset); @@ -473,9 +476,12 @@ void drm_vblank_post_modeset(struct drm_device *dev, int crtc) if (dev->vblank_inmodeset[crtc]) { spin_lock_irqsave(&dev->vbl_lock, irqflags); dev->vblank_disable_allowed = 1; - dev->vblank_inmodeset[crtc] = 0; spin_unlock_irqrestore(&dev->vbl_lock, irqflags); - drm_vblank_put(dev, crtc); + + if (dev->vblank_inmodeset[crtc] & 0x2) + drm_vblank_put(dev, crtc); + + dev->vblank_inmodeset[crtc] = 0; } } EXPORT_SYMBOL(drm_vblank_post_modeset); -- 1.6.0.4 -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: r200 lockup in radeon_cp_texture/radeon_freelist_get
On Thu, Feb 19, 2009 at 9:51 AM, Stephane Marchesin wrote: > On Thu, Feb 19, 2009 at 15:46, Alex Deucher wrote: >> On Thu, Feb 19, 2009 at 7:26 AM, Roland Scheidegger >> wrote: >>> On 19.02.2009 12:23, Arkadi Shishlov wrote: Roland Scheidegger wrote: > I suspect you're hitting a r200 asic bug, which isn't present in rv250 > and other r200 family members. There are workarounds in the driver for > this (see r200UpdateTextureState), but to my knowledge they are > insufficient for two-pass ATI_fragment_shader shaders. There's also a You're right. I changed video card to RV280 9250SE and lockup goes away. Nice picture, a little slower than fglrx, probably due to 9250 being slower than 8500. >>> doom3 is actually a performance mystery to me. On my 9000pro, >>> performance seemed to be similar to fglrx, however using another OS it >>> was vastly faster, and I always wondered how it could be tweaked... >>> Hyperz doesn't seem to help much, neither does the mipmap optimization, >>> yet still somehow it must be possible to make it run much faster. >>> > mesa test which last I heard showed errors (progs/tests/afsmultiarb) > (you can switch the test between one and two pass shaders). > If this is the problem, you could try running doom3 using the arb path I > think something like doom3 +seta r_renderer arb might work, however arb > path looks ugly (r_renderer defaults to "best" which will then choose > "r200" on this card). Yes, its ugly and incorrect, some walls are not opaque but blends over another walls. >>> Oh, that sounds like a bug. Ugly yes but I didn't see that. >>> > Unfortunately I don't know how this could be fixed, as documentation for > these asic bugs is nonexistent (or at least non-public). If lockup could be reliable reproduced with simple test - like afsmultiarb in fresh X without WM - will it be helpfull to get mmio trace from fglrx and r200 drivers to compare? >>> I think at least some of the asic bugs do not necessarily result in a >>> lockup but also could result in misrendering. Someone might be able to >>> figure out what fglrx does, I guess of particular interest would be the >>> writes to these debug regs (0x2D90 through 0x2DBC). That said, it might >>> not be easy to figure this out completely (could depend on which texture >>> units are enabled in what pass, and depending on filtering on each of >>> those). Or it could even be some completely unrelated bug. >>> >>> In the light of recent progress with AMD's attitude, can't you just ask fglrx guys about R200 bug? >>> R200 is understandably not exactly top priority, and it seemed like the >>> usual docs didn't cover it. Though maybe Alex wants to comment on this. >> >> Unfortunately, r200 is so old, it hard to find much information on it >> anymore. >> > > The r200 docs were released under NDA to selected people. So I don't > think the r200 docs have completely disappeared off the earh ;) well, we still have those, but those don't seem to cover whatever quirk or errata seems to be at play here. That's the stuff that's hard to dig up. Alex -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: r200 lockup in radeon_cp_texture/radeon_freelist_get
On Thu, Feb 19, 2009 at 15:46, Alex Deucher wrote: > On Thu, Feb 19, 2009 at 7:26 AM, Roland Scheidegger > wrote: >> On 19.02.2009 12:23, Arkadi Shishlov wrote: >>> Roland Scheidegger wrote: I suspect you're hitting a r200 asic bug, which isn't present in rv250 and other r200 family members. There are workarounds in the driver for this (see r200UpdateTextureState), but to my knowledge they are insufficient for two-pass ATI_fragment_shader shaders. There's also a >>> >>> You're right. I changed video card to RV280 9250SE and lockup goes away. >>> Nice picture, a little slower than fglrx, probably due to 9250 being >>> slower than 8500. >> doom3 is actually a performance mystery to me. On my 9000pro, >> performance seemed to be similar to fglrx, however using another OS it >> was vastly faster, and I always wondered how it could be tweaked... >> Hyperz doesn't seem to help much, neither does the mipmap optimization, >> yet still somehow it must be possible to make it run much faster. >> >>> mesa test which last I heard showed errors (progs/tests/afsmultiarb) (you can switch the test between one and two pass shaders). If this is the problem, you could try running doom3 using the arb path I think something like doom3 +seta r_renderer arb might work, however arb path looks ugly (r_renderer defaults to "best" which will then choose "r200" on this card). >>> >>> Yes, its ugly and incorrect, some walls are not opaque but blends over >>> another walls. >> Oh, that sounds like a bug. Ugly yes but I didn't see that. >> >>> Unfortunately I don't know how this could be fixed, as documentation for these asic bugs is nonexistent (or at least non-public). >>> >>> If lockup could be reliable reproduced with simple test - like >>> afsmultiarb in fresh X without WM - will it be helpfull to get mmio >>> trace from fglrx and r200 drivers to compare? >> I think at least some of the asic bugs do not necessarily result in a >> lockup but also could result in misrendering. Someone might be able to >> figure out what fglrx does, I guess of particular interest would be the >> writes to these debug regs (0x2D90 through 0x2DBC). That said, it might >> not be easy to figure this out completely (could depend on which texture >> units are enabled in what pass, and depending on filtering on each of >> those). Or it could even be some completely unrelated bug. >> >> >>> >>> In the light of recent progress with AMD's attitude, can't you just ask >>> fglrx guys about R200 bug? >> R200 is understandably not exactly top priority, and it seemed like the >> usual docs didn't cover it. Though maybe Alex wants to comment on this. > > Unfortunately, r200 is so old, it hard to find much information on it anymore. > The r200 docs were released under NDA to selected people. So I don't think the r200 docs have completely disappeared off the earh ;) Stephane -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH] drm: Take mmap_sem up front to avoid lock order violations.
On Thu, 2009-02-19 at 11:33 +0100, Peter Zijlstra wrote: > On Thu, 2009-02-19 at 10:19 +0100, Peter Zijlstra wrote: > > On Wed, 2009-02-18 at 11:38 -0500, k...@bitplanet.net wrote: > > > From: Kristian Høgsberg > > > > > > A number of GEM operations (and legacy drm ones) want to copy data to > > > or from userspace while holding the struct_mutex lock. However, the > > > fault handler calls us with the mmap_sem held and thus enforces the > > > opposite locking order. This patch downs the mmap_sem up front for > > > those operations that access userspace data under the struct_mutex > > > lock to ensure the locking order is consistent. > > > > > > Signed-off-by: Kristian Høgsberg > > > --- > > > > > > Here's a different and simpler attempt to fix the locking order > > > problem. We can just down_read() the mmap_sem pre-emptively up-front, > > > and the locking order is respected. It's simpler than the > > > mutex_trylock() game, avoids introducing a new mutex. > > > > > OK let me try that again -- my initial response was a tad curt :/ No that's fair, I was aware that the patch was probably borderline and got the feedback I was looking for ;) > While I appreciate your efforts in fixing GEM (I too have an interest in > seeing it done), I cannot support your patch. > > Firstly, you're using mmap_sem well outside its problem domain, this is > bad form. Furthermore, holding it for extended durations for no good > reason affects all other users. Yup, agree. > Secondly, mmap_sem is not a recursive lock (very few kernel locks are, > and we generally frown upon recursive locking schemes), this means that > the fault handler still cannot function properly. I understand, but we take it twice only as a read lock, so that should work, right? We prevent the deadlock the lockdep validator warned about and as far as I can see, the patch doesn't introduce a new one. But other than that I agree with the frowning on recursive locking, it's too often used to paper over badly thought out locking. cheers, Kristian -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[Bug 20211] Runing Kubrick game maximized freezes X and causes kernel panic
http://bugs.freedesktop.org/show_bug.cgi?id=20211 Alex Deucher changed: What|Removed |Added Attachment #23111|application/x-trash |text/plain mime type|| Attachment #23111|0 |1 is patch|| -- Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: r200 lockup in radeon_cp_texture/radeon_freelist_get
On Thu, Feb 19, 2009 at 7:26 AM, Roland Scheidegger wrote: > On 19.02.2009 12:23, Arkadi Shishlov wrote: >> Roland Scheidegger wrote: >>> I suspect you're hitting a r200 asic bug, which isn't present in rv250 >>> and other r200 family members. There are workarounds in the driver for >>> this (see r200UpdateTextureState), but to my knowledge they are >>> insufficient for two-pass ATI_fragment_shader shaders. There's also a >> >> You're right. I changed video card to RV280 9250SE and lockup goes away. >> Nice picture, a little slower than fglrx, probably due to 9250 being >> slower than 8500. > doom3 is actually a performance mystery to me. On my 9000pro, > performance seemed to be similar to fglrx, however using another OS it > was vastly faster, and I always wondered how it could be tweaked... > Hyperz doesn't seem to help much, neither does the mipmap optimization, > yet still somehow it must be possible to make it run much faster. > >> >>> mesa test which last I heard showed errors (progs/tests/afsmultiarb) >>> (you can switch the test between one and two pass shaders). >>> If this is the problem, you could try running doom3 using the arb path I >>> think something like doom3 +seta r_renderer arb might work, however arb >>> path looks ugly (r_renderer defaults to "best" which will then choose >>> "r200" on this card). >> >> Yes, its ugly and incorrect, some walls are not opaque but blends over >> another walls. > Oh, that sounds like a bug. Ugly yes but I didn't see that. > >> >>> Unfortunately I don't know how this could be fixed, as documentation for >>> these asic bugs is nonexistent (or at least non-public). >> >> If lockup could be reliable reproduced with simple test - like >> afsmultiarb in fresh X without WM - will it be helpfull to get mmio >> trace from fglrx and r200 drivers to compare? > I think at least some of the asic bugs do not necessarily result in a > lockup but also could result in misrendering. Someone might be able to > figure out what fglrx does, I guess of particular interest would be the > writes to these debug regs (0x2D90 through 0x2DBC). That said, it might > not be easy to figure this out completely (could depend on which texture > units are enabled in what pass, and depending on filtering on each of > those). Or it could even be some completely unrelated bug. > > >> >> In the light of recent progress with AMD's attitude, can't you just ask >> fglrx guys about R200 bug? > R200 is understandably not exactly top priority, and it seemed like the > usual docs didn't cover it. Though maybe Alex wants to comment on this. Unfortunately, r200 is so old, it hard to find much information on it anymore. Alex -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH] drm: Only use DRM_IOCTL_UPDATE_DRAW compat wrapper for compat X86.
On Wednesday 18 February 2009, David Miller wrote: > drm: Only use DRM_IOCTL_UPDATE_DRAW compat wrapper for compat X86. > > Only X86 32-bit uses a different alignment for "unsigned long long" > than it's 64-bit counterpart. > > Therefore this compat translation is only correct, and only needed, > when either CONFIG_X86 or CONFIG_IA64. > > Signed-off-by: David S. Miller The patch is correct AFAICT, but I'd like to point out that the problem could have been avoided (besides using a non-padded layout) by using a compat_u64 member in the struct definition instead of the packed attribute: typedef struct drm_update_draw32 { drm_drawable_t handle; unsigned int type; unsigned int num; compat_u64 data;/**< Pointer */ } drm_update_draw32_t; I find that too often __attribute__((packed)) is used on whole structures where some other much more gentle solution can be applied. In fact, there are very few files (e.g. linux/unaligned/packed_struct.h) that look like they want all of the implied meanings (pack members, drop alignment on whole structure, access members as unaligned). A grep for "packed" in compat_ioctl definitions revealed the same bug as in drm_update_draw32 to be present in raw32_config_request, and I'm rather sure that there are more of these. Arnd <>< -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH] drm: Take mmap_sem up front to avoid lock order violations.
On Thu, Feb 19, 2009 at 10:19:05AM +0100, Peter Zijlstra wrote: > On Wed, 2009-02-18 at 11:38 -0500, k...@bitplanet.net wrote: > > From: Kristian Høgsberg > > > > A number of GEM operations (and legacy drm ones) want to copy data to > > or from userspace while holding the struct_mutex lock. However, the > > fault handler calls us with the mmap_sem held and thus enforces the > > opposite locking order. This patch downs the mmap_sem up front for > > those operations that access userspace data under the struct_mutex > > lock to ensure the locking order is consistent. > > > > Signed-off-by: Kristian Høgsberg > > --- > > > > Here's a different and simpler attempt to fix the locking order > > problem. We can just down_read() the mmap_sem pre-emptively up-front, > > and the locking order is respected. It's simpler than the > > mutex_trylock() game, avoids introducing a new mutex. The "simple" way to fix this is to just allocate a temporary buffer to copy a snapshot of the data going to/from userspace. Then do the real usercopy to/from that buffer outside the locks. You don't have any performance critical bulk copies (ie. that will blow the L1 cache), do you? -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: r200 lockup in radeon_cp_texture/radeon_freelist_get
Roland Scheidegger wrote: > I suspect you're hitting a r200 asic bug, which isn't present in rv250 > and other r200 family members. There are workarounds in the driver for > this (see r200UpdateTextureState), but to my knowledge they are > insufficient for two-pass ATI_fragment_shader shaders. There's also a You're right. I changed video card to RV280 9250SE and lockup goes away. Nice picture, a little slower than fglrx, probably due to 9250 being slower than 8500. > mesa test which last I heard showed errors (progs/tests/afsmultiarb) > (you can switch the test between one and two pass shaders). > If this is the problem, you could try running doom3 using the arb path I > think something like doom3 +seta r_renderer arb might work, however arb > path looks ugly (r_renderer defaults to "best" which will then choose > "r200" on this card). Yes, its ugly and incorrect, some walls are not opaque but blends over another walls. > Unfortunately I don't know how this could be fixed, as documentation for > these asic bugs is nonexistent (or at least non-public). If lockup could be reliable reproduced with simple test - like afsmultiarb in fresh X without WM - will it be helpfull to get mmio trace from fglrx and r200 drivers to compare? In the light of recent progress with AMD's attitude, can't you just ask fglrx guys about R200 bug? -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: r200 lockup in radeon_cp_texture/radeon_freelist_get
On 19.02.2009 12:23, Arkadi Shishlov wrote: > Roland Scheidegger wrote: >> I suspect you're hitting a r200 asic bug, which isn't present in rv250 >> and other r200 family members. There are workarounds in the driver for >> this (see r200UpdateTextureState), but to my knowledge they are >> insufficient for two-pass ATI_fragment_shader shaders. There's also a > > You're right. I changed video card to RV280 9250SE and lockup goes away. > Nice picture, a little slower than fglrx, probably due to 9250 being > slower than 8500. doom3 is actually a performance mystery to me. On my 9000pro, performance seemed to be similar to fglrx, however using another OS it was vastly faster, and I always wondered how it could be tweaked... Hyperz doesn't seem to help much, neither does the mipmap optimization, yet still somehow it must be possible to make it run much faster. > >> mesa test which last I heard showed errors (progs/tests/afsmultiarb) >> (you can switch the test between one and two pass shaders). >> If this is the problem, you could try running doom3 using the arb path I >> think something like doom3 +seta r_renderer arb might work, however arb >> path looks ugly (r_renderer defaults to "best" which will then choose >> "r200" on this card). > > Yes, its ugly and incorrect, some walls are not opaque but blends over > another walls. Oh, that sounds like a bug. Ugly yes but I didn't see that. > >> Unfortunately I don't know how this could be fixed, as documentation for >> these asic bugs is nonexistent (or at least non-public). > > If lockup could be reliable reproduced with simple test - like > afsmultiarb in fresh X without WM - will it be helpfull to get mmio > trace from fglrx and r200 drivers to compare? I think at least some of the asic bugs do not necessarily result in a lockup but also could result in misrendering. Someone might be able to figure out what fglrx does, I guess of particular interest would be the writes to these debug regs (0x2D90 through 0x2DBC). That said, it might not be easy to figure this out completely (could depend on which texture units are enabled in what pass, and depending on filtering on each of those). Or it could even be some completely unrelated bug. > > In the light of recent progress with AMD's attitude, can't you just ask > fglrx guys about R200 bug? R200 is understandably not exactly top priority, and it seemed like the usual docs didn't cover it. Though maybe Alex wants to comment on this. Roland -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH] drm: Take mmap_sem up front to avoid lock order violations.
On Thu, 2009-02-19 at 10:19 +0100, Peter Zijlstra wrote: > On Wed, 2009-02-18 at 11:38 -0500, k...@bitplanet.net wrote: > > From: Kristian Høgsberg > > > > A number of GEM operations (and legacy drm ones) want to copy data to > > or from userspace while holding the struct_mutex lock. However, the > > fault handler calls us with the mmap_sem held and thus enforces the > > opposite locking order. This patch downs the mmap_sem up front for > > those operations that access userspace data under the struct_mutex > > lock to ensure the locking order is consistent. > > > > Signed-off-by: Kristian Høgsberg > > --- > > > > Here's a different and simpler attempt to fix the locking order > > problem. We can just down_read() the mmap_sem pre-emptively up-front, > > and the locking order is respected. It's simpler than the > > mutex_trylock() game, avoids introducing a new mutex. > > OK let me try that again -- my initial response was a tad curt :/ While I appreciate your efforts in fixing GEM (I too have an interest in seeing it done), I cannot support your patch. Firstly, you're using mmap_sem well outside its problem domain, this is bad form. Furthermore, holding it for extended durations for no good reason affects all other users. Secondly, mmap_sem is not a recursive lock (very few kernel locks are, and we generally frown upon recursive locking schemes), this means that the fault handler still cannot function properly. -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH] drm: Take mmap_sem up front to avoid lock order violations.
On Wed, 2009-02-18 at 11:38 -0500, k...@bitplanet.net wrote: > From: Kristian Høgsberg > > A number of GEM operations (and legacy drm ones) want to copy data to > or from userspace while holding the struct_mutex lock. However, the > fault handler calls us with the mmap_sem held and thus enforces the > opposite locking order. This patch downs the mmap_sem up front for > those operations that access userspace data under the struct_mutex > lock to ensure the locking order is consistent. > > Signed-off-by: Kristian Høgsberg > --- > > Here's a different and simpler attempt to fix the locking order > problem. We can just down_read() the mmap_sem pre-emptively up-front, > and the locking order is respected. It's simpler than the > mutex_trylock() game, avoids introducing a new mutex. > Hell no! for one, mmap_sem is not a recursive lock, so a pagefault will utterly fail with this in place. Secondly, holding mmap_sem for no good reason just sucks. > drivers/gpu/drm/i915/i915_dma.c |6 +- > drivers/gpu/drm/i915/i915_gem.c | 20 +--- > 2 files changed, 18 insertions(+), 8 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c > index 81f1cff..d8b58d9 100644 > --- a/drivers/gpu/drm/i915/i915_dma.c > +++ b/drivers/gpu/drm/i915/i915_dma.c > @@ -642,9 +642,11 @@ static int i915_batchbuffer(struct drm_device *dev, void > *data, > sizeof(struct > drm_clip_rect))) > return -EFAULT; > > + down_read(¤t->mm->mmap_sem); > mutex_lock(&dev->struct_mutex); > ret = i915_dispatch_batchbuffer(dev, batch); > mutex_unlock(&dev->struct_mutex); > + up_read(¤t->mm->mmap_sem); > > if (sarea_priv) > sarea_priv->last_dispatch = READ_BREADCRUMB(dev_priv); > @@ -674,14 +676,16 @@ static int i915_cmdbuffer(struct drm_device *dev, void > *data, > return -EFAULT; > } > > + down_read(¤t->mm->mmap_sem); > mutex_lock(&dev->struct_mutex); > ret = i915_dispatch_cmdbuffer(dev, cmdbuf); > mutex_unlock(&dev->struct_mutex); > + up_read(¤t->mm->mmap_sem); > + > if (ret) { > DRM_ERROR("i915_dispatch_cmdbuffer failed\n"); > return ret; > } > - > if (sarea_priv) > sarea_priv->last_dispatch = READ_BREADCRUMB(dev_priv); > return 0; > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index d9cd42f..3dd8b6e 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -171,6 +171,7 @@ i915_gem_pread_ioctl(struct drm_device *dev, void *data, > return -EINVAL; > } > > + down_read(¤t->mm->mmap_sem); > mutex_lock(&dev->struct_mutex); > > ret = i915_gem_object_set_cpu_read_domain_range(obj, args->offset, > @@ -196,6 +197,7 @@ i915_gem_pread_ioctl(struct drm_device *dev, void *data, > > drm_gem_object_unreference(obj); > mutex_unlock(&dev->struct_mutex); > + up_read(¤t->mm->mmap_sem); > > return 0; > } > @@ -264,7 +266,7 @@ i915_gem_gtt_pwrite(struct drm_device *dev, struct > drm_gem_object *obj, > if (!access_ok(VERIFY_READ, user_data, remain)) > return -EFAULT; > > - > + down_read(¤t->mm->mmap_sem); > mutex_lock(&dev->struct_mutex); > ret = i915_gem_object_pin(obj, 0); > if (ret) { > @@ -315,6 +317,7 @@ i915_gem_gtt_pwrite(struct drm_device *dev, struct > drm_gem_object *obj, > fail: > i915_gem_object_unpin(obj); > mutex_unlock(&dev->struct_mutex); > + up_read(¤t->mm->mmap_sem); > > return ret; > } > @@ -328,6 +331,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev, struct > drm_gem_object *obj, > loff_t offset; > ssize_t written; > > + down_read(¤t->mm->mmap_sem); > mutex_lock(&dev->struct_mutex); > > ret = i915_gem_object_set_to_cpu_domain(obj, 1); > @@ -350,6 +354,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev, struct > drm_gem_object *obj, > } > > mutex_unlock(&dev->struct_mutex); > + up_read(¤t->mm->mmap_sem); > > return 0; > } > @@ -2473,22 +2478,21 @@ i915_gem_execbuffer(struct drm_device *dev, void > *data, > goto pre_mutex_err; > } > > + down_read(¤t->mm->mmap_sem); > mutex_lock(&dev->struct_mutex); > > i915_verify_inactive(dev, __FILE__, __LINE__); > > if (dev_priv->mm.wedged) { > DRM_ERROR("Execbuf while wedged\n"); > - mutex_unlock(&dev->struct_mutex); > ret = -EIO; > - goto pre_mutex_err; > + goto mutex_err; > } > > if (dev_priv->mm.suspended) { > DRM_ERROR("Execbuf while VT-switched.\n"); > - mutex_unlock(&dev->struct_mutex); > ret = -EBUSY; > - goto pre_mutex_err; > + goto mutex_err; > } >
[Bug 12613] [Suspend regression][DRM, RADEON]
http://bugzilla.kernel.org/show_bug.cgi?id=12613 mmvi...@yahoo.com changed: What|Removed |Added CC||mmvi...@yahoo.com --- Comment #7 from mmvi...@yahoo.com 2009-02-19 01:52 --- The patch posted by Etienne in http://lkml.org/lkml/2009/2/15/28 fixes a similar problem with hibernation on my HP nx9005 laptop with ATI Technologies Inc Radeon Mobility U1 (IGP 320M or something) graphics card. I can't test suspend to ram on this machine due to another problem (http://bugzilla.kernel.org/show_bug.cgi?id=5884). I can make the computer hang with the following steps without the patch (in 2.6.29-rc5-122-g... something): 1) log in to fail safe session (no 3d fancies) 2) check glxgears (works) 3) echo test > /sys/power/disk 4) echo disk > /sys/power/state (resumes itself after 5 seconds, as expected) 5) try glxgears (immediate hang (mouse doesn't move, caps lock light doesn't change); fan gets on, so it is trying to do something; power button turns the machine off normally, etc.) With Etienne's patch the computer doesn't hang anymore with the steps described above and even suspend to disk and resume with KDE 4 running works. Not bad from a "shot in the dark" patch :) -- Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug, or are watching the assignee. -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[Bug 20211] Runing Kubrick game maximized freezes X and causes kernel panic
http://bugs.freedesktop.org/show_bug.cgi?id=20211 --- Comment #1 from Jure Repinc 2009-02-19 01:30:09 PST --- Created an attachment (id=23111) --> (http://bugs.freedesktop.org/attachment.cgi?id=23111) Xorg.0.log -- Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[Bug 20211] New: Runing Kubrick game maximized freezes X and causes kernel panic
http://bugs.freedesktop.org/show_bug.cgi?id=20211 Summary: Runing Kubrick game maximized freezes X and causes kernel panic Product: DRI Version: XOrg CVS Platform: x86-64 (AMD64) OS/Version: Linux (All) Status: NEW Severity: critical Priority: medium Component: DRM/Radeon AssignedTo: dri-devel@lists.sourceforge.net ReportedBy: jlp.b...@gmail.com Created an attachment (id=23110) --> (http://bugs.freedesktop.org/attachment.cgi?id=23110) xorg.conf My Xorg server is version 1.5.3, Linux kernel is 2.6.26.6, graphics card is: 01:00.0 VGA compatible controller: ATI Technologies Inc RV350 AR [Radeon 9600] If I run Kubrick game from KDE 4.2.0 at its default size all is OK. But if I maximize it over the entire screen (I use resolution 1280x1024) after some time of runing in demo mode it freezes the X server, only mouse pounter can move. When I check out the list of processes over SSH I can see kubrick using all CPU. When I kill kubrick process the system completely locks up and keyboard LEDs start flashing so I guess the kernel panics. -- Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel