Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.
On Sat, Feb 14, 2009 at 4:09 PM, David Miller wrote: > From: Benjamin Herrenschmidt > Date: Thu, 12 Feb 2009 21:35:59 +1100 > >> Oh BTW something else to be careful with, though I suppose it's working >> some what by accident right now... when the GART is in the frame buffer >> it gets applied the current fb swapper setting... ouch ! >> >> So it might be a good idea, if we're going to use DRM_READ/WRITE32 which >> afaik are readl/writel (ie, swapping) to make sure we at least >> temporarily disable that swapper while whacking the GART. > > So I've narrowed down the final problems I'm having. It's the surface > swapping settings indeed. > > Running Xorg at depth 8, the CP can execute indirect buffers just > fine, no code changes necessary. > > However, the CP hangs on indirect execution for depth 16 and 24. But > these depths work if I hard disable the surface byte swapping settings > in the radeon Xorg driver. > > I did some research, and it does appear that the GART does read the > PTEs from the VRAM using the Host Data Path. This means the surface > control byte swapping settings are applied. > > So for depths of 16 and 24, the GART is reading garbage PTEs. And > that's why the CP hangs. Wow that is ugly, truly ugly. I'm going to say I've little idea how to fix this outside KMS, The only think I can think to do is use a surface instead of the aperture swappers and make the surface cover all the VRAM except the GART table which is at the end. > I have no idea how to handle this properly. Not only does the Xorg > ATI driver set the swapping settings at an arbitray point, it also > mucks with them dynamically f.e. in the render helper functions. > > The only way this can work properly is to choose one surface swapping > setting, set the VRAM GART table so that the GART sees the PTEs in the > correct format, and retain those settings throughout > > It seems like, in at least R5xx, they tried to add a control bit in > the PCI-E GART registers to handle this. Bit 6 in PCI-E indirect > register 0x10 is named 'GART_RDREQPATH_SEL' and has description: > >Read request path > 0=HDP > 1=Direct Rd Req to MC > > This seems to be intended to be a way to have the GART PTE reads > bypass the full Host Data Path (and thus potential surface byte > swaps), by setting this bit to 1. > > I tried doing that, but it doesn't help my RV370 so perhaps older > chips don't support this bit. It's hard to tell as this is the area > where all of AMD's published GPU documents are severely lacking. You don't get this bit until rv515 and above so no pony for you. > > I tested whether reading back the PCIE_TX_GART_CNTL register shows > bit 6 after I try to write it, and it always reads back zero. > > There are even more complications, the VT enter/exit code in the Xorg > ATI driver tries to save and restore the VRAM GART table for PCI-E > cards. But this: > > 1) Mis-sizes the GART table save buffer, it uses PAGE_SIZE instead > of the constant 4096 to determine how many GART entries there > are. The PCI GART entries map 4096 bytes, always. So using > getpagesize() is wrong. (see RADEONDRIGetPciAperTableSize) > > I have this fixed in my local tree. Oops. > > 2) It doesn't check the surface byte swapping settings, so it > could be saving in one byte order and restoing in another. > > I guess we could force RADEON_SURFACE_CNTL to zero around > the two memcpy()'s done in radeon_driver.c Might be a good plan. > > But really this whole area is a mess, and I know KMS is coming to fix > this, but even in the traditional XORG/DRM layout XORG has no business > keeping the GART table uptodate across VT switches. It should be > calling into the DRM with an ioctl to write the GART table out to VRAM > again. You can draw an arbitrary line anywhere you want really, its all an unholy mess, ever since people decided userspace drivers were a good idea. > > Finally there is a huge issue with how the Xorg ATI driver resets the > chip using the RBBM. It soft resets the CP, but this resets the ring > read pointer. However, nothing is done to make sure the DRM resync's > the ring write pointer (which remains unchanged by a soft CP reset) as > well as it's internal software ring state. > > The result is that on the very next CP command submission, the CP > re-executes everything from ring entry zero until wherever the DRM > things the write pointer was at the time of the CP soft reset. > > Any time the Xorg ATI driver does a CP soft reset, it should do > CP stop and resume calls to the DRM to resync the ring pointers. > And this does not appear to be happening where it needs to be > happening. That's interesting, it could explain why things never work again after a reset or at least proceed to hang straight away. Dave. -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest
[PATCH]: drm: ati_pcigart: Fix limit check in drm_ati_pcigart_init().
drm: ati_pcigart: Fix limit check in drm_ati_pcigart_init(). The variable 'max_pages' is ambiguous. There are two concepts of "pages" being used in this function. First, we have ATI GART pages which are always 4096 bytes. Then, we have system pages which are of size PAGE_SIZE. Eliminate the confusion by creating max_ati_pages and max_real_pages. Calculate and use them as appropriate. Signed-off-by: David S. Miller --- drivers/gpu/drm/ati_pcigart.c | 13 +++-- 1 files changed, 7 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/ati_pcigart.c b/drivers/gpu/drm/ati_pcigart.c index 7972ec8..4d86a62 100644 --- a/drivers/gpu/drm/ati_pcigart.c +++ b/drivers/gpu/drm/ati_pcigart.c @@ -102,7 +102,7 @@ int drm_ati_pcigart_init(struct drm_device *dev, struct drm_ati_pcigart_info *ga u32 *pci_gart, page_base, gart_idx; dma_addr_t bus_address = 0; int i, j, ret = 0; - int max_pages; + int max_ati_pages, max_real_pages; if (!entry) { DRM_ERROR("no scatter/gather memory!\n"); @@ -130,14 +130,15 @@ int drm_ati_pcigart_init(struct drm_device *dev, struct drm_ati_pcigart_info *ga pci_gart = (u32 *) address; - max_pages = (gart_info->table_size / sizeof(u32)); - pages = (entry->pages <= max_pages) - ? entry->pages : max_pages; + max_ati_pages = (gart_info->table_size / sizeof(u32)); + max_real_pages = max_ati_pages / (PAGE_SIZE / ATI_PCIGART_PAGE_SIZE); + pages = (entry->pages <= max_real_pages) + ? entry->pages : max_real_pages; if (gart_info->gart_table_location == DRM_ATI_GART_MAIN) { - memset(pci_gart, 0, max_pages * sizeof(u32)); + memset(pci_gart, 0, max_ati_pages * sizeof(u32)); } else { - for (gart_idx = 0; gart_idx < max_pages; gart_idx++) + for (gart_idx = 0; gart_idx < max_ati_pages; gart_idx++) DRM_WRITE32(map, gart_idx * sizeof(u32), 0); } -- 1.6.1.2.350.g88cc -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.
From: Benjamin Herrenschmidt Date: Thu, 12 Feb 2009 21:35:59 +1100 > Oh BTW something else to be careful with, though I suppose it's working > some what by accident right now... when the GART is in the frame buffer > it gets applied the current fb swapper setting... ouch ! > > So it might be a good idea, if we're going to use DRM_READ/WRITE32 which > afaik are readl/writel (ie, swapping) to make sure we at least > temporarily disable that swapper while whacking the GART. So I've narrowed down the final problems I'm having. It's the surface swapping settings indeed. Running Xorg at depth 8, the CP can execute indirect buffers just fine, no code changes necessary. However, the CP hangs on indirect execution for depth 16 and 24. But these depths work if I hard disable the surface byte swapping settings in the radeon Xorg driver. I did some research, and it does appear that the GART does read the PTEs from the VRAM using the Host Data Path. This means the surface control byte swapping settings are applied. So for depths of 16 and 24, the GART is reading garbage PTEs. And that's why the CP hangs. I have no idea how to handle this properly. Not only does the Xorg ATI driver set the swapping settings at an arbitray point, it also mucks with them dynamically f.e. in the render helper functions. The only way this can work properly is to choose one surface swapping setting, set the VRAM GART table so that the GART sees the PTEs in the correct format, and retain those settings throughout. It seems like, in at least R5xx, they tried to add a control bit in the PCI-E GART registers to handle this. Bit 6 in PCI-E indirect register 0x10 is named 'GART_RDREQPATH_SEL' and has description: Read request path 0=HDP 1=Direct Rd Req to MC This seems to be intended to be a way to have the GART PTE reads bypass the full Host Data Path (and thus potential surface byte swaps), by setting this bit to 1. I tried doing that, but it doesn't help my RV370 so perhaps older chips don't support this bit. It's hard to tell as this is the area where all of AMD's published GPU documents are severely lacking. I tested whether reading back the PCIE_TX_GART_CNTL register shows bit 6 after I try to write it, and it always reads back zero. There are even more complications, the VT enter/exit code in the Xorg ATI driver tries to save and restore the VRAM GART table for PCI-E cards. But this: 1) Mis-sizes the GART table save buffer, it uses PAGE_SIZE instead of the constant 4096 to determine how many GART entries there are. The PCI GART entries map 4096 bytes, always. So using getpagesize() is wrong. (see RADEONDRIGetPciAperTableSize) I have this fixed in my local tree. 2) It doesn't check the surface byte swapping settings, so it could be saving in one byte order and restoing in another. I guess we could force RADEON_SURFACE_CNTL to zero around the two memcpy()'s done in radeon_driver.c But really this whole area is a mess, and I know KMS is coming to fix this, but even in the traditional XORG/DRM layout XORG has no business keeping the GART table uptodate across VT switches. It should be calling into the DRM with an ioctl to write the GART table out to VRAM again. Finally there is a huge issue with how the Xorg ATI driver resets the chip using the RBBM. It soft resets the CP, but this resets the ring read pointer. However, nothing is done to make sure the DRM resync's the ring write pointer (which remains unchanged by a soft CP reset) as well as it's internal software ring state. The result is that on the very next CP command submission, the CP re-executes everything from ring entry zero until wherever the DRM things the write pointer was at the time of the CP soft reset. Any time the Xorg ATI driver does a CP soft reset, it should do CP stop and resume calls to the DRM to resync the ring pointers. And this does not appear to be happening where it needs to be happening. -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[PATCH 4/4] drm: Use spread spectrum when the bios tells us it's ok.
Lifted from the DDX modesetting. Signed-off-by: Kristian Høgsberg --- drivers/gpu/drm/i915/i915_drv.h |2 ++ drivers/gpu/drm/i915/intel_bios.c|8 drivers/gpu/drm/i915/intel_display.c | 20 ++-- 3 files changed, 24 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 7325363..135a08f 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -184,6 +184,8 @@ typedef struct drm_i915_private { unsigned int lvds_dither:1; unsigned int lvds_vbt:1; unsigned int int_crt_support:1; + unsigned int lvds_use_ssc:1; + int lvds_ssc_freq; struct drm_i915_fence_reg fence_regs[16]; /* assume 965 */ int fence_reg_start; /* 4 if userland hasn't ioctl'd us yet */ diff --git a/drivers/gpu/drm/i915/intel_bios.c b/drivers/gpu/drm/i915/intel_bios.c index 4ca82a0..65be30d 100644 --- a/drivers/gpu/drm/i915/intel_bios.c +++ b/drivers/gpu/drm/i915/intel_bios.c @@ -135,6 +135,14 @@ parse_general_features(struct drm_i915_private *dev_priv, if (general) { dev_priv->int_tv_support = general->int_tv_support; dev_priv->int_crt_support = general->int_crt_support; + dev_priv->lvds_use_ssc = general->enable_ssc; + + if (dev_priv->lvds_use_ssc) { + if (IS_I855(dev_priv->dev)) + dev_priv->lvds_ssc_freq = general->ssc_freq ? 66 : 48; + else + dev_priv->lvds_ssc_freq = general->ssc_freq ? 100 : 96; + } } } diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 8b20387..13f9b66 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -217,7 +217,7 @@ bool intel_pipe_has_type (struct drm_crtc *crtc, int type) return false; } -#define INTELPllInvalid(s) { /* ErrorF (s) */; return false; } +#define INTELPllInvalid(s) do { DRM_DEBUG(s); return false; } while (0) /** * Returns whether the given set of divisors are valid for a given refclk with * the given connectors. @@ -726,7 +726,7 @@ static int intel_crtc_mode_set(struct drm_crtc *crtc, int dspsize_reg = (pipe == 0) ? DSPASIZE : DSPBSIZE; int dsppos_reg = (pipe == 0) ? DSPAPOS : DSPBPOS; int pipesrc_reg = (pipe == 0) ? PIPEASRC : PIPEBSRC; - int refclk; + int refclk, num_outputs = 0; intel_clock_t clock; u32 dpll = 0, fp = 0, dspcntr, pipeconf; bool ok, is_sdvo = false, is_dvo = false; @@ -763,9 +763,14 @@ static int intel_crtc_mode_set(struct drm_crtc *crtc, is_crt = true; break; } + + num_outputs++; } - if (IS_I9XX(dev)) { + if (is_lvds && dev_priv->lvds_use_ssc && num_outputs < 2) { + refclk = dev_priv->lvds_ssc_freq * 1000; + DRM_DEBUG("using SSC reference clock of %d MHz\n", refclk / 1000); + } else if (IS_I9XX(dev)) { refclk = 96000; } else { refclk = 48000; @@ -824,11 +829,14 @@ static int intel_crtc_mode_set(struct drm_crtc *crtc, } } - if (is_tv) { + if (is_sdvo && is_tv) + dpll |= PLL_REF_INPUT_TVCLKINBC; + else if (is_tv) /* XXX: just matching BIOS for now */ -/* dpll |= PLL_REF_INPUT_TVCLKINBC; */ + /* dpll |= PLL_REF_INPUT_TVCLKINBC; */ dpll |= 3; - } + else if (is_lvds && dev_priv->lvds_use_ssc && num_outputs < 2) + dpll |= PLLB_REF_INPUT_SPREADSPECTRUMIN; else dpll |= PLL_REF_INPUT_DREFCLK; -- 1.6.0.rc0.42.g186458 -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[PATCH 3/4] drm: Collapse identical i8xx_clock() and i9xx_clock().
They used to be different. Now they're identical. Signed-off-by: Kristian Høgsberg --- drivers/gpu/drm/i915/intel_display.c | 33 ++--- 1 files changed, 6 insertions(+), 27 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 4f54ac5..8b20387 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -189,19 +189,7 @@ static const intel_limit_t *intel_limit(struct drm_crtc *crtc) return limit; } -/** Derive the pixel clock for the given refclk and divisors for 8xx chips. */ - -static void i8xx_clock(int refclk, intel_clock_t *clock) -{ - clock->m = 5 * (clock->m1 + 2) + (clock->m2 + 2); - clock->p = clock->p1 * clock->p2; - clock->vco = refclk * clock->m / (clock->n + 2); - clock->dot = clock->vco / clock->p; -} - -/** Derive the pixel clock for the given refclk and divisors for 9xx chips. */ - -static void i9xx_clock(int refclk, intel_clock_t *clock) +static void intel_clock(int refclk, intel_clock_t *clock) { clock->m = 5 * (clock->m1 + 2) + (clock->m2 + 2); clock->p = clock->p1 * clock->p2; @@ -209,15 +197,6 @@ static void i9xx_clock(int refclk, intel_clock_t *clock) clock->dot = clock->vco / clock->p; } -static void intel_clock(struct drm_device *dev, int refclk, - intel_clock_t *clock) -{ - if (IS_I9XX(dev)) - i9xx_clock (refclk, clock); - else - i8xx_clock (refclk, clock); -} - /** * Returns whether any output on the specified pipe is of the specified type */ @@ -318,7 +297,7 @@ static bool intel_find_best_PLL(struct drm_crtc *crtc, int target, clock.p1 <= limit->p1.max; clock.p1++) { int this_err; - intel_clock(dev, refclk, &clock); + intel_clock(refclk, &clock); if (!intel_PLL_is_valid(crtc, &clock)) continue; @@ -1313,7 +1292,7 @@ static int intel_crtc_clock_get(struct drm_device *dev, struct drm_crtc *crtc) } /* XXX: Handle the 100Mhz refclk */ - i9xx_clock(96000, &clock); + intel_clock(96000, &clock); } else { bool is_lvds = (pipe == 1) && (I915_READ(LVDS) & LVDS_PORT_EN); @@ -1325,9 +1304,9 @@ static int intel_crtc_clock_get(struct drm_device *dev, struct drm_crtc *crtc) if ((dpll & PLL_REF_INPUT_MASK) == PLLB_REF_INPUT_SPREADSPECTRUMIN) { /* XXX: might not be 66MHz */ - i8xx_clock(66000, &clock); + intel_clock(66000, &clock); } else - i8xx_clock(48000, &clock); + intel_clock(48000, &clock); } else { if (dpll & PLL_P1_DIVIDE_BY_TWO) clock.p1 = 2; @@ -1340,7 +1319,7 @@ static int intel_crtc_clock_get(struct drm_device *dev, struct drm_crtc *crtc) else clock.p2 = 2; - i8xx_clock(48000, &clock); + intel_clock(48000, &clock); } } -- 1.6.0.rc0.42.g186458 -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[PATCH 2/4] drm: Bring PLL limits in sync with DDX values.
Signed-off-by: Kristian Høgsberg --- drivers/gpu/drm/i915/intel_display.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 94c7c09..4f54ac5 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -90,12 +90,12 @@ typedef struct { #define I9XX_DOT_MAX40 #define I9XX_VCO_MIN 140 #define I9XX_VCO_MAX 280 -#define I9XX_N_MIN 3 -#define I9XX_N_MAX 8 +#define I9XX_N_MIN 1 +#define I9XX_N_MAX 6 #define I9XX_M_MIN 70 #define I9XX_M_MAX 120 #define I9XX_M1_MIN 10 -#define I9XX_M1_MAX 20 +#define I9XX_M1_MAX 22 #define I9XX_M2_MIN 5 #define I9XX_M2_MAX 9 #define I9XX_P_SDVO_DAC_MIN 5 -- 1.6.0.rc0.42.g186458 -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[PATCH 1/4] drm: Add locking around cursor gem operations.
We need to hold the struct_mutex around pinning and the phys object operations. Signed-off-by: Kristian Høgsberg --- drivers/gpu/drm/i915/intel_display.c |9 + 1 files changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index ac92799..94c7c09 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -1043,18 +1043,19 @@ static int intel_crtc_cursor_set(struct drm_crtc *crtc, } /* we only need to pin inside GTT if cursor is non-phy */ + mutex_lock(&dev->struct_mutex); if (!dev_priv->cursor_needs_physical) { ret = i915_gem_object_pin(bo, PAGE_SIZE); if (ret) { DRM_ERROR("failed to pin cursor bo\n"); - goto fail; + goto fail_locked; } addr = obj_priv->gtt_offset; } else { ret = i915_gem_attach_phys_object(dev, bo, (pipe == 0) ? I915_GEM_PHYS_CURSOR_0 : I915_GEM_PHYS_CURSOR_1); if (ret) { DRM_ERROR("failed to attach phys object\n"); - goto fail; + goto fail_locked; } addr = obj_priv->phys_obj->handle->busaddr; } @@ -1074,10 +1075,9 @@ static int intel_crtc_cursor_set(struct drm_crtc *crtc, i915_gem_detach_phys_object(dev, intel_crtc->cursor_bo); } else i915_gem_object_unpin(intel_crtc->cursor_bo); - mutex_lock(&dev->struct_mutex); drm_gem_object_unreference(intel_crtc->cursor_bo); - mutex_unlock(&dev->struct_mutex); } + mutex_unlock(&dev->struct_mutex); intel_crtc->cursor_addr = addr; intel_crtc->cursor_bo = bo; @@ -1085,6 +1085,7 @@ static int intel_crtc_cursor_set(struct drm_crtc *crtc, return 0; fail: mutex_lock(&dev->struct_mutex); +fail_locked: drm_gem_object_unreference(bo); mutex_unlock(&dev->struct_mutex); return ret; -- 1.6.0.rc0.42.g186458 -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [RFC] page flipping/buffer swap for DRI2
On Friday, February 13, 2009 4:18 pm Peter John Garrone wrote: > In reply to Jesse Barnes post. > I'm not on top of the finer details, being a consumer rather than a > developer. I have the following queries: > > 1) Is the buffer flip to be synchronous in the hardware, or to be > implemented as a software interrupt? I'm not sure what you mean... The flip is queued in the hardware ring, and delayed by hardware until the vertical blank interval starts. So in that sense the hardware takes care of it for us (we don't rely on an interrupt to get the bits on the screen). We do require an interrupt, however, to know when the flip has occurred. This lets us know that it's ok to start drawing on the new back buffer, since it's no longer being displayed to the user. > 2) By fullscreen, do you mean covering one or all screens in a multi-headed > display? It's hard coded to deal with one screen in this patchset, but potentially needs to flip both heads in a shared (cloned or extended desktop) configuration. > 3) If fullscreen means covering an individual screen in a multi-headed > display, will it be possible to have multiple simultaneous fullscreen > tear-free applications? Yes, it should be possible. > 4) Will non-fullscreen applications be tear-free? Maybe. If you're running a compositing manager that always swaps the whole screen (like the hacked version of compiz I've been using for testing does), then any non-DRI apps running under it will be tear-free. DRI apps will need special treatment, which I haven't implemented yet. > To be honest, the speed of the window manager is not a big issue for me. > Tearing is. In production mode, efficient fullscreen output mode is > desirable. In development mode, having the output into manageable > re-sizeable windows is nicer. > > The only way I can see non-tearing non-fullscreen > applications working is with triple-buffer. i.e. a kernel screen flyback > interrupt copies a screen-full from the multi-headed working buffer to a > plane buffer, schedules that plane buffer to be swapped upon the next > flyback, and triggers approptiate returns from the application swap-buffer > function calls. I assume this is triple-buffer because there is the large > possibly virtual working buffer and also two plane buffers for each screen. That would be one way of handling non-fullscreen apps. Another way would be to queue front buffer blits from such apps to occur during the vertical blank period. That's not as reliable (nor easy to implement) as full page flipping though, but has the advantage of just blitting over a small area of the screen, which will preserve most of the compressed framebuffer (if present), saving power. > However a fullscreen application should be able to override this on an > individual screen basis, rendering directly into the non-output display > plane, and scheduling it to be swapped upon the N'th flyback. Right, that's basically what this patchset enables. When a buffer swap comes in for a full screen back/front combo, a flip is queued rather than a blit executed. > Apologies for sounding preachy, I simply outline my own analysis and am > wondering how it is possible for a window manager, which must output to > a larger virtual space, could be regarded as a good candidate for > double-buffering as you suggest. No that's fine; I'd like more eyes and comments on this patchset. A compositing window manager is a special case, since it may span multiple screens like you say, but issue a single glXSwapBuffers call for its whole rendering area. In that case we can queue flips on each head to update the base pointers, and only allow drawing to the new backbuffer when both flips have completed. That should avoid tearing on either head, at the cost of a little more complexity in the swapbuffers implementation. -- Jesse Barnes, Intel Open Source Technology Center -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [RFC] page flipping/buffer swap for DRI2
In reply to Jesse Barnes post. I'm not on top of the finer details, being a consumer rather than a developer. I have the following queries: 1) Is the buffer flip to be synchronous in the hardware, or to be implemented as a software interrupt? 2) By fullscreen, do you mean covering one or all screens in a multi-headed display? 3) If fullscreen means covering an individual screen in a multi-headed display, will it be possible to have multiple simultaneous fullscreen tear-free applications? 4) Will non-fullscreen applications be tear-free? To be honest, the speed of the window manager is not a big issue for me. Tearing is. In production mode, efficient fullscreen output mode is desirable. In development mode, having the output into manageable re-sizeable windows is nicer. The only way I can see non-tearing non-fullscreen applications working is with triple-buffer. i.e. a kernel screen flyback interrupt copies a screen-full from the multi-headed working buffer to a plane buffer, schedules that plane buffer to be swapped upon the next flyback, and triggers approptiate returns from the application swap-buffer function calls. I assume this is triple-buffer because there is the large possibly virtual working buffer and also two plane buffers for each screen. However a fullscreen application should be able to override this on an individual screen basis, rendering directly into the non-output display plane, and scheduling it to be swapped upon the N'th flyback. Apologies for sounding preachy, I simply outline my own analysis and am wondering how it is possible for a window manager, which must output to a larger virtual space, could be regarded as a good candidate for double-buffering as you suggest. -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: "Can not ioremap virtual address" (was Re: [git pull] drm)
On Friday, February 13, 2009 9:28 am Diego Calleja wrote: > On Lunes 09 Febrero 2009 09:30:57 Dave Airlie escribió: > > Hi Linus, > > > > Please pull the 'drm-fixes' branch from > > ssh://master.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6.git > > drm-fixes > > A commit from this merge is causing an error in my machine: > > commit ab657db12d7020629f26f30d287558a8d0e32b41 > Author: Eric Anholt > Date: Fri Jan 23 12:57:47 2009 -0800 > > drm/i915: Set up an MTRR covering the GTT at driver load. > > > (found using bisection) > > The error: > [ 20.270956] [drm] Initialized i915 1.6.0 20080730 on minor 0 > [ 20.289492] [drm:i915_initialize] *ERROR* can not ioremap virtual > address for ring buffer > > And in Xorg.0.log: (II) intel(0): direct rendering: Failed > > > But...excluding that printk, everything works OK, including "direct > rendering: Yes"... > > There's a related thread here: http://lkml.org/lkml/2009/2/13/139 Does this prevent the message? We map it WC in GEM but not in i915_initialize... diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 81f1cff..2d797ff 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -202,7 +202,7 @@ static int i915_initialize(struct drm_device * dev, drm_i915 dev_priv->ring.map.flags = 0; dev_priv->ring.map.mtrr = 0; - drm_core_ioremap(&dev_priv->ring.map, dev); + drm_core_ioremap_wc(&dev_priv->ring.map, dev); if (dev_priv->ring.map.handle == NULL) { i915_dma_cleanup(dev); -- Jesse Barnes, Intel Open Source Technology Center -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH] drm: make drm_wait_vblank return immediately for very old sequence values
On Friday, February 13, 2009 2:33 am Michel Dänzer wrote: > On Thu, 2009-02-12 at 09:15 -0800, Jesse Barnes wrote: > > It does, but take a look at that code again. If interrupts are disabled > > by the timer, we'll capture the frame count. If, sometime later, they're > > re-enabled, we'll end up in drm_update_vblank_count. And if, between > > those two points, we've had a DPMS event, the frame counter will have > > been reset and will now be a lower value, so our drm_update_vblank_count > > will detect a wraparound. I think any hardware that resets its frame > > count will be susceptible to this problem unless we make wraparounds > > harmless. > > The modeset ioctl makes wraparounds harmless if used correctly. I don't think it does. And you may not be seeing this problem on radeon because your max_vblank_count is only 0x1f, which wouldn't trigger this behavior. Recall our last discussion where I outlined the cases we'd have to deal with in the modeset ioctl if we didn't use get/put to just keep interrupts on around the calls: + /* +* Cases to handle here: +* 1) interrupts are off across both calls +*be sure to add any lost frames +* 2) interrupts are on across both calls +*nothing to do, count will be updated by interupt handlers +* 3) interrupts go from on->off between calls +*add the post mode set frame count +*note that if interrupts go off before the counter resets +*we'll lose some frames. +*and if interrupts go off after it resets we'll end up +*double counting some events since the interrupt handlers will +*have added some of the new frame count value too +* 4) interrupts going from off->on between calls +*this really shouldn't happen much since clients generally don't +*start while modesetting is happening +*we need to account for lost frames in this case too, since +*there may be quite a few of them +*/ I couldn't get all of them working right, but maybe you can. If so, I'd be happy to see that patch. > > Can you think of a case where those frames would matter? > > E.g. the GLX_OML_sync_control spec says 'The graphics Media Stream > Counter (or graphics MSC) is a counter that is unique to the graphics > subsystem and increments for each vertical retrace that occurs.' > Vertical retraces that occur while nobody's listening may not matter in > simple cases, but that doesn't mean they're never relevant. But that doesn't say anything about whether the frame count must stay accurate while apps aren't running (i.e. doing anything with video/GL). That's why I asked for an example. If we assume it's important in some cases, we can still remove the update_vblank_counter code and just provide a knob to prevent interrupts from ever being disabled. > In general, I'm a little worried about the trend to just remove stuff > with problems (software engineering is hard, let's go shopping?), in > particular in a case like this where it's working fine for other > drivers. The other side of this coin is over-engineering things for theoretical problems that don't actually occur in real life. We want to avoid that too. -- Jesse Barnes, Intel Open Source Technology Center -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH] [drm] Release user fbs in drm_release
On Thu, 2009-02-12 at 14:37 -0500, Kristian Høgsberg wrote: > Avoids leaking fbs and associated buffers on release. > > Signed-off-by: Kristian Høgsberg Tested-by: Chris Wilson -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH] drm: make drm_wait_vblank return immediately for very old sequence values
On Thu, 2009-02-12 at 09:15 -0800, Jesse Barnes wrote: > On Thursday, February 12, 2009 2:11 am Michel Dänzer wrote: > > On Wed, 2009-02-11 at 14:29 -0800, Jesse Barnes wrote: > > > > The current wait_vblank condition won't return if the wait sequence is > > > more than 8M behind the current counter. This causes problems in the > > > wraparound case, which can happen if vblank interrupts have been > > > disabled by the timer (after 5s of no waiters), but then re-enabled > > > after a DPMS event. > > > > Sounds like maybe the display driver isn't calling the modeset ioctl > > properly around DPMS events? > > It does, but take a look at that code again. If interrupts are disabled by > the timer, we'll capture the frame count. If, sometime later, they're > re-enabled, we'll end up in drm_update_vblank_count. And if, between those > two points, we've had a DPMS event, the frame counter will have been reset > and will now be a lower value, so our drm_update_vblank_count will detect a > wraparound. I think any hardware that resets its frame count will be > susceptible to this problem unless we make wraparounds harmless. The modeset ioctl makes wraparounds harmless if used correctly. > > > (An alternative fix might be to not account for lost frames between > > > off->on periods at all, removing the drm_update_vblank_count code > > > entirely.) > > > > You can do with your driver whatever you please, but please leave the > > drm_update_vblank_count code for drivers that don't have these problems. > > Note I'm not suggesting dropping frames due to mode set activity, just frames > that happen while no client is listening (subject to the 5s timeout window of > course). I understand that. > Can you think of a case where those frames would matter? E.g. the GLX_OML_sync_control spec says 'The graphics Media Stream Counter (or graphics MSC) is a counter that is unique to the graphics subsystem and increments for each vertical retrace that occurs.' Vertical retraces that occur while nobody's listening may not matter in simple cases, but that doesn't mean they're never relevant. In general, I'm a little worried about the trend to just remove stuff with problems (software engineering is hard, let's go shopping?), in particular in a case like this where it's working fine for other drivers. -- Earthling Michel Dänzer |http://www.vmware.com Libre software enthusiast | Debian, X and DRI developer -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[Bug 12385] [845] i915 work with beryl only at 16bpp
http://bugs.freedesktop.org/show_bug.cgi?id=12385 --- Comment #8 from Michel Dänzer 2009-02-13 00:28:38 PST --- FWIW, current xserver has a patch which allows the driver to return ~0ULL to indicate that the pixmap can't be used directly. -- Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[Bug 17723] Radeon XPRESS 200M (RC410) - xorg crash when enabling compiz with 32 Mb VRAM
http://bugs.freedesktop.org/show_bug.cgi?id=17723 --- Comment #52 from Michel Dänzer 2009-02-13 00:26:18 PST --- Finally got around to pushing the xf86-video-ati patch to Git, so it'll be in the 6.11 release. Can this report be closed? -- Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. -- Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel