Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.

2009-02-13 Thread Dave Airlie
On Sat, Feb 14, 2009 at 4:09 PM, David Miller  wrote:
> From: Benjamin Herrenschmidt 
> Date: Thu, 12 Feb 2009 21:35:59 +1100
>
>> Oh BTW something else to be careful with, though I suppose it's working
>> some what by accident right now... when the GART is in the frame buffer
>> it gets applied the current fb swapper setting... ouch !
>>
>> So it might be a good idea, if we're going to use DRM_READ/WRITE32 which
>> afaik are readl/writel (ie, swapping) to make sure we at least
>> temporarily disable that swapper while whacking the GART.
>
> So I've narrowed down the final problems I'm having.  It's the surface
> swapping settings indeed.
>
> Running Xorg at depth 8, the CP can execute indirect buffers just
> fine, no code changes necessary.
>
> However, the CP hangs on indirect execution for depth 16 and 24.  But
> these depths work if I hard disable the surface byte swapping settings
> in the radeon Xorg driver.
>
> I did some research, and it does appear that the GART does read the
> PTEs from the VRAM using the Host Data Path.  This means the surface
> control byte swapping settings are applied.
>
> So for depths of 16 and 24, the GART is reading garbage PTEs.  And
> that's why the CP hangs.

Wow that is ugly, truly ugly. I'm going to say I've little idea how to
fix this outside KMS,

The only think I can think to do is use a surface instead of the
aperture swappers
and make the surface cover all the VRAM except the GART table which is
at the end.

> I have no idea how to handle this properly.  Not only does the Xorg
> ATI driver set the swapping settings at an arbitray point, it also
> mucks with them dynamically f.e. in the render helper functions.
>
> The only way this can work properly is to choose one surface swapping
> setting, set the VRAM GART table so that the GART sees the PTEs in the
> correct format, and retain those settings throughout
>
> It seems like, in at least R5xx, they tried to add a control bit in
> the PCI-E GART registers to handle this.  Bit 6 in PCI-E indirect
> register 0x10 is named 'GART_RDREQPATH_SEL' and has description:
>
>Read request path
> 0=HDP
> 1=Direct Rd Req to MC
>
> This seems to be intended to be a way to have the GART PTE reads
> bypass the full Host Data Path (and thus potential surface byte
> swaps), by setting this bit to 1.
>
> I tried doing that, but it doesn't help my RV370 so perhaps older
> chips don't support this bit.  It's hard to tell as this is the area
> where all of AMD's published GPU documents are severely lacking.

You don't get this bit until rv515 and above so no pony for you.

>
> I tested whether reading back the PCIE_TX_GART_CNTL register shows
> bit 6 after I try to write it, and it always reads back zero.
>
> There are even more complications, the VT enter/exit code in the Xorg
> ATI driver tries to save and restore the VRAM GART table for PCI-E
> cards.  But this:
>
> 1) Mis-sizes the GART table save buffer, it uses PAGE_SIZE instead
>   of the constant 4096 to determine how many GART entries there
>   are.  The PCI GART entries map 4096 bytes, always.  So using
>   getpagesize() is wrong.  (see RADEONDRIGetPciAperTableSize)
>
>   I have this fixed in my local tree.

Oops.

>
> 2) It doesn't check the surface byte swapping settings, so it
>   could be saving in one byte order and restoing in another.
>
>   I guess we could force RADEON_SURFACE_CNTL to zero around
>   the two memcpy()'s done in radeon_driver.c

Might be a good plan.

>
> But really this whole area is a mess, and I know KMS is coming to fix
> this, but even in the traditional XORG/DRM layout XORG has no business
> keeping the GART table uptodate across VT switches.  It should be
> calling into the DRM with an ioctl to write the GART table out to VRAM
> again.

You can draw an arbitrary line anywhere you want really, its all an unholy mess,
ever since people decided userspace drivers were a good idea.

>
> Finally there is a huge issue with how the Xorg ATI driver resets the
> chip using the RBBM.  It soft resets the CP, but this resets the ring
> read pointer.  However, nothing is done to make sure the DRM resync's
> the ring write pointer (which remains unchanged by a soft CP reset) as
> well as it's internal software ring state.
>
> The result is that on the very next CP command submission, the CP
> re-executes everything from ring entry zero until wherever the DRM
> things the write pointer was at the time of the CP soft reset.
>
> Any time the Xorg ATI driver does a CP soft reset, it should do
> CP stop and resume calls to the DRM to resync the ring pointers.
> And this does not appear to be happening where it needs to be
> happening.

That's interesting, it could explain why things never work again after a reset
or at least proceed to hang straight away.

Dave.

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest

[PATCH]: drm: ati_pcigart: Fix limit check in drm_ati_pcigart_init().

2009-02-13 Thread David Miller

drm: ati_pcigart: Fix limit check in drm_ati_pcigart_init().

The variable 'max_pages' is ambiguous.  There are two concepts
of "pages" being used in this function.

First, we have ATI GART pages which are always 4096 bytes.
Then, we have system pages which are of size PAGE_SIZE.

Eliminate the confusion by creating max_ati_pages and
max_real_pages.  Calculate and use them as appropriate.

Signed-off-by: David S. Miller 
---
 drivers/gpu/drm/ati_pcigart.c |   13 +++--
 1 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/ati_pcigart.c b/drivers/gpu/drm/ati_pcigart.c
index 7972ec8..4d86a62 100644
--- a/drivers/gpu/drm/ati_pcigart.c
+++ b/drivers/gpu/drm/ati_pcigart.c
@@ -102,7 +102,7 @@ int drm_ati_pcigart_init(struct drm_device *dev, struct 
drm_ati_pcigart_info *ga
u32 *pci_gart, page_base, gart_idx;
dma_addr_t bus_address = 0;
int i, j, ret = 0;
-   int max_pages;
+   int max_ati_pages, max_real_pages;
 
if (!entry) {
DRM_ERROR("no scatter/gather memory!\n");
@@ -130,14 +130,15 @@ int drm_ati_pcigart_init(struct drm_device *dev, struct 
drm_ati_pcigart_info *ga
 
pci_gart = (u32 *) address;
 
-   max_pages = (gart_info->table_size / sizeof(u32));
-   pages = (entry->pages <= max_pages)
-   ? entry->pages : max_pages;
+   max_ati_pages = (gart_info->table_size / sizeof(u32));
+   max_real_pages = max_ati_pages / (PAGE_SIZE / ATI_PCIGART_PAGE_SIZE);
+   pages = (entry->pages <= max_real_pages)
+   ? entry->pages : max_real_pages;
 
if (gart_info->gart_table_location == DRM_ATI_GART_MAIN) {
-   memset(pci_gart, 0, max_pages * sizeof(u32));
+   memset(pci_gart, 0, max_ati_pages * sizeof(u32));
} else {
-   for (gart_idx = 0; gart_idx < max_pages; gart_idx++)
+   for (gart_idx = 0; gart_idx < max_ati_pages; gart_idx++)
DRM_WRITE32(map, gart_idx * sizeof(u32), 0);
}
 
-- 
1.6.1.2.350.g88cc


--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.

2009-02-13 Thread David Miller
From: Benjamin Herrenschmidt 
Date: Thu, 12 Feb 2009 21:35:59 +1100

> Oh BTW something else to be careful with, though I suppose it's working
> some what by accident right now... when the GART is in the frame buffer
> it gets applied the current fb swapper setting... ouch !
> 
> So it might be a good idea, if we're going to use DRM_READ/WRITE32 which
> afaik are readl/writel (ie, swapping) to make sure we at least
> temporarily disable that swapper while whacking the GART.

So I've narrowed down the final problems I'm having.  It's the surface
swapping settings indeed.

Running Xorg at depth 8, the CP can execute indirect buffers just
fine, no code changes necessary.

However, the CP hangs on indirect execution for depth 16 and 24.  But
these depths work if I hard disable the surface byte swapping settings
in the radeon Xorg driver.

I did some research, and it does appear that the GART does read the
PTEs from the VRAM using the Host Data Path.  This means the surface
control byte swapping settings are applied.

So for depths of 16 and 24, the GART is reading garbage PTEs.  And
that's why the CP hangs.

I have no idea how to handle this properly.  Not only does the Xorg
ATI driver set the swapping settings at an arbitray point, it also
mucks with them dynamically f.e. in the render helper functions.

The only way this can work properly is to choose one surface swapping
setting, set the VRAM GART table so that the GART sees the PTEs in the
correct format, and retain those settings throughout.

It seems like, in at least R5xx, they tried to add a control bit in
the PCI-E GART registers to handle this.  Bit 6 in PCI-E indirect
register 0x10 is named 'GART_RDREQPATH_SEL' and has description:

Read request path
 0=HDP
 1=Direct Rd Req to MC

This seems to be intended to be a way to have the GART PTE reads
bypass the full Host Data Path (and thus potential surface byte
swaps), by setting this bit to 1.

I tried doing that, but it doesn't help my RV370 so perhaps older
chips don't support this bit.  It's hard to tell as this is the area
where all of AMD's published GPU documents are severely lacking.

I tested whether reading back the PCIE_TX_GART_CNTL register shows
bit 6 after I try to write it, and it always reads back zero.

There are even more complications, the VT enter/exit code in the Xorg
ATI driver tries to save and restore the VRAM GART table for PCI-E
cards.  But this:

1) Mis-sizes the GART table save buffer, it uses PAGE_SIZE instead
   of the constant 4096 to determine how many GART entries there
   are.  The PCI GART entries map 4096 bytes, always.  So using
   getpagesize() is wrong.  (see RADEONDRIGetPciAperTableSize)

   I have this fixed in my local tree.

2) It doesn't check the surface byte swapping settings, so it
   could be saving in one byte order and restoing in another.

   I guess we could force RADEON_SURFACE_CNTL to zero around
   the two memcpy()'s done in radeon_driver.c

But really this whole area is a mess, and I know KMS is coming to fix
this, but even in the traditional XORG/DRM layout XORG has no business
keeping the GART table uptodate across VT switches.  It should be
calling into the DRM with an ioctl to write the GART table out to VRAM
again.

Finally there is a huge issue with how the Xorg ATI driver resets the
chip using the RBBM.  It soft resets the CP, but this resets the ring
read pointer.  However, nothing is done to make sure the DRM resync's
the ring write pointer (which remains unchanged by a soft CP reset) as
well as it's internal software ring state.

The result is that on the very next CP command submission, the CP
re-executes everything from ring entry zero until wherever the DRM
things the write pointer was at the time of the CP soft reset.

Any time the Xorg ATI driver does a CP soft reset, it should do
CP stop and resume calls to the DRM to resync the ring pointers.
And this does not appear to be happening where it needs to be
happening.

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[PATCH 4/4] drm: Use spread spectrum when the bios tells us it's ok.

2009-02-13 Thread Kristian Høgsberg
Lifted from the DDX modesetting.

Signed-off-by: Kristian Høgsberg 
---
 drivers/gpu/drm/i915/i915_drv.h  |2 ++
 drivers/gpu/drm/i915/intel_bios.c|8 
 drivers/gpu/drm/i915/intel_display.c |   20 ++--
 3 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7325363..135a08f 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -184,6 +184,8 @@ typedef struct drm_i915_private {
unsigned int lvds_dither:1;
unsigned int lvds_vbt:1;
unsigned int int_crt_support:1;
+   unsigned int lvds_use_ssc:1;
+   int lvds_ssc_freq;
 
struct drm_i915_fence_reg fence_regs[16]; /* assume 965 */
int fence_reg_start; /* 4 if userland hasn't ioctl'd us yet */
diff --git a/drivers/gpu/drm/i915/intel_bios.c 
b/drivers/gpu/drm/i915/intel_bios.c
index 4ca82a0..65be30d 100644
--- a/drivers/gpu/drm/i915/intel_bios.c
+++ b/drivers/gpu/drm/i915/intel_bios.c
@@ -135,6 +135,14 @@ parse_general_features(struct drm_i915_private *dev_priv,
if (general) {
dev_priv->int_tv_support = general->int_tv_support;
dev_priv->int_crt_support = general->int_crt_support;
+   dev_priv->lvds_use_ssc = general->enable_ssc;
+
+   if (dev_priv->lvds_use_ssc) {
+ if (IS_I855(dev_priv->dev))
+   dev_priv->lvds_ssc_freq = general->ssc_freq ? 66 : 48;
+ else
+   dev_priv->lvds_ssc_freq = general->ssc_freq ? 100 : 96;
+   }
}
 }
 
diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index 8b20387..13f9b66 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -217,7 +217,7 @@ bool intel_pipe_has_type (struct drm_crtc *crtc, int type)
 return false;
 }
 
-#define INTELPllInvalid(s)   { /* ErrorF (s) */; return false; }
+#define INTELPllInvalid(s)   do { DRM_DEBUG(s); return false; } while (0)
 /**
  * Returns whether the given set of divisors are valid for a given refclk with
  * the given connectors.
@@ -726,7 +726,7 @@ static int intel_crtc_mode_set(struct drm_crtc *crtc,
int dspsize_reg = (pipe == 0) ? DSPASIZE : DSPBSIZE;
int dsppos_reg = (pipe == 0) ? DSPAPOS : DSPBPOS;
int pipesrc_reg = (pipe == 0) ? PIPEASRC : PIPEBSRC;
-   int refclk;
+   int refclk, num_outputs = 0;
intel_clock_t clock;
u32 dpll = 0, fp = 0, dspcntr, pipeconf;
bool ok, is_sdvo = false, is_dvo = false;
@@ -763,9 +763,14 @@ static int intel_crtc_mode_set(struct drm_crtc *crtc,
is_crt = true;
break;
}
+
+   num_outputs++;
}
 
-   if (IS_I9XX(dev)) {
+   if (is_lvds && dev_priv->lvds_use_ssc && num_outputs < 2) {
+   refclk = dev_priv->lvds_ssc_freq * 1000;
+   DRM_DEBUG("using SSC reference clock of %d MHz\n", refclk / 
1000);
+   } else if (IS_I9XX(dev)) {
refclk = 96000;
} else {
refclk = 48000;
@@ -824,11 +829,14 @@ static int intel_crtc_mode_set(struct drm_crtc *crtc,
}
}
 
-   if (is_tv) {
+   if (is_sdvo && is_tv)
+   dpll |= PLL_REF_INPUT_TVCLKINBC;
+   else if (is_tv)
/* XXX: just matching BIOS for now */
-/* dpll |= PLL_REF_INPUT_TVCLKINBC; */
+   /*  dpll |= PLL_REF_INPUT_TVCLKINBC; */
dpll |= 3;
-   }
+   else if (is_lvds && dev_priv->lvds_use_ssc && num_outputs < 2)
+   dpll |= PLLB_REF_INPUT_SPREADSPECTRUMIN;
else
dpll |= PLL_REF_INPUT_DREFCLK;
 
-- 
1.6.0.rc0.42.g186458


--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[PATCH 3/4] drm: Collapse identical i8xx_clock() and i9xx_clock().

2009-02-13 Thread Kristian Høgsberg
They used to be different.  Now they're identical.

Signed-off-by: Kristian Høgsberg 
---
 drivers/gpu/drm/i915/intel_display.c |   33 ++---
 1 files changed, 6 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index 4f54ac5..8b20387 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -189,19 +189,7 @@ static const intel_limit_t *intel_limit(struct drm_crtc 
*crtc)
return limit;
 }
 
-/** Derive the pixel clock for the given refclk and divisors for 8xx chips. */
-
-static void i8xx_clock(int refclk, intel_clock_t *clock)
-{
-   clock->m = 5 * (clock->m1 + 2) + (clock->m2 + 2);
-   clock->p = clock->p1 * clock->p2;
-   clock->vco = refclk * clock->m / (clock->n + 2);
-   clock->dot = clock->vco / clock->p;
-}
-
-/** Derive the pixel clock for the given refclk and divisors for 9xx chips. */
-
-static void i9xx_clock(int refclk, intel_clock_t *clock)
+static void intel_clock(int refclk, intel_clock_t *clock)
 {
clock->m = 5 * (clock->m1 + 2) + (clock->m2 + 2);
clock->p = clock->p1 * clock->p2;
@@ -209,15 +197,6 @@ static void i9xx_clock(int refclk, intel_clock_t *clock)
clock->dot = clock->vco / clock->p;
 }
 
-static void intel_clock(struct drm_device *dev, int refclk,
-   intel_clock_t *clock)
-{
-   if (IS_I9XX(dev))
-   i9xx_clock (refclk, clock);
-   else
-   i8xx_clock (refclk, clock);
-}
-
 /**
  * Returns whether any output on the specified pipe is of the specified type
  */
@@ -318,7 +297,7 @@ static bool intel_find_best_PLL(struct drm_crtc *crtc, int 
target,
 clock.p1 <= limit->p1.max; clock.p1++) {
int this_err;
 
-   intel_clock(dev, refclk, &clock);
+   intel_clock(refclk, &clock);
 
if (!intel_PLL_is_valid(crtc, &clock))
continue;
@@ -1313,7 +1292,7 @@ static int intel_crtc_clock_get(struct drm_device *dev, 
struct drm_crtc *crtc)
}
 
/* XXX: Handle the 100Mhz refclk */
-   i9xx_clock(96000, &clock);
+   intel_clock(96000, &clock);
} else {
bool is_lvds = (pipe == 1) && (I915_READ(LVDS) & LVDS_PORT_EN);
 
@@ -1325,9 +1304,9 @@ static int intel_crtc_clock_get(struct drm_device *dev, 
struct drm_crtc *crtc)
if ((dpll & PLL_REF_INPUT_MASK) ==
PLLB_REF_INPUT_SPREADSPECTRUMIN) {
/* XXX: might not be 66MHz */
-   i8xx_clock(66000, &clock);
+   intel_clock(66000, &clock);
} else
-   i8xx_clock(48000, &clock);
+   intel_clock(48000, &clock);
} else {
if (dpll & PLL_P1_DIVIDE_BY_TWO)
clock.p1 = 2;
@@ -1340,7 +1319,7 @@ static int intel_crtc_clock_get(struct drm_device *dev, 
struct drm_crtc *crtc)
else
clock.p2 = 2;
 
-   i8xx_clock(48000, &clock);
+   intel_clock(48000, &clock);
}
}
 
-- 
1.6.0.rc0.42.g186458


--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[PATCH 2/4] drm: Bring PLL limits in sync with DDX values.

2009-02-13 Thread Kristian Høgsberg
Signed-off-by: Kristian Høgsberg 
---
 drivers/gpu/drm/i915/intel_display.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index 94c7c09..4f54ac5 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -90,12 +90,12 @@ typedef struct {
 #define I9XX_DOT_MAX40
 #define I9XX_VCO_MIN   140
 #define I9XX_VCO_MAX   280
-#define I9XX_N_MIN   3
-#define I9XX_N_MAX   8
+#define I9XX_N_MIN   1
+#define I9XX_N_MAX   6
 #define I9XX_M_MIN  70
 #define I9XX_M_MAX 120
 #define I9XX_M1_MIN 10
-#define I9XX_M1_MAX 20
+#define I9XX_M1_MAX 22
 #define I9XX_M2_MIN  5
 #define I9XX_M2_MAX  9
 #define I9XX_P_SDVO_DAC_MIN  5
-- 
1.6.0.rc0.42.g186458


--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[PATCH 1/4] drm: Add locking around cursor gem operations.

2009-02-13 Thread Kristian Høgsberg
We need to hold the struct_mutex around pinning and the phys object
operations.

Signed-off-by: Kristian Høgsberg 
---
 drivers/gpu/drm/i915/intel_display.c |9 +
 1 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index ac92799..94c7c09 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -1043,18 +1043,19 @@ static int intel_crtc_cursor_set(struct drm_crtc *crtc,
}
 
/* we only need to pin inside GTT if cursor is non-phy */
+   mutex_lock(&dev->struct_mutex);
if (!dev_priv->cursor_needs_physical) {
ret = i915_gem_object_pin(bo, PAGE_SIZE);
if (ret) {
DRM_ERROR("failed to pin cursor bo\n");
-   goto fail;
+   goto fail_locked;
}
addr = obj_priv->gtt_offset;
} else {
ret = i915_gem_attach_phys_object(dev, bo, (pipe == 0) ? 
I915_GEM_PHYS_CURSOR_0 : I915_GEM_PHYS_CURSOR_1);
if (ret) {
DRM_ERROR("failed to attach phys object\n");
-   goto fail;
+   goto fail_locked;
}
addr = obj_priv->phys_obj->handle->busaddr;
}
@@ -1074,10 +1075,9 @@ static int intel_crtc_cursor_set(struct drm_crtc *crtc,
i915_gem_detach_phys_object(dev, 
intel_crtc->cursor_bo);
} else
i915_gem_object_unpin(intel_crtc->cursor_bo);
-   mutex_lock(&dev->struct_mutex);
drm_gem_object_unreference(intel_crtc->cursor_bo);
-   mutex_unlock(&dev->struct_mutex);
}
+   mutex_unlock(&dev->struct_mutex);
 
intel_crtc->cursor_addr = addr;
intel_crtc->cursor_bo = bo;
@@ -1085,6 +1085,7 @@ static int intel_crtc_cursor_set(struct drm_crtc *crtc,
return 0;
 fail:
mutex_lock(&dev->struct_mutex);
+fail_locked:
drm_gem_object_unreference(bo);
mutex_unlock(&dev->struct_mutex);
return ret;
-- 
1.6.0.rc0.42.g186458


--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [RFC] page flipping/buffer swap for DRI2

2009-02-13 Thread Jesse Barnes
On Friday, February 13, 2009 4:18 pm Peter John Garrone wrote:
> In reply to Jesse Barnes post.
> I'm not on top of the finer details, being a consumer rather than a
> developer. I have the following queries:
>
> 1) Is the buffer flip to be synchronous in the hardware, or to be
> implemented as a software interrupt?

I'm not sure what you mean... The flip is queued in the hardware ring, and 
delayed by hardware until the vertical blank interval starts.  So in that 
sense the hardware takes care of it for us (we don't rely on an interrupt to 
get the bits on the screen).  We do require an interrupt, however, to know 
when the flip has occurred.  This lets us know that it's ok to start drawing 
on the new back buffer, since it's no longer being displayed to the user.

> 2) By fullscreen, do you mean covering one or all screens in a multi-headed
> display?

It's hard coded to deal with one screen in this patchset, but potentially 
needs to flip both heads in a shared (cloned or extended desktop) 
configuration.

> 3) If fullscreen means covering an individual screen in a multi-headed
> display, will it be possible to have multiple simultaneous fullscreen
> tear-free applications?

Yes, it should be possible.

> 4) Will non-fullscreen applications be tear-free?

Maybe.  If you're running a compositing manager that always swaps the whole 
screen (like the hacked version of compiz I've been using for testing does), 
then any non-DRI apps running under it will be tear-free.  DRI apps will need 
special treatment, which I haven't implemented yet.

> To be honest, the speed of the window manager is not a big issue for me.
> Tearing is. In production mode, efficient fullscreen output mode is
> desirable. In development mode, having the output into manageable
> re-sizeable windows is nicer.
>
> The only way I can see non-tearing non-fullscreen
> applications working is with triple-buffer. i.e. a kernel screen flyback
> interrupt copies a screen-full from the multi-headed working buffer to a
> plane buffer, schedules that plane buffer to be swapped upon the next
> flyback, and triggers approptiate returns from the application swap-buffer
> function calls. I assume this is triple-buffer because there is the large
> possibly virtual working buffer and also two plane buffers for each screen.

That would be one way of handling non-fullscreen apps.  Another way would be 
to queue front buffer blits from such apps to occur during the vertical blank 
period.  That's not as reliable (nor easy to implement) as full page flipping 
though, but has the advantage of just blitting over a small area of the 
screen, which will preserve most of the compressed framebuffer (if present), 
saving power.

> However a fullscreen application should be able to override this on an
> individual screen basis, rendering directly into the non-output display
> plane, and scheduling it to be swapped upon the N'th flyback.

Right, that's basically what this patchset enables.  When a buffer swap comes 
in for a full screen back/front combo, a flip is queued rather than a blit 
executed.

> Apologies for sounding preachy, I simply outline my own analysis and am
> wondering how it is possible for a window manager, which must output to
> a larger virtual space, could be regarded as a good candidate for
> double-buffering as you suggest.

No that's fine; I'd like more eyes and comments on this patchset.  A 
compositing window manager is a special case, since it may span multiple 
screens like you say, but issue a single glXSwapBuffers call for its whole 
rendering area.  In that case we can queue flips on each head to update the 
base pointers, and only allow drawing to the new backbuffer when both flips 
have completed.  That should avoid tearing on either head, at the cost of a 
little more complexity in the swapbuffers implementation.

-- 
Jesse Barnes, Intel Open Source Technology Center

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [RFC] page flipping/buffer swap for DRI2

2009-02-13 Thread Peter John Garrone
In reply to Jesse Barnes post.
I'm not on top of the finer details, being a consumer rather than a
developer. I have the following queries:

1) Is the buffer flip to be synchronous in the hardware, or to be
implemented as a software interrupt?

2) By fullscreen, do you mean covering one or all screens in a multi-headed
display?

3) If fullscreen means covering an individual screen in a multi-headed
display, will it be possible to have multiple simultaneous fullscreen tear-free
applications?

4) Will non-fullscreen applications be tear-free?

To be honest, the speed of the window manager is not a big issue for me.
Tearing is. In production mode, efficient fullscreen output mode is
desirable. In development mode, having the output into manageable
re-sizeable windows is nicer. 

The only way I can see non-tearing non-fullscreen
applications working is with triple-buffer. i.e. a kernel screen flyback 
interrupt
copies a screen-full from the multi-headed working buffer to a plane buffer,
schedules that plane buffer to be swapped upon the next flyback, and
triggers approptiate returns from the application swap-buffer function calls.
I assume this is triple-buffer because there is the large possibly
virtual working buffer and also two plane buffers for each screen.

However a fullscreen application should be able to override this on an
individual screen basis, rendering directly into the non-output display plane,
and scheduling it to be swapped upon the N'th flyback.

Apologies for sounding preachy, I simply outline my own analysis and am
wondering how it is possible for a window manager, which must output to
a larger virtual space, could be regarded as a good candidate for
double-buffering as you suggest.

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: "Can not ioremap virtual address" (was Re: [git pull] drm)

2009-02-13 Thread Jesse Barnes
On Friday, February 13, 2009 9:28 am Diego Calleja wrote:
> On Lunes 09 Febrero 2009 09:30:57 Dave Airlie escribió:
> > Hi Linus,
> >
> > Please pull the 'drm-fixes' branch from
> > ssh://master.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6.git
> > drm-fixes
>
> A commit from this merge is causing an error in my machine:
>
> commit ab657db12d7020629f26f30d287558a8d0e32b41
> Author: Eric Anholt 
> Date:   Fri Jan 23 12:57:47 2009 -0800
>
> drm/i915: Set up an MTRR covering the GTT at driver load.
>
>
> (found using bisection)
>
> The error:
> [   20.270956] [drm] Initialized i915 1.6.0 20080730 on minor 0
> [   20.289492] [drm:i915_initialize] *ERROR* can not ioremap virtual
> address for ring buffer
>
> And in Xorg.0.log: (II) intel(0): direct rendering: Failed
>
>
> But...excluding that printk, everything works OK, including "direct
> rendering: Yes"...
>
> There's a related thread here: http://lkml.org/lkml/2009/2/13/139

Does this prevent the message?  We map it WC in GEM but not in 
i915_initialize...

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 81f1cff..2d797ff 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -202,7 +202,7 @@ static int i915_initialize(struct drm_device * dev, 
drm_i915
dev_priv->ring.map.flags = 0;
dev_priv->ring.map.mtrr = 0;

-   drm_core_ioremap(&dev_priv->ring.map, dev);
+   drm_core_ioremap_wc(&dev_priv->ring.map, dev);

if (dev_priv->ring.map.handle == NULL) {
i915_dma_cleanup(dev);


-- 
Jesse Barnes, Intel Open Source Technology Center

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH] drm: make drm_wait_vblank return immediately for very old sequence values

2009-02-13 Thread Jesse Barnes
On Friday, February 13, 2009 2:33 am Michel Dänzer wrote:
> On Thu, 2009-02-12 at 09:15 -0800, Jesse Barnes wrote:
> > It does, but take a look at that code again.  If interrupts are disabled
> > by the timer, we'll capture the frame count.  If, sometime later, they're
> > re-enabled, we'll end up in drm_update_vblank_count.  And if, between
> > those two points, we've had a DPMS event, the frame counter will have
> > been reset and will now be a lower value, so our drm_update_vblank_count
> > will detect a wraparound.  I think any hardware that resets its frame
> > count will be susceptible to this problem unless we make wraparounds
> > harmless.
>
> The modeset ioctl makes wraparounds harmless if used correctly.

I don't think it does.  And you may not be seeing this problem on radeon 
because your max_vblank_count is only 0x1f, which wouldn't trigger this 
behavior.

Recall our last discussion where I outlined the cases we'd have to deal with 
in the modeset ioctl if we didn't use get/put to just keep interrupts on 
around the calls:

+   /*
+* Cases to handle here:
+* 1) interrupts are off across both calls
+*be sure to add any lost frames
+* 2) interrupts are on across both calls
+*nothing to do, count will be updated by interupt handlers
+* 3) interrupts go from on->off between calls
+*add the post mode set frame count
+*note that if interrupts go off before the counter resets
+*we'll lose some frames.
+*and if interrupts go off after it resets we'll end up
+*double counting some events since the interrupt handlers will
+*have added some of the new frame count value too
+* 4) interrupts going from off->on between calls
+*this really shouldn't happen much since clients generally don't
+*start while modesetting is happening
+*we need to account for lost frames in this case too, since
+*there may be quite a few of them
+*/

I couldn't get all of them working right, but maybe you can.  If so, I'd be 
happy to see that patch.

> > Can you think of a case where those frames would matter?
>
> E.g. the GLX_OML_sync_control spec says 'The graphics Media Stream
> Counter (or graphics MSC) is a counter that is unique to the graphics
> subsystem and increments for each vertical retrace that occurs.'
> Vertical retraces that occur while nobody's listening may not matter in
> simple cases, but that doesn't mean they're never relevant.

But that doesn't say anything about whether the frame count must stay accurate 
while apps aren't running (i.e. doing anything with video/GL).  That's why I 
asked for an example.  If we assume it's important in some cases, we can 
still remove the update_vblank_counter code and just provide a knob to 
prevent interrupts from ever being disabled.

> In general, I'm a little worried about the trend to just remove stuff
> with problems (software engineering is hard, let's go shopping?), in
> particular in a case like this where it's working fine for other
> drivers.

The other side of this coin is over-engineering things for theoretical 
problems that don't actually occur in real life.  We want to avoid that too.

-- 
Jesse Barnes, Intel Open Source Technology Center

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH] [drm] Release user fbs in drm_release

2009-02-13 Thread Chris Wilson
On Thu, 2009-02-12 at 14:37 -0500, Kristian Høgsberg wrote:
> Avoids leaking fbs and associated buffers on release.
> 
> Signed-off-by: Kristian Høgsberg 
Tested-by: Chris Wilson 


--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


Re: [PATCH] drm: make drm_wait_vblank return immediately for very old sequence values

2009-02-13 Thread Michel Dänzer
On Thu, 2009-02-12 at 09:15 -0800, Jesse Barnes wrote:
> On Thursday, February 12, 2009 2:11 am Michel Dänzer wrote:
> > On Wed, 2009-02-11 at 14:29 -0800, Jesse Barnes wrote:
> 
> > > The current wait_vblank condition won't return if the wait sequence is
> > > more than 8M behind the current counter.  This causes problems in the
> > > wraparound case, which can happen if vblank interrupts have been
> > > disabled by the timer (after 5s of no waiters), but then re-enabled
> > > after a DPMS event.
> >
> > Sounds like maybe the display driver isn't calling the modeset ioctl
> > properly around DPMS events?
> 
> It does, but take a look at that code again.  If interrupts are disabled by 
> the timer, we'll capture the frame count.  If, sometime later, they're 
> re-enabled, we'll end up in drm_update_vblank_count.  And if, between those 
> two points, we've had a DPMS event, the frame counter will have been reset 
> and will now be a lower value, so our drm_update_vblank_count will detect a 
> wraparound.  I think any hardware that resets its frame count will be 
> susceptible to this problem unless we make wraparounds harmless.

The modeset ioctl makes wraparounds harmless if used correctly.

> > > (An alternative fix might be to not account for lost frames between
> > > off->on periods at all, removing the drm_update_vblank_count code
> > > entirely.)
> >
> > You can do with your driver whatever you please, but please leave the
> > drm_update_vblank_count code for drivers that don't have these problems.
> 
> Note I'm not suggesting dropping frames due to mode set activity, just frames 
> that happen while no client is listening (subject to the 5s timeout window of 
> course).

I understand that.

> Can you think of a case where those frames would matter?

E.g. the GLX_OML_sync_control spec says 'The graphics Media Stream
Counter (or graphics MSC) is a counter that is unique to the graphics
subsystem and increments for each vertical retrace that occurs.'
Vertical retraces that occur while nobody's listening may not matter in
simple cases, but that doesn't mean they're never relevant.

In general, I'm a little worried about the trend to just remove stuff
with problems (software engineering is hard, let's go shopping?), in
particular in a case like this where it's working fine for other
drivers.


-- 
Earthling Michel Dänzer   |http://www.vmware.com
Libre software enthusiast |  Debian, X and DRI developer

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[Bug 12385] [845] i915 work with beryl only at 16bpp

2009-02-13 Thread bugzilla-daemon
http://bugs.freedesktop.org/show_bug.cgi?id=12385





--- Comment #8 from Michel Dänzer   2009-02-13 00:28:38 PST 
---
FWIW, current xserver has a patch which allows the driver to return ~0ULL to
indicate that the pixmap can't be used directly.


-- 
Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel


[Bug 17723] Radeon XPRESS 200M (RC410) - xorg crash when enabling compiz with 32 Mb VRAM

2009-02-13 Thread bugzilla-daemon
http://bugs.freedesktop.org/show_bug.cgi?id=17723





--- Comment #52 from Michel Dänzer   2009-02-13 00:26:18 
PST ---
Finally got around to pushing the xf86-video-ati patch to Git, so it'll be in
the 6.11 release. Can this report be closed?


-- 
Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
--
___
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel