Re: Why was old TTM removed from drm.git?
On Wed, 2009-06-24 at 19:26 +0200, Thomas Hellström wrote: Just prior to the commit I sent out a message explaining what I was going to do and why, but apparently it didn't make it to the list (which seems to be the case of quite a few mails these days). What was the From: address and subject of that mail (or any others that were apparently lost)? I can't seem to find anything in the dri-devel moderation queue mails around the weekend, so apparently it was dropped before it reached mailman. Maybe some sf.net spam filter or something. -- Earthling Michel Dänzer |http://www.vmware.com Libre software enthusiast | Debian, X and DRI developer -- -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[PATCH 2/2] drm/i915: Adjust DisplayPort clocks to use 96MHz reference
For some reason, the DP clocks were based off a 100MHz reference instead of the standard 96MHz reference. This caused some DP monitors to fail to lock to the signal. Signed-off-by: Keith Packard kei...@keithp.com --- drivers/gpu/drm/i915/intel_display.c | 21 + 1 files changed, 9 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 73e7b9c..a4a2a03 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -814,24 +814,21 @@ intel_find_pll_g4x_dp(const intel_limit_t *limit, struct drm_crtc *crtc, { intel_clock_t clock; if (target 20) { - clock.dot = 161670; - clock.p = 20; clock.p1 = 2; clock.p2 = 10; - clock.n = 0x01; - clock.m = 97; - clock.m1 = 0x10; - clock.m2 = 0x05; + clock.n = 2; + clock.m1 = 23; + clock.m2 = 8; } else { - clock.dot = 27; - clock.p = 10; clock.p1 = 1; clock.p2 = 10; - clock.n = 0x02; - clock.m = 108; - clock.m1 = 0x12; - clock.m2 = 0x06; + clock.n = 1; + clock.m1 = 14; + clock.m2 = 2; } +clock.m = 5 * (clock.m1 + 2) + (clock.m2 + 2); +clock.p = (clock.p1 * clock.p2); +clock.dot = 96000 * clock.m / (clock.n + 2) / clock.p; memcpy(best_clock, clock, sizeof(intel_clock_t)); return true; } -- 1.6.3.1 -- -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[PATCH 1/2] drm: Fix EDID detailed timing misc flags decoding
The change from bitfields to masks was done incorrectly for the misc flags byte. Signed-off-by: Keith Packard kei...@keithp.com --- drivers/gpu/drm/drm_edid.c | 19 include/drm/drm_edid.h | 66 ++-- 2 files changed, 57 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c index 7d08352..c84b306 100644 --- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -303,12 +303,16 @@ static struct drm_display_mode *drm_mode_detailed(struct drm_device *dev, if (hactive 64 || vactive 64) return NULL; - if (pt-misc DRM_EDID_PT_STEREO) { + if (DRM_EDID_DETAILED_MISC_HAS_STEREO(pt-misc)) { printk(KERN_WARNING stereo mode not supported\n); return NULL; } - if (!(pt-misc DRM_EDID_PT_SEPARATE_SYNC)) { - printk(KERN_WARNING integrated sync not supported\n); + if (!(pt-misc DRM_EDID_DETAILED_MISC_DIGITAL_SYNC)) { + printk(KERN_WARNING analog sync not supported\n); + return NULL; + } + if (!(pt-misc DRM_EDID_DETAILED_MISC_DIGITAL_SYNC_SEPARATE)) { + printk(KERN_WARNING digital composite sync not supported\n); return NULL; } @@ -335,16 +339,17 @@ static struct drm_display_mode *drm_mode_detailed(struct drm_device *dev, drm_mode_set_name(mode); - if (pt-misc DRM_EDID_PT_INTERLACED) + if (pt-misc DRM_EDID_DETAILED_MISC_INTERLACED) mode-flags |= DRM_MODE_FLAG_INTERLACE; if (quirks EDID_QUIRK_DETAILED_SYNC_PP) { - pt-misc |= DRM_EDID_PT_HSYNC_POSITIVE | DRM_EDID_PT_VSYNC_POSITIVE; + pt-misc |= (DRM_EDID_DETAILED_MISC_DIGITAL_HSYNC_POSITIVE | +DRM_EDID_DETAILED_MISC_DIGITAL_VSYNC_POSITIVE); } - mode-flags |= (pt-misc DRM_EDID_PT_HSYNC_POSITIVE) ? + mode-flags |= (pt-misc DRM_EDID_DETAILED_MISC_DIGITAL_HSYNC_POSITIVE) ? DRM_MODE_FLAG_PHSYNC : DRM_MODE_FLAG_NHSYNC; - mode-flags |= (pt-misc DRM_EDID_PT_VSYNC_POSITIVE) ? + mode-flags |= (pt-misc DRM_EDID_DETAILED_MISC_DIGITAL_VSYNC_POSITIVE) ? DRM_MODE_FLAG_PVSYNC : DRM_MODE_FLAG_NVSYNC; mode-width_mm = pt-width_mm_lo | (pt-width_height_mm_hi 0xf) 8; diff --git a/include/drm/drm_edid.h b/include/drm/drm_edid.h index c263e4d..8611539 100644 --- a/include/drm/drm_edid.h +++ b/include/drm/drm_edid.h @@ -47,30 +47,54 @@ struct std_timing { u8 vfreq_aspect; } __attribute__((packed)); -#define DRM_EDID_PT_HSYNC_POSITIVE (1 6) -#define DRM_EDID_PT_VSYNC_POSITIVE (1 5) -#define DRM_EDID_PT_SEPARATE_SYNC (3 3) -#define DRM_EDID_PT_STEREO (1 2) -#define DRM_EDID_PT_INTERLACED (1 1) +#define DRM_EDID_DETAILED_MISC_INTERLACED (1 7) + +#define DRM_EDID_DETAILED_MISC_STEREO_MASK ((3 5) | (1 0)) +#define DRM_EDID_DETAILED_MISC_STEREO_NONE_0 ((0 5) | (0 0)) +#define DRM_EDID_DETAILED_MISC_STEREO_NONE_1 ((0 5) | (1 0)) +#define DRM_EDID_DETAILED_MISC_STEREO_FIELD_RIGHT ((1 5) | (0 0)) +#define DRM_EDID_DETAILED_MISC_STEREO_FIELD_LEFT ((2 5) | (0 0)) +#define DRM_EDID_DETAILED_MISC_STEREO_2WAY_RIGHT ((1 5) | (1 0)) +#define DRM_EDID_DETAILED_MISC_STEREO_2WAY_LEFT((2 5) | (1 0)) +#define DRM_EDID_DETAILED_MISC_STEREO_4WAY ((3 5) | (0 0)) +#define DRM_EDID_DETAILED_MISC_STEREO_SIDE_BY_SIDE ((3 5) | (1 0)) +#define DRM_EDID_DETAILED_MISC_HAS_STEREO(x) (((x) (3 5)) != 0) + +#define DRM_EDID_DETAILED_MISC_DIGITAL_SYNC(1 4) +/* Analog sync (embedded with signal) */ +#define DRM_EDID_DETAILED_MISC_ANALOG_SYNC_BIPOLAR (1 3) +#define DRM_EDID_DETAILED_MISC_ANALOG_SYNC_SERRATIONS (1 2) +#define DRM_EDID_DETAILED_MISC_ANALOG_SYNC_ALL_CHAN(1 1) + +/* Digital sync (separate from signal) */ +#define DRM_EDID_DETAILED_MISC_DIGITAL_SYNC_SEPARATE (1 3) + +/* Digital composite sync */ +#define DRM_EDID_DETAILED_MISC_DIGITAL_SYNC_SERRATIONS (1 2) + +/* Digital separate sync */ +#define DRM_EDID_DETAILED_MISC_DIGITAL_HSYNC_POSITIVE (1 2) +#define DRM_EDID_DETAILED_MISC_DIGITAL_VSYNC_POSITIVE (1 1) /* If detailed data is pixel timing */ struct detailed_pixel_timing { - u8 hactive_lo; - u8 hblank_lo; - u8 hactive_hblank_hi; - u8 vactive_lo; - u8 vblank_lo; - u8 vactive_vblank_hi; - u8 hsync_offset_lo; - u8 hsync_pulse_width_lo; - u8 vsync_offset_pulse_width_lo; - u8 hsync_vsync_offset_pulse_width_hi; - u8 width_mm_lo; - u8 height_mm_lo; - u8 width_height_mm_hi; - u8 hborder; - u8 vborder; - u8 misc; + /* first two bytes are the clock */ + u8 hactive_lo; /* 2 */ + u8 hblank_lo;
[Bug 21814] Mesa 7.6-devel implementation error: invalid texture object Target value
http://bugs.freedesktop.org/show_bug.cgi?id=21814 Fabio fabio@libero.it changed: What|Removed |Added Attachment #26000|0 |1 is obsolete|| Attachment #26004|0 |1 is obsolete|| Attachment #26006|0 |1 is obsolete|| Attachment #26710|0 |1 is obsolete|| --- Comment #14 from Fabio fabio@libero.it 2009-06-25 00:48:18 PST --- Created an attachment (id=27117) -- (http://bugs.freedesktop.org/attachment.cgi?id=27117) wine output with mesa master 2009-06-25 (commit cdbcb051) (In reply to comment #13) I will retest when bug 21582 is fixed. Bug 21582 is marked as fixed but I am still having the same assertion of https://bugs.freedesktop.org/attachment.cgi?id=26710. See also bug 22438. OK, bug 22438 is fixed but now the game crashes with a different backtrace (attached). -- Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. -- -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [RFC][PATCH] drm/radeon/kms: add initial colortiling support.
On Wed, 2009-06-24 at 22:32 +0200, Roland Scheidegger wrote: On 24.06.2009 20:17, Jerome Glisse wrote: I think we should let user ask at gem map ioctl time if userspace wants an surface backed mapping or not, and gem map will reply with a success or failure. So if object is in vram and there is a surface reg available it will succeed, if object is in system ram it will report to userspace that there is not automatic untiling and that userspace is on its own to untile the buffer. For the X server that the front buffer is mapped first and never unmapped, it should get a surface (assuming no other process already stole all the surface). For pixmap i think be better of not using tiling for time being (or macro tiling only benchmark below). Mesa, map/unmap things and should be able to untile on its own for front/zbuffer (we need to add texture but i am not sure it's worth it, see benchmark below). I don't see benchmark with texture tiling below... It definitely made some difference though when I implemented (and measured...) this, though I never really worried that much about tiled compressed textures, not sure micro tiled is even possible (and would make sense) but macro tiled certainly should be (but IIRC I tried to measure it and it didn't make much of a difference on r200 but it could have changed with newer chips). That said, don't forget that the performance improvement this gives is chip specific, generally giving more improvement with newer chips. IIRC you definitely don't want to micro tile the front buffer pre-r300. Roland Yeah i loose texture benchmark but it was very small 1-2% on quake3 but maybe quake3 isn't asking for much texture filtering, assuming filtering is the process which benefit from tiled texture. Cheers, Jerome -- -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[Bug 22438] radeon_common.c:1016: radeon_validate_bo: Assertion `radeon-state.validated_bo_count 32' failed.
http://bugs.freedesktop.org/show_bug.cgi?id=22438 Fabio fabio@libero.it changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED --- Comment #3 from Fabio fabio@libero.it 2009-06-25 00:49:29 PST --- (In reply to comment #2) can you retest with mesa master, I fixed a bug that can cause this to happen. Confirmed fixed, thanks! Can you take a look at bug 21814 (see comment 14). -- Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. -- -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH] radeon: preallocate memory for command stream parsing
Hi Jerome, On Tue, 2009-06-23 at 22:52 +0300, Pekka Enberg wrote: On Tue, Jun 23, 2009 at 10:46 PM, Jerome Glissejgli...@redhat.com wrote: Command stream parsing is the most common operation and can happen hundred of times per second, we don't want to allocate/free memory each time this ioctl is call. This rework the ioctl to avoid doing so by allocating temporary memory along the ib pool. Signed-off-by: Jerome Glisse jgli...@redhat.com So how much does this help (i.e. where are the numbers)? I am bit surprised hundred of times per second is an issue for our slab allocators. Hmm? On Wed, 2009-06-24 at 10:29 +0200, Jerome Glisse wrote: I didn't have real number but the vmap path was really slower, quake3 fps goes from ~20 to ~40 on average when going from vmap to preallocated. When using kmalloc i don't thing there was so much performance hit. But i think the biggest hit was that in previous code i asked for zeroed memory so i am pretty sure kernel spend a bit of time clearing page. I reworked the code to avoid needing cleared memory and so avoid memset, this is likely why we get a performance boost. OK. If kmalloc() (without memset) really was too slow for your case, I'd be interested in looking at it in more detail. I'm not completely convinced the memory pool is needed here but I'm not a DRM expert so I'm not NAK'ing this either... Pekka -- -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [RFC][PATCH] drm/radeon/kms: add initial colortiling support.
On Thu, Jun 25, 2009 at 09:46, Jerome Glissegli...@freedesktop.org wrote: On Wed, 2009-06-24 at 22:32 +0200, Roland Scheidegger wrote: On 24.06.2009 20:17, Jerome Glisse wrote: I think we should let user ask at gem map ioctl time if userspace wants an surface backed mapping or not, and gem map will reply with a success or failure. So if object is in vram and there is a surface reg available it will succeed, if object is in system ram it will report to userspace that there is not automatic untiling and that userspace is on its own to untile the buffer. For the X server that the front buffer is mapped first and never unmapped, it should get a surface (assuming no other process already stole all the surface). For pixmap i think be better of not using tiling for time being (or macro tiling only benchmark below). Mesa, map/unmap things and should be able to untile on its own for front/zbuffer (we need to add texture but i am not sure it's worth it, see benchmark below). I don't see benchmark with texture tiling below... It definitely made some difference though when I implemented (and measured...) this, though I never really worried that much about tiled compressed textures, not sure micro tiled is even possible (and would make sense) but macro tiled certainly should be (but IIRC I tried to measure it and it didn't make much of a difference on r200 but it could have changed with newer chips). That said, don't forget that the performance improvement this gives is chip specific, generally giving more improvement with newer chips. IIRC you definitely don't want to micro tile the front buffer pre-r300. Roland Yeah i loose texture benchmark but it was very small 1-2% on quake3 but maybe quake3 isn't asking for much texture filtering, assuming filtering is the process which benefit from tiled texture. IIRC the microtiling mode will only benefit the exotic filtering modes (anisotropic for example). Did you try this? Stephane -- -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: Why was old TTM removed from drm.git?
Michel Dänzer skrev: On Wed, 2009-06-24 at 19:26 +0200, Thomas Hellström wrote: Just prior to the commit I sent out a message explaining what I was going to do and why, but apparently it didn't make it to the list (which seems to be the case of quite a few mails these days). What was the From: address and subject of that mail (or any others that were apparently lost)? I can't seem to find anything in the dri-devel moderation queue mails around the weekend, so apparently it was dropped before it reached mailman. Maybe some sf.net spam filter or something. Hi! Attaching the mail and another one that was stripped. Furthermore I had two patches sent by git-send-email stripped away, but when I routed them through the vmware smtp server they arrived. It might be that sf.net doesn't like my isp email server. /Thomas. ---BeginMessage--- Hi! I'm about to push a commit that strips old TTM from the drm git repo. Not sure if anybody uses it for other things than libdrm, but at least that will remove some unused code. The master nouveau driver will be disabled since it depends on old ttm. /Thomas ---End Message--- ---BeginMessage--- Okias, The documentation available is the Xorg wiki TTM page (a little outdated) and the ttm header files which are quite well commented. Currently there are three drivers using it 1) The Radeon KMS driver using a subset of the TTM functionality. 2) The Intel moorestown / Poulsbo driver which uses the full TTM functionality including modesetting. Look at the list archives for pointers to that. 3) The openchrome driver in the modesetting-newttm branch. No modesetting yet for that one. Note that the latter 2 drivers are using a tiny TTM user-space interface which is never going to make it to the mainstream kernel. The openChrome driver will be patched up to use a driver-specific version of that interface. /Thomas okias wrote: Hello, exist any documentation related to newttm (except already exist drivers) + any HowTo for 'convert' fb driver + xorg driver to support memory manager + kms? Thanks okias -- Are you an open source citizen? Join us for the Open Source Bridge conference! Portland, OR, June 17-19. Two days of sessions, one day of unconference: $250. Need another reason to go? 24-hour hacker lounge. Register today! http://ad.doubleclick.net/clk;215844324;13503038;v?http://opensourcebridge.org -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel ---End Message--- -- -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH] radeon: preallocate memory for command stream parsing
Pekka Enberg skrev: Hi Jerome, On Tue, 2009-06-23 at 22:52 +0300, Pekka Enberg wrote: On Tue, Jun 23, 2009 at 10:46 PM, Jerome Glissejgli...@redhat.com wrote: Command stream parsing is the most common operation and can happen hundred of times per second, we don't want to allocate/free memory each time this ioctl is call. This rework the ioctl to avoid doing so by allocating temporary memory along the ib pool. Signed-off-by: Jerome Glisse jgli...@redhat.com So how much does this help (i.e. where are the numbers)? I am bit surprised hundred of times per second is an issue for our slab allocators. Hmm? On Wed, 2009-06-24 at 10:29 +0200, Jerome Glisse wrote: I didn't have real number but the vmap path was really slower, quake3 fps goes from ~20 to ~40 on average when going from vmap to preallocated. When using kmalloc i don't thing there was so much performance hit. But i think the biggest hit was that in previous code i asked for zeroed memory so i am pretty sure kernel spend a bit of time clearing page. I reworked the code to avoid needing cleared memory and so avoid memset, this is likely why we get a performance boost. OK. If kmalloc() (without memset) really was too slow for your case, I'd be interested in looking at it in more detail. I'm not completely convinced the memory pool is needed here but I'm not a DRM expert so I'm not NAK'ing this either... Pekka Hi! From previous experience with other drivers kmalloc() is just fine performance-wise. I've also never seen memsetting pages turn up on the profile. It would be interesting to see an oprofile timing of this to try and pinpoint what's happening. However, in this case, I believe Jerome was forced to use vmalloc to guarantee that the allocation would succeed, and frequent vmallocs seem to be a performance killer. One should also be careful about frame-rates. Tuning memory manager / command submission operation is usually a matter of how much CPU is consumed for a given framerate. If one compares framerates one must make sure that the CPU is at nearly 100% while benchmarking. /Thomas -- -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel -- -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: Why was old TTM removed from drm.git?
On Thu, 2009-06-25 at 10:55 +0200, Thomas Hellström wrote: Michel Dänzer skrev: On Wed, 2009-06-24 at 19:26 +0200, Thomas Hellström wrote: Just prior to the commit I sent out a message explaining what I was going to do and why, but apparently it didn't make it to the list (which seems to be the case of quite a few mails these days). What was the From: address and subject of that mail (or any others that were apparently lost)? I can't seem to find anything in the dri-devel moderation queue mails around the weekend, so apparently it was dropped before it reached mailman. Maybe some sf.net spam filter or something. Hi! Attaching the mail and another one that was stripped. Furthermore I had two patches sent by git-send-email stripped away, but when I routed them through the vmware smtp server they arrived. It might be that sf.net doesn't like my isp email server. Yeah, something like that seems like the most plausible explanation. -- Earthling Michel Dänzer |http://www.vmware.com Libre software enthusiast | Debian, X and DRI developer -- -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[Bug 22371] [DRI2] TORCS causes assertion failure in r200 dri driver
http://bugs.freedesktop.org/show_bug.cgi?id=22371 --- Comment #3 from Pauli suok...@gmail.com 2009-06-25 03:34:47 PST --- No. Still same assertion failure. I will have a look what torcs is trying to render there. It might help solving the problem. -- Configure bugmail: http://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. -- -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH] radeon: preallocate memory for command stream parsing
On Thu, 2009-06-25 at 11:03 +0200, Thomas Hellström wrote: Pekka Enberg skrev: Hi Jerome, On Tue, 2009-06-23 at 22:52 +0300, Pekka Enberg wrote: On Tue, Jun 23, 2009 at 10:46 PM, Jerome Glissejgli...@redhat.com wrote: Command stream parsing is the most common operation and can happen hundred of times per second, we don't want to allocate/free memory each time this ioctl is call. This rework the ioctl to avoid doing so by allocating temporary memory along the ib pool. Signed-off-by: Jerome Glisse jgli...@redhat.com So how much does this help (i.e. where are the numbers)? I am bit surprised hundred of times per second is an issue for our slab allocators. Hmm? On Wed, 2009-06-24 at 10:29 +0200, Jerome Glisse wrote: I didn't have real number but the vmap path was really slower, quake3 fps goes from ~20 to ~40 on average when going from vmap to preallocated. When using kmalloc i don't thing there was so much performance hit. But i think the biggest hit was that in previous code i asked for zeroed memory so i am pretty sure kernel spend a bit of time clearing page. I reworked the code to avoid needing cleared memory and so avoid memset, this is likely why we get a performance boost. OK. If kmalloc() (without memset) really was too slow for your case, I'd be interested in looking at it in more detail. I'm not completely convinced the memory pool is needed here but I'm not a DRM expert so I'm not NAK'ing this either... Pekka Hi! From previous experience with other drivers kmalloc() is just fine performance-wise. I've also never seen memsetting pages turn up on the profile. It would be interesting to see an oprofile timing of this to try and pinpoint what's happening. However, in this case, I believe Jerome was forced to use vmalloc to guarantee that the allocation would succeed, and frequent vmallocs seem to be a performance killer. One should also be careful about frame-rates. Tuning memory manager / command submission operation is usually a matter of how much CPU is consumed for a given framerate. If one compares framerates one must make sure that the CPU is at nearly 100% while benchmarking. /Thomas To give a more correct rough estimate, at 60fps we will issue somethings like 10 to 50 cs ioctl per frame so it's several thousands of cs ioctl so several thousands of 64K allocation, and memory clearing, i believe such things would show up on profile. I am not running kernel without my patch as i am working on other stuff now, but i will lower the pool size so that we don't waste too much memory right now i think the preallocation use somethings around 8M of memory. I think i can get it down to 1M without impacting performance (even less if we are on pcie). Cheers, Jerome -- -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [RFC][PATCH] drm/radeon/kms: add initial colortiling support.
On Wed, 2009-06-24 at 10:21 +1000, Dave Airlie wrote: From: Dave Airlie airl...@redhat.com This adds color tiling support for buffers in VRAM, it enables a color tiled fbcon and a color tiled X frontbuffer. It changes the API: adds two new parameters to the object creation API (is this better than a set/get tiling?) we probably still need a get but not sure for what yet. relocs are required for 2D DST_PITCH_OFFSET and SRC_PITCH_OFFSET type-0, and 3D COLORPITCH registers. TTM: adds a new check_tiling call to TTM, gets called at fault and around bo moves. Issues: Can we integrate endian swapping in with this? Depth buffer handling? When we run out of surface regs - needs better handling. Future features: texture tiling - need to relocate texture registers TXOFFSET* with tiling info. Worked on that today and i think we can restrict our self to set surface only for pinned buffer (ie scanout buffer). Use case, which i believe defite surface : tiled bo in vram mapped with surface backing it, userspace know that it can access linearly the surface, bo get evicted from vram and userspace mapping invalidated, there is no way (at least i don't think so, except calling some ioctl on each memory access which is a no go) for userspace to know that now it has to until by itself. So i believe we should only set surface on scanoutbuffer and let userspace deal with untiling. I don't think this is a drawback, we can use blit to until a buffer (macro only) or simply have clever code to access the memory, if we are in a fallback we already loose. Am i missing somethings ? Does it make sense to only program surface on scanout (this would give a simpler API for tiling) ? Cheers, Jerome -- -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [RFC][PATCH] drm/radeon/kms: add initial colortiling support.
On 25.06.2009 10:32, Stephane Marchesin wrote: On Thu, Jun 25, 2009 at 09:46, Jerome Glissegli...@freedesktop.org wrote: On Wed, 2009-06-24 at 22:32 +0200, Roland Scheidegger wrote: On 24.06.2009 20:17, Jerome Glisse wrote: I think we should let user ask at gem map ioctl time if userspace wants an surface backed mapping or not, and gem map will reply with a success or failure. So if object is in vram and there is a surface reg available it will succeed, if object is in system ram it will report to userspace that there is not automatic untiling and that userspace is on its own to untile the buffer. For the X server that the front buffer is mapped first and never unmapped, it should get a surface (assuming no other process already stole all the surface). For pixmap i think be better of not using tiling for time being (or macro tiling only benchmark below). Mesa, map/unmap things and should be able to untile on its own for front/zbuffer (we need to add texture but i am not sure it's worth it, see benchmark below). I don't see benchmark with texture tiling below... It definitely made some difference though when I implemented (and measured...) this, though I never really worried that much about tiled compressed textures, not sure micro tiled is even possible (and would make sense) but macro tiled certainly should be (but IIRC I tried to measure it and it didn't make much of a difference on r200 but it could have changed with newer chips). That said, don't forget that the performance improvement this gives is chip specific, generally giving more improvement with newer chips. IIRC you definitely don't want to micro tile the front buffer pre-r300. Roland Yeah i loose texture benchmark but it was very small 1-2% on quake3 but maybe quake3 isn't asking for much texture filtering, assuming filtering is the process which benefit from tiled texture. IIRC the microtiling mode will only benefit the exotic filtering modes (anisotropic for example). Did you try this? I am pretty sure it made quite a difference with normal trilinear filtering (otherwise I never would have bothered implementing it in the first place). Can't remember exactly but probably around 10% or so. Not sure if macro or micro tiling helped more, but both together were definitely accounting for more than 2% (unless using compressed textures). You are right though I think with bilinear (which q3 uses as default) there was less difference. That was on rv250 back then, and it could be different on newer chips (could depend quite a bit on if it's a chip with a lot of memory bandwidth or not too). Roland -- -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[Bug 13610] radeon kms invalid edid data at lvds
http://bugzilla.kernel.org/show_bug.cgi?id=13610 Andrew Morton a...@linux-foundation.org changed: What|Removed |Added CC||a...@linux-foundation.org --- Comment #1 from Andrew Morton a...@linux-foundation.org 2009-06-25 19:47:11 --- There's no such kernel version as 2.6.31-rc0. I assume that you meant 2.6.30? Thanks. -- Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are watching the assignee of the bug. -- -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[Bug 13610] radeon kms invalid edid data at lvds
http://bugzilla.kernel.org/show_bug.cgi?id=13610 Rafael J. Wysocki r...@sisk.pl changed: What|Removed |Added CC||r...@sisk.pl Blocks||13615 --- Comment #2 from Rafael J. Wysocki r...@sisk.pl 2009-06-25 20:54:28 --- I think he meant a post-2.6.30 git kernel. -- Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are watching the assignee of the bug. -- -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
[PATCH] drm/radeon: fix support for vline relocations.
From: Dave Airlie airl...@redhat.com Userspace sends us a special relocation type to sync video/exa to vlines to avoid tearing, this deals with the relocation in the kernel, it picks the correct crtc and avoids issues where crtcs are disabled. Signed-off-by: Dave Airlie airl...@redhat.com --- drivers/gpu/drm/radeon/r100.c | 89 + drivers/gpu/drm/radeon/r300.c | 13 +- drivers/gpu/drm/radeon/r500_reg.h |2 + drivers/gpu/drm/radeon/rv515.c| 23 +++-- 4 files changed, 121 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c index c550932..bbc4cb5 100644 --- a/drivers/gpu/drm/radeon/r100.c +++ b/drivers/gpu/drm/radeon/r100.c @@ -753,6 +753,86 @@ int r100_cs_packet_parse(struct radeon_cs_parser *p, } /** + * r100_cs_packet_next_vline() - parse userspace VLINE packet + * @parser:parser structure holding parsing context. + * + * Userspace sends a special sequence for VLINE waits. + * PACKET0 - VLINE_START_END + value + * PACKET0 - WAIT_UNTIL +_value + * RELOC (P3) - crtc_id in reloc. + * + * This function parses this and relocates the VLINE START END + * and WAIT UNTIL packets to the correct crtc. + * It also detects a switched off crtc and nulls out the + * wait in that case. + */ +int r100_cs_packet_parse_vline(struct radeon_cs_parser *p) +{ + struct radeon_cs_chunk *ib_chunk; + struct drm_mode_object *obj; + struct ttm_crtc *crtc; + struct radeon_crtc *radeon_crtc; + struct radeon_cs_packet p3reloc, waitreloc; + int crtc_id; + int r; + uint32_t header, h_idx, reg; + + /* jump over the wait until */ + r = r100_cs_packet_parse(p, waitreloc, p-idx); + if (r) + return r; + + /* jump over the NOP */ + r = r100_cs_packet_parse(p, p3reloc, p-idx); + if (r) + return r; + h_idx = p-idx - 2; + p-idx += waitreloc.count; + p-idx += p3reloc.count; + + ib_chunk = p-chunks[p-chunk_ib_idx]; + header = ib_chunk-kdata[h_idx]; + crtc_id = ib_chunk-kdata[h_idx + 5]; + reg = ib_chunk-kdata[h_idx] 2; + mutex_lock(p-rdev-ddev-mode_config.mutex); + obj = drm_mode_object_find(p-rdev-ddev, crtc_id, DRM_MODE_OBJECT_CRTC); + if (!obj) { + DRM_ERROR(cannot find crtc %d\n, crtc_id); + r = -EINVAL; + goto out; + } + crtc = obj_to_crtc(obj); + radeon_crtc = to_radeon_crtc(crtc); + crtc_id = radeon_crtc-crtc_id; + + if (!crtc-enabled) { + /* if the CRTC isn't enabled - we need to nop out the wait until */ + ib_chunk-kdata[h_idx + 2] = PACKET2(0); + ib_chunk-kdata[h_idx + 3] = PACKET2(0); + } else if (crtc_id == 1) { + switch (reg) { + case AVIVO_D1MODE_VLINE_START_END: + header = R300_CP_PACKET0_REG_MASK; + header |= AVIVO_D2MODE_VLINE_START_END 2; + break; + case RADEON_CRTC_GUI_TRIG_VLINE: + header = R300_CP_PACKET0_REG_MASK; + header |= RADEON_CRTC2_GUI_TRIG_VLINE 2; + break; + default: + DRM_ERROR(unknown crtc reloc\n); + r = -EINVAL; + goto out; + } + ib_chunk-kdata[h_idx] = header; + ib_chunk-kdata[h_idx + 3] |= RADEON_ENG_DISPLAY_SELECT_CRTC1; + } +out: + mutex_unlock(p-rdev-ddev-mode_config.mutex); + return r; +} + +/** * r100_cs_packet_next_reloc() - parse next packet which should be reloc packet3 * @parser:parser structure holding parsing context. * @data: pointer to relocation data @@ -825,6 +905,15 @@ static int r100_packet0_check(struct radeon_cs_parser *p, } for (i = 0; i = pkt-count; i++, idx++, reg += 4) { switch (reg) { + case RADEON_CRTC_GUI_TRIG_VLINE: + r = r100_cs_packet_parse_vline(p); + if (r) { + DRM_ERROR(No reloc for ib[%d]=0x%04X\n, + idx, reg); + r100_cs_dump_packet(p, pkt); + return r; + } + break; /* FIXME: only allow PACKET3 blit? easier to check for out of * range access */ case RADEON_DST_PITCH_OFFSET: diff --git a/drivers/gpu/drm/radeon/r300.c b/drivers/gpu/drm/radeon/r300.c index e2ed5bc..da87869 100644 --- a/drivers/gpu/drm/radeon/r300.c +++ b/drivers/gpu/drm/radeon/r300.c @@ -44,6 +44,7 @@ int r100_gui_wait_for_idle(struct radeon_device *rdev); int r100_cs_packet_parse(struct radeon_cs_parser *p,
Re: TTM page pool allocator
On Thu, Jun 25, 2009 at 10:01 PM, Jerome Glissegli...@freedesktop.org wrote: Hi, Thomas i attach a reworked page pool allocator based on Dave works, this one should be ok with ttm cache status tracking. It definitely helps on AGP system, now the bottleneck is in mesa vertex's dma allocation. My original version kept a list of wb pages as well, this proved to be quite a useful optimisation on my test systems when I implemented it, without it I was spending ~20% of my CPU in getting free pages, granted I always used WB pages on PCIE/IGP systems. Another optimisation I made at the time was around the populate call, (not sure if this is what still happens): Allocate a 64K local BO for DMA object. Write into the first 5 pages from userspace - get WB pages. Bind to GART, swap those 5 pages to WC + flush. Then populate the rest with WC pages from the list. Granted I think allocating WC in the first place from the pool might work just as well since most of the DMA buffers are write only. Dave. -- -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH 1/2] drm: Fix EDID detailed timing misc flags decoding
On Thu, 2009-06-25 at 16:50 -0700, Jesse Barnes wrote: Is this mixing up the pixel block definition of sync vs the basic block definition (which has all the composite, green etc bits)? The VESA spec seems to allow detailed modes to send sync signals in different ways -- embedded with the analog signal, on one separate wire or on two separate wires. Looks ok, but according to wikipedia hsync+ is 11 and vsync+ is 12 instead. Other than that the #defines look ok (and I wouldn't trust wikipedia; iirc it had a few errors when I looked at it last). Wikipedia is correct and my code was wrong. good catch! -- keith.pack...@intel.com signature.asc Description: This is a digitally signed message part -- -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH 1/2] drm: Fix EDID detailed timing misc flags decoding
On Wed, 24 Jun 2009 23:49:02 -0700 Keith Packard kei...@keithp.com wrote: - if (pt-misc DRM_EDID_PT_STEREO) { + if (DRM_EDID_DETAILED_MISC_HAS_STEREO(pt-misc)) { printk(KERN_WARNING stereo mode not supported\n); return NULL; Looks correct. } - if (!(pt-misc DRM_EDID_PT_SEPARATE_SYNC)) { - printk(KERN_WARNING integrated sync not supported\n); + if (!(pt-misc DRM_EDID_DETAILED_MISC_DIGITAL_SYNC)) { + printk(KERN_WARNING analog sync not supported\n); + return NULL; + } + if (!(pt-misc DRM_EDID_DETAILED_MISC_DIGITAL_SYNC_SEPARATE)) { + printk(KERN_WARNING digital composite sync not supported\n); return NULL; } Is this mixing up the pixel block definition of sync vs the basic block definition (which has all the composite, green etc bits)? @@ -335,16 +339,17 @@ static struct drm_display_mode *drm_mode_detailed(struct drm_device *dev, drm_mode_set_name(mode); - if (pt-misc DRM_EDID_PT_INTERLACED) + if (pt-misc DRM_EDID_DETAILED_MISC_INTERLACED) mode-flags |= DRM_MODE_FLAG_INTERLACE; This looks right. if (quirks EDID_QUIRK_DETAILED_SYNC_PP) { - pt-misc |= DRM_EDID_PT_HSYNC_POSITIVE | DRM_EDID_PT_VSYNC_POSITIVE; + pt-misc |= (DRM_EDID_DETAILED_MISC_DIGITAL_HSYNC_POSITIVE | + DRM_EDID_DETAILED_MISC_DIGITAL_VSYNC_POSITIVE); } Looks ok, but according to wikipedia hsync+ is 11 and vsync+ is 12 instead. Other than that the #defines look ok (and I wouldn't trust wikipedia; iirc it had a few errors when I looked at it last). -- Jesse Barnes, Intel Open Source Technology Center -- -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH 1/2] drm: Fix EDID detailed timing misc flags decoding
On Thu, 25 Jun 2009 17:57:47 -0700 Keith Packard kei...@keithp.com wrote: On Thu, 2009-06-25 at 16:50 -0700, Jesse Barnes wrote: Is this mixing up the pixel block definition of sync vs the basic block definition (which has all the composite, green etc bits)? The VESA spec seems to allow detailed modes to send sync signals in different ways -- embedded with the analog signal, on one separate wire or on two separate wires. Ok cool, I didn't have it in front of me so I wasn't sure of the encoding in the detailed timing vs basic block section. -- Jesse Barnes, Intel Open Source Technology Center -- -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Re: [PATCH 1/2] drm: Fix EDID detailed timing misc flags decoding
On Thu, 2009-06-25 at 18:44 -0700, Jesse Barnes wrote: Ok cool, I didn't have it in front of me so I wasn't sure of the encoding in the detailed timing vs basic block section. Any X.org member can get a complete copy of the VESA specs. Recommended for review of spec-related patches. This may be the only real benefit of membership, but it's a pretty useful one. -- keith.pack...@intel.com signature.asc Description: This is a digitally signed message part -- -- ___ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel