Re: [PATCH v2] arch: Fix image and bitmap byte order for ppc64le
On Mon, 2014-02-24 at 11:36 +0100, Egbert Eich wrote: From: Dinar Valeev dval...@suse.com So far PPC was big endian for sure. For ppc64le this is no longer true. What happened to this ? It never made it upstream... Cheers, Ben. Signed-off-by: Egbert Eich e...@freedesktop.org --- include/servermd.h | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/include/servermd.h b/include/servermd.h index 11f6c10..256d84b 100644 --- a/include/servermd.h +++ b/include/servermd.h @@ -114,8 +114,13 @@ SOFTWARE. #if defined(__powerpc__) || defined(__ppc__) || defined(__ppc64__) -#define IMAGE_BYTE_ORDERMSBFirst -#define BITMAP_BIT_ORDERMSBFirst +#if defined(__LITTLE_ENDIAN__) +#define IMAGE_BYTE_ORDER LSBFirst +#define BITMAP_BIT_ORDER LSBFirst +#else +#define IMAGE_BYTE_ORDER MSBFirst +#define BITMAP_BIT_ORDER MSBFirst +#endif #define GLYPHPADBYTES 4 #endif /* PowerPC */ ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: Bad DMA from CEDAR card
On Tue, 2012-10-23 at 18:42 +1100, Benjamin Herrenschmidt wrote: As you can see, it's not doing much before the failure: Allright, that debug output is bad, it's missing a bunch of stuff, due to a bad log level (the prink(KERN_DEBUG) in the atom debug stuff doesn't work anymore new kernel btw) Here's a better one: execute D7E2 (len 24, WS 0, PS 4) SET_ATI_PORT @ 0xD7E8 port: 0 (MM) CLEAR_REG @ 0xD7EB dst: REG[0x19EC].[31:0] - 0x AND_REG @ 0xD7EF dst: REG[0x19E4].[7:0] - 0x04 src: IMM 0xFE dst: REG[0x19E4].[7:0] - 0x04 OR_REG @ 0xD7F4 dst: REG[0x19E4].[7:0] - 0x04 src: PS[0x00,0x].[7:0] - 0x00 dst: REG[0x19E4].[7:0] - 0x04 EOT @ 0xD7F9 execute D7CA (len 24, WS 0, PS 4) SET_ATI_PORT @ 0xD7D0 port: 0 (MM) CLEAR_REG @ 0xD7D3 dst: REG[0x19AC].[31:0] - 0x AND_REG @ 0xD7D7 dst: REG[0x19A4].[7:0] - 0x04 src: IMM 0xFE dst: REG[0x19A4].[7:0] - 0x04 OR_REG @ 0xD7DC dst: REG[0x19A4].[7:0] - 0x04 src: PS[0x00,0x].[7:0] - 0x00 dst: REG[0x19A4].[7:0] - 0x04 EOT @ 0xD7E1 execute BADE (len 25, WS 0, PS 0) 0001:01:00.0: EEH freeze detected, fstate=3 pcierr=9 ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org/mailman/listinfo/xorg-driver-ati
Bad DMA from CEDAR card
Hi Folks ! I've been tracking a problem on POWER server with a CEDAR card (FirePro 2270 X1 server) on a POWER machine. The cool thing is that our PCI-E bridges fancy error handling is kicking in and freezing all to that device traffic as soon as the error is detected. By sprinkling some test code around, here is what I have found: - The initial symptom is that the system reports an EEH (our error handling system) error on the card at boot. Under our normal firmware Linux, that pretty much causes the card to be taken out. - With some hand made debug, I have got some more details about the error which is a bad DMA write from the card to an address that isn't mapped in our IOMMU. (Ie something that wasn't the result of a dma_map_sg or dma_alloc_coherent, etc...) - The error seems to always happen while ATOM is executing from the atom_enable_crtc table called from: [c01f7e206fb0] [c0440148] .atombios_enable_crtc+0x38/0x50 [c01f7e207030] [c0441784] .atombios_crtc_dpms+0x104/0x1a0 [c01f7e2070c0] [c0441d68] .atombios_crtc_disable+0x28/0x170 [c01f7e207190] [c03e6ef4] .drm_helper_disable_unused_functions+0x144/0x230 [c01f7e207230] [c03e69ec] .drm_fb_helper_initial_config+0x5c/0x310 [c01f7e207340] [c046862c] .radeon_fbdev_init+0xdc/0x190 [c01f7e2073e0] [c04620c0] .radeon_modeset_init+0x740/0xc90 [c01f7e2074b0] [c0438cfc] .radeon_driver_load_kms+0x14c/0x1a0 [c01f7e207550] [c03f6e14] .drm_get_pci_dev+0x1c4/0x2e0 [c01f7e207600] [c079aea8] .radeon_pci_probe+0xc4/0xe8 [c01f7e207690] [c0378d60] .pci_device_probe+0x1a0/0x1c0 [c01f7e207740] [c04db054] .driver_probe_device+0xe4/0x2c0 [c01f7e2077e0] [c04db33c] .__driver_attach+0x10c/0x110 [c01f7e207870] [c04d8b2c] .bus_for_each_dev+0x8c/0xe0 [c01f7e207920] [c04daaa8] .driver_attach+0x28/0x40 [c01f7e2079a0] [c04da3d8] .bus_add_driver+0x228/0x300 [c01f7e207a40] [c04db8b0] .driver_register+0xa0/0x1e0 [c01f7e207ae0] [c0378eb8] .__pci_register_driver+0x48/0x60 [c01f7e207b60] [c03f709c] .drm_pci_init+0x16c/0x1a0 [c01f7e207c10] [c09eae5c] .radeon_init+0xb4/0xd0 The error is detected on an atom_op_jump() that loops for ever due to the isolation which means that all MMIOs are returning 's. The actual error might have happened slightly earlier (see below). . From the backtrace, it seems to be trying to *disable* CRTCs (I would have understood if it was trying to incorrectly enable one which is sourcing pixels from the wrong address...) - I've added various delays in all sort of stages of radeon_modeset_init() and radeon_fbdev_init(), and the error still appears to be fairly well localized to the execution of that table, so it looks like it's not some stray DMA that happens to hit at that moment due to some timing, but something specifically triggered by that table execution. - I've turned on atom_debug right before the call to drm_fb_helper_initial_config() in radeon_fbdev_init() and added a freeze check between each op, and here's the result. I don't really have the brain cycles to try to parse that right now :-) I used to back then but heh, it's a long time ago... that's where I'm handing you the hot potato hoping it will make some obvious sense :-) As you can see, it's not doing much before the failure: execute D7E2 (len 24, WS 0, PS 4) SET_ATI_PORT @ 0xD7E8 port: 0 (MM) CLEAR_REG @ 0xD7EB dst: AND_REG @ 0xD7EF dst: src: dst: OR_REG @ 0xD7F4 dst: src: dst: EOT @ 0xD7F9 execute D7CA (len 24, WS 0, PS 4) SET_ATI_PORT @ 0xD7D0 port: 0 (MM) CLEAR_REG @ 0xD7D3 dst: AND_REG @ 0xD7D7 dst: src: dst: OR_REG @ 0xD7DC dst: src: dst: EOT @ 0xD7E1 execute BADE (len 25, WS 0, PS 0) 0001:01:00.0: EEH freeze detected, fstate=3 pcierr=9 [ here we have detected the freeze, the stuff below is my own diagnostic code, it I will decrypt some of it later if it's of use, basically it says the freeze occurred due to a DMA error to an unmapped DMA address though I am not 100% sure of the actual DMA address, I think what it gives me is actually the address of the iommu PTE that had the valid bit clear, I need to do some work to turn that back into a page or something ... The packet hasn't been captured in the TLP header capture of the AER function unfortunately. ] PHB 1 diagnostic data: brdgCtl = 0x0002 portStatusReg= 0x rootCmplxStatus = 0x busAgentStatus = 0x deviceStatus = 0x000f slotStatus = 0x016003c0 linkStatus = 0xa0120008 devCmdStatus = 0x00100147 devSecStatus = 0x rootErrorStatus = 0x uncorrErrorStatus= 0x
Re: Bad DMA from CEDAR card
On Tue, 2012-10-23 at 18:54 +1100, Benjamin Herrenschmidt wrote: On Tue, 2012-10-23 at 18:42 +1100, Benjamin Herrenschmidt wrote: As you can see, it's not doing much before the failure: Allright, that debug output is bad, it's missing a bunch of stuff, due to a bad log level (the prink(KERN_DEBUG) in the atom debug stuff doesn't work anymore new kernel btw) More data: I've done a bit of AtomDis under Dave instructions and improved my tracing, and what it looks like is we run those 3 tables in that order: DAC1OutputControl DAC2OutputControl EnableCRTCMemReq The error happens somewhere between the end of DAC2OutputControl and early in EnableCRTCMemReq. It all comes from drm_helper_disable_unused_functions(), the first two I suspect as a result of drm_encoder_disable() and the last one as a result of crtc_funcs-disable which itself goes into dpms. I don't know (yet) whether anything happens in between that doesn't go via ATOM, in which case that wouldn't be traced. That's the next thing to check (including interrupts though we shouldn't be getting any at this stage afaik). Now here is the trace. As before, I enable atom_debug right before that sequence, after I've done a 1s pause and a freeze check which passes, so I'm reasonably confident the card is still somewhat in a sane state. I then run the tables, with a hack that adds a 200ms pause after each op, followed by a freeze check which I moved to after the ops instead of before, so we don't miss a freeze caused by the last op of a table. Here's the results so far: execute D7E2 (len 24, WS 0, PS 4) SET_ATI_PORT @ 0xD7E8 port: 0 (MM) CLEAR_REG @ 0xD7EB dst: REG[0x19EC].[31:0] - 0x AND_REG @ 0xD7EF dst: REG[0x19E4].[7:0] - 0x04 src: IMM 0xFE dst: REG[0x19E4].[7:0] - 0x04 OR_REG @ 0xD7F4 dst: REG[0x19E4].[7:0] - 0x04 src: PS[0x00,0x].[7:0] - 0x00 dst: REG[0x19E4].[7:0] - 0x04 EOT @ 0xD7F9 execute D7CA (len 24, WS 0, PS 4) SET_ATI_PORT @ 0xD7D0 port: 0 (MM) CLEAR_REG @ 0xD7D3 dst: REG[0x19AC].[31:0] - 0x AND_REG @ 0xD7D7 dst: REG[0x19A4].[7:0] - 0x04 src: IMM 0xFE dst: REG[0x19A4].[7:0] - 0x04 OR_REG @ 0xD7DC dst: REG[0x19A4].[7:0] - 0x04 src: PS[0x00,0x].[7:0] - 0x00 dst: REG[0x19A4].[7:0] - 0x04 EOT @ 0xD7E1 execute BADE (len 25, WS 0, PS 0) SET_ATI_PORT @ 0xBAE4 port: 0 (MM) 0001:01:00.0: EEH freeze detected, fstate=3 pcierr=9 Interestingly enough, it _does_ go into EnableCRTCMemreq tho it doesn't seem to have time to do anything in there before it detects the failure. However it also doesn't appear to detect it despite the delays on the last instruction of DAC2OutputControl. I do wonder whether we end up actually doing something *else* in between those two, that isn't going through ATOM. I'll try to dig in that direction next. The disassembly of those tables matches the output: command_table d7ca #44 (DAC1OutputControl): Size 0018 Format Rev. 01 Param Rev. 00 Content Rev. 01 Attributes: Work space size00 longs Parameter space size 01 longs Table update indicator 0 0006: 37SET_ATI_PORT (INDIRECT_IO_MM) 0009: 5400ac19 CLEAR reg[19ac] [] 000d: 0725a419feANDreg[19a4] [...X] - fe 0012: 0d21a41900OR reg[19a4] [...X] - param[00] [...X] 0017: 5bEOT command_table d7e2 #45 (DAC2OutputControl): Size 0018 Format Rev. 01 Param Rev. 00 Content Rev. 01 Attributes: Work space size00 longs Parameter space size 01 longs Table update indicator 0 0006: 37SET_ATI_PORT (INDIRECT_IO_MM) 0009: 5400ec19 CLEAR reg[19ec] [] 000d: 0725e419feANDreg[19e4] [...X] - fe 0012: 0d21e41900OR reg[19e4] [...X] - param[00] [...X] 0017: 5bEOT Cheers, Ben. ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org/mailman/listinfo/xorg-driver-ati
Re: Bad DMA from CEDAR card
On Tue, 2012-10-23 at 21:45 +1100, Benjamin Herrenschmidt wrote: On Tue, 2012-10-23 at 18:54 +1100, Benjamin Herrenschmidt wrote: On Tue, 2012-10-23 at 18:42 +1100, Benjamin Herrenschmidt wrote: As you can see, it's not doing much before the failure: Allright, that debug output is bad, it's missing a bunch of stuff, due to a bad log level (the prink(KERN_DEBUG) in the atom debug stuff doesn't work anymore new kernel btw) More data: I've done a bit of AtomDis under Dave instructions and improved my tracing, and what it looks like is we run those 3 tables in that order: And more :-) .../... I don't know (yet) whether anything happens in between that doesn't go via ATOM, in which case that wouldn't be traced. That's the next thing to check (including interrupts though we shouldn't be getting any at this stage afaik). So I think it's in between. From what I can tell, the error happens somewhere inside the call to drm_vblank_pre_modeset() from atombios_crtc_dpms(). This is actually a bit nasty, but basically that pre_modeset() calls drm_vblank_get() which enables vblanks. Now, it doesn't look like it's actually taking any interrupt ... Well assuming this is indeed using the irq handler in evergreen.c, which doesn't appear to be called, but I might have confused my ASICs and missed something specific to CEDAR here. Here's what I've traced so far: evergreen_irq_set: vblank 0 evergreen_irq_set: hpd 1 evergreen_irq_set: hpd 2 evergreen_irq_set: hpd 3 evergreen_irq_set: hpd 4 0001:01:00.0: EEH freeze detected, fstate=3 pcierr=9 msg: irqset 2 What that output means is that it called evergreen_irq_set() which enables vblank0 (and various hpd's but those were already enabled), and the freeze is detected at the tracepoint irqset 2 that I added in there. This point is basically right at the end of evergreen_irq_set(), where I do a 500ms delay and check for freeze. A previous trace point right before writing to CP_INT_CNTL didn't show any freeze. Now the interrupt being an MSI, it's a memory store ... I had a vague memory of one of you guys mentioning address limitations to 40-bit or so in the radeon, though I though that shouldn't affect MSIs right ? Well ... Our 64-bit MSIs are actually using all 64-bit address bits. If the radeon doesn't do that properly and crops the address bits, the MSIs are going to hit wrong, right in the middle of nowhere, possibly some DMA space. So I hacked my platform code to force it to only hand out 32-bit MSI addresses and guess what ? ... the problem seems to be gone. Ouch. That's really nasty. Supporting only a subset of the PCI address space for DMA was already fairly nasty to begin with, but not doing the full MSI addresses looks like a clear violation of the PCIe spec :-( I'll do some more tests tomorrow to confirm whether that is the problem or not at which point, if it is, we'll need some kind of quirk to indicate that it supports only MSI32 and not MSI64 or something along those lines. Guys, go shoot your HW engineers please. Cheers, Ben. ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org/mailman/listinfo/xorg-driver-ati
Re: Bad DMA from CEDAR card
On Tue, 2012-10-23 at 09:34 -0400, Alex Deucher wrote: I'll do some more tests tomorrow to confirm whether that is the problem or not at which point, if it is, we'll need some kind of quirk to indicate that it supports only MSI32 and not MSI64 or something along those lines. Guys, go shoot your HW engineers please. Well, we only support a 40 bit DMA mask, so I suspect MSIs are limited to 40 bits as well. Well, there is a big difference here... limiting yourself to configurations up to 1TB of RAM for the purpose of DMA is somewhat understandable (ignoring that in NUMA setups etc... memory might actually cover more address space than the amount of RAM etc...) but MSI's are a whole different thing. They do not hit memory, they are decoded as special addresses by the bridge , and can potentially be anywhere. Some of our bridges only support MSIs high up in 64-bit space for example. Anyway, I will double check today whether that is indeed the problem I'm hitting though from what I can tell from here, it looks like it. Cheers, Ben. ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org/mailman/listinfo/xorg-driver-ati
Re: [PATCH] Fix ConnectorTable crash in radeon_output.c
On Wed, 2012-05-02 at 17:59 +0200, Hans Verkuil wrote: For the latter, you could try if hacking the driver as described in http://lists.ozlabs.org/pipermail/linuxppc-dev/2012-April/097668.html helps. I tried this but that had no effect that I could see. For the former, Benjamin Herrenschmidt (CC'd) might be able to point you to some patches I keep hearing about but haven't seen yet. I'd be happy to test out any patches you might have. The radeon driver as is is actually way better than the old driver if it wasn't for the wrong colors. The 32-bit patches are still afaik in Tony's hands (CC) :-) He and his wife just had a new kid so he might not be -that- responsive but I'm hoping hell release those patches soon (Tony !) Cheers, Ben. ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org/mailman/listinfo/xorg-driver-ati
Multitouch regression in 3.3 on thinkpad X220 clickpad
Hi folks ! So Peter and I have been discussing a problem I observed on my brand new ThinkPad X220 and it's multitouch clickpad. I've been digging a bit more today and bisected the regression to: commit 7968a5dd492ccc38345013e534ad4c8d6eb60ed1 Input: synaptics - add support for Relative mode Without that commit, I can draw two traces when using two fingers in mtview (after removing the device from X), each follow one finger. With that commit applied, this doesn't work anymore: it appears to be unable to track more than one finger. IE. If i put a finger on the touchpad, I can draw, but a second finger is then ignored (or causes one or two spots in a random place to appear but that's about it). Now other multitouch operations such as two finger scrolling seem to work in X. I'm not familiar with the inner workings of the input layer or the synaptic touchpads. I'll dig a bit more this week-end if I have time, but it would save me plenty of that precious time if you guys could hint me at things to look at :-) I have a couple more bits of interesting info. If I boot with a bad driver. rmmod it. insmod a good one (one build from the same kernel tree set to the commit right before the one above) then things work fine. The interesting bit is that if I then rmmod it and insmod the bad one again ... it still works. So it looks like it might have something to do with how the touchpad is initialized. Some more info about the touchpad itself from dmesg: psmouse serio1: synaptics: Touchpad model: 1, fw: 8.0, id: 0x1e2b1, caps: 0xd001a3/0x940300/0x120c00 psmouse serio1: synaptics: serio: Synaptics pass-through port at isa0060/serio1/input0 input: SynPS/2 Synaptics TouchPad as /devices/platform/i8042/serio1/input/input13 Let me know if there's any other info of value I can collect. Cheers, Ben. ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: Multitouch regression in 3.3 on thinkpad X220 clickpad
On Fri, 2012-04-20 at 10:01 -0700, Dmitry Torokhov wrote: It looks we lost a condition in synaptics_set_advanced_gesture_mode(). It used to be: if (!(SYN_CAP_ADV_GESTURE(priv-ext_cap_0c) || SYN_CAP_IMAGE_SENSOR(priv-ext_cap_0c))) return 0; and now simply is: if (!SYN_CAP_ADV_GESTURE(priv-ext_cap_0c)) return 0; Could you try restoring the condition and see if it fixes the regression? Yes, that's it. Please shoot the patch below to Linus. Thanks, Ben. input/synaptics: Fix regression with image sensor trackpads commit 7968a5dd492ccc38345013e534ad4c8d6eb60ed1 Input: synaptics - add support for Relative mode Accidentally broke support for advanced gestures (multitouch) on some trackpads such as the one in my ThinkPad X220 by incorretly changing the condition for enabling them. This restores it. Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org CC: sta...@kernel.org [3.3] diff --git a/drivers/input/mouse/synaptics.c b/drivers/input/mouse/synaptics.c index a3bb49c..37b3792 100644 --- a/drivers/input/mouse/synaptics.c +++ b/drivers/input/mouse/synaptics.c @@ -273,7 +273,8 @@ static int synaptics_set_advanced_gesture_mode(struct psmouse *psmouse) static unsigned char param = 0xc8; struct synaptics_data *priv = psmouse-private; - if (!SYN_CAP_ADV_GESTURE(priv-ext_cap_0c)) + if (!(SYN_CAP_ADV_GESTURE(priv-ext_cap_0c) || + SYN_CAP_IMAGE_SENSOR(priv-ext_cap_0c))) return 0; if (psmouse_sliced_command(psmouse, SYN_QUE_MODEL)) ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH] RADEONCopySwap: Fix RADEON_HOST_DATA_SWAP_16BIT case.
On Mon, 2012-04-16 at 15:39 +0200, Michel Dänzer wrote: From: Michel Dänzer michel.daen...@amd.com It was the same code as for RADEON_HOST_DATA_SWAP_32BIT. This caused bus errors on FreeBSD/PPC, but I'm not sure how it could not cause problems anywhere... What kind of bus errors ? Alignment faults ? I think we go lucky the kernel fixed them up on Linux at a performance penalty :-) Cheers, Ben. Reported-by: Andreas Tobler andre...@fgznet.ch Signed-off-by: Michel Dänzer michel.daen...@amd.com --- src/radeon_accel.c |5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/src/radeon_accel.c b/src/radeon_accel.c index 1cebcf6..784502d 100644 --- a/src/radeon_accel.c +++ b/src/radeon_accel.c @@ -982,10 +982,9 @@ void RADEONCopySwap(uint8_t *dst, uint8_t *src, unsigned int size, int swap) for (; nwords 0; --nwords, ++d, ++s) #ifdef __powerpc__ - asm volatile(stwbrx %0,0,%1 : : r (*s), r (d)); + asm volatile(sthbrx %0,0,%1 : : r (*s), r (d)); #else - *d = ((*s 24) 0xff) | ((*s 8) 0xff00) - | ((*s 0xff00) 8) | ((*s 0xff) 24); + *d = (*s 8) | (*s 8); #endif return; } ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org/mailman/listinfo/xorg-driver-ati
[PATCH] ddx/evergreen: Fix endian of ALU constants
The constants are written directly into a buffer object shared with the card and we forget to swap them. This patch fixes it by doing the swap in evergreen_set_alu_consts() in-place (ie, it modifies the buffer), which should be fine with the way we use it in the ddx. This makes everything work fine on my caicos card on a G5 including some quik tests with Xv, gnome3 shell, etc... Thanks a lot to Jerome Glisse for holding my hand through debugging that (and finding the actual bug). Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org --- diff --git a/src/evergreen_accel.c b/src/evergreen_accel.c index 5c95e20..83320c8 100644 --- a/src/evergreen_accel.c +++ b/src/evergreen_accel.c @@ -479,6 +479,17 @@ evergreen_set_alu_consts(ScrnInfoPtr pScrn, const_config_t *const_conf, uint32_t if (size == 0) size = 1; +#if X_BYTE_ORDER == X_BIG_ENDIAN +{ + uint32_t count = size 4, *p = const_conf-cpu_ptr; + + while(count--) { + *p = cpu_to_le32(*p); + p++; + } +} +#endif + /* flush SQ cache */ evergreen_cp_set_surface_sync(pScrn, SH_ACTION_ENA_bit, const_conf-size_bytes, const_conf-const_addr, diff --git a/src/evergreen_exa.c b/src/evergreen_exa.c index 6becbb3..603d854 100644 --- a/src/evergreen_exa.c +++ b/src/evergreen_exa.c @@ -172,6 +172,7 @@ EVERGREENPrepareSolid(PixmapPtr pPix, int alu, Pixel pm, Pixel fg) ps_alu_consts = radeon_vbo_space(pScrn, accel_state-cbuf, 256); ps_const_conf.bo = accel_state-cbuf.vb_bo; ps_const_conf.const_addr = accel_state-cbuf.vb_mc_addr + accel_state-cbuf.vb_offset; +ps_const_conf.cpu_ptr = (uint32_t *)(char *)ps_alu_consts; if (accel_state-dst_obj.bpp == 16) { r = (fg 11) 0x1f; g = (fg 5) 0x3f; @@ -1320,6 +1321,7 @@ static Bool EVERGREENPrepareComposite(int op, PicturePtr pSrcPicture, vs_const_conf.bo = accel_state-cbuf.vb_bo; vs_const_conf.const_addr = accel_state-cbuf.vb_mc_addr + accel_state-cbuf.vb_offset; +vs_const_conf.cpu_ptr = (uint32_t *)(char *)cbuf; EVERGREENXFormSetup(pSrcPicture, pSrc, 0, cbuf); if (pMask) EVERGREENXFormSetup(pMaskPicture, pMask, 1, cbuf); diff --git a/src/evergreen_state.h b/src/evergreen_state.h index 40fec22..5fd85f8 100644 --- a/src/evergreen_state.h +++ b/src/evergreen_state.h @@ -120,6 +120,7 @@ typedef struct { int size_bytes; uint64_t const_addr; struct radeon_bo *bo; +uint32_t *cpu_ptr; } const_config_t; /* Vertex buffer / vtx resource */ diff --git a/src/evergreen_textured_videofuncs.c b/src/evergreen_textured_videofuncs.c index 6200cdc..d27a02a 100644 --- a/src/evergreen_textured_videofuncs.c +++ b/src/evergreen_textured_videofuncs.c @@ -449,6 +449,7 @@ EVERGREENDisplayTexturedVideo(ScrnInfoPtr pScrn, RADEONPortPrivPtr pPriv) ps_alu_consts = radeon_vbo_space(pScrn, accel_state-cbuf, 256); ps_const_conf.bo = accel_state-cbuf.vb_bo; ps_const_conf.const_addr = accel_state-cbuf.vb_mc_addr + accel_state-cbuf.vb_offset; +ps_const_conf.cpu_ptr = (uint32_t *)(char *)ps_alu_consts; ps_alu_consts[0] = off[0]; ps_alu_consts[1] = off[1]; @@ -474,6 +475,7 @@ EVERGREENDisplayTexturedVideo(ScrnInfoPtr pScrn, RADEONPortPrivPtr pPriv) vs_alu_consts = radeon_vbo_space(pScrn, accel_state-cbuf, 256); vs_const_conf.bo = accel_state-cbuf.vb_bo; vs_const_conf.const_addr = accel_state-cbuf.vb_mc_addr + accel_state-cbuf.vb_offset; +vs_const_conf.cpu_ptr = (uint32_t *)(char *)vs_alu_consts; vs_alu_consts[0] = 1.0 / pPriv-w; vs_alu_consts[1] = 1.0 / pPriv-h; ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org/mailman/listinfo/xorg-driver-ati
[PATCH] xserver: Fix double free's in config file parser
Feeding the parser a bad config file, I crashed the server a few times. It looks like whoever free's val.str (yeah, val is a global .. yuck) is also responsible for clearing the pointer or something else might try to free it again some time later. Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org --- diff -urN xorg-server-1.10.2.902.orig/hw/xfree86/parser/Input.c xorg-server-1.10.2.902/hw/xfree86/parser/Input.c --- xorg-server-1.10.2.902.orig/hw/xfree86/parser/Input.c 2011-02-25 14:27:14.0 +1100 +++ xorg-server-1.10.2.902/hw/xfree86/parser/Input.c2011-07-14 16:57:18.912426863 +1000 @@ -106,6 +106,7 @@ if (strcmp(val.str, keyboard) == 0) { ptr-inp_driver = strdup(kbd); free(val.str); + val.str = NULL; } else ptr-inp_driver = val.str; diff -urN xorg-server-1.10.2.902.orig/hw/xfree86/parser/InputClass.c xorg-server-1.10.2.902/hw/xfree86/parser/InputClass.c --- xorg-server-1.10.2.902.orig/hw/xfree86/parser/InputClass.c 2011-02-25 14:27:14.0 +1100 +++ xorg-server-1.10.2.902/hw/xfree86/parser/InputClass.c 2011-07-14 16:57:28.608402699 +1000 @@ -114,6 +114,7 @@ if (strcmp(val.str, keyboard) == 0) { ptr-driver = strdup(kbd); free(val.str); + val.str = NULL; } else ptr-driver = val.str; diff -urN xorg-server-1.10.2.902.orig/hw/xfree86/parser/Screen.c xorg-server-1.10.2.902/hw/xfree86/parser/Screen.c --- xorg-server-1.10.2.902.orig/hw/xfree86/parser/Screen.c 2011-02-25 14:27:14.0 +1100 +++ xorg-server-1.10.2.902/hw/xfree86/parser/Screen.c 2011-07-14 16:56:38.492527593 +1000 @@ -316,6 +316,7 @@ Error (QUOTE_MSG, SubSection); { free(val.str); + val.str = NULL; HANDLE_LIST (scrn_display_lst, xf86parseDisplaySubSection, XF86ConfDisplayPtr); } ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH] xserver: Fix double free's in config file parser
On Thu, 2011-07-14 at 07:57 -0700, Dan Nicholson wrote: I also see a couple instances of free (val.str) in parser/Files.c that don't set it to NULL afterward. Yay for custom parsers! With those two instances fixed: Hrm, did I have a grep failure ? Oh... I see, I didn't catch the space between free and (val.str). I can send a newer patch tomorrow, or an addon, let me know, I'm off to bed now. Cheers, Ben. ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH] xserver: Fix double free's in config file parser
On Thu, 2011-07-14 at 09:26 -0700, Keith Packard wrote: On Fri, 15 Jul 2011 01:36:05 +1000, Benjamin Herrenschmidt b...@kernel.crashing.org wrote: On Thu, 2011-07-14 at 07:57 -0700, Dan Nicholson wrote: I also see a couple instances of free (val.str) in parser/Files.c that don't set it to NULL afterward. Yay for custom parsers! With those two instances fixed: Hrm, did I have a grep failure ? Oh... I see, I didn't catch the space between free and (val.str). I can send a newer patch tomorrow, or an addon, let me know, I'm off to bed now. Might as well send a single patch which fixes all of them in one go. I also wouldn't mind seeing a patch to replace 'val' with a slightly more appropriate name :-) Somewhat I knew that's what I was going to end up doing :-) Will take a bit longer, I'm in the middle of some other more urgent stuff (unless somebody else wants to pick it up). I need to look more at how this parser works but it should be possible to have a more robust handling of the lifetime of that current token anyways. Something, like freeing it when we try to replace it (or some other common place like at the end of the loops) rather than having each case that can potentially consume a string have to have the right 2 statements in the right place, that sort of thing... Might even be possible to stop making it a global I suppose. Anyway, don't hold your breath on me doing that real soon, I'll fixup the patch for the immediate bug(s) later today hopefully and will try to go back to those other improvements later, maybe next week but no promise. Cheers, Ben. ___ xorg-devel@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel
Re: [PATCH 3/6] drm/radeon: Writeback endian fixes
On Thu, 2011-07-14 at 17:19 +0200, Michel Dänzer wrote: On Mit, 2011-07-13 at 16:28 +1000, Benjamin Herrenschmidt wrote: The writeback ring pointer and IH ring pointer are read using le32_to_cpu so we do not want the chip to byteswap them on big-endian. We still want to byteswap the ring itself and the IBs, so we don't touch that but we remove setting of the byteswap bits in CP_RB_RPTR_ADDR and IH_CNTL. In general, for things like that where we control all the accessors easily, we are better off doing the swap in SW rather than HW. Paradoxally, it does keep the code closer to x86 and avoid using poorly tested HW features. Absolutely. Unfortunately, when I fixed the CP writeback code to use le32_to_cpu(), I didn't realize the code for some GPU families was already using HW swappers for this. I also changed the use of RADEON_ to R600_ in a couple of cases to be more consistent with the surrounding code. That should probably be in a separate patch. Either way, though: I thought about it and decided it was trivial enough not to bother re-doing the patches. Alex/Dave/whoever's in charge, feel free to apply the current batch, I'll send further cleanups/fixes as separate patches, possibly not before next week or so. Cheers, Ben. Reviewed-by: Michel Dänzer mic...@daenzer.net ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org/mailman/listinfo/xorg-driver-ati
Re: [PATCH 4/6] drm/radeon: Add a rmb() in IH processing
On Fri, 2011-07-15 at 04:19 +, Matt Turner wrote: On Wed, Jul 13, 2011 at 6:28 AM, Benjamin Herrenschmidt b...@kernel.crashing.org wrote: We should have a read memory barrier between reading the WPTR from memory and reading ring entries based on that value (ie, we need to ensure both loads are done in order by the CPU). It could be argued that the MMIO reads in r600_ack_irq() might be enough to get that barrier but I prefer keeping an explicit one just in case. Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org --- (resent adding dri-devel to the CC list to hit patchwork) drivers/gpu/drm/radeon/r600.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index 3c86b15..7e5c801 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -3312,6 +3312,9 @@ int r600_irq_process(struct radeon_device *rdev) } restart_ih: + /* Order reading of wptr vs. reading of IH ring data */ + wmb(); + /* display interrupts */ r600_irq_ack(rdev); The subject line says rmb(), but this says wmb(). Just want to verify what you have is correct. Nice spotting, it's a typo and should have been rmb(). I'll fix it and respin. Cheers, Ben ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org/mailman/listinfo/xorg-driver-ati
[PATCH 6/6] drm/radeon: Fix the definition of RADEON_BUF_SWAP_32BIT
(Note that this is duplicated under various other names such as R600_BUF_SWAP_32BIT etc...). At least now all the definitions agree. Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org --- (resent adding dri-devel to the CC list to hit patchwork) drivers/gpu/drm/radeon/radeon_reg.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_reg.h b/drivers/gpu/drm/radeon/radeon_reg.h index ec93a75..0828add1 100644 --- a/drivers/gpu/drm/radeon/radeon_reg.h +++ b/drivers/gpu/drm/radeon/radeon_reg.h @@ -3293,7 +3293,7 @@ # define RADEON_RB_BUFSZ_MASK (0x3f 0) # define RADEON_RB_BLKSZ_SHIFT8 # define RADEON_RB_BLKSZ_MASK (0x3f 8) -# define RADEON_BUF_SWAP_32BIT(1 17) +# define RADEON_BUF_SWAP_32BIT(2 16) # define RADEON_MAX_FETCH_SHIFT 18 # define RADEON_MAX_FETCH_MASK(0x3 18) # define RADEON_RB_NO_UPDATE (1 27) ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org/mailman/listinfo/xorg-driver-ati
[PATCH 4/6] drm/radeon: Add a rmb() in IH processing
We should have a read memory barrier between reading the WPTR from memory and reading ring entries based on that value (ie, we need to ensure both loads are done in order by the CPU). It could be argued that the MMIO reads in r600_ack_irq() might be enough to get that barrier but I prefer keeping an explicit one just in case. Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org --- (resent adding dri-devel to the CC list to hit patchwork) drivers/gpu/drm/radeon/r600.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index 3c86b15..7e5c801 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -3312,6 +3312,9 @@ int r600_irq_process(struct radeon_device *rdev) } restart_ih: + /* Order reading of wptr vs. reading of IH ring data */ + wmb(); + /* display interrupts */ r600_irq_ack(rdev); ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org/mailman/listinfo/xorg-driver-ati
[PATCH 5/6] drm/radeon: Do an MMIO read on interrupts when not uisng MSIs
When not using MSIs, there is no guarantee that DMA from the device has been fully flushed to point where it's visible to the CPU when taking an interrupt. To get this guarantee, we need to perform an MMIO read from the device, which will flush all outstanding DMAs from bridges between the device and the system. Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org --- (resent adding dri-devel to the CC list to hit patchwork) drivers/gpu/drm/radeon/r600.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index 7e5c801..25b2dab 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -3300,6 +3300,10 @@ int r600_irq_process(struct radeon_device *rdev) if (!rdev-ih.enabled || rdev-shutdown) return IRQ_NONE; + /* No MSIs, need a dummy read to flush PCI DMAs */ + if (!rdev-msi_enabled) + RREG32(IH_RB_WPTR); + wptr = r600_get_ih_wptr(rdev); rptr = rdev-ih.rptr; DRM_DEBUG(r600_irq_process start: rptr %d, wptr %d\n, rptr, wptr); ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org/mailman/listinfo/xorg-driver-ati
[PATCH 2/6] drm/radeon: ATOM Endian fix for atombios_crtc_program_pll()
v6 of the structure was programmed incorrectly: args.v6.ulCrtcPclkFreq.ulPixelClock = cpu_to_le32(clock / 10); ulPixelClock is a 24-bit bitfield. This statement would thus do a 32-bit swap of (clock / 10) and drop the top 8 bits which are ... the LSB. Not what we want. Instead use masks shifts. Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org --- (resent adding dri-devel to the CC list to hit patchwork) drivers/gpu/drm/radeon/atombios_crtc.c |5 ++--- 1 files changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/radeon/atombios_crtc.c b/drivers/gpu/drm/radeon/atombios_crtc.c index 9541995..c742944 100644 --- a/drivers/gpu/drm/radeon/atombios_crtc.c +++ b/drivers/gpu/drm/radeon/atombios_crtc.c @@ -764,7 +764,7 @@ static void atombios_crtc_set_dcpll(struct drm_crtc *crtc, } static void atombios_crtc_program_pll(struct drm_crtc *crtc, - int crtc_id, + u32 crtc_id, int pll_id, u32 encoder_mode, u32 encoder_id, @@ -851,8 +851,7 @@ static void atombios_crtc_program_pll(struct drm_crtc *crtc, args.v5.ucPpll = pll_id; break; case 6: - args.v6.ulCrtcPclkFreq.ucCRTC = crtc_id; - args.v6.ulCrtcPclkFreq.ulPixelClock = cpu_to_le32(clock / 10); + args.v6.ulDispEngClkFreq = cpu_to_le32(crtc_id 24 | clock / 10); args.v6.ucRefDiv = ref_div; args.v6.usFbDiv = cpu_to_le16(fb_div); args.v6.ulFbDivDecFrac = cpu_to_le32(frac_fb_div * 10); ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org/mailman/listinfo/xorg-driver-ati
[PATCH 1/6] drm/radeon: Remove a bunch of useless _iomem casts
Just defining rdev-rmmio properly in the first place should do the trick. In some cases, the cast were also complete dups as the original variable was already of the right type. Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org --- (resent adding dri-devel to the CC list to hit patchwork) drivers/gpu/drm/radeon/radeon.h | 22 +++--- drivers/gpu/drm/radeon/rs600.c |2 +- 2 files changed, 12 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index ef0e0e0..c92cf2c 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -1174,7 +1174,7 @@ struct radeon_device { /* Register mmio */ resource_size_t rmmio_base; resource_size_t rmmio_size; - void*rmmio; + void __iomem*rmmio; radeon_rreg_t mc_rreg; radeon_wreg_t mc_wreg; radeon_rreg_t pll_rreg; @@ -1251,20 +1251,20 @@ int radeon_gpu_wait_for_idle(struct radeon_device *rdev); static inline uint32_t r100_mm_rreg(struct radeon_device *rdev, uint32_t reg) { if (reg rdev-rmmio_size) - return readl(((void __iomem *)rdev-rmmio) + reg); + return readl((rdev-rmmio) + reg); else { - writel(reg, ((void __iomem *)rdev-rmmio) + RADEON_MM_INDEX); - return readl(((void __iomem *)rdev-rmmio) + RADEON_MM_DATA); + writel(reg, (rdev-rmmio) + RADEON_MM_INDEX); + return readl((rdev-rmmio) + RADEON_MM_DATA); } } static inline void r100_mm_wreg(struct radeon_device *rdev, uint32_t reg, uint32_t v) { if (reg rdev-rmmio_size) - writel(v, ((void __iomem *)rdev-rmmio) + reg); + writel(v, (rdev-rmmio) + reg); else { - writel(reg, ((void __iomem *)rdev-rmmio) + RADEON_MM_INDEX); - writel(v, ((void __iomem *)rdev-rmmio) + RADEON_MM_DATA); + writel(reg, (rdev-rmmio) + RADEON_MM_INDEX); + writel(v, (rdev-rmmio) + RADEON_MM_DATA); } } @@ -1296,10 +1296,10 @@ static inline void r100_io_wreg(struct radeon_device *rdev, u32 reg, u32 v) /* * Registers read write functions. */ -#define RREG8(reg) readb(((void __iomem *)rdev-rmmio) + (reg)) -#define WREG8(reg, v) writeb(v, ((void __iomem *)rdev-rmmio) + (reg)) -#define RREG16(reg) readw(((void __iomem *)rdev-rmmio) + (reg)) -#define WREG16(reg, v) writew(v, ((void __iomem *)rdev-rmmio) + (reg)) +#define RREG8(reg) readb((rdev-rmmio) + (reg)) +#define WREG8(reg, v) writeb(v, (rdev-rmmio) + (reg)) +#define RREG16(reg) readw((rdev-rmmio) + (reg)) +#define WREG16(reg, v) writew(v, (rdev-rmmio) + (reg)) #define RREG32(reg) r100_mm_rreg(rdev, (reg)) #define DREG32(reg) printk(KERN_INFO REGISTER: #reg : 0x%08X\n, r100_mm_rreg(rdev, (reg))) #define WREG32(reg, v) r100_mm_wreg(rdev, (reg), (v)) diff --git a/drivers/gpu/drm/radeon/rs600.c b/drivers/gpu/drm/radeon/rs600.c index 6e3b11e..6779576 100644 --- a/drivers/gpu/drm/radeon/rs600.c +++ b/drivers/gpu/drm/radeon/rs600.c @@ -530,7 +530,7 @@ int rs600_gart_set_page(struct radeon_device *rdev, int i, uint64_t addr) addr = addr 0xF000ULL; addr |= R600_PTE_VALID | R600_PTE_SYSTEM | R600_PTE_SNOOPED; addr |= R600_PTE_READABLE | R600_PTE_WRITEABLE; - writeq(addr, ((void __iomem *)ptr) + (i * 8)); + writeq(addr, ptr + (i * 8)); return 0; } ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org/mailman/listinfo/xorg-driver-ati
[PATCH 3/6] drm/radeon: Writeback endian fixes
The writeback ring pointer and IH ring pointer are read using le32_to_cpu so we do not want the chip to byteswap them on big-endian. We still want to byteswap the ring itself and the IBs, so we don't touch that but we remove setting of the byteswap bits in CP_RB_RPTR_ADDR and IH_CNTL. In general, for things like that where we control all the accessors easily, we are better off doing the swap in SW rather than HW. Paradoxally, it does keep the code closer to x86 and avoid using poorly tested HW features. I also changed the use of RADEON_ to R600_ in a couple of cases to be more consistent with the surrounding code. Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org --- (resent adding dri-devel to the CC list to hit patchwork) drivers/gpu/drm/radeon/evergreen.c |3 --- drivers/gpu/drm/radeon/r600.c |7 --- drivers/gpu/drm/radeon/r600_cp.c | 23 +-- 3 files changed, 9 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c index e8a5ffb..23cf089 100644 --- a/drivers/gpu/drm/radeon/evergreen.c +++ b/drivers/gpu/drm/radeon/evergreen.c @@ -1359,9 +1359,6 @@ int evergreen_cp_resume(struct radeon_device *rdev) /* set the wb address wether it's enabled or not */ WREG32(CP_RB_RPTR_ADDR, -#ifdef __BIG_ENDIAN - RB_RPTR_SWAP(2) | -#endif ((rdev-wb.gpu_addr + RADEON_WB_CP_RPTR_OFFSET) 0xFFFC)); WREG32(CP_RB_RPTR_ADDR_HI, upper_32_bits(rdev-wb.gpu_addr + RADEON_WB_CP_RPTR_OFFSET) 0xFF); WREG32(SCRATCH_ADDR, ((rdev-wb.gpu_addr + RADEON_WB_SCRATCH_OFFSET) 8) 0x); diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index f79d2cc..3c86b15 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -2212,9 +2212,6 @@ int r600_cp_resume(struct radeon_device *rdev) /* set the wb address whether it's enabled or not */ WREG32(CP_RB_RPTR_ADDR, -#ifdef __BIG_ENDIAN - RB_RPTR_SWAP(2) | -#endif ((rdev-wb.gpu_addr + RADEON_WB_CP_RPTR_OFFSET) 0xFFFC)); WREG32(CP_RB_RPTR_ADDR_HI, upper_32_bits(rdev-wb.gpu_addr + RADEON_WB_CP_RPTR_OFFSET) 0xFF); WREG32(SCRATCH_ADDR, ((rdev-wb.gpu_addr + RADEON_WB_SCRATCH_OFFSET) 8) 0x); @@ -2993,10 +2990,6 @@ int r600_irq_init(struct radeon_device *rdev) /* RPTR_REARM only works if msi's are enabled */ if (rdev-msi_enabled) ih_cntl |= RPTR_REARM; - -#ifdef __BIG_ENDIAN - ih_cntl |= IH_MC_SWAP(IH_MC_SWAP_32BIT); -#endif WREG32(IH_CNTL, ih_cntl); /* force the active interrupt state to all disabled */ diff --git a/drivers/gpu/drm/radeon/r600_cp.c b/drivers/gpu/drm/radeon/r600_cp.c index c3ab959..45fd592 100644 --- a/drivers/gpu/drm/radeon/r600_cp.c +++ b/drivers/gpu/drm/radeon/r600_cp.c @@ -1802,8 +1802,8 @@ static void r600_cp_init_ring_buffer(struct drm_device *dev, /* Set ring buffer size */ #ifdef __BIG_ENDIAN RADEON_WRITE(R600_CP_RB_CNTL, -RADEON_BUF_SWAP_32BIT | -RADEON_RB_NO_UPDATE | +R600_BUF_SWAP_32BIT | +R600_RB_NO_UPDATE | (dev_priv-ring.rptr_update_l2qw 8) | dev_priv-ring.size_l2qw); #else @@ -1820,15 +1820,15 @@ static void r600_cp_init_ring_buffer(struct drm_device *dev, #ifdef __BIG_ENDIAN RADEON_WRITE(R600_CP_RB_CNTL, -RADEON_BUF_SWAP_32BIT | -RADEON_RB_NO_UPDATE | -RADEON_RB_RPTR_WR_ENA | +R600_BUF_SWAP_32BIT | +R600_RB_NO_UPDATE | +R600_RB_RPTR_WR_ENA | (dev_priv-ring.rptr_update_l2qw 8) | dev_priv-ring.size_l2qw); #else RADEON_WRITE(R600_CP_RB_CNTL, -RADEON_RB_NO_UPDATE | -RADEON_RB_RPTR_WR_ENA | +R600_RB_NO_UPDATE | +R600_RB_RPTR_WR_ENA | (dev_priv-ring.rptr_update_l2qw 8) | dev_priv-ring.size_l2qw); #endif @@ -1851,13 +1851,8 @@ static void r600_cp_init_ring_buffer(struct drm_device *dev, - ((unsigned long) dev-sg-virtual) + dev_priv-gart_vm_start; } - RADEON_WRITE(R600_CP_RB_RPTR_ADDR, -#ifdef __BIG_ENDIAN -(2 0) | -#endif -(rptr_addr 0xfffc)); - RADEON_WRITE(R600_CP_RB_RPTR_ADDR_HI, -upper_32_bits(rptr_addr)); + RADEON_WRITE(R600_CP_RB_RPTR_ADDR, (rptr_addr 0xfffc)); + RADEON_WRITE(R600_CP_RB_RPTR_ADDR_HI, upper_32_bits(rptr_addr)); #ifdef __BIG_ENDIAN RADEON_WRITE(R600_CP_RB_CNTL, ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org
LSI problem, endianness cleanups
Hi Alex ! A couple of things I noticed while doing some of the fixes I posted recently: - There seem to be what could be a HW issue, I would appreciate if you could double check with your HW guys, at least on rv610 (I haven't been able to test on other cards). When MSI is -disabled- (using old style LSIs), my PCI-E bridge detected an invalid DMA after every interrupt. I am not sure I managed to capture the address properly, it might be 0 but I need to double check with my own HW guys. This doesn't happen when MSI is enabled. I -looks- like the chip might be trying to shoot an MSI even when MSIs aren't enabled in the config space, possibly using the (stale) content of the config space MSI address register. Can you check if your HW folks know anything about this ? - There's a lot of horrible duplication of register definitions :-) I think there's at least 3 copies of CP_RB_CNTL and associated bits. One thing I'd like to cleanup (not -that- but related) is the way we set the endian swap bits. Basically, we currently have #ifdef's that duplicate entire multi-line statements changing one bit (almost always the same). I'm thinking about instead defining in a .h something like #ifdef __BIG_ENDIAN #define CP_RB_SWAP CP_RB_SWAP_32BIT #else #define CP_RB_SWAP 0 #endif And using CP_RB_SWAP everywhere, removing all the dups (which are bug prone as always with dups). I'd like to make sure you are ok with that before I jump and do it, and especially ok with having a _single_ instance of that bit def for all radeon variants since the bits are in the same place everywhere. Cheers, Ben. ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org/mailman/listinfo/xorg-driver-ati
[PATCH 6/6] drm/radeon: Fix the definition of RADEON_BUF_SWAP_32BIT
(Note that this is duplicated under various other names such as R600_BUF_SWAP_32BIT etc...). At least now all the definitions agree. Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org --- drivers/gpu/drm/radeon/radeon_reg.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_reg.h b/drivers/gpu/drm/radeon/radeon_reg.h index ec93a75..0828add1 100644 --- a/drivers/gpu/drm/radeon/radeon_reg.h +++ b/drivers/gpu/drm/radeon/radeon_reg.h @@ -3293,7 +3293,7 @@ # define RADEON_RB_BUFSZ_MASK (0x3f 0) # define RADEON_RB_BLKSZ_SHIFT8 # define RADEON_RB_BLKSZ_MASK (0x3f 8) -# define RADEON_BUF_SWAP_32BIT(1 17) +# define RADEON_BUF_SWAP_32BIT(2 16) # define RADEON_MAX_FETCH_SHIFT 18 # define RADEON_MAX_FETCH_MASK(0x3 18) # define RADEON_RB_NO_UPDATE (1 27) ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org/mailman/listinfo/xorg-driver-ati
[PATCH 4/6] drm/radeon: Add a rmb() in IH processing
We should have a read memory barrier between reading the WPTR from memory and reading ring entries based on that value (ie, we need to ensure both loads are done in order by the CPU). It could be argued that the MMIO reads in r600_ack_irq() might be enough to get that barrier but I prefer keeping an explicit one just in case. Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org --- drivers/gpu/drm/radeon/r600.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index 3c86b15..7e5c801 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -3312,6 +3312,9 @@ int r600_irq_process(struct radeon_device *rdev) } restart_ih: + /* Order reading of wptr vs. reading of IH ring data */ + wmb(); + /* display interrupts */ r600_irq_ack(rdev); ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org/mailman/listinfo/xorg-driver-ati
[PATCH 5/6] drm/radeon: Do an MMIO read on interrupts when not uisng MSIs
When not using MSIs, there is no guarantee that DMA from the device has been fully flushed to point where it's visible to the CPU when taking an interrupt. To get this guarantee, we need to perform an MMIO read from the device, which will flush all outstanding DMAs from bridges between the device and the system. Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org --- drivers/gpu/drm/radeon/r600.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index 7e5c801..25b2dab 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -3300,6 +3300,10 @@ int r600_irq_process(struct radeon_device *rdev) if (!rdev-ih.enabled || rdev-shutdown) return IRQ_NONE; + /* No MSIs, need a dummy read to flush PCI DMAs */ + if (!rdev-msi_enabled) + RREG32(IH_RB_WPTR); + wptr = r600_get_ih_wptr(rdev); rptr = rdev-ih.rptr; DRM_DEBUG(r600_irq_process start: rptr %d, wptr %d\n, rptr, wptr); ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org/mailman/listinfo/xorg-driver-ati
[PATCH 1/6] drm/radeon: Remove a bunch of useless _iomem casts
Just defining rdev-rmmio properly in the first place should do the trick. In some cases, the cast were also complete dups as the original variable was already of the right type. Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org --- drivers/gpu/drm/radeon/radeon.h | 22 +++--- drivers/gpu/drm/radeon/rs600.c |2 +- 2 files changed, 12 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index ef0e0e0..c92cf2c 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -1174,7 +1174,7 @@ struct radeon_device { /* Register mmio */ resource_size_t rmmio_base; resource_size_t rmmio_size; - void*rmmio; + void __iomem*rmmio; radeon_rreg_t mc_rreg; radeon_wreg_t mc_wreg; radeon_rreg_t pll_rreg; @@ -1251,20 +1251,20 @@ int radeon_gpu_wait_for_idle(struct radeon_device *rdev); static inline uint32_t r100_mm_rreg(struct radeon_device *rdev, uint32_t reg) { if (reg rdev-rmmio_size) - return readl(((void __iomem *)rdev-rmmio) + reg); + return readl((rdev-rmmio) + reg); else { - writel(reg, ((void __iomem *)rdev-rmmio) + RADEON_MM_INDEX); - return readl(((void __iomem *)rdev-rmmio) + RADEON_MM_DATA); + writel(reg, (rdev-rmmio) + RADEON_MM_INDEX); + return readl((rdev-rmmio) + RADEON_MM_DATA); } } static inline void r100_mm_wreg(struct radeon_device *rdev, uint32_t reg, uint32_t v) { if (reg rdev-rmmio_size) - writel(v, ((void __iomem *)rdev-rmmio) + reg); + writel(v, (rdev-rmmio) + reg); else { - writel(reg, ((void __iomem *)rdev-rmmio) + RADEON_MM_INDEX); - writel(v, ((void __iomem *)rdev-rmmio) + RADEON_MM_DATA); + writel(reg, (rdev-rmmio) + RADEON_MM_INDEX); + writel(v, (rdev-rmmio) + RADEON_MM_DATA); } } @@ -1296,10 +1296,10 @@ static inline void r100_io_wreg(struct radeon_device *rdev, u32 reg, u32 v) /* * Registers read write functions. */ -#define RREG8(reg) readb(((void __iomem *)rdev-rmmio) + (reg)) -#define WREG8(reg, v) writeb(v, ((void __iomem *)rdev-rmmio) + (reg)) -#define RREG16(reg) readw(((void __iomem *)rdev-rmmio) + (reg)) -#define WREG16(reg, v) writew(v, ((void __iomem *)rdev-rmmio) + (reg)) +#define RREG8(reg) readb((rdev-rmmio) + (reg)) +#define WREG8(reg, v) writeb(v, (rdev-rmmio) + (reg)) +#define RREG16(reg) readw((rdev-rmmio) + (reg)) +#define WREG16(reg, v) writew(v, (rdev-rmmio) + (reg)) #define RREG32(reg) r100_mm_rreg(rdev, (reg)) #define DREG32(reg) printk(KERN_INFO REGISTER: #reg : 0x%08X\n, r100_mm_rreg(rdev, (reg))) #define WREG32(reg, v) r100_mm_wreg(rdev, (reg), (v)) diff --git a/drivers/gpu/drm/radeon/rs600.c b/drivers/gpu/drm/radeon/rs600.c index 6e3b11e..6779576 100644 --- a/drivers/gpu/drm/radeon/rs600.c +++ b/drivers/gpu/drm/radeon/rs600.c @@ -530,7 +530,7 @@ int rs600_gart_set_page(struct radeon_device *rdev, int i, uint64_t addr) addr = addr 0xF000ULL; addr |= R600_PTE_VALID | R600_PTE_SYSTEM | R600_PTE_SNOOPED; addr |= R600_PTE_READABLE | R600_PTE_WRITEABLE; - writeq(addr, ((void __iomem *)ptr) + (i * 8)); + writeq(addr, ptr + (i * 8)); return 0; } ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org/mailman/listinfo/xorg-driver-ati
[PATCH 3/6] drm/radeon: Writeback endian fixes
The writeback ring pointer and IH ring pointer are read using le32_to_cpu so we do not want the chip to byteswap them on big-endian. We still want to byteswap the ring itself and the IBs, so we don't touch that but we remove setting of the byteswap bits in CP_RB_RPTR_ADDR and IH_CNTL. In general, for things like that where we control all the accessors easily, we are better off doing the swap in SW rather than HW. Paradoxally, it does keep the code closer to x86 and avoid using poorly tested HW features. I also changed the use of RADEON_ to R600_ in a couple of cases to be more consistent with the surrounding code. Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org --- drivers/gpu/drm/radeon/evergreen.c |3 --- drivers/gpu/drm/radeon/r600.c |7 --- drivers/gpu/drm/radeon/r600_cp.c | 23 +-- 3 files changed, 9 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c index e8a5ffb..23cf089 100644 --- a/drivers/gpu/drm/radeon/evergreen.c +++ b/drivers/gpu/drm/radeon/evergreen.c @@ -1359,9 +1359,6 @@ int evergreen_cp_resume(struct radeon_device *rdev) /* set the wb address wether it's enabled or not */ WREG32(CP_RB_RPTR_ADDR, -#ifdef __BIG_ENDIAN - RB_RPTR_SWAP(2) | -#endif ((rdev-wb.gpu_addr + RADEON_WB_CP_RPTR_OFFSET) 0xFFFC)); WREG32(CP_RB_RPTR_ADDR_HI, upper_32_bits(rdev-wb.gpu_addr + RADEON_WB_CP_RPTR_OFFSET) 0xFF); WREG32(SCRATCH_ADDR, ((rdev-wb.gpu_addr + RADEON_WB_SCRATCH_OFFSET) 8) 0x); diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index f79d2cc..3c86b15 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -2212,9 +2212,6 @@ int r600_cp_resume(struct radeon_device *rdev) /* set the wb address whether it's enabled or not */ WREG32(CP_RB_RPTR_ADDR, -#ifdef __BIG_ENDIAN - RB_RPTR_SWAP(2) | -#endif ((rdev-wb.gpu_addr + RADEON_WB_CP_RPTR_OFFSET) 0xFFFC)); WREG32(CP_RB_RPTR_ADDR_HI, upper_32_bits(rdev-wb.gpu_addr + RADEON_WB_CP_RPTR_OFFSET) 0xFF); WREG32(SCRATCH_ADDR, ((rdev-wb.gpu_addr + RADEON_WB_SCRATCH_OFFSET) 8) 0x); @@ -2993,10 +2990,6 @@ int r600_irq_init(struct radeon_device *rdev) /* RPTR_REARM only works if msi's are enabled */ if (rdev-msi_enabled) ih_cntl |= RPTR_REARM; - -#ifdef __BIG_ENDIAN - ih_cntl |= IH_MC_SWAP(IH_MC_SWAP_32BIT); -#endif WREG32(IH_CNTL, ih_cntl); /* force the active interrupt state to all disabled */ diff --git a/drivers/gpu/drm/radeon/r600_cp.c b/drivers/gpu/drm/radeon/r600_cp.c index c3ab959..45fd592 100644 --- a/drivers/gpu/drm/radeon/r600_cp.c +++ b/drivers/gpu/drm/radeon/r600_cp.c @@ -1802,8 +1802,8 @@ static void r600_cp_init_ring_buffer(struct drm_device *dev, /* Set ring buffer size */ #ifdef __BIG_ENDIAN RADEON_WRITE(R600_CP_RB_CNTL, -RADEON_BUF_SWAP_32BIT | -RADEON_RB_NO_UPDATE | +R600_BUF_SWAP_32BIT | +R600_RB_NO_UPDATE | (dev_priv-ring.rptr_update_l2qw 8) | dev_priv-ring.size_l2qw); #else @@ -1820,15 +1820,15 @@ static void r600_cp_init_ring_buffer(struct drm_device *dev, #ifdef __BIG_ENDIAN RADEON_WRITE(R600_CP_RB_CNTL, -RADEON_BUF_SWAP_32BIT | -RADEON_RB_NO_UPDATE | -RADEON_RB_RPTR_WR_ENA | +R600_BUF_SWAP_32BIT | +R600_RB_NO_UPDATE | +R600_RB_RPTR_WR_ENA | (dev_priv-ring.rptr_update_l2qw 8) | dev_priv-ring.size_l2qw); #else RADEON_WRITE(R600_CP_RB_CNTL, -RADEON_RB_NO_UPDATE | -RADEON_RB_RPTR_WR_ENA | +R600_RB_NO_UPDATE | +R600_RB_RPTR_WR_ENA | (dev_priv-ring.rptr_update_l2qw 8) | dev_priv-ring.size_l2qw); #endif @@ -1851,13 +1851,8 @@ static void r600_cp_init_ring_buffer(struct drm_device *dev, - ((unsigned long) dev-sg-virtual) + dev_priv-gart_vm_start; } - RADEON_WRITE(R600_CP_RB_RPTR_ADDR, -#ifdef __BIG_ENDIAN -(2 0) | -#endif -(rptr_addr 0xfffc)); - RADEON_WRITE(R600_CP_RB_RPTR_ADDR_HI, -upper_32_bits(rptr_addr)); + RADEON_WRITE(R600_CP_RB_RPTR_ADDR, (rptr_addr 0xfffc)); + RADEON_WRITE(R600_CP_RB_RPTR_ADDR_HI, upper_32_bits(rptr_addr)); #ifdef __BIG_ENDIAN RADEON_WRITE(R600_CP_RB_CNTL, ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org/mailman/listinfo/xorg-driver-ati
Re: [PATCH 2/6] drm/radeon: ATOM Endian fix for atombios_crtc_program_pll()
On Wed, 2011-07-13 at 10:38 -0400, Alex Deucher wrote: case 6: - args.v6.ulCrtcPclkFreq.ucCRTC = crtc_id; - args.v6.ulCrtcPclkFreq.ulPixelClock = cpu_to_le32(clock / 10); + args.v6.ulDispEngClkFreq = cpu_to_le32(crtc_id 24 | clock / 10); For clarity (i can never remember op precedence), you might put: cpu_to_le32((crtc_id 24) | (clock / 10)); I can't either but I have a nice chart printed on my wall :-) I can respin if you really want but the above is correct. Cheers, Ben. Other than that, Reviewed-by: Alex Deucher alexander.deuc...@amd.com args.v6.ucRefDiv = ref_div; args.v6.usFbDiv = cpu_to_le16(fb_div); args.v6.ulFbDivDecFrac = cpu_to_le32(frac_fb_div * 10); ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org/mailman/listinfo/xorg-driver-ati
Re: [PATCH 3/6] drm/radeon: Writeback endian fixes
On Wed, 2011-07-13 at 10:42 -0400, Alex Deucher wrote: On Wed, Jul 13, 2011 at 2:28 AM, Benjamin Herrenschmidt b...@kernel.crashing.org wrote: The writeback ring pointer and IH ring pointer are read using le32_to_cpu so we do not want the chip to byteswap them on big-endian. We still want to byteswap the ring itself and the IBs, so we don't touch that but we remove setting of the byteswap bits in CP_RB_RPTR_ADDR and IH_CNTL. In general, for things like that where we control all the accessors easily, we are better off doing the swap in SW rather than HW. Paradoxally, it does keep the code closer to x86 and avoid using poorly tested HW features. I also changed the use of RADEON_ to R600_ in a couple of cases to be more consistent with the surrounding code. Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org evergreen.c and ni.c will need similar fixes. evergreen.c -is- fixed in the patch :-) ni.c doesn't seem to set swapping on the write back of the ring pointer (can you dbl check ?), it only enables swapping on the ring itself which is correct as far as I can tell. Cheers, Ben. Reviewed-by: Alex Deucher alexander.deuc...@amd.com --- (resent adding dri-devel to the CC list to hit patchwork) drivers/gpu/drm/radeon/evergreen.c |3 --- drivers/gpu/drm/radeon/r600.c |7 --- drivers/gpu/drm/radeon/r600_cp.c | 23 +-- 3 files changed, 9 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c index e8a5ffb..23cf089 100644 --- a/drivers/gpu/drm/radeon/evergreen.c +++ b/drivers/gpu/drm/radeon/evergreen.c @@ -1359,9 +1359,6 @@ int evergreen_cp_resume(struct radeon_device *rdev) /* set the wb address wether it's enabled or not */ WREG32(CP_RB_RPTR_ADDR, -#ifdef __BIG_ENDIAN - RB_RPTR_SWAP(2) | -#endif ((rdev-wb.gpu_addr + RADEON_WB_CP_RPTR_OFFSET) 0xFFFC)); WREG32(CP_RB_RPTR_ADDR_HI, upper_32_bits(rdev-wb.gpu_addr + RADEON_WB_CP_RPTR_OFFSET) 0xFF); WREG32(SCRATCH_ADDR, ((rdev-wb.gpu_addr + RADEON_WB_SCRATCH_OFFSET) 8) 0x); diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index f79d2cc..3c86b15 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -2212,9 +2212,6 @@ int r600_cp_resume(struct radeon_device *rdev) /* set the wb address whether it's enabled or not */ WREG32(CP_RB_RPTR_ADDR, -#ifdef __BIG_ENDIAN - RB_RPTR_SWAP(2) | -#endif ((rdev-wb.gpu_addr + RADEON_WB_CP_RPTR_OFFSET) 0xFFFC)); WREG32(CP_RB_RPTR_ADDR_HI, upper_32_bits(rdev-wb.gpu_addr + RADEON_WB_CP_RPTR_OFFSET) 0xFF); WREG32(SCRATCH_ADDR, ((rdev-wb.gpu_addr + RADEON_WB_SCRATCH_OFFSET) 8) 0x); @@ -2993,10 +2990,6 @@ int r600_irq_init(struct radeon_device *rdev) /* RPTR_REARM only works if msi's are enabled */ if (rdev-msi_enabled) ih_cntl |= RPTR_REARM; - -#ifdef __BIG_ENDIAN - ih_cntl |= IH_MC_SWAP(IH_MC_SWAP_32BIT); -#endif WREG32(IH_CNTL, ih_cntl); /* force the active interrupt state to all disabled */ diff --git a/drivers/gpu/drm/radeon/r600_cp.c b/drivers/gpu/drm/radeon/r600_cp.c index c3ab959..45fd592 100644 --- a/drivers/gpu/drm/radeon/r600_cp.c +++ b/drivers/gpu/drm/radeon/r600_cp.c @@ -1802,8 +1802,8 @@ static void r600_cp_init_ring_buffer(struct drm_device *dev, /* Set ring buffer size */ #ifdef __BIG_ENDIAN RADEON_WRITE(R600_CP_RB_CNTL, -RADEON_BUF_SWAP_32BIT | -RADEON_RB_NO_UPDATE | +R600_BUF_SWAP_32BIT | +R600_RB_NO_UPDATE | (dev_priv-ring.rptr_update_l2qw 8) | dev_priv-ring.size_l2qw); #else @@ -1820,15 +1820,15 @@ static void r600_cp_init_ring_buffer(struct drm_device *dev, #ifdef __BIG_ENDIAN RADEON_WRITE(R600_CP_RB_CNTL, -RADEON_BUF_SWAP_32BIT | -RADEON_RB_NO_UPDATE | -RADEON_RB_RPTR_WR_ENA | +R600_BUF_SWAP_32BIT | +R600_RB_NO_UPDATE | +R600_RB_RPTR_WR_ENA | (dev_priv-ring.rptr_update_l2qw 8) | dev_priv-ring.size_l2qw); #else RADEON_WRITE(R600_CP_RB_CNTL, -RADEON_RB_NO_UPDATE | -RADEON_RB_RPTR_WR_ENA | +R600_RB_NO_UPDATE | +R600_RB_RPTR_WR_ENA | (dev_priv-ring.rptr_update_l2qw 8) | dev_priv-ring.size_l2qw); #endif @@ -1851,13 +1851,8 @@ static void r600_cp_init_ring_buffer(struct drm_device *dev
Re: [PATCH 4/6] drm/radeon: Add a rmb() in IH processing
On Wed, 2011-07-13 at 10:48 -0400, Alex Deucher wrote: On Wed, Jul 13, 2011 at 10:43 AM, Alex Deucher alexdeuc...@gmail.com wrote: On Wed, Jul 13, 2011 at 2:28 AM, Benjamin Herrenschmidt b...@kernel.crashing.org wrote: We should have a read memory barrier between reading the WPTR from memory and reading ring entries based on that value (ie, we need to ensure both loads are done in order by the CPU). It could be argued that the MMIO reads in r600_ack_irq() might be enough to get that barrier but I prefer keeping an explicit one just in case. Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org Reviewed-by: Alex Deucher alexander.deuc...@amd.com evergreen.c will need a similar fix. Ok. I can do that. Cheers, Ben. Alex --- (resent adding dri-devel to the CC list to hit patchwork) drivers/gpu/drm/radeon/r600.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index 3c86b15..7e5c801 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -3312,6 +3312,9 @@ int r600_irq_process(struct radeon_device *rdev) } restart_ih: + /* Order reading of wptr vs. reading of IH ring data */ + wmb(); + /* display interrupts */ r600_irq_ack(rdev); ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org/mailman/listinfo/xorg-driver-ati
Re: LSI problem, endianness cleanups
On Wed, 2011-07-13 at 11:33 -0400, Alex Deucher wrote: I'm thinking about instead defining in a .h something like #ifdef __BIG_ENDIAN #define CP_RB_SWAP CP_RB_SWAP_32BIT #else #define CP_RB_SWAP 0 #endif And using CP_RB_SWAP everywhere, removing all the dups (which are bug prone as always with dups). I'd like to make sure you are ok with that before I jump and do it, and especially ok with having a _single_ instance of that bit def for all radeon variants since the bits are in the same place everywhere. Seems ok to me. But as you mentioned in IRC, I think in most cases we should just do the swap on the CPU. Also we generally use the same bits for all the endian swap fields so we could do something like: #define ENDIAN_NONE 0 #define ENDIAN_8IN16 1 #define ENDIAN_8IN32 2 #ifdef __BIG_ENDIAN #define DFLT_SWAP ENDIAN_8IN32 #else #define DFLT_SWAP ENDIAN_NONE #endif And use it for all the relevant fields. Right. As far as the CPU side swap, well, I agree mostly, but for the ring and IB this is a much bigger undertaking. We'd have to fix everything that writes (or reads, ie. validation) to ring and IB in kernel and userspace, so I'll leave that one alone for now. The IH and writeback were easy because they are contained within the kernel driver. Cheers, Ben. ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org/mailman/listinfo/xorg-driver-ati
[PATCH] radeon: Fix PCI usage of 32-bit driver on 64-bit platform
Hi ! The radeon driver is storing PCI addresses in unsigned long's which won't work well on 32-bit platforms with 64-bit physical address space such as PowerPC 4xx. This fixes it by using unsigned long long instead. Signed-off-by: Benjamin Herrenschmidt [EMAIL PROTECTED] --- index f7ae1a8..e7ccfa9 100644 --- a/src/radeon.h +++ b/src/radeon.h @@ -648,9 +648,9 @@ typedef struct { RADEONChipFamily ChipFamily; RADEONErrata ChipErrata; -unsigned long LinearAddr; /* Frame buffer physical address */ -unsigned long MMIOAddr; /* MMIO region physical address */ -unsigned long BIOSAddr; /* BIOS physical address */ +unsigned long long LinearAddr; /* Frame buffer physical address */ +unsigned long long MMIOAddr; /* MMIO region physical address */ +unsigned long long BIOSAddr; /* BIOS physical address */ uint32_t fbLocation; uint32_t gartLocation; uint32_t mc_fb_location; index c759bd6..138bbc8 100644 --- a/src/radeon_driver.c +++ b/src/radeon_driver.c @@ -429,7 +429,7 @@ static Bool RADEONMapFB(ScrnInfoPtr pScrn) RADEONInfoPtr info = RADEONPTR(pScrn); xf86DrvMsgVerb(pScrn-scrnIndex, X_INFO, RADEON_LOGLEVEL_DEBUG, - Map: 0x%08lx, 0x%08lx\n, info-LinearAddr, info-FbMapSize); + Map: 0x%016llx, 0x%08lx\n, info-LinearAddr, info-FbMapSize); #ifndef XSERVER_LIBPCIACCESS @@ -1745,11 +1745,11 @@ static Bool RADEONPreInitChipType(ScrnInfoPtr pScrn) } from = X_PROBED; -info-LinearAddr = PCI_REGION_BASE(info-PciInfo, 0, REGION_MEM) ~0x1ffUL; +info-LinearAddr = PCI_REGION_BASE(info-PciInfo, 0, REGION_MEM) ~0x1ffULL; pScrn-memPhysBase = info-LinearAddr; if (dev-MemBase) { xf86DrvMsg(pScrn-scrnIndex, X_INFO, - Linear address override, using 0x%016lx instead of 0x%016lx\n, + Linear address override, using 0x%016lx instead of 0x%016llx\n, dev-MemBase, info-LinearAddr); info-LinearAddr = dev-MemBase; @@ -1760,7 +1760,7 @@ static Bool RADEONPreInitChipType(ScrnInfoPtr pScrn) return FALSE; } xf86DrvMsg(pScrn-scrnIndex, from, - Linear framebuffer at 0x%016lx\n, info-LinearAddr); + Linear framebuffer at 0x%016llx\n, info-LinearAddr); #ifndef XSERVER_LIBPCIACCESS /* BIOS */ @@ -2742,11 +2742,13 @@ Bool RADEONPreInit(ScrnInfoPtr pScrn, int flags) info-PciTag = pciTag(PCI_DEV_BUS(info-PciInfo), PCI_DEV_DEV(info-PciInfo), PCI_DEV_FUNC(info-PciInfo)); -info-MMIOAddr = PCI_REGION_BASE(info-PciInfo, 2, REGION_MEM) ~0xffUL; +info-MMIOAddr = PCI_REGION_BASE(info-PciInfo, 2, REGION_MEM) ~0xffULL; info-MMIOSize = PCI_REGION_SIZE(info-PciInfo, 2); + xf86DrvMsg(pScrn-scrnIndex, X_INFO, TOTO SAYS %016llx\n, + (unsigned long long)info-PciInfo-regions[2].base_addr); if (info-pEnt-device-IOBase) { xf86DrvMsg(pScrn-scrnIndex, X_CONFIG, - MMIO address override, using 0x%08lx instead of 0x%08lx\n, + MMIO address override, using 0x%08lx instead of 0x%016llx\n, info-pEnt-device-IOBase, info-MMIOAddr); info-MMIOAddr = info-pEnt-device-IOBase; @@ -2755,7 +2757,7 @@ Bool RADEONPreInit(ScrnInfoPtr pScrn, int flags) goto fail1; } xf86DrvMsg(pScrn-scrnIndex, X_INFO, - MMIO registers at 0x%016lx: size %ldKB\n, info-MMIOAddr, info-MMIOSize / 1024); + MMIO registers at 0x%016llx: size %ldKB\n, info-MMIOAddr, info-MMIOSize / 1024); if(!RADEONMapMMIO(pScrn)) { xf86DrvMsg(pScrn-scrnIndex, X_ERROR, ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org/mailman/listinfo/xorg-driver-ati
RE: [radeonhd] 0x94CC:0x1545:0x3650: Radeonhd RV610 HD2400Pro
On Wed, 2008-08-13 at 03:52 -0700, Tai Tri Nguyen wrote: Hi all, Thanks for your helps. That seems an interesting issue. Ben, I'm working on a powerpc sequoia using AMCC 440EPX CPU. Ok, I don't think I have one of these, but I have an SPe and a 405EX so that should be similar. In any case, it will take a little bit of time before I can get to these r6xx endian issues so I suggest you look at what you get with an r5xx so we can properly separate the issues caused by the lack of surface swapping from whatever other pixman issues we may have. Cheers, Ben. Best regards, Tai Nguyen -Original Message- From: Benjamin Herrenschmidt [mailto:[EMAIL PROTECTED] Sent: Wednesday, August 13, 2008 5:38 PM To: Dave Airlie Cc: Tai Tri Nguyen; Alex Deucher; xorg-driver-ati@lists.x.org; Tuan Thanh Phan Subject: Re: [radeonhd] 0x94CC:0x1545:0x3650: Radeonhd RV610 HD2400Pro On Wed, 2008-08-13 at 20:07 +1000, Dave Airlie wrote: On Wed, Aug 13, 2008 at 7:54 PM, Tai Tri Nguyen [EMAIL PROTECTED] wrote: Hi Alex, The driver I'm using is from git, the latest version of xf86-video-ati git clone git://anongit.freedesktop.org/git/xorg/driver/xf86-video-ati I have just downloaded and compiled again but the issue still exist. Video can display in 24 bits depth color but perform wrong color. Do you have any idea for the issue? I think this is a pixman bug, but I'm not sure, Ben you have any luck tracking it down further? Actually R600 doesn't have surfaces like previous cards so actually we need to program some bits of r600. Hi ! There are two different things at play here: - There may be a problem with pixman, I've had report of color issues with some apps / themes and I've observed some myself with the fedora default theme, on cards like r500 where we pretty much know the surface swapping works. - The r600 has no surfaces, at least not the way they used to work, so it's expected that colors will be wrong on that one. For the later, I don't know yet what can be done, ie. I don't know what they replaced surfaces with and how we can deal with endianness on it, I'll have to catch up with you and Alex, look at available docs, etc... At this stage, it's hard to say which of these two problems is affecting Tai Tri. I would suggest he tries an R500 card such as X1650 or similar, this is what we use on Bimini and appears to work reasonably well, at least we have working surface swappers. I'll have to track down the pixman issue (if that's one, btw, I could use your help here Dave in isolating what X rendering ops are actually going wrong). For R6xx, I can't tell yet what we'll do. I have a 2400HD too so I should hopefully be able to work on that as soon as I find some spare time. Tai: What machine is this ? I can try to do a similar setup, I have a whole range of AMCC eval boards here, I was planning to give the X1650 a go some time next week or so in fact... Cheers, Ben. CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and contains information that is confidential and proprietary to Applied Micro Circuits Corporation or its subsidiaries. It is to be used solely for the purpose of furthering the parties' business relationship. All unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org/mailman/listinfo/xorg-driver-ati
Re: [radeonhd] 0x94CC:0x1545:0x3650: Radeonhd RV610 HD2400Pro
On Wed, 2008-08-13 at 20:07 +1000, Dave Airlie wrote: On Wed, Aug 13, 2008 at 7:54 PM, Tai Tri Nguyen [EMAIL PROTECTED] wrote: Hi Alex, The driver I'm using is from git, the latest version of xf86-video-ati git clone git://anongit.freedesktop.org/git/xorg/driver/xf86-video-ati I have just downloaded and compiled again but the issue still exist. Video can display in 24 bits depth color but perform wrong color. Do you have any idea for the issue? I think this is a pixman bug, but I'm not sure, Ben you have any luck tracking it down further? Actually R600 doesn't have surfaces like previous cards so actually we need to program some bits of r600. Hi ! There are two different things at play here: - There may be a problem with pixman, I've had report of color issues with some apps / themes and I've observed some myself with the fedora default theme, on cards like r500 where we pretty much know the surface swapping works. - The r600 has no surfaces, at least not the way they used to work, so it's expected that colors will be wrong on that one. For the later, I don't know yet what can be done, ie. I don't know what they replaced surfaces with and how we can deal with endianness on it, I'll have to catch up with you and Alex, look at available docs, etc... At this stage, it's hard to say which of these two problems is affecting Tai Tri. I would suggest he tries an R500 card such as X1650 or similar, this is what we use on Bimini and appears to work reasonably well, at least we have working surface swappers. I'll have to track down the pixman issue (if that's one, btw, I could use your help here Dave in isolating what X rendering ops are actually going wrong). For R6xx, I can't tell yet what we'll do. I have a 2400HD too so I should hopefully be able to work on that as soon as I find some spare time. Tai: What machine is this ? I can try to do a similar setup, I have a whole range of AMCC eval boards here, I was planning to give the X1650 a go some time next week or so in fact... Cheers, Ben. ___ xorg-driver-ati mailing list xorg-driver-ati@lists.x.org http://lists.x.org/mailman/listinfo/xorg-driver-ati