Re: [PATCH v2] arch: Fix image and bitmap byte order for ppc64le

2014-06-16 Thread Benjamin Herrenschmidt
On Mon, 2014-02-24 at 11:36 +0100, Egbert Eich wrote:
 From: Dinar Valeev dval...@suse.com
 
 So far PPC was big endian for sure. For ppc64le this is no longer
 true.

What happened to this ? It never made it upstream...

Cheers,
Ben.

 Signed-off-by: Egbert Eich e...@freedesktop.org
 ---
  include/servermd.h | 9 +++--
  1 file changed, 7 insertions(+), 2 deletions(-)
 
 diff --git a/include/servermd.h b/include/servermd.h
 index 11f6c10..256d84b 100644
 --- a/include/servermd.h
 +++ b/include/servermd.h
 @@ -114,8 +114,13 @@ SOFTWARE.
  
  #if defined(__powerpc__) || defined(__ppc__) || defined(__ppc64__)
  
 -#define IMAGE_BYTE_ORDERMSBFirst
 -#define BITMAP_BIT_ORDERMSBFirst
 +#if defined(__LITTLE_ENDIAN__)
 +#define IMAGE_BYTE_ORDER  LSBFirst
 +#define BITMAP_BIT_ORDER  LSBFirst
 +#else
 +#define IMAGE_BYTE_ORDER  MSBFirst
 +#define BITMAP_BIT_ORDER  MSBFirst
 +#endif
  #define GLYPHPADBYTES   4
  
  #endif  /* PowerPC */


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: Bad DMA from CEDAR card

2012-10-23 Thread Benjamin Herrenschmidt
On Tue, 2012-10-23 at 18:42 +1100, Benjamin Herrenschmidt wrote:
 
 As you can see, it's not doing much before the failure:

Allright, that debug output is bad, it's missing a bunch of stuff,
due to a bad log level (the prink(KERN_DEBUG) in the atom debug
stuff doesn't work anymore new kernel btw)

Here's a better one:

 execute D7E2 (len 24, WS 0, PS 4)
   SET_ATI_PORT @ 0xD7E8
  port: 0 (MM)
   CLEAR_REG @ 0xD7EB
  dst: REG[0x19EC].[31:0] - 0x
   AND_REG @ 0xD7EF
  dst: REG[0x19E4].[7:0] - 0x04
  src: IMM 0xFE
  dst: REG[0x19E4].[7:0] - 0x04
   OR_REG @ 0xD7F4
  dst: REG[0x19E4].[7:0] - 0x04
  src: PS[0x00,0x].[7:0] - 0x00
  dst: REG[0x19E4].[7:0] - 0x04
   EOT @ 0xD7F9

 execute D7CA (len 24, WS 0, PS 4)
   SET_ATI_PORT @ 0xD7D0
  port: 0 (MM)
   CLEAR_REG @ 0xD7D3
  dst: REG[0x19AC].[31:0] - 0x
   AND_REG @ 0xD7D7
  dst: REG[0x19A4].[7:0] - 0x04
  src: IMM 0xFE
  dst: REG[0x19A4].[7:0] - 0x04
   OR_REG @ 0xD7DC
  dst: REG[0x19A4].[7:0] - 0x04
  src: PS[0x00,0x].[7:0] - 0x00
  dst: REG[0x19A4].[7:0] - 0x04
   EOT @ 0xD7E1

 execute BADE (len 25, WS 0, PS 0)
0001:01:00.0: EEH freeze detected, fstate=3 pcierr=9


___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-driver-ati


Bad DMA from CEDAR card

2012-10-23 Thread Benjamin Herrenschmidt
Hi Folks !

I've been tracking a problem on POWER server with a CEDAR card (FirePro
2270 X1 server) on a POWER machine.

The cool thing is that our PCI-E bridges fancy error handling is kicking
in and freezing all to that device traffic as soon as the error is
detected. By sprinkling some test code around, here is what I have
found:

 - The initial symptom is that the system reports an EEH (our error
handling system) error on the card at boot. Under our normal firmware 
Linux, that pretty much causes the card to be taken out. 

 - With some hand made debug, I have got some more details about the
error which is a bad DMA write from the card to an address that isn't
mapped in our IOMMU. (Ie something that wasn't the result of a
dma_map_sg or dma_alloc_coherent, etc...)

 - The error seems to always happen while ATOM is executing from the
atom_enable_crtc table called from:

[c01f7e206fb0] [c0440148] .atombios_enable_crtc+0x38/0x50
[c01f7e207030] [c0441784] .atombios_crtc_dpms+0x104/0x1a0
[c01f7e2070c0] [c0441d68] .atombios_crtc_disable+0x28/0x170
[c01f7e207190] [c03e6ef4] 
.drm_helper_disable_unused_functions+0x144/0x230
[c01f7e207230] [c03e69ec] .drm_fb_helper_initial_config+0x5c/0x310
[c01f7e207340] [c046862c] .radeon_fbdev_init+0xdc/0x190
[c01f7e2073e0] [c04620c0] .radeon_modeset_init+0x740/0xc90
[c01f7e2074b0] [c0438cfc] .radeon_driver_load_kms+0x14c/0x1a0
[c01f7e207550] [c03f6e14] .drm_get_pci_dev+0x1c4/0x2e0
[c01f7e207600] [c079aea8] .radeon_pci_probe+0xc4/0xe8
[c01f7e207690] [c0378d60] .pci_device_probe+0x1a0/0x1c0
[c01f7e207740] [c04db054] .driver_probe_device+0xe4/0x2c0
[c01f7e2077e0] [c04db33c] .__driver_attach+0x10c/0x110
[c01f7e207870] [c04d8b2c] .bus_for_each_dev+0x8c/0xe0
[c01f7e207920] [c04daaa8] .driver_attach+0x28/0x40
[c01f7e2079a0] [c04da3d8] .bus_add_driver+0x228/0x300
[c01f7e207a40] [c04db8b0] .driver_register+0xa0/0x1e0
[c01f7e207ae0] [c0378eb8] .__pci_register_driver+0x48/0x60
[c01f7e207b60] [c03f709c] .drm_pci_init+0x16c/0x1a0
[c01f7e207c10] [c09eae5c] .radeon_init+0xb4/0xd0

The error is detected on an atom_op_jump() that loops for ever due to
the isolation which means that all MMIOs are returning 's. The
actual error might have happened slightly earlier (see below).

. From the backtrace, it seems to be trying to *disable* CRTCs (I would
have understood if it was trying to incorrectly enable one which is
sourcing pixels from the wrong address...)

 - I've added various delays in all sort of stages of
radeon_modeset_init() and radeon_fbdev_init(), and the error still
appears to be fairly well localized to the execution of that table, so
it looks like it's not some stray DMA that happens to hit at that moment
due to some timing, but something specifically triggered by that table
execution.

 - I've turned on atom_debug right before the call to
drm_fb_helper_initial_config() in radeon_fbdev_init() and added a freeze
check between each op, and here's the result. I don't really have the
brain cycles to try to parse that right now :-) I used to back then but
heh, it's a long time ago... that's where I'm handing you the hot potato
hoping it will make some obvious sense :-)

As you can see, it's not doing much before the failure:

 execute D7E2 (len 24, WS 0, PS 4)
   SET_ATI_PORT @ 0xD7E8
  port: 0 (MM)
   CLEAR_REG @ 0xD7EB
  dst: 
   AND_REG @ 0xD7EF
  dst: 
  src: 
  dst: 
   OR_REG @ 0xD7F4
  dst: 
  src: 
  dst: 
   EOT @ 0xD7F9

 execute D7CA (len 24, WS 0, PS 4)
   SET_ATI_PORT @ 0xD7D0
  port: 0 (MM)
   CLEAR_REG @ 0xD7D3
  dst: 
   AND_REG @ 0xD7D7
  dst: 
  src: 
  dst: 
   OR_REG @ 0xD7DC
  dst: 
  src: 
  dst: 
   EOT @ 0xD7E1

 execute BADE (len 25, WS 0, PS 0)
0001:01:00.0: EEH freeze detected, fstate=3 pcierr=9

[ here we have detected the freeze, the stuff below is my own diagnostic
  code, it I will decrypt some of it later if it's of use, basically it
  says the freeze occurred due to a DMA error to an unmapped DMA address
  though I am not 100% sure of the actual DMA address, I think what it
  gives me is actually the address of the iommu PTE that had the valid
  bit clear, I need to do some work to turn that back into a page or
  something ... The packet hasn't been captured in the TLP header capture
  of the AER function unfortunately.
 ]

PHB 1 diagnostic data:
  brdgCtl  = 0x0002
  portStatusReg= 0x
  rootCmplxStatus  = 0x
  busAgentStatus   = 0x
  deviceStatus = 0x000f
  slotStatus   = 0x016003c0
  linkStatus   = 0xa0120008
  devCmdStatus = 0x00100147
  devSecStatus = 0x
  rootErrorStatus  = 0x
  uncorrErrorStatus= 0x
  

Re: Bad DMA from CEDAR card

2012-10-23 Thread Benjamin Herrenschmidt
On Tue, 2012-10-23 at 18:54 +1100, Benjamin Herrenschmidt wrote:
 On Tue, 2012-10-23 at 18:42 +1100, Benjamin Herrenschmidt wrote:
  
  As you can see, it's not doing much before the failure:
 
 Allright, that debug output is bad, it's missing a bunch of stuff,
 due to a bad log level (the prink(KERN_DEBUG) in the atom debug
 stuff doesn't work anymore new kernel btw)

More data: I've done a bit of AtomDis under Dave instructions and
improved my tracing, and what it looks like is we run those 3 tables in
that order:

DAC1OutputControl
DAC2OutputControl
EnableCRTCMemReq

The error happens somewhere between the end of DAC2OutputControl and
early in EnableCRTCMemReq.

It all comes from drm_helper_disable_unused_functions(), the first two I
suspect as a result of drm_encoder_disable() and the last one as a
result of crtc_funcs-disable which itself goes into dpms.

I don't know (yet) whether anything happens in between that doesn't go
via ATOM, in which case that wouldn't be traced. That's the next thing
to check (including interrupts though we shouldn't be getting any at
this stage afaik).

Now here is the trace. As before, I enable atom_debug right before that
sequence, after I've done a 1s pause and a freeze check which passes, so
I'm reasonably confident the card is still somewhat in a sane state.

I then run the tables, with a hack that adds a 200ms pause after each
op, followed by a freeze check which I moved to after the ops instead of
before, so we don't miss a freeze caused by the last op of a table.

Here's the results so far:

 execute D7E2 (len 24, WS 0, PS 4)
   SET_ATI_PORT @ 0xD7E8
  port: 0 (MM)
   CLEAR_REG @ 0xD7EB
  dst: REG[0x19EC].[31:0] - 0x
   AND_REG @ 0xD7EF
  dst: REG[0x19E4].[7:0] - 0x04
  src: IMM 0xFE
  dst: REG[0x19E4].[7:0] - 0x04
   OR_REG @ 0xD7F4
  dst: REG[0x19E4].[7:0] - 0x04
  src: PS[0x00,0x].[7:0] - 0x00
  dst: REG[0x19E4].[7:0] - 0x04
   EOT @ 0xD7F9

 execute D7CA (len 24, WS 0, PS 4)
   SET_ATI_PORT @ 0xD7D0
  port: 0 (MM)
   CLEAR_REG @ 0xD7D3
  dst: REG[0x19AC].[31:0] - 0x
   AND_REG @ 0xD7D7
  dst: REG[0x19A4].[7:0] - 0x04
  src: IMM 0xFE
  dst: REG[0x19A4].[7:0] - 0x04
   OR_REG @ 0xD7DC
  dst: REG[0x19A4].[7:0] - 0x04
  src: PS[0x00,0x].[7:0] - 0x00
  dst: REG[0x19A4].[7:0] - 0x04
   EOT @ 0xD7E1

 execute BADE (len 25, WS 0, PS 0)
   SET_ATI_PORT @ 0xBAE4
  port: 0 (MM)
0001:01:00.0: EEH freeze detected, fstate=3 pcierr=9

Interestingly enough, it _does_ go into EnableCRTCMemreq tho it
doesn't seem to have time to do anything in there before it
detects the failure. However it also doesn't appear to detect
it despite the delays on the last instruction of DAC2OutputControl.

I do wonder whether we end up actually doing something *else*
in between those two, that isn't going through ATOM. I'll try to
dig in that direction next.

The disassembly of those tables matches the output:

command_table  d7ca  #44  (DAC1OutputControl):

  Size 0018
  Format Rev.  01
  Param Rev.   00
  Content Rev. 01
  Attributes:  Work space size00 longs
   Parameter space size   01 longs
   Table update indicator 0

  0006: 37SET_ATI_PORT    (INDIRECT_IO_MM)
  0009: 5400ac19  CLEAR  reg[19ac]  []
  000d: 0725a419feANDreg[19a4]  [...X]  -  fe
  0012: 0d21a41900OR reg[19a4]  [...X]  -  param[00]  [...X]
  0017: 5bEOT

command_table  d7e2  #45  (DAC2OutputControl):

  Size 0018
  Format Rev.  01
  Param Rev.   00
  Content Rev. 01
  Attributes:  Work space size00 longs
   Parameter space size   01 longs
   Table update indicator 0

  0006: 37SET_ATI_PORT    (INDIRECT_IO_MM)
  0009: 5400ec19  CLEAR  reg[19ec]  []
  000d: 0725e419feANDreg[19e4]  [...X]  -  fe
  0012: 0d21e41900OR reg[19e4]  [...X]  -  param[00]  [...X]
  0017: 5bEOT

Cheers,
Ben.


___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-driver-ati


Re: Bad DMA from CEDAR card

2012-10-23 Thread Benjamin Herrenschmidt
On Tue, 2012-10-23 at 21:45 +1100, Benjamin Herrenschmidt wrote:
 On Tue, 2012-10-23 at 18:54 +1100, Benjamin Herrenschmidt wrote:
  On Tue, 2012-10-23 at 18:42 +1100, Benjamin Herrenschmidt wrote:
   
   As you can see, it's not doing much before the failure:
  
  Allright, that debug output is bad, it's missing a bunch of stuff,
  due to a bad log level (the prink(KERN_DEBUG) in the atom debug
  stuff doesn't work anymore new kernel btw)
 
 More data: I've done a bit of AtomDis under Dave instructions and
 improved my tracing, and what it looks like is we run those 3 tables in
 that order:

And more :-)

 .../...

 I don't know (yet) whether anything happens in between that doesn't go
 via ATOM, in which case that wouldn't be traced. That's the next thing
 to check (including interrupts though we shouldn't be getting any at
 this stage afaik).

So I think it's in between. From what I can tell, the error happens
somewhere inside the call to drm_vblank_pre_modeset() from
atombios_crtc_dpms().

This is actually a bit nasty, but basically that pre_modeset() calls
drm_vblank_get() which enables vblanks.

Now, it doesn't look like it's actually taking any interrupt ... Well
assuming this is indeed using the irq handler in evergreen.c, which
doesn't appear to be called, but I might have confused my ASICs and
missed something specific to CEDAR here.

Here's what I've traced so far:

evergreen_irq_set: vblank 0
evergreen_irq_set: hpd 1
evergreen_irq_set: hpd 2
evergreen_irq_set: hpd 3
evergreen_irq_set: hpd 4
0001:01:00.0: EEH freeze detected, fstate=3 pcierr=9 msg: irqset 2

What that output means is that it called evergreen_irq_set() which
enables vblank0 (and various hpd's but those were already enabled), and
the freeze is detected at the tracepoint irqset 2 that I added in
there.

This point is basically right at the end of evergreen_irq_set(), where I
do a 500ms delay and check for freeze. A previous trace point right
before writing to CP_INT_CNTL didn't show any freeze.

Now the interrupt being an MSI, it's a memory store ... I had a vague
memory of one of you guys mentioning address limitations to 40-bit or so
in the radeon, though I though that shouldn't affect MSIs right ?
Well ...

Our 64-bit MSIs are actually using all 64-bit address bits. If the
radeon doesn't do that properly and crops the address bits, the MSIs are
going to hit wrong, right in the middle of nowhere, possibly some DMA
space.

So I hacked my platform code to force it to only hand out 32-bit MSI
addresses and guess what ? ... the problem seems to be gone. Ouch.

That's really nasty. Supporting only a subset of the PCI address space
for DMA was already fairly nasty to begin with, but not doing the full
MSI addresses looks like a clear violation of the PCIe spec :-(

I'll do some more tests tomorrow to confirm whether that is the problem
or not at which point, if it is, we'll need some kind of quirk to
indicate that it supports only MSI32 and not MSI64 or something along
those lines. Guys, go shoot your HW engineers please.

Cheers,
Ben.


___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-driver-ati


Re: Bad DMA from CEDAR card

2012-10-23 Thread Benjamin Herrenschmidt
On Tue, 2012-10-23 at 09:34 -0400, Alex Deucher wrote:
 
  I'll do some more tests tomorrow to confirm whether that is the problem
  or not at which point, if it is, we'll need some kind of quirk to
  indicate that it supports only MSI32 and not MSI64 or something along
  those lines. Guys, go shoot your HW engineers please.
 
 
 Well, we only support a 40 bit DMA mask, so I suspect MSIs are limited
 to 40 bits as well.

Well, there is a big difference here... limiting yourself to
configurations up to 1TB of RAM for the purpose of DMA is somewhat
understandable (ignoring that in NUMA setups etc... memory might
actually cover more address space than the amount of RAM etc...) but
MSI's are a whole different thing.

They do not hit memory, they are decoded as special addresses by the
bridge , and can potentially be anywhere.

Some of our bridges only support MSIs high up in 64-bit space for
example.

Anyway, I will double check today whether that is indeed the problem I'm
hitting though from what I can tell from here, it looks like it.

Cheers,
Ben.


___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-driver-ati


Re: [PATCH] Fix ConnectorTable crash in radeon_output.c

2012-05-02 Thread Benjamin Herrenschmidt
On Wed, 2012-05-02 at 17:59 +0200, Hans Verkuil wrote:
  For the latter, you could try if hacking the driver as described in
  http://lists.ozlabs.org/pipermail/linuxppc-dev/2012-April/097668.html
  helps.
 
 I tried this but that had no effect that I could see.
 
  For the former, Benjamin Herrenschmidt (CC'd) might be able to
  point you to some patches I keep hearing about but haven't seen yet. 
 
 I'd be happy to test out any patches you might have. The radeon driver as
 is is actually way better than the old driver if it wasn't for the wrong
 colors. 

The 32-bit patches are still afaik in Tony's hands (CC) :-) He and his
wife just had a new kid so he might not be -that- responsive but I'm
hoping hell release those patches soon (Tony !)

Cheers,
Ben.


___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-driver-ati


Multitouch regression in 3.3 on thinkpad X220 clickpad

2012-04-20 Thread Benjamin Herrenschmidt
Hi folks !

So Peter and I have been discussing a problem I observed on my brand new
ThinkPad X220 and it's multitouch clickpad. I've been digging a bit
more today and bisected the regression to:

commit 7968a5dd492ccc38345013e534ad4c8d6eb60ed1
Input: synaptics - add support for Relative mode

Without that commit, I can draw two traces when using two fingers in
mtview (after removing the device from X), each follow one finger.

With that commit applied, this doesn't work anymore: it appears to be
unable to track more than one finger. IE. If i put a finger on the
touchpad, I can draw, but a second finger is then ignored (or causes one
or two spots in a random place to appear but that's about it).

Now other multitouch operations such as two finger scrolling seem to
work in X. I'm not familiar with the inner workings of the input layer
or the synaptic touchpads. I'll dig a bit more this week-end if I have
time, but it would save me plenty of that precious time if you guys
could hint me at things to look at :-)

I have a couple more bits of interesting info.

If I boot with a bad driver. rmmod it. insmod a good one (one build
from the same kernel tree set to the commit right before the one above)
then things work fine. The interesting bit is that if I then rmmod it
and insmod the bad one again ... it still works. So it looks like it
might have something to do with how the touchpad is initialized.

Some more info about the touchpad itself from dmesg:

psmouse serio1: synaptics: Touchpad model: 1, fw: 8.0, id: 0x1e2b1, caps: 
0xd001a3/0x940300/0x120c00
psmouse serio1: synaptics: serio: Synaptics pass-through port at 
isa0060/serio1/input0
input: SynPS/2 Synaptics TouchPad as 
/devices/platform/i8042/serio1/input/input13

Let me know if there's any other info of value I can collect. 

Cheers,
Ben.


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: Multitouch regression in 3.3 on thinkpad X220 clickpad

2012-04-20 Thread Benjamin Herrenschmidt
On Fri, 2012-04-20 at 10:01 -0700, Dmitry Torokhov wrote:

 It looks we lost a condition in synaptics_set_advanced_gesture_mode().
 It used to be:
 
 
   if (!(SYN_CAP_ADV_GESTURE(priv-ext_cap_0c) ||
   SYN_CAP_IMAGE_SENSOR(priv-ext_cap_0c)))
   return 0;
 
 and now simply is:
 
   if (!SYN_CAP_ADV_GESTURE(priv-ext_cap_0c))
   return 0;
 
 Could you try restoring the condition and see if it fixes the
 regression?

Yes, that's it. Please shoot the patch below to Linus.

Thanks,
Ben.

input/synaptics: Fix regression with image sensor trackpads

commit 7968a5dd492ccc38345013e534ad4c8d6eb60ed1
Input: synaptics - add support for Relative mode

Accidentally broke support for advanced gestures (multitouch)
on some trackpads such as the one in my ThinkPad X220 by
incorretly changing the condition for enabling them. This
restores it.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
CC: sta...@kernel.org [3.3]

diff --git a/drivers/input/mouse/synaptics.c b/drivers/input/mouse/synaptics.c
index a3bb49c..37b3792 100644
--- a/drivers/input/mouse/synaptics.c
+++ b/drivers/input/mouse/synaptics.c
@@ -273,7 +273,8 @@ static int synaptics_set_advanced_gesture_mode(struct 
psmouse *psmouse)
static unsigned char param = 0xc8;
struct synaptics_data *priv = psmouse-private;
 
-   if (!SYN_CAP_ADV_GESTURE(priv-ext_cap_0c))
+   if (!(SYN_CAP_ADV_GESTURE(priv-ext_cap_0c) ||
+ SYN_CAP_IMAGE_SENSOR(priv-ext_cap_0c)))
return 0;
 
if (psmouse_sliced_command(psmouse, SYN_QUE_MODEL))


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH] RADEONCopySwap: Fix RADEON_HOST_DATA_SWAP_16BIT case.

2012-04-16 Thread Benjamin Herrenschmidt
On Mon, 2012-04-16 at 15:39 +0200, Michel Dänzer wrote:
 From: Michel Dänzer michel.daen...@amd.com
 
 It was the same code as for RADEON_HOST_DATA_SWAP_32BIT. This caused bus 
 errors
 on FreeBSD/PPC, but I'm not sure how it could not cause problems anywhere...

What kind of bus errors ? Alignment faults ? I think we go lucky the
kernel fixed them up on Linux at a performance penalty :-)

Cheers,
Ben.

 Reported-by: Andreas Tobler andre...@fgznet.ch
 Signed-off-by: Michel Dänzer michel.daen...@amd.com
 ---
  src/radeon_accel.c |5 ++---
  1 file changed, 2 insertions(+), 3 deletions(-)
 
 diff --git a/src/radeon_accel.c b/src/radeon_accel.c
 index 1cebcf6..784502d 100644
 --- a/src/radeon_accel.c
 +++ b/src/radeon_accel.c
 @@ -982,10 +982,9 @@ void RADEONCopySwap(uint8_t *dst, uint8_t *src, unsigned 
 int size, int swap)
  
   for (; nwords  0; --nwords, ++d, ++s)
  #ifdef __powerpc__
 - asm volatile(stwbrx %0,0,%1 : : r (*s), r (d));
 + asm volatile(sthbrx %0,0,%1 : : r (*s), r (d));
  #else
 - *d = ((*s  24)  0xff) | ((*s  8)  0xff00)
 - | ((*s  0xff00)  8) | ((*s  0xff)  24);
 + *d = (*s  8) | (*s  8);
  #endif
   return;
   }


___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-driver-ati


[PATCH] ddx/evergreen: Fix endian of ALU constants

2011-11-20 Thread Benjamin Herrenschmidt
The constants are written directly into a buffer object shared with the
card and we forget to swap them. This patch fixes it by doing the swap
in evergreen_set_alu_consts() in-place (ie, it modifies the buffer),
which should be fine with the way we use it in the ddx.

This makes everything work fine on my caicos card on a G5 including some
quik tests with Xv, gnome3 shell, etc...

Thanks a lot to Jerome Glisse for holding my hand through debugging that
(and finding the actual bug).

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
---

diff --git a/src/evergreen_accel.c b/src/evergreen_accel.c
index 5c95e20..83320c8 100644
--- a/src/evergreen_accel.c
+++ b/src/evergreen_accel.c
@@ -479,6 +479,17 @@ evergreen_set_alu_consts(ScrnInfoPtr pScrn, const_config_t 
*const_conf, uint32_t
 if (size == 0)
size = 1;
 
+#if X_BYTE_ORDER == X_BIG_ENDIAN
+{
+   uint32_t count = size  4, *p = const_conf-cpu_ptr;
+
+   while(count--) {
+   *p = cpu_to_le32(*p);
+   p++;
+   }
+}
+#endif
+
 /* flush SQ cache */
 evergreen_cp_set_surface_sync(pScrn, SH_ACTION_ENA_bit,
  const_conf-size_bytes, 
const_conf-const_addr,
diff --git a/src/evergreen_exa.c b/src/evergreen_exa.c
index 6becbb3..603d854 100644
--- a/src/evergreen_exa.c
+++ b/src/evergreen_exa.c
@@ -172,6 +172,7 @@ EVERGREENPrepareSolid(PixmapPtr pPix, int alu, Pixel pm, 
Pixel fg)
 ps_alu_consts = radeon_vbo_space(pScrn, accel_state-cbuf, 256);
 ps_const_conf.bo = accel_state-cbuf.vb_bo;
 ps_const_conf.const_addr = accel_state-cbuf.vb_mc_addr + 
accel_state-cbuf.vb_offset;
+ps_const_conf.cpu_ptr = (uint32_t *)(char *)ps_alu_consts;
 if (accel_state-dst_obj.bpp == 16) {
r = (fg  11)  0x1f;
g = (fg  5)  0x3f;
@@ -1320,6 +1321,7 @@ static Bool EVERGREENPrepareComposite(int op, PicturePtr 
pSrcPicture,
 vs_const_conf.bo = accel_state-cbuf.vb_bo;
 vs_const_conf.const_addr = accel_state-cbuf.vb_mc_addr + 
accel_state-cbuf.vb_offset;
 
+vs_const_conf.cpu_ptr = (uint32_t *)(char *)cbuf;
 EVERGREENXFormSetup(pSrcPicture, pSrc, 0, cbuf);
 if (pMask)
 EVERGREENXFormSetup(pMaskPicture, pMask, 1, cbuf);
diff --git a/src/evergreen_state.h b/src/evergreen_state.h
index 40fec22..5fd85f8 100644
--- a/src/evergreen_state.h
+++ b/src/evergreen_state.h
@@ -120,6 +120,7 @@ typedef struct {
 int size_bytes;
 uint64_t const_addr;
 struct radeon_bo *bo;
+uint32_t *cpu_ptr;
 } const_config_t;
 
 /* Vertex buffer / vtx resource */
diff --git a/src/evergreen_textured_videofuncs.c 
b/src/evergreen_textured_videofuncs.c
index 6200cdc..d27a02a 100644
--- a/src/evergreen_textured_videofuncs.c
+++ b/src/evergreen_textured_videofuncs.c
@@ -449,6 +449,7 @@ EVERGREENDisplayTexturedVideo(ScrnInfoPtr pScrn, 
RADEONPortPrivPtr pPriv)
 ps_alu_consts = radeon_vbo_space(pScrn, accel_state-cbuf, 256);
 ps_const_conf.bo = accel_state-cbuf.vb_bo;
 ps_const_conf.const_addr = accel_state-cbuf.vb_mc_addr + 
accel_state-cbuf.vb_offset;
+ps_const_conf.cpu_ptr = (uint32_t *)(char *)ps_alu_consts;
 
 ps_alu_consts[0] = off[0];
 ps_alu_consts[1] = off[1];
@@ -474,6 +475,7 @@ EVERGREENDisplayTexturedVideo(ScrnInfoPtr pScrn, 
RADEONPortPrivPtr pPriv)
 vs_alu_consts = radeon_vbo_space(pScrn, accel_state-cbuf, 256);
 vs_const_conf.bo = accel_state-cbuf.vb_bo;
 vs_const_conf.const_addr = accel_state-cbuf.vb_mc_addr + 
accel_state-cbuf.vb_offset;
+vs_const_conf.cpu_ptr = (uint32_t *)(char *)vs_alu_consts;
 
 vs_alu_consts[0] = 1.0 / pPriv-w;
 vs_alu_consts[1] = 1.0 / pPriv-h;




___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-driver-ati


[PATCH] xserver: Fix double free's in config file parser

2011-07-14 Thread Benjamin Herrenschmidt
Feeding the parser a bad config file, I crashed the server a few times.

It looks like whoever free's val.str (yeah, val is a global ..  yuck)
is also responsible for clearing the pointer or something else might try
to free it again some time later.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
---

diff -urN xorg-server-1.10.2.902.orig/hw/xfree86/parser/Input.c 
xorg-server-1.10.2.902/hw/xfree86/parser/Input.c
--- xorg-server-1.10.2.902.orig/hw/xfree86/parser/Input.c   2011-02-25 
14:27:14.0 +1100
+++ xorg-server-1.10.2.902/hw/xfree86/parser/Input.c2011-07-14 
16:57:18.912426863 +1000
@@ -106,6 +106,7 @@
 if (strcmp(val.str, keyboard) == 0) {
 ptr-inp_driver = strdup(kbd);
 free(val.str);
+   val.str = NULL;
 }
 else
ptr-inp_driver = val.str;
diff -urN xorg-server-1.10.2.902.orig/hw/xfree86/parser/InputClass.c 
xorg-server-1.10.2.902/hw/xfree86/parser/InputClass.c
--- xorg-server-1.10.2.902.orig/hw/xfree86/parser/InputClass.c  2011-02-25 
14:27:14.0 +1100
+++ xorg-server-1.10.2.902/hw/xfree86/parser/InputClass.c   2011-07-14 
16:57:28.608402699 +1000
@@ -114,6 +114,7 @@
 if (strcmp(val.str, keyboard) == 0) {
 ptr-driver = strdup(kbd);
 free(val.str);
+   val.str = NULL;
 }
 else
 ptr-driver = val.str;
diff -urN xorg-server-1.10.2.902.orig/hw/xfree86/parser/Screen.c 
xorg-server-1.10.2.902/hw/xfree86/parser/Screen.c
--- xorg-server-1.10.2.902.orig/hw/xfree86/parser/Screen.c  2011-02-25 
14:27:14.0 +1100
+++ xorg-server-1.10.2.902/hw/xfree86/parser/Screen.c   2011-07-14 
16:56:38.492527593 +1000
@@ -316,6 +316,7 @@
Error (QUOTE_MSG, SubSection);
{
free(val.str);
+   val.str = NULL;
HANDLE_LIST (scrn_display_lst, 
xf86parseDisplaySubSection,
 XF86ConfDisplayPtr);
}


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH] xserver: Fix double free's in config file parser

2011-07-14 Thread Benjamin Herrenschmidt
On Thu, 2011-07-14 at 07:57 -0700, Dan Nicholson wrote:
 I also see a couple instances of free (val.str) in parser/Files.c
 that don't set it to NULL afterward. Yay for custom parsers! With
 those two instances fixed:

Hrm, did I have a grep failure ? Oh... I see, I didn't catch the space
between free and (val.str). I can send a newer patch tomorrow, or an
addon, let me know, I'm off to bed now.

Cheers,
Ben.


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH] xserver: Fix double free's in config file parser

2011-07-14 Thread Benjamin Herrenschmidt
On Thu, 2011-07-14 at 09:26 -0700, Keith Packard wrote:
 On Fri, 15 Jul 2011 01:36:05 +1000, Benjamin Herrenschmidt 
 b...@kernel.crashing.org wrote:
  On Thu, 2011-07-14 at 07:57 -0700, Dan Nicholson wrote:
   I also see a couple instances of free (val.str) in parser/Files.c
   that don't set it to NULL afterward. Yay for custom parsers! With
   those two instances fixed:
  
  Hrm, did I have a grep failure ? Oh... I see, I didn't catch the space
  between free and (val.str). I can send a newer patch tomorrow, or an
  addon, let me know, I'm off to bed now.
 
 Might as well send a single patch which fixes all of them in one go.
 
 I also wouldn't mind seeing a patch to replace 'val' with a slightly
 more appropriate name :-)

Somewhat I knew that's what I was going to end up doing :-)

Will take a bit longer, I'm in the middle of some other more urgent
stuff (unless somebody else wants to pick it up).

I need to look more at how this parser works but it should be possible
to have a more robust handling of the lifetime of that current token
anyways.

Something, like freeing it when we try to replace it (or some other
common place like at the end of the loops) rather than having each
case that can potentially consume a string have to have the right 2
statements in the right place, that sort of thing...

Might even be possible to stop making it a global I suppose.

Anyway, don't hold your breath on me doing that real soon, I'll fixup
the patch for the immediate bug(s) later today hopefully and will try to
go back to those other improvements later, maybe next week but no
promise.

Cheers,
Ben.


___
xorg-devel@lists.x.org: X.Org development
Archives: http://lists.x.org/archives/xorg-devel
Info: http://lists.x.org/mailman/listinfo/xorg-devel


Re: [PATCH 3/6] drm/radeon: Writeback endian fixes

2011-07-14 Thread Benjamin Herrenschmidt
On Thu, 2011-07-14 at 17:19 +0200, Michel Dänzer wrote:
 On Mit, 2011-07-13 at 16:28 +1000, Benjamin Herrenschmidt wrote:
  The writeback ring pointer and IH ring pointer are read using le32_to_cpu
  so we do not want the chip to byteswap them on big-endian.
  
  We still want to byteswap the ring itself and the IBs, so we don't touch
  that but we remove setting of the byteswap bits in CP_RB_RPTR_ADDR and
  IH_CNTL.
  
  In general, for things like that where we control all the accessors easily,
  we are better off doing the swap in SW rather than HW. Paradoxally, it does
  keep the code closer to x86 and avoid using poorly tested HW features.
 
 Absolutely. Unfortunately, when I fixed the CP writeback code to use
 le32_to_cpu(), I didn't realize the code for some GPU families was
 already using HW swappers for this. 
 
 
  I also changed the use of RADEON_ to R600_ in a couple of cases to be more
  consistent with the surrounding code.
 
 That should probably be in a separate patch. Either way, though:

I thought about it and decided it was trivial enough not to bother
re-doing the patches.

Alex/Dave/whoever's in charge, feel free to apply the current batch,
I'll send further cleanups/fixes as separate patches, possibly not
before next week or so.

Cheers,
Ben.

 Reviewed-by: Michel Dänzer mic...@daenzer.net



___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-driver-ati


Re: [PATCH 4/6] drm/radeon: Add a rmb() in IH processing

2011-07-14 Thread Benjamin Herrenschmidt
On Fri, 2011-07-15 at 04:19 +, Matt Turner wrote:
 On Wed, Jul 13, 2011 at 6:28 AM, Benjamin Herrenschmidt
 b...@kernel.crashing.org wrote:
  We should have a read memory barrier between reading the WPTR from
  memory and reading ring entries based on that value (ie, we need to
  ensure both loads are done in order by the CPU).
 
  It could be argued that the MMIO reads in r600_ack_irq() might be
  enough to get that barrier but I prefer keeping an explicit one just
  in case.
 
  Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
  ---
 
  (resent adding dri-devel to the CC list to hit patchwork)
 
  drivers/gpu/drm/radeon/r600.c |3 +++
   1 files changed, 3 insertions(+), 0 deletions(-)
 
  diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
  index 3c86b15..7e5c801 100644
  --- a/drivers/gpu/drm/radeon/r600.c
  +++ b/drivers/gpu/drm/radeon/r600.c
  @@ -3312,6 +3312,9 @@ int r600_irq_process(struct radeon_device *rdev)
 }
 
   restart_ih:
  +   /* Order reading of wptr vs. reading of IH ring data */
  +   wmb();
  +
 /* display interrupts */
 r600_irq_ack(rdev);
 
 The subject line says rmb(), but this says wmb(). Just want to verify
 what you have is correct.

Nice spotting, it's a typo and should have been rmb(). I'll fix it and
respin.

Cheers,
Ben

___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-driver-ati


[PATCH 6/6] drm/radeon: Fix the definition of RADEON_BUF_SWAP_32BIT

2011-07-13 Thread Benjamin Herrenschmidt
(Note that this is duplicated under various other names such
as R600_BUF_SWAP_32BIT etc...). At least now all the definitions
agree.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
---

(resent adding dri-devel to the CC list to hit patchwork)

 drivers/gpu/drm/radeon/radeon_reg.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_reg.h 
b/drivers/gpu/drm/radeon/radeon_reg.h
index ec93a75..0828add1 100644
--- a/drivers/gpu/drm/radeon/radeon_reg.h
+++ b/drivers/gpu/drm/radeon/radeon_reg.h
@@ -3293,7 +3293,7 @@
 #  define RADEON_RB_BUFSZ_MASK (0x3f  0)
 #  define RADEON_RB_BLKSZ_SHIFT8
 #  define RADEON_RB_BLKSZ_MASK (0x3f  8)
-#  define RADEON_BUF_SWAP_32BIT(1  17)
+#  define RADEON_BUF_SWAP_32BIT(2  16)
 #  define RADEON_MAX_FETCH_SHIFT   18
 #  define RADEON_MAX_FETCH_MASK(0x3  18)
 #  define RADEON_RB_NO_UPDATE  (1  27)



___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-driver-ati


[PATCH 4/6] drm/radeon: Add a rmb() in IH processing

2011-07-13 Thread Benjamin Herrenschmidt
We should have a read memory barrier between reading the WPTR from
memory and reading ring entries based on that value (ie, we need to
ensure both loads are done in order by the CPU).

It could be argued that the MMIO reads in r600_ack_irq() might be
enough to get that barrier but I prefer keeping an explicit one just
in case.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
---
 
(resent adding dri-devel to the CC list to hit patchwork)

drivers/gpu/drm/radeon/r600.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index 3c86b15..7e5c801 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -3312,6 +3312,9 @@ int r600_irq_process(struct radeon_device *rdev)
}
 
 restart_ih:
+   /* Order reading of wptr vs. reading of IH ring data */
+   wmb();
+
/* display interrupts */
r600_irq_ack(rdev);
 



___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-driver-ati


[PATCH 5/6] drm/radeon: Do an MMIO read on interrupts when not uisng MSIs

2011-07-13 Thread Benjamin Herrenschmidt
When not using MSIs, there is no guarantee that DMA from the device
has been fully flushed to point where it's visible to the CPU when
taking an interrupt. To get this guarantee, we need to perform an
MMIO read from the device, which will flush all outstanding DMAs
from bridges between the device and the system.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
---

 (resent adding dri-devel to the CC list to hit patchwork)

drivers/gpu/drm/radeon/r600.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index 7e5c801..25b2dab 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -3300,6 +3300,10 @@ int r600_irq_process(struct radeon_device *rdev)
if (!rdev-ih.enabled || rdev-shutdown)
return IRQ_NONE;
 
+   /* No MSIs, need a dummy read to flush PCI DMAs */
+   if (!rdev-msi_enabled)
+   RREG32(IH_RB_WPTR);
+
wptr = r600_get_ih_wptr(rdev);
rptr = rdev-ih.rptr;
DRM_DEBUG(r600_irq_process start: rptr %d, wptr %d\n, rptr, wptr);



___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-driver-ati


[PATCH 2/6] drm/radeon: ATOM Endian fix for atombios_crtc_program_pll()

2011-07-13 Thread Benjamin Herrenschmidt
v6 of the structure was programmed incorrectly:

  args.v6.ulCrtcPclkFreq.ulPixelClock = cpu_to_le32(clock / 10);

ulPixelClock is a 24-bit bitfield. This statement would thus
do a 32-bit swap of (clock / 10) and drop the top 8 bits which
are ... the LSB. Not what we want. Instead use masks  shifts.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
---

(resent adding dri-devel to the CC list to hit patchwork)

 drivers/gpu/drm/radeon/atombios_crtc.c |5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/radeon/atombios_crtc.c 
b/drivers/gpu/drm/radeon/atombios_crtc.c
index 9541995..c742944 100644
--- a/drivers/gpu/drm/radeon/atombios_crtc.c
+++ b/drivers/gpu/drm/radeon/atombios_crtc.c
@@ -764,7 +764,7 @@ static void atombios_crtc_set_dcpll(struct drm_crtc *crtc,
 }
 
 static void atombios_crtc_program_pll(struct drm_crtc *crtc,
- int crtc_id,
+ u32 crtc_id,
  int pll_id,
  u32 encoder_mode,
  u32 encoder_id,
@@ -851,8 +851,7 @@ static void atombios_crtc_program_pll(struct drm_crtc *crtc,
args.v5.ucPpll = pll_id;
break;
case 6:
-   args.v6.ulCrtcPclkFreq.ucCRTC = crtc_id;
-   args.v6.ulCrtcPclkFreq.ulPixelClock = cpu_to_le32(clock 
/ 10);
+   args.v6.ulDispEngClkFreq = cpu_to_le32(crtc_id  24 | 
clock / 10);
args.v6.ucRefDiv = ref_div;
args.v6.usFbDiv = cpu_to_le16(fb_div);
args.v6.ulFbDivDecFrac = cpu_to_le32(frac_fb_div * 
10);



___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-driver-ati


[PATCH 1/6] drm/radeon: Remove a bunch of useless _iomem casts

2011-07-13 Thread Benjamin Herrenschmidt
Just defining rdev-rmmio properly in the first place should do
the trick. In some cases, the cast were also complete dups as
the original variable was already of the right type.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
---

(resent adding dri-devel to the CC list to hit patchwork)

 drivers/gpu/drm/radeon/radeon.h |   22 +++---
 drivers/gpu/drm/radeon/rs600.c  |2 +-
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index ef0e0e0..c92cf2c 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -1174,7 +1174,7 @@ struct radeon_device {
/* Register mmio */
resource_size_t rmmio_base;
resource_size_t rmmio_size;
-   void*rmmio;
+   void __iomem*rmmio;
radeon_rreg_t   mc_rreg;
radeon_wreg_t   mc_wreg;
radeon_rreg_t   pll_rreg;
@@ -1251,20 +1251,20 @@ int radeon_gpu_wait_for_idle(struct radeon_device 
*rdev);
 static inline uint32_t r100_mm_rreg(struct radeon_device *rdev, uint32_t reg)
 {
if (reg  rdev-rmmio_size)
-   return readl(((void __iomem *)rdev-rmmio) + reg);
+   return readl((rdev-rmmio) + reg);
else {
-   writel(reg, ((void __iomem *)rdev-rmmio) + RADEON_MM_INDEX);
-   return readl(((void __iomem *)rdev-rmmio) + RADEON_MM_DATA);
+   writel(reg, (rdev-rmmio) + RADEON_MM_INDEX);
+   return readl((rdev-rmmio) + RADEON_MM_DATA);
}
 }
 
 static inline void r100_mm_wreg(struct radeon_device *rdev, uint32_t reg, 
uint32_t v)
 {
if (reg  rdev-rmmio_size)
-   writel(v, ((void __iomem *)rdev-rmmio) + reg);
+   writel(v, (rdev-rmmio) + reg);
else {
-   writel(reg, ((void __iomem *)rdev-rmmio) + RADEON_MM_INDEX);
-   writel(v, ((void __iomem *)rdev-rmmio) + RADEON_MM_DATA);
+   writel(reg, (rdev-rmmio) + RADEON_MM_INDEX);
+   writel(v, (rdev-rmmio) + RADEON_MM_DATA);
}
 }
 
@@ -1296,10 +1296,10 @@ static inline void r100_io_wreg(struct radeon_device 
*rdev, u32 reg, u32 v)
 /*
  * Registers read  write functions.
  */
-#define RREG8(reg) readb(((void __iomem *)rdev-rmmio) + (reg))
-#define WREG8(reg, v) writeb(v, ((void __iomem *)rdev-rmmio) + (reg))
-#define RREG16(reg) readw(((void __iomem *)rdev-rmmio) + (reg))
-#define WREG16(reg, v) writew(v, ((void __iomem *)rdev-rmmio) + (reg))
+#define RREG8(reg) readb((rdev-rmmio) + (reg))
+#define WREG8(reg, v) writeb(v, (rdev-rmmio) + (reg))
+#define RREG16(reg) readw((rdev-rmmio) + (reg))
+#define WREG16(reg, v) writew(v, (rdev-rmmio) + (reg))
 #define RREG32(reg) r100_mm_rreg(rdev, (reg))
 #define DREG32(reg) printk(KERN_INFO REGISTER:  #reg  : 0x%08X\n, 
r100_mm_rreg(rdev, (reg)))
 #define WREG32(reg, v) r100_mm_wreg(rdev, (reg), (v))
diff --git a/drivers/gpu/drm/radeon/rs600.c b/drivers/gpu/drm/radeon/rs600.c
index 6e3b11e..6779576 100644
--- a/drivers/gpu/drm/radeon/rs600.c
+++ b/drivers/gpu/drm/radeon/rs600.c
@@ -530,7 +530,7 @@ int rs600_gart_set_page(struct radeon_device *rdev, int i, 
uint64_t addr)
addr = addr  0xF000ULL;
addr |= R600_PTE_VALID | R600_PTE_SYSTEM | R600_PTE_SNOOPED;
addr |= R600_PTE_READABLE | R600_PTE_WRITEABLE;
-   writeq(addr, ((void __iomem *)ptr) + (i * 8));
+   writeq(addr, ptr + (i * 8));
return 0;
 }
 



___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-driver-ati


[PATCH 3/6] drm/radeon: Writeback endian fixes

2011-07-13 Thread Benjamin Herrenschmidt
The writeback ring pointer and IH ring pointer are read using le32_to_cpu
so we do not want the chip to byteswap them on big-endian.

We still want to byteswap the ring itself and the IBs, so we don't touch
that but we remove setting of the byteswap bits in CP_RB_RPTR_ADDR and
IH_CNTL.

In general, for things like that where we control all the accessors easily,
we are better off doing the swap in SW rather than HW. Paradoxally, it does
keep the code closer to x86 and avoid using poorly tested HW features.

I also changed the use of RADEON_ to R600_ in a couple of cases to be more
consistent with the surrounding code.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
---

(resent adding dri-devel to the CC list to hit patchwork)

 drivers/gpu/drm/radeon/evergreen.c |3 ---
 drivers/gpu/drm/radeon/r600.c  |7 ---
 drivers/gpu/drm/radeon/r600_cp.c   |   23 +--
 3 files changed, 9 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen.c 
b/drivers/gpu/drm/radeon/evergreen.c
index e8a5ffb..23cf089 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -1359,9 +1359,6 @@ int evergreen_cp_resume(struct radeon_device *rdev)
 
/* set the wb address wether it's enabled or not */
WREG32(CP_RB_RPTR_ADDR,
-#ifdef __BIG_ENDIAN
-  RB_RPTR_SWAP(2) |
-#endif
   ((rdev-wb.gpu_addr + RADEON_WB_CP_RPTR_OFFSET)  0xFFFC));
WREG32(CP_RB_RPTR_ADDR_HI, upper_32_bits(rdev-wb.gpu_addr + 
RADEON_WB_CP_RPTR_OFFSET)  0xFF);
WREG32(SCRATCH_ADDR, ((rdev-wb.gpu_addr + RADEON_WB_SCRATCH_OFFSET)  
8)  0x);
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index f79d2cc..3c86b15 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -2212,9 +2212,6 @@ int r600_cp_resume(struct radeon_device *rdev)
 
/* set the wb address whether it's enabled or not */
WREG32(CP_RB_RPTR_ADDR,
-#ifdef __BIG_ENDIAN
-  RB_RPTR_SWAP(2) |
-#endif
   ((rdev-wb.gpu_addr + RADEON_WB_CP_RPTR_OFFSET)  0xFFFC));
WREG32(CP_RB_RPTR_ADDR_HI, upper_32_bits(rdev-wb.gpu_addr + 
RADEON_WB_CP_RPTR_OFFSET)  0xFF);
WREG32(SCRATCH_ADDR, ((rdev-wb.gpu_addr + RADEON_WB_SCRATCH_OFFSET)  
8)  0x);
@@ -2993,10 +2990,6 @@ int r600_irq_init(struct radeon_device *rdev)
/* RPTR_REARM only works if msi's are enabled */
if (rdev-msi_enabled)
ih_cntl |= RPTR_REARM;
-
-#ifdef __BIG_ENDIAN
-   ih_cntl |= IH_MC_SWAP(IH_MC_SWAP_32BIT);
-#endif
WREG32(IH_CNTL, ih_cntl);
 
/* force the active interrupt state to all disabled */
diff --git a/drivers/gpu/drm/radeon/r600_cp.c b/drivers/gpu/drm/radeon/r600_cp.c
index c3ab959..45fd592 100644
--- a/drivers/gpu/drm/radeon/r600_cp.c
+++ b/drivers/gpu/drm/radeon/r600_cp.c
@@ -1802,8 +1802,8 @@ static void r600_cp_init_ring_buffer(struct drm_device 
*dev,
/* Set ring buffer size */
 #ifdef __BIG_ENDIAN
RADEON_WRITE(R600_CP_RB_CNTL,
-RADEON_BUF_SWAP_32BIT |
-RADEON_RB_NO_UPDATE |
+R600_BUF_SWAP_32BIT |
+R600_RB_NO_UPDATE |
 (dev_priv-ring.rptr_update_l2qw  8) |
 dev_priv-ring.size_l2qw);
 #else
@@ -1820,15 +1820,15 @@ static void r600_cp_init_ring_buffer(struct drm_device 
*dev,
 
 #ifdef __BIG_ENDIAN
RADEON_WRITE(R600_CP_RB_CNTL,
-RADEON_BUF_SWAP_32BIT |
-RADEON_RB_NO_UPDATE |
-RADEON_RB_RPTR_WR_ENA |
+R600_BUF_SWAP_32BIT |
+R600_RB_NO_UPDATE |
+R600_RB_RPTR_WR_ENA |
 (dev_priv-ring.rptr_update_l2qw  8) |
 dev_priv-ring.size_l2qw);
 #else
RADEON_WRITE(R600_CP_RB_CNTL,
-RADEON_RB_NO_UPDATE |
-RADEON_RB_RPTR_WR_ENA |
+R600_RB_NO_UPDATE |
+R600_RB_RPTR_WR_ENA |
 (dev_priv-ring.rptr_update_l2qw  8) |
 dev_priv-ring.size_l2qw);
 #endif
@@ -1851,13 +1851,8 @@ static void r600_cp_init_ring_buffer(struct drm_device 
*dev,
- ((unsigned long) dev-sg-virtual)
+ dev_priv-gart_vm_start;
}
-   RADEON_WRITE(R600_CP_RB_RPTR_ADDR,
-#ifdef __BIG_ENDIAN
-(2  0) |
-#endif
-(rptr_addr  0xfffc));
-   RADEON_WRITE(R600_CP_RB_RPTR_ADDR_HI,
-upper_32_bits(rptr_addr));
+   RADEON_WRITE(R600_CP_RB_RPTR_ADDR, (rptr_addr  0xfffc));
+   RADEON_WRITE(R600_CP_RB_RPTR_ADDR_HI, upper_32_bits(rptr_addr));
 
 #ifdef __BIG_ENDIAN
RADEON_WRITE(R600_CP_RB_CNTL,



___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org

LSI problem, endianness cleanups

2011-07-13 Thread Benjamin Herrenschmidt
Hi Alex !

A couple of things I noticed while doing some of the fixes I posted
recently:

 - There seem to be what could be a HW issue, I would appreciate if you
could double check with your HW guys, at least on rv610 (I haven't been
able to test on other cards).

When MSI is -disabled- (using old style LSIs), my PCI-E bridge detected
an invalid DMA after every interrupt. I am not sure I managed to capture
the address properly, it might be 0 but I need to double check with my
own HW guys.

This doesn't happen when MSI is enabled.

I -looks- like the chip might be trying to shoot an MSI even when MSIs
aren't enabled in the config space, possibly using the (stale) content
of the config space MSI address register. Can you check if your HW folks
know anything about this ?

 - There's a lot of horrible duplication of register definitions :-) I
think there's at least 3 copies of CP_RB_CNTL and associated bits. One
thing I'd like to cleanup (not -that- but related) is the way we set the
endian swap bits.

Basically, we currently have #ifdef's that duplicate entire multi-line
statements changing one bit (almost always the same).

I'm thinking about instead defining in a .h something like

#ifdef __BIG_ENDIAN
#define CP_RB_SWAP  CP_RB_SWAP_32BIT
#else
#define CP_RB_SWAP  0
#endif

And using CP_RB_SWAP everywhere, removing all the dups (which are bug prone
as always with dups). I'd like to make sure you are ok with that before I
jump and do it, and especially ok with having a _single_ instance of that bit
def for all radeon variants since the bits are in the same place everywhere.

Cheers,
Ben.

___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-driver-ati


[PATCH 6/6] drm/radeon: Fix the definition of RADEON_BUF_SWAP_32BIT

2011-07-13 Thread Benjamin Herrenschmidt
(Note that this is duplicated under various other names such
as R600_BUF_SWAP_32BIT etc...). At least now all the definitions
agree.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
---
 drivers/gpu/drm/radeon/radeon_reg.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_reg.h 
b/drivers/gpu/drm/radeon/radeon_reg.h
index ec93a75..0828add1 100644
--- a/drivers/gpu/drm/radeon/radeon_reg.h
+++ b/drivers/gpu/drm/radeon/radeon_reg.h
@@ -3293,7 +3293,7 @@
 #  define RADEON_RB_BUFSZ_MASK (0x3f  0)
 #  define RADEON_RB_BLKSZ_SHIFT8
 #  define RADEON_RB_BLKSZ_MASK (0x3f  8)
-#  define RADEON_BUF_SWAP_32BIT(1  17)
+#  define RADEON_BUF_SWAP_32BIT(2  16)
 #  define RADEON_MAX_FETCH_SHIFT   18
 #  define RADEON_MAX_FETCH_MASK(0x3  18)
 #  define RADEON_RB_NO_UPDATE  (1  27)


___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-driver-ati


[PATCH 4/6] drm/radeon: Add a rmb() in IH processing

2011-07-13 Thread Benjamin Herrenschmidt
We should have a read memory barrier between reading the WPTR from
memory and reading ring entries based on that value (ie, we need to
ensure both loads are done in order by the CPU).

It could be argued that the MMIO reads in r600_ack_irq() might be
enough to get that barrier but I prefer keeping an explicit one just
in case.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
---
 drivers/gpu/drm/radeon/r600.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index 3c86b15..7e5c801 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -3312,6 +3312,9 @@ int r600_irq_process(struct radeon_device *rdev)
}
 
 restart_ih:
+   /* Order reading of wptr vs. reading of IH ring data */
+   wmb();
+
/* display interrupts */
r600_irq_ack(rdev);
 


___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-driver-ati


[PATCH 5/6] drm/radeon: Do an MMIO read on interrupts when not uisng MSIs

2011-07-13 Thread Benjamin Herrenschmidt
When not using MSIs, there is no guarantee that DMA from the device
has been fully flushed to point where it's visible to the CPU when
taking an interrupt. To get this guarantee, we need to perform an
MMIO read from the device, which will flush all outstanding DMAs
from bridges between the device and the system.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
---
 drivers/gpu/drm/radeon/r600.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index 7e5c801..25b2dab 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -3300,6 +3300,10 @@ int r600_irq_process(struct radeon_device *rdev)
if (!rdev-ih.enabled || rdev-shutdown)
return IRQ_NONE;
 
+   /* No MSIs, need a dummy read to flush PCI DMAs */
+   if (!rdev-msi_enabled)
+   RREG32(IH_RB_WPTR);
+
wptr = r600_get_ih_wptr(rdev);
rptr = rdev-ih.rptr;
DRM_DEBUG(r600_irq_process start: rptr %d, wptr %d\n, rptr, wptr);


___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-driver-ati


[PATCH 1/6] drm/radeon: Remove a bunch of useless _iomem casts

2011-07-13 Thread Benjamin Herrenschmidt
Just defining rdev-rmmio properly in the first place should do
the trick. In some cases, the cast were also complete dups as
the original variable was already of the right type.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
---
 drivers/gpu/drm/radeon/radeon.h |   22 +++---
 drivers/gpu/drm/radeon/rs600.c  |2 +-
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index ef0e0e0..c92cf2c 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -1174,7 +1174,7 @@ struct radeon_device {
/* Register mmio */
resource_size_t rmmio_base;
resource_size_t rmmio_size;
-   void*rmmio;
+   void __iomem*rmmio;
radeon_rreg_t   mc_rreg;
radeon_wreg_t   mc_wreg;
radeon_rreg_t   pll_rreg;
@@ -1251,20 +1251,20 @@ int radeon_gpu_wait_for_idle(struct radeon_device 
*rdev);
 static inline uint32_t r100_mm_rreg(struct radeon_device *rdev, uint32_t reg)
 {
if (reg  rdev-rmmio_size)
-   return readl(((void __iomem *)rdev-rmmio) + reg);
+   return readl((rdev-rmmio) + reg);
else {
-   writel(reg, ((void __iomem *)rdev-rmmio) + RADEON_MM_INDEX);
-   return readl(((void __iomem *)rdev-rmmio) + RADEON_MM_DATA);
+   writel(reg, (rdev-rmmio) + RADEON_MM_INDEX);
+   return readl((rdev-rmmio) + RADEON_MM_DATA);
}
 }
 
 static inline void r100_mm_wreg(struct radeon_device *rdev, uint32_t reg, 
uint32_t v)
 {
if (reg  rdev-rmmio_size)
-   writel(v, ((void __iomem *)rdev-rmmio) + reg);
+   writel(v, (rdev-rmmio) + reg);
else {
-   writel(reg, ((void __iomem *)rdev-rmmio) + RADEON_MM_INDEX);
-   writel(v, ((void __iomem *)rdev-rmmio) + RADEON_MM_DATA);
+   writel(reg, (rdev-rmmio) + RADEON_MM_INDEX);
+   writel(v, (rdev-rmmio) + RADEON_MM_DATA);
}
 }
 
@@ -1296,10 +1296,10 @@ static inline void r100_io_wreg(struct radeon_device 
*rdev, u32 reg, u32 v)
 /*
  * Registers read  write functions.
  */
-#define RREG8(reg) readb(((void __iomem *)rdev-rmmio) + (reg))
-#define WREG8(reg, v) writeb(v, ((void __iomem *)rdev-rmmio) + (reg))
-#define RREG16(reg) readw(((void __iomem *)rdev-rmmio) + (reg))
-#define WREG16(reg, v) writew(v, ((void __iomem *)rdev-rmmio) + (reg))
+#define RREG8(reg) readb((rdev-rmmio) + (reg))
+#define WREG8(reg, v) writeb(v, (rdev-rmmio) + (reg))
+#define RREG16(reg) readw((rdev-rmmio) + (reg))
+#define WREG16(reg, v) writew(v, (rdev-rmmio) + (reg))
 #define RREG32(reg) r100_mm_rreg(rdev, (reg))
 #define DREG32(reg) printk(KERN_INFO REGISTER:  #reg  : 0x%08X\n, 
r100_mm_rreg(rdev, (reg)))
 #define WREG32(reg, v) r100_mm_wreg(rdev, (reg), (v))
diff --git a/drivers/gpu/drm/radeon/rs600.c b/drivers/gpu/drm/radeon/rs600.c
index 6e3b11e..6779576 100644
--- a/drivers/gpu/drm/radeon/rs600.c
+++ b/drivers/gpu/drm/radeon/rs600.c
@@ -530,7 +530,7 @@ int rs600_gart_set_page(struct radeon_device *rdev, int i, 
uint64_t addr)
addr = addr  0xF000ULL;
addr |= R600_PTE_VALID | R600_PTE_SYSTEM | R600_PTE_SNOOPED;
addr |= R600_PTE_READABLE | R600_PTE_WRITEABLE;
-   writeq(addr, ((void __iomem *)ptr) + (i * 8));
+   writeq(addr, ptr + (i * 8));
return 0;
 }
 


___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-driver-ati


[PATCH 3/6] drm/radeon: Writeback endian fixes

2011-07-13 Thread Benjamin Herrenschmidt
The writeback ring pointer and IH ring pointer are read using le32_to_cpu
so we do not want the chip to byteswap them on big-endian.

We still want to byteswap the ring itself and the IBs, so we don't touch
that but we remove setting of the byteswap bits in CP_RB_RPTR_ADDR and
IH_CNTL.

In general, for things like that where we control all the accessors easily,
we are better off doing the swap in SW rather than HW. Paradoxally, it does
keep the code closer to x86 and avoid using poorly tested HW features.

I also changed the use of RADEON_ to R600_ in a couple of cases to be more
consistent with the surrounding code.

Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
---
 drivers/gpu/drm/radeon/evergreen.c |3 ---
 drivers/gpu/drm/radeon/r600.c  |7 ---
 drivers/gpu/drm/radeon/r600_cp.c   |   23 +--
 3 files changed, 9 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen.c 
b/drivers/gpu/drm/radeon/evergreen.c
index e8a5ffb..23cf089 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -1359,9 +1359,6 @@ int evergreen_cp_resume(struct radeon_device *rdev)
 
/* set the wb address wether it's enabled or not */
WREG32(CP_RB_RPTR_ADDR,
-#ifdef __BIG_ENDIAN
-  RB_RPTR_SWAP(2) |
-#endif
   ((rdev-wb.gpu_addr + RADEON_WB_CP_RPTR_OFFSET)  0xFFFC));
WREG32(CP_RB_RPTR_ADDR_HI, upper_32_bits(rdev-wb.gpu_addr + 
RADEON_WB_CP_RPTR_OFFSET)  0xFF);
WREG32(SCRATCH_ADDR, ((rdev-wb.gpu_addr + RADEON_WB_SCRATCH_OFFSET)  
8)  0x);
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index f79d2cc..3c86b15 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -2212,9 +2212,6 @@ int r600_cp_resume(struct radeon_device *rdev)
 
/* set the wb address whether it's enabled or not */
WREG32(CP_RB_RPTR_ADDR,
-#ifdef __BIG_ENDIAN
-  RB_RPTR_SWAP(2) |
-#endif
   ((rdev-wb.gpu_addr + RADEON_WB_CP_RPTR_OFFSET)  0xFFFC));
WREG32(CP_RB_RPTR_ADDR_HI, upper_32_bits(rdev-wb.gpu_addr + 
RADEON_WB_CP_RPTR_OFFSET)  0xFF);
WREG32(SCRATCH_ADDR, ((rdev-wb.gpu_addr + RADEON_WB_SCRATCH_OFFSET)  
8)  0x);
@@ -2993,10 +2990,6 @@ int r600_irq_init(struct radeon_device *rdev)
/* RPTR_REARM only works if msi's are enabled */
if (rdev-msi_enabled)
ih_cntl |= RPTR_REARM;
-
-#ifdef __BIG_ENDIAN
-   ih_cntl |= IH_MC_SWAP(IH_MC_SWAP_32BIT);
-#endif
WREG32(IH_CNTL, ih_cntl);
 
/* force the active interrupt state to all disabled */
diff --git a/drivers/gpu/drm/radeon/r600_cp.c b/drivers/gpu/drm/radeon/r600_cp.c
index c3ab959..45fd592 100644
--- a/drivers/gpu/drm/radeon/r600_cp.c
+++ b/drivers/gpu/drm/radeon/r600_cp.c
@@ -1802,8 +1802,8 @@ static void r600_cp_init_ring_buffer(struct drm_device 
*dev,
/* Set ring buffer size */
 #ifdef __BIG_ENDIAN
RADEON_WRITE(R600_CP_RB_CNTL,
-RADEON_BUF_SWAP_32BIT |
-RADEON_RB_NO_UPDATE |
+R600_BUF_SWAP_32BIT |
+R600_RB_NO_UPDATE |
 (dev_priv-ring.rptr_update_l2qw  8) |
 dev_priv-ring.size_l2qw);
 #else
@@ -1820,15 +1820,15 @@ static void r600_cp_init_ring_buffer(struct drm_device 
*dev,
 
 #ifdef __BIG_ENDIAN
RADEON_WRITE(R600_CP_RB_CNTL,
-RADEON_BUF_SWAP_32BIT |
-RADEON_RB_NO_UPDATE |
-RADEON_RB_RPTR_WR_ENA |
+R600_BUF_SWAP_32BIT |
+R600_RB_NO_UPDATE |
+R600_RB_RPTR_WR_ENA |
 (dev_priv-ring.rptr_update_l2qw  8) |
 dev_priv-ring.size_l2qw);
 #else
RADEON_WRITE(R600_CP_RB_CNTL,
-RADEON_RB_NO_UPDATE |
-RADEON_RB_RPTR_WR_ENA |
+R600_RB_NO_UPDATE |
+R600_RB_RPTR_WR_ENA |
 (dev_priv-ring.rptr_update_l2qw  8) |
 dev_priv-ring.size_l2qw);
 #endif
@@ -1851,13 +1851,8 @@ static void r600_cp_init_ring_buffer(struct drm_device 
*dev,
- ((unsigned long) dev-sg-virtual)
+ dev_priv-gart_vm_start;
}
-   RADEON_WRITE(R600_CP_RB_RPTR_ADDR,
-#ifdef __BIG_ENDIAN
-(2  0) |
-#endif
-(rptr_addr  0xfffc));
-   RADEON_WRITE(R600_CP_RB_RPTR_ADDR_HI,
-upper_32_bits(rptr_addr));
+   RADEON_WRITE(R600_CP_RB_RPTR_ADDR, (rptr_addr  0xfffc));
+   RADEON_WRITE(R600_CP_RB_RPTR_ADDR_HI, upper_32_bits(rptr_addr));
 
 #ifdef __BIG_ENDIAN
RADEON_WRITE(R600_CP_RB_CNTL,


___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-driver-ati


Re: [PATCH 2/6] drm/radeon: ATOM Endian fix for atombios_crtc_program_pll()

2011-07-13 Thread Benjamin Herrenschmidt
On Wed, 2011-07-13 at 10:38 -0400, Alex Deucher wrote:

 case 6:
  -   args.v6.ulCrtcPclkFreq.ucCRTC = crtc_id;
  -   args.v6.ulCrtcPclkFreq.ulPixelClock = 
  cpu_to_le32(clock / 10);
  +   args.v6.ulDispEngClkFreq = cpu_to_le32(crtc_id  
  24 | clock / 10);
 
 For clarity (i can never remember op precedence), you might put:
 
 cpu_to_le32((crtc_id  24) | (clock / 10));

I can't either but I have a nice chart printed on my wall :-)

I can respin if you really want but the above is correct.

Cheers,
Ben.

 Other than that,
 
 Reviewed-by: Alex Deucher alexander.deuc...@amd.com
 
 
 args.v6.ucRefDiv = ref_div;
 args.v6.usFbDiv = cpu_to_le16(fb_div);
 args.v6.ulFbDivDecFrac = cpu_to_le32(frac_fb_div * 
  10);
 
 
 
 


___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-driver-ati


Re: [PATCH 3/6] drm/radeon: Writeback endian fixes

2011-07-13 Thread Benjamin Herrenschmidt
On Wed, 2011-07-13 at 10:42 -0400, Alex Deucher wrote:
 On Wed, Jul 13, 2011 at 2:28 AM, Benjamin Herrenschmidt
 b...@kernel.crashing.org wrote:
  The writeback ring pointer and IH ring pointer are read using le32_to_cpu
  so we do not want the chip to byteswap them on big-endian.
 
  We still want to byteswap the ring itself and the IBs, so we don't touch
  that but we remove setting of the byteswap bits in CP_RB_RPTR_ADDR and
  IH_CNTL.
 
  In general, for things like that where we control all the accessors easily,
  we are better off doing the swap in SW rather than HW. Paradoxally, it does
  keep the code closer to x86 and avoid using poorly tested HW features.
 
  I also changed the use of RADEON_ to R600_ in a couple of cases to be more
  consistent with the surrounding code.
 
  Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
 
 evergreen.c and ni.c will need similar fixes.

evergreen.c -is- fixed in the patch :-) ni.c doesn't seem to set
swapping on the write back of the ring pointer (can you dbl check ?), it
only enables swapping on the ring itself which is correct as far as I
can tell.

Cheers,
Ben.

 Reviewed-by: Alex Deucher alexander.deuc...@amd.com
 
  ---
 
  (resent adding dri-devel to the CC list to hit patchwork)
 
   drivers/gpu/drm/radeon/evergreen.c |3 ---
   drivers/gpu/drm/radeon/r600.c  |7 ---
   drivers/gpu/drm/radeon/r600_cp.c   |   23 +--
   3 files changed, 9 insertions(+), 24 deletions(-)
 
  diff --git a/drivers/gpu/drm/radeon/evergreen.c 
  b/drivers/gpu/drm/radeon/evergreen.c
  index e8a5ffb..23cf089 100644
  --- a/drivers/gpu/drm/radeon/evergreen.c
  +++ b/drivers/gpu/drm/radeon/evergreen.c
  @@ -1359,9 +1359,6 @@ int evergreen_cp_resume(struct radeon_device *rdev)
 
 /* set the wb address wether it's enabled or not */
 WREG32(CP_RB_RPTR_ADDR,
  -#ifdef __BIG_ENDIAN
  -  RB_RPTR_SWAP(2) |
  -#endif
((rdev-wb.gpu_addr + RADEON_WB_CP_RPTR_OFFSET)  
  0xFFFC));
 WREG32(CP_RB_RPTR_ADDR_HI, upper_32_bits(rdev-wb.gpu_addr + 
  RADEON_WB_CP_RPTR_OFFSET)  0xFF);
 WREG32(SCRATCH_ADDR, ((rdev-wb.gpu_addr + RADEON_WB_SCRATCH_OFFSET) 
   8)  0x);
  diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
  index f79d2cc..3c86b15 100644
  --- a/drivers/gpu/drm/radeon/r600.c
  +++ b/drivers/gpu/drm/radeon/r600.c
  @@ -2212,9 +2212,6 @@ int r600_cp_resume(struct radeon_device *rdev)
 
 /* set the wb address whether it's enabled or not */
 WREG32(CP_RB_RPTR_ADDR,
  -#ifdef __BIG_ENDIAN
  -  RB_RPTR_SWAP(2) |
  -#endif
((rdev-wb.gpu_addr + RADEON_WB_CP_RPTR_OFFSET)  
  0xFFFC));
 WREG32(CP_RB_RPTR_ADDR_HI, upper_32_bits(rdev-wb.gpu_addr + 
  RADEON_WB_CP_RPTR_OFFSET)  0xFF);
 WREG32(SCRATCH_ADDR, ((rdev-wb.gpu_addr + RADEON_WB_SCRATCH_OFFSET) 
   8)  0x);
  @@ -2993,10 +2990,6 @@ int r600_irq_init(struct radeon_device *rdev)
 /* RPTR_REARM only works if msi's are enabled */
 if (rdev-msi_enabled)
 ih_cntl |= RPTR_REARM;
  -
  -#ifdef __BIG_ENDIAN
  -   ih_cntl |= IH_MC_SWAP(IH_MC_SWAP_32BIT);
  -#endif
 WREG32(IH_CNTL, ih_cntl);
 
 /* force the active interrupt state to all disabled */
  diff --git a/drivers/gpu/drm/radeon/r600_cp.c 
  b/drivers/gpu/drm/radeon/r600_cp.c
  index c3ab959..45fd592 100644
  --- a/drivers/gpu/drm/radeon/r600_cp.c
  +++ b/drivers/gpu/drm/radeon/r600_cp.c
  @@ -1802,8 +1802,8 @@ static void r600_cp_init_ring_buffer(struct 
  drm_device *dev,
 /* Set ring buffer size */
   #ifdef __BIG_ENDIAN
 RADEON_WRITE(R600_CP_RB_CNTL,
  -RADEON_BUF_SWAP_32BIT |
  -RADEON_RB_NO_UPDATE |
  +R600_BUF_SWAP_32BIT |
  +R600_RB_NO_UPDATE |
  (dev_priv-ring.rptr_update_l2qw  8) |
  dev_priv-ring.size_l2qw);
   #else
  @@ -1820,15 +1820,15 @@ static void r600_cp_init_ring_buffer(struct 
  drm_device *dev,
 
   #ifdef __BIG_ENDIAN
 RADEON_WRITE(R600_CP_RB_CNTL,
  -RADEON_BUF_SWAP_32BIT |
  -RADEON_RB_NO_UPDATE |
  -RADEON_RB_RPTR_WR_ENA |
  +R600_BUF_SWAP_32BIT |
  +R600_RB_NO_UPDATE |
  +R600_RB_RPTR_WR_ENA |
  (dev_priv-ring.rptr_update_l2qw  8) |
  dev_priv-ring.size_l2qw);
   #else
 RADEON_WRITE(R600_CP_RB_CNTL,
  -RADEON_RB_NO_UPDATE |
  -RADEON_RB_RPTR_WR_ENA |
  +R600_RB_NO_UPDATE |
  +R600_RB_RPTR_WR_ENA |
  (dev_priv-ring.rptr_update_l2qw  8) |
  dev_priv-ring.size_l2qw);
   #endif
  @@ -1851,13 +1851,8 @@ static void r600_cp_init_ring_buffer(struct 
  drm_device *dev

Re: [PATCH 4/6] drm/radeon: Add a rmb() in IH processing

2011-07-13 Thread Benjamin Herrenschmidt
On Wed, 2011-07-13 at 10:48 -0400, Alex Deucher wrote:
 On Wed, Jul 13, 2011 at 10:43 AM, Alex Deucher alexdeuc...@gmail.com wrote:
  On Wed, Jul 13, 2011 at 2:28 AM, Benjamin Herrenschmidt
  b...@kernel.crashing.org wrote:
  We should have a read memory barrier between reading the WPTR from
  memory and reading ring entries based on that value (ie, we need to
  ensure both loads are done in order by the CPU).
 
  It could be argued that the MMIO reads in r600_ack_irq() might be
  enough to get that barrier but I prefer keeping an explicit one just
  in case.
 
  Signed-off-by: Benjamin Herrenschmidt b...@kernel.crashing.org
 
  Reviewed-by: Alex Deucher alexander.deuc...@amd.com
 
 evergreen.c will need a similar fix.

Ok. I can do that.

Cheers,
Ben.

 Alex
 
 
  ---
 
  (resent adding dri-devel to the CC list to hit patchwork)
 
  drivers/gpu/drm/radeon/r600.c |3 +++
   1 files changed, 3 insertions(+), 0 deletions(-)
 
  diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
  index 3c86b15..7e5c801 100644
  --- a/drivers/gpu/drm/radeon/r600.c
  +++ b/drivers/gpu/drm/radeon/r600.c
  @@ -3312,6 +3312,9 @@ int r600_irq_process(struct radeon_device *rdev)
 }
 
   restart_ih:
  +   /* Order reading of wptr vs. reading of IH ring data */
  +   wmb();
  +
 /* display interrupts */
 r600_irq_ack(rdev);
 
 
 
 
 
 


___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-driver-ati


Re: LSI problem, endianness cleanups

2011-07-13 Thread Benjamin Herrenschmidt
On Wed, 2011-07-13 at 11:33 -0400, Alex Deucher wrote:

  I'm thinking about instead defining in a .h something like
 
  #ifdef __BIG_ENDIAN
  #define CP_RB_SWAP  CP_RB_SWAP_32BIT
  #else
  #define CP_RB_SWAP  0
  #endif
 
  And using CP_RB_SWAP everywhere, removing all the dups (which are bug prone
  as always with dups). I'd like to make sure you are ok with that before I
  jump and do it, and especially ok with having a _single_ instance of that 
  bit
  def for all radeon variants since the bits are in the same place everywhere.
 
 
 Seems ok to me.  But as you mentioned in IRC, I think in most cases we
 should just do the swap on the CPU.  Also we generally use the same
 bits for all the endian swap fields so we could do something like:
 
 #define ENDIAN_NONE 0
 #define ENDIAN_8IN16 1
 #define ENDIAN_8IN32 2
 
 #ifdef __BIG_ENDIAN
 #define DFLT_SWAP ENDIAN_8IN32
 #else
 #define DFLT_SWAP ENDIAN_NONE
 #endif
 
 And use it for all the relevant fields.

Right.

As far as the CPU side swap, well, I agree mostly, but for the ring and
IB this is a much bigger undertaking. We'd have to fix everything that
writes (or reads, ie. validation) to ring and IB in kernel and
userspace, so I'll leave that one alone for now.

The IH and writeback were easy because they are contained within the
kernel driver.

Cheers,
Ben.

___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-driver-ati


[PATCH] radeon: Fix PCI usage of 32-bit driver on 64-bit platform

2008-11-24 Thread Benjamin Herrenschmidt
Hi !

The radeon driver is storing PCI addresses in unsigned long's which
won't work well on 32-bit platforms with 64-bit physical address space
such as PowerPC 4xx. This fixes it by using unsigned long long instead.

Signed-off-by: Benjamin Herrenschmidt [EMAIL PROTECTED]
---

index f7ae1a8..e7ccfa9 100644
--- a/src/radeon.h
+++ b/src/radeon.h
@@ -648,9 +648,9 @@ typedef struct {
 RADEONChipFamily  ChipFamily;
 RADEONErrata  ChipErrata;
 
-unsigned long LinearAddr;   /* Frame buffer physical address */
-unsigned long MMIOAddr; /* MMIO region physical address  */
-unsigned long BIOSAddr; /* BIOS physical address */
+unsigned long long LinearAddr;   /* Frame buffer physical address  
   */
+unsigned long long MMIOAddr; /* MMIO region physical address   
   */
+unsigned long long BIOSAddr; /* BIOS physical address  
   */
 uint32_t  fbLocation;
 uint32_t  gartLocation;
 uint32_t  mc_fb_location;
index c759bd6..138bbc8 100644
--- a/src/radeon_driver.c
+++ b/src/radeon_driver.c
@@ -429,7 +429,7 @@ static Bool RADEONMapFB(ScrnInfoPtr pScrn)
 RADEONInfoPtr  info = RADEONPTR(pScrn);
 
 xf86DrvMsgVerb(pScrn-scrnIndex, X_INFO, RADEON_LOGLEVEL_DEBUG,
-  Map: 0x%08lx, 0x%08lx\n, info-LinearAddr, 
info-FbMapSize);
+  Map: 0x%016llx, 0x%08lx\n, info-LinearAddr, 
info-FbMapSize);
 
 #ifndef XSERVER_LIBPCIACCESS
 
@@ -1745,11 +1745,11 @@ static Bool RADEONPreInitChipType(ScrnInfoPtr pScrn)
 }
 
 from   = X_PROBED;
-info-LinearAddr   = PCI_REGION_BASE(info-PciInfo, 0, REGION_MEM)  
~0x1ffUL;
+info-LinearAddr   = PCI_REGION_BASE(info-PciInfo, 0, REGION_MEM)  
~0x1ffULL;
 pScrn-memPhysBase = info-LinearAddr;
 if (dev-MemBase) {
xf86DrvMsg(pScrn-scrnIndex, X_INFO,
-  Linear address override, using 0x%016lx instead of 
0x%016lx\n,
+  Linear address override, using 0x%016lx instead of 
0x%016llx\n,
   dev-MemBase,
   info-LinearAddr);
info-LinearAddr = dev-MemBase;
@@ -1760,7 +1760,7 @@ static Bool RADEONPreInitChipType(ScrnInfoPtr pScrn)
return FALSE;
 }
 xf86DrvMsg(pScrn-scrnIndex, from,
-  Linear framebuffer at 0x%016lx\n, info-LinearAddr);
+  Linear framebuffer at 0x%016llx\n, info-LinearAddr);
 
 #ifndef XSERVER_LIBPCIACCESS
/* BIOS */
@@ -2742,11 +2742,13 @@ Bool RADEONPreInit(ScrnInfoPtr pScrn, int flags)
 info-PciTag  = pciTag(PCI_DEV_BUS(info-PciInfo),
   PCI_DEV_DEV(info-PciInfo),
   PCI_DEV_FUNC(info-PciInfo));
-info-MMIOAddr = PCI_REGION_BASE(info-PciInfo, 2, REGION_MEM)  ~0xffUL;
+info-MMIOAddr = PCI_REGION_BASE(info-PciInfo, 2, REGION_MEM)  ~0xffULL;
 info-MMIOSize = PCI_REGION_SIZE(info-PciInfo, 2);
+   xf86DrvMsg(pScrn-scrnIndex, X_INFO, TOTO SAYS %016llx\n, 
+   (unsigned long long)info-PciInfo-regions[2].base_addr);
 if (info-pEnt-device-IOBase) {
xf86DrvMsg(pScrn-scrnIndex, X_CONFIG,
-  MMIO address override, using 0x%08lx instead of 0x%08lx\n,
+  MMIO address override, using 0x%08lx instead of 
0x%016llx\n,
   info-pEnt-device-IOBase,
   info-MMIOAddr);
info-MMIOAddr = info-pEnt-device-IOBase;
@@ -2755,7 +2757,7 @@ Bool RADEONPreInit(ScrnInfoPtr pScrn, int flags)
goto fail1;
 }
 xf86DrvMsg(pScrn-scrnIndex, X_INFO,
-  MMIO registers at 0x%016lx: size %ldKB\n, info-MMIOAddr, 
info-MMIOSize / 1024);
+  MMIO registers at 0x%016llx: size %ldKB\n, info-MMIOAddr, 
info-MMIOSize / 1024);
 
 if(!RADEONMapMMIO(pScrn)) {
xf86DrvMsg(pScrn-scrnIndex, X_ERROR,


___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-driver-ati


RE: [radeonhd] 0x94CC:0x1545:0x3650: Radeonhd RV610 HD2400Pro

2008-08-13 Thread Benjamin Herrenschmidt
On Wed, 2008-08-13 at 03:52 -0700, Tai Tri Nguyen wrote:
 Hi all,
 
 Thanks for your helps. That seems an interesting issue. 
 
 Ben,
  I'm working on a powerpc sequoia using AMCC 440EPX CPU.

Ok, I don't think I have one of these, but I have an SPe and a 405EX so
that should be similar. In any case, it will take a little bit of time
before I can get to these r6xx endian issues so I suggest you look at
what you get with an r5xx so we can properly separate the issues caused
by the lack of surface swapping from whatever other pixman issues we may
have.

Cheers,
Ben.

 Best regards,
 
 Tai Nguyen
 
 -Original Message-
 From: Benjamin Herrenschmidt [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, August 13, 2008 5:38 PM
 To: Dave Airlie
 Cc: Tai Tri Nguyen; Alex Deucher; xorg-driver-ati@lists.x.org; Tuan
 Thanh Phan
 Subject: Re: [radeonhd] 0x94CC:0x1545:0x3650: Radeonhd RV610 HD2400Pro
 
 On Wed, 2008-08-13 at 20:07 +1000, Dave Airlie wrote:
  On Wed, Aug 13, 2008 at 7:54 PM, Tai Tri Nguyen [EMAIL PROTECTED]
 wrote:
   Hi Alex,
  
   The driver I'm using is from git, the latest version of 
   xf86-video-ati  git clone 
   git://anongit.freedesktop.org/git/xorg/driver/xf86-video-ati
  
   I have just downloaded and compiled again but the issue still exist.
   Video can display in 24 bits depth color but perform wrong color.
   Do you have any idea for the issue?
  
  
  I think this is a pixman bug, but I'm not sure,
  
  Ben you have any luck tracking it down further?
  
  Actually R600 doesn't have surfaces like previous cards so actually we
 
  need to program some bits of r600.
 
 Hi !
 
 There are two different things at play here:
 
  - There may be a problem with pixman, I've had report of color issues
 with some apps / themes and I've observed some myself with the fedora
 default theme, on cards like r500 where we pretty much know the surface
 swapping works.
 
  - The r600 has no surfaces, at least not the way they used to work, so
 it's expected that colors will be wrong on that one.
 
 For the later, I don't know yet what can be done, ie. I don't know what
 they replaced surfaces with and how we can deal with endianness on it,
 I'll have to catch up with you and Alex, look at available docs, etc...
 
 At this stage, it's hard to say which of these two problems is affecting
 Tai Tri. I would suggest he tries an R500 card such as X1650 or similar,
 this is what we use on Bimini and appears to work reasonably well, at
 least we have working surface swappers. I'll have to track down the
 pixman issue (if that's one, btw, I could use your help here Dave in
 isolating what X rendering ops are actually going wrong).
 
 For R6xx, I can't tell yet what we'll do. I have a 2400HD too so I
 should hopefully be able to work on that as soon as I find some spare
 time.
 
 Tai: What machine is this ? I can try to do a similar setup, I have a
 whole range of AMCC eval boards here, I was planning to give the X1650 a
 go some time next week or so in fact...
 
 Cheers,
 Ben.
 
 
 CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is 
 for the sole use of the intended recipient(s) and contains information that 
 is confidential and proprietary to Applied Micro Circuits Corporation or its 
 subsidiaries. It is to be used solely for the purpose of furthering the 
 parties' business relationship. All unauthorized review, use, disclosure or 
 distribution is prohibited. If you are not the intended recipient, please 
 contact the sender by reply e-mail and destroy all copies of the original 
 message.

___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-driver-ati


Re: [radeonhd] 0x94CC:0x1545:0x3650: Radeonhd RV610 HD2400Pro

2008-08-13 Thread Benjamin Herrenschmidt
On Wed, 2008-08-13 at 20:07 +1000, Dave Airlie wrote:
 On Wed, Aug 13, 2008 at 7:54 PM, Tai Tri Nguyen [EMAIL PROTECTED] wrote:
  Hi Alex,
 
  The driver I'm using is from git, the latest version of xf86-video-ati
   git clone
  git://anongit.freedesktop.org/git/xorg/driver/xf86-video-ati
 
  I have just downloaded and compiled again but the issue still exist.
  Video can display in 24 bits depth color but perform wrong color.
  Do you have any idea for the issue?
 
 
 I think this is a pixman bug, but I'm not sure,
 
 Ben you have any luck tracking it down further?
 
 Actually R600 doesn't have surfaces like previous cards so actually we
 need to program some bits of r600.

Hi !

There are two different things at play here:

 - There may be a problem with pixman, I've had report of color
issues with some apps / themes and I've observed some myself with
the fedora default theme, on cards like r500 where we pretty much
know the surface swapping works.

 - The r600 has no surfaces, at least not the way they used to work, so
it's expected that colors will be wrong on that one.

For the later, I don't know yet what can be done, ie. I don't know what
they replaced surfaces with and how we can deal with endianness on it,
I'll have to catch up with you and Alex, look at available docs, etc...

At this stage, it's hard to say which of these two problems is affecting
Tai Tri. I would suggest he tries an R500 card such as X1650 or similar,
this is what we use on Bimini and appears to work reasonably well, at
least we have working surface swappers. I'll have to track down the
pixman issue (if that's one, btw, I could use your help here Dave in
isolating what X rendering ops are actually going wrong).

For R6xx, I can't tell yet what we'll do. I have a 2400HD too so I
should hopefully be able to work on that as soon as I find some spare
time.

Tai: What machine is this ? I can try to do a similar setup, I have a
whole range of AMCC eval boards here, I was planning to give the X1650 a
go some time next week or so in fact...

Cheers,
Ben.


___
xorg-driver-ati mailing list
xorg-driver-ati@lists.x.org
http://lists.x.org/mailman/listinfo/xorg-driver-ati