Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-03-21 Thread Alexey Brodkin

On 02/12/2013 09:25 PM, Arnd Bergmann wrote:

On Tuesday 12 February 2013, Michal Simek wrote:

But on Microblaze LE is necessary to use different datain/out_le16
functions as below
which are also not compatible with Microblaze and PPC BE.

diff --git a/drivers/block/xsysace.c b/drivers/block/xsysace.c
index bbad046..8dd192c 100644
--- a/drivers/block/xsysace.c
+++ b/drivers/block/xsysace.c
@@ -314,7 +314,7 @@ static void ace_datain_le16(struct ace_device *ace)
 int i = ACE_FIFO_SIZE / 2;
 u16 *dst = ace->data_ptr;
 while (i--)
-   *dst++ = ioread16be(ace->baseaddr + 0x40);
+   *dst++ = ioread16(ace->baseaddr + 0x40);
 ace->data_ptr = dst;
  }


Hmm, that actually seems sane then: You can replace the loop around
ioread16be with a single ioread32_rep() call, which will use
the correct native endian accesses on both LE Microblaze
and on all the big-endian platforms. Note that the asm-generic
definition of that function was broken until quite recently
when Will Deacon repaired it in order to support ARM64.

Arnd



It seems like this thread was not updated for many weeks now.
I'm wondering if I missed something or if I may do something to help you 
guys move forward with proposed patch?


-Alexey
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-03-21 Thread Alexey Brodkin

On 02/12/2013 09:25 PM, Arnd Bergmann wrote:

On Tuesday 12 February 2013, Michal Simek wrote:

But on Microblaze LE is necessary to use different datain/out_le16
functions as below
which are also not compatible with Microblaze and PPC BE.

diff --git a/drivers/block/xsysace.c b/drivers/block/xsysace.c
index bbad046..8dd192c 100644
--- a/drivers/block/xsysace.c
+++ b/drivers/block/xsysace.c
@@ -314,7 +314,7 @@ static void ace_datain_le16(struct ace_device *ace)
 int i = ACE_FIFO_SIZE / 2;
 u16 *dst = ace-data_ptr;
 while (i--)
-   *dst++ = ioread16be(ace-baseaddr + 0x40);
+   *dst++ = ioread16(ace-baseaddr + 0x40);
 ace-data_ptr = dst;
  }


Hmm, that actually seems sane then: You can replace the loop around
ioread16be with a single ioread32_rep() call, which will use
the correct native endian accesses on both LE Microblaze
and on all the big-endian platforms. Note that the asm-generic
definition of that function was broken until quite recently
when Will Deacon repaired it in order to support ARM64.

Arnd



It seems like this thread was not updated for many weeks now.
I'm wondering if I missed something or if I may do something to help you 
guys move forward with proposed patch?


-Alexey
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-12 Thread Arnd Bergmann
On Tuesday 12 February 2013, Michal Simek wrote:
> But on Microblaze LE is necessary to use different datain/out_le16
> functions as below
> which are also not compatible with Microblaze and PPC BE.
> 
> diff --git a/drivers/block/xsysace.c b/drivers/block/xsysace.c
> index bbad046..8dd192c 100644
> --- a/drivers/block/xsysace.c
> +++ b/drivers/block/xsysace.c
> @@ -314,7 +314,7 @@ static void ace_datain_le16(struct ace_device *ace)
> int i = ACE_FIFO_SIZE / 2;
> u16 *dst = ace->data_ptr;
> while (i--)
> -   *dst++ = ioread16be(ace->baseaddr + 0x40);
> +   *dst++ = ioread16(ace->baseaddr + 0x40);
> ace->data_ptr = dst;
>  }

Hmm, that actually seems sane then: You can replace the loop around
ioread16be with a single ioread32_rep() call, which will use
the correct native endian accesses on both LE Microblaze
and on all the big-endian platforms. Note that the asm-generic
definition of that function was broken until quite recently
when Will Deacon repaired it in order to support ARM64.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-12 Thread Michal Simek
2013/2/7 Grant Likely :
> On Thu, Feb 7, 2013 at 3:12 PM, Alexey Brodkin
>  wrote:
>> On 02/07/2013 06:51 PM, Grant Likely wrote:
>>>
>>> On Thu, Feb 7, 2013 at 2:39 PM, Grant Likely 
>>> wrote:

 On Wed, Feb 6, 2013 at 9:35 PM, Benjamin Herrenschmidt
  wrote:
>>
>>   In fact, the driver already knows about this and figures
>> out at runtime how the device is wired up to the bus. This is not the
>> problem.
>
>
> Except that this is very gross, especially when you observe that in the
> busted "big endian" case, it has to byteswap the bloody data port.
>
> So you end up having to do that gross hack with separate accessors for
> registers vs. data and not able to use the _rep variants, which also
> means that on platforms like ppc, you end up with a memory barrier on
> every access (or more), which is going to slow things down enormously.


 I don't see why the _rep variants aren't usable here. The only reason
 I didn't use them when I wrote the driver in the first place was I was
 a n00b kernel hacker and I didn't know they were there.
>>>
>>>
>>> The 8-bit variant is different though because the hardware requires
>>> pingponging between odd and even byte addresses to flush the fifo.
>>> Reading a data port even address (0x40) gives the least significant
>>> byte. Reading from an odd address (0x41) give the MSB and pops the
>>> data off the FIFO. So, yes, the _rep variant can't be used in 8-bit
>>> mode. It should still be fine in 16-bit.
>>>
>>> page 45: http://www.xilinx.com/support/documentation/data_sheets/ds080.pdf
>>>
>>
>> Ok, so may I do a re-spin with these changes:
>
> There are two things here. 1) changing the accessors used. 2)
> switching the endianess as a bug fix. Any changes to the endian access
> should be a separate patch which a description of what is needed.
>
>> 1. In "ace_in_be16" use "ioread16be"
>> 2. In "ace_out_be16" use "iowrite16be"
>> 3. In "ace_in_le16" use "ioread16"
>> 4. In "ace_out_le16" use "iowrite16"
>
> Yes
>
>> 5. In "ace_datain_le16" use "ioread16_rep"
>> 6. In "ace_dataout_le16" use "iowrite16_rep"
>
> Maybe. In a separate patch. Hmmm... I guess there isn't an
> ioread16be_rep variant. Oh well. Check first with Michal on LE
> microblaze before making the change. If it doesn't work for him the
> more understanding is needed. I was pretty sure the LE variant already
> worked.

Sorry it took me some time to get 16bit LE to work for test but here
are test results.

I have tested it on ppc BE, Microblaze BE and Microblaze LE systems
and surprisingly on all of them only ace_reg_le16_ops are used.

But on Microblaze LE is necessary to use different datain/out_le16
functions as below
which are also not compatible with Microblaze and PPC BE.

diff --git a/drivers/block/xsysace.c b/drivers/block/xsysace.c
index bbad046..8dd192c 100644
--- a/drivers/block/xsysace.c
+++ b/drivers/block/xsysace.c
@@ -314,7 +314,7 @@ static void ace_datain_le16(struct ace_device *ace)
int i = ACE_FIFO_SIZE / 2;
u16 *dst = ace->data_ptr;
while (i--)
-   *dst++ = ioread16be(ace->baseaddr + 0x40);
+   *dst++ = ioread16(ace->baseaddr + 0x40);
ace->data_ptr = dst;
 }

@@ -323,7 +323,7 @@ static void ace_dataout_le16(struct ace_device *ace)
int i = ACE_FIFO_SIZE / 2;
u16 *src = ace->data_ptr;
while (i--)
-   iowrite16be(*src++, ace->baseaddr + 0x40);
+   iowrite16(*src++, ace->baseaddr + 0x40);
ace->data_ptr = src;
 }

Grant: How did you tested BE mode? It is fpga it means you can switch
some connections
and it could work.

Thanks,
Michal


-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-12 Thread Arnd Bergmann
On Tuesday 12 February 2013, Michal Simek wrote:
> ok but still there should be well defined how to do it. Let's say
> generic Kconfig option.

You cannot solve it in a generic way, since every device has different
needs. In a single SoC, you may have one device that only ever exists
with little-endian I/O, one that is always wired the same way as the
CPU, one that always swaps endianess and one that exists in two
variants but you know which one you have at compile time.

> Or maybe there is also third option to modified code to use proper writes.

You mean run-time code patching? That is seriously wrong here. As Ben
explained, MMIO is slow by definition, so if you add five cycles for
an indirect function call on every MMIO read, that will not even
be noticeable in the 20µs latency that the read has on the bus.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-12 Thread Benjamin Herrenschmidt
On Tue, 2013-02-12 at 11:11 +0100, Michal Simek wrote:
> For high performance IPs using accessors functions is still
> problematic
> because there will be performance regression it means that
> from my point of view there still should be any option to "setup"
> proper endians for the driver and it can't be setup at run-time.
> 
> Sure the question is if drivers like this should be in the mainline.
> But if yes, there should be an option to do it in acceptable way.

No. High performance IP should rely on DMA essentially and be less
affected if not at all. Endianess should affect the control path, not
the data path (unless you got your wiring wrong again :-)

If you want an Kconfig option for an "optimal" version, put the ifdef
trickery in the accessors and make them inline.

We do something like that iirc with OHCI where the type of endianness
needed is a pair of select's and the conditional only happens when both
are set. You can probably do better with feature bits & a mask of
"possible" features relying on the compiler optimisations rather than
the preprocessor.

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-12 Thread Michal Simek
2013/2/12 Arnd Bergmann :
> On Tuesday 12 February 2013, Michal Simek wrote:
>> > In particular, ARM can run both big- and little-endian even though
>> > big-endian is rarely used, so you need to know the endianess for
>> > the device you are talking to rather than assume that it knows
>> > what the CPU does at the time.
>>
>> For high performance IPs using accessors functions is still problematic
>> because there will be performance regression it means that
>> from my point of view there still should be any option to "setup"
>> proper endians for the driver and it can't be setup at run-time.
>
> I did not mean you have to use a run-time detection here, although
> that is often the easiest solution. If you know the endianess of
> a device for a specific architecture or platform, it is totally
> fine to pick that endianess at compile-time and use e.g. the
> readl_relaxed() accessors on ARM to give you the lowest access
> latency.

ok but still there should be well defined how to do it. Let's say
generic Kconfig option.
Or maybe there is also third option to modified code to use proper writes.

Thanks,
Michal

-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-12 Thread Arnd Bergmann
On Tuesday 12 February 2013, Benjamin Herrenschmidt wrote:
> It depends how the ARM core operates vs. IO when switched between BE and
> LE, does it keep the same lines doing byte 0 or does it keep the MSB/LSB
> in the same place (and thus changes which lanes contain byte 0) ?

IIRC it changed between older and more recent ARM, at least we have
separate configuration options for "BE-8" big-endian mode on ARMv6
and above, and "BE-32" on older ones.

Currently we don't support platforms upstream that get configured
as BE-32, but the code is there and I know of people using it.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-12 Thread Arnd Bergmann
On Tuesday 12 February 2013, Michal Simek wrote:
> > In particular, ARM can run both big- and little-endian even though
> > big-endian is rarely used, so you need to know the endianess for
> > the device you are talking to rather than assume that it knows
> > what the CPU does at the time.
> 
> For high performance IPs using accessors functions is still problematic
> because there will be performance regression it means that
> from my point of view there still should be any option to "setup"
> proper endians for the driver and it can't be setup at run-time.

I did not mean you have to use a run-time detection here, although
that is often the easiest solution. If you know the endianess of
a device for a specific architecture or platform, it is totally
fine to pick that endianess at compile-time and use e.g. the
readl_relaxed() accessors on ARM to give you the lowest access
latency.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-12 Thread Michal Simek
2013/2/11 Arnd Bergmann :
> On Monday 11 February 2013, Michal Simek wrote:
>> Unfortunately no. Another is spi/i2c (sysace as we discuss in this
>> thread), probably icap
>> network drivers are ok because they are not shared.
>> Timer when it is moved to clocksource(not important right now)
>> xilinx gpio is using __raw IO functions which is incorrect on arm in
>> connection to barriers.
>>
>> But it reminds me that maybe the easiest solution is not to use endian
>> accessors just use two simple macros which should work on all systems.
>>
>> #define _readreg(offset)  ({__raw_readl(offset); rmb(); })
>> #define _writereg(offset, val)({wmb(); __raw_writel(val, offset); 
>> })
>>
>> which is probably the faster solution which add minimum additional code
>> to driver and can also remove endian detection code.
>
> I tend to disagree. You should never assume that a device is the same
> endianess as the the CPU, and you should try not to use the __raw_*
> accessors in device drivers either.
>
> In particular, ARM can run both big- and little-endian even though
> big-endian is rarely used, so you need to know the endianess for
> the device you are talking to rather than assume that it knows
> what the CPU does at the time.

For high performance IPs using accessors functions is still problematic
because there will be performance regression it means that
from my point of view there still should be any option to "setup"
proper endians for the driver and it can't be setup at run-time.

Sure the question is if drivers like this should be in the mainline.
But if yes, there should be an option to do it in acceptable way.

Thanks,
Michal

-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-12 Thread Michal Simek
2013/2/12 Benjamin Herrenschmidt :
> On Mon, 2013-02-11 at 16:57 +0100, Michal Simek wrote:
>> But it reminds me that maybe the easiest solution is not to use endian
>> accessors just use two simple macros which should work on all systems.
>>
>> #define _readreg(offset)  ({__raw_readl(offset); rmb(); })
>> #define _writereg(offset, val)({wmb(); __raw_writel(val, offset); 
>> })
>>
>> which is probably the faster solution which add minimum additional code
>> to driver and can also remove endian detection code.
>
> Except that rmb & wmb might not be the right barriers for IOs ...

Arm is using __iormb in readX macros
#define __iormb()   rmb()
but it is arm specific only that's why I have added there rmb().

Thanks,
Michal




-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-12 Thread Michal Simek
2013/2/12 Benjamin Herrenschmidt b...@kernel.crashing.org:
 On Mon, 2013-02-11 at 16:57 +0100, Michal Simek wrote:
 But it reminds me that maybe the easiest solution is not to use endian
 accessors just use two simple macros which should work on all systems.

 #define name_readreg(offset)  ({__raw_readl(offset); rmb(); })
 #define name_writereg(offset, val)({wmb(); __raw_writel(val, offset); 
 })

 which is probably the faster solution which add minimum additional code
 to driver and can also remove endian detection code.

 Except that rmb  wmb might not be the right barriers for IOs ...

Arm is using __iormb in readX macros
#define __iormb()   rmb()
but it is arm specific only that's why I have added there rmb().

Thanks,
Michal




-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-12 Thread Michal Simek
2013/2/11 Arnd Bergmann a...@arndb.de:
 On Monday 11 February 2013, Michal Simek wrote:
 Unfortunately no. Another is spi/i2c (sysace as we discuss in this
 thread), probably icap
 network drivers are ok because they are not shared.
 Timer when it is moved to clocksource(not important right now)
 xilinx gpio is using __raw IO functions which is incorrect on arm in
 connection to barriers.

 But it reminds me that maybe the easiest solution is not to use endian
 accessors just use two simple macros which should work on all systems.

 #define name_readreg(offset)  ({__raw_readl(offset); rmb(); })
 #define name_writereg(offset, val)({wmb(); __raw_writel(val, offset); 
 })

 which is probably the faster solution which add minimum additional code
 to driver and can also remove endian detection code.

 I tend to disagree. You should never assume that a device is the same
 endianess as the the CPU, and you should try not to use the __raw_*
 accessors in device drivers either.

 In particular, ARM can run both big- and little-endian even though
 big-endian is rarely used, so you need to know the endianess for
 the device you are talking to rather than assume that it knows
 what the CPU does at the time.

For high performance IPs using accessors functions is still problematic
because there will be performance regression it means that
from my point of view there still should be any option to setup
proper endians for the driver and it can't be setup at run-time.

Sure the question is if drivers like this should be in the mainline.
But if yes, there should be an option to do it in acceptable way.

Thanks,
Michal

-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-12 Thread Arnd Bergmann
On Tuesday 12 February 2013, Michal Simek wrote:
  In particular, ARM can run both big- and little-endian even though
  big-endian is rarely used, so you need to know the endianess for
  the device you are talking to rather than assume that it knows
  what the CPU does at the time.
 
 For high performance IPs using accessors functions is still problematic
 because there will be performance regression it means that
 from my point of view there still should be any option to setup
 proper endians for the driver and it can't be setup at run-time.

I did not mean you have to use a run-time detection here, although
that is often the easiest solution. If you know the endianess of
a device for a specific architecture or platform, it is totally
fine to pick that endianess at compile-time and use e.g. the
readl_relaxed() accessors on ARM to give you the lowest access
latency.

Arnd
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-12 Thread Arnd Bergmann
On Tuesday 12 February 2013, Benjamin Herrenschmidt wrote:
 It depends how the ARM core operates vs. IO when switched between BE and
 LE, does it keep the same lines doing byte 0 or does it keep the MSB/LSB
 in the same place (and thus changes which lanes contain byte 0) ?

IIRC it changed between older and more recent ARM, at least we have
separate configuration options for BE-8 big-endian mode on ARMv6
and above, and BE-32 on older ones.

Currently we don't support platforms upstream that get configured
as BE-32, but the code is there and I know of people using it.

Arnd
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-12 Thread Michal Simek
2013/2/12 Arnd Bergmann a...@arndb.de:
 On Tuesday 12 February 2013, Michal Simek wrote:
  In particular, ARM can run both big- and little-endian even though
  big-endian is rarely used, so you need to know the endianess for
  the device you are talking to rather than assume that it knows
  what the CPU does at the time.

 For high performance IPs using accessors functions is still problematic
 because there will be performance regression it means that
 from my point of view there still should be any option to setup
 proper endians for the driver and it can't be setup at run-time.

 I did not mean you have to use a run-time detection here, although
 that is often the easiest solution. If you know the endianess of
 a device for a specific architecture or platform, it is totally
 fine to pick that endianess at compile-time and use e.g. the
 readl_relaxed() accessors on ARM to give you the lowest access
 latency.

ok but still there should be well defined how to do it. Let's say
generic Kconfig option.
Or maybe there is also third option to modified code to use proper writes.

Thanks,
Michal

-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-12 Thread Benjamin Herrenschmidt
On Tue, 2013-02-12 at 11:11 +0100, Michal Simek wrote:
 For high performance IPs using accessors functions is still
 problematic
 because there will be performance regression it means that
 from my point of view there still should be any option to setup
 proper endians for the driver and it can't be setup at run-time.
 
 Sure the question is if drivers like this should be in the mainline.
 But if yes, there should be an option to do it in acceptable way.

No. High performance IP should rely on DMA essentially and be less
affected if not at all. Endianess should affect the control path, not
the data path (unless you got your wiring wrong again :-)

If you want an Kconfig option for an optimal version, put the ifdef
trickery in the accessors and make them inline.

We do something like that iirc with OHCI where the type of endianness
needed is a pair of select's and the conditional only happens when both
are set. You can probably do better with feature bits  a mask of
possible features relying on the compiler optimisations rather than
the preprocessor.

Cheers,
Ben.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-12 Thread Arnd Bergmann
On Tuesday 12 February 2013, Michal Simek wrote:
 ok but still there should be well defined how to do it. Let's say
 generic Kconfig option.

You cannot solve it in a generic way, since every device has different
needs. In a single SoC, you may have one device that only ever exists
with little-endian I/O, one that is always wired the same way as the
CPU, one that always swaps endianess and one that exists in two
variants but you know which one you have at compile time.

 Or maybe there is also third option to modified code to use proper writes.

You mean run-time code patching? That is seriously wrong here. As Ben
explained, MMIO is slow by definition, so if you add five cycles for
an indirect function call on every MMIO read, that will not even
be noticeable in the 20µs latency that the read has on the bus.

Arnd
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-12 Thread Michal Simek
2013/2/7 Grant Likely grant.lik...@secretlab.ca:
 On Thu, Feb 7, 2013 at 3:12 PM, Alexey Brodkin
 alexey.brod...@synopsys.com wrote:
 On 02/07/2013 06:51 PM, Grant Likely wrote:

 On Thu, Feb 7, 2013 at 2:39 PM, Grant Likely grant.lik...@secretlab.ca
 wrote:

 On Wed, Feb 6, 2013 at 9:35 PM, Benjamin Herrenschmidt
 b...@kernel.crashing.org wrote:

   In fact, the driver already knows about this and figures
 out at runtime how the device is wired up to the bus. This is not the
 problem.


 Except that this is very gross, especially when you observe that in the
 busted big endian case, it has to byteswap the bloody data port.

 So you end up having to do that gross hack with separate accessors for
 registers vs. data and not able to use the _rep variants, which also
 means that on platforms like ppc, you end up with a memory barrier on
 every access (or more), which is going to slow things down enormously.


 I don't see why the _rep variants aren't usable here. The only reason
 I didn't use them when I wrote the driver in the first place was I was
 a n00b kernel hacker and I didn't know they were there.


 The 8-bit variant is different though because the hardware requires
 pingponging between odd and even byte addresses to flush the fifo.
 Reading a data port even address (0x40) gives the least significant
 byte. Reading from an odd address (0x41) give the MSB and pops the
 data off the FIFO. So, yes, the _rep variant can't be used in 8-bit
 mode. It should still be fine in 16-bit.

 page 45: http://www.xilinx.com/support/documentation/data_sheets/ds080.pdf


 Ok, so may I do a re-spin with these changes:

 There are two things here. 1) changing the accessors used. 2)
 switching the endianess as a bug fix. Any changes to the endian access
 should be a separate patch which a description of what is needed.

 1. In ace_in_be16 use ioread16be
 2. In ace_out_be16 use iowrite16be
 3. In ace_in_le16 use ioread16
 4. In ace_out_le16 use iowrite16

 Yes

 5. In ace_datain_le16 use ioread16_rep
 6. In ace_dataout_le16 use iowrite16_rep

 Maybe. In a separate patch. Hmmm... I guess there isn't an
 ioread16be_rep variant. Oh well. Check first with Michal on LE
 microblaze before making the change. If it doesn't work for him the
 more understanding is needed. I was pretty sure the LE variant already
 worked.

Sorry it took me some time to get 16bit LE to work for test but here
are test results.

I have tested it on ppc BE, Microblaze BE and Microblaze LE systems
and surprisingly on all of them only ace_reg_le16_ops are used.

But on Microblaze LE is necessary to use different datain/out_le16
functions as below
which are also not compatible with Microblaze and PPC BE.

diff --git a/drivers/block/xsysace.c b/drivers/block/xsysace.c
index bbad046..8dd192c 100644
--- a/drivers/block/xsysace.c
+++ b/drivers/block/xsysace.c
@@ -314,7 +314,7 @@ static void ace_datain_le16(struct ace_device *ace)
int i = ACE_FIFO_SIZE / 2;
u16 *dst = ace-data_ptr;
while (i--)
-   *dst++ = ioread16be(ace-baseaddr + 0x40);
+   *dst++ = ioread16(ace-baseaddr + 0x40);
ace-data_ptr = dst;
 }

@@ -323,7 +323,7 @@ static void ace_dataout_le16(struct ace_device *ace)
int i = ACE_FIFO_SIZE / 2;
u16 *src = ace-data_ptr;
while (i--)
-   iowrite16be(*src++, ace-baseaddr + 0x40);
+   iowrite16(*src++, ace-baseaddr + 0x40);
ace-data_ptr = src;
 }

Grant: How did you tested BE mode? It is fpga it means you can switch
some connections
and it could work.

Thanks,
Michal


-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-12 Thread Arnd Bergmann
On Tuesday 12 February 2013, Michal Simek wrote:
 But on Microblaze LE is necessary to use different datain/out_le16
 functions as below
 which are also not compatible with Microblaze and PPC BE.
 
 diff --git a/drivers/block/xsysace.c b/drivers/block/xsysace.c
 index bbad046..8dd192c 100644
 --- a/drivers/block/xsysace.c
 +++ b/drivers/block/xsysace.c
 @@ -314,7 +314,7 @@ static void ace_datain_le16(struct ace_device *ace)
 int i = ACE_FIFO_SIZE / 2;
 u16 *dst = ace-data_ptr;
 while (i--)
 -   *dst++ = ioread16be(ace-baseaddr + 0x40);
 +   *dst++ = ioread16(ace-baseaddr + 0x40);
 ace-data_ptr = dst;
  }

Hmm, that actually seems sane then: You can replace the loop around
ioread16be with a single ioread32_rep() call, which will use
the correct native endian accesses on both LE Microblaze
and on all the big-endian platforms. Note that the asm-generic
definition of that function was broken until quite recently
when Will Deacon repaired it in order to support ARM64.

Arnd
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-11 Thread Benjamin Herrenschmidt
On Mon, 2013-02-11 at 16:08 +, Arnd Bergmann wrote:
> I tend to disagree. You should never assume that a device is the same
> endianess as the the CPU, and you should try not to use the __raw_*
> accessors in device drivers either.
> 
> In particular, ARM can run both big- and little-endian even though
> big-endian is rarely used, so you need to know the endianess for
> the device you are talking to rather than assume that it knows
> what the CPU does at the time.

Part of the problem he might be having is that the way a device is wired
to the bus may be different depending on whether the CPU is running LE
or BE ... or rather should be (in order to preserve byte addresses).

However, not all logic out there might do it right, which means that you
may end up with the wrong wiring if you switch the core around in SW.

It depends how the ARM core operates vs. IO when switched between BE and
LE, does it keep the same lines doing byte 0 or does it keep the MSB/LSB
in the same place (and thus changes which lanes contain byte 0) ?

Depending on how the core does the mode change it may require a wiring
change of IO devices as well... If it changes where byte 0 is, then data
cycles must have the lanes reversed. If it changes where the MSB is,
then address cycles must have the lanes reversed.

I wouldn't be surprised if most SOC bus logic out there gets it wired up
for one and only one case.

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-11 Thread Benjamin Herrenschmidt
On Mon, 2013-02-11 at 16:57 +0100, Michal Simek wrote:
> But it reminds me that maybe the easiest solution is not to use endian
> accessors just use two simple macros which should work on all systems.
> 
> #define _readreg(offset)  ({__raw_readl(offset); rmb(); })
> #define _writereg(offset, val)({wmb(); __raw_writel(val, offset); })
> 
> which is probably the faster solution which add minimum additional code
> to driver and can also remove endian detection code.

Except that rmb & wmb might not be the right barriers for IOs ...

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-11 Thread Arnd Bergmann
On Monday 11 February 2013, Michal Simek wrote:
> Unfortunately no. Another is spi/i2c (sysace as we discuss in this
> thread), probably icap
> network drivers are ok because they are not shared.
> Timer when it is moved to clocksource(not important right now)
> xilinx gpio is using __raw IO functions which is incorrect on arm in
> connection to barriers.
> 
> But it reminds me that maybe the easiest solution is not to use endian
> accessors just use two simple macros which should work on all systems.
> 
> #define _readreg(offset)  ({__raw_readl(offset); rmb(); })
> #define _writereg(offset, val)({wmb(); __raw_writel(val, offset); })
> 
> which is probably the faster solution which add minimum additional code
> to driver and can also remove endian detection code.

I tend to disagree. You should never assume that a device is the same
endianess as the the CPU, and you should try not to use the __raw_*
accessors in device drivers either.

In particular, ARM can run both big- and little-endian even though
big-endian is rarely used, so you need to know the endianess for
the device you are talking to rather than assume that it knows
what the CPU does at the time.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-11 Thread Michal Simek
2013/2/11 Arnd Bergmann :
> On Monday 11 February 2013, Michal Simek wrote:
>> I have just found that it won't be so easy as I thought because I have found
>> that microblaze wrong implementation was done because of others device 
>> drivers.
>> It means that I have to fix all device drivers to support big and
>> little endian accessors
>> first before I can switch microblaze to asm-generic/io.h. :-(
>>
>> BTW: If you want to take this patch though your tree then ok, or it
>> can go through my
>> microblaze tree. Both ways should just work.
>
> Please use your tree then. I don't have any asm-generic patches
> lined up for 3.9 myself, so we can save me and Linus a little work
> with a trivial pull request that way.
>
>> I will send the patch for uarlite when I finish my testing on
>> microblaze, ppc and arm.
>
> Ok. Is that the only one that you found to require the "wrong-endian" mode
> in microblaze.

Unfortunately no. Another is spi/i2c (sysace as we discuss in this
thread), probably icap
network drivers are ok because they are not shared.
Timer when it is moved to clocksource(not important right now)
xilinx gpio is using __raw IO functions which is incorrect on arm in
connection to barriers.

But it reminds me that maybe the easiest solution is not to use endian
accessors just use two simple macros which should work on all systems.

#define _readreg(offset)  ({__raw_readl(offset); rmb(); })
#define _writereg(offset, val)({wmb(); __raw_writel(val, offset); })

which is probably the faster solution which add minimum additional code
to driver and can also remove endian detection code.

Thanks,
Michal

-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-11 Thread Arnd Bergmann
On Monday 11 February 2013, Michal Simek wrote:
> I have just found that it won't be so easy as I thought because I have found
> that microblaze wrong implementation was done because of others device 
> drivers.
> It means that I have to fix all device drivers to support big and
> little endian accessors
> first before I can switch microblaze to asm-generic/io.h. :-(
> 
> BTW: If you want to take this patch though your tree then ok, or it
> can go through my
> microblaze tree. Both ways should just work.

Please use your tree then. I don't have any asm-generic patches
lined up for 3.9 myself, so we can save me and Linus a little work
with a trivial pull request that way.
 
> I will send the patch for uarlite when I finish my testing on
> microblaze, ppc and arm.

Ok. Is that the only one that you found to require the "wrong-endian" mode
in microblaze.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-11 Thread Michal Simek
2013/2/11 Arnd Bergmann :
> On Monday 11 February 2013, Michal Simek wrote:
>>
>> 2013/2/8 Arnd Bergmann :
>> > On Thursday 07 February 2013, Michal Simek wrote:
>> >> Subject: "asm-generic: io: Fix ioread16/32be and iowrite16/32be"
>> >
>> > Ok, thanks. If you don't mind, I think this can just go with the other 
>> > patches
>> > for xsysace that come out of this discussion.
>>
>> I want to move microblaze to use this asm-generic/io.h that's why I
>> prefer to send
>> this with microblaze patch directly to Linus. For me this make more sense 
>> that
>> with sending block device driver changes.
>>
>> What do you think?
>
> Sure, please do it that way. My policy for asm-generic patches is that they
> I prefer to have them go through whatever tree needs them, I was just confused
> about which one that would be in this case.
>

I have just found that it won't be so easy as I thought because I have found
that microblaze wrong implementation was done because of others device drivers.
It means that I have to fix all device drivers to support big and
little endian accessors
first before I can switch microblaze to asm-generic/io.h. :-(

BTW: If you want to take this patch though your tree then ok, or it
can go through my
microblaze tree. Both ways should just work.

I will send the patch for uarlite when I finish my testing on
microblaze, ppc and arm.

Thanks,
Michal

-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-11 Thread Arnd Bergmann
On Monday 11 February 2013, Michal Simek wrote:
> 
> 2013/2/8 Arnd Bergmann :
> > On Thursday 07 February 2013, Michal Simek wrote:
> >> Subject: "asm-generic: io: Fix ioread16/32be and iowrite16/32be"
> >
> > Ok, thanks. If you don't mind, I think this can just go with the other 
> > patches
> > for xsysace that come out of this discussion.
> 
> I want to move microblaze to use this asm-generic/io.h that's why I
> prefer to send
> this with microblaze patch directly to Linus. For me this make more sense that
> with sending block device driver changes.
> 
> What do you think?

Sure, please do it that way. My policy for asm-generic patches is that they
I prefer to have them go through whatever tree needs them, I was just confused
about which one that would be in this case.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-11 Thread Michal Simek
2013/2/8 Arnd Bergmann :
> On Thursday 07 February 2013, Michal Simek wrote:
>> Subject: "asm-generic: io: Fix ioread16/32be and iowrite16/32be"
>
> Ok, thanks. If you don't mind, I think this can just go with the other patches
> for xsysace that come out of this discussion.

I want to move microblaze to use this asm-generic/io.h that's why I
prefer to send
this with microblaze patch directly to Linus. For me this make more sense that
with sending block device driver changes.

What do you think?

Thanks,
Michal



-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-11 Thread Michal Simek
2013/2/8 Arnd Bergmann a...@arndb.de:
 On Thursday 07 February 2013, Michal Simek wrote:
 Subject: asm-generic: io: Fix ioread16/32be and iowrite16/32be

 Ok, thanks. If you don't mind, I think this can just go with the other patches
 for xsysace that come out of this discussion.

I want to move microblaze to use this asm-generic/io.h that's why I
prefer to send
this with microblaze patch directly to Linus. For me this make more sense that
with sending block device driver changes.

What do you think?

Thanks,
Michal



-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-11 Thread Arnd Bergmann
On Monday 11 February 2013, Michal Simek wrote:
 
 2013/2/8 Arnd Bergmann a...@arndb.de:
  On Thursday 07 February 2013, Michal Simek wrote:
  Subject: asm-generic: io: Fix ioread16/32be and iowrite16/32be
 
  Ok, thanks. If you don't mind, I think this can just go with the other 
  patches
  for xsysace that come out of this discussion.
 
 I want to move microblaze to use this asm-generic/io.h that's why I
 prefer to send
 this with microblaze patch directly to Linus. For me this make more sense that
 with sending block device driver changes.
 
 What do you think?

Sure, please do it that way. My policy for asm-generic patches is that they
I prefer to have them go through whatever tree needs them, I was just confused
about which one that would be in this case.

Arnd
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-11 Thread Michal Simek
2013/2/11 Arnd Bergmann a...@arndb.de:
 On Monday 11 February 2013, Michal Simek wrote:

 2013/2/8 Arnd Bergmann a...@arndb.de:
  On Thursday 07 February 2013, Michal Simek wrote:
  Subject: asm-generic: io: Fix ioread16/32be and iowrite16/32be
 
  Ok, thanks. If you don't mind, I think this can just go with the other 
  patches
  for xsysace that come out of this discussion.

 I want to move microblaze to use this asm-generic/io.h that's why I
 prefer to send
 this with microblaze patch directly to Linus. For me this make more sense 
 that
 with sending block device driver changes.

 What do you think?

 Sure, please do it that way. My policy for asm-generic patches is that they
 I prefer to have them go through whatever tree needs them, I was just confused
 about which one that would be in this case.


I have just found that it won't be so easy as I thought because I have found
that microblaze wrong implementation was done because of others device drivers.
It means that I have to fix all device drivers to support big and
little endian accessors
first before I can switch microblaze to asm-generic/io.h. :-(

BTW: If you want to take this patch though your tree then ok, or it
can go through my
microblaze tree. Both ways should just work.

I will send the patch for uarlite when I finish my testing on
microblaze, ppc and arm.

Thanks,
Michal

-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-11 Thread Arnd Bergmann
On Monday 11 February 2013, Michal Simek wrote:
 I have just found that it won't be so easy as I thought because I have found
 that microblaze wrong implementation was done because of others device 
 drivers.
 It means that I have to fix all device drivers to support big and
 little endian accessors
 first before I can switch microblaze to asm-generic/io.h. :-(
 
 BTW: If you want to take this patch though your tree then ok, or it
 can go through my
 microblaze tree. Both ways should just work.

Please use your tree then. I don't have any asm-generic patches
lined up for 3.9 myself, so we can save me and Linus a little work
with a trivial pull request that way.
 
 I will send the patch for uarlite when I finish my testing on
 microblaze, ppc and arm.

Ok. Is that the only one that you found to require the wrong-endian mode
in microblaze.

Arnd
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-11 Thread Michal Simek
2013/2/11 Arnd Bergmann a...@arndb.de:
 On Monday 11 February 2013, Michal Simek wrote:
 I have just found that it won't be so easy as I thought because I have found
 that microblaze wrong implementation was done because of others device 
 drivers.
 It means that I have to fix all device drivers to support big and
 little endian accessors
 first before I can switch microblaze to asm-generic/io.h. :-(

 BTW: If you want to take this patch though your tree then ok, or it
 can go through my
 microblaze tree. Both ways should just work.

 Please use your tree then. I don't have any asm-generic patches
 lined up for 3.9 myself, so we can save me and Linus a little work
 with a trivial pull request that way.

 I will send the patch for uarlite when I finish my testing on
 microblaze, ppc and arm.

 Ok. Is that the only one that you found to require the wrong-endian mode
 in microblaze.

Unfortunately no. Another is spi/i2c (sysace as we discuss in this
thread), probably icap
network drivers are ok because they are not shared.
Timer when it is moved to clocksource(not important right now)
xilinx gpio is using __raw IO functions which is incorrect on arm in
connection to barriers.

But it reminds me that maybe the easiest solution is not to use endian
accessors just use two simple macros which should work on all systems.

#define name_readreg(offset)  ({__raw_readl(offset); rmb(); })
#define name_writereg(offset, val)({wmb(); __raw_writel(val, offset); })

which is probably the faster solution which add minimum additional code
to driver and can also remove endian detection code.

Thanks,
Michal

-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-11 Thread Arnd Bergmann
On Monday 11 February 2013, Michal Simek wrote:
 Unfortunately no. Another is spi/i2c (sysace as we discuss in this
 thread), probably icap
 network drivers are ok because they are not shared.
 Timer when it is moved to clocksource(not important right now)
 xilinx gpio is using __raw IO functions which is incorrect on arm in
 connection to barriers.
 
 But it reminds me that maybe the easiest solution is not to use endian
 accessors just use two simple macros which should work on all systems.
 
 #define name_readreg(offset)  ({__raw_readl(offset); rmb(); })
 #define name_writereg(offset, val)({wmb(); __raw_writel(val, offset); })
 
 which is probably the faster solution which add minimum additional code
 to driver and can also remove endian detection code.

I tend to disagree. You should never assume that a device is the same
endianess as the the CPU, and you should try not to use the __raw_*
accessors in device drivers either.

In particular, ARM can run both big- and little-endian even though
big-endian is rarely used, so you need to know the endianess for
the device you are talking to rather than assume that it knows
what the CPU does at the time.

Arnd
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-11 Thread Benjamin Herrenschmidt
On Mon, 2013-02-11 at 16:57 +0100, Michal Simek wrote:
 But it reminds me that maybe the easiest solution is not to use endian
 accessors just use two simple macros which should work on all systems.
 
 #define name_readreg(offset)  ({__raw_readl(offset); rmb(); })
 #define name_writereg(offset, val)({wmb(); __raw_writel(val, offset); })
 
 which is probably the faster solution which add minimum additional code
 to driver and can also remove endian detection code.

Except that rmb  wmb might not be the right barriers for IOs ...

Cheers,
Ben.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-11 Thread Benjamin Herrenschmidt
On Mon, 2013-02-11 at 16:08 +, Arnd Bergmann wrote:
 I tend to disagree. You should never assume that a device is the same
 endianess as the the CPU, and you should try not to use the __raw_*
 accessors in device drivers either.
 
 In particular, ARM can run both big- and little-endian even though
 big-endian is rarely used, so you need to know the endianess for
 the device you are talking to rather than assume that it knows
 what the CPU does at the time.

Part of the problem he might be having is that the way a device is wired
to the bus may be different depending on whether the CPU is running LE
or BE ... or rather should be (in order to preserve byte addresses).

However, not all logic out there might do it right, which means that you
may end up with the wrong wiring if you switch the core around in SW.

It depends how the ARM core operates vs. IO when switched between BE and
LE, does it keep the same lines doing byte 0 or does it keep the MSB/LSB
in the same place (and thus changes which lanes contain byte 0) ?

Depending on how the core does the mode change it may require a wiring
change of IO devices as well... If it changes where byte 0 is, then data
cycles must have the lanes reversed. If it changes where the MSB is,
then address cycles must have the lanes reversed.

I wouldn't be surprised if most SOC bus logic out there gets it wired up
for one and only one case.

Cheers,
Ben.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Grant Likely
On Thu, Feb 7, 2013 at 5:22 PM, Alexey Brodkin
 wrote:
> On 02/07/2013 09:16 PM, Grant Likely wrote:
>>
>> On Thu, Feb 7, 2013 at 4:56 PM, Alexey Brodkin
>>  wrote:
>>>
>>> On 02/07/2013 08:44 PM, Grant Likely wrote:

 So, if I'm correct that means that for the data port (specifically
 copying between RAM and the data port) using ioread16/iowrite16 on the
 data port results in the correct behaviour. It also means that LE
 support in the current driver is broken.
>>>
>>>
>>> That matches my earlier note when I wrote that for correct work on LE
>>> (note
>>> I'm on ARC, not PPC/MB) I needed to use "io{read|write}16" in
>>> "ace_data{in|out}_le16".
>>>
>>> With original "io{read|write}16be" instead data was corrupted.
>>
>>
>> In which case your bug-fix patch should drop the
>> ace_datain_le16/ace_dataout_le16 variants entirely and use the be16
>> ones for both (renaming appropriately).
>>
>> g.
>>
>> --
>> Grant Likely, B.Sc., P.Eng.
>> Secret Lab Technologies Ltd.
>>
>
> Sorry, do you mean to replace original lines:
> ===
>
> static void ace_datain_be16(struct ace_device *ace)
> {
> int i = ACE_FIFO_SIZE / 2;
> u16 *dst = ace->data_ptr;
> while (i--)
> *dst++ = in_le16(ace->baseaddr + 0x40);
> ace->data_ptr = dst;
>
> }
>
> static void ace_dataout_be16(struct ace_device *ace)
> {
> int i = ACE_FIFO_SIZE / 2;
> u16 *src = ace->data_ptr;
> while (i--)
> ioread16(*src++, ace->baseaddr + 0x40);
> ace->data_ptr = src;
> }
> ===
>
> with something like:
> ===
> static void ace_datain_16(struct ace_device *ace)
>
> {
> int i = ACE_FIFO_SIZE / 2;
> u16 *dst = ace->data_ptr;
> while (i--)
> *dst++ = in_le16(ace->baseaddr + 0x40);
> ace->data_ptr = dst;
> }
>
> static void ace_dataout_16(struct ace_device *ace)
>
> {
> int i = ACE_FIFO_SIZE / 2;
> u16 *src = ace->data_ptr;
> while (i--)
> iowrite16(*src++, ace->baseaddr + 0x40);
> ace->data_ptr = src;
> }

Ummm, I think you finger fumbled that because the above doesn't make sense.

I think that your original patch should be applied as-is first. It is
just a mechanical replacement of the ppc accessors with ioread/iowrite
variants. Nothing controversial there.

I was suggesting to use a second patch to drop
ace_datain_le16/ace_dataout_le16 and rename
ace_datain_be16/ace_dataout_be16 to ace_datain_16/ace_dataout_16.\,
and in that same patch you can switch the read loop to use
ioread16_rep/iowrite16_rep.

*However*... I think my first analysis is actually wrong for the BE
case. The current BE code works using ioread16() for the data port
which does a swap, but ioread16_rep() doesn't swap . I hadn't gone an
actually looked at how the _rep variants are implemented. (Thanks for
your help Ben). That means ioread16_rep will work fine for LE, but
we're still stuck with the slow loop on BE.

So, craft your bug fix for the LE case to use _rep variants for
ace_{datain,dataout}_le16() but don't change the BE support yet.

g.

-- 
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Grant Likely
On Thu, Feb 7, 2013 at 11:05 PM, Arnd Bergmann  wrote:
> On Thursday 07 February 2013, Grant Likely wrote:
>> On Thu, Feb 7, 2013 at 3:28 PM, Alexey Brodkin
> Of course, as long as the driver is only ever used to access
> the same non-removable block device and you don't change
> the driver, it does not matter at all whether you swap bytes
> on the data port or not, because they are swapped on both
> read and write, and it's just storage. Only if you try to
> read the device with a "correct" driver, you will see a problem
> if it was written with a "wrong" driver.

It's actually a removable compact flash card interface. I know that
the current big-endian support code is correct because my laptop
doesn't complain about the data.  :-)

g.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Geert Uytterhoeven
On Fri, Feb 8, 2013 at 12:05 AM, Arnd Bergmann  wrote:
>> The same is true for the data port. A BE read does the right thing on
>> a BE platform, and LE read on a LE platform. (again, this is all about
>> bus attachment). However, the difference is what is then done with the
>> data.
>
> Well, except that we cannot use the ioread16be_rep function,

ioread16_rep

> which is made for the case where the bus does not swap.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Arnd Bergmann
On Thursday 07 February 2013, Michal Simek wrote:
> Subject: "asm-generic: io: Fix ioread16/32be and iowrite16/32be"

Ok, thanks. If you don't mind, I think this can just go with the other patches
for xsysace that come out of this discussion.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Arnd Bergmann
On Thursday 07 February 2013, Grant Likely wrote:
> On Thu, Feb 7, 2013 at 3:28 PM, Alexey Brodkin

> Starting with register (non-data) access. The bus bindings are such
> that on both BE and LE systems a native 16 bit read results in the
> bits being in the correct order. On powerpc, you want to do a BE read
> because the device appears as a BE device, and on LE systems a LE read
> is wanted because that is how the device is wired. So far this makes
> sense and matches the driver.

Well, until you get a little-endian PowerPC system with this
device ;). I heard people are plannng to put those into
production systems. I would assume that the device would
still use big-endian registers, so the rule about PowerPC
using the BE accessors still holds, but it stops making
sense.

> The same is true for the data port. A BE read does the right thing on
> a BE platform, and LE read on a LE platform. (again, this is all about
> bus attachment). However, the difference is what is then done with the
> data.

Well, except that we cannot use the ioread16be_rep function,
which is made for the case where the bus does not swap.

Of course, as long as the driver is only ever used to access
the same non-removable block device and you don't change
the driver, it does not matter at all whether you swap bytes
on the data port or not, because they are swapped on both
read and write, and it's just storage. Only if you try to
read the device with a "correct" driver, you will see a problem
if it was written with a "wrong" driver.

> The ironic thing is that if the BE PPC/MB hardware engineers hadn't
> swapped the bytelanes then we would *still* have to do special thinks
> in the hardware; except it would be the data port accesses that would
> need extra work, instead of the other registers.

No, we would juse use the _rep() functions if they had done it right,
just like any other driver with this problem.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Benjamin Herrenschmidt
On Thu, 2013-02-07 at 20:56 +0400, Alexey Brodkin wrote:
> > So, if I'm correct that means that for the data port (specifically
> > copying between RAM and the data port) using ioread16/iowrite16 on the
> > data port results in the correct behaviour. It also means that LE
> > support in the current driver is broken.
> 
> That matches my earlier note when I wrote that for correct work on LE 
> (note I'm on ARC, not PPC/MB) I needed to use "io{read|write}16" in 
> "ace_data{in|out}_le16".
> 
> With original "io{read|write}16be" instead data was corrupted.

For the "backward wiring case" your data port will always be the opposite
endian as your host. For the "correct wiring" case, the data port will always
be the same endian as your host.

So the correct wiring case (the one using ioread16/iowrite16 for registers),
the data port should just use ioread16_rep.

For the other case, you need some ifdef'ery or raw variants with home made
barriers, or a bounce buffer.

Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Benjamin Herrenschmidt
On Thu, 2013-02-07 at 16:44 +, Grant Likely wrote:
> 
> I've just spent some quality time with a piece of paper, and I think
> I've figured it out...

http://linuxplumbers.ubicast.tv/videos/big-and-little-endian-inside-out/

Watch the last part on IO busses

This all has to do which which byte of the bus is the lowest byte
*address* regardless of significance. That's how the wiring should be
consistent between CPU and device, in order for a data port to work
properly.

The data port then requires no swapping (which also means it works
nicely with dumb DMA engines etc...)

Whether the registers need swapping or not depends on which half is the
MSB, which is somewhat a semantically higher level than the bus
transport, and depends on whether the device exposes them as BE or LE
registers. But data ports are just "windows" to a byte stream and
shouldn't be affected by endianess (again, unless you get a moron doing
the HW which seems to be still too common).

There is only one right way to connect devices to CPUs basically, which
is called byte address invariance, and preserves the order of bytes in
term of byte addresses.

(Note that for busses that also carry addresses such as PCI, this can
get tricky as you might need to have the lanes routed in a different
order for address vs. data cycles).

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Benjamin Herrenschmidt
On Thu, 2013-02-07 at 15:23 +, Grant Likely wrote:
> 
> Maybe. In a separate patch. Hmmm... I guess there isn't an
> ioread16be_rep variant. 

Because it would not make sense. The ioread16_rep isn't "LE" ... it
should be usable for any kind if data port since such a fifo should
never require byteswap as long as the bus is wired properly (whether the
device is LE or BE).

The problem we have here is that we have an LE device that can be wired
backward. The fact that this only happens when the CPU is BE is somewhat
irrelevant (well, let's say that the confusion is easier to make for the
HW guys when using a BE CPU but fundamentally it's not relevant, a
similar backard wiring could have been done for a BE device on a LE CPU
for example). 

So in that case, you need some kind of hand made loop.

> Oh well. Check first with Michal on LE microblaze before making the
> change. If it doesn't work for him the more understanding is needed. I
> was pretty sure the LE variant already worked.
> 
> > not sure about items for "ace_datain/out_be16" - what about _rep
> options
> > here?
> 
> ioread16_rep should be fine. The ace_data{in,out}_be16 routines need
> to use the LE accessor. The existing code is definitely correct in
> this respect.

Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Benjamin Herrenschmidt
On Thu, 2013-02-07 at 19:12 +0400, Alexey Brodkin wrote:
> not sure about items for "ace_datain/out_be16" - what about _rep
> options here?

Well, you have a backward wiring of an LE device so you can't use the
_rep variants, unless you ping pong, so you either use a loop of
ioread/write16 (le) and bite the bullet on extra barriers, or use _rep &
bounce buffer for a separate swap pass.

Point is, the backward wiring will require byteswap on both BE and LE
hosts for data (which is why it's so stupid).

Cheers,
Ben.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Alexey Brodkin

On 02/07/2013 09:16 PM, Grant Likely wrote:

On Thu, Feb 7, 2013 at 4:56 PM, Alexey Brodkin
 wrote:

On 02/07/2013 08:44 PM, Grant Likely wrote:

So, if I'm correct that means that for the data port (specifically
copying between RAM and the data port) using ioread16/iowrite16 on the
data port results in the correct behaviour. It also means that LE
support in the current driver is broken.


That matches my earlier note when I wrote that for correct work on LE (note
I'm on ARC, not PPC/MB) I needed to use "io{read|write}16" in
"ace_data{in|out}_le16".

With original "io{read|write}16be" instead data was corrupted.


In which case your bug-fix patch should drop the
ace_datain_le16/ace_dataout_le16 variants entirely and use the be16
ones for both (renaming appropriately).

g.

--
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.



Sorry, do you mean to replace original lines:
===
static void ace_datain_be16(struct ace_device *ace)
{
int i = ACE_FIFO_SIZE / 2;
u16 *dst = ace->data_ptr;
while (i--)
*dst++ = in_le16(ace->baseaddr + 0x40);
ace->data_ptr = dst;
}

static void ace_dataout_be16(struct ace_device *ace)
{
int i = ACE_FIFO_SIZE / 2;
u16 *src = ace->data_ptr;
while (i--)
ioread16(*src++, ace->baseaddr + 0x40);
ace->data_ptr = src;
}
===

with something like:
===
static void ace_datain_16(struct ace_device *ace)
{
int i = ACE_FIFO_SIZE / 2;
u16 *dst = ace->data_ptr;
while (i--)
*dst++ = in_le16(ace->baseaddr + 0x40);
ace->data_ptr = dst;
}

static void ace_dataout_16(struct ace_device *ace)
{
int i = ACE_FIFO_SIZE / 2;
u16 *src = ace->data_ptr;
while (i--)
iowrite16(*src++, ace->baseaddr + 0x40);
ace->data_ptr = src;
}
===

and then the next patch should replace "io{read|write}16" with 
"io{read|write}16_rep" I guess, correct?


-Alexey
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Grant Likely
On Thu, Feb 7, 2013 at 4:56 PM, Alexey Brodkin
 wrote:
> On 02/07/2013 08:44 PM, Grant Likely wrote:
>> So, if I'm correct that means that for the data port (specifically
>> copying between RAM and the data port) using ioread16/iowrite16 on the
>> data port results in the correct behaviour. It also means that LE
>> support in the current driver is broken.
>
> That matches my earlier note when I wrote that for correct work on LE (note
> I'm on ARC, not PPC/MB) I needed to use "io{read|write}16" in
> "ace_data{in|out}_le16".
>
> With original "io{read|write}16be" instead data was corrupted.

In which case your bug-fix patch should drop the
ace_datain_le16/ace_dataout_le16 variants entirely and use the be16
ones for both (renaming appropriately).

g.

--
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Alexey Brodkin

On 02/07/2013 08:44 PM, Grant Likely wrote:


Then that data needs to be stored into memory. This is where things
are different with native 16 bit stores:
   On a BE memory system. MSB in byte address 0 and LSB in byte address 1.
   On a LE memory system. MSB in byte address 1 and LSB in byte address 0.

The block subsystem deals with byte oriented buffers; so given the way
data is arranged in the data port we want to always make sure the LSB
goes into address 0. The cause of problems isn't the BE vs LE bus
attachment. It is the different memory system orientation. Using
ioread16/iowrite16 has the right behaviour, but it's kind of a
backwards way to go about it It isn't that we want a be16_to_cpu()
or le16_to_cpu() on the data port read, but rather a cpu_to_le16() on
the store to memory regardless of the endianess of the platform.

So, if I'm correct that means that for the data port (specifically
copying between RAM and the data port) using ioread16/iowrite16 on the
data port results in the correct behaviour. It also means that LE
support in the current driver is broken.


That matches my earlier note when I wrote that for correct work on LE 
(note I'm on ARC, not PPC/MB) I needed to use "io{read|write}16" in 
"ace_data{in|out}_le16".


With original "io{read|write}16be" instead data was corrupted.

-Alexey
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Grant Likely
On Thu, Feb 7, 2013 at 3:28 PM, Alexey Brodkin
 wrote:
> On 02/07/2013 07:23 PM, Grant Likely wrote:
>>
>> On Thu, Feb 7, 2013 at 3:12 PM, Alexey Brodkin
>>  wrote:
>>>
>>> On 02/07/2013 06:51 PM, Grant Likely wrote:


 On Thu, Feb 7, 2013 at 2:39 PM, Grant Likely 
 wrote:
>
>
> On Wed, Feb 6, 2013 at 9:35 PM, Benjamin Herrenschmidt
>  wrote:
>>>
>>>
>>>In fact, the driver already knows about this and figures
>>> out at runtime how the device is wired up to the bus. This is not the
>>> problem.
>>
>>
>>
>> Except that this is very gross, especially when you observe that in
>> the
>> busted "big endian" case, it has to byteswap the bloody data port.
>>
>> So you end up having to do that gross hack with separate accessors for
>> registers vs. data and not able to use the _rep variants, which also
>> means that on platforms like ppc, you end up with a memory barrier on
>> every access (or more), which is going to slow things down enormously.
>
>
>
> I don't see why the _rep variants aren't usable here. The only reason
> I didn't use them when I wrote the driver in the first place was I was
> a n00b kernel hacker and I didn't know they were there.



 The 8-bit variant is different though because the hardware requires
 pingponging between odd and even byte addresses to flush the fifo.
 Reading a data port even address (0x40) gives the least significant
 byte. Reading from an odd address (0x41) give the MSB and pops the
 data off the FIFO. So, yes, the _rep variant can't be used in 8-bit
 mode. It should still be fine in 16-bit.

 page 45:
 http://www.xilinx.com/support/documentation/data_sheets/ds080.pdf

>>>
>>> Ok, so may I do a re-spin with these changes:
>>
>>
>> There are two things here. 1) changing the accessors used. 2)
>> switching the endianess as a bug fix. Any changes to the endian access
>> should be a separate patch which a description of what is needed.
>>
>>> 1. In "ace_in_be16" use "ioread16be"
>>> 2. In "ace_out_be16" use "iowrite16be"
>>> 3. In "ace_in_le16" use "ioread16"
>>> 4. In "ace_out_le16" use "iowrite16"
>>
>>
>> Yes
>>
>>> 5. In "ace_datain_le16" use "ioread16_rep"
>>> 6. In "ace_dataout_le16" use "iowrite16_rep"
>>
>>
>> Maybe. In a separate patch. Hmmm... I guess there isn't an
>> ioread16be_rep variant. Oh well. Check first with Michal on LE
>> microblaze before making the change. If it doesn't work for him the
>> more understanding is needed. I was pretty sure the LE variant already
>> worked.
>>
>>> not sure about items for "ace_datain/out_be16" - what about _rep options
>>> here?
>>
>>
>> ioread16_rep should be fine. The ace_data{in,out}_be16 routines need
>> to use the LE accessor. The existing code is definitely correct in
>> this respect.
>
>
> Well, if "ioread16_rep" is used in both "ace_data{in,out}_be16" and
> "ace_data{in,out}_le16" then what is a difference between them?
> Whether there's a subtle difference still exists and I cannot see it or
> unified accessor could be used from now on at least for data access.

I've just spent some quality time with a piece of paper, and I think
I've figured it out...

Starting with register (non-data) access. The bus bindings are such
that on both BE and LE systems a native 16 bit read results in the
bits being in the correct order. On powerpc, you want to do a BE read
because the device appears as a BE device, and on LE systems a LE read
is wanted because that is how the device is wired. So far this makes
sense and matches the driver.

The same is true for the data port. A BE read does the right thing on
a BE platform, and LE read on a LE platform. (again, this is all about
bus attachment). However, the difference is what is then done with the
data.

For register reads, the driver uses the value directly. Either by
writing a 16 bit value to a register or reading and interpreting a 16
bit value. In both case the drive is directly interested in the 16bit
word

However, for the data port, the data is streamed out of the FIFO and
stored in memory for the block subsystem to use. A read from the data
port (assuming 'native' accesses are used as described above) obtains
16bits of data from the disk. Disk offsets are byte oriented, but
since it they are 16 bit reads, two bytes are read at a time:
First read:  offset 0 in LSB, offset 1 MSB
Second read: offset 2 in LSB, offset 3 in MSB
Third read: offset 4 in LSB, offset 5 in MSB

etc.

Then that data needs to be stored into memory. This is where things
are different with native 16 bit stores:
  On a BE memory system. MSB in byte address 0 and LSB in byte address 1.
  On a LE memory system. MSB in byte address 1 and LSB in byte address 0.

The block subsystem deals with byte oriented buffers; so given the way
data is arranged in the data port we want to always make sure the LSB
goes into address 0. The cause of 

Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Alexey Brodkin

On 02/07/2013 07:23 PM, Grant Likely wrote:

On Thu, Feb 7, 2013 at 3:12 PM, Alexey Brodkin
 wrote:

On 02/07/2013 06:51 PM, Grant Likely wrote:


On Thu, Feb 7, 2013 at 2:39 PM, Grant Likely 
wrote:


On Wed, Feb 6, 2013 at 9:35 PM, Benjamin Herrenschmidt
 wrote:


   In fact, the driver already knows about this and figures
out at runtime how the device is wired up to the bus. This is not the
problem.



Except that this is very gross, especially when you observe that in the
busted "big endian" case, it has to byteswap the bloody data port.

So you end up having to do that gross hack with separate accessors for
registers vs. data and not able to use the _rep variants, which also
means that on platforms like ppc, you end up with a memory barrier on
every access (or more), which is going to slow things down enormously.



I don't see why the _rep variants aren't usable here. The only reason
I didn't use them when I wrote the driver in the first place was I was
a n00b kernel hacker and I didn't know they were there.



The 8-bit variant is different though because the hardware requires
pingponging between odd and even byte addresses to flush the fifo.
Reading a data port even address (0x40) gives the least significant
byte. Reading from an odd address (0x41) give the MSB and pops the
data off the FIFO. So, yes, the _rep variant can't be used in 8-bit
mode. It should still be fine in 16-bit.

page 45: http://www.xilinx.com/support/documentation/data_sheets/ds080.pdf



Ok, so may I do a re-spin with these changes:


There are two things here. 1) changing the accessors used. 2)
switching the endianess as a bug fix. Any changes to the endian access
should be a separate patch which a description of what is needed.


1. In "ace_in_be16" use "ioread16be"
2. In "ace_out_be16" use "iowrite16be"
3. In "ace_in_le16" use "ioread16"
4. In "ace_out_le16" use "iowrite16"


Yes


5. In "ace_datain_le16" use "ioread16_rep"
6. In "ace_dataout_le16" use "iowrite16_rep"


Maybe. In a separate patch. Hmmm... I guess there isn't an
ioread16be_rep variant. Oh well. Check first with Michal on LE
microblaze before making the change. If it doesn't work for him the
more understanding is needed. I was pretty sure the LE variant already
worked.


not sure about items for "ace_datain/out_be16" - what about _rep options
here?


ioread16_rep should be fine. The ace_data{in,out}_be16 routines need
to use the LE accessor. The existing code is definitely correct in
this respect.


Well, if "ioread16_rep" is used in both "ace_data{in,out}_be16" and 
"ace_data{in,out}_le16" then what is a difference between them?
Whether there's a subtle difference still exists and I cannot see it or 
unified accessor could be used from now on at least for data access.


What do you think?

-Alexey
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Grant Likely
On Thu, Feb 7, 2013 at 3:12 PM, Alexey Brodkin
 wrote:
> On 02/07/2013 06:51 PM, Grant Likely wrote:
>>
>> On Thu, Feb 7, 2013 at 2:39 PM, Grant Likely 
>> wrote:
>>>
>>> On Wed, Feb 6, 2013 at 9:35 PM, Benjamin Herrenschmidt
>>>  wrote:
>
>   In fact, the driver already knows about this and figures
> out at runtime how the device is wired up to the bus. This is not the
> problem.


 Except that this is very gross, especially when you observe that in the
 busted "big endian" case, it has to byteswap the bloody data port.

 So you end up having to do that gross hack with separate accessors for
 registers vs. data and not able to use the _rep variants, which also
 means that on platforms like ppc, you end up with a memory barrier on
 every access (or more), which is going to slow things down enormously.
>>>
>>>
>>> I don't see why the _rep variants aren't usable here. The only reason
>>> I didn't use them when I wrote the driver in the first place was I was
>>> a n00b kernel hacker and I didn't know they were there.
>>
>>
>> The 8-bit variant is different though because the hardware requires
>> pingponging between odd and even byte addresses to flush the fifo.
>> Reading a data port even address (0x40) gives the least significant
>> byte. Reading from an odd address (0x41) give the MSB and pops the
>> data off the FIFO. So, yes, the _rep variant can't be used in 8-bit
>> mode. It should still be fine in 16-bit.
>>
>> page 45: http://www.xilinx.com/support/documentation/data_sheets/ds080.pdf
>>
>
> Ok, so may I do a re-spin with these changes:

There are two things here. 1) changing the accessors used. 2)
switching the endianess as a bug fix. Any changes to the endian access
should be a separate patch which a description of what is needed.

> 1. In "ace_in_be16" use "ioread16be"
> 2. In "ace_out_be16" use "iowrite16be"
> 3. In "ace_in_le16" use "ioread16"
> 4. In "ace_out_le16" use "iowrite16"

Yes

> 5. In "ace_datain_le16" use "ioread16_rep"
> 6. In "ace_dataout_le16" use "iowrite16_rep"

Maybe. In a separate patch. Hmmm... I guess there isn't an
ioread16be_rep variant. Oh well. Check first with Michal on LE
microblaze before making the change. If it doesn't work for him the
more understanding is needed. I was pretty sure the LE variant already
worked.

> not sure about items for "ace_datain/out_be16" - what about _rep options
> here?

ioread16_rep should be fine. The ace_data{in,out}_be16 routines need
to use the LE accessor. The existing code is definitely correct in
this respect.

g.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Alexey Brodkin

On 02/07/2013 06:51 PM, Grant Likely wrote:

On Thu, Feb 7, 2013 at 2:39 PM, Grant Likely  wrote:

On Wed, Feb 6, 2013 at 9:35 PM, Benjamin Herrenschmidt
 wrote:

  In fact, the driver already knows about this and figures
out at runtime how the device is wired up to the bus. This is not the
problem.


Except that this is very gross, especially when you observe that in the
busted "big endian" case, it has to byteswap the bloody data port.

So you end up having to do that gross hack with separate accessors for
registers vs. data and not able to use the _rep variants, which also
means that on platforms like ppc, you end up with a memory barrier on
every access (or more), which is going to slow things down enormously.


I don't see why the _rep variants aren't usable here. The only reason
I didn't use them when I wrote the driver in the first place was I was
a n00b kernel hacker and I didn't know they were there.


The 8-bit variant is different though because the hardware requires
pingponging between odd and even byte addresses to flush the fifo.
Reading a data port even address (0x40) gives the least significant
byte. Reading from an odd address (0x41) give the MSB and pops the
data off the FIFO. So, yes, the _rep variant can't be used in 8-bit
mode. It should still be fine in 16-bit.

page 45: http://www.xilinx.com/support/documentation/data_sheets/ds080.pdf



Ok, so may I do a re-spin with these changes:
1. In "ace_in_be16" use "ioread16be"
2. In "ace_out_be16" use "iowrite16be"
3. In "ace_in_le16" use "ioread16"
4. In "ace_out_le16" use "iowrite16"
5. In "ace_datain_le16" use "ioread16_rep"
6. In "ace_dataout_le16" use "iowrite16_rep"

not sure about items for "ace_datain/out_be16" - what about _rep options 
here?


-Alexey
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Grant Likely
On Thu, Feb 7, 2013 at 2:39 PM, Grant Likely  wrote:
> On Wed, Feb 6, 2013 at 9:35 PM, Benjamin Herrenschmidt
>  wrote:
>>>  In fact, the driver already knows about this and figures
>>> out at runtime how the device is wired up to the bus. This is not the
>>> problem.
>>
>> Except that this is very gross, especially when you observe that in the
>> busted "big endian" case, it has to byteswap the bloody data port.
>>
>> So you end up having to do that gross hack with separate accessors for
>> registers vs. data and not able to use the _rep variants, which also
>> means that on platforms like ppc, you end up with a memory barrier on
>> every access (or more), which is going to slow things down enormously.
>
> I don't see why the _rep variants aren't usable here. The only reason
> I didn't use them when I wrote the driver in the first place was I was
> a n00b kernel hacker and I didn't know they were there.

The 8-bit variant is different though because the hardware requires
pingponging between odd and even byte addresses to flush the fifo.
Reading a data port even address (0x40) gives the least significant
byte. Reading from an odd address (0x41) give the MSB and pops the
data off the FIFO. So, yes, the _rep variant can't be used in 8-bit
mode. It should still be fine in 16-bit.

page 45: http://www.xilinx.com/support/documentation/data_sheets/ds080.pdf
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Grant Likely
On Wed, Feb 6, 2013 at 9:35 PM, Benjamin Herrenschmidt
 wrote:
> On Wed, 2013-02-06 at 10:14 +, Grant Likely wrote:
>>
>> Huh? That makes no sense. This device out in the wild with both big
>> and little endian bus attachments. You can argue all day that one of
>> them is wrong, but it doesn't matter. It exists, is used, and must be
>> supported.
>
> No. That's where you are VERY wrong. We don't have to support crap and
> arguably shouldn't if that can give an incentive to vendors to fix their
> stuff. If you don't believe me, ask Linus :-)

As far as horrible hardware goes, this device comes no where close to
some of the stuff I've worked on. The driver is self contained. It
doesn't have any nasty hooks into the rest of the kernel and most
importantly it currently *works* and is used.

>>  In fact, the driver already knows about this and figures
>> out at runtime how the device is wired up to the bus. This is not the
>> problem.
>
> Except that this is very gross, especially when you observe that in the
> busted "big endian" case, it has to byteswap the bloody data port.
>
> So you end up having to do that gross hack with separate accessors for
> registers vs. data and not able to use the _rep variants, which also
> means that on platforms like ppc, you end up with a memory barrier on
> every access (or more), which is going to slow things down enormously.

I don't see why the _rep variants aren't usable here. The only reason
I didn't use them when I wrote the driver in the first place was I was
a n00b kernel hacker and I didn't know they were there.

g.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Alexey Brodkin

On 02/07/2013 06:31 PM, Michal Simek wrote:

The origin patch (after some microblaze ioread/iowrite fixes) works ok,
These additional changes is breaking sysace on microblaze big endian
ml505 16bit mode.
8bit mode just works

Also I have looked at our tree and I see that the mainline kernel misses
one patch which was sent by Graeme Smecher and it was also Acked-by Grant.
It is called "Fix device name assignment for SystemACE (from "xs`" to "xsa")."
Let me resend it too.

Alexey: Can you please confirm that on your arc platform you see broken CF
partitions name?

Thanks,
Michal



Indeed I had it and fixed myself)
I didn't know about this fix so I've just sent another fix for it - 
https://patchwork.kernel.org/patch/2110811/


Anyways good to know now it is fixed.

Thanks,
Alexey
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Michal Simek
2013/2/7 Alexey Brodkin :
> On 02/07/2013 01:35 AM, Benjamin Herrenschmidt wrote:
>>
>> On Wed, 2013-02-06 at 10:14 +, Grant Likely wrote:
>>>
>>>
>>> Huh? That makes no sense. This device out in the wild with both big
>>> and little endian bus attachments. You can argue all day that one of
>>> them is wrong, but it doesn't matter. It exists, is used, and must be
>>> supported.
>>
>>
>> No. That's where you are VERY wrong. We don't have to support crap and
>> arguably shouldn't if that can give an incentive to vendors to fix their
>> stuff. If you don't believe me, ask Linus :-)
>>
>>>   In fact, the driver already knows about this and figures
>>> out at runtime how the device is wired up to the bus. This is not the
>>> problem.
>>
>>
>> Except that this is very gross, especially when you observe that in the
>> busted "big endian" case, it has to byteswap the bloody data port.
>>
>> So you end up having to do that gross hack with separate accessors for
>> registers vs. data and not able to use the _rep variants, which also
>> means that on platforms like ppc, you end up with a memory barrier on
>> every access (or more), which is going to slow things down enormously.
>
>
> BTW I've just realized that in case if there's no bridge between CPU and
> CF-controller or if this bridge is "transparent" (does no swapping
> neither bytes nor bits) our data accessors here should be changed.
>
> Isn't it strange in "ace_datain_le16" use "ioread16be" or the one it was
> here initially "in_be16"?
> With BE ones I'd say similar changes should be done.
>
> So finally I see them implemented this way:
> ===
> /* BE part */
>
> static void ace_datain_be16(struct ace_device *ace)
> {
> int i = ACE_FIFO_SIZE / 2;
> u16 *dst = ace->data_ptr;
> while (i--)
> *dst++ = ioread16be(ace->baseaddr + 0x40);
> ace->data_ptr = dst;
> }
>
> static void ace_dataout_be16(struct ace_device *ace)
> {
> int i = ACE_FIFO_SIZE / 2;
> u16 *src = ace->data_ptr;
> while (i--)
> iowrite16be(*src++, ace->baseaddr + 0x40);
> ace->data_ptr = src;
> }
>
> /* LE part*/
>
> static void ace_datain_le16(struct ace_device *ace)
> {
> int i = ACE_FIFO_SIZE / 2;
> u16 *dst = ace->data_ptr;
> while (i--)
> *dst++ = ioread16(ace->baseaddr + 0x40);
> ace->data_ptr = dst;
> }
>
> static void ace_dataout_le16(struct ace_device *ace)
> {
> int i = ACE_FIFO_SIZE / 2;
> u16 *src = ace->data_ptr;
> while (i--)
> iowrite16(*src++, ace->baseaddr + 0x40);
> ace->data_ptr = src;
> }
> ===
>
> Correct me if I'm wrong here.
>
> And at least these accessors for LE got xsysace perfectly working on our
> FPGA platform (little-endian ARC700 on Xilinx ml-509 with our own
> BVCI-to-MPU bridge that does no swapping).
>
> I have to confess that I didn't properly tested initial patch on real HW -
> it was only sort of cosmetic clean-up.

The origin patch (after some microblaze ioread/iowrite fixes) works ok,
These additional changes is breaking sysace on microblaze big endian
ml505 16bit mode.
8bit mode just works

Also I have looked at our tree and I see that the mainline kernel misses
one patch which was sent by Graeme Smecher and it was also Acked-by Grant.
It is called "Fix device name assignment for SystemACE (from "xs`" to "xsa")."
Let me resend it too.

Alexey: Can you please confirm that on your arc platform you see broken CF
partitions name?

Thanks,
Michal


-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Michal Simek
2013/2/7 Arnd Bergmann :
> On Thursday 07 February 2013, Geert Uytterhoeven wrote:
>>
>> On Thu, Feb 7, 2013 at 9:01 AM, Michal Simek  wrote:
>> > ok. Can you please confirm with me that the same problem is also for
>> > iowrite32be
>> > ioread16be and ioread32be?
>> >
>> > This description seems to me correct for BE and LE.
>> > #define ioread16be(addr)   __be16_to_cpu(__raw_readw(addr))
>> > #define ioread32be(addr)   __be32_to_cpu(__raw_readl(addr))
>> >
>> > #define iowrite16be(v, addr)   __raw_writew(__cpu_to_be16(v), addr)
>> > #define iowrite32be(v, addr)   __raw_writel(__cpu_to_be32(v), addr)
>> >
>> > What do you think?
>>
>> Looks fine to me. Arnd, Ben?
>
> Yes, that would be better. Can you prepare a patch?

Done:
Subject: "asm-generic: io: Fix ioread16/32be and iowrite16/32be"

Thanks,
Michal

-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Benjamin Herrenschmidt
On Thu, 2013-02-07 at 23:20 +1100, Benjamin Herrenschmidt wrote:

> For the "swapped" case, I would suggest using ioread16be for the registers
> for the data port, use ioread16_rep followed by a pass of byteswap. I assume
> that this incorrect wiring case only happens on BE platforms right ?

Of course that won't work on writes unless you use a bounce buffer, but
heh, who cares, it's busted anyway ?

Another option is to stick with a loop of ioread16be/iowrite16be, it
*should* work, but it will have too many memory barriers, the impact
will vary depending on the core.

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Benjamin Herrenschmidt
On Thu, 2013-02-07 at 16:09 +0400, Alexey Brodkin wrote:
> 
> BTW I've just realized that in case if there's no bridge between CPU and 
> CF-controller or if this bridge is "transparent" (does no swapping
> neither bytes nor bits) our data accessors here should be changed.
> 
> Isn't it strange in "ace_datain_le16" use "ioread16be" or the one it was 
> here initially "in_be16"?
> With BE ones I'd say similar changes should be done.

This is part of the problem with those bloody attempts at dealing with
broken shit, they get the "right" case wrong :-)

So yes, if the bridge is wired up properly:

 - Register access is LE

 - Data port access is native endian (always) because what matters here
is not endianness but byte ordering (ie, which byte is 0) and if things
are wired properly, this is preserved.

So the correct things is in that case is to use ioread16_rep which will
do the right thing (and avoid the extra barriers and C loop) for the
data in/out code, and ioread16/iowrite16 for the registers.

For the "swapped" case, I would suggest using ioread16be for the registers
for the data port, use ioread16_rep followed by a pass of byteswap. I assume
that this incorrect wiring case only happens on BE platforms right ?

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Alexey Brodkin

On 02/07/2013 01:35 AM, Benjamin Herrenschmidt wrote:

On Wed, 2013-02-06 at 10:14 +, Grant Likely wrote:


Huh? That makes no sense. This device out in the wild with both big
and little endian bus attachments. You can argue all day that one of
them is wrong, but it doesn't matter. It exists, is used, and must be
supported.


No. That's where you are VERY wrong. We don't have to support crap and
arguably shouldn't if that can give an incentive to vendors to fix their
stuff. If you don't believe me, ask Linus :-)


  In fact, the driver already knows about this and figures
out at runtime how the device is wired up to the bus. This is not the
problem.


Except that this is very gross, especially when you observe that in the
busted "big endian" case, it has to byteswap the bloody data port.

So you end up having to do that gross hack with separate accessors for
registers vs. data and not able to use the _rep variants, which also
means that on platforms like ppc, you end up with a memory barrier on
every access (or more), which is going to slow things down enormously.


BTW I've just realized that in case if there's no bridge between CPU and 
CF-controller or if this bridge is "transparent" (does no swapping

neither bytes nor bits) our data accessors here should be changed.

Isn't it strange in "ace_datain_le16" use "ioread16be" or the one it was 
here initially "in_be16"?

With BE ones I'd say similar changes should be done.

So finally I see them implemented this way:
===
/* BE part */
static void ace_datain_be16(struct ace_device *ace)
{
int i = ACE_FIFO_SIZE / 2;
u16 *dst = ace->data_ptr;
while (i--)
*dst++ = ioread16be(ace->baseaddr + 0x40);
ace->data_ptr = dst;
}

static void ace_dataout_be16(struct ace_device *ace)
{
int i = ACE_FIFO_SIZE / 2;
u16 *src = ace->data_ptr;
while (i--)
iowrite16be(*src++, ace->baseaddr + 0x40);
ace->data_ptr = src;
}

/* LE part*/
static void ace_datain_le16(struct ace_device *ace)
{
int i = ACE_FIFO_SIZE / 2;
u16 *dst = ace->data_ptr;
while (i--)
*dst++ = ioread16(ace->baseaddr + 0x40);
ace->data_ptr = dst;
}

static void ace_dataout_le16(struct ace_device *ace)
{
int i = ACE_FIFO_SIZE / 2;
u16 *src = ace->data_ptr;
while (i--)
iowrite16(*src++, ace->baseaddr + 0x40);
ace->data_ptr = src;
}
===

Correct me if I'm wrong here.

And at least these accessors for LE got xsysace perfectly working on our 
FPGA platform (little-endian ARC700 on Xilinx ml-509 with our own 
BVCI-to-MPU bridge that does no swapping).


I have to confess that I didn't properly tested initial patch on real HW 
- it was only sort of cosmetic clean-up.


-Alexey


BTW, that document describes only one of the systemace bus
attachments. There is a different on used on Microblaze little-endian,
and some boards have the SystemACE directly wired to the SoC external
bus (no adapter IP).

The only problem that I see is that the ARM and Microblaze
ioread16be/iowrite16be helpers are missing barriers which smells like
a bug and should be fixed.

Michal, have you tested Alexey's patch? If it works for you then I'm
comfortable with acking it. It looks correct to me.


No, the real problem is that the "big endian" wiring is totally busted
and the HW guys who came with it must be taught a lesson. Not supporting
that crap might be one way to do it.

Ben.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Arnd Bergmann
On Thursday 07 February 2013, Geert Uytterhoeven wrote:
> 
> On Thu, Feb 7, 2013 at 9:01 AM, Michal Simek  wrote:
> > ok. Can you please confirm with me that the same problem is also for
> > iowrite32be
> > ioread16be and ioread32be?
> >
> > This description seems to me correct for BE and LE.
> > #define ioread16be(addr)   __be16_to_cpu(__raw_readw(addr))
> > #define ioread32be(addr)   __be32_to_cpu(__raw_readl(addr))
> >
> > #define iowrite16be(v, addr)   __raw_writew(__cpu_to_be16(v), addr)
> > #define iowrite32be(v, addr)   __raw_writel(__cpu_to_be32(v), addr)
> >
> > What do you think?
> 
> Looks fine to me. Arnd, Ben?

Yes, that would be better. Can you prepare a patch?

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Geert Uytterhoeven
On Thu, Feb 7, 2013 at 9:01 AM, Michal Simek  wrote:
> ok. Can you please confirm with me that the same problem is also for
> iowrite32be
> ioread16be and ioread32be?
>
> This description seems to me correct for BE and LE.
> #define ioread16be(addr)   __be16_to_cpu(__raw_readw(addr))
> #define ioread32be(addr)   __be32_to_cpu(__raw_readl(addr))
>
> #define iowrite16be(v, addr)   __raw_writew(__cpu_to_be16(v), addr)
> #define iowrite32be(v, addr)   __raw_writel(__cpu_to_be32(v), addr)
>
> What do you think?

Looks fine to me. Arnd, Ben?

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Michal Simek

On 02/07/2013 08:38 AM, Geert Uytterhoeven wrote:

On Thu, Feb 7, 2013 at 8:23 AM, Michal Simek  wrote:

#define iowrite16be(v, addr)   iowrite16(be16_to_cpu(v), (addr))
#define iowrite16(v, addr)  writew((v), (addr))
#define writew(b,addr) __raw_writew(__cpu_to_le16(b),addr)

static inline void __raw_writew(u16 b, volatile void __iomem *addr)
{
 *(volatile u16 __force *) addr = b;
}

How is this suppose to work on Big Endian?
be16_to_cpu(v) is (v)
and
__cpu_to_le16(b) is swab16(v)


Yes.


But on native BE system ( I expect that v is in big endian)
iowrite16be(v, addr) should be just *(volatile u16 __force *) addr =
v; not *(volatile u16 __force *) addr = swab16(v);



What I would expect is
#define iowrite16be(v, addr)   __raw_writew(__cpu_to_be16(v), addr)


Indeed, it should be "__cpu_to_be16(v)" instead of "be16_to_cpu(v)".


What do you mean by that?


Bummer, I missed that current iowrite16be() uses (the little endian)
iowrite16(),
not _raw_writew(), and thought the only difference between the original
and your version was the endianness conversion macro.

Yes,

 #define iowrite16be(v, addr)   __raw_writew(__cpu_to_be16(v), addr)

should be correct.


ok. Can you please confirm with me that the same problem is also for iowrite32be
ioread16be and ioread32be?

This description seems to me correct for BE and LE.
#define ioread16be(addr)   __be16_to_cpu(__raw_readw(addr))
#define ioread32be(addr)   __be32_to_cpu(__raw_readl(addr))
#define iowrite16be(v, addr)   __raw_writew(__cpu_to_be16(v), addr)
#define iowrite32be(v, addr)   __raw_writel(__cpu_to_be32(v), addr)

What do you think?

Thanks,
Michal

--
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel 2.6 Microblaze Linux - http://www.monstr.eu/fdt/
Microblaze U-BOOT custodian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Benjamin Herrenschmidt
On Thu, 2013-02-07 at 19:12 +0400, Alexey Brodkin wrote:
 not sure about items for ace_datain/out_be16 - what about _rep
 options here?

Well, you have a backward wiring of an LE device so you can't use the
_rep variants, unless you ping pong, so you either use a loop of
ioread/write16 (le) and bite the bullet on extra barriers, or use _rep 
bounce buffer for a separate swap pass.

Point is, the backward wiring will require byteswap on both BE and LE
hosts for data (which is why it's so stupid).

Cheers,
Ben.



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Benjamin Herrenschmidt
On Thu, 2013-02-07 at 15:23 +, Grant Likely wrote:
 
 Maybe. In a separate patch. Hmmm... I guess there isn't an
 ioread16be_rep variant. 

Because it would not make sense. The ioread16_rep isn't LE ... it
should be usable for any kind if data port since such a fifo should
never require byteswap as long as the bus is wired properly (whether the
device is LE or BE).

The problem we have here is that we have an LE device that can be wired
backward. The fact that this only happens when the CPU is BE is somewhat
irrelevant (well, let's say that the confusion is easier to make for the
HW guys when using a BE CPU but fundamentally it's not relevant, a
similar backard wiring could have been done for a BE device on a LE CPU
for example). 

So in that case, you need some kind of hand made loop.

 Oh well. Check first with Michal on LE microblaze before making the
 change. If it doesn't work for him the more understanding is needed. I
 was pretty sure the LE variant already worked.
 
  not sure about items for ace_datain/out_be16 - what about _rep
 options
  here?
 
 ioread16_rep should be fine. The ace_data{in,out}_be16 routines need
 to use the LE accessor. The existing code is definitely correct in
 this respect.

Ben.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Benjamin Herrenschmidt
On Thu, 2013-02-07 at 16:44 +, Grant Likely wrote:
 
 I've just spent some quality time with a piece of paper, and I think
 I've figured it out...

http://linuxplumbers.ubicast.tv/videos/big-and-little-endian-inside-out/

Watch the last part on IO busses

This all has to do which which byte of the bus is the lowest byte
*address* regardless of significance. That's how the wiring should be
consistent between CPU and device, in order for a data port to work
properly.

The data port then requires no swapping (which also means it works
nicely with dumb DMA engines etc...)

Whether the registers need swapping or not depends on which half is the
MSB, which is somewhat a semantically higher level than the bus
transport, and depends on whether the device exposes them as BE or LE
registers. But data ports are just windows to a byte stream and
shouldn't be affected by endianess (again, unless you get a moron doing
the HW which seems to be still too common).

There is only one right way to connect devices to CPUs basically, which
is called byte address invariance, and preserves the order of bytes in
term of byte addresses.

(Note that for busses that also carry addresses such as PCI, this can
get tricky as you might need to have the lanes routed in a different
order for address vs. data cycles).

Cheers,
Ben.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Benjamin Herrenschmidt
On Thu, 2013-02-07 at 20:56 +0400, Alexey Brodkin wrote:
  So, if I'm correct that means that for the data port (specifically
  copying between RAM and the data port) using ioread16/iowrite16 on the
  data port results in the correct behaviour. It also means that LE
  support in the current driver is broken.
 
 That matches my earlier note when I wrote that for correct work on LE 
 (note I'm on ARC, not PPC/MB) I needed to use io{read|write}16 in 
 ace_data{in|out}_le16.
 
 With original io{read|write}16be instead data was corrupted.

For the backward wiring case your data port will always be the opposite
endian as your host. For the correct wiring case, the data port will always
be the same endian as your host.

So the correct wiring case (the one using ioread16/iowrite16 for registers),
the data port should just use ioread16_rep.

For the other case, you need some ifdef'ery or raw variants with home made
barriers, or a bounce buffer.

Ben.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Arnd Bergmann
On Thursday 07 February 2013, Grant Likely wrote:
 On Thu, Feb 7, 2013 at 3:28 PM, Alexey Brodkin

 Starting with register (non-data) access. The bus bindings are such
 that on both BE and LE systems a native 16 bit read results in the
 bits being in the correct order. On powerpc, you want to do a BE read
 because the device appears as a BE device, and on LE systems a LE read
 is wanted because that is how the device is wired. So far this makes
 sense and matches the driver.

Well, until you get a little-endian PowerPC system with this
device ;). I heard people are plannng to put those into
production systems. I would assume that the device would
still use big-endian registers, so the rule about PowerPC
using the BE accessors still holds, but it stops making
sense.

 The same is true for the data port. A BE read does the right thing on
 a BE platform, and LE read on a LE platform. (again, this is all about
 bus attachment). However, the difference is what is then done with the
 data.

Well, except that we cannot use the ioread16be_rep function,
which is made for the case where the bus does not swap.

Of course, as long as the driver is only ever used to access
the same non-removable block device and you don't change
the driver, it does not matter at all whether you swap bytes
on the data port or not, because they are swapped on both
read and write, and it's just storage. Only if you try to
read the device with a correct driver, you will see a problem
if it was written with a wrong driver.

 The ironic thing is that if the BE PPC/MB hardware engineers hadn't
 swapped the bytelanes then we would *still* have to do special thinks
 in the hardware; except it would be the data port accesses that would
 need extra work, instead of the other registers.

No, we would juse use the _rep() functions if they had done it right,
just like any other driver with this problem.

Arnd
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Arnd Bergmann
On Thursday 07 February 2013, Michal Simek wrote:
 Subject: asm-generic: io: Fix ioread16/32be and iowrite16/32be

Ok, thanks. If you don't mind, I think this can just go with the other patches
for xsysace that come out of this discussion.

Arnd
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Geert Uytterhoeven
On Fri, Feb 8, 2013 at 12:05 AM, Arnd Bergmann a...@arndb.de wrote:
 The same is true for the data port. A BE read does the right thing on
 a BE platform, and LE read on a LE platform. (again, this is all about
 bus attachment). However, the difference is what is then done with the
 data.

 Well, except that we cannot use the ioread16be_rep function,

ioread16_rep

 which is made for the case where the bus does not swap.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say programmer or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Grant Likely
On Thu, Feb 7, 2013 at 11:05 PM, Arnd Bergmann a...@arndb.de wrote:
 On Thursday 07 February 2013, Grant Likely wrote:
 On Thu, Feb 7, 2013 at 3:28 PM, Alexey Brodkin
 Of course, as long as the driver is only ever used to access
 the same non-removable block device and you don't change
 the driver, it does not matter at all whether you swap bytes
 on the data port or not, because they are swapped on both
 read and write, and it's just storage. Only if you try to
 read the device with a correct driver, you will see a problem
 if it was written with a wrong driver.

It's actually a removable compact flash card interface. I know that
the current big-endian support code is correct because my laptop
doesn't complain about the data.  :-)

g.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Grant Likely
On Thu, Feb 7, 2013 at 5:22 PM, Alexey Brodkin
alexey.brod...@synopsys.com wrote:
 On 02/07/2013 09:16 PM, Grant Likely wrote:

 On Thu, Feb 7, 2013 at 4:56 PM, Alexey Brodkin
 alexey.brod...@synopsys.com wrote:

 On 02/07/2013 08:44 PM, Grant Likely wrote:

 So, if I'm correct that means that for the data port (specifically
 copying between RAM and the data port) using ioread16/iowrite16 on the
 data port results in the correct behaviour. It also means that LE
 support in the current driver is broken.


 That matches my earlier note when I wrote that for correct work on LE
 (note
 I'm on ARC, not PPC/MB) I needed to use io{read|write}16 in
 ace_data{in|out}_le16.

 With original io{read|write}16be instead data was corrupted.


 In which case your bug-fix patch should drop the
 ace_datain_le16/ace_dataout_le16 variants entirely and use the be16
 ones for both (renaming appropriately).

 g.

 --
 Grant Likely, B.Sc., P.Eng.
 Secret Lab Technologies Ltd.


 Sorry, do you mean to replace original lines:
 ===

 static void ace_datain_be16(struct ace_device *ace)
 {
 int i = ACE_FIFO_SIZE / 2;
 u16 *dst = ace-data_ptr;
 while (i--)
 *dst++ = in_le16(ace-baseaddr + 0x40);
 ace-data_ptr = dst;

 }

 static void ace_dataout_be16(struct ace_device *ace)
 {
 int i = ACE_FIFO_SIZE / 2;
 u16 *src = ace-data_ptr;
 while (i--)
 ioread16(*src++, ace-baseaddr + 0x40);
 ace-data_ptr = src;
 }
 ===

 with something like:
 ===
 static void ace_datain_16(struct ace_device *ace)

 {
 int i = ACE_FIFO_SIZE / 2;
 u16 *dst = ace-data_ptr;
 while (i--)
 *dst++ = in_le16(ace-baseaddr + 0x40);
 ace-data_ptr = dst;
 }

 static void ace_dataout_16(struct ace_device *ace)

 {
 int i = ACE_FIFO_SIZE / 2;
 u16 *src = ace-data_ptr;
 while (i--)
 iowrite16(*src++, ace-baseaddr + 0x40);
 ace-data_ptr = src;
 }

Ummm, I think you finger fumbled that because the above doesn't make sense.

I think that your original patch should be applied as-is first. It is
just a mechanical replacement of the ppc accessors with ioread/iowrite
variants. Nothing controversial there.

I was suggesting to use a second patch to drop
ace_datain_le16/ace_dataout_le16 and rename
ace_datain_be16/ace_dataout_be16 to ace_datain_16/ace_dataout_16.\,
and in that same patch you can switch the read loop to use
ioread16_rep/iowrite16_rep.

*However*... I think my first analysis is actually wrong for the BE
case. The current BE code works using ioread16() for the data port
which does a swap, but ioread16_rep() doesn't swap . I hadn't gone an
actually looked at how the _rep variants are implemented. (Thanks for
your help Ben). That means ioread16_rep will work fine for LE, but
we're still stuck with the slow loop on BE.

So, craft your bug fix for the LE case to use _rep variants for
ace_{datain,dataout}_le16() but don't change the BE support yet.

g.

-- 
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Michal Simek

On 02/07/2013 08:38 AM, Geert Uytterhoeven wrote:

On Thu, Feb 7, 2013 at 8:23 AM, Michal Simek mon...@monstr.eu wrote:

#define iowrite16be(v, addr)   iowrite16(be16_to_cpu(v), (addr))
#define iowrite16(v, addr)  writew((v), (addr))
#define writew(b,addr) __raw_writew(__cpu_to_le16(b),addr)

static inline void __raw_writew(u16 b, volatile void __iomem *addr)
{
 *(volatile u16 __force *) addr = b;
}

How is this suppose to work on Big Endian?
be16_to_cpu(v) is (v)
and
__cpu_to_le16(b) is swab16(v)


Yes.


But on native BE system ( I expect that v is in big endian)
iowrite16be(v, addr) should be just *(volatile u16 __force *) addr =
v; not *(volatile u16 __force *) addr = swab16(v);



What I would expect is
#define iowrite16be(v, addr)   __raw_writew(__cpu_to_be16(v), addr)


Indeed, it should be __cpu_to_be16(v) instead of be16_to_cpu(v).


What do you mean by that?


Bummer, I missed that current iowrite16be() uses (the little endian)
iowrite16(),
not _raw_writew(), and thought the only difference between the original
and your version was the endianness conversion macro.

Yes,

 #define iowrite16be(v, addr)   __raw_writew(__cpu_to_be16(v), addr)

should be correct.


ok. Can you please confirm with me that the same problem is also for iowrite32be
ioread16be and ioread32be?

This description seems to me correct for BE and LE.
#define ioread16be(addr)   __be16_to_cpu(__raw_readw(addr))
#define ioread32be(addr)   __be32_to_cpu(__raw_readl(addr))
#define iowrite16be(v, addr)   __raw_writew(__cpu_to_be16(v), addr)
#define iowrite32be(v, addr)   __raw_writel(__cpu_to_be32(v), addr)

What do you think?

Thanks,
Michal

--
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel 2.6 Microblaze Linux - http://www.monstr.eu/fdt/
Microblaze U-BOOT custodian
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Geert Uytterhoeven
On Thu, Feb 7, 2013 at 9:01 AM, Michal Simek mon...@monstr.eu wrote:
 ok. Can you please confirm with me that the same problem is also for
 iowrite32be
 ioread16be and ioread32be?

 This description seems to me correct for BE and LE.
 #define ioread16be(addr)   __be16_to_cpu(__raw_readw(addr))
 #define ioread32be(addr)   __be32_to_cpu(__raw_readl(addr))

 #define iowrite16be(v, addr)   __raw_writew(__cpu_to_be16(v), addr)
 #define iowrite32be(v, addr)   __raw_writel(__cpu_to_be32(v), addr)

 What do you think?

Looks fine to me. Arnd, Ben?

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say programmer or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Arnd Bergmann
On Thursday 07 February 2013, Geert Uytterhoeven wrote:
 
 On Thu, Feb 7, 2013 at 9:01 AM, Michal Simek mon...@monstr.eu wrote:
  ok. Can you please confirm with me that the same problem is also for
  iowrite32be
  ioread16be and ioread32be?
 
  This description seems to me correct for BE and LE.
  #define ioread16be(addr)   __be16_to_cpu(__raw_readw(addr))
  #define ioread32be(addr)   __be32_to_cpu(__raw_readl(addr))
 
  #define iowrite16be(v, addr)   __raw_writew(__cpu_to_be16(v), addr)
  #define iowrite32be(v, addr)   __raw_writel(__cpu_to_be32(v), addr)
 
  What do you think?
 
 Looks fine to me. Arnd, Ben?

Yes, that would be better. Can you prepare a patch?

Arnd
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Alexey Brodkin

On 02/07/2013 01:35 AM, Benjamin Herrenschmidt wrote:

On Wed, 2013-02-06 at 10:14 +, Grant Likely wrote:


Huh? That makes no sense. This device out in the wild with both big
and little endian bus attachments. You can argue all day that one of
them is wrong, but it doesn't matter. It exists, is used, and must be
supported.


No. That's where you are VERY wrong. We don't have to support crap and
arguably shouldn't if that can give an incentive to vendors to fix their
stuff. If you don't believe me, ask Linus :-)


  In fact, the driver already knows about this and figures
out at runtime how the device is wired up to the bus. This is not the
problem.


Except that this is very gross, especially when you observe that in the
busted big endian case, it has to byteswap the bloody data port.

So you end up having to do that gross hack with separate accessors for
registers vs. data and not able to use the _rep variants, which also
means that on platforms like ppc, you end up with a memory barrier on
every access (or more), which is going to slow things down enormously.


BTW I've just realized that in case if there's no bridge between CPU and 
CF-controller or if this bridge is transparent (does no swapping

neither bytes nor bits) our data accessors here should be changed.

Isn't it strange in ace_datain_le16 use ioread16be or the one it was 
here initially in_be16?

With BE ones I'd say similar changes should be done.

So finally I see them implemented this way:
===
/* BE part */
static void ace_datain_be16(struct ace_device *ace)
{
int i = ACE_FIFO_SIZE / 2;
u16 *dst = ace-data_ptr;
while (i--)
*dst++ = ioread16be(ace-baseaddr + 0x40);
ace-data_ptr = dst;
}

static void ace_dataout_be16(struct ace_device *ace)
{
int i = ACE_FIFO_SIZE / 2;
u16 *src = ace-data_ptr;
while (i--)
iowrite16be(*src++, ace-baseaddr + 0x40);
ace-data_ptr = src;
}

/* LE part*/
static void ace_datain_le16(struct ace_device *ace)
{
int i = ACE_FIFO_SIZE / 2;
u16 *dst = ace-data_ptr;
while (i--)
*dst++ = ioread16(ace-baseaddr + 0x40);
ace-data_ptr = dst;
}

static void ace_dataout_le16(struct ace_device *ace)
{
int i = ACE_FIFO_SIZE / 2;
u16 *src = ace-data_ptr;
while (i--)
iowrite16(*src++, ace-baseaddr + 0x40);
ace-data_ptr = src;
}
===

Correct me if I'm wrong here.

And at least these accessors for LE got xsysace perfectly working on our 
FPGA platform (little-endian ARC700 on Xilinx ml-509 with our own 
BVCI-to-MPU bridge that does no swapping).


I have to confess that I didn't properly tested initial patch on real HW 
- it was only sort of cosmetic clean-up.


-Alexey


BTW, that document describes only one of the systemace bus
attachments. There is a different on used on Microblaze little-endian,
and some boards have the SystemACE directly wired to the SoC external
bus (no adapter IP).

The only problem that I see is that the ARM and Microblaze
ioread16be/iowrite16be helpers are missing barriers which smells like
a bug and should be fixed.

Michal, have you tested Alexey's patch? If it works for you then I'm
comfortable with acking it. It looks correct to me.


No, the real problem is that the big endian wiring is totally busted
and the HW guys who came with it must be taught a lesson. Not supporting
that crap might be one way to do it.

Ben.



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Benjamin Herrenschmidt
On Thu, 2013-02-07 at 16:09 +0400, Alexey Brodkin wrote:
 
 BTW I've just realized that in case if there's no bridge between CPU and 
 CF-controller or if this bridge is transparent (does no swapping
 neither bytes nor bits) our data accessors here should be changed.
 
 Isn't it strange in ace_datain_le16 use ioread16be or the one it was 
 here initially in_be16?
 With BE ones I'd say similar changes should be done.

This is part of the problem with those bloody attempts at dealing with
broken shit, they get the right case wrong :-)

So yes, if the bridge is wired up properly:

 - Register access is LE

 - Data port access is native endian (always) because what matters here
is not endianness but byte ordering (ie, which byte is 0) and if things
are wired properly, this is preserved.

So the correct things is in that case is to use ioread16_rep which will
do the right thing (and avoid the extra barriers and C loop) for the
data in/out code, and ioread16/iowrite16 for the registers.

For the swapped case, I would suggest using ioread16be for the registers
for the data port, use ioread16_rep followed by a pass of byteswap. I assume
that this incorrect wiring case only happens on BE platforms right ?

Cheers,
Ben.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Benjamin Herrenschmidt
On Thu, 2013-02-07 at 23:20 +1100, Benjamin Herrenschmidt wrote:

 For the swapped case, I would suggest using ioread16be for the registers
 for the data port, use ioread16_rep followed by a pass of byteswap. I assume
 that this incorrect wiring case only happens on BE platforms right ?

Of course that won't work on writes unless you use a bounce buffer, but
heh, who cares, it's busted anyway ?

Another option is to stick with a loop of ioread16be/iowrite16be, it
*should* work, but it will have too many memory barriers, the impact
will vary depending on the core.

Cheers,
Ben.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Michal Simek
2013/2/7 Arnd Bergmann a...@arndb.de:
 On Thursday 07 February 2013, Geert Uytterhoeven wrote:

 On Thu, Feb 7, 2013 at 9:01 AM, Michal Simek mon...@monstr.eu wrote:
  ok. Can you please confirm with me that the same problem is also for
  iowrite32be
  ioread16be and ioread32be?
 
  This description seems to me correct for BE and LE.
  #define ioread16be(addr)   __be16_to_cpu(__raw_readw(addr))
  #define ioread32be(addr)   __be32_to_cpu(__raw_readl(addr))
 
  #define iowrite16be(v, addr)   __raw_writew(__cpu_to_be16(v), addr)
  #define iowrite32be(v, addr)   __raw_writel(__cpu_to_be32(v), addr)
 
  What do you think?

 Looks fine to me. Arnd, Ben?

 Yes, that would be better. Can you prepare a patch?

Done:
Subject: asm-generic: io: Fix ioread16/32be and iowrite16/32be

Thanks,
Michal

-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Michal Simek
2013/2/7 Alexey Brodkin alexey.brod...@synopsys.com:
 On 02/07/2013 01:35 AM, Benjamin Herrenschmidt wrote:

 On Wed, 2013-02-06 at 10:14 +, Grant Likely wrote:


 Huh? That makes no sense. This device out in the wild with both big
 and little endian bus attachments. You can argue all day that one of
 them is wrong, but it doesn't matter. It exists, is used, and must be
 supported.


 No. That's where you are VERY wrong. We don't have to support crap and
 arguably shouldn't if that can give an incentive to vendors to fix their
 stuff. If you don't believe me, ask Linus :-)

   In fact, the driver already knows about this and figures
 out at runtime how the device is wired up to the bus. This is not the
 problem.


 Except that this is very gross, especially when you observe that in the
 busted big endian case, it has to byteswap the bloody data port.

 So you end up having to do that gross hack with separate accessors for
 registers vs. data and not able to use the _rep variants, which also
 means that on platforms like ppc, you end up with a memory barrier on
 every access (or more), which is going to slow things down enormously.


 BTW I've just realized that in case if there's no bridge between CPU and
 CF-controller or if this bridge is transparent (does no swapping
 neither bytes nor bits) our data accessors here should be changed.

 Isn't it strange in ace_datain_le16 use ioread16be or the one it was
 here initially in_be16?
 With BE ones I'd say similar changes should be done.

 So finally I see them implemented this way:
 ===
 /* BE part */

 static void ace_datain_be16(struct ace_device *ace)
 {
 int i = ACE_FIFO_SIZE / 2;
 u16 *dst = ace-data_ptr;
 while (i--)
 *dst++ = ioread16be(ace-baseaddr + 0x40);
 ace-data_ptr = dst;
 }

 static void ace_dataout_be16(struct ace_device *ace)
 {
 int i = ACE_FIFO_SIZE / 2;
 u16 *src = ace-data_ptr;
 while (i--)
 iowrite16be(*src++, ace-baseaddr + 0x40);
 ace-data_ptr = src;
 }

 /* LE part*/

 static void ace_datain_le16(struct ace_device *ace)
 {
 int i = ACE_FIFO_SIZE / 2;
 u16 *dst = ace-data_ptr;
 while (i--)
 *dst++ = ioread16(ace-baseaddr + 0x40);
 ace-data_ptr = dst;
 }

 static void ace_dataout_le16(struct ace_device *ace)
 {
 int i = ACE_FIFO_SIZE / 2;
 u16 *src = ace-data_ptr;
 while (i--)
 iowrite16(*src++, ace-baseaddr + 0x40);
 ace-data_ptr = src;
 }
 ===

 Correct me if I'm wrong here.

 And at least these accessors for LE got xsysace perfectly working on our
 FPGA platform (little-endian ARC700 on Xilinx ml-509 with our own
 BVCI-to-MPU bridge that does no swapping).

 I have to confess that I didn't properly tested initial patch on real HW -
 it was only sort of cosmetic clean-up.

The origin patch (after some microblaze ioread/iowrite fixes) works ok,
These additional changes is breaking sysace on microblaze big endian
ml505 16bit mode.
8bit mode just works

Also I have looked at our tree and I see that the mainline kernel misses
one patch which was sent by Graeme Smecher and it was also Acked-by Grant.
It is called Fix device name assignment for SystemACE (from xs` to xsa).
Let me resend it too.

Alexey: Can you please confirm that on your arc platform you see broken CF
partitions name?

Thanks,
Michal


-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Alexey Brodkin

On 02/07/2013 06:31 PM, Michal Simek wrote:

The origin patch (after some microblaze ioread/iowrite fixes) works ok,
These additional changes is breaking sysace on microblaze big endian
ml505 16bit mode.
8bit mode just works

Also I have looked at our tree and I see that the mainline kernel misses
one patch which was sent by Graeme Smecher and it was also Acked-by Grant.
It is called Fix device name assignment for SystemACE (from xs` to xsa).
Let me resend it too.

Alexey: Can you please confirm that on your arc platform you see broken CF
partitions name?

Thanks,
Michal



Indeed I had it and fixed myself)
I didn't know about this fix so I've just sent another fix for it - 
https://patchwork.kernel.org/patch/2110811/


Anyways good to know now it is fixed.

Thanks,
Alexey
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Grant Likely
On Wed, Feb 6, 2013 at 9:35 PM, Benjamin Herrenschmidt
b...@kernel.crashing.org wrote:
 On Wed, 2013-02-06 at 10:14 +, Grant Likely wrote:

 Huh? That makes no sense. This device out in the wild with both big
 and little endian bus attachments. You can argue all day that one of
 them is wrong, but it doesn't matter. It exists, is used, and must be
 supported.

 No. That's where you are VERY wrong. We don't have to support crap and
 arguably shouldn't if that can give an incentive to vendors to fix their
 stuff. If you don't believe me, ask Linus :-)

As far as horrible hardware goes, this device comes no where close to
some of the stuff I've worked on. The driver is self contained. It
doesn't have any nasty hooks into the rest of the kernel and most
importantly it currently *works* and is used.

  In fact, the driver already knows about this and figures
 out at runtime how the device is wired up to the bus. This is not the
 problem.

 Except that this is very gross, especially when you observe that in the
 busted big endian case, it has to byteswap the bloody data port.

 So you end up having to do that gross hack with separate accessors for
 registers vs. data and not able to use the _rep variants, which also
 means that on platforms like ppc, you end up with a memory barrier on
 every access (or more), which is going to slow things down enormously.

I don't see why the _rep variants aren't usable here. The only reason
I didn't use them when I wrote the driver in the first place was I was
a n00b kernel hacker and I didn't know they were there.

g.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Grant Likely
On Thu, Feb 7, 2013 at 2:39 PM, Grant Likely grant.lik...@secretlab.ca wrote:
 On Wed, Feb 6, 2013 at 9:35 PM, Benjamin Herrenschmidt
 b...@kernel.crashing.org wrote:
  In fact, the driver already knows about this and figures
 out at runtime how the device is wired up to the bus. This is not the
 problem.

 Except that this is very gross, especially when you observe that in the
 busted big endian case, it has to byteswap the bloody data port.

 So you end up having to do that gross hack with separate accessors for
 registers vs. data and not able to use the _rep variants, which also
 means that on platforms like ppc, you end up with a memory barrier on
 every access (or more), which is going to slow things down enormously.

 I don't see why the _rep variants aren't usable here. The only reason
 I didn't use them when I wrote the driver in the first place was I was
 a n00b kernel hacker and I didn't know they were there.

The 8-bit variant is different though because the hardware requires
pingponging between odd and even byte addresses to flush the fifo.
Reading a data port even address (0x40) gives the least significant
byte. Reading from an odd address (0x41) give the MSB and pops the
data off the FIFO. So, yes, the _rep variant can't be used in 8-bit
mode. It should still be fine in 16-bit.

page 45: http://www.xilinx.com/support/documentation/data_sheets/ds080.pdf
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Alexey Brodkin

On 02/07/2013 06:51 PM, Grant Likely wrote:

On Thu, Feb 7, 2013 at 2:39 PM, Grant Likely grant.lik...@secretlab.ca wrote:

On Wed, Feb 6, 2013 at 9:35 PM, Benjamin Herrenschmidt
b...@kernel.crashing.org wrote:

  In fact, the driver already knows about this and figures
out at runtime how the device is wired up to the bus. This is not the
problem.


Except that this is very gross, especially when you observe that in the
busted big endian case, it has to byteswap the bloody data port.

So you end up having to do that gross hack with separate accessors for
registers vs. data and not able to use the _rep variants, which also
means that on platforms like ppc, you end up with a memory barrier on
every access (or more), which is going to slow things down enormously.


I don't see why the _rep variants aren't usable here. The only reason
I didn't use them when I wrote the driver in the first place was I was
a n00b kernel hacker and I didn't know they were there.


The 8-bit variant is different though because the hardware requires
pingponging between odd and even byte addresses to flush the fifo.
Reading a data port even address (0x40) gives the least significant
byte. Reading from an odd address (0x41) give the MSB and pops the
data off the FIFO. So, yes, the _rep variant can't be used in 8-bit
mode. It should still be fine in 16-bit.

page 45: http://www.xilinx.com/support/documentation/data_sheets/ds080.pdf



Ok, so may I do a re-spin with these changes:
1. In ace_in_be16 use ioread16be
2. In ace_out_be16 use iowrite16be
3. In ace_in_le16 use ioread16
4. In ace_out_le16 use iowrite16
5. In ace_datain_le16 use ioread16_rep
6. In ace_dataout_le16 use iowrite16_rep

not sure about items for ace_datain/out_be16 - what about _rep options 
here?


-Alexey
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Grant Likely
On Thu, Feb 7, 2013 at 3:12 PM, Alexey Brodkin
alexey.brod...@synopsys.com wrote:
 On 02/07/2013 06:51 PM, Grant Likely wrote:

 On Thu, Feb 7, 2013 at 2:39 PM, Grant Likely grant.lik...@secretlab.ca
 wrote:

 On Wed, Feb 6, 2013 at 9:35 PM, Benjamin Herrenschmidt
 b...@kernel.crashing.org wrote:

   In fact, the driver already knows about this and figures
 out at runtime how the device is wired up to the bus. This is not the
 problem.


 Except that this is very gross, especially when you observe that in the
 busted big endian case, it has to byteswap the bloody data port.

 So you end up having to do that gross hack with separate accessors for
 registers vs. data and not able to use the _rep variants, which also
 means that on platforms like ppc, you end up with a memory barrier on
 every access (or more), which is going to slow things down enormously.


 I don't see why the _rep variants aren't usable here. The only reason
 I didn't use them when I wrote the driver in the first place was I was
 a n00b kernel hacker and I didn't know they were there.


 The 8-bit variant is different though because the hardware requires
 pingponging between odd and even byte addresses to flush the fifo.
 Reading a data port even address (0x40) gives the least significant
 byte. Reading from an odd address (0x41) give the MSB and pops the
 data off the FIFO. So, yes, the _rep variant can't be used in 8-bit
 mode. It should still be fine in 16-bit.

 page 45: http://www.xilinx.com/support/documentation/data_sheets/ds080.pdf


 Ok, so may I do a re-spin with these changes:

There are two things here. 1) changing the accessors used. 2)
switching the endianess as a bug fix. Any changes to the endian access
should be a separate patch which a description of what is needed.

 1. In ace_in_be16 use ioread16be
 2. In ace_out_be16 use iowrite16be
 3. In ace_in_le16 use ioread16
 4. In ace_out_le16 use iowrite16

Yes

 5. In ace_datain_le16 use ioread16_rep
 6. In ace_dataout_le16 use iowrite16_rep

Maybe. In a separate patch. Hmmm... I guess there isn't an
ioread16be_rep variant. Oh well. Check first with Michal on LE
microblaze before making the change. If it doesn't work for him the
more understanding is needed. I was pretty sure the LE variant already
worked.

 not sure about items for ace_datain/out_be16 - what about _rep options
 here?

ioread16_rep should be fine. The ace_data{in,out}_be16 routines need
to use the LE accessor. The existing code is definitely correct in
this respect.

g.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Alexey Brodkin

On 02/07/2013 07:23 PM, Grant Likely wrote:

On Thu, Feb 7, 2013 at 3:12 PM, Alexey Brodkin
alexey.brod...@synopsys.com wrote:

On 02/07/2013 06:51 PM, Grant Likely wrote:


On Thu, Feb 7, 2013 at 2:39 PM, Grant Likely grant.lik...@secretlab.ca
wrote:


On Wed, Feb 6, 2013 at 9:35 PM, Benjamin Herrenschmidt
b...@kernel.crashing.org wrote:


   In fact, the driver already knows about this and figures
out at runtime how the device is wired up to the bus. This is not the
problem.



Except that this is very gross, especially when you observe that in the
busted big endian case, it has to byteswap the bloody data port.

So you end up having to do that gross hack with separate accessors for
registers vs. data and not able to use the _rep variants, which also
means that on platforms like ppc, you end up with a memory barrier on
every access (or more), which is going to slow things down enormously.



I don't see why the _rep variants aren't usable here. The only reason
I didn't use them when I wrote the driver in the first place was I was
a n00b kernel hacker and I didn't know they were there.



The 8-bit variant is different though because the hardware requires
pingponging between odd and even byte addresses to flush the fifo.
Reading a data port even address (0x40) gives the least significant
byte. Reading from an odd address (0x41) give the MSB and pops the
data off the FIFO. So, yes, the _rep variant can't be used in 8-bit
mode. It should still be fine in 16-bit.

page 45: http://www.xilinx.com/support/documentation/data_sheets/ds080.pdf



Ok, so may I do a re-spin with these changes:


There are two things here. 1) changing the accessors used. 2)
switching the endianess as a bug fix. Any changes to the endian access
should be a separate patch which a description of what is needed.


1. In ace_in_be16 use ioread16be
2. In ace_out_be16 use iowrite16be
3. In ace_in_le16 use ioread16
4. In ace_out_le16 use iowrite16


Yes


5. In ace_datain_le16 use ioread16_rep
6. In ace_dataout_le16 use iowrite16_rep


Maybe. In a separate patch. Hmmm... I guess there isn't an
ioread16be_rep variant. Oh well. Check first with Michal on LE
microblaze before making the change. If it doesn't work for him the
more understanding is needed. I was pretty sure the LE variant already
worked.


not sure about items for ace_datain/out_be16 - what about _rep options
here?


ioread16_rep should be fine. The ace_data{in,out}_be16 routines need
to use the LE accessor. The existing code is definitely correct in
this respect.


Well, if ioread16_rep is used in both ace_data{in,out}_be16 and 
ace_data{in,out}_le16 then what is a difference between them?
Whether there's a subtle difference still exists and I cannot see it or 
unified accessor could be used from now on at least for data access.


What do you think?

-Alexey
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Grant Likely
On Thu, Feb 7, 2013 at 3:28 PM, Alexey Brodkin
alexey.brod...@synopsys.com wrote:
 On 02/07/2013 07:23 PM, Grant Likely wrote:

 On Thu, Feb 7, 2013 at 3:12 PM, Alexey Brodkin
 alexey.brod...@synopsys.com wrote:

 On 02/07/2013 06:51 PM, Grant Likely wrote:


 On Thu, Feb 7, 2013 at 2:39 PM, Grant Likely grant.lik...@secretlab.ca
 wrote:


 On Wed, Feb 6, 2013 at 9:35 PM, Benjamin Herrenschmidt
 b...@kernel.crashing.org wrote:


In fact, the driver already knows about this and figures
 out at runtime how the device is wired up to the bus. This is not the
 problem.



 Except that this is very gross, especially when you observe that in
 the
 busted big endian case, it has to byteswap the bloody data port.

 So you end up having to do that gross hack with separate accessors for
 registers vs. data and not able to use the _rep variants, which also
 means that on platforms like ppc, you end up with a memory barrier on
 every access (or more), which is going to slow things down enormously.



 I don't see why the _rep variants aren't usable here. The only reason
 I didn't use them when I wrote the driver in the first place was I was
 a n00b kernel hacker and I didn't know they were there.



 The 8-bit variant is different though because the hardware requires
 pingponging between odd and even byte addresses to flush the fifo.
 Reading a data port even address (0x40) gives the least significant
 byte. Reading from an odd address (0x41) give the MSB and pops the
 data off the FIFO. So, yes, the _rep variant can't be used in 8-bit
 mode. It should still be fine in 16-bit.

 page 45:
 http://www.xilinx.com/support/documentation/data_sheets/ds080.pdf


 Ok, so may I do a re-spin with these changes:


 There are two things here. 1) changing the accessors used. 2)
 switching the endianess as a bug fix. Any changes to the endian access
 should be a separate patch which a description of what is needed.

 1. In ace_in_be16 use ioread16be
 2. In ace_out_be16 use iowrite16be
 3. In ace_in_le16 use ioread16
 4. In ace_out_le16 use iowrite16


 Yes

 5. In ace_datain_le16 use ioread16_rep
 6. In ace_dataout_le16 use iowrite16_rep


 Maybe. In a separate patch. Hmmm... I guess there isn't an
 ioread16be_rep variant. Oh well. Check first with Michal on LE
 microblaze before making the change. If it doesn't work for him the
 more understanding is needed. I was pretty sure the LE variant already
 worked.

 not sure about items for ace_datain/out_be16 - what about _rep options
 here?


 ioread16_rep should be fine. The ace_data{in,out}_be16 routines need
 to use the LE accessor. The existing code is definitely correct in
 this respect.


 Well, if ioread16_rep is used in both ace_data{in,out}_be16 and
 ace_data{in,out}_le16 then what is a difference between them?
 Whether there's a subtle difference still exists and I cannot see it or
 unified accessor could be used from now on at least for data access.

I've just spent some quality time with a piece of paper, and I think
I've figured it out...

Starting with register (non-data) access. The bus bindings are such
that on both BE and LE systems a native 16 bit read results in the
bits being in the correct order. On powerpc, you want to do a BE read
because the device appears as a BE device, and on LE systems a LE read
is wanted because that is how the device is wired. So far this makes
sense and matches the driver.

The same is true for the data port. A BE read does the right thing on
a BE platform, and LE read on a LE platform. (again, this is all about
bus attachment). However, the difference is what is then done with the
data.

For register reads, the driver uses the value directly. Either by
writing a 16 bit value to a register or reading and interpreting a 16
bit value. In both case the drive is directly interested in the 16bit
word

However, for the data port, the data is streamed out of the FIFO and
stored in memory for the block subsystem to use. A read from the data
port (assuming 'native' accesses are used as described above) obtains
16bits of data from the disk. Disk offsets are byte oriented, but
since it they are 16 bit reads, two bytes are read at a time:
First read:  offset 0 in LSB, offset 1 MSB
Second read: offset 2 in LSB, offset 3 in MSB
Third read: offset 4 in LSB, offset 5 in MSB

etc.

Then that data needs to be stored into memory. This is where things
are different with native 16 bit stores:
  On a BE memory system. MSB in byte address 0 and LSB in byte address 1.
  On a LE memory system. MSB in byte address 1 and LSB in byte address 0.

The block subsystem deals with byte oriented buffers; so given the way
data is arranged in the data port we want to always make sure the LSB
goes into address 0. The cause of problems isn't the BE vs LE bus
attachment. It is the different memory system orientation. Using
ioread16/iowrite16 has the right behaviour, but it's kind of a
backwards way to go about it It isn't that we want a be16_to_cpu()
or le16_to_cpu() 

Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Alexey Brodkin

On 02/07/2013 08:44 PM, Grant Likely wrote:


Then that data needs to be stored into memory. This is where things
are different with native 16 bit stores:
   On a BE memory system. MSB in byte address 0 and LSB in byte address 1.
   On a LE memory system. MSB in byte address 1 and LSB in byte address 0.

The block subsystem deals with byte oriented buffers; so given the way
data is arranged in the data port we want to always make sure the LSB
goes into address 0. The cause of problems isn't the BE vs LE bus
attachment. It is the different memory system orientation. Using
ioread16/iowrite16 has the right behaviour, but it's kind of a
backwards way to go about it It isn't that we want a be16_to_cpu()
or le16_to_cpu() on the data port read, but rather a cpu_to_le16() on
the store to memory regardless of the endianess of the platform.

So, if I'm correct that means that for the data port (specifically
copying between RAM and the data port) using ioread16/iowrite16 on the
data port results in the correct behaviour. It also means that LE
support in the current driver is broken.


That matches my earlier note when I wrote that for correct work on LE 
(note I'm on ARC, not PPC/MB) I needed to use io{read|write}16 in 
ace_data{in|out}_le16.


With original io{read|write}16be instead data was corrupted.

-Alexey
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Grant Likely
On Thu, Feb 7, 2013 at 4:56 PM, Alexey Brodkin
alexey.brod...@synopsys.com wrote:
 On 02/07/2013 08:44 PM, Grant Likely wrote:
 So, if I'm correct that means that for the data port (specifically
 copying between RAM and the data port) using ioread16/iowrite16 on the
 data port results in the correct behaviour. It also means that LE
 support in the current driver is broken.

 That matches my earlier note when I wrote that for correct work on LE (note
 I'm on ARC, not PPC/MB) I needed to use io{read|write}16 in
 ace_data{in|out}_le16.

 With original io{read|write}16be instead data was corrupted.

In which case your bug-fix patch should drop the
ace_datain_le16/ace_dataout_le16 variants entirely and use the be16
ones for both (renaming appropriately).

g.

--
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-07 Thread Alexey Brodkin

On 02/07/2013 09:16 PM, Grant Likely wrote:

On Thu, Feb 7, 2013 at 4:56 PM, Alexey Brodkin
alexey.brod...@synopsys.com wrote:

On 02/07/2013 08:44 PM, Grant Likely wrote:

So, if I'm correct that means that for the data port (specifically
copying between RAM and the data port) using ioread16/iowrite16 on the
data port results in the correct behaviour. It also means that LE
support in the current driver is broken.


That matches my earlier note when I wrote that for correct work on LE (note
I'm on ARC, not PPC/MB) I needed to use io{read|write}16 in
ace_data{in|out}_le16.

With original io{read|write}16be instead data was corrupted.


In which case your bug-fix patch should drop the
ace_datain_le16/ace_dataout_le16 variants entirely and use the be16
ones for both (renaming appropriately).

g.

--
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.



Sorry, do you mean to replace original lines:
===
static void ace_datain_be16(struct ace_device *ace)
{
int i = ACE_FIFO_SIZE / 2;
u16 *dst = ace-data_ptr;
while (i--)
*dst++ = in_le16(ace-baseaddr + 0x40);
ace-data_ptr = dst;
}

static void ace_dataout_be16(struct ace_device *ace)
{
int i = ACE_FIFO_SIZE / 2;
u16 *src = ace-data_ptr;
while (i--)
ioread16(*src++, ace-baseaddr + 0x40);
ace-data_ptr = src;
}
===

with something like:
===
static void ace_datain_16(struct ace_device *ace)
{
int i = ACE_FIFO_SIZE / 2;
u16 *dst = ace-data_ptr;
while (i--)
*dst++ = in_le16(ace-baseaddr + 0x40);
ace-data_ptr = dst;
}

static void ace_dataout_16(struct ace_device *ace)
{
int i = ACE_FIFO_SIZE / 2;
u16 *src = ace-data_ptr;
while (i--)
iowrite16(*src++, ace-baseaddr + 0x40);
ace-data_ptr = src;
}
===

and then the next patch should replace io{read|write}16 with 
io{read|write}16_rep I guess, correct?


-Alexey
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-06 Thread Geert Uytterhoeven
On Thu, Feb 7, 2013 at 8:23 AM, Michal Simek  wrote:
>>> #define iowrite16be(v, addr)   iowrite16(be16_to_cpu(v), (addr))
>>> #define iowrite16(v, addr)  writew((v), (addr))
>>> #define writew(b,addr) __raw_writew(__cpu_to_le16(b),addr)
>>>
>>> static inline void __raw_writew(u16 b, volatile void __iomem *addr)
>>> {
>>> *(volatile u16 __force *) addr = b;
>>> }
>>>
>>> How is this suppose to work on Big Endian?
>>> be16_to_cpu(v) is (v)
>>> and
>>> __cpu_to_le16(b) is swab16(v)
>>
>> Yes.
>
> But on native BE system ( I expect that v is in big endian)
> iowrite16be(v, addr) should be just *(volatile u16 __force *) addr =
> v; not *(volatile u16 __force *) addr = swab16(v);

>>> What I would expect is
>>> #define iowrite16be(v, addr)   __raw_writew(__cpu_to_be16(v), addr)
>>
>> Indeed, it should be "__cpu_to_be16(v)" instead of "be16_to_cpu(v)".
>
> What do you mean by that?

Bummer, I missed that current iowrite16be() uses (the little endian)
iowrite16(),
not _raw_writew(), and thought the only difference between the original
and your version was the endianness conversion macro.

Yes,

#define iowrite16be(v, addr)   __raw_writew(__cpu_to_be16(v), addr)

should be correct.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-06 Thread Michal Simek
2013/2/6 Geert Uytterhoeven :
> On Wed, Feb 6, 2013 at 6:40 PM, Michal Simek  wrote:
>> One question regarding to asm-generic/io.h about iowrite16be implementation
>>
>> #define iowrite16be(v, addr)   iowrite16(be16_to_cpu(v), (addr))
>> #define iowrite16(v, addr)  writew((v), (addr))
>> #define writew(b,addr) __raw_writew(__cpu_to_le16(b),addr)
>>
>> static inline void __raw_writew(u16 b, volatile void __iomem *addr)
>> {
>> *(volatile u16 __force *) addr = b;
>> }
>>
>> How is this suppose to work on Big Endian?
>> be16_to_cpu(v) is (v)
>> and
>> __cpu_to_le16(b) is swab16(v)
>
> Yes.

But on native BE system ( I expect that v is in big endian)
iowrite16be(v, addr) should be just *(volatile u16 __force *) addr =
v; not *(volatile u16 __force *) addr = swab16(v);


>> On little
>> be16_to_cpu(v) is swab16(v)
>
> Yes.
>
>> and
>> __cpu_to_le16(swab(b)) is swab16(v)
>
> ???
>
> Don't you mean "__cpu_to_le16(b) is (b)"?

BTW: I took output value from the first line (be16_to_cpu(v) is swab16(v))

Grrr - on LE this code works. (I expect that v is in little endian)
iowrite16be(v, addr) is *(volatile u16 __force *) addr = swab16(v);


>> What I would expect is
>> #define iowrite16be(v, addr)   __raw_writew(__cpu_to_be16(v), addr)
>
> Indeed, it should be "__cpu_to_be16(v)" instead of "be16_to_cpu(v)".

What do you mean by that?

Thanks,
Michal

-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-06 Thread Benjamin Herrenschmidt
On Wed, 2013-02-06 at 10:14 +, Grant Likely wrote:
> 
> Huh? That makes no sense. This device out in the wild with both big
> and little endian bus attachments. You can argue all day that one of
> them is wrong, but it doesn't matter. It exists, is used, and must be
> supported.

No. That's where you are VERY wrong. We don't have to support crap and
arguably shouldn't if that can give an incentive to vendors to fix their
stuff. If you don't believe me, ask Linus :-)

>  In fact, the driver already knows about this and figures
> out at runtime how the device is wired up to the bus. This is not the
> problem.

Except that this is very gross, especially when you observe that in the
busted "big endian" case, it has to byteswap the bloody data port.

So you end up having to do that gross hack with separate accessors for
registers vs. data and not able to use the _rep variants, which also
means that on platforms like ppc, you end up with a memory barrier on
every access (or more), which is going to slow things down enormously.

> BTW, that document describes only one of the systemace bus
> attachments. There is a different on used on Microblaze little-endian,
> and some boards have the SystemACE directly wired to the SoC external
> bus (no adapter IP).
> 
> The only problem that I see is that the ARM and Microblaze
> ioread16be/iowrite16be helpers are missing barriers which smells like
> a bug and should be fixed.
> 
> Michal, have you tested Alexey's patch? If it works for you then I'm
> comfortable with acking it. It looks correct to me.

No, the real problem is that the "big endian" wiring is totally busted
and the HW guys who came with it must be taught a lesson. Not supporting
that crap might be one way to do it.

Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-06 Thread Geert Uytterhoeven
On Wed, Feb 6, 2013 at 6:40 PM, Michal Simek  wrote:
> One question regarding to asm-generic/io.h about iowrite16be implementation
>
> #define iowrite16be(v, addr)   iowrite16(be16_to_cpu(v), (addr))
> #define iowrite16(v, addr)  writew((v), (addr))
> #define writew(b,addr) __raw_writew(__cpu_to_le16(b),addr)
>
> static inline void __raw_writew(u16 b, volatile void __iomem *addr)
> {
> *(volatile u16 __force *) addr = b;
> }
>
> How is this suppose to work on Big Endian?
> be16_to_cpu(v) is (v)
> and
> __cpu_to_le16(b) is swab16(v)

Yes.

> On little
> be16_to_cpu(v) is swab16(v)

Yes.

> and
> __cpu_to_le16(swab(b)) is swab16(v)

???

Don't you mean "__cpu_to_le16(b) is (b)"?

> What I would expect is
> #define iowrite16be(v, addr)   __raw_writew(__cpu_to_be16(v), addr)

Indeed, it should be "__cpu_to_be16(v)" instead of "be16_to_cpu(v)".

Same for iowrite32be().

> on Big endian:
> __cpu_to_be16(v) is (v)
> on Little endian:
> __cpu_to_be16(v) is swab(v)
>
> What am I missing here?

But as both conversions are identical (just swapping 2 bytes), It Just Works.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-06 Thread Michal Simek

On 02/07/2013 01:34 AM, Arnd Bergmann wrote:

On Wednesday 06 February 2013 17:21:37 Michal Simek wrote:


I have looked at the patches from more practical side and I have tested it on
microblaze big endian in 16bit mode and I have found that sysace driver
stop to work.
After that I have looked at ioread/iowrite microblaze implementation
and implementation of that functions is wrong.
I have fixed it but looking at using asm-generic/io.h for microblaze.

I will do more tests and let you know.


Well, I think they are only wrong in the way that they ignore
endianess. You can fix that by changing them to be identical
to the in_le/in_be families.


But still it is wrong.



However, I would also recommend changing your __raw_* accessors
to inline assembly functions rather than pointer dereferences,
because we have had problems in the past where gcc (when faced
with undefined C) silently turned 32-bit accesses into multiples
of byte accesses, which can be fatal for MMIO. The asm-generic
version obviously cannot get this right.

The PCI I/O space handling, as mentioned, is completely broken
on microblaze, and you can either use the approach from asm-generic
when you set PCI_IOBASE match your isa_io_base.


One question regarding to asm-generic/io.h about iowrite16be implementation

#define iowrite16be(v, addr)   iowrite16(be16_to_cpu(v), (addr))
#define iowrite16(v, addr)  writew((v), (addr))
#define writew(b,addr) __raw_writew(__cpu_to_le16(b),addr)

static inline void __raw_writew(u16 b, volatile void __iomem *addr)
{
*(volatile u16 __force *) addr = b;
}

How is this suppose to work on Big Endian?
be16_to_cpu(v) is (v)
and
__cpu_to_le16(b) is swab16(v)

On little
be16_to_cpu(v) is swab16(v)
and
__cpu_to_le16(swab(b)) is swab16(v)


What I would expect is
#define iowrite16be(v, addr)   __raw_writew(__cpu_to_be16(v), addr)

on Big endian:
__cpu_to_be16(v) is (v)
on Little endian:
__cpu_to_be16(v) is swab(v)

What am I missing here?

Thanks,
Michal


--
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel 2.6 Microblaze Linux - http://www.monstr.eu/fdt/
Microblaze U-BOOT custodian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-06 Thread Arnd Bergmann
On Wednesday 06 February 2013 17:21:37 Michal Simek wrote:

> I have looked at the patches from more practical side and I have tested it on
> microblaze big endian in 16bit mode and I have found that sysace driver
> stop to work.
> After that I have looked at ioread/iowrite microblaze implementation
> and implementation of that functions is wrong.
> I have fixed it but looking at using asm-generic/io.h for microblaze.
> 
> I will do more tests and let you know.

Well, I think they are only wrong in the way that they ignore
endianess. You can fix that by changing them to be identical
to the in_le/in_be families.

However, I would also recommend changing your __raw_* accessors
to inline assembly functions rather than pointer dereferences,
because we have had problems in the past where gcc (when faced
with undefined C) silently turned 32-bit accesses into multiples
of byte accesses, which can be fatal for MMIO. The asm-generic
version obviously cannot get this right.

The PCI I/O space handling, as mentioned, is completely broken
on microblaze, and you can either use the approach from asm-generic
when you set PCI_IOBASE match your isa_io_base.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-06 Thread Arnd Bergmann
On Wednesday 06 February 2013, Grant Likely wrote:
> The only problem that I see is that the ARM and Microblaze
> ioread16be/iowrite16be helpers are missing barriers which smells like
> a bug and should be fixed.

It looks correct to me on ARM, we use the same barriers in ioread and
iowrite that we use in readl/writel.

Microblaze uses no barriers in any of the I/O accessor families
(readl, ioread32, in_le32, inl, inl_p), so it presumably doesn't
require any because it is fully ordered, at least changing from
in_le32 to ioread32 won't make it worse.
On a related note, the inb/outb style accessors are bogus on microblaze
and should be removed, but that is a different discussion. I should
have updated my old patches to introduce a meaningful CONFIG_NO_IOPORT
ages ago, and without them it probably breaks.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-06 Thread Michal Simek
2013/2/6 Grant Likely :
> On Tue, Feb 5, 2013 at 3:12 PM, Arnd Bergmann  wrote:
>> On Tuesday 05 February 2013 18:03:31 Alexey Brodkin wrote:
>>> The Xilinx System ACE Compact Flash chip is a true little-endian device
>>> and the PLB is a big-endian bus. Therefore the XPS System ACE Interface
>>> Controller will do a bit-swap in each byte when connecting the PLB data
>>> bus to the System ACE data bus as shown in Table 2.
>>>
>>> Note however, that the XPS System ACE Interface Controller does not
>>> perform the byte swapping necessary to interface to a little-endian
>>> device when configured to use 16-bit mode. Therefore, the software
>>> drivers provided for this core will perform the necessary byte-swapping
>>> to correctly interface to the Xilinx System ACE Compact Flash chip as
>>> shown in Table 3.
>>
>> Ok. In this case, I would recommend making the default for this driver
>> little-endian, and adding a quirk for broken hardware bridges like the
>> one you cited to have a mixed-endian mode if configured so at compile
>> time.
>>
>> It seems that on all normal platforms, this device should behave as
>> little-endian, while the Xilinx bridge can be either big-endian
>> or little-endian, depending on whether it is used in 8-bit or 16-bit
>> mode, so if we are using this, it cannot be known at compile time.
>
> The driver already handles this. It has three sets of accessors;
> 8-bit, 16-bit LE and 16-bit BE *and* when doing 16-bit it figures out
> on its own which set to use at runtime. There is nothing controversial
> here. The only problem is that the driver is currently using in_/out_
> IO accessors instead of ioread/iowrite variants.

I have looked at the patches from more practical side and I have tested it on
microblaze big endian in 16bit mode and I have found that sysace driver
stop to work.
After that I have looked at ioread/iowrite microblaze implementation
and implementation of that functions is wrong.
I have fixed it but looking at using asm-generic/io.h for microblaze.

I will do more tests and let you know.

Thanks,
Michal

-- 
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-06 Thread Arnd Bergmann
On Wednesday 06 February 2013, Grant Likely wrote:
> The driver already handles this. It has three sets of accessors;
> 8-bit, 16-bit LE and 16-bit BE and when doing 16-bit it figures out
> on its own which set to use at runtime. There is nothing controversial
> here. The only problem is that the driver is currently using in_/out_
> IO accessors instead of ioread/iowrite variants.

Ah right. It certainly helps to look at the source code.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] drivers/block/xsysace - replace in(out)_8/in(out)_be16/in(out)_le16 with generic iowrite(read)8/16(be)

2013-02-06 Thread Arnd Bergmann
On Wednesday 06 February 2013, Geert Uytterhoeven wrote:
> On Wed, Feb 6, 2013 at 12:03 AM, Arnd Bergmann  wrote:
> > On Tuesday 05 February 2013, Michal Simek wrote:
> >> I want to be sure about this. I have parsed this again with closer look and
> >> seems to me that ioread32 is equal to readl and iowrite32 to writel.
> >> Arnd: Am I right?
> >
> > Correct. On all the architectures you care about (most importantly, not
> > x86), readl and ioread32 are defined to have the same semantics. There
> > are a few exceptions where ioread32 provides a wrapper for PCI
> > PIO accesses that are not memory mapped, making ioread32 slightly
> > slower than readl, but for all practical purposes you don't have
> > to worry about it ;-)
> 
> PCI PIO? Do you mean "PCI I/O space"?

Yes.

> That's different, and accessed through inb() and friends, right?

Sort of. The native methods are readl() etc for MMIO and inb() etc. for PIO,
as you say. However, the ioread32() family does the superset of the two,
either using a wrapper function with a mangled port number / pointer
argument (e.g. on x86), or it gets defined to do the same as readl on
platforms where the PCI I/O space is memory mapped. The requirement
here is that whatever the result of ioport_map() is must be turned
back into a PIO access when passed into ioread32(). On memory mapped
platforms, there is typically just an offset that gets added to the
port number to turn that into an __iomem pointer.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   >