[RFC] SystemACE driver - abstract register ops

2007-04-27 Thread John Williams

Grant,

Thanks for your work on the SystemACE driver - I'll be porting/merging 
this across to MicroBlaze very shortly.


Given that SysACE can be hooked up in any number of ways, bit widths, 
endians, PPC/Microblaze, can you please offer your comments on the 
attached patch?


It introduce a private ace_reg_ops structure, with various member 
functions for the different kinds of accesses to the HW.


This patch should not change the functionality of your original driver 
at all, it's just groundwork for what's to come.


I recognise that it adds indirection into the various access paths, and 
potentially a little bloat.  Whether this is better than #if 
1...#else...#endif is debatable.


Similar issues will arise for most (all?) of the Xilinx drivers that we 
will share between PPC and MicroBlaze.  Hopefully we can converge on a 
nice consistent and clean way of handling these dual arch drivers.


Cheers,

John
Index: linux-2.6.x-petalogix/drivers/block/xsysace.c
===
--- linux-2.6.x-petalogix/drivers/block/xsysace.c   (revision 2628)
+++ linux-2.6.x-petalogix/drivers/block/xsysace.c   (working copy)
@@ -73,6 +73,11 @@
  *interrupt, then the kernel timer will expire and the driver can
  *continue where it left off.
  *
+ *  SystemACE can be wired up in different endian orders and data widths.
+ *  It works on both PPC and MicroBlaze architectures.  For this reason,
+ *  an ace_reg_ops structure is used that abstracts away low level
+ *  endian/width/arch access to the HW registers.
+
  * To Do:
  *- Add FPGA configuration control interface.
  *- Request major number from lanana
@@ -161,32 +166,120 @@
 /* -
  * Low level register access
  */
+struct reg_ops {
+   u8 (*read8)(void *addr);
+   u16 (*read16)(void *addr);
+   u32 (*read32)(void *addr);
+   u32 (*readdata)(void *addr);
 
-/* register access macros */
+   void (*write16)(void *addr, u16 val);
+   void (*write32)(void *addr, u32 val);
+   void (*writedata)(void *addr, u32 val);
+};
+
+#define ace_reg_read8(ace, reg) (ace->ops->read8(ace->baseaddr + reg))
+#define ace_reg_read16(ace, reg) (ace->ops->read16(ace->baseaddr + reg))
+#define ace_reg_readdata(ace, reg) (ace->ops->readdata(ace->baseaddr + reg))
+#define ace_reg_read32(ace, reg) (ace->ops->read32(ace->baseaddr+reg))
+#define ace_reg_write16(ace, reg, val) (ace->ops->write16(ace->baseaddr+reg, 
val))
+#define ace_reg_writedata(ace, reg, val) (ace->ops->writedata(ace->baseaddr + 
reg, val))
+#define ace_reg_write32(ace, reg, val) (ace->ops->write32(ace->baseaddr+reg, 
val))
+
+/* register access functions */
 #if 1 /* Little endian 16-bit regs */
-#define ace_reg_read8(ace, reg) in_8(ace->baseaddr + reg)
-#define ace_reg_read16(ace, reg) in_le16(ace->baseaddr + reg)
-#define ace_reg_readdata(ace, reg) in_be16(ace->baseaddr + reg)
-#define ace_reg_read32(ace, reg) ((in_le16(ace->baseaddr + reg+2) << 16) | \
-  (in_le16(ace->baseaddr + reg)))
-#define ace_reg_write16(ace, reg, val) out_le16(ace->baseaddr + reg, val)
-#define ace_reg_writedata(ace, reg, val) out_be16(ace->baseaddr + reg, val)
-#define ace_reg_write32(ace, reg, val) { \
-   out_le16(ace->baseaddr + reg+2, (val) >> 16); \
-   out_le16(ace->baseaddr + reg, val); \
-   }
+static u8 ace_le16_read8(void *addr)
+{
+   return in_8(addr);
+}
+
+static u16 ace_le16_read16(void *addr)
+{
+   return in_le16(addr);
+}
+
+static u32 ace_le16_read32(void *addr)
+{
+   return ((in_le16(addr+2) << 16) | (in_le16(addr)));
+}
+
+static u32 ace_le16_readdata(void *addr)
+{
+   return in_be16(addr);
+}
+
+static void ace_le16_write16(void *addr, u16 val)
+{
+   out_le16(addr, val);
+}
+
+static void ace_le16_write32(void *addr, u32 val)
+{
+   out_le16(addr+2,(val) >> 16); \
+   out_le16(addr, val);
+}
+
+static void ace_le16_writedata(void *addr, u32 val)
+{
+   out_be16(addr, val);
+}
+
+static struct reg_ops ace_ops = {
+   .read8  =   ace_le16_read8,
+   .read16 =   ace_le16_read16,
+   .read32 =   ace_le16_read32,
+   .readdata = ace_le16_readdata,
+   .write16 =  ace_le16_write16,
+   .write32 =  ace_le16_write32,
+   .writedata =ace_le16_writedata
+} ;
+
 #else /* Big endian 16-bit regs */
-#define ace_reg_read8(ace, reg) in_8(ace->baseaddr + reg)
-#define ace_reg_read16(ace, reg) in_be16(ace->baseaddr + reg)
-#define ace_reg_readdata(ace, reg) in_le16(ace->baseaddr + reg)
-#define ace_reg_read32(ace, reg) ((in_be16(ace->baseaddr + reg+2) << 16) | \
-  (in_be16(ace->baseaddr + reg)))
-#define ace_reg_write16(ace, reg, val) out_be16(ace->baseaddr + reg, val)
-#define ace_reg_writedata(ace, reg, val) out_le16(ace->baseaddr + reg, val)
-#define ac

Re: [RFC] SystemACE driver - abstract register ops

2007-04-30 Thread John Williams
Grant,

Grant Likely wrote:
> On 4/27/07, John Williams <[EMAIL PROTECTED]> wrote:
> 
>> Thanks for your work on the SystemACE driver - I'll be porting/merging
>> this across to MicroBlaze very shortly.
> 
> Very cool; I hope it works well.

Indeed it does - your latest patchset version of the systemACE driver 
"just works" on MicroBlaze 2.6.20

I have only tested the 16-bit buswidth (standard ML401 reference design).

> For your reading pleasure, I've attached the bus attachment changes
> that I've made in my tree.  I hope to get this driver accepted into
> mainline during the 2.6.22 merge window; so please get any comments
> you have back to me ASAP.

Acked-by: John Williams <[EMAIL PROTECTED]>

John
___
Linuxppc-embedded mailing list
Linuxppc-embedded@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-embedded


[RFC] uartlite driver MicroBlaze compatability

2007-04-30 Thread John Williams

Hi Peter,

The attached patch gets your uartlite driver going on MicroBlaze.

All readb/writeb ops are converted to ioread32/iowrite32.

On MicroBlaze readb/writeb are picking up the MSB, instead of LSB, and 
thus reading all zeros instead of the 8-bit control/status/FIFO 
registers that you intended.


Can you please confirm if this works on PPC?

I note that Grant's recent bootloader driver uses in_be32/out_be32 - 
would you prefer that instead of ioread32/iowrite32?


Thanks,

John
Convert readb/writeb ops into ioread32/iowrite32.

This gets the driver working with MicroBlaze (2.6.20).

signed-off-by: John Williams [EMAIL PROTECTED]

Index: linux-2.6.x/drivers/serial/uartlite.c
===
--- linux-2.6.x/drivers/serial/uartlite.c   (revision 2561)
+++ linux-2.6.x/drivers/serial/uartlite.c   (working copy)
@@ -61,7 +61,7 @@
/* stats */
if (stat & ULITE_STATUS_RXVALID) {
port->icount.rx++;
-   ch = readb(port->membase + ULITE_RX);
+   ch = ioread32(port->membase + ULITE_RX);
 
if (stat & ULITE_STATUS_PARITY)
port->icount.parity++;
@@ -106,7 +106,7 @@
return 0;
 
if (port->x_char) {
-   writeb(port->x_char, port->membase + ULITE_TX);
+   iowrite32(port->x_char, port->membase + ULITE_TX);
port->x_char = 0;
port->icount.tx++;
return 1;
@@ -115,7 +115,7 @@
if (uart_circ_empty(xmit) || uart_tx_stopped(port))
return 0;
 
-   writeb(xmit->buf[xmit->tail], port->membase + ULITE_TX);
+   iowrite32(xmit->buf[xmit->tail], port->membase + ULITE_TX);
xmit->tail = (xmit->tail + 1) & (UART_XMIT_SIZE-1);
port->icount.tx++;
 
@@ -132,7 +132,7 @@
int busy;
 
do {
-   int stat = readb(port->membase + ULITE_STATUS);
+   int stat = ioread32(port->membase + ULITE_STATUS);
busy  = ulite_receive(port, stat);
busy |= ulite_transmit(port, stat);
} while (busy);
@@ -148,7 +148,7 @@
unsigned int ret;
 
spin_lock_irqsave(&port->lock, flags);
-   ret = readb(port->membase + ULITE_STATUS);
+   ret = ioread32(port->membase + ULITE_STATUS);
spin_unlock_irqrestore(&port->lock, flags);
 
return ret & ULITE_STATUS_TXEMPTY ? TIOCSER_TEMT : 0;
@@ -171,7 +171,7 @@
 
 static void ulite_start_tx(struct uart_port *port)
 {
-   ulite_transmit(port, readb(port->membase + ULITE_STATUS));
+   ulite_transmit(port, ioread32(port->membase + ULITE_STATUS));
 }
 
 static void ulite_stop_rx(struct uart_port *port)
@@ -200,17 +200,17 @@
if (ret)
return ret;
 
-   writeb(ULITE_CONTROL_RST_RX | ULITE_CONTROL_RST_TX,
+   iowrite32(ULITE_CONTROL_RST_RX | ULITE_CONTROL_RST_TX,
   port->membase + ULITE_CONTROL);
-   writeb(ULITE_CONTROL_IE, port->membase + ULITE_CONTROL);
+   iowrite32(ULITE_CONTROL_IE, port->membase + ULITE_CONTROL);
 
return 0;
 }
 
 static void ulite_shutdown(struct uart_port *port)
 {
-   writeb(0, port->membase + ULITE_CONTROL);
-   readb(port->membase + ULITE_CONTROL); /* dummy */
+   iowrite32(0, port->membase + ULITE_CONTROL);
+   ioread32(port->membase + ULITE_CONTROL); /* dummy */
free_irq(port->irq, port);
 }
 
@@ -314,7 +314,7 @@
 
/* wait up to 10ms for the character(s) to be sent */
for (i = 0; i < 1; i++) {
-   if (readb(port->membase + ULITE_STATUS) & ULITE_STATUS_TXEMPTY)
+   if (ioread32(port->membase + ULITE_STATUS) & 
ULITE_STATUS_TXEMPTY)
break;
udelay(1);
}
@@ -323,7 +323,7 @@
 static void ulite_console_putchar(struct uart_port *port, int ch)
 {
ulite_console_wait_tx(port);
-   writeb(ch, port->membase + ULITE_TX);
+   iowrite32(ch, port->membase + ULITE_TX);
 }
 
 static void ulite_console_write(struct console *co, const char *s,
@@ -340,8 +340,8 @@
spin_lock_irqsave(&port->lock, flags);
 
/* save and disable interrupt */
-   ier = readb(port->membase + ULITE_STATUS) & ULITE_STATUS_IE;
-   writeb(0, port->membase + ULITE_CONTROL);
+   ier = ioread32(port->membase + ULITE_STATUS) & ULITE_STATUS_IE;
+   iowrite32(0, port->membase + ULITE_CONTROL);
 
uart_console_write(port, s, count, ulite_console_putchar);
 
@@ -349,7 +349,7 @@
 
/* restore interrupt state */
if (ier)
-   writeb(ULITE_CONTROL_IE, port->membase + ULITE_CONTROL);
+   iowrite32(ULITE_CONTROL_IE, port->membase + ULITE_CONTROL);
 
if (locked)
spi

Re: [RFC] uartlite driver MicroBlaze compatability

2007-04-30 Thread John Williams
Grant Likely wrote:
> On 4/30/07, John Williams <[EMAIL PROTECTED]> wrote:
> 
>> All readb/writeb ops are converted to ioread32/iowrite32.
>>
>> On MicroBlaze readb/writeb are picking up the MSB, instead of LSB, and
>> thus reading all zeros instead of the 8-bit control/status/FIFO
>> registers that you intended.
>>
>> Can you please confirm if this works on PPC?
> 
> Yes, I've confirmed this does work on PPC; but I don't think it's
> quite the correct fix.
> 
> ioread/write32 is mapped to in/out_le32, yet the bootloader driver
> must use in/out_be32.  This is because the uartlite driver follows the
> lead of 8250 and requires an offset of 3 from the base address in
> order to find the relevant byte wise address.  In fact, I believe the
> driver should work as-is on microblaze if the offset-by-3 is not used
> when registering it to the platform bus.

ugh.  I missed the off-by-3 stuff in the PPC platform setup.  Smells bad!

> However, the uartlite is *not* an 8250.  The 8250 turns up all over
> the place and it's registers are defined as 8 bit wide.  The
> offset-by-3 stuff is part of the plat_serial8250_port structure which
> is also used to specify .regshift (increment between registers).
> Whereas the UARTLITE is defined as a 32 bit device and it doesn't show
> up in anywhere near as many designs.  Registers are always 4 bytes
> wide and are always located at multiples of 4 bytes off the base
> address.

Agreed.

> So; starting with your patch and modifying it, I've attached I think
> the change should be.  It should work for microblaze, but I've only
> tested w/ ppc.  Unfortunately the (void*) casts are ugly; there might
> be a way around that, but it's due to the type used for the (struct
> uart_port)->membase variable.

Looks good - boot tested on MicroBlaze 2.6.20

acked-by: John Williams <[EMAIL PROTECTED]>

John
___
Linuxppc-embedded mailing list
Linuxppc-embedded@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-embedded


Re: [RFC] uartlite driver MicroBlaze compatability

2007-05-01 Thread John Williams
Grant Likely wrote:
> On 5/1/07, John Williams <[EMAIL PROTECTED]> wrote:
> 
>> Grant Likely wrote:
>> > However, the uartlite is *not* an 8250.  The 8250 turns up all over
>> > the place and it's registers are defined as 8 bit wide.  The
>> > offset-by-3 stuff is part of the plat_serial8250_port structure which
>> > is also used to specify .regshift (increment between registers).
>> > Whereas the UARTLITE is defined as a 32 bit device and it doesn't show
>> > up in anywhere near as many designs.  Registers are always 4 bytes
>> > wide and are always located at multiples of 4 bytes off the base
> 
> Hmm, I think I was smoking something last night.  Address used for 8
> bit access should not be affected by CPU endianess.  After David's
> comments, I reread the uartlite documentation.  The current design is
> definately for 32bit OPB bus connections, but it looks like there is a
> posibility for xilinx to add a 16 or 8 bit attachment.  Since the
> uartlite design explicitly supports 8, 16 and 32 bit access, sticking
> with 8 bit io may be the safest.  

To be honest I don't think that will ever happen - just because the OPB 
bus data width is parameterisable, doesn't mean that it actually *works* 
or has been tested on anything other than 32-bits wide.  I've certainly 
never heard of anyone doing so, on either MicroBlaze or PPC.

but, I won't fight over it :)

Either way, it will still require a code change if/when someone does a 
16/8 bit wide OPB bus.  Whether they change the IO access operation, or 
a hardcoded constant, it's still not perfect.

Of course the real solution here is to create an OPB bus driver, with  a 
'width' field that you can pull out of XPAR, and so on... Use that 
instead of platform bus, and all this rubbish can be dealt with cleanly.

> However, I still think the
> application of the 3 byte offset should be done in the driver, and not
> in the platform bus registration.

If it has to be done, I agree the driver is the place to put it.

> I've reworked the patch with the following changes
> - remove 3 byte offset from platform bus registration.
> - added ulite_in/ulite_out macros to make changing bus attachment
> details simpler if xilinx changes the uartlite design.
> - stick with 8 bit IO.

It works fine, however perhaps a comment explaining the +3 offset might 
be appreciated by those who follow.

  > Tested on PPC.  John, can you please test on microblaze?

Acked-by: John Williams <[EMAIL PROTECTED]>

John
___
Linuxppc-embedded mailing list
Linuxppc-embedded@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-embedded


Re: [RFC] uartlite driver MicroBlaze compatability

2007-05-02 Thread John Williams
Hi Peter,

Peter Korsgaard wrote:

> JW> The attached patch gets your uartlite driver going on MicroBlaze.
> 
> Nice!
> 
> JW> All readb/writeb ops are converted to ioread32/iowrite32.
> 
> JW> On MicroBlaze readb/writeb are picking up the MSB, instead of LSB,
> JW> and thus reading all zeros instead of the 8-bit
> JW> control/status/FIFO registers that you intended.
> 
> I take it that the microblaze is big endian? Then you just need to add
> 3 to the base address and everything should work without your patch.

I struggle to see adding 3 to the base address in the platform driver as 
a clean solution.  The base address of the peripheral is 0x1024, or 
whatever, not 0x10240003.

I understand the reasoning for it, but from the platform's perspective 
it seems wrong.

If you read the opb_uartlite datasheet, it says that bits 0-26 of the 
FIFO, CTRL and STATUS regs are "reserved".  It doesn't say, this is an 
8-bit peripheral that is mapped onto a 32bit bus with a stride of 4.

If you also read page 6 , under address map, it says

BASE_ADDRESS+0 : read from receive FIFO
BASE_ADDRESS+4 : write to transmit FIFO
and so on.

It is a 32-bit peripheral, it just so happens the 24 of those bits are 
currently "reserved".

Grant's recanting may have been triggered by the figure on page 4 of the 
datasheet, which is generic Xilnx IP Core datasheet material explaining 
the endian interpretation for different data widths.

> JW> Can you please confirm if this works on PPC?
> 
> It won't as ioread/write does big/little endian byte swapping. Isn't
> that done on microblaze?

Not presently, but I will fix that.

I think that's Grant's approach of using in/out_be32, and the real base 
address (ie not +3) is the only logically correct solution.

Regards,

John
___
Linuxppc-embedded mailing list
Linuxppc-embedded@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-embedded


Re: Xilinx git tree at source.mvista.com

2007-05-28 Thread John Williams
Hi Wolfgang,

Wolfgang Reissnegger wrote:
> David H. Lynch Jr. wrote:
> 
>>For me the most significant issue is the bazillion layers of nested
>>macro's and includes.
> 
> 
> I don't see the macros as an issue, just look at the implementation of,
> for example, spin_lock_irq() and Xilinx's macros seem like child's play :-)
> As for includes, yes, there are a few too many header files. But, as
> time progresses and the need arises they can be merged into fewer files.

It seems the kernel.org decision has been made re: the style issue. 
None of the *_i.[ch], *_g.[ch] + adapter.c drivers will make it to 
mainline.

I understand why Xilinx did it this way, but to be honest never agreed 
with it myself either.  Style issues aside, three levels of function 
calls in an interrupt handler might be portable, but it still isn't a 
good thing!

The effort to refactor these drivers is not huge, but it is an effort. 
If Xilinx is committed to good quality Linux support for their silicon, 
it will require tangible investment in the form of labour or resources. 
I know you understand this, but Xilinx as an company still needs a good 
hard shove in this direction.

Alternatively, drivers will trickle into kernel.org as the community 
gets around to it, witness the uartlite and system ace drivers.

Same old story, if you want it "some day", then it will be free, if you 
want it now, you've got to pay!

Cheers,

John

___
Linuxppc-embedded mailing list
Linuxppc-embedded@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-embedded


Xilinx OPB_PCI support in 2.6

2007-06-18 Thread John Williams
Hi folks,

Has anyone tried getting the Xilinx opb_pci bridge going in a recent 2.6 
kernel?  - ML410 board or similar?

Mailing list archives seems silent about it.

Thanks,

John
___
Linuxppc-embedded mailing list
Linuxppc-embedded@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-embedded


suspend/resume (Xilinx V4)

2007-08-02 Thread John Williams
Hi,

Any comments on the status/feasibility of suspend/resume for LinuxPPC on 
Xilinx devices?

The only suspend-related traffic I see here in the last 12 months is for 
the lite5200b board.

Any guesstimates of level of difficulty for such a task on Xilinx 
PPC405?  The core lite5200b patchset didn't look too hairy (just a 
modest bit of ASM :), but there would also be Xilinx device driver hooks 
to consider.

Also, are there kernel revision dependencies here?  Project is currently 
on an MVL 2.6.10 tree - but have seen mention of 2.6.17 (?) being 
earliest with PPC suspend/resume capability.

All info appreciated.  Thanks,

John
___
Linuxppc-embedded mailing list
Linuxppc-embedded@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-embedded


Re: [PATCH] Xilinx TEMAC driver

2007-11-12 Thread John Williams
Hi David, GRant

Grant Likely wrote:
> On 7/24/07, David H. Lynch Jr. <[EMAIL PROTECTED]> wrote:
> 
>>Hopefully this is not too much of a mess.

> Hooray!  Thanks for posting your work.  I'm keen to try this on my
> platform.  Comments below.
> 

[snip]

Any progress on the ll_temac driver since July?  In EDK9.2, ll_temac is 
really the only supported ethernet solution, apart from ethernet lite 
(yuck).

If there's a PPC version in a reasonable state, i'm happy to see what's 
requierd to port it across to MicroBlaze.

cheers,

John
___
Linuxppc-embedded mailing list
Linuxppc-embedded@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-embedded


Re: Xilinx devicetrees

2007-11-27 Thread John Williams
Hi folks,

Stephen Neuendorffer wrote:

>  >Binding it to a kernel, is a non-starter for us.
> 
> I agree that this is not the best way of leveraging the power of device 
> trees.  The point is that by using a device tree, you haven't lost 
> anything you can do currently.  In the future we might have one kernel 
> which supports all versions of all our IP, along with all flavors of 
> microblaze and powerpc...  You would only ever need to recompile this 
> kernel as a final optimization, if at all.

I strongly support the OF / device tree work being done, from its own 
perspective and also as a path to MicroBlaze/PPC unification, however 
there is one critical difference that I have not seen adequately 
addressed yet.

MicroBlaze is a highly configurable CPU in terms of its instruction set, 
features and so on.  To make use of this, it is essential that each 
kernel image be compiled to match the CPU configuration.  While a 
generic kernel, making use of no features (MUL, DIV, Shift, MSR ops etc) 
would run on any CPU, performance would be abysmal.

In my view it's not acceptable to present these as options for the user 
to select at kernel config time. With N yes/no parameters, there is 1 
correct configuration, and 2^N-1 incorrect ones.  The odds of the user 
falling upon a configuration that at worst fails to boot, or at best is 
not optimally matched to the hardware, are high.

This same issue also applies to C libraries and apps - they must be 
compiled with prior knowledge of the CPU.  This is why our 
microblaze-uclinux-gcc toolchain, with multilib'd uClibc, is almost 400meg!

Wrapping every mul, div, shift etc in a function call is clearly not 
feasible.  Things like the msrset/msrclr ops have a modest but 
measurable impact on kernel code size and performance - it's just not 
reasonable to add any level of indirection in there.

I have thought about dynamic (boot-time) code re-writing as one 
possibility here, but it very quickly gets nasty.  All of the 
"optimised" opcodes (MUL/DIV/Shift etc) are smaller than their emulated 
counterparts, so in principle we could re-write the text segment with 
the optimised opcode, then NOP pad, but that's still inefficient.  As 
soon as we start talking about dynamic code relocation, re-writing 
relative offsets in loops, ... yuck..  We'd be putting half of mb-ld 
into the arch early boot code (or bootloader...)

The opposite approach, to build with all instructions enabled and 
install exception handlers to deal with the fixups, is also pretty awful.

I find myself asking the question - for what use cases does the current 
static approach used in MicroBlaze (with the PetaLinux BSP / 
Kconfig.auto) *not* work?

One compromise approach might be to have a script in 
arch/microblaze/scripts, called by the arch Makefile, that cracks open 
the DT at build time and extracts appropriate cpu flags.

Finally, what is the LKML position on DT files going into the kernel 
source tree?

Regards,

John
___
Linuxppc-embedded mailing list
Linuxppc-embedded@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-embedded


Re: Xilinx devicetrees

2007-11-27 Thread John Williams
Stephen Neuendorffer wrote:

>>From: John Williams [mailto:[EMAIL PROTECTED] 

>>MicroBlaze is a highly configurable CPU in terms of its 
>>instruction set, 
>>features and so on.  To make use of this, it is essential that each 
>>kernel image be compiled to match the CPU configuration.  While a 
>>generic kernel, making use of no features (MUL, DIV, Shift, 
>>MSR ops etc) 
>>would run on any CPU, performance would be abysmal.
> 
> I think the userspace is actually much more critical than the kernel for
> most of these things (with the exception of msrset/msrclr, and the
> barrel shifter perhaps).  Unfortunately, even if you implement an
> alternatives-style mechanism for the kernel, you're still hosed for
> userspace.  

I haven't benchmarks each option on the kernel - you are right that 
shift is a big one, but mul I think is also important, given that every 
array access generates an integer multiply,

Once I a big enough system, it's just unfeasible to
> recompile everything anyway.  I think this is where autoconfig starts to
> break down.

I'm not sure I agree, here, given that most people building MicroBlaze 
systems are doing so with uClinux-dist (or PetaLinux), you can do a full 
rebuilt, kernel libs apps, in a couple of minutes.  Much shorter than 
sythnesis and P&R, that's for sure (and runtime is linear in size, 
unlike P&R :)

> It's not nice, I agree.  I think the key principle should be that I
> should be able to get a system working as quickly as possible, and I
> might optimize things later.  One thing that device trees will allow is
> for *all* the drivers to get compiled in to the kernel, and only as a
> late stage operation does a designer need to start throwing things away.
> Using traps I can easily start with a 'kitchen sink' design, and start
> discarding processor features, relying on the traps.  When I get low
> enough down on the performance curve, I can uas an autoconfig-type
> mechanism to regain a little of what I lost by optimizing away the trap
> overhead. 

OK, but now we have a kernel dependent on *3* files - a DTS, a 
Kconfig.auto, and (indirectly) the bitstream itself.

> Personally, I think the easiest way out of all this is to just have less
> configurability.  For microblaze in general, this is too much of a
> restriction, but microblaze used as a control processor running linux,
> there are probably just a few design points that really make sense
> (probably size optimized: no options except maybe msrset/msrclr, and the
> kitchen sink).  If we go that far, we don't really need people to ever
> run autoconfig, or kernels, or anything.  Especially considering there
> is no easy way of selecting which of the 2^N design points I want
> *anyway*. :)

My experience tells me that if the microblaze can be configured in a 
particular way, *someone* will want to do it (and still boot linux on 
it!)  We still have people building MicroBlaze 3.00 in Spartan2E, with 
EDK 6.3.  And autoconfig works!  Exceptions on/off, MMU on/off (runtime 
configurable on that?).

Our ability to plug into the backend design database of EDK presents a 
great opportunity - truly automatically configured kernels.  I think we 
have a responsibility to leverage that power.We are already there 
with the static approach, I think we just need to make sure that 
persists into the dynamic approach, and that we find a good mix of the two.

There are of course some semantic issues that the EDK cannot 
automatically resolve - relative ordering and priority of multiple 
peripheral instances for example.

>>One compromise approach might be to have a script in 
>>arch/microblaze/scripts, called by the arch Makefile, that 
>>cracks open 
>>the DT at build time and extracts appropriate cpu flags.
> 
> Hmm... interesting idea, although parsing the source is likely
> difficult...  It's probably not worth it to go this far, I think.   As
> you say, why doesn't autoconfig of today work fine for this.

Well, copying multiple configuration files into the kernel is not ideal. 
  Surely a little perl or python script would do the trick?  DTS syntax 
is pretty clean, just find the CPU node and off we go.  Multiple CPU, 
well... :)

>>Finally, what is the LKML position on DT files going into the kernel 
>>source tree?
> 
> 
> Source .dts go in and get compiled to binary blobs at compile time.  The
> 'big' recent controversy is whether the source->binary compiler dtc
> should be mirrored in the Linux tree or not.

OK.

Another thing I suggested to Michal recently, perhaps we need 
kernel/lib/libof to store common OF / DT handling code.  Much better 
than duplicating it accross microblaze and PPC, and maybe other arch's 
would also see the light..  That would also add a claim for the DTC to 
go in scripts/

John
___
Linuxppc-embedded mailing list
Linuxppc-embedded@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-embedded


Re: SMP on linux with Microblaze?

2007-12-03 Thread John Williams
Hi,

khollan wrote:

>Now that Full Linux can run on Microblaze with the addition of the MMU, are
>there plans to enable Symmetric Multi-Processing of two or more Microblaze
>cores running Linux?
>  
>
This isn't really the right list for direct microblaze discussion, but
since you asked..  The challenge with SMP is not an MMU, but cache
coherency.  This is why native SMP on dual PPC on V4/V5 is also a
non-starter.  It is possible to build software driven snoop/invalidate
mechanisms that might allow a crippled SMP on MicroBlaze, but I think
the performance would be pretty nasty. 

The Blackfin Linux team have done some interesting things towards SMP on
non cache-coherent dual CPUs.  Basically they do a local cache
invalidation upon acquiring any kernel lock, on the theory that if you
are accessing a shared data structure you will grab a lock first.  Thus,
the cache flush will make sure you get the "true" value, not some stale
locally cached result.  But, it's still pretty inefficient, and cannot
do things like processor affinity and process migration.  Google the
bfin lists for details and patches.

Regards,

John

___
Linuxppc-embedded mailing list
Linuxppc-embedded@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-embedded