[MPC52xx]Latency issue with DMA on FEC

2010-12-01 Thread Jean-Michel Hautbois
Hi lists !

I measured the latency and the jitter of the RX and TX ethernet paths
on my MPC5200 board.
The RX path is quite good, but the TX path can be slow.

[ 1218.976762] [mpc52xx_fec_start_xmit]Delay >30us for dma_map_single
=> 76364 ns
[ 1219.188405] [mpc52xx_fec_tx_interrupt]Delay >30us for
dma_unmap_single => 34515 ns
[ 1220.628785] [mpc52xx_fec_start_xmit]Delay >30us for
bcom_submit_next_buffer => 97273 ns
[ 1225.776784] [mpc52xx_fec_tx_interrupt]Delay >30us for
dma_unmap_single => 95273 ns

As one can see, this is obviously problematic.
The first function I analyzed is bcom_submit_next_buffer() => This
function doesn't do lots of things, except a call to mb().

I have been looking to the "MPC603e RISC Microprocessor User's Manual"
and especially the chapter named "2.3.4.7 Memory Synchronization
Instructions—UISA".

Here is a paragraph which explains a lot :

"The functions performed by the sync instruction normally take a
significant amount of time
to complete; as a result, frequent use of this instruction may
adversely affect performance.
In addition, the number of cycles required to complete a sync
instruction depends on system
parameters and on the processor's state when the instruction is issued."

I am using a real time kernel, and this is a problem, as it is not
deterministic to use this instruction.
Is there a way to avoid this ?

I will now focus on the dma_map_single() and dma_unmap_single functions...

Thanks in advance for your help,
Best Regards,

JM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: PHY/FEC Network adapter failed to initialize on MPC52xx Board

2010-12-01 Thread Peter kuennem...@crane-soft

Tanks to all for the reply to this thread. I got the tip from the thread Tiejun
mentioned.

Quotation: 'St. Strobel' in Xenomai-help

"I had this problem too. In my case the problem was caused by an incorrect port 
multiplex configuration in U-Boot, see CONIFG_SYS_GPS_PORT_CONFIG (I use 0x 
xxx5, which corresponds to 10/100Mbit
Eth with MD). "

That fixed it for me.

Regards, Peter

Am 29.11.2010 03:48, schrieb tiejun.chen:
> Peter wrote:
>> Hi all
>>
>> I got completely stuck with a network adapter problem on my
>> ppc board (MPC52xx style). The ntwork adapter does not seem
>> to intialize correctly when booted without 'help from uboot'
>>
> Looks your problem is very similar to one I replied here not long ago :) That 
> is
> also issued from MPC5200. And I remember there was a wrong Port Multiplex
> Configuration.
>
> If possible maybe you can check the email subjected "Problem Ethernet
> Initialization MPC5200 +  LXT971A" on linuxppc-dev list.
>
> Hope its useful.
>
> Tiejun
>
>> The adapter works properly when I first use it with uboot. E.g.
>> using tftp to load the kernel or just issuing a dummy sntp
>> command. It does not get intialized if I boot linux without
>> using any network relevant command in ubboot
>>
>> The difference manifests on the boot message: (working)
>> PHY working
>> ...
>> mpc52xx MII bus: probed
>> TCP cubic registered
>> NET: Registered protocol family 17
>> IP-Config: Complete:
>>  device=eth0, addr=192.168.1.245, mask=255.254.0.0, gw=192.168.1.2,
>>  host=192.168.1.245, domain=, nis-domain=(none),
>>  bootserver=192.168.1.244, rootserver=192.168.1.244, rootpath=
>> Looking up port of RPC 13/2 on 192.168.1.244
>> Looking up port of RPC 15/1 on 192.168.1.244
>> VFS: Mounted root (nfs filesystem) on device 0:11.
>> Freeing unused kernel memory: 124k init
>> PHY: f0003000:00 - Link is Up - 100/Full
>>
>> # ping 192.168.1.2  returns proper results.
>>
>> PHY Not working:
>> ...
>> mpc52xx MII bus: probed
>> TCP cubic registered
>> NET: Registered protocol family 17
>> IP-Config: Complete:
>>  device=eth0, addr=192.168.1.245, mask=255.254.0.0, gw=192.168.1.2,
>>  host=192.168.1.245, domain=, nis-domain=(none),
>>  bootserver=192.168.1.244, rootserver=192.168.1.244, rootpath=
>> VFS: Mounted root (squashfs filesystem) readonly on device 31:3.
>> Freeing unused kernel memory: 124k init
>>
>> # ping 192.168.1.2  hangs
>>
>>
>> The second snipped does not have "Looking up.." messages because it
>> boots from flash. Main difference is "PHY: f0003000:00 - Link is Up - 
>> 100/Full"
>> which does not appear at the failing case.
>>
>> Linux Version is 2.6.35.7 patched with xenomai 2.5
>> U-Boot 2010.06 (Aug 05 2010 - 19:54:45)
>>
>> Linux configuration see below: ( i left most entries out that are not set)
>> I also experimented with different settings but finally only
>> CONFIG_FEC_MPC52xx=y  and CONFIG_FEC_MPC52xx_MDIO=y
>> seem to be of any relevance. If both are set, the adapter works
>> when initialized by uboot.
>>
>> Any help or tips will be very much appreciated,
>>
>> Regards, Peter
>>
>>
>> Linux .config
>>
>> ...
>> #
>> # Platform support
>> #
>> # CONFIG_PPC_CHRP is not set
>> # CONFIG_MPC5121_ADS is not set
>> # CONFIG_MPC5121_GENERIC is not set
>> CONFIG_PPC_MPC52xx=y
>> CONFIG_PPC_MPC5200_SIMPLE=y
>> # CONFIG_PPC_EFIKA is not set
>> CONFIG_PPC_LITE5200=y
>> # CONFIG_PPC_MEDIA5200 is not set
>> CONFIG_PPC_MPC5200_BUGFIX=y
>> # CONFIG_PPC_MPC5200_GPIO is not set
>> CONFIG_PPC_MPC5200_LPBFIFO=y
>>
>> CONFIG_PPC_BESTCOMM=y
>> CONFIG_PPC_BESTCOMM_FEC=y
>> CONFIG_PPC_BESTCOMM_GEN_BD=y
>> # CONFIG_SIMPLE_GPIO is not set
>> ..
>> # Bus options
>> #
>> CONFIG_ZONE_DMA=y
>> CONFIG_NEED_SG_DMA_LENGTH=y
>> CONFIG_GENERIC_ISA_DMA=y
>> CONFIG_PPC_PCI_CHOICE=y
>> ...
>> #
>> # Generic Driver Options
>> #
>> CONFIG_STANDALONE=y
>> CONFIG_PREVENT_FIRMWARE_BUILD=y
>> CONFIG_MTD=y
>> CONFIG_MTD_PARTITIONS=y
>> CONFIG_MTD_CMDLINE_PARTS=y
>>
>> #
>> # MII PHY device drivers
>> #
>> CONFIG_LXT_PHY=y## Does not seem to have any 
>> influence
>> CONFIG_NET_ETHERNET=y
>> CONFIG_MII=y
>> CONFIG_ETHOC=y   ## Does not seem to have any 
>> influence
>> CONFIG_FEC_MPC52xx=y  ## Must be Y in roder to get adapter 
>> working with uboot's init
>> CONFIG_FEC_MPC52xx_MDIO=y  ## Must be Y in roder to get adapter working with 
>> uboot's init
> ___
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH] of/address: use propper endianess in get_flags

2010-12-01 Thread Sebastian Andrzej Siewior
This patch changes u32 to __be32 for all "ranges", "prop" and "addr" and
such. Those variables are pointing to the device tree which containts
intergers in big endian format.
Most functions are doing it right because of_read_number() is doing the
right thing for them. of_bus_isa_get_flags(), of_bus_pci_get_flags() and
of_bus_isa_map() were accessing the data directly and were doing it wrong.

Signed-off-by: Sebastian Andrzej Siewior 
---
 arch/powerpc/include/asm/prom.h |2 +-
 drivers/of/address.c|   54 --
 include/linux/of_address.h  |6 ++--
 3 files changed, 32 insertions(+), 30 deletions(-)

diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h
index ae26f2e..ab34f60 100644
--- a/arch/powerpc/include/asm/prom.h
+++ b/arch/powerpc/include/asm/prom.h
@@ -42,7 +42,7 @@ extern void pci_create_OF_bus_map(void);
 
 /* Translate a DMA address from device space to CPU space */
 extern u64 of_translate_dma_address(struct device_node *dev,
-   const u32 *in_addr);
+   const __be32 *in_addr);
 
 #ifdef CONFIG_PCI
 extern unsigned long pci_address_to_pio(phys_addr_t address);
diff --git a/drivers/of/address.c b/drivers/of/address.c
index 3a1c7e7..b4559c5 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -12,13 +12,13 @@
(ns) > 0)
 
 static struct of_bus *of_match_bus(struct device_node *np);
-static int __of_address_to_resource(struct device_node *dev, const u32 *addrp,
-   u64 size, unsigned int flags,
+static int __of_address_to_resource(struct device_node *dev,
+   const __be32 *addrp, u64 size, unsigned int flags,
struct resource *r);
 
 /* Debug utility */
 #ifdef DEBUG
-static void of_dump_addr(const char *s, const u32 *addr, int na)
+static void of_dump_addr(const char *s, const __be32 *addr, int na)
 {
printk(KERN_DEBUG "%s", s);
while (na--)
@@ -26,7 +26,7 @@ static void of_dump_addr(const char *s, const u32 *addr, int 
na)
printk("\n");
 }
 #else
-static void of_dump_addr(const char *s, const u32 *addr, int na) { }
+static void of_dump_addr(const char *s, const __be32 *addr, int na) { }
 #endif
 
 /* Callbacks for bus specific translators */
@@ -36,10 +36,10 @@ struct of_bus {
int (*match)(struct device_node *parent);
void(*count_cells)(struct device_node *child,
   int *addrc, int *sizec);
-   u64 (*map)(u32 *addr, const u32 *range,
+   u64 (*map)(u32 *addr, const __be32 *range,
int na, int ns, int pna);
int (*translate)(u32 *addr, u64 offset, int na);
-   unsigned int(*get_flags)(const u32 *addr);
+   unsigned int(*get_flags)(const __be32 *addr);
 };
 
 /*
@@ -55,7 +55,7 @@ static void of_bus_default_count_cells(struct device_node 
*dev,
*sizec = of_n_size_cells(dev);
 }
 
-static u64 of_bus_default_map(u32 *addr, const u32 *range,
+static u64 of_bus_default_map(u32 *addr, const __be32 *range,
int na, int ns, int pna)
 {
u64 cp, s, da;
@@ -85,7 +85,7 @@ static int of_bus_default_translate(u32 *addr, u64 offset, 
int na)
return 0;
 }
 
-static unsigned int of_bus_default_get_flags(const u32 *addr)
+static unsigned int of_bus_default_get_flags(const __be32 *addr)
 {
return IORESOURCE_MEM;
 }
@@ -110,10 +110,10 @@ static void of_bus_pci_count_cells(struct device_node *np,
*sizec = 2;
 }
 
-static unsigned int of_bus_pci_get_flags(const u32 *addr)
+static unsigned int of_bus_pci_get_flags(const __be32 *addr)
 {
unsigned int flags = 0;
-   u32 w = addr[0];
+   u32 w = be32_to_cpup(addr);
 
switch((w >> 24) & 0x03) {
case 0x01:
@@ -129,7 +129,8 @@ static unsigned int of_bus_pci_get_flags(const u32 *addr)
return flags;
 }
 
-static u64 of_bus_pci_map(u32 *addr, const u32 *range, int na, int ns, int pna)
+static u64 of_bus_pci_map(u32 *addr, const __be32 *range, int na, int ns,
+   int pna)
 {
u64 cp, s, da;
unsigned int af, rf;
@@ -160,7 +161,7 @@ static int of_bus_pci_translate(u32 *addr, u64 offset, int 
na)
return of_bus_default_translate(addr + 1, offset, na - 1);
 }
 
-const u32 *of_get_pci_address(struct device_node *dev, int bar_no, u64 *size,
+const __be32 *of_get_pci_address(struct device_node *dev, int bar_no, u64 
*size,
unsigned int *flags)
 {
const __be32 *prop;
@@ -207,7 +208,7 @@ EXPORT_SYMBOL(of_get_pci_address);
 int of_pci_address_to_resource(struct device_node *dev, int bar,
   struct resource *r)
 {
-   const u32   *addrp;
+   const __be32*addrp;
u64 size;
unsigned intflags;
 
@@ -

Re: [MPC52xx]Latency issue with DMA on FEC

2010-12-01 Thread Jean-Michel Hautbois
2010/12/1 Jean-Michel Hautbois :
> Hi lists !
>
> I measured the latency and the jitter of the RX and TX ethernet paths
> on my MPC5200 board.
> The RX path is quite good, but the TX path can be slow.
>
> [ 1218.976762] [mpc52xx_fec_start_xmit]Delay >30us for dma_map_single
> => 76364 ns
> [ 1219.188405] [mpc52xx_fec_tx_interrupt]Delay >30us for
> dma_unmap_single => 34515 ns
> [ 1220.628785] [mpc52xx_fec_start_xmit]Delay >30us for
> bcom_submit_next_buffer => 97273 ns
> [ 1225.776784] [mpc52xx_fec_tx_interrupt]Delay >30us for
> dma_unmap_single => 95273 ns
>
> As one can see, this is obviously problematic.
> The first function I analyzed is bcom_submit_next_buffer() => This
> function doesn't do lots of things, except a call to mb().
>
> I have been looking to the "MPC603e RISC Microprocessor User's Manual"
> and especially the chapter named "2.3.4.7 Memory Synchronization
> Instructions—UISA".
>
> Here is a paragraph which explains a lot :
>
> "The functions performed by the sync instruction normally take a
> significant amount of time
> to complete; as a result, frequent use of this instruction may
> adversely affect performance.
> In addition, the number of cycles required to complete a sync
> instruction depends on system
> parameters and on the processor's state when the instruction is issued."
>
> I am using a real time kernel, and this is a problem, as it is not
> deterministic to use this instruction.
> Is there a way to avoid this ?
>
> I will now focus on the dma_map_single() and dma_unmap_single functions...
>
> Thanks in advance for your help,
> Best Regards,
>
> JM
>

dma_map_single() and dma_unmap_single() have the same instruction set
used inside (sync) because there is a cleaning of cache.
eieio instruction doesn't seem to be faster and I think that because
cache is not inhibited, this is not a good way to do that.

The delay introduced by the use of these instructions can be really
big (about 70-90µs) whereas in most cases it is relatively good (about
10-20µs).
This jitter is a problem in my use case, and I think I am not the only one :).

One other thing to say : I am using little packets (about 200 bytes).

JM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] eSPI: change the read behavior of the SPIRF

2010-12-01 Thread Mingkai Hu
The user must read N bytes of SPIRF (1 <= N <= 4) that do not exceed the
amount of data in the receive FIFO, so read the SPIRF byte by byte when
the data in receive FIFO is less than 4 bytes.

On Simics, when read N bytes that exceed the amout of data in receive
FIFO, we can't read the data out, that is we can't clear the rx FIFO,
then the CPU will loop on the espi rx interrupt.

Signed-off-by: Mingkai Hu 
---
 drivers/spi/spi_fsl_espi.c |   19 ---
 1 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/spi/spi_fsl_espi.c b/drivers/spi/spi_fsl_espi.c
index e3b4f64..ae78926 100644
--- a/drivers/spi/spi_fsl_espi.c
+++ b/drivers/spi/spi_fsl_espi.c
@@ -507,16 +507,29 @@ void fsl_espi_cpu_irq(struct mpc8xxx_spi *mspi, u32 
events)
 
/* We need handle RX first */
if (events & SPIE_NE) {
-   u32 rx_data;
+   u32 rx_data, tmp;
+   u8 rx_data_8;
 
/* Spin until RX is done */
while (SPIE_RXCNT(events) < min(4, mspi->len)) {
cpu_relax();
events = mpc8xxx_spi_read_reg(®_base->event);
}
-   mspi->len -= 4;
 
-   rx_data = mpc8xxx_spi_read_reg(®_base->receive);
+   if (mspi->len >= 4) {
+   rx_data = mpc8xxx_spi_read_reg(®_base->receive);
+   } else {
+   tmp = mspi->len;
+   rx_data = 0;
+   while (tmp--) {
+   rx_data_8 = in_8((u8 *)®_base->receive);
+   rx_data |= (rx_data_8 << (tmp * 8));
+   }
+
+   rx_data <<= (4 - mspi->len) * 8;
+   }
+
+   mspi->len -= 4;
 
if (mspi->rx)
mspi->get_rx(rx_data, mspi);
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Basic driver devel questions ?

2010-12-01 Thread Guillaume Dargaud
Hello all,
is it OK if I ask basic driver development questions here ?
Could you recommend a better forum for that maybe ?
Thanks
-- 
Guillaume Dargaud
http://www.gdargaud.net/
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Basic driver devel questions ?

2010-12-01 Thread Michael Ellerman
On Wed, 2010-12-01 at 11:15 +0100, Guillaume Dargaud wrote:
> Hello all,
> is it OK if I ask basic driver development questions here ?
> Could you recommend a better forum for that maybe ?

Hi Guillaume,

I guess it depends how basic they are :)

If they're basic _powerpc_ driver questions then this is probably the
right place.

But I'd say just ask and maybe someone will be able to help, or maybe
they'll point you somewhere else.

cheers


signature.asc
Description: This is a digitally signed message part
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

_extending_ platform support options?

2010-12-01 Thread Joachim Foerster

Hi all,

currently I'm wondering what the preferred/recommend way of _extending_ an existing 
"Platform support" option is?


We are working with custom design/boards based on Virtex4/5. So we are primarily using the 
 CONFIG_XILINX_VIRTEX*_GENERIC_BOARD options. In our case we have some special needs, 
like custom ppc_md.restart(), ppc_md.power_off() or ppc_md.show_cpuinfo().


Till now, we just duplicated arch/powerpc/platforms/4?x/virtex.c and added our special 
stuff. Properly renaming everything, etc ...


An alternative could be to add a virtex_my.c which extends virtex.c, like this
(also like virtex_ml510.c extends virtex.c):

static void virtex_my_show_cpuinfo(struct seq_file *m)
{
seq_printf(m, something);
}

static int __init virtex_mle_init(void)
{
ppc_md.show_cpuinfo = virtex_my_show_cpuinfo;
return 0;
}
machine_core_initcall(virtex, virtex_my_init);

Though, to me, it does not seem really OK to assign ppc_md members that way. The original 
struct machdep for "virtex" (which is defined in virtex.c with define_machine()) is not 
adjusted either. Ok, we could modify that one, too.

Especially I'm not sure if it is OK to use machine_core_initcall() for such 
modifications.

So my question is: Is there any recommended way for doing such "extensions"? Or is it OK 
to just duplicate virtex.c (which does not seem really OK, too)?


Thanks,
 Joachim
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: _extending_ platform support options?

2010-12-01 Thread Josh Boyer
On Wed, Dec 01, 2010 at 02:25:43PM +0100, Joachim Foerster wrote:
>Hi all,
>
>currently I'm wondering what the preferred/recommend way of
>_extending_ an existing "Platform support" option is?
>
>We are working with custom design/boards based on Virtex4/5. So we
>are primarily using the  CONFIG_XILINX_VIRTEX*_GENERIC_BOARD options.
>In our case we have some special needs, like custom ppc_md.restart(),
>ppc_md.power_off() or ppc_md.show_cpuinfo().
>
>Till now, we just duplicated arch/powerpc/platforms/4?x/virtex.c and
>added our special stuff. Properly renaming everything, etc ...
>
>An alternative could be to add a virtex_my.c which extends virtex.c, like this
>(also like virtex_ml510.c extends virtex.c):
>
>static void virtex_my_show_cpuinfo(struct seq_file *m)
>{
>   seq_printf(m, something);
>}
>
>static int __init virtex_mle_init(void)
>{
>   ppc_md.show_cpuinfo = virtex_my_show_cpuinfo;
>   return 0;
>}
>machine_core_initcall(virtex, virtex_my_init);
>
>Though, to me, it does not seem really OK to assign ppc_md members
>that way. The original struct machdep for "virtex" (which is defined
>in virtex.c with define_machine()) is not adjusted either. Ok, we
>could modify that one, too.
>Especially I'm not sure if it is OK to use machine_core_initcall() for such 
>modifications.
>
>So my question is: Is there any recommended way for doing such
>"extensions"? Or is it OK to just duplicate virtex.c (which does not
>seem really OK, too)?

Duplicate it as you have done, naming the file something unique.  We try
to prevent unnecessary duplication of code, but sometimes it's cleaner
to just have a separate board file instead.

josh
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [MPC52xx]Latency issue with DMA on FEC

2010-12-01 Thread Steven Rostedt
On Wed, 2010-12-01 at 09:16 +0100, Jean-Michel Hautbois wrote:
> Hi lists !
> 
> I measured the latency and the jitter of the RX and TX ethernet paths
> on my MPC5200 board.
> The RX path is quite good, but the TX path can be slow.
> 
> [ 1218.976762] [mpc52xx_fec_start_xmit]Delay >30us for dma_map_single
> => 76364 ns
> [ 1219.188405] [mpc52xx_fec_tx_interrupt]Delay >30us for
> dma_unmap_single => 34515 ns
> [ 1220.628785] [mpc52xx_fec_start_xmit]Delay >30us for
> bcom_submit_next_buffer => 97273 ns
> [ 1225.776784] [mpc52xx_fec_tx_interrupt]Delay >30us for
> dma_unmap_single => 95273 ns
> 
> As one can see, this is obviously problematic.
> The first function I analyzed is bcom_submit_next_buffer() => This
> function doesn't do lots of things, except a call to mb().
> 
> I have been looking to the "MPC603e RISC Microprocessor User's Manual"
> and especially the chapter named "2.3.4.7 Memory Synchronization
> Instructions—UISA".
> 
> Here is a paragraph which explains a lot :
> 
> "The functions performed by the sync instruction normally take a
> significant amount of time
> to complete; as a result, frequent use of this instruction may
> adversely affect performance.
> In addition, the number of cycles required to complete a sync
> instruction depends on system
> parameters and on the processor's state when the instruction is issued."
> 
> I am using a real time kernel, and this is a problem, as it is not
> deterministic to use this instruction.
> Is there a way to avoid this ?

Don't use that hardware.

When working with drivers there are times you must sync with the device.
And if the device is nondeterministic, then find another set of hardware
to use. Unfortunately, I think you may not find any.

A mb() is usually used if you do a write to device and read from it.
With out it, the CPU could perform the read before the write, which
would give you an incorrect result. There's no other way around that.

-- Steve

> 
> I will now focus on the dma_map_single() and dma_unmap_single functions...
> 
> Thanks in advance for your help,
> Best Regards,
> 
> JM


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [MPC52xx]Latency issue with DMA on FEC

2010-12-01 Thread David Laight
 
> A mb() is usually used if you do a write to device and read from it.
> With out it, the CPU could perform the read before the write, which
> would give you an incorrect result. There's no other way around that.

Possibly the synchronisation functions are doing significantly
more work than is required.

I was looking at the in_le32() and out_le32() functions for the
ppc e300 (and maybe others).

The out_le32() contains a 'sync' instruction - this may only
be needed after a series of writes (eg just before a command).

The iosync() function just adds a 'sync' and can be used as needed.

The in_le32() not only contains the unwanted 'sync', but also
a 'twi' (trap immediate - NFI exactly what this does) and 'isync'.
The 'isync' is particularly horrid and unnecessary (aborts
the instruction queue and refeches the opcode bytes).

The very slow in_le32() might be there to give semi-synchronous
traps on address fault - but unless the hardware is being probed
that really isn't necessary.

I did find st_le32() and ld_le32() in arch/powerpc/include/asm/swab.h
but had difficulty #including that version of swab.h!
#include <../arch/powerpc/include/asm/swab.h>
worked - but isn't that nice.

David


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [MPC52xx]Latency issue with DMA on FEC

2010-12-01 Thread Jean-Michel Hautbois
2010/12/1 David Laight :
>
>> A mb() is usually used if you do a write to device and read from it.
>> With out it, the CPU could perform the read before the write, which
>> would give you an incorrect result. There's no other way around that.
>
> Possibly the synchronisation functions are doing significantly
> more work than is required.
>
> I was looking at the in_le32() and out_le32() functions for the
> ppc e300 (and maybe others).
>
> The out_le32() contains a 'sync' instruction - this may only
> be needed after a series of writes (eg just before a command).
>
> The iosync() function just adds a 'sync' and can be used as needed.
>
> The in_le32() not only contains the unwanted 'sync', but also
> a 'twi' (trap immediate - NFI exactly what this does) and 'isync'.
> The 'isync' is particularly horrid and unnecessary (aborts
> the instruction queue and refeches the opcode bytes).
>
> The very slow in_le32() might be there to give semi-synchronous
> traps on address fault - but unless the hardware is being probed
> that really isn't necessary.
>
> I did find st_le32() and ld_le32() in arch/powerpc/include/asm/swab.h
> but had difficulty #including that version of swab.h!
>    #include <../arch/powerpc/include/asm/swab.h>
> worked - but isn't that nice.
>
>        David

Yes, I was also looking at in_be16 and out_be16, and my thoughts were
exactly the same.
I think the HW I am using is not a good one, but this is not
sufficient to explain the behaviour.
These instructions are called, for instance, from bcom_enable_task().
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Getting the IRQ number (Was: Basic driver devel questions ?)

2010-12-01 Thread Guillaume Dargaud
> I guess it depends how basic they are :)
> 
> If they're basic _powerpc_ driver questions then this is probably the
> right place.
> 
> But I'd say just ask and maybe someone will be able to help, or maybe
> they'll point you somewhere else.

OK, here goes then: how do I get the IRQ number so that I can install an 
interrupt handler on it ?

In my dts file I have:
xps_acqui_data_0: xps-acqui-d...@c980 {
compatible = "xlnx,xps-acqui-data-3.00.a";
interrupt-parent = <&xps_intc_0>;
interrupts = < 0 2 >;
reg = < 0xc980 0x1 >;
xlnx,family = "virtex4";
xlnx,include-dphase-timer = <0x1>;
xlnx,mplb-awidth = <0x20>;
xlnx,mplb-clk-period-ps = <0x2710>;
xlnx,mplb-dwidth = <0x40>;
xlnx,mplb-native-dwidth = <0x40>;
xlnx,mplb-p2p = <0x0>;
xlnx,mplb-smallest-slave = <0x20>;
} ;

In my minimal driver init, I have:
  first = MKDEV (my_major, my_minor);
  register_chrdev_region(first, count, NAME);
  cdev_init(my_cdev, &fops);
  cdev_add (my_cdev, first, count);
So far so good.

Now how do I connect the dots between the hardware definitions from the dts and 
my driver ?
I have:
# ll /proc/device-tree/p...@0/xps-acqui-d...@c980/
-r--r--r--1 root root  27 Dec  1 16:26 compatible
-r--r--r--1 root root   4 Dec  1 16:26 interrupt-parent
-r--r--r--1 root root   8 Dec  1 16:26 interrupts
-r--r--r--1 root root  15 Dec  1 16:26 name
-r--r--r--1 root root   8 Dec  1 16:26 reg
-r--r--r--1 root root   8 Dec  1 16:26 xlnx,family
-r--r--r--1 root root   4 Dec  1 16:26 xlnx,include-dphase-
timer
-r--r--r--1 root root   4 Dec  1 16:26 xlnx,mplb-awidth
-r--r--r--1 root root   4 Dec  1 16:26 xlnx,mplb-clk-period-ps
-r--r--r--1 root root   4 Dec  1 16:26 xlnx,mplb-dwidth
-r--r--r--1 root root   4 Dec  1 16:26 xlnx,mplb-native-dwidth
-r--r--r--1 root root   4 Dec  1 16:26 xlnx,mplb-p2p
-r--r--r--1 root root   4 Dec  1 16:26 xlnx,mplb-smallest-
slave

But first I'm not sure where to find the IRQ in there, and also I'm not sure if 
reading the filesystem from a module is allowed.

How do I know if this interrupt is shared or not (is it important ?)

Thanks
-- 
Guillaume Dargaud
http://www.gdargaud.net/
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Getting the IRQ number (Was: Basic driver devel questions ?)

2010-12-01 Thread Philipp Ittershagen
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 12/01/2010 05:35 PM, Guillaume Dargaud wrote:
> Now how do I connect the dots between the hardware definitions from the dts 
> and 
> my driver ?

You can get the interrupt number from the dt by calling
irq_of_parse_and_map(). Be sure to pass the node of your device to this
function.

Then you have to request the interrupt by calling request_irq. This is
where you specify the interrupt handler.

> But first I'm not sure where to find the IRQ in there, and also I'm not sure 
> if 
> reading the filesystem from a module is allowed.

Why do you want to read the file system?
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkz2lCEACgkQCG4q0RxCsY4GpgCgiQFRhiF7jjhUdZcUBc4Y5ScJ
E0AAn0VxcCaVexepjrah64ZSS+Xhbed8
=h97e
-END PGP SIGNATURE-
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Getting the IRQ number (Was: Basic driver devel questions ?)

2010-12-01 Thread Scott Wood
On Wed, 1 Dec 2010 17:35:58 +0100
Guillaume Dargaud  wrote:

> OK, here goes then: how do I get the IRQ number so that I can install an 
> interrupt handler on it ?
> 
> In my dts file I have:
>   xps_acqui_data_0: xps-acqui-d...@c980 {
>   compatible = "xlnx,xps-acqui-data-3.00.a";
>   interrupt-parent = <&xps_intc_0>;
>   interrupts = < 0 2 >;
>   reg = < 0xc980 0x1 >;
>   xlnx,family = "virtex4";
>   xlnx,include-dphase-timer = <0x1>;
>   xlnx,mplb-awidth = <0x20>;
>   xlnx,mplb-clk-period-ps = <0x2710>;
>   xlnx,mplb-dwidth = <0x40>;
>   xlnx,mplb-native-dwidth = <0x40>;
>   xlnx,mplb-p2p = <0x0>;
>   xlnx,mplb-smallest-slave = <0x20>;
>   } ;
> 
> In my minimal driver init, I have:
>   first = MKDEV (my_major, my_minor);
>   register_chrdev_region(first, count, NAME);
>   cdev_init(my_cdev, &fops);
>   cdev_add (my_cdev, first, count);
> So far so good.
> 
> Now how do I connect the dots between the hardware definitions from the dts 
> and 
> my driver ?

How was your driver probed?  If you can get a pointer to the device
node, use irq_of_parse_and_map() to get a virtual irq that you can pass
to request_irq().

> But first I'm not sure where to find the IRQ in there, and also I'm not sure 
> if 
> reading the filesystem from a module is allowed.

There's no need; there are much easier ways to access the device tree
from within the kernel.

> How do I know if this interrupt is shared or not (is it important ?)

Can your driver tolerate it being shared?  If so, request it as shared.

-Scott

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] drivers: char: hvc: add arm JTAG DCC console support

2010-12-01 Thread Daniel Walker
On Tue, 2010-11-30 at 21:30 -0800, Stephen Boyd wrote:
> On 11/30/2010 11:25 AM, Daniel Walker wrote:
> > @@ -682,6 +682,15 @@ config HVC_UDBG
> > select HVC_DRIVER
> > default n
> >  
> > +config HVC_DCC
> > +   bool "ARM JTAG DCC console"
> > +   depends on ARM
> > +   select HVC_DRIVER
> > +   help
> > + This console uses the JTAG DCC on ARM to create a console under 
> > the HVC
> 
> Looks like you added one too many spaces for indent here.

The first line is fine, but the other two might need an extra space.

> > diff --git a/drivers/char/hvc_dcc.c b/drivers/char/hvc_dcc.c
> > new file mode 100644
> > index 000..6470f63
> > --- /dev/null
> > +++ b/drivers/char/hvc_dcc.c
> > +static inline u32 __dcc_getstatus(void)
> > +{
> > +   u32 __ret;
> > +
> > +   asm("mrc p14, 0, %0, c0, c1, 0  @ read comms ctrl reg"
> > +   : "=r" (__ret) : : "cc");
> > +
> > +   return __ret;
> > +}
> 
> Without marking this asm volatile my compiler decides it can cache the
> value of __ret in a register and then check the value of it continually
> in hvc_dcc_put_chars() (I had to replace get_wait/put_wait with 1 and
> fixup the branch otherwise my disassembler barfed on __dcc_(get|put)char).
> 
> 
>  :
>0:   ee103e11mrc 14, 0, r3, cr0, cr1, {0}
>4:   e3a0c000mov ip, #0  ; 0x0
>8:   e2033202and r3, r3, #536870912  ; 0x2000
>c:   ea06b   2c 
>   10:   e353cmp r3, #0  ; 0x0
>   14:   1afdbne 10 
>   18:   e7d1000cldrbr0, [r1, ip]
>   1c:   ee10fe11mrc 14, 0, pc, cr0, cr1, {0}
>   20:   2afdbcs 1c 
>   24:   ee000e15mcr 14, 0, r0, cr0, cr5, {0}
>   28:   e28cc001add ip, ip, #1  ; 0x1
>   2c:   e15c0002cmp ip, r2
>   30:   baf6blt 10 
>   34:   e1a2mov r0, r2
>   38:   e12fff1ebx  lr
> 
> As you can see, the value of the mrc is checked against DCC_STATUS_TX
> (bit 29) and then stored in r3 for later use. Marking this volatile
> produces the following:
> 
>  :
>0:   e3a03000mov r3, #0  ; 0x0
>4:   ea07b   28 
>8:   ee100e11mrc 14, 0, r0, cr0, cr1, {0}
>c:   e3100202tst r0, #536870912  ; 0x2000
>   10:   1afcbne 8 
>   14:   e7d10003ldrbr0, [r1, r3]
>   18:   ee10fe11mrc 14, 0, pc, cr0, cr1, {0}
>   1c:   2afdbcs 18 
>   20:   ee000e15mcr 14, 0, r0, cr0, cr5, {0}
>   24:   e2833001add r3, r3, #1  ; 0x1
>   28:   e1530002cmp r3, r2
>   2c:   baf5blt 8 
>   30:   e1a2mov r0, r2
>   34:   e12fff1ebx  lr
> 
> which looks better.
> 
> I marked all the asm in this driver as volatile. Is that correct?

Are you talking about __dcc_getstatus only? I don't think adding
volatile is going to hurt anything, if not having it causes problems.

> > +#if defined(CONFIG_CPU_V7)
> > +static inline char __dcc_getchar(void)
> > +{
> > +   char __c;
> > +
> > +   asm("get_wait:  mrc p14, 0, pc, c0, c1, 0  \n\
> > +   bne get_wait   \n\
> > +   mrc p14, 0, %0, c0, c5, 0   @ read comms data reg"
> > +   : "=r" (__c) : : "cc");
> > +
> > +   return __c;
> > +}
> > +#else
> > +static inline char __dcc_getchar(void)
> > +{
> > +   char __c;
> > +
> > +   asm("mrc p14, 0, %0, c0, c5, 0  @ read comms data reg"
> > +   : "=r" (__c));
> > +
> > +   return __c;
> > +}
> > +#endif
> > +
> > +#if defined(CONFIG_CPU_V7)
> > +static inline void __dcc_putchar(char c)
> > +{
> > +   asm("put_wait:  mrc p14, 0, pc, c0, c1, 0 \n\
> > +   bcs put_wait  \n\
> > +   mcr p14, 0, %0, c0, c5, 0   "
> > +   : : "r" (c) : "cc");
> > +}
> > +#else
> > +static inline void __dcc_putchar(char c)
> > +{
> > +   asm("mcr p14, 0, %0, c0, c5, 0  @ write a char"
> > +   : /* no output register */
> > +   : "r" (c));
> > +}
> > +#endif
> > +
> 
> I don't think both the v7 and v6 functions are necessary. It seems I can
> get away with just the second version of __dcc_(get|put)char() on a v7.
> The mrc p14, 0, pc, c0, c1, 0 will assign the top 4 bits (31-28) to the
> condition codes NZCV on v7. It also looks like on an ARM11 (a v6) will
> also do the same thing if I read the manuals right. The test in the
> inline assembly is saying, wait for a character to be ready or wait for
> a character to be read then actually write a character or read one. The
> code in hvc_dcc_put_chars() is already doing the same thing, albeit in a
> slightly different form. Instead of getting the status bits put into the
> condition codes and looping with bne or bcs it will read the register,
> a

Re: [PATCH] drivers: char: hvc: add arm JTAG DCC console support

2010-12-01 Thread Greg KH
On Wed, Dec 01, 2010 at 10:54:56AM -0800, Daniel Walker wrote:
> On Tue, 2010-11-30 at 21:30 -0800, Stephen Boyd wrote:
> > On 11/30/2010 11:25 AM, Daniel Walker wrote:
> > > @@ -682,6 +682,15 @@ config HVC_UDBG
> > > select HVC_DRIVER
> > > default n
> > >  
> > > +config HVC_DCC
> > > +   bool "ARM JTAG DCC console"
> > > +   depends on ARM
> > > +   select HVC_DRIVER
> > > +   help
> > > + This console uses the JTAG DCC on ARM to create a console under 
> > > the HVC
> > 
> > Looks like you added one too many spaces for indent here.
> 
> The first line is fine, but the other two might need an extra space.

For this, or any other changes you want, I'll gladly take a follow-on
patch as this one is already in my tty-next tree.

thanks,

greg k-h
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 0/4] V2 Add ability to link device blob(s) into vmlinux

2010-12-01 Thread dirk . brandewie
From: Dirk Brandewie 

This patch set adds the ability to link device tree blobs into
vmlinux. 

Patch 1 implements the changes to include/asm-generic/vmlinux.lds.h and
adds a generic rule for generating DTB objects to be linked vmlinux.

Patch 2 implements linking a DTB into an x86 image.

Patch 3-4 move {powerpc,microblaze}/boot/Makefile to use the dtc rule
in patch 1.

This patch set has been tested on x86.

Powerpc and Microblaze have been compile tested with and without patch
3 and 4 applied.

Changes from V1:

Documentation added for dtc command in Makefile.lib to
Documentation/kbuild/makefiles.txt
Separate DTB_ALIGNMENT define removed.
FORCE removed from dtc rule.
Removed hardcoded path to dts files from dtc command.  
Moved %.dtb: %.dts rule to arch specific makefiles. 
 
Patch for adding kernel command line option to pass in dtb_compat
string dropped from this set will be submitted seperately.

Dirk Brandewie (4):
  of: Add support for linking device tree blobs into vmlinux
  x86/of: Add building device tree blob(s) into image.
  of/powerpc: Use generic rule to build dtb's
  microblaze/of: Use generic rule to build dtb's

 Documentation/kbuild/makefiles.txt |   15 +++
 arch/microblaze/boot/Makefile  |   10 ++
 arch/powerpc/boot/Makefile |8 +++-
 arch/x86/platform/ce4100/Makefile  |   10 ++
 include/asm-generic/vmlinux.lds.h  |   15 ---
 scripts/Makefile.lib   |   21 -
 6 files changed, 62 insertions(+), 17 deletions(-)

-- 
1.7.2.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 1/4] of: Add support for linking device tree blobs into vmlinux

2010-12-01 Thread dirk . brandewie
From: Dirk Brandewie 

This patch adds support for linking device tree blob(s) into
vmlinux. Modifies asm-generic/vmlinux.lds.h to add linking
.dtb sections into vmlinux. To maintain compatiblity with the of/fdt
driver code platforms MUST copy the blob to a non-init memory location
before the kernel frees the .init.* sections in the image.

Modifies scripts/Makefile.lib to add a kbuild command to
compile DTS files to device tree blobs and a rule to create objects to
wrap the blobs for linking.

STRUCT_ALIGNMENT is defined in vmlinux.lds.h for use in the rule to
create wrapper objects for the dtb in Makefile.lib.  The
STRUCT_ALIGN() macro in vmlinux.lds.h is modified to use the
STRUCT_ALIGNMENT definition.

The DTB's are placed on 32 byte boundries to allow parsing the blob
with driver/of/fdt.c during early boot without having to copy the blob
to get the structure alignment GCC expects.

A DTB is linked in by adding the DTB object to the list of objects to
be linked into vmlinux in the archtecture specific Makefile using
   obj-y += foo.dtb.o

Signed-off-by: Dirk Brandewie 
---
 Documentation/kbuild/makefiles.txt |   15 +++
 include/asm-generic/vmlinux.lds.h  |   15 ---
 scripts/Makefile.lib   |   21 -
 3 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/Documentation/kbuild/makefiles.txt 
b/Documentation/kbuild/makefiles.txt
index 0ef00bd..fc18bb1 100644
--- a/Documentation/kbuild/makefiles.txt
+++ b/Documentation/kbuild/makefiles.txt
@@ -1136,6 +1136,21 @@ When kbuild executes, the following steps are followed 
(roughly):
  resulting in the target file being recompiled for no
  obvious reason.
 
+dtc
+   Create flattend device tree blob object suitable for linking
+   into vmlinux. Device tree blobs linked into vmlinux are placed
+   in an init section in the image. Platform code *must* copy the
+   blob to non-init memory prior to calling unflatten_device_tree().
+
+   Example:
+   #arch/x86/platform/ce4100/Makefile
+   clean-files := *dtb.S
+
+   DTC_FLAGS := -p 1024
+   obj-y += foo.dtb.o
+
+   $(obj)/%.dtb: $(src)/%.dts
+   $(call if_changed,dtc)
 
 --- 6.7 Custom kbuild commands
 
diff --git a/include/asm-generic/vmlinux.lds.h 
b/include/asm-generic/vmlinux.lds.h
index bd69d79..024d3b9 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -23,7 +23,7 @@
  * _etext = .;
  *
  *  _sdata = .;
- * RO_DATA_SECTION(PAGE_SIZE)
+*  RO_DATA_SECTION(PAGE_SIZE)
  * RW_DATA_SECTION(...)
  * _edata = .;
  *
@@ -67,7 +67,8 @@
  * Align to a 32 byte boundary equal to the
  * alignment gcc 4.5 uses for a struct
  */
-#define STRUCT_ALIGN() . = ALIGN(32)
+#define STRUCT_ALIGNMENT 32
+#define STRUCT_ALIGN() . = ALIGN(STRUCT_ALIGNMENT)
 
 /* The actual configuration determine if the init/exit sections
  * are handled as text/data or they can be discarded (which
@@ -146,6 +147,13 @@
 #define TRACE_SYSCALLS()
 #endif
 
+
+#define KERNEL_DTB()   \
+   STRUCT_ALIGN(); \
+   VMLINUX_SYMBOL(__dtb_start) = .;\
+   *(.dtb.init.rodata) \
+   VMLINUX_SYMBOL(__dtb_end) = .;
+
 /* .data section */
 #define DATA_DATA  \
*(.data)\
@@ -468,7 +476,8 @@
MCOUNT_REC()\
DEV_DISCARD(init.rodata)\
CPU_DISCARD(init.rodata)\
-   MEM_DISCARD(init.rodata)
+   MEM_DISCARD(init.rodata)\
+   KERNEL_DTB()
 
 #define INIT_TEXT  \
*(.init.text)   \
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 4c72c11..937eabbb 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -200,7 +200,26 @@ quiet_cmd_gzip = GZIP$@
 cmd_gzip = (cat $(filter-out FORCE,$^) | gzip -f -9 > $@) || \
(rm -f $@ ; false)
 
-
+# DTC
+#  ---
+
+# Generate an assembly file to wrap the output of the device tree compiler
+$(obj)/%.dtb.S: $(obj)/%.dtb
+   @echo '#include ' > $@
+   @echo '.section .dtb.init.rodata,"a"' >> $@
+   @echo '.balign STRUCT_ALIGNMENT' >> $@
+   @echo '.global __dtb_$(*F)_begin' >> $@
+   @echo '__dtb_$(*F)_begin:' >> $@
+   @echo '.incbin "$<" ' >> $@
+   @echo '__dtb_$(*F)_end:' >> $@
+   @echo '.global __dtb_$(*F)_end' >> $@
+   @echo '.balign STRUCT_ALIGNMEN

[PATCH 2/4] x86/of: Add building device tree blob(s) into image.

2010-12-01 Thread dirk . brandewie
From: Dirk Brandewie 

This patch adds linking device tree blob into vmlinux. DTB's are
added by adding the blob object name to list of objects to be linked
into the image.

Signed-off-by: Dirk Brandewie 
---
 arch/x86/platform/ce4100/Makefile |   10 ++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/arch/x86/platform/ce4100/Makefile 
b/arch/x86/platform/ce4100/Makefile
index 91fc929..3b49187 100644
--- a/arch/x86/platform/ce4100/Makefile
+++ b/arch/x86/platform/ce4100/Makefile
@@ -1 +1,11 @@
 obj-$(CONFIG_X86_INTEL_CE) += ce4100.o
+clean-files := *dtb.S
+
+ifdef CONFIG_X86_OF
+###
+# device tree blob
+obj-$(CONFIG_X86_INTEL_CE) += ce4100.dtb.o
+
+$(obj)/%.dtb: $(src)/%.dts
+   $(call if_changed,dtc)
+endif
-- 
1.7.2.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 3/4] of/powerpc: Use generic rule to build dtb's

2010-12-01 Thread dirk . brandewie
From: Dirk Brandewie 

Modify arch/powerpc/boot/Makefile to use dtc command in
scripts/Makefile.lib

Signed-off-by: Dirk Brandewie 
---
 arch/powerpc/boot/Makefile |8 +++-
 1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile
index fae8192..3afb33a 100644
--- a/arch/powerpc/boot/Makefile
+++ b/arch/powerpc/boot/Makefile
@@ -35,7 +35,7 @@ endif
 
 BOOTCFLAGS += -I$(obj) -I$(srctree)/$(obj)
 
-DTS_FLAGS  ?= -p 1024
+DTC_FLAGS  ?= -p 1024
 
 $(obj)/4xx.o: BOOTCFLAGS += -mcpu=405
 $(obj)/ebony.o: BOOTCFLAGS += -mcpu=405
@@ -332,10 +332,8 @@ $(obj)/treeImage.%: vmlinux $(obj)/%.dtb $(wrapperbits)
$(call if_changed,wrap,treeboot-$*,,$(obj)/$*.dtb)
 
 # Rule to build device tree blobs
-DTC = $(objtree)/scripts/dtc/dtc
-
-$(obj)/%.dtb: $(dtstree)/%.dts
-   $(DTC) -O dtb -o $(obj)/$*.dtb -b 0 $(DTS_FLAGS) $(dtstree)/$*.dts
+$(obj)/%.dtb: $(src)/dts/%.dts
+   $(call if_changed,dtc)
 
 # If there isn't a platform selected then just strip the vmlinux.
 ifeq (,$(image-y))
-- 
1.7.2.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 4/4] microblaze/of: Use generic rule to build dtb's

2010-12-01 Thread dirk . brandewie
From: Dirk Brandewie 

Modify arch/powerpc/boot/Makefile to use dtc command in
scripts/Makefile.lib

Signed-off-by: Dirk Brandewie 
---
 arch/microblaze/boot/Makefile |   10 ++
 1 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/arch/microblaze/boot/Makefile b/arch/microblaze/boot/Makefile
index be01d78..52430e5 100644
--- a/arch/microblaze/boot/Makefile
+++ b/arch/microblaze/boot/Makefile
@@ -10,9 +10,6 @@ targets := linux.bin linux.bin.gz simpleImage.%
 
 OBJCOPYFLAGS := -O binary
 
-# Where the DTS files live
-dtstree := $(srctree)/$(src)/dts
-
 # Ensure system.dtb exists
 $(obj)/linked_dtb.o: $(obj)/system.dtb
 
@@ -51,14 +48,11 @@ $(obj)/simpleImage.%: vmlinux FORCE
$(call if_changed,strip)
@echo 'Kernel: $@ is ready' ' (#'`cat .version`')'
 
-# Rule to build device tree blobs
-DTC = $(objtree)/scripts/dtc/dtc
 
 # Rule to build device tree blobs
-quiet_cmd_dtc = DTC $@
-   cmd_dtc = $(DTC) -O dtb -o $(obj)/$*.dtb -b 0 -p 1024 $(dtstree)/$*.dts
+DTC_FLAGS := -p 1024
 
-$(obj)/%.dtb: $(dtstree)/%.dts FORCE
+$(obj)/%.dtb: $(src)/dts/%.dts FORCE
$(call if_changed,dtc)
 
 clean-files += *.dtb simpleImage.*.unstrip linux.bin.ub
-- 
1.7.2.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [MPC52xx]Latency issue with DMA on FEC

2010-12-01 Thread Micha Nelissen

David Laight wrote:

The in_le32() not only contains the unwanted 'sync', but also
a 'twi' (trap immediate - NFI exactly what this does) and 'isync'.
The 'isync' is particularly horrid and unnecessary (aborts
the instruction queue and refeches the opcode bytes).


I've also wondered why some time ago, and this is what I could find: 
it's a special sequence that is detected by the bus error handler 
(machine check exception happens on I/O error i.e. aborted pci 
transaction or some such), so that it can 'recover' by continuing at the 
next instruction (and setting an error variable).


Perhaps there is no other way to recover reliably from bus errors?

Micha
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: _extending_ platform support options?

2010-12-01 Thread Benjamin Herrenschmidt
On Wed, 2010-12-01 at 08:47 -0500, Josh Boyer wrote:
> >Though, to me, it does not seem really OK to assign ppc_md members
> >that way. The original struct machdep for "virtex" (which is defined
> >in virtex.c with define_machine()) is not adjusted either. Ok, we
> >could modify that one, too.
> >Especially I'm not sure if it is OK to use machine_core_initcall()
> for such modifications.
> >
> >So my question is: Is there any recommended way for doing such
> >"extensions"? Or is it OK to just duplicate virtex.c (which does not
> >seem really OK, too)?
> 
> Duplicate it as you have done, naming the file something unique.  We
> try
> to prevent unnecessary duplication of code, but sometimes it's cleaner
> to just have a separate board file instead. 

Right. Best way is to turn the common code in virtex.c into "library"
code that you can hookup from your platform's ppc_md, so you avoid
duplication that way for most things.

You an do that by just linking in virtex.c and changing the stuff you
want to be non-static, or better if that becomes a habit, separate
virtex-lib.c (for example) from virtex-simple.c (generic platform for
example). You don't have to follow my proposed names :-)

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] drivers: char: hvc: add arm JTAG DCC console support

2010-12-01 Thread Stephen Boyd
On 12/01/2010 10:54 AM, Daniel Walker wrote:
> Are you talking about __dcc_getstatus only? I don't think adding
> volatile is going to hurt anything, if not having it causes problems.
>

Just marking __dcc_getstatus volatile gives me

0038 :
  38:   ee10fe11mrc 14, 0, pc, cr0, cr1, {0}
  3c:   1afdbne 38 
  40:   ee103e15mrc 14, 0, r3, cr0, cr5, {0}
  44:   e3a0mov r0, #0  ; 0x0
  48:   e6ef3073uxtbr3, r3
  4c:   ea04b   64 
  50:   ee10ce11mrc 14, 0, ip, cr0, cr1, {0}
  54:   e31c0101tst ip, #1073741824 ; 0x4000
  58:   012fff1ebxeqlr
  5c:   e7c13000strbr3, [r1, r0]
  60:   e281add r0, r0, #1  ; 0x1
  64:   e152cmp r0, r2
  68:   baf8blt 50 
  6c:   e12fff1ebx  lr

Seems the compiler keeps the value of __dcc_getchar() in r3 for the
duration of the loop. So we need to mark that one volatile too. I don't
think __dcc_putchar() needs to be marked volatile but it probably
doesn't hurt.

>
> We could maybe drop the looping for TX, but RX has no C based looping
> even tho for v7 it's recommended that we loop (presumably for v6 it's
> not recommended).
>

Definitely for TX since it seems like a redundant loop, but I agree RX
code has changed. Instead of

If RX buffer full
Poll for RX buffer full
Read character from RX buffer

we would have

If RX buffer full
Read character from RX buffer

which doesn't seem all that different assuming the RX buffer doesn't go
from full to empty between the If and Poll steps. Hopefully Tony knows more.

> Like this?
>
>   for (i = 0; i < count; ++i) {
>
>   if (__dcc_getstatus() & DCC_STATUS_RX)
>   buf[i] = __dcc_getchar();
>   else
>   break;
>   }
>
> It's a micro clean up I guess ..

Yes, it's much clearer that way.

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [MPC52xx]Latency issue with DMA on FEC

2010-12-01 Thread Scott Wood
On Wed, 1 Dec 2010 15:09:54 +
David Laight  wrote:

> The in_le32() not only contains the unwanted 'sync', but also
> a 'twi' (trap immediate - NFI exactly what this does) and 'isync'.

It turns a data dependency into a flow dependency.  It's basically equivalent 
to:

lwz rX, ...
cmpwrX, rX
bne 1f
1: isync

> The 'isync' is particularly horrid and unnecessary (aborts
> the instruction queue and refeches the opcode bytes)

The isync makes sure that the twi has completed before proceeding.

Note that the guarded, cache-inhibited load itself can be pretty
painful -- the core can't restart it, so it must complete before you
can take an interrupt.

> The very slow in_le32() might be there to give semi-synchronous
> traps on address fault - but unless the hardware is being probed
> that really isn't necessary.

There are times when you really want to be sure that the I/O is
finished before proceeding with something that isn't a load/store and
thus can't be serialized with normal barriers.

E.g. you're about to execute instructions in a physical address window
that you just set up (or even just create a non-guarded mapping to it
-- could get speculative accesses any time), or you just masked an
interrupt at the PIC (with a readback to flush) and are about to enable
MSR[EE].

Most of the time, though, it's overkill.  It should probably be an
alternate accessor form, or maybe a wait_for_io() wrapper -- if it can
be shown to make a real performance difference.

-Scott

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH V2 2/3] powerpc: Poll VPA for topology changes and update NUMA maps

2010-12-01 Thread Jesse Larrew
On 11/28/2010 10:44 PM, Benjamin Herrenschmidt wrote:
> On Tue, 2010-11-09 at 16:25 -0700, Jesse Larrew wrote:
>> From: Jesse Larrew 
>>
>> This patch sets a timer during boot that will periodically poll the
>> associativity change counters in the VPA. When a change in 
>> associativity is detected, it retrieves the new associativity domain 
>> information via the H_HOME_NODE_ASSOCIATIVITY hcall and updates the 
>> NUMA node maps and sysfs entries accordingly. Note that since the 
>> ibm,associativity device tree property does not exist on configurations 
>> with both NUMA and SPLPAR enabled, no device tree updates are necessary.
>>
>> Signed-off-by: Jesse Larrew 
>> ---
> 
> No fundamental objection, just quick nits before I merge:

Thanks for the review!

>> +
>> +/* Vrtual Processor Home Node (VPHN) support */
>> +#define VPHN_NR_CHANGE_CTRS (8)
>> +static u8 vphn_cpu_change_counts[NR_CPUS][VPHN_NR_CHANGE_CTRS];
>> +static cpumask_t cpu_associativity_changes_mask;
>> +static void topology_work_fn(struct work_struct *work);
>> +static DECLARE_WORK(topology_work, topology_work_fn);
> 
> Remove the prototype for topology_work_fn() and puts the DECLARE_WORK
> right below the function itself.
> 

No problem.

>> +static void topology_timer_fn(unsigned long ignored);
>> +static struct timer_list topology_timer =
>> +TIMER_INITIALIZER(topology_timer_fn, 0, 0);
> 
> Same deal.
> 

Ditto.

>> +static void set_topology_timer(void);
>> +int stop_topology_update(void);
>> +
>> +/*
>> + * Store the current values of the associativity change counters in the
>> + * hypervisor.
>> + */
>> +static void setup_cpu_associativity_change_counters(void)
>> +{
>> +int cpu = 0;
>> +
>> +for_each_possible_cpu(cpu) {
>> +int i = 0;
>> +u8 *counts = vphn_cpu_change_counts[cpu];
>> +volatile u8 *hypervisor_counts = lppaca[cpu].vphn_assoc_counts;
>> +
>> +for (i = 0; i < VPHN_NR_CHANGE_CTRS; i++) {
>> +counts[i] = hypervisor_counts[i];
>> +}
>> +}
>> +
>> +return;
>> +}
>> +
>> +/*
>> + * The hypervisor maintains a set of 8 associativity change counters in
>> + * the VPA of each cpu that correspond to the associativity levels in the
>> + * ibm,associativity-reference-points property. When an associativity
>> + * level changes, the corresponding counter is incremented.
>> + *
>> + * Set a bit in cpu_associativity_changes_mask for each cpu whose home
>> + * node associativity levels have changed.
>> + */
>> +static void update_cpu_associativity_changes_mask(void)
>> +{
>> +int cpu = 0;
>> +cpumask_t *changes = &cpu_associativity_changes_mask;
>> +
>> +cpumask_clear(changes);
>> +
>> +for_each_possible_cpu(cpu) {
>> +int i;
>> +u8 *counts = vphn_cpu_change_counts[cpu];
>> +volatile u8 *hypervisor_counts = lppaca[cpu].vphn_assoc_counts;
>> +
>> +for (i = 0; i < VPHN_NR_CHANGE_CTRS; i++) {
>> +if (hypervisor_counts[i] > counts[i]) {
>> +counts[i] = hypervisor_counts[i];
>> +
>> +if (!(cpumask_test_cpu(cpu, changes)))
>> +cpumask_set_cpu(cpu, changes);
>> +}
>> +}
> 
> This is a tad sub-optimal. I'd just set a local variable, and
> after the inside loop set the cpumask bit when that variable is set.
> 
> Also, keep another variable that accumulate all bits set and return
> its value so you don't have to re-check the mask in the caller.
> 
> cpumask operations can be expensive.
> 

You're right. That's more efficient.

>> +}
>> +
>> +return;
> 
> You don't need a return; at the end of a function.
> 

Ok.

>> +}
>> +
>> +/* 6 64-bit registers unpacked into 12 32-bit associativity values */
>> +#define VPHN_ASSOC_BUFSIZE (6*sizeof(u64)/sizeof(u32))
>> +
>> +/*
>> + * Convert the associativity domain numbers returned from the hypervisor
>> + * to the sequence they would appear in the ibm,associativity property.
>> + */
>> +static int vphn_unpack_associativity(const long *packed, unsigned int 
>> *unpacked)
>> +{
>> +int i = 0;
>> +int nr_assoc_doms = 0;
>> +const u16 *field = (const u16*) packed;
>> +
>> +#define VPHN_FIELD_UNUSED   (0x)
>> +#define VPHN_FIELD_MSB  (0x8000)
>> +#define VPHN_FIELD_MASK (~VPHN_FIELD_MSB)
>> +
>> +for (i = 0; i < VPHN_ASSOC_BUFSIZE; i++) {
>> +if (*field == VPHN_FIELD_UNUSED) {
>> +/* All significant fields processed, and remaining
>> + * fields contain the reserved value of all 1's.
>> + * Just store them.
>> + */
>> +unpacked[i] = *((u32*)field);
>> +field += 2;
>> +}
>> +else if (*field & VPHN_FIELD_MSB) {
>> +/* Data is in the lower 15 bits of this field */
>> +u

Re: [PATCH V2 2/3] powerpc: Poll VPA for topology changes and update NUMA maps

2010-12-01 Thread Benjamin Herrenschmidt
On Wed, 2010-12-01 at 16:50 -0500, Jesse Larrew wrote:
> 
> Hmmm... Good point. That would eliminate a lot of complexity, and if
> we wrap the code in the timer interrupt so that it only executes on
> systems with the VPHN feature, then partition migration pretty much
> takes care of itself as well. :) I'll repost this patch set with the
> tweaks that you mentioned above, then I'll post a separate patch to
> remove the cpumask and timer.

Right. First fixup that patch and we can merge that, then we can look at
the "better approach" as a separate step.

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH V3 0/3][RFC] Add Support for Virtual Processor Home Node (VPHN)

2010-12-01 Thread Jesse Larrew
From: Jesse Larrew 

The SPLPAR option allows the platform to dispatch virtual processors on
physical processors that, due to the variable nature of work loads, are
temporarily free, thus improving the utilization of computing resources.
However, SPLPAR implies inconsistent mapping of virtual to physical
processors, thus defeating resource allocation software that attempts to
optimize performance on platforms that implement the NUMA option.

To bridge the gap between these two options, the VPHN option maintains a
substantially consistent mapping of a given virtual processor to a physical
processor or set of processors within a given associativity domain. When
allocating computing resources, the kernel can take advantage of this
statistically consistent mapping to improve processing performance.

VPHN mappings are substantially consistent but not static. For any given
dispatch cycle, a best effort is made by the hypervisor to dispatch the
virtual processor on a physical processor within a targeted associativity
domain (the virtual processor's home node). However, if processing capacity
within the home node is not available, some other physical processor is
assigned to meet the processing capacity entitlement. From time to time,
to optimize the total platform performance, it may be necessary for the
platform to change the home node of a given virtual processor.

The Virtual Processor Home Node feature addresses this by adding the
H_HOME_NODE_ASSOCIATIVITY hcall to retrieve the current associativity
domain information directly from the hypervisor for a given virtual
processor's home node. It also exposes a set of associativity change
counters in the Virtual Processor Area (VPA) of each processor to indicate
when associativity changes occur.

This patch set sets a timer during boot that will periodically poll the
associativity change counters. When a change in associativity is detected,
it retrieves the new associativity domain information via the
H_HOME_NODE_ASSOCIATIVITY hcall and updates the NUMA node maps and sysfs
entries accordingly. The polling mechanism is also tied into the
ibm,suspend-me rtas call to stop/restart polling before/after a suspend,
hibernate, migrate, or checkpoint restart operation.

This patch set applies to v2.6.37-rc4 and includes the following:

[PATCH 1/3] powerpc: Add VPHN firmware feature
[PATCH 2/3] powerpc: Poll VPA for topology changes and update NUMA maps
[PATCH 3/3] powerpc: Disable VPHN polling during a suspend operation

Changes since V2:

* Rebased on 2.6.37-rc4.
* Rearranged work declarations and timer initializations to eliminate
  unnecessary function declarations.
* Eliminated redundant cpumask operations in 
  update_cpu_associativity_changes_mask().
* Eliminated unnecessary return statments from functions with void
  return types.
* Coding-style cleanups.
* Replaced del_timer() with del_timer_sync() in stop_topology_update(),
  and added the "vphn_enabled" flag to prevent the timer function from
  reinstalling itself.

Signed-off-by: Jesse Larrew 
---
 arch/powerpc/include/asm/firmware.h   |3 +-
 arch/powerpc/include/asm/hvcall.h |3 +-
 arch/powerpc/include/asm/lppaca.h |5 +-
 arch/powerpc/include/asm/topology.h   |   10 +
 arch/powerpc/kernel/rtas.c|3 +
 arch/powerpc/mm/numa.c|  274 +++-
 arch/powerpc/platforms/pseries/firmware.c |1 +
 7 files changed, 286 insertions(+), 13 deletions(-)

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH V3 1/3] powerpc: Add VPHN firmware feature

2010-12-01 Thread Jesse Larrew
From: Jesse Larrew 

This simple patch adds the firmware feature for VPHN to the firmware 
features bitmask.

Signed-off-by: Jesse Larrew 
---
 arch/powerpc/include/asm/firmware.h   |3 ++-
 arch/powerpc/include/asm/hvcall.h |3 ++-
 arch/powerpc/platforms/pseries/firmware.c |1 +
 3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/firmware.h 
b/arch/powerpc/include/asm/firmware.h
index 20778a4..4ef662e 100644
--- a/arch/powerpc/include/asm/firmware.h
+++ b/arch/powerpc/include/asm/firmware.h
@@ -46,6 +46,7 @@
 #define FW_FEATURE_PS3_LV1 ASM_CONST(0x0080)
 #define FW_FEATURE_BEATASM_CONST(0x0100)
 #define FW_FEATURE_CMO ASM_CONST(0x0200)
+#define FW_FEATURE_VPHNASM_CONST(0x0400)
 
 #ifndef __ASSEMBLY__
 
@@ -59,7 +60,7 @@ enum {
FW_FEATURE_VIO | FW_FEATURE_RDMA | FW_FEATURE_LLAN |
FW_FEATURE_BULK_REMOVE | FW_FEATURE_XDABR |
FW_FEATURE_MULTITCE | FW_FEATURE_SPLPAR | FW_FEATURE_LPAR |
-   FW_FEATURE_CMO,
+   FW_FEATURE_CMO | FW_FEATURE_VPHN,
FW_FEATURE_PSERIES_ALWAYS = 0,
FW_FEATURE_ISERIES_POSSIBLE = FW_FEATURE_ISERIES | FW_FEATURE_LPAR,
FW_FEATURE_ISERIES_ALWAYS = FW_FEATURE_ISERIES | FW_FEATURE_LPAR,
diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index de03ca5..6de1e5f 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -232,7 +232,8 @@
 #define H_GET_EM_PARMS 0x2B8
 #define H_SET_MPP  0x2D0
 #define H_GET_MPP  0x2D4
-#define MAX_HCALL_OPCODE   H_GET_MPP
+#define H_HOME_NODE_ASSOCIATIVITY 0x2EC
+#define MAX_HCALL_OPCODE   H_HOME_NODE_ASSOCIATIVITY
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/powerpc/platforms/pseries/firmware.c 
b/arch/powerpc/platforms/pseries/firmware.c
index 0a14d8c..0b0eff0 100644
--- a/arch/powerpc/platforms/pseries/firmware.c
+++ b/arch/powerpc/platforms/pseries/firmware.c
@@ -55,6 +55,7 @@ firmware_features_table[FIRMWARE_MAX_FEATURES] = {
{FW_FEATURE_XDABR,  "hcall-xdabr"},
{FW_FEATURE_MULTITCE,   "hcall-multi-tce"},
{FW_FEATURE_SPLPAR, "hcall-splpar"},
+   {FW_FEATURE_VPHN,   "hcall-vphn"},
 };
 
 /* Build up the firmware features bitmask using the contents of
-- 
1.7.2.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH V3 2/3] powerpc: Poll VPA for topology changes and update NUMA maps

2010-12-01 Thread Jesse Larrew
From: Jesse Larrew 

This patch sets a timer during boot that will periodically poll the
associativity change counters in the VPA. When a change in 
associativity is detected, it retrieves the new associativity domain 
information via the H_HOME_NODE_ASSOCIATIVITY hcall and updates the 
NUMA node maps and sysfs entries accordingly. Note that since the 
ibm,associativity device tree property does not exist on configurations 
with both NUMA and SPLPAR enabled, no device tree updates are necessary.

Signed-off-by: Jesse Larrew 
---
 arch/powerpc/include/asm/lppaca.h |5 +-
 arch/powerpc/mm/numa.c|  277 +++--
 2 files changed, 271 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/include/asm/lppaca.h 
b/arch/powerpc/include/asm/lppaca.h
index 7f5e0fe..380d48b 100644
--- a/arch/powerpc/include/asm/lppaca.h
+++ b/arch/powerpc/include/asm/lppaca.h
@@ -62,7 +62,10 @@ struct lppaca {
volatile u32 dyn_pir;   // Dynamic ProcIdReg value  x20-x23
u32 dsei_data;  // DSEI datax24-x27
u64 sprg3;  // SPRG3 value  x28-x2F
-   u8  reserved3[80];  // Reserved x30-x7F
+   u8  reserved3[40];  // Reserved x30-x57
+   volatile u8 vphn_assoc_counts[8]; // Virtual processor home node
+   // associativity change counters x58-x5F
+   u8  reserved4[32];  // Reserved x60-x7F
 
 //=
 // CACHE_LINE_2 0x0080 - 0x00FF Contains local read-write data
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 74505b2..42aa7d1 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -20,10 +20,14 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
 #include 
+#include 
+#include 
 
 static int numa_enabled = 1;
 
@@ -246,32 +250,41 @@ static void initialize_distance_lookup_table(int nid,
 /* Returns nid in the range [0..MAX_NUMNODES-1], or -1 if no useful numa
  * info is found.
  */
-static int of_node_to_nid_single(struct device_node *device)
+static int associativity_to_nid(const unsigned int *associativity)
 {
int nid = -1;
-   const unsigned int *tmp;
 
if (min_common_depth == -1)
goto out;
 
-   tmp = of_get_associativity(device);
-   if (!tmp)
-   goto out;
-
-   if (tmp[0] >= min_common_depth)
-   nid = tmp[min_common_depth];
+   if (associativity[0] >= min_common_depth)
+   nid = associativity[min_common_depth];
 
/* POWER4 LPAR uses 0x as invalid node */
if (nid == 0x || nid >= MAX_NUMNODES)
nid = -1;
 
-   if (nid > 0 && tmp[0] >= distance_ref_points_depth)
-   initialize_distance_lookup_table(nid, tmp);
+   if (nid > 0 && associativity[0] >= distance_ref_points_depth)
+   initialize_distance_lookup_table(nid, associativity);
 
 out:
return nid;
 }
 
+/* Returns the nid associated with the given device tree node,
+ * or -1 if not found.
+ */
+static int of_node_to_nid_single(struct device_node *device)
+{
+   int nid = -1;
+   const unsigned int *tmp;
+
+   tmp = of_get_associativity(device);
+   if (tmp)
+   nid = associativity_to_nid(tmp);
+   return nid;
+}
+
 /* Walk the device tree upwards, looking for an associativity id */
 int of_node_to_nid(struct device_node *device)
 {
@@ -1248,3 +1261,247 @@ int hot_add_scn_to_nid(unsigned long scn_addr)
 }
 
 #endif /* CONFIG_MEMORY_HOTPLUG */
+
+/* Vrtual Processor Home Node (VPHN) support */
+#define VPHN_NR_CHANGE_CTRS (8)
+static u8 vphn_cpu_change_counts[NR_CPUS][VPHN_NR_CHANGE_CTRS];
+static cpumask_t cpu_associativity_changes_mask;
+static int vphn_enabled;
+static void set_topology_timer(void);
+int stop_topology_update(void);
+
+/*
+ * Store the current values of the associativity change counters in the
+ * hypervisor.
+ */
+static void setup_cpu_associativity_change_counters(void)
+{
+   int cpu = 0;
+
+   for_each_possible_cpu(cpu) {
+   int i = 0;
+   u8 *counts = vphn_cpu_change_counts[cpu];
+   volatile u8 *hypervisor_counts = lppaca[cpu].vphn_assoc_counts;
+
+   for (i = 0; i < VPHN_NR_CHANGE_CTRS; i++) {
+   counts[i] = hypervisor_counts[i];
+   }
+   }
+}
+
+/*
+ * The hypervisor maintains a set of 8 associativity change counters in
+ * the VPA of each cpu that correspond to the associativity levels in the
+ * ibm,associativity-reference-points property. When an associativity
+ * level changes, the corresponding counter is incremented.
+ *
+ * Set a bit in cpu_associativity_changes_mask for each cpu whose home
+ * node associativity levels have ch

[PATCH V3 3/3] powerpc: Disable VPHN polling during a suspend operation

2010-12-01 Thread Jesse Larrew
From: Jesse Larrew 

Tie the polling mechanism into the ibm,suspend-me rtas call to
stop/restart polling before/after a suspend, hibernate, migrate,
or checkpoint restart operation. This ensures that the system has a
chance to disable the polling if the partition is migrated to a system
that does not support VPHN (and vice versa).

Signed-off-by: Jesse Larrew 
---
 arch/powerpc/include/asm/topology.h |   10 ++
 arch/powerpc/kernel/rtas.c  |3 +++
 2 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/topology.h 
b/arch/powerpc/include/asm/topology.h
index afe4aaa..aed188b 100644
--- a/arch/powerpc/include/asm/topology.h
+++ b/arch/powerpc/include/asm/topology.h
@@ -93,6 +93,8 @@ extern void __init dump_numa_cpu_topology(void);
 extern int sysfs_add_device_to_node(struct sys_device *dev, int nid);
 extern void sysfs_remove_device_from_node(struct sys_device *dev, int nid);
 
+extern int start_topology_update(void);
+extern int stop_topology_update(void);
 #else
 
 static inline void dump_numa_cpu_topology(void) {}
@@ -107,6 +109,14 @@ static inline void sysfs_remove_device_from_node(struct 
sys_device *dev,
 {
 }
 
+static inline int start_topology_update(void)
+{
+   return 0;
+}
+static inline int stop_topology_update(void)
+{
+   return 0;
+}
 #endif /* CONFIG_NUMA */
 
 #include 
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 8fe8bc6..2097f2b 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -41,6 +41,7 @@
 #include 
 #include 
 #include 
+#include 
 
 struct rtas_t rtas = {
.lock = __ARCH_SPIN_LOCK_UNLOCKED
@@ -713,6 +714,7 @@ static int __rtas_suspend_last_cpu(struct 
rtas_suspend_me_data *data, int wake_w
int cpu;
 
slb_set_size(SLB_MIN_SIZE);
+   stop_topology_update();
printk(KERN_DEBUG "calling ibm,suspend-me on cpu %i\n", 
smp_processor_id());
 
while (rc == H_MULTI_THREADS_ACTIVE && !atomic_read(&data->done) &&
@@ -728,6 +730,7 @@ static int __rtas_suspend_last_cpu(struct 
rtas_suspend_me_data *data, int wake_w
rc = atomic_read(&data->error);
 
atomic_set(&data->error, rc);
+   start_topology_update();
 
if (wake_when_done) {
atomic_set(&data->done, 1);
-- 
1.7.2.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH] mpc52xx: gpt: include fs.h

2010-12-01 Thread Wolfram Sang
Fix build errors like these from a randconfig:

src/arch/powerpc/platforms/52xx/mpc52xx_gpt.c:549: error: dereferencing pointer 
to incomplete type: 1 errors in 1 logs
src/arch/powerpc/platforms/52xx/mpc52xx_gpt.c:636: error: implicit declaration 
of function 'nonseekable_open': 1 errors in 1 logs
src/arch/powerpc/platforms/52xx/mpc52xx_gpt.c:657: error: variable 
'mpc52xx_wdt_fops' has initializer but incomplete type: 1 errors in 1 logs
src/arch/powerpc/platforms/52xx/mpc52xx_gpt.c:658: error: excess elements in 
struct initializer: 1 errors in 1 logs
src/arch/powerpc/platforms/52xx/mpc52xx_gpt.c:658: error: unknown field 'owner' 
specified in initializer: 1 errors in 1 logs
...

Reported-by: Geert Uytterhoeven 
Signed-off-by: Wolfram Sang 
Cc: Grant Likely 
---
 arch/powerpc/platforms/52xx/mpc52xx_gpt.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/platforms/52xx/mpc52xx_gpt.c 
b/arch/powerpc/platforms/52xx/mpc52xx_gpt.c
index fea833e..e0d703c 100644
--- a/arch/powerpc/platforms/52xx/mpc52xx_gpt.c
+++ b/arch/powerpc/platforms/52xx/mpc52xx_gpt.c
@@ -63,6 +63,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
-- 
1.7.2.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] of/address: use propper endianess in get_flags

2010-12-01 Thread Benjamin Herrenschmidt
On Wed, 2010-12-01 at 10:54 +0100, Sebastian Andrzej Siewior wrote:
> This patch changes u32 to __be32 for all "ranges", "prop" and "addr" and
> such. Those variables are pointing to the device tree which containts
> intergers in big endian format.
> Most functions are doing it right because of_read_number() is doing the
> right thing for them. of_bus_isa_get_flags(), of_bus_pci_get_flags() and
> of_bus_isa_map() were accessing the data directly and were doing it wrong.
> 
> Signed-off-by: Sebastian Andrzej Siewior 

Acked-by: Benjamin Herrenschmidt 
---

> ---
>  arch/powerpc/include/asm/prom.h |2 +-
>  drivers/of/address.c|   54 --
>  include/linux/of_address.h  |6 ++--
>  3 files changed, 32 insertions(+), 30 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h
> index ae26f2e..ab34f60 100644
> --- a/arch/powerpc/include/asm/prom.h
> +++ b/arch/powerpc/include/asm/prom.h
> @@ -42,7 +42,7 @@ extern void pci_create_OF_bus_map(void);
>  
>  /* Translate a DMA address from device space to CPU space */
>  extern u64 of_translate_dma_address(struct device_node *dev,
> - const u32 *in_addr);
> + const __be32 *in_addr);
>  
>  #ifdef CONFIG_PCI
>  extern unsigned long pci_address_to_pio(phys_addr_t address);
> diff --git a/drivers/of/address.c b/drivers/of/address.c
> index 3a1c7e7..b4559c5 100644
> --- a/drivers/of/address.c
> +++ b/drivers/of/address.c
> @@ -12,13 +12,13 @@
>   (ns) > 0)
>  
>  static struct of_bus *of_match_bus(struct device_node *np);
> -static int __of_address_to_resource(struct device_node *dev, const u32 
> *addrp,
> - u64 size, unsigned int flags,
> +static int __of_address_to_resource(struct device_node *dev,
> + const __be32 *addrp, u64 size, unsigned int flags,
>   struct resource *r);
>  
>  /* Debug utility */
>  #ifdef DEBUG
> -static void of_dump_addr(const char *s, const u32 *addr, int na)
> +static void of_dump_addr(const char *s, const __be32 *addr, int na)
>  {
>   printk(KERN_DEBUG "%s", s);
>   while (na--)
> @@ -26,7 +26,7 @@ static void of_dump_addr(const char *s, const u32 *addr, 
> int na)
>   printk("\n");
>  }
>  #else
> -static void of_dump_addr(const char *s, const u32 *addr, int na) { }
> +static void of_dump_addr(const char *s, const __be32 *addr, int na) { }
>  #endif
>  
>  /* Callbacks for bus specific translators */
> @@ -36,10 +36,10 @@ struct of_bus {
>   int (*match)(struct device_node *parent);
>   void(*count_cells)(struct device_node *child,
>  int *addrc, int *sizec);
> - u64 (*map)(u32 *addr, const u32 *range,
> + u64 (*map)(u32 *addr, const __be32 *range,
>   int na, int ns, int pna);
>   int (*translate)(u32 *addr, u64 offset, int na);
> - unsigned int(*get_flags)(const u32 *addr);
> + unsigned int(*get_flags)(const __be32 *addr);
>  };
>  
>  /*
> @@ -55,7 +55,7 @@ static void of_bus_default_count_cells(struct device_node 
> *dev,
>   *sizec = of_n_size_cells(dev);
>  }
>  
> -static u64 of_bus_default_map(u32 *addr, const u32 *range,
> +static u64 of_bus_default_map(u32 *addr, const __be32 *range,
>   int na, int ns, int pna)
>  {
>   u64 cp, s, da;
> @@ -85,7 +85,7 @@ static int of_bus_default_translate(u32 *addr, u64 offset, 
> int na)
>   return 0;
>  }
>  
> -static unsigned int of_bus_default_get_flags(const u32 *addr)
> +static unsigned int of_bus_default_get_flags(const __be32 *addr)
>  {
>   return IORESOURCE_MEM;
>  }
> @@ -110,10 +110,10 @@ static void of_bus_pci_count_cells(struct device_node 
> *np,
>   *sizec = 2;
>  }
>  
> -static unsigned int of_bus_pci_get_flags(const u32 *addr)
> +static unsigned int of_bus_pci_get_flags(const __be32 *addr)
>  {
>   unsigned int flags = 0;
> - u32 w = addr[0];
> + u32 w = be32_to_cpup(addr);
>  
>   switch((w >> 24) & 0x03) {
>   case 0x01:
> @@ -129,7 +129,8 @@ static unsigned int of_bus_pci_get_flags(const u32 *addr)
>   return flags;
>  }
>  
> -static u64 of_bus_pci_map(u32 *addr, const u32 *range, int na, int ns, int 
> pna)
> +static u64 of_bus_pci_map(u32 *addr, const __be32 *range, int na, int ns,
> + int pna)
>  {
>   u64 cp, s, da;
>   unsigned int af, rf;
> @@ -160,7 +161,7 @@ static int of_bus_pci_translate(u32 *addr, u64 offset, 
> int na)
>   return of_bus_default_translate(addr + 1, offset, na - 1);
>  }
>  
> -const u32 *of_get_pci_address(struct device_node *dev, int bar_no, u64 *size,
> +const __be32 *of_get_pci_address(struct device_node *dev, int bar_no, u64 
> *size,
>   unsigned int *flags)
>  {
>   const __be32 *prop;
> @@ -207,7