[PATCH 0/2] [PPC 4xx] L2-cache synchronization for ppc44x

2007-11-06 Thread Yuri Tikhonov

 Hello all,

 Here is a patch-set for support L2-cache synchronization routines for
the ppc44x processors family. I know that the "ppc" branch is for bug-fixing 
only, thus
the patch-set is just FYI [though enabled but non-coherent L2-cache may appear 
as a bug for
someone who uses one of the boards listed below :)].

[PATCH 1/2] [PPC 4xx] invalidate_l2cache_range() implementation for ppc44x;
[PATCH 2/2] [PPC 44x] enable L2-cache for the following ppc44x-based boards: 
ALPR,
Katmai, Ocotea, and Taishan.

 Regards, Yuri

-- 
Yuri Tikhonov, Senior Software Engineer
Emcraft Systems, www.emcraft.com

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 0/2] [PPC 4xx] L2-cache synchronization for ppc44x

2008-01-11 Thread Yuri Tikhonov

 Hello, Eugene,

 The h/w snooping mechanism you are talking about is limited to the Low 
Latency (LL) segment of the PLB bus in ppc440sp and ppc440spe chips (see 
section "7.2.7 L2 Cache Coherency" of the ppc440spe spec), whereas DMA and 
XOR engines use the High Bandwidth (HB) segment of PLB bus (see 
section "1.1.2 Internal Buses" of the ppc440spe spec).

 Thus, the h/w snooping mechanism is not able to trace the results of 
operations performed by DMA and XOR engines and keep L2-cache coherent with 
SDRAM, because the data flow through the HB PLB segment. This leads to, for 
example, incorrect results of RAID-parity calculations if one uses the h/w 
accelerated ppc440spe ADMA driver with L2-cache enabled.

 The s/w synchronization algorithms proposed in my patches has no LL PLB 
limitations as opposed to h/w snooping, but, probably, this is not the best 
way of how it might be implemented. Even though with these patches the h/w 
accelerated RAID starts to operate correctly (with L2-cache enabled) there is 
a performance degradation (induced by loops in the L2-cache synchronization 
routines) observed in the most cases. So, as a result, there is no benefit 
from using L2-cache for these, RAID, cases at all.

 Regards, Yuri

On Wednesday 28 November 2007 22:50, Eugene Surovegin wrote:
> On Wed, Nov 07, 2007 at 01:40:10AM +0300, Yuri Tikhonov wrote:
> > 
> >  Hello all,
> > 
> >  Here is a patch-set for support L2-cache synchronization routines for
> > the ppc44x processors family. I know that the "ppc" branch is for 
bug-fixing only, thus
> > the patch-set is just FYI [though enabled but non-coherent L2-cache may 
appear as a bug for
> > someone who uses one of the boards listed below :)].
> > 
> > [PATCH 1/2] [PPC 4xx] invalidate_l2cache_range() implementation for 
ppc44x;
> > [PATCH 2/2] [PPC 44x] enable L2-cache for the following ppc44x-based 
boards: ALPR,
> > Katmai, Ocotea, and Taishan.
> 
> Why is this all needed?
> 
> IIRC ibm440gx_l2c_enable() configures 64G snoop region for L2C.
> 
> Did AMCC made non-only-coherent L2C chips recently?
> 
> -- 
> Eugene
> 
> 

-- 
Yuri Tikhonov, Senior Software Engineer
Emcraft Systems, www.emcraft.com
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 0/2] [PPC 4xx] L2-cache synchronization for ppc44x

2008-01-11 Thread Eugene Surovegin
On Fri, Jan 11, 2008 at 06:24:46PM +0300, Yuri Tikhonov wrote:
> 
>  Hello, Eugene,
> 
>  The h/w snooping mechanism you are talking about is limited to the Low 
> Latency (LL) segment of the PLB bus in ppc440sp and ppc440spe chips (see 
> section "7.2.7 L2 Cache Coherency" of the ppc440spe spec), whereas DMA and 
> XOR engines use the High Bandwidth (HB) segment of PLB bus (see 
> section "1.1.2 Internal Buses" of the ppc440spe spec).
> 
>  Thus, the h/w snooping mechanism is not able to trace the results of 
> operations performed by DMA and XOR engines and keep L2-cache coherent with 
> SDRAM, because the data flow through the HB PLB segment. This leads to, for 
> example, incorrect results of RAID-parity calculations if one uses the h/w 
> accelerated ppc440spe ADMA driver with L2-cache enabled.
> 
>  The s/w synchronization algorithms proposed in my patches has no LL PLB 
> limitations as opposed to h/w snooping, but, probably, this is not the best 
> way of how it might be implemented. Even though with these patches the h/w 
> accelerated RAID starts to operate correctly (with L2-cache enabled) there is 
> a performance degradation (induced by loops in the L2-cache synchronization 
> routines) observed in the most cases. So, as a result, there is no benefit 
> from using L2-cache for these, RAID, cases at all.

Thanks a lot for explanation, Yuri. I'd never imagine they were so 
stupid to make new chips with such behaviour.

-- 
Eugene

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 0/2] [PPC 4xx] L2-cache synchronization for ppc44x

2008-01-11 Thread Benjamin Herrenschmidt

> >  The s/w synchronization algorithms proposed in my patches has no LL PLB 
> > limitations as opposed to h/w snooping, but, probably, this is not the best 
> > way of how it might be implemented. Even though with these patches the h/w 
> > accelerated RAID starts to operate correctly (with L2-cache enabled) there 
> > is 
> > a performance degradation (induced by loops in the L2-cache synchronization 
> > routines) observed in the most cases. So, as a result, there is no benefit 
> > from using L2-cache for these, RAID, cases at all.
> 
> Thanks a lot for explanation, Yuri. I'd never imagine they were so 
> stupid to make new chips with such behaviour.

Indeed. Now the question is do we want to make that configurable by the
platform so it can select whether to enable snooping, or use this
mechanism (in which case we can disable snooping on the L2) ?

Another option would be to make the dma_ops smart enough to know whether
a given device is on the snooped portion of the bus, which would be
easier to do after I merge 32 and 64 bits DMA ops, so we get the ability
to change the dma-ops per bus or per device even.

What do you guys think ?

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 0/2] [PPC 4xx] L2-cache synchronization for ppc44x

2008-01-11 Thread Eugene Surovegin
On Sat, Jan 12, 2008 at 09:05:35AM +1100, Benjamin Herrenschmidt wrote:
> 
> > >  The s/w synchronization algorithms proposed in my patches has no LL PLB 
> > > limitations as opposed to h/w snooping, but, probably, this is not the 
> > > best 
> > > way of how it might be implemented. Even though with these patches the 
> > > h/w 
> > > accelerated RAID starts to operate correctly (with L2-cache enabled) 
> > > there is 
> > > a performance degradation (induced by loops in the L2-cache 
> > > synchronization 
> > > routines) observed in the most cases. So, as a result, there is no 
> > > benefit 
> > > from using L2-cache for these, RAID, cases at all.
> > 
> > Thanks a lot for explanation, Yuri. I'd never imagine they were so 
> > stupid to make new chips with such behaviour.
> 
> Indeed. Now the question is do we want to make that configurable by the
> platform so it can select whether to enable snooping, or use this
> mechanism (in which case we can disable snooping on the L2) ?

I don't think we should panish platforms with sane L2 caches, because 
there are some brain-dead ones.

> Another option would be to make the dma_ops smart enough to know whether
> a given device is on the snooped portion of the bus, which would be
> easier to do after I merge 32 and 64 bits DMA ops, so we get the ability
> to change the dma-ops per bus or per device even.
> 
> What do you guys think ?

I like the idea of having smart DMA routines with different 
per-bus/device behaviour.

-- 
Eugene

 
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 0/2] [PPC 4xx] L2-cache synchronization for ppc44x

2008-01-11 Thread Benjamin Herrenschmidt

On Fri, 2008-01-11 at 14:38 -0800, Eugene Surovegin wrote:
> On Sat, Jan 12, 2008 at 09:05:35AM +1100, Benjamin Herrenschmidt wrote:
> > 
> > > >  The s/w synchronization algorithms proposed in my patches has no LL 
> > > > PLB 
> > > > limitations as opposed to h/w snooping, but, probably, this is not the 
> > > > best 
> > > > way of how it might be implemented. Even though with these patches the 
> > > > h/w 
> > > > accelerated RAID starts to operate correctly (with L2-cache enabled) 
> > > > there is 
> > > > a performance degradation (induced by loops in the L2-cache 
> > > > synchronization 
> > > > routines) observed in the most cases. So, as a result, there is no 
> > > > benefit 
> > > > from using L2-cache for these, RAID, cases at all.
> > > 
> > > Thanks a lot for explanation, Yuri. I'd never imagine they were so 
> > > stupid to make new chips with such behaviour.
> > 
> > Indeed. Now the question is do we want to make that configurable by the
> > platform so it can select whether to enable snooping, or use this
> > mechanism (in which case we can disable snooping on the L2) ?
> 
> I don't think we should panish platforms with sane L2 caches, because 
> there are some brain-dead ones.

I agree, which is why I'm thinking about making it some kind of explicit
thing that a give platform would call from it's setup_arch() callbacks
to turn on manual L2 sycnhronization.

> > Another option would be to make the dma_ops smart enough to know whether
> > a given device is on the snooped portion of the bus, which would be
> > easier to do after I merge 32 and 64 bits DMA ops, so we get the ability
> > to change the dma-ops per bus or per device even.
> > 
> > What do you guys think ?
> 
> I like the idea of having smart DMA routines with different 
> per-bus/device behaviour.

That would be longer term. When I merge the dma ops, I'll look into a
way to provide 44x specific DMA ops that handle that case, and then a
way for devices to be tagged (maybe via the device-tree) on whether they
are on an L2 coherent or non-L2 coherent segment of the bus.

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 0/2] [PPC 4xx] L2-cache synchronization for ppc44x

2007-11-28 Thread Eugene Surovegin
On Wed, Nov 07, 2007 at 01:40:10AM +0300, Yuri Tikhonov wrote:
> 
>  Hello all,
> 
>  Here is a patch-set for support L2-cache synchronization routines for
> the ppc44x processors family. I know that the "ppc" branch is for bug-fixing 
> only, thus
> the patch-set is just FYI [though enabled but non-coherent L2-cache may 
> appear as a bug for
> someone who uses one of the boards listed below :)].
> 
> [PATCH 1/2] [PPC 4xx] invalidate_l2cache_range() implementation for ppc44x;
> [PATCH 2/2] [PPC 44x] enable L2-cache for the following ppc44x-based boards: 
> ALPR,
> Katmai, Ocotea, and Taishan.

Why is this all needed?

IIRC ibm440gx_l2c_enable() configures 64G snoop region for L2C.

Did AMCC made non-only-coherent L2C chips recently?

-- 
Eugene

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev