Hi Greg,

Thanks a lot. Problem were with cache flushing, how you say.

But i have one more question. It is able to add support of split
D-cache&I-cache by enbling it in mcfcache.h and writing additional functions
for flushing each cache region (add support for functions __flush_icache_all
and __flush_dcache_all)? Or for enable D-cache it need some serious patch to
kernel?

2008/10/5 Greg Ungerer <g...@snapgear.com>

> Hi Alexander,
>
> Alexander Eremeenkov wrote:
>
>> I have performance problems with applications on this kernel(uClinux
>> 2.6.25-uc0).
>> First of all, I have custom made board with this features:
>> ColdFire 5274 @ 150 MHz
>> External Bus Frequency 75 MHz
>>
>> uClinux 2.6.25 boot up and work perfectly stable, but very slow. For this
>> board i have also 2.4.x kernel compiled, and it works more faster ( ~ 2-20
>> times faster, if different test application).
>> After reading maillist, i find, that some people have equals problems, and
>> their reason were - disabled cache. But in my case, cache enabled in start
>> process normally.
>> Calibration delay calculate good value:
>> Calibrating delay loop... 98.71 BogoMIPS (lpj=493568)
>>
>> Also, if i disabled cache in /include/asm-m68knommu/mcfcache.h, i got
>> looks-like normal, with disabled cache, value:
>> Calibrating delay loop... 5.82 BogoMIPS
>> So, I have drawn a conclusion, that cache enables good.
>>
>> Now, about performance. I test it with some applications.
>> 1. Dhrystone test. With it, i have 4065 dhrystones/second, that critically
>> small for this CPU. But i curious, that, when a disabled cache, the result
>> same.. As though like cache at just flushed or worked incorrect.
>> 2. Simple forever cycle with 1 value in memory incremented. This test,
>> also show near ~5 real MISP performance.
>> When i try to watch assess to external memory by CPU, i saw, that, it work
>> with external RAM(where this application contain) everytime - but i think
>> that, simple application with 5 command + 1 value memory must being in
>> cache, and CPU must work with cache only.
>> 3. Some real/work application. They do massive processing/moving data,
>> working with peripheral, etc. I also have quite bad performance. Those
>> applications runs at 2.4.x kernel at ~4 times faster.
>>
>> So, conclusion with it.
>> 1) I think, that i can't have problems with hardware - because, 2.4.x
>> kernel work fast.
>> 2) I have enabled cache at start process - 98 BogoMIPS good value for my
>> CPU, also 2.4 kernel calculate same result.
>>
>> Now i have question, maybe somebody have same problems at this kernel?
>> Maybe 2.6.xx  so slow, and ways to speed up it use old kernel or
>> modern/faster CPU?Maybe some bugs in poring it to ColdFire family?Etc?
>>
>
> There is no reason that application code performance (independant of
> use of kernel system calls) should give noticably different results
> on a 2.6 kernel vs a 2.4 kernel.
>
>
>  Maybe it problem with cache at working process? Some drivers, for example,
>> can flush cache..anyone have equal problem? About it, i will check it with
>> kernel build for 5275EVB, but i think, that  result will be same, because, a
>> removed all i can drivers/modules form kernel, that almost "empty" kernel
>> starts, and after it i run Dhrystone test, and get same result.
>> Maybe it's a toolchain problem? I use m68k-uclinux-tools-20061214.sh, they
>> have some minor problems, maybe reason with they? But, my second test with
>> simple application excludes this variant.
>>
>> If someone had those problems I will be glad for any help.
>>
>
> I recall a problem a little while back where the cache flush code
> was changing the cache configuration and not just flushing. (I think
> it was the 5282 cache support code that was broken). In this scenario
> the initial cache setup was good (so everything was fast), and after
> the first cache flush the setup was wrong. Now that would make something
> like the bogomips calculation look good, but later performance bad
>
> Looking at the 2 places this is done:
>
>  linux-2.6.x/include/asm-m68knommu/mcfcache.h
>  linux-2.6.x/include/asm-m68knommu/cacheflush.h
>
> I suspect this may have broken the cache support for the 527x
> series (so the 5270/5271 and 5274/5275).
>
> To verify if this is what you are seeing, can you change the cache
> flush code for CONFIG_M527x in cacheflush.h from:
>
>    "movel  #0x81000200, %%d0\n\t"
>
> to
>
>    "movel  #0x81400100, %%d0\n\t"
>
> This is just to prove this is the problem. A real fix would
> need the 528x and 527x cache flushing code separated out.
>
> Regards
> Greg
>
>
>
>  And my kernel messages at result:
>>
>> /> cat /proc/kmsg
>>
>> <5>Linux version 2.6.25-uc0 (wa...@arch) (gcc version 4.1.1) #36 Mon Jan
>> 5 13:05:16 PST 9
>> <6>
>>
>> <4>
>>
>> <4>uClinux/COLDFIRE(m5274/5275)
>>
>> <6>COLDFIRE port done by Greg Ungerer, g...@snapgear.com
>>               <6>Flat model support (C) 1998,1999 Kenneth Albanowski, D.
>> Jeff Dionne                 <7>On node 0 totalpages: 8192
>>                                        <7>  DMA zone: 0 pages used for
>> memmap                                                 <7>  Normal zone: 64
>> pages used for memmap                                             <7>
>>  Normal zone: 8128 pages, LIFO batch:0
>>       <7>  Movable zone: 0 pages used for memmap
>>                 <4>Built 1 zonelists in Zone order, mobility grouping on.
>>  Total pages: 8128           <5>Kernel command line:
>>                                                               <6>Dentry
>> cache hash table entries: 4096 (order: 2, 16384 bytes)
>> <6>Inode-cache hash table entries: 2048 (order: 1, 8192 bytes)
>>           <6>Memory available: 29264k/32768k RAM, (1504k kernel code, 243k
>> data)                 <7>Calibrating delay loop... 98.71 BogoMIPS
>> (lpj=493568)                               <4>Mount-cache hash table
>> entries: 512
>> <6>net_namespace: 144 bytes
>>            <6>NET: Registered protocol family 16
>>                      <6>NET: Registered protocol family 2
>>                                 <6>IP route cache hash table entries: 1024
>> (order: 0, 4096 bytes)                      <6>TCP established hash table
>> entries: 1024 (order: 1, 8192 bytes)                     <6>TCP bind hash
>> table entries: 1024 (order: 0, 4096 bytes)
>>  <6>TCP: Hash tables configured (established 1024 bind 1024)
>>            <6>TCP reno registered
>>                       <6>Installing knfsd (copyright (C) 1996
>> o...@monad.swb.de).                            <6>JFFS2 version 2.2.
>> (NAND) ?? 2001-2006 Red Hat, Inc.                                <6>io
>> scheduler noop registered
>>      <6>io scheduler cfq registered (default)
>>                                      <4>ColdFire internal UART serial driver
>>                                                <6>ttyS0 at MMIO 0x40000200
>> (irq = 77) is a ColdFire UART                              <6>console
>> [ttyS0] enabled
>> <6>ttyS1 at MMIO 0x40000240 (irq = 78) is a ColdFire UART
>>            <6>ttyS2 at MMIO 0x40000280 (irq = 79) is a ColdFire UART
>>                      <6>brd: module loaded
>>                                <4>FEC ENET Version 0.2
>>                                          <4>fec: PHY @ 0x1, ID 0x20005c90 --
>> DP83848                                            <4>eth0: ethernet
>> 00:04:24:23:34:44
>>  <4>uclinux[mtd]: RAM probe address=0x1d50f4 size=0x14d000
>>            <5>Creating 1 MTD partitions on "RAM":
>>                       <5>0x00000000-0x0014d000 : "ROMfs"
>>                                 <4>uclinux[mtd]: set ROMfs to be root
>> filesystem
>>  <6>Physically mapped flash: Found 1 x16 devices at 0x0 in 16-bit bank
>>            <4> Intel/Sharp Extended Query Table at 0x010A
>>                       <4> Intel/Sharp Extended Query Table at 0x010A
>>                                 <4> Intel/Sharp Extended Query Table at
>> 0x010A                                         <4> Intel/Sharp Extended
>> Query Table at 0x010A                                         <4>
>> Intel/Sharp Extended Query Table at 0x010A
>>       <6>Using buffer write method
>>                 <6>Using auto-unlock on power-up/resume
>>                            <5>cfi_cmdset_0001: Erase suspend on write
>> enabled                                     <7>erase region 0:
>> offset=0x0,size=0x8000,blocks=4                                     <7>erase
>> region 1: offset=0x20000,size=0x20000,blocks=63
>>                                               <6>TCP cubic registered
>>                                                          <6>NET: Registered
>> protocol family 1                                                   <6>RPC:
>> Registered udp transport module.
>>   <6>RPC: Registered tcp transport module.
>>             <6>NET: Registered protocol family 33
>>                        <4>VFS: Mounted root (romfs filesystem) readonly.
>>                                  <5>Freeing unused kernel memory: 52k freed
>> (0x1a7000 - 0x1b3000)                       <4>eth0: config:
>> auto-negotiation on, 100FDX, 100HDX, 10FDX, 10HDX.
>>
>>
>>
>>
>>
>>
> --
> ------------------------------------------------------------------------
> Greg Ungerer  --  Principal Engineer        EMAIL:     g...@snapgear.com
> SnapGear, a McAfee Company                  PHONE:       +61 7 3435 2888
> 825 Stanley St,                             FAX:         +61 7 3891 3630
> Woolloongabba, QLD, 4102, Australia         WEB: http://www.SnapGear.com
> _______________________________________________
> uClinux-dev mailing list
> uClinux-dev@uclinux.org
> http://mailman.uclinux.org/mailman/listinfo/uclinux-dev
> This message was resent by uclinux-dev@uclinux.org
> To unsubscribe see:
> http://mailman.uclinux.org/mailman/options/uclinux-dev
>
_______________________________________________
uClinux-dev mailing list
uClinux-dev@uclinux.org
http://mailman.uclinux.org/mailman/listinfo/uclinux-dev
This message was resent by uclinux-dev@uclinux.org
To unsubscribe see:
http://mailman.uclinux.org/mailman/options/uclinux-dev

Reply via email to