Re: [coreboot] fam10/h8dmr: extreme slowness in CBFS memset / memcpy

2009-07-21 Thread Myles Watson
 Following up on this - Patrick helped in IRC this evening, and we came to
 the
 conclusion that it's probably *not* an MTRR issue, since we figured out
 the
 code seems to set MTRRs properly.
I wonder what else could cause it to be so slow?  It's especially surprising
for the memset, which is pretty simple.  Does it use movnti for that?

 
 We found out after adding an extra MTRR over the flash chip, which did not
 change anything.

Did you disable and re-enable the cache so that the settings take effect?

I guess I would:
1. Add some little benchmark loops reading/writing different areas
a. read ROM  time it
b. read from RAM (cached area) and time it
c. read from RAM (non-cached area)
d. write to RAM (cached area)
...
2. disable MTRRs to see if it would go even slower.

Sorry that's not much help, but I don't have a fam10 box to try things on.

Thanks,
Myles




-- 
coreboot mailing list: coreboot@coreboot.org
http://www.coreboot.org/mailman/listinfo/coreboot


Re: [coreboot] fam10/h8dmr: extreme slowness in CBFS memset / memcpy

2009-07-21 Thread Ward Vandewege
On Tue, Jul 21, 2009 at 06:25:38AM -0600, Myles Watson wrote:
  Following up on this - Patrick helped in IRC this evening, and we came to
  the
  conclusion that it's probably *not* an MTRR issue, since we figured out
  the
  code seems to set MTRRs properly.
 I wonder what else could cause it to be so slow?  It's especially surprising
 for the memset, which is pretty simple.  Does it use movnti for that?

It's actually just a plain byte-by-byte assignment in c, see
src/lib/memset.c.

  We found out after adding an extra MTRR over the flash chip, which did not
  change anything.
 
 Did you disable and re-enable the cache so that the settings take effect?

Hmm, we tried adding it here

  src/cpu/amd/car/clear_init_ram.c

in function set_init_ram_access, which already sets an mtrr.

This gets called just before CAR is disabled I think.

And then we found the mtrr set in 

  src/cpu/amd/car/cache_as_ram.inc

which looks like it *should* do the right thing. But that's assembler of
course. I don't suppose there's a way to print debug info from right there?

 I guess I would:
 1. Add some little benchmark loops reading/writing different areas
   a. read ROM  time it
   b. read from RAM (cached area) and time it
   c. read from RAM (non-cached area)
   d. write to RAM (cached area)
   ...
 2. disable MTRRs to see if it would go even slower.
 
 Sorry that's not much help, but I don't have a fam10 box to try things on.

Thanks - will see if I can try some of these things.

Thanks,
Ward.

-- 
Ward Vandewege w...@gnu.org

-- 
coreboot mailing list: coreboot@coreboot.org
http://www.coreboot.org/mailman/listinfo/coreboot


Re: [coreboot] fam10/h8dmr: extreme slowness in CBFS memset /memcpy

2009-07-21 Thread Myles Watson

 On Tue, Jul 21, 2009 at 06:25:38AM -0600, Myles Watson wrote:
   Following up on this - Patrick helped in IRC this evening, and we came
 to
   the
   conclusion that it's probably *not* an MTRR issue, since we figured
 out
   the
   code seems to set MTRRs properly.
  I wonder what else could cause it to be so slow?  It's especially
 surprising
  for the memset, which is pretty simple.  Does it use movnti for that?
 
 It's actually just a plain byte-by-byte assignment in c, see
 src/lib/memset.c.
It would be interesting to see if you make it 4 bytes at a time if it is 4x
faster.

   We found out after adding an extra MTRR over the flash chip, which did
 not
   change anything.
 
  Did you disable and re-enable the cache so that the settings take
 effect?
 
 Hmm, we tried adding it here
 
   src/cpu/amd/car/clear_init_ram.c
 
 in function set_init_ram_access, which already sets an mtrr.
I always wondered about that one.

The thing that makes it hard to debug is that it will read back correctly
even if it hasn't taken effect.

 Thanks - will see if I can try some of these things.
Good luck,
Myles


-- 
coreboot mailing list: coreboot@coreboot.org
http://www.coreboot.org/mailman/listinfo/coreboot


Re: [coreboot] fam10/h8dmr: extreme slowness in CBFS memset / memcpy

2009-07-20 Thread Ward Vandewege
Following up on this - Patrick helped in IRC this evening, and we came to the
conclusion that it's probably *not* an MTRR issue, since we figured out the
code seems to set MTRRs properly.

We found out after adding an extra MTRR over the flash chip, which did not
change anything.

The system boots fairly normally after the slowdowns, and appears to work
normally. It sets three MTRRs further in the bootup process:

  reg00: base=0x (   0MB), size=32768MB: write-back, count=1
  reg01: base=0x8 (32768MB), size= 512MB: write-back, count=1
  reg02: base=0xe000 (3584MB), size= 512MB: uncachable, count=1

Any thoughts on something else I should look at to debug this?

Thanks,
Ward.

On Sun, Jul 19, 2009 at 09:23:21PM -0400, Ward Vandewege wrote:
 Hi all,
 
 I'm working on a fam10 tree for supermicro h8dmr. I'm using CBFS.
 
 It boots, but I'm struggling with some extreme slowness during boot. In
 particular, the memset function in src/lib/memset.c takes *minutes* to clear
 1.2MB of ram. A little further CBFS does a memcpy which takes another 20 or
 30 seconds:
 
   Stage: load fallback/coreboot_ram @ 2097152/1245184 bytes, enter @ 20
 
   LOONG pause
 
   Stage: after memset
   on-stack variables at 00ffbec8 and 00ffbed4
   cbfs_decompress: algo: 0
   cbfs_decompress: uncompressed
 
   another lengthy pause
 
   cbfs_decompress: memcpy from 0xffbecc to 0xffbed0 for 0x2d304 bytes done
   Stage: done loading.
 
 The first, lengthly pause is new; it is apparently caused by something
 introduced between r4368 and r4440.
 
 The second pause was there already in r4368.
 
 I understand this may have something to do with MTRRs - looking at the logs
 it seems MTRRs are not set up until well after CBFS has dealt with
 coreboot_ram. 
 
 This box has 32GB of ram, in case that makes a difference.
 
 Any suggestions?
 
 Thanks,
 Ward.
 
-- 
Ward Vandewege w...@gnu.org

-- 
coreboot mailing list: coreboot@coreboot.org
http://www.coreboot.org/mailman/listinfo/coreboot


[coreboot] fam10/h8dmr: extreme slowness in CBFS memset / memcpy

2009-07-19 Thread Ward Vandewege
Hi all,

I'm working on a fam10 tree for supermicro h8dmr. I'm using CBFS.

It boots, but I'm struggling with some extreme slowness during boot. In
particular, the memset function in src/lib/memset.c takes *minutes* to clear
1.2MB of ram. A little further CBFS does a memcpy which takes another 20 or
30 seconds:

  Stage: load fallback/coreboot_ram @ 2097152/1245184 bytes, enter @ 20

  LOONG pause

  Stage: after memset
  on-stack variables at 00ffbec8 and 00ffbed4
  cbfs_decompress: algo: 0
  cbfs_decompress: uncompressed

  another lengthy pause

  cbfs_decompress: memcpy from 0xffbecc to 0xffbed0 for 0x2d304 bytes done
  Stage: done loading.

The first, lengthly pause is new; it is apparently caused by something
introduced between r4368 and r4440.

The second pause was there already in r4368.

I understand this may have something to do with MTRRs - looking at the logs
it seems MTRRs are not set up until well after CBFS has dealt with
coreboot_ram. 

This box has 32GB of ram, in case that makes a difference.

Any suggestions?

Thanks,
Ward.

-- 
Ward Vandewege w...@gnu.org

-- 
coreboot mailing list: coreboot@coreboot.org
http://www.coreboot.org/mailman/listinfo/coreboot