On Sun, Feb 01, 2015 at 03:38:42AM +0100, Albert ARIBAUD wrote:
> Hello Przemyslaw,
> 
> On Wed, 28 Jan 2015 13:55:42 +0100, Przemyslaw Marczak
> <p.marc...@samsung.com> wrote:
> > For ARM architecture, enable the CONFIG_USE_ARCH_MEMSET/MEMCPY,
> > will highly increase the memset/memcpy performance. This is able
> > thanks to the ARM multiple register instructions.
> > 
> > Unfortunatelly the relocation is done without the cache enabled,
> > so it takes some time, but zeroing the BSS memory takes much more
> > longer, especially for the configs with big static buffers.
> > 
> > A quick test confirms, that the boot time improvement after using
> > the arch memcpy for relocation has no significant meaning.
> > The same test confirms that enable the memset for zeroing BSS,
> > reduces the boot time.
> > 
> > So this patch enables the arch memset for zeroing the BSS after
> > the relocation process. For ARM boards, this can be enabled
> > in board configs by defining: 'CONFIG_USE_ARCH_MEMSET'.
> 
> Since the issue is that zeroing is done one word at a time, could we
> not simply clear r3 as well as r2 (possibly even r4 and r5 too) and do
> a double (possibly quadruple) write loop? That would avoid calling a
> libc routine from the almost sole file in U-Boot where a C environment
> is not necessarily granted.

I want to jump up here again.  Note that the arch memset/memcpy routines
are in asm and I don't belive require a C environment.  Why don't we
simply use the asm versions for everyone and backport whatever we need
from the kernel to re-sync there as it's not a choice there and it's a
performance win too?

-- 
Tom

Attachment: signature.asc
Description: Digital signature

_______________________________________________
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot

Reply via email to