I took the new armasm memcpy patch and applied it to my DirectFB-1.1.1 and ran 
it on my davinci based system using gcc-3.4 and real libc.  And it is now 
faster than the libc. Nice job, the original patch actually slowed things 
down.

Thank,
Craig


On Wednesday 25 March 2009 2:50:35 am vince wrote:
> Niels,
>
> Here is a new version of the patch with the second version of memcpy and
> a conditional to remove big-endian.
>
> Let me know if you have any trouble with it.
>
> Regards,
>
> Vince
>
> On Tue, 2009-03-24 at 16:36 +0100, Niels Roest wrote:
> > Hi John,
> > thanks for the comments,
> > just want to mention 1 or 2 things too.
> >
> > The testing routines do have a single cold, unmeasured, run first to
> > rule out previous cache state influence.
> >
> > The test itself is in fact really simple - a continuous copy of a large
> > region. So no repeats. This does focus on the use case that is most
> > obvious for DirectFB, namely copying chunks and lines of graphics
> > between surfaces, which will normally lead to cache misses anyway. I am
> > most concerned about alignment, since this is really unpredictable.
> >
> > I am not sure if we will benefit much from shuffling the code or using
> > different memory regions; you have to remember that the testing routines
> > produce a single score only, so these will need to be fine tuned a lot,
> > and we may even need to revert to multiple memcpy routines which are
> > optimised for multiple use cases. This might be an interesting approach,
> > it is one I will follow if performance measurements show that we can
> > expect a proper benefit from this - forgetting that DirectFB is mainly
> > about hardware acceleration anyway.
> >
> > For me I am very happy with the changes that Vince made, thanks Vince,
> > and if I have a BE/LE lock, I will include the patch.
> >
> > Greets
> > Niels
> >
> > John Williams wrote:
> > > Hi Vince,
> > >
> > > On Wed, Mar 25, 2009 at 12:57 AM, vince <vi...@bluush.com> wrote:
> > >> Ive change my benchmark to invalidate the cache before every test. My
> > >> result are the same. Attached is my test program.
> > >
> > > No worries - just wanted to make sure we weren't missing the obvious!
> > >
> > > Might also be worth shuffling the sequencing of the tests (armasm,
> > > armasm2, libc), see if that has any impact.  I'm not intimate with ARM
> > > cache details, but with a write-back cache you could be stalling on
> > > cacheline evictions later in the test.
> > >
> > > Another safety would be to perform the tests in different memory
> > > regions, with a complete cache flush and invalidate between each run.
> > >
> > > Not saying there's anything wrong with your code, just know its easy
> > > to get false results from simple benchmark code. Memory tests are
> > > another one where the obvious approach is often wrong.
> > >
> > > Cheers,
> > >
> > > John
> > > _______________________________________________
> > > directfb-dev mailing list
> > > directfb-dev@directfb.org
> > > http://mail.directfb.org/cgi-bin/mailman/listinfo/directfb-dev

-- 




Craig Matsuura - Principal Engineer
Control4
11734 South Election Road - Suite 200
Salt Lake City, UT 84020-6432
PH: 801-523-3161
FX: 801-523-3199
_______________________________________________
directfb-dev mailing list
directfb-dev@directfb.org
http://mail.directfb.org/cgi-bin/mailman/listinfo/directfb-dev

Reply via email to