Which config.h are you refering to in the armasm_memcpy.S?

Craig

On Wednesday 25 March 2009 12:55:07 pm Craig Matsuura wrote:
> I took the new armasm memcpy patch and applied it to my DirectFB-1.1.1 and
> ran it on my davinci based system using gcc-3.4 and real libc.  And it is
> now faster than the libc. Nice job, the original patch actually slowed
> things down.
>
> Thank,
> Craig
>
> On Wednesday 25 March 2009 2:50:35 am vince wrote:
> > Niels,
> >
> > Here is a new version of the patch with the second version of memcpy and
> > a conditional to remove big-endian.
> >
> > Let me know if you have any trouble with it.
> >
> > Regards,
> >
> > Vince
> >
> > On Tue, 2009-03-24 at 16:36 +0100, Niels Roest wrote:
> > > Hi John,
> > > thanks for the comments,
> > > just want to mention 1 or 2 things too.
> > >
> > > The testing routines do have a single cold, unmeasured, run first to
> > > rule out previous cache state influence.
> > >
> > > The test itself is in fact really simple - a continuous copy of a large
> > > region. So no repeats. This does focus on the use case that is most
> > > obvious for DirectFB, namely copying chunks and lines of graphics
> > > between surfaces, which will normally lead to cache misses anyway. I am
> > > most concerned about alignment, since this is really unpredictable.
> > >
> > > I am not sure if we will benefit much from shuffling the code or using
> > > different memory regions; you have to remember that the testing
> > > routines produce a single score only, so these will need to be fine
> > > tuned a lot, and we may even need to revert to multiple memcpy routines
> > > which are optimised for multiple use cases. This might be an
> > > interesting approach, it is one I will follow if performance
> > > measurements show that we can expect a proper benefit from this -
> > > forgetting that DirectFB is mainly about hardware acceleration anyway.
> > >
> > > For me I am very happy with the changes that Vince made, thanks Vince,
> > > and if I have a BE/LE lock, I will include the patch.
> > >
> > > Greets
> > > Niels
> > >
> > > John Williams wrote:
> > > > Hi Vince,
> > > >
> > > > On Wed, Mar 25, 2009 at 12:57 AM, vince <vi...@bluush.com> wrote:
> > > >> Ive change my benchmark to invalidate the cache before every test.
> > > >> My result are the same. Attached is my test program.
> > > >
> > > > No worries - just wanted to make sure we weren't missing the obvious!
> > > >
> > > > Might also be worth shuffling the sequencing of the tests (armasm,
> > > > armasm2, libc), see if that has any impact.  I'm not intimate with
> > > > ARM cache details, but with a write-back cache you could be stalling
> > > > on cacheline evictions later in the test.
> > > >
> > > > Another safety would be to perform the tests in different memory
> > > > regions, with a complete cache flush and invalidate between each run.
> > > >
> > > > Not saying there's anything wrong with your code, just know its easy
> > > > to get false results from simple benchmark code. Memory tests are
> > > > another one where the obvious approach is often wrong.
> > > >
> > > > Cheers,
> > > >
> > > > John
> > > > _______________________________________________
> > > > directfb-dev mailing list
> > > > directfb-dev@directfb.org
> > > > http://mail.directfb.org/cgi-bin/mailman/listinfo/directfb-dev

-- 




Craig Matsuura - Principal Engineer
Control4
11734 South Election Road - Suite 200
Salt Lake City, UT 84020-6432
PH: 801-523-3161
FX: 801-523-3199
_______________________________________________
directfb-dev mailing list
directfb-dev@directfb.org
http://mail.directfb.org/cgi-bin/mailman/listinfo/directfb-dev

Reply via email to