Re: Small memcpy optimization

Stefan Fritsch Sat, 10 Nov 2012 08:52:42 -0800

On Thursday 08 November 2012, Ted Unangst wrote:
> On Thu, Nov 01, 2012 at 22:43, Stefan Fritsch wrote:
> > On Tuesday 21 August 2012, Stefan Fritsch wrote:
> >> On x86, the xchg operation between reg and mem has an implicit
> >> lock prefix, i.e. it is a relatively expensive atomic
> >> operation. This is not needed here.
> > 
> > OKs, anyone?
> 
> What do other implementations do?  Benchmarks?  I'm sure
> someone somewhere has spent a lot of effort making the world's
> fastest memcpy.  Taking that work seems better than home grown
> fiddling.


Other implementations don't call bcopy, so that's not really 
applicable here. To make bcopy faster, one would probably use 
different implementations depending on CPU features (SSE3, ...). Or 
maybe use gcc's __builtin_memcpy(), but one would need to check if 
there are any caveats when using that in kernel space.

Re: Small memcpy optimization

Reply via email to