> > memcpy already takes care of copying in the fastest way posible.
> 
> That's right, but we still have a call, a ret, and a conditional or two ;-)

I was going to say exactly that ;-)

> By inlining we can get rid of these things (especially if size is known up-front).
> Moreover, due to JIT's dynamic nature it's possible to generate faster code at 
>run-time.
> For example, the following (generic) memcpy is faster on pre-Pentium x86s (Intel 
>syntax):
>   mov esi, $src
>   mov ecx, $size
>   mov edi, $dest
>   shr ecx,1
>   rep movsw
>   adc cl,cl
>   rep movsb
> 
> For const size==1 we could just mov al, [src]; mov [dest],al
> etc.etc.
> BTW, MS JIT uses similar optimizations for cpblk/initblk.

Exactly.  The same logic that lives in memmove() for the data size
quantum can be inlined by the JIT engine trivially.  

However, how often does this happen?  Until a couple of days ago we did
not have cpblk, so my guess is that measuring the performance impact
might not be immediately noticeable. 

I would very much like to see this at some point.

Miguel.

_______________________________________________
Mono-list maillist  -  [EMAIL PROTECTED]
http://lists.ximian.com/mailman/listinfo/mono-list

Reply via email to