> > memcpy already takes care of copying in the fastest way posible. > > That's right, but we still have a call, a ret, and a conditional or two ;-)
I was going to say exactly that ;-) > By inlining we can get rid of these things (especially if size is known up-front). > Moreover, due to JIT's dynamic nature it's possible to generate faster code at >run-time. > For example, the following (generic) memcpy is faster on pre-Pentium x86s (Intel >syntax): > mov esi, $src > mov ecx, $size > mov edi, $dest > shr ecx,1 > rep movsw > adc cl,cl > rep movsb > > For const size==1 we could just mov al, [src]; mov [dest],al > etc.etc. > BTW, MS JIT uses similar optimizations for cpblk/initblk. Exactly. The same logic that lives in memmove() for the data size quantum can be inlined by the JIT engine trivially. However, how often does this happen? Until a couple of days ago we did not have cpblk, so my guess is that measuring the performance impact might not be immediately noticeable. I would very much like to see this at some point. Miguel. _______________________________________________ Mono-list maillist - [EMAIL PROTECTED] http://lists.ximian.com/mailman/listinfo/mono-list