Heinz:

> This results in a much robust code.

That's the right way to do it, with a D fallback.


> You are right too about the "load-load-load processing-processing-processing 
> store-store-store instead a load-processing-store load-processing-store 
> load-processing-store" thing. I'll modify my code to this model, though it 
> will require to move some elements to the stack but no big deal, i think this 
> won't hurt performance as it is designed to work this way.<

That may be slow, so you need to benchmark.



>-Does ASM kill inlining for the function where the asm block is present or for 
>the whole compilation?<

It prevents just the function that contains assembly to be inlined.


>-In your opinion, How badly can be if function inlining is not present?<

Inlining is an important optimization if your function does very little, 
otherwise it's not important or it makes the code slower. In your function 
there are only few asm instructions, so inlining becomes important. This is why 
I suggest you to benchmark your asm code against equivalent D code compiled 
with -O -release -inline. The D code+inlining may turn out to be faster. Here 
this is the most probable outcome, in my opinion.

If you use the LDC compiler there are two different ways (pragma(allow_inline) 
and inline asm expressions) to have inlining even when you use asm code, so the 
situation is better.

Bye,
bearophile

Reply via email to