Re: Performance gain through dereferencing?

Peter Schneider Wed, 16 Apr 2014 09:18:12 -0700

Hi David,

Sorry, I had included more information in an earlier draft which Iedited out for brevity.


> You cannot learn useful timing
> information from a single run of a short
> test like this - there are far too many
> other factors that come into play.

I didn't mention that I have run it dozens of times. I know that bluntruntime measurements on a non-realtime system tend to benon-reproducible, and that they are inadequate for exact measurements.But the difference here is so large that the result is highlysignificant, in spite of the "amateurish" setup. The run I am showinghere is typical. One of my four cores is surely idle at any givenmoment, and there is no I/O, so the variations are small.

You cannot learn useful timing information from unoptimised code.

I beg to disagree. While in this case the problem (and indeed eventuallythe whole program ;-) ) goes away with optimization that may not be thecase in less trivial scenarios. And optimization or not -- I wouldalways contend that *p = n is **not slower** than i = n. But it is.Something is wrong ;-).

So I'd like to direct our attention to the generated code and itsperformance (because such code conceivably could appear as the result ofan optimized compiler run as well, in less trivial scenarios). Whatpuzzles me is: How can it be that two instructions are slower than avery similar pair of instructions plus another one? (And that questionis totally unrelated to optimization.)

Otherwise the
result could be nothing more than a quirk of the way caching worked out.

Could you explain how caching could play a role here if all variablesand adresses are on the stack and are likely to be in the same memorypage? (I'm not being sarcastic -- I may miss something obvious).

I can imagine that somehow the processor architecture is better utilizedby the faster version (e.g. because short inner loops pipleline worse orwhatever). For what it's worth, the programs were running on a i7-3632QM.

Re: Performance gain through dereferencing?

Reply via email to