On 05.06.2018 21:05, DigitalDesigns wrote:
On Tuesday, 5 June 2018 at 18:46:41 UTC, Timon Gehr wrote:
On 05.06.2018 18:50, DigitalDesigns wrote:
With a for loop, it is pretty much a wrapper on internal cpu logic so it will be near as fast as possible.

This is not even close to being true for modern CPUs. There are a lot of architectural and micro-architectural details that affect performance but are not visible or accessible in your for loop. If you care about performance, you will need to test anyway, as even rather sophisticated models of CPU performance don't get everything right.

Those optimizations are not part of the instruction set so are irrelevant. They will occur with ranges too.
...

I was responding to claims that for loops are basically a wrapper on internal CPU logic and nearly as fast as possible. Both of those claims were wrong.

For loops HAVE a direct cpu semantic! Do you doubt this?
...

You'd have to define what that means. (E.g., Google currently shows no hits for "direct CPU semantics".)


Cpu's do not have range semantics. Ranges are layers on top of compiler semantics... you act like they are equivalent, they are not!

I don't understand why you bring this up nor what you think it means.

The compiler takes a program and produces some machine code that has the right behavior. Performance is usually not formally specified. In terms of resulting behavior, code with explicit for loops and range-based code may have identical semantics. Which one executes faster depends on internal details of the compiler and the target architecture, and it may change over time, e.g. between compiler releases.

All range semantics must go through the library code then to the compiler then to cpu. For loops of all major systems languages go almost directly to cpu instructions.

for(int i = 0; i < N; i++)

translates in to either increment and loop or jump instructions.
...

Sure, or whatever else the compiler decides to do. It might even be translated into a memcpy call. Even if you want to restrict yourself to use only for loops, my point stands. Write maintainable code by default and let the compiler do what it does. Then optimize further in those cases where the resulting code is actually too slow. Test for performance regressions.

There is absolutely no reason why any decent compiler would not use what the cpu has to offer. For loops are language semantics, Ranges are library semantics.

Not really. Also, irrelevant.

To pretend they are equivalent is wrong and no amount of justifying will make them the same.

Again, I don't think this point is part of this discussion.

I actually do not know even any commercial viable cpu exists without loop semantics.

What does it mean for a CPU to have "loop semantics"? CPUs typically have an instruction pointer register and possibly some built-in instructions to manipulate said instruction pointer. x86 has some built-in loop instructions, but I think they are just there for legacy support and not actually something you want to use in performant code.

I also no of no commercially viable compiler that does not wrap those instructions in a for loop(or while, or whatever) like syntax that almost maps directly to the cpu instructions.
...

The compiler takes your for loop and generates some machine code. I don't think there is a "commercially viable" compiler that does not sometimes do things that are not direct. And even then, there is no very simple mapping from CPU instructions to observed performance, so the entire point is a bit moot.

Also, it is often not necessary to be "as fast as possible". It is usually more helpful to figure out where the bottleneck is for your code and concentrate optimization effort there, which you can do more effectively if you can save time and effort for the remaining parts of your program by writing simple and obviously correct range-based code, which often will be fast as well.

It's also often not necessary to be "as slow as possible".

This seems to be quoting an imaginary person. My point is that to get even faster code, you need to spend effort and often get lower maintainability. This is not always a good trade-off, in particular if the optimization does not improve performance a lot and/or the code in question is not executed very often.

I'm not asking for about generalities but specifics. It's great to make generalizations about how things should be but I would like to know how they are.

That's a bit unspecific.

Maybe in theory ranges could be more optimal than other semantics but theory never equals practice.


I don't know who this is addressed to. My point was entirely practical.

Reply via email to