Re: stride in slices

Timon Gehr via Digitalmars-d Tue, 05 Jun 2018 17:02:25 -0700

On 05.06.2018 21:05, DigitalDesigns wrote:

On Tuesday, 5 June 2018 at 18:46:41 UTC, Timon Gehr wrote:
On 05.06.2018 18:50, DigitalDesigns wrote:
With a for loop, it is pretty much a wrapper on internal cpu logic soit will be near as fast as possible.
This is not even close to being true for modern CPUs. There are a lotof architectural and micro-architectural details that affectperformance but are not visible or accessible in your for loop. If youcare about performance, you will need to test anyway, as even rathersophisticated models of CPU performance don't get everything right.
Those optimizations are not part of the instruction set so areirrelevant. They will occur with ranges too.
...

I was responding to claims that for loops are basically a wrapper oninternal CPU logic and nearly as fast as possible. Both of those claimswere wrong.

For loops HAVE a direct cpu semantic! Do you doubt this?
...

You'd have to define what that means. (E.g., Google currently shows nohits for "direct CPU semantics".)

Cpu's do not have range semantics. Ranges are layers on top of compilersemantics... you act like they are equivalent, they are not!


I don't understand why you bring this up nor what you think it means.

The compiler takes a program and produces some machine code that has theright behavior. Performance is usually not formally specified. In termsof resulting behavior, code with explicit for loops and range-based codemay have identical semantics. Which one executes faster depends oninternal details of the compiler and the target architecture, and it maychange over time, e.g. between compiler releases.

All rangesemantics must go through the library code then to the compiler then tocpu. For loops of all major systems languages go almost directly to cpuinstructions.
for(int i = 0; i < N; i++)

translates in to either increment and loop or jump instructions.
...

Sure, or whatever else the compiler decides to do. It might even betranslated into a memcpy call. Even if you want to restrict yourself touse only for loops, my point stands. Write maintainable code by defaultand let the compiler do what it does. Then optimize further in thosecases where the resulting code is actually too slow. Test forperformance regressions.

There is absolutely no reason why any decent compiler would not use whatthe cpu has to offer. For loops are language semantics, Ranges arelibrary semantics.


Not really. Also, irrelevant.

To pretend they are equivalent is wrong and no amountof justifying will make them the same.


Again, I don't think this point is part of this discussion.

I actually do not know even anycommercial viable cpu exists without loop semantics.

What does it mean for a CPU to have "loop semantics"? CPUs typicallyhave an instruction pointer register and possibly some built-ininstructions to manipulate said instruction pointer. x86 has somebuilt-in loop instructions, but I think they are just there for legacysupport and not actually something you want to use in performant code.

I also no of nocommercially viable compiler that does not wrap those instructions in afor loop(or while, or whatever) like syntax that almost maps directly tothe cpu instructions.
...

The compiler takes your for loop and generates some machine code. Idon't think there is a "commercially viable" compiler that does notsometimes do things that are not direct. And even then, there is no verysimple mapping from CPU instructions to observed performance, so theentire point is a bit moot.

Also, it is often not necessary to be "as fast as possible". It isusually more helpful to figure out where the bottleneck is for yourcode and concentrate optimization effort there, which you can do moreeffectively if you can save time and effort for the remaining parts ofyour program by writing simple and obviously correct range-based code,which often will be fast as well.
It's also often not necessary to be "as slow as possible".

This seems to be quoting an imaginary person. My point is that to geteven faster code, you need to spend effort and often get lowermaintainability. This is not always a good trade-off, in particular ifthe optimization does not improve performance a lot and/or the code inquestion is not executed very often.

I'm notasking for about generalities but specifics. It's great to makegeneralizations about how things should be but I would like to know howthey are.


That's a bit unspecific.

Maybe in theory ranges could be more optimal than othersemantics but theory never equals practice.


I don't know who this is addressed to. My point was entirely practical.

Re: stride in slices

Reply via email to