On Thursday, 27 February 2020 at 08:52:09 UTC, Basile B. wrote:
On Thursday, 27 February 2020 at 04:44:56 UTC, Basile B. wrote:
On Thursday, 27 February 2020 at 03:58:15 UTC, Bruce Carneal wrote:

Maybe you talked about another implementation of decimalLength9 ?

Yes. It's one I wrote after I saw your post. Psuedo-code here:

auto d9_branchless(uint v) { return 1 + (v >= 10) + (v >= 100) ... }

Using ldc to target an x86 with the above yields a series of cmpl, seta instruction pairs in the function body followed by a summing and a return. No branching.

Let me know if the above is unclear or insufficient.

No thanks, it's crystal clear now.

although I don't see the performance gain.
snip

ubyte decimalLength9_0(const uint v)
{
    if (v >= 100000000) { return 9; }
    if (v >= 10000000) { return 8; }
    if (v >= 1000000) { return 7; }
    if (v >= 100000) { return 6; }
    if (v >= 10000) { return 5; }
    if (v >= 1000) { return 4; }
    if (v >= 100) { return 3; }
    if (v >= 10) { return 2; }
    return 1;
}

snip

Could you share your benchmarking method maybe ?

OK. I'm working with equi-probable digits. You're working with equi-probable values. The use-case may be either of these or, more likely, a third as yet unknown/unspecified distribution.

The predictability of the test input sequence clearly influences the performance of the branching implementations. The uniform random input is highly predictable wrt digits. (there are *many* more high digit numbers than low digit numbers)

Take the implementation above as an example. If, as in the random number (equi-probable value) scenario, your inputs skew *stongly* toward a higher number of digits, then the code should perform very well.

If you believe the function will see uniform random values as input, then testing with uniform random inputs is the way to go. If you believe some digits distribution is what will be seen, then you should alter your inputs to match. (testing uniform random in addition to the supposed target distribution is almost always a good idea for sensitivity analysis).

My testing methodology: To obtain an equi-digit distribution, I fill an array with N 1-digit values, followed by N 2-digit values, ... followed by N 9-digit values. I use N == 1000 and loop over the total array until I've presented about 1 billion values to the function under test.

I test against that equi-digit array before shuffling (this favors branching implementations) and then again after shuffling (this favors branchless).

If you wish to work with equi-digit distributions I'd prefer that you implement the functionality anew. This is some small duplicated effort but will help guard against my having screwed up even that simple task (less than 6 LOC IIRC).

I will post my code if there is any meaningful difference in your subsequent results.









Reply via email to