On 10/26/2015 08:35 AM, Don wrote:
On Sunday, 25 October 2015 at 19:37:32 UTC, Iakh wrote:
Here is my implementatation of SIMD find. Function returns index of
ubyte in static 16 byte array with unique values.

[snip]

You need to be very careful with doing benchmarks on tiny test cases,
they can be very misleading.

Be aware that the speed of bsf() and bsr() is very very strongly
processor dependent. On some machines, it is utterly pathetic. eg AMD
K7, BSR is 23 micro-operations, on original pentium is was up to 73 (!),
even on AMD Bobcat it is 11 micro-ops, but on recent Intel it is one
micro-op. This fact of 73 can totally screw up your performance
comparisons.

Just because it is a single machine instruction does not mean it is fast.

One other note: don't compare with binary search, it's not an appropriate baseline. You should use it only if you implemented SIMD-based binary search.

Good baselines: std.find, memchr, a naive version with pointers (no bounds checking).


Andrei


Reply via email to