On Tuesday, 31 May 2016 at 22:50:37 UTC, David Nadlinger wrote:
Another thing that might be interesting to do (now that you have a "clever" baseline) is to start counting cycles and make some comparisons against manual asm/intrinsics implementations. For short(-ish) needles, PCMPESTRI is probably the most promising candidate, although I suspect that for \r\n scanning in long strings in particular, an optimised AVX2 solution might have higher throughput.

Of course these observations are not very valuable without backing them up with measurements, but it seems like before optimising a generic search algorithm for short-needle test cases, having one's eyes on a solid SIMD baseline would be a prudent thing to do.

The current algorithm is generic with respect to the predicate. Once we use SSE/AVX tricks, it is a special case for equality.

As a next step in Phobos, this is probably worth it for strings. We could probably steal some well-optimized strcmp from somewhere.

Reply via email to