On Thursday, 1 June 2017 at 04:39:17 UTC, Jonathan M Davis wrote:
On Wednesday, May 31, 2017 16:03:54 H. S. Teoh via
Digitalmars-d-learn wrote:
[...]
Digitalmars-d-learn wrote:
[...]
If you're really trying to make it fast, there may be something
that you can do with SIMD. IIRC, Brian Schott did that with his
lexer (or maybe he was just talking about it - I don't remember
for sure).
See my link above to realdworldtech. Using SIMD can give good
results in micro-benchmarks but completely screw up performance
of other things in practice (the alignment requirements are heavy
and result in code bloat, cache misses, TLB misses, cost of
context switches, AVX warm up time (Agner Fog observed around
10000 cycles before AVX switches from 128 bits to 256 bits
operations), reduced turboing, etc.).