On Thursday, June 01, 2017 04:52:40 Patrick Schluter via Digitalmars-d-learn wrote: > On Thursday, 1 June 2017 at 04:39:17 UTC, Jonathan M Davis wrote: > > On Wednesday, May 31, 2017 16:03:54 H. S. Teoh via > > > > Digitalmars-d-learn wrote: > >> [...] > > > > Digitalmars-d-learn wrote: > >> [...] > > > > If you're really trying to make it fast, there may be something > > that you can do with SIMD. IIRC, Brian Schott did that with his > > lexer (or maybe he was just talking about it - I don't remember > > for sure). > > See my link above to realdworldtech. Using SIMD can give good > results in micro-benchmarks but completely screw up performance > of other things in practice (the alignment requirements are heavy > and result in code bloat, cache misses, TLB misses, cost of > context switches, AVX warm up time (Agner Fog observed around > 10000 cycles before AVX switches from 128 bits to 256 bits > operations), reduced turboing, etc.).
Whenever you attempt more complicated optimizations, it becomes harder to get it right, and you always have the problem of figuring out whether you really did make it better in general. It's the sort of thing that's easier when you have a specific use case and it's very difficult to get right when dealing with a general solution for a standard library. So, it doesn't surprise me at all if a particular optimization turns out to be a bad idea for Phobos even if it's great for some use cases. - Jonathan M Davis