On 4/2/2014 4:49 AM, monarch_dodra wrote:
That said, I'm abusing the fact that 50% of your bench is for chars over 0x80. If I loop only on actual ASCII you can find in text, (0x20 - 0X80), then those numbers "only" go from "320" => "300". Only slightly better, but still a win.
Surely a better approach would be to do a statistical analysis of character frequency in a representative corpus, and tune the compares that way. But the table lookup offered such a dramatic improvement, I didn't bother.