On 2014-03-09 13:00:45 +0000, "monarch_dodra" <monarchdo...@gmail.com> said:

AFAIK, the most common algorithm "case insensitive search" *must* decode.

Not necessarily. While the unicode collation algorithms (which should be used to compare text) are defined in term of code points, you could build a collation element table using code units as keys and bypass the decoding step for searching the table. I'm not sure if there would be a significant performance gain though.

That remains an optimization though. The natural way to implement a Unicode algorithm is to base it on code points.

--
Michel Fortin
michel.for...@michelf.ca
http://michelf.ca

Reply via email to