On 2013年2月1日, at 上午6:07, "Costello, Roger L." <coste...@mitre.org> wrote:
> So why would one ever generate text in decomposed form (NFD)? > The Unihan database is stored in NFD because it makes the regular expressions used to qualify its contents much, *much* simpler. I imagine that things like fuzzy text matching are easier in NFD. At worst, it's about as useful as UTF-32: occasionally very handy in internal processing, but not terribly attactive overall.