On Sat, Feb 21, 2009 at 3:24 PM, Eli Barzilay <[email protected]> wrote:
>> Sure, if you want.  Or you can use various forms of normalization,
>> some of which are standardized by Unicode and some not, to throw
>> away any unwanted distinctions.  For example, if you are analyzing
>> Chinese text, you may want to throw away the difference between
>> Simplified and Traditional characters -- not that it's trivial to do
>> so.
>
> Exacly my point -- these various forms of normalizations are more
> fragile, and the selection of the normalizations you'd want to have is
> also less obvious, and they're all things that are inherently
> cultural.  So, as a hacker, I find it much easier to just ignore it
> all and look at the bits instead.  (It's convenient to have Unicode as
> a very difficult piece of work that I didn't have to deal with...)

And, for some people, I suspect that normalizing traditional into
simplified would be a political no no (and the other way would
probably make your text hard to read for lots of people).

Robby

_______________________________________________
r6rs-discuss mailing list
[email protected]
http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss

Reply via email to