On Feb 21, John Cowan wrote:
>
> /enable deep-typographic-geek-mode
> [...]
> /disable deep-typographic-geek-mode
>
> > As a programmer, I really don't want to care about all of this.
> > Someone made some decision like that "double vav" that
> > distinguishes two strings that render in exactly the same way, and
> > as a Hebrew speaker it doesn't make any sense to me --
>
> Hopefully you are more enlightened now.
Yes -- interesting to know all that. (I (vaguely) knew that Yiddish
uses letters as done in Latin, but not the reasoning that propagates
this to having separate characters that way.)
> > but as a hacker I still don't want to care about the differences.
> > The bits that represent the two in a (for example) UTF-8 text file
> > are different, so the easiest way for me to avoid it so to just
> > look at the bits.
>
> Sure, if you want. Or you can use various forms of normalization,
> some of which are standardized by Unicode and some not, to throw
> away any unwanted distinctions. For example, if you are analyzing
> Chinese text, you may want to throw away the difference between
> Simplified and Traditional characters -- not that it's trivial to do
> so.
Exacly my point -- these various forms of normalizations are more
fragile, and the selection of the normalizations you'd want to have is
also less obvious, and they're all things that are inherently
cultural. So, as a hacker, I find it much easier to just ignore it
all and look at the bits instead. (It's convenient to have Unicode as
a very difficult piece of work that I didn't have to deal with...)
--
((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay:
http://www.barzilay.org/ Maze is Life!
_______________________________________________
r6rs-discuss mailing list
[email protected]
http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss