On 09/07/2004 01:41, Michael (michka) Kaplan wrote:

From: "Michael Everson" <[EMAIL PROTECTED]>



I think it's stupid (in general) to argue for stripping a letter of
diacritics. If a reader is ignorant of their meaning, that can be
cured. But if they are meaningful, stripping them is just misspelling
the words they belong to. Why would anyone want to do that?



I think its inadvisable (in general) to call things stupid merely because one does not see the need. on the whole, that is a better time to ask the question than to make the judgment.

There is actually a great deal of both European and American data in
programs like Microsoft Exchange and Outlook, as well as in web search) that
folding away diacritics as a part of giving full lists of possible matches
is indeed preferred by users. Now they would (also) prefer the exact matches
to have priority, but having additional matches without the diacritics is a
common request, and one that has been built into many scenarios.



It seems to me that you two Michaels are talking at cross purposes.

Everson was apparently referring to the practice of stripping diacritics from foreign words as rendered typographically, e.g. in magazines and presumably online texts. And I tend to agree with him (from my European perspective) that this is unnecessary. On the other hand, if some people want to do it, they should not be prevented.

But Kaplan is referring to something quite different, optionally ignoring diacritics in search operations. This is indeed desirable, so that a single search can match both Dvorak and DvoÅÃk for example, and so that the one doing the search does not need to remember exactly which diacritics are used in the name. And it is already covered by the Unicode collation algorithm and default table, in which diacritics are distinguished only at the second level and so folded by a top level only collation.

--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/




Reply via email to