At 01:52 PM 7/14/2004, Doug Ewell wrote:
It's not German data (with umlauts) that will be affected by this
solution, but non-German data (with diaereses) in German bibliographic
systems.  That makes it a much smaller problem.

the use of diaeresis is perfectly valid for words in fields that have a language ID 'German'.


The DIN request and the USNB solution didn't address this, because the
problem to be solved was disambiguating {a, o, u}-with-tréma from {a, o,
u}-with-umlaut.  If there are combinations of (for example)
a-with-tréma-and-something-else AND ALSO
a-with-umlaut-and-something-else, then those two will need to be
disambiguated somehow.  But I strongly doubt that the latter case exists
in German bibliographic data, though of course one never knows.

First off, there have to be corresponding entries in the sorting tables used for such data, to make that distinction have the correct effect. Since the sorting tables would not support anything ohter than <BASE, CGJ, DIAERESIS> there's no reason to introduce other sequences into the data.


Secondly, the dieresis is used to indicate that two vowels are pronounced separately. I haven't seen a case where the vowels would already be accented.

Finally, one of the additional reasons that the phonetic sorting is relevant in this instance, other than that the pronunciations are in fact different, is that the use of diaeresis is not mandatory to the same degree as for umlauts. You can find Kapernaum spelled with and without it, but if you spell Hauser with it, it's the plural of Haus, without it it's a name. Personal names however, sometimes are spelled with vowel + e (Moeller).

By sorting dieresis as a secondary difference, related terms do sort together, and names sort near their variant spellings. The suggested approach solves the problem at hand for those data where somebody took the trouble to decide (on input) which was which, so that huge catalogs of subject keywords or authors come out correctly.

Note, the bulk of all possible data in German won't make that distinction, and won't be used on systems that support the special sorting method.

A./






Reply via email to