From: "Addison Phillips [wM]" <[EMAIL PROTECTED]>

For example, Dutch sometimes treats the sequence "ij" as a single letter (it turns out that there are characters for the letter 'ij' in Unicode too, but they are for compatibility with an ancient non-Unicode character set). Software must be modified or tailored to provide behavior consistent with the specific language and context.

Not sure about that: not all Dutch "ij" letter pairs are a single grapheme, so there are cases where the two letters must be treated as distinct and not as a single letter. For this reason, Dutch will need a distinct "ij" letter, coded as a single character, and with its own capitalization rules (the uppercase or titlecase form of "ij" will be the single letter "IJ", not two letters and not "Ij"; also there exists cases where diacritics can be added on top of the "ij" letter, which is then more tied as a single letter than a simple digraph.)


This distinction is also often made visible in the typography (where the single letter "ij" digraph is shown with the leg of the "j" kerned deeply below (and sometimes to the left of) the leading "i", unlike cases where they are treated as two letters where no kerning occurs (the 'i' is shown completely on the left of the bottom-left leg of 'j'), and it is even more evident in the uppercase style (where there will even be the standard small distance between I and J glyphs when they are two distinct letters, but where the uppercase I may be drawn in the middle of the left leg of J).

Note the very near ressemblance of the "ij" signel letter with a y with a diaeresis (so you'll find also Dutch texts that use y with diaeresis instead of the correct "ij" letter, notably in texts coded with legacy charsets). This distinction is also preserved for uppercase, where the missing "IJ" single letter appears encoded with Y with diaeresis...

These cases in Dutch where there's a distinction between the single letter digraph and two letters are rare, so it is often acceptable to encode the digraph with two letters, without creating linguistic ambiguities (in most cases...), or with y with diaeresis/umlaut (which otherwise is not a letter used in Dutch).

For me, your allusion to legacy charsets is about the deprecating use of y with diaeresis, not about the use of a distinct "IJ" letter which is needed for Dutch and should be treated as distinct from the "I then J" letters pair.





Reply via email to