John Hudson responded to Michael Everson: > Michael Everson wrote: > > >> This would make the mid-dot too high. The top dot of the colon usually > >> sits toward the top of the x-height; the *mid*-dot should sit lower,
> > John, I just don't believe you. I don't believe that in all the history > > of Greek and Catalan typography this careful hairsplitting has *always* > > taken place; certainly in scientific transcription the HALF TRIANGULAR > > COLON is just the top dot in the TRIANGULAR COLON, and in Americanist > > transcription where the dot-colons are used instead of triangles I would > > say the same applies. > > I never contested that the dots of a colon correspond to the triangles of the > linguistic > long vowel marker. They clearly do. What I contested was that the typographic > mid-point > (U+00B7) corresponded to the top dot of a colon. It clearly does not. It is called a > mid-point because it sits midway up the x-height. It is used in this position for a > variety of stylistic purposes, ... I think we have two typographers here arguing somewhat at cross-purposes. Clearly the typographic "mid-point" behaves as John has mentioned, and is designed as such in many fine fonts (examples seen among the exhibits that Asmus gathered). But just a clearly, there is a long, long tradition in Americanist orthographic practice (which is used widely for linguistic orthographies outside of Native America as well) of using a "raised dot" for an indication of vocalic (and occasionally consonantal) length. For 100 years, that raised dot was mechanically generated by, among other means, filing the lower dot off a colon key on a mechanical typewriter. (I have such a typewriter sitting on my desk.) Linguists got used to this raised dot height, coordinated with a colon in design (which then could be used, among other things to indicate a prolonged length, when two degrees of length were in question), and that preference made its way into print, at least for many North American languages, where the raised dot could be printed at x-height, rather than at midway up the x-height, which would be too low for most of the linguistic usage. Enter the electronic age. ASCII had no MIDDLE DOT. It was period (.), colon (:) or the highway. Early linguistic material on computers made do with those, because they had no choice. The IBM PC and the Macintosh introduced a MIDDLE DOT (0xFA [= IBM CDRA SD630000 "Middle Dot"] and 0xE1, respectively). When ISO 8859-1 was defined, it also had a MIDDLE DOT (0xB7). *Everybody* made use of that MIDDLE DOT for anything that was vaguely in the ballpark -- the typographical mid-point, the linguistic length mark, the mathematical multiplication operator, the Greek ano teleia, the dictionary hyphenation point, and, yes, the Catalan middle dot. The fact that each of those usages might have extremely fine typographical hairs to split regarding the rendering was so much horsepucky as far as the character identity was concerned. You used what you had available to represent your data. The Unicode Standard, for a variety of reasons -- some of which included compatibility mapping concerns to other standards which had started to proliferate middle dots -- added a collection of middle dots *besides* U+00B7, *the* middle dot, to its repertoire. Those other middle dots give people textual representation alternatives now, if they need to make distinctions, and textual rendering alternatives, if they need to make middle dots which display with slightly different heights, sizes, or spacings, depending on the rendering requirements. What is clear, however, is that it is utterly impossible to satisfy everybody regarding middle dots. Typographical purists will always want plain text to make more distinctions. Text processing requirements will abhor the splitting of text representation into more and more difficult-to- distinguish glyph representations without clear semantic differences. And dot proliferation *always* poses difficulty for establishing character properties. Before people bluster on too much further on this thread, it would be good for everyone to recall that the *reason* why U+00B7 has problematical properties is that it was inherently ambiguous in *preexisting* usage (that is, prior to Unicode altogether) as punctuation versus length mark (and other things as well). This puts it in the same grabbag of very difficult, ambiguous ASCII characters, such as "~", "*", and "'" which also acquired conflicting usages during their reign among the small set of available punctuation and symbols in ASCII. History has consequences. The history of a character's encoding also has consequences for how the Unicode Standard is to be used and interpreted. --Ken