Re: When to use markup: (Was:Introducing the idea of a ROMAN VARIANT SELECTOR (was: Re: Proposing Fraktur))
- Original Message - From: Asmus Freytag [EMAIL PROTECTED] To: Karl Pentzlin [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: den 31 januari 2002 22:09 Subject: When to use markup: (Was:Introducing the idea of a ROMAN VARIANT SELECTOR (was: Re: Proposing Fraktur)) A more productive distinction would be along these lines: a) is the feature necessary for correctly expressing the content Yes. b) is the feature rule based, and Yes. b.1) is the rule implementable w/o knowledge of semantics, or No. c) when implementing the feature, is it necessary to c.1) provide scope information, or Yes. c.2) is local context sufficient No. Leaving out italics from a document can not only change the level of emphasis, but for example in English, there are occasional circumstances where the use of italics removes a possible ambiguity in interpreting a sentence. Nevertheless (except for mathematics) italics were left to a higher level protocol (style markup). Italics is better supported than Fraktur, as most word processors have an option for using italics with any font installed on the computer. For Fraktur one has to use a different font. There is no Fraktur font widely spread on all Windows computers or something like that, so it's almost impossible to using Fraktur text in any public document or similar w/o using bitmaps. Why was Fraktur supported for mathematics, but not for old Swedish/German/etc.? Stefan _ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
Re: When to use markup: (Was:Introducing the idea of a ROMAN VARIANT SELECTOR (was: Re: Proposing Fraktur))
- Original Message - From: Asmus Freytag [EMAIL PROTECTED] To: Karl Pentzlin [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: den 31 januari 2002 22:09 Subject: When to use markup: (Was:Introducing the idea of a ROMAN VARIANT SELECTOR (was: Re: Proposing Fraktur)) A more productive distinction would be along these lines: a) is the feature necessary for correctly expressing the content Yes. b) is the feature rule based, and Yes. b.1) is the rule implementable w/o knowledge of semantics, or No. c) when implementing the feature, is it necessary to c.1) provide scope information, or Yes. c.2) is local context sufficient No. Leaving out italics from a document can not only change the level of emphasis, but for example in English, there are occasional circumstances where the use of italics removes a possible ambiguity in interpreting a sentence. Nevertheless (except for mathematics) italics were left to a higher level protocol (style markup). Italics is better supported than Fraktur, as most word processors have an option for using italics with any font installed on the computer. For Fraktur one has to use a different font. There is no Fraktur font widely spread on all Windows computers or something like that, so it's almost impossible to use Fraktur text in any public document or similar w/o using bitmaps to displaying the characters. Why was Fraktur supported for mathematics, but not for old Swedish/German/etc.? Stefan _ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
Re: When to use markup: (Was:Introducing the idea of a ROMAN VARIANT SELECTOR (was: Re: Proposing Fraktur))
At 07:35 2/3/2002, Stefan Persson wrote: Italics is better supported than Fraktur, as most word processors have an option for using italics with any font installed on the computer. For Fraktur one has to use a different font. Um, for italics one has to use a different font also. Many programs provide an italics button that activates the italic member of a font family, but this still involves selecting a separate font. There is no Fraktur font widely spread on all Windows computers or something like that, so it's almost impossible to using Fraktur text in any public document or similar w/o using bitmaps. There are plenty of Fraktur and other blackletter fonts available. Many of the best ones are available from Linotype in Germany. If you think that a Fraktur font should come installed on operating systems, you should petition your OS developer. I don't see that these font availability issues have anything to do with Unicode. John Hudson Tiro Typeworks www.tiro.com Vancouver, BC [EMAIL PROTECTED] ... es ist ein unwiederbringliches Bild der Vergangenheit, das mit jeder Gegenwart zu verschwinden droht, die sich nicht in ihm gemeint erkannte. ... every image of the past that is not recognized by the present as one of its own concerns threatens to disappear irretrievably. Walter Benjamin
Re: When to use markup: (Was:Introducing the idea of a ROMAN VARIANT SELECTOR (was: Re: Proposing Fraktur))
From: John Hudson [EMAIL PROTECTED] Um, for italics one has to use a different font also. Many programs provide an italics button that activates the italic member of a font family, but this still involves selecting a separate font. Au contraire, sir! Many fonts *do* have a separate .TTF files for the italic version, bu there are just as many that do not, yet the italic option does not find itself disabled in programs. MichKa Michael Kaplan Trigeminal Software, Inc. -- http://www.trigeminal.com/
Re: When to use markup: (Was:Introducing the idea of a ROMAN VARIANT SELECTOR (was: Re: Proposing Fraktur))
At 10:25 AM 2/3/02, John Hudson wrote: Um, for italics one has to use a different font also. Many programs provide an italics button that activates the italic member of a font family, but this still involves selecting a separate font. And it would be simple to set up a font family so that Fraktur would be the normal state, and the italic button on the word processor would select a Roman member of the family (if you still needed sloped italics, those could be assigned to the bold italic slot). -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: When to use markup: (Was:Introducing the idea of a ROMAN VARIANT SELECTOR (was: Re: Proposing Fraktur))
At 10:55 2/3/2002, Michael \(michka\) Kaplan wrote: Um, for italics one has to use a different font also. Many programs provide an italics button that activates the italic member of a font family, but this still involves selecting a separate font. Au contraire, sir! Many fonts *do* have a separate .TTF files for the italic version, bu there are just as many that do not, yet the italic option does not find itself disabled in programs. Ah. Those 'italics'. Those are not italics. Those are slanted romans. Sorry, I thought we were talking about typography. In Adobe InDesign, the italic function is disabled if an italic font is not available. There is a separate control for slanting text, but it is not possible to accidentally produce a sloped roman in the absence of an italic font. This is how it should be. John Hudson Tiro Typeworks www.tiro.com Vancouver, BC [EMAIL PROTECTED] ... es ist ein unwiederbringliches Bild der Vergangenheit, das mit jeder Gegenwart zu verschwinden droht, die sich nicht in ihm gemeint erkannte. ... every image of the past that is not recognized by the present as one of its own concerns threatens to disappear irretrievably. Walter Benjamin
RE: Introducing the idea of a ROMAN VARIANT SELECTOR (was: Re: Proposing Fraktur)
Hi, Ken wrote: frakturDas sinkende Schiff sandte/fraktur SOSfraktur-Rufe./fraktur or conversely, perhaps better: Das sinkende Schiff sandte antiquaSOS/antiqua-Rufe. at the end, it may be more useful to rather markup the semantics than formatting properties, i.e. This is not a question of foreign origin=DEZeitgeist/foreign. It is the responsibility of the rendering engine (style sheet, ...) to map that markup to whatever font/script/typeface should be used, according to users' (or typesetters') preferences, current environment and purpose. - The author or some post-authoring process would (hopefully ;-) ) have the knowledge about where the linguistic expression originates from and can apply appropriate (semantic) markup, but doesn't need to care about typesetting conventions (which the author may not be expert in). - The rendering engine/typesetter doesn't need to have any linguistic information (such as a database of loan words), but only needs to know how to map foreign content to formatting properties in a given context. - Third, depending on the environment and purpose, different stylistic conventions may be necessary for the same linguistic expression (fraktur in one document, no special formatting in another) so that any formatting-oriented markup (or encoding, for that matter) will potentially reduce the reusability of the document. Cheers, Oli Oliver Christ TRADOS GmbH Stuttgart
RE: Introducing the idea of a ROMAN VARIANT SELECTOR (was: Re: Proposing Fraktur)
quite a lot of space. However, Fraktur is already encoded in the Mathematical whatever-it's-called block. This variant selector would mean that lots of characters can be displayed in two *different* ways. I'd prefer that Fraktur diacritics were added instead, and that the mathematical letters were to be used for Fraktur texts. I hope not. These were encoded there because they convey a specific meaning when used for mathematics. If you use them to spell out names, then you're abusing them and potentially confusing software that would rely on their mathematical semantics. I think it's time to have another proposal for French, FRENCH VARIANT SELECTOR, where we do not use Fraktur but some other font variation. And we may need a QUEBEC VARIANT SELECTOR if they have different rules... Or should it be a QUEBEC FRENCH VARIANT SELECTOR to show the relationship? YA
Re: Proposing Fraktur
- Original Message - From: Kenneth Whistler [EMAIL PROTECTED] To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: den 31 januari 2002 01:04 Subject: Re: Proposing Fraktur And so what? I thought the meaning of Unicode was that all languages should be fully supported in plain text, using one single font to displaying all of the characters. With old Swedish, this isn't possible. I think this misconstrues the mission of Unicode as an encoding. The goal is to encode sufficient characters to enable the correct and legible representation of *plain* text in any script (modern or historic). This difference has to be done everywhere (read: including in plain text), otherwise the text is grammatically wrong. - Original Message - From: Yves Arrouye [EMAIL PROTECTED] To: 'Stefan Persson' [EMAIL PROTECTED]; Karl Pentzlin [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: den 31 januari 2002 09:54 Subject: RE: Introducing the idea of a ROMAN VARIANT SELECTOR (was: Re: Proposing Fraktur) quite a lot of space. However, Fraktur is already encoded in the Mathematical whatever-it's-called block. This variant selector would mean that lots of characters can be displayed in two *different* ways. I'd prefer that Fraktur diacritics were added instead, and that the mathematical letters were to be used for Fraktur texts. I hope not. These were encoded there because they convey a specific meaning when used for mathematics. If you use them to spell out names, then you're abusing them and potentially confusing software that would rely on their mathematical semantics. Letters A through Z and ALPHA through OMEGA are used in *both* text and mathematics, and I see no problem with this. Why would this cause problems with the Fraktur letters in the Mathematical Alphanumeric Symbols block? I think it's time to have another proposal for French, FRENCH VARIANT SELECTOR, where we do not use Fraktur but some other font variation. And we may need a QUEBEC VARIANT SELECTOR if they have different rules... Or should it be a QUEBEC FRENCH VARIANT SELECTOR to show the relationship? Do you have to use *both* kinds of characters at the same time in the same document? In old Swedish you have to use *both* a's at the same time, otherwise the text is grammatically wrong, be it so in plain text. Stefan _ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
When to use markup: (Was:Introducing the idea of a ROMAN VARIANT SELECTOR (was: Re: Proposing Fraktur))
At 09:42 AM 1/30/02 +0100, Karl Pentzlin wrote: The question is, are typesetting rules part of the script? (I mean rules in the sense of obligatory regulations, not guidelines). This distinction is a very German way of approaching the question. If yes, (in my opinion) the plain text must carry the information that is needed to follow them. If no, their execution can be left to higher level protocols (which then have to decide whether a word is a foreign word [to be set in Roman letters] or a name [to be set in Fraktur letters], such at least according to German typesetting rules). A more productive distinction would be along these lines: a) is the feature necessary for correctly expressing the content b) is the feature rule based, and b.1) is the rule implementable w/o knowledge of semantics, or c) when implementing the feature, is it necessary to c.1) provide scope information, or c.2) is local context sufficient Looking at this list, roughly in reverse order: Higher level protocols, understood as markup languages in particular, do really well, when implementing something requires defining a scope, since in them, all text data and the effect of all syntax are scoped already. If layout features can be determined algorithmically, it makes little sense to add what can be derived from the existing text data, also into the markup. Allowing for duplicate representation of information, always allows the possibility of something getting out of step. If semantic knowledge is required to implement a feature, this knowledge must be supplied. If the extra information can be expressed as point-like, local context, then it makes much *less* sense to use higher level markup compared to character codes. Character codes, in a way, provide the ideal representation of point like context in a data stream. Finally, we get back to the original argument. Whether a typesetting rule (and by rule I mean both conventions and legislated rules) is supported by information added to the plain text or not, does not depend on whether a national authority promulgates it, or whether it just represents the consensus of the users of the language. If, in practice, such a rule can be ignored, yet not change the meaning of the text, it's a good candidate for not being implemented via plain text. However, this is not absolute: Leaving out italics from a document can not only change the level of emphasis, but for example in English, there are occasional circumstances where the use of italics removes a possible ambiguity in interpreting a sentence. Nevertheless (except for mathematics) italics were left to a higher level protocol (style markup). Overriding bad hyphenation, or bad line breaks, is supported by SHY and NBSP, even though hyphenation is not required at all to express the content of a text, nor would bad line breaks e.g. after Dr. change the meaning of the text. In the latter two cases, character codes were added (fairly early) to plain text, because using point-like context to support these very common algorithms (hyphenation and linebreak) is an elegant solution, while adding markup for the same purpose would be inelegant to the extreme. Like everything else in character encoding, there are shades of gray, and levels of gradation, so not everything is clear cut. But recognizing up front that character codes may legitimately serve the support of algorithms, even where the feature implemented by the algorithm is merely common, and not absolutely and minimally required, is useful. A./
Re: Proposing Fraktur
On Thu, Jan 31, 2002 at 07:32:40PM +0100, Stefan Persson wrote: Do you have to use *both* kinds of characters at the same time in the same document? In old Swedish you have to use *both* a's at the same time, otherwise the text is grammatically wrong, be it so in plain text. Being grammatically wrong implies that there's a error in the normal form of the language - that is, the spoken form for most languages. And I don't see it as any different from the rules that you must put the titles of books in italics. -- David Starner - [EMAIL PROTECTED], dvdeug/jabber.com (Jabber) Pointless website: http://dvdeug.dhis.org What we've got is a blue-light special on truth. It's the hottest thing with the youth. -- Information Society, Peace and Love, Inc.
Re: When to use markup: (Was:Introducing the idea of a ROMAN VARIANT SELECTOR(was: Re: Proposing Fraktur))
.. about Fraktur vs. Roman being a codepoint difference rather than a markup difference.. Like everything else in character encoding, there are shades of gray, and levels of gradation, so not everything is clear cut. But recognizing up front that character codes may legitimately serve the support of algorithms, even where the feature implemented by the algorithm is merely common, and not absolutely and minimally required, is useful. A./ $B$+$?$+$J$`$h$&$>!*$R$i$,$J$r$D$+$($k!*!*(B I DON'T NEED LOWERCASE! I CAN USE CAPITAL LETTERS! $B"*!!$8$e$&$$$C$A$c$s!!"+(B $B!!$@$s$;$$$i$7$5$`$h$&(B _ $B%$%s%?!<%M%C%H$r$V$i$V$i%7%g%C%T%s%0$9$k$J$i(BMSN $B%7%g%C%T%s%0$X(B http://shopping.msn.co.jp/
Re: Introducing the idea of a ROMAN VARIANT SELECTOR (was: Re: Proposing Fraktur)
Am Mittwoch, 30. Januar 2002 um 00:39 schrieb Philipp Reichmuth: PR ... for example, in German hyphenation the consonant PR cluster ck gets hyphenated as k-k under some circumstances. This PR is a rule as well, but still it is a clear case where putting it into PR the encoding by means of a hypothetical UNUSUAL HYPHENATION SELECTOR PR would be a bit inappropriate. This is a complete algorithmic decision. Some circumstances is practically identical to using old (i.e. pre-1998) ortography (at least I don't know a German compound word which first part ends in -c and which second part starts with k-). The new orthography hyphenates before the -ck. (Thus, the decision how to hyphenate ck is for the whole text, not for the individual position, and does not need to be marked there.) PR I think most of these cases, including PR the Fraktur problem, deal with _typesetting_ rules and should thus be PR left to _typesetting_ software, i.e. the now-famous higher level PR protocol. The question is, are typesetting rules part of the script? (I mean rules in the sense of obligatory regulations, not guidelines). If yes, (in my opinion) the plain text must carry the information that is needed to follow them. If no, their execution can be left to higher level protocols (which then have to decide whether a word is a foreign word [to be set in Roman letters] or a name [to be set in Fraktur letters], such at least according to German typesetting rules). PR Would this mean much of an advantage over selecting a different font PR for the respective character by means of markup? The advantage is that you can encode text to be displayed correctly (i.e. according to the obligatory typesetting rules) in Fraktur as plain text. You even can display this text correctly in Fraktur or Roman without change (as you can encode a Serbocroatian plain text to be displayed in Latin or Cyrillic correctly without change). Fraktur and Roman are script variants, not font variants. Both script variants have a lot of fonts, but they are not fonts themselves. If you regard the typesetting rules as part of the script, you can look at Fraktur as a script variant which has four cases: upper/lower for foreign words and upper/lower for the rest. The former accidentily happen to look like the two cases of the Roman script variant; thus you can use a Roman font for these two cases and another real Fraktur letter font for the other two. Cases could be left to higher level protocols, but for good reasons they are not. -- Karl Pentzlin ACS Analysis Consulting Software GmbH München, Germany
RE: Introducing the idea of a ROMAN VARIANT SELECTOR (was: Re: Proposing Fraktur)
Karl Pentzlin wrote: [...] (as you can encode a Serbocroatian plain text to be displayed in Latin or Cyrillic correctly without change). I guess you are talking about old Yugoslav character sets, as this would not be possible in Unicode. Another case of a single encoding which overlaps more than one script is ISCII, the Indian standard encoding. Fraktur and Roman are script variants, not font variants. Both script variants have a lot of fonts, but they are not fonts themselves. In rich text, you don't necessarily have to set a different font for roman words in Fraktur text: the higher level protocol could be designed to have a roman or loanword tag which is independent of font choice. In plain text, I think that plane 14 language tags could be used: imagine defining a language old Swedish and a sub language old Swedish/LOANWORD. But I know that these language tags are not very popular, and perhaps I am stretching their usage scope too much... _ Marco
Re: Proposing Fraktur
origin, while katakana and hiragana letters are very different and generally derive from completely different ideographs. Mark Actually no. Of the 46 syllables, 31 have a shared root, only the derivation is different (block writing for katakans and fast handwriting for hiragana) ... not quite what I'd call generally ; ) Mìcheal
RE: Proposing Fraktur
Michael Bauer wrote: origin, while katakana and hiragana letters are very different and generally derive from completely different ideographs. Mark Mark or Marco? Well, anyway, the root is shared. :-) Actually no. Of the 46 syllables, 31 have a shared root, only the derivation is different (block writing for katakans and fast handwriting for hiragana) ... not quite what I'd call generally ; ) Oh, right! Although I count 48 syllables and 30 shared roots, that doesn't change the basic the fact that my generally is to be corrected as sometimes or often at best... Mìcheal Mìcheal or Michael? Well, anyway, the root is shared. :-) _ Marco
Re: Introducing the idea of a ROMAN VARIANT SELECTOR (was: Re: Proposing Fraktur)
On Wed, Jan 30, 2002 at 09:42:08AM +0100, Karl Pentzlin wrote: The advantage is that you can encode text to be displayed correctly (i.e. according to the obligatory typesetting rules) in Fraktur as plain text. You even can display this text correctly in Fraktur or Roman without change (as you can encode a Serbocroatian plain text to be displayed in Latin or Cyrillic correctly without change). What happens to the long s? That needs changing if you're talking about Roman script since the 19th century. -- David Starner - [EMAIL PROTECTED], dvdeug/jabber.com (Jabber) Pointless website: http://dvdeug.dhis.org What we've got is a blue-light special on truth. It's the hottest thing with the youth. -- Information Society, Peace and Love, Inc.
Re: Introducing the idea of a ROMAN VARIANT SELECTOR (was: Re: Proposing Fraktur)
- Original Message - From: Karl Pentzlin [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: den 29 januari 2002 23:39 Subject: Introducing the idea of a ROMAN VARIANT SELECTOR (was: Re: Proposing Fraktur) While in Swedish this is a *tradition* according to Stefan, in German it is even a *rule*. Also in Swedish, this was a rule. But from the end of the 18th century, people began publishing books in Fraktur *only*, or antiqua *only*. In some books, the antiqua part was written in italics instead. NOTE: This italic thing should be considered as a glyph variant. Maybe something like a ROMAN VARIANT SELECTOR would be appropriate: In any case, it'd be better to have *two* selectors, one to turn on Fraktur, and a different one to turn it off. Otherwise, you'd have to put the variant selector after *every* letter you want to be in antiqua, which would require quite a lot of space. However, Fraktur is already encoded in the Mathematical whatever-it's-called block. This variant selector would mean that lots of characters can be displayed in two *different* ways. I'd prefer that Fraktur diacritics were added instead, and that the mathematical letters were to be used for Fraktur texts. NOTE: Sometimes part of a word is in Fraktur, and a different part in antiqua. Example: the Swedish word latin is a Latin loan word, and should thus be written in antiqua. However, if you add the Swedish ending -sk, you'll get latinsk (Latin-like). The ending is Swedish and can, but doesn't have to, be written in Fraktur. It's up to the author to decide which. This selector could fulfill another important purpose: If this selector appears after a U+017F (long s), this character is only to be displayed as long s when it is (by means of a higher level protocol) to be displayed in Fraktur. Otherwise it is to be displayed as U+0073 (lower case s). Long s is displayed as long s in antiqua words used in Fraktur Swedish. So this wouldn't work. Instead, one would have to write s in German texts. A comma after a Fraktur word is displayed as *either* , or / (glyph difference), while a comma after an antiqua word is *always* displayed as ,. So I guess that a Fraktur comma would also have to be added… Stefan _ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
Re: Introducing the idea of a ROMAN VARIANT SELECTOR (was: Re: Proposing Fraktur)
- Original Message - From: Karl Pentzlin [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: den 30 januari 2002 09:42 Subject: Re: Introducing the idea of a ROMAN VARIANT SELECTOR (was: Re: Proposing Fraktur) PR I think most of these cases, including PR the Fraktur problem, deal with _typesetting_ rules and should thus be PR left to _typesetting_ software, i.e. the now-famous higher level PR protocol. The question is, are typesetting rules part of the script? (I mean rules in the sense of obligatory regulations, not guidelines). If yes, (in my opinion) the plain text must carry the information that is needed to follow them. If no, their execution can be left to higher level protocols (which then have to decide whether a word is a foreign word [to be set in Roman letters] or a name [to be set in Fraktur letters], such at least according to German typesetting rules). In this case: * The program would have to know which language it's dealing with, and which spelling rules are used in the text (in Swedish: free spelling (as preferred), pre-1905, and post-1905). *It would have to know every loan word and personal name. Here's a difficult case: * Et: Latin word. Used in Swedish in cases such as et cetera. Written in antiqua * Et: old spelling for ett (a, one). Written in Fraktur. How would the program know which of them I'm referring to? Stefan _ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
Re: Proposing Fraktur
Stefan Persson wrote: AFAIK, the criteria for adding any character to the Standard is that there should be a difference between the character and all the other characters already supported by the Standard. Here we have a such difference, doesn't this mean that Fraktur ought to be added to the Standard. Asmus pretty thoroughly laid out the issues for kana and Fraktur. I won't say anything further about that. But stepping back a little further, I would like to point out that the assertion that: the criteria for adding any character to the Standard is that there should be a difference between the character and all the other characters already supported by the Standard italipsissima verba/ital == irony warning begs the questions which arise about the identity of the character in the first place. Every marking on paper (or papyrus, or clay, or stone, for that matter) is not necessarily a character deserving of encoding as a character in the universal character encoding, even if I can show systematic differences between it and existing characters in the standard. On the one hand, one must show that the differences don't fall within the range of acceptable variation for an already existing encoded character. And one must show that the entity in question has some verifiable existence as an abstract character, or that some processing requirement forces consideration of its encoding as a character. Merely being a distinct glyph is not enough. And so what? I thought the meaning of Unicode was that all languages should be fully supported in plain text, using one single font to displaying all of the characters. With old Swedish, this isn't possible. I think this misconstrues the mission of Unicode as an encoding. The goal is to encode sufficient characters to enable the correct and legible representation of *plain* text in any script (modern or historic). The goal is not and has never been to enable the plain text representation of *all* extant and future texts of any form. For that, markup, high-level layout, and font selection has always been required. Again: one language, one font. No. One font is sufficient for monofont display of a language, tautologously. But there is no presumption that any and all text in a language need be displayed in a single font, or that such a goal would even be desirable. --Ken Stefan
RE: Proposing Fraktur
Stefan Persson wrote: In old Swedish there was a tradition of writing words of foreign origin in the Roman type of letters (in Swedish referred to as antikva), while the rest of the words were written in Fraktur. I have seen the same usage in German, on an old Duden dictionary: words of foreign origins and etymologies were in Roman, the rest being in Fraktur. This is similar to the difference between katakana and hiragana/kanji in modern Japanese. And a similar difference is used in all modern European languages: roman for normal text and italics for foreign words. But notice that roman, italics and Fraktur all look alike and share a common origin, while katakana and hiragana letters are very different and generally derive from completely different ideographs. [...] I know that the letters A-z are already supported in the Mathematical Alphanumeric Symbol block (and some in the Letterlike Symbols block), AFAIK, those characters should not be used to compose text: they are supposed to be *symbols* to be used by mathematicians too busy to set a different font. ;-) _ Marco
Re: Proposing Fraktur
- Original Message - From: Marco Cimarosti [EMAIL PROTECTED] To: 'Stefan Persson' [EMAIL PROTECTED]; Unicode-listan [EMAIL PROTECTED] Sent: den 29 januari 2002 19:39 Subject: RE: Proposing Fraktur Stefan Persson wrote: In old Swedish there was a tradition of writing words of foreign origin in the Roman type of letters (in Swedish referred to as antikva), while the rest of the words were written in Fraktur. I have seen the same usage in German, on an old Duden dictionary: words of foreign origins and etymologies were in Roman, the rest being in Fraktur. [...] And a similar difference is used in all modern European languages: roman for normal text and italics for foreign words. The only case I've seen this in use is for some special frases of French origin when used in English. Besides, this is no rule (i.e. you don't have to use italics), while this rule was applied to *all* occurences of such words in old Swedish. But notice that roman, italics and Fraktur all look alike and share a common origin, while katakana and hiragana letters are very different and generally derive from completely different ideographs. And so what? I thought the meaning of Unicode was that all languages should be fully supported in plain text, using one single font to displaying all of the characters. With old Swedish, this isn't possible. [...] I know that the letters A-z are already supported in the Mathematical Alphanumeric Symbol block (and some in the Letterlike Symbols block), AFAIK, those characters should not be used to compose text: they are supposed to be *symbols* to be used by mathematicians too busy to set a different font. ;-) Again: one language, one font. Stefan _ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
Introducing the idea of a ROMAN VARIANT SELECTOR (was: Re: Proposing Fraktur)
Am Dienstag, 29. Januar 2002 um 17:07 schrieb Stefan Persson: SP In old Swedish there was a tradition of writing words of foreign origin in SP the Roman type of letters (in Swedish referred to as antikva), while the SP rest of the words were written in Fraktur. ... Am Dienstag, 29. Januar 2002 um 19:39 schrieb Marco Cimarosti: MC I have seen the same usage in German, on an old Duden dictionary: words of MC foreign origins and etymologies were in Roman, the rest being in Fraktur. This is still valid for Fraktur typesetting according to the *actual* Duden, at least to the edition of 1996 which I have (21th edition; the one which introduced the new German ortography which became effective 1998). See page 66. The Duden uses e.g. the following example: Das sinkende Schiff sandte SOS-Rufe. (The sinking ship emitted SOS calls.) fff Ffff ff Ff aaaff (f=Fraktur (i.e. Blackletter), a=Antiqua (i.e. Roman), F=U+017F in Fraktur) While in Swedish this is a *tradition* according to Stefan, in German it is even a *rule*. The Duden says: Fremdsprachige Wörter und Wortgruppen ... sind im Fraktursatz als Antiqua zu setzen, i.e. Words of foreign languages and groups of them ... have to be typeset in Roman within Fraktur typesetting. This may be an argument proving that the Fraktur/Roman differentation can be a matter of text rather than of higher level protocols, as in fact claimed by Stefan. On the other hand, Fraktur is too obviously a variant of the Latin script to be encoded separately. Maybe something like a ROMAN VARIANT SELECTOR would be appropriate: If this appears after a character which is (by means of a higher level protocol) to be displayed in Fraktur otherwise, that character is to be displayed in Roman. In other circumstances, this selector can be ignored. This selector could fulfill another important purpose: If this selector appears after a U+017F (long s), this character is only to be displayed as long s when it is (by means of a higher level protocol) to be displayed in Fraktur. Otherwise it is to be displayed as U+0073 (lower case s). This would allow German (and maybe Swedish etc.) texts to be encoded in a way that they can be displayed correctly in Fraktur as well as in Roman. The German orthographic rules require that a normal s (U+0073) is to be used when using Roman script where a long s (U+017F) is to be used when using Fraktur script. -- Karl Pentzlin ACS Analysis Consulting Software GmbH München, Germany
Re: Introducing the idea of a ROMAN VARIANT SELECTOR (was: Re: Proposing Fraktur)
Hello Karl and others, KP While in Swedish this is a *tradition* according to Stefan, in German KP it is even a *rule*. The Duden says: KP Fremdsprachige Wörter und Wortgruppen ... sind im Fraktursatz als KP Antiqua zu setzen, i.e. Words of foreign languages and groups of KP them ... have to be typeset in Roman within Fraktur typesetting. KP This may be an argument proving that the Fraktur/Roman KP differentation can be a matter of text rather than of higher level KP protocols, as in fact claimed by Stefan. On the other hand, for example, in German hyphenation the consonant cluster ck gets hyphenated as k-k under some circumstances. This is a rule as well, but still it is a clear case where putting it into the encoding by means of a hypothetical UNUSUAL HYPHENATION SELECTOR would be a bit inappropriate. I think most of these cases, including the Fraktur problem, deal with _typesetting_ rules and should thus be left to _typesetting_ software, i.e. the now-famous higher level protocol. KP If this appears after a character which is (by means of a higher level KP protocol) to be displayed in Fraktur otherwise, that character is to be KP displayed in Roman. In other circumstances, this selector can be ignored. Would this mean much of an advantage over selecting a different font for the respective character by means of markup? Philippmailto:[EMAIL PROTECTED]
RE: Proposing Fraktur
David Starner said: Fraktur is not a different script from the Latin script, and therefore is not encoded separately. True, but Fraktur math characters are encoded in plane 1 for use in mathematics. These characters are not intended to be used for natural language purposes (unless you think of mathematics as a natural language :-) In which case, it's probably the only truly international natural language. Thanks Murray
Re: Proposing Fraktur
Kana (Hiragana/Katakana): Two (essentially) iso-phonic(?) systems, where each symbol in one set has a corresponding symbol in the other set, both denoting the same sound value. The set of forms are historically unrelated. There is little overlap in the forms. Competent readers will know both sets, but will lean them separately. Convention decides which set to use, but innovative uses are known that flout these conventions. Use of Katakana for foreign words is conventional. Having longer texts (book length) available in both forms, however, is very uncommon (never say never). The rules of layout are identical, spelling rules differ in the demarkation of vowel length. Widespread daily modern use Monofont support is a practical everyday requirement Encoded as two scripts Latin (Fraktur/Roman/italic): Three isophonic systems Forms historically related Some overlap in the forms (Some forms of Fraktur have what I call 'embellished roman' capitals, instead of true Fraktur shapes.) Knowledge of roman/italic only is widespread, but reading Fraktur can be self-taught. Convention decides which one to use, when they occur together, but innovative uses of Fraktur are common for names and titles, and misuse of italics is rampant. Use of roman for foreign words is a common feature of Fraktur texts. Use of italic for emphasis is a common feature of roman texts. Books published in Fraktur, have commonly been republished in roman style, as Fraktur has fallen out of common use. The rules of layout (ligating, hyphenating, etc.) are different. In Fraktur, emphasis is denoted by s e p a r a t i n g the letters, whereas in many languages w/o a Fraktur tradition, italics have taken on this role, and character spacing is used to justify lines. For languages with Fraktur tradition, separation is still used with roman, and automatic use for character spacing is an example of poor localization (!). No longer widespread use. Limited to attention grabbing (titles, names) and specialize (math) uses. Monofont support is not an everyday requirement, except in specialized notation (mathematics). Encoded as one script plus extension for mathematics. I think this is a complete summary. My belief is that if Fraktur was still common today, and more commonly used together with roman, and/or if Japanese usage rules for Kana were somewhat different, then the resulting encodings might well have been different in each case. From a purely rich-text point of view there is nothing that prevents treating the Kana as a single script. On the other hand, the layout rule differences make a simple font substitution awkward for Fraktur text of any length. So does the use of length mark for vowels in Katakan, vs. vowel doubling in Hiragana. No such issues exist for roman/italic. A./ PS: The set of 'scripts' unified with Latin is in fact a bit larger, if manuscript and handwriting styles are considered as well. Some handwriting styles (Suetterlin) are so different that considerable training is required to read them. PPS: I don't care to distinguish between 'conventions' and 'rules'. A tendency of considering conventions as/in tersm of rules, is quite conventional in Germany ;-)