UTR#17 comments (was RE: Unicode Public Review Issues update)
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Rick McGowan The following public review issues are new: 25 Proposed Update UTR #17 Character Encoding Model 2004.01.27 I have submitted the following comments, copied here in case anyone wishes to discuss them: The draft text for TR17, section 5 says, A simple character encoding scheme is a mapping of each code unit of a CCS into a unique serialized byte sequence. It goes on to define a compound CES. While not stated explicitly, Unicodes CESs do not fit the definition of a compound CES, and so the definition for simple CES must apply. The problem is that this definition cannot accommodate all seven Unicode CESs. Since it defines a CES as a mapping from each code unit, there are only two possible byte-order-dependent mappings for 16- and 32-bit code units. In other words, the distinction between UTF-16BE and UTF-16 data that is big-endian cannot be a CES distinction because individual code units are mapped in exactly the same way in both cases. A definition for simple CES must, at a minimum, refer to a mapping of *streams* of code units if it is to include details about a byte-order mark that may or may not occur at the beginning of a stream. I would suggest that, in order to accommodate the UTF-16 and UTF-32 CESs, an appropriate definition should actually be a level of abstraction away from a mapping: a CES is a specification for mappings. Any mapping is necessarily deterministic, giving a specific output for each input. A mapping itself cannot serialize in either big-endian or little-endian format; it must be one or the other, unambiguously. On the other hand, a specification for how to map into byte sequences can be ambiguous in this regard. Thus, the UTF-16 CES can be considered a specification for mapping into byte sequences that allows a little-endian mapping or a big-endian mapping. Peter Peter Constable Globalization Infrastructure and Font Technologies Microsoft Windows Division
Ethiopic numbers (was RE: Unicode Public Review Issues update)
26 Update properties for Ethiopic and Tamil non-decimal digits 2003.01.27Decimal numbers are those using in decimal-radix number systems. In particular, the sequence of the ONE character followed by the TWO character is interpreted as having the value of twelve. We have gotten feedback that this is the not the case for Ethiopic or Tamil. Details are on the public issues page. Comments I've submitted: PRI#26: It is my understanding that Ethiopic numerals do not use a decimal radix. Most sources describing Ethiopic script will list the characters representing tens, 100 and 1. The existence of these characters, which are not combinations made from sequences of digits for 0 - 9, already indicates that this is not a decimal-radix system. Traditionally, each syllabic character has a numeric value associated. This is described on page 8 of http://www.intelligirldesign.com/paper_gabriella.pdf, which shows Arabic decimal values, and http://www.library.cornell.edu/africana/Writing_Systems/Numeric.html, which shows traditional Ethiopic numerals. By comparison of these two documents, one can get an idea of how the numbers work. The following are some other useful discussions of Ethiopic numbering: http://www.geez.org/Numerals/ http://www.abyssiniacybergateway.net/fidel/sera-faq_4.html http://www.ethiopic.com/ethiopic/numerals.pdf The last of these proposes the addition of a digit 0 in order to allow decimal-radix numbers in Ethiopic. I have no idea whether this has caught on at all or not, but it is not the traditional system. Peter Peter Constable Globalization Infrastructure and Font Technologies Microsoft Windows Division
Re: Ethiopic numbers (was RE: Unicode Public Review Issues update)
On Fri, 28 Nov 2003, Peter Constable wrote: 26 Update properties for Ethiopic and Tamil non-decimal digits 2003.01.27Decimal numbers are those using in decimal-radix number systems. In particular, the sequence of the ONE character followed by the TWO character is interpreted as having the value of twelve. We have gotten feedback that this is the not the case for Ethiopic or Tamil. Details PRI#26: It is my understanding that Ethiopic numerals do not use a decimal radix. Most sources describing Ethiopic script will list the characters representing tens, 100 and 1. The existence of these characters, which are not combinations made from sequences of digits for 0 - 9, already indicates that this is not a decimal-radix system. Thank you for providing many useful and interesting links. FYI, three list styles for Ethiopic scripts were implemented in Mozilla a long time ago. One of them was Ethiopic numeric list style. All three styles are in CSS3 draft. (http://bugzilla.mozilla.org/show_bug.cgi?id=102252). BTW, isn't this covered in TUS 3.0 section 11.1 as well as in TUS 4.0 section 12.1 (p. 323 - p. 324) ? Jungshik
RE: Unicode Public Review Issues update (braille)
I noticed that this message had not gotten a reply. At 05:07 PM 10/7/03 +0200, Kent Karlsson wrote: A question about the issues already open: What is the justification for proposing to make Braille Lo? Shortly before this came up as a Public Review Issue, I suggested that Braille characters should not be regarded as ignorable symbols when collating texts. I.e. that they should have level one weights in the default weighting table. The reason being that they are more often used for letters than for other things. (However, I did not ask to make them Lo...) That reasoning doesn't really apply. When data-streams contain a mixture of Braille and other character codes, then one would assume that the Braille is merely cited for the sighted, in which case it's used as a symbol. When data streams contain exclusively Braille, then they are actually used as intended. To sort such data would require a tailoring based on the Braille mapping being used. Which you recognize implicitly: That would be for the default ordering. Wanting a more alphabetically proper ordering, would still require tailoring for that particular correspondence between ordinary letters and Braille, but not require converting to the ordinary letters. Each such tailoring would give level 1 weights to most of the Braille characters used in that system of usage. That may be true, but I still don't see what difference a change in default mapping would offer. For people reading the braille, it provides a random, almost binary ordering, which would seem to swamp all benefits of it being level 1. [If all data to be sorted has weights only on the same level(s), the results should be unaffected by which those levels are. Or am I missing something here?] For people using data with Braille embedded, e.g. instructional material, I don't see the benefit of sorting the Braille as if it was letters by default. If you wanted to sort a list, like a Braille to English phrasebook, then you would need a tailored sort anyway. A./
Re: Unicode Public Review Issues update: BRAILLE
On 06/10/2003 19:08, Christopher John Fynn wrote: - Original Message - From: Jony Rosenne [EMAIL PROTECTED] Please note that Braille is used also for Hebrew. We use the same codes, but they are assigned a different meaning. The reader has to know or guess which language it is. I don't remember whether Hebrew Braille is written RTL or LTR. Jony Braille is probably used for a lot of scripts, maybe even *most* scripts used for modern languages - in Bhutan I know they use Braille for writing Dzongkha (Tibetan script). - Chris Presumably it is no more difficult for a multilingual reader to know or guess what language is being used than it is for sighted readers to tell the difference between e.g. English and French in a Latin script text. -- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/
RE: Unicode Public Review Issues update
Jony Rosenne wrote: I don't remember whether Hebrew Braille is written RTL or LTR. Braille is always LTR, even for Hebrew and Arabic. To be more precise, Braille is always LTR when you read it, but RTL when you write it manually (because it is engraved on the back side of the paper, using a dotted rule and a stylus). _ Marco
RE: Unicode Public Review Issues update (braille)
A question about the issues already open: What is the justification for proposing to make Braille Lo? Shortly before this came up as a Public Review Issue, I suggested that Braille characters should not be regarded as ignorable symbols when collating texts. I.e. that they should have level one weights in the default weighting table. The reason being that they are more often used for letters than for other things. (However, I did not ask to make them Lo...) That would be for the default ordering. Wanting a more alphabetically proper ordering, would still require tailoring for that particular correspondence between ordinary letters and Braille, but not require converting to the ordinary letters. Each such tailoring would give level 1 weights to most of the Braille characters used in that system of usage. Among other things it would make it part of identifiers. However, there's been some suggestion that this is a bad idea. Whether or not a braille symbol actually stands for a letter or a digit or a punctuation mark is entirely dependent on a higher level protocol. I would agree with your reasoning here. I don't think Braille should be used for identifiers. Also, by making them Lo, any parser that tries to collect words, would run them together with any surrounding regular letters and digits. That seems odd, but perhaps its not any more odd than mixing Devanagari and Han. I.e., this is not so odd at all (a quite different case from identifiers). The original model for these was that your text processing is done in non-Braille, and on the last leg to a device, you would transcode the regular text to a Braille sequence using a domain and language specific mapping. Having the codes in Unicode allows you to preserve 'final form' and transmit that as needed w/o having to also transmit the text-to-braille mapping(s) that were used to generate the Braille version of the text. (This assumes that the eventual human reader can do 'autodetection'.) This does not apply to text that have been manually written in or translated to Braille (for a particular language). As I have understood it, writers/transcribers often use(d) peculiar writings, e.g. abbreviations, that would not occur in normal text, the abbreviations varied from scribe to scribe. I'm not familiar with the details though. Braille can be used also for math and music notation. B.t.w. Braille often uses state shifts, e.g. for digits. There is a digits Braille code, followed by one or more codes for the letters a-j (if the basic script is Latin) which then stand for 1, ..., 9, 0 (the list is terminated with any non-a-j code; but decoding the Braille for e.g. 12a is ambiguous, IIUC). /kent k smime.p7s Description: S/MIME cryptographic signature
Re: Unicode Public Review Issues update: BRAILLE
At 10:32 AM 10/7/03 +0530, [EMAIL PROTECTED] wrote: The only justification mentioned so far for changing Braille from So to Lo is to be able to use Braille in identifiers. I'm not sure why someone whould want to use Braille in this way, for a start how would these identifiers be translated into Braille? Braille identifiers only make sense when the whole source file has been translated to Braille. However, the parsing semantics applied to it should then be determined by the properties of the original characters (before applying the Braille mapping). If one does want to work directly with a Braill transcoded stream, then such systm must support *dynamic* property assignments. That's something that's outside the scope of the Unicode Standard. In conclusion, it seems that the correct set of *default* properties for Braille would be determined by the needs of inserting Braille strings into other text (for educational manuals and similar specifications). As Marco has pointed out that means BIDI = L and I believe it also means GC=So, and other properties assigned as they are for other characters that share BIDI=L and GC=So. A./
Re: Unicode Public Review Issues update
- Original Message - From: Marco Cimarosti [EMAIL PROTECTED] To: 'Jony Rosenne' [EMAIL PROTECTED]; 'Asmus Freytag' [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Tuesday, October 07, 2003 11:47 AM Subject: RE: Unicode Public Review Issues update Jony Rosenne wrote: I don't remember whether Hebrew Braille is written RTL or LTR. Braille is always LTR, even for Hebrew and Arabic. To be more precise, Braille is always LTR when you read it, but RTL when you write it manually (because it is engraved on the back side of the paper, using a dotted rule and a stylus). Looks like those dots in paper have the Mirrored property: If I take a piece of paper, make dots e.g. 2,3,4 with the 1,2,3,7 column to the left and the 4,5,6,8 to the right, and then look at the backside then I see the 1,2,3,7 column displayed to the right. _ Marco
Re: Unicode Public Review Issues update (braille)
Kent Karlsson wrote: The original model for these was that your text processing is done in non-Braille, and on the last leg to a device, you would transcode the regular text to a Braille sequence using a domain and language specific mapping. Having the codes in Unicode allows you to preserve 'final form' and transmit that as needed w/o having to also transmit the text-to-braille mapping(s) that were used to generate the Braille version of the text. (This assumes that the eventual human reader can do 'autodetection'.) This does not apply to text that have been manually written in or translated to Braille (for a particular language). As I have understood it, writers/transcribers often use(d) peculiar writings, e.g. abbreviations, that would not occur in normal text, the abbreviations varied from scribe to scribe. I'm not familiar with the details though. Braille can be used also for math and music notation. I'm not sure about variation among users. I know that Braille as used for English (at least in America) has a standard set of short forms (I studied Grade II Braille, as it is called, a bit), including symbols for common letter-combinations, one-letter abbreviations for common words, and sort of escape symbol+letter abbreviations for common word-endings and suffixes. B.t.w. Braille often uses state shifts, e.g. for digits. There is a digits Braille code, followed by one or more codes for the letters a-j (if the basic script is Latin) which then stand for 1, ..., 9, 0 (the list is terminated with any non-a-j code; but decoding the Braille for e.g. 12a is ambiguous, IIUC). Not so. There is, indeed, a Braille symbol for numbers which, when followed by one or more letters a-j, makes the following string digits instead of letters. There is also, however, a corresponding letter sign that can be used to cancel out the effect of the number-shift, or to disambiguate in the case of an isolated symbol that might be confusing otherwise. Both of these, I believe (and can look up) are also used as escape characters in making suffix short-forms, and are unambiguous because as letter/number shifts they appear at the beginning of a string, and not in the middle as they would for suffix short-forms (which begs the question of how to encode a12. I presume that lettersignanumbersignab would work for the same reason that numbersignablettersigna works for 12a). ~mark
Re: Unicode Public Review Issues update: BRAILLE
Asmus said: In conclusion, it seems that the correct set of *default* properties for Braille would be determined by the needs of inserting Braille strings into other text (for educational manuals and similar specifications). As Marco has pointed out that means BIDI = L and I believe it also means GC=So, and other properties assigned as they are for other characters that share BIDI=L and GC=So. Which is a ton of them, including all the squared CJK, circled symbols, and most of the musical symbols. The upshot so far seems to be that there is little reason to change gc=So -- gc=Lo, but that some feel that there is better reason to change bc=ON -- bc=L for the Braille symbols. --Ken
Re: Unicode Public Review Issues update
The Unicode Technical Committee has posted some new issues for public review and comment. Details are on the following web page: http://www.unicode.org/review/ A question about the issues already open: What is the justification for proposing to make Braille Lo?
Re: Unicode Public Review Issues update
At 10:29 AM 10/6/03 +0530, [EMAIL PROTECTED] wrote: The Unicode Technical Committee has posted some new issues for public review and comment. Details are on the following web page: http://www.unicode.org/review/ A question about the issues already open: What is the justification for proposing to make Braille Lo? Among other things it would make it part of identifiers. However, there's been some suggestion that this is a bad idea. Whether or not a braille symbol actually stands for a letter or a digit or a punctuation mark is entirely dependent on a higher level protocol. Also, by making them Lo, any parser that tries to collect words, would run them together with any surrounding regular letters and digits. That seems odd, but perhaps its not any more odd than mixing Devanagari and Han. We've given Braille a script ID, since it's used for running text, unlike a string of symbols. There was a lot of discussion in the meeting which is the reason why UTC is asking for public input before deciding. The original model for these was that your text processing is done in non-Braille, and on the last leg to a device, you would transcode the regular text to a Braille sequence using a domain and language specific mapping. Having the codes in Unicode allows you to preserve 'final form' and transmit that as needed w/o having to also transmit the text-to-braille mapping(s) that were used to generate the Braille version of the text. (This assumes that the eventual human reader can do 'autodetection'.) Needless to say, conceived this way, Braille does not fit neatly into Unicode's text handling model. The General Category, being very simplistic, can only express a single aspect of a characters use. Usually we can agree on what that primary aspect is, so gc is reasonably useful as a quick cut. However, Braille is a bit resistant if put to the question: Are you symbol or letter? In reality, the Braille codes are glyph codes. We decided at some point not to allow any new types of gc values. If we didn't have that restriction, we could assign them an *Sb or *Lb (for *Symbol-Braille or *Letter-Braille). But that's an option we don't have. One thing that we are hoping to learn is whether people are actually using these Braille codes and are using them in ways that are or are not compatible with the model we describe in http://www.unicode.org/versions/Unicode4.0.0/ch14.pdf (see section 14.9). In terms of the organization of the book we've clearly sorted Braille among the symbols, by the way. Any comments? A./
Re: Unicode Public Review Issues update
Rick McGowan wrote: The Unicode Technical Committee has posted some new issues for public review and comment. Details are on the following web page: http://www.unicode.org/review/ Maybe I'm missing something, but I still can't find any reference that the Unihan.txt file will be released under a license that permits redistribution (which has been announced in other documents).
RE: Unicode Public Review Issues update
Please note that Braille is used also for Hebrew. We use the same codes, but they are assigned a different meaning. The reader has to know or guess which language it is. I don't remember whether Hebrew Braille is written RTL or LTR. Jony -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Asmus Freytag Sent: Monday, October 06, 2003 8:58 PM To: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: Unicode Public Review Issues update At 10:29 AM 10/6/03 +0530, [EMAIL PROTECTED] wrote: The Unicode Technical Committee has posted some new issues for public review and comment. Details are on the following web page: http://www.unicode.org/review/ A question about the issues already open: What is the justification for proposing to make Braille Lo? Among other things it would make it part of identifiers. However, there's been some suggestion that this is a bad idea. Whether or not a braille symbol actually stands for a letter or a digit or a punctuation mark is entirely dependent on a higher level protocol. Also, by making them Lo, any parser that tries to collect words, would run them together with any surrounding regular letters and digits. That seems odd, but perhaps its not any more odd than mixing Devanagari and Han. We've given Braille a script ID, since it's used for running text, unlike a string of symbols. There was a lot of discussion in the meeting which is the reason why UTC is asking for public input before deciding. The original model for these was that your text processing is done in non-Braille, and on the last leg to a device, you would transcode the regular text to a Braille sequence using a domain and language specific mapping. Having the codes in Unicode allows you to preserve 'final form' and transmit that as needed w/o having to also transmit the text-to-braille mapping(s) that were used to generate the Braille version of the text. (This assumes that the eventual human reader can do 'autodetection'.) Needless to say, conceived this way, Braille does not fit neatly into Unicode's text handling model. The General Category, being very simplistic, can only express a single aspect of a characters use. Usually we can agree on what that primary aspect is, so gc is reasonably useful as a quick cut. However, Braille is a bit resistant if put to the question: Are you symbol or letter? In reality, the Braille codes are glyph codes. We decided at some point not to allow any new types of gc values. If we didn't have that restriction, we could assign them an *Sb or *Lb (for *Symbol-Braille or *Letter-Braille). But that's an option we don't have. One thing that we are hoping to learn is whether people are actually using these Braille codes and are using them in ways that are or are not compatible with the model we describe in http://www.unicode.org/versions/Unicode4.0.0/ch14.pdf (see section 14.9). In terms of the organization of the book we've clearly sorted Braille among the symbols, by the way. Any comments? A./
Re: Unicode Public Review Issues update
Florian Weimer asked: http://www.unicode.org/review/ Maybe I'm missing something, but I still can't find any reference that the Unihan.txt file will be released under a license that permits redistribution (which has been announced in other documents). Ah, you're right. It will have the same distribution as everything else, just nobody's had time to update it yet since the last draft was posted. It's already in the works. If you read the 4.0 UCD it says the Unihan file is intended to have the same distribution. Rick
Re: Unicode Public Review Issues update: BRAILLE
- Original Message - From: Jony Rosenne [EMAIL PROTECTED] Please note that Braille is used also for Hebrew. We use the same codes, but they are assigned a different meaning. The reader has to know or guess which language it is. I don't remember whether Hebrew Braille is written RTL or LTR. Jony Braille is probably used for a lot of scripts, maybe even *most* scripts used for modern languages - in Bhutan I know they use Braille for writing Dzongkha (Tibetan script). - Chris
Soft-dotted (was: RE: Unicode Public Review Issues update)
Re. the ij ligature and soft-dottedness: This is a compatibility letter, both in the sense that it has a compatibility mapping and is taken from a legacy character encoding. It is, however, not necessarily a character that should not be used. Even though in most cases it is sufficient to use ordinary i and j in sequence to write the Dutch ij, in some cases it may still be best to use the ij ligature character, for best spacing, titlecasing, and vertical layout. This was discussed on this list a while ago. Also discussed on this list was the following: The ij can in some cases be acute accented, in which case both of the dots should be replaced by acute accents. The representation of this is quite straightforward if ordinary i and j are used. If, however, the ij ligature is used, the ij ligature must first of all be soft-dotted. Applying just ordinary combining acute accent to produce an accent on each part of the ligature may be a bit strange. It may be more logical to apply combining double acute accent to the ij ligature to get the desired effect. A (small) typographic problem is to align the two acute accents over the constituent letter bodies. It is not certain that a grave accent may be applicable to the ij, but if it is, the story is similar. At least one Dutch dictionary (Ter Laan, Niew Groninger Wordenboek, 1929) uses a macron over the ij. If coded as separate i and j, one would use 035E;COMBINING DOUBLE MACRON in-between. If the ij ligature is used, a combining macron should be applied after the ligature character to get a macron that goes over both the dotless constituent letter bodies. In each of these cases (of accented ij ligature), the ij ligature must be soft-dotted. Interesting issue for the Latin Small ij Ligature (U+0133): Normally the Soft_Dotted issupposed to make disappear one dot when there's and additional diacritic above, but many applications may keep these two dots above, fitting the diacritic in the middle. Examples? (Of actual use of such a letter+marks combination, not of applications that currently do what you say.) This proposal would mean that this become illegal, and it promote the use of an additional intermediate dot-above diacritic if the dot must be kept. What would be the interpretation of this dot added on top of the ligature? Should it be still a single dot centered above the ij Probably. Examples of use? digraph, requiring two dots to be encoded if both i and j must have their own dot above? They would stack on top of eachother (unless you want additional special rules, which seems very uncalled for). Or would this require using a diaeresis instead centered above the digraph? Probably. But are there any examples of this in use (ever, not necessarily Unicode encoded, or at all digitally encoded)? If that kind of thing never has occured before, it does not really matter very much, and some coarse approximation will do fine. For the modifier letter j or Greek letter yot, this is less ambiguous. The proposal however is fine for the mathematical variants of i and j, (including the double struck italic, for unification reasons) I think so too (though I don't know what you mean by unification here). But there are also cases where math diacritics go over a larger expression than just a single-letter variable name (the span being expressed via markup). It is probably not wise to automatically remove the dots on i's and j's in such cases. However, for cases where a math diacritic goes on top of just a single-letter i-like or j-like name, the dot should automatically be removed (or, rather, an alternate dotless glyph be used). For other cases, like more-than-a-variable expressions getting a diacritic, or when actual undotted i or j is desired (compare TeXs \imath and \jmath), dotless i and dotless j characters should be used. (But that is another matter, though related to the soft-dotted issue.) /kent k (Note I also posted this comment in the online report form) -- Philippe.
Re: Soft-dotted (was: RE: Unicode Public Review Issues update)
On Monday, June 30, 2003 1:33 PM, Kent Karlsson [EMAIL PROTECTED] wrote: Or would this require using a diaeresis instead centered above the digraph? Probably. But are there any examples of this in use (ever, not necessarily Unicode encoded, or at all digitally encoded)? If that kind of thing never has occured before, it does not really matter very much, and some coarse approximation will do fine. I admit this is quite coarse. But as there does not seem to exist any language for which a single or double dot above would be used over this character, I think that a sequence like: ij, combining dot above would be rendered as a pair of undotted ij, with a single centered dot above. using the diaeresis to make a version with two dots. So if one really wants to emulate the past bad behavior of some old fonts for ij, combining accute accent, where dots were kept, he could use now: ij, combining diaeresis, combining accute accent (using the new soft_dotted property of ij which removes its dots when combining with any ABOVE=230 diacritics, including the diaeresis, so that this will produce exactly two dots, and not a quad dots). For the modifier letter j or Greek letter yot, this is less ambiguous. The proposal however is fine for the mathematical variants of i and j, (including the double struck italic, for unification reasons) I think so too (though I don't know what you mean by unification here). I am speaking about the few holes in the mathematics block, which were unified with pre-existing characters in other blocks. So if the update is accepted for the new mathematics block, it must be accepted also for these characters not present in these holes but unified with characters of previously encoded blocks. -- Philippe.
Re: Unicode Public Review Issues update
On Friday, June 27, 2003 10:29 PM, Rick McGowan [EMAIL PROTECTED] wrote: The Unicode Technical Committee has posted a new issue for public review and comment. Details are on the following web page: http://www.unicode.org/review/ Briefly, the new issue is: Issue #11 Soft Dotted Property Proposal: The Unicode Standard has the principle that if an accent is applied to an i or j, the base character loses its dot. Such characters are called soft-dotted. The UTC proposes to extend this property to a number of characters that do not currently have the property. The accompanying document lists the characters Interesting issue for the Latin Small ij Ligature (U+0133): Normally the Soft_Dotted issupposed to make disappear one dot when there's and additional diacritic above, but many applications may keep these two dots above, fitting the diacritic in the middle. This proposal would mean that this become illegal, and it promote the use of an additional intermediate dot-above diacritic if the dot must be kept. What would be the interpretation of this dot added on top of the ligature? Should it be still a single dot centered above the ij digraph, requiring two dots to be encoded if both i and j must have their own dot above? Or would this require using a diaeresis instead centered above the digraph? For the modifier letter j or Greek letter yot, this is less ambiguous. The proposal however is fine for the mathematical variants of i and j, (including the double struck italic, for unification reasons) (Note I also posted this comment in the online report form) -- Philippe.
Re: Unicode Public Review Issues update
url please Rick McGowan wrote: The Unicode Public Review Issues page has been updated today. Highlights: Closed issue #1 (Language tag deprecation) without any change. Updated some deadlines on other issues to June 1, 2003. Added a document for issue #7 (tailored normalizations). Added an issue #8 regarding properties of math digits. Regards, Rick McGowan Unicode, Inc.
Re: Unicode Public Review Issues update
On Tue, Mar 18, 2003 at 10:26:49AM -0800, Yung-Fong Tang wrote: url please Rick McGowan wrote: The Unicode Public Review Issues page has been updated today. http://www.google.com (Yes, it would have been nice to have a URL in the message, but it's not hard to find the page.) -- David Starner - [EMAIL PROTECTED] Einstein once said that it would be hard to teach in a co-ed college since guys were only looking on girls and not listening to the teacher. He was objected that they would be listening to _him_ very attentively, forgetting about any girls. But such guys won't be worth teaching, replied the great man.