RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-11 Thread Philippe Verdy
Christopher John Fynn wrote: > Peter Kirk wrote: > >Consider the following: > > (1) {U+00E9} > > (2) e{U+0301} > > (3) e > class="black-text">{U+0301} > > (4) e > class="red-text">{U+0301} > > > > I would expect (1), (2) and (3) to be rendered identically, and (4) to > > differ only in the colour

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-10 Thread Christopher John Fynn
Peter Kirk wrote: >Consider the following: > (1) {U+00E9} > (2) e{U+0301} > (3) e class="black-text">{U+0301} > (4) e{U+0301} > I would expect (1), (2) and (3) to be rendered identically, and (4) to > differ only in the colour of the accent, just as it would be (apart from > (1) if U+0301 were

RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-10 Thread jon
> > I've seen text/cpp and text/java, but really there are no such > > types. I've also > > seen text/x-source-code which is at least legal, if of little value to > > interoperability. > > > > The correct MIME type for C and C++ source files is text/plain. > > This is where I disagree: Brin

RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-09 Thread D. Starner
> Just imagine what would be created with your assumption with this source: > const wchar_t c = L'?'; > where ? is a combining character. The programmer would get bit. At best, there's no reason to assume that every compiler accepts UTF-8, besides that fact that you can't assume that the co

RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-09 Thread Philippe Verdy
[EMAIL PROTECTED] writes: > > > You might as well say that C code is not plain text because it too is > > > subject to special canons of interpretation. > > > > C, C++ and Java source files are not plain text as well (they > > have their own > > C, C++ and Java source files are plain text. > >

Re: plain text (was RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-09 Thread jcowan
Peter Constable scripsit: > Perhaps we need some new terminology here. It might be helpful to > describe an XML file as a "plain-text-markup file" (PTM, for acronym > lovers), but reserve the term "plain text file" for files that contain > text with no markup. Note that the terms being defined are

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-09 Thread Peter Kirk
On 09/12/2003 06:36, [EMAIL PROTECTED] wrote: Perhaps so does yours. It isn't clear whether the CSS for .red-text would have to over-ride the default behaviour whereby an inline element like is rendered by stacking it to the left or right (depending on text directionality) of the previous inli

plain text (was RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-09 Thread Peter Constable
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf > Of [EMAIL PROTECTED] > XML files most certainly are plain text XML *can* be interpreted as plain text, or it can be interpreted as something *other* than plain text (i.e. XML). This ambiguity exists for any other plain-text-based ma

RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-09 Thread Peter Constable
From: Philippe Verdy [mailto:[EMAIL PROTECTED] >> I see no particular value in this. The font rendering of base >> diacritic should be exactly the same as that for >> basediacritic provided the font >> characteristics are the same or do not affect metrics. > >This is wrong here: there's no guaran

RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-09 Thread jon
> > You might as well say that C code is not plain text because it too is > > subject to special canons of interpretation. > > C, C++ and Java source files are not plain text as well (they have their own C, C++ and Java source files are plain text. > "text/*" MIME type, which is NOT "text/plain"

Re: Transcoding Tamil in the presence of markup (was Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup))

2003-12-09 Thread Mark Davis
AIL PROTECTED]> Sent: Tue, 2003 Dec 09 00:30 Subject: RE: Transcoding Tamil in the presence of markup (was Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)) > From: [EMAIL PROTECTED] on behalf of Kenneth Whistler > > >> Unicode doesn't prevent styling

RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-09 Thread Philippe Verdy
> You might as well say that C code is not plain text because it too is > subject to special canons of interpretation. C, C++ and Java source files are not plain text as well (they have their own "text/*" MIME type, which is NOT "text/plain" notably because of the rules associated with end-of-line

Re: Transcoding Tamil in the presence of markup (was Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup))

2003-12-09 Thread Jungshik Shin
On Mon, 8 Dec 2003, Peter Jacobi wrote: > It would be most interesting, if someone can point out a wordprocessor > or even a rendering library (shouldn't Pango be the solution to > everything?), > which enables styling of individual Tamil letters. I think Pango's attributed string ( http://de

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-09 Thread jon
> Your alternative suggestion using svg seemed to require the user to > handle the details of glyph positioning with specified horizontal > advances, which is surely a very strange requirement. Or maybe I have > misunderstood what was going on here. Perhaps so does yours. It isn't clear whether

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-09 Thread Peter Kirk
On 09/12/2003 05:13, [EMAIL PROTECTED] wrote: So, let's get this clear. Within an XML or HTML document, if I want an e with a red acute accent on it, it is quite permissible to write: e{U+0301} where {U+0301} is replaced by the actual Unicode character, and "red-text" is defined in the stylesh

RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-09 Thread Philippe Verdy
[EMAIL PROTECTED] writes: > Philippe Verdy scripsit: > > XML files are definitely NOT plain text (if this was the case, > > then it would be forbidden to interpret "<" as a special markup > > character instead of the standard Unicode base character with > > its associated glyph)... > > You migh

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-09 Thread Peter Jacobi
Hi Peter, All, Peter Kirk <[EMAIL PROTECTED]> wrote: > [...] > [About é being correct HTML} > [...] > If this is correct, then the Tamil problem which Peter J is concerned > about has gone away completely, or at least it is reduced to a tricky > rendering issue. Jungshik and Martin already vot

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-09 Thread jcowan
Philippe Verdy scripsit: > XML files are definitely NOT plain text (if this was the case, then it would > be forbidden to interpret "<" as a special markup character instead of the > standard Unicode base character with its associated glyph)... You might as well say that C code is not plain text

RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-09 Thread Philippe Verdy
> -Message d'origine- > De : Peter Kirk [mailto:[EMAIL PROTECTED] > Envoye : mardi 9 decembre 2003 13:17 > A : [EMAIL PROTECTED] > Cc : [EMAIL PROTECTED] > Objet : Re: Coloured diacritics (Was: Transcoding Tamil in the presence > of markup) > > >

RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-09 Thread Philippe Verdy
[EMAIL PROTECTED] writes: > What is not allowed, and this makes XML technically non-conformant to the > Unicode Standard Where did you see that XML files need to be conformant to the Unicode standard? XML files are definitely NOT plain text (if this was the case, then it would be forbidden to int

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-09 Thread jon
> So, let's get this clear. Within an XML or HTML document, if I want an e > with a red acute accent on it, it is quite permissible to write: > > e{U+0301} > > where {U+0301} is replaced by the actual Unicode character, and > "red-text" is defined in the stylesheet. So it is not a problem that

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-09 Thread jcowan
Philippe Verdy scripsit: > When in doubt, don't perform any normalization of XML _files_ as they are > NOT plain text: you need a XML parser to do it safely only in relevant > sections of this file. All you could do safely is to possibly reencode XML > files (for example from UTF-8 to UTF-16 encod

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-09 Thread jon
> Anyone, please, is it or is it not true that XML forbids, or will forbid > in future versions, combining characters immediately after markup? XML does not forbid it, it does recommend you avoid it. Charmod defines "include-normalization" and "full-normalization" which go beyond Unicode normal

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-09 Thread jcowan
Peter Kirk scripsit: > Anyone, please, is it or is it not true that XML forbids, or will forbid > in future versions, combining characters immediately after markup? XML 1.0 is silent on the subject. The W3C Character Model (which is not official yet) says that "content developers SHOULD avoid c

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-09 Thread Peter Kirk
On 09/12/2003 03:41, Philippe Verdy wrote: Peter Kirk writes: Philippe, you have now stated this (several times). But just a day earlier you yourself stated that the rule forbidding combining marks at the start of a string would never be relaxed because it is fundamental to the XML containme

RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-09 Thread Philippe Verdy
Peter Kirk writes: > Philippe, you have now stated this (several times). But just a day > earlier you yourself stated that the rule forbidding combining marks at > the start of a string would never be relaxed because it is fundamental > to the XML containment model. You don't usually contradict

Re: Transcoding Tamil in the presence of markup (was Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup))

2003-12-09 Thread Peter Kirk
On 08/12/2003 16:17, Kenneth Whistler wrote: ... Having an 'invisible consonant' to call for rendering of the vowel sign in isolation (and without the dotted circle), would also help the limited number of cases where the styled single character is needed - but in a rather hackish way. That i

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-09 Thread Peter Kirk
On 08/12/2003 15:51, Philippe Verdy wrote: ... Peter Kirk writes: Agreed. But now we are told that the latter is illegal XML because a combining mark is not permitted (by XML, not by Unicode) after . It is not forbidden by XML. It's just that handling a XML file (which is not plain-text)

RE: Transcoding Tamil in the presence of markup (was Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup))

2003-12-09 Thread Peter Constable
From: [EMAIL PROTECTED] on behalf of Kenneth Whistler >> Unicode doesn't prevent styling, of course. But having 'logical' order >> instead of 'visual' makes it a hard task for the application and the >> renderer. >> This is witnessed by the thin-spread support for this. > >Yes... Ken conceded th

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-08 Thread Chris Jacobs
- Original Message - From: "Christopher John Fynn" <[EMAIL PROTECTED]> To: "Unicode List" <[EMAIL PROTECTED]> Sent: Monday, December 08, 2003 6:03 PM Subject: Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup) > Andrew West wro

RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-08 Thread Philippe Verdy
Peter Constable writes: > > A very tentative suggestion for some glue: a character which can take > > combining marks but whose function is to throw those marks back on to > > the preceding base character, preceding any markup. > > I see no particular value in this. The font rendering of base > di

RE: Transcoding Tamil in the presence of markup (was Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup))

2003-12-08 Thread Kenneth Whistler
Peter Jacobi said: > Unicode doesn't prevent styling, of course. But having 'logical' order > instead of 'visual' makes it a hard task for the application and the > renderer. > This is witnessed by the thin-spread support for this. Yes, but having visual order instead of logical order makes *othe

RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-08 Thread Philippe Verdy
-Message d'origine- De :Philippe Verdy [mailto:[EMAIL PROTECTED] Envoye :mardi 9 decembre 2003 00:11 A : Peter Kirk Cc :[EMAIL PROTECTED] Objet : RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup) Peter Kirk writes: > Agreed. But no

RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-08 Thread Philippe Verdy
-Message d'origine- De :Philippe Verdy [mailto:[EMAIL PROTECTED] Envoye :mardi 9 decembre 2003 00:11 A : Peter Kirk Cc :[EMAIL PROTECTED] Objet : RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup) Peter Kirk writes: > Agreed. But no

RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-08 Thread Philippe Verdy
Peter Kirk writes: > Agreed. But now we are told that the latter is illegal XML because a > combining mark is not permitted (by XML, not by Unicode) after . It is not forbidden by XML. It's just that handling a XML file (which is not plain-text) as if it was a Unicode plain-text when performing n

RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-08 Thread Mete Kural
Being able to color diacritics and other characters in rendering would be great. We are trying to develop some tools to research the Quran and one of the tools is a sophisticated search engine that can search for substrings and display the search results while emphasizing the searched substrings

RE: Transcoding Tamil in the presence of markup (was Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup))

2003-12-08 Thread Peter Jacobi
Dear Peter Constable, Peter Kirk, All, "Peter Constable" <[EMAIL PROTECTED]> wrote: > SIL's Graphite definitely *will* permit exactly what you want to do > (assuming the font is properly designed). [...] Thanks for this clarification. Having tried SIL WorldPad with Tamil Graphite font, and not

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-08 Thread Peter Kirk
On 08/12/2003 11:35, Peter Constable wrote: ... I see no particular value in this. The font rendering of base diacritic should be exactly the same as that for basediacritic provided the font characteristics are the same or do not affect metrics. Agreed. But now we are told that the latter is i

RE: Transcoding Tamil in the presence of markup (was Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup))

2003-12-08 Thread Philippe Verdy
Peter Jacobi > To re-iterate - in the original post, the string in question did > consist of side by side characters, not ligated in any font known > to me. And the legacy Tamil enocings have for obvious reasons no > problem to style any single character. This specific case is not the one of "side

RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-08 Thread Peter Constable
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf > Of Peter Kirk > And what if you want to colour just the dot on i? Or just the crossbar > on a t? Use Illustrator or Photoshop or Freehand or whatever your favourite graphics application is. > A very tentative suggestion for some

Re: Transcoding Tamil in the presence of markup (was Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup))

2003-12-08 Thread Peter Kirk
On 08/12/2003 10:16, Peter Jacobi wrote: ... So, to promote Unicode usage, in a community, which partly sees ISCII unification as a conspiracy against the Dravidian languages, it would be very helpful to demonstrate, that everything that can be done with the legacy encodings, can also be done usin

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-08 Thread Peter Kirk
On 08/12/2003 10:57, Jungshik Shin wrote: ... You're another 'victim'(?!) of the multi-level representability of the Korean script. Although I consistently used syllables, letters (Jamos: complex/compund vs simple/basic), it may not have been clear to you. ... Peter, can you just open up TUS 4

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-08 Thread Jungshik Shin
On Mon, 8 Dec 2003, Peter Kirk wrote: > On 08/12/2003 08:37, Doug Ewell wrote: > > >Peter Kirk wrote: > >>I may have missed or misunderstood the details, but it has been > >>clearly stated here in the last few days that (a) there are more > >>than 11,000 redundant Korean characters in the BMP, a

Transcoding Tamil in the presence of markup (was Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup))

2003-12-08 Thread Peter Jacobi
Dear All, I find it rather disappointing, that the the question of coloring the horizontal line of 't' attracts more attention, than the original question. To re-iterate - in the original post, the string in question did consist of side by side characters, not ligated in any font known to me. And

RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-08 Thread Philippe Verdy
Christopher John Fynn wrote: > Andrew West wrote: > > > ... and similar stroke-by-stroke incremental diagrams showing > > how to write CJK ideographs are even more common in (Chinese, > > Japanese, etc.) pedagogical texts intended for both native > > children and for foreigners. I've also seen

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-08 Thread Peter Kirk
On 08/12/2003 08:37, Doug Ewell wrote: Peter Kirk wrote: I may have missed or misunderstood the details, but it has been clearly stated here in the last few days that (a) there are more than 11,000 redundant Korean characters in the BMP, and (b) many precomposed Korean characters lack canonic

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-08 Thread Christopher John Fynn
Andrew West wrote: > ... and similar stroke-by-stroke incremental diagrams showing how to write CJK > ideographs are even more common in (Chinese, Japanese, etc.) pedagogical texts > intended for both native children and for foreigners. I've also seen such > diagrams in Tibetan pedagogical texts,

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-08 Thread Doug Ewell
Peter Kirk wrote: > I may have missed or misunderstood the details, but it has been > clearly stated here in the last few days that (a) there are more > than 11,000 redundant Korean characters in the BMP, and (b) many > precomposed Korean characters lack canonical or even compatibility > decompos

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-08 Thread Doug Ewell
Andrew C. West <[EMAIL PROTECTED]> > ... and similar stroke-by-stroke incremental diagrams showing how to > write CJK ideographs are even more common in (Chinese, Japanese, > etc.) pedagogical texts intended for both native children and for > foreigners. I've also seen such diagrams in Tibetan ped

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-08 Thread Peter Kirk
On 07/12/2003 17:40, Doug Ewell wrote: Peter Kirk wrote: Well, this is W3C's problem. They seem to have backed themselves into a corner which they need to get out of but have no easy way of doing so. Only if this issue of applying style to individual combining marks is considered a suffi

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-08 Thread Michael Everson
Of course, display of coloured diacritics isn't plain text. -- Michael Everson * * Everson Typography * * http://www.evertype.com

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-08 Thread Andrew C. West
On Sun, 7 Dec 2003 17:40:25 -0800, "Doug Ewell" wrote: > There are plenty of things one can do with writing that aren't supported > by computer encodings, and aren't really expected to be. The idea of a > black "i" with a red dot was mentioned. Here's another: the > piece-by-piece "exploded diagr

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-07 Thread Philippe Verdy
>Doug Ewell [mailto:[EMAIL PROTECTED] writes: >> Peter Kirk wrote: >> > Unicode is of course very familiar with this kind of situation e.g. >> > with character name errors, combining class errors, 11000+ redundant >> > Korean characters without decompositions, etc etc. >> >> "Without decompositio

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-07 Thread Doug Ewell
Peter Kirk wrote: > Well, this is W3C's problem. They seem to have backed themselves into > a corner which they need to get out of but have no easy way of doing > so. Only if this issue of applying style to individual combining marks is considered a sufficiently important text operation do they

RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-07 Thread Philippe Verdy
Peter Kirk wrote: > On 07/12/2003 15:40, Philippe Verdy wrote: > > Peter Kirk wrote: > > > Of course there is an even simpler way to provide the glue I > > > was talking about. W3C simply needs to relax the rule forbidding > > > combining marks at the start of a string (and interpret the one > >

RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-07 Thread Philippe Verdy
> Of course there is an even simpler way to provide the glue I was talking > about. W3C simply needs to relax the rule forbidding combining marks at > the start of a string (and interpret the one precomposed character with > ">" as base as if it were decomposed, as I suggested before), and, > r

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-07 Thread Peter Kirk
On 07/12/2003 15:40, Philippe Verdy wrote: Of course there is an even simpler way to provide the glue I was talking about. W3C simply needs to relax the rule forbidding combining marks at the start of a string (and interpret the one precomposed character with ">" as base as if it were decompose

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-07 Thread Peter Kirk
On 07/12/2003 12:10, Philippe Verdy wrote: The glue seems good in apparence but much too complex to implement in Unicode. I do think that specific occurences of compelx styles must be handled with a stylesheet, where any given grapheme cluster is applied a composite style as a whole. ...

RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-07 Thread Philippe Verdy
Peter Kirk writes: > A very tentative suggestion for some glue: a character which can take > combining marks but whose function is to throw those marks back on to > the preceding base character, preceding any markup. This would have to > be a zero width base character, not a format character or

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-07 Thread Peter Kirk
On 07/12/2003 02:40, Philippe Verdy wrote: ... Just one example, suppose that you want to color the circumflex above a lowercase i or above a uppercase A: the base letters have distinct widths (meaning that the diacritic has a different horizontal position), distinct height (meaning that the diac

RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-07 Thread Philippe Verdy
John Hudson writes: > At 03:53 PM 12/6/2003, Philippe Verdy wrote: > > >Still this is an interesting problem: some texts for example want to > >exhibit some diacritics added to a base letter with a distinct color, > >notably in linguistic texts related to grammar or orthography. > > > >So for exam

Re: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

2003-12-06 Thread John Hudson
I wrote: The way to do this is to decompose bases and marks at the glyph level if they are not already decomposed at the character level... I meant to say *one* way to do this... I didn't mean to imply that it was the only way, or necessarily the best way. It would be interesting and useful to