Re: Questions on ZWNBS - for line initial holam plus alef

2003-09-19 Thread Peter Kirk
On 19/09/2003 00:47, Kent Karlsson wrote: ... How should a text rendering library deal with dbl_diacritic>? Should the character after the diacritic be drawn under the right half of the diacritic, yes Nitpick: under or above, as appropriate, the right "half" of the dbl diacritic. T

RE: Questions on ZWNBS - for line initial holam plus alef

2003-09-19 Thread Kent Karlsson
... > >How should a text rendering library deal with >dbl_diacritic>? Should the character after the diacritic be > >drawn under the right half of the diacritic, > > yes Nitpick: under or above, as appropriate, the right "half" of the dbl diacritic. There are some dbl diacritics that are below

Re: Questions on ZWNBS - for line initial holam plus alef

2003-09-18 Thread Asmus Freytag
At 08:36 PM 9/18/03 -0400, Noah Levitt wrote: On Mon, Aug 11, 2003 at 12:57:11 -0700, Kenneth Whistler wrote: > Kent asked: > > > How should a freestanding double diacritic be encoded (for purposes of > > meta-discussions, and the like): or > diacritic, SPACE>? > > It *could* be represented as ,

Re: Questions on ZWNBS - for line initial holam plus alef

2003-09-18 Thread Noah Levitt
On Mon, Aug 11, 2003 at 12:57:11 -0700, Kenneth Whistler wrote: > Kent asked: > > > How should a freestanding double diacritic be encoded (for purposes of > > meta-discussions, and the like): or > diacritic, SPACE>? > > It *could* be represented as , of course, > or for that matter , or other

RE: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Kenneth Whistler
Kent Karlsson said: > I see no particular *technical* problem with using WJ, though. In > contrast > to the suggestion of using CGJ (re. another problem) anywhere else but > at the end of a combining sequence. CGJ has combining class 0, despite > being invisible and not ("visually") interfering w

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Philippe Verdy
On Wednesday, August 06, 2003 12:36 PM, Kent Karlsson <[EMAIL PROTECTED]> wrote: > > The NFD decompositions of spacing marks is alredy defined as a SPACE > > plus a non-spacing combining character. > > Philippe, please! Those are *compatibility* decompositions. The > normal form NFD only uses *c

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Mark Davis
: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...) > On 05/08/2003 09:42, Jim Allan wrote: > > > Peter Kirk posted: > > > >> If I want to do this, should I explicitly encode a dotted circle, or > >> should I encode nothing and expect the font

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread John Cowan
Kenneth Whistler scripsit: > D17a Defective combining character sequence: A combining character > sequence that does not start with a base character. > > * Defective combining character sequences occur when a sequence >of combining characters appears at the start of a strin

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
On Wednesday, August 06, 2003 12:38 PM, Kent Karlsson <[EMAIL PROTECTED]> wrote: > Since I think should be canonically > equivalent to , but cannot be made > so (now), the only ways out seem to be to either formally deprecate > CGJ, or at least confine it to very specific uses. Other occurrences >

RE: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Jon Hanna
> In the context of XML processing, where strings should (must?) be FYI. It's "should" for XML 1.1, and it's quite explicitly stated that normalisation is not required for a document to be well-formed. XML1.0 doesn't mention Unicode normalisation, although plenty of applications built on top of

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Peter Kirk
On 06/08/2003 15:47, Philippe Verdy wrote: On Wednesday, August 06, 2003 11:48 PM, Peter Kirk <[EMAIL PROTECTED]> wrote: OK, what kind of markup should I use, in any well-known markup language, to ensure that an isolated diacritic is centred in the space between the words before and after it?

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Kenneth Whistler
Peter Kirk followed up: > On 07/08/2003 07:27, Philippe Verdy wrote: > > >On Thursday, August 07, 2003 2:40 AM, Doug Ewell <[EMAIL PROTECTED]> wrote: > > > >>Kenneth Whistler wrote: > >> > >>>But I challenge you to find anything in the standard that > >>>*prohibits* such sequences from occurring

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Michael Everson
At 14:22 -0700 2003-08-08, Kenneth Whistler wrote: Philippe, you are tilting at windmills, here. There is no chance that the UTC is going to consider such a character, in my assessment, let alone give it the properties you suggest. Nor WG2 either. -- Michael Everson * * Everson Typography * * h

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
On Friday, August 08, 2003 9:54 PM, Peter Kirk <[EMAIL PROTECTED]> wrote: > On 08/08/2003 08:54, Philippe Verdy wrote: > > But I'm not sure that ZERO WIDTH SYMBOL is the best name, unless you > are suggesting other uses in which it really has zero width. Well, it > might have in a case like line

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
On Sunday, August 10, 2003 11:53 AM, Kent Karlsson <[EMAIL PROTECTED]> wrote: > <> > > Spams de Philippe Verdy non tolérés: tout message non sollicité sera > rapporté à son fournisseur de services Internet. There was no spam in the message you deleted. This was a single post to the list, no

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Peter Kirk
On 11/08/2003 12:26, Kenneth Whistler wrote: Peter Kirk wrote: I think this may be a "Peter mistake". I meant to refer to spacing diacritics. Sorry. It is certainly highly inappropriate for spacing diacritics to be considered word boundaries. Why? It is entirely dependent on the orthog

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Peter Kirk
On 11/08/2003 08:39, Doug Ewell wrote: Peter Kirk wrote: Thank you, Ken. Well, you make it sound as if the problems are minimal, and that version I can just about accept. But if Philippe is correct about what he says about UAX#29 and UAX#14, there are some more serious problems. It is certain

RE: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Jon Hanna
> For me the term "difficult" is inappropriate. In fact it is invalid for > interoperability (even though it is valid, not forbidden, for > ISO10646/Unicode, as an string fragment for intermediate processing), > and such sequence should not occur in actual documents, out of any > external processin

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
From: "Jon Hanna" <[EMAIL PROTECTED]> > If this is > > different, then it is not XML but a derived language (for example HTML or > > SGML which are using more "relaxed" syntaxes). > > XML is derived from SGML, not the other way around. Still doesn't matter. I did not say that, despite the senten

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Mark Davis
There are a number of incorrect statements. My comments below. - Original Message - From: "Peter Kirk" <[EMAIL PROTECTED]> To: "Kenneth Whistler" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Monday, August 11, 2003 16:28 Subject: Re: Questions

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
From: "Jon Hanna" <[EMAIL PROTECTED]> > Lots of different things happen that affect the whitespace of an XML > document (whether a DOM tree is constructed or not, since it isn't the only > legal way to process an XML document). Of course one is not required to build an actual DOM tree, however XM

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Peter Kirk
On 11/08/2003 18:46, Mark Davis wrote: There are a number of incorrect statements. My comments below. Thanks for the clarifications. Sorry about the inaccuracies. On some maybe Philippe misled me, on others it is just my inadequate understanding. ... In practice, looking at a character past

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread John Cowan
Peter Kirk scripsit: > So far so good, but when I get to an accent with no predefined spacing > variant, I have a problem! No you don't. If you want to say is the diacritic used to represent linguolabial sounds in the IPA, then you just encode U+0020 U+033C at the beginning of the next line.

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Peter Kirk
On 12/08/2003 04:17, Jon Hanna wrote: Thanks for the clarification. I probably misunderstood Jon's intention. But is there a problem if, for example, an application sees the string and regularises it (wrongly!) to combining mark>? Yes, I was not saying that it wouldn't be sensible to begin

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread John Cowan
Peter Kirk scripsit: > Philippe or anyone else, would it be "XML-safe" to use NBSP rather than > SP as the base character for spacing diacritics in XML? Perhaps that's > the answer here. I know there are still some issues of detail concerning > the line breaking, but apart from that is there an

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread John Cowan
Peter Kirk scripsit: > Sorry, I'm confused. Are you saying that the input processing will > translate line breaks into spaces within attribute values, unless > inserted as ? Well, I suppose this is fair enough as it is up to > the user not to enter garbage. Yes, that is how attribute values

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
- Original Message - From: "John Cowan" <[EMAIL PROTECTED]> To: "Peter Kirk" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Wednesday, August 13, 2003 5:31 AM Subject: Re: Questions on ZWNBS - for line initial holam plus alef > Peter Kir

RE: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Jony Rosenne
2003 8:43 PM > To: Philippe Verdy > Cc: [EMAIL PROTECTED] > Subject: Re: Questions on ZWNBS - for line initial holam plus alef > > > On 13/08/2003 11:09, Philippe Verdy wrote: > > >... For this reason, defective > >combining sequences (combining characters w

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Peter Kirk
On 12/08/2003 07:05, John Cowan wrote: Very true. But what is this whitespace normalization? 1) Throughout the document, line-end characters and sequences are normalized to LF. Not relevant here. 2) In attribute values, LF, CR, and TAB characters are normalized to spaces. Not relevant here.

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Peter Kirk
On 12/08/2003 09:00, Philippe Verdy wrote: This is really a shame that there is no "XML-safe" base character in Unicode to represent leading spacing diacritics in actual documents (either in HTML, XML, SGML, or even for other Rich-Text format, including TeX, RTF, or proprietary text formats like M

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Peter Kirk
On 13/08/2003 14:07, Philippe Verdy wrote: I did not notice that the discussion about Hebrew holam male was related. In fact I don't know anything about the hebrew alphabet so I could not understand the semantics discussed, and so di not note that was a "defective" encoding (in terms of combining

RE: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Jon Hanna
> I do agree: a XML document could require the use at some place of a > given attribute or element. If this attribute name follows the element > name > after a line break, which gets changed into a space during parsing, > forcing > XML parsers to treat SPACE+combining as a unbreakable grapheme > cl

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread John Cowan
Peter Kirk scripsit: > >2) In attribute values, LF, CR, and TAB characters are normalized to > >spaces. Not relevant here. > > This would be relevant if it is legal for the character after LF, CR, > and TAB to be a combining mark. Is this legal? In this case what was > previously a defective

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Mark Davis
muove” ◄ - Original Message - From: "Kenneth Whistler" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Monday, August 11, 2003 12:26 Subject: Re: Questions on ZWNBS - for line initial holam plus alef > Pet

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Peter Kirk
On 12/08/2003 20:28, John Cowan wrote: Peter Kirk scripsit: 2) In attribute values, LF, CR, and TAB characters are normalized to spaces. Not relevant here. This would be relevant if it is legal for the character after LF, CR, and TAB to be a combining mark. Is this legal? In this ca

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Peter Kirk
On 13/08/2003 11:09, Philippe Verdy wrote: ... For this reason, defective combining sequences (combining characters without a leading base character) should be forbidden (invalid for XML). If there is even the remotest possibility of this happening, we need to know quickly! Defective combining

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread John Cowan
Peter Kirk scripsit: > These processes cannot > simply take a space as a space and process it as such. Every time they > come across a space (which is very often!) they have to test whether it > is followed by a combining character, and if it is they have to treat > that space specially. Thi

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread John Cowan
Peter Kirk scripsit: > Really? It looks to me as if U+00B4 and U+02D8 to U+02DD have only a > compatibility equivalences to space plus diacritic, and U+005E and > U+0060 don't even have compatibility equivalences. Indeed. The last two, BTW, are because the ASCII repertoire has taken on a life

RE: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Jon Hanna
> Of course one is not required to build an actual DOM tree, > however XML, HTML > and alike is now defined in terms of the DOM, where the text/xml syntax is > just a serialization, which is the only place where whitespaces > normalization is defined (such normalization does not occur at the DOM >

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Thomas M. Widmann
Peter Kirk <[EMAIL PROTECTED]> writes: > On 08/08/2003 08:54, Philippe Verdy wrote: > > > ... Could there be another codepoint assigned that has > > > >these properties: > > > >20CF;ZERO WIDTH SYMBOL;Sk;0;ON; 0020N; > > [...] > But I'm not sure that ZERO WIDTH SYMBOL is the best name, unl

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread John Cowan
Peter Kirk scripsit: > On 13/08/2003 11:09, Philippe Verdy wrote: > > >... For this reason, defective > >combining sequences (combining characters without a leading base > >character) should be forbidden (invalid for XML). > > > > > If there is even the remotest possibility of this happening, we

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread John Cowan
John Hudson scripsit: > Again, you are working on the assumption that U+0020 is represented by an > actual painted glyph and not e.g. by a horizontal offset. In my experience, > the more sophisticated the application -- e.g. a professional page layout > application rather than a word processor

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
On Monday, August 11, 2003 12:27 AM, Kenneth Whistler <[EMAIL PROTECTED]> wrote: > A point I keep trying to make, but which often gets overlooked > by people trying to code Unicode mechanisms for dealing with > edge cases, is that the design goal of the Unicode Standard is, > and always has been,

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
- Original Message - From: "Doug Ewell" <[EMAIL PROTECTED]> To: "Unicode Mailing List" <[EMAIL PROTECTED]> Cc: "Peter Kirk" <[EMAIL PROTECTED]>; "Kenneth Whistler" <[EMAIL PROTECTED]> Sent: Monday, August 11, 2003 5:39 PM Su

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Jim Allan
Philip Verdy posted: Could ZWS+combining diacritic may be the best solution for isolated diacritics in text? From http://www.unicode.org/book/ch04.pdf: << * Such characters may be large enough to effect the placement of their base character relative to preceding and succeeding base characters. F

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Peter Kirk
On 13/08/2003 04:44, Jon Hanna wrote: No, the safe thing to do (and the thing that is done) is to treat the space as a space ignoring the fact that the NMTOKEN contains a combining character, this is even safer than your suggestion since it can't mis-identify the combining properties of a characte

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Peter Kirk
On 11/08/2003 06:59, Jon Hanna wrote: There are only two theoretical problems that I can see here, the first is that a whitespace character other than space gets converted to space by attribute value normalisation, and that this changes the meaning of the text in some way. This could only occur if

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
- Original Message - From: "Peter Kirk" <[EMAIL PROTECTED]> To: "Jon Hanna" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Wednesday, August 13, 2003 3:05 PM Subject: Re: Questions on ZWNBS - for line initial holam plus alef > On 13/08/2003

Fw: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
From: "Jon Hanna" <[EMAIL PROTECTED]> > Some of these only apply to elements that do not allow any > character data apart from whitespace to appear directly within them, and > hence are not an issue here. Some happen at relatively high level of > processing, e.g. rendering (not parsing) of HTML, an

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Peter Kirk
On 11/08/2003 11:45, Kenneth Whistler wrote: Peter Kirk responded: On 11/08/2003 06:59, Jon Hanna wrote: There are only two theoretical problems that I can see here, the first is that a whitespace character other than space gets converted to space by attribute value normalisation, and th

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
From: "Kenneth Whistler" <[EMAIL PROTECTED]> > It is perfectly reasonable, as I see it, to consider the > in a sequence to be: > a. significant > b. part of the characters in a document that are not markup > (at least in the cases we are talking about, since the > problem is not abo

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread John Cowan
Philippe Verdy scripsit: > Of course one is not required to build an actual DOM tree, however XML, HTML > and alike is now defined in terms of the DOM, where the text/xml syntax is > just a serialization, This is absolutely false. XML is defined by the XML Recommendation, which is entirely synta

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Peter Kirk
On 11/08/2003 16:06, Mark Davis wrote: Some of this seems to be in reference to an earlier contention that Text Boundaries (inc. Lines) break between the space and the non-spacing mark. I think this was attributed to Phillipe. [This may not be true: I don't actually read his email, because the inf

RE: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Kent Karlsson
Michael wrote: > The Name Police reject this utterly. ZERO WIDTH cannot have an > expanding dynamic width. Then what about ZERO WIDTH SPACE, which, according to TUS3, p. 238, "can grow to have a visible width when justified"? And it has the NamesList comment: * nominally zero width, but

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Peter Kirk
On 09/08/2003 13:23, Noah Levitt wrote: According to the docs at http://www.microsoft.com/typography/otfntdev/indicot/other.htm, uniscribe renders combining marks in isolation when they are applied to SPACE + ZWJ. (Without the ZWJ, it uses a dotted circle.) Perhaps this is an acceptable solution t

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread John Hudson
At 11:36 AM 8/11/2003, John Cowan wrote: > So far so good, but when I get to an accent with no predefined spacing > variant, I have a problem! No you don't. If you want to say is the diacritic used to represent linguolabial sounds in the IPA, then you just encode U+0020 U+033C at the beginning o

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Peter Kirk
On 13/08/2003 15:54, Jony Rosenne wrote: Suggested but not accepted. I am inherently suspicious when pressure is being exerted to decide complex and difficult questions in a hurry. Jony Jony, I am not trying to hurry anything. I am putting a lot of time and effort into trying to reach proper

RE: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Jon Hanna
> The only way to bypass this would be to use entitiy references to encode > the base space needed by the Unicode convention, so this is related to > what Unicode defines as a higher level protocol, needed here to bypass > the limitations of basic text. However it still creates a problem within > C

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Peter Kirk
On 06/08/2003 15:24, Doug Ewell wrote: Like Freud's cigar, sometimes a "may" is just a "may." And I suspect the phrase "any intelligent typographer" MAY generate some flak from typographers on this list who consider themselves "intelligent enough" yet have a different opinion. I'm not a typograph

RE: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Jony Rosenne
age- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Peter Kirk > Sent: Wednesday, August 06, 2003 12:11 PM > To: Curtis Clark > Cc: Unicode List > Subject: Re: Display of Isolated Nonspacing Marks (was Re: > Questions on ZWNBS...) > > > On 05/

RE: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Kent Karlsson
Kenneth Whistler wrote: > Kent Karlsson said: > > > I see no particular *technical* problem with using WJ, though. In > > contrast > > to the suggestion of using CGJ (re. another problem) > anywhere else but > > at the end of a combining sequence. CGJ has combining class > 0, despite > > bein

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Philippe Verdy
On Wednesday, August 06, 2003 11:48 PM, Peter Kirk <[EMAIL PROTECTED]> wrote: > OK, what kind of markup should I use, in any well-known markup > language, to ensure that an isolated diacritic is centred in the > space between the words before and after it? In plain text, I think that this encodin

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
From: "John Cowan" <[EMAIL PROTECTED]> > Peter Kirk scripsit: > > > So far so good, but when I get to an accent with no predefined spacing > > variant, I have a problem! > > No you don't. If you want to say is the diacritic used to > represent linguolabial sounds in the IPA, then you just encode

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
On Saturday, August 09, 2003 3:11 PM, Kent Karlsson <[EMAIL PROTECTED]> wrote: > Michael wrote: > > The Name Police reject this utterly. ZERO WIDTH cannot have an > > expanding dynamic width. > > Then what about ZERO WIDTH SPACE, which, according to TUS3, p. 238, > "can grow to have a visible wid

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread John Hudson
At 05:27 PM 8/8/2003, Kenneth Whistler wrote: Because the mechanism for doing so -- application to SPACE or to NBSP -- has been specified by the standard for a decade now. True enough, but I'm also a bit concerned about this mechanism because white space characters are another pesky thing that no

RE: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Kenneth Whistler
Kent asked: > How should a freestanding double diacritic be encoded (for purposes of > meta-discussions, and the like): or diacritic, SPACE>? It *could* be represented as , of course, or for that matter , or other possibilities. The combining character sequence, in either case, is the sequenc

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
On Saturday, August 09, 2003 12:49 AM, Michael Everson <[EMAIL PROTECTED]> wrote: > At 14:22 -0700 2003-08-08, Kenneth Whistler wrote: > > > Philippe, you are tilting at windmills, here. There is no chance > > that the UTC is going to consider such a character, in my > > assessment, let alone giv

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Peter Kirk
On 06/08/2003 03:38, Kent Karlsson wrote: Kenneth Whistler wrote: Kent Karlsson said: I see no particular *technical* problem with using WJ, though. In contrast to the suggestion of using CGJ (re. another problem) anywhere else but at the end of a combining sequence. CGJ ha

RE: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Kent Karlsson
> The NFD decompositions of spacing marks is alredy defined as a SPACE > plus a non-spacing combining character. Philippe, please! Those are *compatibility* decompositions. The normal form NFD only uses *canonical* decompositions. And there is no such thing as "NFD decompositions". /ke

RE: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Jon Hanna
> >3) In attribute values that have a declared type other than > CDATA, multiple > > spaces are compressed to a single space, and leading and > trailing spaces > > are removed. After this is done, there can be no spaces in attributes > > of type ID, IDREF, ENTITY, NMTOKEN, NOTATION, or enume

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Peter Kirk
On 05/08/2003 09:42, Jim Allan wrote: Peter Kirk posted: If I want to do this, should I explicitly encode a dotted circle, or should I encode nothing and expect the font to generate the dotted circle, as it often does? I think that practise of a font or application automaticaly inserting a do

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Doug Ewell
Peter Kirk wrote: > Thank you, Ken. Well, you make it sound as if the problems are > minimal, and that version I can just about accept. But if Philippe is > correct about what he says about UAX#29 and UAX#14, there are some > more serious problems. It is certainly highly inappropriate for > non-s

RE: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Jon Hanna
> Thanks for the clarification. I probably misunderstood Jon's intention. > But is there a problem if, for example, an application sees the string > and regularises it (wrongly!) to combining mark>? Yes, I was not saying that it wouldn't be sensible to begin a line of text with a spacing diacrit

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
On Wednesday, August 06, 2003 10:19 PM, Kenneth Whistler <[EMAIL PROTECTED]> wrote: > Kent Karlsson responded: > > > > > I see no particular *technical* problem with using WJ, though. > > > > In contrast > > > > to the suggestion of using CGJ (re. another problem) > > > anywhere else but > > > >

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Peter Kirk
On 11/08/2003 18:03, John Cowan wrote: You don't have (nor do I) the vaguest idea why Microsoft produced this particular nonconforming implementation, or whether they consider it a bug or not. Don't make assumptions about things you don't know anything about. I have been working closely and pe

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Peter Kirk
On 08/08/2003 13:56, Thomas M. Widmann wrote: Peter Kirk <[EMAIL PROTECTED]> writes: On 08/08/2003 08:54, Philippe Verdy wrote: ... Could there be another codepoint assigned that has these properties: 20CF;ZERO WIDTH SYMBOL;Sk;0;ON; 0020N; [...] But I'm not sure that ZER

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Kenneth Whistler
Peter Kirk asked: > Thanks for the clarification. I probably misunderstood Jon's intention. > But is there a problem if, for example, an application sees the string > and regularises it (wrongly!) to combining mark>? Then you have a problem, of course. What the Unicode Standard says about ap

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Kenneth Whistler
Peter Kirk wrote: > I think this may be a "Peter mistake". I meant to refer to spacing > diacritics. Sorry. > > It is certainly highly inappropriate for spacing diacritics to > be considered word boundaries. Why? It is entirely dependent on the orthography and conventions involved. There is pr

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Jim Allan
Ken Whistler posted: Of course a standard which mandates space folding is also within its rights to mandate, for example, the non-use of nonspacing marks applied to SPACE characters. It can simply rule out such sequences as valid for its context, in which case the problem goes away. And for such

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Kenneth Whistler
Ted Hopp asked: > I believe that reasonable people might reasonably conclude from factoids 1 > and 2 that SPACE is indeed a format character. > > Reasonable, but evidently wrong. Explanation, please? I provided the text deconstruction in my last email, but to continue, the confusion arises from

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Philippe Verdy
On Thursday, August 07, 2003 8:06 PM, Peter Kirk <[EMAIL PROTECTED]> wrote: > On 06/08/2003 15:47, Philippe Verdy wrote: > > > On Wednesday, August 06, 2003 11:48 PM, Peter Kirk > > <[EMAIL PROTECTED]> wrote: > > > > > > > > > OK, what kind of markup should I use, in any well-known markup > >

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread John Cowan
Peter Kirk scripsit: > This is a clear demonstration that Microsoft also has problems with the > mechanism which has been defined in the standard for ten years, This is a clear demonstration that Uniscribe fails to implement a standard correctly, a property unique neither to Microsoft nor to t

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
From: "Peter Kirk" <[EMAIL PROTECTED]> > On 13/08/2003 11:09, Philippe Verdy wrote: > > >... For this reason, defective > >combining sequences (combining characters without a leading base > >character) should be forbidden (invalid for XML). > > > > > If there is even the remotest possibility of th

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
From: "Jon Hanna" <[EMAIL PROTECTED]> > I was saying that it wouldn't be sensible to begin a line with a > combining diacritic, since that combining diacritic would be combining > with a newline character which it's difficult to think of any possible > sensible meaning for. A newline is a contro

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Peter Kirk
conformity. Jony -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter Kirk Sent: Wednesday, August 06, 2003 12:11 PM To: Curtis Clark Cc: Unicode List Subject: Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...) On 05/08/2003 16:59

RE: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Michael Everson
At 15:11 +0200 2003-08-09, Kent Karlsson wrote: Michael wrote: The Name Police reject this utterly. ZERO WIDTH cannot have an expanding dynamic width. Then what about ZERO WIDTH SPACE, which, according to TUS3, p. 238, "can grow to have a visible width when justified"? And it has the NamesList co

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Peter Kirk
On 05/08/2003 15:53, Ted Hopp wrote: On Tuesday, August 05, 2003 5:40 PM, Mark Davis wrote: Where did you get the notion that space is not a base character? And base characters include those that are not control or format characters. Space is neither one. Well, I think Jim Allan pointed to

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Peter Kirk
On 08/08/2003 09:54, Jim Allan wrote: ... It certainly makes sense that in the case of space characters that have a defined width that this width is innate to the definition of the character and in such a case should take precidence over the width of the normally non-spacing combining characte

RE: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Jon Hanna
> (provided that the whitespace normalization algorithm will not > include in the whitespaces sequence and treat it > isolately, something that a conforming HTML or XML processor > should not do, as it should unify only sequences of , > , , , and only according to the context of the > containing e

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Peter Kirk
On 05/08/2003 14:40, Mark Davis wrote: Where did you get the notion that space is not a base character? And base characters include those that are not control or format characters. Space is neither one. The standard specifically states in a number of places that to exhibit a combining mark in isol

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Mark Davis
___ http://www.macchiato.com ► “Eppur si muove” ◄ - Original Message - From: "Kenneth Whistler" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Wednesday, August 06, 2003 15:48 Subject: Re: Display of Isolate

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread John Cowan
Jon Hanna scripsit: > If this is not the case (I'm not entirely sure this bans what XML does with > spaces) then all we would need is a change so that rather than a de facto > ban on space+combining within names and nmtokens we would have an explicit > ban on the same; then we'd all be happy, exce

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Philippe Verdy
On Sunday, August 10, 2003 9:30 AM, Mark Davis <[EMAIL PROTECTED]> wrote: > > As for oe-ligature, the > > French representative to WG3 (or its predecessor) said that France > > could live without it. > > Even worse; the story I heard was that the committee had planned from > the start to have Πa

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Curtis Clark
on 2003-08-06 15:24 Doug Ewell wrote: I'm not a typographer (intelligent or otherwise), but I'm having a tough time seeing how Section 2.10 *requires* fonts and rendering engines to give a space-plus-combining-diacritic combination the exact minimum width of the diacritic alone, or to leave equal s

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Peter Kirk
On 08/08/2003 08:54, Philippe Verdy wrote: ... Could there be another codepoint assigned that has these properties: 20CF;ZERO WIDTH SYMBOL;Sk;0;ON; 0020N; i.e. being considered symbolic, not a whitespace, with combining class 0 (not combining), and used as an explicit base for a isolate

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Noah Levitt
According to the docs at http://www.microsoft.com/typography/otfntdev/indicot/other.htm, uniscribe renders combining marks in isolation when they are applied to SPACE + ZWJ. (Without the ZWJ, it uses a dotted circle.) Perhaps this is an acceptable solution to the people calling for a new character.

RE: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Kent Karlsson
> > there is no such thing as "NFD decompositions". > > Sorry for the confusion. Still even with a NFKD decomposition, And there is no such thing as NFKD decomposition either. It goes as follows, in steps: 1. Canonical and compatibility decomposition mappings (one-step), and canonical class

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Doug Ewell
Peter Kirk wrote: > Point taken. But when different fonts and rendering engines give > different results because the standard is unclear or ambiguous, that > is a matter for the discussion here. And when conforming fonts and > rendering engines fail to give the required results, that may also be

RE: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Jon Hanna
the > solution with > SPACE is really tricky due to the special treatment of SPACE notably > in HTML, SGML, XML I disagree. There are a few different things that happen with whitespace in such technologies. Some of these only apply to elements that do not allow any character data apart from whites

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Kenneth Whistler
Peter responded to Mark: > On 05/08/2003 14:40, Mark Davis wrote: > > >Where did you get the notion that space is not a base character? And > >base characters include those that are not control or format > >characters. Space is neither one. > > > >The standard specifically states in a number of p

  1   2   >