RE: informative due to variation across langauges
On Tue, 19 Jun 2001, Marco Cimarosti wrote: > Peter Constable wrote: > > Can anyone think of other examples of informative properties > > that are so > > because the property is typical but not true for all languages? > >[snip] I arrived late to this discussion. Is "culturally correct" sorting/filing such a property? I believe the Japanese and Koreans sort/file Kanji/Hani phonetically--as if they were written in kana and hangul. And that software cannot be expected to derive the kana from the kanji. I think it is also the case that "good" sorting of Latin, Cyrillic, Arabic scripts is language dependent (and m aybe other scripts too. Regards, Jim Agenbroad ( [EMAIL PROTECTED] ) The above are purely personal opinions, not necessarily the official views of any government or any agency of any. Phone: 202 707-9612; Fax: 202 707-0955; US mail: I.T.S. Dev.Gp.4, Library of Congress, 101 Independence Ave. SE, Washington, D.C. 20540-9334 U.S.A.
RE: informative due to variation across langauges
Peter Constable wrote: > Can anyone think of other examples of informative properties > that are so > because the property is typical but not true for all languages? Is it stretching things too much to say that glyphs (the representative glyphs as published in TUS) are informative character properties? If not, the fact that different languages may use different glyphs can be seen as one reason why there cannot be "normative glyphs". Of course it is just one of the zillions reasons, and probably not even the most important one. However, the fact that glyphs may depend on language is particularly important in some contexts, namely CJK characters. E.g., I think that all the stroke count information in UniHan.txt is informative (is it, right?) because the counting depends on the actual glyphs, and the glyphs partly depends on which language is considered. > Can anyone give me a specific example of why Line Breaking or > East Asian Width properties aren't normative? East Asian Width could be seen as another example of a property which is informative because it depends on actual glyphs, which in turn depend on the actual language. E.g., the whole East Asian Width property is meaningful only for systems which implement East Asian typography. _ Marco
Re: informative due to variation across langauges
>Well, not exactly. "It's normative" *means* that xyz. But "It's normative" >*because* the Unicode Standard says so, which in turn is because the >UTC voted that it be so. > >*Why* they voted so may be an interesting historical question in >particular instances, but it may be beyond the necessities of >didactic explanation. A little bit like asking why cardinals are >red and bluebirds are blue, when you get down to it. Maybe there >actually *is* a real reason (or reasons) for that, but it is probably too >complicated to figure out, and ultimately besides the point for >people who just need to be able to distinguish cardinals from bluebirds. Precisely. If I'm teaching someone about Unicode, I need to give them some hook on which to hang the normative vs. informative distinction, and the most accurate answers, viz. because UTC decided so, is a bit too abstract. Something like - "it doesn't work the same way in all cases" or - "it's just additional documentation that has no implications for how processes should behave" or - "the issues aren't yet well enough understood to set it all in stone" or - "there are a bunch of compatibility characters for which the values are unclear or controversial" works, though. But I guess the first of these exaplanations can't easily be used anymore, now that case mappings are normative. - Peter --- Peter Constable Non-Roman Script Initiative, SIL International 7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA Tel: +1 972 708 7485 E-mail: <[EMAIL PROTECTED]>
Re: informative due to variation across langauges
>> But normative explicitly does *not* mean unchangeable. > >It quite specifically means that others can use it and reference it. Anyone >knows you cannot build a house on a shifting foundation, which is why making >something "normative" should be something reserved for things that one is >*not* going to change. Sorry, but check out the text on p73, TUS 3.0: The term normative when applied to a character property does *not* mean that the value of the property will never change. Corrections and extensions to the standard in the future may require minor changes to normative values, even though the Unicode Technical Committee strives to minimize such changes. It is true that *some* normative properties (and some informative properties, e.g. Unicode 1.0 Name) are unchangeable, but it is not true that *all* are. Case in point, the combining classes underwent a lot of changes from TUS 2.1.9 to 3.0, and consideration is being given to further changes (though decidedly less drastic). - Peter --- Peter Constable Non-Roman Script Initiative, SIL International 7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA Tel: +1 972 708 7485 E-mail: <[EMAIL PROTECTED]>
Re: informative due to variation across langauges
Peter continued: > Indeed: e.g. that is true for the Unicode 1.0 Name property. My question, > though, is whether there are some properties that are informative because > they may be typical for most languages but not true for all. It was always > my impression that that was the reason for case mappings having been > informative. Was I wrong in that assumption? No, you are probably right. Everybody knew from the start (of Unicode) that case mappings were going to have exceptions. In the absence of a Character Property Model to guide thinking about how to pigeonhole things, that turned into an implicit assumption that case-mapping as a whole should be informative, since we knew that there were going to be locale-based exceptions for a few well-defined cases. Recently, it became clearer that the case properties themselves and the default case mappings had all kinds of firm implications for many processes that people were implementing, and that it was inadvisable to keep on saying that case mapping in toto was informative. This led to the switchover to say that all of this was normative, together with the formal SpecialCasing.txt way of enumerating the exceptions. Offhand, I can't think of any other instances of properties that were explicitly labelled "informative" because of language-specific behavior. These are *character* properties after all, and the characters themselves are not language-specific. I suppose it would be possible to manufacture another instance comparable to case mapping, but it would probably be rather odd and presumably wouldn't apply to any existing property lists in the Unicode Character Database. > > The real issue is that I'm trying to find ways to explain to someone why > there are distinctions between normative and informative behaviours and > properties. That's easy. Just as for any Dad responding to the kid's "Why is, Daddy?" questions, you ultimately end up giving the ultimate answer: "Because that's just the way it is." ;-) > > Which isn't really helpful for my purposes here, which are didactic: "It's > normative because conformant implementations have to follow it, and they > have to follow it because it's normative." Well, not exactly. "It's normative" *means* that xyz. But "It's normative" *because* the Unicode Standard says so, which in turn is because the UTC voted that it be so. *Why* they voted so may be an interesting historical question in particular instances, but it may be beyond the necessities of didactic explanation. A little bit like asking why cardinals are red and bluebirds are blue, when you get down to it. Maybe there actually *is* a real reason (or reasons) for that, but it is probably too complicated to figure out, and ultimately besides the point for people who just need to be able to distinguish cardinals from bluebirds. --Ken > > > >Because no one is yet convinced that the specifics of either are > >so widely agreed upon that the UTC would want to make > >some strong claim about conformance to the particular properties > >and their values for implementations of the behavior. > > Now that works.
Re: informative due to variation across langauges
From: <[EMAIL PROTECTED]> > On 06/15/2001 06:29:51 PM "Michael \(michka\) Kaplan" wrote: > >Why be more specific then there are a lot of people who think they might > >possibly have made TOO MUCH normative and do not want to make things > >unchangeable that might be in error or might need to change later? > > But normative explicitly does *not* mean unchangeable. It quite specifically means that others can use it and reference it. Anyone knows you cannot build a house on a shifting foundation, which is why making something "normative" should be something reserved for things that one is *not* going to change. michka
Re: informative due to variation across langauges
On 06/15/2001 06:28:34 PM Kenneth Whistler wrote: >Peter asked: > >> It used to be that one could describe informative properties saying, "some >> properties are valid for most languages but not all and so are informative, >> such as case mappings". > >This never really was the case, since from the moment that the UTC started >posting informative properties, there were some that had nothing to do >with language differences. Indeed: e.g. that is true for the Unicode 1.0 Name property. My question, though, is whether there are some properties that are informative because they may be typical for most languages but not true for all. It was always my impression that that was the reason for case mappings having been informative. Was I wrong in that assumption? The real issue is that I'm trying to find ways to explain to someone why there are distinctions between normative and informative behaviours and properties. The Unicode 1.0 Name typifies one reason for having an informative property (which I take to be that it is historical documentation that is relevant for implementations based on TUS 1.0 but that otherwise has no bearing on implementations). I'm trying to motivate the reason for other informative properties. >Chapter 4 *does* define normative and informative properties, but >does so in terms of what a claim of conformance to the property >means. > >I think this is basically correct: normativity has to do with what >a claim of conformance means, rather than what kind of real-world >property we are dealing with. This is part of the reason why >a formerly informative property can change its status to become >normative. Which isn't really helpful for my purposes here, which are didactic: "It's normative because conformant implementations have to follow it, and they have to follow it because it's normative." >Because no one is yet convinced that the specifics of either are >so widely agreed upon that the UTC would want to make >some strong claim about conformance to the particular properties >and their values for implementations of the behavior. Now that works. - Peter --- Peter Constable Non-Roman Script Initiative, SIL International 7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA Tel: +1 972 708 7485 E-mail: <[EMAIL PROTECTED]>
Re: informative due to variation across langauges
On 06/15/2001 06:29:51 PM "Michael \(michka\) Kaplan" wrote: >> Can anyone give me a specific example of why Line Breaking or East Asian >> Width properties aren't normative? > >Why be more specific then there are a lot of people who think they might >possibly have made TOO MUCH normative and do not want to make things >unchangeable that might be in error or might need to change later? But normative explicitly does *not* mean unchangeable. - Peter --- Peter Constable Non-Roman Script Initiative, SIL International 7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA Tel: +1 972 708 7485 E-mail: <[EMAIL PROTECTED]>
Re: informative due to variation across langauges
Peter asked: > It used to be that one could describe informative properties saying, "some > properties are valid for most languages but not all and so are informative, > such as case mappings". This never really was the case, since from the moment that the UTC started posting informative properties, there were some that had nothing to do with language differences. > Case mappings gave an easy example for why to have > informative properties. Now that the mappings are informative (with > normative exceptions listed in SpecialCasing.txt), vice-versa, actually > it's harder to give an > easy explanation for why some properties are informative. This comes down to the lack of what I call a "Character Properties Model" for Unicode. Asmus Freytag has been working on one side of this problem in an as yet not public draft for UTR #23 "Survey of Unicode Character Properties and Guidelines" that the UTC has been kicking around. Chapter 4 *does* define normative and informative properties, but does so in terms of what a claim of conformance to the property means. I think this is basically correct: normativity has to do with what a claim of conformance means, rather than what kind of real-world property we are dealing with. This is part of the reason why a formerly informative property can change its status to become normative. > > Can anyone think of other examples of informative properties that are so > because the property is typical but not true for all languages? > > Can anyone give me a specific example of why Line Breaking or East Asian > Width properties aren't normative? Because no one is yet convinced that the specifics of either are so widely agreed upon that the UTC would want to make some strong claim about conformance to the particular properties and their values for implementations of the behavior. Put it another way, if someone claims that they are doing "Unicode line breaking", are we yet ready to examine their line breaks and declare them non-conformant if they make some different choices than the informative values specified in LineBreak.txt? On the other hand, if an API purports to be returning the "Unicode General Property" of a character, and it returns "Ps" instead of "Lo" for an ideograph at some version of Unicode, I think we could now agree that that was a non-conformant API, even though formerly both "Ps" and "Lo" were considered "informative" values of the General Category. --Ken >
Re: informative due to variation across langauges
From: <[EMAIL PROTECTED]> > Can anyone give me a specific example of why Line Breaking or East Asian > Width properties aren't normative? Why be more specific then there are a lot of people who think they might possibly have made TOO MUCH normative and do not want to make things unchangeable that might be in error or might need to change later? Seems like a nice, conservative course to me to not lock down stuff that you might want to change. michka