RE: What things are called (was Non-ascii string processing)
I have invented a new system, Unilib, for organising books in a library. ... Except that you're not allowed to call them books any more, because I've already redefined the word book to mean the physical expression of a catalogue entry. Since what the user normally experiences as a book may actually require several catalogue entries, we can no longer use the word book for this object. Consequently, we need a new word or phrase to describe what the user normally experiences as a book. We tried calling them volumes back in Unilib 3.0, but it turned out that that word was also used for something else. So now we call them default chapter clusters. Hey - the public will just have to get used to it! :-) Jill
Re: What things are called (was Non-ascii string processing)
(2) The object currently called a character be renamed as something like mapped codepoint or encoded codepoint, or possibly (coming in from the other end) something like sub-character or character component or characterette (which can be shortened to charette and pronounced carrot. :-) ) charette would just get confused with caret :)
RE: What things are called (was Non-ascii string processing)
Jill Ramonsky wrote: Hey - the public will just have to get used to it! No, the public should not be bored with these technical details: in the user manual, a book will still be a book. The fact that, in the source code of the application book means something else if of interest only to programmers. _ Marco
RE: What things are called (was Non-ascii string processing)
Er, dude. It's called a sense of humor. Hence the smiley (which you snipped). Jill -Original Message- From: Marco Cimarosti [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 07, 2003 1:32 PM To: 'Jill Ramonsky'; [EMAIL PROTECTED] Subject: RE: What things are called (was Non-ascii string processing) Jill Ramonsky wrote: Hey - the public will just have to get used to it! No, the public should not be bored with these technical details: in the user manual, a book will still be a book. The fact that, in the source code of the application book means something else if of interest only to programmers. _ Marco
Re: What things are called (was Non-ascii string processing)
Jill Ramonsky Jill dot Ramonsky at Aculab dot com wrote... Well, one thing she wrote was: :-) OK, that's out of the way. What follows is not necessarily 100% serious. I have invented a new system, Unilib, for organising books in a library. ... Except that you're not allowed to call them books any more, because I've already redefined the word book to mean the physical expression of a catalogue entry. Since what the user normally experiences as a book may actually require several catalogue entries, we can no longer use the word book for this object. Consequently, we need a new word or phrase to describe what the user normally experiences as a book. We tried calling them volumes back in Unilib 3.0, but it turned out that that word was also used for something else. So now we call them default chapter clusters. Actually, this is a great analogy to what is going on with Unicode terminology, but probably not for the reason Jill had in mind. There are plenty of examples of books as the user sees them that contain one or more books as the author sees them. The Old and New Testaments, and similar scriptural and philosophical material in many belief systems, consist of many books that are bound together within a hard cover. The Book of Genesis would be an awfully thin book if it appeared on the shelf individually. Likewise, many great (and not-so-great) literary works have been divided into Book I and Book II by their authors. This overloading of the word book can indeed lead to confusion and misunderstanding, as when a high-school student with an assignment to read and compare two books chooses Book I and Book II of the same jointly bound work. When the Springfield Public Library takes an inventory, they will probably continue to count each copy of the Bible as one book, not as dozens. My point is that Jill's Unilib didn't invent this confusion and ambiguity. Likewise, any character encoding standard that incorporates the concept of combining characters is bound to experience the same sort of confusion and ambiguity over the term character. This is not unique to Unicode; ISO 6937 has this problem as well with its (leading) non-spacing marks. In ISO 6937, 0x61 is a, while 0xC2 0x61 is . Are both the one-byte and two-byte sequences characters? Does that mean 0x61 is both a character in its own right and *part* of another character? Do we need a separate word for whatever 0x61 represents? Unicode greatly expanded the potential for this sort of complication, by encoding all the lexical symbols (or whatever) of almost all modern scripts and many archaic ones, and introducing many more types of combining marks and interactions between them than any previous character encoding. Unicode has also tried to reduce the confusion, by introducing new terms. Sometimes the terms add confusion here as they take it away there, but our only real alternative is to go back to the days when we couldn't really talk about these things because they had no name. :-) -Doug Ewell Fullerton, California http://users.adelphia.net/~dewell/
Re: What things are called (was Non-ascii string processing)
On 07/10/2003 08:42, Doug Ewell wrote: ... The Book of Genesis would be an awfully thin book if it appeared on the shelf individually. ... Not that thin, actually - 85 pages in my Hebrew Bible. But some of the books, e.g. Obadiah and 2 and 3 John, fit easily on one page. So your point stands. Likewise, many great (and not-so-great) literary works have been divided into Book I and Book II by their authors. This was I think based on the custom in classical times when a book had a fixed maximum size rather smaller than it is today, based on the size of a scroll or whatever, and so authors were forced to divide their works into separate books. Of course many authors still do it even though we now have printed books large enough. Well, actually books have been large enough at least since the 4th century CE when the first one volume copies of the full Greek Bible were produced. Three of these 4th-5th century copies survive, two of them in the British Library. This overloading of the word book can indeed lead to confusion and misunderstanding, as when a high-school student with an assignment to read and compare two books chooses Book I and Book II of the same jointly bound work. When the Springfield Public Library takes an inventory, they will probably continue to count each copy of the Bible as one book, not as dozens. Then there is also the confusion of whether a multi-volume work counts as one book or several. How many entries in the inventory for a ten volume encyclopedia? Ten or one? What if one volume is missing? What of a supposedly multi-volume work whose volumes are published at wide intervals? Some Bible commentary series are presented as multi-volume works but volumes have been published in an arbitrary order, by various authors, and sometimes replaced one at a time, in extreme cases for as long as a century (the International Critical Commentary series). So the concept of book becomes even more slippery than the concept of character. -- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/