Re: Dates in Japanese Era Names in Unicode Standard
Philippe, >>Is it possible that these eras start at midday instead of noon ? I assume you mean midnight RM www.raymondm.co.uk
Re: Turned Capital letter L (pointing to the left, with serifs)
I have looked at both the collected works of Gauss and at the English version of the Theoria Motus, in order to see what a later editor made of this symbol. In the Werke the symbol ’7’ continues to be used : C F Gauss, Werke, Vol. 7, ed. E J Schering, Gotha, 1871; § 77, M = N + n’7’ ̶ Π. In the translation the ‘7’ is replaced by the lower case tau. Theory of the motion of the heavenly bodies moving about the sun in conic sections: a translation of Gauss's "Theoria motus." With an appendix. By Charles Henry Davis, Boston : Little, Brown and company, 1857; § 77, M = N + nτ ̶ Π. So this seems to settle the matter of the identity, and just leaves one to puzzle over the German use of this sign for tau. Raymond
Re: Turned Capital letter L (pointing to the left, with serifs)
On further reflection I can well agree that it is tau. The attached images from R. Barbour, Greek Literary Hands, show clearly (scan 3) the large upper case tau in several lines, and in scan 4 in the first and other lines a hooked version of tau. So I withdraw my suggestion of pi. Raymond From: Asmus Freytag (t) Sent: Monday, January 04, 2016 7:58 PM To: unicode@unicode.org Subject: Re: Turned Capital letter L (pointing to the left, with serifs) On 1/4/2016 10:41 AM, Michael Everson wrote: Certainly it does look more like a very common variant of “tau” than “pi” Variant of uppercase tau? A./
Re: Turned Capital letter L (pointing to the left, with serifs)
The sign described as like 7 is surely a cursive form of π. The form used by Gauss (Disquisitio de elementis ellipticis Palladis) is much the same as that shown in manuals of Greek Palaeography as a cursive π. This is given by E.P. Thompson in two works, An Introduction to Greek and Latin Palaeography, Oxford, 1912, p.83, and A Handbook of Greek and Latin Palaeography, Chicago, 1975, p. 95. Raymond Mercier
Re: Unicode 7.0 Paperback Available
Well why not print a good clean copy with Acrobat and a high quality printer, and do the rest of the volume printing as camera-ready ? I have had complex texts published that way. R.___ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode
Re: Unicode 7.0 Paperback Available
Asmus, Thanks. Indeed I am surprised that a publisher cannot get results as clean and reliable as I do when printing from Acrobat. R From: Asmus Freytag (t) Sent: Saturday, January 17, 2015 5:52 PM To: Raymond Mercier ; unicode@unicode.org Subject: Re: Unicode 7.0 Paperback Available Raymond, even though the source is PDF, the nature of the fonts used for the charts makes this extremely challenging for the printers. Experiments run by some volunteers have determined that you can expect very inconsistent results, because the way these printing services and their contractors handle PDF is just not the same as when you use Acrobat or some browser plug-in to view them on screen. You may find this a surprising state of affairs, but those are the facts on the ground. It was found that even the same service may get you different results for each order. And by different, I mean, with different discrepancies from the desired output. These services apparently subcontract with a number of printing presses, all of which may have different software. A./ ___ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode
Re: Unicode 7.0 Paperback Available
Since the new printed volume is so expensive when shipping is included, why not try one of the commercial binding services, such as https://www.doxdirect.com/products/specialist-document-printing/pdf-printing/. The pdf files that make up Unicode 7.0 can all be downloaded from http://www.unicode.org/versions/Unicode7.0.0/. It would have been easier of course if the individual pdf’s had been gathered together into larger groups, although one can do that easily within Acrobat. Best of all would be a volume (or two ?) like that for Unicode 5 produced by Addison Wesley. When I once asked about that for Unicode 6 I was told that it was just too difficult to get the pages formatted suitably for book production. But if the charts can be presented as pdf, why is it difficult to print and bind them ? Regards Raymond Mercier ___ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode
Re: Word reversal from Abobe to Word
Thanks to all for your comments on the problem of copying a Hebrew phrase from Adobe to Word. The Hebrew phrase is מהלך חמה הבינוני בשנים מחוברות ופרוטות וחדשים I have had another look at the problem, with these results. Copy from Adobe Acrobat 6 (with Select text): Paste into Word as rtf - words in the correct order, but reversal within each word Paste into Word using Paste special and unformatted unicode -word order reversed, as well as reversal within each word Copy from Adobe Reader X: Paste int Word as rtf - word order reversed, but characters not reversed within each word Paste into Word using Paste special and unformatted unicode - everything correct. Clearly my old Adobe Acrobat 6 is not up to the job, while the latest reader is, provided Paste Special is used. There are options within Word options/Advanced/Cut Copy Paste, but a choice of non-default options has no bearing on the issue. Regards Raymond Mercier
Re: Word reversal from Abobe to Word
From: "Jukka K. Korpela" Do you mean the commercial Adobe Acrobat software for creating PDF documents, or the free Adobe Reader (previously called Adobe Acrobat Reader) for viewing and printing (and commenting on) PDF documents? << I am using the full commercial Adobe Acrobat version 6, running on XP. In constructing the example I realize that I had the wrong final 'm', but that does not affect the point. If there is more than one word, the order of words IS correct, but the order of characters in each word is reversed. Regards Raymond
Word reversal from Abobe to Word
This problem is not precisely about Unicode - or is it? If I have a Hebrew text displayed in Adobe Acrobat I can select part of it and can paste it into Word. The trouble is that while individual characters are correctly displayed the order is reversed. Thus if I have in Acrobat קודמ (meaning 'prior') when pasted into Word I get םדוק Every effort to put this right has failed, and yet it must have been met by many others. It's not really about Word as such, since pasting into Notepad has the same result. What does one do ? Regards Raymond Mercier
Re: Greek Astrology
Byzantine authors had a great penchant for ligatures. Although I do not have expertise in Greek astrology, I do have some competence in other aspects of Byzantine literature (including some familiarity with manuscripts and inscriptions). Based on that experience, I feel I can safely say that any attempt to encode the four ligatures on the grounds discussed here would be an invitation to encode a host of other Byzantine Greek ligatures (for example, the standard cruciform invocative monograms: V. Laurent, La collection Orghidian [Paris, 1952], pl. lxx). A formal proposal for these four ligatures would be premature. One should first understand the entire culture of Byzantine ligation, then determine what parts of that culture should be encoded, and which not. Sincerely, jk I have done enough text editing in Greek to know well that there is a large and bewildering range of ligatures and abbreviations, and I have absolutely no thought of suggesting they should all be encoded. However there are pecularities of notation associated with individual disciplines, such as mathematics and music. The sign for the ascendant, at least, is part of the notation in astrology and astronomy, along with, for example, the sign used for zero in sexagesimal notation. This approach can be compared with the Unicode block of Byzantine musical notation. Raymond Mercier -- Joel Kalvesmaki Editor in Byzantine Studies Dumbarton Oaks 1703 32nd St. NW Washington, DC 20007 (202) 339-6435 From: , "A. Sz." mailto:a.sz.sz...@gmail.com>> Date: Thursday, November 1, 2012 3:56 AM To: CE Whitehead mailto:cewcat...@hotmail.com>> Cc: "unicode@unicode.org<mailto:unicode@unicode.org>" mailto:unicode@unicode.org>> Subject: Re: Greek Astrology Is there evidence that these have been used consistently, on most charts of the time? These could be ad-hoc notations (as given the contemporary praxis, ligation per se does not make a "symbol"). -- Szelp, Andr Szabolcs +43 (650) 79 22 400 On Thu, Nov 1, 2012 at 2:38 AM, CE Whitehead mailto:cewcat...@hotmail.com>> wrote: Hi. From: Raymond Mercier mailto:rm459_at_cam.ac.uk?Subject=Re:%20Greek%20astrology>> Date: Mon, 29 Oct 2012 08:52:43 - I think I had somehow assumed that the symbols used in Greek Horoscopes had already been encoded, but it seems not. The four signs used to mark the principal corners (ascendant, etc) of the horoscope diagram are shown in the attachment, taken from http://www.skyscript.co.uk/greek_horoscope.html These four signs should be encoded along with the zodiacal signs U+2648 to U+2653. Perhaps they are already in the pipeline ? Perhaps these should be in the pipeline, as the online templates I could find for astrological charts do not have them; they have to be added in (although it would be possible to have these built into the chart template also, as the houses are always in the same place and the ascendant is always located between the 12th and the 1rst, etc.); see: http://www.skyscript.co.uk/charttemp.html Similarly Paul Wade's copiable template is void of the symbols http://books.google.com/books?id=WY8hjKtSaP0C&pg=PA40&lpg=PA40&dq=natal+charts+astrological+charts+templates&source=bl&ots=By-xF3UGWB&sig=KvomOKgo999CwuJPKaq1LmeoqHc&hl=fr&sa=X&ei=oMCRUK-wF5Sc8gTWi4DYAg&ved=0CDQQ6AEwAzgK#v=onepage&q=natal%20charts%20astrological%20charts%20templates&f=false (I'll try to check an offline guide, too, but the few actual online templates, not sample charts, seem void of the symbols for the ascendant, midheaven, etc., so they seem to be separate from the actual chart of the houses, so go for it. Happy Halloween in any case.) Best, --C. E. Whitehead cewcat...@hotmail.com<mailto:cewcat...@hotmail.com> Best wishes Raymond Mercier
Greek astrology
The first sign (ascendant or horoscope) is common enough I believe, and I attach a small portion of a most valuable (10th century ?) Syriac manuscript (photographed in UV light), a palimpsest, where the inferior text is Greek. There is clearly a list of 'values' for this ascendant (16,15,14...), but I cannot make out the rest of the lines. When such material is transcribed people (as Neugebauer, Greek Horoscopes) just use H, or some other conventional sign to indicate the horoscope, but it would be nice if the ancient sign were encoded. For the transcription I made a horoscope sign using Paint, and used an ad hoc uncial script (non Unicode) for the Greek, with sigma instead of stigma. All this shows the problems of making a faithful transcription of ancient texts. Raymond Mercier - Original Message - From: Szelp, A. Sz. To: CE Whitehead Cc: unicode@unicode.org Sent: Thursday, November 01, 2012 7:56 AM Subject: Re: Greek Astrology Is there evidence that these have been used consistently, on most charts of the time? These could be ad-hoc notations (as given the contemporary praxis, ligation per se does not make a "symbol"). -- Szelp, André Szabolcs +43 (650) 79 22 400 <><>
Greek astrology
I think I had somehow assumed that the symbols used in Greek Horoscopes had already been encoded, but it seems not. The four signs used to mark the principal corners (ascendant, etc) of the horoscope diagram are shown in the attachment, taken from http://www.skyscript.co.uk/greek_horoscope.html These four signs should be encoded along with the zodiacal signs U+2648 to U+2653. Perhaps they are already in the pipeline ? Best wishes Raymond Mercier <>
Re: Unicode Core
Michael Everson: Perhaps less than us character mavens would imagine. Books don't publish themselves, and publishing takes resources of various kinds. Julian Bradfield: Not much, if they use the Lulu route, as they already have an account set up. An hour of somebody's time should do it. And at a Lulu price, there'll be a lot more of a market than at an Addison-Wesley price! I haven't work out the number of pages needed for all the charts, but even if it needed two volumes, what is the problem with that ? It is not just for private libraries like mine, but this is something, complete with the charts, that should be in the reference section of every university library, and every computer library. Or do we tell the library user that they can always download the charts ? Raymond Mercier
Unicode Core
Today I received from Lulu the Unicode Standard 6.1 -Core specification http://www.lulu.com/shop/unicode-consortium/the-unicode-standard-version-61-core-specification/paperback/product-20082926.html . While I am very glad to have this, I really do wonder why there was not a full publication of Unicode 6 or 6.1 from the corporation itself, with all the charts, as we have had with Unicode 1 to 5. Surely there is a market for this ? Raymond Mercier
Re: Notice of brief Unicode.org system outage on Friday
From: "Cristian Secară" Just wondering why the time zone reference is not given in a universal format, like UTC±n, so one in other part of the world can calculate. Excellent point ! Raymond Mercier
Re: Hittite cuneiform
From: "Michael Everson" > Hittite cuneiform is a subset of http://www.unicode.org/charts/PDF/U12000.pdf I agree that the Hittite signs are a relatively small subset of those used in Sumerian, Akkadian, etc., but the values so often differ that it seems to me that a separate listing of the Hittite usage is appropriate. Compare the CJK signs: the Japanese -kun and -on readings are included in unihan along with the Chinese readings. Raymond Mercier
Hittite cuneiform
Why has Hittite cuneiform not yet been included ? As one sees from the table in http://www.ancientscripts.com/hittite.html, it should be easy enough, just as Old Persian and Ugaritic are included under the general heading Cuneiform. Best wishes Raymond Mercier
Re: Missing old Greek ligature/letter "omicron+upsilon above"
Philippe Verdy : >>Clearly there does seem to be missing a Greek letter, I hope there is no suggestion of encoding the huge variety of Greek abbreviations and ligatures. The early printed Greek texts used type designed to follow the manuscripts, but thank goodness that was dropped before long. There is a bewildering number of these signs. See, for example http://commons.wikimedia.org/wiki/File:Greek_alphabet_ligatures.jpg and that's just the start of it. Raymond Mercier
Re: TeX: insert Unicode character
From: "Julian Bradfield" In principle, find a suitable cuneiform package, and use it. I don't know which package has what characters in it, unfortunately, and at a first glance, I can't see one that has that character. We have a font, since I adapted an older experimental cuneiform TrueType font, essentially by changing the encoding to conform to Unicode cuneiform. I have not made this publicly available, but have just used it for an article written in Word. My colleague wants to convert that to Latex, but he ran into some problems. The publisher does not insist in Latex, but it has a loyal fan club, especially among mathematicians. Raymond
Re: TeX: insert Unicode character
Thanks for both those suggestions, which I will pass on. Raymond
TeX: insert Unicode character
I am trying to help a colleague who writes an article in LaTeX, and who needs to insert an isolated character U+1212d from the Unicode block. I am not too much familiar with LaTeX myself, but what do I suggest to him ? Raymond Mercier
Re: Reasonable to propose stability policy on numeric type = decimal
"John Dlugosz" writes I can imagine supporting national representations for numbers for outputting reports, but I don't imagine anyone writing in a >>programming language would be compelled to type 四佰六十 instead of 560. Especially since 四佰六十 is 460. Raymond Mercier
Re: Pronunciation of the word emoji
Please, Mr Overington, enough ! enough ! Raymond Mercier
Re: Value of U+1E20
> Raymond mentions Arabic ghayn, but I would expect this to be > transliterated more commonly with U+011F or U+0121. I can assure you that 1E20 and the l.c. companion 1E21 are very clearly used by Wehr in his Arabic Dictionary. As to U+011F or U+0121, I see that Socin, in his old Arabic Grammar (1895), uses U+011F for jim U+062C, and the U+0121 for ghain U+063A. Wright, Arabic Grammar, as old as Socin, also uses U+0121 for ghain U+063A. It may be that these usages of a century ago survive in some quarters even today. Raymond
Re: Value of U+1E20
> Would any one know what is the value of U+1E20 ? > Is this (also) used in Semitic transliterations ? For which value ? > Could it be a fricative G ? It is used somtimes in transcribing Arabic, where it represents ghain U+063A Ø. You will see it for example in Wehr's Arabic Dictionary, even in the English version of Cowan. In most English transcriptions of Arabic gh would be used. Raymond Mercier
Re: sign for anti-neutrino - greek nu with diacritical line aboveworkaround ?
Herbert, >have you been to: >http://titus.fkidg1.uni-frankfurt.de/database/unicode/unicself.htm >there you can combine NU and MACRON - and they are using IE on newest >WINDOWS… Well that's very pretty - for me however it works only in Mozilla, not in IE. As far as I know IE6 is the latest. Raymond
Re: sign for anti-neutrino - greek nu with diacritical line aboveworkaround ?
Herbert, Sorry, no change in IE6: still nu+ empty squares. However it works in Mozilla, and so did the previous one. Raymond
Re: sign for anti-neutrino - greek nu with diacritical line aboveworkaround ?
Mark: > It is probably a really bad idea to have the base letter in one span and the > combining mark in another. That is very likely to throw a monkey wrench into > whatever you are trying, on most text layout systems. If I start in Word with a clear nu+macron, and save as html, I get the division into two spans. What I posted is derived from that by dropping all the style padding. Raymond
Re: sign for anti-neutrino - greek nu with diacritical line aboveworkaround ?
ï Herbert, Well when I open yours in IE 6 I just get the character nu, followed by a blank square. To add to this comedy, however, when I look at your source in notepad I see there indeed a correct nu+macron ! There is some odd instability going on here. BTW in my previous message I intended ν̄ to be all in one line - but it does not come out that way in the mail display, at least not in Outlook Express. Raymond
Re: sign for anti-neutrino - greek nu with diacritical line aboveworkaround ?
Peter Jacobi: >Testing with Mozilla 1.7, ν̄ displays a fine Anti-Neutrino sign. So you say, but the following works in IE6, and Opera, but not in Mozilla 1.7. What is the problem ? "nu.htm" ν̄ In Mozilla only the nu shows. And if I change ν̄ to ν ̄ then in IE6 the macron is shifted to the right. What is going on ? Raymond Mercier
Re: Morohashi in unihan
From: "Allen Haaheim" <[EMAIL PROTECTED]> > I spot-checked a few random characters from Blocks A and B, and some of them > were in Morohashi. So that means that the Morohashi numbers have just not been included in A or B. Raymond Mercier
Morohashi in unihan
A lot of characters in unihan.txt have Morohashi values 0 or 9. I take it there is then no Morohashi equivalent, but what is the distinction between these two, and what is the point of putting anything if there is no Morohashi equivalent ? Also the Morohashi equivalent is not given for CJK-A or B. Is that really true, I mean are these characters really not in Morohashi ? Raymond Mercier
Re: Much better Latin-1 keyboard for Windows
Jowh Cowan writes > http://www.livejournal.com/users/gwalla/39856.html is a page about > (and a link to) a truly excellent Windows keyboard driver that > provides full access to the Latin-1 range Latin-1 is not everything! If you need to transcribe Arabic/Hebrew/Sanskrit/Farsi, you will need the macrons on vowels (Latin Extended-A) and various dot-under letters (Latin Extended Additional). I made my own layout using the DDK. Raymond Mercier
Re: Looking for transcription or transliteration standards latin- >arabic
Peter Kirk writes > This is more complicated than it looks. The Greek form Istimboli is > impossible for the period as Greek had no [b] sound, for Î was > pronounced [v] except that later and perhaps already at that period ÎÏ > was pronounced [b] at least in foreign words. So is the Greek consonant > cluster ÎÎ, or ÎÏ, or ÎÎÏ, or what? Also is the previous consonant > cluster ÏÏ as transliterated, or ÏÎ corresponding to "isthmus"? And then > what are the Greek vowels? I was only trying to grasp the sense of Gerd's throw-away remark (which I hope he will explain), but I appreciate the difficulties you raise, especially the point about the Greek beta as the phoneme /v/ . That particular difficulty at least doesn't apply to the Ottoman b, if we look for a Turkish -bul < ÏÎÎÎÏ. Raymond Mercier http://ourworld.compuserve.com/homepages/RaymondM
Re: Looking for transcription or transliteration standards latin- >arabic
ï Gerd Schumacher wrote > I think, the underying meaning of Istimboli must be > "town at the isthmus", which makes sense, indeed. How does that work ? Do you mean istim<ÎÏÎÎÎÏ , bol<ÏÎÎÎÏ ? Raymond Mercier
Re: Philippe's Management of Microsoft (was: Re: Yoruba Keyboard)
James Kass writes:> IE6 displays CJK(A) in UTF8 just fine. It can't seem to handle> CJK(B) in UTF-8, though.Isn't it the other way round ?I attach a file with three characters all in UTF8, representing CJK(A), CJKand CJK(B). The CJK(A) displays in IE6 only if ... isincluded, but it *does* handle the CJK(B) without any reference to lang.In Mozilla all three display without the "lang=ZH"Of course to see the CJK(B) you need the font Simsun (Founder Extended).Raymond Title: Definition Search 㖾 35BE 址 5740 𨀣 28023
Re: Philippe's Management of Microsoft (was: Re: Yoruba Keyboard)
Kenneth Whistler writes, replying to Philippe > This kind of long-winded harangue about how Microsoft should manage its > business is OT for this list and is generally insulting to the Microsoft > participants as well. Please take it elsewhere and do not bother the > Unicode list with your management plans for Microsoft's internal > business. It is all very well to mock Philippe, but IE6 fails badly if it cannot even display CJK(A) in UTF8, something Mozilla does perfectly well. If there are Microsoft participants in this list perhaps they could explain this failure. Broadly speaking I am pro-Microsoft, but this behaviour in IE6 reflects badly on them. Raymond Mercier
Re:CJK(B) and IE6
[Earlier posting lost, it seems.] James Kass writes: > The lack of support for supplementary characters expressed in UTF-8 > in the Internet Explorer is a bug. As Philippe Verdy mentions, the > Mozilla browser does not have this same bug. Also it should be > noted that the Opera browser handles non-BMP UTF-8 just fine. As I said in my starting message Mozilla copes with everything, both UTF8 and NCR, over the whole CJK range. However Opera (in my experience) cannot do Ext B in either UTF8 or NCR. IE6 cannot cope with Ext A in UTF8, but will do so in NCR. I attach two short files (produced by Hanfind) that include both extensions, one in UTF8 and the other NCR (except that characters given within the text are all NCR). > While working with NCRs may be an ugly nightmare, there are some shortcuts. BabelPad is great, but it chokes in converting all the UTF8 in unihan.txt to NCR at one go. I wrote a dedicated program to do that. > I *think* that Windows 2000 uses Unicode always internally and uses an > internal conversion chart if material is non-Unicode like GB-18030. > That at least is declared http://www.i18nguy.com/surrogates.html. Raymond Mercier Title: Definition Search 㖾 35BE, E4 (same as 咢) to beat a drum; to startle, to argue; to debate; to dispute, (interchangeable 愕) to be surprised; to be amazed; to marvel, (interchangeable 鍔) the blade or edge of a sword, beams of a house㝔 3754, YAO4 deep bottom; the southeast corner of a house㝢 3762, YU3 (same as 宇) a house; a roof, look; appearance, space㝪 376A, DIAN4 DING3 a slanting house, nightmare㡯 386F, ZHAI2 (ancient form of 宅) wall of a building, a house, to keep in the house, thriving; flourishing, blazing, (ancient form of 度) legal system; laws and institutions, to think; to consider; to ponder; to contemplate㡯 386F, ZHAI2 (ancient form of 宅) wall of a building, a house, to keep in the house, thriving; flourishing, blazing, (ancient form of 度) legal system; laws and institutions, to think; to consider; to ponder; to contemplate㡰 3870, YU3 (large seal type 宇) a house; a roof, appearance, space; the canopy of heaven, to cover㡵 3875, LING2 roof of the house connected㡸 3878, ZHA3 ZHA4 a house; an unfinished house, uneven; irregular; unsuitable; ill-matched, tenon㡸 3878, ZHA3 ZHA4 a house; an unfinished house, uneven; irregular; unsuitable; ill-matched, tenon㡺 387A, DAN4 a cottage; a small house, a small cup㢂 3882, YAN3 (terrains) of highly strategic; precipitious (hill, etc. a big mound, (same as VEA 3888) a collapse house, to hit, to catch something㢈 3888, TUI2 a collapsed house, (same as 堆) to heap up; to pile㢎 388E, CHA4 ZE2 ZHAI2 ZHE2 hide; conceal, a house not so high㢑 3891, TUI2 (corrupted form of VEA 3888) a collapsed house, (same as 堆) to heap up; to pile㢒 3892, CHA2 an almost collapsing house㢖 3896, not available a store house, to store㢗 3897, QIAO4 a high house; a high building㢚 389A, LU3 a corridor; a hallway; rooms around the hall (the middle room of a Chinese house), a nunnery; a convent, a cottage; a hut, a mansion㢝 389D, not available cottage; a coarse hourse, house with flat roof㢞 389E, YI4 rooms connected, moveable house ( a yurt, a portable, tentlike dwelling used by nomadic Mongols)㭽 3B7D, DI3 (non-classical form of 柢) root; foundation; base, eaves of a house; brim㯪 3BEA, LING2 (same as 櫺) carved or patterned window-railings; sills, the wooden planks which join eaves with a house㰃 3C03, MIAN2 (same as U+6AB0) a tree, the bark of which is used in medicine-- Eucommia ulmoides, an awning of the house㰅 3C05, DI2 (same as 樀) eaves of a house; brim, part of a loom, the cross beams on the frame on which silkworms spin, a bookcase, to abandon or give up㼟 3F1F, BAI2 a tiled house, brick wall of a well䅊 414A, DU4 a spacious house, (corrupted form of 秺) bundle of rice plant, name of a place䆖 4196, HONG2 a big house, (same as 宏) great; vast; wide; ample䆧 41A7, not available (same as 窩) a cave; a den, living quarters; a house, to hide; to harbor䆲 41B2, not available a spacious house, emptiness䆵 41B5, CHENG2 an echo, a high and deep; large; big; specious house䆸 41B8, CHENG2 spacious; capacious, sound (of the house), a picture (on silk) scroll䗔 45D4, HOU2 a house-lizard or gecko, a kind of insect; living in the water䦗 4997, XU4 (same as 侐) quiet (house, surrounding, etc.)䳸 4CF8, MA2 MAI2 the wild goose, sparrow; the house-sparrow䵇 4D47, XIAN4 to dislike; to reject; to hate, a house; a building䵺 4D7A, TING3 (same as 圢 町)boundary between agricultural lands, (in Japan) a street; a city block, ant hill; formicary, vacant land by the side of a house; a paddock, deer trace; deer track址 5740, ZHI3 site, location, land for house墅 5885, SHU4 villa, country house壁 58C1, BI4 partition wall; walls of a house宇 5B87, YU3 house; building, structure; eaves室 5BA4, SHI4 room, home, house, chamber家 5BB6, JIA1 JIE5 GU1 house, home, residence; family屋 5C4B, WU1 house; room; building, shelter庳 5EB
Re: CJK(B) and IE6
Raymond Mercier a écrit : > However, I am disappointed to find that IE6 will not display > U+2, etc. See http://www.i18nguy.com/surrogates.html, may help. -- François Thanks very much. With these changes in the Registry the font Simsun (Founder extended) displays in IE, and in my Hanfind too, since that relies on the browser. Hanfind: http://ourworld.compuserve.com/homepages/RaymondM Raymond
CJK(B) and IE6
Having installed the large font Simsun (Founder Extended), which covers much of CJK(Ext B)(U+2, etc), I find that these characters appear in MS Word, Wordpad and Notepad. However, I am disappointed to find that IE6 will not display U+2, etc. Of course in Tools/Internet Options, I have set the Asian Font display to this new font. The same browser that is used in IE6 can be coupled with other applications (compiled in VC6, for example), but the result is the same. On the other hand Mozilla will show these characters. I know that applications can be arranged to use the Mozilla browser, but that is a whole new programming ball game, that frankly I could do without. Raymond Mercier
Re: GB18030 and super font
ï Eric, Amazin' Amazon!! Now why didn't I think of that ? In fact the uk Amazon.co.uk say it is discontinued, so I would have to get it from Amazon in the US. It is not the first time that the two Amazon's fail to connect. Many thanks for the tip, Raymond - Original Message - From: Eric Muller To: [EMAIL PROTECTED] Sent: Thursday, April 22, 2004 5:40 PM Subject: Re: GB18030 and super font Raymond Mercier wrote: But that link to proofing tools leads nowhere. Maybe it's not be so easy toget the CHS version.Includes ~140 fonts, mostly for CJK, Arabic, Hebrew but other scripts as well. Includes "Simsun (Founder Extended)" aka "åä-ææèååçé", with 65,531 glyphs!Eric.
GB18030 and super font
ï Mark Shoulson writes>their Super Font is bundled with Microsoft Office XP, and> even Microsoft's prices haven't gotten that high!From Microsoft,http://www.microsoft.com/globaldev/DrIntl/columns/015/default.mspx :"A font that contains Simplified Chinese glyphs from both CJK Extension Aand B sets is "SimSun (Founder Extended)" (SurSong.ttf in the system), oråäâææèååçé (in Chinese). It is currently available in the Simplified Chinese(CHS) version of Office XP, or the Microsoft Office Proofing Tools. Clickthe link for more information and how to buy."But that link to proofing tools leads nowhere. Maybe it's not be so easy toget the CHS version.Raymond
GB18030 and super font
I am intrigued by GB18030 encoding. There is a table of equivalences in http://oss.software.ibm.com/cvs/icu/~checkout~/charset/data/xml/gb-18030-200 0.xml No doubt Unihan will at some stage include these 2 & 4 byte values. I enquired about the 'super font' created by a Beijing foundry, http://font.founder.com.cn/english/web/index.htm, and am fairly astonished at the prices, as you see from the attached. I suppose this is the only source for such a full font. Raymond Mercier - Original Message - From: "GaoZhiQing (高志青)" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Thursday, April 22, 2004 12:09 PM Subject: re:GB18030 super font Hello Mercier, The price of our GB18030 font: 20,000US$/1 font per year license. 80,000US$/perpetual license. The price of our GB2312 bitmap font: 15,000US$/4 years license.(ONE SIZE) (We provide China standard bitmap fonts,the price and term has constitute by Chinses government.You company must agreement with chinese government.We can act at an agent.) Best regards, Gao Zhiqing Beijing Founder Electronic Co., Ltd. Network Circulation Division Add:9,No.5Street,Shangdi Information Industry Base,Haidian District,Beijing 100085,China Tel:86-10-62981432 Fax:86-10-62981438 Mobile:13501204825 E-mail:[EMAIL PROTECTED] http://font.founder.com.cn -邮件原件- 发件人: Raymond Mercier [mailto:[EMAIL PROTECTED] 发送时间: 2004年4月19日 2:08 收件人: 字库支持信箱 主题: GB18030 super font It would be helpful to learn availability and cost of the full GB18030 font. The bitmap fonts (GB2312) are also of interest. Dr R.Mercier St Ives,Cambs UK
Re: Unihan.txt and the four dictionary sorting algorithm
John Jenkins writes >>Also, even though the full Unihan database is 25+ Mb in size, given the cheapness of disk space nowadays, it's not all *that* big, surely. << The problem of the size of Unihan has nothing at all to do with the cost of storage, and everything to do with the functioning of programs that might open and read it. Since the lines in Unihan are separated by 0x0A alone, not 0x0A0x0D, this means that when opened in notepad the lines are not separated. Notepad does have the advantage that the UTF-8 encoding is recognized, and the characters are displayed. If opened in Wordpad the Chinese characters do not appear, perhaps the UTF-8 encoding does not function. If I try MS Word the machine grinds to a halt - and this is a good modern machine (XP with 120Mb HD and 512Mb RAM). Similarly if I open in IE6, with UTF-8 encoding, the text opens up to around U+4C00, and then grinds to a halt. I can open it in the HexWorkshop byte editor, or in the editor in Visual C 6, but these do not recognize UTF-8 encoding, and they hardly count as suitable readers for such a file. I wish the people who designed this file would accept the need for a more structured and sophisticated approach. Why not, for example, have a basic html file, with html-links to the various sections ? Raymond Mercier
Re: Unihan.txt and the four dictionary sorting algorithm
Ernest Cline writes >>I'm trying to pare Unihan.txt down to a less unwieldy size for my own use by eliminating properties that are of no interest to me << The sheer size of unihan creates problems, hence the need to extract manageable subsets. This is the basis of my Hanfind: (http://ourworld.compuserve.com/homepages/RaymondM) which isolates Pinyin, definitions, etc etc. Andrew West once suggested that Unihan be converted to an XML file, and would appear to help isolate the different fields. Raymond Mercier
Re: Web Form: Subj: Unicode conversion- Microsoft Visual C++ compiler
Mino, This is not at clear: the character U+0427 is Ð in the Cyrillic block, and what does this have to do with the two characters à and Â, which are U+ 00D0 and U+00A7 ? Are you wondering how to store 0x0427 in a binary file ? Or what ? Raymond Mercier > > Contact: [EMAIL PROTECTED] > > Report Type: Other Question, Problem, or Feedback > > Opt Subject: Unicode conversion > > > > I would like to convert a 2 byte Unicode code into its > > corresponding Unicode character (for instance the decimal 1063 or the > > hexadecimal 0427 into 'ÃÂ'). Is there a C function in order to make the > > conversion? What file .h do I need to include in the C program? Can I > > use the 6.0 version of the Microsoft Visual C++ compiler, or do i > > need a newer version? > > Thanks a lot in advance. > > Mino Napoletano > > > > -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- > > (End of Report) >
Re: CJK U+3ADA and U+66F6
James Kass writes: > Is there a difference between U+66F6 and U+3ADA? > > The newest UNIHAN.TXT file doesn't have a definition field for > U+66F6. The glyphs in the Unicode 4.0 book appear identical > for these two characters. One is placed with radical 72, the > other with radical 73, although UNIHAN.TXT gives both as > having radical 73. Only experts with access to all the references will sort this out, but at least note that both characters are placed under radical 73 in both Unicode 4.0 (p.1237) and the revised unihan. Raymond Mercier
Greek zero
I have made a proposal to the UTC to encode the Greek symbol for zero, as used in astronomical texts. An extended version of this is available on my site http://ourworld.compuserve.com/homepages/RaymondM/. It is a rather long pdf file. Raymond Mercier
RE: Code points on Windows
>>RichEdit 4.1 (used in Windows XP SP1 WordPad and later) also have the toggle I am using Wordpad on Win2000 (SP4), and Word 2002. I found after rebooting that Alt-X now works on Wordpad, but not the reverse. According to Wordpad About, I am using 'Version 5 SP 4'. In a program of mine (Handfind in http://ourworld.compuserve.com/homepages/RaymondM/ where I used RichEdit control, the reverse also fails. If Alt-X fails in Word, you need to check the assignment of shortcuts for Word commands, as follows: In Tools/Macros; select Word Commands; look for ListCommands, Click Run. This gives a list of commands and their shortcuts: look for 'Toggle Character Code'. The corresponding shortcut should be Alt-X, but if it has been reassigned for any reason, it has to be reset. Raymond Mercier - Original Message - From: Mike Ayers To: 'Murray Sargent' ; Raymond Mercier Cc: [EMAIL PROTECTED] Sent: Wednesday, January 14, 2004 11:19 PM Subject: RE: Code points on Windows > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On > Behalf Of Murray Sargent > Sent: Wednesday, January 14, 2004 3:01 PM > WordPad on Windows 2000 and XP support Alt+x. Win95 and Win98 WordPads > don't, since they used earlier RichEdit's than version 3.0. > Version 3.0 > doesn't have the toggle: Alt+x converts a hex code string to > the Unicode > character; Alt+X does the reverse. Word 2002 added the Alt+x facility ^ Alt+Shift+X > with the nice wrinkle of making it a toggle. Accordingly RichEdit 4.0 > (used in Office 2002) and RichEdit 4.1 (used in Windows XP SP1 WordPad > and later) also have the toggle, as does RichEdit 5.0 shipped with > Office 2003. I'm sure he knew that - his fingers just forgot. ;-) /|/|ike
Re: Code points on Windows
Title: Code points on Windows In MS Word if you type the Unicode code point, followed by Alt-X, you get the character (if you have the font). This works in reverse. Sometimes in a RichEdit control window it will work in the first direction, but not in reverse. It does not work in Wordpad, in spite of its use of RichEdit. I don't know why not. Raymond Mercier - Original Message - From: Mike Ayers To: '[EMAIL PROTECTED]' Sent: Wednesday, January 14, 2004 7:34 PM Subject: Code points on Windows On Windows, it is well known that you can generate a character from its code point by holding down the alt key and typing the code point in decimal, with a leading 0, on the numeric keypad. I recall that there is also a method to do this in reverse - given a character on, say, Wordpad, one can get the Unicode codepoint for that character (copied to the clipboard, I believe). However, I have forgotten how to do this. Can anyone help me out here? Thanks, /|/|ike
Re: Chinese rod numerals
Christopher, This is an excellent suggestion. A submission can be made using n2352-form.pdf that you can get from this site. http://www.dkuug.dk/JTC1/SC2/WG2/docs/summaryform.html Raymond Mercier - Original Message - From: "Christopher Cullen" <[EMAIL PROTECTED]> To: "Unicode list" <[EMAIL PROTECTED]> Sent: Saturday, January 10, 2004 12:23 PM Subject: Chinese rod numerals > > I am an academic with research interests in the history of ancient > Chinese mathematics, and I should like to propose the encoding of > traditional Chinese rod numerals. > > These represent the arrays of "counting rods" on a counting board as > used in China for complex calculations before the invention of the > abacus. There are eighteen forms in all, representing the numerals one > to nine in two forms which are basically versions of each other with a > 90 degrees rotation. One form is used for units, the the other for > tens, then back to the first form for hundreds, and so on. A zero is > represented by a gap in the array. For pictures of these and an > explanatory text, see: > > http://www.math.sfu.ca/histmath/China/Beginning/Rod.html > > These forms appear in pre-modern mathematical books in China, and in > modern books discussing ancient mathematics. They are not to be > confused with the the related "Hangzhou numerals", which are already > encoded at 3021-303a. It would be a great convenience to have these > as a standard resource rather than having to create a special private > font in order to represent them. > > From a private source, I have been told that these forms are neither in > any current Unicode encoding initiative, nor indeed anywhere in the > proposal pipeline. I should therefore be grateful for any comments or > advice that might guide me towards making a formal submission. > > > Christopher Cullen >
Re: Today is neither Thursday nor Friday
> > Michael Everson scripsit: > > > On 21 December 2012 the Mayan Long Count calendar will tick over from > > 12.19.19.17.19 to 13.0.0.0.0. Isn't that cool > --- subject to considerable uncertainty about the alignment between the Mayan cycles and our own calendar (I mean the "Ahau constant") See my Kairos 3, at http://ourworld.compuserve.com/homepages/RaymondM/ [I know this is OT, but it is a holiday.] Raymond Mercier
Re: MS Windows and Unicode 4.0 ?
Well can we be perfectly clear about this: I read that OS X is Unicode compliant, yet I understand you to say that Word (as part of Office) on OS X is not. If that is true of Word on OS X then I am surprised - even amazed, but that seems to be what you said. Is it really the case that characters in Word in OS X are not stored as Unicode, even though they are so stored in Word in Windows NT (and later) on a PC ? If not stored as Unicode on a Mac, then how are they stored ? Raymond Mercier - Original Message - From: "Michael Everson" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Thursday, December 04, 2003 3:43 PM Subject: Re: MS Windows and Unicode 4.0 ? > > At 15:00 + 2003-12-04, Raymond Mercier wrote: > >Arcane Jill writes > >My next OS will be a Mac. > > > >Before you rush off to the nearest Mac showroom: > > > >Michael Everson 25/11/03 wrote > >>Microsoft Office on OS X does not support Unicode. Quark XPress on > >>OS X does not support Unicode. Adobe InDesign on OS X does not > >>support Unicode inputting via keyboard, and doesn't shape > >>Devanagari properly. Eudora on OS X does not support Unicode. > >> > >>These companies have work to do if their products are to be > >>Unicode-enabled for Mac OS X. It is frustrating. > > Do ***NOT*** quote me as a reason not to buy a Macintosh. > > Using a Macintosh is a joy. Unicode support at the OS level is strong > and stable. That Microsoft, Quark, Adobe, and Qualcomm have work to > do to allow their customers to take advantage of the richness Apple > provides us is *their* challenge. And when they do, using a Macintosh > will be even more of a pleasure than it is now. > -- > Michael Everson * * Everson Typography * * http://www.evertype.com
Re: MS Windows and Unicode 4.0 ?
Right. And they even have the nerve to charge for it. I use OE. Raymond - Original Message - From: "Stefan Persson" <[EMAIL PROTECTED]> To: "Raymond Mercier" <[EMAIL PROTECTED]> Cc: "Arcane Jill" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Thursday, December 04, 2003 3:49 PM Subject: Re: MS Windows and Unicode 4.0 ? > > Raymond Mercier wrote: > > Eudora on OS X does not support Unicode. > > Eudora doesn't support Unicode on *any* OS, right? > > Stefan
Re: MS Windows and Unicode 4.0 ?
Arcane Jill writes > My next OS will be a Mac. Before you rush off to the nearest Mac showroom: Michael Everson 25/11/03 wrote >> Microsoft Office on OS X does not support Unicode. Quark XPress on OS X does not support Unicode. Adobe InDesign on OS X does not support Unicode inputting via keyboard, and doesn't shape Devanagari properly. Eudora on OS X does not support Unicode. These companies have work to do if their products are to be Unicode-enabled for Mac OS X. It is frustrating. << Raymond Mercier
Re: Free Fonts
Philippe Verdy writes > Simple: for now the fonts are in beta, and do not include the hinting > instructions. This may be in development, but faces some legal issues > with Apple patents. So until there's a patent-free hinting mechanism, > for use in fonts, or Apple agrees with a royaltee-free license on > hinting mechanisms, hinted fonts cannot be freely distributed. > What is the legal position if these fonts are taken into Fontlab and rehinted ? Surely if I make my own hinted font in Fontlab I do not owe royalties to Apple. Raymond Mercier
Re: Fonts on Web Pages
Of course Adobe was designed to do just the problem you defined, and it works well, with your embedded fonts, etc., so the recipient sees just what you write. OTOH What about using Word with your embedded fonts, and then saving it as mht (Web Archive File)? Have a look at: http://www.softcities.com/WebArchiveX/download/6912.htm >>The WebArchiveX Component API lets you programmatically save a complete Web page as a single Web Archive file (.mht) file. (Same as "Save as Web Archive" in Microsoft Internet Explorer). Web Archive is an Internet standard for keeping HTML documents within a MIME formatted message including graphics, scripts and style sheets as its body parts. Packing HTML files with the WebArchiveX COM Component helps to avoid errors when you send the Web page by email or publish it electronically. WebArchiveX can be used with any programming language and supports multi-threaded environments << Raymond - Original Message - From: Arcane Jill To: [EMAIL PROTECTED] Sent: Tuesday, December 02, 2003 12:02 PM Subject: RE: Fonts on Web Pages The use of PDF files does solve a problem, yes, but it solves a different problem from the one about which I had asked. I specifically want to know the current state-of-the-art regarding the use of fonts on web pages. I believe someone was working on this, but I don't know if it was the W3C or some other bunch.Jill-Original Message-From: Raymond Mercier [mailto:[EMAIL PROTECTED]]Sent: Tuesday, December 02, 2003 11:29 AMTo: Arcane JillCc: [EMAIL PROTECTED]Subject: Re: Fonts on Web Pages Surely Adobe Acrobat will solve both problems ? The recipient only needs to have the Acrobat Reader installed, and who does not already have that ? Raymond Mercier Anyone know the current status on embedded fonts in web pages?I basically have two questions. (1) Assume the existence of a font to which I legally own the copyright. For example, let's say I invented it. Now, I design a web page which uses this font. Now, it's easy (but terribly inconvenient) to say on the web page "Please download and install this font in order to view this web page correctly", but the truth is I know damn well that no-one will ever do that. So, short of using small image files, what's the current state-of-the-art technical solution to this.Question (2) is the same as question (1), except that I don't own the copyright. Suppose, for example I want to use this font called Garamond. It's on my machine. (I don't know how it got there - I think it came pre-installed with the OS). But of course, I can't guarantee that it will be installed on someone else's machine. And since I don't own the copyright, and don't have explicit permission to distribute it, I don't think I'm even allowed to say "Please download and install this font in order to view this web page correctly". How do we solve this one?Jill
Re: Fonts on Web Pages
Surely Adobe Acrobat will solve both problems ? The recipient only needs to have the Acrobat Reader installed, and who does not already have that ? Raymond Mercier - Original Message - From: Arcane Jill To: [EMAIL PROTECTED] Sent: Tuesday, December 02, 2003 10:29 AM Subject: Fonts on Web Pages Anyone know the current status on embedded fonts in web pages?I basically have two questions. (1) Assume the existence of a font to which I legally own the copyright. For example, let's say I invented it. Now, I design a web page which uses this font. Now, it's easy (but terribly inconvenient) to say on the web page "Please download and install this font in order to view this web page correctly", but the truth is I know damn well that no-one will ever do that. So, short of using small image files, what's the current state-of-the-art technical solution to this.Question (2) is the same as question (1), except that I don't own the copyright. Suppose, for example I want to use this font called Garamond. It's on my machine. (I don't know how it got there - I think it came pre-installed with the OS). But of course, I can't guarantee that it will be installed on someone else's machine. And since I don't own the copyright, and don't have explicit permission to distribute it, I don't think I'm even allowed to say "Please download and install this font in order to view this web page correctly". How do we solve this one?Jill
Re: How can I have OTF for MacOS
OK, I stand corrected on Mozilla ! Raymond Mercier
Re: How can I have OTF for MacOS
Michael Everson writes > Eudora on OS X does not support Unicode. > Eudora doesn't support Unicode anywhere, surely ? To my knowledge on a PC the only mail handler that is Unicode compliant is Outlook Express. Raymond Mercier
unicode@unicode.org
I am not sure if this is a point that really involves Unicode blocks, but someone in this list might have a comment. In Word 2002 there is one bug that is cleared up in Word 2003 (at least in the Beta, which I have been playing with). In Word 2002 the Style may assign one particular font for Latin characters, but when certain Latin characters are inserted, the font switches to the Asian font even when the characters are found in the Latin font. This is now cleared up in Word 2003. For example if the Latin font is Times, and the Asian font is Simsun, and if the character U+01CE is inserted the font switches to Simsun, even though U+01CE is available in Times. (U+01CE is the letter a with the Chinese 3rd tone mark, a small v placed over the a). If the character is selected and the font is switched back to Times then the character is switched to the one in the Times font. The problem is if course not restricted to this character, or the Times font, but happens in many other situations in Word 2002. This bug has disappeared in Word 2003, where all Latin based characters are taken from the Latin font, as long as the character is really present in that font. In Word 2002 the problem extends to Greek fonts, when accented characters are inserted: in that case the font switches to Arial Unicode, even when the accented character is in the default font such as Cardo. But all this is cleared up in Word 2003. Raymond Mercier
Re: How can I input any Unicode character if I know its hexadecimal code?
John Cowan wrote > It's an XML editor (recte XMetaL), and an XML editor that can't handle > Unicode would be a sorry specimen indeed. A quick glance at the program's site suggests that there cannot be such a serious problem http://www.corel.com/servlet/Satellite?pagename=Corel/Products/productInfo&id=1042152756365&did=1042152754863&content=FAQ#top How does Corel® XMetaL® 4 support Unicode? Corel XMetaL 4 features UTF-8 and UTF-16 encoding in conformance with Unicode 3.0 for the transparent display and editing of all left-to-right reading languages. Unicode support is available in the document window, as well as in the customizable interface elements (e.g., menu items and toolbar names) found in Corel XMetaL 4 and the macro script-editing interface. -and a number of other encouraging paragraphs. So what is the problem ? Raymond mercier - Original Message - From: "John Cowan" <[EMAIL PROTECTED]> To: "Raymond Mercier" <[EMAIL PROTECTED]> Cc: "Patrick Andries" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Saturday, November 15, 2003 5:16 PM Subject: Re: How can I input any Unicode character if I know its hexadecimal code? > > Raymond Mercier scripsit: > > > You cannot complain to Unicode if there are software developers who fail to > > make their programs Unicode compliant. What makes you think that XMetal > > (whatever it is) can handle Unicode internally ? > > It's an XML editor (recte XMetaL), and an XML editor that can't handle > Unicode would be a sorry specimen indeed. > > -- > John Cowan <[EMAIL PROTECTED]> > http://www.reutershealth.comhttp://www.ccil.org/~cowan > .e'osai ko sarji la lojban. > Please support Lojban! http://www.lojban.org
Re: How can I input any Unicode character if I know its hexadecimal code?
> > Then it is a request for enhancement to address to the author of XMetal. > > This is not an issue of Unicode. > > Funnily enough, I thought I wanted to input Unicode characters You cannot complain to Unicode if there are software developers who fail to make their programs Unicode compliant. What makes you think that XMetal (whatever it is) can handle Unicode internally ? Even in WIN2000 some features are not Unicode compliant - for example the global search facility applied to folders and files. Certainly Alt-X doesn't work, and even if you paste a non-Ascii character into the window, you will get nonsense when you run the search. Raymond Mercier
Re: New contribution N2676
Richard, >>Now, unless Zero shares the same glyphic range as Artabe, I’m not sure that they can be unified.<< My last wish is to unify artabe and zero, but the artabe symbol listed in N2676 really is just the same as the zero used in papyri of the age. >>If you look at e.g. ‘Siglae’ in RE 2.2 (1923) 2279-2315 you’ll see that Bilabel lists 16 glyph variants for the Artabe. The most common variants are the ones with a horizinal line (like a dash) with an arrangement of between one and three dots around it, sometimes the dots are solid and sometimes they are hollow circles.<< I will have a look at RE, since I have so far only seen the examples in Kenyon's photos, but it was already clear to me that the entry in N2676 will not do, if only because it fails to represent the confused variety of forms used in the papyri. The question of the zero is separate, and perhaps easier. Anyway I will not try to summarise here my as yet incomplete collection. Raymond - Original Message - From: Richard Peevers To: [EMAIL PROTECTED] Sent: Monday, October 27, 2003 5:07 PM Subject: Re: New contribution N2676 Raymond, Apropos 10186 G GREEK ARTABE SIGN The identity of one glyph variant of ‘zero’ and one of ‘artabe’ raises an interesting problem. For the ‘Zero’ there are, it seems to me, two main characters used for this: one is identical to the letter omicron and the other is a circle (more or less like an omicron) with a more or less elaborate bar over it. It’s only the second that we’d be looking to propose. It seems to me that here we need two characters: 1) Artabe (horizontal line surrounded by 1-3 dots/hollow circles) and 2) Zero (hollow circle with more or less elaborate line above. Richard Richard Peevers Research Associate Thesaurus Linguae Graecae 3450 Berkeley Place Irvine CA 92697-5550 www.tlg.uci.edu www.digressus.org
Re: New contribution N2676
>Should we continue to encode this as ARTABE SIGN and just note the use of > this shape for 'zero' in an annotation? > Should we change it to another name and add the annotation for 'artabe'?> > Should we take any other actions? Well I don't quite know. My real intrest is in the changing shape of the zero, but I am not yet ready with a proposal. Besides in the papyri where Kenyon read Artable this symbol is much of the time coupled with another, the two written rather cursively together in the papyri. Kenyon carefully records all the different forms, and after seeing that I am in some doubt about what exactly should be encoded. I suspect that the new list is based not on the many many symbols given by Kenyon in his many volumes of transcribed papyri, but on a summary list that he published before that. I wish I could be more definite. Raymond - Original Message - From: "Asmus Freytag" <[EMAIL PROTECTED]> To: "Raymond Mercier" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Saturday, October 25, 2003 8:26 PM Subject: Re: New contribution N2676 > > At 05:51 PM 10/25/03 +0100, Raymond Mercier wrote: > > Among the new characters in N2676 there is > > > > 10186 G GREEK ARTABE SIGN > > > > This is one of the many signs found in papyri, such as those edited by > >Kenyon. This symbol represents apparently a measure of volume used for > >grain. It appears as a small circle, smaller than omicron, with a long > >overline, much longer than a macron. > > > > While I have been looking for the various forms of the symbol for zero I > >find in other papyri quite exacty the same character used for 'zero'. I make > >this comparison after studying many photographs of papyri, those provided > >with Kenyon's editions on the one hand, and on the other, Alexander Jones' > >recent volume of horoscopes, Astronomical Papyri from Oxyrhynchus. > > The attached image is take from Jones, part of a column of zeroes written > >this way. > > This is fascinating information. > > However, I'm unclear what you propose. > > Should we continue to encode this as ARTABE SIGN and just note the use of > this shape for 'zero' in an annotation? > > Should we change it to another name and add the annotation for 'arabe'? > > Should we take any other actions? > > A./
Re: New contribution N2676
Among the new characters in N2676 there is 10186 G GREEK ARTABE SIGN This is one of the many signs found in papyri, such as those edited by Kenyon. This symbol represents apparently a measure of volume used for grain. It appears as a small circle, smaller than omicron, with a long overline, much longer than a macron. While I have been looking for the various forms of the symbol for zero I find in other papyri quite exacty the same character used for 'zero'. I make this comparison after studying many photographs of papyri, those provided with Kenyon's editions on the one hand, and on the other, Alexander Jones' recent volume of horoscopes, Astronomical Papyri from Oxyrhynchus. The attached image is take from Jones, part of a column of zeroes written this way. Raymond Mercier > - Original Message - > From: "Michael Everson" <[EMAIL PROTECTED]> > To: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> > Sent: Friday, October 24, 2003 7:36 PM > Subject: New contribution N2676 > > > > > > A new contribution: > > N2676 > > Repertoire additions from meeting 44 > > Asmus Freytag > > 2003-10-23 > > http://www.dkuug.dk/jtc1/sc2/wg2/docs/n2676.pdf > > > > -- > > Michael Everson * * Everson Typography * * http://www.evertype.com > <>
Re: Web Form: Other Question, Problem, or Feedback
> > -Original Message- > > Date/Time:Tue Oct 21 07:54:01 EDT 2003 > > Contact: [EMAIL PROTECTED] > > Report Type: Other Question, Problem, or Feedback > > > > Hello Unicode-Team, > > > > i'm looking for a tool or a tutorial to convert japanese > > signs in numeric unicode signs (e.g. 留). Can you help me? > > > > Greetings from Germany > > T. Nikolai > > please mail to: [EMAIL PROTECTED] Hello Nikolai, You might find some features of interest in my program Hanfind, which you can download from: http://ourworld.compuserve.com/homepages/RaymondM/ It works however from the Pinyin reading of the characters, seen as Chinese not Japanese. If you are working from a text in Word you can always write a macro which gives directly the Unicode value of any character, based on the calculation Unicode = Hex(AscW(Selection.text)). Raymond Mercier
Re: Mac fonts
Thanks - I have passed on your messages. Raymond
Re: Mac fonts
Thanks to all who reassure me that TitusCyberbitBasic and Code2000 as I use them on my PC can also be used on Mac with OS X. This is really for a colleague, who has tried without success to install the Titus font that I passed on to her. She tells me she has OS X, and I will just have to discuss it further with her. Raymond Mercier
Mac fonts
I am looking for Mac versions of the fonts TitusCyberbitBasic and Code2000. Any suggestions ? I would like a serif font like Times, with the Latin Extended Additional block. Raymond Mercier
Re: TLG and Beta code
John,I am glad to hear from you. I shall do what I can to get a proposaltogether.Raymond- Original Message -From: "John Hudson" <[EMAIL PROTECTED]>To: "Raymond Mercier" <[EMAIL PROTECTED]>Cc: <[EMAIL PROTECTED]>Sent: Wednesday, August 27, 2003 7:20 PMSubject: Re: TLG and Beta code>> At 05:37 AM 8/27/2003, Raymond Mercier wrote:>> >I know this is common in the TLG, but as you say, they assume it is just> >omicron (an assumption repeated in a message just received from them).> >But, I am trying to get across that that is wrong: it represents neither> >papyri nor Byzantine MSS.>> ...>> >So is there not a good reason to treat this as a distinct character, tobe> >assigned to a Unicode codepoint ?>> Raymond, based on what you have said, I would agree. A variety of visual> representations, clearly distinct from the omicron as formed in the same> documents, suggests a separate character. Would you be able to write up a> proposal to encode such a character, or at least an informational document> including illustrations of different forms of the Greek zero, preferablyin> proximity to differently formed omicrons? Nothing is going to happenunless> the UTC receive such a document, and you sound like the best person to> prepare one.>> John Hudson>> Tiro Typeworks www.tiro.com> Vancouver, BC [EMAIL PROTECTED]>> You need a good operator to make type. If it were a> DIY affair the caster would only run for about five> minutes before the DIYer burned his butt off.> - Jim Rimmer>- Original Message -From: "John Hudson" <[EMAIL PROTECTED]>To: "Raymond Mercier" <[EMAIL PROTECTED]>Cc: <[EMAIL PROTECTED]>Sent: Wednesday, August 27, 2003 7:20 PMSubject: Re: TLG and Beta code>> At 05:37 AM 8/27/2003, Raymond Mercier wrote:>> >I know this is common in the TLG, but as you say, they assume it is just> >omicron (an assumption repeated in a message just received from them).> >But, I am trying to get across that that is wrong: it represents neither> >papyri nor Byzantine MSS.>> ...>> >So is there not a good reason to treat this as a distinct character, tobe> >assigned to a Unicode codepoint ?>> Raymond, based on what you have said, I would agree. A variety of visual> representations, clearly distinct from the omicron as formed in the same> documents, suggests a separate character. Would you be able to write up a> proposal to encode such a character, or at least an informational document> including illustrations of different forms of the Greek zero, preferablyin> proximity to differently formed omicrons? Nothing is going to happenunless> the UTC receive such a document, and you sound like the best person to> prepare one.>> John Hudson>> Tiro Typeworks www.tiro.com> Vancouver, BC [EMAIL PROTECTED]>> You need a good operator to make type. If it were a> DIY affair the caster would only run for about five> minutes before the DIYer burned his butt off.> - Jim Rimmer>
Re: TLG and Beta code
- Original Message -From: "Nick Nicholas" <[EMAIL PROTECTED]>To: <[EMAIL PROTECTED]>Sent: Wednesday, August 27, 2003 12:33 PMSubject: Re: TLG and Beta code>The equivalent glyph the TLG has posted for #130 is omicron,
Re: TLG and Beta code
> In a Greek text, shouldn't you be using omicron and a combining macron > rather than Latin o with macron? If omicron plus combining macron is an > adequate representation of the glpyh, then maybe there is no need to a > new character. > > -- > Peter Kirk > [EMAIL PROTECTED] (personal) > [EMAIL PROTECTED] (work) > http://www.qaya.org/ > Well, it is just simpler to use the Latin, since the combination is a single codepoint. The real point is that it would be nice to have an appropriate Greek form. The TLG assumption is that the Greek texts used omicron for zero, but that is not what you find in the MSS. Against that assumtion, I have just written to the TLG as follows: I know that you will find support in Heath, whose Greek Mathematics, Vol.1, p. 45, is surprisingly misleading in just saying that they used omicron. Also in his ed. of Ptolemy's Hypotheses Heiberg has rather perversely put a macron on all the letters except omicron ! (Opera Minora 78.29, for example). This does not adequately represent the Byzantine MSS. I don't have Heiberg's Syntaxis in front of me, but Halma's edition of the Syntaxis is closer to the MSS, and uses, o+macron. Elsewhere in the MSS one finds a variety of forms, according to the age etc. In the ninth century MSS zero is represented by a rather small o with a long overline with serifs at either end, much bigger than our macron. In late Byzantine mathematics one finds sometimes a form like the Cyrillic che (U+0447). Certainly the form varies a good deal, but I have not seen a simple omicron, whatever the editors may have put. In the texts edited in Georges Gémiste Pléthon (by Anne Tihon and myself), which I see you include in the TLG, we use a macron on the o, and are doing the same in our edition of Ptolemy's Handy Tables. If we had something closer to the forms used in ninth century MSS we would use it. Raymond
TLG and Beta code
David,I am glad to see this much progress, yet, as I noticed after posting, the zero symbol is actually missing inbeta code, so your Beta code -Unicode equivalences would not have it. I think it is fair to say that the TLG have avoided the parts of mathematical texts where the symbol is common, as in the various tables in Ptolemy's Almagest (where all the tables are omitted by TLG). This symbol is in reality more common than the rarities listed in quickbeta. In the editions I am involved with we use U+14D, o, which is near enough I suppose.Raymond- Original Message -From: "David J. Perry" <[EMAIL PROTECTED]>To: "'Raymond Mercier'" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>Sent: Wednesday, August 27, 2003 1:11 AMSubject: RE: TLG and Beta code>> Raymond,>> If you go to http://www.tlg.uci.edu/Uni.prop.html you will see all the> proposals; the site indicates very clearly which ones have been accepted> by the UTC and which are pending (only one still pending at this point).> They must of course be voted on by WG2 before they are officially a part> of Unicode. The TLG folks have prepared a very useful document at> http://www.tlg.uci.edu/quickbeta.pdf that shows the Unicode equivalent> for each beta code character (some of these are existing Unicode> characters, some newly proposed, and some so rare or poorly understood> that TLG did not think them appropriate to propose for Unicode).>> David>
TLG and Beta code
Last January when I asked if the Greek symbol for one-half might be included somewhere in Unicode I was led to understand that not only that but a whole range of Greek symbols were being proposed by the TLG people. There was for example http://www.tlg.uci.edu/Uni.prop.html. Indeed the Beta code (http://www.tlg.uci.edu/~tlg/BetaCode.html) used by the TLG covers a huge range of odd symbols which are needed in Unicode if the classical texts which they have digitised are ever to be "unicoded". I was reminded of the need to enlarge the Greek coverage when converting some Greek numerical texts, and saw that not even the symbol for zero was part of the Greek block, so that I had to use U+014D, latin l.c. 'o' +macron, ō, which is admittedly near enough. Yet when I search the Unicode site now for TLG or beta code I find nothing. Are the TLG proposals somewhere in the pipeline ? Raymond Mercier
Re: [Way OT] Beer measurements (was: Re: Handwritten EURO sign)
Ted Hopp writes > > Since we're speaking of the French (we are, aren't we?) what ever happened > to French Revolutionary Metric Time? The other French attempts were less successful, such as the 12 30-day months. The French names for the months Vendémiaire, etc., were parodied in an English version: wheezy, sneezy, freezy, slippy, etc. One decimal dystem that survives is the grad (400 grad = 360 degrees), still used at least by surveyors, but Laplace used it in astronomical calculations. The Americans won't have the meter now, unless it's renamed the freedom-yard, I suppose. Raymond
Re: [Way OT] Beer measurements (was: Re: Handwritten EURO sign)
At some time in the 70's when I was at conference to mark the centenary of the Greenwich meridian I learned that the French agreed to give up the Paris meridian if the British agreed to go metric-and that was over a century ago ! Maybe the U.S. could be bribed to go metric if they were allowed to have Washington as the standard meridian. Raymond Mercier
Re: Which ancestral links
John Clews writes: >I've never seen a description of the Sogdian > alphabet (i.e. I have never come across one): is there a good article > or URL which illustrates such links? Here is a Unicode proposal for just that: http://wwwold.dkuug.dk/jtc1/sc2/wg2/docs/n2422.pdf See also http://www.gengo.l.u-tokyo.ac.jp/~hkum/pdf/SIE3.pdf Raymond Mercier
Aramaic scripts
There are omissions in Michael Everson's chart in http://www.dkuug.dk/jtc1/sc2/wg2/docs/n2311.pdf The chart was based on Semitic languages, although purporting to be about scripts. After all Greek and Latin also derive from the same family of scripts, as we all learn from page 1 of Greek grammars. There are less obvious omissions: 1. Kharoshthi, a RtoL script much used in North West India, and regarded by everyone as a derivative from a form of the Aramaic script used in that region. It is found on coins, Ashokan edicts, various inscriptions and manuscripts. It was used to write mainly prakrits, although some sanskrit text is known. See, for example, A.H. Dani, Indian Palaeography, Oxford 1963. 2. Pahlavi, widely used to write Middle Persian. This involved a troublesome mixture of Persian reading of Aramaic words, a subject requiring more elaboration than is needed here. Raymond Mercier
Re: Which ancestral links
Indeed, pardon my haste, that was a matter of an addition to the Syriac script. For a comparison of the various scripts used for Sogdian, http://iranianlanguages.com/midiranian/sogdian.htm#Alphabet Raymond - Original Message - From: "Michael Everson" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Friday, August 08, 2003 5:43 PM Subject: Re: Which ancestral links > > At 17:26 +0100 2003-08-08, Raymond Mercier wrote: > >John Clews writes: > > > >> I've never seen a description of the Sogdian > >> alphabet (i.e. I have never come across one): is there a good article > >> or URL which illustrates such links? > > > >Here is a Unicode proposal for just that: > > > >http://wwwold.dkuug.dk/jtc1/sc2/wg2/docs/n2422.pdf > > That is not the Sogdian script. > -- > Michael Everson * * Everson Typography * * http://www.evertype.com
Re: UTF-8 and HTML import into MS Word 2000
Both the html files open in Word2002 without problem, Polish & Japanese characters included. Raymond Mercier - Original Message - From: "Janusz S. Bieñ" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Tuesday, July 29, 2003 9:56 AM Subject: UTF-8 and HTML import into MS Word 2000 > > > I try to convert a LaTeX document into Word through UTF-8 coded HTML. > > When I import a small test > > http://www.mimuw.edu.pl/~jsbien/poufne/utf8-pjk.html > http://www.mimuw.edu.pl/~jsbien/poufne/utf8-pjk.css > > into Word, I see it correctly. To be precise, sufficiently correctly > (a single Polish letter is displayed in a strange way) as all Japanese > characters are displayed properly. > > When I import the real document > > http://www.mimuw.edu.pl/~jsbien/poufne/JSB-EAJS03.html > http://www.mimuw.edu.pl/~jsbien/poufne/JSB-EAJS03.css > > exactly in the same way, I see empty boxes instead of Japanese > characters. > > Am I making some silly error? I will appreciate comments and > suggestions. > > Best regards > > Janusz > > -- > , > dr hab. Janusz S. Bien, prof. UW > Prof. Janusz S. Bien, Warsaw Uniwersity > [EMAIL PROTECTED], [EMAIL PROTECTED] > http://www.orient.uw.edu.pl/~jsbien/ > http://www.mimuw.edu.pl/~jsbien/ >
Greek polytonique kybd for AZERT
People using an AZERTY keyboard (French or Belgian) might find it useful to know of the excellent keyboard layout for polytonique unicode greek available at http://club.euronet.be/frederique.bouras/kbdhept.htm Raymond Mercier
RE: International Font to be Used
In Babelmap the choice of unicode block is restricted to just one block at a time, whereas in Unicode Search you can select any number of blocks. For example, if you want to handle Classical Greek, you need to know which fonts cover both the blocks Greek and Greek Extended. Such combinations are not available in Babelmap. Raymond And one free tool that will do both is BabelMap (just press F7) : uk.geocities.com/BabelStone1357/Software/BabelMap.html And for those without access to such a tool, a comparative table of Unicode 4.0 block coverage for some of the more common "pan-Unicode" fonts such as Arial Unicode MS and Code2000 is given at : uk.geocities.com/BabelStone1357/Unicode/fonts.html#FontsByRange Andrew
RE: International Font to be Used
At 12:16 09/06/2003 -0400, you wrote: One (free) tool that will allow you to investigate what blocks of Unicode are actually covered in a font file is: http://pfaedit.sourceforge.net/ And to see what fonts on your disk support specified unicode blocks, another free tool at http://ourworld.compuserve.com/homepages/RaymondM/unisearch.htm Raymond Mercier
Re: Fw: Unicode filename problems
Much obliged ! Raymond At 15:26 02/06/2003 +0200, you wrote: Raymond Mercier wrote: > Doesn't one have to know the binary format of a Zip file to be sure of that ? I suppose that is proprietry, and in any case, I don't have it. http://www.pkware.com/products/enterprise/white_papers/appnote.html Stefan
Re: Fw: Unicode filename problems
At 00:11 01/06/2003 +0200, you wrote: but certainly not for the file index stored in a ZIP file where there's no reason why it should not contain correctly encoded and portable UTF-8 names Doesn't one have to know the binary format of a Zip file to be sure of that ? I suppose that is proprietry, and in any case, I don't have it. -Raymond
Re: Fw: Unicode filename problems
Well, you would expect that, since Win9* and WinNT/2000/XP differ fundamentally regarding unicode compliance. -Raymond At 21:01 31/05/2003 +0200, you wrote: Am Samstag, 31. Mai 2003 um 13:18 schrieb Raymond Mercier: RM> Certainly more work is needed on RAR (at least on the Win 2000 version). The author of WinRar wrote me in a private mail that he hopes to fix the problem in a future version. The problem seems to be that he then has to maintain two versions for Win9x/ME and WinNT/2000/XP. -- Karl
Re: [OT] Unicode filename problems
Before I am accused of being altogether OT I will try to answer a couple of points. >>The name of the ZIP archive is not relevant: you don't really need to internationalize it, and can restrict it to ASCII with a classic .zip or .jar extension. Someone using the Chinese or Russian Win2000 would naturally name files with Chinese or Russian characters. He hasn't chosen to "internationalize". He just has a right to function within that part of the Unicode universe of filenames that he happens to occupy. >>Zip files should have no problems to contain files with UTF-8 names. For example a Cyrillic filename U+444.doc, becomes in UTF-8 ¿Ñ.doc. Neither is accepted by WInzip. In writing a program like Winzip there is barrier to the use of fileopening routines with wide-byte characters. All the API routines are defined so as accept that. It is just that programmers have been rather lazy about it. Raymond
Re: Fw: Unicode filename problems
This question of non-Ascii filenames is a real problem : hardly any software out there can cope with this. I did not know of RAR, but have given it a try. Even here there is a serious problem, because if the filename is non-Ascii the name of the compressed file comes out as _.rar, with as many underlines as there were characters in the original name. In fact it is a bit less predictable : if the name is Greek, for example, you get Latin letters, if it is Cyrillic, just the underline. This is useless then if you have a number of filenames all with the same number of characters. Certainly more work is needed on RAR (at least on the Win 2000 version). I know about that, since I made my Fontlist 5 work properly with arbitrary non-ascii names : http://ourworld.compuserve.com/homepages/RaymondM/fontlist5.htm . Raymond Mercier At 22:58 30/05/2003 -0500, you wrote: I wonder if anyone here has ideas on these matters. Peter - Forwarded by Peter Constable/IntlAdmin/WCT on 05/30/2003 10:56 PM - I have 3 LinguaLinks lexicons that I have converted into HTML pages - one for each entry. The languages use non-ANSI characters, so I also did a Unicode conversion at the same time. [snip] Everything works very well except that I cannot burn the files onto a CD because of the unicode values in the filenames. Roxio and Nero CD-burners don't accept some of the higher values found in the file names (using Jolliet, ISO9600 and UDF). Anyone have any ideas how to deal with this? For example, a filename with unicode value 026B, a tilde lower case L, causes problems. In the meantime, to get it onto CD, I decided to try and zip all the files. Turns out almost all the zippers out there DO NOT support Unicode filenames. Doug Rintoul found WinRAR (http://www.rarlab.com/rar_archiver.htm) which does the trick in the RAR format only. There is a RAR expander for Macintosh and Linux systems as well (all of these are $29 USD). So far, have not found a freeware solution that meets unicode filename needs. Have any of you run into this yet? I could try to determine what Unicode values are causing problems on the CD burner and do an unacceptable-to-acceptable character translation in the filenames and the links to those filenames ... but that seems like a huge compromise. Also, it will be difficult to come up with a generic solution ... that is to say, I don't know what RANGE of values are unacceptable for characters in a CD filename. Jolliet is supposed to allow Unicode filenames according to the documentation I have seen. Larry
Re: unicode in Mac
Tom Gewecke writes PS The FEFF could well be the BOM (Byte Order Mark) which NotePad puts at the beginning of UTF-8 encoded files (even though it is not needed or customary for other apps to do so). It does not have any significance. The opening bytes are FF FE ( or FEFF read as a short integer), imposed when the file is saved in Word as a plain text unicode. If these two bytes are deleted the text still opens correctly in Notepad. The MAC OS is OS 9; my colleague has been put off by an attempt to install OS X, although the CD for it came with his new machine. I am surprised that he should expect such difficulties with OS 9. Raymond
unicode in Mac
Given a plain text unicode file, with the opening byte FEFF, and which displays correctly in Notepad on a PC. What facility is available on a Mac to make this file display correctly ? I am trying to help a colleague, who has MAC OS IX, and I need to tell him what font will cover Greek and Extended Greek. Raymond Mercier
Re: Greek fractions
Just a final note on the 'half' symbol. I was wrong to criticise TLG in stating that this symbol is omitted in their reproduction of the texts. They represent the text in Beta-code, http://www.tlg.uci.edu/BetaCode.html and this certainly has the 'half' in a variety of forms. The problem is rather: when are Unicode going to include the great many symbols covered in Betacode so that TLG files can be converted to Unicode ? I understand that they hope to have this conversion in about two years. Raymond Mercier
Re: Greek fractions
At 11:59 AM 1/22/2003 -0800, you wrote: Does this affect Euclid at all? Also, do you know of any source for Euclid in Greek other than the full TLG or Perseus CD-ROMS? I have read a fair chunk of the Elements online, but would like a print copy that I can write on, or read outside with Heath and Strong handy. Much as I admire your work, I have no need for all of the other authors in either collection I would be surprised if fractions occurred anywhere in Euclid, but someone may correct me. I don't know of any on line version other than TLG. Raymond
Re: Greek fractions
At 11:47 PM 1/21/2003 -0800, Doug Ewell wrote Thanks for the link, John. Indeed, the TLG proposal for numerals [1] does include a GREEK HALF SIGN, although their preferred glyph does not include the "prime" sign Raymond mentioned (it is listed as a glyph variant, however). I am glad to learn of the proposal, and realize that I should have checked the Unicode site first. There are indeed many variants for this sign, with none, one, or two primes. Sometimes the sign is as I described, or at other times as a sort of scrunched up stigma, and so on. What troubles me a little about the proposal is that it may depend more on the way editors have handled it rather than on what is used in the manuscripts. For examples Heiberg's edition of Heron is quoted in the proposal. Raymond
Greek fractions
In Classical Greek scientific texts the fraction 'one half' is represented very commonly by a symbol which looks a bit like 'less than', or like 'angle' U+2220, but followed by a prime. Is there no place for this in the Unicode scheme of things ? Other symbols are also found for common fractions, apart from the general usage where a prime is added to indicate the reciprocal. I have been converting some TLG* files to Unicode, and I notice that even in the original TLG file the symbol is just replaced by a space. This makes a nonsense of Ptolemy's geographical coordinates. *TLG = Thesaurus Graecae Linguae Raymond Mercier
Re: Status of Unihan Mandarin readings?
At 08:44 AM 12/20/2002 -0700, you wrote: That's because the file was converted to UTF-8. Previously it had not been in any single encoding, which was creating problems Well, OK, but should you have created by now some sort of program that checks the file whenever you make a change - a sort of spellcheck ? Should not be too hard to write something that displays the effects of any changes. Raymond Mercier
Re: Status of Unihan Mandarin readings?
On the errors in kMandarin: Apart from the kMandarin errors of the kind that Andrew West has noted, there is another corruption, namely the loss of ü, and this happened between "3.0b1" and "3.0b2", when the ü became the two bytes C393. As to Han/Yi, "U+6C49 YI4 HAN4" is found not only in "3.0b1" and "3.0b2", but also in "2.0". The HAN4 was dropped only in 3.2. While I admire the effort to "explain" the intrusion of YI4, I feel it is a bit misplaced, and that some more mechanical/clerical explanation is in order. After all, look at the number of times "same as U+ " is written as "sama as U+... " in 3.2: 6 to be precise. Raymond Mercier Raymond Mercier
Re: CJK fonts
At 09:04 AM 12/11/2002 -0700, you wrote: On Wednesday, December 11, 2002, at 08:27 AM, Raymond Mercier wrote: For example, the simplified form of the character Han itself (U+6C49) is given the Pinyin reading Yi, the traditional form U+6F22 is the correct reading Han. Have you reported this? Not yet, since I have only just noticed it. I know there is an address on the Unicode site for such reporting. Andrew West in a recent message mentioned a number of serious mis-readings. These are the tip of the iceberg. The file needs a total overhaul. BTW, there's the official Unihan lookup Web page at <http://www.unicode.org/charts/unihan.html>. == John H. Jenkins [EMAIL PROTECTED] [EMAIL PROTECTED] http://www.tejat.net/
Re: CJK fonts
The display in Hanfind uses the html browser embedded in the program. Any unicode reference in an html page can be written as an entity, such as 㱩 for U+3C69. This displays without any problem, as long as you have the font. Or have I missed your point ? Raymond At 10:08 AM 12/11/2002 -0800, you wrote: > http://ourworld.compuserve.com/homepages/RaymondM. I clicked on "Hanfind". Something is wrong with that page. It's HTML encoded directly in Unicode, which as far as I know is invalid HTML. Rick