Re: ISO 15924 draft fixes
From: Michael Everson [EMAIL PROTECTED] At 23:21 +0200 2004-05-20, Philippe Verdy wrote: There is still a conflict of Code for Mandaean, is it Mand or Mnda? Mand. OK This is now corrected on the new HTML pages. But the new normative plain-text file now contains... Mnda !!! --- Beside this, an important note to other users trying to get the corrected pages: I know I need to clear my browser cache to get the newest files, because the Unicode server does not mark page changes with documents dates, so they will rarely expire in browser caches. For HTML pages this is not a problem as you can use the Refresh button of your browser to load the new pages. But for downloading (even with right-click and Save as...), the browser cache may maintain a copy of the old archive, that MUST be removed manually by the visitor to get the new file and not an old copy! --- As a side note to Michael or the other 6 RA members (Ken, and Rick notably), I don't think it's even a good idea to ZIP this reference plain-text file due to its very small size (which smaller than each of the HTML versions of codelists). It could be presented directly under the URL: http://www.unicode.org/iso15924/iso15924-code.UTF-8.txt The plain text would appear directly in the browser window where it could be saved as well, without needing any ZIP tool... This will work provided that the text is effectively coded with the MIME plain-text conventions for end-of-lines (as used in DOS, Windows, OS/2, CP/M, VMS, and all RFCs... i.e. with CRLF end-of-lines, as stated in the text/plain MIME document type RFC), and if the Unicode web site server correctly identifies *.txt files with a Content-Type: text/plain header, or even better by identifying *.UTF-8.txt as Content-Type: text/plain; charset=UTF-8. --- I hope that the published versions will soon be acceptable for getting from the current FDIS status (Final Draft International Standarf) to the Standard status in ISO. I looked into the TC46 web site and for now ISO 15924 is still not a final standard but a final draft (the last step before standard and technical revisions which ISO may publish yearly): http://comelec.afnor.fr/servlet/ServletComelec?form_name=cFormIndexlogin=invitepassword=inviteorganisme=isocomite=tc46 (then click on the Work programme link, which can't be posted here as it requires a session ID set by the AFNOR web server)
Re: ISO 15924 draft fixes
On Thursday, May 20th, 2004 23:56, Philippe Verdy wrote: I see no real problem if not all the different orthographies are listed or if they are not used universally. As long as the name is non ambiguous. What will be important for interchange of data will not be this name but the Code (or N°, or even ID in UAX#24 properties). I disagree. When I put content on the web, under my signature, I care about whether is written correctly or not. And when there are different possibilities, I prefer the best one given any other constraints (such as technical limitations here or there.) So there's nothing wrong if Han'gul is shown to users Sorry: this is meaningless to me as French reader. And it is a mistake (missing breve) when it comes about the McCune-Reischauer scheme. Half-good fallback mechanisms are usually better than nothing, but worse than anything else. And we do have better possibilities here. French normally has no caron and no breve, and the circumflex is used to mark a slight alteration of the vowel because of an assimilated consonnant in the historical orthograph (most often this circumflex in French denotes a lost s after the vowel). Or it can be for other reasons. Which consonant is involved in dû? So the curcumflex on Hangul would be inappropriate for French, Please go to Langues'O for this commentary. As I wrote, you will be probably answered with the historical context. Also, there are a number of circumflexes already in the names, which have nothing to do with swallowed s (like in dévanâgarî), which furthermore are the main entries, unlike the case at hand. Are you proposing to drop them? Perhaps in favour of macrons (like is done in a number of dictionnaries, by the way)? [Comments-OT] The problem of apostrophes is that French keyboards don't have it, but only have a single-quote. Huh ??? That is quite a time I did not use a French keyboard on NT/2000, but until now, all did send apostrophes, not single-quote. Antoine
Re: ISO 15924 draft fixes
Philippe Verdy wrote: Michael Everson wrote: Philippe Verdy wrote: There is still a conflict of Code for Mandaean, is it Mand or Mnda? Mand. OK This is now corrected on the new HTML pages. But the new normative plain-text file now contains... Mnda !!! I updated my own Excel sheet at: http://www.rodage.org/pub/iso15924-sheets.xls with the addition of 'Phags-Pa, and new fixes by Michael. Also browsable in HTML: http://www.rodage.org/pub/iso15924-sheets.html Or downloadable: http://www.rodage.org/pub/iso15924-sheets.zip About cell background colors: - light blue signals the english or French names that have been kept when removing duplicate rows with alternate names. - yellow signals the changes that have already been applied with the previous published version. (I see that some dates have been changed for a row without any change in the other fields in any of the 5 previous tables, when compared to their first published version.) - light red signals missing changes: * dates that should be changed but have still not, * the case of Asomtavruli whiwh has been removed but is not signaled in changes, * the PropertyValueAlias=Common for Code=Zyyy, as found in the UCD * the missing change from Mnda to Mand in the current plain-text version (change already applied in the currently published HTML versions of table 1 and 2).
Re: ISO 15924 draft fixes
From: Antoine Leca [EMAIL PROTECTED] On Thursday, May 20th, 2004 23:56, Philippe Verdy wrote: I see no real problem if not all the different orthographies are listed or if they are not used universally. As long as the name is non ambiguous. What will be important for interchange of data will not be this name but the Code (or N, or even ID in UAX#24 properties). I disagree. When I put content on the web, under my signature, I care about whether is written correctly or not. And when there are different possibilities, I prefer the best one given any other constraints (such as technical limitations here or there.) So there's nothing wrong if Han'gul is shown to users Sorry: this is meaningless to me as French reader. Je ne sais pas si tu t'en es rendu compte mais je suis franais aussi et vit en France... Comment prononces-tu les termes franais trs courants hanche ou hangar ? Avec une voyelle nasale, mais sans le son n! La proximit orthographie vidente avec ces deux mots conduits sa prononciation normale par un lecteur francophone avec une voyelle nasale, le terme hangul tant trs mal connu des franais, ou reconnu comme un terme non franais... L'apostrophe est nettement plus correcte car son usage marque une lision d'au moins une voyelle aprs la consonne, cette dernire (le n dans notre cas) tant alors lue distinctement et fusionne avec la lettre (voyelle ou consonne) suivante. Dans ce cas la squence consonnantale n'g sera lue nettement plus correctement, sparment du a qui la prcde dans Han'gul prononc /*h a: ng l/ alors que Hangul se prononcerait normalement /*h : g l/ (ici je note avec /*h/ le h aspir, normalement non prononc en franais mais qui interdit les liaisons et lisions avant le mot). And it is a mistake (missing breve) when it comes about the McCune-Reischauer scheme. Half-good fallback mechanisms are usually better than nothing, but worse than anything else. And we do have better possibilities here. Est-ce que ce McCune ou ce Reischauer sont des francophones natifs? Ils connaissent sans aucun dote le franais mais leur choix acadmique qui a conduit leur standard de _translitration_ (et non pas de _traduction_) est tranger toute considration sur l'adquation de cette translitration latine avec la langue et l'orthographe franaise... French normally has no caron and no breve, and the circumflex is used to mark a slight alteration of the vowel because of an assimilated consonnant in the historical orthograph (most often this circumflex in French denotes a lost s after the vowel). Or it can be for other reasons. Which consonant is involved in d? Difficile dire, vu qu'il s'agit d'une forme conjugue d'un verbe TRES irrgulier (devoir) o mme le radical est modifi, ou de sa substantivation. Je suppose que la prsence de ce circonflxe se justifie hitoriquement par la volont de le distinguer de l'article indfini contract du. Au passage, note que le circonflxe disparat au fminin et (selong certains auteurs) au pluriel... Certains lecteurs font la diffrence cause de cet accent, et prononcent du avec un u bref, et d avec un u long. So the curcumflex on Hangul would be inappropriate for French, Sachant que la prononciation du circonflxe en Franais produit souvent un allongement de la voyelle, l'utilisation du circonflxe la place d'un accent bref est trs incorrect... Please go to Langues'O for this commentary. As I wrote, you will be probably answered with the historical context. C'est quoi Langues'O ? O est-ce ? Also, there are a number of circumflexes already in the names, which have nothing to do with swallowed s (like in dvangar), which furthermore are the main entries, unlike the case at hand. Are you proposing to drop them? Perhaps in favour of macrons (like is done in a number of dictionnaries, by the way)? Ici [Comments-OT] The problem of apostrophes is that French keyboards don't have it, but only have a single-quote. Huh ??? That is quite a time I did not use a French keyboard on NT/2000, but until now, all did send apostrophes, not single-quote. Le clavier franais standard affiche une apostrophe sur le clavier qias gnre seulement une quote simple utilisable droite comme gauche (donc gnralement rendue verticalement dans nombre de polices). Dans cette phrase je faisis rfrence la diffrence de codage entre la quote simple ASCII et le vritable caractre apostrophe (ou virgule haute). Il est vrait que certainbes polices de caractres ne font pas la diffrence entre les deux, mais les deux codes ont des usages spars. J'insiste donc: le clavier franais standard ne gnre pas l'apostrophe (qui n'est pas le caractre ASCII) mais une quote simple (dans le jeu ASCII)...
Re: ISO 15924 draft fixes
Philippe Verdy scripsit: Please go to Langues'O for this commentary. As I wrote, you will be probably answered with the historical context. C'est quoi Langues'O ? Où est-ce ? Please forgive me for intruding into an internal francophone matter, but whenever I see Langues'O, my mind insists on correcting it into Langues d'O, as in Histoire d'O. Not that I read French. -- John Cowan [EMAIL PROTECTED]http://www.reutershealth.com Not to know The Smiths is not to know K.X.U. --K.X.U.
Re: ISO 15924 draft fixes
At 10:28 +0200 2004-05-21, Philippe Verdy wrote: From: Michael Everson [EMAIL PROTECTED] At 23:21 +0200 2004-05-20, Philippe Verdy wrote: There is still a conflict of Code for Mandaean, is it Mand or Mnda? Mand. OK This is now corrected on the new HTML pages. But the new normative plain-text file now contains... Mnda !!! Whoops. It was late, and that change was made by hand. As a side note to Michael or the other 6 RA members (Ken, and Rick notably), I don't think it's even a good idea to ZIP this reference plain-text file due to its very small size (which smaller than each of the HTML versions of codelists). Surely it is not harmful. It could be presented directly under the URL: http://www.unicode.org/iso15924/iso15924-code.UTF-8.txt The plain text would appear directly in the browser window where it could be saved as well, without needing any ZIP tool... Everyone has a zip tool. I am not very happy about loading the plain-text in browsers. Three of my browsers load it and *all* the French UTF-8 is displayed in Latin 1. I hope that the published versions will soon be acceptable for getting from the current FDIS status (Final Draft International Standarf) to the Standard status in ISO. I looked into the TC46 web site and for now ISO 15924 is still not a final standard but a final draft It *has* been published by ISO, though the TC46 web site doesn't reflect this. -- Michael Everson * * Everson Typography * * http://www.evertype.com
Re: ISO 15924 draft fixes
on 2004-05-21 07:10 Michael Everson wrote: I am not very happy about loading the plain-text in browsers. Three of my browsers load it and *all* the French UTF-8 is displayed in Latin 1. This *may* be a server issue. Iirc, the server has to be told to mark the text/plain MIME-type as UTF-8, since there are no meta tags (as there could be in HTML) and since browsers generally lack the heuristics to decide on coding of plain text. -- Curtis Clark http://www.csupomona.edu/~jcclark/ Mockingbird Font Works http://www.mockfont.com/
Re: ISO 15924 draft fixes
Michael Everson everson at evertype dot com wrote: The plain text would appear directly in the browser window where it could be saved as well, without needing any ZIP tool... Everyone has a zip tool. I am not very happy about loading the plain-text in browsers. Three of my browsers load it and *all* the French UTF-8 is displayed in Latin 1. Why not post both copies, zipped and unzipped, and let the user decide which one she wants to download? -Doug Ewell Fullerton, California http://users.adelphia.net/~dewell/
Re: ISO 15924 draft fixes
I am not very happy about loading the plain-text in browsers. Three of my browsers load it and *all* the French UTF-8 is displayed in Latin 1. Michael, you just need to put a BOM at the start of the file. Direct access to the plain text file, would be much preferred. The file is small -- there is no need to zip it (unlike, say, Unihan!). Mark __ http://www.macchiato.com - Original Message - From: Michael Everson [EMAIL PROTECTED] To: Unicode List [EMAIL PROTECTED] Sent: Fri, 2004 May 21 07:10 Subject: Re: ISO 15924 draft fixes At 10:28 +0200 2004-05-21, Philippe Verdy wrote: From: Michael Everson [EMAIL PROTECTED] At 23:21 +0200 2004-05-20, Philippe Verdy wrote: There is still a conflict of Code for Mandaean, is it Mand or Mnda? Mand. OK This is now corrected on the new HTML pages. But the new normative plain-text file now contains... Mnda !!! Whoops. It was late, and that change was made by hand. As a side note to Michael or the other 6 RA members (Ken, and Rick notably), I don't think it's even a good idea to ZIP this reference plain-text file due to its very small size (which smaller than each of the HTML versions of codelists). Surely it is not harmful. It could be presented directly under the URL: http://www.unicode.org/iso15924/iso15924-code.UTF-8.txt The plain text would appear directly in the browser window where it could be saved as well, without needing any ZIP tool... Everyone has a zip tool. I am not very happy about loading the plain-text in browsers. Three of my browsers load it and *all* the French UTF-8 is displayed in Latin 1. I hope that the published versions will soon be acceptable for getting from the current FDIS status (Final Draft International Standarf) to the Standard status in ISO. I looked into the TC46 web site and for now ISO 15924 is still not a final standard but a final draft It *has* been published by ISO, though the TC46 web site doesn't reflect this. -- Michael Everson * * Everson Typography * * http://www.evertype.com
RE: ISO 15924 draft fixes
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Philippe Verdy I updated my own Excel sheet at: Philippe, I really appreciate the content you posted for it's potential value in guiding the RA in doing a better job with their data. I hope, however, that you do not plan to leave it online once Michael has his content corrected. In the long run, it really is unhelpful to have alternate sources for data. Inevitably, the mirrors get out of sync as the owners move on to other interests, and inevitably someone points to the copy, not the source. Peter Peter Constable Globalization Infrastructure and Font Technologies Microsoft Windows Division
Re: ISO 15924 draft fixes
At 12:31 +0200 2004-05-21, Philippe Verdy wrote: - light blue signals the english or French names that have been kept when removing duplicate rows with alternate names. Those duplicate rows did not appear in the plain-text data files, so will not be considered further or tracked on the code changes page. Khar and Khmr should have been yellow, not blue. - yellow signals the changes that have already been applied with the previous published version. (I see that some dates have been changed for a row without any change in the other fields in any of the 5 previous tables, when compared to their first published version.) You didn't highlight Java. I'm not tracking the change to Latf because it was just a parenthesis missing not an actual change. Malayalam and Oriya did not change between the plain text versions (as I have said before). Hanunoo had a name change. - light red signals missing changes: * dates that should be changed but have still not Only changes between the plain-text documents are going to be tracked. * the case of Asomtavruli whiwh has been removed but is not signaled in changes, Only changes between the plain-text documents are going to be tracked. * the PropertyValueAlias=Common for Code=Zyyy, as found in the UCD Thanks! It's good that you found this. * the missing change from Mnda to Mand in the current plain-text version (change already applied in the currently published HTML versions of table 1 and 2). That should be fixed now. I'm going to ask Rick to regenerate the four tables. Change dates will be 2004-05-21. When they're posted we'll consider it the final beta. -- Michael Everson * * Everson Typography * * http://www.evertype.com
Re: ISO 15924 draft fixes
At 08:31 -0700 2004-05-21, Mark Davis wrote: I am not very happy about loading the plain-text in browsers. Three of my browsers load it and *all* the French UTF-8 is displayed in Latin 1. Michael, you just need to put a BOM at the start of the file. Direct access to the plain text file, would be much preferred. The file is small -- there is no need to zip it (unlike, say, Unihan!). I asked you yesterday How?. -- Michael Everson * * Everson Typography * * http://www.evertype.com
RE: ISO 15924 draft fixes
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Michael Everson As a side note to Michael or the other 6 RA members (Ken, and Rick notably), I don't think it's even a good idea to ZIP this reference plain-text file due to its very small size (which smaller than each of the HTML versions of codelists). Surely it is not harmful. I agree, it's not harmful. But I agree with Philippe, it's not particularly helpful or necessary. Peter Peter Constable Globalization Infrastructure and Font Technologies Microsoft Windows Division
Re: ISO 15924 draft fixes
At 07:57 -0700 2004-05-21, Curtis Clark wrote: on 2004-05-21 07:10 Michael Everson wrote: I am not very happy about loading the plain-text in browsers. Three of my browsers load it and *all* the French UTF-8 is displayed in Latin 1. This *may* be a server issue. Iirc, the server has to be told to mark the text/plain MIME-type as UTF-8, since there are no meta tags (as there could be in HTML) and since browsers generally lack the heuristics to decide on coding of plain text. I am going to keep the file zipped. -- Michael Everson * * Everson Typography * * http://www.evertype.com
Re: ISO 15924 draft fixes
Antoine Leca Antoine10646 at leca dash marti dot org wrote: So there's nothing wrong if Han'gul is shown to users Sorry: this is meaningless to me as French reader. And it is a mistake (missing breve) when it comes about the McCune-Reischauer scheme. Half-good fallback mechanisms are usually better than nothing, but worse than anything else. And we do have better possibilities here. This question of how to spell Hangul in French was discussed on the ISO 15924 discussion list back in 2000. (Antoine, you may remember that discussion; you were involved in it.) Michael wasn't happy about having to maintain three separate spellings, which, as he correctly pointed out, adds nothing to the standard but draws attention to the fact that Korean transliteration is unstandardized. But apparently somebody deemed it necessary, because there they are. In any case, the question of *which* French-based transliteration(s) to use seems to have been decided already. -Doug Ewell Fullerton, California http://users.adelphia.net/~dewell/
Re: ISO 15924 draft fixes
From: Peter Constable [EMAIL PROTECTED] From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Philippe Verdy I updated my own Excel sheet at: Philippe, I really appreciate the content you posted for it's potential value in guiding the RA in doing a better job with their data. I hope, however, that you do not plan to leave it online once Michael has his content corrected. In the long run, it really is unhelpful to have alternate sources for data. Inevitably, the mirrors get out of sync as the owners move on to other interests, and inevitably someone points to the copy, not the source. In fact what I'll do is to replace that by a JavaScript version, whose source data will be feeded and cached (in the server) from the Unicode normative text file, to generate a Javascript array. The colored version was there to help show what I found. Michael said that he will ignore all differences found in the previous HTML files, considering only the text file as the source and adding the missing elements. Since then, there has been no clear justification for the removal of Georgian Asomtavruli (I was told that the two scripts were being disunified in Unicode, and it is already for bibliographic references, and considered distinct by most Georgian readers that can't read it, but can read perfectly the default Mkhedruli script variant with various combinations of diacritics for transliteration, that could not work correctly if written with the Asomtavruli variant). So the 4-letter code has been published for some time, but only with a conflicting 3-digits numeric code. As most users of ISO15924 will ignore the numeric code in most applications, they may already have started to tag their Asomtavruli references with Geoa (it was said that it was valid and standard...) instead of Private Use codes (in Qaaa to Qabx). Will they need to revert them? What if documents or books have already been printed in Georgia using the Geoa code in their references? Or if this has already been used to feed librarian indices for interchange? May be there was no prior approval of this code and the publication was delayed for later and should not have been published... Oh well... --- Thanks to Michael for the addition of PropertyValueAlias=Common for Code=Zyyy, and the correction of the incorrect HTML syntax of NCRs. I would have much prefered the absence of line wrap in this code (copy/paste operations by developers will insert an undesirable additional character that may be unnoticed in sources). On the opposite, there was no real need to prohibit line wraps in the Date column.
Re: ISO 15924 draft fixes
From: Doug Ewell [EMAIL PROTECTED] In any case, the question of *which* French-based transliteration(s) to use seems to have been decided already. Is it true also for N=206, Code=Goth, English_Name=Gothic, Nom_franais=Gotique, Property_Value_Alias=Gothic ? My French dictionnaries (Petit Larousse, Robert de la Langue Franaise) refer to Gothique (with a h), including my French-German dictionnary: * [fr] Goth (n.m.) = [de] Gote (n.m.), Gotin (n.f.), Gotik (adj. in Archeology). * [fr] gothique (adj.) = [de] gotisch (adj.) The French name of a script is built on the adjective (used to qualify caractre or criture), written in the masculine singular form as it can also be a substantivation of the adjective used alone (some exceptions exist in the other current French names containing syllabaire, codet, parole and hiroglyphes where a nominal group is used rather than an adjective, with only hiroglyphes using the plural in both French and English). I have no rfrence in my French dictionnaries for Gotique, but LOTS of references to criture gothique ou caractres gothiques (including on the web and in calligraphy/typography books). I think it's a typo here... So this should be Nom_franais=gothique.
Re: ISO 15924 draft fixes - UTF-8 BOM
Michael Everson wrote: At 08:31 -0700 2004-05-21, Mark Davis wrote: I am not very happy about loading the plain-text in browsers. Three of my browsers load it and *all* the French UTF-8 is displayed in Latin 1. Michael, you just need to put a BOM at the start of the file. Direct access to the plain text file, would be much preferred. The file is small -- there is no need to zip it (unlike, say, Unihan!). I asked you yesterday How?. In Windows Notepad, if you save as UTF-8, then you get a BOM/signature. (It has been suggested in the past to define a charset name that implies the use of the signature with UTF-8 much like the UTF-16 charset/CES works.) markus
Re: ISO 15924 draft fixes
At 21:38 +0200 2004-05-21, Philippe Verdy wrote: Michael said that he will ignore all differences found in the previous HTML files, considering only the text file as the source and adding the missing elements. Yes, I did. Since then, there has been no clear justification for the removal of Georgian Asomtavruli (I was told that the two scripts were being disunified in Unicode, and it is already for bibliographic references, and considered distinct by most Georgian readers that can't read it, but can read perfectly the default Mkhedruli script variant with various combinations of diacritics for transliteration, that could not work correctly if written with the Asomtavruli variant). Good gods. Philippe, an early version of the draft had Asomtavruli and Nuskhuri. Both were removed before publiciation. One instance of Asomtavruli was left in by ACCIDENT. It is likely that Khutsuri will be added, since it encompasses the casing pair. So the 4-letter code has been published for some time, but only with a conflicting 3-digits numeric code. As most users of ISO15924 will ignore the numeric code in most applications, they may already have started to tag their Asomtavruli references with Geoa Might they? In the last 20 days? Be serious. -- Michael Everson * * Everson Typography * * http://www.evertype.com
Re: ISO 15924 draft fixes
At 22:04 +0200 2004-05-21, Philippe Verdy wrote: I have no référence in my French dictionnaries for Gotique, but LOTS of references to écriture gothique ou caractères gothiques (including on the web and in calligraphy/typography books). I think it's a typo here... So this should be Nom_français=gothique. 1. I don't want to entertain this sort of thing until the other problems are dealt with. 2. There is an online form to fill out if you want suchlike to be considered. -- Michael Everson * * Everson Typography * * http://www.evertype.com
Re: ISO 15924 draft fixes
Philippe Verdy verdy underscore p at wanadoo dot fr wrote: In any case, the question of *which* French-based transliteration(s) to use seems to have been decided already. Is it true also for N=206, Code=Goth, English_Name=Gothic, Nom_franais=Gotique, Property_Value_Alias=Gothic ? I have no idea. I was only referring to the Hangul question. -Doug Ewell Fullerton, California http://users.adelphia.net/~dewell/
Re: ISO 15924 draft fixes
From: Michael Everson [EMAIL PROTECTED] At 22:04 +0200 2004-05-21, Philippe Verdy wrote: I have no référence in my French dictionnaries for Gotique, but LOTS of references to écriture gothique ou caractères gothiques (including on the web and in calligraphy/typography books). I think it's a typo here... So this should be Nom_français=gothique. 1. I don't want to entertain this sort of thing until the other problems are dealt with. 2. There is an online form to fill out if you want suchlike to be considered. May be that's a question that the ISO 15924 member representing TC46 and working for Encyclopedia Universalis (I can't remember his name) could reply. He's French too and he has access to large collections of documents. I don't have his encyclopedia, much too expensive for my budget, and as most French people, my references to French dictionnaries is limited to the common Petit Larousse and Petit Robert, and a much less expensive encyclopedia... These are now old editions... dated 1982 with much less concerns with imports of foreign words. OK it's not critical for now. The script names are described in Unicode, and in the first edition, they could be left informative but still descriptive enough to avoid ambiguities, rather than becoming normative. Only the codes and aliases will be normative for now.
Re: ISO 15924 draft fixes
From: "Michael Everson" [EMAIL PROTECTED] At 03:28 +0200 2004-05-20, Philippe Verdy wrote: It was in the previous list (see the online HTML table 2). What does that refer to? See http://www.unicode.org/iso15924/iso15924-codes.html (sorry it was Table 1): Sylo 316 Syloti Nagri sylotî nâgrî 2004-01-09 Can't you get the same page from the Unicode web site?
Re: ISO 15924 draft fixes
[Mailed _and_ posted to the list; UTF-8] On Wednesday, May 19th, 2004 10:40 PM, Michael Everson wrote: I would appreciate it if interested persons could look this over and inform me if they find any further discrepancies between the two which are worth troubling about. Then we will proceed to generate the other files. The French name for Hang looks strange. It happened to be hangul (hangul, hangeul) (after quite a bit of discussion.) Antoine
Re: ISO 15924 draft fixes
At 11:16 +0200 2004-05-20, Philippe Verdy wrote: From: Michael Everson mailto:[EMAIL PROTECTED][EMAIL PROTECTED] At 03:28 +0200 2004-05-20, Philippe Verdy wrote: It was in the previous list (see the online HTML table 2). What does that refer to? See http://www.unicode.org/iso15924/iso15924-codes.htmlhttp://www.unicode.org/iso15924/iso15924-codes.html (sorry it was Table 1): Sylo 316 Syloti Nagri sylotî nâgrî 2004-01-09 Can't you get the same page from the Unicode web site? There are a number of pages, Philippe. -- Michael Everson * * Everson Typography * * http://www.evertype.com
Re: ISO 15924 draft fixes
At 11:52 +0200 2004-05-20, Antoine Leca wrote: [Mailed _and_ posted to the list; UTF-8] On Wednesday, May 19th, 2004 10:40 PM, Michael Everson wrote: I would appreciate it if interested persons could look this over and inform me if they find any further discrepancies between the two which are worth troubling about. Then we will proceed to generate the other files. The French name for Hang looks strange. It happened to be hangul (hangul, hangeul) (after quite a bit of discussion.) That's an error in the file. -- Michael Everson * * Everson Typography * * http://www.evertype.com
Re: ISO 15924 draft fixes
From: Michael Everson [EMAIL PROTECTED] At 11:16 +0200 2004-05-20, Philippe Verdy wrote: From: Michael Everson mailto:[EMAIL PROTECTED][EMAIL PROTECTED] At 03:28 +0200 2004-05-20, Philippe Verdy wrote: It was in the previous list (see the online HTML table 2). What does that refer to? See http://www.unicode.org/iso15924/iso15924-codes.html (sorry it was Table 1): Sylo 316 Syloti Nagri sylotî nâgrî 2004-01-09 Can't you get the same page from the Unicode web site? There are a number of pages, Philippe. Not so much: 4 pages only (the links for the English left column and the French right column are the same), plus 1 link to the downloadable zipped plain-text version (I wonder why this file is zipped, given its small size, and the fact that the text file is coded in Unix-style end-of-line format, not in MIME/DOS/Windows format which one could assume as Zip was primarily developed on DOS/Windows... If you want a Unix-style format, compress it with gzip instead) Keep this in mind: - table 1 is sorted alphabetically by 4-letter codes http://www.unicode.org/iso15924/iso15924-codes.html - table 2 is sorted numerically by 3-digits codes http://www.unicode.org/iso15924/iso15924-num.html - table 3 is sorted alphebetically by English script name http://www.unicode.org/iso15924/iso15924-en.html - table 4 is sorted alphebetically by French script name http://www.unicode.org/iso15924/iso15924-fr.html Table numbers correspond to the order of fields in the plain text version. You did not reply to the change of orthograph for the English name of Malalayam (a dot below diacritic removed), which was not shown in your proposed list of changes (in HTML format, within your zip archive).
Re: ISO 15924 draft fixes
At 13:00 +0200 2004-05-20, Philippe Verdy wrote: (I wonder why this file is zipped, given its small size, If uncompressed, downloading it opens it in the browser rather than downloading it. and the fact that the text file is coded in Unix-style end-of-line format, I used Mac OS X TextEdit. not in MIME/DOS/Windows format which one could assume as Zip was primarily developed on DOS/Windows... If you want a Unix-style format, compress it with gzip instead) Can everyone un-gzip? Everyone can un-zip. You did not reply to the change of orthograph for the English name of Malalayam (a dot below diacritic removed), which was not shown in your proposed list of changes (in HTML format, within your zip archive). I am NOT going to track all the problems in all of those tables. I am tracking the changes between the two plain-text files ONLY, and Malayalam was not spelled differently in the first one. -- Michael Everson * * Everson Typography * * http://www.evertype.com
Re: ISO 15924 draft fixes
- Original Message - From: Michael Everson [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Wednesday, May 19, 2004 10:40 PM Subject: ISO 15924 draft fixes The Registrar wishes to thank everyone who has taken an interest in the ISO 15924 data pages, and regrets the imperfections which are contained there. I am not sure how we will manage the generation of the pages, but it is clear that the base should be the plain-text document. I have made changes to the plain-text document and placed it, a draft Changes page, and the original plain-text document available at http://www.unicode.org/iso15924/iso15924-fixes.zip I would appreciate it if interested persons could look this over and inform me if they find any further discrepancies between the two which are worth troubling about. Then we will proceed to generate the other files. I deleted some duplicate lines: Ethiopic was on two lines, under Ethiopic and under Ge'ez. It seemed inappropriate to burden the tables with such duplication. I added Coptic unilaterally. I can't see Coptic for now in your source zip file. There are other duplicate lines for name aliases that should be listed in changes: - Berber (Tifinagh) = Tifinagh (Berber) - (Burmese) Myanmar = Myanmar (Burmese) - Fraktur (variant of Latin) = Latin (Fraktur variant) - Gaelic (variant of Latin) = Latin (Gaelic variant) - Harappan (Indus) = Indus (Harappan) - Mormon (Deseret) = Deseret (Mormon) - Nagari (Devanagari) = Devanagari (Nagari) - Old Church Slavonic (variant of Cyrillic) = Cyrillic (Old Church Slavonic variant) Note that the French names for Han variants are identical 'idéogrammes han, when the English names correctly indicates the distinction between Traditional and Simplified variants. These French names should be: idéogrammes han (Hanzi, Kanji, Hanja);Hani;500;Han (Hanzi, Kanji, Hanja) idéogrammes han (variante simplifiée);Hans;501; idéogrammes han (variante traditionnelle);Hant;502; For the French name of Hangul, I also found Hang quite strange (never seen this orthograph before) Documents in French from Korea or from Korean users in French refer to Hangul, Hangoul, or Hangûl, rarely Hangeul whose French reading as *Ha:n'jeul or *Hãjeul would cause problem. Some sources are using Hangueul which spells correctly in French but it may be offensive as it is too near from the popular slang French verb engueuler conjugated as engueule (a correct synonym for this verb is gronder, sometimes enguirlander in the popular language, because the radical gueule is used normally to speak about to animal faces/mouths).
Re: ISO 15924 draft fixes
At 13:37 +0200 2004-05-20, Philippe Verdy wrote: I added Coptic unilaterally. I can't see Coptic for now in your source zip file. It isn't in that file. There are other duplicate lines for name aliases that should be listed in changes: I'm not going to list those changes. There is no code or name change involved. - Berber (Tifinagh) = Tifinagh (Berber) [...] Note that the French names for Han variants are identical 'idéogrammes han, when the English names correctly indicates the distinction between Traditional and Simplified variants. These French names should be: idéogrammes han (Hanzi, Kanji, Hanja);Hani;500;Han (Hanzi, Kanji, Hanja) idéogrammes han (variante simplifiée);Hans;501; idéogrammes han (variante traditionnelle);Hant;502; This has been corrected. For the French name of Hangul, I also found Hang quite strange (never seen this orthograph before) Orthograph is not the word you want. You want the word spelling. I already said, this error has been corrected. -- Michael Everson * * Everson Typography * * http://www.evertype.com
Re: ISO 15924 draft fixes
From: Michael Everson [EMAIL PROTECTED] It can't be Unicode's UTC alone, as there are already codes for bibliographic references that are not (and will never) be encoded separately in Unicode,so I suppose that there are librarian or publishers members with which you have to discuss, independantly of the work of Unicode, which should only be the registrar for these codes. May be there's still no formal procedure, and for now the codes are maintainable without lots of administration. Read the standard. Stop this easy argument (that I find offensive here), you could have read it too before publishing tables with errors (most probably because you forgot to consult the relevant sources to check that your document were correct; I note that you are taking some freedom with you own decisions, regarding Coptic and the removal of Georgian (Asomtavruli) coded Geoa). I have read it and that's why I propose corrections... OK there are lots of corrections, but that's not a reason of ignoring some elements that were already published (and are still published for now on the Unicode web site, which is the only reference for the ISO15924 Registration Authority. Unicode has just appointed you to perform administrative updates for the RA, not to take your own decisions.) Sorry if you think that these sentences are a bit aggressive but for now the RA has made a bad start, and it's mainly because of your work... If the publication was preliminary (waiting for comments) it should have been documented as such on the Unicode web site (like for the proposals in Unicode, which pass by a testbed before being listed as standard). For now I suggest an immediate warning in the ISO15924 web pages, explicitly stating that these published tables were in beta, and contain incoherences, which are being corrected. A link should list the incoherences and the proposed changes. I have such a list and all it takes for me is a simple Excel spreadsheet, used to sort the tables and detecting differences between published tables and proposed corrections.
Re: ISO 15924 draft fixes
At 14:44 +0200 2004-05-20, Philippe Verdy wrote: From: Michael Everson [EMAIL PROTECTED] It can't be Unicode's UTC alone, as there are already codes for bibliographic references that are not (and will never) be encoded separately in Unicode,so I suppose that there are librarian or publishers members with which you have to discuss, independantly of the work of Unicode, which should only be the registrar for these codes. May be there's still no formal procedure, and for now the codes are maintainable without lots of administration. Read the standard. Stop this easy argument (that I find offensive here), you could have read it too before publishing tables with errors Errors are errors. The RA-JAC had an opportunity to review all the tables. Do not blame me alone. People err. People have kindly pointed out discrepancies. (most probably because you forgot to consult the relevant sources to check that your document were correct; Don't presume. I note that you are taking some freedom with you own decisions, regarding Coptic and the removal of Georgian (Asomtavruli) coded Geoa). I have (properly) proposed the addition of Coptic (and some other scripts) to the JAC. Asomtavruli was removed for good reasons. Live with it. It will be reinstated in due course. I have read it and that's why I propose corrections... And that's why I am communicating with you, to get relevant feedback. The only delta we are going to deal with is the one between the plain-text documents; it is that which is going to be considered authoritative and which will be used (somehow) to generate the other tables. Sorry if you think that these sentences are a bit aggressive but for now the RA has made a bad start, and it's mainly because of your work... Nonsense. I am not ashamed. It was a hell of a lot of work getting that standard together. It is, as you have pointed out, difficult to maintain different tables by hand. If the publication was preliminary (waiting for comments) it should have been documented as such on the Unicode web site (like for the proposals in Unicode, which pass by a testbed before being listed as standard). It does NOT matter, Philippe. The corrections are being made. For now I suggest an immediate warning in the ISO15924 web pages, explicitly stating that these published tables were in beta, and contain incoherences, which are being corrected. No. This is purely cosmetic. Let us move on. A link should list the incoherences and the proposed changes. I have such a list and all it takes for me is a simple Excel spreadsheet, used to sort the tables and detecting differences between published tables and proposed corrections. The only delta we are going to deal with is the one between the plain-text documents; it is that which is going to be considered authoritative and which will be used (somehow) to generate the other tables. -- Michael Everson * * Everson Typography * * http://www.evertype.com
Re: ISO 15924 draft fixes
Antoine Leca a crit : The French name for Hang looks strange. It happened to be hangul (hangul, hangeul) (after quite a bit of discussion.) The name in ISO/CEI 10646 (F) is hangl from a Corean dictionary and a Corean grammar published by the Inalco (Langues O'). Another suggested form in some sources, to appromixate the pronounciation. is hangueul P. A.
Re: ISO 15924 draft fixes
To terminate with this discussion, I have put online the corrected tables. http://www.rodage.org/pub/iso15924-sheets.html (this is a Excel workbook in HTML format with frames but without Excel interactivity, that references other URLs in a subfolder; it can be navigated by the tabs at the bottom) Also available as a plain Excel file: http://www.rodage.org/pub/iso15924-sheets.xls The above collection is also archived in http://www.rodage.org/pub/iso15924-sheets.zip
Re: ISO 15924 draft fixes
At 06:51 -0700 2004-05-20, Patrick Andries wrote: Antoine Leca a écrit : The French name for Hang looks strange. It happened to be hangul (hangul, hangeul) (after quite a bit of discussion.) The name in ISO/CEI 10646 (F) is « hangûl » from a Corean dictionary and a Corean grammar published by the Inalco (Langues O'). Another suggested form in some sources, to appromixate the pronounciation. is « hangueul » transliterations of Korean that the Korean NB insisted upon. Hangul instead of hangûl we will treat as a spelling error (so you don't have to file a change form). -- Michael Everson * * Everson Typography * * http://www.evertype.com
RE: ISO 15924 draft fixes
For now I suggest an immediate warning in the ISO15924 web pages, explicitly stating that these published tables were in beta, and contain incoherences, which are being corrected. No. This is purely cosmetic. Let us move on. I find this cavalier attitude a bit disconcerting. Errors in the tables are not purely cosmetic. An IT standard is created to support IT implementations, and people have been and will be referring to those tables to create their implementations. Each view of the data should be reliable, and if it is found that it was not, then that needs to be communicated in some way. IMO, it is essential that there be a place on the site for errata. I'm inclined to agree with Philippe: the errata notes should indicate that there were errors in the original tables and what the nature of those errors were. If IDs were misspelled or missing, those should be enumerated. If English or French names were misspelled, I think a general note is sufficient. A link should list the incoherences and the proposed changes. I have such a list and all it takes for me is a simple Excel spreadsheet, used to sort the tables and detecting differences between published tables and proposed corrections. The only delta we are going to deal with is the one between the plain-text documents; it is that which is going to be considered authoritative Is that document*s* (plural)? I strongly encourage you to maintain *one* master source from which all others are derived. Peter Peter Constable Globalization Infrastructure and Font Technologies Microsoft Windows Division
RE: ISO 15924 draft fixes
At 08:10 -0700 2004-05-20, Peter Constable wrote: For now I suggest an immediate warning in the ISO15924 web pages, explicitly stating that these published tables were in beta, and contain incoherences, which are being corrected. No. This is purely cosmetic. Let us move on. I find this cavalier attitude a bit disconcerting. Errors in the tables are not purely cosmetic. Look, Peter. I'm glad people found errors and inconsistencies. We are working on fixing that, and expect it to be fixed very soon. You're ALL listening. Taking time to put up an immediate warning isn't a good use of my time. IMO, it is essential that there be a place on the site for errata. I'm inclined to agree with Philippe: the errata notes should indicate that there were errors in the original tables and what the nature of those errors were. If IDs were misspelled or missing, those should be enumerated. If English or French names were misspelled, I think a general note is sufficient. The changes will be noted at http://www.unicode.org/iso15924/codechanges.html Please be a little bit patient. The only delta we are going to deal with is the one between the plain-text documents; it is that which is going to be considered authoritative Is that document*s* (plural)? I strongly encourage you to maintain *one* master source from which all others are derived. That would be THE old plain-text document and THE new plain-text document which will replace it. -- Michael Everson * * Everson Typography * * http://www.evertype.com
RE: ISO 15924 draft fixes
Peter, Philippe, I hope this satisfies you. http://www.unicode.org/iso15924/codelists.html It is enough work finding and fixing and figuring out whatever it is that a perl script is and how to make it work. It may seem obvious to you, but it is not obvious to me. -- Michael Everson * * Everson Typography * * http://www.evertype.com
RE: ISO 15924 draft fixes
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Michael Everson Taking time to put up an immediate warning isn't a good use of my time. I didn't ask for an immediate warning. I will note, though, that incorporating bad data into a product may not be a good use of time for someone else -- and it may be far more costly for them than it will be for you. The changes will be noted at http://www.unicode.org/iso15924/codechanges.html Please be a little bit patient. I don't think I'm being at all impatient. I didn't ask you to do anything yesterday; I just ask that it be done carefully. And not to think that bad data files can be relegated to cosmetics, which is what you seemed to be saying. Peter Peter Constable Globalization Infrastructure and Font Technologies Microsoft Windows Division
RE: ISO 15924 draft fixes
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Michael Everson I hope this satisfies you. http://www.unicode.org/iso15924/codelists.html If they are consistent and reliable, I'm satisfied with them. I hope you will be preparing a page for corrigenda / errata. It's not a big issue, but I don't understand why the dates don't match: was Arab added on January 9 or May 1? So, they're not entirely consistent. Also, it appears you have not fixed a serious error in the plain-text file: it is not well-structured. Some rows have 6 columns, and some have 7. Peter Peter Constable Globalization Infrastructure and Font Technologies Microsoft Windows Division
RE: ISO 15924 draft fixes
I concur with Peter. If there are multiple documents now, then I'd like to see a single normative document... and furthermore I would like it to *be* normative (and I'd like to know which one it is). The text file is listed on the web site as the alternative... By all means correct errors. Spelling or nomenclatural (non-substantive) changes in the descriptions are errata. But I view changes, additions, and deletions to/from the data tables as changes to the standard and they should, in my opinion, be treated as such even if they are only to correct errors. Best Regards, Addison Addison P. Phillips Director, Globalization Architecture webMethods | Delivering Global Business Visibility http://www.webMethods.com Chair, W3C Internationalization (I18N) Working Group Chair, W3C-I18N-WG, Web Services Task Force http://www.w3.org/International Internationalization is an architecture. It is not a feature. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Peter Constable Sent: 2004520 8:10 To: Unicode List Subject: RE: ISO 15924 draft fixes For now I suggest an immediate warning in the ISO15924 web pages, explicitly stating that these published tables were in beta, and contain incoherences, which are being corrected. No. This is purely cosmetic. Let us move on. I find this cavalier attitude a bit disconcerting. Errors in the tables are not purely cosmetic. An IT standard is created to support IT implementations, and people have been and will be referring to those tables to create their implementations. Each view of the data should be reliable, and if it is found that it was not, then that needs to be communicated in some way. IMO, it is essential that there be a place on the site for errata. I'm inclined to agree with Philippe: the errata notes should indicate that there were errors in the original tables and what the nature of those errors were. If IDs were misspelled or missing, those should be enumerated. If English or French names were misspelled, I think a general note is sufficient. A link should list the incoherences and the proposed changes. I have such a list and all it takes for me is a simple Excel spreadsheet, used to sort the tables and detecting differences between published tables and proposed corrections. The only delta we are going to deal with is the one between the plain-text documents; it is that which is going to be considered authoritative Is that document*s* (plural)? I strongly encourage you to maintain *one* master source from which all others are derived. Peter Peter Constable Globalization Infrastructure and Font Technologies Microsoft Windows Division
RE: ISO 15924 draft fixes
At 09:49 -0700 2004-05-20, Peter Constable wrote: From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Michael Everson I hope this satisfies you. http://www.unicode.org/iso15924/codelists.html If they are consistent and reliable, I'm satisfied with them. I hope you will be preparing a page for corrigenda / errata. That's what http://www.unicode.org/iso15924/codechanges.html is for. It's not a big issue, but I don't understand why the dates don't match: was Arab added on January 9 or May 1? So, they're not entirely consistent. Because long long ago when I thought that ISO was going to publish the document on my birthday (sigh) I put 2004-01-09 on the document; that didn't happen, and it wasn't published until 2004-05-01. Also, it appears you have not fixed a serious error in the plain-text file: it is not well-structured. Some rows have 6 columns, and some have 7. That might be fixed in the newest one. -- Michael Everson * * Everson Typography * * http://www.evertype.com
RE: ISO 15924 draft fixes
At 10:00 -0700 2004-05-20, Addison Phillips [wM] wrote: I concur with Peter. If there are multiple documents now, then I'd like to see a single normative document... It will be the plain-text version, and for the purposes of fixing the current regrettable mess I'm taking it as read that the plain text version was always the normative version. and furthermore I would like it to *be* normative (and I'd like to know which one it is). The text file is listed on the web site as the alternative... It should say normative. Is the format order satisfactory? English_Name;Code;Nº;Nom_français;PVA;Date Or would it be preferable to have it in the format of Table 1 (Code;Nº;English_Name;Nom_français;PVA;Date) -- Michael Everson * * Everson Typography * * http://www.evertype.com
Re: ISO 15924 draft fixes
Antoine Leca a écrit : The French name for Hang looks strange. It happened to be hangul (hangul, hangeul) (after quite a bit of discussion.) Sorry guys. For reasons known to itself, my mailer refused to post in UTF-8 this morning. I meant hangul(hangul, hangeul). According to a native ftp://dkuug.dk/ftp.anonymous/email/iso15924/277 the correct form are the ones between parenthesis (with an added apostrophe between han'gul). : From: Jian YANG [EMAIL PROTECTED] : Subject: Re: Re: (iso15924.275) Hangul (Hang~ul, Hangeul) : as script name (~is adiacritical mark) : Date: Mon, 29 May 2000 15:49:25 -0400 : : : «Hangeul» = Norme de romanisation du Ministère de : l'Éducation de la Corée du Sud; : «Hangul» = Romanisation Mc-Cune-Reischauer (la forme exacte : est «Han'gul» : «u» with breve, et non caron; mais on a : enlevé le signe diacritique pour accommoder la convention de : ascii, sans doute); On Thursday, May 20, 2004 3:51 PM, Patrick Andries va escriure: The name in ISO/CEI 10646 (F) is « hangûl » from a Corean dictionary and a Corean grammar published by the Inalco (Langues O'). Clearly, the Langues'O did adapt it to French typographical possibilities, reversing the breve accent into a circumflex. Another suggested form in some sources, to appromixate the pronounciation. is « hangueul » This is the other form, with an added, euphonical u after the g, to avoid a complete misprononciation. About whether all this right or not, I do not know. But I believe this text did go through two ballots against the very people of Langues'O (?), so we have no reason to correct now what was accepted in the standard. The only choice right now is to type exactly what was printed, since I understand we do not have any more the master that served to the [F]DIS texts. Since I am not a member of TC46, and furthermore I was away from the process last year, I might very easily be wrong. Antoine
RE: ISO 15924 draft fixes
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Michael Everson Also, it appears you have not fixed a serious error in the plain-text file: it is not well-structured. Some rows have 6 columns, and some have 7. That might be fixed in the newest one. It is not fixed in the file that's on the site now. If this is the normative file, I'd suggest you fix it as soon as possible. Peter Peter Constable Globalization Infrastructure and Font Technologies Microsoft Windows Division
RE: ISO 15924 draft fixes
At 12:07 -0700 2004-05-20, Peter Constable wrote: From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Michael Everson Also, it appears you have not fixed a serious error in the plain-text file: it is not well-structured. Some rows have 6 columns, and some have 7. That might be fixed in the newest one. It is not fixed in the file that's on the site now. If this is the normative file, I'd suggest you fix it as soon as possible. Which file on the site? -- Michael Everson * * Everson Typography * * http://www.evertype.com
RE: ISO 15924 draft fixes
I don't care about the order, so long as it is stable over time. Personally I find the latter form more logical (with the identifier, i.e. the code, first). I view the English and French names and the PVA as merely descriptive or informative information. The code and the ID number should go first, IMO. But if the file is in some other format, that's fine, so long as the format is stable. Best Regards, Addison Addison P. Phillips Director, Globalization Architecture webMethods | Delivering Global Business Visibility http://www.webMethods.com Chair, W3C Internationalization (I18N) Working Group Chair, W3C-I18N-WG, Web Services Task Force http://www.w3.org/International Internationalization is an architecture. It is not a feature. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Michael Everson Sent: 2004520 10:59 To: [EMAIL PROTECTED] Subject: RE: ISO 15924 draft fixes At 10:00 -0700 2004-05-20, Addison Phillips [wM] wrote: I concur with Peter. If there are multiple documents now, then I'd like to see a single normative document... It will be the plain-text version, and for the purposes of fixing the current regrettable mess I'm taking it as read that the plain text version was always the normative version. and furthermore I would like it to *be* normative (and I'd like to know which one it is). The text file is listed on the web site as the alternative... It should say normative. Is the format order satisfactory? English_Name;Code;N;Nom_franais;PVA;Date Or would it be preferable to have it in the format of Table 1 (Code;N;English_Name;Nom_franais;PVA;Date) -- Michael Everson * * Everson Typography * * http://www.evertype.com
RE: ISO 15924 draft fixes
I could use a little help rendering this into French, lest I embarrass myself The Property Value Alias is defined as part of the Unicode Standard and is provided informatively in the tables here to show how entries in the ISO 15924 code table relate to script names defined in Unicode. -- Michael Everson * * Everson Typography * * http://www.evertype.com
RE: ISO 15924 draft fixes
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Addison Phillips [wM] I don't care about the order, so long as it is stable over time. Personally I find the latter form more logical (with the identifier, i.e. the code, first). I agree with Addison here: the most important thing is stability, but it makes sense that the first and second columns be the symbolic code and the numeric code, especially if this is *the* plain-text version and normative reference. Peter Peter Constable Globalization Infrastructure and Font Technologies Microsoft Windows Division
Re: ISO 15924 draft fixes
From: Peter Constable [EMAIL PROTECTED] From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Michael Everson I hope this satisfies you. http://www.unicode.org/iso15924/codelists.html If they are consistent and reliable, I'm satisfied with them. I hope you will be preparing a page for corrigenda / errata. It's not a big issue, but I don't understand why the dates don't match: was Arab added on January 9 or May 1? So, they're not entirely consistent. Also, it appears you have not fixed a serious error in the plain-text file: it is not well-structured. Some rows have 6 columns, and some have 7. No the structure is correct, however the text file was prepared by copy/pasting HTML text inserted in empty cells, namely the nbsp; character reference (that contains a syntaxic semicolon conflicting with the CSV separator). That's the first thing I had signaled to Michael several days ago, and he has acknowledged it and corrected it in its new update. I have already signaled almost all bugs and inconsistencies to Michael, and prepared corrected files. Micheal has just changed the online version (but with the wrong dates...that's irritating). There is still a conflict of Code for Mandaean, is it Mand or Mnda? - Table 1 (HTML by Code): Mand;140;Mandaean;mandéen;;2004-05-01 - Table 2 (HTML by N°): 140;Mand;Mandaean;mandéen;;2004-05-01 - Table 3 (HTML by Name): Mandaean;140;Mnda;mandéen;;2004-05-01 - same thing for Table 3 (plain-text by Name) - Table 4 (HTML by Nom): mandéen;140;Mnda;Mandaean;;2004-05-01 As Michael indicates that the plain-text should be the reference, then it suggests Mnda and not Mand... But the new plain-text version uses Mand... I had already signaled it in a past message... So it's also irritating.
RE: ISO 15924 draft fixes
At 13:49 -0700 2004-05-20, Peter Constable wrote: I agree with Addison here: the most important thing is stability, but it makes sense that the first and second columns be the symbolic code and the numeric code, especially if this is *the* plain-text version and normative reference. That's going to happen. -- Michael Everson * * Everson Typography * * http://www.evertype.com
Re: ISO 15924 draft fixes
From: Michael Everson [EMAIL PROTECTED] Is the format order satisfactory? English_Name;Code;Nº;Nom_français;PVA;Date Or would it be preferable to have it in the format of Table 1 (Code;Nº;English_Name;Nom_français;PVA;Date) I vote for the order of table 1; the Code is the most important one, and the start of line will have a fixed format, easing its parsing, or simply easing its legibility for readers. Make the plain-text normative and published online (out of a zip file), and make the HTML pages only informative... I have done another scripted page (with PHP; however the PHP generated page may be stored in a cached static page if you don't want scripts on the Unicode server itself) using the text file as the reference, to generate all the HTML pages for browsing.
Re: ISO 15924 draft fixes
At 23:21 +0200 2004-05-20, Philippe Verdy wrote: Micheal has just changed the online version (but with the wrong dates...that's irritating). Patience... Unchanged codes will retain 2004-05-01 as the starting date. Changed codes have (as of the current BETA draft which is uploaded for testing purposes now; look at it if you like to do that sort of thing) the date of 2004-05-20. If there are further changes before we go to RELEASE then I will adjust that date accordingly. OK? There is still a conflict of Code for Mandaean, is it Mand or Mnda? Mand. As Michael indicates that the plain-text should be the reference, then it suggests Mnda and not Mand... But the new plain-text version uses Mand... I had already signaled it in a past message... So it's also irritating. They all say Mand now. -- Michael Everson * * Everson Typography * * http://www.evertype.com
Re: ISO 15924 draft fixes
From: Antoine Leca [EMAIL PROTECTED] Antoine Leca a crit : The French name for Hang looks strange. It happened to be hangul (hangul, hangeul) (after quite a bit of discussion.) Sorry guys. For reasons known to itself, my mailer refused to post in UTF-8 this morning. I meant hangul(hangul, hangeul). According to a native ftp://dkuug.dk/ftp.anonymous/email/iso15924/277 the correct form are the ones between parenthesis (with an added apostrophe between han'gul). : From: Jian YANG [EMAIL PROTECTED] : Subject: Re: Re: (iso15924.275) Hangul (Hang~ul, Hangeul) : as script name (~is adiacritical mark) : Date: Mon, 29 May 2000 15:49:25 -0400 : : : Hangeul = Norme de romanisation du Ministre de : l'ducation de la Core du Sud; : Hangul = Romanisation Mc-Cune-Reischauer (la forme exacte : est Han'gul : u with breve, et non caron; mais on a : enlev le signe diacritique pour accommoder la convention de : ascii, sans doute); On Thursday, May 20, 2004 3:51 PM, Patrick Andries va escriure: The name in ISO/CEI 10646 (F) is hangl from a Corean dictionary and a Corean grammar published by the Inalco (Langues O'). Clearly, the Langues'O did adapt it to French typographical possibilities, reversing the breve accent into a circumflex. Another suggested form in some sources, to appromixate the pronounciation. is hangueul This is the other form, with an added, euphonical u after the g, to avoid a complete misprononciation. About whether all this right or not, I do not know. But I believe this text did go through two ballots against the very people of Langues'O (?), so we have no reason to correct now what was accepted in the standard. The only choice right now is to type exactly what was printed, since I understand we do not have any more the master that served to the [F]DIS texts. Since I am not a member of TC46, and furthermore I was away from the process last year, I might very easily be wrong. I see no real problem if not all the different orthographies are listed or if they are not used universally. As long as the name is non ambiguous. What will be important for interchange of data will not be this name but the Code (or N, or even ID in UAX#24 properties). So there's nothing wrong if Han'gul is shown to users without the prefered apostrophe (I don't mean here not the single quote!), or with a caron or circumflex instead of breve (to dapat to the rendering or encoding context in which this name would be exposed to users), or even without any diacritic (my opinion is that substituting a diacritic for another is worse than just removing the diacritic that can't be displayed or encoded). French normally has no caron and no breve, and the circumflex is used to mark a slight alteration of the vowel because of an assimilated consonnant in the historical orthograph (most often this circumflex in French denotes a lost s after the vowel). So the curcumflex on Hangul would be inappropriate for French, as well as Hangeul (breaks the common reading rules). Hangul and Han'gul are more acceptable, as well as Hangoeul with a oe ligature, or Han'goeul with an additional apostrophe, which would have been even more accurate but have been seen nowhere for now. [Comments-OT] The problem of apostrophes is that French keyboards don't have it, but only have a single-quote. Handling the presence of quotes as meaning apostrophe is limited in French to very few words as a mark of ellision of some characters, not as a mark for the phonetic. In Han'gul there's no ellision but its absence places a nasalisation of the previous letter a. A solution would be to write Hanngul. However there are now lots of proper names _ending_ in -an (such as Alan) for which the nasalisation is easy to avoid by readers (so Han, i.e. the ideographic script of Chinese, is appropriate in French, but not Hanzi, Hanja, or Hangul, where almlost all native readers would not pronounce the n but would nasalize the previous vowel a). The simplest solution to avoid nasalization is to place another n after it in French (nazalisation never occurs with double-n in French). This would give in French: Hanngul (or Hanngoeul, or is it Hanngoul ?), Hannzi (to avoid pronounce it like in enzyme), Hannja (to avoid pronounce it like in en japonais), but still Han (preferably to Hann)... [/Comments-OT] Philippe.
Re: ISO 15924 draft fixes
Peter Constable wrote: Michael Everson wrote: Also, it appears you have not fixed a serious error in the plain-text file: it is not well-structured. Some rows have 6 columns, and some have 7. That might be fixed in the newest one. It is not fixed in the file that's on the site now. If this is the normative file, I'd suggest you fix it as soon as possible. This (below) is my own plain text version (still using the field and row order of table 3 by english name, instead of the order of table 1 by code)... Some entries are commented out with %. Philippe. --- % The format is Name;Code;N;Nom;ID;Date % Codes for the representation of names of scripts % Codes pour la reprsentation des noms dcritures % Alphabetical list of English script names English_Name;Code;N;Nom_franais;ID;Date (alias for Hiragana + Katakana);Hrkt;412;(alias pour hiragana + katakana);Katakana_Or_Hiragana;2004-05-01 Arabic;Arab;160;arabe;Arabic;2004-05-01 Armenian;Armn;230;armnien;Armenian;2004-05-01 Balinese;Bali;360;balinais;;2004-05-18 Batak;Batk;365;batak;;2004-05-01 Bengali;Beng;325;bengal;Bengali;2004-05-01 Blissymbols;Blis;550;symboles Bliss;;2004-05-01 Bopomofo;Bopo;285;bopomofo;Bopomofo;2004-05-01 Brahmi;Brah;300;brhm;;2004-05-01 Braille;Brai;570;braille;Braille;2004-05-01 Buginese;Bugi;367;bouguis;;2004-05-01 Buhid;Buhd;372;bouhide;Buhid;2004-05-01 Cham;Cham;358;cham (am, tcham);;2004-05-01 Cherokee;Cher;445;tchrok;Cherokee;2004-05-01 Cirth;Cirt;291;cirth;;2004-05-01 Code for uncoded script;Zzzz;999;codet pour criture non code;;2004-05-01 Code for undetermined script;Zyyy;998;codet pour criture indtermine;;2004-05-01 Code for unwritten languages;Zxxx;997;codet pour les langues non crites;;2004-05-01 % Still missing... %Coptic;copt;201;copte;;2004-05-20 Cuneiform, Sumero-Akkadian;Xsux;020;cuniforme sumro-akkadien;;2004-05-01 Cypriot;Cprt;403;syllabaire chypriote;Cypriot;2004-05-01 Cyrillic;Cyrl;220;cyrillique;Cyrillic;2004-05-01 Cyrillic (Old Church Slavonic variant);Cyrs;221;cyrillique (variante slavonne);;2004-05-01 Deseret (Mormon);Dsrt;250;dseret (mormon);Deseret;2004-05-01 Devanagari (Nagari);Deva;315;dvangar;Devanagari;2004-05-01 Egyptian demotic;Egyd;070;dmotique gyptien;;2004-05-01 Egyptian hieratic;Egyh;060;hiratique gyptien;;2004-05-01 Egyptian hieroglyphs;Egyp;050;hiroglyphes gyptiens;;2004-05-01 Ethiopic (Geez);Ethi;430;thiopique (thiopien, geez);Ethiopic;2004-05-01 % Why was this removed? Wasn't it present for bibliographic references? %Georgian (Asomtavruli);Geoa;241;gorgien (assomtavrouli);;2004-05-18 Georgian (Mkhedruli);Geor;240;gorgien (mkhdrouli);Georgian;2004-05-18 Glagolitic;Glag;225;glagolitique;;2004-05-01 Gothic;Goth;206;gotique;Gothic;2004-05-01 Greek;Grek;200;grec;Greek;2004-05-01 Gujarati;Gujr;320;goudjart (gujrt);Gujarati;2004-05-01 Gurmukhi;Guru;310;gourmoukh;Gurmukhi;2004-05-01 Han (Hanzi, Kanji, Hanja);Hani;500;idogrammes han;Han;2004-05-01 Han (Simplified variant);Hans;501;idogrammes han (variante simplifie);;2004-05-01 Han (Traditional variant);Hant;502;idogrammes han (variante traditionelle);;2004-05-01 % This should better be: %Hangul (Hangl, Hangeul);Hang;286;hangul (hangul);Hangul;2004-05-01 Hangul (Hangl, Hangeul);Hang;286;hangul (hangl, hangeul);Hangul;2004-05-01 Hanuno;Hano;371;hanouno;Hanunoo;2004-05-01 Hebrew;Hebr;125;hbreu;Hebrew;2004-05-01 Hiragana;Hira;410;hiragana;Hiragana;2004-05-01 Indus (Harappan);Inds;610;indus;;2004-05-01 Javanese;Java;361;javanais;;2004-05-18 Kannada;Knda;345;kannara (canara);Kannada;2004-05-18 Katakana;Kana;411;katakana;Katakana;2004-05-01 Kayah Li;Kali;357;kayah li;;2004-05-01 Kharoshthi;Khar;305;kharochth;;2004-05-18 Khmer;Khmr;355;khmer;Khmer;2004-05-18 Lao;Laoo;356;laotien;Lao;2004-05-01 Latin;Latn;215;latin;Latin;2004-05-01 Latin (Fraktur variant);Latf;217;latin (variante brise);;2004-05-01 Latin (Gaelic variant);Latg;216;latin (variante galique);;2004-05-01 Lepcha (Rng);Lepc;335;lepcha (rng);;2004-05-01 Limbu;Limb;336;limbou;Limbu;2004-05-18 Linear A;Lina;400;linaire A;;2004-05-01 Linear B;Linb;401;linaire B;Linear_B;2004-05-18 Malayalam;Mlym;347;malaylam;Malayalam;2004-05-01 Mandaean;Mnda;140;manden;;2004-05-01 Mayan hieroglyphs;Maya;090;hiroglyphes mayas;;2004-05-01 Meroitic;Mero;100;mrotique;;2004-05-01 Mongolian;Mong;145;mongol;Mongolian;2004-05-01 Myanmar (Burmese);Mymr;350;birman;Myanmar;2004-05-01 Ogham;Ogam;212;ogam;Ogham;2004-05-01 Old Hungarian;Hung;176;ancien hongrois;;2004-05-01 Old Italic (Etruscan, Oscan, etc.);Ital;210;ancien italique (trusque, osque, etc.);Old_Italic;2004-05-18 Old Permic;Perm;227;ancien permien;;2004-05-01 Old Persian;Xpeo;030;cuniforme perspolitain;;2004-05-01 Oriya;Orya;327;oriy;Oriya;2004-05-01 Orkhon;Orkh;175;orkhon;;2004-05-01 Osmanya;Osma;260;osmanais;Osmanya;2004-05-01 Pahawh Hmong;Hmng;450;pahawh hmong;;2004-05-01 Phoenician;Phnx;115;phnicien;;2004-05-01 Pollard Phonetic;Plrd;282;phontique de Pollard;;2004-05-01 Reserved for private use (start);Qaaa;900;rserv lusage priv (dbut);;2004-05-18
RE: ISO 15924 draft fixes
From: Philippe Verdy [mailto:[EMAIL PROTECTED] No the structure is correct, however the text file was prepared by copy/pasting HTML text inserted in empty cells, namely the nbsp; character reference (that contains a syntaxic semicolon conflicting with the CSV separator). IMO, the structure of data is effectively determined by how processes will interpret the data. A process won't see 6 columns one of which contains nbsp;. It will see seven columns one of which contains nbsp. He's said the file has been fixed (though I don't know if he's posted the fixed file). Peter Peter Constable Globalization Infrastructure and Font Technologies Microsoft Windows Division
Re: ISO 15924 draft fixes
At 00:05 +0200 2004-05-21, Philippe Verdy wrote: This (below) is my own plain text version (still using the field and row order of table 3 by english name, instead of the order of table 1 by code)... Some entries are commented out with %. The RA has no intention whatsoever of making use of this file. Absolutely not. -- Michael Everson * * Everson Typography * * http://www.evertype.com
Re: ISO 15924 draft fixes
From: Michael Everson [EMAIL PROTECTED] I could use a little help rendering this into French, lest I embarrass myself The Property Value Alias is defined as part of the Unicode Standard and is provided informatively in the tables here to show how entries in the ISO 15924 code table relate to script names defined in Unicode. Tip: French translation is: Le synonyme de valeur de propriété est défini au sein du Standard Unicode et est fourni ici de façon informative dans les tables, afin de montrer comment les entrées des tables de codets ISO 15924 correspondent aux noms de scripts définis dans Unicode. (there should be a reference to the PropertyValueAliases.txt file in the UCD, and the section in the UTS or its annexes that describes this UCD text file.) It's true that the PropertyValueAliases.txt file in the UCD already contains long aliases for the shorter ISO-15924 codes: (...) sc ; Arab ; Arabic sc ; Armn ; Armenian (...) sc ; Zyyy ; Common (...) It's true that this same file does not list all possible values (the long value Inherited has no other alias defined in that file). May be this file in the UCD could list also the ISO-15924 numeric codes, but there's no obligation to add them there. Simply the existence of the sc: ... lines are enough to indicate that the prefered alias is the ISO-15924 code when it exists, so that Arab is prefered to Arabic, or Linb is prefered to Linear_B. With regards to semantics however, there's no difference between Arab and Arabic, or between Linb and Linear_B, meaning that these values are in the same value space. That's a good reason to not pollute that value space with new long uneeded aliases. The long aliases only exist for legacy reasons, also in Unicode, and the ID column in ISO-15924 tables is mostly informative, and should not be normative. This ID column in ISO-15924 already has the semantics of a Unicode Script Property Value Alias, but it could be any other alias needed for some other legacy applications. I just wonder why this column was placed there, before the Date column that is required, given that there may possibly exist several legacy aliases to list in ISO-15924, and defined in other standards than Unicode. If you want to keep a master table for the long term, I would either drop this ID column, or put it at end of the row, after the Date field (so that more than 1 alias could be added to each code; For example, there are some numeric script ids defined in OpenType and that could be listed as X_OT_17, if they are bound directly to standard script codes)
Re: ISO 15924 draft fixes
From: Peter Constable [EMAIL PROTECTED] From: Philippe Verdy [mailto:[EMAIL PROTECTED] No the structure is correct, however the text file was prepared by copy/pasting HTML text inserted in empty cells, namely the nbsp; character reference (that contains a syntaxic semicolon conflicting with the CSV separator). IMO, the structure of data is effectively determined by how processes will interpret the data. A process won't see 6 columns one of which contains nbsp;. It will see seven columns one of which contains nbsp. He's said the file has been fixed (though I don't know if he's posted the fixed file). It's not fixed in the zipped archive linked from the ISO 15924/RA web pages (no changed occured for now for this download), but it is fixed in the corrected archive that Michael indicated here: http://www.unicode.org/iso15924/iso15924-fixes.zip (this link is not published officially for now, because Michael wanted comments about it before, thanks because it was still not perfect) Michael has started the corrections in the HTML tables 1 and 2, but table 3 (and its downlodable alternative plain-text version) and table 4 are still not corrected. I said this was lots of files to change, but in fact all can be done with one spreadsheet saved into 5 files. Michael could also have used a very basic database application (an Access or FileMaker or dBase or Paradox database, with 1 table and 5 query-views, or other similar tools that each programmer or data maintainer should have to perform easily such basic task without lots of manual editing, and even without programming a script).
Re: ISO 15924 draft fixes
From: Michael Everson [EMAIL PROTECTED] At 00:05 +0200 2004-05-21, Philippe Verdy wrote: This (below) is my own plain text version (still using the field and row order of table 3 by english name, instead of the order of table 1 by code)... Some entries are commented out with %. The RA has no intention whatsoever of making use of this file. Absolutely not. OK. But you have also argumented incorrectly to oppose one of my questions related to the Common script ID, when I was asking to what Common and Inherited (defined in UAX#24) corresponded in ISO-15924. If I look at the standard Property Values Aliases defined in the UCD files, I see this rule: sc ; Zyyy ; Common Clearly it states that Common is an alias of the Zyyy script code. So the ISO-15924 tables should reflect it in their ID columns. For example in Table 1 (list by code): Zyyy;998;Code for undetermined script;codet pour écriture indéterminée;Common;2004-05-01 If your arguments related to the usage of the ISO-15924 Zyyy;998 codes are valid, and differ from the definition of the Common script ID in UAX #24, then there's a problem in the definition of the PropertyValueAliases.txt file in the UCD 4.0 and the sc ; Zyyy ; Common line should be removed... This will require an amendment to Unicode.
Re: ISO 15924 draft fixes
I see some differences - For Georgian, your new file contains only: Georgian (Mkhedruli);Geor;240;géorgien (mkhédrouli);Georgian;2004-05-18 But the previous version also contained in one of the online tables: Georgian (Asomtavruli);Geoa;242;géorgien (assomtavrouli);Georgian;2004-01-05 - Where is this line?: Syloti Nagri;Sylo;316;sylotî nâgrî;;2004-09-01 Limbu has been adjusted to a more appropriate numeric code within South-Asian scripts (401 to 336). I also think that the removal of duplicate rows for English or French name aliases was a good decision (after all the aliases are already listed between parentheses). I also think that slpitting the line for the start end end codes of private scripts was a good idea. - Original Message - From: Michael Everson [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Wednesday, May 19, 2004 10:40 PM Subject: ISO 15924 draft fixes The Registrar wishes to thank everyone who has taken an interest in the ISO 15924 data pages, and regrets the imperfections which are contained there. I am not sure how we will manage the generation of the pages, but it is clear that the base should be the plain-text document. I have made changes to the plain-text document and placed it, a draft Changes page, and the original plain-text document available at http://www.unicode.org/iso15924/iso15924-fixes.zip I would appreciate it if interested persons could look this over and inform me if they find any further discrepancies between the two which are worth troubling about. Then we will proceed to generate the other files. I deleted some duplicate lines: Ethiopic was on two lines, under Ethiopic and under Ge'ez. It seemed inappropriate to burden the tables with such duplication. I added Coptic unilaterally. -- Michael Everson * * Everson Typography * * http://www.evertype.com
Re: ISO 15924 draft fixes
I note also that the list of change (the HTML file in your archive) does not include the change of orthograph in English names for consonnants with dots below (such as malalayam). As this ISO-15924 standard should make the English and French names unambiguous, their orthograph is important. - Original Message - From: Michael Everson [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Wednesday, May 19, 2004 10:40 PM Subject: ISO 15924 draft fixes The Registrar wishes to thank everyone who has taken an interest in the ISO 15924 data pages, and regrets the imperfections which are contained there. I am not sure how we will manage the generation of the pages, but it is clear that the base should be the plain-text document. I have made changes to the plain-text document and placed it, a draft Changes page, and the original plain-text document available at http://www.unicode.org/iso15924/iso15924-fixes.zip
Re: ISO 15924 draft fixes
At 01:08 +0200 2004-05-20, Philippe Verdy wrote: I see some differences - For Georgian, your new file contains only: Georgian (Mkhedruli);Geor;240;géorgien (mkhédrouli);Georgian;2004-05-18 But the previous version also contained in one of the online tables: Georgian (Asomtavruli);Geoa;242;géorgien (assomtavrouli);Georgian;2004-01-05 That's correct. Asomtavruli has been deleted for now. - Where is this line?: Syloti Nagri;Sylo;316;sylotî nâgrî;;2004-09-01 A new script? Oh, it's in the old file and not in the new one? It, Coptic, and Phags-pa need to be in the list (they are all under ballot). Limbu has been adjusted to a more appropriate numeric code within South-Asian scripts (401 to 336). Error corrected. I also think that the removal of duplicate rows for English or French name aliases was a good decision (after all the aliases are already listed between parentheses). No, it would allow a huge number of aliases. People can search the online files with command-F or control-F. I also think that slpitting the line for the start end end codes of private scripts was a good idea. It wasn't mine. I forget whose it was, but it makes the tables print more nicely. -- Michael Everson * * Everson Typography * * http://www.evertype.com
Re: ISO 15924 draft fixes
At 01:26 +0200 2004-05-20, Philippe Verdy wrote: I note also that the list of change (the HTML file in your archive) does not include the change of orthograph in English names for consonnants with dots below (such as malalayam). As this ISO-15924 standard should make the English and French names unambiguous, their orthograph is important. I understand that there are many problems with the online files; I made a comparison only with the plain-text files, and Malayalam was not spelled differently in that file, so I judged it irrelevant to the task of correcting the basic database. -- Michael Everson * * Everson Typography * * http://www.evertype.com
Re: ISO 15924 draft fixes
From: Michael Everson [EMAIL PROTECTED] - Where is this line?: Syloti Nagri;Sylo;316;sylotî nâgrî;;2004-09-01 A new script? Oh, it's in the old file and not in the new one? It, Coptic, and Phags-pa need to be in the list (they are all under ballot). It was in the previous list (see the online HTML table 2). Who decides for the addition of scripts in ISO-15924? I thought there was a separate technical commity and that you were just the bookkeeper of the decisions made by this sub-commitee. It can't be Unicode's UTC alone, as there are already codes for bibliographic references that are not (and will never) be encoded separately in Unicode,so I suppose that there are librarian or publishers members with which you have to discuss, independantly of the work of Unicode, which should only be the registrar for these codes. May be there's still no formal procedure, and for now the codes are maintainable without lots of administration. Do you want a script that generate HTML tables from the reference text file? I'm not an expert in Perl, but my knowledge of PHP or awk is enough to create it. Or may be a simple Javascript could generate the presentation in browsers. I suggest you use a spreadsheet for now to allow sorting or moving columns. One final note: there's still a missing closing parenthese in a French name latin (variante brisée for the Fraktur script.
Re: ISO 15924 draft fixes
At 03:28 +0200 2004-05-20, Philippe Verdy wrote: It was in the previous list (see the online HTML table 2). What does that refer to? Who decides for the addition of scripts in ISO-15924? The ISO 15924 RA-JAC. I thought there was a separate technical commity and that you were just the bookkeeper of the decisions made by this sub-commitee. With regard to Coptic, and the need to sort out the initial difficulties we are having, it seems prudent that I do what is necessary to correct faults. It is unlikely that the RA-JAC will object to this. It can't be Unicode's UTC alone, as there are already codes for bibliographic references that are not (and will never) be encoded separately in Unicode,so I suppose that there are librarian or publishers members with which you have to discuss, independantly of the work of Unicode, which should only be the registrar for these codes. May be there's still no formal procedure, and for now the codes are maintainable without lots of administration. Read the standard. Do you want a script that generate HTML tables from the reference text file? No. We will handle that in due course. One final note: there's still a missing closing parenthese in a French name latin (variante brisée for the Fraktur script. I think that has been corrected by now. -- Michael Everson * * Everson Typography * * http://www.evertype.com