On 29 March 2010 13:35, Paul Gilmartin <paulgboul...@aim.com> wrote: > The question isn't whether the IBM-1047 glyphs exist in > UTF-8, but whether it's possible to perform the lookup > mechanically. Your suggestions require artificial > intelligence since it appears no natural intelligence > was applied in coding the charmaps.
I'm not sure the names are normative. Well, certainly they're not consistent from one schem to another. The IBM scheme (CDRA) gives each character a name along the lines of LA010000 for LATIN SMALL LETTER A, and UNICODE has similar but different names. And some of the UNICODE (and maybe the IBM) names have changed over time, though I think UNICODE has now stopped changing them, even when they are egregiously wrong. The UNIX standard (not sure which one that is - POSIX, I imagine) seems to be based on a very old version of the IBM one, with short names like LA01. I'm still not quite sure what you are trying to accomplish, but if you want to look things up, certainly some kind of mapping between the CDRA and the UNICODE names is the place to start. Some of these charts are online, for example http://www-304.ibm.com/jct01003c/software/globalization/gcgid/latin.html but of course one would like them to be on the running system. Tony H. ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html