> * ASCII is a well defined standard, and all ASCII is UTF-8 (but the > converse is not true). > > * The command iconv -f UTF-8 -t ASCII file will fail unless all the > characters in file are already ASCII. Hence it isn't a very useful > command. > > * The coding standards say that we should prefer ASCII wherever > possible. If it is not possible, then we should use UTF-8. > > I think that Therese is saying that there are some files which are using > UTF-8 when ASCII would have sufficed.
Thanks. This is exactly what I meant. (Thérèse forwarded my question from webmasters to this list). ASCII is the common subset of both UTF-8 and most single-byte charsets, specially ISO-8859-X. Coding 'maintain.txt' and 'maintain.info' in pure ASCII (as was done until some time ago) makes them backwards compatible with almost all machines and OSs at no cost. Can you define what you mean with "pure ASCII"? What is "unpure ASCII"? Are you refering to ASCII (which is 7-bits) represented in a 8-bit byte, with the high bit high by unpure, and high bit low to be pure? I.e., #o200 >= #o400 being unpure chars, < #o200 as "pure"? Initially you mentioned issues viewing things on terminals. That has little to do with not using the eighth bit in a byte, since about one third of the ASCII table is not viewable on dumb terminals. And many such combinations are in the Info format, and have been since before Info was used by the GNU project. Can you clarify, give examples, etc of what exactly you are having an issue with? Seeing that UTF-8 is a subset of the ASCII table, so since they by definition are compatible with any thing that represents a character as a 8-bit byte I cannot see what system it wouldn't work on, so a better description of where it doesn't work would be helpful to understand the issue. Can you clarify what issues you are seeing, on what systems, and what architectures? This is specially important for 'maintain.info' because it can't be converted (the tag table becomes incorrect). Any user of a single-byte terminal will need to rebuild 'maintain.info' from source (as I need to do to see it in one of my machines). Why should it be converted? Info files are meant for an Info viewer, it would be the task of the Info viewer to adjust its locale. I don't know what a single byte terminal is, but anything that represents a character as a 8-bit byte will handle UTF-8 just fine -- this includes ancient VT100's, so hearing where you have issues with UTF-8 would be helpful. /Alfred
