Hi, Thanks for a clear explanation and simple solution for the problem. Best regards, Ranier Vilela ________________________________________ De: Theron <[email protected]> Enviado: quarta-feira, 18 de julho de 2018 21:03 Para: [email protected] Assunto: Re: [Iup-users] Use utf-8 source encoding rather than ISO8859-1
Returning to the original issue of string encoding in source: This needn't be a Windows/non-Windows problem, or UTF-8 vs. -16, or any other conflict over platform and standard compatibility. I encountered the string encoding problem just last month when I worked on porting IUP to a system using the Clang compiler; the following is more of a summary of the problem and its workaround than anything which hasn't already been discovered or acknowledged in this thread: The intent in iup_str.c and iup_strmessage.c appears to be that the exact byte or byte string between quotes be copied into the constant string data compiled into the library: the compiler should not try to reinterpret the byte sequences in any way; this is the job of IUP, the OS, and/or third-party libraries at run-time. However, not all compilers expect to use character and string literals in this way. The lowest common denominator of various ways a compiler may try to interpret the string may be reasonably assumed to be ASCII, therefore any bytes in range 128-255 should be represented using escapes rather than pasted directly in the source. So although Cloud Wu suggested converting source encoding "from ISO8859-1 to UTF-8", the intent is to restrict source itself to ASCII (no UTF-8!), while preserving the raw byte strings the source intends to represent in whichever ISO8859-1/UTF-8/UTF-16 encoding is already used. If done correctly, this would not change the compiled library in any way. The remaining issue, of course, is this: On 07/18/18 07:06, 云风 Cloud Wu wrote: Antonio Scuri <[email protected]<mailto:[email protected]>>于2018年7月18日周三 下午6:15写道: Although being harmless, it turns maintenance of these strings more difficult. I wouldn't like that. For iupmatex_units.c , I agree it turns maintenance more difficult. I suggest using a macro to improve it. A potentially cleaner solution is to maintain a single master file of string constants, encoded in whatever encoding is preferred for editing, and to generate the relevant parts of iup*_str*, iup*_units*, etc. programmatically. In porting IUP to FreeBSD, where Clang compiler is default and preferred, I used small conversion programs to convert the string literals as needed. These are attached. Once compiled, the conversion utilities read source from stdin and write adjusted source to stdout. Should the IUP project insist on continuing to release iup_str.c and iup_strmessage.c with the existing mixed-encoding literals, these conversion utilities would need to become part of the routine procedure for the maintainer of the FreeBSD port to import upstream changes. (This also applies to a port of IUP to MacOS, another system where Clang is default. More on this later...) Instead of maintaining this conversion scheme independently, it seems sensible to try to integrate this with the master IUP project. In the interest of making IUP more portable across diverse platforms, I would be happy to help with this. Theron ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Iup-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/iup-users
