Hi,
Thanks for a clear explanation and simple solution for the problem.

Best regards,
Ranier Vilela
________________________________________
De: Theron <[email protected]>
Enviado: quarta-feira, 18 de julho de 2018 21:03
Para: [email protected]
Assunto: Re: [Iup-users] Use utf-8 source encoding rather than ISO8859-1

Returning to the original issue of string encoding in source:

This needn't be a Windows/non-Windows problem, or UTF-8 vs. -16, or any other 
conflict over platform and standard compatibility.

I encountered the string encoding problem just last month when I worked on 
porting IUP to a system using the Clang compiler; the following is more of a 
summary of the problem and its workaround than anything which hasn't already 
been discovered or acknowledged in this thread:

The intent in iup_str.c and iup_strmessage.c appears to be that the exact byte 
or byte string between quotes be copied into the constant string data compiled 
into the library: the compiler should not try to reinterpret the byte sequences 
in any way; this is the job of IUP, the OS, and/or third-party libraries at 
run-time.

However, not all compilers expect to use character and string literals in this 
way.  The lowest common denominator of various ways a compiler may try to 
interpret the string may be reasonably assumed to be ASCII, therefore any bytes 
in range 128-255 should be represented using escapes rather than pasted 
directly in the source.

So although Cloud Wu suggested converting source encoding "from ISO8859-1 to 
UTF-8", the intent is to restrict source itself to ASCII (no UTF-8!), while 
preserving the raw byte strings the source intends to represent in whichever 
ISO8859-1/UTF-8/UTF-16 encoding is already used.

If done correctly, this would not change the compiled library in any way.

The remaining issue, of course, is this:

On 07/18/18 07:06, 云风 Cloud Wu wrote:
Antonio Scuri 
<[email protected]<mailto:[email protected]>>于2018年7月18日周三 下午6:15写道:
  Although being harmless, it turns maintenance of these strings more 
difficult. I wouldn't like that.

For iupmatex_units.c , I agree it turns maintenance more difficult. I suggest 
using a macro to improve it.


A potentially cleaner solution is to maintain a single master file of string 
constants, encoded in whatever encoding is preferred for editing, and to 
generate the relevant parts of iup*_str*, iup*_units*, etc. programmatically.

In porting IUP to FreeBSD, where Clang compiler is default and preferred, I 
used small conversion programs to convert the string literals as needed.  These 
are attached.  Once compiled, the conversion utilities read source from stdin 
and write adjusted source to stdout.

Should the IUP project insist on continuing to release iup_str.c and 
iup_strmessage.c with the existing mixed-encoding literals, these conversion 
utilities would need to become part of the routine procedure for the maintainer 
of the FreeBSD port to import upstream changes.  (This also applies to a port 
of IUP to MacOS, another system where Clang is default.  More on this later...) 
 Instead of maintaining this conversion scheme independently, it seems sensible 
to try to integrate this with the master IUP project.  In the interest of 
making IUP more portable across diverse platforms, I would be happy to help 
with this.

Theron
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Iup-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/iup-users

Reply via email to