Hi, > About working all the time in utf-8, do you mean (for example) > converting utf-16 or anything else to utf-8 then working in utf-8 ? Or > only supporting utf-8 files?
What I meant is that if Source-highlight were to use internally only one of the various Unicode encodings, then I would vote for UTF-8, since it's the most common one (except perhaps in CJK countries) and therefore would not generally require either an external application or Source-highlight's frontend to convert between different encodings. I am not familiar with Source-highlight's internals, so I cannot tell you what is the best choice architecture-wise. Nevertheless, I see two broad options: a) Parameterise the encoding in such a way that the internal functions that operate on strings would change depending on whether we were dealing with single-byte, UTF-8, UTF-16, etc. b) Use only one Unicode encoding internally (ex: UTF-8), and make it the frontend's or external application's responsibility to convert to/from this encoding. If whoever implements this option is not comfortable with variable-length encodings, then by all means use a fixed-length encoding like UTF-32 (aka UCS-4). Cheers, Dario Teixeira _______________________________________________ Help-source-highlight mailing list [email protected] http://lists.gnu.org/mailman/listinfo/help-source-highlight
