On Saturday 05 January 2013 11:30:42 Jonas Maebe wrote: [...] > > For example, I said that basically nothing changed in 2.7.x compared to > 2.6.x, except that certain string constants are no longer automatically > converted to utf-16 at compile time, and then you ask "Or should we not > touch the theme strings and FPC anymore?". Since basically nothing changed > except for a few less blind auto-conversions at compile time, why should > you no longer be able to touch anything anymore? > > Let me repeat: your string constants will be parsed by the compiler into > character sequences with exactly the same content in both 2.6.x and 2.7.x > (and with content I mean that if they would be converted to the same code > page in 2.6.x and in 2.7.x, you would end up with exactly the same binary > data). Whether or not they contain character literals whose value is >#127 > in the source code's code page, or explicit "#xx", "#xxx" etc expressions > has no influence, nothing changed in the compiler in that account. > > The *only* difference is that the compiler can now internally represent > ansistrings with arbitrary code pages, and as a result the aforementioned > character sequences may now be stored internally in the compiler in a > different format, and also stored in the program in a different format if > that can avoid conversions at run time. In particular, character sequences > are no longer all converted immediately/by default/under all circumstances > to UTF-16 in case characters >#127 need to be interpreted according to a > particular code page (i.e., if a {$codepage xxx} directive is present). > > The compiler will now only convert such character sequences to UTF-16, > still at compile time (just like it was able to do in 2.6.x), if it is > actually assigned to an UTF-16-encoded string, passed to an UTF-16 > parameter etc. And the compiler will also convert it to another ansistring > code page is case the character sequence appeared in e.g. a file with > {$codepage utf-8} and is then assigned to a variable whose type is declared > as "type ansistring(850)". > Thank you very much for the detailed explanation. What I could not found in all the answers (probably it is my ignorance of the English language), is, does #n mean a utf16 code unit as in Delphi XE3 or does it denote something other? You write:
> Whether or not they contain character literals whose value is >#127 > in the source code's code page, or explicit "#xx", "#xxx" etc expressions > has no influence, nothing changed in the compiler in that account. Assume {$codepage utf-8} how should we enter Russian character constants in #n form? How should we enter Russian character constants in #n form if {$codepage 8859-5} is defined? And again, sorry for the impertinence, how do resource strings fit in the string handling scenario of Free Pascal trunk? Martin _______________________________________________ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel