On 11/25/2014 09:39 PM, Hans-Peter Diettrich wrote:
The Delphi model already broke that claimed type safety, by omitting
conversions of RawByteString results, for speed optimization. That's
dangerous, because the compiler can *only* check the static type of
string variables, but not the dynam
Mattias Gaertner schrieb:
On Tue, 25 Nov 2014 14:49:52 +0100
Felipe Monteiro de Carvalho wrote:
On Tue, Nov 25, 2014 at 2:45 PM, Mattias Gaertner
wrote:
Retype "Char" to "String" and the compiler will bark. For example in
Graphics.
What about changing to WideChar then?
If you mean unit
Mattias Gaertner schrieb:
On Tue, 25 Nov 2014 13:10:26 +0100
Hans-Peter Diettrich wrote:
[...]
Maybe I don't understand the question, but it seems to me this is
documented where static-, dynamic cp and rawbytestring are explained.
More concrete questions:
How can a user be sure that a strin
Mattias Gaertner schrieb:
On Tue, 25 Nov 2014 11:53:00 +0100
Hans-Peter Diettrich wrote:
[...]
Correction: *This* Char type needs to be extended.
Please specify.
The ThousandSeparator type is "Char", which does not work with
Russian in UTF-8. Well, at least if you want the non breakable sp
2014-11-25 14:45 GMT+01:00 Mattias Gaertner :
> On Tue, 25 Nov 2014 11:53:00 +0100
> Hans-Peter Diettrich wrote:
>
> >[...]
> > > Correction: *This* Char type needs to be extended.
> >
> > Please specify.
>
> The ThousandSeparator type is "Char", which does not work with
> Russian in UTF-8. Well,
On Tue, Nov 25, 2014 at 3:14 PM, Mattias Gaertner
wrote:
>> What about changing to WideChar then?
>
> If you mean unit Graphics: It checks for ASCII characters. So a change
> to WideChar would add implicit conversions without any gain.
>
> In case of ThousandSeparator:
> That would probably be suf
On Tue, 25 Nov 2014 14:49:52 +0100
Felipe Monteiro de Carvalho wrote:
> On Tue, Nov 25, 2014 at 2:45 PM, Mattias Gaertner
> wrote:
> > Retype "Char" to "String" and the compiler will bark. For example in
> > Graphics.
>
> What about changing to WideChar then?
If you mean unit Graphics: It chec
On Tue, Nov 25, 2014 at 2:45 PM, Mattias Gaertner
wrote:
> Retype "Char" to "String" and the compiler will bark. For example in
> Graphics.
What about changing to WideChar then?
--
Felipe Monteiro de Carvalho
--
___
Lazarus mailing list
Lazarus@lists
On Tue, 25 Nov 2014 11:53:00 +0100
Hans-Peter Diettrich wrote:
>[...]
> > Correction: *This* Char type needs to be extended.
>
> Please specify.
The ThousandSeparator type is "Char", which does not work with
Russian in UTF-8. Well, at least if you want the non breakable space
instead of the nor
On Tue, 25 Nov 2014 13:10:26 +0100
Hans-Peter Diettrich wrote:
>[...]
> > Maybe I don't understand the question, but it seems to me this is
> > documented where static-, dynamic cp and rawbytestring are explained.
>
> More concrete questions:
>
> How can a user be sure that a string parameter i
Mattias Gaertner schrieb:
On Mon, 24 Nov 2014 22:15:29 +0100
Hans-Peter Diettrich wrote:
[...]
The Delphi (and FPC) encoding model allows for strings of different
static (declared) and dynamic (true content) encoding, see the special
handling of RawByteString (Wiki).
So far it's not a good
Mattias Gaertner schrieb:
On Mon, 24 Nov 2014 22:53:44 +0100
Hans-Peter Diettrich wrote:
Graeme Geldenhuys schrieb:
How is ThousandSeparator and DecimalSeparator supposed to work it
TFormatSettings? If you switched the RTL to UTF-8 or UTF-16 a Russian
thousand separator (4-byte non-breaking
On 11/24/2014 10:15 PM, Hans-Peter Diettrich wrote:
I'm missing documentation for working safely (and efficiently) with
such irregular strings, most probably none of the FPC (and Delphi)
developers ever noticed how users are left alone with this problem :-(
Hmm. In the fpc-devel, lazarus-de
On 2014-11-24 23:13, Mattias Gaertner wrote:
> In case of the new LCL mode we can extend the "LCL Unicode support" page.
I don't know if that is the correct place though. The "not implemented
yet" features affect other toolkits, console and web applications too,
not just LCL based ones.
So for no
On Mon, 24 Nov 2014 16:40:06 +
Graeme Geldenhuys wrote:
>[...]
> Where should we report this? Mantis or Unicode page of the Wiki?
On a second thought, a programmer need to know what might fail and the
alternative/workaround. The latter depends on settings.
In case of the new LCL mode we can
On Mon, 24 Nov 2014 22:53:44 +0100
Hans-Peter Diettrich wrote:
> Graeme Geldenhuys schrieb:
>
> > How is ThousandSeparator and DecimalSeparator supposed to work it
> > TFormatSettings? If you switched the RTL to UTF-8 or UTF-16 a Russian
> > thousand separator (4-byte non-breaking white space ch
On Mon, 24 Nov 2014 22:15:29 +0100
Hans-Peter Diettrich wrote:
>[...]
> The Delphi (and FPC) encoding model allows for strings of different
> static (declared) and dynamic (true content) encoding, see the special
> handling of RawByteString (Wiki).
>
> So far it's not a good idea to simply *as
Graeme Geldenhuys schrieb:
How is ThousandSeparator and DecimalSeparator supposed to work it
TFormatSettings? If you switched the RTL to UTF-8 or UTF-16 a Russian
thousand separator (4-byte non-breaking white space character) for
example will not fit into a Char type.
The Char type is quite us
luiz americo pereira camara schrieb:
When DefaultSystemCodePage is CP_ACP the variable S will have the
content of UTF8 but the encoding will be ACP (in my case 1252), just
like is today.
With DefaultSystemCodePage as CP_UTF8 both content and code page will match
The Delphi (and FPC) encoding
On 2014-11-24 16:36, Mattias Gaertner wrote:
> It has not yet been converted.
Many thanks for confirming that.
> We can help the FPC team by collecting all places.
Where should we report this? Mantis or Unicode page of the Wiki?
Regards,
- Graeme -
--
fpGUI Toolkit - a cross-platform GUI
On Mon, 24 Nov 2014 16:25:15 +
Graeme Geldenhuys wrote:
>[...]
> Or is TFormatSettings just something that hasn't yet been converted to
> be Unicode friendly?
It has not yet been converted.
We can help the FPC team by collecting all places.
Mattias
--
On 2014-11-22 16:38, Michael Van Canneyt wrote:
> The exact behaviour of the RTL is controlled by a couple of variables:
> DefaultSystemCodePage, DefaultFileSystemCodePage ,
> DefaultRTLFileSystemCodePage.
I've read the updated wiki page, but still confused about something...
TFormatSettings =
On Mon, 24 Nov 2014 12:45:54 -0300
luiz americo pereira camara wrote:
> 2014-11-24 8:15 GMT-03:00 Mattias Gaertner :
>[...]
> > This works with or without {$codepage utf8}:
> >
> > S := 'João'; // constant to (Ansi or Short)string
> >
>
> Without {$codepage utf8}
> When DefaultSystemCodePage is
2014-11-24 8:15 GMT-03:00 Mattias Gaertner :
> On Sun, 23 Nov 2014 21:37:56 -0300
> luiz americo pereira camara wrote:
>
> > The attached program show how data loss can occur
>
> The program uses writeln, which converts to console CP.
> When you save the strings to a file you can see what they co
On 2014-11-24 10:52, Michael Schnell wrote:
> I don't know the internals of the program(s). It's a huge system and
> does anything that somehow might be possible :-) .
Luckily you have everything unit tested right. So it would simply be a
case of running the test suite to see what works and what
Please don't start an UTF war again.
This has been discussed in length and a zillion times.
Mattias
--
___
Lazarus mailing list
Lazarus@lists.lazarus.freepascal.org
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
On 11/24/2014 02:50 PM, Hans-Peter Diettrich wrote:
code, the user should be allowed to use the string encoding (and byte
cont per character), he finds the most convenient for his application.
I'm not sure what exactly you mean here.
Here I menat that for a *new project* the user might be wil
Am 24.11.2014 14:55 schrieb "Hans-Peter Diettrich" :
> Please note that until now Windows did the Ansi to UTF conversions
itself, in every API call with strings involved. If this was not noticed
before, the conversions won't be noticeable afterwards as well.
This is something that one definitely s
On 11/24/2014 02:19 PM, Hans-Peter Diettrich wrote:
A move to UTF-16 instead will only favor Windows,
Regarding the RTL interface, you of course are right.
Doing the user software with UTF-16 instead of RTZF-8 strings, in many
cases (but of course not perfectly) allows for keeping old-style 1
Michael Schnell schrieb:
On 11/23/2014 07:52 PM, Felipe Monteiro de Carvalho wrote:
Well, the first reports of how the unicode rtl would look like were
pretty scary: Total break of the string part of millions of lines of
code that people wrote with Lazarus since years.
That is why I stopped re
On Sun, 23 Nov 2014 18:27:12 -0300
luiz americo pereira camara wrote:
> 2014-11-20 13:21 GMT-03:00 Mattias Gaertner :
>[...]
> Please test and tell what you find out.
> >
> >
> The FormatSettings fields are still encoded with System Code Page
> regardless of DefaultSystemCodePage value.
>
> Whil
On Mon, 24 Nov 2014 13:12:04 +0100
Michael Schnell wrote:
> On 11/24/2014 12:01 PM, Juha Manninen wrote:
> > See the request from Mattias : "Please test and tell what you find out."
>
> I have not enough knowledge to be able to patch the compiler :-(
I asked for testing compiling with -dEnableU
On 11/24/2014 12:01 PM, Juha Manninen wrote:
See the request from Mattias : "Please test and tell what you find out."
I have not enough knowledge to be able to patch the compiler :-(
let's keep this thread in a more congrete level.
Agreed (even if I don't think that will lead to anything fai
On Mon, 24 Nov 2014 12:15:03 +0100
Mattias Gaertner wrote:
>[...]
> I guess it would be a good idea to pass -Fcutf8 with FPC 2.7.1. For
> both modes.
On second thought: only for new mode.
Passing it in the old mode will make the wide/unicode/utf8string work,
but the Ansi/Shortstring will be wro
On Sun, 23 Nov 2014 21:37:56 -0300
luiz americo pereira camara wrote:
> 2014-11-20 13:21 GMT-03:00 Mattias Gaertner :
>[...]
First of all: Thanks for testing.
> Without {$codepage utf8} directive String constants will get Code Page 0
> (CP_ACP) and not the 1200 (UTF16 - UnicodeString).
Beware:
On Mon, Nov 24, 2014 at 11:33 AM, Michael Schnell wrote:
> IMHO that would be just GREAT to allow for doing portable software. The RTL
> and LCL interface should be OS ignorant for portability. In user code, the
> user should be allowed to use the string encoding (and byte cont per
> character), h
On 11/24/2014 11:44 AM, luiz americo pereira camara wrote:
If the program does not explicitely assumesa specific encoding, i.e.
use only String type and do not do low level string handling, there
will be no need to change.
I don't know the internals of the program(s). It's a huge system and
2014-11-24 6:29 GMT-03:00 Michael Schnell :
> On 11/23/2014 07:52 PM, Felipe Monteiro de Carvalho wrote:
>
>>
>> Well, the first reports of how the unicode rtl would look like were
>> pretty scary: Total break of the string part of millions of lines of
>> code that people wrote with Lazarus since
On 11/22/2014 05:18 PM, Hans-Peter Diettrich wrote:
Does this mean that Lazarus (new mode) ignores the OS system codepage
setting?
IMHO that would be just GREAT to allow for doing portable software. The
RTL and LCL interface should be OS ignorant for portability. In user
code, the user should
On 11/23/2014 07:52 PM, Felipe Monteiro de Carvalho wrote:
Well, the first reports of how the unicode rtl would look like were
pretty scary: Total break of the string part of millions of lines of
code that people wrote with Lazarus since years.
That is why I stopped recommending Lazarus to my c
On 24.11.2014 03:19, luiz americo pereira camara wrote:
I updated the test app to show the hexadecimal representation of the string.
When {$codepage utf8} is set, all string encoding and content is right
matching each other regardless of MultiByteConversionCodePage
Without {$codepage utf8}:
Wh
On 24.11.2014 01:37, luiz americo pereira camara wrote:
2014-11-20 13:21 GMT-03:00 Mattias Gaertner mailto:nc-gaert...@netcologne.de>>:
Please test and tell what you find out.
Without {$codepage utf8} directive String constants will get Code Page 0
(CP_ACP) and not the 1200 (UTF16 - Un
I updated the test app to show the hexadecimal representation of the string.
When {$codepage utf8} is set, all string encoding and content is right
matching each other regardless of MultiByteConversionCodePage
Without {$codepage utf8}:
When MultiByteConversionCodePage is CP_ACP (default) one str
I added {.$codepage utf8} and all strings output as "Joao".
Got confused. I did not to expect changes in the constant assigned to the
UnicodeString variable
Need to check what is the correct UTF8 output: "JoA£o" or "Joao"
Luiz
2014-11-23 21:37 GMT-03:00 luiz americo pereira camara :
>
>
> 201
2014-11-20 13:21 GMT-03:00 Mattias Gaertner :
>
> Please test and tell what you find out.
>
Without {$codepage utf8} directive String constants will get Code Page 0
(CP_ACP) and not the 1200 (UTF16 - UnicodeString).
String variables assigned to those constants will also have Code Page = 0
This
2014-11-20 13:21 GMT-03:00 Mattias Gaertner :
>
> 2. The new mode: The LCL, FCL and RTL treat all "String" as UTF-8
> encoded. Most RTL file functions now work with full Unicode.
> For example FileExists and aStringList.LoadFromFile(Filename) now
> support full Unicode.
>
[..]
Please test and te
On Sun, Nov 23, 2014 at 1:56 PM, Michael Van Canneyt
wrote:
> Don't worry. Computers are not scary, not really. Just look at "Terminator"
> (or any other Sci-Fi involving computers), the humans always win in the
> end... :-)
Well, the first reports of how the unicode rtl would look like were
pret
On 2014-11-23 12:56, Michael Van Canneyt wrote:
> the humans always win in the end... :-)
ROFL
> Phew... At least something we did better in the whole string mess ... ;)
9/10 times FPC does everything better than Delphi.
Regards,
- Graeme -
--
fpGUI Toolkit - a cross-platform GUI toolkit u
On Sun, 23 Nov 2014 13:56:42 +0100 (CET)
Michael Van Canneyt wrote:
>[...]
> Anyway, I was just trying to say that a 1-byte string is not necessarily
> UTF-8 in FPC 2.7.1.
Yes, you can still store anything you like in strings.
And you can store UTF-8 in a string and say it is not.
Mattias
--
On Sun, 23 Nov 2014, Mattias Gaertner wrote:
True. Although many programmers misunderstand what this means. It is not
as scary as it sounds.
To all the scared people:
Don't worry. Computers are not scary, not really.
Just look at "Terminator" (or any other Sci-Fi involving computers),
the
Am 23.11.2014 00:15 schrieb "Mattias Gaertner" :
> > Additionally, most basic File I/O routines now correctly call the
underlying
> > OS-es file routines with the codepage the OS expects (which is
WideString on Windows).
>
> Is it safe to say UTF-16? Or are there still UCS-2 Windows?
Till NT 4 inc
On Sat, 22 Nov 2014 17:38:33 +0100 (CET)
Michael Van Canneyt wrote:
>[...]
> > Yes, with the UTF8 RTL. The default RTL uses system codepage.
>
> Careful, there is no such thing as the "UTF8 RTL".
>
> There is now a "Unicode and CodePage-aware RTL".
Well, yes, you are right of course.
But "Unic
On Sat, 22 Nov 2014 17:18:35 +0100
Hans-Peter Diettrich wrote:
> Mattias Gaertner schrieb:
>
> > // GetCommandLineW returns a UTF-16 PWideChar
> > // the compiler adds code to convert this to the
> > // default system codepage (CP_ACP = CP_UTF8)
> > // the resulting string has StringCode
Mattias Gaertner schrieb:
// GetCommandLineW returns a UTF-16 PWideChar
// the compiler adds code to convert this to the
// default system codepage (CP_ACP = CP_UTF8)
// the resulting string has StringCodePage CP_ACP
// and is encoded in UTF-8.
Does this mean that Lazarus (new mode)
On Sat, 22 Nov 2014, Mattias Gaertner wrote:
On Sat, 22 Nov 2014 16:18:09 +0100
Jürgen Hestermann wrote:
Am 2014-11-22 um 15:06 schrieb Mattias Gaertner:
> procedure TForm1.FormCreate(Sender: TObject);
> var s: string; // String = AnsiString because of $H+
> begin
> s:=GetCommandLineW
On Sat, 22 Nov 2014 16:18:09 +0100
Jürgen Hestermann wrote:
> Am 2014-11-22 um 15:06 schrieb Mattias Gaertner:
> > procedure TForm1.FormCreate(Sender: TObject);
> > var s: string; // String = AnsiString because of $H+
> > begin
> > s:=GetCommandLineW;
> > // GetCommandLineW returns a UTF
Am 2014-11-22 um 15:06 schrieb Mattias Gaertner:
> procedure TForm1.FormCreate(Sender: TObject);
> var s: string; // String = AnsiString because of $H+
> begin
> s:=GetCommandLineW;
> // GetCommandLineW returns a UTF-16 PWideChar
> // the compiler adds code to convert this to the
> // defa
On Sat, 22 Nov 2014 14:37:00 +0100
Jürgen Hestermann wrote:
> Am 2014-11-20 um 17:21 schrieb Mattias Gaertner:
> > The development version of FPC 2.7.1 has extended Strings and many RTL
> > functions now work for codepages other than the system codepage.
>
> > 2. The new mode: The LCL, FC
Am 2014-11-20 um 17:21 schrieb Mattias Gaertner:
> The development version of FPC 2.7.1 has extended Strings and many RTL
> functions now work for codepages other than the system codepage.
> 2. The new mode: The LCL, FCL and RTL treat all "String" as UTF-8 encoded.
...
> When accessing the Wi
On Thu, Nov 20, 2014 at 1:21 PM, Mattias Gaertner wrote:
> Hi all, especially Windows users,
>
> The development version of FPC 2.7.1 has extended Strings and many RTL
> functions now work for codepages other than the system codepage.
>
> This means Lazarus can now be compiled in two modes:
>
> 1
Hi all, especially Windows users,
The development version of FPC 2.7.1 has extended Strings and many RTL
functions now work for codepages other than the system codepage.
This means Lazarus can now be compiled in two modes:
1. The old mode: LCL treats all "String" as UTF-8 encoded. When
accessing
61 matches
Mail list logo