Re: [fpc-pascal] Strings - suggestions

2012-12-22 Thread Rainer Stratmann
Am Saturday 22 December 2012 12:26:09 schrieb dev.d...@gmail.com:
 Users can define the internal type with e.g. {$STRING UTF8} for their
 *whole* project.

Should that (*whole* project) include also the 3rd party units (with 
available sourcecode)?
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Strings - suggestions

2012-12-22 Thread Michael Van Canneyt



On Sat, 22 Dec 2012, dev.d...@gmail.com wrote:


Hi,
concerning the string topic, for me (using fpc since 2.0.4 on a regular basis;
TP experience ~ average user) there really should be an decision what way to
go as early as possible.


- We'll implement the capacity to have a code-page aware string type, as Delphi 
has.
  (well, it is there already).

- The {$H } directive will be extended so you can choose which string type you 
need per unit.
  (ansi/wide/utf16/utf8...)
  This is different from Delphi, where you don't have this choice: 
String=Widestring.

- The necessary conversions will be inserted by the compiler.
  (in fact, they are there already)

- Supplying the type on the command-line will make it the project default.
  We can think about a directive that can be specified in a project file.

Because of the requirement for backwards compatibility with FPC itself, 
we'll make 2 RTLs: one backwards compatible, one with the new unicode string.


Michael.
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Strings - suggestions

2012-12-22 Thread dev . dliw
Hi,
  Users can define the internal type with e.g. {$STRING UTF8} for their
  *whole* project.
 
 Should that (*whole* project) include also the 3rd party units (with
 available sourcecode)?

Yes, that's the idea...
... the only problem is, that many still use old style hacking, this of 
course does only work with {$STRING ANSI} to stay in my example.

Btw. 
and uses a general string unit (e.g. Sysutils).
should be (e.g. strutils) :)
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Strings - suggestions

2012-12-22 Thread dev . dliw
Hi,
thanks for the quick reply.

So the direction seems to be quite clear...
... unfortunately this seemingly wasn't communicated clearly enough to the 
surroundings.

Because of the requirement for backwards compatibility with FPC itself, 
we'll make 2 RTLs: one backwards compatible, one with the new unicode string.
Do you mean really seperated sources or a comiler switch?

What's the problem to use the new RTL with Ansistring? I can't see a 
problem, if it doesn't use direct string access...
... of course you have to compile it accordingly..

so you can choose which string type you need per unit
Can a project wide choice override this for *any* unit (3rd party) or will 
there be conversion going on between the different units?

d.l.i.w
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Strings - suggestions

2012-12-22 Thread Michael Van Canneyt



On Sat, 22 Dec 2012, dev.d...@gmail.com wrote:


Hi,
thanks for the quick reply.

So the direction seems to be quite clear...
... unfortunately this seemingly wasn't communicated clearly enough to the
surroundings.


Because of the requirement for backwards compatibility with FPC itself,
we'll make 2 RTLs: one backwards compatible, one with the new unicode string.

Do you mean really seperated sources or a comiler switch?


A bit of both :-)



What's the problem to use the new RTL with Ansistring? I can't see a
problem, if it doesn't use direct string access...


For obvious reasons, the RTL uses direct string memory access in many places.


... of course you have to compile it accordingly..


You can use ansistring, no problem.

The problem are classes. If I define

TComponent = Class(TPersistent)
Published
  Property Name : String;
end;

Then name will be of the string type as declared in the unit where TComponent is defined. 
For properties this is not so much of a problem but if you try to override something like


Procedure(Arg : String); virtual;

Then the string type must match the original string type as in the original declaration, 
no matter what the string type in your particular unit.



so you can choose which string type you need per unit

Can a project wide choice override this for *any* unit (3rd party) or will
there be conversion going on between the different units?


There will always be conversion if
1) a unit specifies a string type by itself.
2) the unit comes in compiled form.

Michael.
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Strings - suggestions

2012-12-22 Thread dev . dliw
Hi,
thx, got it...

 There will always be conversion if
 1) a unit specifies a string type by itself.
 2) the unit comes in compiled form.

One more question:
If a particular unit (maybe 3rd party) does not define its string type, what 
string type is used:
(a) the type defined in project,
(b) a fpc default type?

In other words, can I force *all* sources in my project to use the same string 
type, provided that I know, they don't do direct access?

The wiki says [http://wiki.freepascal.org/FPC_Unicode_support], that
- shortstring
- ansistring
- widestring
- utf8string
- utf16string
- utf32string
- ucs2string (?)
- ucs4string (?)
may be supported.

So in future I will be able to define any of these for my source (and switch 
between them), without changing code?
Thus:
Any function with string as param will be automatically overloaded for all 
supported string types?

d.l.i.w



___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Strings - suggestions

2012-12-22 Thread Michael Van Canneyt



On Sat, 22 Dec 2012, dev.d...@gmail.com wrote:


Hi,
thx, got it...


There will always be conversion if
1) a unit specifies a string type by itself.
2) the unit comes in compiled form.


One more question:
If a particular unit (maybe 3rd party) does not define its string type, what
string type is used:
(a) the type defined in project,
(b) a fpc default type?


That depends. See below.


In other words, can I force *all* sources in my project to use the same string
type, provided that I know, they don't do direct access?


Yes, *if* the units are recompiled.


The wiki says [http://wiki.freepascal.org/FPC_Unicode_support], that
- shortstring
- ansistring
- widestring
- utf8string
- utf16string
- utf32string
- ucs2string (?)
- ucs4string (?)
may be supported.


They will just be pre-defined string types.


So in future I will be able to define any of these for my source (and switch
between them), without changing code?
Thus:
Any function with string as param will be automatically overloaded for all
supported string types?


Provided 
a) you recompile everything

b) The unit does not specify a string type
the answer seems yes.

Michael.
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Strings - suggestions

2012-12-22 Thread Martin Schreiber
On Saturday 22 December 2012 12:55:12 Michael Van Canneyt wrote:
[...]
 - The {$H } directive will be extended so you can choose which string type
 you need per unit. (ansi/wide/utf16/utf8...)
    This is different from Delphi, where you don't have this choice:
 String=Widestring.

You probably mean String = UnicodeString = reference counted utf-16 encoded 16 
bit string?

 Because of the requirement for backwards compatibility with FPC itself,
 we'll make 2 RTLs: one backwards compatible, one with the new unicode
 string.

UnicodeString or codepage aware cpstrnew?

Martin
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Strings - suggestions

2012-12-22 Thread Michael Van Canneyt



On Sat, 22 Dec 2012, Martin Schreiber wrote:


On Saturday 22 December 2012 12:55:12 Michael Van Canneyt wrote:
[...]

- The {$H } directive will be extended so you can choose which string type
you need per unit. (ansi/wide/utf16/utf8...)
   This is different from Delphi, where you don't have this choice:
String=Widestring.


You probably mean String = UnicodeString = reference counted utf-16 encoded 16
bit string?


Yes.



Because of the requirement for backwards compatibility with FPC itself,
we'll make 2 RTLs: one backwards compatible, one with the new unicode
string.


UnicodeString or codepage aware cpstrnew?


'codepage aware' just means that you can specify a charsize/codepage. 
A unicode string is just one type of codepage aware string.

(unless the compiler guys now tell me otherwise).

So, string will mean Unicode string.

The effect is that
  TComponent.Name
etc. will be a unicode string.

In the backwards compatible RTL,
  TComponent.Name
will be an ansistring, i.e. as it is now.

Michael.___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] Strings - suggestions

2012-12-22 Thread Paul Ishenin

22.12.12, 22:58, Martin Schreiber пишет:


That was so in the beginning but Delphi later changed it. So a Delphi
UnicodeString variable currently allways is utf-16 encoded.


The same in FPC.

Best regards,
Paul Ishenin

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Strings - suggestions

2012-12-22 Thread luiz americo pereira camara
Em 22/12/2012 09:55, Michael Van Canneyt mich...@freepascal.org
escreveu:



 Because of the requirement for backwards compatibility with FPC itself,
we'll make 2 RTLs: one backwards compatible, one with the new unicode
string.


It will be possible to compile a utf8 rtl?

There will be a RtlString ?

Luiz


 ___
 fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
 http://lists.freepascal.org/mailman/listinfo/fpc-pascal
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] Strings - suggestions

2012-12-22 Thread Michael Van Canneyt



On Sat, 22 Dec 2012, luiz americo pereira camara wrote:




Em 22/12/2012 09:55, Michael Van Canneyt mich...@freepascal.org escreveu:



 Because of the requirement for backwards compatibility with FPC itself, we'll 
make 2 RTLs: one backwards compatible, one with the new unicode string.


It will be possible to compile a utf8 rtl?


No.



There will be a RtlString ?


No.

Michael.
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Strings - suggestions

2012-12-22 Thread Marco van de Voort
In our previous episode, Michael Van Canneyt said:
 - The {$H } directive will be extended so you can choose which string type 
 you need per unit.
(ansi/wide/utf16/utf8...)


This is different from Delphi, where you don't have this choice: 
 String=Widestring.

unicodestring, actually.
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Strings - suggestions

2012-12-22 Thread Marco van de Voort
In our previous episode, dev.d...@gmail.com said:
 Users can define the internal type with e.g. {$STRING UTF8} for their *whole* 
 project.

This is technically impossible. Both FPC and Lazarus don't have a complete
overview of all units and includefiles in a project, and compiles can also
be parts of program.

Rule of thumb: anything global must be passed on the cmdline everytime, and
directives are only for unit level. (a few special ones for library units
like $libsuffix excluded)

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Strings - suggestions

2012-12-22 Thread Michael Van Canneyt



On Sat, 22 Dec 2012, Marco van de Voort wrote:


In our previous episode, dev.d...@gmail.com said:

Users can define the internal type with e.g. {$STRING UTF8} for their *whole*
project.


This is technically impossible. Both FPC and Lazarus don't have a complete
overview of all units and includefiles in a project, and compiles can also
be parts of program.

Rule of thumb: anything global must be passed on the cmdline everytime, and
directives are only for unit level. (a few special ones for library units
like $libsuffix excluded)


While this is correct, I think it is possible to construct a (new) directive that 
can be inserted in a program/library file only, and which will have the same 
effect as the command-line argument: for all units that will be compiled as 
part of the normal run, the string type is set.


Whether or not this is desirable, is another matter.

(personally, I think we have enough with the command-line version)

Michael.
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Strings - suggestions

2012-12-22 Thread Marco van de Voort
In our previous episode, Michael Van Canneyt said:
 
  Rule of thumb: anything global must be passed on the cmdline everytime, and
  directives are only for unit level. (a few special ones for library units
  like $libsuffix excluded)
 
 While this is correct, I think it is possible to construct a (new) directive 
 that 
 can be inserted in a program/library file only, and which will have the same 
 effect as the command-line argument: for all units that will be compiled as 
 part of the normal run, the string type is set.
 
 Whether or not this is desirable, is another matter.

Exactly. In theory you can lift a directive to global state for that
compiler invocation (as done with linker related switches atm)

But it can cause so much confusion (e..g other project that uses partially
the same units) that it causes more problems than it solves.

During the unicode discussions I thought about this, and even thought of a
mitigation (though not complete solution) of that problem:

1. if such directive is in a normal unit, mark the PPU as such.
2. if such a PPU is loaded, make it global.
3. if that causes a conflict (multiple such units with different 
  directives), abort.   

This way putting some directive in an unit that is at the top of the
dependency graph will cause it to quickly pull it in on independent
compiles.

It mainly fixes the case where one users generally uses only one style. It
does require some restraint though, which I think the solution is only
theoretical.

So either go to a mandatory project system (but not even Delphi does that),
or not. This will also allow to throw errors when using an unit not compiled
with the same project file etc.

Delphi seems to go in that direction though, I found it is hard to quickly
compile FPC programs on the cmdline with XE3, since then it misses the namespace
prefixes and complains it can't find sysutils because it wants
system.sysutils.

There are backwards compat aliases (actualy, already since D7, and the
winprocs,wintypes unit aliases are even older), but with XE3 they are less

 (personally, I think we have enough with the command-line version)

Yes. And even if not, I would go for a mandatory project file, not in
source.
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] Strings - suggestions

2012-12-22 Thread dev . dliw
Hi,
that's great news...

Thanks for the effort to clarify,
d.l.i.w
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal