Re: [fpc-pascal] Managed properties idea

2017-10-04 Thread Sven Barth via fpc-pascal
Am 05.10.2017 05:04 schrieb "Ryan Joseph" :
> I’ve been wanting to learn how to contribute to the compiler for years
now but maybe this is an easy enough project to start with. I don’t know if
this is  a problem people have though but I assume it may be since
Objective-C had a system like for memory management and properties. What do
you think? How do most FPC programmers handle memory these days?

The way to go in Object Pascal are reference counted interfaces (aka
COM-style interfaces). After all one should program against an interface
anyway and that reference counted interfaces are automatically handled by
the compiler is an added bonus.

Regards,
Sven
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

[fpc-pascal] Managed properties idea

2017-10-04 Thread Ryan Joseph
Yesterday when I was thinking about that enumerators I had an idea for a 
language feature and something which I saw introduced in Objective-C some years 
ago.

Retaining of a member variable through a property is a common pattern I do 
everyday nearly. In my root class I manage a simply retain count which frees 
the object when it reaches 0. When assigning variables that the receiver wants 
to take ownership of on assignment I have to make an accessor which manages the 
retain count of the object (boring boiler plate).

Consider the example below where the root class (TObject) overrides a class 
function which performs this retain count management (ManageObject). In the 
other class there is a member of type TObject which TMyClass will take 
ownership of when assigned. If the property “Data” includes the “managed” 
keyword the compiler will automatically call ClassType.ManageObject() and take 
the return value as the variable which is written to “m_data”.

I’ve been wanting to learn how to contribute to the compiler for years now but 
maybe this is an easy enough project to start with. I don’t know if this is  a 
problem people have though but I assume it may be since Objective-C had a 
system like for memory management and properties. What do you think? How do 
most FPC programmers handle memory these days?

type
TObject = class
class function ManageObject (obj: pointer): TObject; override;  
end;

type
TMyClass = class (TObject)
private
m_data: TObject;
public
property Data: TObject read m_data write m_data; 
managed;
// if there’s a write method defined the compiler calls 
ManageObject before the write method
property MyData: TObject read m_data write SetData; 
managed;
end;

class function TObject.ManageObject (obj: pointer): TObject; override;
begin
if obj <> nil then
result := TObject(obj).Retain;
else
begin
TObject(obj).Release;
result := nil;
end;
end;

Regards,
Ryan Joseph

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

[fpc-pascal] Finding long file names

2017-10-04 Thread James Richters
I'm trying to use findfirst()/findnext to obtain a list of files.  Here's my 
code:
Searchfile:=Tap_Drive+Tap_Path+'\'+Tap_SubDirectory+'\*.TAP';
If FindFirst(Searchfile, FAAnyfile-FAHidden, FileDirInfo)=0 then
..

It finds most files, even ones with really long file names, however it can't 
find files with periods in the file name, 
So it will find:
This is a TEST.Tap

But it will not find:
This.is.a.TEST.tap

If I change my search string to:
Searchfile:=Tap_Drive+Tap_Path+'\'+Tap_SubDirectory+'\*.*';

Then it DOES find the files with more than one period in them... along with 
everything else.

I could filter them out myself I suppose, but that seems to defeat the way 
findfirst is supposed to work.

Any ideas how to make this work?  Is there a better method to use than 
findfirst() ?

I notice that if I use Extractfileext() with This.is.a.TEST.tap it correctly 
returns '.tap' as the extension.  Maybe findfirst is an obsolete way of listing 
the files?Or maybe it just never got fixed to handle valid files with more 
than one period? 

Any thoughts on this?

James


___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] For ..in GetEnumerator Allocation

2017-10-04 Thread Ryan Joseph

> On Oct 4, 2017, at 10:03 PM, Michael Van Canneyt  
> wrote:
> 
> Newinstance allocates the memory for a new instance of the class.
> By default this is GetMem(instanceSize).

So you override the class method Newinstance in the enumerator class and return 
the same block of memory? Then when I override FreeInstance and do nothing I 
need to manually free the memory in the calling class I guess.

That’s an interesting solution and I didn’t know TObject did that even but 
using records still seems like a better solution and about as efficient 
(perhaps). I’ll try this tomorrow also to make sure I understand it. Thanks.

Regards,
Ryan Joseph

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] For ..in GetEnumerator Allocation

2017-10-04 Thread Ryan Joseph

> On Oct 4, 2017, at 5:19 PM, Marco van de Voort  wrote:
> 
> Yup, or a record.  See e.g. http://www.stack.nl/~marcov/lightcontainers.zip

This seems like the simplest more efficient method. Does FPC just know it’s a 
record internally and not try to dealloc it? GetEnumerator seems like a magic 
method so I’m not sure how the memory is being managed but I assume the stack 
space for the method return value is reserved and I just fill in the values of 
the record to that location.

Regards,
Ryan Joseph

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] For ..in GetEnumerator Allocation

2017-10-04 Thread Mattias Gaertner
On Wed, 4 Oct 2017 15:41:27 +0700
Ryan Joseph  wrote:

> As I understand the for..in loop GetEnumerator method is expected to create a 
> new object each time it’s called and FPC destroys it later when the loos is 
> finished. Can I retain the enumerator and just reset it in-between calls? I’d 
> like to remove all these alloc/deallocs so I can use for..in more efficiently 
> in tight loops.

Here is an example how to use a global enumerator object:

type
  TMyEnumerator = class
  private
FOwner: TComponent;
FCurrent: TComponent;
  public
procedure Init(Owner: TComponent);
function MoveNext: boolean;
property Current: TComponent read FCurrent;
function GetEnumerator: TMyEnumerator;
  end;

  TForm1 = class(TForm)
  public
function GetEnumerator: TMyEnumerator;
  end;

var
  Form2: TForm2;
  MyEnumerator: TMyEnumerator;

implementation

procedure TMyEnumerator.Init(Owner: TComponent);
begin
  FOwner:=Owner;
  FCurrent:=nil;
end;

function TMyEnumerator.MoveNext: boolean;
var
  i: Integer;
begin
  if FCurrent=nil then begin
if FOwner.ComponentCount=0 then exit(false);
FCurrent:=FOwner.Components[0];
  end else begin
i:=FCurrent.ComponentIndex+1;
if i>=FCurrent.Owner.ComponentCount then exit(false);
FCurrent:=FCurrent.Owner.Components[i];
  end;
  Result:=true;
end;

function TMyEnumerator.GetEnumerator: TMyEnumerator;
begin
  Result:=Self;
end;

function TForm1.GetEnumerator: TMyEnumerator;
begin
  if MyEnumerator=nil then
MyEnumerator:=TMyEnumerator.Create;
  MyEnumerator.Init(Self);
end;

finalization
  MyEnumerator.Free;
end.
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] Yet another thread on Unicode Strings

2017-10-04 Thread Marco van de Voort
In our previous episode, Tony Whyman said:
> Unicode Character String handling is a question that keeps coming up on 
> the Free Pascal Mailing lists and, empirically, it is hard to avoid the 
> conclusion that there is something wrong with the way these character 
> string types are handled. Otherwise, why does this issue keep arising?

Because people have old code that is ascii, or handles unicode in a
different, ad-hoc matter. Moreover FPC/Lazarus is also still usable in an
ascii only mode for old projects.

> The programmer is too often forced to be aware of how strings 
> are encoded and must make a choice as to which is the preferred 
> character encoding for their program. There then follows confusion over 
> how to make that choice.

To avoid confusion, make sure it is unicode. It doesn't matter that
much if it is utf16 or not.

> Is Delphi compatibility the goal? What 
> Languages must I support? If I want platform independence which is the 
> best encoding? Which encoding gives the best performance for my 
> algorithm? And so on.
 
> Another problem is that there is no character type for a Unicode 
> Character. The built-in type ?WideChar? is only two bytes and cannot 
> hold a UTF-16 code point comprising two surrogate pairs. There is no 
> char type for a UTF-8 character and, while UCS4Char exists, the Lazarus 
> UTF-8 utilities use ?cardinal? as the type for a code point (not exactly 
> strong typing).

Most code will simply use "string" to hold a character. Only special and
code that really must be performant will do other things.
 
> In order to stop all this confusion I believe that there has to be a 
> return to Pascal's original fundamental concept. That is the value of a 
> character type represents a character, while the encoding of the 
> character is platform dependent and a choice the compiler makes and not 
> the programmer. Likewise a character string is an array of characters 
> that can be indexed by character (not byte) number, from which 
> substrings can be selected and compared with other strings according to 
> the locale and the unicode standard collating sequence. Let the 
> programmer worry about the algorithm and the compiler worry about the 
> best implementation.
>
> I want to propose a new character type called ?UniChar? - short for 
> Unicode Character, along with a new string type ?UniString? and a new 
> collection ?TUniStrings?. I have presented my thoughts here in a 
> detailed paper
>
This doesn't work, and it seems you haven't read the backlog for unicode
related messages all the way back to early 2009. What you suggest was one of
the null hypotheses back then, and we are now 8 years further.

Search for the unicode meanings of (1) glyph, (2) character (3) codepoint
(4) denormalized strings.

If you digest all that, you need to define the unichar type very large,
blowing up strings enormously, and then again converting it back to either
utf16 or utf8 to communicate with nearly anything (APIs, libraries etc)

Moreover it will just require yet another conversion and more confusion with
more competing systems. So the number of problems will only rise. And the
incompatibility to Delphi is still there, so will create trouble ad
infinitum.

This argument is best summed up by this cartoon: https://xkcd.com/927/

In short, there is no substitute than to actively learn what unicode is
about and live with it. 

Some of the problems were summed up in the discussion back then:
http://www.stack.nl/~marcov/unicode.pdf

Note that in hindsight I don't think Florian's proposal was that bad, and
Florian was somewhat vindicated by Delphi's choice for multi encoding
ansistring type.

My new opinion is that whatever the choice is, I think to choose different
from Delphi (despite all its flaws, perceived OR real, doesn't matter) was
wrong.
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] For ..in GetEnumerator Allocation

2017-10-04 Thread Michael Van Canneyt



On Wed, 4 Oct 2017, Ryan Joseph wrote:




On Oct 4, 2017, at 4:26 PM, Michael Van Canneyt  wrote:

You can do so by overriding the newinstance and it's sister method of
your enumerator class.


Can you explain how this works or give an example? Not sure how these gets 
around the problem of alloc/dealloc for each iterator.


Careful, the iterator is instantiated only once per loop ?

Newinstance allocates the memory for a new instance of the class.
By default this is GetMem(instanceSize).

What you can do is allocate somewhere a block of the correct size, 
once, and always return this single block.


Of course, you need to know for sure that only 1 instance of your enumerator
will be active at any given point in your program. 
If there are multiple, you could allocate a static array and allocate from that.


The FreeInstance procedure does a freemem, by default, but you can change
that to do nothing. (in case of an array, you mark the element unused)


Michael.

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] For ..in GetEnumerator Allocation

2017-10-04 Thread Ryan Joseph

> On Oct 4, 2017, at 4:26 PM, Michael Van Canneyt  
> wrote:
> 
> You can do so by overriding the newinstance and it's sister method of
> your enumerator class.

Can you explain how this works or give an example? Not sure how these gets 
around the problem of alloc/dealloc for each iterator.

Regards,
Ryan Joseph

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] Yet another thread on Unicode Strings

2017-10-04 Thread Mattias Gaertner
On Wed, 4 Oct 2017 13:10:02 +0100
Tony Whyman  wrote:

> Unicode Character String handling is a question that keeps coming up on 
> the Free Pascal Mailing lists and, empirically, it is hard to avoid the 
> conclusion that there is something wrong with the way these character 
> string types are handled. Otherwise, why does this issue keep arising?

Mixing string types, mixing encodings, mixing legacy code, confusing
UCS-2 with UTF-16, 


>[...]
> Another problem is that there is no character type for a Unicode 
> Character.

I'm curious: What languages have such a type?

> The built-in type “WideChar” is only two bytes and cannot 
> hold a UTF-16 code point comprising two surrogate pairs. There is no 
> char type for a UTF-8 character and, while UCS4Char exists, the Lazarus 
> UTF-8 utilities use “cardinal” as the type for a code point (not exactly 
> strong typing).

Should be remedied.

>[...]
>Let the programmer worry about the algorithm and the compiler worry about the 
best implementation.

An UTF-32 string type is seldom the best choice for memory
and/or speed.

>[...]
> I want to propose a new character type called “UniChar” - short for 
> Unicode Character, along with a new string type “UniString” and a new 
> collection “TUniStrings”. I have presented my thoughts here in a 
> detailed paper
> 
> see https://mwasoftware.co.uk/docs/unistringproposal.pdf
> 
> This is intended to be a fully worked proposal and I have circulated it 
> to provoke discussion and in the hope that it may be useful.

Adding another string type without disabling some old string types will
increase the confusion. Please provide a proposal for disabling old
string types.

Also keep in mind, that there is still no UTF-16 RTL, even though
many people need that for Delphi compatibility. Starting yet another
UTF-32 RTL need some heavy dedicated programmers.

Mattias
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

[fpc-pascal] Yet another thread on Unicode Strings

2017-10-04 Thread Tony Whyman
Unicode Character String handling is a question that keeps coming up on 
the Free Pascal Mailing lists and, empirically, it is hard to avoid the 
conclusion that there is something wrong with the way these character 
string types are handled. Otherwise, why does this issue keep arising?


Supporters of the current implementation point to the rich set of 
functions available to handle both UTF-8 and UTF-16 in addition to 
legacy ANSI code pages. That is true – but it may be that it is also the 
problem. The programmer is too often forced to be aware of how strings 
are encoded and must make a choice as to which is the preferred 
character encoding for their program. There then follows confusion over 
how to make that choice. Is Delphi compatibility the goal? What 
Languages must I support? If I want platform independence which is the 
best encoding? Which encoding gives the best performance for my 
algorithm? And so on.


Another problem is that there is no character type for a Unicode 
Character. The built-in type “WideChar” is only two bytes and cannot 
hold a UTF-16 code point comprising two surrogate pairs. There is no 
char type for a UTF-8 character and, while UCS4Char exists, the Lazarus 
UTF-8 utilities use “cardinal” as the type for a code point (not exactly 
strong typing).


In order to stop all this confusion I believe that there has to be a 
return to Pascal's original fundamental concept. That is the value of a 
character type represents a character, while the encoding of the 
character is platform dependent and a choice the compiler makes and not 
the programmer. Likewise a character string is an array of characters 
that can be indexed by character (not byte) number, from which 
substrings can be selected and compared with other strings according to 
the locale and the unicode standard collating sequence. Let the 
programmer worry about the algorithm and the compiler worry about the 
best implementation.


I want to propose a new character type called “UniChar” - short for 
Unicode Character, along with a new string type “UniString” and a new 
collection “TUniStrings”. I have presented my thoughts here in a 
detailed paper


see https://mwasoftware.co.uk/docs/unistringproposal.pdf

This is intended to be a fully worked proposal and I have circulated it 
to provoke discussion and in the hope that it may be useful.


The intent is to create a character and string handling design that is 
natural to use with the programmer rarely if ever having to think about 
the character or string encoding. They are dealing with Unicode 
Characters and strings of Unicode Characters and that is all. When 
necessary, transliteration happens naturally and as a consequence of 
string concatenation, input/output, or in the rare cases when 
performance demands a specific character encoding.


There is also a strong desire to avoid creating more choice and hence 
more confusion. The intent is to “embrace and replace”. Both AnsiString 
and UnicodeString should be seen as subsets or special cases of the 
proposed UniString, and with concrete types such as AnsiChar, WideChar 
and WideString, other than for legacy reasons, existing primarily to 
define external interfaces.


Tony Whyman

MWA Software

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] For ..in GetEnumerator Allocation

2017-10-04 Thread Graeme Geldenhuys

On 2017-10-04 09:41, Ryan Joseph wrote:

I’d like to remove all these alloc/deallocs so I can use for..in more 
efficiently in tight loops.


I've had the same requirement, and also needed that functionality before 
the for..in syntax existed in FPC. Take a look at my Iterator interface 
and implementation. It allows you to move forward, backwards, reset, 
filter data etc. You can hold on to the instance reference as long as 
you like.


This code lives in the tiOPF project, but can be used outside of the 
tiOPF project too (I do that often) - simply delete the iterator 
implementations for TtiObjectList.



https://github.com/graemeg/tiopf/blob/tiopf2/Options/tiIteratorIntf.pas

https://github.com/graemeg/tiopf/blob/tiopf2/Options/tiIteratorImpl.pas

Some years ago I wrote a article for a magazine about this, and that is 
when I implemented the code. You can still find the "Iterator" article 
in the link below - and the accompanied source code too (though the 
tiOPF code is newer).


  http://geldenhuys.co.uk/articles/


Regards,
  Graeme

--
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/

My public PGP key:  http://tinyurl.com/graeme-pgp
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] For ..in GetEnumerator Allocation

2017-10-04 Thread Marco van de Voort
In our previous episode, Michael Van Canneyt said:
> As an alternative you can create an object enumeator. 
> It's simply allocated on the stack, and you can reset it in the enumerator
> operator.

Yup, or a record.  See e.g. http://www.stack.nl/~marcov/lightcontainers.zip

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] For ..in GetEnumerator Allocation

2017-10-04 Thread Michael Van Canneyt



On Wed, 4 Oct 2017, Ryan Joseph wrote:

As I understand the for..in loop GetEnumerator method is expected to create a new object each time it’s called 
and FPC destroys it later when the loos is finished. Can I retain the enumerator and just reset it in-between calls? 
I’d like to remove all these alloc/deallocs so I can use for..in more efficiently in tight loops.


You can do so by overriding the newinstance and it's sister method of
your enumerator class.

As an alternative you can create an object enumeator. 
It's simply allocated on the stack, and you can reset it in the enumerator

operator.

See the example in http://wiki.freepascal.org/for-in_loop
section: Using any identifiers instead of builtin MoveNext and Current

Michael.___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

[fpc-pascal] For ..in GetEnumerator Allocation

2017-10-04 Thread Ryan Joseph
As I understand the for..in loop GetEnumerator method is expected to create a 
new object each time it’s called and FPC destroys it later when the loos is 
finished. Can I retain the enumerator and just reset it in-between calls? I’d 
like to remove all these alloc/deallocs so I can use for..in more efficiently 
in tight loops.

Regards,
Ryan Joseph

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal