Re: [fpc-pascal] Managed properties idea
Am 05.10.2017 05:04 schrieb "Ryan Joseph": > I’ve been wanting to learn how to contribute to the compiler for years now but maybe this is an easy enough project to start with. I don’t know if this is a problem people have though but I assume it may be since Objective-C had a system like for memory management and properties. What do you think? How do most FPC programmers handle memory these days? The way to go in Object Pascal are reference counted interfaces (aka COM-style interfaces). After all one should program against an interface anyway and that reference counted interfaces are automatically handled by the compiler is an added bonus. Regards, Sven ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
[fpc-pascal] Managed properties idea
Yesterday when I was thinking about that enumerators I had an idea for a language feature and something which I saw introduced in Objective-C some years ago. Retaining of a member variable through a property is a common pattern I do everyday nearly. In my root class I manage a simply retain count which frees the object when it reaches 0. When assigning variables that the receiver wants to take ownership of on assignment I have to make an accessor which manages the retain count of the object (boring boiler plate). Consider the example below where the root class (TObject) overrides a class function which performs this retain count management (ManageObject). In the other class there is a member of type TObject which TMyClass will take ownership of when assigned. If the property “Data” includes the “managed” keyword the compiler will automatically call ClassType.ManageObject() and take the return value as the variable which is written to “m_data”. I’ve been wanting to learn how to contribute to the compiler for years now but maybe this is an easy enough project to start with. I don’t know if this is a problem people have though but I assume it may be since Objective-C had a system like for memory management and properties. What do you think? How do most FPC programmers handle memory these days? type TObject = class class function ManageObject (obj: pointer): TObject; override; end; type TMyClass = class (TObject) private m_data: TObject; public property Data: TObject read m_data write m_data; managed; // if there’s a write method defined the compiler calls ManageObject before the write method property MyData: TObject read m_data write SetData; managed; end; class function TObject.ManageObject (obj: pointer): TObject; override; begin if obj <> nil then result := TObject(obj).Retain; else begin TObject(obj).Release; result := nil; end; end; Regards, Ryan Joseph ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
[fpc-pascal] Finding long file names
I'm trying to use findfirst()/findnext to obtain a list of files. Here's my code: Searchfile:=Tap_Drive+Tap_Path+'\'+Tap_SubDirectory+'\*.TAP'; If FindFirst(Searchfile, FAAnyfile-FAHidden, FileDirInfo)=0 then .. It finds most files, even ones with really long file names, however it can't find files with periods in the file name, So it will find: This is a TEST.Tap But it will not find: This.is.a.TEST.tap If I change my search string to: Searchfile:=Tap_Drive+Tap_Path+'\'+Tap_SubDirectory+'\*.*'; Then it DOES find the files with more than one period in them... along with everything else. I could filter them out myself I suppose, but that seems to defeat the way findfirst is supposed to work. Any ideas how to make this work? Is there a better method to use than findfirst() ? I notice that if I use Extractfileext() with This.is.a.TEST.tap it correctly returns '.tap' as the extension. Maybe findfirst is an obsolete way of listing the files?Or maybe it just never got fixed to handle valid files with more than one period? Any thoughts on this? James ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] For ..in GetEnumerator Allocation
> On Oct 4, 2017, at 10:03 PM, Michael Van Canneyt> wrote: > > Newinstance allocates the memory for a new instance of the class. > By default this is GetMem(instanceSize). So you override the class method Newinstance in the enumerator class and return the same block of memory? Then when I override FreeInstance and do nothing I need to manually free the memory in the calling class I guess. That’s an interesting solution and I didn’t know TObject did that even but using records still seems like a better solution and about as efficient (perhaps). I’ll try this tomorrow also to make sure I understand it. Thanks. Regards, Ryan Joseph ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] For ..in GetEnumerator Allocation
> On Oct 4, 2017, at 5:19 PM, Marco van de Voortwrote: > > Yup, or a record. See e.g. http://www.stack.nl/~marcov/lightcontainers.zip This seems like the simplest more efficient method. Does FPC just know it’s a record internally and not try to dealloc it? GetEnumerator seems like a magic method so I’m not sure how the memory is being managed but I assume the stack space for the method return value is reserved and I just fill in the values of the record to that location. Regards, Ryan Joseph ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] For ..in GetEnumerator Allocation
On Wed, 4 Oct 2017 15:41:27 +0700 Ryan Josephwrote: > As I understand the for..in loop GetEnumerator method is expected to create a > new object each time it’s called and FPC destroys it later when the loos is > finished. Can I retain the enumerator and just reset it in-between calls? I’d > like to remove all these alloc/deallocs so I can use for..in more efficiently > in tight loops. Here is an example how to use a global enumerator object: type TMyEnumerator = class private FOwner: TComponent; FCurrent: TComponent; public procedure Init(Owner: TComponent); function MoveNext: boolean; property Current: TComponent read FCurrent; function GetEnumerator: TMyEnumerator; end; TForm1 = class(TForm) public function GetEnumerator: TMyEnumerator; end; var Form2: TForm2; MyEnumerator: TMyEnumerator; implementation procedure TMyEnumerator.Init(Owner: TComponent); begin FOwner:=Owner; FCurrent:=nil; end; function TMyEnumerator.MoveNext: boolean; var i: Integer; begin if FCurrent=nil then begin if FOwner.ComponentCount=0 then exit(false); FCurrent:=FOwner.Components[0]; end else begin i:=FCurrent.ComponentIndex+1; if i>=FCurrent.Owner.ComponentCount then exit(false); FCurrent:=FCurrent.Owner.Components[i]; end; Result:=true; end; function TMyEnumerator.GetEnumerator: TMyEnumerator; begin Result:=Self; end; function TForm1.GetEnumerator: TMyEnumerator; begin if MyEnumerator=nil then MyEnumerator:=TMyEnumerator.Create; MyEnumerator.Init(Self); end; finalization MyEnumerator.Free; end. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Yet another thread on Unicode Strings
In our previous episode, Tony Whyman said: > Unicode Character String handling is a question that keeps coming up on > the Free Pascal Mailing lists and, empirically, it is hard to avoid the > conclusion that there is something wrong with the way these character > string types are handled. Otherwise, why does this issue keep arising? Because people have old code that is ascii, or handles unicode in a different, ad-hoc matter. Moreover FPC/Lazarus is also still usable in an ascii only mode for old projects. > The programmer is too often forced to be aware of how strings > are encoded and must make a choice as to which is the preferred > character encoding for their program. There then follows confusion over > how to make that choice. To avoid confusion, make sure it is unicode. It doesn't matter that much if it is utf16 or not. > Is Delphi compatibility the goal? What > Languages must I support? If I want platform independence which is the > best encoding? Which encoding gives the best performance for my > algorithm? And so on. > Another problem is that there is no character type for a Unicode > Character. The built-in type ?WideChar? is only two bytes and cannot > hold a UTF-16 code point comprising two surrogate pairs. There is no > char type for a UTF-8 character and, while UCS4Char exists, the Lazarus > UTF-8 utilities use ?cardinal? as the type for a code point (not exactly > strong typing). Most code will simply use "string" to hold a character. Only special and code that really must be performant will do other things. > In order to stop all this confusion I believe that there has to be a > return to Pascal's original fundamental concept. That is the value of a > character type represents a character, while the encoding of the > character is platform dependent and a choice the compiler makes and not > the programmer. Likewise a character string is an array of characters > that can be indexed by character (not byte) number, from which > substrings can be selected and compared with other strings according to > the locale and the unicode standard collating sequence. Let the > programmer worry about the algorithm and the compiler worry about the > best implementation. > > I want to propose a new character type called ?UniChar? - short for > Unicode Character, along with a new string type ?UniString? and a new > collection ?TUniStrings?. I have presented my thoughts here in a > detailed paper > This doesn't work, and it seems you haven't read the backlog for unicode related messages all the way back to early 2009. What you suggest was one of the null hypotheses back then, and we are now 8 years further. Search for the unicode meanings of (1) glyph, (2) character (3) codepoint (4) denormalized strings. If you digest all that, you need to define the unichar type very large, blowing up strings enormously, and then again converting it back to either utf16 or utf8 to communicate with nearly anything (APIs, libraries etc) Moreover it will just require yet another conversion and more confusion with more competing systems. So the number of problems will only rise. And the incompatibility to Delphi is still there, so will create trouble ad infinitum. This argument is best summed up by this cartoon: https://xkcd.com/927/ In short, there is no substitute than to actively learn what unicode is about and live with it. Some of the problems were summed up in the discussion back then: http://www.stack.nl/~marcov/unicode.pdf Note that in hindsight I don't think Florian's proposal was that bad, and Florian was somewhat vindicated by Delphi's choice for multi encoding ansistring type. My new opinion is that whatever the choice is, I think to choose different from Delphi (despite all its flaws, perceived OR real, doesn't matter) was wrong. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] For ..in GetEnumerator Allocation
On Wed, 4 Oct 2017, Ryan Joseph wrote: On Oct 4, 2017, at 4:26 PM, Michael Van Canneytwrote: You can do so by overriding the newinstance and it's sister method of your enumerator class. Can you explain how this works or give an example? Not sure how these gets around the problem of alloc/dealloc for each iterator. Careful, the iterator is instantiated only once per loop ? Newinstance allocates the memory for a new instance of the class. By default this is GetMem(instanceSize). What you can do is allocate somewhere a block of the correct size, once, and always return this single block. Of course, you need to know for sure that only 1 instance of your enumerator will be active at any given point in your program. If there are multiple, you could allocate a static array and allocate from that. The FreeInstance procedure does a freemem, by default, but you can change that to do nothing. (in case of an array, you mark the element unused) Michael. ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] For ..in GetEnumerator Allocation
> On Oct 4, 2017, at 4:26 PM, Michael Van Canneyt> wrote: > > You can do so by overriding the newinstance and it's sister method of > your enumerator class. Can you explain how this works or give an example? Not sure how these gets around the problem of alloc/dealloc for each iterator. Regards, Ryan Joseph ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Yet another thread on Unicode Strings
On Wed, 4 Oct 2017 13:10:02 +0100 Tony Whymanwrote: > Unicode Character String handling is a question that keeps coming up on > the Free Pascal Mailing lists and, empirically, it is hard to avoid the > conclusion that there is something wrong with the way these character > string types are handled. Otherwise, why does this issue keep arising? Mixing string types, mixing encodings, mixing legacy code, confusing UCS-2 with UTF-16, >[...] > Another problem is that there is no character type for a Unicode > Character. I'm curious: What languages have such a type? > The built-in type “WideChar” is only two bytes and cannot > hold a UTF-16 code point comprising two surrogate pairs. There is no > char type for a UTF-8 character and, while UCS4Char exists, the Lazarus > UTF-8 utilities use “cardinal” as the type for a code point (not exactly > strong typing). Should be remedied. >[...] >Let the programmer worry about the algorithm and the compiler worry about the best implementation. An UTF-32 string type is seldom the best choice for memory and/or speed. >[...] > I want to propose a new character type called “UniChar” - short for > Unicode Character, along with a new string type “UniString” and a new > collection “TUniStrings”. I have presented my thoughts here in a > detailed paper > > see https://mwasoftware.co.uk/docs/unistringproposal.pdf > > This is intended to be a fully worked proposal and I have circulated it > to provoke discussion and in the hope that it may be useful. Adding another string type without disabling some old string types will increase the confusion. Please provide a proposal for disabling old string types. Also keep in mind, that there is still no UTF-16 RTL, even though many people need that for Delphi compatibility. Starting yet another UTF-32 RTL need some heavy dedicated programmers. Mattias ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
[fpc-pascal] Yet another thread on Unicode Strings
Unicode Character String handling is a question that keeps coming up on the Free Pascal Mailing lists and, empirically, it is hard to avoid the conclusion that there is something wrong with the way these character string types are handled. Otherwise, why does this issue keep arising? Supporters of the current implementation point to the rich set of functions available to handle both UTF-8 and UTF-16 in addition to legacy ANSI code pages. That is true – but it may be that it is also the problem. The programmer is too often forced to be aware of how strings are encoded and must make a choice as to which is the preferred character encoding for their program. There then follows confusion over how to make that choice. Is Delphi compatibility the goal? What Languages must I support? If I want platform independence which is the best encoding? Which encoding gives the best performance for my algorithm? And so on. Another problem is that there is no character type for a Unicode Character. The built-in type “WideChar” is only two bytes and cannot hold a UTF-16 code point comprising two surrogate pairs. There is no char type for a UTF-8 character and, while UCS4Char exists, the Lazarus UTF-8 utilities use “cardinal” as the type for a code point (not exactly strong typing). In order to stop all this confusion I believe that there has to be a return to Pascal's original fundamental concept. That is the value of a character type represents a character, while the encoding of the character is platform dependent and a choice the compiler makes and not the programmer. Likewise a character string is an array of characters that can be indexed by character (not byte) number, from which substrings can be selected and compared with other strings according to the locale and the unicode standard collating sequence. Let the programmer worry about the algorithm and the compiler worry about the best implementation. I want to propose a new character type called “UniChar” - short for Unicode Character, along with a new string type “UniString” and a new collection “TUniStrings”. I have presented my thoughts here in a detailed paper see https://mwasoftware.co.uk/docs/unistringproposal.pdf This is intended to be a fully worked proposal and I have circulated it to provoke discussion and in the hope that it may be useful. The intent is to create a character and string handling design that is natural to use with the programmer rarely if ever having to think about the character or string encoding. They are dealing with Unicode Characters and strings of Unicode Characters and that is all. When necessary, transliteration happens naturally and as a consequence of string concatenation, input/output, or in the rare cases when performance demands a specific character encoding. There is also a strong desire to avoid creating more choice and hence more confusion. The intent is to “embrace and replace”. Both AnsiString and UnicodeString should be seen as subsets or special cases of the proposed UniString, and with concrete types such as AnsiChar, WideChar and WideString, other than for legacy reasons, existing primarily to define external interfaces. Tony Whyman MWA Software ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] For ..in GetEnumerator Allocation
On 2017-10-04 09:41, Ryan Joseph wrote: I’d like to remove all these alloc/deallocs so I can use for..in more efficiently in tight loops. I've had the same requirement, and also needed that functionality before the for..in syntax existed in FPC. Take a look at my Iterator interface and implementation. It allows you to move forward, backwards, reset, filter data etc. You can hold on to the instance reference as long as you like. This code lives in the tiOPF project, but can be used outside of the tiOPF project too (I do that often) - simply delete the iterator implementations for TtiObjectList. https://github.com/graemeg/tiopf/blob/tiopf2/Options/tiIteratorIntf.pas https://github.com/graemeg/tiopf/blob/tiopf2/Options/tiIteratorImpl.pas Some years ago I wrote a article for a magazine about this, and that is when I implemented the code. You can still find the "Iterator" article in the link below - and the accompanied source code too (though the tiOPF code is newer). http://geldenhuys.co.uk/articles/ Regards, Graeme -- fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal http://fpgui.sourceforge.net/ My public PGP key: http://tinyurl.com/graeme-pgp ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] For ..in GetEnumerator Allocation
In our previous episode, Michael Van Canneyt said: > As an alternative you can create an object enumeator. > It's simply allocated on the stack, and you can reset it in the enumerator > operator. Yup, or a record. See e.g. http://www.stack.nl/~marcov/lightcontainers.zip ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] For ..in GetEnumerator Allocation
On Wed, 4 Oct 2017, Ryan Joseph wrote: As I understand the for..in loop GetEnumerator method is expected to create a new object each time it’s called and FPC destroys it later when the loos is finished. Can I retain the enumerator and just reset it in-between calls? I’d like to remove all these alloc/deallocs so I can use for..in more efficiently in tight loops. You can do so by overriding the newinstance and it's sister method of your enumerator class. As an alternative you can create an object enumeator. It's simply allocated on the stack, and you can reset it in the enumerator operator. See the example in http://wiki.freepascal.org/for-in_loop section: Using any identifiers instead of builtin MoveNext and Current Michael.___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
[fpc-pascal] For ..in GetEnumerator Allocation
As I understand the for..in loop GetEnumerator method is expected to create a new object each time it’s called and FPC destroys it later when the loos is finished. Can I retain the enumerator and just reset it in-between calls? I’d like to remove all these alloc/deallocs so I can use for..in more efficiently in tight loops. Regards, Ryan Joseph ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal