Re: RE : [fpc-devel] Unicode support (yet again)
On 18.09.2011 02:22, Flávio Etrusco wrote: On Sat, Sep 17, 2011 at 10:59 AM, DaWorm wrote: This might be total crap, so bear with me a moment, In an object like a Stringlist, there is a default property such as Strings, such that List.Strings[1] is equivalent to List[1], is there not? If, as in .NET or Java, all strings become objects, then you could have a String object whose default property is Chars, whose type isn't really a char, but another String whose length is one entity. That's somewhat what I was thinking. Actually something like UnicodeString = object strict private FEncoding: Integer; FBuffer: AnsiString; function GetCodePointAt(AIndex: SizeInt): Integer; procedure SetCodePoint(AIndex: SizeInt; p_Value: Integer); public property CodePoint[AIndex: SizeInt]: Integer read GetCodePointAt write SetCodePoint; default; end; I just don't whether something like this is already implemented in the test branches, at least for -err- testing... Well... you can now take a look at trunk as well, because the changes from cpstrnew have been merged yesterday. Regards, Sven ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
[fpc-devel] implementation AVX for Intel/AMD Prozessors
Hi, im starting with implementation of AVX (Intel/AMD) in fpc-assemblercode. I use in first step a external assembler (linux "as"). In this case, it is easy to use XMM-Register (max. 3 parameter) (only change the file "x86ins.dat", examble: [VMOVDQA] (Ch All, Ch None, Ch None) xmmreg, xmmrm \361\... For use of YMM-Register i have change the file "x86reg.dat" NR_YMM0,$0700,ymm0,%ymm0,ymm0,ymm0,21,21,17,OT_YMMREG,0 NR_YMM1,$0701,ymm1,%ymm1,ymm1,ymm1,22,22,18,OT_YMMREG,1 NR_YMM2,$0702,ymm2,%ymm2,ymm2,ymm2,23,23,19,OT_YMMREG,2 NR_YMM3,$0703,ymm3,%ymm3,ymm3,ymm3,24,24,20,OT_YMMREG,3 NR_YMM4,$0704,ymm4,%ymm4,ymm4,ymm4,25,25,21,OT_YMMREG,4 NR_YMM5,$0705,ymm5,%ymm5,ymm5,ymm5,26,26,22,OT_YMMREG,5 NR_YMM6,$0706,ymm6,%ymm6,ymm6,ymm6,27,27,23,OT_YMMREG,6 NR_YMM7,$0707,ymm7,%ymm7,ymm7,ymm7,28,28,24,OT_YMMREG,7 NR_YMM8,$0708,ymm8,%ymm8,ymm8,ymm8,-1,-1,25,OT_YMMREG,0,64 NR_YMM9,$0709,ymm9,%ymm9,ymm9,ymm9,-1,-1,26,OT_YMMREG,1,64 NR_YMM10,$070a,ymm10,%ymm10,ymm10,ymm10,-1,-1,27,OT_YMMREG,2,64 NR_YMM11,$070b,ymm11,%ymm11,ymm11,ymm11,-1,-1,28,OT_YMMREG,3,64 NR_YMM12,$070c,ymm12,%ymm12,ymm12,ymm12,-1,-1,29,OT_YMMREG,4,64 NR_YMM13,$070d,ymm13,%ymm13,ymm13,ymm13,-1,-1,30,OT_YMMREG,5,64 NR_YMM14,$070e,ymm14,%ymm14,ymm14,ymm14,-1,-1,31,OT_YMMREG,6,64 NR_YMM15,$070f,ymm15,%ymm15,ymm15,ymm15,-1,-1,32,OT_YMMREG,7,64 Next step is change the files "cpubase, cgbase, aasmcpu, ...". What do i consider? Regards Torsten ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] implementation AVX for Intel/AMD Prozessors
Am 18.09.2011 10:54, schrieb Torsten: > Hi, > > im starting with implementation of AVX (Intel/AMD) in fpc-assemblercode. > > I use in first step a external assembler (linux "as"). In this case, it > is easy to use XMM-Register (max. 3 parameter) (only change the file > "x86ins.dat", examble: > > [VMOVDQA] > (Ch All, Ch None, Ch None) > xmmreg, xmmrm \361\... Problem is probably the correct encoding sequence. > > > For use of YMM-Register i have change the file "x86reg.dat" > > > NR_YMM0,$0700,ymm0,%ymm0,ymm0,ymm0,21,21,17,OT_YMMREG,0 > NR_YMM1,$0701,ymm1,%ymm1,ymm1,ymm1,22,22,18,OT_YMMREG,1 > NR_YMM2,$0702,ymm2,%ymm2,ymm2,ymm2,23,23,19,OT_YMMREG,2 > NR_YMM3,$0703,ymm3,%ymm3,ymm3,ymm3,24,24,20,OT_YMMREG,3 > NR_YMM4,$0704,ymm4,%ymm4,ymm4,ymm4,25,25,21,OT_YMMREG,4 > NR_YMM5,$0705,ymm5,%ymm5,ymm5,ymm5,26,26,22,OT_YMMREG,5 > NR_YMM6,$0706,ymm6,%ymm6,ymm6,ymm6,27,27,23,OT_YMMREG,6 > NR_YMM7,$0707,ymm7,%ymm7,ymm7,ymm7,28,28,24,OT_YMMREG,7 > NR_YMM8,$0708,ymm8,%ymm8,ymm8,ymm8,-1,-1,25,OT_YMMREG,0,64 > NR_YMM9,$0709,ymm9,%ymm9,ymm9,ymm9,-1,-1,26,OT_YMMREG,1,64 > NR_YMM10,$070a,ymm10,%ymm10,ymm10,ymm10,-1,-1,27,OT_YMMREG,2,64 > NR_YMM11,$070b,ymm11,%ymm11,ymm11,ymm11,-1,-1,28,OT_YMMREG,3,64 > NR_YMM12,$070c,ymm12,%ymm12,ymm12,ymm12,-1,-1,29,OT_YMMREG,4,64 > NR_YMM13,$070d,ymm13,%ymm13,ymm13,ymm13,-1,-1,30,OT_YMMREG,5,64 > NR_YMM14,$070e,ymm14,%ymm14,ymm14,ymm14,-1,-1,31,OT_YMMREG,6,64 > NR_YMM15,$070f,ymm15,%ymm15,ymm15,ymm15,-1,-1,32,OT_YMMREG,7,64 I'am not sure if the ymm registers should be an own register class. After all, they are a superset of xmm > > Next step is change the files "cpubase, cgbase, aasmcpu, ...". > What do i consider? In which regard? ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On Sunday 18 September 2011 10.50:26 Sven Barth wrote: > > Well... you can now take a look at trunk as well, because the changes > from cpstrnew have been merged yesterday. > [...] make[7]: Entering directory `/home/mse/packs/standard/svn/fp/trunk/rtl/linux' /home/mse/packs/standard/svn/fp/trunk/compiler/ppc2 -Ur -Ur -Xs -O2 -n - Fi../inc -Fi../i386 -Fi../unix -Fii386 -FE. - FU/home/mse/packs/standard/svn/fp/trunk/rtl/units/i386-linux -di386 -dRELEASE -Us -Sg system.pp This binary has no unicodestrings support compiled in. Recompile the application with a unicodestrings-manager in the program uses clause. Fatal: Compilation aborted An unhandled exception occurred at $08056B3F : ENoWideStringSupport : SIGQUIT signal received. $08056B3F $08092A8C $080926DD $080F121E $080F48D5 $080F5CEF $08067532 Martin ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On 18.09.2011 11:27, Martin Schreiber wrote: On Sunday 18 September 2011 10.50:26 Sven Barth wrote: Well... you can now take a look at trunk as well, because the changes from cpstrnew have been merged yesterday. [...] make[7]: Entering directory `/home/mse/packs/standard/svn/fp/trunk/rtl/linux' /home/mse/packs/standard/svn/fp/trunk/compiler/ppc2 -Ur -Ur -Xs -O2 -n - Fi../inc -Fi../i386 -Fi../unix -Fii386 -FE. - FU/home/mse/packs/standard/svn/fp/trunk/rtl/units/i386-linux -di386 -dRELEASE -Us -Sg system.pp This binary has no unicodestrings support compiled in. Recompile the application with a unicodestrings-manager in the program uses clause. Fatal: Compilation aborted An unhandled exception occurred at $08056B3F : ENoWideStringSupport : SIGQUIT signal received. $08056B3F $08092A8C $080926DD $080F121E $080F48D5 $080F5CEF $08067532 Currently the POSIX-based systems seem to be broken (the Windows ones work). That is already known. The other devs are working on that. Regards, Sven ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: RE : [fpc-devel] Unicode support (yet again)
In our previous episode, Fl?vio Etrusco said: > > That's somewhat what I was thinking. Actually something like > > UnicodeString = object > strict private > FEncoding: Integer; > FBuffer: AnsiString; > function GetCodePointAt(AIndex: SizeInt): Integer; > procedure SetCodePoint(AIndex: SizeInt; p_Value: Integer); > public > property CodePoint[AIndex: SizeInt]: Integer read GetCodePointAt > write SetCodePoint; default; > end; > > I just don't whether something like this is already implemented in the > test branches, at least for -err- testing... Such ability is not unique for an object. One can also do something like that with a native type. It was discussed and rejected. The trouble is that it is not that easy, consider the first thing a long time pascal user will do is fix his existing code which has many constructs that loop over a string: setlength(s2,s1); for i:=1 to length(s1) do s2[i]:=s1[i]; Now, to return codepoint[i], you need to parse all codepoints before [i]. So instead of O(n) this loop suddenly becomes O(n^2) ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On 18.09.2011 11:27, Martin Schreiber wrote: On Sunday 18 September 2011 10.50:26 Sven Barth wrote: Well... you can now take a look at trunk as well, because the changes from cpstrnew have been merged yesterday. [...] make[7]: Entering directory `/home/mse/packs/standard/svn/fp/trunk/rtl/linux' /home/mse/packs/standard/svn/fp/trunk/compiler/ppc2 -Ur -Ur -Xs -O2 -n - Fi../inc -Fi../i386 -Fi../unix -Fii386 -FE. - FU/home/mse/packs/standard/svn/fp/trunk/rtl/units/i386-linux -di386 -dRELEASE -Us -Sg system.pp This binary has no unicodestrings support compiled in. Recompile the application with a unicodestrings-manager in the program uses clause. Fatal: Compilation aborted An unhandled exception occurred at $08056B3F : ENoWideStringSupport : SIGQUIT signal received. $08056B3F $08092A8C $080926DD $080F121E $080F48D5 $080F5CEF $08067532 For now you can apply the following patch as a workaround. The compiler (and fpmake) will depend on the C-library then (which should not be the case in the final solution). Regards, Sven --- fpc/compiler/pp.pas 2011-06-21 20:29:50.340618090 +0200 +++ fpc-build/compiler/pp.pas 2011-09-18 12:11:46.823598604 +0200 @@ -166,6 +166,9 @@ {$ifdef profile} profile, {$endif profile} +{$ifdef unix} + cwstring, +{$endif} {$ifndef NOCATCH} {$if defined(Unix) or defined(Go32v2) or defined(Watcom)} catch, --- fpc/packages/fpmkunit/src/fpmkunit.pp 2011-09-18 11:44:03.176488622 +0200 +++ fpc-build/packages/fpmkunit/src/fpmkunit.pp 2011-09-18 12:14:34.279871962 +0200 @@ -45,6 +45,9 @@ uses SysUtils, Classes, StrUtils +{$ifdef unix} + ,cwstring +{$endif} {$ifdef HAS_UNIT_PROCESS} ,process {$endif HAS_UNIT_PROCESS} ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On 18 Sep 2011, at 12:26, Sven Barth wrote: > For now you can apply the following patch as a workaround. The compiler (and > fpmake) will depend on the C-library then (which should not be the case in > the final solution). Not only that: even with cwstring (and under Windows) the result is wrong, because doing ansistr1=ansistr2 will currently be evaluated as a case-insensitive string comparison under certain circumstances. Jonas___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On Sunday 18 September 2011 12.44:26 Jonas Maebe wrote: > On 18 Sep 2011, at 12:26, Sven Barth wrote: > > For now you can apply the following patch as a workaround. The compiler > > (and fpmake) will depend on the C-library then (which should not be the > > case in the final solution). > > Not only that: even with cwstring (and under Windows) the result is wrong, > because doing ansistr1=ansistr2 will currently be evaluated as a > case-insensitive string comparison under certain circumstances. > shudder. ;-) Martin ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On 18/09/2011, Sven Barth wrote: > > Currently the POSIX-based systems seem to be broken (the Windows ones > work). That is already known. The other devs are working on that. And it boggles the mind why something so broken / incomplete was merged into Trunk in the first place? Isn't that the whole point of Branches. Make sure it works after being synced with latest Trunk. If the branch works successful, then merge into trunk. -- Regards, - Graeme - ___ fpGUI - a cross-platform Free Pascal GUI toolkit http://fpgui.sourceforge.net ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On 18 Sep 2011, at 13:16, Graeme Geldenhuys wrote: > And it boggles the mind why something so broken / incomplete was > merged into Trunk in the first place? Yes, we suck from time to time (in this case: testsuite runs were performed, but the sync and merge were done by a person who only had access to a Windows system). Can we move on now? Jonas ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On 18.09.2011 13:20, Jonas Maebe wrote: On 18 Sep 2011, at 13:16, Graeme Geldenhuys wrote: And it boggles the mind why something so broken / incomplete was merged into Trunk in the first place? Yes, we suck from time to time (in this case: testsuite runs were performed, but the sync and merge were done by a person who only had access to a Windows system). Can we move on now? Also the developers decided that it's better to continue the work on that feature in trunk as errors are then spotted more easily and thus can also be fixed faster (you remember how orphaned cpstrnew basically was, Graeme?). Regards, Sven ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: RE : [fpc-devel] Unicode support (yet again)
On Sun, Sep 18, 2011 at 6:50 AM, Marco van de Voort wrote: > In our previous episode, Fl?vio Etrusco said: >> >> That's somewhat what I was thinking. Actually something like >> >> UnicodeString = object >> (...) > Such ability is not unique for an object. One can also do something like > that with a native type. > Of course. That wasn't meant as a real implementation, I just decided to write some code instead of explaining in words. Basically my point was to people discussing endlessly without any data or observations, that FPC already provides much of the tools for a non-native implementation to be made and gather real and practical data. > It was discussed and rejected. > The trouble is that it is not that easy, consider the first thing a > long time pascal user will do is fix his existing code which has many > constructs that loop over a string: > > setlength(s2,s1); > for i:=1 to length(s1) do > s2[i]:=s1[i]; > > Now, to return codepoint[i], you need to parse all codepoints before [i]. > > So instead of O(n) this loop suddenly becomes O(n^2) I hope then that either I'm wrong or that you change your mind ;-) IMHO what must be changed is the way to deal with strings. I must assume from this preoccupation that you're talking about a a directive to make the String keyword instantiate a UnicodeString? Also IMVHO in that compiler mode the code just needs to work, not fast, and the user code be updated/fixed. One obvious way to mitigate this would be to store the last CodePoint->Char in the string record, so that at least the most common case is covered. Best regards, Flávio PS. Sorry for the double post, Marco. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] implementation AVX for Intel/AMD Prozessors
Op Sun, 18 Sep 2011, schreef Florian Klämpfl: I'am not sure if the ymm registers should be an own register class. After all, they are a superset of xmm Exactly. Since if xmm0 is allocated, ymm0 is allocated too; the register allocator should treat them as a single register. xmm0 and ymm0 are a subregister of the same superregister. Daniël___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
Luiz Americo Pereira Camara schrieb: On 17/9/2011 11:46, Hans-Peter Diettrich wrote: Luiz Americo Pereira Camara schrieb: The codepage of a RawByteString at runtime will keep the previous CodePage (65001 for UTF8, 1200 for UTF16) as opposed to change to the RawbyteString CodePage (65535) as a though previously Delphi defines RawByteString=AnsiString, so there is no room for UTF-16 in such an string. No. I was wrong. See Florian email. RawByteString will keep the codepage (1200 = UTF16) and the data of the assigned string be UTF8, be UTF8. So the implementation would be: function FileGetAttr(const FileName: RawByteString): Longint; begin SetCodePage(FileName, 1200, True); Won't work, because of "const", Yes and because UTF-16 is not a Byte (AnsiChar) string :-( No. See above. Look in net for Delphi and Unicode doc by marco cantu Can you give me a link? I checked the XE documentation and RTL, and could not find that RawByteString can hold UTF-16, and my test confirms that: var a: AnsiString; u: UnicodeString; procedure test(r: RawByteString; cp: word); begin WriteLn('in: ', StringElementSize(r), ' cp: ', StringCodePage(r), ' len=', length(r)); WriteLn('"', r, '"'); //writes garbage for non-OEM chars, of course SetCodePage(r, cp, true); WriteLn('out: ', StringElementSize(r), ' cp: ', StringCodePage(r), ' len=', length(r)); a := r; //use the result, so that nothing can be optimized away WriteLn('"', r, '"'); end; This reveals the following behaviour: 1) UnicodeString is converted to AnsiString, before passed to test. 2) Setting codepage to 1200 doesn't change anything. 3) Conversion to UTF-8 seems to work (length changed). 4) Conversion from UTF-8 to Ansi results in an empty string. I'll ask in an Embarcadero group, in detail for [4]. DoDi ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
[fpc-devel] more question on -g commandline switches / -gv valgind
Trying to improve the lazarus user-interface for selecting what debug info to generate, I try to understand the relations between settings. (as for example, indicating to the user, that -gl causes the same debug info than -g does) Valgrind requires debug info too so -g-v is nonsnes.. It also understand stabs and dwarf, according to it's docs The question is what does fpc do, if -gv is specified. Does -gv : - force either stabs or dwarf - work with both, and add extend info (never mind if stabs or dwarf) - observe stabs or dwarf setting (-gs / -gw) but only add info for one of the 2 formats? In other words, does it make sense in Lazarus, to allow the user to choose explicit stab or dwarf; and then on top of that optionally add -gv ? Will it work as expected in any combination? Or should lazarus, only allow stabs, or only dwarf, if the user gives -gv. Or should Lazarus prevent the user from chosing stabs/dwarf at all, if -gv is selected? so it simply gives -g -gv (or just -gv) to fpc, but NOT -gs / -gw ? ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On 18/9/2011 10:07, Hans-Peter Diettrich wrote: Luiz Americo Pereira Camara schrieb: On 17/9/2011 11:46, Hans-Peter Diettrich wrote: Luiz Americo Pereira Camara schrieb: The codepage of a RawByteString at runtime will keep the previous CodePage (65001 for UTF8, 1200 for UTF16) as opposed to change to the RawbyteString CodePage (65535) as a though previously Delphi defines RawByteString=AnsiString, so there is no room for UTF-16 in such an string. No. I was wrong. See Florian email. RawByteString will keep the codepage (1200 = UTF16) and the data of the assigned string be UTF8, be UTF8. So the implementation would be: function FileGetAttr(const FileName: RawByteString): Longint; begin SetCodePage(FileName, 1200, True); Won't work, because of "const", Yes and because UTF-16 is not a Byte (AnsiChar) string :-( No. See above. Look in net for Delphi and Unicode doc by marco cantu Can you give me a link? I checked the XE documentation and RTL, and could not find that RawByteString can hold UTF-16, and my test confirms that: http://edn.embarcadero.com/article/38980 You may read also: http://www.micro-isv.asia/2008/08/using-rawbytestring-effectively/ var a: AnsiString; u: UnicodeString; procedure test(r: RawByteString; cp: word); begin WriteLn('in: ', StringElementSize(r), ' cp: ', StringCodePage(r), ' len=', length(r)); WriteLn('"', r, '"'); //writes garbage for non-OEM chars, of course SetCodePage(r, cp, true); WriteLn('out: ', StringElementSize(r), ' cp: ', StringCodePage(r), ' len=', length(r)); a := r; //use the result, so that nothing can be optimized away WriteLn('"', r, '"'); end; This reveals the following behaviour: 1) UnicodeString is converted to AnsiString, before passed to test. 2) Setting codepage to 1200 doesn't change anything. 3) Conversion to UTF-8 seems to work (length changed). 4) Conversion from UTF-8 to Ansi results in an empty string. I'll ask in an Embarcadero group, in detail for [4]. Are you using Delphi XE or fpc? I dont have Delphi XE. What i know is from that docs and these discussions Luiz ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
[fpc-devel] Re: Request to merge commit 18230 (STABS fix) to FPC 2.6
Sorry, I want to repeat the question because I didn't get any answer and I would really like to know if the whole 2.6.x line will have this bug or only 2.6.0, so that I can either skip 2.6.0 or discuss switching to DWARF. Can the commit nr 18230 be merged into FPC 2.6 ? Thanks in advance, simple yes or no will suffice. -- cobines ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Re: Request to merge commit 18230 (STABS fix) to FPC 2.6
In our previous episode, cobines said: > > Can the commit nr 18230 be merged into FPC 2.6 ? > > Thanks in advance, simple yes or no will suffice. I do most of the merging, but I don't merge compiler revs unless I get a green light from the compiler devs. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: RE : [fpc-devel] Unicode support (yet again)
On 18 Sep 2011, at 13:57, Flávio Etrusco wrote: > One obvious way to mitigate this would be to store the last > CodePoint->Char in the string record, so that at least the most common > case is covered. ... and so that the common case is broken in multithreaded environments. Directly indexing a string will most likely always work using fixed-length steps (8, 16, 32 bit). If you want to iterate based on anything else (such as code points), use some kind of iterator model instead. Jonas___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] more question on -g commandline switches / -gv valgind
On 18 Sep 2011, at 14:34, Martin wrote: > Does -gv : > - force either stabs or dwarf -gv by itself enables the default debug format, just like -g, -gh, -gt, -gc, ... > - work with both, and add extend info (never mind if stabs or dwarf) It works with both. > - observe stabs or dwarf setting (-gs / -gw) but only add info for one of the > 2 formats? It mainly removes debug information with which Valgrind cannot deal, in addition to including the cmem unit. And yes, -gw -gv will force DWARF in combination with Valgrind provisions, and -gs -gv will force Stabs in combination with Valgrind provisions. Jonas___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: RE : [fpc-devel] Unicode support (yet again)
On Sep 18, 2011 5:50 AM, "Marco van de Voort" wrote: > > The trouble is that it is not that easy, consider the first thing a > long time pascal user will do is fix his existing code which has many > constructs that loop over a string: > > setlength(s2,s1); > for i:=1 to length(s1) do > s2[i]:=s1[i]; > > Now, to return codepoint[i], you need to parse all codepoints before [i]. > > So instead of O(n) this loop suddenly becomes O(n^2) Sure it does. So what? The point is, it will do what the user expects. And for most users, the fact that it does it slowly won't even matter. For those whom it does matter, it is a chance for them to learn the right way. Like I said in my first post, this is an extremely complex subject. I think trying to optimize user code before they even write it adds even more complexity, which slows implementation down. Get something that works and gives the expected results first, worry about speed later. By the time you finish, the CPU speed will have caught up to you. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: RE : [fpc-devel] Unicode support (yet again)
In our previous episode, DaWorm said: > > > > So instead of O(n) this loop suddenly becomes O(n^2) > > Sure it does. So what? So much! > The point is, it will do what the user expects. No it doesn't. The user has no clue, and will just stumble on the next detail (like codepoints not being characters). > Like I said in my first post, this is an extremely complex subject. I think > trying to optimize user code before they even write it adds even more > complexity, which slows implementation down. As often repeated: IMHO users can make such decisions for their app logic. The libraries (and most of Lazarus) can't make such speed and Unicode subset assumptions, and they are heavy "string" users too. And of course, finally, there is the matter with Delphi compatibility. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: RE : [fpc-devel] Unicode support (yet again)
On 18.09.2011 17:48, DaWorm wrote: On Sep 18, 2011 5:50 AM, "Marco van de Voort" mailto:mar...@stack.nl>> wrote: > > The trouble is that it is not that easy, consider the first thing a > long time pascal user will do is fix his existing code which has many > constructs that loop over a string: > > setlength(s2,s1); > for i:=1 to length(s1) do > s2[i]:=s1[i]; > > Now, to return codepoint[i], you need to parse all codepoints before [i]. > > So instead of O(n) this loop suddenly becomes O(n^2) Sure it does. So what? The point is, it will do what the user expects. And for most users, the fact that it does it slowly won't even matter. For those whom it does matter, it is a chance for them to learn the right way. Like I said in my first post, this is an extremely complex subject. I think trying to optimize user code before they even write it adds even more complexity, which slows implementation down. Get something that works and gives the expected results first, worry about speed later. By the time you finish, the CPU speed will have caught up to you. Let me quote a saying by Pascal's father Wirth: "Software is getting slower more rapidly than hardware becomes faster." (see also here: http://en.wikipedia.org/wiki/Wirth%27s_law ) I personally see no reason in Pascal becoming (much) slower only because we want to support code page aware strings (and O(n^2) IS much slower than O(n)). Regards, Sven ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: RE : [fpc-devel] Unicode support (yet again)
On Sun, Sep 18, 2011 at 12:01 PM, Sven Barth wrote: > On 18.09.2011 17:48, DaWorm wrote: But isn't it O(n^2) only when actually using unicode strings? Wouldn't you also be able to do something like String.Encoding := Ansi and then all String[i] accesses would then be o(n) + x (where x is the overhead of run time checking that it is safe to just use a memory offset, presumably fairly short)? Of course it would be up to the user to choose to reencode some string he got from the RTL or FCL that way and understand the consequences. What assumptions are the typical String[i] user going to make about what is returned? There will be the types that are seeing if the fifth character is a 'C' or something like that, and for those there probably isn't too much that is going to go wrong, they might have to switch to "C" instead, or the compiler can make the 'C' literal a "unicode char which is really a string" conversion at compile time. There may be the ones that want to turn a 'C' into a 'c' by flipping the 6th bit, and that will indeed break, and in a Unicode world, perhaps that should break, forcing using LowerCase as needed. And there are those (such as myself) who often use strings as buffers for things like serial comms. That code will totally break if I were to try to use a unicode string buffer, but a simple addition of String.Encoding := ANSI or RawByteString or ShortString in the first line would fix that, or I could bite the bullet and recode that quick and dirty code the right way. My point is that trying to keep the bad habits of a single byte string world in a unicode world is counterproductive. They aren't the same, and all attempts to make them the same just cause more problems than they solve. As for the RTL and FCL, presumably they wouldn't be doing any of this Sting[i] stuff in the first place, would they? So they aren't going to suffer that speed penalty. Just because one type of code is slow, doesn't mean everything is slow. Jeff. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: RE : [fpc-devel] Unicode support (yet again)
In our previous episode, DaWorm said: > But isn't it O(n^2) only when actually using unicode strings? > Wouldn't you also be able to do something like String.Encoding := Ansi > and then all String[i] accesses would then be o(n) + x (where x is the > overhead of run time checking that it is safe to just use a memory > offset, presumably fairly short)? Of course it would be up to the user > to choose to reencode some string he got from the RTL or FCL that way > and understand the consequences. It is possible, but that state can't be in the string/object because for read-only access strings are shared. (not doing so incurs a lot of copying overhead) So that means that you need to allocate that state locally, either explicitely by manually allocating an iterator object (as Jonas already explained) or implicitely on the stack. The latter requires a native string type though, and is therefore hard with objects. Implicit methods also have the disadvantage that the compiler must recognize the access pattern. So usually that means only the simplest of cases (or e.g. only when for..in is used) > What assumptions are the typical String[i] user going to make about > what is returned? IMHO development should not be driven by the users assumptions. If so, we would now have UIs with one red button with the text "do what I think", since that seems to be what most users want and expect :-) > There will be the types that are seeing if the fifth character is a 'C' or > something like that, and for those there probably isn't too much that is > going to go wrong, they might have to switch to "C" instead, or the > compiler can make the 'C' literal a "unicode char which is really a > string" conversion at compile time. This is a very rare case. With the increasing internationalization of applications operations on such literals are even rarer. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: RE : [fpc-devel] Unicode support (yet again)
2011/9/18 Marco van de Voort : > The trouble is that it is not that easy, consider the first thing a > long time pascal user will do is fix his existing code which has many > constructs that loop over a string: > > setlength(s2,s1); > for i:=1 to length(s1) do > s2[i]:=s1[i]; > > Now, to return codepoint[i], you need to parse all codepoints before [i]. Correct me if I'm wrong, but length(s1) wouldn't return the number of code points anyway? -- cobines ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: RE : [fpc-devel] Unicode support (yet again)
DaWorm schrieb: On Sun, Sep 18, 2011 at 12:01 PM, Sven Barth wrote: On 18.09.2011 17:48, DaWorm wrote: But isn't it O(n^2) only when actually using unicode strings? All MBCS encodings, with no fixed character size, suffer from that problem. Wouldn't you also be able to do something like String.Encoding := Ansi and then all String[i] accesses would then be o(n) + x (where x is the overhead of run time checking that it is safe to just use a memory offset, presumably fairly short)? Of course it would be up to the user to choose to reencode some string he got from the RTL or FCL that way and understand the consequences. Calling subroutines for indexed access, instead of direct array access, will add another factor (10..100?) to single character access - including register save/restore and disallowed optimizations. What assumptions are the typical String[i] user going to make about what is returned? There will be the types that are seeing if the fifth character is a 'C' or something like that, and for those there probably isn't too much that is going to go wrong, they might have to switch to "C" instead, or the compiler can make the 'C' literal a "unicode char which is really a string" conversion at compile time. There may be the ones that want to turn a 'C' into a 'c' by flipping the 6th bit, and that will indeed break, and in a Unicode world, perhaps that should break, forcing using LowerCase as needed. The simple upper/lower conversion works only for ASCII, not for Ansi chars. And there are those (such as myself) who often use strings as buffers for things like serial comms. That code will totally break if I were to try to use a unicode string buffer, but a simple addition of String.Encoding := ANSI or RawByteString or ShortString in the first line would fix that, or I could bite the bullet and recode that quick and dirty code the right way. Delphi introduced TBytes for non-character byte data. My point is that trying to keep the bad habits of a single byte string world in a unicode world is counterproductive. They aren't the same, and all attempts to make them the same just cause more problems than they solve. That's why I still suggest to use UTF-16 in user code. When the user skips all unknown chars, nothing can go wrong. As for the RTL and FCL, presumably they wouldn't be doing any of this Sting[i] stuff in the first place, would they? So they aren't going to suffer that speed penalty. Just because one type of code is slow, doesn't mean everything is slow. It's absolutely safe, even with UTF-8 strings, to e.g. search for all '\' separators, and to replace these in place with '/'. It's also safe to search for an set of (ASCII) separator chars, and to split strings at these positions (e.g. CSV). Bytewise case-insensitive comparison also works for all encodings, at least for equality. Other comparisons are much slower, due to the required lookup of the sort order values (maybe alphabetic, dictionary etc.), and again with every encoding. Even with ASCII there exists a choice of sorting 'a' like 'A', after 'A' or after 'Z'. DoDi ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
Luiz Americo Pereira Camara schrieb: Can you give me a link? I checked the XE documentation and RTL, and could not find that RawByteString can hold UTF-16, and my test confirms that: http://edn.embarcadero.com/article/38980 You may read also: http://www.micro-isv.asia/2008/08/using-rawbytestring-effectively/ Thanks, but that's nothing new to me in general, and the RawByteString handling doesn't work as documented. Are you using Delphi XE or fpc? I tested with XE, because I wanted to learn more about the possible use or RawByteString in the RTL. The result was not very enlightening, except for the fact that many years after introduction of string encodings and RawByteString the Delphi RTL still contains severe bugs :-( I dont have Delphi XE. What i know is from that docs and these discussions That's why I wanted to test with Delphi first, before I start finding bugs or incompatibilities in the FPC implementation. DoDi ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Re: Request to merge commit 18230 (STABS fix) to FPC 2.6
2011/9/18 Marco van de Voort : > I do most of the merging, but I don't merge compiler revs unless I get a > green light from the compiler devs. OK. I see maybe this question only Florian can answer, because he made the commit, and maybe he's simply too busy. That's fine. I hope he can read it before the release. -- cobines ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] implementation AVX for Intel/AMD Prozessors
Am 18.09.2011 11:02, schrieb Florian Klämpfl: Am 18.09.2011 10:54, schrieb Torsten: Hi, im starting with implementation of AVX (Intel/AMD) in fpc-assemblercode. I use in first step a external assembler (linux "as"). In this case, it is easy to use XMM-Register (max. 3 parameter) (only change the file "x86ins.dat", examble: [VMOVDQA] (Ch All, Ch None, Ch None) xmmreg, xmmrm \361\... Problem is probably the correct encoding sequence. Yes, i known. I think that is important for the internal assembler, but not external. For use of YMM-Register i have change the file "x86reg.dat" NR_YMM0,$0700,ymm0,%ymm0,ymm0,ymm0,21,21,17,OT_YMMREG,0 NR_YMM1,$0701,ymm1,%ymm1,ymm1,ymm1,22,22,18,OT_YMMREG,1 NR_YMM2,$0702,ymm2,%ymm2,ymm2,ymm2,23,23,19,OT_YMMREG,2 NR_YMM3,$0703,ymm3,%ymm3,ymm3,ymm3,24,24,20,OT_YMMREG,3 NR_YMM4,$0704,ymm4,%ymm4,ymm4,ymm4,25,25,21,OT_YMMREG,4 NR_YMM5,$0705,ymm5,%ymm5,ymm5,ymm5,26,26,22,OT_YMMREG,5 NR_YMM6,$0706,ymm6,%ymm6,ymm6,ymm6,27,27,23,OT_YMMREG,6 NR_YMM7,$0707,ymm7,%ymm7,ymm7,ymm7,28,28,24,OT_YMMREG,7 NR_YMM8,$0708,ymm8,%ymm8,ymm8,ymm8,-1,-1,25,OT_YMMREG,0,64 NR_YMM9,$0709,ymm9,%ymm9,ymm9,ymm9,-1,-1,26,OT_YMMREG,1,64 NR_YMM10,$070a,ymm10,%ymm10,ymm10,ymm10,-1,-1,27,OT_YMMREG,2,64 NR_YMM11,$070b,ymm11,%ymm11,ymm11,ymm11,-1,-1,28,OT_YMMREG,3,64 NR_YMM12,$070c,ymm12,%ymm12,ymm12,ymm12,-1,-1,29,OT_YMMREG,4,64 NR_YMM13,$070d,ymm13,%ymm13,ymm13,ymm13,-1,-1,30,OT_YMMREG,5,64 NR_YMM14,$070e,ymm14,%ymm14,ymm14,ymm14,-1,-1,31,OT_YMMREG,6,64 NR_YMM15,$070f,ymm15,%ymm15,ymm15,ymm15,-1,-1,32,OT_YMMREG,7,64 I'am not sure if the ymm registers should be an own register class. After all, they are a superset of xmm OK. Next step is change the files "cpubase, cgbase, aasmcpu, ...". What do i consider? In which regard? I do not know exactly which functions need to be changed. I'm hoping for tips. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
On 18/9/2011 17:04, Hans-Peter Diettrich wrote: Luiz Americo Pereira Camara schrieb: Can you give me a link? I checked the XE documentation and RTL, and could not find that RawByteString can hold UTF-16, and my test confirms that: http://edn.embarcadero.com/article/38980 You may read also: http://www.micro-isv.asia/2008/08/using-rawbytestring-effectively/ Thanks, but that's nothing new to me in general, and the RawByteString handling doesn't work as documented. procedure ShowCodePage(const S: RawByteString); begin Form1.Caption := IntToStr(StringCodePage(S)); end; Strange What value you get passing and UTF8 and UTF16? According to that site you should get 65001 and 1200 You may try also with UTF16 and UTF8 it should implicitly convert to UnicodeString procedure ShowWithAnyCodePage(const S: RawByteString); begin Form1.Caption := S; end; Luiz end; ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] Unicode support (yet again)
Luiz Americo Pereira Camara schrieb: Thanks, but that's nothing new to me in general, and the RawByteString handling doesn't work as documented. procedure ShowCodePage(const S: RawByteString); begin Form1.Caption := IntToStr(StringCodePage(S)); end; Strange What value you get passing and UTF8 and UTF16? According to that site you should get 65001 and 1200 UTF-8 comes in as 65001, and UTF-16 as Ansi. You may try also with UTF16 and UTF8 it should implicitly convert to UnicodeString Right. DoDi ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: RE : [fpc-devel] Unicode support (yet again)
On Sun, Sep 18, 2011 at 11:45 AM, Jonas Maebe wrote: > > On 18 Sep 2011, at 13:57, Flávio Etrusco wrote: > >> One obvious way to mitigate this would be to store the last >> CodePoint->Char in the string record, so that at least the most common >> case is covered. > > ... and so that the common case is broken in multithreaded environments. > > Directly indexing a string will most likely always work using fixed-length > steps (8, 16, 32 bit). > If you want to iterate based on anything else (such as code points), use some > kind of > iterator model instead. > > Jonas By "the most common case" I meant non-threaded ;-) But no, I don't see any trivial and efficient solution to avoid the worst case (but among threadvars, per-string fixed lookup table, shared lookup caches, per-reference data (like Object), etc, there must be a good solution). Basically I think the UnicodeString should move farther (than AnsiString) away from PChar, from the compiler/RTL POV. I think that the user should (have to) use the iterator model to *efficiently* iterate over the string, but I see indexed access as a compatibility feature, and as such should care more about correctness and ease-of-use rather than performance. I thought the endless bugs WRT to char vs codepoint indexes, even in Java-developed software, would buy my argument... -Flávio ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] TField.Validate in FPC 2.6
Joost van der Sluis wrote / napísal(a): On Sat, 2011-09-17 at 10:56 +0200, Michael Van Canneyt wrote: On Sat, 17 Sep 2011, Martin Schreiber wrote: Hi, TField.SetData() in fixes_2_6 calls TField.Validate(). 2.4.4 does not, trunk neither. Strange that 2.6 does this if trunk does not ? What are the plans for the upcoming release? Will FPC 2.6.0 call TField.Validate() in TField.SetData()? It should not. There were some changes regarding this, and I still didn't look at them. I will and decide how to solve this. (I think I'll revert it all) Please look at http://svn.freepascal.org/cgi-bin/viewvc.cgi?view=rev&revision=18872 This commit (patch by Luiz Americo) is OK. It was a long time awaited feature, so please do not drop it. Thanks -Laco. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel
Re: [fpc-devel] implementation AVX for Intel/AMD Prozessors
Op Sun, 18 Sep 2011, schreef Torsten: I do not know exactly which functions need to be changed. I'm hoping for tips. You will have to be a bit exploring here; AVX is a major upgrade to the x86 instruction set, and there will likely not be a few routines that need to be changed. First step is to make sure they can be used in assembler routines. The assembler is largely table driven, so it you have added them to the tables, a lot should work already. Nevertheless, I expect that modifications are necessary in the both the assembler generators (ag*.pas) and assembler readers (ra*.pas) due to the additional operand that needs to be written/parsed. Only when the point is reached that the instructions are handled well by the assembler reader/writers you could start by adding code generator support. Daniël___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel