So to better optimise the original program with regular optimisation
options, I guess we need a) full SSA support
Yes.
b) better (node) CSE
support
Enabling node cse on loadparentfpn nodes helps already slightly:
diff a/compiler/ncgld.pas b/compiler/ncgld.pas
index 028e51e..1ab6f11
Am 24.09.2010 15:36, schrieb Jonas Maebe:
On 24 Sep 2010, at 14:35, Jonas Maebe wrote:
On 24 Sep 2010, at 11:48, Adrian Veith wrote:
Register allocation is on a comparable level for both versions.
Delphi keeps the Bar pointer in a register, while FPC spills it to
the stack. Because Bar
With asm cse enabled as in 2.5, I think it should be also doable to use
ebp as general purpose register if the stack frame is omitted, this
should squeeze out another few percents.
___
fpc-pascal maillist - fpc-pascal@lists.freepascal.org
Am 26.09.2010 23:53, schrieb Florian Klämpfl:
With asm cse enabled as in 2.5,
Disabled I mean.
___
fpc-pascal maillist - fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal
On 26 Sep 2010, at 23:25, Florian Klämpfl wrote:
Am 24.09.2010 15:36, schrieb Jonas Maebe:
Correction: Delphi keeps the hidden parent frame pointer parameter in a
register (which is required every time Bar is accessed), while FPC puts
it on the stack. The end result is the same though: 1
On 23.09.2010 17:03, Jonas Maebe wrote:
On 23 Sep 2010, at 16:59, Adrian Veith wrote:
I analyzed your code - I think the problem is the array element address
calculation of the fpc compiler. You have a lot of code like
Bar[MinValley] etc. The delphi compile uses the lea assembler code for
On 24 Sep 2010, at 08:06, Adrian Veith wrote:
On 23.09.2010 17:03, Jonas Maebe wrote:
It may help a lot, but only because it will reduce register pressure,
not because the multiplications are gone.
It reduces the total number of multiplications about 70% - I gave the
code to one of my guys
On 24 Sep 2010, at 11:48, Adrian Veith wrote:
Changing to pointers reduces the amount of multiplications for
accessing
the nth element in an array - if you compare the delphi code to th fpc
code on assembler base, this is the main difference in both generated
codes.
Did you actually try
Hi Adrian,
Adrian Veith adr...@veith-system.de wrote:
[...]
we optimized the code further and eliminated the all Next, Prev: Integer
etc to and changed them to pointers again. Here are the results:
[...]
first optimization - saving redundant array access to pointers:
[...]
next optimization -
On 24 Sep 2010, at 14:35, Jonas Maebe wrote:
On 24 Sep 2010, at 11:48, Adrian Veith wrote:
Register allocation is on a comparable level for both versions.
Delphi keeps the Bar pointer in a register, while FPC spills it to
the stack. Because Bar is used in most of the most-executed
On 24.09.2010 14:35, Jonas Maebe wrote:
On 24 Sep 2010, at 11:48, Adrian Veith wrote:
Changing to pointers reduces the amount of multiplications for accessing
the nth element in an array - if you compare the delphi code to th fpc
code on assembler base, this is the main difference in both
stefan...@web.de schrieb:
My experience is that linked lists with pointers are much slower than linked lists realized by arrays.
That's my experience too. I converted a few programs from linked lists to array
of pointers and the speed increase was always dramatically.
Message: 6
Date: Wed, 22 Sep 2010 16:08:37 +0200 (CEST)
From: stefan...@web.de
Subject: Re: [fpc-pascal] code optimization
To: fpc-pascal@lists.freepascal.org
Message-ID:
1487431390.1512221.1285164517310.javamail.fm...@mwmweb065
Content-Type: text/plain; charset=UTF-8
Hi Adrian
karl-michael.schind...@web.de wrote:
My 2 cents:
looking at the code, i would assume that you can gain by using linked lists
with pointers instead of arrays and working with the index. This would reduce
the number of offset calculations. However, it means quite a rewrite. So, do
you really
.
any suggestions?
Stefan
-Ursprüngliche Nachricht-
Von: Adrian Veith adr...@veith-system.de
Gesendet: 22.09.2010 08:08:45
An: FPC-Pascal users discussions fpc-pascal@lists.freepascal.org
Betreff: Re: [fpc-pascal] code optimization
Hi Stefan,
is this a benchmark program or a complex
On 23 Sep 2010, at 16:59, Adrian Veith wrote:
I analyzed your code - I think the problem is the array element
address
calculation of the fpc compiler. You have a lot of code like
Bar[MinValley] etc. The delphi compile uses the lea assembler code for
this, whereas fpc calculates the address
Eduardo nec...@retena.com wrote:
Can you try optimize for size? In some cases, it reduces L2 / L3 cache
miss and runs faster than O3. It happens in other compilers and
languages too.
I just tried it: The code gets even slightly larger and much slower (almost a
factor of 2).
On 23/09/2010 18:09, stefan...@web.de wrote:
Eduardonec...@retena.com wrote:
Can you try optimize for size? In some cases, it reduces L2 / L3 cache
miss and runs faster than O3. It happens in other compilers and
languages too.
I just tried it: The code gets even slightly larger and much
Hi Stefan,
is this a benchmark program or a complex program you are talking about.
If it is a benchmark, then it would be interesting to see the code,
because from my experience I doubt that Delphi produces better code than
fpc (in general it is the other way round). If it is a complex program,
fpc-pascal-boun...@lists.freepascal.org scritti il 22/09/2010 08.08.45
is this a benchmark program or a complex program you are talking about.
If it is a benchmark, then it would be interesting to see the code,
because from my experience I doubt that Delphi produces better code than
fpc (in
compiler version is 2.4.0, I run under Windows XP.
any suggestions?
Stefan
-Ursprüngliche Nachricht-
Von: Adrian Veith adr...@veith-system.de
Gesendet: 22.09.2010 08:08:45
An: FPC-Pascal users discussions fpc-pascal@lists.freepascal.org
Betreff: Re: [fpc-pascal] code optimization
Hi
On 22 Sep 2010, at 16:08, stefan...@web.de wrote:
Thus it looks like FPC pascal is doing very bad on optimizing the
code.
I agree, that I also have seen examples where FPC pascal code is
about 10% faster than Delphi code.
So why does FPC pascal fail on this code?
At first sight it looks
Hi all,
I am currently trying to get the fastest possible code from fpc on a modern
CPU (Intel Xeon) under Windows XP. I use the compiler options:
fpc -Mdelphi -O3 -OpPENTIUMM -Cfsse2 -Cr- -Co- -CO- -Ci- myprogram.dpr
Are there any better settings I should use? Compiling exactly the same
23 matches
Mail list logo