Am 16.11.2018 um 23:36 schrieb Jonas Maebe: > On 16/11/18 22:44, Florian Klämpfl wrote: >> With some compiler tuning and a few tricks (two changes to the code and >> hand-simulated peephole optimizations, but I >> think these tricks can also the compiler do): > > You can improve performance further by devirtualising all method calls using > wpo. First compile it with -FWvipri.wpo > -OWDEVIRTCALLS,OPTVMTS and next with -Fwvipri.wpo -OwDEVIRTCALLS,OPTVMTS (at > least on my machine it gives a small boost, > and makes the results also more stable). > > Since I only have a preliminary llvm version (with Dwarf EH) running on > macOS, I can't provide a direct Kylix > comparison. The versions below are both x86-64. As mentioned before, a 32 bit > FPC/LLVM is still quite a way off. > > * FPC 3.0.4 -MDelphi -O2 -Fwvipri.wpo -OwDEVIRTCALLS,OPTVMTS: > > $ time ./vipribenchmemcache_nodeps > VipriBenchThreaded - RunningTimeSeconds=5, TestCount=100, StartSeq=0, > NumberOfChannels=6, BufferPackets=5000, > NumberOfSynchroThreads=4 > ................................................................................................. > Time: 5016ms = 9669059 pkts/s = 14680 MB/s > > real 0m5.137s > user 0m5.042s > sys 0m0.017s > > FPC 3.3.1 + llvm (clang from Xcode 10.1 with -O3 on FPC-generated llvm IR) > and -Fwvipri.wpo -OwDEVIRTCALLS,OPTVMTS (no > LLVM link-time optimization): > > $ time ./vipribenchmemcache_nodeps_llvm > VipriBenchThreaded - RunningTimeSeconds=5, TestCount=100, StartSeq=0, > NumberOfChannels=6, BufferPackets=5000, > NumberOfSynchroThreads=4 > ................................................................................................................. > Time: 5018ms = 11259466 pkts/s = 17094 MB/s > > real 0m5.161s > user 0m5.060s > sys 0m0.017s >
Can you test with FPC 3.1.1 native, -O4 and the following patch: compiler/nmem.pas | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/compiler/nmem.pas b/compiler/nmem.pas index d5c1d85e8f..52add1fd81 100644 --- a/compiler/nmem.pas +++ b/compiler/nmem.pas @@ -1176,7 +1176,7 @@ implementation begin include(flags,nf_write); { see comment in tsubscriptnode.mark_write } - if not(is_implicit_pointer_object_type(left.resultdef)) then + if not(is_implicit_array_pointer(left.resultdef)) then left.mark_write; end; ? _______________________________________________ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel