Re: [fpc-devel] The 15k bounty: Optimizing executable speed forLinux x86 / LLVM

2019-01-06 Thread Simon Kissel
e... > Thank you! > Cheers, > Simon > ___ > fpc-devel maillist - fpc-devel@lists.freepascal.org > http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel Best regards, Simon Kissel -- Nerdherrschaft GmbH Mainzer Str. 40 55411 Bingen am Rhein Germany Phone

Re: [fpc-devel] The 15k bounty: Optimizing executable speed forLinux x86 / LLVM

2019-01-06 Thread Simon Kissel
Hi & happy new year, > https://bugs.freepascal.org/view.php?id=34646 > https://bugs.freepascal.org/view.php?id=34647 any chance that someone could have a look at those tickets? We are kinda blocked in advancing in the bounty if we can't use the improved compiler to benchmark our code... Thank yo

Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-12-04 Thread Simon Kissel
Hi Florian, > Do you compile with -Aas? The internal assemblers do not support TLS yet, > this is WIP. Ah wow! -Aas does indeed help. Both the assembler errors and the internal error are gone, both in Linux i386 and ARM. And the created binaries even work. Nice! Thank you! Cheers, Simon

Re: [fpc-devel] The 15k bounty: Optimizing executable speed forLinux x86 / LLVM

2018-12-04 Thread Simon Kissel
Hi Gareth, > A regression like this is quite serious. I'd recommend opening a > bug report with a reproducible case so we can investigate and hopefully fix > it within the day. created a test project, and opened two tickets: https://bugs.freepascal.org/view.php?id=34646 https://bugs.freepascal

Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-12-03 Thread Simon Kissel
Hi Florian, we are currently to try to do some real-life benchmarks with our products, however with rev. 40346 compilation fails with the two following showstoppers: 1.) The assembler parser appears to be broken - the following very valid opcodes get rejected: SBMath.pas(1932,9) Error: Asm: [cm

Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-11-27 Thread Simon Kissel
Hi guys, that platform is not relevant for us, but to provide some motivational boost: CrossFPC 4.14 beta Win64: C:\Users\BeRo\Documents\Projects\Tests\threadingtest0\aa>vipribenchmemcache_nodeps VipriBenchThreaded - RunningTimeSeconds=5, TestCount=100, StartSeq=0, NumberOfChannels=6, BufferPack

Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-11-23 Thread Simon Kissel
Hi Adriaan, In case you aren't just trolling and the subject really is of interest to you, I would recommend reading the discussion thread in full. That works much better than treating this like a write-only system. > You didn't answer any of my questions. The goal is to get the > code faster, is

Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-11-23 Thread Simon Kissel
Hi Florian, > Actually, most of the improvements so far are no related to > threading. In particular r40339 helped a lot, it was a bug > fix: the compiler assumed that a certain sub expression was written > while it not was and this prevented CSE. Even better, that means there is still gold to be

Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-11-23 Thread Simon Kissel
Hi Adriaan, > I find the phrase. "FPC's terrible multi-threading performance" > unjust. Well, see the complete thread to better understand what this is about, and what progress is being made. So far a 20% improvement has been made, which kinda is like a proof that there was something to improve ;

Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-11-20 Thread Simon Kissel
11 Bingen am Rhein Germany Phone:+49-6721-9492994 Fax: +49-6721-9492996 simon.kis...@nerdherrschaft.com http://www.nerdherrschaft.com Registered office/Sitz der Gesellschaft: Bingen am Rhein, Germany CEO/Geschäftsführer: Simon Kissel Commercial register/Handelsregister: Amtsg

Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-11-18 Thread Simon Kissel
Hi Florian, > Compile the benchmark with (where fpcnew is the newly build fpc): Bero has confirmed, works for us as well. This rocks! > The changes help also on arm and arm can be build using the same > command line, however, at least on a Raspi3B+ the > improvement is less significant than on i

Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-11-17 Thread Simon Kissel
Hi Jonas, Nice results! > Since I only have a preliminary llvm version (with Dwarf EH) running on > macOS, I can't provide a direct Kylix comparison. The versions below are > both x86-64. As mentioned before, a 32 bit FPC/LLVM is still quite a way > off. How far of a way is that? Sadly we'll hav

Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-11-17 Thread Simon Kissel
Hi Florian, > With some compiler tuning and a few tricks (two changes to the code > and hand-simulated peephole optimizations, but I > think these tricks can also the compiler do): Nice - what changes did you do? Changing the code of course is cheating, but there might be something to learn for

Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-11-16 Thread Simon Kissel
.207 total Benchmark results for ARM will follow. Cheers, Simon Thursday, November 15, 2018, 10:31:55 PM, you wrote: > Am 14.11.2018 um 14:46 schrieb Simon Kissel: >> >> We have not yet tested this on ARM (does it work on ARM?). >> > Aft

Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-11-14 Thread Simon Kissel
Hi Florian, you are a hero. In a very artificial benchmark which just consists of threads and exception handlers, a 32 bit Linux executable now is *twice as fast*! In a real-life scenario we are "only" seeing an improvement of about 10%. But really, this is huge progress. I think everyone will be

[fpc-devel] Kylix Open Edition - was: The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-28 Thread Simon Kissel
Hi, I've packed together a minimal CrossKylix build that includes the old Kylix 3 Open Edition, for those who wish to have a look and/or test (to be provided) the bounty test project later on without violating any Borland (RIP) licenses. Please note that this has only been tested using my CrossKy

Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-28 Thread Simon Kissel
Hi Florian, [DWARF-EH] > This is something I would like to work for years on already. So > maybe its now a good opportunity to start with it. *hugs* Simon ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/m

Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-28 Thread Simon Kissel
Hi Sven, > Borland's Fastcall is more famously known as the Register calling > convention aka the default calling convention in Object Pascal. As > you admitted in your mail further down you have quite some assembly > code and as such you rely on the calling convention for parameter > passing. Her

Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-28 Thread Simon Kissel
Hi Jonas, >> - Complete the LLVM branch of FPC. It looks like Jonas has stopped >>working on it two years ago, which is a pity. > I didn't stop working on it, but I didn't make real progress anymore > either. So, would you be interested in making progress again? :) > a) exception handling

Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-28 Thread Simon Kissel
Hi Sven, > The thing is that we can't enable or disable a feature based on > whether a program links third party libraries or a unit is included > in a library or not, cause we might need to work with precompiled > units. So either you'll need to enable this feature for a locally > build FPC amd b

Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-28 Thread Simon Kissel
Hi Florian, > But there is another pretty simple optimization opportunity in this > area: make the FPC heap manager capable of using > os-based memory reallocation. Kernel-based memory reallocation of > large blocks has the big advantage that the OS can > move the memory contents only by re-mappin

Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-28 Thread Simon Kissel
Hi Sven, > And no one said that it is. But points like table based exception > handling and section based threadvars can be relatively easily > achieved and benefits more targets while working on the optimizer > usually is a per platform work. I agree that this very likely will make a big boost.

Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-28 Thread Simon Kissel
Hi Ben, > There's one more problem I forgot to mention in my first post, and it is > probably a deal breaker for the original bounty: LLVM does not support > Borland's fastcall calling convention for i386. So you would need to add > support for Borland fastcall on i386 to LLVM if it has to sup

Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-28 Thread Simon Kissel
Hi Florian, > The %gs based approach works only for object files linked statically to > the executable. In general there are four TLS access models on linux and > at least three of them need to be supported, if one wants to support > dyn. libraries in a usefull manner. Are you talking about bein

Re: [fpc-devel] The 15k bounty: Optimizing executable speed forLinux x86 / LLVM

2018-10-28 Thread Simon Kissel
Hi Gareth, > And unfortunately not many of us have > access to Kylix. I can have a look if a can package something that works based on the old Open Edition of Kylix for those who don't have an Kylix ISO floating around. Simon ___ fpc-devel maillist -

Re: [fpc-devel] The 15k bounty: Optimizing executable speed forLinux x86 / LLVM

2018-10-28 Thread Simon Kissel
Hi Michael, > I think that specific improvements should be specified, and a bounty for > each of these improvements should be specified, instead of an overall > bounty. I agree. Let's agree on a list of improvements, and spread to bounty accordingly. Simon __

[fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-24 Thread Simon Kissel
Hi, I assume everybody here still knows who I am, so I'll drop the introduction part. In our products, we use FPC for a couple of targets. However, for all of Linux x86 platforms, we still have to use Kylix (CrossKylix). This is because for our code, FPC on these platforms compiles code that is 2