[fpc-devel] Another blocking bug

2018-10-25 Thread J. Gareth Moreton
 Hi everyone,

 Sorry to be a pain, but it looks like there's another bug in FPC that
blocks compilation on a particular platform:

 https://bugs.freepascal.org/view.php?id=34458

 It seems a recent change to the PasToJS packages causes it to fail
compilation on x86_64-win64.  SVN says the user "mattias" was the last
person to modify the culprit file.

 Gareth aka. Kit
 ___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-25 Thread Sven Barth via fpc-devel

Am 25.10.2018 um 20:34 schrieb Jonas Maebe:

On 25/10/18 20:13, Florian Klämpfl wrote:

Am 25.10.2018 um 18:59 schrieb Jonas Maebe:

On 20/10/18 16:07, Simon Kissel wrote:

- Complete the LLVM branch of FPC. It looks like Jonas has stopped
    working on it two years ago, which is a pity.
I didn't stop working on it, but I didn't make real progress anymore 
either. The current state of the LLVM code

generator is that everything works on Darwin/x86-64, except for
a) exception handling in general: indeed needs DWARF-EH support in 
the RTL,
This is something I would like to work for years on already. So maybe 
its now a good opportunity to start with it.


I started a branch for 
it:https://svn.freepascal.org/svn/fpc/branches/debug_eh


As a first step, I'll depend on libgcc unwinding, let's see how far 
we get.


Using libgcc's foreign exception support works somewhat, but is not 
very usable in practice due to the limitation of having only one 
exception in flight. I simply started translating all of libgcc's 
exception support to Pascal, since it's also licensed under LGPL + 
linking exception (I took the one from gcc 4.2.1 for the people who 
don't like (L)GPL3).
As you already started working on translating that part of libgcc, would 
you please provide what you have so far? :)


Regards,
Sven
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-25 Thread Jeppe Johansen

On 10/20/18 4:07 PM, Simon Kissel wrote:

The requirements for my bounty would be:

- Must bring executable speed for non-Floating point load
   on both multihreaded and non-multithreaded workloads to
   the Speed of Kylix combined binaries

- Improvements should also help on ARM targets

- An LLVM-based solution must allow inline assembler for
   all x86 and ARM

- Must be completed by February 2019

So, any suggestions on how to move forward on this?

Cheers,

Simon


Hi,

Can you create some benchmarks showing typical workloads that you 
experience a large performance difference on?


Best Regards,
Jeppe

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-25 Thread Jonas Maebe

On 25/10/18 20:13, Florian Klämpfl wrote:

Am 25.10.2018 um 18:59 schrieb Jonas Maebe:

On 20/10/18 16:07, Simon Kissel wrote:

- Complete the LLVM branch of FPC. It looks like Jonas has stopped
    working on it two years ago, which is a pity.

I didn't stop working on it, but I didn't make real progress anymore either. 
The current state of the LLVM code
generator is that everything works on Darwin/x86-64, except for
a) exception handling in general: indeed needs DWARF-EH support in the RTL,

This is something I would like to work for years on already. So maybe its now a 
good opportunity to start with it.

I started a branch for it:https://svn.freepascal.org/svn/fpc/branches/debug_eh

As a first step, I'll depend on libgcc unwinding, let's see how far we get.


Using libgcc's foreign exception support works somewhat, but is not very 
usable in practice due to the limitation of having only one exception in 
flight. I simply started translating all of libgcc's exception support 
to Pascal, since it's also licensed under LGPL + linking exception (I 
took the one from gcc 4.2.1 for the people who don't like (L)GPL3).



Jonas

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-25 Thread Karoly Balogh (Charlie/SGR)
Hi,

On Thu, 25 Oct 2018, Florian Klaempfl wrote:

> >> That is good news.  The contours of a TODO list are becoming visible :)
> >>
> >> But we may need also need a solution for other platforms, which means the
> >> current system should remain in place for those platforms where such a
> >> system is not present ?
> >
> > FPC already has some code to support section threadvars via the GS segment
> > on i386 at least, but it doesn't seem to be enabled by default? (Couldn't
> > test it, but the tf_section_threadvars target flag, which enable this is
> > actually behind a define in i_linux.pas, which I couldn't find enabled
> > anywhere?). Also tf_section_threadvars flag has some code to support it
> > all over the compiler, including the x86 cg. I have some really vague
> > memories I actually enabled it in some experimental local version I had,
> > and it worked on first sight at least, but I could be completely off here.
> >
> > I wonder why it was never enabled by default.
>
> The %gs based approach works only for object files linked statically to
> the executable. In general there are four TLS access models on linux and
> at least three of them need to be supported, if one wants to support
> dyn. libraries in a usefull manner. Of course, this comes with the
> requirement to over means to control the used model. The tls.pdf by U.
> Drepper decribes it very well.

Ah, right. It's been a while. Ironically, it would have been enough for
the actual use case at hand, when I fiddled with it.

Charlie
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-25 Thread Jonas Maebe

On 20/10/18 16:07, Simon Kissel wrote:

- Complete the LLVM branch of FPC. It looks like Jonas has stopped
   working on it two years ago, which is a pity.


I didn't stop working on it, but I didn't make real progress anymore 
either. The current state of the LLVM code generator is that everything 
works on Darwin/x86-64, except for
a) exception handling in general: indeed needs DWARF-EH support in the 
RTL, and also support for the LLVM exception handling intrinsics in the 
code generator. I've worked on and off on this and have some local 
patches, but it's not complete
b) hardware exceptions (null pointer, floating point): the LLVM versions 
I worked with back then did not support support any form of hardware 
exceptions. If a memory access faults, the result is undefined behaviour 
(even with full exception support in the LLVM IR). If a floating point 
instruction  throw an exception, the result is undefined (although they 
have been working a bit on it since then). This is not something that 
can be changed/fixed in FPC, and is quite different from how FPC's 
current code generator works (I don't know how Embarcardero deals with 
it in their LLVM-based code generator).


Additionally, in the current FPC code generator global variables behave 
mostly as volatile variables. With LLVM, that won't be the case (unless 
we mark all of their accesses as volatile, but that would obviously 
inhibit LLVM optimizations). This may break some multithreaded code that 
currently works, and would probably require the introduction of a 
volatile() operatator (similar to the unaligned() one). On the other 
hand, I already added support for tracking the volatile state of 
references in the past, so that should be easy to do.



Jonas
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-25 Thread Florian Klaempfl

Am 25.10.2018 um 09:06 schrieb Sven Barth via fpc-devel:
Simon Kissel > schrieb am Do., 25. Okt. 2018, 
08:54:


- Complete the LLVM branch of FPC. It looks like Jonas has stopped
   working on it two years ago, which is a pity.


I personally don't think that LLVM is the way to go. It's essentially a 
moving target and adds an unnecessary dependency to the compiler.


Me neither :)



- Rewrite the code generator, for example in a SSA-IR way


Didn't Florian work on that already? I wonder how far he is by now 樂


Got distracted by other stuff but also because I do not believe that it 
matters much for a lot real world programs (small benchmarks are another 
story).

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-25 Thread Florian Klaempfl

Am 25.10.2018 um 11:18 schrieb Sven Barth via fpc-devel:
Michael Van Canneyt > schrieb am Do., 25. Okt. 2018, 09:38:




On Sat, 20 Oct 2018, Simon Kissel wrote:

 > - Make Exception handling, TLS etc use the infrastructure that
 >  libpthread is providing

TLS is handled already by libpthread. I doubt you will gain much there.

However, Exception handling is a problem. There are 2 possible ways
ahead:
- DWARF exception handling as mentioned by Sven.
- Port SEH to be cross platform, this is the approach as taken by Kylix.
Kilyx has a small rtlunwind  library that mimics the needed run-time
functionality
offered by Windows.

Conceivably, it can be duplicated. wine probably has such a library
which
can be used as an inspiration.

The needed compiler infrastructure for SEH  already exists, so this
is most likely
the fastest way to proceed.


I'm against emulating SEH. Better implement DWARF exceptions. 


Yes.

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-25 Thread Florian Klaempfl

Am 25.10.2018 um 17:38 schrieb Karoly Balogh (Charlie/SGR):

Hi,

On Thu, 25 Oct 2018, Michael Van Canneyt wrote:


- Make Exception handling, TLS etc use the infrastructure that
  libpthread is providing


TLS is handled already by libpthread. I doubt you will gain much there.



GCC has (depending on the platform) a faster implementation for "__thread"
variables. E.g. on x86 it uses the GS segment and the data is stored in ELF
sections. There were experiments in the past to support this in FPC as
well, so maybe we're on a good way there already.


That is good news.  The contours of a TODO list are becoming visible :)

But we may need also need a solution for other platforms, which means the
current system should remain in place for those platforms where such a
system is not present ?


FPC already has some code to support section threadvars via the GS segment
on i386 at least, but it doesn't seem to be enabled by default? (Couldn't
test it, but the tf_section_threadvars target flag, which enable this is
actually behind a define in i_linux.pas, which I couldn't find enabled
anywhere?). Also tf_section_threadvars flag has some code to support it
all over the compiler, including the x86 cg. I have some really vague
memories I actually enabled it in some experimental local version I had,
and it worked on first sight at least, but I could be completely off here.

I wonder why it was never enabled by default. 


The %gs based approach works only for object files linked statically to 
the executable. In general there are four TLS access models on linux and 
at least three of them need to be supported, if one wants to support 
dyn. libraries in a usefull manner. Of course, this comes with the 
requirement to over means to control the used model. The tls.pdf by U. 
Drepper decribes it very well.


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-25 Thread Karoly Balogh (Charlie/SGR)
Hi,

On Thu, 25 Oct 2018, Michael Van Canneyt wrote:

> >>> - Make Exception handling, TLS etc use the infrastructure that
> >>>  libpthread is providing
> >>
> >> TLS is handled already by libpthread. I doubt you will gain much there.
> >>
> >
> > GCC has (depending on the platform) a faster implementation for "__thread"
> > variables. E.g. on x86 it uses the GS segment and the data is stored in ELF
> > sections. There were experiments in the past to support this in FPC as
> > well, so maybe we're on a good way there already.
>
> That is good news.  The contours of a TODO list are becoming visible :)
>
> But we may need also need a solution for other platforms, which means the
> current system should remain in place for those platforms where such a
> system is not present ?

FPC already has some code to support section threadvars via the GS segment
on i386 at least, but it doesn't seem to be enabled by default? (Couldn't
test it, but the tf_section_threadvars target flag, which enable this is
actually behind a define in i_linux.pas, which I couldn't find enabled
anywhere?). Also tf_section_threadvars flag has some code to support it
all over the compiler, including the x86 cg. I have some really vague
memories I actually enabled it in some experimental local version I had,
and it worked on first sight at least, but I could be completely off here.

I wonder why it was never enabled by default. Maybe to keep compatibility
to some older Linux version, which didn't support this yet?

IOW, it might be an one line change. Can I take some of the bounty now? :P

Charlie
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-25 Thread Michael Van Canneyt



On Thu, 25 Oct 2018, Sven Barth via fpc-devel wrote:


Michael Van Canneyt  schrieb am Do., 25. Okt. 2018,
09:38:




On Sat, 20 Oct 2018, Simon Kissel wrote:


- Make Exception handling, TLS etc use the infrastructure that
 libpthread is providing


TLS is handled already by libpthread. I doubt you will gain much there.



GCC has (depending on the platform) a faster implementation for "__thread"
variables. E.g. on x86 it uses the GS segment and the data is stored in ELF
sections. There were experiments in the past to support this in FPC as
well, so maybe we're on a good way there already.


That is good news.  The contours of a TODO list are becoming visible :)

But we may need also need a solution for other platforms, which means the
current system should remain in place for those platforms where such a
system is not present ?

Michael.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-25 Thread Sven Barth via fpc-devel
Michael Van Canneyt  schrieb am Do., 25. Okt. 2018,
09:38:

>
>
> On Sat, 20 Oct 2018, Simon Kissel wrote:
>
> > - Make Exception handling, TLS etc use the infrastructure that
> >  libpthread is providing
>
> TLS is handled already by libpthread. I doubt you will gain much there.
>

GCC has (depending on the platform) a faster implementation for "__thread"
variables. E.g. on x86 it uses the GS segment and the data is stored in ELF
sections. There were experiments in the past to support this in FPC as
well, so maybe we're on a good way there already.

Regards,
Sven

>
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-25 Thread Sven Barth via fpc-devel
Michael Van Canneyt  schrieb am Do., 25. Okt. 2018,
14:55:

>
>
> On Thu, 25 Oct 2018, Sven Barth via fpc-devel wrote:
>
> >
> >> Personally I am also in favour of a more open technique instead of a
> >> technique which is proprietary to a platform, and in this sense I
> >> understand
> >> and endorse your point of view, but beggars can't be choosers.
> >>
> >> There is no problem to have both techniques available. As I wrote, the
> SEH
> >> is the fastest path.
> >>
> >
> > I have my doubts especially as the rtlunwind stuff of Kylix only works on
> > i386. The SEH mechanism between i386 and all other Windows platforms
> > differs significantly and I doubt that Simon only wants i386 to benefit.
>
> If 'SEH is the fastest path.' is not correct, then all the more reason to
> use DWARF...
>

A further obstacle for SEH on non-i386: GNU AS supports the pseudo
instructions needed for SEH only for PE/COFF, but not ELF. This would mean
that we'd need to add them manually to to the assembly files which would
definitely be more bothersome...

Regards,
Sven

>
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-25 Thread Michael Van Canneyt



On Thu, 25 Oct 2018, Sven Barth via fpc-devel wrote:




Personally I am also in favour of a more open technique instead of a
technique which is proprietary to a platform, and in this sense I
understand
and endorse your point of view, but beggars can't be choosers.

There is no problem to have both techniques available. As I wrote, the SEH
is the fastest path.



I have my doubts especially as the rtlunwind stuff of Kylix only works on
i386. The SEH mechanism between i386 and all other Windows platforms
differs significantly and I doubt that Simon only wants i386 to benefit.


If 'SEH is the fastest path.' is not correct, then all the more reason to use 
DWARF...

Michael.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-25 Thread Joao Schuler
Hello Simon - wondering if you have code examples that provoke problems you
are experiencing? It will be easier to measure/test improvements with test
cases. Solutions might not come from a single person/team and therefore not
sure how to apply the bounty in the most effective/fair way.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-25 Thread Sven Barth via fpc-devel
Michael Van Canneyt  schrieb am Do., 25. Okt. 2018,
11:51:

>
>
> On Thu, 25 Oct 2018, Sven Barth via fpc-devel wrote:
>
> > Michael Van Canneyt  schrieb am Do., 25. Okt.
> 2018,
> > 09:38:
> >
> >>
> >>
> >> On Sat, 20 Oct 2018, Simon Kissel wrote:
> >>
> >>> - Make Exception handling, TLS etc use the infrastructure that
> >>>  libpthread is providing
> >>
> >> TLS is handled already by libpthread. I doubt you will gain much there.
> >>
> >> However, Exception handling is a problem. There are 2 possible ways
> ahead:
> >> - DWARF exception handling as mentioned by Sven.
> >> - Port SEH to be cross platform, this is the approach as taken by Kylix.
> >> Kilyx has a small rtlunwind  library that mimics the needed run-time
> >> functionality
> >> offered by Windows.
> >>
> >> Conceivably, it can be duplicated. wine probably has such a library
> which
> >> can be used as an inspiration.
> >>
> >> The needed compiler infrastructure for SEH  already exists, so this is
> >> most likely
> >> the fastest way to proceed.
> >>
> >
> > I'm against emulating SEH. Better implement DWARF exceptions. The
> > infrastructure that was created for SEH inside the compiler should help
> > nevertheless.
>
> You can be against, and  you don't need to work on it,
> but if someone supplies a patch, I don't think we should refuse it.
>

I don't agree here.


> Personally I am also in favour of a more open technique instead of a
> technique which is proprietary to a platform, and in this sense I
> understand
> and endorse your point of view, but beggars can't be choosers.
>
> There is no problem to have both techniques available. As I wrote, the SEH
> is the fastest path.
>

I have my doubts especially as the rtlunwind stuff of Kylix only works on
i386. The SEH mechanism between i386 and all other Windows platforms
differs significantly and I doubt that Simon only wants i386 to benefit.

Regards,
Sven
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-25 Thread Michael Van Canneyt



On Thu, 25 Oct 2018, Martin Schreiber wrote:


On Thursday 25 October 2018 11:18:58 Sven Barth via fpc-devel wrote:


I'm against emulating SEH. Better implement DWARF exceptions. The
infrastructure that was created for SEH inside the compiler should help
nevertheless.

MSElang has some code for "Itanium ABI Zero-cost Exception Handling" supported 
by LLVM, for example the runtime part:

https://gitlab.com/mseide-msegui/mselang/blob/master/mselang/compiler/__mla__personality.pas
Works well so far.


Great, thank you for this info. The more choice, the better!

Michael.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-25 Thread Michael Van Canneyt



On Thu, 25 Oct 2018, Sven Barth via fpc-devel wrote:


Michael Van Canneyt  schrieb am Do., 25. Okt. 2018,
09:38:




On Sat, 20 Oct 2018, Simon Kissel wrote:


- Make Exception handling, TLS etc use the infrastructure that
 libpthread is providing


TLS is handled already by libpthread. I doubt you will gain much there.

However, Exception handling is a problem. There are 2 possible ways ahead:
- DWARF exception handling as mentioned by Sven.
- Port SEH to be cross platform, this is the approach as taken by Kylix.
Kilyx has a small rtlunwind  library that mimics the needed run-time
functionality
offered by Windows.

Conceivably, it can be duplicated. wine probably has such a library which
can be used as an inspiration.

The needed compiler infrastructure for SEH  already exists, so this is
most likely
the fastest way to proceed.



I'm against emulating SEH. Better implement DWARF exceptions. The
infrastructure that was created for SEH inside the compiler should help
nevertheless.


You can be against, and  you don't need to work on it, 
but if someone supplies a patch, I don't think we should refuse it.


Personally I am also in favour of a more open technique instead of a
technique which is proprietary to a platform, and in this sense I understand
and endorse your point of view, but beggars can't be choosers.

There is no problem to have both techniques available. As I wrote, the SEH
is the fastest path.

So hopefully we will be able to compare and can still choose the better/faster 
one.

Michael.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-25 Thread Martin Schreiber
On Thursday 25 October 2018 11:18:58 Sven Barth via fpc-devel wrote:
>
> I'm against emulating SEH. Better implement DWARF exceptions. The
> infrastructure that was created for SEH inside the compiler should help
> nevertheless.
>
MSElang has some code for "Itanium ABI Zero-cost Exception Handling" supported 
by LLVM, for example the runtime part:
https://gitlab.com/mseide-msegui/mselang/blob/master/mselang/compiler/__mla__personality.pas
Works well so far.

Martin
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-25 Thread Sven Barth via fpc-devel
Michael Van Canneyt  schrieb am Do., 25. Okt. 2018,
09:38:

>
>
> On Sat, 20 Oct 2018, Simon Kissel wrote:
>
> > - Make Exception handling, TLS etc use the infrastructure that
> >  libpthread is providing
>
> TLS is handled already by libpthread. I doubt you will gain much there.
>
> However, Exception handling is a problem. There are 2 possible ways ahead:
> - DWARF exception handling as mentioned by Sven.
> - Port SEH to be cross platform, this is the approach as taken by Kylix.
> Kilyx has a small rtlunwind  library that mimics the needed run-time
> functionality
> offered by Windows.
>
> Conceivably, it can be duplicated. wine probably has such a library which
> can be used as an inspiration.
>
> The needed compiler infrastructure for SEH  already exists, so this is
> most likely
> the fastest way to proceed.
>

I'm against emulating SEH. Better implement DWARF exceptions. The
infrastructure that was created for SEH inside the compiler should help
nevertheless.

Regards,
Sven

>
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] The 15k bounty: Optimizing executable speed forLinux x86 / LLVM

2018-10-25 Thread Michael Van Canneyt



On Thu, 25 Oct 2018, J. Gareth Moreton wrote:

I would argue how such a bounty would be 
rewarded here because overall performance 
gains have been done by multiple 
submitters. For example, I've submitted a 
number of improvements to the optimiser to 
produce both smaller and faster machine 
code.


I think that specific improvements should be specified, and a bounty for
each of these improvements should be specified, instead of an overall
bounty.

Michael.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


[fpc-devel] Fwd: Re: [FPC 0034456]: r40027 breaks build

2018-10-25 Thread J. Gareth Moreton
Accidentally sent to just Pierre.


- Original Message ---
--
From: J. Gareth Moreton 
gar...@moreton-family.com
To: "Pierre Muller" 
pie...@freepascal.org
Sent: Thu 25/10/18 07:14
Subject: Fwd: Re: [FPC 
0034456]: r40027 breaks build
 Hi Pierre,
 I realised afterwads that my additions to 
compiler/x86/x86ins.dat were
incorrect, as the far-call version of RET 
is, by default, 32-bit even on
64-bit platforms, whereas the near-call 
version is 64-bit, so adding
NOX86_64 was wrong.  I corrected the 
patch.

 I did wonder about whether adding 64-bit 
support was correct or not. 
Thanks for checking that.  The bug report 
was set to "fixed" since that's
the default option when I select 
"resolved".  Is there a better option to
select when submitting a patch?

 Gareth

 On Thu 25/10/18 07:53 , Pierre Muller 
pie...@freepascal.org sent:
  Hi all, 

 sorry about this ... 

 This s completely my fault, 
 I forgot that the list of instructions 
 is not the same for i8086, i386 and 
x86_64 ... 

 I just committed revision 40028, that 
fixes the compilation failure. 

 I would need some advices on the notes 
added by J. Gareth Moreton. 

 Hi Gareth, 

 I have a few comments on your notes on 
that bug report: 

 First, let me thank you for the fast 
reaction to the bug report, 
 but I think that your patch proposal is 
not correct, 
 because as_i386_wasm is a i386 specific 
assembler, and thus 
 adding code to handle x86_64 versions of 
the same instruction is useless 
 as long as Watcom assembler does not 
support 64-bit instruction. 

 Even if the would later, we would add a 
new as_x86_64_wasm assembler id
for this. 

 Second, I have no idea about the 
correctnes of your patch related 
 to x86ins.dat 

 Index: compiler/x86/x86ins.dat 
 =
== 
 --- compiler/x86/x86ins.dat (revision 
40027) 
 +++ compiler/x86/x86ins.dat (working 
copy) 
 @@ -1754,8 +1754,8 @@ 

 [RETFD,lret] 
 (Ch_All) 
 -void 3251xCB 386 
 -imm 3251xCA30 386,SW 
 +void 3251xCB 386,NOX86_64 
 +imm 3251xCA30 386,SW,NOX86_64 

 [RETND,ret] 
 (Ch_All) 

 I have no idea if this instruction might 
be valid for embedded 
 target when trying to switch between 32 
and 64 bit code... 

 Can someone else from core team please 
comment on that part? 

 Third, I do not understand why you 
changed the bug status to 'fixed'. 
 Shouldn't we wait until a fix has been 
committed to the relevant branch to
change the 
 bug report status to 'fixed'? 

 I did re-open and changed it back to 
fixed, adding reference to commit
40028. 

 Thanks! 

 Pierre 

 

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] The 15k bounty: Optimizing executable speed forLinux x86 / LLVM

2018-10-25 Thread J. Gareth Moreton
I would argue how such a bounty would be 
rewarded here because overall performance 
gains have been done by multiple 
submitters. For example, I've submitted a 
number of improvements to the optimiser to 
produce both smaller and faster machine 
code.

And unfortunately not many of us have 
access to Kylix.

Saying all that though, any improvement to 
FPC is greatly welcomed.

Gareth aka. Kit


On Thu 25/10/18 08:38 , Michael Van 
Canneyt mich...@freepascal.org sent:
> 
> 
> 
> 
> On Sat, 20 Oct 2018, Simon Kissel wrote:
> 
> 
> 
> > - Make Exception handling, TLS etc use 
the
> infrastructure that
> >  libpthread is providing
> 
> 
> 
> TLS is handled already by libpthread. I 
doubt you will gain much there.
> 
> 
> 
> However, Exception handling is a 
problem. There are 2 possible ways ahead:
> 
> - DWARF exception handling as mentioned 
by Sven.
> 
> - Port SEH to be cross platform, this is 
the approach as taken by Kylix.
> 
> Kilyx has a small rtlunwind  library 
that mimics the needed run-time
> functionality
> offered by Windows.
> 
> 
> 
> Conceivably, it can be duplicated. wine 
probably has such a library which
> 
> can be used as an inspiration.
> 
> 
> 
> The needed compiler infrastructure for 
SEH  already exists, so this is most
> likely
> the fastest way to proceed.
> 
> 
> 
> Michael..
> 
> 
__
_
> 
> fpc-devel maillist  -  fpc-
de...@lists.freepascal.org
> http://lists.freepascal.org/cgi-
bin/mailman/listinfo/fpc-devel
> 
> 
> 
> 

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-25 Thread Michael Van Canneyt



On Sat, 20 Oct 2018, Simon Kissel wrote:


- Make Exception handling, TLS etc use the infrastructure that
 libpthread is providing


TLS is handled already by libpthread. I doubt you will gain much there.

However, Exception handling is a problem. There are 2 possible ways ahead:
- DWARF exception handling as mentioned by Sven.
- Port SEH to be cross platform, this is the approach as taken by Kylix.
Kilyx has a small rtlunwind  library that mimics the needed run-time 
functionality
offered by Windows.

Conceivably, it can be duplicated. wine probably has such a library which
can be used as an inspiration.

The needed compiler infrastructure for SEH  already exists, so this is most 
likely
the fastest way to proceed.

Michael..
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-25 Thread Sven Barth via fpc-devel
Simon Kissel  schrieb am Do., 25. Okt.
2018, 08:54:

> - Complete the LLVM branch of FPC. It looks like Jonas has stopped
>   working on it two years ago, which is a pity.
>

I personally don't think that LLVM is the way to go. It's essentially a
moving target and adds an unnecessary dependency to the compiler.

- Rewrite the code generator, for example in a SSA-IR way
>

Didn't Florian work on that already? I wonder how far he is by now 樂

- Make Exception handling, TLS etc use the infrastructure that
>   libpthread is providing
>

I'm against having such a basic functionality depend on an external library
as I quite enjoy that FPC can be used without any dependencies on Linux.
However I am in favor of introducing DWARF exception handling that should
have similar benefits as SEH on Win64 if I remember correctly.
And for threadvars we could try to implement a different mechanism as well.
I think there was some experiment for that some time ago 樂

A further problem is that not all of us have access to Kylix so that not
everyone can compare the performance.

Regards,
Sven
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] r40027 break. Patch available.

2018-10-25 Thread Pierre Muller
Hi all,

Le 25/10/2018 à 07:50, J. Gareth Moreton a écrit :
> Hi everyone,
> 
> As Pascal Riekenberg reported, the recent update to the trunk causes FPC to 
> not compile on some platforms, specifically x86_64.  I have provided what is 
> hopefully a long-term fix over here: 
> https://bugs.freepascal.org/view.php?id=34456
> 
> Considering this is a critical bug, I request that the patch be tested by 
> others as soon as is practical to ensure it is up to standard and fixes the 
> problem.


  I was he cause of this compilation error,
and I apologize for this, the problem whould
be fixed by commit # 40028.

In the hope that it willl fix the problem for all targets.


Pierre Muller
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


[fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-25 Thread Simon Kissel
Hi,

I assume everybody here still knows who I am, so I'll drop the
introduction part.

In our products, we use FPC for a couple of targets. However, for all
of Linux x86 platforms, we still have to use Kylix (CrossKylix). This
is because for our code, FPC on these platforms compiles code that
is 25% slower than Kylix, and up to 50% when it comes to
multi-threaded stuff.

We know about a couple of bottlenecks (fpc_pushexceptaddr /
RelocateThreadVar etc) which explain FPC's terrible multi-threading
performance, but in general, FPC's code generator really is quite
a mess, which we learned the hard way a couple of years when we
did optimization work on the ARM target.

Due to use having to stick to Kylix, we can not use any of the
recent Object Pascal language features of the last 15 years,
which is frustrating. It also prevents us from fully moving over
to Unicode.

I'd therefore like to put out a 15.000 Euro bounty for whoever
brings FPC at least on par with Kylix when it comes to executable
speed in multi-threaded scenarios, but first would like to discuss
with you guys what route should be taken (the list is not
complete and not mutually exclusive, of course):

- Complete the LLVM branch of FPC. It looks like Jonas has stopped
  working on it two years ago, which is a pity.

- Rewrite the code generator, for example in a SSA-IR way

- Make Exception handling, TLS etc use the infrastructure that
  libpthread is providing

The requirements for my bounty would be:

- Must bring executable speed for non-Floating point load
  on both multihreaded and non-multithreaded workloads to
  the Speed of Kylix combined binaries

- Improvements should also help on ARM targets

- An LLVM-based solution must allow inline assembler for
  all x86 and ARM

- Must be completed by February 2019

So, any suggestions on how to move forward on this?

Cheers,

Simon

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


[fpc-devel] r40027 break. Patch available.

2018-10-25 Thread J. Gareth Moreton
 Hi everyone,
 As Pascal Riekenberg reported, the recent update to the trunk causes FPC
to not compile on some platforms, specifically x86_64.  I have provided
what is hopefully a long-term fix over here:
https://bugs.freepascal.org/view.php?id=34456

 Considering this is a critical bug, I request that the patch be tested by
others as soon as is practical to ensure it is up to standard and fixes the
problem.

 Gareth aka. Kit
  ___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel