[fpc-devel] IsMultiThread always true issue 30535

2017-03-19 Thread Dimitrios Chr. Ioannidis via fpc-devel

Hi,

  is the commit from 35567 rev. compatible with 3.0.x fixes branch ? If 
so is it possible someone to commit it also there ?


regards,

--

Dimitrios Chr. Ioannidis

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Staticaly link C/C++ library (.lib) into FreePascal on Windows

2017-03-19 Thread silvioprog
On Sun, Mar 19, 2017 at 5:43 AM, Sven Barth via fpc-devel <
fpc-devel@lists.freepascal.org> wrote:

> Am 19.03.2017 04:53 schrieb "silvioprog" :
>
> Unfortunately you can't use the static libraries (.a) of Intel because
> they are generated for Linux, in spite of static libraries be
> cross-platform.
>
> Non-sense. Static libraries are as platform specific as any other binary
> code, after all it needs to call OS functions.
>

Well, I understand by cross-platform anything that is implemented on
multiple platforms, so once ar archives can be generated for multiple ones,
it makes sense for me. :-)

> I'm not sure about the .lib files. MS's COFF files adopt the .lib
> extension, but it is a little bit strange these sizes below:
> >
> > `libippi.a`:
> > . original - 251 MB;
> > . striped - 192 MB.
> >
> > `libippi.lib`:
> > . original - 853 KB;
> > . striped - no strip needed, it is already small.
>
> Seems like the second one is merely an import library for the DLL instead
> of a real static library.
>

Indeed.

And of course that is COFF as well. MSVC only supports COFF.
>

There isn't only one kind of COFF, AFAIK MS has an own COFF style and MSVC
supports only that. Sure, Intel must have used MSVC ones. However, my
*suggestion* about LacaK confirming that was just because he can generate
an object or a shared library from a MS COFF file, solving his problem!

--
Silvio Clécio
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Optimization of redundant mov's

2017-03-19 Thread Jonas Maebe

Martok wrote:

>  a:= CurrentHash[0]; b:= CurrentHash[1]; c:= CurrentHash[2]; d:= 
> CurrentHash[3];
> 000100074943 488b8424a002 mov0x2a0(%rsp),%rax
> 00010007494B 4c8b5038 mov0x38(%rax),%r10
> 00010007494F 488b8424a002 mov0x2a0(%rsp),%rax
> 000100074957 4c8b5840 mov0x40(%rax),%r11
> 00010007495B 488b9424a002 mov0x2a0(%rsp),%rdx
> 000100074963 488b4248 mov0x48(%rdx),%rax
> 000100074967 488b9424a002 mov0x2a0(%rsp),%rdx
> 00010007496F 488b6a50 mov0x50(%rdx),%rbp
> 
> Every single one of the "mov 0x2a0(%rsp), %rxx" instructions except the first 
> is
> redundant and causes another memory round-trip. At the same time, more 
> registers
> are used, which probably makes other optimizations more difficult, especially
> when something similar happens on i386.
> 
> Now, the fun part: I haven't been able to build a simple test that causes the
> same issue (the self-pointer already is in %rcx and not fetched from the stack
> each time), so I have a feeling this may be a side effect of some other part 
> of
> the code.

It's called register spilling: once there are no registers left to hold
values, the compiler has to pick registers whose value will be kept in
memory instead. Register allocation is an NP-complete problem, so the
result will never be 100% optimal (at least if you don't want to wait
forever while the compiler checks out all possible assignments). One
possible heuristic, which is used by FPC's register allocator, is to
spill the register that conflicts with the largest number of other
registers (to minimise the number of registers spilled to memory).

There are techniques to more optimally spill (e.g. live range
splitting), and there are also other kinds of optimisations that could
be run after register allocation to make the code more optimal. CSE at
the assembler level could be used in this case. That's a very complex
undertaking for relatively little gain though. E.g. those memory loads
are probably optimised by the processor itself (not necessarily coming
even from the L1 cache, but possibly from the write-back buffer).


Jonas
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Staticaly link C/C++ library (.lib) into FreePascal on Windows

2017-03-19 Thread Sven Barth via fpc-devel
Am 19.03.2017 04:53 schrieb "silvioprog" :
>
> On Wed, Mar 15, 2017 at 4:38 AM, LacaK  wrote:
>>>
>>> I forgot a question, could you send your ippi .a files for us? If so, I
can try a test here. :-)
>>
>>
>> Yes of course: I have uploaded them here
http://uschovna.zoznam.sk/download?code=1342688547227-EZyyeVzToDVVkkbJNCbN
>> But be aware of that I am on Windows, not Linux (Despite this I have
added to ZIP also .a files as they are installed by Intel into direcotry
"Linux". In direcory "Windows" are installed only .lib files).
>> If I can repeat my question: Can I use ".a" libraries also on Windows ?
If not can I use ".lib" created by C/C++ (I do not know how they are build)
>> Thank you
>>
>> -Laco.
>
>
> Unfortunately you can't use the static libraries (.a) of Intel because
they are generated for Linux, in spite of static libraries be
cross-platform.

Non-sense. Static libraries are as platform specific as any other binary
code, after all it needs to call OS functions.

> I'm not sure about the .lib files. MS's COFF files adopt the .lib
extension, but it is a little bit strange these sizes below:
>
> `libippi.a`:
> . original - 251 MB;
> . striped - 192 MB.
>
> `libippi.lib`:
> . original - 853 KB;
> . striped - no strip needed, it is already small.

Seems like the second one is merely an import library for the DLL instead
of a real static library.
And of course that is COFF as well. MSVC only supports COFF.

Regards,
Sven
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Some questions about compiler work on x86_64-win64

2017-03-19 Thread Sven Barth via fpc-devel
Am 18.03.2017 23:11 schrieb "Bishop" :
>
> 03/18/17 00:51:05, Sven Barth via fpc-devel <
fpc-devel@lists.freepascal.org>:
>
>
> > Bo, the main sense of this is to detect when a new thread is started
and more importantly terminated cause only with this we can free the
threadvar area of the thread accordingly (if the thread is an external one,
not one started using BeginThread or TThread).
>
> Thanks, now i understand how its work. Plus as i understand on Linux (and
other unixes) threadvar for external threads allocated on first access to
them (and free via PThread ability to call destructor for key).

Correct.
(and my first word should have been "No", not "Bo"; stupid smartphone
keyboard)

>
> > Why *should* it be auto generated if one can use a table and let the
RTL do the rest.
> Is it not better make all that can be done in compile time? Its not more
complex solution for compiler code, but as i see it, its more harmonious
(Its depend not only INIFINAL, but all tables, than used in RTL to make
work of compiler/linker. As example, FPC_THREADVARTABLES. Different
modules, i mean DLL or SO, use different TLS keys for their threadvar
regions. But why position of variable from begin of threadvar region must
be generated in runtime? Isn`t it work for linker?). Possible this is
depend on that "dynamic packages"?

If you have different modules (the binary and the libraries) then they are
*separate* entities. Cause it could be that a Pascal library is used with a
C binary and thus a library has the whole RTL statically linked (or at
least that part that is used).
Only dynamic packages allow one to transparently have units be part of
different binary modules yet providing one whole application. Package
libraries can however only be used by a binary compiled with the same
compiler as they rely on quite a bit of compiler magic.

>
> > Also with the addition of dynamic packages this will move even more
towards a table based approach.
> Where i can read information about what is it and why we need it? What
kind of problems is must solve? Because we already have dll/so, and as i
know and see for now its enough. Possible my knowledge is not enough to see
whole problem.

With dynamic packages you can share classes, strings, memory, etc. between
the modules (the main binary and the different package libraries), because
the RTL will only exist once. And all this transparently for the user.
When you use ordinary libraries you need to use a shared memory manager to
pass strings around and you can't use the "as" and "is" operators inside
the main binary on classes passed in from the libraries (and by extension
this also applies to exceptions).

>
> > But you can set the corresponding PE flag for ASLR using $SetPEOpts (or
so). No recompilation needed in that case.
> Can. But what if i dont want ASLR binary? Its totaly valid.

Since ASLR is disabled by default in FPC that question is useless.

>
> > Microsoft recommended that approach for Win64 so why should we do the
work and implement it differently even if ASLR isn't enabled by default for
FPC executables?
> Recommendation in not a law (like it is with SEH in Win64). C compilers
allow both type of programs, depend on what programmer need. Is it need
many work to change it? As i see it, its just one check in compiler code
for global varibles (if select PIC - use RIP-related, if not - use direct).
It already done in linux. I think it was better to give compiler user more
possibilities when its cost almoust nothing.

If it is so important to you: patches are welcome. But keep in mind that
the default needs to be the status quo.

Regards,
Sven
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


[fpc-devel] Optimization of redundant mov's

2017-03-19 Thread Martok
Hi all,

there has been some discussion about FPCs optimizer in #31444, prompting me to
investigate some of my own code. Generally speaking the generated assembler is
not all that bad (I like how it uses LEA for almost all integer arithmetics),
but I keep seeing sections with lots of redundant MOVs.

Example, from a SHA512 implementation:
CurrentHash is a field of the current class, compiled with anything above -O2,
-CpCOREAVX2, -Px86_64.

 a:= CurrentHash[0]; b:= CurrentHash[1]; c:= CurrentHash[2]; d:= CurrentHash[3];
000100074943 488b8424a002 mov0x2a0(%rsp),%rax
00010007494B 4c8b5038 mov0x38(%rax),%r10
00010007494F 488b8424a002 mov0x2a0(%rsp),%rax
000100074957 4c8b5840 mov0x40(%rax),%r11
00010007495B 488b9424a002 mov0x2a0(%rsp),%rdx
000100074963 488b4248 mov0x48(%rdx),%rax
000100074967 488b9424a002 mov0x2a0(%rsp),%rdx
00010007496F 488b6a50 mov0x50(%rdx),%rbp

Every single one of the "mov 0x2a0(%rsp), %rxx" instructions except the first is
redundant and causes another memory round-trip. At the same time, more registers
are used, which probably makes other optimizations more difficult, especially
when something similar happens on i386.

Now, the fun part: I haven't been able to build a simple test that causes the
same issue (the self-pointer already is in %rcx and not fetched from the stack
each time), so I have a feeling this may be a side effect of some other part of
the code.

Does this sound familiar to anyone? If so, what could I do about it?


Regards,

Martok

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel