On 30.06.2023 01:14, Detlef Riekenberg wrote:
Hi Herman

On 25.06.2023 20:30, Herman ten Brugge via Tinycc-devel wrote:

I just pushed a patch to fix this.

Hi Hermann,

some numbers from Win32:

before:
   # 6.334 s, 85768 lines/s, 27.9 MB/s
after first patch:
   # 11.825 s, 45941 lines/s, 14.9 MB/s
after second patch:
   # 10.406 s, 52206 lines/s, 17.0 MB/s

Hm ...

I do not think, that we really need a 64bit hash (with 64bit multiply)
for the complete file content.

Actually it hasn't yet to do with the hash at all.  Also not #pragma once
is not used.  Here is some more data:

# 25401 idents, 4838227 lines, 176764178 bytes (168.6 MB)
# 10.405 s, 52211 lines/s, 17.0 MB/s
# text 4705836, data.rw 3084, data.ro 483724, bss 524940 bytes

# 172 files compiled, 13771 included, 5087 skipped, 43749 not found
# 72308 files stat'ed, 0 hashed

Which means tcc compiled 172 files on one command line, each of them
including on average ~110 headers, from which ~30 are skipped by the
include cache mechanism (checking the #ifndef _XXX_H_ around the file).

The result now is that the new stat() is called 72308 times, mostly
failing (due to include path search).  Which means that at least on
Windows just those stat() calls are taking about the same time as
tcc parsing ~169MB of source code, and that the cache makes tcc
much slower than no cache at all.

(If you step a bit into what that stat() from msvcr90.dll does, then
it's no surprise really.)

In addition to the filename and the filesize, i suggest to use "st_mtime":
  much cheaper and available for free.

gcc, at least 3.4.6, checks st_size and st_mtime, and then does a plain
memcmp() over the entire buffers (cppfiles.c:should_stack_file()).

BUT: tinycc does have a mission that gcc does not have, which is to be
fast and simple.  So I guess it will have to make some restrictions
to the feature as to what extend it can be supported sensibly.

For example tinycc could require at least same basenames (as in the reported
case).  Which would reduce drastically the number of possible candidates
and still would work for all purposes of #pragma once except when 'b.h' is
a link to 'a.h' and both are used in the same translation unit.

-- grischka

I tested this also before committing. I could not find a problem.
I only have an x86_64 machine on redhat linux and a raspberry pi with 32
and 64 bits.
I also have no Windows any more and my i386 machine died about 10 years ago.

So I did the measurement with wine (32/64 bits) and saw no difference
before and after commit.

Your machine is too recent / too fast / has too much memory.
* Multiplications on recent processors are much faster than on older processors.
* A SSD is so fast, that loading many includes many times has no resonable 
delay.
* With a huge amount of system memory, your include directory entries and many 
include files are cached.

For speed tests comparsion, a low resource VM or an old system with fewer RAM 
and a HDD will show the slowdown.
(disable kernel VM support / force JIT mode to make the emulated processor 
slower)

I cannot currently think of a better solution for pragma once. Maybe you
can?

* replace the hash with "st_mtime"




_______________________________________________
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel

Reply via email to