On Sat, 1 May 2021, Yakov wrote:

On this sample using macros speeds the program up 400%

Be that as it may, it's not representative of most application. For instance, cpython's performance increases by only 10-15% with the inliner turned on.

(And actually that's misleading, because inlining enables many other optimizations. The impact for tcc if it _only_ added inlining would probably be much less. Unfortunately gcc doesn't seem to be willing to inline at -O0.)


I have recently read a paper about a Linear Scan Register Allocator[1], they claim it gives you 95% performance or Graph Coloring Register Allocator in basically no time, and requires no SSA.

Yes, linear scan is quite nice. It's not really compatible with tcc's compilation model--nor are most other optimizations, including inlining--but I mentioned it because it's probably the most worthwhile optimization a compiler can perform and it's not too difficult.

In the context of a compiler like gcc or llvm, linear scan takes almost no time at all. However it depends on a certain model of code that tcc does not provide currently. Gcc already produces such a model, even without optimizations, and linear scan takes advantage of the information which is already there; a big part of the reason why tcc is so fast is that it produces no such model. For gcc this is a sunk cost; for tcc, not.

 -E

_______________________________________________
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel

Reply via email to