Single pass compilers still can have peephole optimization. Once I
demonstrated with situations when tcc generates
MOV EAX,const
MOV [location],EAX
and catched this case to generate
MOV DWORD [location],const

Back to the topic, AVX-512 is difficult to be supported at language level.
In the case of tcc, it is impossible because to convert array operations
into AVX-512 instructions you need  to see a global picture of the program.

However, it is possible at library level. Simply write matrix manipulations
functions in assembly language. Inside functions load data into AVX
registers, do calculations in registers and at exit write the result into
memory.


On Sat, Feb 5, 2022 at 4:55 PM Christian Jullien <eli...@orange.fr> wrote:

> An optimizer compiler need several pass to operate.
> - constant folding
> - register allocation
> - peephole optimization
> - branch prediction
> ...
>
> When it knows the target it can reorganize code to keep as much as
> possible data un L1 cache and have the longest series of instructions that
> can be executed without breaking the pipeline. i.e. instructions nearly run
> in //
>
> Tcc, which is one pass compiler, definitely loses on this point. On the
> other end, one pass makes it damn fast and that's why we love it.
>
> We can't have the butter and the money for the butter
>
> -----Original Message-----
> From: rem...@tutanota.com [mailto:rem...@tutanota.com]
> Sent: Saturday, February 05, 2022 16:10
> To: Jullien; Tinycc Devel
> Cc: Tinycc Devel
> Subject: Re: [Tinycc-devel] Optimizing for avx512
>
> 5 Φεβ 2022, 11:01 Από eli...@orange.fr:
>
> > The price to pay its really fast compilation is that the generated code
> is poor compared to gcc, clang or vc++ (among others). Depending on your
> program, consider it is roughly 2 to 4x slower.
> >
> I would say that this is not always the case. And correct me if I'm wrong
> but aren't optimization (except few of them) mostly because the programmer
> wrote bad code and the compiler found a better instructions to do the same
> thing? Inline assembly exists in the end so if you really really care about
> performance, you should probably use inline assembly in the most critical
> algorithms/functions. I've seen some code running the same on TCC and GCC
> so I suppose optimization doesn't always makes magic. Or you may have a 5%
> increase or even less. In any case, I would suggest using both TCC and then
> GCC/Clang for the critical parts that will be hugely favored by the
> optimizations these compilers can do.
>
> But of course just my opinion on the topic.
>
>
> _______________________________________________
> Tinycc-devel mailing list
> Tinycc-devel@nongnu.org
> https://lists.nongnu.org/mailman/listinfo/tinycc-devel
>
_______________________________________________
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel

Reply via email to