Ok. I prefer just keep scalar load + vmv.v.x by default since I believe most 
machines 
prefer this way.



juzhe.zh...@rivai.ai
 
From: Jeff Law
Date: 2023-05-31 06:09
To: 钟居哲; andrew; rdapp.gcc
CC: gcc-patches; kito.cheng; palmer
Subject: Re: [PATCH] RISC-V: Synthesize power-of-two constants.
 
 
On 5/30/23 16:01, 钟居哲 wrote:
> I agree with Andrew.
> 
> And I don't think this patch is appropriate for following reasons:
> 1. This patch increases vector workload in machine since
>       it convert scalar load + vmv.v.x into vmv.v.i + vsll.vi.
This is probably uarch dependent.  I can probably construct cases where 
the first will be better and I can probably construct cases where the 
latter will be better.  In fact the recommendation from our uarch team 
is to generally do this stuff on the vector side.
 
 
 
> 2. For multi-issue OoO machine, scalar instructions are very cheap
>      when they are located in vector codegen. For example a sequence
>      like this:
>        scalar insn
>        scalar insn
>        vector insn
>        scalar insn
> vector insn
>        ....
>        In such situation, we can issue multiple instructions simultaneously,
>        and the latency of scalar instructions will be hided so scalar 
> instruction
>        is cheap. Wheras this patch increasing vector pipeline workload 
> is not
>        friendly to OoO machine what I mentioned above.
I probably need to be careful what I say here :-)  I'll go with mixing 
vector/scalar code may incur certain penalties on some 
microarchitectures depending on the exact code sequences involved.
 
 
> 3.   I can image the only benefit of this patch is that we can reduce 
> scalar register pressure
>        in some extreme circumstances. However, I don't this benefit is 
> "real" since GCC should
>        well schedule the instruction sequence when we well tune the 
> vector instructions scheduling
>        model and cost model to make such register live range very short 
> when the scalar register
>        pressure is very high.
> 
> Overal, I disagree with this patch.
What I think this all argues is that it'll likely need to be uarch 
dependent.    I'm not yet sure how to describe the properties of the 
uarch in a concise manner to put into our costing structure yet though.
 
jeff
 

Reply via email to