> On Nov 20, 2023, at 17:52, Alexander Monakov <amona...@ispras.ru> wrote:
> 
> 
> On Mon, 20 Nov 2023, Maxim Kuvyrkov wrote:
> 
>> This patch avoids sched-deps.cc:find_inc() creating exponential number
>> of dependencies, which become memory and compilation time hogs.
>> Consider example (simplified from PR96388) ...
>> ===
>> sp=sp-4 // sp_insnA
>> mem_insnA1[sp+A1]
>> ...
>> mem_insnAN[sp+AN]
>> sp=sp-4 // sp_insnB
>> mem_insnB1[sp+B1]
>> ...
>> mem_insnBM[sp+BM]
>> ===
>> ... in this example find_modifiable_mems() will arrange for mem_insnA*
>> to be able to pass sp_insnA, and, while doing this, will create
>> dependencies between all mem_insnA*s and sp_insnB -- because sp_insnB
>> is a consumer of sp_insnA.  After this sp_insnB will have N new
>> backward dependencies.
>> Then find_modifiable_mems() gets to mem_insnB*s and starts to create
>> N new dependencies for _every_ mem_insnB*.  This gets us N*M new
>> dependencies.

[For avoidance of doubt, below discussion is about the general implementation 
of find_modifiable_mems() and not about the patch.]

> 
> It's a bit hard to read this without knowing which value of 'backwards'
> is assumed.
> 
> Say 'backwards' is true and we are inspecting producer sp_insnB of mem_insnB1.
> This is a true dependency. We know we can break it by adjusting B1 by -4, but
> we need to be careful not to move such modified mem_insnB1 above sp_insnA, so
> we need to iterate over *incoming true dependencies* of sp_insnB and add them.
> 
> But the code seems to be iterating over *all incoming dependencies*, so it
> will e.g. take anti-dependency mem_insnA1 -> sp_insnB and create a true
> dependency mem_insnA1 -> mem_insnB1'. This seems utterly inefficient, if my
> understanding is correct.

Yeap, your understanding is correct.  However, this is what 
find_modifiable_mems() has to do to avoid complicated analysis of second-level 
dependencies.

I think, the optimization that find_modifiable_mems() does can be implemented 
as part of main dependency analysis, and I'm going to discuss that with Bernd.  
I might be missing something, but it seems that instruction transformations 
that find_modifiable_mems() is doing are simpler than what ia64 speculation is 
doing.  So, I think, we should be able to leverage speculation infrastructure, 
rather than doing this optimization "on the side" of normal scheduling.

--
Maxim Kuvyrkov
https://www.linaro.org

Reply via email to