Hi Nico,

I finally got around to trying to make a prototype and I'm unsure how to 
proceed. As an overview, I want to find a triangle CFG and hoist the 
conditional 'middle' block into its only predecessor. Something like this:
B0:
  x = ComparisonOp a, b, kind
  BranchOp x, B2, B1
B1:
  y = ComparisonOp c, d, kind
  BranchOp y, B2, B3
B2:
  z = ....
-------------------------------------
B0:
  x = ComparisionOp a, b, kind
  y = ConditionalComparisonOp a, b, x, kind
  BranchOp y, B2, B3
B2:
  z = ...
B3:

My current implementation would, from `Bind`, search the CFG from B2 and 
decide that B1 can be hoisted into B0. From there, we need to replace the 
BranchOp in B0 with the one in B1. We'r'e then replacing the condition of 
that branch with a new ConditonalCompareOp.

If this sounds feasible, where/how should I do the replacing? I'm still yet 
to figure out if there is an interface which a reducer has to implement, 
but I've seen several examples which sound promising: REDUCE(Branch), 
ReduceBranchCondition and ReduceInputGraphBranch. Would any of these be 
right place or would they be a misuse of how the reducer stacks operate..?

cheers,
sam
On Thursday, July 27, 2023 at 3:47:08 PM UTC+1 Nico Hartmann wrote:

> Great to hear that it's working as expected.
>
> The phase ordering is a hard problem indeed. There is no one answer to 
> that question. Every phase has a cost in the sense that the graph has to be 
> iterated and a new graph has to be emitted. Although the cost is relatively 
> small, it is not free. In other words, you might not want to put every 
> reducer in its own phase. Besides some reducers that have a few additional 
> constraints (e.g. they always run on the input graph or they always need to 
> be at the bottom of the stack), if you have a situation where ReducerA 
> recognizes and optimizes a certain pattern and you have a ReducerB that may 
> produce those as part of some other reductions/optimizations, then it might 
> typically be a good idea to run ReducerA (somewhere) behind ReducerB in a 
> stack.
>
> Other than that, for the pipeline we have so far, we just tried different 
> combinations and compared performance of generated code and of the 
> compilation itself.
>
> Hope that helps a bit,
> Nico
>
> On Thu, Jul 27, 2023 at 3:49 PM Sam Parker-Haynes <sam.p...@arm.com> 
> wrote:
>
>> Right, it is now working as expected. I've broken the WasmOptimizePhase 
>> into two:
>>
>> void WasmOptimizePhase::Run(Zone* temp_zone) {                           
>>                                                       
>>   UnparkedScopeIfNeeded scope(PipelineData::Get().broker(),
>>                               v8_flags.turboshaft_trace_reduction);
>>   // TODO(14108): Add more reducers as needed.
>>   OptimizationPhase<WasmLoweringReducer,
>>                     
>> MachineOptimizationReducerSignallingNanPossible>::Run(temp_zone);
>>   OptimizationPhase<ValueNumberingReducer>::Run(temp_zone);
>>
>> And this indeed cleans up the graph to remove dead code, which allows the 
>> backend to perform optimisations under CanCover:
>>
>> B5: AO#4 (no frame)  instructions: [11, 12)
>>  predecessors: B4
>>       11: gap () ()
>>           Arm64Cmp32 && branch if equal v0(R) #50 [rpo_immediate:7] 
>> [rpo_immediate:6]
>>
>> So, my final question :) how does one decide how to organize passes? 
>> There's always going to be an ordering problem, but what are the trade-offs 
>> to grouping/splitting reducers into OptimizationPhase(s)? 
>>
>> Sam
>> On Thursday, July 27, 2023 at 1:24:25 PM UTC+1 Sam Parker-Haynes wrote:
>>
>>> Ah, I think I was miss-reading the Arm64 instructions: 
>>> Arm64CompareAndBranch is cbz (?) so the control-flow would work out... Too 
>>> many levels of conditionals for me!
>>> On Thursday, July 27, 2023 at 11:34:02 AM UTC+1 Sam Parker-Haynes wrote:
>>>
>>>> If I'm reading the IR correctly, the branch has already been rewritten 
>>>> to directly use #26, so I'm not expecting there to be two comparisons. The 
>>>> branch targets are the same, but the condition is being inverted (again) 
>>>> during ISel. I think the IR should be translated to this:
>>>>  
>>>> v7(R) = Arm64Cmp32 && branch if equal v0(R) #50 [rpo_immediate:7] 
>>>> [rpo_immediate:6]
>>>>
>>>> And many thanks for the pipeline overview! I will continue playing.
>>>>
>>>>
>>>> On Thursday, July 27, 2023 at 9:52:27 AM UTC+1 Nico Hartmann wrote:
>>>>
>>>>> Hi Sam,
>>>>>
>>>>> Briefly summarized, the Turboshaft pipeline is organized into phases 
>>>>> where each phase is defined by a stack of reducers (e.g. 
>>>>> `OptimizationPhase<SomeReducer, SomeOtherReducer, 
>>>>> AnotherReducer>::Run()`). 
>>>>> A Turboshaft graph is passed to each phase (the "input graph") and the 
>>>>> phase then iterates over the graph and passes each operation through the 
>>>>> stack. At the bottom of the stack, a new operation (or a sequence of 
>>>>> operations) is generated into the phase's output (the "output graph"). 
>>>>> The  
>>>>> output graph is then the next phase's input. However, some phases run 
>>>>> analyses. Those always process the phase's input graph regardless of 
>>>>> where 
>>>>> they are in the stack. The `DeadCodeEliminationReducer` is one of these 
>>>>> that runs an analysis on the input graph and then skips emitting dead 
>>>>> operations during reduction. As a result, operations that became dead 
>>>>> during a previous reduction in the same phase (reducer stack), will not 
>>>>> be 
>>>>> removed.
>>>>>
>>>>> There are two ways to get rid of unused operations:
>>>>> - When a phase iterates over the input graph, all operations that 
>>>>> don't have uses are just skipped. So once an operation that doesn't have 
>>>>> any uses is emitted into the output graph, it will be removed (or rather 
>>>>> ignored when generating the next graph) at the beginning of the next 
>>>>> phase.
>>>>> - Using the `DeadCodeEliminationReducer`, which is a more powerful 
>>>>> liveness analysis that can do much more. It will run a full fixpoint 
>>>>> analysis on the input graph to find all live operations to remove all 
>>>>> dead 
>>>>> operations in one step. It will also eliminate unnecessary control flow 
>>>>> by 
>>>>> rewriting BranchOp and GotoOp targets and it will rewrite PhiOps and 
>>>>> such.  
>>>>> If you want to make use of this, it has to be in the following phase, 
>>>>> because - as described above - it always runs on the phase's input graph. 
>>>>> However, this is a more expensive analysis+reducer, so I would not 
>>>>> recommend using it for the example you provided above but just rely on 
>>>>> the 
>>>>> next OptimizationPhase run to remove the unused operations.
>>>>>
>>>>> Regarding the ISel, I think the generated code looks okay. `v7(R) = 
>>>>> ...` is the comparison of `v0` with `50` (so that's basically `26: 
>>>>> Word32Equal(4, 25)`) and then we branch based on this result. Please 
>>>>> clarify what you think is wrong here.
>>>>>
>>>>> Nico
>>>>>
>>>>> On Wed, Jul 26, 2023 at 12:35 PM Sam Parker-Haynes <sam.p...@arm.com> 
>>>>> wrote:
>>>>>
>>>>>> Hi Nico,
>>>>>>
>>>>>> I've been having a poke around using a toy wasm example. I'm trying 
>>>>>> to understand how the wasm optimization pipeline works. I have this 
>>>>>> basic 
>>>>>> block:
>>>>>>
>>>>>> BLOCK B7 <- B4                                                       
>>>>>>                                                           
>>>>>>    26: Constant()[word32: 50]                                         
>>>>>>                                                          
>>>>>>    27: Equal(#3, #26)[Word32]                                         
>>>>>>                                                          
>>>>>>    28: Constant()[word32: 0]                                         
>>>>>>                                                           
>>>>>>    29: Equal(#27, #28)[Word32]                                       
>>>>>>                                                           
>>>>>>    30: Branch(#29)[B10, B8, None]
>>>>>>
>>>>>> And I can see that MachineOptimizationReducer changes the branch (29) 
>>>>>> condition, but it doesn't remove (28) and (27):
>>>>>>
>>>>>> BLOCK B6 <- B4
>>>>>>    25: Constant()[word32: 50]
>>>>>>    26: Equal(#3, #25)[Word32]
>>>>>>    27: Constant()[word32: 0]
>>>>>>    28: Equal(#26, #27)[Word32]
>>>>>>    29: Branch(#26)[B7, B8, None]
>>>>>>
>>>>>> I've tried adding DeadCodeElimination, but it always seems to run 
>>>>>> *before* MachineOptimizationReducer, how can I get it to run 
>>>>>> afterwards?
>>>>>>
>>>>>> What I find somewhat worrying is that this IR is lowered to 
>>>>>> equivalent looking Turbofan IR but then ISel appears to be using 
>>>>>> Word32Equal (28) as the branch condition again:
>>>>>>
>>>>>> --- BLOCK B5 id7 <- B4 ---
>>>>>>   24: IfFalse(22)
>>>>>>   25: Int32Constant[50]
>>>>>>   26: Word32Equal(4, 25)
>>>>>>   27: Int32Constant[0]
>>>>>>   28: Word32Equal(26, 27)
>>>>>>   29: Branch[Unspecified, None](26) -> B7, B6
>>>>>>
>>>>>> B5: AO#4 (no frame)  instructions: [11, 13)
>>>>>>  predecessors: B4
>>>>>>       11: gap () ()
>>>>>>           v7(R) = Arm64Cmp32 && set if equal v0(R) #50
>>>>>>       12: gap () ()
>>>>>>           Arm64CompareAndBranch32 && branch if not equal v7(R) 
>>>>>> [rpo_immediate:7] [rpo_immediate:6]
>>>>>>
>>>>>> Any idea of what's going here?
>>>>>>
>>>>>> cheers!
>>>>>> On Tuesday, July 25, 2023 at 10:53:01 AM UTC+1 Sam Parker-Haynes 
>>>>>> wrote:
>>>>>>
>>>>>>> Great, thanks for the info, I'll take a look around.
>>>>>>>
>>>>>>> On Tuesday, July 25, 2023 at 10:47:07 AM UTC+1 Nico Hartmann wrote:
>>>>>>>
>>>>>>>> There is not a lot of (up to date) documentation. A very brief 
>>>>>>>> summary:
>>>>>>>>
>>>>>>>> The new IR's operations (which replace Turbofan's nodes) are 
>>>>>>>> defined in turboshaft/operations.h 
>>>>>>>> <https://source.chromium.org/chromium/chromium/src/+/main:v8/src/compiler/turboshaft/operations.h>
>>>>>>>> .
>>>>>>>> The current state of the pipeline is as follows: We run 
>>>>>>>> Turbofan's frontend and then translate to the Turboshaft IR (in 
>>>>>>>> turboshaft/graph-builder.cc 
>>>>>>>> <https://source.chromium.org/chromium/chromium/src/+/main:v8/src/compiler/turboshaft/graph-builder.cc>).
>>>>>>>>  
>>>>>>>> Next, the Turboshaft optimization and lowering phases run on the CFG 
>>>>>>>> (see 
>>>>>>>> primarily pipeline.cc 
>>>>>>>> <https://source.chromium.org/chromium/chromium/src/+/main:v8/src/compiler/pipeline.cc>
>>>>>>>>  and 
>>>>>>>> the turboshaft/*-phase.h included from there). Eventually, we 
>>>>>>>> translate the 
>>>>>>>> CFG back into Turbofan's sea-of-nodes IR (in 
>>>>>>>> turboshaft/recreate-schedule.cc 
>>>>>>>> <https://source.chromium.org/chromium/chromium/src/+/main:v8/src/compiler/turboshaft/recreate-schedule.cc>)
>>>>>>>>  
>>>>>>>> to run the backend.
>>>>>>>>
>>>>>>>> We are currently working on reimplementing the rest of 
>>>>>>>> Turbofan's phases to have a full Turboshaft CFG pipeline eventually, 
>>>>>>>> but we 
>>>>>>>> plan to ship a first version of Turboshaft soonish, which still only 
>>>>>>>> covers 
>>>>>>>> part of the pipeline and relies on the translation back and forth.
>>>>>>>>
>>>>>>>> On Tue, Jul 25, 2023 at 11:05 AM Sam Parker-Haynes <
>>>>>>>> sam.p...@arm.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Nico,
>>>>>>>>>
>>>>>>>>> A CFG IR sounds good to me :) I've been hesitate to look as I 
>>>>>>>>> didn't know the status, is there any documentation? Is it currently 
>>>>>>>>> piggybacking off turbofan for some functionality? I can't immediately 
>>>>>>>>> see 
>>>>>>>>> where/how graph nodes are defined.
>>>>>>>>>
>>>>>>>>> cheers, 
>>>>>>>>>
>>>>>>>>> On Tuesday, July 25, 2023 at 9:37:26 AM UTC+1 Nico Hartmann wrote:
>>>>>>>>>
>>>>>>>>>> Hi Sam,
>>>>>>>>>>
>>>>>>>>>> We are currently migrating Turbofan to a new CFG-based IR 
>>>>>>>>>> (Turboshaft). This could help.
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Nico
>>>>>>>>>>
>>>>>>>>>> On Tue, Jul 25, 2023 at 10:23 AM Sam Parker-Haynes <
>>>>>>>>>> sam.p...@arm.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I've been wondering how to go about adding general support for 
>>>>>>>>>>> conditional operations in Turbofan, specifically conditional 
>>>>>>>>>>> compares for 
>>>>>>>>>>> AArch64. But I now see that Intel APX extensions also support more 
>>>>>>>>>>> conditional instructions, beyond cmov.
>>>>>>>>>>>
>>>>>>>>>>> I'm not a huge fan of flag continuations, as it's been too easy 
>>>>>>>>>>> for me to introduce bugs while using them, so is there a safer way 
>>>>>>>>>>> to 
>>>>>>>>>>> approach this?
>>>>>>>>>>>
>>>>>>>>>>> Machine-level predication would seem to require a CFG in a form 
>>>>>>>>>>> that isn't really available until during/after jump-threading. So a 
>>>>>>>>>>> few 
>>>>>>>>>>> questions about jump threading:
>>>>>>>>>>> - Could predication happen there?
>>>>>>>>>>> - Why does jump threading run so late, why not before regalloc?
>>>>>>>>>>>
>>>>>>>>>>> And, in general, what is the policy on having optimisation 
>>>>>>>>>>> passes that run on machine code?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Sam
>>>>>>>>>>>
>>>>>>>>>>> -- 
>>>>>>>>>>> -- 
>>>>>>>>>>> v8-dev mailing list
>>>>>>>>>>> v8-...@googlegroups.com
>>>>>>>>>>> http://groups.google.com/group/v8-dev
>>>>>>>>>>> --- 
>>>>>>>>>>> You received this message because you are subscribed to the 
>>>>>>>>>>> Google Groups "v8-dev" group.
>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from 
>>>>>>>>>>> it, send an email to v8-dev+un...@googlegroups.com.
>>>>>>>>>>> To view this discussion on the web visit 
>>>>>>>>>>> https://groups.google.com/d/msgid/v8-dev/80169017-f248-4424-89ad-97260f28643dn%40googlegroups.com
>>>>>>>>>>>  
>>>>>>>>>>> <https://groups.google.com/d/msgid/v8-dev/80169017-f248-4424-89ad-97260f28643dn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>>> .
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -- 
>>>>>>>>>> Nico Hartmann | Software Engineer | nicoha...@google.com | Chrome 
>>>>>>>>>> - V8
>>>>>>>>>>
>>>>>>>>> -- 
>>>>>>>>> -- 
>>>>>>>>> v8-dev mailing list
>>>>>>>>> v8-...@googlegroups.com
>>>>>>>>> http://groups.google.com/group/v8-dev
>>>>>>>>> --- 
>>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>>> Groups "v8-dev" group.
>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>>> send an email to v8-dev+un...@googlegroups.com.
>>>>>>>>>
>>>>>>>> To view this discussion on the web visit 
>>>>>>>>> https://groups.google.com/d/msgid/v8-dev/f4cb3d5e-d703-4e0a-b65d-1fbacf1573a0n%40googlegroups.com
>>>>>>>>>  
>>>>>>>>> <https://groups.google.com/d/msgid/v8-dev/f4cb3d5e-d703-4e0a-b65d-1fbacf1573a0n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>> .
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> Nico Hartmann | Software Engineer | nicoha...@google.com | Chrome 
>>>>>>>> - V8
>>>>>>>>
>>>>>>> -- 
>>>>>> -- 
>>>>>> v8-dev mailing list
>>>>>> v8-...@googlegroups.com
>>>>>> http://groups.google.com/group/v8-dev
>>>>>> --- 
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "v8-dev" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>> send an email to v8-dev+un...@googlegroups.com.
>>>>>>
>>>>> To view this discussion on the web visit 
>>>>>> https://groups.google.com/d/msgid/v8-dev/87824ff9-a7a0-4a22-b69b-44e686d07f1an%40googlegroups.com
>>>>>>  
>>>>>> <https://groups.google.com/d/msgid/v8-dev/87824ff9-a7a0-4a22-b69b-44e686d07f1an%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>>
>>>>>
>>>>> -- 
>>>>> Nico Hartmann | Software Engineer | nicoha...@google.com | Chrome - V8
>>>>>
>>>> -- 
>> -- 
>> v8-dev mailing list
>> v8-...@googlegroups.com
>> http://groups.google.com/group/v8-dev
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "v8-dev" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to v8-dev+un...@googlegroups.com.
>>
> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/v8-dev/2565d8b5-4153-4d11-81f2-76b0bbcd223an%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/v8-dev/2565d8b5-4153-4d11-81f2-76b0bbcd223an%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>
>
> -- 
> Nico Hartmann | Software Engineer | nicoha...@google.com | Chrome - V8
>

-- 
-- 
v8-dev mailing list
v8-dev@googlegroups.com
http://groups.google.com/group/v8-dev
--- 
You received this message because you are subscribed to the Google Groups 
"v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to v8-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/v8-dev/7df09dff-8493-4e78-9b1c-071023ef6d05n%40googlegroups.com.

Reply via email to