I can confirm that by disabling value numbering, a second 
Int32AddWithOverflow does exist:

--- BLOCK B1158 id1159 <- B1153, B1157 ---
  5768: Phi[kRepFloat64](5753, 5767)
  5769: Float64RoundDown(5768)
  5770: Int64Constant[19]
  5771: Load[kRepTaggedPointer|kTypeAny](27, 5770)
  5772: TypedStateValues[kRepTagged|kTypeAny, sparse:^](3)
  5773: TypedStateValues[kRepWord32|kTypeInt32, kRepWord32|kTypeInt32, 
sparse:^^](5690, 5689)
  5774: TypedStateValues[, sparse:..]
  5775: TypedStateValues[kRepTagged|kTypeAny, kRepTagged|kTypeAny, 
sparse:^^......](5773, 5774)
  5776: TypedStateValues[, sparse:.]
  5777: FrameState[UNOPTIMIZED_FRAME, 92, PokeAt(0), 0x191e0015eab5 
<SharedFunctionInfo test_div>](5772, 5775, 5776, 2, 8, 0)
  5778: Int32AddWithOverflow(5689, 5689)
  5779: Projection[0](5778)
  5780: Projection[1](5778)
  5781: Branch[Unspecified, False](5780) -> B1160, B1159

So could Overflow operations and/or Projections[1] be special-cased..? And 
why is there a difference here with a normal comparison?

Thanks,
Sam

On Thursday, November 9, 2023 at 9:50:53 AM UTC Sam Parker-Haynes wrote:

> Hi,
>
> Thanks both!
>
> Looking again at the IR, I wonder if it is control related. This is the 
> well-behaved IR, where both instances of the Int32AddWithOverflow have a 
> control input.
>
> --- BLOCK B1145 id5053 <- B1143, B1144 ---                                 
>                                                     
>   43: Merge(42, 13024)                                                     
>                                                     
>   53: Phi[kRepWord32](18, 13025, 43)                                       
>                                                     
>   52: Phi[kRepWord32](17, 13021, 43)                                       
>                                                     
>   51: EffectPhi(15, 13024, 43)                                             
>                                                     
>   13034: ChangeInt32ToFloat64(53)
>   6670: Load[kRepTaggedPointer|kTypeAny](6114, 23562, 51, 43) 
>   13026: Int32AddWithOverflow(53, 53, 43)                                 
>                                                      
>   13027: Projection[1](13026, 43)                                         
>                                                      
>   13028: Branch[Machine, False](13027, 43) -> B1147, B1146
>
> --- BLOCK B1158 id5027 <- B1157, B1153 ---
>   13061: Merge(13054, 13044)
>   13063: Phi[kRepFloat64](13056, 13060, 13061)
>   13062: EffectPhi(13056, 78, 13061)
>   6672: Load[kRepTaggedPointer|kTypeAny](6111, 23562, 13062, 13061)
>   13068: Int32AddWithOverflow(53, 53, 13061)
>   13069: Projection[1](13068, 13061)
>   13070: Branch[Machine, False](13069, 13061) -> B1160, B1159
>
> However, with Turboshaft, the blocks look like this:
>
> --- BLOCK B1145 id1146 <- B1143, B1144 ---
>   5473: Phi[kRepWord32](5449, 5470)
>   5474: Phi[kRepWord32](5450, 5466)
>   5475: Int64Constant[19]
>   5476: Load[kRepTaggedPointer|kTypeAny](28, 5475)
>   5477: TypedStateValues[kRepTagged|kTypeAny, sparse:^](3)
>   5478: TypedStateValues[kRepWord32|kTypeInt32, kRepWord32|kTypeInt32, 
> sparse:^^](5474, 5473)
>   5479: TypedStateValues[, sparse:..]
>   5480: TypedStateValues[kRepTagged|kTypeAny, kRepTagged|kTypeAny, 
> kRepTagged|kTypeAny, kRepTagged|kTypeAny, kRepTagged|kTypeAny, 
> sparse:^^^^^...](5478, 5479, 27, 26, 25)
>   5481: TypedStateValues[, sparse:.]
>   5482: FrameState[UNOPTIMIZED_FRAME, 59, PokeAt(0), 0x191e0015eab5 
> <SharedFunctionInfo test_div>](5477, 5480, 5481, 2, 8, 0)
>   5483: Int32AddWithOverflow(5473, 5473)
>   5484: Projection[0](5483)
>   5485: Projection[1](5483)
>   5486: Branch[Unspecified, False](5485) -> B1147, B1146
>
> --- BLOCK B1158 id1159 <- B1153, B1157 ---
>   5549: Phi[kRepFloat64](5534, 5548)
>   5550: Float64RoundDown(5549)
>   5551: Int64Constant[19]
>   5552: Load[kRepTaggedPointer|kTypeAny](27, 5551)
>   5553: TypedStateValues[kRepTagged|kTypeAny, sparse:^](3)
>   5554: TypedStateValues[kRepWord32|kTypeInt32, kRepWord32|kTypeInt32, 
> sparse:^^](5474, 5473)
>   5555: TypedStateValues[, sparse:..]
>   5556: TypedStateValues[kRepTagged|kTypeAny, kRepTagged|kTypeAny, 
> sparse:^^......](5554, 5555)
>   5557: TypedStateValues[, sparse:.]
>   5558: FrameState[UNOPTIMIZED_FRAME, 92, PokeAt(0), 0x191e0015eab5 
> <SharedFunctionInfo test_div>](5553, 5556, 5557, 2, 8, 0)
>   5559: Branch[Unspecified, False](5485) -> B1160, B1159
>
>
> I have seen that Turboshaft has an EmitProjectionReducer (
> https://source.chromium.org/chromium/chromium/src/+/main:v8/src/compiler/turboshaft/assembler.h;drc=f5a7962861b208e9cf82e61c1fa9f8dc0d216a87;l=606)
>  
> and seems to be closely related to ValueNumbering. Without control inputs, 
> I can't see why ValueNumbering, or something similar, wouldn't remove the 
> second instance.  
>
> >  It could be an artifact of scheduling
>
> I will have a look at the scheduler.
>
> > However, this should be prevented by this check
>
> Yes, it's that check that I'm trying to remove :) and this works for 
> everything but projections. Surely we should be able to combine an 
> operation even if it has multiple users? 
>
> Regards,
> Sam
>
> On Wednesday, November 8, 2023 at 5:20:37 PM UTC dmerc...@google.com 
> wrote:
>
>> Hi,
>>
>> > With a turbofan-only flow, the overflow operation and the projection 
>> will be duplicated for the second branch, and everything is fine.
>>
>> I'm not sure which phase could duplicate a Int32AddWithOverflow. I don't 
>> think that the Instruction Selector does this. It could be an artifact of 
>> scheduling (which happens before effect control linearization or before 
>> instruction selection). I'm not sure it's a conscious optimization though, 
>> or just a side-effect of the scheduler.
>>
>> > Is it legal for all projected values to be live across basic blocks? If 
>> so, who should be responsible for ensuring values are live where they need 
>> to be?
>>
>> Yes it is legal (and probably happens fairly often). As to who should 
>> ensure that the values are live... It sounds like here the bug is in the 
>> Instruction Selector. What probably happens is that when emitting the 1st 
>> branch, we decide that it can cover the Projection[1] and don't need to 
>> emit it, which is wrong. However, this should be prevented by this check 
>> https://source.chromium.org/chromium/chromium/src/+/main:v8/src/compiler/backend/x64/instruction-selector-x64.cc;l=4278:
>>  
>> when user=the_first_branch and value=projection[1], CanCover should return 
>> false, because the projection has another use.
>>
>> If you can provide an example that we can reproduce, I'd be happy to have 
>> a look, but without being able to reproduce, we're just speculating :/
>>
>> > I don't understand why it has a control input
>>
>> Projections don't really need control inputs, but I think that having 
>> control inputs in projections is somewhat useful for scheduling in 
>> Turbofan. We trying removing the control input (
>> https://crrev.com/c/4570503), and performance regressed in some cases.
>>
>> Cheers,
>> Darius
>>
>> On Wednesday, November 8, 2023 at 6:02:40 PM UTC+1 Matthias Liedtke wrote:
>>
>>> Hi Sam,
>>>
>>> Honestly I don't feel knowledgeable enough to answer this question, so I 
>>> hope some Turbofan experts can chime in.
>>> This line 
>>> <https://source.chromium.org/chromium/chromium/src/+/main:v8/src/compiler/common-operator.cc;l=1622>
>>>  defines 
>>> that the Projection in Turbofan has one value input, a control input and a 
>>> value output but no control output.
>>> I don't understand why it has a control input or what it exactly means 
>>> that an operation has a control input vs. let's say a Select operation that 
>>> doesn't have one.
>>> Still, I'd assume that the Branch in B4 should have a control input that 
>>> at some point leads to the same control input as "2: Projection [1]" and 
>>> therefore I'd assume that it's valid as the control input of the projection 
>>> dominates the branch that uses it.
>>> Looking at the schedule, as the Projection is in B0, it is guaranteed to 
>>> be "executed" before B4, so the virtual register should be defined and 
>>> should be "available" / usable by all instructions following itself in B0 
>>> and in all instructions in blocks that are reachable from B0 directly or 
>>> indirectly.
>>> I don't think there is anything limiting the liveness of a virtual 
>>> register, so as long as a block A dominates another Block B, all virtual 
>>> registers defined in A are usable in B.
>>> So your snippet looks totally fine based on my (limited) understanding 
>>> of Turbofan.
>>>
>>> However, I'd assume it should be
>>>   1: Projection[0] (0)
>>>   2: Projection[1] (0)
>>> and not
>>>   1: Projection[0] (0)
>>>   2: Projection[1] (1)
>>> (i.e. both projections should have Int32AddWithOverflow as an input)
>>>
>>> Best regards,
>>> Matthias
>>>
>>>
>>> On Wed, Nov 8, 2023 at 5:24 PM Sam Parker-Haynes <sam.p...@arm.com> 
>>> wrote:
>>>
>>>> Hi Matthias,
>>>>
>>>> My RecreateSchedule example looks like this:
>>>>
>>>> B0:
>>>>   0: Int32AddWithOverflow
>>>>   1: Projection[0] (0)
>>>>   2: Projection[1] (1)
>>>>   Branch (2)
>>>> ...
>>>>
>>>> B4:
>>>>   Branch (2)    <-- how should this branch receive (2)
>>>>
>>>> I know I said it looks okay at the IR level, but since the IR isn't 
>>>> documented that may not be true ;) So, what are the semantics of a 
>>>> Projection..?
>>>> - Is the Projection[1] of an Overflow operation a part of the control 
>>>> chain?
>>>> - Is it legal for all projected values to be live across basic blocks? 
>>>> If so, who should be responsible for ensuring values are live where they 
>>>> need to be?
>>>>
>>>> Thanks,
>>>> Sam
>>>>
>>>> On Wednesday, November 8, 2023 at 2:15:58 PM UTC Sam Parker-Haynes 
>>>> wrote:
>>>>
>>>> Also, are any instruction selection tests being added for Turboshaft? 
>>>> Something like the equivalent of 
>>>> test/unittests/compiler/arm64/instruction-selector-arm64-unittest.cc?
>>>>
>>>> On Wednesday, November 8, 2023 at 1:26:12 PM UTC Sam Parker-Haynes 
>>>> wrote:
>>>>
>>>> EDIT: only passes with --no-turboshaft
>>>>
>>>> On Wednesday, November 8, 2023 at 1:20:03 PM UTC Sam Parker-Haynes 
>>>> wrote:
>>>>
>>>> Hi Matthias,
>>>>
>>>> Ah, I hadn't seen that, so yes I've been tweaking the OG selector 
>>>> function for arm64. On the x64 side of things, everything is fine with 
>>>> 'pure' turboshaft pipeline but, as with my arm64 checkout, the issue 
>>>> arises 
>>>> when modifying the turbofan selector function and using ---no-turboshaft.
>>>>
>>>> So, I guess the problem is with the RecreateSchedule?
>>>>
>>>> Thanks,
>>>> Sam 
>>>>
>>>> On Wednesday, November 8, 2023 at 1:12:56 PM UTC mlie...@google.com 
>>>> wrote:
>>>>
>>>> Hi Sam,
>>>>
>>>> I don't have any insights on x64 for VisitWordCompareZero but as a 
>>>> quick note, the TurboshaftAdapter instruction selection implementation for 
>>>> VisitWordCompareZero for arm64 was just added yesterday by me (
>>>> https://chromium-review.googlesource.com/c/v8/v8/+/5002003).
>>>>
>>>> So, just to clarify, I assume you are talking about the Turbofan {Node* 
>>>> / TurbofanAdapter} based instruction selection with Turboshaft 
>>>> optimizations followed by a RecreateSchedule?
>>>> If you do changes for the TurbofanAdapter instruction selection on 
>>>> arm64 for code that already has an implementation for the 
>>>> TurboshaftAdapter, could you add a TODO(mliedtke) on the Turboshaft 
>>>> implementation with some nice description and CC me on the gerrit changes, 
>>>> so these don't get lost when we switch over to the TurboshaftAdapter?
>>>>
>>>> Thanks and best regards,
>>>> Matthias
>>>>
>>>> On Wed, Nov 8, 2023 at 1:54 PM Sam Parker-Haynes <sam.p...@arm.com> 
>>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I've been investigating rematerialising flag setting instructions in 
>>>> VisitWordCompareZero, just by removing the CanCover check, and something 
>>>> is 
>>>> going wrong with turboshaft.
>>>>
>>>> I have an arithmetic overflow operation, Int32AddWithOverflow, and the 
>>>> overflow check is used by two branch operations. With a turbofan-only 
>>>> flow, 
>>>> the overflow operation and the projection will be duplicated for the 
>>>> second 
>>>> branch, and everything is fine. But with turboshaft, neither the overflow 
>>>> operation or projection are duplicated and the second branch uses the 
>>>> original projection[1] value. This looks fine at the IR level, but ends up 
>>>> broken after isel when the register allocator comes across a virtual 
>>>> register without a definition. This happens for both x64 and arm64, so I'm 
>>>> assuming turboshaft is making some assumptions, based on the current 
>>>> behaviour, that are non-obvious to me.
>>>>
>>>> Any ideas?
>>>>
>>>> Thanks!
>>>> Sam
>>>>
>>>>
>>>>
>>>> -- 
>>>> -- 
>>>> v8-dev mailing list
>>>> v8-...@googlegroups.com
>>>> http://groups.google.com/group/v8-dev
>>>> --- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "v8-dev" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to v8-dev+un...@googlegroups.com.
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/v8-dev/175a4ba2-9d33-45bb-84e3-91d41a722846n%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/v8-dev/175a4ba2-9d33-45bb-84e3-91d41a722846n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>> -- 
>>>> -- 
>>>> v8-dev mailing list
>>>> v8-...@googlegroups.com
>>>> http://groups.google.com/group/v8-dev
>>>> --- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "v8-dev" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to v8-dev+un...@googlegroups.com.
>>>>
>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/v8-dev/46693e6c-1216-493b-a104-2c9d41075987n%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/v8-dev/46693e6c-1216-493b-a104-2c9d41075987n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>

-- 
-- 
v8-dev mailing list
v8-dev@googlegroups.com
http://groups.google.com/group/v8-dev
--- 
You received this message because you are subscribed to the Google Groups 
"v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to v8-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/v8-dev/d539c990-e8ed-4134-bba5-98ae45c51c38n%40googlegroups.com.

Reply via email to