I can confirm that by disabling value numbering, a second Int32AddWithOverflow does exist:
--- BLOCK B1158 id1159 <- B1153, B1157 --- 5768: Phi[kRepFloat64](5753, 5767) 5769: Float64RoundDown(5768) 5770: Int64Constant[19] 5771: Load[kRepTaggedPointer|kTypeAny](27, 5770) 5772: TypedStateValues[kRepTagged|kTypeAny, sparse:^](3) 5773: TypedStateValues[kRepWord32|kTypeInt32, kRepWord32|kTypeInt32, sparse:^^](5690, 5689) 5774: TypedStateValues[, sparse:..] 5775: TypedStateValues[kRepTagged|kTypeAny, kRepTagged|kTypeAny, sparse:^^......](5773, 5774) 5776: TypedStateValues[, sparse:.] 5777: FrameState[UNOPTIMIZED_FRAME, 92, PokeAt(0), 0x191e0015eab5 <SharedFunctionInfo test_div>](5772, 5775, 5776, 2, 8, 0) 5778: Int32AddWithOverflow(5689, 5689) 5779: Projection[0](5778) 5780: Projection[1](5778) 5781: Branch[Unspecified, False](5780) -> B1160, B1159 So could Overflow operations and/or Projections[1] be special-cased..? And why is there a difference here with a normal comparison? Thanks, Sam On Thursday, November 9, 2023 at 9:50:53 AM UTC Sam Parker-Haynes wrote: > Hi, > > Thanks both! > > Looking again at the IR, I wonder if it is control related. This is the > well-behaved IR, where both instances of the Int32AddWithOverflow have a > control input. > > --- BLOCK B1145 id5053 <- B1143, B1144 --- > > 43: Merge(42, 13024) > > 53: Phi[kRepWord32](18, 13025, 43) > > 52: Phi[kRepWord32](17, 13021, 43) > > 51: EffectPhi(15, 13024, 43) > > 13034: ChangeInt32ToFloat64(53) > 6670: Load[kRepTaggedPointer|kTypeAny](6114, 23562, 51, 43) > 13026: Int32AddWithOverflow(53, 53, 43) > > 13027: Projection[1](13026, 43) > > 13028: Branch[Machine, False](13027, 43) -> B1147, B1146 > > --- BLOCK B1158 id5027 <- B1157, B1153 --- > 13061: Merge(13054, 13044) > 13063: Phi[kRepFloat64](13056, 13060, 13061) > 13062: EffectPhi(13056, 78, 13061) > 6672: Load[kRepTaggedPointer|kTypeAny](6111, 23562, 13062, 13061) > 13068: Int32AddWithOverflow(53, 53, 13061) > 13069: Projection[1](13068, 13061) > 13070: Branch[Machine, False](13069, 13061) -> B1160, B1159 > > However, with Turboshaft, the blocks look like this: > > --- BLOCK B1145 id1146 <- B1143, B1144 --- > 5473: Phi[kRepWord32](5449, 5470) > 5474: Phi[kRepWord32](5450, 5466) > 5475: Int64Constant[19] > 5476: Load[kRepTaggedPointer|kTypeAny](28, 5475) > 5477: TypedStateValues[kRepTagged|kTypeAny, sparse:^](3) > 5478: TypedStateValues[kRepWord32|kTypeInt32, kRepWord32|kTypeInt32, > sparse:^^](5474, 5473) > 5479: TypedStateValues[, sparse:..] > 5480: TypedStateValues[kRepTagged|kTypeAny, kRepTagged|kTypeAny, > kRepTagged|kTypeAny, kRepTagged|kTypeAny, kRepTagged|kTypeAny, > sparse:^^^^^...](5478, 5479, 27, 26, 25) > 5481: TypedStateValues[, sparse:.] > 5482: FrameState[UNOPTIMIZED_FRAME, 59, PokeAt(0), 0x191e0015eab5 > <SharedFunctionInfo test_div>](5477, 5480, 5481, 2, 8, 0) > 5483: Int32AddWithOverflow(5473, 5473) > 5484: Projection[0](5483) > 5485: Projection[1](5483) > 5486: Branch[Unspecified, False](5485) -> B1147, B1146 > > --- BLOCK B1158 id1159 <- B1153, B1157 --- > 5549: Phi[kRepFloat64](5534, 5548) > 5550: Float64RoundDown(5549) > 5551: Int64Constant[19] > 5552: Load[kRepTaggedPointer|kTypeAny](27, 5551) > 5553: TypedStateValues[kRepTagged|kTypeAny, sparse:^](3) > 5554: TypedStateValues[kRepWord32|kTypeInt32, kRepWord32|kTypeInt32, > sparse:^^](5474, 5473) > 5555: TypedStateValues[, sparse:..] > 5556: TypedStateValues[kRepTagged|kTypeAny, kRepTagged|kTypeAny, > sparse:^^......](5554, 5555) > 5557: TypedStateValues[, sparse:.] > 5558: FrameState[UNOPTIMIZED_FRAME, 92, PokeAt(0), 0x191e0015eab5 > <SharedFunctionInfo test_div>](5553, 5556, 5557, 2, 8, 0) > 5559: Branch[Unspecified, False](5485) -> B1160, B1159 > > > I have seen that Turboshaft has an EmitProjectionReducer ( > https://source.chromium.org/chromium/chromium/src/+/main:v8/src/compiler/turboshaft/assembler.h;drc=f5a7962861b208e9cf82e61c1fa9f8dc0d216a87;l=606) > > and seems to be closely related to ValueNumbering. Without control inputs, > I can't see why ValueNumbering, or something similar, wouldn't remove the > second instance. > > > It could be an artifact of scheduling > > I will have a look at the scheduler. > > > However, this should be prevented by this check > > Yes, it's that check that I'm trying to remove :) and this works for > everything but projections. Surely we should be able to combine an > operation even if it has multiple users? > > Regards, > Sam > > On Wednesday, November 8, 2023 at 5:20:37 PM UTC dmerc...@google.com > wrote: > >> Hi, >> >> > With a turbofan-only flow, the overflow operation and the projection >> will be duplicated for the second branch, and everything is fine. >> >> I'm not sure which phase could duplicate a Int32AddWithOverflow. I don't >> think that the Instruction Selector does this. It could be an artifact of >> scheduling (which happens before effect control linearization or before >> instruction selection). I'm not sure it's a conscious optimization though, >> or just a side-effect of the scheduler. >> >> > Is it legal for all projected values to be live across basic blocks? If >> so, who should be responsible for ensuring values are live where they need >> to be? >> >> Yes it is legal (and probably happens fairly often). As to who should >> ensure that the values are live... It sounds like here the bug is in the >> Instruction Selector. What probably happens is that when emitting the 1st >> branch, we decide that it can cover the Projection[1] and don't need to >> emit it, which is wrong. However, this should be prevented by this check >> https://source.chromium.org/chromium/chromium/src/+/main:v8/src/compiler/backend/x64/instruction-selector-x64.cc;l=4278: >> >> when user=the_first_branch and value=projection[1], CanCover should return >> false, because the projection has another use. >> >> If you can provide an example that we can reproduce, I'd be happy to have >> a look, but without being able to reproduce, we're just speculating :/ >> >> > I don't understand why it has a control input >> >> Projections don't really need control inputs, but I think that having >> control inputs in projections is somewhat useful for scheduling in >> Turbofan. We trying removing the control input ( >> https://crrev.com/c/4570503), and performance regressed in some cases. >> >> Cheers, >> Darius >> >> On Wednesday, November 8, 2023 at 6:02:40 PM UTC+1 Matthias Liedtke wrote: >> >>> Hi Sam, >>> >>> Honestly I don't feel knowledgeable enough to answer this question, so I >>> hope some Turbofan experts can chime in. >>> This line >>> <https://source.chromium.org/chromium/chromium/src/+/main:v8/src/compiler/common-operator.cc;l=1622> >>> defines >>> that the Projection in Turbofan has one value input, a control input and a >>> value output but no control output. >>> I don't understand why it has a control input or what it exactly means >>> that an operation has a control input vs. let's say a Select operation that >>> doesn't have one. >>> Still, I'd assume that the Branch in B4 should have a control input that >>> at some point leads to the same control input as "2: Projection [1]" and >>> therefore I'd assume that it's valid as the control input of the projection >>> dominates the branch that uses it. >>> Looking at the schedule, as the Projection is in B0, it is guaranteed to >>> be "executed" before B4, so the virtual register should be defined and >>> should be "available" / usable by all instructions following itself in B0 >>> and in all instructions in blocks that are reachable from B0 directly or >>> indirectly. >>> I don't think there is anything limiting the liveness of a virtual >>> register, so as long as a block A dominates another Block B, all virtual >>> registers defined in A are usable in B. >>> So your snippet looks totally fine based on my (limited) understanding >>> of Turbofan. >>> >>> However, I'd assume it should be >>> 1: Projection[0] (0) >>> 2: Projection[1] (0) >>> and not >>> 1: Projection[0] (0) >>> 2: Projection[1] (1) >>> (i.e. both projections should have Int32AddWithOverflow as an input) >>> >>> Best regards, >>> Matthias >>> >>> >>> On Wed, Nov 8, 2023 at 5:24 PM Sam Parker-Haynes <sam.p...@arm.com> >>> wrote: >>> >>>> Hi Matthias, >>>> >>>> My RecreateSchedule example looks like this: >>>> >>>> B0: >>>> 0: Int32AddWithOverflow >>>> 1: Projection[0] (0) >>>> 2: Projection[1] (1) >>>> Branch (2) >>>> ... >>>> >>>> B4: >>>> Branch (2) <-- how should this branch receive (2) >>>> >>>> I know I said it looks okay at the IR level, but since the IR isn't >>>> documented that may not be true ;) So, what are the semantics of a >>>> Projection..? >>>> - Is the Projection[1] of an Overflow operation a part of the control >>>> chain? >>>> - Is it legal for all projected values to be live across basic blocks? >>>> If so, who should be responsible for ensuring values are live where they >>>> need to be? >>>> >>>> Thanks, >>>> Sam >>>> >>>> On Wednesday, November 8, 2023 at 2:15:58 PM UTC Sam Parker-Haynes >>>> wrote: >>>> >>>> Also, are any instruction selection tests being added for Turboshaft? >>>> Something like the equivalent of >>>> test/unittests/compiler/arm64/instruction-selector-arm64-unittest.cc? >>>> >>>> On Wednesday, November 8, 2023 at 1:26:12 PM UTC Sam Parker-Haynes >>>> wrote: >>>> >>>> EDIT: only passes with --no-turboshaft >>>> >>>> On Wednesday, November 8, 2023 at 1:20:03 PM UTC Sam Parker-Haynes >>>> wrote: >>>> >>>> Hi Matthias, >>>> >>>> Ah, I hadn't seen that, so yes I've been tweaking the OG selector >>>> function for arm64. On the x64 side of things, everything is fine with >>>> 'pure' turboshaft pipeline but, as with my arm64 checkout, the issue >>>> arises >>>> when modifying the turbofan selector function and using ---no-turboshaft. >>>> >>>> So, I guess the problem is with the RecreateSchedule? >>>> >>>> Thanks, >>>> Sam >>>> >>>> On Wednesday, November 8, 2023 at 1:12:56 PM UTC mlie...@google.com >>>> wrote: >>>> >>>> Hi Sam, >>>> >>>> I don't have any insights on x64 for VisitWordCompareZero but as a >>>> quick note, the TurboshaftAdapter instruction selection implementation for >>>> VisitWordCompareZero for arm64 was just added yesterday by me ( >>>> https://chromium-review.googlesource.com/c/v8/v8/+/5002003). >>>> >>>> So, just to clarify, I assume you are talking about the Turbofan {Node* >>>> / TurbofanAdapter} based instruction selection with Turboshaft >>>> optimizations followed by a RecreateSchedule? >>>> If you do changes for the TurbofanAdapter instruction selection on >>>> arm64 for code that already has an implementation for the >>>> TurboshaftAdapter, could you add a TODO(mliedtke) on the Turboshaft >>>> implementation with some nice description and CC me on the gerrit changes, >>>> so these don't get lost when we switch over to the TurboshaftAdapter? >>>> >>>> Thanks and best regards, >>>> Matthias >>>> >>>> On Wed, Nov 8, 2023 at 1:54 PM Sam Parker-Haynes <sam.p...@arm.com> >>>> wrote: >>>> >>>> Hi, >>>> >>>> I've been investigating rematerialising flag setting instructions in >>>> VisitWordCompareZero, just by removing the CanCover check, and something >>>> is >>>> going wrong with turboshaft. >>>> >>>> I have an arithmetic overflow operation, Int32AddWithOverflow, and the >>>> overflow check is used by two branch operations. With a turbofan-only >>>> flow, >>>> the overflow operation and the projection will be duplicated for the >>>> second >>>> branch, and everything is fine. But with turboshaft, neither the overflow >>>> operation or projection are duplicated and the second branch uses the >>>> original projection[1] value. This looks fine at the IR level, but ends up >>>> broken after isel when the register allocator comes across a virtual >>>> register without a definition. This happens for both x64 and arm64, so I'm >>>> assuming turboshaft is making some assumptions, based on the current >>>> behaviour, that are non-obvious to me. >>>> >>>> Any ideas? >>>> >>>> Thanks! >>>> Sam >>>> >>>> >>>> >>>> -- >>>> -- >>>> v8-dev mailing list >>>> v8-...@googlegroups.com >>>> http://groups.google.com/group/v8-dev >>>> --- >>>> You received this message because you are subscribed to the Google >>>> Groups "v8-dev" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to v8-dev+un...@googlegroups.com. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/v8-dev/175a4ba2-9d33-45bb-84e3-91d41a722846n%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/v8-dev/175a4ba2-9d33-45bb-84e3-91d41a722846n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>>> -- >>>> -- >>>> v8-dev mailing list >>>> v8-...@googlegroups.com >>>> http://groups.google.com/group/v8-dev >>>> --- >>>> You received this message because you are subscribed to the Google >>>> Groups "v8-dev" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to v8-dev+un...@googlegroups.com. >>>> >>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/v8-dev/46693e6c-1216-493b-a104-2c9d41075987n%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/v8-dev/46693e6c-1216-493b-a104-2c9d41075987n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- -- v8-dev mailing list v8-dev@googlegroups.com http://groups.google.com/group/v8-dev --- You received this message because you are subscribed to the Google Groups "v8-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to v8-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/d539c990-e8ed-4134-bba5-98ae45c51c38n%40googlegroups.com.