On 11/7/25 9:18 AM, Mark Wielaard wrote:
Hi Ben,

On Fri, 2025-11-07 at 08:09 -0800, Ben Woodard wrote:
On 11/7/25 6:58 AM, Mark Wielaard wrote:
On Thu, 2025-11-06 at 11:34 -0800, Ben Woodard via Dwarf-discuss wrote:

I think renaming is really confusing. And I think extending to
supporting floating point types should be a separate issue that would
also look at the other operators.

Maybe a compromise would be to keep DW_OP_mod (and make DW_OP_rem an
alias?)
I would do it the other way around make DW_OP_mod be a legacy alias and
call the same operation DW_OP_rem.
I think that is fine, as long as they have the same constant value
(0x1d).

Agreed. Same encoding. Just in the header files there are two defines which point to the same constant value. Old consumers can continue to print DW_OP_mod (just like DW_OP_push_object_{address,location} but consumer's human readable strings should be updated to DW_OP_rem.

Honestly, as a concession while I think it would be less confusing to rename the operator. I'm really fine with keeping DW_OP_mod as a name so long as the domain is expanded to include signed and unsigned integral types and the algorithm and domain of the operator is documented in the standard.

With Jakub's example, I think that we have a compelling reason to expand
the domain of DW_OP_rem (the former DW_OP_mod) to include signed
integral types as well as unsigned integral types. His example seems to
require the semantics of C99's % operator (truncated division).

If we do this, then it will be backward compatible. The only thing that
we would be changing is the domain over which the current DW_OP_mod
operates. We are not changing any of the semantics.
Because the semantics weren't really defined for the expanded domain,
which means e.g. gdb does interpret it differently.

Agreed GDB will have to change. However, when we discovered that it implemented DW_OP_mod using language specific semantics from the UI, everyone including the GDB developers on the call agreed that was the wrong thing to do. IIRC there are some 200 languages that are supported by DWARF and not all of them have specified behavior for for signed mod. Thus trying to interpret DW_OP_mod in a language specific way was universally deemed "wrong" or even "insane". Evidently, in the DWARF committee meeting, having the implementation tied to the source language was universally panned.

In the DWARF for GPUs meeting, we decided to put forth a proposal specifying that DWARF operations were their own thing and not tied to the source language. I was assigned the task to draft that proposal but I have yet to do so yet.


I'm ambivalent about expanding the domain to floating point values. If
someone has a reason for having it work on floating point types, then
sure why not. It is a bit of extra code in every consumer but whatever.
And it makes us have to define how floating point values are
represented and pick a specific interpretation of the operations.

My understanding is that the DWARF committee already decided to allow floating point numbers on the stack. I was not around for this and I do not know the exact reasoning. My guess as to why that was done is so that if a FP number was optimized out, it could still be reconstructed with DWARF expressions and then represented in implicit storage. This would suggest encoding of FP numbers would have to follow the consumer's target architecture and interpretation of operators would need to be specific enough to allow the unambiguous reconstruction of the optimized out variable on that target architecture.

That being the case, certainly many of the arithmetic operators would need to be defined for floating point base types. However, the thing that gives me pause with DW_OP_rem (or its old name DW_OP_mod) is that even in modern C or C++ modulo for floating point numbers is a function call, fmod(), not a primitive operation the way that % is.


What I really care about is that when we update the description of
DW_OP_rem (the operation formerly known asĀ  DW_OP_mod) we specify both
the domain of the operator as well as the algorithm used.
I think doing both a renaming and redefining the interpretation of the
operation and domain at the same time is super confusing. Better to
just introduce two new operations for the expanded domain.

I disagree with you on this particular point.

As a counter argument I point to DW_OP_push_object_addr which is now a legacy alias with the same encoding as the current DW_OP_push_object_location. The information extracted from execution context is now a fully specified location rather than just a generic value for the address. We changed both the name and changed what it does in a backward compatible way. I argue that */_expanding_/* the domain of operators in a backward compatible way is a minor change.

As I went through all the operators making https://github.com/woodard/dwarf-locations/blob/op-formatting/024-revise-operations.md and I made a bunch of notes where I think the domain of operations should be specified or in some cases expanded. I have to write all of those up. They include:

Does DW_OP_regval_type really need to be limited to a base type? (vector registers) Does DW_OP_regval_bits really need to be limited to the number of bits of a generic? (large vector predicate registers) Why can't the logical operators also be applied to integral vector registers and vector predicate registers? Why can't we mix vector integral types with integral types when doing arithmetic operations? there are literally opcodes in many ISAs for this. DW_OP_shl and DW_OP_shr should be also work on vector registers. This can be used for lane shifting.
Are DW_OP_shl and DW_OP_shr defined for negative shifts? (clarification)
...

I believe major versions like the DWARF6 we are building toward are the time to clean things like this up.

We are also down to only about 50 available opcodes in the single byte operation encoding space, and so we need to a bit careful about how many new ones we allocate until we all agree to have a flag day and break compatibility with DWARF2-?.


   Then introduce new DW_OP_modulo (defined using floored
division)
Again I'm personally ambivalent about the need for this. I don't think
that it is going to be used very often and I think if we do define it we
should consider pushing it into the new DW_OP_extended operation
encoding space. This will make its encoding a two byte operation but it
will reserve more of the one byte encodings for more frequently used
operations.
Sure DW_OP_modulo and DW_OP_remainder could be "extended" operations.

Agree.

My ambivalence to a true modulo is because unlike truncated division aka remainder which is used for address arithmetic within both the signed and unsigned domains, true modulo on FP numbers and even truncated division on FP numbers is a function call in C/C++.

I'm happy to let everyone else discuss and decide if we need an actual modulo in DWARF.


and DW_OP_remainder (defined using truncated division)
operators that are only to be used with typed DWARF stack values?
I really do believe that a better approach is to rename and expand the
domain of the current DW_OP_mod rather than adding another new special
purpose operator.
I disagree. I think just leave DW_OP_mod for legacy operation on the
generic type and have two clearly defined new DW_OP_modulo and
DW_OP_remainder for typed DWARF stack values is much clearer.
I am happy to let the overall committee decide this.

We agree on most points have a minor disagreement on a couple of narrow points. We can sort those out in committee.

-ben


Cheers,

Mark
-- 
Dwarf-discuss mailing list
[email protected]
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss

Reply via email to