Re: [Dwarf-discuss] Clarification: DW_OP_mod doesn't specify which definition of modulo

Ron Brender via Dwarf-discuss Thu, 25 Sep 2025 08:26:14 -0700

"I am either too young or have lived a too sheltered life to be aware of
any architectures which actually have signed addressing."


Yes, those of us who grew up in computing using PDP-11, VAX and Alpha
architectures, especially with the VMS operating system, learned to know
and love (or at least make peace with) signed addresses. (And also stacks
that grow downward -- but that is a different issue!:-)

Researching some more, my eyes were greatly opened and my mind greatly
boggled by the article  "Modulo" in Wikipedia (see
https://en.wikipedia.org/wiki/Modulo). Among other things, that article
ends with a table of 150 or more programming languages together with which
of the *four* major versions of modulo it uses (Euclidean, Truncated,
Rounded and Floored (this last includes Knuth)). A casual visual scan of
that table suggests that no single one of the four dominates the others in
frequency of occurrence. [A couple languages even leave the matter
undefined or implementation-defined.]

My latest thoughts...

*First*, this article absolutely must be at least cited in any discussion
in any Issue proposal we might consider. There is a huge amount of
information in it that should be taken into account.

*Second*, I conclude that any single choice that DWARF might make would be
a terrible choice for most languages. I think that DWARF should specify
that the definition of modulo uses the definition for the language that
applies in its execution context. I think I saw that GDB does this already
so this is not a novel, never-been-tried idea.

If we do that, then applicability to floating-point types is also naturally
left as language-dependent.

An alternative might be to replace the current modulo operator with four
new operators, one for each of the four kinds of modulo mentioned above. I
don't really like this idea but I think it is defensible.

*Third*, the generic type is still a problem. Since the generic type is
DWARFs invention it is DWARF's problem to solve I guess. So far my best
suggestion is to say that an operand of the modulo operation whose type is
generic is treated by that operator as an unsigned integer. I think that is
the more common case and more consistent with our historical (if
unpublished) position. [I prefer to keep this resolution specific to the
modulo operator if possible and not venture into broader consideration of
the nature of the generic type generally.]

Ron


On Wed, Sep 24, 2025 at 9:03 PM Ben Woodard <[email protected]> wrote:

>
> On 9/24/25 4:46 PM, Ron Brender wrote:
>
> For starters, the proposed text is a non-starter (forgive the play on
> words)
> because there is no Chapter 2 Section 4 in Knuth's The Art of Computer
> Programming Volume 1. Chapter 2 is entitled Information Structures, in
> which
> section 2.4 (is that what you mean by "Section 4"?) is entitled Multilinked
> Structures, and has nothing to do with the modulo operation.
>
> The discussion mentions Section 1.2.4, which is actually in Volume 1,
> Chapter 1, is entitled Integer Functions and Elementary Number Theory,
> and does define and discuss the modulo operation.
>
> I can correct that. When we were discussing the issue, Cary pulled out his
> copy Knuth, mine is tucked away in a box that never got unpacked the last
> time that I moved. I misunderstood his citation.
>
>
> Even if the citation were correct, I would object on the grounds that I
> believe
> the DWARF text should provide the definition, not a citation that the
> reader
> needs to consult. A footnote to an external source might be OK if there
> were
> complicated issues of possible supplementary interest.
>
> I was kind of hoping that the dwarf-discuss community could propose what
> they think it should be and then I'll be happy to write that into the text
> of the proposal. I really don't have a strong opinion on the matter.
>
> I just think that whatever the algorithm is, it should be in the standard.
> Right now we have:
> 1) GDB implmenting, Knuth's algorithm but limited to signed and unsigned
> integers.
> 2) IIUC John DelSignore said TotalView implements DW_OP_mod using C's %
> operator.
> 3) and we have an email from Michael Eager from 2011 saying that DW_OP_mod
> should only apply to unsigned integral types.
>
> Let's all just get on the same page. (I don't care what page it is.)
>
>
> Finally, the Knuth definition is given in terms of real numbers, of which
> integers are a special case, using floor and ceiling operations. This
> would be appropriate if DWARF DW_OP_mod were intended to apply
> to floating-point operands but is rather pedantic overkill for just
> integers.
>
> Yeah but in that section we have two broad classes of operators arithmetic
> and logical. All the other arithmetic operators are defined with a domain
> that includes non-integral types. Should we exclude this one arithmetic
> operator from that more expansive range?
>
> It is not as if the algorithm is unknown or particularly complicated.
>
> *x* mod *y* = *x* − *y* × floor(*x* / *y*), if *y* ≠ 0; *x* mod 0 = *x*.
>
>
> But I think the real problem is not the definition of DW_OP_mod per se but
> the definition of the generic type. DWARF Section 2.5.2 defines the generic
> type as an "integral type that has the size of an address on the target
> machine
> and unspecified signedness." We know that some architectures treat
> addresses
> as signed and some as unsigned integers, and DWARF is trying not to care.
>
> Most of the time it mostly doesn't matter. But to be concrete, what does
> one make of
>
>      DW_OP_lit5
>      DW_OP_lit2
>      DW_OP_neg
>      DW_OP_mod
>
> If the generic type is signed, then the result is -1. However, if the
> generic type
> is thought to be unsigned, then "-2" is just a very large positive number
> and
> the result is 5.
>
> exactly!
>
> We might think about solving this problem by defining the generic type to
> be
>      a) signed
>      b) unsigned
>      c) signedness implementation-defined
> I would not advocate either a) or b). Moreover, I would be very caution in
> overturning the
> "non-signedness" of generic type which has been characteristic of DWARF
> from the beginning
> (even before the name "generic type" was introduced).
>
> I am either too young or have lived a too sheltered life to be aware of
> any architectures which actually have signed addressing. I've seen plenty
> of cases where there are signed offsets added to unsigned addresses to make
> unsigned addresses. This address arithmetic avoids the complication of
> mixed signed and unsigned arithmetic which can slip into naively written C
> code.
>
> It seems to me that what you are really wanting with the generic type is
> "address arithmetic" where you can do "unsigned_address + signed_offset"
> and not have to worry about the C rules that can cause  mixed unsigned and
> signed arithmetic to yield unexpected results. For example I believe that
> "unsigned_address + signed_offset" is actually defined to be
> "unsigned_address + (unsigned) signed_offset" causing a a very large
> unsigned value to be added to the unsigned address in the case where
> signed_offset happens to be a small negative.
>
> I think that both "signed" and "unsigned" are kind of C concepts and I
> would suggest that an other option which may provide a way out of this
> dilemma is:
>
> d) the generic type is a type is defined to be an integral type suitable
> for address arithmetic.
>
> and more specifically this means that when implementing a consumer in C
> you must be careful when mixing signed and unsigned values doing something
> like:
>
> if (signed_offset >=0 )
>   unsigned_address+=signed_offset;
> else
>   unsigned_address-=abs(signed_offset);
>
> There may be other cases where the arithmetic of the "generic type"
> diverges from integers in C or other language in subtle ways. DW_OP_mod may
> be one of those but I haven't thought about it enough to be sure.
>
> -ben
>
>
>
>
> Defining DW_OP_mod to be defined only for unsigned integers seems
> overkill and unnecessary when no generic type operands are involved.
>
> A more permissive approach is to specify that an operand of the generic
> type is implicitly treated as unsigned. Then use the Knuth definition
> restricted to
> integers. This is close to Ben's second alternative but further resolves
> the ambiguity
> of generic signedness.
>
> Ben has raised a definite problem for which further thought is surely
> warranted...
>
> Ron
>
>
>
>
>
>
> On Wed, Sep 24, 2025 at 2:32 PM Ben Woodard via Dwarf-discuss <
> [email protected]> wrote:
>
>> Background:
>>
>> Evidently, originally DWARF didn't allow arithmetic operations on
>> floating point numbers and most uses of the DWARF stack were done with
>> the assumption that the values being acted upon were addresses and so
>> the computation was assumed to be acting upon unsigned numbers.
>>
>> At some point, DWARF began to allow the arithmetic operations to work on
>> floating point numbers and several operations were explicitly defined to
>> work over non-integral values. This led to the paragraph in the current
>> DWARF working draft that says in section 2.5.2.4 on page 37 lines 24-27:
>>
>> "Operations other than DW_OP_abs, DW_OP_div, DW_OP_minus,
>> DW_OP_mul, DW_OP_neg and DW_OP_plus require integral types of the
>> operand (either integral base type or the generic type). Operations do
>> not cause
>> an exception on overflow."
>>
>> Unlike all the other arithmetic operations this explicitly limits
>> DW_OP_mod to integral base types and the generic type. It lumps
>> DW_OP_mod in with the logical operations. Furthermore, there are
>> multiple definitions of the modulo operator which vary in how they
>> handle signed values.
>>
>> According to the dwarf-discuss archives, this issue came up back in 2011
>> and at that time Michael Eager made a pronouncement that DW_OP_mod used
>> the modulo algorithm for unsigned arithmetic. However, this decision was
>> not recorded in the standard. Since that time, consumers have
>> implemented different implementations of DW_OP_mod.
>>
>> This proposal seeks to clarify and harmonize the consumer
>> implementations of the DW_OP_mod operator by defining which algorithm to
>> use for signed arithmetic as well as define it for floating point numbers.
>>
>> Proposal:
>>
>> Add DW_OP_mod to the list of operators which do not require integral
>> base types by changing:
>>
>> Operations other than DW_OP_abs, DW_OP_div, DW_OP_minus, DW_OP_mul,
>> DW_OP_neg and DW_OP_plus require integral types of the operand (either
>> integral base type or the generic type).
>>
>> To:
>>
>> Operations other than DW_OP_abs, DW_OP_div, DW_OP_mod, DW_OP_minus,
>> DW_OP_mul, DW_OP_neg and DW_OP_plus require integral types of the
>> operand (either integral base type or the generic type).
>>
>> Then append the following sentence to the description of the DW_OP_mod:
>>
>> The algorithm used to implement modulo shall be the one defined in The
>> Art of Computer Programming Volume 1: Fundamental Algorithms Chapter 2
>> Section 4. Knuth.
>>
>> Alternative proposals:
>>
>> 1) Explicitly state in the standard that DW_OP_mod is only defined for
>> unsigned integral arithmetic. This effectively standardizes the Michael
>> Eager's pronouncement from 2011.
>>
>> 2) Pick any algorithm for modulo that works for signed as well unsigned
>> arithmetic and specify that DW_OP_modulo shall follow it. The current
>> GDB implementation follows Knuth 1.2.4 for signed and unsigned integral
>> arithmetic but excludes the algorithm for reals and floating point
>> numbers.
>>
>>
>> --
>> Dwarf-discuss mailing list
>> [email protected]
>> https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss
>>
>

-- 
Dwarf-discuss mailing list
[email protected]
https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss

Re: [Dwarf-discuss] Clarification: DW_OP_mod doesn't specify which definition of modulo

Reply via email to