"I am either too young or have lived a too sheltered life to be aware of any architectures which actually have signed addressing."
Yes, those of us who grew up in computing using PDP-11, VAX and Alpha architectures, especially with the VMS operating system, learned to know and love (or at least make peace with) signed addresses. (And also stacks that grow downward -- but that is a different issue!:-) Researching some more, my eyes were greatly opened and my mind greatly boggled by the article "Modulo" in Wikipedia (see https://en.wikipedia.org/wiki/Modulo). Among other things, that article ends with a table of 150 or more programming languages together with which of the *four* major versions of modulo it uses (Euclidean, Truncated, Rounded and Floored (this last includes Knuth)). A casual visual scan of that table suggests that no single one of the four dominates the others in frequency of occurrence. [A couple languages even leave the matter undefined or implementation-defined.] My latest thoughts... *First*, this article absolutely must be at least cited in any discussion in any Issue proposal we might consider. There is a huge amount of information in it that should be taken into account. *Second*, I conclude that any single choice that DWARF might make would be a terrible choice for most languages. I think that DWARF should specify that the definition of modulo uses the definition for the language that applies in its execution context. I think I saw that GDB does this already so this is not a novel, never-been-tried idea. If we do that, then applicability to floating-point types is also naturally left as language-dependent. An alternative might be to replace the current modulo operator with four new operators, one for each of the four kinds of modulo mentioned above. I don't really like this idea but I think it is defensible. *Third*, the generic type is still a problem. Since the generic type is DWARFs invention it is DWARF's problem to solve I guess. So far my best suggestion is to say that an operand of the modulo operation whose type is generic is treated by that operator as an unsigned integer. I think that is the more common case and more consistent with our historical (if unpublished) position. [I prefer to keep this resolution specific to the modulo operator if possible and not venture into broader consideration of the nature of the generic type generally.] Ron On Wed, Sep 24, 2025 at 9:03 PM Ben Woodard <[email protected]> wrote: > > On 9/24/25 4:46 PM, Ron Brender wrote: > > For starters, the proposed text is a non-starter (forgive the play on > words) > because there is no Chapter 2 Section 4 in Knuth's The Art of Computer > Programming Volume 1. Chapter 2 is entitled Information Structures, in > which > section 2.4 (is that what you mean by "Section 4"?) is entitled Multilinked > Structures, and has nothing to do with the modulo operation. > > The discussion mentions Section 1.2.4, which is actually in Volume 1, > Chapter 1, is entitled Integer Functions and Elementary Number Theory, > and does define and discuss the modulo operation. > > I can correct that. When we were discussing the issue, Cary pulled out his > copy Knuth, mine is tucked away in a box that never got unpacked the last > time that I moved. I misunderstood his citation. > > > Even if the citation were correct, I would object on the grounds that I > believe > the DWARF text should provide the definition, not a citation that the > reader > needs to consult. A footnote to an external source might be OK if there > were > complicated issues of possible supplementary interest. > > I was kind of hoping that the dwarf-discuss community could propose what > they think it should be and then I'll be happy to write that into the text > of the proposal. I really don't have a strong opinion on the matter. > > I just think that whatever the algorithm is, it should be in the standard. > Right now we have: > 1) GDB implmenting, Knuth's algorithm but limited to signed and unsigned > integers. > 2) IIUC John DelSignore said TotalView implements DW_OP_mod using C's % > operator. > 3) and we have an email from Michael Eager from 2011 saying that DW_OP_mod > should only apply to unsigned integral types. > > Let's all just get on the same page. (I don't care what page it is.) > > > Finally, the Knuth definition is given in terms of real numbers, of which > integers are a special case, using floor and ceiling operations. This > would be appropriate if DWARF DW_OP_mod were intended to apply > to floating-point operands but is rather pedantic overkill for just > integers. > > Yeah but in that section we have two broad classes of operators arithmetic > and logical. All the other arithmetic operators are defined with a domain > that includes non-integral types. Should we exclude this one arithmetic > operator from that more expansive range? > > It is not as if the algorithm is unknown or particularly complicated. > > *x* mod *y* = *x* − *y* × floor(*x* / *y*), if *y* ≠ 0; *x* mod 0 = *x*. > > > But I think the real problem is not the definition of DW_OP_mod per se but > the definition of the generic type. DWARF Section 2.5.2 defines the generic > type as an "integral type that has the size of an address on the target > machine > and unspecified signedness." We know that some architectures treat > addresses > as signed and some as unsigned integers, and DWARF is trying not to care. > > Most of the time it mostly doesn't matter. But to be concrete, what does > one make of > > DW_OP_lit5 > DW_OP_lit2 > DW_OP_neg > DW_OP_mod > > If the generic type is signed, then the result is -1. However, if the > generic type > is thought to be unsigned, then "-2" is just a very large positive number > and > the result is 5. > > exactly! > > We might think about solving this problem by defining the generic type to > be > a) signed > b) unsigned > c) signedness implementation-defined > I would not advocate either a) or b). Moreover, I would be very caution in > overturning the > "non-signedness" of generic type which has been characteristic of DWARF > from the beginning > (even before the name "generic type" was introduced). > > I am either too young or have lived a too sheltered life to be aware of > any architectures which actually have signed addressing. I've seen plenty > of cases where there are signed offsets added to unsigned addresses to make > unsigned addresses. This address arithmetic avoids the complication of > mixed signed and unsigned arithmetic which can slip into naively written C > code. > > It seems to me that what you are really wanting with the generic type is > "address arithmetic" where you can do "unsigned_address + signed_offset" > and not have to worry about the C rules that can cause mixed unsigned and > signed arithmetic to yield unexpected results. For example I believe that > "unsigned_address + signed_offset" is actually defined to be > "unsigned_address + (unsigned) signed_offset" causing a a very large > unsigned value to be added to the unsigned address in the case where > signed_offset happens to be a small negative. > > I think that both "signed" and "unsigned" are kind of C concepts and I > would suggest that an other option which may provide a way out of this > dilemma is: > > d) the generic type is a type is defined to be an integral type suitable > for address arithmetic. > > and more specifically this means that when implementing a consumer in C > you must be careful when mixing signed and unsigned values doing something > like: > > if (signed_offset >=0 ) > unsigned_address+=signed_offset; > else > unsigned_address-=abs(signed_offset); > > There may be other cases where the arithmetic of the "generic type" > diverges from integers in C or other language in subtle ways. DW_OP_mod may > be one of those but I haven't thought about it enough to be sure. > > -ben > > > > > Defining DW_OP_mod to be defined only for unsigned integers seems > overkill and unnecessary when no generic type operands are involved. > > A more permissive approach is to specify that an operand of the generic > type is implicitly treated as unsigned. Then use the Knuth definition > restricted to > integers. This is close to Ben's second alternative but further resolves > the ambiguity > of generic signedness. > > Ben has raised a definite problem for which further thought is surely > warranted... > > Ron > > > > > > > On Wed, Sep 24, 2025 at 2:32 PM Ben Woodard via Dwarf-discuss < > [email protected]> wrote: > >> Background: >> >> Evidently, originally DWARF didn't allow arithmetic operations on >> floating point numbers and most uses of the DWARF stack were done with >> the assumption that the values being acted upon were addresses and so >> the computation was assumed to be acting upon unsigned numbers. >> >> At some point, DWARF began to allow the arithmetic operations to work on >> floating point numbers and several operations were explicitly defined to >> work over non-integral values. This led to the paragraph in the current >> DWARF working draft that says in section 2.5.2.4 on page 37 lines 24-27: >> >> "Operations other than DW_OP_abs, DW_OP_div, DW_OP_minus, >> DW_OP_mul, DW_OP_neg and DW_OP_plus require integral types of the >> operand (either integral base type or the generic type). Operations do >> not cause >> an exception on overflow." >> >> Unlike all the other arithmetic operations this explicitly limits >> DW_OP_mod to integral base types and the generic type. It lumps >> DW_OP_mod in with the logical operations. Furthermore, there are >> multiple definitions of the modulo operator which vary in how they >> handle signed values. >> >> According to the dwarf-discuss archives, this issue came up back in 2011 >> and at that time Michael Eager made a pronouncement that DW_OP_mod used >> the modulo algorithm for unsigned arithmetic. However, this decision was >> not recorded in the standard. Since that time, consumers have >> implemented different implementations of DW_OP_mod. >> >> This proposal seeks to clarify and harmonize the consumer >> implementations of the DW_OP_mod operator by defining which algorithm to >> use for signed arithmetic as well as define it for floating point numbers. >> >> Proposal: >> >> Add DW_OP_mod to the list of operators which do not require integral >> base types by changing: >> >> Operations other than DW_OP_abs, DW_OP_div, DW_OP_minus, DW_OP_mul, >> DW_OP_neg and DW_OP_plus require integral types of the operand (either >> integral base type or the generic type). >> >> To: >> >> Operations other than DW_OP_abs, DW_OP_div, DW_OP_mod, DW_OP_minus, >> DW_OP_mul, DW_OP_neg and DW_OP_plus require integral types of the >> operand (either integral base type or the generic type). >> >> Then append the following sentence to the description of the DW_OP_mod: >> >> The algorithm used to implement modulo shall be the one defined in The >> Art of Computer Programming Volume 1: Fundamental Algorithms Chapter 2 >> Section 4. Knuth. >> >> Alternative proposals: >> >> 1) Explicitly state in the standard that DW_OP_mod is only defined for >> unsigned integral arithmetic. This effectively standardizes the Michael >> Eager's pronouncement from 2011. >> >> 2) Pick any algorithm for modulo that works for signed as well unsigned >> arithmetic and specify that DW_OP_modulo shall follow it. The current >> GDB implementation follows Knuth 1.2.4 for signed and unsigned integral >> arithmetic but excludes the algorithm for reals and floating point >> numbers. >> >> >> -- >> Dwarf-discuss mailing list >> [email protected] >> https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss >> >
-- Dwarf-discuss mailing list [email protected] https://lists.dwarfstd.org/mailman/listinfo/dwarf-discuss
