https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118837
--- Comment #8 from Tom Tromey <tromey at gcc dot gnu.org> --- (In reply to Simon Marchi from comment #7) > ... so I am afraid that any attempt at addressing this problem will be met > with a "it's a quality of implementation issue", but I think it's worth > trying. I think it would be better if DWARF didn't allow being ambiguous. Even just making this a hard rule would be an improvement. IIRC in one of the old threads the answer was that producers and consumers should agree, but to me this is clearly a bad answer, since the DWARF standard is precisely the mechanism by which they agree. > So far my understanding is the problem is: you could have an attribute, > let's say DW_AT_const_value, with form DW_AT_data1, and value 0x80. As a > consumer, how do you know if that 0x80 means -1 or 128? You could have > compiler-1 people saying "it should obviously be interpreted as a signed > constant" and compiler-2 people saying "it should obviously be interpreted > as an unsigned constant". And then, as a consumer, you are in a pickle. Correct. And the decision varies based on context. > 1. The easy way: remove the DW_FORM_data<n> forms from the constant class. > This only leaves DW_FORM_udata and DW_FORM_sdata, which are define the > signedness explicitly. The advantages: it's an easy change for everybody > (in the spec, in producers, in consumers). How many ways of describing a > constant does DWARF really need? The downside is obviously a possible > increase in debug info size. But would it be significant? I would like to > prototype it an see how many values in a real-world DWARF file would now > take an extra byte because of this. Unfortunately DWARF seems to really love these space-saving micro-optimizations. Personally I think sleb/uleb is enough for nearly everything (basically all values not involving relocations). But, e.g., DWARF added DW_FORM_strx3, I guess to save one byte sometimes? Anyway one problem with this approach is that it provides no guidance for DWARF 3-5. Still, it would be hugely better, be easy to implement, etc. But I guess would be a pretty big change from existing practice. > 2. A more complicated way: for each attribute that can be of the constant > class, define a default signedness (I imagine an extra column in Table 7.5: > Attribute encodings). If the form does not specify the signedness (i.e. > DW_FORM_data<n>), then the consumer would refer to that table to know if the > value should be treated as signed or unsigned. This is more or less the approach I took to fixing this in gdb: I went through every spot and tried to determine the correct answer. I don't think I quite finished. And there are spots that are "confused". That is, compilers in practice will emit a DW_FORM_sdata if the value in question is signed, but will emit DW_FORM_data1 and expect this to be zero-extended. This, to me, undermines the idea that the value or the context is "signed" or "unsigned". The main problem with this approach is that the answer doesn't just depend on the tag or the attribute. It can depend on other DIEs as well, for instance I believe a variant part's discriminant value is sign-extended, or not, depending on the type of the relevant field. This of course is difficult to implement, test, etc.
