On Tue, 2014-02-04 at 18:24 -0800, Josh Stone wrote: > On 02/04/2014 03:12 PM, Josh Stone wrote: > > There are only a few internal dwarf_formsdata calls: for the decls as I > > mentioned, and in array_size() for DW_AT_lower/upper_bound. AFAICS the > > spec doesn't explicitly call bounds signed or unsigned, but only > > unsigned makes sense to me, so these also ought to use dwarf_formudata. > > http://www.dwarfstd.org/ShowIssue.php?issue=020702.1 > > So Fortran allows negative bounds, yay, and this is the origin of the > standard's vague statements about data[1248] context.
Thanks for finding this, it explains the context nicely. > Here's a little experiment with gcc-gfortran-4.8.2-7.fc20.x86_64: > (and forgive my fortran ignorance, but at least this compiles) > > PROGRAM main > INTEGER A(10:199) > INTEGER B(-20:-10) > A(10) = B(-10) > END > > yields: > > [ 67] array_type > type (ref4) [ 7f] > sibling (ref4) [ 78] > [ 70] subrange_type > type (ref4) [ 78] > lower_bound (data1) 10 > upper_bound (data1) 199 > [ 78] base_type > byte_size (data1) 8 > encoding (data1) signed (5) > name (strp) "integer(kind=8)" > [ 7f] base_type > byte_size (data1) 4 > encoding (data1) signed (5) > name (strp) "integer(kind=4)" > [ 86] array_type > type (ref4) [ 7f] > sibling (ref4) [ a5] > [ 8f] subrange_type > type (ref4) [ 78] > lower_bound (data8) 18446744073709551596 > upper_bound (data8) 18446744073709551606 > > Thus gfortran appears to support the current elfutils behavior - read it > as unsigned and cast it without sign extension. It happily put 199 in > data1, and went all the way to data8 for negative values. It could have > been more compact with sdata instead of data8 though. > > Also, apparently eu-readelf is not using dwarf_formsdata for bounds, and > it should. It doesn't because it is very low-level and doesn't use any context. So it just sees the DW_FORM_data8 and will print its value. But if I read the DWARF issue correctly then a higher-level interface seeing a DW_TAG_subrange_type would lookup the DW_TAG_type for the DIE first to see whether it is signed or not to decide how to interpret the DW_AT_lower and upper bound values. It can even be a reference or an exprloc that represents the actual value. We might want to introduce a dwarf_subrange_bounds () function that does that. > Binutils readelf prints those as hex, no better. > > FWIW, libdwarf's dwarfdump just reveals its indecision: > DW_AT_lower_bound 10 > DW_AT_upper_bound 199(as signed = -57) > and > DW_AT_lower_bound 18446744073709551596(as signed = -20) > DW_AT_upper_bound 18446744073709551606(as signed = -10) Right, because dwarfdump is similar to eu-readelf, it doesn't use any context and so it doesn't know how to represent the value encoded with DW_FORM_data8. I actually like that it also prints the signed value if different. Maybe we should make eu-readelf do the same? Printing is hex like binutils readelf does is another way to mask the ambiguity at the low-level. > So now I'm not sure anything needs to change. At least dwarf_formsdata > should stay as-is for gcc. Are you sure? I think your original analysis is correct that dwarf_formsdata () is wrong and really should sign-extend. > We could conceivably use dwarf_formudata for > DW_AT_decl_file/line/column, since those really are specified unsigned, > but this is unlikely to ever make a difference. The values for > dwarf_decl_line/column are asserted 0..INT_MAX, and people with more > than INT64_MAX files are already insane. You are right. Still using dwarf_formudata () would be more correct IMHO. Cheers, Mark
