https://sourceware.org/bugzilla/show_bug.cgi?id=31454
Bug ID: 31454 Summary: Add constant tracking to disassembly (objdump -d, gdb disas) Product: binutils Version: unspecified Status: NEW Severity: normal Priority: P2 Component: binutils Assignee: unassigned at sourceware dot org Reporter: jakub at redhat dot com Target Milestone: --- Consider unsigned foo (void) { return 0xdeadbeefU; } unsigned long long bar (void) { return 0xdeadbeefcafebabeULL; } static int p; int *baz (void) { return &p; } int main () {} When linked on x86_64 with -O2 -fpic, objdump -d and gdb disassemble already does some immediate visualization to help user reading the code: 0000000000401140 <baz>: 401140: 48 8d 05 d9 2e 00 00 lea 0x2ed9(%rip),%rax # 404020 <__TMC_END__> 401147: c3 ret or Dump of assembler code for function baz: 0x0000000000401140 <+0>: lea 0x2ed9(%rip),%rax # 0x404020 <p> 0x0000000000401147 <+7>: ret knows to handle lea with immediate and (%rip) to add the 0x2ed9 in there with end of the instruction and print the resulting immediate and perhaps symbolic rendering of it in the comment. The 0xdeadbeef and 0xdeadbeefcafebabe immediates are clearly shown in the assembly, so there is no need to help users reading that. Now, let's try the same on other arches, e.g. aarch64: 400140: 5297dde0 mov w0, #0xbeef // #48879 400144: 72bbd5a0 movk w0, #0xdead, lsl #16 in foo, 400160: d29757c0 mov x0, #0xbabe // #47806 400164: f2b95fc0 movk x0, #0xcafe, lsl #16 400168: f2d7dde0 movk x0, #0xbeef, lsl #32 40016c: f2fbd5a0 movk x0, #0xdead, lsl #48 in bar and 400180: f00000e0 adrp x0, 41f000 <baz+0x1ee80> 400184: 913fa000 add x0, x0, #0xfe8 in baz. It would be helpful if the disassembly could for a small set of instructions which are usually involved in constant creations in GPR registers be able to propagate constants through them; for each GPR register remember if it is set to a known constant (then also the constant value) or not. When seeing a start of a function (new symbol?) reset this knowledge, maybe also reset it on possible conditional/unconditional jump destinations from the same function (though computing that might require another pass through the instructions), when seeing a GPR register set with a handled instruction to constant remember that constant, when seeing a handled instruction where all the inputs have known constant values try to evaluate the instruction and remember the resulting constant and then show in comments like in the lea case above the immediate plus symbolic rendering if any. And when seeing an unhandled instruction that sets or clobbers some GPR (or might do that), forget the value of that register. So, for foo above, remember that w0 is set to 0xbeef, interpret the movk instruction that the result is 0xdeadbeef and tell it to the user, ditto for the second case, similarly remember for adrp and handle the add too, printing there 41ffe8 <p>. Now, repeat this on other arches, powerpc{,64,64le}, sparc{,64}, ... On s390x, one can also see that it loads some constants from .rodata/.data.rel.ro* and similar sections, those too would be nice to track and print. This would help users so that they don't have to scratch their heads interpreting the instructions or having to actually see what it does at runtime to find out what it actually computes. In gdb, sometimes one just disassembles part of a function, not the whole one, I think it would be perfectly fine to start with nothing known state at the start of such a block and print only what is discovered in that block. -- You are receiving this mail because: You are on the CC list for the bug.