On 15/11/16 16:48, Jiong Wang wrote: > > > On 15/11/16 16:18, Jakub Jelinek wrote: >> On Tue, Nov 15, 2016 at 04:00:40PM +0000, Jiong Wang wrote: >>>>> Takes one signed LEB128 offset and retrieves 8-byte contents >>>>> from the address >>>>> calculated by CFA plus this offset, the contents then >>>>> authenticated as per A >>>>> key for instruction pointer using current CFA as salt. The >>>>> result is pushed >>>>> onto the stack. >>>> I'd like to point out that especially the vendor range of DW_OP_* is >>>> extremely scarce resource, we have only a couple of unused values, >>>> so taking >>>> 3 out of the remaining unused 12 for a single architecture is IMHO >>>> too much. >>>> Can't you use just a single opcode and encode which of the 3 >>>> operations it is >>>> in say the low 2 bits of a LEB 128 operand? >>>> We'll likely need to do RSN some multiplexing even for the generic GNU >>>> opcodes if we need just a few further ones (say 0xff as an extension, >>>> followed by uleb128 containing the opcode - 0xff). >>>> In the non-vendor area we still have 54 values left, so there is >>>> more space >>>> for future expansion. >>> Seperate DWARF operations are introduced instead of combining all >>> of them into >>> one are mostly because these operations are going to be used for most >>> of the >>> functions once return address signing are enabled, and they are used for >>> describing frame unwinding that they will go into unwind table for >>> C++ program >>> or C program compiled with -fexceptions, the impact on unwind table >>> size is >>> significant. So I was trying to lower the unwind table size overhead >>> as much as >>> I can. >>> >>> IMHO, three numbers actually is not that much for one architecture >>> in DWARF >>> operation vendor extension space as vendors can overlap with each >>> other. The >>> only painful thing from my understand is there are platform vendors, >>> for example >>> "GNU" and "LLVM" etc, for which architecture vendor can't overlap with. >> For DW_OP_*, there aren't two vendor ranges like e.g. in ELF, there is >> just >> one range, so ideally the opcodes would be unique everywhere, if not, >> there >> is just a single GNU vendor, there is no separate range for Aarch64, that >> can overlap with range for x86_64, and powerpc, etc. >> >> Perhaps we could declare that certain opcode subrange for the GNU >> vendor is >> architecture specific and document that the meaning of opcodes in that >> range >> and count/encoding of their arguments depends on the architecture, but >> then >> we should document how to figure out the architecture too (e.g. for ELF >> base it on the containing EM_*). All the tools that look at DWARF >> (readelf, >> objdump, eu-readelf, libdw, libunwind, gdb, dwz, ...) would need to >> agree on that >> though. >> >> I know nothing about the aarch64 return address signing, would all 3 >> or say >> 2 usually appear together without any separate pc advance, or are they >> all >> going to appear frequently and at different pcs? > > I think it's the latter, the DW_OP_AARCH64_paciasp and > DW_OP_AARCH64_paciasp_deref are going to appear frequently and at > different pcs. > For example, the following function prologue, there are three > instructions > at 0x0, 0x4, 0x8. > > After the first instruction at 0x0, LR/X30 will be mangled. The > "paciasp" always > mangle LR register using SP as salt and write back the value into LR. > We then generate > DW_OP_AARCH64_paciasp to notify any unwinder that the original LR is > mangled in this > way so they can unwind the original value properly. > > After the second instruction at 0x4, The mangled value of LR/X30 will > be pushed on > to stack, unlike usual .cfi_offset, the unwind rule for LR/X30 becomes: > first fetch the > mangled value from stack offset -16, then do whatever to restore the > original value > from the mangled value. This is represented by > (DW_OP_AARCH64_paciasp_deref, offset). > > .cfi_startproc > 0x0 paciasp (this instruction sign return address register LR/X30) > .cfi_val_expression 30, DW_OP_AARCH64_paciasp > 0x4 stp x29, x30, [sp, -32]! > .cfi_val_expression 30, DW_OP_AARCH64_paciasp_deref, -16 > .cfi_offset 29, -32 > .cfi_def_cfa_offset 32 > 0x8 add x29, sp, 0 >
Now I'm confused. I was thinking that we needed one opcode for the sign operation in the prologue and one for the unsign/validate operation in the epilogue (to support non-call exceptions. But why do we need a separate code to say that a previously signed value has now been pushed on the stack? Surely that's just a normal store operation that can be tracked through the unwinding state machine. I was expecting the third opcode to be needed for the special operations that are not frequently used by the compiler. R. >> Perhaps if there is just 1 >> opcode and has all the info encoded just in one bigger uleb128 or >> something >> similar... >