dwblaikie wrote: > Carlos is away at the moment so I'm happy to take the nop-sled question. > Sorry for the slight wait, I've also been in and out of the office this week. > > > Problem is that DWARF isn't enough to identify ICF sites for indirect calls > > - because it doesn't record the originally intended destination of the > > call, only the actual call - so there's nothing to compare to to detect the > > difference. > > That makes sense. However, I think there's more than just ICF detection at > play here as I believe it's also the ability to disambiguate between tail > calls and ICF that is valuable. The SCE debugger recovers entry values using > call site parameter DIEs, so it's important to know whether or not there has > been a tail call between that call site info (in the next frame up/parent > frame) and the current call frame, as that would mean the call site > information does not describe the call to the current frame. > > Where there is a mismatch between the parent frame call_origin and current > frame it could be due to either ICF or a tail call. Today, we can > disambiguate those cases for direct calls using the call_origin as already > mentioned. > > However, I don't think there's a foolproof way to determine whether there's > been a tail call for indirect calls currently. Right now the debugger > defensively hides the entry_values if the parent frame doesn't have a > call_origin at the call site (e.g., because it's indirect). Adding > `virutal_call_origin` for virtual calls lets us perform that check over some > indirect call sites, albeit imperfectly (a tail call through to another > implementation of the virtual function would be missed, I think).
tail calls have `DW_AT_call_tail_call` on them, right? > > I know this is way off in a different direction, but... > > > Have you considered using a NOP-sled at the start of ICF'd functions (this > > would be a linker change - when ICF'ing, rather than resolving every > > address to the ICF'd version - resolve each unique original function to a > > unique NOP at the start of/before the start of the chosen preserved copy - > > that way each original function would still have a unique address at the > > call sites)? This would solve any cases of indirect calls, not just virtual > > ones, and would work in a debugger, would preserve uniqueness of addresses > > for C++ function address comparison semantics, etc. > > That sounds interesting, I had not seen this idea before. I wonder, does this > only help you detect ICF specifically on the first step into a call? I can't > picture how this would work if you paused execution in the middle of the > code-folded function. So you already do this for statically known call sites by comparing the function you're in to the caller's call_site's target function - if they're different, it's ICF (and/or a tail call - though the caller's call site should tell you if it's a tail call, so you can know if you're ICF-that's-not-a-tail-call, but if it's a tail call you could be in tail call or tail call+ICF) With a NOP sled you could do this for dynamic calls too - because you could differentiate between the start of the function you're in and the caller's call_site's call_target, if you can still evaluate that (if registers haven't been clobbered/not saved/etc). Which you couldn't do without the nop sled, because the caller's call_target would already be resolved to point to the ICF'd function. > I'm not sure it helps us with the tail call detection for entry value > resolution for indirect calls. I appreciate that isn't what you were > suggesting or responding to, but I think we'd still like to solve that. Rather than trying to detect tail calls - can't you rely the call_site to tell you if it's a tail call? The compiler knows and should be able to emit this information reliably (& seems to at least in some cases - if there's gaps, hopefully those are fixable) & yeah, if you see a tail call - if you can get to the function the tail call is for (even if it's a dynamic dispatch - you can then lookup the function it's dispatching to in the DWARF because you know at runtime/post-runtime what it is) and if it isn't the callee, don't use the call_site_parameters, and if it is the callee you have to analyze all that direct call's calls (recursively), and up from the callee, until you have a graph from caller to callee following only tail call edges and check that the callee appears nowhere else in that graph (eg: in the simple case if the callee is tail recursive - you don't know whether you're in the first call or the Nth call, so you can't reliably use the call site parameters). > > It seems like a misuse of DW_OP_call_origin & I think it'd be more suitable > > to use a distinct attribute for this since it has different semantics from > > the standard attribute. > > That seems fair to me, and it looks like Carlos has taken this on board and > added a new attribute. Yep yep - wondering if we can avoid that/have some more general solution that solves some other problems too. > > (but, yeah, I'd consider NOP sleds as a more general solution - they have > > some code size impact, but maybe it's small enough to keep them even in > > fully optimized builds) > > Some of our users are quite constrained by code size (and it seems reasonable > to assume it's likely to be those users that are using ICF), so that could > potentially be an issue. Are there performance implications? Assuming that > the call sites also get patched to the unique nops (otherwise the linker > needs to potentially do special things for debug info, which sounds like it's > not ideal, unless I'm not seeing the full picture?). Yep, it'd have non-zero size and performance implications - I don't know how small the epsilon is & whether it's below noise/below some acceptable cost threshold. https://github.com/llvm/llvm-project/pull/167666 _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
