Hi Szabolcs, Answers from a linux dev perspective below.
On Mon, Jan 20, 2020 at 10:53:33AM +0000, Szabolcs Nagy wrote: > On 19/01/2020 08:53, Fāng-ruì Sòng via gcc-patches wrote: > > It'd be great to have some tests, e.g. > > > > 1. -fpatchable-function-entry=0 -mbranch-protection=bti > > 2. -fpatchable-function-entry=2 -mbranch-protection=bti > > > > I have updated clang to emit `.Lfunc_begin0: bti c; nop; nop` for case 2. > > The __patchable_function_entries entry points to .Lfunc_begin0 (bti c). > > > > (The change is not included in the llvm 10.0 branch.) > > i have to ask some linux developers which way they prefer: > > e.g. -fpatchable-function-entry=3,1 is > > .section __patchable_function_entries > .8byte .Lpatch > .text > .Lpatch: > nop > func: > nop > nop > ... > > with bti the code will be emitted as: > > .Lpatch: > nop > func: > bti c > nop > nop > ... That looks good to me. > but e.g. -fpatchable-function-entry=2,0 has two reasonable > approaches with bti: > > (a) > > func: > .Lpatch: > bti c > nop > nop > ... > > (b) > > func: > bti c > .Lpatch: > nop > nop > ... I had assumed (b); that means that .Lpatch consistently points to the first NOP. To my mental model, that seems more consistent than (a). However, I can work with either so long as it's consistent. > i think (a) is more consistent across fancy N,M settings > (bti is always included into the patch area, user needs > to know to skip it), but (b) is more compatible with > existing usage (M=0 is i believe the common setting and > with that or with M=N the patching code does not need to > know about bti, existing patching code works unmodified). > > current llvm fix does (a), proposed gcc fix does (b), > i guess we have to pick one. > > (solution (a) is a bit messier in gcc, because currently > there is no target hook between the emission of .Lpatch > and the nops, i avoided refactoring that code to get a > backend only fix that is easy to backport, but (a) is > possible to do with a bit more changes.) As above, my weak preference is (b), but I can work with either. I just need the behaviour to be consistent. Was there a rationale for LLVM choosing (a) rather than (b), e.g. was that also ease of implementation? If there isn't a rationale otherwise, and if LLVM could also do (b), that would be nice from my PoV. How big is "a bit more changes" for GCC? Thanks, Mark.