* Peter Zijlstra <pet...@infradead.org> wrote:
> On Sun, Nov 11, 2018 at 08:53:07PM +0000, Nadav Amit wrote: > > > >> + /* > > >> + * The lock is not really needed, but this allows to avoid > > >> open-coding. > > >> + */ > > >> + ptep = get_locked_pte(poking_mm, poking_addr, &ptl); > > >> + > > >> + /* > > >> + * If we failed to allocate a PTE, fail. This should *never* > > >> happen, > > >> + * since we preallocate the PTE. > > >> + */ > > >> + if (WARN_ON_ONCE(!ptep)) > > >> + goto out; > > > > > > Since we hard rely on init getting that right; can't we simply get rid > > > of this? > > > > This is a repeated complaint of yours, which I do not feel comfortable with. > > One day someone will run some static analysis tool and start finding that > > all these checks are missing. > > > > The question is why do you care about them. > > Mostly because they should not be happening, ever. Since get_locked_pte() might in principle return NULL, it's an entirely routine pattern to check the return for NULL. This will save reviewer time in the future. > [...] And if they happen, there really isn't anything sensible we can > do about it. Warning about it is 'something', even if we cash afterwards, isn't it? > > If it is because they affect the > > generated code and make it less efficient, I can fully understand and > > perhaps > > we should have something like PARANOID_WARN_ON_ONCE() which compiles into > > nothing > > unless a certain debug option is set. > > > > If it is about the way the source code looks - I guess it doesn’t sore my > > eyes as hard as some other stuff, and I cannot do much about it (other than > > removing it as you asked). > > And yes on the above two points. It adds both runtime overhead (albeit > trivially small) and code complexity. It's trivially small cycle level overhead in something that will be burdened by two TLB flushes anyway is is utterly slow. > > >> +out: > > >> + if (memcmp(addr, opcode, len)) > > >> + r = -EFAULT; > > > > > > How could this ever fail? And how can we reliably recover from that? > > > > This code has been there before (with slightly uglier code). Before this > > patch, a BUG_ON() was used here. However, I noticed that kgdb actually > > checks that text_poke() succeeded after calling it and gracefully fail. > > However, this was useless, since text_poke() would panic before kgdb gets > > the chance to do anything (see patch 7). > > Yes, I know it was there before, and I did see kgdb do it too. But aside > from that out-label case, which we also should never hit, how can we > realistically ever fail that memcmp()? > > If we fail here, something is _seriously_ buggered. So wouldn't it be better to just document and verify our assumptions of this non-trivial code by using return values intelligently? I mean, being worried about overhead would be legitimate in the syscall entry code. In code patching code, which is essentially a slow path, we should be much more worried about *robustness*. Thanks, Ingo