On Tue, 17 Nov 2020 20:54:24 -0800
Alexei Starovoitov <alexei.starovoi...@gmail.com> wrote:

> >  extern int
> > @@ -310,7 +312,12 @@ static inline struct tracepoint 
> > *tracepoint_ptr_deref(tracepoint_ptr_t *p)
> >                 do {                                                    \
> >                         it_func = (it_func_ptr)->func;                  \
> >                         __data = (it_func_ptr)->data;                   \
> > -                       ((void(*)(void *, proto))(it_func))(__data, args); \
> > +                       /*                                              \
> > +                        * Removed functions that couldn't be allocated \
> > +                        * are replaced with TRACEPOINT_STUB.           \
> > +                        */                                             \
> > +                       if (likely(it_func != TRACEPOINT_STUB))         \
> > +                               ((void(*)(void *, proto))(it_func))(__data, 
> > args); \  
> 
> I think you're overreacting to the problem.

I will disagree with that.

> Adding run-time check to extremely unlikely problem seems wasteful.

Show me a real benchmark that you can notice a problem here. I bet that
check is even within the noise of calling an indirect function. Especially
on a machine with retpolines.

> 99.9% of the time allocate_probes() will do kmalloc from slab of small
> objects.
> If that slab is out of memory it means it cannot allocate a single page.
> In such case so many things will be failing to alloc that system
> is unlikely operational. oom should have triggered long ago.
> Imo Matt's approach to add __GFP_NOFAIL to allocate_probes()

Looking at the GFP_NOFAIL comment:

 * %__GFP_NOFAIL: The VM implementation _must_ retry infinitely: the caller
 * cannot handle allocation failures. The allocation could block
 * indefinitely but will never return with failure. Testing for
 * failure is pointless.
 * New users should be evaluated carefully (and the flag should be
 * used only when there is no reasonable failure policy) but it is
 * definitely preferable to use the flag rather than opencode endless
 * loop around allocator.

I realized I made a mistake in my patch for using it, as my patch is a
failure policy. It looks like something we want to avoid in general.

Thanks, I'll send a v3 that removes it.

> when it's called from func_remove() is much better.
> The error was reported by syzbot that was using
> memory fault injections. ENOMEM in allocate_probes() was
> never seen in real life and highly unlikely will ever be seen.

And the biggest thing you are missing here, is that if you are running on a
machine that has static calls, this code is never hit unless you have more
than one callback on a single tracepoint. That's because when there's only
one callback, it gets called directly, and this loop is not involved.

-- Steve

Reply via email to