On 2017-08-11 02:36, Richard Guy Briggs wrote:
> On 2017-06-28 15:03, Paul Moore wrote:
> > On Tue, Jun 27, 2017 at 5:11 PM, Richard Guy Briggs <r...@redhat.com> wrote:
> > > On 2017-05-30 17:21, Paul Moore wrote:
> > >> On Tue, Apr 4, 2017 at 5:21 AM, Richard Guy Briggs <r...@redhat.com> 
> > >> wrote:
> > 
> > ...
> > 
> > >> > diff --git a/kernel/audit.c b/kernel/audit.c
> > >> > index 25dd70a..7d83c5a 100644
> > >> > --- a/kernel/audit.c
> > >> > +++ b/kernel/audit.c
> > >> > @@ -66,6 +66,7 @@
> > >> >  #include <linux/freezer.h>
> > >> >  #include <linux/pid_namespace.h>
> > >> >  #include <net/netns/generic.h>
> > >> > +#include <linux/dcache.h>
> > >> >
> > >> >  #include "audit.h"
> > >> >
> > >> > @@ -1884,6 +1885,10 @@ void audit_copy_inode(struct audit_names *name, 
> > >> > const struct dentry *dentry,
> > >> >         name->gid   = inode->i_gid;
> > >> >         name->rdev  = inode->i_rdev;
> > >> >         security_inode_getsecid(inode, &name->osid);
> > >> > +       if (name->dentry) {
> > >> > +               dput(name->dentry);
> > >> > +               name->dentry = NULL;
> > >> > +       }
> > >>
> > >> Out of curiosity, what terrible things happen if we take a reference
> > >> to a non-NULL dentry passed to audit_copy_inode() and store it in
> > >> name->dentry?  Does performance tank?
> > >
> > > Interesting idea.  Right now it is optimized to only take a reference to
> > > the dentry's parent dentry in the case we're handed an anonymous entry.
> > > Most of the time it will never be used even though we invest in the
> > > overhead of taking a reference count.  Besides, __audit_inode_child()
> > > hands in a NULL for the dentry parameter to audit_copy_inode().
> > 
> > [NOTE: audit_copy_inode() hands a NULL dentry only in the anonymous parent 
> > case]
> > 
> > I believe I was just thinking of less conditional handling, especially
> > when reference counts are concerned.  I'm just trying to limit future
> > headaches, but I suspect the perf cost would be problematic, and as
> > you point out, there is no *need* for the majority of cases.
> > 
> > Looking at this again today, why would we want to clear name->dentry
> > in audit_copy_inode() if it is already set?  Does that ever happen?
> > I'm not sure it does ...
> 
> Ok, I just tried re-writing that part to dput that dentry only for the
> child of an anonymous parent rather than whenever audit_copy_inode() was
> called and I ended up with a:
> 
>       BUG: Dentry ffff8800338f1dc0{i=369a,n=stderr}  still in use (1) 
> [unmount of tmpfs tmpfs]
>       WARNING: CPU: 0 PID: 387 at fs/dcache.c:1445 umount_check+0x99/0xc0
> 
> This was after rebasing on a recent audit-next based on 4.11 rather than
> the previous based on a 4.8 audit-next.  Something else changed in the
> kernel or kernel config between these two points.  I was getting some
> "INFO: suspicious RCU usage." warnings in idmap for NFS on the earlier
> kernel and that is no longer happenning, and I'm no longer getting any
> tracefs audit PATH entries in the more recent kernel which suggests that
> whatever was causing the anonymous parent PATH records from tracefs has
> been changed/fixed/configured-out.  This tmpfs Dentry issue happens on
> both 4.8 and 4.11 kernels.

Minor correction here: I'm still getting tracefs PATH records on 4.11,
so whatever happenned in that test appears to be a test procedure error.

> So the tmpfs is being unmounted within the time of a task that has taken
> an anonymous parent audit_name and it appears that the dput in
> audit_copy_inode() called from elsewhere had reset it to avoid
> needlessly extending the life of this dentry.
> 
> I see two obvious ways to solve this:
>       - Return to freeing this dentry from audit_copy_inode()
>       - Add tmpfs to the list of filesystems to not create PATH records
> 
> I'm really not crazy aobut this second one unless I know why tmpfs is
> generating calls with anonymous parents.
> 
> For reference, the rest of the call trace is:
>       dump_stack+0x85/0xc9
>       __warn+0xd1/0xf0
>       ? d_genocide_kill+0x40/0x40
>       warn_slowpath_null+0x1d/0x20
>       umount_check+0x99/0xc0
>       d_walk+0x10b/0x580
>       ? do_one_tree+0x26/0x40
>       do_one_tree+0x26/0x40
>       shrink_dcache_for_umount+0x5d/0xd0
>       generic_shutdown_super+0x1f/0xf0
>       kill_litter_super+0x29/0x40
>       deactivate_locked_super+0x43/0x70
>       deactivate_super+0x88/0xa0
>       cleanup_mnt+0x8e/0xe0
>       __cleanup_mnt+0x12/0x20
>       task_work_run+0x83/0xc0
>       do_exit+0x45c/0xfe0
>       ? syscall_trace_enter+0x2e4/0x400
>       do_group_exit+0x68/0xe0
>       SyS_exit_group+0x14/0x20
>       do_syscall_64+0x82/0x270
>       entry_SYSCALL64_slow_path+0x25/0x25
> 
> > > I'm assuming you are hinting at also using that dentry to compare the
> > > audit_names entry, which I think it a bad idea since there could be
> > > multiple paths to access a dentry.  I did orignially have another patch
> > > that would have tried to use that as well, which didn't seem to hurt,
> > > but I didn't think was worth upstreaming.
> > 
> > No, I wasn't thinking that, the dev/inode numbers should be sufficient
> > in those cases I believe; I'm not sure the dentry would help us here.
> > 
> > >> Also out of curiosity, why do we want to drop a dentry reference here
> > >> if one already exists?
> > >
> > > I think we want to drop a dentry reference here because this inode child
> > > could be a subsequent access to the same dentry with a full path,
> > > removing the need to cache this dentry information in the first place.
> > 
> > Related to my comment above from today ... what code path please?
> 
> So it appears that there is a code path that does free this dentry in a
> timely fashion to avoid needlessly extending its life and simply leaving
> it there until the audit_context is freed is too long.
> 
> > >> > @@ -1925,6 +1930,17 @@ void audit_log_name(struct audit_context 
> > >> > *context, struct audit_names *n,
> > >> >                         audit_log_n_untrustedstring(ab, n->name->name,
> > >> >                                                     n->name_len);
> > >> >                 }
> > >> > +       } else if (n->dentry) {
> > >> > +               char *fullpath;
> > >> > +               const char *fullpathp;
> > >> > +
> > >> > +               fullpath = kmalloc(PATH_MAX, GFP_KERNEL);
> > >> > +               if (!fullpath)
> > >> > +                       return;
> > >>
> > >> I'm wondering if there is some value in still emitting the record if
> > >> the kmalloc() fails, just with the name field set as the unset "?"
> > >> value, e.g. "name=?".  Thoughts?
> > >
> > > Possibly.  We've got much bigger problems if that happens, but this
> > > sounds like a good defensive coding approach.  I'm even tempted to call
> > > audit_panic().
> > 
> > No audit_panic().  We've still got good information that we can
> > record, e.g. dev/inode numbers; let's just print "name=?" and go on
> > our way recording the rest of the information.  This is in keeping
> > with the current audit_log_name() error handling.
> > 
> > At the very least you need to clean up here instead of just returning.
> > As the patch currently stands I believe this will end up leaking an
> > audit_buffer.
> 
> This has been fixed along with a fullpath kmalloc leak.
> 
> > paul moore
> 
> - RGB
> 
> --
> Richard Guy Briggs <r...@redhat.com>
> Sr. S/W Engineer, Kernel Security, Base Operating Systems
> Remote, Ottawa, Red Hat Canada
> IRC: rgb, SunRaycer
> Voice: +1.647.777.2635, Internal: (81) 32635
> 
> --
> Linux-audit mailing list
> linux-au...@redhat.com
> https://www.redhat.com/mailman/listinfo/linux-audit

- RGB

--
Richard Guy Briggs <r...@redhat.com>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

Reply via email to