Re: [PATCH -next] cred: conditionally declare groups-related functions
On Fri, Jun 22, 2018 at 3:47 AM Ondrej Mosnacek wrote: > št 21. 6. 2018 o 23:17 Paul Moore napísal(a): > > On Thu, Jun 21, 2018 at 4:33 AM Ondrej Mosnacek wrote: > > > The groups-related functions declared in include/linux/cred.h are > > > defined in kernel/groups.c, which is compiled only when > > > CONFIG_MULTIUSER=y. Move all these function declarations under #ifdef > > > CONFIG_MULTIUSER to help avoid accidental usage in contexts where > > > CONFIG_MULTIUSER might be disabled. > > > > > > This patch also adds a fallback for groups_search(). Currently this > > > function is only called from kernel/groups.c itself and > > > keys/permissions.c, which depends on CONFIG_MULTIUSER. However, the > > > audit subsystem (which does not depend on CONFIG_MULTIUSER) calls this > > > function in -next, so the fallback will be needed to avoid compilation > > > errors or ugly workarounds. > > > > > > See also: > > > https://lkml.org/lkml/2018/6/20/670 > > > https://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit.git/commit/?h=next&id=af85d1772e31fed34165a1b3decef340cf4080c0 > > > > > > Reported-by: Randy Dunlap > > > Signed-off-by: Ondrej Mosnacek > > > --- > > > include/linux/cred.h | 16 +++- > > > 1 file changed, 11 insertions(+), 5 deletions(-) > > > > > > diff --git a/include/linux/cred.h b/include/linux/cred.h > > > index 631286535d0f..8917768453cc 100644 > > > --- a/include/linux/cred.h > > > +++ b/include/linux/cred.h > > > @@ -65,6 +65,12 @@ extern void groups_free(struct group_info *); > > > > > > extern int in_group_p(kgid_t); > > > extern int in_egroup_p(kgid_t); > > > + > > > +extern int set_current_groups(struct group_info *); > > > +extern void set_groups(struct cred *, struct group_info *); > > > +extern int groups_search(const struct group_info *, kgid_t); > > > +extern bool may_setgroups(void); > > > +extern void groups_sort(struct group_info *); > > > #else > > > static inline void groups_free(struct group_info *group_info) > > > { > > > @@ -78,12 +84,12 @@ static inline int in_egroup_p(kgid_t grp) > > > { > > > return 1; > > > } > > > + > > > +static inline int groups_search(const struct group_info *group_info, > > > kgid_t grp) > > > +{ > > > + return 0; > > > > Is this the right fallback value? If CONFIG_MULTIUSER is disabled, > > wouldn't we always want to indicate a group match? The in_group_p() > > and in_egroup_p() dummy functions would seem to indicate that is the > > correct behavior ... > > Hm, indeed this is a bit tricky and I'm guilty of not noticing this... > > The way I see it (now that I though about it a little), there are > basically two possible semantics of groups_search(): > 1. as an (auxiliary) permissions checking function (like > in_[e]group_p()) -- in this case we would expect the same return value > as in_group_p(), i.e. 1. > 2. as a function that simply checks if a group is contained in a list > of groups (taken from a cred struct) -- in this case we would expect > it to return 0 in single-user mode, since there will be always no > supplemental groups set for any task (if I understand it right). > > I guess no matter which semantic we pick, we might confuse someone > expecting the other one, so I would suggest dropping this patch (or at > least the fallbacks for groups_search) and explicitly handle the > single-user case in audit. > > We should probably default to 1 in audit anyway, because the original > code used in_[e]group_p(). Even though 0 would seem more logical to > me, comparing GIDs doesn't really make sense in single-user mode > anyway, so keeping the legacy behavior will be safer. (In fact now > that I think of it, having audit enabled (or even compiled) in > single-user mode does not make much sense either... maybe we should > just make CONFIG_AUDIT depend on CONFIG_MULTIUSER...). If CONFIG_MULTIUSER is unset then basically everything runs as root:root, there is no user separation anymore. With that in mind it seems the proper behavior is to always return 1 when groups_search() is called. Please make the adjustment and resubmit your patch. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH -next v2] cred: conditionally declare groups-related functions
On Mon, Jun 25, 2018 at 2:56 AM Ondrej Mosnacek wrote: > The groups-related functions declared in include/linux/cred.h are > defined in kernel/groups.c, which is compiled only when > CONFIG_MULTIUSER=y. Move all these function declarations under #ifdef > CONFIG_MULTIUSER to help avoid accidental usage in contexts where > CONFIG_MULTIUSER might be disabled. > > This patch also adds a fallback for groups_search(). Currently this > function is only called from kernel/groups.c itself and > keys/permissions.c, which depends on CONFIG_MULTIUSER. However, the > audit subsystem (which does not depend on CONFIG_MULTIUSER) calls this > function in -next, so the fallback will be needed to avoid compilation > errors or ugly workarounds. > > See also: > https://lkml.org/lkml/2018/6/20/670 > https://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit.git/commit/?h=next&id=af85d1772e31fed34165a1b3decef340cf4080c0 > > Reported-by: Randy Dunlap > Signed-off-by: Ondrej Mosnacek > --- > include/linux/cred.h | 15 ++- > 1 file changed, 10 insertions(+), 5 deletions(-) > > diff --git a/include/linux/cred.h b/include/linux/cred.h > index 631286535d0f..7eed6101c791 100644 > --- a/include/linux/cred.h > +++ b/include/linux/cred.h > @@ -65,6 +65,12 @@ extern void groups_free(struct group_info *); > > extern int in_group_p(kgid_t); > extern int in_egroup_p(kgid_t); > +extern int groups_search(const struct group_info *, kgid_t); > + > +extern int set_current_groups(struct group_info *); > +extern void set_groups(struct cred *, struct group_info *); > +extern bool may_setgroups(void); > +extern void groups_sort(struct group_info *); > #else > static inline void groups_free(struct group_info *group_info) > { > @@ -78,12 +84,11 @@ static inline int in_egroup_p(kgid_t grp) > { > return 1; > } > +static inline int groups_search(const struct group_info *group_info, kgid_t > grp) > +{ > + return 1; > +} > #endif > -extern int set_current_groups(struct group_info *); > -extern void set_groups(struct cred *, struct group_info *); > -extern int groups_search(const struct group_info *, kgid_t); > -extern bool may_setgroups(void); > -extern void groups_sort(struct group_info *); I was going through the list of functions to make sure it was safe to move all of these under CONFIG_MULTIUSER and I believe there may be an issue with key_task_permission() and groups_search(), specifically one can disable CONFIG_MULTIUSER yet keep CONFIG_KEYS enabled. I'm going to guess that the fix is going to be to have CONFIG_KEYS depend on CONFIG_MULTIUSER, but I would talk with the keys folks (I've CC'd David Howells) to see what they would prefer. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH] kernel: audit_tree: Fix a sleep-in-atomic-context bug
On Mon, Jun 25, 2018 at 5:23 AM Jan Kara wrote: > On Fri 22-06-18 14:56:09, Paul Moore wrote: > > On Fri, Jun 22, 2018 at 5:23 AM Jan Kara wrote: > > > On Wed 20-06-18 21:29:12, Matthew Wilcox wrote: > > > > On Thu, Jun 21, 2018 at 11:32:45AM +0800, Jia-Ju Bai wrote: > > > > > The kernel may sleep with holding a spinlock. > > > > > The function call paths (from bottom to top) in Linux-4.16.7 are: > > > > > > > > > > [FUNC] kmem_cache_alloc(GFP_KERNEL) > > > > > fs/notify/mark.c, 439: > > > > > kmem_cache_alloc in fsnotify_attach_connector_to_object > > > > > fs/notify/mark.c, 520: > > > > > fsnotify_attach_connector_to_object in > > > > > fsnotify_add_mark_list > > > > > fs/notify/mark.c, 590: > > > > > fsnotify_add_mark_list in fsnotify_add_mark_locked > > > > > kernel/audit_tree.c, 437: > > > > > fsnotify_add_mark_locked in tag_chunk > > > > > kernel/audit_tree.c, 423: > > > > > spin_lock in tag_chunk > > > > > > > > There are several locks here; your report would be improved by saying > > > > which one is the problem. I'm assuming it's old_entry->lock. > > > > > > > > spin_lock(&old_entry->lock); > > > > ... > > > > if (fsnotify_add_inode_mark_locked(chunk_entry, > > > > old_entry->connector->inode, 1)) { > > > > ... > > > > return fsnotify_add_mark_locked(mark, inode, NULL, allow_dups); > > > > ... > > > > ret = fsnotify_add_mark_list(mark, inode, mnt, allow_dups); > > > > ... > > > > if (inode) > > > > connp = &inode->i_fsnotify_marks; > > > > conn = fsnotify_grab_connector(connp); > > > > if (!conn) { > > > > err = fsnotify_attach_connector_to_object(connp, inode, > > > > mnt); > > > > > > > > It seems to me that this is safe because old_entry is looked up from > > > > fsnotify_find_mark, and it can't be removed while its lock is held. > > > > Therefore there's always a 'conn' returned from > > > > fsnotify_grab_connector(), > > > > and so this path will never be taken. > > > > > > > > But this code path is confusing to me, and I could be wrong. Jan, > > > > please > > > > confirm my analysis is correct? > > > > > > Yes, you are correct. The presence of another mark in the list (and the > > > fact we pin it there using refcount & mark_mutex) guarantees we won't need > > > to allocate the connector. I agree the audit code's use of fsnotify would > > > deserve some cleanup. > > > > I'm always open to suggestions and patches (hint, hint) from the > > fsnotify experts ;) > > Yeah, I was looking into it on Friday and today :). Currently I've got a > bit stuck because I think I've found some races in audit_tree code and I > haven't yet decided how to fix them. E.g. am I right the following can > happen? > > CPU1CPU2 > tag_chunk(inode, tree1) tag_chunk(inode, tree2) > old_entry = fsnotify_find_mark(); old_entry = fsnotify_find_mark(); > old = container_of(old_entry); old = container_of(old_entry); > chunk = alloc_chunk(old->count + 1);chunk = alloc_chunk(old->count + 1); > mutex_lock(&group->mark_mutex); > adds new mark > replaces chunk > old->dead = 1; > mutex_unlock(&group->mark_mutex); > mutex_lock(&group->mark_mutex); > if (!(old_entry->flags & > FSNOTIFY_MARK_FLAG_ATTACHED)) > { > Check fails as old_entry is > not yet destroyed > adds new mark > replaces old chunk again -> > list corruption, lost refs, ... > mutex_unlock(&group->mark_mutex); > > Generally there's a bigger problem that audit_tree code can have multiple > marks attached to one inode but only one of them is the
Re: [PATCH -next v2] cred: conditionally declare groups-related functions
On Tue, Jun 26, 2018 at 4:13 AM Ondrej Mosnacek wrote: > po 25. 6. 2018 o 23:04 Paul Moore napísal(a): > > On Mon, Jun 25, 2018 at 2:56 AM Ondrej Mosnacek wrote: > > > The groups-related functions declared in include/linux/cred.h are > > > defined in kernel/groups.c, which is compiled only when > > > CONFIG_MULTIUSER=y. Move all these function declarations under #ifdef > > > CONFIG_MULTIUSER to help avoid accidental usage in contexts where > > > CONFIG_MULTIUSER might be disabled. > > > > > > This patch also adds a fallback for groups_search(). Currently this > > > function is only called from kernel/groups.c itself and > > > keys/permissions.c, which depends on CONFIG_MULTIUSER. However, the > > > audit subsystem (which does not depend on CONFIG_MULTIUSER) calls this > > > function in -next, so the fallback will be needed to avoid compilation > > > errors or ugly workarounds. > > > > > > See also: > > > https://lkml.org/lkml/2018/6/20/670 > > > https://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit.git/commit/?h=next&id=af85d1772e31fed34165a1b3decef340cf4080c0 > > > > > > Reported-by: Randy Dunlap > > > Signed-off-by: Ondrej Mosnacek > > > --- > > > include/linux/cred.h | 15 ++- > > > 1 file changed, 10 insertions(+), 5 deletions(-) > > > > > > diff --git a/include/linux/cred.h b/include/linux/cred.h > > > index 631286535d0f..7eed6101c791 100644 > > > --- a/include/linux/cred.h > > > +++ b/include/linux/cred.h > > > @@ -65,6 +65,12 @@ extern void groups_free(struct group_info *); > > > > > > extern int in_group_p(kgid_t); > > > extern int in_egroup_p(kgid_t); > > > +extern int groups_search(const struct group_info *, kgid_t); > > > + > > > +extern int set_current_groups(struct group_info *); > > > +extern void set_groups(struct cred *, struct group_info *); > > > +extern bool may_setgroups(void); > > > +extern void groups_sort(struct group_info *); > > > #else > > > static inline void groups_free(struct group_info *group_info) > > > { > > > @@ -78,12 +84,11 @@ static inline int in_egroup_p(kgid_t grp) > > > { > > > return 1; > > > } > > > +static inline int groups_search(const struct group_info *group_info, > > > kgid_t grp) > > > +{ > > > + return 1; > > > +} > > > #endif > > > -extern int set_current_groups(struct group_info *); > > > -extern void set_groups(struct cred *, struct group_info *); > > > -extern int groups_search(const struct group_info *, kgid_t); > > > -extern bool may_setgroups(void); > > > -extern void groups_sort(struct group_info *); > > > > I was going through the list of functions to make sure it was safe to > > move all of these under CONFIG_MULTIUSER and I believe there may be an > > issue with key_task_permission() and groups_search(), specifically one > > can disable CONFIG_MULTIUSER yet keep CONFIG_KEYS enabled. > > Oh, you're right, CONFIG_KEYS does not depend on CONFIG_MULTIUSER, for > some reason I was looking at CONFIG_SECURITY instead, which does :/ I > will need to update the commit message... > > > I'm going to guess that the fix is going to be to have CONFIG_KEYS > > depend on CONFIG_MULTIUSER, but I would talk with the keys folks (I've > > CC'd David Howells) to see what they would prefer. > > I was wondering why the security/keys/permissions.c did not give the > link error and it turns out that it is due to the definition of > __kgid_val(), which is hard-coded to 0 if CINFIG_MULTIUSER=n, so the > compiler just optimizes out most of the code in key_task_permission() > (including the groups_search() call). Thus, for permissions.c it does > not matter which fallback we choose (and since this patch always > defines groups_search(), it will not lead to compile errors in > permissions.c either). > > TL;DR: There is no need to fix anything in the keys code (beyond this > patch). I also checked the other functions and they are only used in > places that either depend on MULTIUSER or use also groups_alloc(), > which is already declared only when CONFIG_MULTIUSER=y. ... and you've already defined a dummy implementation of groups_search(); that's the important part. I clearly was in a rush while reviewing this patch. /me shakes his head -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH -next v3] cred: conditionally declare groups-related functions
On Tue, Jun 26, 2018 at 7:04 AM Ondrej Mosnacek wrote: > > The groups-related functions declared in include/linux/cred.h are > defined in kernel/groups.c, which is compiled only when > CONFIG_MULTIUSER=y. Move all these function declarations under #ifdef > CONFIG_MULTIUSER to help avoid accidental usage in contexts where > CONFIG_MULTIUSER might be disabled. > > This patch also adds a fallback for groups_search(). Currently this > function is only called from kernel/groups.c itself and > security/keys/permissions.c, where the call is (by coincidence) > optimized away in case CONFIG_MULTIUSER=n. However, the audit subsystem > (which does not depend on CONFIG_MULTIUSER) calls this function in > -next, so the fallback will be needed to avoid compilation errors or > ugly workarounds. > > See also: > https://lkml.org/lkml/2018/6/20/670 > https://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit.git/commit/?h=next&id=af85d1772e31fed34165a1b3decef340cf4080c0 > > Reported-by: Randy Dunlap > Tested-by: Randy Dunlap > Signed-off-by: Ondrej Mosnacek > --- > > Changes: > v2->v3: commit message corrections > v1->v2: change default return value of groups_search() to 1 > > include/linux/cred.h | 15 ++- > 1 file changed, 10 insertions(+), 5 deletions(-) Merged, thanks guys. > diff --git a/include/linux/cred.h b/include/linux/cred.h > index 631286535d0f..7eed6101c791 100644 > --- a/include/linux/cred.h > +++ b/include/linux/cred.h > @@ -65,6 +65,12 @@ extern void groups_free(struct group_info *); > > extern int in_group_p(kgid_t); > extern int in_egroup_p(kgid_t); > +extern int groups_search(const struct group_info *, kgid_t); > + > +extern int set_current_groups(struct group_info *); > +extern void set_groups(struct cred *, struct group_info *); > +extern bool may_setgroups(void); > +extern void groups_sort(struct group_info *); > #else > static inline void groups_free(struct group_info *group_info) > { > @@ -78,12 +84,11 @@ static inline int in_egroup_p(kgid_t grp) > { > return 1; > } > +static inline int groups_search(const struct group_info *group_info, kgid_t > grp) > +{ > + return 1; > +} > #endif > -extern int set_current_groups(struct group_info *); > -extern void set_groups(struct cred *, struct group_info *); > -extern int groups_search(const struct group_info *, kgid_t); > -extern bool may_setgroups(void); > -extern void groups_sort(struct group_info *); > > /* > * The security context of a task > -- > 2.17.1 > -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH ghak59 V1 1/2] audit: tree: check audit_enabled
On Thu, Jun 14, 2018 at 4:22 PM Richard Guy Briggs wrote: > > Respect the audit_enabled flag when printing tree rule config change > records. > > See: https://github.com/linux-audit/audit-kernel/issues/50 > Signed-off-by: Richard Guy Briggs > --- > kernel/audit_tree.c | 2 ++ > 1 file changed, 2 insertions(+) Merged, thanks. > diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c > index 67e6956..5e9d1e5 100644 > --- a/kernel/audit_tree.c > +++ b/kernel/audit_tree.c > @@ -497,6 +497,8 @@ static void audit_tree_log_remove_rule(struct audit_krule > *rule) > { > struct audit_buffer *ab; > > + if (!audit_enabled) > + return; > ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_CONFIG_CHANGE); > if (unlikely(!ab)) > return; > -- > 1.8.3.1 > -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH ghak59 V1 2/2] audit: watch: simplify audit_enabled check
On Thu, Jun 14, 2018 at 4:22 PM Richard Guy Briggs wrote: > > Check the audit_enabled flag and bail immediately. This does not change > the functionality, but brings the code format in line with similar > checks in audit_tree_log_remove_rule(), audit_mark_log_rule_change(), > and elsewhere in the audit code. > > See: https://github.com/linux-audit/audit-kernel/issues/50 > Signed-off-by: Richard Guy Briggs > --- > kernel/audit_watch.c | 29 +++-- > 1 file changed, 15 insertions(+), 14 deletions(-) Merged, thanks. As a FYI for future patches, please don't use "audit: X: " as a subject line unless you are crossing subsystem boundaries. As an example, the following is okay: audit: selinux: make things more awesomer ... while this isn't something I like seeing: audit: watch: simplify audit_enabled check ... because the "watch" in this case refers to the audit watch code which is part of the audit subsystem already. > diff --git a/kernel/audit_watch.c b/kernel/audit_watch.c > index f1ba889..9b4836b 100644 > --- a/kernel/audit_watch.c > +++ b/kernel/audit_watch.c > @@ -238,20 +238,21 @@ static struct audit_watch *audit_dupe_watch(struct > audit_watch *old) > > static void audit_watch_log_rule_change(struct audit_krule *r, struct > audit_watch *w, char *op) > { > - if (audit_enabled) { > - struct audit_buffer *ab; > - ab = audit_log_start(NULL, GFP_NOFS, AUDIT_CONFIG_CHANGE); > - if (unlikely(!ab)) > - return; > - audit_log_format(ab, "auid=%u ses=%u op=%s", > -from_kuid(&init_user_ns, > audit_get_loginuid(current)), > -audit_get_sessionid(current), op); > - audit_log_format(ab, " path="); > - audit_log_untrustedstring(ab, w->path); > - audit_log_key(ab, r->filterkey); > - audit_log_format(ab, " list=%d res=1", r->listnr); > - audit_log_end(ab); > - } > + struct audit_buffer *ab; > + > + if (!audit_enabled) > + return; > + ab = audit_log_start(NULL, GFP_NOFS, AUDIT_CONFIG_CHANGE); > + if (!ab) > + return; > + audit_log_format(ab, "auid=%u ses=%u op=%s", > +from_kuid(&init_user_ns, > audit_get_loginuid(current)), > +audit_get_sessionid(current), op); > + audit_log_format(ab, " path="); > + audit_log_untrustedstring(ab, w->path); > + audit_log_key(ab, r->filterkey); > + audit_log_format(ab, " list=%d res=1", r->listnr); > + audit_log_end(ab); > } > > /* Update inode info in audit rules based on filesystem event. */ > -- > 1.8.3.1 > -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH] audit: fix use-after-free in audit_add_watch
On Mon, Jun 18, 2018 at 2:37 PM Richard Guy Briggs wrote: > > On 2018-06-18 15:27, Ronny Chevalier wrote: > > audit_add_watch stores locally krule->watch without taking a reference > > on watch. Then, it calls audit_add_to_parent, and uses the watch stored > > locally. > > > > Unfortunately, it is possible that audit_add_to_parent updates > > krule->watch. > > When it happens, it also drops a reference of watch which > > could free the watch. > > > > How to reproduce (with KASAN enabled): > > > > auditctl -w /etc/passwd -F success=0 -k test_passwd > > auditctl -w /etc/passwd -F success=1 -k test_passwd2 > > > > The second call to auditctl triggers the use-after-free, because > > audit_to_parent updates krule->watch to use a previous existing watch > > and drops the reference to the newly created watch. > > > > To fix the issue, we grab a reference of watch and we release it at the > > end of the function. > > Nice catch. > > This doesn't quite look right though. What if audit_get_nd() fails? > How about we put that audit_get_watch(watch) right before > audit_find_parent()? Since we end up using the watch in audit_get_nd() shouldn't we bump the refcnt in the watch before we call audit_get_nd()? Further, I think we need to bump the refcnt before we drop audit_filter_mutex so we don't end up in a race with audit_find_parent(), yes? > > Signed-off-by: Ronny Chevalier > > --- > > kernel/audit_watch.c | 9 + > > 1 file changed, 9 insertions(+) > > > > diff --git a/kernel/audit_watch.c b/kernel/audit_watch.c > > index f1ba88994508..4bb8d7718fbc 100644 > > --- a/kernel/audit_watch.c > > +++ b/kernel/audit_watch.c > > @@ -421,6 +421,14 @@ int audit_add_watch(struct audit_krule *krule, struct > > list_head **list) > > > > mutex_unlock(&audit_filter_mutex); > > > > + /* > > + * When we will be calling audit_add_to_parent, krule->watch might > > have > > + * been updated and watch might have been freed. > > + * So we need to keep a reference of watch. > > + */ > > + > > + audit_get_watch(watch); > > + > > /* Avoid calling path_lookup under audit_filter_mutex. */ > > ret = audit_get_nd(watch, &parent_path); > > > > @@ -446,6 +454,7 @@ int audit_add_watch(struct audit_krule *krule, struct > > list_head **list) > > *list = &audit_inode_hash[h]; > > error: > > path_put(&parent_path); > > + audit_put_watch(watch); > > return ret; > > } > > > > - RGB > > -- > Richard Guy Briggs > Sr. S/W Engineer, Kernel Security, Base Operating Systems > Remote, Ottawa, Red Hat Canada > IRC: rgb, SunRaycer > Voice: +1.647.777.2635, Internal: (81) 32635 > > -- > Linux-audit mailing list > Linux-audit@redhat.com > https://www.redhat.com/mailman/listinfo/linux-audit -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak59 V1 1/6] audit: give a clue what CONFIG_CHANGE op was involved
On Thu, Jun 14, 2018 at 4:23 PM Richard Guy Briggs wrote: > The failure to add an audit rule due to audit locked gives no clue > what CONFIG_CHANGE operation failed. > Similarly the set operation is the only other operation that doesn't > give the "op=" field to indicate the action. > All other CONFIG_CHANGE records include an op= field to give a clue as > to what sort of configuration change is being executed. > > Since these are the only CONFIG_CHANGE records that that do not have an > op= field, add them to bring them in line with the rest. Normally this would be an immediate reject because this patch inserts a field into an existing record, but the CONFIG_CHANGE record is so variable (supposedly bad in its own right) that I don't this really matters. With that out of the way, I think this patch is fine, but I don't think it is complete. At the very least there is another CONFIG_CHANGE record in audit_watch_log_rule_change() that doesn't appear to include an "op" field. If we want to make sure we have an "op" field in every CONFIG_CHANGE record, let's actually add them all :) There appears to be another one in audit_mark_log_rule_change() ... and one more in audit_receive_msg(). There may be more. > Old records: > type=CONFIG_CHANGE msg=audit(1519812997.781:374): pid=610 uid=0 auid=0 ses=1 > subj=... audit_enabled=2 res=0 > type=CONFIG_CHANGE msg=audit(2018-06-14 14:55:04.507:47) : audit_enabled=1 > old=1 auid=unset ses=unset subj=... res=yes > > New records: > type=CONFIG_CHANGE msg=audit(1520958477.855:100): pid=610 uid=0 auid=0 ses=1 > subj=... op=add_rule audit_enabled=2 res=0 > > type=CONFIG_CHANGE msg=audit(2018-06-14 14:55:04.507:47) : op=set > audit_enabled=1 old=1 auid=unset ses=unset subj=... res=yes > > See: https://github.com/linux-audit/audit-kernel/issues/59 > Signed-off-by: Richard Guy Briggs > --- > kernel/audit.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/kernel/audit.c b/kernel/audit.c > index e7478cb..ad54339 100644 > --- a/kernel/audit.c > +++ b/kernel/audit.c > @@ -403,7 +403,7 @@ static int audit_log_config_change(char *function_name, > u32 new, u32 old, > ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_CONFIG_CHANGE); > if (unlikely(!ab)) > return rc; > - audit_log_format(ab, "%s=%u old=%u", function_name, new, old); > + audit_log_format(ab, "op=set %s=%u old=%u", function_name, new, old); > audit_log_session_info(ab); > rc = audit_log_task_context(ab); > if (rc) > @@ -1365,7 +1365,9 @@ static int audit_receive_msg(struct sk_buff *skb, > struct nlmsghdr *nlh) > return -EINVAL; > if (audit_enabled == AUDIT_LOCKED) { > audit_log_common_recv_msg(&ab, AUDIT_CONFIG_CHANGE); > - audit_log_format(ab, " audit_enabled=%d res=0", > audit_enabled); > + audit_log_format(ab, " op=%s_rule audit_enabled=%d > res=0", > + msg_type == AUDIT_ADD_RULE ? "add" : > "remove", > +audit_enabled); > audit_log_end(ab); > return -EPERM; > } > -- > 1.8.3.1 -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak59 V1 2/6] audit: add syscall information to CONFIG_CHANGE records
On Thu, Jun 14, 2018 at 4:23 PM Richard Guy Briggs wrote: > > Tie syscall information to all CONFIG_CHANGE calls since they are all a > result of user actions. > > See: https://github.com/linux-audit/audit-kernel/issues/59 > See: https://github.com/linux-audit/audit-kernel/issues/50 > Signed-off-by: Richard Guy Briggs > --- > kernel/audit.c | 4 ++-- > kernel/audit_fsnotify.c | 2 +- > kernel/audit_tree.c | 2 +- > kernel/audit_watch.c| 2 +- > kernel/auditfilter.c| 2 +- > 5 files changed, 6 insertions(+), 6 deletions(-) Merged, thanks. > diff --git a/kernel/audit.c b/kernel/audit.c > index ad54339..e469234 100644 > --- a/kernel/audit.c > +++ b/kernel/audit.c > @@ -400,7 +400,7 @@ static int audit_log_config_change(char *function_name, > u32 new, u32 old, > struct audit_buffer *ab; > int rc = 0; > > - ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_CONFIG_CHANGE); > + ab = audit_log_start(audit_context(), GFP_KERNEL, > AUDIT_CONFIG_CHANGE); > if (unlikely(!ab)) > return rc; > audit_log_format(ab, "op=set %s=%u old=%u", function_name, new, old); > @@ -1067,7 +1067,7 @@ static void audit_log_common_recv_msg(struct > audit_buffer **ab, u16 msg_type) > return; > } > > - *ab = audit_log_start(NULL, GFP_KERNEL, msg_type); > + *ab = audit_log_start(audit_context(), GFP_KERNEL, msg_type); > if (unlikely(!*ab)) > return; > audit_log_format(*ab, "pid=%d uid=%u", pid, uid); > diff --git a/kernel/audit_fsnotify.c b/kernel/audit_fsnotify.c > index 52f368b..1640eb6 100644 > --- a/kernel/audit_fsnotify.c > +++ b/kernel/audit_fsnotify.c > @@ -127,7 +127,7 @@ static void audit_mark_log_rule_change(struct > audit_fsnotify_mark *audit_mark, c > > if (!audit_enabled) > return; > - ab = audit_log_start(NULL, GFP_NOFS, AUDIT_CONFIG_CHANGE); > + ab = audit_log_start(audit_context(), GFP_NOFS, AUDIT_CONFIG_CHANGE); > if (unlikely(!ab)) > return; > audit_log_format(ab, "auid=%u ses=%u op=%s", > diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c > index 5e9d1e5..a01b9da 100644 > --- a/kernel/audit_tree.c > +++ b/kernel/audit_tree.c > @@ -499,7 +499,7 @@ static void audit_tree_log_remove_rule(struct audit_krule > *rule) > > if (!audit_enabled) > return; > - ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_CONFIG_CHANGE); > + ab = audit_log_start(audit_context(), GFP_KERNEL, > AUDIT_CONFIG_CHANGE); > if (unlikely(!ab)) > return; > audit_log_format(ab, "op=remove_rule"); > diff --git a/kernel/audit_watch.c b/kernel/audit_watch.c > index 9b4836b..da2978b 100644 > --- a/kernel/audit_watch.c > +++ b/kernel/audit_watch.c > @@ -242,7 +242,7 @@ static void audit_watch_log_rule_change(struct > audit_krule *r, struct audit_watc > > if (!audit_enabled) > return; > - ab = audit_log_start(NULL, GFP_NOFS, AUDIT_CONFIG_CHANGE); > + ab = audit_log_start(audit_context(), GFP_NOFS, AUDIT_CONFIG_CHANGE); > if (!ab) > return; > audit_log_format(ab, "auid=%u ses=%u op=%s", > diff --git a/kernel/auditfilter.c b/kernel/auditfilter.c > index eaa3201..6e19acb 100644 > --- a/kernel/auditfilter.c > +++ b/kernel/auditfilter.c > @@ -1093,7 +1093,7 @@ static void audit_log_rule_change(char *action, struct > audit_krule *rule, int re > if (!audit_enabled) > return; > > - ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_CONFIG_CHANGE); > + ab = audit_log_start(audit_context(), GFP_KERNEL, > AUDIT_CONFIG_CHANGE); > if (!ab) > return; > audit_log_session_info(ab); > -- > 1.8.3.1 > -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak59 V1 2/6] audit: add syscall information to CONFIG_CHANGE records
On Thu, Jun 28, 2018 at 5:47 PM Paul Moore wrote: > > On Thu, Jun 14, 2018 at 4:23 PM Richard Guy Briggs wrote: > > > > Tie syscall information to all CONFIG_CHANGE calls since they are all a > > result of user actions. > > > > See: https://github.com/linux-audit/audit-kernel/issues/59 > > See: https://github.com/linux-audit/audit-kernel/issues/50 > > Signed-off-by: Richard Guy Briggs > > --- > > kernel/audit.c | 4 ++-- > > kernel/audit_fsnotify.c | 2 +- > > kernel/audit_tree.c | 2 +- > > kernel/audit_watch.c| 2 +- > > kernel/auditfilter.c| 2 +- > > 5 files changed, 6 insertions(+), 6 deletions(-) > > Merged, thanks. Actually, I take that back (good thing I hadn't pushed yet). Why don't you squash this patch with 3/6 so that a bisect or cherry-pick backport doesn't end up splitting 2/6 and 3/6 and causing a regression with the AUDIT_USER_* records. > > diff --git a/kernel/audit.c b/kernel/audit.c > > index ad54339..e469234 100644 > > --- a/kernel/audit.c > > +++ b/kernel/audit.c > > @@ -400,7 +400,7 @@ static int audit_log_config_change(char *function_name, > > u32 new, u32 old, > > struct audit_buffer *ab; > > int rc = 0; > > > > - ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_CONFIG_CHANGE); > > + ab = audit_log_start(audit_context(), GFP_KERNEL, > > AUDIT_CONFIG_CHANGE); > > if (unlikely(!ab)) > > return rc; > > audit_log_format(ab, "op=set %s=%u old=%u", function_name, new, > > old); > > @@ -1067,7 +1067,7 @@ static void audit_log_common_recv_msg(struct > > audit_buffer **ab, u16 msg_type) > > return; > > } > > > > - *ab = audit_log_start(NULL, GFP_KERNEL, msg_type); > > + *ab = audit_log_start(audit_context(), GFP_KERNEL, msg_type); > > if (unlikely(!*ab)) > > return; > > audit_log_format(*ab, "pid=%d uid=%u", pid, uid); > > diff --git a/kernel/audit_fsnotify.c b/kernel/audit_fsnotify.c > > index 52f368b..1640eb6 100644 > > --- a/kernel/audit_fsnotify.c > > +++ b/kernel/audit_fsnotify.c > > @@ -127,7 +127,7 @@ static void audit_mark_log_rule_change(struct > > audit_fsnotify_mark *audit_mark, c > > > > if (!audit_enabled) > > return; > > - ab = audit_log_start(NULL, GFP_NOFS, AUDIT_CONFIG_CHANGE); > > + ab = audit_log_start(audit_context(), GFP_NOFS, > > AUDIT_CONFIG_CHANGE); > > if (unlikely(!ab)) > > return; > > audit_log_format(ab, "auid=%u ses=%u op=%s", > > diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c > > index 5e9d1e5..a01b9da 100644 > > --- a/kernel/audit_tree.c > > +++ b/kernel/audit_tree.c > > @@ -499,7 +499,7 @@ static void audit_tree_log_remove_rule(struct > > audit_krule *rule) > > > > if (!audit_enabled) > > return; > > - ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_CONFIG_CHANGE); > > + ab = audit_log_start(audit_context(), GFP_KERNEL, > > AUDIT_CONFIG_CHANGE); > > if (unlikely(!ab)) > > return; > > audit_log_format(ab, "op=remove_rule"); > > diff --git a/kernel/audit_watch.c b/kernel/audit_watch.c > > index 9b4836b..da2978b 100644 > > --- a/kernel/audit_watch.c > > +++ b/kernel/audit_watch.c > > @@ -242,7 +242,7 @@ static void audit_watch_log_rule_change(struct > > audit_krule *r, struct audit_watc > > > > if (!audit_enabled) > > return; > > - ab = audit_log_start(NULL, GFP_NOFS, AUDIT_CONFIG_CHANGE); > > + ab = audit_log_start(audit_context(), GFP_NOFS, > > AUDIT_CONFIG_CHANGE); > > if (!ab) > > return; > > audit_log_format(ab, "auid=%u ses=%u op=%s", > > diff --git a/kernel/auditfilter.c b/kernel/auditfilter.c > > index eaa3201..6e19acb 100644 > > --- a/kernel/auditfilter.c > > +++ b/kernel/auditfilter.c > > @@ -1093,7 +1093,7 @@ static void audit_log_rule_change(char *action, > > struct audit_krule *rule, int re > > if (!audit_enabled) > > return; > > > > - ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_CONFIG_CHANGE); > > + ab = audit_log_start(audit_context(), GFP_KERNEL, > > AUDIT_CONFIG_CHANGE); > > if (!ab) > > return; > > audit_log_session_info(ab); > > -- > > 1.8.3.1 > > > > > -- > paul moore > www.paul-moore.com -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak59 V1 3/6] audit: exclude user records from syscall context
On Thu, Jun 14, 2018 at 4:23 PM Richard Guy Briggs wrote: > Since the function audit_log_common_recv_msg() is shared by a number of > AUDIT_CONFIG_CHANGE and the entire range of AUDIT_USER_* record types, > and since the AUDIT_CONFIG_CHANGE message type has been converted to a > syscall accompanied record type, special-case the AUDIT_USER_* range of > messages so they remain standalone records. > > See: https://github.com/linux-audit/audit-kernel/issues/59 > Signed-off-by: Richard Guy Briggs > --- > kernel/audit.c | 12 +--- > 1 file changed, 9 insertions(+), 3 deletions(-) I think this is fine, but see my previous comment about combining 2/6 and 3/6 as a safety measure. > diff --git a/kernel/audit.c b/kernel/audit.c > index e469234..c8c2efc 100644 > --- a/kernel/audit.c > +++ b/kernel/audit.c > @@ -1057,7 +1057,8 @@ static int audit_netlink_ok(struct sk_buff *skb, u16 > msg_type) > return err; > } > > -static void audit_log_common_recv_msg(struct audit_buffer **ab, u16 msg_type) > +static void __audit_log_common_recv_msg(struct audit_context *context, > + struct audit_buffer **ab, u16 > msg_type) > { > uid_t uid = from_kuid(&init_user_ns, current_uid()); > pid_t pid = task_tgid_nr(current); > @@ -1067,7 +1068,7 @@ static void audit_log_common_recv_msg(struct > audit_buffer **ab, u16 msg_type) > return; > } > > - *ab = audit_log_start(audit_context(), GFP_KERNEL, msg_type); > + *ab = audit_log_start(context, GFP_KERNEL, msg_type); > if (unlikely(!*ab)) > return; > audit_log_format(*ab, "pid=%d uid=%u", pid, uid); > @@ -1075,6 +1076,11 @@ static void audit_log_common_recv_msg(struct > audit_buffer **ab, u16 msg_type) > audit_log_task_context(*ab); > } > > +static inline void audit_log_common_recv_msg(struct audit_buffer **ab, u16 > msg_type) > +{ > + __audit_log_common_recv_msg(audit_context(), ab, msg_type); > +} > + > int is_audit_feature_set(int i) > { > return af.features & AUDIT_FEATURE_TO_MASK(i); > @@ -1341,7 +1347,7 @@ static int audit_receive_msg(struct sk_buff *skb, > struct nlmsghdr *nlh) > if (err) > break; > } > - audit_log_common_recv_msg(&ab, msg_type); > + __audit_log_common_recv_msg(NULL, &ab, msg_type); > if (msg_type != AUDIT_USER_TTY) > audit_log_format(ab, " msg='%.*s'", > AUDIT_MESSAGE_TEXT_MAX, > -- > 1.8.3.1 > -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak59 V1 4/6] audit: hand taken context to audit_kill_trees for syscall logging
list_del_init(&victim->list); > > mutex_unlock(&audit_filter_mutex); > @@ -972,7 +974,7 @@ static void evict_chunk(struct audit_chunk *chunk) > list_del_init(&owner->same_root); > spin_unlock(&hash_lock); > if (!postponed) { > - kill_rules(owner); > + kill_rules(audit_context(), owner); > list_move(&owner->list, &prune_list); > need_prune = 1; > } else { > diff --git a/kernel/auditsc.c b/kernel/auditsc.c > index ceb1c45..2590c9e 100644 > --- a/kernel/auditsc.c > +++ b/kernel/auditsc.c > @@ -1490,7 +1490,7 @@ void __audit_free(struct task_struct *tsk) > if (context->in_syscall && context->current_state == > AUDIT_RECORD_CONTEXT) > audit_log_exit(context, tsk); > if (!list_empty(&context->killed_trees)) > - audit_kill_trees(&context->killed_trees); > + audit_kill_trees(context); See my comment below about the ordering of audit_kill_trees() and audit_log_exit(). > audit_free_context(context); > } > @@ -1577,7 +1577,7 @@ void __audit_syscall_exit(int success, long return_code) > context->prio = context->state == AUDIT_RECORD_CONTEXT ? ~0ULL : 0; > > if (!list_empty(&context->killed_trees)) > - audit_kill_trees(&context->killed_trees); > + audit_kill_trees(context); I wonder if we should move the kill_trees if-block above the audit_log_exit() block so that any records that are emitted will be before the SYSCALL record. I didn't chase down all the code paths, but it seems like it should be safe, no? > audit_free_names(context); > unroll_tree_refs(context, NULL, 0); > -- > 1.8.3.1 -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak59 V1 5/6] audit: move EOE record after kill_trees for exit/free
On Thu, Jun 14, 2018 at 4:23 PM Richard Guy Briggs wrote: > The EOE record was being issued prior to the pruning of the killed_tree > list. > > Move the EOE record creation out of audit_log_exit() and into its > callers __audit_free() and __audit_syscall_exit() so that > audit_kill_trees() can be called prior to the EOE record creation and > any purged trees CONFIG_CHANGE records included in the syscall record > event. > > See: https://github.com/linux-audit/audit-kernel/issues/50 > See: https://github.com/linux-audit/audit-kernel/issues/59 > Signed-off-by: Richard Guy Briggs > --- > kernel/auditsc.c | 24 +--- > 1 file changed, 17 insertions(+), 7 deletions(-) See my comments in 4/6. Assuming we are able to shuffle the ordering of audit_log_exit() and audit_kill_trees() this patch would no longer be needed, yes? > diff --git a/kernel/auditsc.c b/kernel/auditsc.c > index 2590c9e..d56aead 100644 > --- a/kernel/auditsc.c > +++ b/kernel/auditsc.c > @@ -1460,10 +1460,6 @@ static void audit_log_exit(struct audit_context > *context, struct task_struct *ts > > audit_log_proctitle(tsk, context); > > - /* Send end of event record to help user space know we are finished */ > - ab = audit_log_start(context, GFP_KERNEL, AUDIT_EOE); > - if (ab) > - audit_log_end(ab); > if (call_panic) > audit_panic("error converting sid to string"); > } > @@ -1491,6 +1487,14 @@ void __audit_free(struct task_struct *tsk) > audit_log_exit(context, tsk); > if (!list_empty(&context->killed_trees)) > audit_kill_trees(context); > + if (context->in_syscall && context->current_state == > AUDIT_RECORD_CONTEXT) { > + struct audit_buffer *ab; > + > + /* Send end of event record to help user space know we are > finished */ > + ab = audit_log_start(context, GFP_KERNEL, AUDIT_EOE); > + if (ab) > + audit_log_end(ab); > + } > > audit_free_context(context); > } > @@ -1572,13 +1576,19 @@ void __audit_syscall_exit(int success, long > return_code) > > if (context->in_syscall && context->current_state == > AUDIT_RECORD_CONTEXT) > audit_log_exit(context, current); > + if (!list_empty(&context->killed_trees)) > + audit_kill_trees(context); > + if (context->in_syscall && context->current_state == > AUDIT_RECORD_CONTEXT) { > + struct audit_buffer *ab; > > + /* Send end of event record to help user space know we are > finished */ > + ab = audit_log_start(context, GFP_KERNEL, AUDIT_EOE); > + if (ab) > + audit_log_end(ab); > + } > context->in_syscall = 0; > context->prio = context->state == AUDIT_RECORD_CONTEXT ? ~0ULL : 0; > > - if (!list_empty(&context->killed_trees)) > - audit_kill_trees(context); > - > audit_free_names(context); > unroll_tree_refs(context, NULL, 0); > audit_free_aux(context); > -- > 1.8.3.1 > -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak59 V1 6/6] audit: extend config_change mark/watch/tree rule changes
uct audit_tree *tree, char *trig) > { > struct list_head *p, *q; > spin_lock(&hash_lock); > @@ -584,7 +584,7 @@ static void trim_marked(struct audit_tree *tree) > tree->goner = 1; > spin_unlock(&hash_lock); > mutex_lock(&audit_filter_mutex); > - kill_rules(audit_context(), tree); > + kill_rules(audit_context(), tree, trig); > list_del_init(&tree->list); > mutex_unlock(&audit_filter_mutex); > prune_one(tree); > @@ -665,7 +665,7 @@ void audit_trim_trees(void) > node->index &= ~(1U<<31); > } > spin_unlock(&hash_lock); > - trim_marked(tree); > + trim_marked(tree, "trim"); > drop_collected_mounts(root_mnt); > skip_it: > put_tree(tree); > @@ -798,7 +798,7 @@ int audit_add_tree_rule(struct audit_krule *rule) > node->index &= ~(1U<<31); > spin_unlock(&hash_lock); > } else { > - trim_marked(tree); > + trim_marked(tree, "add"); > goto Err; > } > > @@ -900,7 +900,7 @@ int audit_tag_tree(char *old, char *new) > node->index &= ~(1U<<31); > spin_unlock(&hash_lock); > } else { > - trim_marked(tree); > + trim_marked(tree, "equiv"); > } > > put_tree(tree); > @@ -924,7 +924,7 @@ static void audit_schedule_prune(void) > * ... and that one is done if evict_chunk() decides to delay until the end > * of syscall. Runs synchronously. > */ > -void audit_kill_trees(struct audit_context *context) > +void audit_kill_trees(struct audit_context *context, char *trig) > { > struct list_head *list = &context->killed_trees; > > @@ -935,7 +935,7 @@ void audit_kill_trees(struct audit_context *context) > struct audit_tree *victim; > > victim = list_entry(list->next, struct audit_tree, list); > - kill_rules(context, victim); > + kill_rules(context, victim, trig); > list_del_init(&victim->list); > > mutex_unlock(&audit_filter_mutex); > @@ -974,7 +974,7 @@ static void evict_chunk(struct audit_chunk *chunk) > list_del_init(&owner->same_root); > spin_unlock(&hash_lock); > if (!postponed) { > - kill_rules(audit_context(), owner); > + kill_rules(audit_context(), owner, "evict"); > list_move(&owner->list, &prune_list); > need_prune = 1; > } else { > diff --git a/kernel/audit_watch.c b/kernel/audit_watch.c > index da2978b..693d0a8 100644 > --- a/kernel/audit_watch.c > +++ b/kernel/audit_watch.c > @@ -317,7 +317,9 @@ static void audit_update_watch(struct audit_parent > *parent, > if (oentry->rule.exe) > audit_remove_mark(oentry->rule.exe); > > - audit_watch_log_rule_change(r, owatch, > "updated_rules"); > + audit_watch_log_rule_change(r, owatch, invalidating ? > + > "updated_rules(watch:inval)" : > + > "updated_rules(watch:set)"); > > call_rcu(&oentry->rcu, audit_free_rule_rcu); > } > @@ -345,7 +347,7 @@ static void audit_remove_parent_watches(struct > audit_parent *parent) > list_for_each_entry_safe(w, nextw, &parent->watches, wlist) { > list_for_each_entry_safe(r, nextr, &w->rules, rlist) { > e = container_of(r, struct audit_entry, rule); > - audit_watch_log_rule_change(r, w, "remove_rule"); > + audit_watch_log_rule_change(r, w, > "remove_rule(watch:parent)"); > if (e->rule.exe) > audit_remove_mark(e->rule.exe); > list_del(&r->rlist); > diff --git a/kernel/auditsc.c b/kernel/auditsc.c > index d56aead..32428a3 100644 > --- a/kernel/auditsc.c > +++ b/kernel/auditsc.c > @@ -1486,7 +1486,7 @@ void __audit_free(struct task_struct *tsk) >
Re: [PATCH 0/6] audit: Fix various races when tagging and untagging mounts
On Fri, Jun 29, 2018 at 7:44 AM Amir Goldstein wrote: > On Thu, Jun 28, 2018 at 7:40 PM, Jan Kara wrote: > > Hello, > > > > this series addresses the problems I have identified when trying to > > understand > > how exactly is kernel/audit_tree.c using generic fsnotify framework. I hope > > I have understood all the interactions right but careful review is certainly > > welcome (CCing Al as he was the one implementing this code originally). > > > > The patches have been tested by a stress test I have written which mounts & > > unmounts filesystems in the directory tree while adding and removing audit > > rules for this tree in parallel and accessing the tree to generate events. > > Still some real-world testing would be welcome. > > > > This sort of stress test sound really useful to fanotify/inotify as well. > Do plan to upstream that stress test? Agreed. I would be interested in having something like this in the audit-testsuite so that we can include it in our regular regression testing. * https://github.com/linux-audit/audit-testsuite -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH] audit: use ktime_get_coarse_ts64() for time access
On Fri, Jun 22, 2018 at 4:26 PM Richard Guy Briggs wrote: > On 2018-06-18 16:58, Arnd Bergmann wrote: > > The API got renamed for consistency with the other time accessors, > > this changes the audit caller as well. > > > > Signed-off-by: Arnd Bergmann > > This looks reasonable to me. > Not that I think it matters, but any reason it was swapped in a > different code location other than aesthetics? Time and serial are > usually used together in our code. > > Reviewed-by: Richard Guy Briggs Looks fine to me too. Merged, thanks. > > --- > > kernel/audit.c | 2 +- > > kernel/auditsc.c | 2 +- > > 2 files changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/kernel/audit.c b/kernel/audit.c > > index e7478cb58079..3e34b9f03cfe 100644 > > --- a/kernel/audit.c > > +++ b/kernel/audit.c > > @@ -1724,7 +1724,7 @@ static inline void audit_get_stamp(struct > > audit_context *ctx, > > struct timespec64 *t, unsigned int *serial) > > { > > if (!ctx || !auditsc_get_stamp(ctx, t, serial)) { > > - *t = current_kernel_time64(); > > + ktime_get_coarse_ts64(t); > > *serial = audit_serial(); > > } > > } > > diff --git a/kernel/auditsc.c b/kernel/auditsc.c > > index ceb1c4596c51..dbbf02f513a8 100644 > > --- a/kernel/auditsc.c > > +++ b/kernel/auditsc.c > > @@ -1540,10 +1540,10 @@ void __audit_syscall_entry(int major, unsigned long > > a1, unsigned long a2, > > context->argv[2]= a3; > > context->argv[3]= a4; > > context->serial = 0; > > - context->ctime = current_kernel_time64(); > > context->in_syscall = 1; > > context->current_state = state; > > context->ppid = 0; > > + ktime_get_coarse_ts64(&context->ctime); > > } > > > > /** > > -- > > 2.9.0 > > > > - RGB > > -- > Richard Guy Briggs > Sr. S/W Engineer, Kernel Security, Base Operating Systems > Remote, Ottawa, Red Hat Canada > IRC: rgb, SunRaycer > Voice: +1.647.777.2635, Internal: (81) 32635 -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH v2] audit: fix use-after-free in audit_add_watch
On Mon, Jun 18, 2018 at 3:12 PM Ronny Chevalier wrote: > > audit_add_watch stores locally krule->watch without taking a reference > on watch. Then, it calls audit_add_to_parent, and uses the watch stored > locally. > > Unfortunately, it is possible that audit_add_to_parent updates > krule->watch. > When it happens, it also drops a reference of watch which > could free the watch. > > How to reproduce (with KASAN enabled): > > auditctl -w /etc/passwd -F success=0 -k test_passwd > auditctl -w /etc/passwd -F success=1 -k test_passwd2 > > The second call to auditctl triggers the use-after-free, because > audit_to_parent updates krule->watch to use a previous existing watch > and drops the reference to the newly created watch. > > To fix the issue, we grab a reference of watch and we release it at the > end of the function. > > Signed-off-by: Ronny Chevalier > --- > v2: > - Move audit_get_watch before audit_find_parent. In the case of > audit_get_nd failing. > --- > kernel/audit_watch.c | 9 + > 1 file changed, 9 insertions(+) I haven't made it through Jan's latest patches, which may change things related to this patch, but did you see my comment on the original version of your patch? * https://www.redhat.com/archives/linux-audit/2018-June/msg00166.html > diff --git a/kernel/audit_watch.c b/kernel/audit_watch.c > index f1ba88994508..6d9b3f2bb1e2 100644 > --- a/kernel/audit_watch.c > +++ b/kernel/audit_watch.c > @@ -430,6 +430,14 @@ int audit_add_watch(struct audit_krule *krule, struct > list_head **list) > if (ret) > return ret; > > + /* > +* When we will be calling audit_add_to_parent, krule->watch might > have > +* been updated and watch might have been freed. > +* So we need to keep a reference of watch. > +*/ > + > + audit_get_watch(watch); > + > /* either find an old parent or attach a new one */ > parent = audit_find_parent(d_backing_inode(parent_path.dentry)); > if (!parent) { > @@ -446,6 +454,7 @@ int audit_add_watch(struct audit_krule *krule, struct > list_head **list) > *list = &audit_inode_hash[h]; > error: > path_put(&parent_path); > + audit_put_watch(watch); > return ret; > } > > -- > 2.17.1 > -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH v2] audit: fix use-after-free in audit_add_watch
On Tue, Jul 3, 2018 at 11:56 AM Ronny Chevalier wrote: > > On 03/07/2018 17:34, Paul Moore wrote: > > > > I haven't made it through Jan's latest patches, which may change > > things related to this patch, but did you see my comment on the > > original version of your patch? > > > > * https://www.redhat.com/archives/linux-audit/2018-June/msg00166.html > > Yes, I have, but I did not have time to work on it yet. > (this patch was sent before your comments) No worries, or rush, I just wanted to make sure. Thanks for your help. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH 0/6] audit: Fix various races when tagging and untagging mounts
On Tue, Jul 3, 2018 at 10:14 AM Jan Kara wrote: > On Fri 29-06-18 14:01:44, Paul Moore wrote: > > On Fri, Jun 29, 2018 at 7:44 AM Amir Goldstein wrote: > > > On Thu, Jun 28, 2018 at 7:40 PM, Jan Kara wrote: > > > > Hello, > > > > > > > > this series addresses the problems I have identified when trying to > > > > understand > > > > how exactly is kernel/audit_tree.c using generic fsnotify framework. I > > > > hope > > > > I have understood all the interactions right but careful review is > > > > certainly > > > > welcome (CCing Al as he was the one implementing this code originally). > > > > > > > > The patches have been tested by a stress test I have written which > > > > mounts & > > > > unmounts filesystems in the directory tree while adding and removing > > > > audit > > > > rules for this tree in parallel and accessing the tree to generate > > > > events. > > > > Still some real-world testing would be welcome. > > > > > > > > > > This sort of stress test sound really useful to fanotify/inotify as well. > > > Do plan to upstream that stress test? > > > > Agreed. > > > > I would be interested in having something like this in the > > audit-testsuite so that we can include it in our regular regression > > testing. > > > > * https://github.com/linux-audit/audit-testsuite > > OK, I'll look into integrating the test script into audit testsuite. Great, thank you. Even if you don't get around to it, posting it somewhere could still be helpful, e.g. I could use it to hammer on your patches too. Speaking of which, thank you very much for doing this work; I know how painful the audit code can be and I suspect this wasn't easy. I see you've already got some feedback from Amir (thank you Amir!) and I'm working my way through them too, but some vacation time is going to make progress a bit slow. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
[PATCH] audit: use ktime_get_coarse_real_ts64() for timestamps
Commit c72051d5778a ("audit: use ktime_get_coarse_ts64() for time access") converted audit's use of current_kernel_time64() to the new ktime_get_coarse_ts64() function. Unfortunately this resulted in incorrect timestamps, e.g. events stamped with the year 1969 despite it being 2018. This patch corrects this by using ktime_get_coarse_real_ts64() just like the current_kernel_time64() wrapper. Fixes: c72051d5778a ("audit: use ktime_get_coarse_ts64() for time access") Signed-off-by: Paul Moore --- kernel/audit.c |2 +- kernel/auditsc.c |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/audit.c b/kernel/audit.c index e17bc697d11c..2a8058764aa6 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -1721,7 +1721,7 @@ static inline void audit_get_stamp(struct audit_context *ctx, struct timespec64 *t, unsigned int *serial) { if (!ctx || !auditsc_get_stamp(ctx, t, serial)) { - ktime_get_coarse_ts64(t); + ktime_get_coarse_real_ts64(t); *serial = audit_serial(); } } diff --git a/kernel/auditsc.c b/kernel/auditsc.c index f6a0cb32d76e..fb207466e99b 100644 --- a/kernel/auditsc.c +++ b/kernel/auditsc.c @@ -1543,7 +1543,7 @@ void __audit_syscall_entry(int major, unsigned long a1, unsigned long a2, context->in_syscall = 1; context->current_state = state; context->ppid = 0; - ktime_get_coarse_ts64(&context->ctime); + ktime_get_coarse_real_ts64(&context->ctime); } /** -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH v3] audit: fix use-after-free in audit_add_watch
On Fri, Jul 13, 2018 at 5:00 PM Richard Guy Briggs wrote: > On 2018-07-11 14:39, Ronny Chevalier wrote: > > audit_add_watch stores locally krule->watch without taking a reference > > on watch. Then, it calls audit_add_to_parent, and uses the watch stored > > locally. > > > > Unfortunately, it is possible that audit_add_to_parent updates > > krule->watch. > > When it happens, it also drops a reference of watch which > > could free the watch. > > > > How to reproduce (with KASAN enabled): > > > > auditctl -w /etc/passwd -F success=0 -k test_passwd > > auditctl -w /etc/passwd -F success=1 -k test_passwd2 > > > > The second call to auditctl triggers the use-after-free, because > > audit_to_parent updates krule->watch to use a previous existing watch > > and drops the reference to the newly created watch. > > > > To fix the issue, we grab a reference of watch and we release it at the > > end of the function. > > > > Signed-off-by: Ronny Chevalier > > --- > > Ronnie, This looks good to me, but let's see how Jan Kara's patches > interplay with it. > > Reviewed-by: Richard Guy Briggs Agreed, this looks better, thanks Ronny, I'm merging this into audit/next. Yes, this obviously treads in the same space as Jan's patches, but we will deal with that as Jan's patches are reviewed; in the meantime this fixes a trivially reproduced problem, the additional reference is limited to a single function instantiation, and is easily understood. > > v3: > > - Move audit_get_watch before audit_get_nd since it is using it > > and call audit_put_watch if it fails. > > v2: > > - Move audit_get_watch before audit_find_parent. In the case of > > audit_get_nd failing. > > --- > > kernel/audit_watch.c | 12 +++- > > 1 file changed, 11 insertions(+), 1 deletion(-) > > > > diff --git a/kernel/audit_watch.c b/kernel/audit_watch.c > > index 6f249bdf2d84..787c7afdf829 100644 > > --- a/kernel/audit_watch.c > > +++ b/kernel/audit_watch.c > > @@ -420,6 +420,13 @@ int audit_add_watch(struct audit_krule *krule, struct > > list_head **list) > > struct path parent_path; > > int h, ret = 0; > > > > + /* > > + * When we will be calling audit_add_to_parent, krule->watch might > > have > > + * been updated and watch might have been freed. > > + * So we need to keep a reference of watch. > > + */ > > + audit_get_watch(watch); > > + > > mutex_unlock(&audit_filter_mutex); > > > > /* Avoid calling path_lookup under audit_filter_mutex. */ > > @@ -428,8 +435,10 @@ int audit_add_watch(struct audit_krule *krule, struct > > list_head **list) > > /* caller expects mutex locked */ > > mutex_lock(&audit_filter_mutex); > > > > - if (ret) > > + if (ret) { > > + audit_put_watch(watch); > > return ret; > > + } > > > > /* either find an old parent or attach a new one */ > > parent = audit_find_parent(d_backing_inode(parent_path.dentry)); > > @@ -447,6 +456,7 @@ int audit_add_watch(struct audit_krule *krule, struct > > list_head **list) > > *list = &audit_inode_hash[h]; > > error: > > path_put(&parent_path); > > + audit_put_watch(watch); > > return ret; > > } > > > > -- > > 2.18.0 > > > > - RGB > > -- > Richard Guy Briggs > Sr. S/W Engineer, Kernel Security, Base Operating Systems > Remote, Ottawa, Red Hat Canada > IRC: rgb, SunRaycer > Voice: +1.647.777.2635, Internal: (81) 32635 -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak10 v3 0/3] audit: Log modifying adjtimex(2) calls
On Tue, Jul 3, 2018 at 8:44 AM Ondrej Mosnacek wrote: > I tried to implement separate records for each variable as suggested by > Richard and it turned out to be quite straightforward and results in > more compact and readable records (even though there is now a bit more > of them). > > Changes in v3: > - Switched to separate records for each variable > - Both old and new value is now reported for each change > - Injecting offset is reported via a separate record (since this > offset consists of two values and is added directly to the clock, > i.e. it doesn't make sense to log old and new value) > - Added example records produced by chronyd -q (see the commit message > of the last patch) > > Changes in v2: > - The audit_adjtime() function has been modified to only log those > fields that contain values that are actually used, resulting in more > compact records. > - The audit_adjtime() call has been moved to do_adjtimex() in > timekeeping.c > - Added an additional patch (for review) that simplifies the detection > if the syscall is read-only. Looking at these new records, and trying to guess a bit at the original intent of the feature request, I think we may be going a bit overboard with the information we are logging. I'm thinking all we really need to capture in the audit log is the system time both before and after the change (for the sake of simplicity I suggest using a data format similar to the audit record timestamp). While I created the GH issue for this, I believe the original request came from a Red Hat BZ that Steve created; Steve, what sort of certification requirements (if any?) are there for logging system time changes? -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak10 v3 0/3] audit: Log modifying adjtimex(2) calls
On Wed, Jul 18, 2018 at 3:36 PM Steve Grubb wrote: > On Wednesday, July 18, 2018 2:36:11 PM EDT Paul Moore wrote: > > > Changes in v2: > > > - The audit_adjtime() function has been modified to only log those > > > fields that contain values that are actually used, resulting in more > > > compact records. > > > - The audit_adjtime() call has been moved to do_adjtimex() in > > > timekeeping.c > > > - Added an additional patch (for review) that simplifies the detection > > > if the syscall is read-only. > > > > Looking at these new records, and trying to guess a bit at the > > original intent of the feature request, I think we may be going a bit > > overboard with the information we are logging. I'm thinking all we > > really need to capture in the audit log is the system time both before > > and after the change (for the sake of simplicity I suggest using a > > data format similar to the audit record timestamp). > > > > While I created the GH issue for this, I believe the original request > > came from a Red Hat BZ that Steve created; Steve, what sort of > > certification requirements (if any?) are there for logging system time > > changes? > > That we record any attempts to change the system time. The problem is that > adjtimex passes a data structure that is opaque to user space. So, we can't > tell if someone is setting time, adjusting a tolerance, or simply retrieving > status. > > With stime, we can clearly see the time that was sent into the kernel and it > unconditionally sets time. With settimeofday, it uses a data structure that > we cannot see, but whatever the contents are we are definitely setting time. > Same goes for clock_settime. Only in 1 case do we actually see what the time > is. So, that is not really needed. So, I think what we need to know is did > the syscall do anything that adjusted the system's notion of time? And that's > all. So presumably my above suggestion of simply recording the system time both before and after the change would be sufficient, yes? -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak9 0/3] audit: Record the path of FDs passed to *at(2) syscalls
On Thu, Jul 12, 2018 at 7:36 AM Ondrej Mosnacek wrote: > This patchset is a prototype implementation of the feature requested in GHAK > issue #9 [1]. I decided for a simple auxiliary record with just 2 fields (fd > and path) that is emitted whenever we want to record the full path for a file > descriptor passed to a syscall (e.g. the dirfd argument of openat(2)). I > choose this approach because for some syscalls there is more than one file > descriptor we might be interested in (a good example is the renameat(2) > syscall). > > The motivation for this feature (as I understand it) is to avoid the need to > reconstruct the paths corresponding to the file descriptors passed to > syscalls, as this might be difficult and time consuming or even impossible in > case not all of the right sycalls are being logged. Note that it is always > possible to disable these records by simply adding an exclude filter rule > matching all records of type FD_PATH. > > At this moment I only implement logging for a single syscall (openat(2)) to > keep it simple. In the final version I plan to add support for other similar > syscalls ()mkdirat, mknodeat, fchownat, ...). > > Please let me know if the general approach and the proposed record format > make sense to you so I can improve/complete the solution. > > Thanks, > Ondrej > > [1] https://github.com/linux-audit/audit-kernel/issues/9 While I recognize that the GH issue did raise the idea of possibly creating a new record type, looking at these patches I'm not sure a new record type is justified, I think reusing the existing PATH record type would be more beneficial. I recognize that this proposed FD_PATH record also contains the file descriptor number, but that information should also be contained in the associated SYSCALL record and arguably the fd number is only useful if you are logging the SYSCALL information. Can you explain what advantage the FS_PATH record type has over reusing the existing PATH record? I know you mention multiple fd/paths as in the case of renameat(2), but we already have to deal with this in the non-at rename(2) case. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak90 (was ghak32) V3 10/10] rfkill: fix spelling mistake contidion to condition
On Wed, Jun 6, 2018 at 1:02 PM Richard Guy Briggs wrote: > Signed-off-by: Richard Guy Briggs > --- > net/rfkill/core.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) I would suggest splitting this patch from the audit container ID patchset and sending it to the netdev folks. It really has nothing to do with the audit container ID work, although I suspect I know why you bumped into this spelling mistake ;) > diff --git a/net/rfkill/core.c b/net/rfkill/core.c > index 59d0eb9..e89a009 100644 > --- a/net/rfkill/core.c > +++ b/net/rfkill/core.c > @@ -494,7 +494,7 @@ void rfkill_remove_epo_lock(void) > /** > * rfkill_is_epo_lock_active - returns true EPO is active > * > - * Returns 0 (false) if there is NOT an active EPO contidion, > + * Returns 0 (false) if there is NOT an active EPO condition, > * and 1 (true) if there is an active EPO contition, which > * locks all radios in one of the BLOCKED states. > * > -- > 1.8.3.1 -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak59 V1 1/6] audit: give a clue what CONFIG_CHANGE op was involved
On Thu, Jul 12, 2018 at 8:43 PM Richard Guy Briggs wrote: > On 2018-06-28 15:41, Paul Moore wrote: > > On Thu, Jun 14, 2018 at 4:23 PM Richard Guy Briggs wrote: > > > The failure to add an audit rule due to audit locked gives no clue > > > what CONFIG_CHANGE operation failed. > > > Similarly the set operation is the only other operation that doesn't > > > give the "op=" field to indicate the action. > > > All other CONFIG_CHANGE records include an op= field to give a clue as > > > to what sort of configuration change is being executed. > > > > > > Since these are the only CONFIG_CHANGE records that that do not have an > > > op= field, add them to bring them in line with the rest. > > > > Normally this would be an immediate reject because this patch inserts > > a field into an existing record, but the CONFIG_CHANGE record is so > > variable (supposedly bad in its own right) that I don't this really > > matters. > > > > With that out of the way, I think this patch is fine, but I don't > > think it is complete. At the very least there is another > > CONFIG_CHANGE record in audit_watch_log_rule_change() that doesn't > > appear to include an "op" field. If we want to make sure we have an > > "op" field in every CONFIG_CHANGE record, let's actually add them all > > :) > > The version I'm looking at already had it when it was added in 2009. Yup, there it is ... now I'm wondering what tree I was looking at as a reference while reviewing this? /me scratches head > This one doesn't add the auid and ses fields because they will be > covered by the linking of this record with the syscall record via the > audit_context() introduced in another patch. Yeah, I'm not concerned about that for the reasons you state. > > and one more in audit_receive_msg(). There may be more. > > I believe they're covered by other patches in the ghak59 set. If they are in the later patches it might be good to move those "op=" additions into this patch. > > > Old records: > > > type=CONFIG_CHANGE msg=audit(1519812997.781:374): pid=610 uid=0 auid=0 > > > ses=1 subj=... audit_enabled=2 res=0 > > > type=CONFIG_CHANGE msg=audit(2018-06-14 14:55:04.507:47) : > > > audit_enabled=1 old=1 auid=unset ses=unset subj=... res=yes > > > > > > New records: > > > type=CONFIG_CHANGE msg=audit(1520958477.855:100): pid=610 uid=0 auid=0 > > > ses=1 subj=... op=add_rule audit_enabled=2 res=0 > > > > > > type=CONFIG_CHANGE msg=audit(2018-06-14 14:55:04.507:47) : op=set > > > audit_enabled=1 old=1 auid=unset ses=unset subj=... res=yes > > > > > > See: https://github.com/linux-audit/audit-kernel/issues/59 > > > Signed-off-by: Richard Guy Briggs > > > --- > > > kernel/audit.c | 6 -- > > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > > > diff --git a/kernel/audit.c b/kernel/audit.c > > > index e7478cb..ad54339 100644 > > > --- a/kernel/audit.c > > > +++ b/kernel/audit.c > > > @@ -403,7 +403,7 @@ static int audit_log_config_change(char > > > *function_name, u32 new, u32 old, > > > ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_CONFIG_CHANGE); > > > if (unlikely(!ab)) > > > return rc; > > > - audit_log_format(ab, "%s=%u old=%u", function_name, new, old); > > > + audit_log_format(ab, "op=set %s=%u old=%u", function_name, new, > > > old); > > > audit_log_session_info(ab); > > > rc = audit_log_task_context(ab); > > > if (rc) > > > @@ -1365,7 +1365,9 @@ static int audit_receive_msg(struct sk_buff *skb, > > > struct nlmsghdr *nlh) > > > return -EINVAL; > > > if (audit_enabled == AUDIT_LOCKED) { > > > audit_log_common_recv_msg(&ab, > > > AUDIT_CONFIG_CHANGE); > > > - audit_log_format(ab, " audit_enabled=%d res=0", > > > audit_enabled); > > > + audit_log_format(ab, " op=%s_rule > > > audit_enabled=%d res=0", > > > +msg_type == AUDIT_ADD_RULE ? > > > "add" : "remove", > > > +audit_enabled); > > > audit_log_end(ab); > > > return -EPERM; > > > } > > > -- > > > 1.8.3.1 > > > > -- > > paul moore > > www.paul-moore.com > > - RGB > > -- > Richard Guy Briggs > Sr. S/W Engineer, Kernel Security, Base Operating Systems > Remote, Ottawa, Red Hat Canada > IRC: rgb, SunRaycer > Voice: +1.647.777.2635, Internal: (81) 32635 -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak10 v3 0/3] audit: Log modifying adjtimex(2) calls
On Wed, Jul 18, 2018 at 6:34 PM Steve Grubb wrote: > On Wednesday, July 18, 2018 3:59:24 PM EDT Paul Moore wrote: > > On Wed, Jul 18, 2018 at 3:36 PM Steve Grubb wrote: > > > On Wednesday, July 18, 2018 2:36:11 PM EDT Paul Moore wrote: > > > > > Changes in v2: > > > > > - The audit_adjtime() function has been modified to only log those > > > > > fields that contain values that are actually used, resulting in more > > > > > compact records. > > > > > - The audit_adjtime() call has been moved to do_adjtimex() in > > > > > timekeeping.c > > > > > - Added an additional patch (for review) that simplifies the > > > > > detection > > > > > if the syscall is read-only. > > > > > > > > Looking at these new records, and trying to guess a bit at the > > > > original intent of the feature request, I think we may be going a bit > > > > overboard with the information we are logging. I'm thinking all we > > > > really need to capture in the audit log is the system time both before > > > > and after the change (for the sake of simplicity I suggest using a > > > > data format similar to the audit record timestamp). > > > > > > > > While I created the GH issue for this, I believe the original request > > > > came from a Red Hat BZ that Steve created; Steve, what sort of > > > > certification requirements (if any?) are there for logging system time > > > > changes? > > > > > > That we record any attempts to change the system time. The problem is > > > that > > > adjtimex passes a data structure that is opaque to user space. So, we > > > can't tell if someone is setting time, adjusting a tolerance, or simply > > > retrieving status. > > > > > > With stime, we can clearly see the time that was sent into the kernel and > > > it unconditionally sets time. With settimeofday, it uses a data > > > structure that we cannot see, but whatever the contents are we are > > > definitely setting time. Same goes for clock_settime. Only in 1 case do > > > we actually see what the time is. So, that is not really needed. So, I > > > think what we need to know is did the syscall do anything that adjusted > > > the system's notion of time? And that's all. > > > > So presumably my above suggestion of simply recording the system time > > both before and after the change would be sufficient, yes? > > Well, I think that we need to make it obvious that a time setting has changed > as in a y/n answer. Doing a comparison of the time to get a y/n answer is > less obvious if you are doing a grep. Presumably we would only be emitting this new record when the system time changes as the result of a time setting change so the simple presence of this record should be sufficient to indicate a time change. > Also, what time will the event have? I > presume that the event will have the new time since we are on the exit > filter. So, do we really need to collect that? The audit event timestamp is not guaranteed to be the same as the user supplied value so we should explicitly log the old/new system time. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak90 (was ghak32) V3 00/10] audit: implement container identifier
On Thu, Jun 7, 2018 at 10:24 AM Richard Guy Briggs wrote: > On 2018-06-07 09:10, Stefan Berger wrote: > > On 06/06/2018 12:58 PM, Richard Guy Briggs wrote: > > > Implement kernel audit container identifier. > > > > What tree does this series build upon as a base? I don't seem to find one > > with the necessary base patches applied. > > It is Paul Moore's audit next tree, with patch ("collect audit task > parameters") of ghak81: > > git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit.git > https://www.redhat.com/archives/linux-audit/2018-May/msg00091.html > https://www.redhat.com/archives/linux-audit/2018-May/msg00100.html To help reduce the confusion on the next re-spin of this patchset please include the "collect audit task parameters" patch. I just lost a few minutes trying to figure out where the task_struct changes myself ... -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak59 V1 1/6] audit: give a clue what CONFIG_CHANGE op was involved
On Thu, Jul 19, 2018 at 12:10 PM Richard Guy Briggs wrote: > On 2018-07-18 17:45, Paul Moore wrote: > > On Thu, Jul 12, 2018 at 8:43 PM Richard Guy Briggs wrote: > > > On 2018-06-28 15:41, Paul Moore wrote: > > > > On Thu, Jun 14, 2018 at 4:23 PM Richard Guy Briggs > > > > wrote: > > > > > The failure to add an audit rule due to audit locked gives no clue > > > > > what CONFIG_CHANGE operation failed. > > > > > Similarly the set operation is the only other operation that doesn't > > > > > give the "op=" field to indicate the action. > > > > > All other CONFIG_CHANGE records include an op= field to give a clue as > > > > > to what sort of configuration change is being executed. > > > > > > > > > > Since these are the only CONFIG_CHANGE records that that do not have > > > > > an > > > > > op= field, add them to bring them in line with the rest. > > > > > > > > Normally this would be an immediate reject because this patch inserts > > > > a field into an existing record, but the CONFIG_CHANGE record is so > > > > variable (supposedly bad in its own right) that I don't this really > > > > matters. > > > > > > > > With that out of the way, I think this patch is fine, but I don't > > > > think it is complete. At the very least there is another > > > > CONFIG_CHANGE record in audit_watch_log_rule_change() that doesn't > > > > appear to include an "op" field. If we want to make sure we have an > > > > "op" field in every CONFIG_CHANGE record, let's actually add them all > > > > :) > > > > > > The version I'm looking at already had it when it was added in 2009. > > > > Yup, there it is ... now I'm wondering what tree I was looking at as a > > reference while reviewing this? > > > > /me scratches head > > > > > This one doesn't add the auid and ses fields because they will be > > > covered by the linking of this record with the syscall record via the > > > audit_context() introduced in another patch. > > > > Yeah, I'm not concerned about that for the reasons you state. > > > > > > and one more in audit_receive_msg(). There may be more. > > > > > > I believe they're covered by other patches in the ghak59 set. > > > > If they are in the later patches it might be good to move those "op=" > > additions into this patch. > > I don't see any CONFIG_CHANGE records generated in audit_receive_msg() > that are missing op= field. Can you narrow it down? Well, just grep'ing my way through audit_receive_msg() I see that AUDIT_ADD/DEL_RULE generates a CONFIG_CHANGE record. > > > > > Old records: > > > > > type=CONFIG_CHANGE msg=audit(1519812997.781:374): pid=610 uid=0 > > > > > auid=0 ses=1 subj=... audit_enabled=2 res=0 > > > > > type=CONFIG_CHANGE msg=audit(2018-06-14 14:55:04.507:47) : > > > > > audit_enabled=1 old=1 auid=unset ses=unset subj=... res=yes > > > > > > > > > > New records: > > > > > type=CONFIG_CHANGE msg=audit(1520958477.855:100): pid=610 uid=0 > > > > > auid=0 ses=1 subj=... op=add_rule audit_enabled=2 res=0 > > > > > > > > > > type=CONFIG_CHANGE msg=audit(2018-06-14 14:55:04.507:47) : op=set > > > > > audit_enabled=1 old=1 auid=unset ses=unset subj=... res=yes > > > > > > > > > > See: https://github.com/linux-audit/audit-kernel/issues/59 > > > > > Signed-off-by: Richard Guy Briggs > > > > > --- > > > > > kernel/audit.c | 6 -- > > > > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > > > > > > > diff --git a/kernel/audit.c b/kernel/audit.c > > > > > index e7478cb..ad54339 100644 > > > > > --- a/kernel/audit.c > > > > > +++ b/kernel/audit.c > > > > > @@ -403,7 +403,7 @@ static int audit_log_config_change(char > > > > > *function_name, u32 new, u32 old, > > > > > ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_CONFIG_CHANGE); > > > > > if (unlikely(!ab)) > > > > > return rc; > > > > > - audit_log_format(ab, "%s=%u old=%u", function_name, new, old); > > > > > + audit_log_format(ab, "
Re: [RFC PATCH ghak10 v3 0/3] audit: Log modifying adjtimex(2) calls
On Thu, Jul 19, 2018 at 3:36 AM Ondrej Mosnacek wrote: > On Wed, Jul 18, 2018 at 9:59 PM Paul Moore wrote: > > On Wed, Jul 18, 2018 at 3:36 PM Steve Grubb wrote: > > > On Wednesday, July 18, 2018 2:36:11 PM EDT Paul Moore wrote: > > > > > Changes in v2: > > > > > - The audit_adjtime() function has been modified to only log those > > > > > fields that contain values that are actually used, resulting in more > > > > > compact records. > > > > > - The audit_adjtime() call has been moved to do_adjtimex() in > > > > > timekeeping.c > > > > > - Added an additional patch (for review) that simplifies the detection > > > > > if the syscall is read-only. > > > > > > > > Looking at these new records, and trying to guess a bit at the > > > > original intent of the feature request, I think we may be going a bit > > > > overboard with the information we are logging. I'm thinking all we > > > > really need to capture in the audit log is the system time both before > > > > and after the change (for the sake of simplicity I suggest using a > > > > data format similar to the audit record timestamp). > > > > > > > > While I created the GH issue for this, I believe the original request > > > > came from a Red Hat BZ that Steve created; Steve, what sort of > > > > certification requirements (if any?) are there for logging system time > > > > changes? > > > > > > That we record any attempts to change the system time. The problem is that > > > adjtimex passes a data structure that is opaque to user space. So, we > > > can't > > > tell if someone is setting time, adjusting a tolerance, or simply > > > retrieving > > > status. > > > > > > With stime, we can clearly see the time that was sent into the kernel and > > > it > > > unconditionally sets time. With settimeofday, it uses a data structure > > > that > > > we cannot see, but whatever the contents are we are definitely setting > > > time. > > > Same goes for clock_settime. Only in 1 case do we actually see what the > > > time > > > is. So, that is not really needed. So, I think what we need to know is did > > > the syscall do anything that adjusted the system's notion of time? And > > > that's > > > all. > > > > So presumably my above suggestion of simply recording the system time > > both before and after the change would be sufficient, yes? > > I wouldn't say this is the right approach here, for two reasons that > I'll try to explain below. > > adjtimex(2) can be basically used for two major types of adjustment: > 1. for (immediately) injecting offset into system time, > 2. for changing NTP status variables. > > (1.) is the more obvious and simple adjustment, which *directly* > alters the system time. For this adjustment the current proposed > solution simply logs the delta, by which the time is adjusted (it can > be retrieved easily from the supplied timex struct). In my opinion, > logging of this delta is more explicit than logging the timestamps > (which might not give the precise offset anyway, since time might flow > between the two timestamps) and is also more easily grepped in the > logs, as Steve pointed out in his reply. For example, you can quickly > filter out adjustments of less than +/-1 second by simply comparing > the offset in the records. I remain unconvinced for this case, I would argue that the fact that the new system time is more useful than the delta, but it might not matter considering the second case below. > (2.) is a lot more tricky... These adjustments modify the variables of > the NTP algorithm, which is quite complex and it is not obvious how it > influences the system time. From my understanding, the algorithm is > used for adjustments that require smooth transition over time - such > as (or maybe this is the only use case) inserting a leap second. Note > that in this case, just logging timestamp before and after the syscall > is not sufficient, since the change doesn't happen immediately, but in > small increments that are gradually added *after* the syscall already > finished. I don't know if (or which of) these adjustments can be used > for a real attack, but at least the leap second insertion sounds like > something that could be used to shift the time without being easily > detected. This is why I am wary of throwing things out from the > logging until we understand what is truly important and what is not. Okay, it sounds like we need to log some of these variables, but which ones are relevant is still unknown (all?). I would suggest a bit more research on this so that we can say with some certainty which fields are important, please include this information in the commit descriptions so we have it in the git log for future use. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak59 V1 1/6] audit: give a clue what CONFIG_CHANGE op was involved
On Fri, Jul 20, 2018 at 9:30 AM Richard Guy Briggs wrote: > On 2018-07-19 18:47, Paul Moore wrote: > > On Thu, Jul 19, 2018 at 12:10 PM Richard Guy Briggs wrote: > > > On 2018-07-18 17:45, Paul Moore wrote: > > > > On Thu, Jul 12, 2018 at 8:43 PM Richard Guy Briggs > > > > wrote: > > > > > On 2018-06-28 15:41, Paul Moore wrote: > > > > > > On Thu, Jun 14, 2018 at 4:23 PM Richard Guy Briggs > > > > > > wrote: > > > > > > > The failure to add an audit rule due to audit locked gives no clue > > > > > > > what CONFIG_CHANGE operation failed. > > > > > > > Similarly the set operation is the only other operation that > > > > > > > doesn't > > > > > > > give the "op=" field to indicate the action. > > > > > > > All other CONFIG_CHANGE records include an op= field to give a > > > > > > > clue as > > > > > > > to what sort of configuration change is being executed. > > > > > > > > > > > > > > Since these are the only CONFIG_CHANGE records that that do not > > > > > > > have an > > > > > > > op= field, add them to bring them in line with the rest. > > > > > > > > > > > > Normally this would be an immediate reject because this patch > > > > > > inserts > > > > > > a field into an existing record, but the CONFIG_CHANGE record is so > > > > > > variable (supposedly bad in its own right) that I don't this really > > > > > > matters. > > > > > > > > > > > > With that out of the way, I think this patch is fine, but I don't > > > > > > think it is complete. At the very least there is another > > > > > > CONFIG_CHANGE record in audit_watch_log_rule_change() that doesn't > > > > > > appear to include an "op" field. If we want to make sure we have an > > > > > > "op" field in every CONFIG_CHANGE record, let's actually add them > > > > > > all > > > > > > :) > > > > > > > > > > The version I'm looking at already had it when it was added in 2009. > > > > > > > > Yup, there it is ... now I'm wondering what tree I was looking at as a > > > > reference while reviewing this? > > > > > > > > /me scratches head > > > > > > > > > This one doesn't add the auid and ses fields because they will be > > > > > covered by the linking of this record with the syscall record via the > > > > > audit_context() introduced in another patch. > > > > > > > > Yeah, I'm not concerned about that for the reasons you state. > > > > > > > > > > and one more in audit_receive_msg(). There may be more. > > > > > > > > > > I believe they're covered by other patches in the ghak59 set. > > > > > > > > If they are in the later patches it might be good to move those "op=" > > > > additions into this patch. > > > > > > I don't see any CONFIG_CHANGE records generated in audit_receive_msg() > > > that are missing op= field. Can you narrow it down? > > > > Well, just grep'ing my way through audit_receive_msg() I see that > > AUDIT_ADD/DEL_RULE generates a CONFIG_CHANGE record. > > The failure case is addressed in this patch. The success case is > addressed in audit_log_rule_change(). The latter already has it. What > is the problem? What tree are you looking at? What am I missing? So it does. This discussion dragged out long enough that I forgot to check the actual patch submission. I think this patch is fine, I would recommend updating this patchset using the feedback on the other individual patches and resubmitting. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak90 (was ghak32) V3 03/10] audit: add containerid support for ptrace and signals
On Wed, Jun 6, 2018 at 1:01 PM Richard Guy Briggs wrote: > Add audit container identifier support to ptrace and signals. In > particular, the "op" field provides a way to label the auxiliary record > to which it is associated. > > Signed-off-by: Richard Guy Briggs > --- > include/linux/audit.h | 11 +-- > kernel/audit.c| 13 +++-- > kernel/audit.h| 2 ++ > kernel/auditsc.c | 21 - > 4 files changed, 30 insertions(+), 17 deletions(-) ... > diff --git a/kernel/audit.c b/kernel/audit.c > index 5e150c6..ba304a8 100644 > --- a/kernel/audit.c > +++ b/kernel/audit.c > @@ -142,6 +142,7 @@ struct audit_net { > kuid_t audit_sig_uid = INVALID_UID; > pid_t audit_sig_pid = -1; > u32audit_sig_sid = 0; > +u64audit_sig_cid = AUDIT_CID_UNSET; > > /* Records can be lost in several ways: > 0) [suppressed in audit_alloc] > @@ -1437,6 +1438,7 @@ static int audit_receive_msg(struct sk_buff *skb, > struct nlmsghdr *nlh) > memcpy(sig_data->ctx, ctx, len); > security_release_secctx(ctx, len); > } > + sig_data->cid = audit_sig_cid; > audit_send_reply(skb, seq, AUDIT_SIGNAL_INFO, 0, 0, > sig_data, sizeof(*sig_data) + len); > kfree(sig_data); > @@ -2050,23 +2052,22 @@ void audit_log_session_info(struct audit_buffer *ab) > > /* > * audit_log_contid - report container info > - * @tsk: task to be recorded > * @context: task or local context for record > * @op: contid string description > + * @contid: container ID to report > */ > -int audit_log_contid(struct task_struct *tsk, > -struct audit_context *context, char *op) > +int audit_log_contid(struct audit_context *context, > + char *op, u64 contid) > { > struct audit_buffer *ab; > > - if (!audit_contid_set(tsk)) > + if (!cid_valid(contid)) > return 0; > /* Generate AUDIT_CONTAINER record with container ID */ > ab = audit_log_start(context, GFP_KERNEL, AUDIT_CONTAINER); > if (!ab) > return -ENOMEM; > - audit_log_format(ab, "op=%s contid=%llu", > -op, audit_get_contid(tsk)); > + audit_log_format(ab, "op=%s contid=%llu", op, contid); > audit_log_end(ab); > return 0; > } You might as well just make these changes to audit_log_contid() in patch 2 when you first define audit_log_contid(). -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak90 (was ghak32) V3 04/10] audit: add support for non-syscall auxiliary records
On Wed, Jun 6, 2018 at 1:01 PM Richard Guy Briggs wrote: > Standalone audit records have the timestamp and serial number generated > on the fly and as such are unique, making them standalone. This new > function audit_alloc_local() generates a local audit context that will > be used only for a standalone record and its auxiliary record(s). The > context is discarded immediately after the local associated records are > produced. > > Signed-off-by: Richard Guy Briggs > --- > include/linux/audit.h | 8 > kernel/auditsc.c | 25 +++-- > 2 files changed, 31 insertions(+), 2 deletions(-) ... > --- a/kernel/auditsc.c > +++ b/kernel/auditsc.c > @@ -916,8 +916,11 @@ static inline void audit_free_aux(struct audit_context > *context) > static inline struct audit_context *audit_alloc_context(enum audit_state > state) > { > struct audit_context *context; > + gfp_t gfpflags; > > - context = kzalloc(sizeof(*context), GFP_KERNEL); > + /* We can be called in atomic context via audit_tg() */ > + gfpflags = (in_atomic() || irqs_disabled()) ? GFP_ATOMIC : GFP_KERNEL; Instead of trying to guess the proper gfp flags, and likely getting it wrong at some point (the in_atomic() comment in preempt.h don't give me the warm fuzzies), why not pass the gfp flags as an argument? Right now it looks like we would only have two callers: audit_alloc() and audit_audit_local(). The audit_alloc() invocation would simply pass GFP_KERNEL and we could allow the audit_alloc_local() callers to specify the gfp flags when calling audit_alloc_local() (although I suspect that will always be GFP_ATOMIC since we should only be calling audit_alloc_local() from interrupt-like context, in all other cases we should use the audit_context from the current task_struct. > + context = kzalloc(sizeof(*context), gfpflags); > if (!context) > return NULL; > context->state = state; > @@ -993,8 +996,26 @@ struct audit_task_info init_struct_audit = { > .ctx = NULL, > }; > > -static inline void audit_free_context(struct audit_context *context) > +struct audit_context *audit_alloc_local(void) > { Let's see where this goes, but we may want to rename this slightly to indicate that this should only be called from interrupt context when we can't rely on current's audit_context. Bonus points if we can find a way to enforce this with a WARN() assertion so we can better catch abuse. > + struct audit_context *context; > + > + if (!audit_ever_enabled) > + return NULL; /* Return if not auditing. */ > + > + context = audit_alloc_context(AUDIT_RECORD_CONTEXT); > + if (!context) > + return NULL; > + context->serial = audit_serial(); > + context->ctime = current_kernel_time64(); > + context->in_syscall = 1; Setting in_syscall is both interesting and a bit troubling, if for no other reason than I expect most (all?) callers to be in an interrupt context when audit_alloc_local() is called. Setting in_syscall would appear to be conceptually in this case. Can you help explain why this is the right thing to do, or necessary to ensure things are handled correctly? Looking quickly at the audit code, it seems to only be used on record and/or syscall termination to end things properly as well as in some of the PATH record code paths to limit filename collection to actual syscalls. However, this was just a quick look so I could be missing some important points. > + return context; > +} > + > +void audit_free_context(struct audit_context *context) > +{ > + if (!context) > + return; > audit_free_names(context); > unroll_tree_refs(context, NULL, 0); > free_tree_refs(context); > -- > 1.8.3.1 > > -- > Linux-audit mailing list > Linux-audit@redhat.com > https://www.redhat.com/mailman/listinfo/linux-audit -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak90 (was ghak32) V3 02/10] audit: log container info of syscalls
On Wed, Jun 6, 2018 at 1:00 PM Richard Guy Briggs wrote: > Create a new audit record AUDIT_CONTAINER to document the audit > container identifier of a process if it is present. > > Called from audit_log_exit(), syscalls are covered. > > A sample raw event: > type=SYSCALL msg=audit(1519924845.499:257): arch=c03e syscall=257 > success=yes exit=3 a0=ff9c a1=56374e1cef30 a2=241 a3=1b6 items=2 ppid=606 > pid=635 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 > tty=pts0 ses=3 comm="bash" exe="/usr/bin/bash" > subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 > key="tmpcontainerid" > type=CWD msg=audit(1519924845.499:257): cwd="/root" > type=PATH msg=audit(1519924845.499:257): item=0 name="/tmp/" inode=13863 > dev=00:27 mode=041777 ouid=0 ogid=0 rdev=00:00 obj=system_u:object_r:tmp_t:s0 > nametype= PARENT cap_fp= cap_fi= cap_fe=0 > cap_fver=0 > type=PATH msg=audit(1519924845.499:257): item=1 name="/tmp/tmpcontainerid" > inode=17729 dev=00:27 mode=0100644 ouid=0 ogid=0 rdev=00:00 > obj=unconfined_u:object_r:user_tmp_t:s0 nametype=CREATE > cap_fp= cap_fi= cap_fe=0 cap_fver=0 > type=PROCTITLE msg=audit(1519924845.499:257): > proctitle=62617368002D6300736C65657020313B206563686F2074657374203E202F746D702F746D70636F6E7461696E65726964 > type=CONTAINER msg=audit(1519924845.499:257): op=task contid=123458 > > See: https://github.com/linux-audit/audit-kernel/issues/90 > See: https://github.com/linux-audit/audit-userspace/issues/51 > See: https://github.com/linux-audit/audit-testsuite/issues/64 > See: https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID > Signed-off-by: Richard Guy Briggs > --- > include/linux/audit.h | 7 +++ > include/uapi/linux/audit.h | 1 + > kernel/audit.c | 23 +++ > kernel/auditsc.c | 3 +++ > 4 files changed, 34 insertions(+) ... > --- a/include/uapi/linux/audit.h > +++ b/include/uapi/linux/audit.h > @@ -115,6 +115,7 @@ > #define AUDIT_REPLACE 1329/* Replace auditd if this packet > unanswerd */ > #define AUDIT_KERN_MODULE 1330/* Kernel Module events */ > #define AUDIT_FANOTIFY 1331/* Fanotify access decision */ > +#define AUDIT_CONTAINER1332/* Container ID */ I'm not sure I'm completely sold on the AUDIT_CONTAINER_ID and AUDIT_CONTAINER record type names. From what I can tell AUDIT_CONTAINER_ID seems to be used for audit container ID management operations, e.g. setting the ID, whereas the AUDIT_CONTAINER is used to tag events with the corresponding audit container ID. Assuming that is correct, it seems like AUDIT_CONTAINER might be better served if it was named AUDIT_CONTAINER_ID and if we could change AUDIT_CONTAINER_ID to AUDIT_CONTAINER_OP/MGMT/etc. Thoughts? > #define AUDIT_AVC 1400/* SE Linux avc denial or grant */ > #define AUDIT_SELINUX_ERR 1401/* Internal SE Linux Errors */ > diff --git a/kernel/audit.c b/kernel/audit.c > index e7478cb..5e150c6 100644 > --- a/kernel/audit.c > +++ b/kernel/audit.c > @@ -2048,6 +2048,29 @@ void audit_log_session_info(struct audit_buffer *ab) > audit_log_format(ab, " auid=%u ses=%u", auid, sessionid); > } > > +/* > + * audit_log_contid - report container info > + * @tsk: task to be recorded > + * @context: task or local context for record > + * @op: contid string description > + */ > +int audit_log_contid(struct task_struct *tsk, > +struct audit_context *context, char *op) > +{ > + struct audit_buffer *ab; > + > + if (!audit_contid_set(tsk)) > + return 0; > + /* Generate AUDIT_CONTAINER record with container ID */ > + ab = audit_log_start(context, GFP_KERNEL, AUDIT_CONTAINER); > + if (!ab) > + return -ENOMEM; > + audit_log_format(ab, "op=%s contid=%llu", > +op, audit_get_contid(tsk)); Can you explain your reason for including an "op" field in this record type? I've been looking at the rest of the patches in this patchset and it seems to be used more as an indicator of the record's generating context rather than any sort of audit container ID operation. > + audit_log_end(ab); > + return 0; > +} -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak90 (was ghak32) V3 05/10] audit: add containerid support for tty_audit
On Wed, Jun 6, 2018 at 1:04 PM Richard Guy Briggs wrote: > Add audit container identifier auxiliary record to tty logging rule > event standalone records. > > Signed-off-by: Richard Guy Briggs > --- > drivers/tty/tty_audit.c | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/tty/tty_audit.c b/drivers/tty/tty_audit.c > index e30aa6b..66bd850 100644 > --- a/drivers/tty/tty_audit.c > +++ b/drivers/tty/tty_audit.c > @@ -66,8 +66,9 @@ static void tty_audit_log(const char *description, dev_t > dev, > uid_t uid = from_kuid(&init_user_ns, task_uid(tsk)); > uid_t loginuid = from_kuid(&init_user_ns, audit_get_loginuid(tsk)); > unsigned int sessionid = audit_get_sessionid(tsk); > + struct audit_context *context = audit_alloc_local(); We should be using current's audit_context in tty_audit_log(). Actually, we should probably just get rid of the tsk variable in tty_audit_log() and use current directly to make things a bit more obvious. I did some digging and I have a two year old, half-baked patch that cleans up this tsk/current usage as well as a few others. I just rebased it against audit/next and surprisingly it seems to pass a basic smoke test (kernel boots and audit-testsuite passes); I'll post it to the list as a RFC once I'm done reviewing these patches. > - ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_TTY); > + ab = audit_log_start(context, GFP_KERNEL, AUDIT_TTY); > if (ab) { > char name[sizeof(tsk->comm)]; > > @@ -80,6 +81,8 @@ static void tty_audit_log(const char *description, dev_t > dev, > audit_log_n_hex(ab, data, size); > audit_log_end(ab); > } > + audit_log_contid(context, "tty", audit_get_contid(tsk)); > + audit_free_context(context); > } -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak90 (was ghak32) V3 01/10] audit: add container id
or is not single-threaded, deny */ > + else if (!list_empty(&task->children)) > + rc = -EBUSY; Is this safe without holding tasklist_lock? I worry we might be vulnerable to a race with fork(). > + else if (!(thread_group_leader(task) && thread_group_empty(task))) > + rc = -EALREADY; Similar concern here as well, although related to threads. > + /* it is already set, and not inherited from the parent, reject */ > + else if (cid_valid(oldcontid) && !task->audit->inherited) > + rc = -EEXIST; Maybe I'm missing something, but why do we care about preventing reassigning the audit container ID in this case? The task is single threaded and has no descendants at this point so it should be safe, yes? So long as the task changing the audit container ID has capable(CAP_AUDIT_CONTOL) it shouldn't matter, right? Related, I'm questioning if we would ever care if the audit container ID was inherited or not? > + if (!rc) { > + task_lock(task); > + task->audit->contid = contid; > + task->audit->inherited = false; > + task_unlock(task); I suspect the task_lock() may not be what we want here, but if we are using task_lock() to protect the audit fields two things come to mind: 1. We should update the header comments for task_lock() in task.h to indicate that it also protects ->audit. 2. Where else do we need to worry about taking this lock? At the very least we should take this lock near the top of this function before we check task->audit and not drop it until after we have set it, or failed the operation for one of the reasons above. > + } > + > + if (!audit_enabled) > + return rc; > + > + ab = audit_log_start(audit_context(), GFP_KERNEL, AUDIT_CONTAINER_ID); > + if (!ab) > + return rc; > + > + uid = from_kuid(&init_user_ns, task_uid(current)); > + tty = audit_get_tty(current); > + audit_log_format(ab, "op=set opid=%d old-contid=%llu contid=%llu > pid=%d uid=%u auid=%u tty=%s ses=%u", > +task_tgid_nr(task), oldcontid, contid, > +task_tgid_nr(current), uid > +from_kuid(&init_user_ns, > audit_get_loginuid(current)), > +tty ? tty_name(tty) : "(none)", > +audit_get_sessionid(current)); > + audit_put_tty(tty); > + audit_log_task_context(ab); > + audit_log_format(ab, " comm="); > + audit_log_untrustedstring(ab, get_task_comm(comm, current)); > + audit_log_d_path_exe(ab, current->mm); > + audit_log_format(ab, " res=%d", !rc); > + audit_log_end(ab); > + return rc; > +} > + > +/** > * __audit_mq_open - record audit data for a POSIX MQ open > * @oflag: open flag > * @mode: mode bits -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak90 (was ghak32) V3 08/10] audit: NETFILTER_PKT: record each container ID associated with a netNS
ctitle_free(context); > kfree(context); > } > +EXPORT_SYMBOL(audit_free_context); Same. > static int audit_log_pid_context(struct audit_context *context, pid_t pid, > kuid_t auid, kuid_t uid, unsigned int > sessionid, > diff --git a/net/netfilter/xt_AUDIT.c b/net/netfilter/xt_AUDIT.c > index f368ee6..10d2707 100644 > --- a/net/netfilter/xt_AUDIT.c > +++ b/net/netfilter/xt_AUDIT.c > @@ -71,10 +71,13 @@ static bool audit_ip6(struct audit_buffer *ab, struct > sk_buff *skb) > { > struct audit_buffer *ab; > int fam = -1; > + struct audit_context *context; > + struct net *net; > > if (audit_enabled == 0) > - goto errout; > - ab = audit_log_start(NULL, GFP_ATOMIC, AUDIT_NETFILTER_PKT); > + goto out; > + context = audit_alloc_local(); > + ab = audit_log_start(context, GFP_ATOMIC, AUDIT_NETFILTER_PKT); > if (ab == NULL) > goto errout; > > @@ -104,7 +107,12 @@ static bool audit_ip6(struct audit_buffer *ab, struct > sk_buff *skb) > > audit_log_end(ab); > > + net = xt_net(par); > + audit_log_contid_list(net, context); > + > errout: > + audit_free_context(context); > +out: > return XT_CONTINUE; > } > > -- > 1.8.3.1 > > -- > Linux-audit mailing list > Linux-audit@redhat.com > https://www.redhat.com/mailman/listinfo/linux-audit -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak90 (was ghak32) V3 06/10] audit: add containerid filtering
} > entry->rule.exe = audit_mark; > break; > + case AUDIT_CONTID: > + if (f->val != sizeof(u64)) > + goto exit_free; > + str = audit_unpack_string(&bufp, &remain, f->val); > + if (IS_ERR(str)) > + goto exit_free; > + f->val64 = ((u64 *)str)[0]; > + break; > } > } > > @@ -666,6 +675,11 @@ static struct audit_rule_data > *audit_krule_to_data(struct audit_krule *krule) > data->buflen += data->values[i] = > audit_pack_string(&bufp, > audit_mark_path(krule->exe)); > break; > + case AUDIT_CONTID: > + data->buflen += data->values[i] = sizeof(u64); > + for (i = 0; i < sizeof(u64); i++) > + ((char *)bufp)[i] = ((char *)&f->val64)[i]; > + break; > case AUDIT_LOGINUID_SET: > if (krule->pflags & AUDIT_LOGINUID_LEGACY && !f->val) > { > data->fields[i] = AUDIT_LOGINUID; > @@ -752,6 +766,10 @@ static int audit_compare_rule(struct audit_krule *a, > struct audit_krule *b) > if (!gid_eq(a->fields[i].gid, b->fields[i].gid)) > return 1; > break; > + case AUDIT_CONTID: > + if (a->fields[i].val64 != b->fields[i].val64) > + return 1; > + break; > default: > if (a->fields[i].val != b->fields[i].val) > return 1; > @@ -1208,6 +1226,31 @@ int audit_comparator(u32 left, u32 op, u32 right) > } > } > > +int audit_comparator64(u64 left, u32 op, u64 right) > +{ > + switch (op) { > + case Audit_equal: > + return (left == right); > + case Audit_not_equal: > + return (left != right); > + case Audit_lt: > + return (left < right); > + case Audit_le: > + return (left <= right); > + case Audit_gt: > + return (left > right); > + case Audit_ge: > + return (left >= right); > + case Audit_bitmask: > + return (left & right); > + case Audit_bittest: > + return ((left & right) == right); > + default: > + BUG(); > + return 0; > + } > +} > + > int audit_uid_comparator(kuid_t left, u32 op, kuid_t right) > { > switch (op) { > @@ -1346,6 +1389,10 @@ int audit_filter(int msgtype, unsigned int listtype) > result = > audit_comparator(audit_loginuid_set(current), > f->op, f->val); > break; > + case AUDIT_CONTID: > + result = > audit_comparator64(audit_get_contid(current), > + f->op, > f->val64); > + break; > case AUDIT_MSGTYPE: > result = audit_comparator(msgtype, f->op, > f->val); > break; > diff --git a/kernel/auditsc.c b/kernel/auditsc.c > index 81c9765..ea1ee35 100644 > --- a/kernel/auditsc.c > +++ b/kernel/auditsc.c > @@ -622,6 +622,9 @@ static int audit_filter_rules(struct task_struct *tsk, > case AUDIT_LOGINUID_SET: > result = audit_comparator(audit_loginuid_set(tsk), > f->op, f->val); > break; > + case AUDIT_CONTID: > + result = audit_comparator64(audit_get_contid(tsk), > f->op, f->val64); > + break; > case AUDIT_SUBJ_USER: > case AUDIT_SUBJ_ROLE: > case AUDIT_SUBJ_TYPE: > -- > 1.8.3.1 > > -- > Linux-audit mailing list > Linux-audit@redhat.com > https://www.redhat.com/mailman/listinfo/linux-audit -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak90 (was ghak32) V3 07/10] audit: add support for containerid to network namespaces
ct net *net) > +{ > + struct audit_net *aunet = net_generic(net, audit_net_id); > + > + return &aunet->contid_list; > +} > + > +void audit_contid_add(struct net *net, u64 contid) > +{ > + struct list_head *contid_list = audit_get_contid_list(net); > + struct audit_contid *cont; > + > + if (!cid_valid(contid)) > + return; > + if (!list_empty(contid_list)) > + list_for_each_entry(cont, contid_list, list) > + if (cont->id == contid) { > + refcount_inc(&cont->refcount); > + return; > + } I think this is fine for right now, but we may need to be a bit clever about how we store the IDs - walking an unsorted list with lots of entries may prove to be too painful. > + cont = kmalloc(sizeof(struct audit_contid), GFP_KERNEL); > + if (!cont) > + return; > + INIT_LIST_HEAD(&cont->list); > + cont->id = contid; > + refcount_set(&cont->refcount, 1); > + list_add(&cont->list, contid_list); > +} > + > +void audit_contid_del(struct net *net, u64 contid) > +{ > + struct list_head *contid_list = audit_get_contid_list(net); > + struct audit_contid *cont = NULL; > + int found = 0; > + > + if (!cid_valid(contid)) > + return; > + if (!list_empty(contid_list)) > + list_for_each_entry(cont, contid_list, list) > + if (cont->id == contid) { > + found = 1; > + break; You don't really need the found variable, you can just move all of the work you need to do up into the if statement and return from inside the if statement. > + } > + if (!found) > + return; > + list_del(&cont->list); > + if (refcount_dec_and_test(&cont->refcount)) > + kfree(cont); Don't you want to dec_and_test first and only remove it from the list if there are no other references? > +} > > +void audit_switch_task_namespaces(struct nsproxy *ns, struct task_struct *p) > +{ > + u64 contid = audit_get_contid(p); > + struct nsproxy *new = p->nsproxy; > + > + if (!cid_valid(contid)) > + return; > + audit_contid_del(ns->net_ns, contid); > + if (new) > + audit_contid_add(new->net_ns, contid); > +} > + > void audit_panic(const char *message) > { > switch (audit_failure) { > @@ -1550,6 +1621,7 @@ static int __net_init audit_net_init(struct net *net) > return -ENOMEM; > } > aunet->sk->sk_sndtimeo = MAX_SCHEDULE_TIMEOUT; > + INIT_LIST_HEAD(&aunet->contid_list); > > return 0; > } > diff --git a/kernel/auditsc.c b/kernel/auditsc.c > index ea1ee35..6ab5e5e 100644 > --- a/kernel/auditsc.c > +++ b/kernel/auditsc.c > @@ -75,6 +75,7 @@ > #include > #include > #include > +#include > > #include "audit.h" > > @@ -2165,6 +2166,7 @@ int audit_set_contid(struct task_struct *task, u64 > contid) > uid_t uid; > struct tty_struct *tty; > char comm[sizeof(current->comm)]; > + struct net *net = task->nsproxy->net_ns; > > /* Can't set if audit disabled */ > if (!task->audit) > @@ -2185,10 +2187,13 @@ int audit_set_contid(struct task_struct *task, u64 > contid) > else if (cid_valid(oldcontid) && !task->audit->inherited) > rc = -EEXIST; > if (!rc) { > + if (cid_valid(oldcontid)) > + audit_contid_del(net, oldcontid); > task_lock(task); > task->audit->contid = contid; > task->audit->inherited = false; > task_unlock(task); > + audit_contid_add(net, contid); > } > > if (!audit_enabled) > diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c > index f6c5d33..dcb69fe 100644 > --- a/kernel/nsproxy.c > +++ b/kernel/nsproxy.c > @@ -27,6 +27,7 @@ > #include > #include > #include > +#include > > static struct kmem_cache *nsproxy_cachep; > > @@ -140,6 +141,7 @@ int copy_namespaces(unsigned long flags, struct > task_struct *tsk) > struct nsproxy *old_ns = tsk->nsproxy; > struct user_namespace *user_ns = task_cred_xxx(tsk, user_ns); > struct nsproxy *new_ns; > + u64 contid = audit_get_contid(tsk); > > if (like
Re: [RFC PATCH ghak90 (was ghak32) V3 09/10] debug audit: read container ID of a process
On Wed, Jun 6, 2018 at 1:02 PM Richard Guy Briggs wrote: > Add support for reading the audit container identifier from the proc > filesystem. > > This is a read from the proc entry of the form > /proc/PID/audit_containerid where PID is the process ID of the task > whose audit container identifier is sought. > > The read expects up to a u64 value (unset: 18446744073709551615). > > Signed-off-by: Richard Guy Briggs > --- > fs/proc/base.c | 20 ++-- > 1 file changed, 18 insertions(+), 2 deletions(-) > > diff --git a/fs/proc/base.c b/fs/proc/base.c > index 318dff4..ca8bfe2 100644 > --- a/fs/proc/base.c > +++ b/fs/proc/base.c > @@ -1303,6 +1303,21 @@ static ssize_t proc_sessionid_read(struct file * file, > char __user * buf, > .llseek = generic_file_llseek, > }; > > +static ssize_t proc_contid_read(struct file *file, char __user *buf, > + size_t count, loff_t *ppos) > +{ > + struct inode *inode = file_inode(file); > + struct task_struct *task = get_proc_task(inode); > + ssize_t length; > + char tmpbuf[TMPBUFLEN*2]; > + > + if (!task) > + return -ESRCH; > + length = scnprintf(tmpbuf, TMPBUFLEN*2, "%llu", > audit_get_contid(task)); > + put_task_struct(task); > + return simple_read_from_buffer(buf, count, ppos, tmpbuf, length); > +} While I still remain very nervous about opening the audit container ID up for abuse by making it accessible, I understand that this would make things a lot easier us (e.g. testing) and perhaps the container engines as well. In order to limit the potential for abuse, what do you think about restricting read access to those processes which have CAP_AUDIT_CONTROL, similar to what we do for setting the audit container ID? > static ssize_t proc_contid_write(struct file *file, const char __user *buf, >size_t count, loff_t *ppos) > { > @@ -1333,6 +1348,7 @@ static ssize_t proc_contid_write(struct file *file, > const char __user *buf, > } > > static const struct file_operations proc_contid_operations = { > + .read = proc_contid_read, > .write = proc_contid_write, > .llseek = generic_file_llseek, > }; > @@ -3030,7 +3046,7 @@ static int proc_pid_patch_state(struct seq_file *m, > struct pid_namespace *ns, > #ifdef CONFIG_AUDITSYSCALL > REG("loginuid", S_IWUSR|S_IRUGO, proc_loginuid_operations), > REG("sessionid", S_IRUGO, proc_sessionid_operations), > - REG("audit_containerid", S_IWUSR, proc_contid_operations), > + REG("audit_containerid", S_IWUSR|S_IRUSR, proc_contid_operations), > #endif > #ifdef CONFIG_FAULT_INJECTION > REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations), > @@ -3422,7 +3438,7 @@ static int proc_tid_comm_permission(struct inode > *inode, int mask) > #ifdef CONFIG_AUDITSYSCALL > REG("loginuid", S_IWUSR|S_IRUGO, proc_loginuid_operations), > REG("sessionid", S_IRUGO, proc_sessionid_operations), > - REG("audit_containerid", S_IWUSR, proc_contid_operations), > + REG("audit_containerid", S_IWUSR|S_IRUSR, proc_contid_operations), > #endif > #ifdef CONFIG_FAULT_INJECTION > REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations), -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
[RFC PATCH] audit: use current whenever possible
There are many places, notably audit_log_task_info() and audit_log_exit(), that take task_struct pointers but in reality they are always working on the current task. This patch eliminates the task_struct arguments and uses current directly which allows a number of cleanups as well. Signed-off-by: Paul Moore --- drivers/tty/tty_audit.c | 13 ++-- include/linux/audit.h|6 +- kernel/audit.c | 34 +-- kernel/audit.h |2 - kernel/auditsc.c | 118 +- security/integrity/ima/ima_api.c |2 - 6 files changed, 81 insertions(+), 94 deletions(-) diff --git a/drivers/tty/tty_audit.c b/drivers/tty/tty_audit.c index 50f567b6a66e..28f87fd6a28e 100644 --- a/drivers/tty/tty_audit.c +++ b/drivers/tty/tty_audit.c @@ -61,20 +61,19 @@ static void tty_audit_log(const char *description, dev_t dev, unsigned char *data, size_t size) { struct audit_buffer *ab; - struct task_struct *tsk = current; - pid_t pid = task_pid_nr(tsk); - uid_t uid = from_kuid(&init_user_ns, task_uid(tsk)); - uid_t loginuid = from_kuid(&init_user_ns, audit_get_loginuid(tsk)); - unsigned int sessionid = audit_get_sessionid(tsk); + pid_t pid = task_pid_nr(current); + uid_t uid = from_kuid(&init_user_ns, task_uid(current)); + uid_t loginuid = from_kuid(&init_user_ns, audit_get_loginuid(current)); + unsigned int sessionid = audit_get_sessionid(current); ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_TTY); if (ab) { - char name[sizeof(tsk->comm)]; + char name[sizeof(current->comm)]; audit_log_format(ab, "%s pid=%u uid=%u auid=%u ses=%u major=%d" " minor=%d comm=", description, pid, uid, loginuid, sessionid, MAJOR(dev), MINOR(dev)); - get_task_comm(name, tsk); + get_task_comm(name, current); audit_log_untrustedstring(ab, name); audit_log_format(ab, " data="); audit_log_n_hex(ab, data, size); diff --git a/include/linux/audit.h b/include/linux/audit.h index 9334fbef7bae..108dd9706a30 100644 --- a/include/linux/audit.h +++ b/include/linux/audit.h @@ -153,8 +153,7 @@ extern void audit_log_link_denied(const char *operation); extern voidaudit_log_lost(const char *message); extern int audit_log_task_context(struct audit_buffer *ab); -extern void audit_log_task_info(struct audit_buffer *ab, - struct task_struct *tsk); +extern void audit_log_task_info(struct audit_buffer *ab); extern int audit_update_lsm_rules(void); @@ -202,8 +201,7 @@ static inline int audit_log_task_context(struct audit_buffer *ab) { return 0; } -static inline void audit_log_task_info(struct audit_buffer *ab, - struct task_struct *tsk) +static inline void audit_log_task_info(struct audit_buffer *ab) { } #define audit_enabled AUDIT_OFF #endif /* CONFIG_AUDIT */ diff --git a/kernel/audit.c b/kernel/audit.c index 2a8058764aa6..160144f7e5f9 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -1096,10 +1096,11 @@ static void audit_log_feature_change(int which, u32 old_feature, u32 new_feature if (audit_enabled == AUDIT_OFF) return; + ab = audit_log_start(audit_context(), GFP_KERNEL, AUDIT_FEATURE_CHANGE); if (!ab) return; - audit_log_task_info(ab, current); + audit_log_task_info(ab); audit_log_format(ab, " feature=%s old=%u new=%u old_lock=%u new_lock=%u res=%d", audit_feature_names[which], !!old_feature, !!new_feature, !!old_lock, !!new_lock, res); @@ -2247,15 +2248,15 @@ void audit_log_d_path_exe(struct audit_buffer *ab, audit_log_format(ab, " exe=(null)"); } -struct tty_struct *audit_get_tty(struct task_struct *tsk) +struct tty_struct *audit_get_tty(void) { struct tty_struct *tty = NULL; unsigned long flags; - spin_lock_irqsave(&tsk->sighand->siglock, flags); - if (tsk->signal) - tty = tty_kref_get(tsk->signal->tty); - spin_unlock_irqrestore(&tsk->sighand->siglock, flags); + spin_lock_irqsave(¤t->sighand->siglock, flags); + if (current->signal) + tty = tty_kref_get(current->signal->tty); + spin_unlock_irqrestore(¤t->sighand->siglock, flags); return tty; } @@ -2264,25 +2265,24 @@ void audit_put_tty(struct tty_struct *tty) tty_kref_put(tty); } -void audit_log_task_info(struct audit_buffer *ab, struct task_struct *tsk) +void audit_log_task_info(struct audit_buffer *ab) { const st
Re: [RFC PATCH ghak90 (was ghak32) V3 02/10] audit: log container info of syscalls
On Sat, Jul 21, 2018 at 4:32 PM Richard Guy Briggs wrote: > On 2018-07-20 18:13, Paul Moore wrote: > > On Wed, Jun 6, 2018 at 1:00 PM Richard Guy Briggs wrote: > > > Create a new audit record AUDIT_CONTAINER to document the audit > > > container identifier of a process if it is present. > > > > > > Called from audit_log_exit(), syscalls are covered. > > > > > > A sample raw event: > > > type=SYSCALL msg=audit(1519924845.499:257): arch=c03e syscall=257 > > > success=yes exit=3 a0=ff9c a1=56374e1cef30 a2=241 a3=1b6 items=2 > > > ppid=606 pid=635 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 > > > fsgid=0 tty=pts0 ses=3 comm="bash" exe="/usr/bin/bash" > > > subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 > > > key="tmpcontainerid" > > > type=CWD msg=audit(1519924845.499:257): cwd="/root" > > > type=PATH msg=audit(1519924845.499:257): item=0 name="/tmp/" inode=13863 > > > dev=00:27 mode=041777 ouid=0 ogid=0 rdev=00:00 > > > obj=system_u:object_r:tmp_t:s0 nametype= PARENT cap_fp= > > > cap_fi= cap_fe=0 cap_fver=0 > > > type=PATH msg=audit(1519924845.499:257): item=1 > > > name="/tmp/tmpcontainerid" inode=17729 dev=00:27 mode=0100644 ouid=0 > > > ogid=0 rdev=00:00 obj=unconfined_u:object_r:user_tmp_t:s0 nametype=CREATE > > > cap_fp= cap_fi= cap_fe=0 cap_fver=0 > > > type=PROCTITLE msg=audit(1519924845.499:257): > > > proctitle=62617368002D6300736C65657020313B206563686F2074657374203E202F746D702F746D70636F6E7461696E65726964 > > > type=CONTAINER msg=audit(1519924845.499:257): op=task contid=123458 > > > > > > See: https://github.com/linux-audit/audit-kernel/issues/90 > > > See: https://github.com/linux-audit/audit-userspace/issues/51 > > > See: https://github.com/linux-audit/audit-testsuite/issues/64 > > > See: > > > https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID > > > Signed-off-by: Richard Guy Briggs > > > --- > > > include/linux/audit.h | 7 +++ > > > include/uapi/linux/audit.h | 1 + > > > kernel/audit.c | 23 +++ > > > kernel/auditsc.c | 3 +++ > > > 4 files changed, 34 insertions(+) > > > > ... > > > > > --- a/include/uapi/linux/audit.h > > > +++ b/include/uapi/linux/audit.h > > > @@ -115,6 +115,7 @@ > > > #define AUDIT_REPLACE 1329/* Replace auditd if this packet > > > unanswerd */ > > > #define AUDIT_KERN_MODULE 1330/* Kernel Module events */ > > > #define AUDIT_FANOTIFY 1331/* Fanotify access decision */ > > > +#define AUDIT_CONTAINER1332/* Container ID */ > > > > I'm not sure I'm completely sold on the AUDIT_CONTAINER_ID and > > AUDIT_CONTAINER record type names. From what I can tell > > AUDIT_CONTAINER_ID seems to be used for audit container ID management > > operations, e.g. setting the ID, whereas the AUDIT_CONTAINER is used > > to tag events with the corresponding audit container ID. Assuming > > that is correct, it seems like AUDIT_CONTAINER might be better served > > if it was named AUDIT_CONTAINER_ID and if we could change > > AUDIT_CONTAINER_ID to AUDIT_CONTAINER_OP/MGMT/etc. Thoughts? > > Please see discussion at: > https://www.redhat.com/archives/linux-audit/2018-May/msg00101.html > > I'm fine with changing AUDIT_CONTAINER_ID to AUDIT_CONTAINER_OP/MGMT/etc. Noted, and while I'm generally a big fan of consistency for things like this, I think these things are different enough (the loginuid is recorded as a field, the audit container ID is recorded in a dedicated record) that we don't need to be bound by LOGINUID's naming convention. > > > #define AUDIT_AVC 1400/* SE Linux avc denial or grant */ > > > #define AUDIT_SELINUX_ERR 1401/* Internal SE Linux Errors */ > > > diff --git a/kernel/audit.c b/kernel/audit.c > > > index e7478cb..5e150c6 100644 > > > --- a/kernel/audit.c > > > +++ b/kernel/audit.c > > > @@ -2048,6 +2048,29 @@ void audit_log_session_info(struct audit_buffer > > > *ab) > > > audit_log_format(ab, " auid=%u ses=%u", auid, sessionid); > > > } > > > > > > +/* > > > + * audit_log_contid - report container info > > > + * @tsk: task to be recorded > > > + * @context: task o
Re: [RFC PATCH ghak90 (was ghak32) V3 02/10] audit: log container info of syscalls
On Mon, Jul 23, 2018 at 12:48 PM Steve Grubb wrote: > On Monday, July 23, 2018 11:11:48 AM EDT Richard Guy Briggs wrote: > > On 2018-07-23 09:19, Steve Grubb wrote: > > > On Sunday, July 22, 2018 4:55:10 PM EDT Richard Guy Briggs wrote: > > > > On 2018-07-22 09:32, Steve Grubb wrote: > > > > > On Saturday, July 21, 2018 4:29:30 PM EDT Richard Guy Briggs wrote: > > > > > > > > + * audit_log_contid - report container info > > > > > > > > + * @tsk: task to be recorded > > > > > > > > + * @context: task or local context for record > > > > > > > > + * @op: contid string description > > > > > > > > + */ > > > > > > > > +int audit_log_contid(struct task_struct *tsk, > > > > > > > > +struct audit_context *context, > > > > > > > > char > > > > > > > > *op) > > > > > > > > +{ > > > > > > > > + struct audit_buffer *ab; > > > > > > > > + > > > > > > > > + if (!audit_contid_set(tsk)) > > > > > > > > + return 0; > > > > > > > > + /* Generate AUDIT_CONTAINER record with container ID */ > > > > > > > > + ab = audit_log_start(context, GFP_KERNEL, > > > > > > > > AUDIT_CONTAINER); > > > > > > > > + if (!ab) > > > > > > > > + return -ENOMEM; > > > > > > > > + audit_log_format(ab, "op=%s contid=%llu", > > > > > > > > +op, audit_get_contid(tsk)); > > > > > > > > > > > > > > Can you explain your reason for including an "op" field in this > > > > > > > record > > > > > > > type? I've been looking at the rest of the patches in this > > > > > > > patchset > > > > > > > and it seems to be used more as an indicator of the record's > > > > > > > generating context rather than any sort of audit container ID > > > > > > > operation. > > > > > > > > > > > > "action" might work, but that's netfilter and numeric... "kind"? > > > > > > Nothing else really seems to fit from a field name, type or lack of > > > > > > searchability perspective. > > > > > > > > > > > > Steve, do you have an opinion? > > > > > > > > > > We only have 1 sample event where we have op=task. What are the other > > > > > possible values? > > > > > > > > For the AUDIT_CONTAINER record we have op= "task", "target" (from the > > > > ptrace and signals patch), "tty". > > > > > > > > For the AUDIT_CONTAINER_ID record we have "op=set". > > > > > > Since the purpose of this record is to log the container id, I think that > > > is all that is needed. We can get the context from the other records in > > > the event. I'd suggest dropping the "op" field. > > > > Ok, the information above it for two different audit container > > identifier records. Which one should drop the "op=" field? Both? Or > > just the AUDIT_CONTAINER record? The AUDIT_CONTAINER_ID record (which > > might be renamed) could use it to distinguish a "set" record from a > > dropped audit container identifier that is no longer registered by any > > task or namespace. > > Neither of them need it. All they need to do is state the container that is > being acted upon. I think we should keep the "op" field for audit container ID management operations, even though we really only have a "set" operation at the moment, but the others should drop the "op" field (see my previous emails in this thread). -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak9 0/3] audit: Record the path of FDs passed to *at(2) syscalls
On Fri, Jul 20, 2018 at 6:12 AM Ondrej Mosnacek wrote: > On Wed, Jul 18, 2018 at 10:41 PM Paul Moore wrote: > > On Thu, Jul 12, 2018 at 7:36 AM Ondrej Mosnacek wrote: > > > This patchset is a prototype implementation of the feature requested in > > > GHAK issue #9 [1]. I decided for a simple auxiliary record with just 2 > > > fields (fd and path) that is emitted whenever we want to record the full > > > path for a file descriptor passed to a syscall (e.g. the dirfd argument > > > of openat(2)). I choose this approach because for some syscalls there is > > > more than one file descriptor we might be interested in (a good example > > > is the renameat(2) syscall). > > > > > > The motivation for this feature (as I understand it) is to avoid the need > > > to reconstruct the paths corresponding to the file descriptors passed to > > > syscalls, as this might be difficult and time consuming or even > > > impossible in case not all of the right sycalls are being logged. Note > > > that it is always possible to disable these records by simply adding an > > > exclude filter rule matching all records of type FD_PATH. > > > > > > At this moment I only implement logging for a single syscall (openat(2)) > > > to keep it simple. In the final version I plan to add support for other > > > similar syscalls ()mkdirat, mknodeat, fchownat, ...). > > > > > > Please let me know if the general approach and the proposed record format > > > make sense to you so I can improve/complete the solution. > > > > > > Thanks, > > > Ondrej > > > > > > [1] https://github.com/linux-audit/audit-kernel/issues/9 > > > > While I recognize that the GH issue did raise the idea of possibly > > creating a new record type, looking at these patches I'm not sure a > > new record type is justified, I think reusing the existing PATH record > > type would be more beneficial. I recognize that this proposed FD_PATH > > record also contains the file descriptor number, but that information > > should also be contained in the associated SYSCALL record and arguably > > the fd number is only useful if you are logging the SYSCALL > > information. > > To be honest, I'm not sure what is the exact semantic of the PATH > record... Is it simply "log information about files that the syscall > somehow touched"? Or "log information about any string syscall > argument that represents a file path"? The PATH record exists to record path name information about files related to the audit event, depending on the circumstances, both relative and absolute paths could be expected. Look at the functions, and callers, that add to the audit_context->names_list to get a better understanding of where the path name information is collected. I doubt you will be surprised at this, but fair warning, it's a bit ugly in places. > In the PATH record samples I > have seen, the name=... field sometimes contains just the last segment > of the path, other times it contains the full path (huh?). Depending on the syscall, the syscall arguments, and the "nametype" of the PATH record you could see either a full or partial path. In the cases of a partial path, there *should* be additional PATH records which can help recreate a full path to the file for that particular event instance. > When we log > the full path in PATH, then I guess the FD_PATH record would be > (almost) redundant, but it seems that whenever openat(2) is called > with dirfd != AT_FDCWD, PATH name=... contains just the relative path > supplied to the syscall (thus FD_PATH actually provides the missing > directory path). > > So... assuming we would want to provide the missing information in the > existing PATH record, how should it look like? Should the name=... > field simply always contain the full path? Should there be another > PATH record for the directory? The only real hard requirement is that there be enough to information to determine the file full path(s) for any given audit event using one, or a combination, of PATH records associated with the event. > If so, how do we indicate the association between the two PATH records? Look at how the "nametype" field is used. > > Can you explain what advantage the FS_PATH record type has over > > reusing the existing PATH record? I know you mention multiple > > fd/paths as in the case of renameat(2), but we already have to deal > > with this in the non-at rename(2) case. > > As I said above, I don't fully understand the PATH record so I can't > compare them right now. Multiple fd arguments was meant as more of a > justification for the presence of the fd=... field in the new FD_PATH > record. I suggest taking some time to better understand how the current file path auditing works and then revisiting this patchset. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak59 V1 3/6] audit: exclude user records from syscall context
On Mon, Jul 23, 2018 at 12:43 PM Richard Guy Briggs wrote: > On 2018-07-12 17:46, Richard Guy Briggs wrote: > > On 2018-06-28 18:11, Paul Moore wrote: > > > On Thu, Jun 14, 2018 at 4:23 PM Richard Guy Briggs > > > wrote: > > > > Since the function audit_log_common_recv_msg() is shared by a number of > > > > AUDIT_CONFIG_CHANGE and the entire range of AUDIT_USER_* record types, > > > > and since the AUDIT_CONFIG_CHANGE message type has been converted to a > > > > syscall accompanied record type, special-case the AUDIT_USER_* range of > > > > messages so they remain standalone records. > > > > > > > > See: https://github.com/linux-audit/audit-kernel/issues/59 > > > > Signed-off-by: Richard Guy Briggs > > > > --- > > > > kernel/audit.c | 12 +--- > > > > 1 file changed, 9 insertions(+), 3 deletions(-) > > > > > > I think this is fine, but see my previous comment about combining 2/6 > > > and 3/6 as a safety measure. > > > > This one I left as a seperate patch for discussion. We'd previously > > talked about connecting all possible records with syscall records if > > they exist, but this one I'm unsure about, since we don't really care > > what userspace process is issuing this message. It is just the message > > content itself that is important. Or is it? Are we concerned about > > CAP_AUDIT_WRITE holders/abusers and want as much info about them as we > > can get in case they go rogue or pear-shaped? > > I'm waiting on re-spinning this patchset because of this open question. > > Is connecting AUDIT_USER* records desirable or a liability? Like I said, I think special casing the AUDIT_USER* records so they are *not* associated with other records is okay, and perhaps even the right thing to do. The problem is that we don't have the necessary context (har har) to match any kernel events (and there is the possibility that there are none) to the userspace generated AUDIT_USER* event. Further, I don't think this is something we would ever be able to solve in a reasonable manner. > > > > diff --git a/kernel/audit.c b/kernel/audit.c > > > > index e469234..c8c2efc 100644 > > > > --- a/kernel/audit.c > > > > +++ b/kernel/audit.c > > > > @@ -1057,7 +1057,8 @@ static int audit_netlink_ok(struct sk_buff *skb, > > > > u16 msg_type) > > > > return err; > > > > } > > > > > > > > -static void audit_log_common_recv_msg(struct audit_buffer **ab, u16 > > > > msg_type) > > > > +static void __audit_log_common_recv_msg(struct audit_context *context, > > > > + struct audit_buffer **ab, u16 > > > > msg_type) > > > > { > > > > uid_t uid = from_kuid(&init_user_ns, current_uid()); > > > > pid_t pid = task_tgid_nr(current); > > > > @@ -1067,7 +1068,7 @@ static void audit_log_common_recv_msg(struct > > > > audit_buffer **ab, u16 msg_type) > > > > return; > > > > } > > > > > > > > - *ab = audit_log_start(audit_context(), GFP_KERNEL, msg_type); > > > > + *ab = audit_log_start(context, GFP_KERNEL, msg_type); > > > > if (unlikely(!*ab)) > > > > return; > > > > audit_log_format(*ab, "pid=%d uid=%u", pid, uid); > > > > @@ -1075,6 +1076,11 @@ static void audit_log_common_recv_msg(struct > > > > audit_buffer **ab, u16 msg_type) > > > > audit_log_task_context(*ab); > > > > } > > > > > > > > +static inline void audit_log_common_recv_msg(struct audit_buffer **ab, > > > > u16 msg_type) > > > > +{ > > > > + __audit_log_common_recv_msg(audit_context(), ab, msg_type); > > > > +} > > > > + > > > > int is_audit_feature_set(int i) > > > > { > > > > return af.features & AUDIT_FEATURE_TO_MASK(i); > > > > @@ -1341,7 +1347,7 @@ static int audit_receive_msg(struct sk_buff *skb, > > > > struct nlmsghdr *nlh) > > > > if (err) > > > > break; > > > > } > > > > - audit_log_common_recv_msg(&ab, msg_type); > > > > + __audit_log_common_recv_msg(NULL, &ab, > > > > msg_type); &g
Re: [RFC PATCH] audit: use current whenever possible
On Mon, Jul 23, 2018 at 3:40 PM Richard Guy Briggs wrote: > On 2018-07-20 18:17, Paul Moore wrote: > > There are many places, notably audit_log_task_info() and > > audit_log_exit(), that take task_struct pointers but in reality they > > are always working on the current task. This patch eliminates the > > task_struct arguments and uses current directly which allows a number > > of cleanups as well. > > I came across and removed a several in the audit task struct cleanup, > but it looks like you've rebased over those and caught a few more. I just based this patch against audit/next to make life easier. Since the earliest it would possibly go into the audit tree would be after the next merge window it will likely get rebased/merged again. If there is another patch that does some of this work and gets merged first, awesome, if not, that's fine too. > I'm fine with delaying setting task's context to NULL for > __audit_free(). Yeah, it really shouldn't matter when it happens in __audit_free() as we should be the only ones who are touching that task_struct at that point in time. > Why was the context originally taken for __audit_syscall_exit() and > given back once the syscall event records have been issued? Is there a > possible race with something else? That was a bit bizarre, wasn't it? There shouldn't be a race condition as the audit_context is private to the individual task and at the point in time where __audit_syscall_exit() is being called we shouldn't have to worry about other things hitting the task_struct. If anything, this patch should actually make things better by not setting the current->context to NULL at the start of __audit_syscall_exit() only to reset it back to the original value at the end (the audit_take_context() function, and it's relationship with audit_log_exit() was ... odd ... and that is me being kind). I'm chalking this up to "audit being audit" :/ > > Signed-off-by: Paul Moore > > Otherwise, this cleanup looks like a good simplification. > Reviewed-by: Richard Guy Briggs Diffstats that remove more lines than they add always make me happy. Thanks for taking a look. It boots and passes our tests but I still haven't convinced myself all those changes are correct. I'll send a note if/when it gets merged, but like I said that won't happen until after the merge window closes as we are at -rc6 right now. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH] audit: fix potential null dereference 'context->module.name'
On Tue, Jul 24, 2018 at 7:39 AM Eric Paris wrote: > Would it make more sense to actually check for failure on allocation > rather than try to remember to deal with it later? How about we just > have audit_log_kern_module return an error and fail if we are OOM? Generally I agree, checking on allocation is the right thing to do. However, in this particular case the error recovery for a failed allocation would likely ignore the record entirely, and I think a module load record with a "(null)" module name is still better than no module load record at all ... and yes, I understand that if the module name allocation fails there is a good chance other allocations will fail and we might not emit the record, but I'll take my chances that the load is transient and the system is able to recover in a timely manner. > (also this seems like a good place to use kstrdup, instead of > kmalloc+strcpy) Yes, I agree. Yi Wang, could you submit a v2 of this patch using kstrdup()? > On Tue, 2018-07-24 at 13:57 +0800, Yi Wang wrote: > > The variable 'context->module.name' may be null pointer when > > kmalloc return null, so it's better to check it before using > > to avoid null dereference. > > > > Signed-off-by: Yi Wang > > Reviewed-by: Jiang Biao > > --- > > kernel/auditsc.c | 11 --- > > 1 file changed, 8 insertions(+), 3 deletions(-) > > > > diff --git a/kernel/auditsc.c b/kernel/auditsc.c > > index e80459f..4830b83 100644 > > --- a/kernel/auditsc.c > > +++ b/kernel/auditsc.c > > @@ -1272,8 +1272,12 @@ static void show_special(struct audit_context > > *context, int *call_panic) > > break; > > case AUDIT_KERN_MODULE: > > audit_log_format(ab, "name="); > > - audit_log_untrustedstring(ab, context->module.name); > > - kfree(context->module.name); > > + if (context->module.name) { > > + audit_log_untrustedstring(ab, context- > > >module.name); > > + kfree(context->module.name); > > + } else > > + audit_log_format(ab, "(null)"); > > + > > break; > > } > > audit_log_end(ab); > > @@ -2409,7 +2413,8 @@ void __audit_log_kern_module(char *name) > > struct audit_context *context = current->audit_context; > > > > context->module.name = kmalloc(strlen(name) + 1, GFP_KERNEL); > > - strcpy(context->module.name, name); > > + if (context->module.name) > > + strcpy(context->module.name, name); > > context->type = AUDIT_KERN_MODULE; > > } > > -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak59 V1 3/6] audit: exclude user records from syscall context
On Tue, Jul 24, 2018 at 9:05 AM Richard Guy Briggs wrote: > On 2018-07-23 17:00, Paul Moore wrote: > > On Mon, Jul 23, 2018 at 12:43 PM Richard Guy Briggs wrote: > > > On 2018-07-12 17:46, Richard Guy Briggs wrote: > > > > On 2018-06-28 18:11, Paul Moore wrote: > > > > > On Thu, Jun 14, 2018 at 4:23 PM Richard Guy Briggs > > > > > wrote: > > > > > > Since the function audit_log_common_recv_msg() is shared by a > > > > > > number of > > > > > > AUDIT_CONFIG_CHANGE and the entire range of AUDIT_USER_* record > > > > > > types, > > > > > > and since the AUDIT_CONFIG_CHANGE message type has been converted > > > > > > to a > > > > > > syscall accompanied record type, special-case the AUDIT_USER_* > > > > > > range of > > > > > > messages so they remain standalone records. > > > > > > > > > > > > See: https://github.com/linux-audit/audit-kernel/issues/59 > > > > > > Signed-off-by: Richard Guy Briggs > > > > > > --- > > > > > > kernel/audit.c | 12 +--- > > > > > > 1 file changed, 9 insertions(+), 3 deletions(-) > > > > > > > > > > I think this is fine, but see my previous comment about combining 2/6 > > > > > and 3/6 as a safety measure. > > > > > > > > This one I left as a seperate patch for discussion. We'd previously > > > > talked about connecting all possible records with syscall records if > > > > they exist, but this one I'm unsure about, since we don't really care > > > > what userspace process is issuing this message. It is just the message > > > > content itself that is important. Or is it? Are we concerned about > > > > CAP_AUDIT_WRITE holders/abusers and want as much info about them as we > > > > can get in case they go rogue or pear-shaped? > > > > > > I'm waiting on re-spinning this patchset because of this open question. > > > > > > Is connecting AUDIT_USER* records desirable or a liability? > > > > Like I said, I think special casing the AUDIT_USER* records so they > > are *not* associated with other records is okay, and perhaps even the > > right thing to do. The problem is that we don't have the necessary > > context (har har) to match any kernel events (and there is the > > possibility that there are none) to the userspace generated > > AUDIT_USER* event. Further, I don't think this is something we would > > ever be able to solve in a reasonable manner. > > Ok, having said that, I think I'd still prefer to keep this patch > seperate, partly to retain the simplicity of the previous patch and make > very clear what each one is doing, and partly if we decide to change our > mind in the future that these AUDIT_USER* records should be autonomous. Okay, I'll buy that argument. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak90 (was ghak32) V3 08/10] audit: NETFILTER_PKT: record each container ID associated with a netNS
On Tue, Jul 24, 2018 at 3:48 PM Steve Grubb wrote: > On Friday, July 20, 2018 6:15:00 PM EDT Paul Moore wrote: > > On Wed, Jun 6, 2018 at 1:03 PM Richard Guy Briggs wrote: > > > Add audit container identifier auxiliary record(s) to NETFILTER_PKT > > > event standalone records. Iterate through all potential audit container > > > identifiers associated with a network namespace. > > > > > > Signed-off-by: Richard Guy Briggs > > > --- > > > include/linux/audit.h| 5 + > > > kernel/audit.c | 20 +++- > > > kernel/auditsc.c | 2 ++ > > > net/netfilter/xt_AUDIT.c | 12 ++-- > > > 4 files changed, 36 insertions(+), 3 deletions(-) > > > > ... > > > > > diff --git a/include/linux/audit.h b/include/linux/audit.h > > > index 7e2e51c..4560a4e 100644 > > > --- a/include/linux/audit.h > > > +++ b/include/linux/audit.h > > > @@ -167,6 +167,8 @@ extern int audit_log_contid(struct audit_context > > > *context, extern void audit_contid_add(struct net *net, u64 contid); > > > extern void audit_contid_del(struct net *net, u64 contid); > > > extern void audit_switch_task_namespaces(struct nsproxy *ns, struct > > > task_struct *p); +extern void audit_log_contid_list(struct net *net, > > > +struct audit_context *context); > > > > See my comment in previous patches about changing the function name to > > better indicate it's dedicate use for network namespaces. > > > > > extern int audit_update_lsm_rules(void); > > > > > > @@ -231,6 +233,9 @@ static inline void audit_contid_del(struct net *net, > > > u64 contid) { } > > > static inline void audit_switch_task_namespaces(struct nsproxy *ns, > > > struct task_struct *p) { } > > > +static inline void audit_log_contid_list(struct net *net, > > > + struct audit_context *context) > > > +{ } > > > > > > #define audit_enabled 0 > > > #endif /* CONFIG_AUDIT */ > > > diff --git a/kernel/audit.c b/kernel/audit.c > > > index ecd2de4..8cca41a 100644 > > > --- a/kernel/audit.c > > > +++ b/kernel/audit.c > > > @@ -382,6 +382,20 @@ void audit_switch_task_namespaces(struct nsproxy > > > *ns, struct task_struct *p) audit_contid_add(new->net_ns, contid); > > > } > > > > > > +void audit_log_contid_list(struct net *net, struct audit_context > > > *context) +{ > > > + struct audit_contid *cont; > > > + int i = 0; > > > + > > > + list_for_each_entry(cont, audit_get_contid_list(net), list) { > > > + char buf[14]; > > > + > > > + sprintf(buf, "net%u", i++); > > > + audit_log_contid(context, buf, cont->id); > > > > Hmm. It looks like this will generate multiple audit container ID > > records with "op=netX contid=Y" (X=netns number, Y=audit container > > ID), is that what we want? I've mentioned my concern around the "op" > > values in these records earlier in the patchset, that still applies > > here, but now I'm also concerned about the multiple records. I'm > > thinking we might be better served with a single record with either > > multiple "contid" fields, or a single "contid" field with a set of > > comma separated values (or some other delimiter that Steve's tools > > will tolerate). > > > > Steve, thoughts? > > A single record is best. Maybe pattern this after the args listed in an > execve record. I'm concerned that an execve-like approach might not scale very well as would could potentially have a lot of containers sharing a single network namespace ("a%d=%d" vs ",%d"). Further, with execve we log the argument position in addition to the argument itself, that isn't something we need to worry about with the audit container IDs. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak90 (was ghak32) V3 07/10] audit: add support for containerid to network namespaces
On Tue, Jul 24, 2018 at 10:06 AM Richard Guy Briggs wrote: > On 2018-07-20 18:14, Paul Moore wrote: > > On Wed, Jun 6, 2018 at 1:03 PM Richard Guy Briggs wrote: > > > Audit events could happen in a network namespace outside of a task > > > context due to packets received from the net that trigger an auditing > > > rule prior to being associated with a running task. The network > > > namespace could in use by multiple containers by association to the > > > tasks in that network namespace. We still want a way to attribute > > > these events to any potential containers. Keep a list per network > > > namespace to track these audit container identifiiers. > > > > > > Add/increment the audit container identifier on: > > > - initial setting of the audit container identifier via /proc > > > - clone/fork call that inherits an audit container identifier > > > - unshare call that inherits an audit container identifier > > > - setns call that inherits an audit container identifier > > > Delete/decrement the audit container identifier on: > > > - an inherited audit container identifier dropped when child set > > > - process exit > > > - unshare call that drops a net namespace > > > - setns call that drops a net namespace > > > > > > See: https://github.com/linux-audit/audit-kernel/issues/92 > > > See: https://github.com/linux-audit/audit-testsuite/issues/64 > > > See: > > > https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID > > > Signed-off-by: Richard Guy Briggs > > > --- > > > include/linux/audit.h | 23 > > > kernel/audit.c| 72 > > > +++ > > > kernel/auditsc.c | 5 > > > kernel/nsproxy.c | 4 +++ > > > 4 files changed, 104 insertions(+) ... > > > + } > > > + if (!found) > > > + return; > > > + list_del(&cont->list); > > > + if (refcount_dec_and_test(&cont->refcount)) > > > + kfree(cont); > > > > Don't you want to dec_and_test first and only remove it from the list > > if there are no other references? > > I don't think so. Let me try to describe it in prose to see if I > understood this properly and see if this makes more sense: I want to > remove this audit_contid list member from this net's audit_contid list > and decrement unconditionally this member's refcount so it knows there > is one less thing pointing at it and when there is no longer anything > pointing at it, free it. Yep, sorry, my mistake, I was thinking the other way around (netns going away) ... which actually, this patchset doesn't handle that does it (I don't see any new code in audit_net_exit())? Is is in a later patch? If so, it really should be in the same patch as this code to prevent bisect nasties. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak90 (was ghak32) V3 05/10] audit: add containerid support for tty_audit
On Tue, Jul 24, 2018 at 10:10 AM Richard Guy Briggs wrote: > On 2018-07-20 18:14, Paul Moore wrote: > > On Wed, Jun 6, 2018 at 1:04 PM Richard Guy Briggs wrote: > > > Add audit container identifier auxiliary record to tty logging rule > > > event standalone records. > > > > > > Signed-off-by: Richard Guy Briggs > > > --- > > > drivers/tty/tty_audit.c | 5 - > > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/tty/tty_audit.c b/drivers/tty/tty_audit.c > > > index e30aa6b..66bd850 100644 > > > --- a/drivers/tty/tty_audit.c > > > +++ b/drivers/tty/tty_audit.c > > > @@ -66,8 +66,9 @@ static void tty_audit_log(const char *description, > > > dev_t dev, > > > uid_t uid = from_kuid(&init_user_ns, task_uid(tsk)); > > > uid_t loginuid = from_kuid(&init_user_ns, > > > audit_get_loginuid(tsk)); > > > unsigned int sessionid = audit_get_sessionid(tsk); > > > + struct audit_context *context = audit_alloc_local(); > > > > We should be using current's audit_context in tty_audit_log(). > > Actually, we should probably just get rid of the tsk variable in > > tty_audit_log() and use current directly to make things a bit more > > obvious. > > Ok, agreed. At this point, it it current passed in anyways so no harm > other than efficiency. > > > > > > > I did some digging and I have a two year old, half-baked patch that > > cleans up this tsk/current usage as well as a few others. I just > > rebased it against audit/next and surprisingly it seems to pass a > > basic smoke test (kernel boots and audit-testsuite passes); I'll post > > it to the list as a RFC once I'm done reviewing these patches. > > I'll leave this patch the way it is since there should be no difference > and trust this other patch will work its way through the system and > solve that. Yep, that's a merge issue I'll deal with when we get to that point. Although, I would expect that when you post an updated patchset that it applies cleanly to the then-current audit/next tree (or Linus' tree, but audit/next is preferable). In other words, please rebase your development branch before doing your final dev testing and posting. > > > - ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_TTY); > > > + ab = audit_log_start(context, GFP_KERNEL, AUDIT_TTY); > > > if (ab) { > > > char name[sizeof(tsk->comm)]; > > > > > > @@ -80,6 +81,8 @@ static void tty_audit_log(const char *description, > > > dev_t dev, > > > audit_log_n_hex(ab, data, size); > > > audit_log_end(ab); > > > } > > > + audit_log_contid(context, "tty", audit_get_contid(tsk)); > > > + audit_free_context(context); > > > } > > > > -- > > paul moore > > www.paul-moore.com > > - RGB > > -- > Richard Guy Briggs > Sr. S/W Engineer, Kernel Security, Base Operating Systems > Remote, Ottawa, Red Hat Canada > IRC: rgb, SunRaycer > Voice: +1.647.777.2635, Internal: (81) 32635 -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak90 (was ghak32) V3 01/10] audit: add container id
On Tue, Jul 24, 2018 at 3:09 PM Richard Guy Briggs wrote: > On 2018-07-20 18:13, Paul Moore wrote: > > On Wed, Jun 6, 2018 at 1:00 PM Richard Guy Briggs wrote: > > > Implement the proc fs write to set the audit container identifier of a > > > process, emitting an AUDIT_CONTAINER_ID record to document the event. > > > > > > This is a write from the container orchestrator task to a proc entry of > > > the form /proc/PID/audit_containerid where PID is the process ID of the > > > newly created task that is to become the first task in a container, or > > > an additional task added to a container. > > > > > > The write expects up to a u64 value (unset: 18446744073709551615). > > > > > > The writer must have capability CAP_AUDIT_CONTROL. > > > > > > This will produce a record such as this: > > > type=CONTAINER_ID msg=audit(2018-06-06 12:39:29.636:26949) : op=set > > > opid=2209 old-contid=18446744073709551615 contid=123456 pid=628 auid=root > > > uid=root tty=ttyS0 ses=1 > > > subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 comm=bash > > > exe=/usr/bin/bash res=yes > > > > > > The "op" field indicates an initial set. The "pid" to "ses" fields are > > > the orchestrator while the "opid" field is the object's PID, the process > > > being "contained". Old and new audit container identifier values are > > > given in the "contid" fields, while res indicates its success. > > > > > > It is not permitted to unset or re-set the audit container identifier. > > > A child inherits its parent's audit container identifier, but then can > > > be set only once after. > > > > > > See: https://github.com/linux-audit/audit-kernel/issues/90 > > > See: https://github.com/linux-audit/audit-userspace/issues/51 > > > See: https://github.com/linux-audit/audit-testsuite/issues/64 > > > See: > > > https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID > > > > > > Signed-off-by: Richard Guy Briggs > > > --- > > > fs/proc/base.c | 37 > > > include/linux/audit.h | 25 > > > include/uapi/linux/audit.h | 2 ++ > > > kernel/auditsc.c | 71 > > > ++ > > > 4 files changed, 135 insertions(+) ... > > > @@ -2112,6 +2116,73 @@ int audit_set_loginuid(kuid_t loginuid) > > > } > > > > > > /** > > > + * audit_set_contid - set current task's audit_context contid > > > + * @contid: contid value > > > + * > > > + * Returns 0 on success, -EPERM on permission failure. > > > + * > > > + * Called (set) from fs/proc/base.c::proc_contid_write(). > > > + */ > > > +int audit_set_contid(struct task_struct *task, u64 contid) > > > +{ > > > + u64 oldcontid; > > > + int rc = 0; > > > + struct audit_buffer *ab; > > > + uid_t uid; > > > + struct tty_struct *tty; > > > + char comm[sizeof(current->comm)]; > > > + > > > + /* Can't set if audit disabled */ > > > + if (!task->audit) > > > + return -ENOPROTOOPT; > > > + oldcontid = audit_get_contid(task); > > > + /* Don't allow the audit containerid to be unset */ > > > + if (!cid_valid(contid)) > > > + rc = -EINVAL; > > > + /* if we don't have caps, reject */ > > > + else if (!capable(CAP_AUDIT_CONTROL)) > > > + rc = -EPERM; > > > + /* if task has children or is not single-threaded, deny */ > > > + else if (!list_empty(&task->children)) > > > + rc = -EBUSY; > > > > Is this safe without holding tasklist_lock? I worry we might be > > vulnerable to a race with fork(). > > > > > + else if (!(thread_group_leader(task) && thread_group_empty(task))) > > > + rc = -EALREADY; > > > > Similar concern here as well, although related to threads. > > I think you are correct here and tasklist_lock should cover both. Do we > also want rcu_read_lock() immediately preceeding it? You'll need to take a closer look and determine the locking scheme. I simply took a quick look while reviewing this patch to see what of the existing locks, if any, would be most applicable here; ta
Re: [RFC PATCH ghak90 (was ghak32) V3 04/10] audit: add support for non-syscall auxiliary records
On Tue, Jul 24, 2018 at 3:40 PM Richard Guy Briggs wrote: > On 2018-07-20 18:14, Paul Moore wrote: > > On Wed, Jun 6, 2018 at 1:01 PM Richard Guy Briggs wrote: > > > Standalone audit records have the timestamp and serial number generated > > > on the fly and as such are unique, making them standalone. This new > > > function audit_alloc_local() generates a local audit context that will > > > be used only for a standalone record and its auxiliary record(s). The > > > context is discarded immediately after the local associated records are > > > produced. > > > > > > Signed-off-by: Richard Guy Briggs > > > --- > > > include/linux/audit.h | 8 > > > kernel/auditsc.c | 25 +++-- > > > 2 files changed, 31 insertions(+), 2 deletions(-) ... > > > + struct audit_context *context; > > > + > > > + if (!audit_ever_enabled) > > > + return NULL; /* Return if not auditing. */ > > > + > > > + context = audit_alloc_context(AUDIT_RECORD_CONTEXT); > > > + if (!context) > > > + return NULL; > > > + context->serial = audit_serial(); > > > + context->ctime = current_kernel_time64(); > > > + context->in_syscall = 1; > > > > Setting in_syscall is both interesting and a bit troubling, if for no > > other reason than I expect most (all?) callers to be in an interrupt > > context when audit_alloc_local() is called. Setting in_syscall would > > appear to be conceptually in this case. Can you help explain why this > > is the right thing to do, or necessary to ensure things are handled > > correctly? > > I'll admit this is cheating a bit, but seemed harmless. It is needed so > that auditsc_get_stamp() from audit_get_stamp() from audit_log_start() > doesn't bail on me without giving me its already assigned time and > serial values rather than generating a new one. I did look to see if > there were any other undesireable side effects and found none, so I'm > tmepted to rename the ->in_syscall to something a bit more helpful. I > could add a new audit_context structure member to make > auditsc_get_stamp() co-operative, but this seems wasteful and > unnecessary. That's what I suspected. Let's look into renaming the "in_syscall" field, it borderline confusing now, and hijacking it for something which is very obviously not "in syscall" is A Very Bad Thing. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak9 0/3] audit: Record the path of FDs passed to *at(2) syscalls
On Tue, Jul 24, 2018 at 10:12 AM Ondrej Mosnacek wrote: > OK, so I just wrote a small script to see what PATH records would be > generated for a renameat(2) syscall with non-AT_FDCWD fd arguments, > and it seems the current implementation is not lacking information, > but actually buggy. > > strace output: > openat(AT_FDCWD, "/tmp/tmp.CXtBRafonK/a", O_RDONLY|O_PATH|O_DIRECTORY) = 3 > openat(AT_FDCWD, "/tmp/tmp.CXtBRafonK/b", O_RDONLY|O_PATH|O_DIRECTORY) = 4 > renameat(3, "f", 4, "g")= 0 > close(3)= 0 > close(4)= 0 > > Corresponding audit records for renameat(): > type=SYSCALL msg=audit(1532439957.660:5): arch=c03e syscall=264 > success=yes exit=0 a0=3 a1=7ffcc364de2a a2=4 a3=7ffcc364de42 items=4 > ppid=594 pid=635 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 > sgid=0 fsgid=0 tty=(none) ses=1 comm="trigger-renamea" > exe="/tmp/tmp.GEfuEtn1II/trigger-renameat" > subj=system_u:system_r:kernel_t:s0 key=(null) > type=CWD msg=audit(1532439957.660:5): cwd="/root/Dokumenty/Kernel" ... > type=PATH msg=audit(1532439957.660:5): item=0 > name="/root/Dokumenty/Kernel" inode=2155 dev=00:1a mode=040755 ouid=0 > ogid=0 rdev=00:00 obj=system_u:object_r:tmpfs_t:s0 nametype=PARENT > cap_fp= cap_fi= cap_fe=0 cap_fver=0 > type=PATH msg=audit(1532439957.660:5): item=1 > name="/root/Dokumenty/Kernel" inode=2156 dev=00:1a mode=040755 ouid=0 > ogid=0 rdev=00:00 obj=system_u:object_r:tmpfs_t:s0 nametype=PARENT > cap_fp= cap_fi= cap_fe=0 cap_fver=0 > type=PATH msg=audit(1532439957.660:5): item=2 name="f" inode=2157 > dev=00:1a mode=0100644 ouid=0 ogid=0 rdev=00:00 > obj=system_u:object_r:tmpfs_t:s0 nametype=DELETE > cap_fp= cap_fi= cap_fe=0 cap_fver=0 > type=PATH msg=audit(1532439957.660:5): item=3 name="g" inode=2157 > dev=00:1a mode=0100644 ouid=0 ogid=0 rdev=00:00 > obj=system_u:object_r:tmpfs_t:s0 nametype=CREATE > cap_fp= cap_fi= cap_fe=0 cap_fver=0 ... > The PARENT paths are incorrectly reporting the CWD path instead of the > path of the source/destination directories, even though the inode > numbers seem to be correct. Yes, that's odd, and not desirable. > Beyond that, there is really no information in the records that would > allow reconstructing which PARENT path belongs to which CREATE/DELETE > path... (Intuitively you can guess that src will come before dst, but > that is not very reliable.) I think a "parent inode" field in the PATH > records could fix this, but maybe there is a better solution... I have my suspicions, but I would be curious to hear from Steve how the reconstruction is typically handled. > I'll see if/how I can fix these issues (especially the first one) and > then I'll get back to the original issue. renameat() and maybe a few > other syscalls should be OK after the fix, but at least openat() will > need some further work (right now it only emits just one PATH record > with only relative path). Yes, let's fix this first and go from there. Thanks for looking into this. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH] audit: fix potential null dereference 'context->module.name'
On Tue, Jul 24, 2018 at 6:38 PM Eric Paris wrote: > On Tue, 2018-07-24 at 15:55 -0400, Paul Moore wrote: > > On Tue, Jul 24, 2018 at 7:39 AM Eric Paris wrote: > > > Would it make more sense to actually check for failure on > > > allocation > > > rather than try to remember to deal with it later? How about we > > > just > > > have audit_log_kern_module return an error and fail if we are OOM? > > > > Generally I agree, checking on allocation is the right thing to do. > > However, in this particular case the error recovery for a failed > > allocation would likely ignore the record entirely, and I think a > > module load record with a "(null)" module name is still better than > > no > > module load record at all ... and yes, I understand that if the > > module > > name allocation fails there is a good chance other allocations will > > fail and we might not emit the record, but I'll take my chances that > > the load is transient and the system is able to recover in a timely > > manner. > > On the load_module() code path passing the error up the stack would > cause the module to not be loaded, instead of being loaded with loss of > name information. I'd rather have the module not loaded and a > name=(null) record than it BE loaded and get name=(null) > > You've sold me on why Yi Wang's patch is good, but not on why we > wouldn't propagate the error up the stack on load/delete. (even if we > may choose to delete the module on OOM) For better or worse most (all?) of the various audit_log_*() functions don't return an error code, or if they do they aren't reliably checked by callers. I agree with you that in a perfect world it would be nice if an audit failure here would block the operation, but that isn't how audit is typically used. We could presumably fail the record generation on an allocation failure and signal a lost record via audit_log_lost()? Perhaps that's the best option as it would leave the decision up to administrator (continue without the record, printk, or panic). -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH 04/10] audit: Embed key into chunk
On Tue, Jul 10, 2018 at 6:02 AM Jan Kara wrote: > Currently chunk hash key (which is in fact pointer to the inode) is > derived as chunk->mark.conn->obj. It is tricky to make this dereference > reliable for hash table lookups only under RCU as mark can get detached > from the connector and connector gets freed independently of the > running lookup. Thus there is a possible use after free / NULL ptr > dereference issue: > > CPU1CPU2 > untag_chunk() > ... > audit_tree_lookup() > list_for_each_entry_rcu(p, list, hash) { > list_del_rcu(&chunk->hash); > fsnotify_destroy_mark(entry); > fsnotify_put_mark(entry) > chunk_to_key(p) > if (!chunk->mark.connector) > ... > > hlist_del_init_rcu(&mark->obj_list); > if (hlist_empty(&conn->list)) { > inode = > fsnotify_detach_connector_from_object(conn); > mark->connector = NULL; > ... > frees connector from workqueue > chunk->mark.connector->obj > > This race is probably impossible to hit in practice as the race window > on CPU1 is very narrow and CPU2 has a lot of code to execute. Still it's > better to have this fixed. Since the inode the chunk is attached to is > constant during chunk's lifetime it is easy to cache the key in the > chunk itself and thus avoid these issues. > > Signed-off-by: Jan Kara > --- > kernel/audit_tree.c | 27 --- > 1 file changed, 8 insertions(+), 19 deletions(-) This looks okay to me. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH 07/10] audit: Remove pointless check in insert_hash()
On Tue, Jul 10, 2018 at 6:02 AM Jan Kara wrote: > > The audit_tree_group->mark_mutex is held all the time while we create > the fsnotify mark, add it to the inode, and insert chunk into the hash. > Hence mark cannot get detached during this time and so the check whether > the mark is attached in insert_hash() is pointless. > > Signed-off-by: Jan Kara > --- > kernel/audit_tree.c | 2 -- > 1 file changed, 2 deletions(-) This makes sense. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH 02/10] audit: Fix possible spurious -ENOSPC error
On Tue, Jul 10, 2018 at 6:02 AM Jan Kara wrote: > When an inode is tagged with a tree, tag_chunk() checks whether there is > audit_tree_group mark attached to the inode and adds one if not. However > nothing protects another tag_chunk() to add the mark between we've > checked and try to add the fsnotify mark thus resulting in an error from > fsnotify_add_mark() and consequently an ENOSPC error from tag_chunk(). > > Fix the problem by holding mark_mutex over the whole check-insert code > sequence. > > Signed-off-by: Jan Kara > --- > kernel/audit_tree.c | 26 -- > 1 file changed, 16 insertions(+), 10 deletions(-) ... > diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c > index 1c82eb6674c4..de8d344d91b1 100644 > --- a/kernel/audit_tree.c > +++ b/kernel/audit_tree.c > @@ -342,25 +342,29 @@ static void untag_chunk(struct node *p) > spin_lock(&hash_lock); > } > > +/* Call with group->mark_mutex held, releases it */ Stuff like that always makes me nervous. Could we defer releasing the mutex to the caller, after create_chunk() returns? It looks like fsnotify_destroy_mark() allows a single level of nesting so it should be okay, yes? -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH 01/10] audit_tree: Remove mark->lock locking
On Tue, Jul 10, 2018 at 6:02 AM Jan Kara wrote: > Currently, audit_tree code uses mark->lock to protect against detaching > of mark from an inode. In most places it however also uses > mark->group->mark_mutex (as we need to atomically replace attached > marks) and this provides protection against mark detaching as well. So > just remove protection with mark->lock from audit tree code and replace > it with mark->group->mark_mutex protection in all the places. It > simplifies the code and gets rid of some ugly catches like calling > fsnotify_add_mark_locked() with mark->lock held (which cannot sleep only > because we hold a reference to another mark attached to the same inode). > > Signed-off-by: Jan Kara > --- > kernel/audit_tree.c | 24 > 1 file changed, 4 insertions(+), 20 deletions(-) ... > diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c > index 02feef939560..1c82eb6674c4 100644 > --- a/kernel/audit_tree.c > +++ b/kernel/audit_tree.c > @@ -360,12 +355,12 @@ static int create_chunk(struct inode *inode, struct > audit_tree *tree) > return -ENOSPC; > } > > - spin_lock(&entry->lock); > + mutex_lock(&entry->group->mark_mutex); I wonder if we could move the lock up above the fsnotify_add_inode_mark() earlier in create_chunk() and use fsnotify_add_mark_locked()? -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH 10/10] audit: Replace chunk attached to mark instead of replacing mark
On Tue, Jul 10, 2018 at 6:02 AM Jan Kara wrote: > Audit tree code currently associates new fsnotify mark with each new > chunk. As chunk attached to an inode is replaced when new tag is added / > removed, we also need to remove old fsnotify mark and add a new one on > such occasion. This is cumbersome and makes locking rules somewhat > difficult to follow. > > Fix these problems by allocating fsnotify mark independently of chunk > and keeping it all the time while there is some chunk attached to an > inode. Also add documentation about the locking rules so that things are > easier to follow. > > Signed-off-by: Jan Kara > --- > kernel/audit_tree.c | 163 > +++- > 1 file changed, 85 insertions(+), 78 deletions(-) This is a really nice improvement, thanks! > diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c > index aec9b27a20ff..40f61de77dd0 100644 > --- a/kernel/audit_tree.c > +++ b/kernel/audit_tree.c > @@ -272,6 +273,20 @@ static struct audit_chunk *find_chunk(struct node *p) > return container_of(p, struct audit_chunk, owners[0]); > } > > +static void replace_mark_chunk(struct fsnotify_mark *entry, > + struct audit_chunk *chunk) > +{ > + struct audit_chunk *old; > + > + assert_spin_locked(&hash_lock); > + old = AUDIT_M(entry)->chunk; > + AUDIT_M(entry)->chunk = chunk; > + if (chunk) > + chunk->mark = entry; > + if (old) > + old->mark = NULL; Is it necessary that we check to see if chunk and old are non-NULL? It seems like we would always want to set chunk->mark to entry and set old->mark to NULL, yes? > @@ -321,29 +341,31 @@ static void untag_chunk(struct node *p) > > mutex_lock(&entry->group->mark_mutex); > /* > -* mark_mutex protects mark from getting detached and thus also from > -* mark->connector->obj getting NULL. > +* mark_mutex protects mark stabilizes chunk attached to the mark so > we > +* can check whether it didn't change while we've dropped hash_lock. I think your new text could use some revision, the "protects mark stabilizes chunk" is odd. > */ > - if (chunk->dead || !(entry->flags & FSNOTIFY_MARK_FLAG_ATTACHED)) { > + if (!(entry->flags & FSNOTIFY_MARK_FLAG_ATTACHED) || > + AUDIT_M(entry)->chunk != chunk) { > mutex_unlock(&entry->group->mark_mutex); > if (new) > - fsnotify_put_mark(new->mark); > + kfree(new); Since we are just calling kfree() now we can do away with the "if (new)" check. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH 09/10] audit: Allocate fsnotify mark independently of chunk
On Tue, Jul 10, 2018 at 6:02 AM Jan Kara wrote: > Allocate fsnotify mark independently instead of embedding it inside > chunk. This will allow us to just replace chunk attached to mark when > growing / shrinking chunk instead of replacing mark attached to inode > which is a more complex operation. > > Signed-off-by: Jan Kara > --- > kernel/audit_tree.c | 59 > - > 1 file changed, 45 insertions(+), 14 deletions(-) ... > diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c > index bce3b04a365d..aec9b27a20ff 100644 > --- a/kernel/audit_tree.c > +++ b/kernel/audit_tree.c > @@ -38,6 +38,11 @@ struct audit_chunk { > } owners[]; > }; > > +struct audit_tree_mark { > + struct fsnotify_mark fsn_mark; > + struct audit_chunk *chunk; > +}; It's probably okay to just call it "mark" considering we call fsnotify_mark fields "mark" elsewhere. If we are going to change it in one spot we should probably change it other places as well for the sake of readability. 2,10 +148,28 @@ static void audit_mark_put_chunk(struct audit_chunk *chunk) > call_rcu(&chunk->head, __put_chunk); > } > > +static inline struct audit_tree_mark *AUDIT_M(struct fsnotify_mark *entry) > +{ > + return container_of(entry, struct audit_tree_mark, fsn_mark); > +} When I see a symbol in all caps I think "macro definition", but this isn't a macro definition. I would suggest a different name, dropping the caps, or converting it into a macro. Also, unless I missed a call, it seems like after patch 10 all callers of AUDIT_M end up getting the chunk field; maybe it is better if AUDIT_M() ends up returning the audit_chunk pointer instead of the audit_tree_mark pointer (and of course a name change if this is a reasonable change)? > static void audit_tree_destroy_watch(struct fsnotify_mark *entry) > { > - struct audit_chunk *chunk = container_of(entry, struct audit_chunk, > mark); > + struct audit_chunk *chunk = AUDIT_M(entry)->chunk; > audit_mark_put_chunk(chunk); > + kmem_cache_free(audit_tree_mark_cachep, entry); > +} I think you've said you've already fixed the above (thanks for the review Amir!). > +static struct fsnotify_mark *alloc_fsnotify_mark(void) > +{ > + struct audit_tree_mark *mark; > + > + mark = kmem_cache_zalloc(audit_tree_mark_cachep, GFP_KERNEL); > + if (!mark) > + return NULL; > + fsnotify_init_mark(&mark->fsn_mark, audit_tree_group); > + mark->fsn_mark.mask = FS_IN_IGNORED; > + return &mark->fsn_mark; > } Can't we just call it alloc_mark()? Or did you create the function earlier in the patchset and I'm missing it now? [SIDE NOTE: I have some rather big disagreements with the current naming in this file, but keeping things consistent seems to be a Good Thing (once again, this is an existing problem not specific to your patches).] -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH 06/10] audit: Factor out chunk replacement code
audit_tree *tree) > { > struct fsnotify_mark *old_entry, *chunk_entry; > - struct audit_tree *owner; > struct audit_chunk *chunk, *old; > struct node *p; > int n; > @@ -464,35 +472,21 @@ static int tag_chunk(struct inode *inode, struct > audit_tree *tree) > fsnotify_put_mark(old_entry); > return 0; > } > - chunk->key = old->key; > - list_replace_init(&old->trees, &chunk->trees); > - for (n = 0, p = chunk->owners; n < old->count; n++, p++) { > - struct audit_tree *s = old->owners[n].owner; > - p->owner = s; > - p->index = old->owners[n].index; > - if (!s) /* result of fallback in untag */ > - continue; > - get_tree(s); > - list_replace_init(&old->owners[n].list, &p->list); > - } > + p = &chunk->owners[chunk->count - 1]; > p->index = (chunk->count - 1) | (1U<<31); > p->owner = tree; > get_tree(tree); > list_add(&p->list, &tree->chunks); > - list_for_each_entry(owner, &chunk->trees, same_root) > - owner->root = chunk; > old->dead = 1; > if (!tree->root) { > tree->root = chunk; > list_add(&tree->same_root, &chunk->trees); > } > /* > -* Make sure chunk is fully initialized before making it visible in > the > -* hash. Pairs with a data dependency barrier in READ_ONCE() in > -* audit_tree_lookup(). > +* This has to go last when updating chunk as once replace_chunk() is > +* called, new RCU readers can see the new chunk. > */ > - smp_wmb(); > - list_replace_rcu(&old->hash, &chunk->hash); > + replace_chunk(chunk, old, NULL); > spin_unlock(&hash_lock); > fsnotify_detach_mark(old_entry); > mutex_unlock(&audit_tree_group->mark_mutex); > -- > 2.16.4 -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH v2] audit: fix potential null dereference 'context->module.name'
On Tue, Jul 24, 2018 at 10:28 PM Yi Wang wrote: > The variable 'context->module.name' may be null pointer when > kmalloc return null, so it's better to check it before using > to avoid null dereference. > Another one more thing this patch does is using kstrdup instead > of (kmalloc + strcpy), and signal a lost record via audit_log_lost. > > Signed-off-by: Yi Wang > Reviewed-by: Jiang Biao > --- > v2: use kstrdup instead of kmalloc + strcpy, and signal a lost > record. Thanks to Eric and Paul. > > kernel/auditsc.c | 13 + > 1 file changed, 9 insertions(+), 4 deletions(-) Thanks, this looks good to me. I'm also going to tag this for -stable. I'm building a test kernel right now and if all goes well I'll send this up to Linus for v4.18; if he doesn't pull it for v4.18 I'll add this to the audit/next branch. > diff --git a/kernel/auditsc.c b/kernel/auditsc.c > index e80459f..713386a 100644 > --- a/kernel/auditsc.c > +++ b/kernel/auditsc.c > @@ -1272,8 +1272,12 @@ static void show_special(struct audit_context > *context, int *call_panic) > break; > case AUDIT_KERN_MODULE: > audit_log_format(ab, "name="); > - audit_log_untrustedstring(ab, context->module.name); > - kfree(context->module.name); > + if (context->module.name) { > + audit_log_untrustedstring(ab, context->module.name); > + kfree(context->module.name); > + } else > + audit_log_format(ab, "(null)"); > + > break; > } > audit_log_end(ab); > @@ -2408,8 +2412,9 @@ void __audit_log_kern_module(char *name) > { > struct audit_context *context = current->audit_context; > > - context->module.name = kmalloc(strlen(name) + 1, GFP_KERNEL); > - strcpy(context->module.name, name); > + context->module.name = kstrdup(name, GFP_KERNEL); > + if (!context->module.name) > + audit_log_lost("out of memory in __audit_log_kern_module"); > context->type = AUDIT_KERN_MODULE; > } > > -- > 1.8.3.1 > -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
[GIT PULL] Audit fixes for v4.18 (#1)
Hi Linus, A single small audit fix to guard against memory allocation failures when logging information about a kernel module load. It's small, easy to understand, and self-contained; while nothing is zero risk, this should be pretty low. Please merge for v4.18, thanks. -Paul -- The following changes since commit 5b71388663c0920848c0ee7de946970a2692b76d: audit: Fix wrong task in comparison of session ID (2018-05-21 14:27:43 -0400) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit.git tags/audit-pr-20180731 for you to fetch changes up to b305f7ed0f4f494ad6f3ef5667501535d5a8fa31: audit: fix potential null dereference 'context->module.name' (2018-07-30 18:09:37 -0400) audit/stable-4.18 PR 20180731 Yi Wang (1): audit: fix potential null dereference 'context->module.name' kernel/auditsc.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: What time do the auditd timestamps represent?
On Wed, Aug 1, 2018 at 1:03 PM James Davis wrote: > Hi, > > Here is my general question. I have not found an answer in the auditd docs. > > auditd records timestamps. What time do these timestamps represent? > - After syscall is issued but before any action begins in kernel-space? > - During syscall? > - After syscall completes but before it returns to userspace? > - "Asynchronously"? > - Other? > > I was expecting the answer to be "Sometime during the kernel-space execution > of the syscall". > I was surprised this morning to find that this does not appear to be the case. > > Here is an example: > > 1. I configure auditd to watch the exit_group syscall. > > # auditctl -l > -a always,exit -F arch=b32 -S exit,fork,execve,setpgid,clone,exit_group -F > key=foo > -a always,exit -F arch=b64 -S clone,fork,execve,exit,setpgid,exit_group -F > key=foo > > 2. I strace a dd process, watching its syscalls to see when it issues > exit_group. > > # strace -tt dd if=/dev/zero of=/fs1/timestampTest bs=1K count=1 2>&1 | egrep > 'exit' > 09:28:42.829212 exit_group(0) = ? > 09:28:42.829278 +++ exited with 0 +++ > > strace records that the dd process calls exit_group at 42.829. > > 3. auditd records a slightly different time -- one millisecond off. > > I found the pid of the strace process and found where it clones the dd > process. > The dd process had pid 16642. > Let's see what auditd has to say about pid 16642. > > [root@fin13p 09:49:08 ~] # ausearch --pid 16642 --interpret --start '09:00:00' > > type=PATH msg=audit(08/01/2018 09:28:42.823:84286412) : item=1 > name=/lib64/ld-linux-x86-64.so.2 inode=403084422 dev=fd:00 mode=file,755 > ouid=root ogid=root rdev=00:00 objtype=NORMAL > type=PATH msg=audit(08/01/2018 09:28:42.823:84286412) : item=0 name=/bin/dd > inode=269218970 dev=fd:00 mode=file,755 ouid=root ogid=root rdev=00:00 > objtype=NORMAL > type=CWD msg=audit(08/01/2018 09:28:42.823:84286412) : > cwd=/ghome/jamiedavis/src > type=EXECVE msg=audit(08/01/2018 09:28:42.823:84286412) : argc=5 a0=dd > a1=if=/dev/zero a2=of=/fs1/timestampTest a3=bs=1K a4=count=1 > type=SYSCALL msg=audit(08/01/2018 09:28:42.823:84286412) : arch=x86_64 > syscall=execve success=yes exit=0 a0=0x7ffe5a2db320 a1=0x7ffe5a2dc6a8 > a2=0x7ffe5a2dc6d8 a3=0x7ffe5a2dae20 items=2 ppid=16639 pid=16642 auid=root > uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root > fsgid=root tty=pts6 ses=6437 comm=dd exe=/usr/bin/dd key=prov-auditd > > type=SYSCALL msg=audit(08/01/2018 09:28:42.828:84286414) : arch=x86_64 > syscall=exit_group a0=EXIT_SUCCESS a1=0x0 a2=0x0 a3=0xff70 > items=0 ppid=16639 pid=16642 auid=root uid=root gid=root euid=root suid=root > fsuid=root egid=root sgid=root fsgid=root tty=pts6 ses=6437 comm=dd > exe=/usr/bin/dd key=prov-auditd > > -- > > Here is my specific question: Why does strace say that exit_group was called > at 42.829 while auditd says that the time was 42.828? The kernel captures the time for the timestamp early in the syscall's invocation, before the syscall actually executes anything. I just looked quickly at the strace code and it appears it captures the the timestamp it uses for printing whenever strace.c:printleader() is called. Considering these two timestamp captures are happening at different times it doesn't surprise me that the times are off slightly. I'm also looking quickly at the kernel code, and it appears that there is a slightly different approach towards calculating the time used by the audit subsystem (what you see in the audit record) and the clock_gettime() syscall (what is used by systrace). I'm not a time expert, but it looks like the audit subsystem uses a more coarse approach than the clock_gettime() syscall (for performance reasons) and I suspect that might be another source of the time difference, although I think the first issue I mentioned is likely to be the source of more deviation than algorithmic differences. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
[RFC PATCH] audit: minimize our use of audit_log_format()
WARNING: completely untested patch! There are several cases where we are using audit_log_format() when we could be using audit_log_string(), which should be quicker. There are also some cases where we are making multiple audit_log_format() calls in a row, for no apparent reason. This patch fixes the problems above in the core audit code, the other kernel subsystems are left for another time. Signed-off-by: Paul Moore --- kernel/audit.c | 37 ++--- kernel/audit_fsnotify.c |2 +- kernel/audit_tree.c |3 +-- kernel/audit_watch.c|2 +- kernel/auditsc.c| 19 +-- 5 files changed, 30 insertions(+), 33 deletions(-) diff --git a/kernel/audit.c b/kernel/audit.c index 160144f7e5f9..a0f57f4f9944 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -1347,7 +1347,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) else { int size; - audit_log_format(ab, " data="); + audit_log_string(ab, " data="); size = nlmsg_len(nlh); if (size > 0 && ((unsigned char *)data)[size - 1] == '\0') @@ -1375,7 +1375,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) case AUDIT_TRIM: audit_trim_trees(); audit_log_common_recv_msg(&ab, AUDIT_CONFIG_CHANGE); - audit_log_format(ab, " op=trim res=1"); + audit_log_string(ab, " op=trim res=1"); audit_log_end(ab); break; case AUDIT_MAKE_EQUIV: { @@ -1406,9 +1406,9 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) audit_log_common_recv_msg(&ab, AUDIT_CONFIG_CHANGE); - audit_log_format(ab, " op=make_equiv old="); + audit_log_string(ab, " op=make_equiv old="); audit_log_untrustedstring(ab, old); - audit_log_format(ab, " new="); + audit_log_string(ab, " new="); audit_log_untrustedstring(ab, new); audit_log_format(ab, " res=%d", !err); audit_log_end(ab); @@ -2021,7 +2021,7 @@ void audit_log_d_path(struct audit_buffer *ab, const char *prefix, char *p, *pathname; if (prefix) - audit_log_format(ab, "%s", prefix); + audit_log_string(ab, prefix); /* We will allow 11 spaces for ' (deleted)' to be appended */ pathname = kmalloc(PATH_MAX+11, ab->gfp_mask); @@ -2048,11 +2048,11 @@ void audit_log_session_info(struct audit_buffer *ab) void audit_log_key(struct audit_buffer *ab, char *key) { - audit_log_format(ab, " key="); + audit_log_string(ab, " key="); if (key) audit_log_untrustedstring(ab, key); else - audit_log_format(ab, "(null)"); + audit_log_string(ab, "(null)"); } void audit_log_cap(struct audit_buffer *ab, char *prefix, kernel_cap_t *cap) @@ -2134,7 +2134,7 @@ void audit_log_name(struct audit_context *context, struct audit_names *n, switch (n->name_len) { case AUDIT_NAME_FULL: /* log the full path */ - audit_log_format(ab, " name="); + audit_log_string(ab, " name="); audit_log_untrustedstring(ab, n->name->name); break; case 0: @@ -2144,12 +2144,12 @@ void audit_log_name(struct audit_context *context, struct audit_names *n, break; default: /* log the name's directory component */ - audit_log_format(ab, " name="); + audit_log_string(ab, " name="); audit_log_n_untrustedstring(ab, n->name->name, n->name_len); } } else - audit_log_format(ab, " name=(null)"); + audit_log_string(ab, " name=(null)"); if (n->ino != AUDIT_INO_UNSET) audit_log_format(ab, " inode=%lu" @@ -2178,22 +2178,21 @@ void audit_log_name(struct audit_context *context, struct audit_names *n, } /* log the audit_names record type */ - audit_log_format(ab, " nametype="); switch(n->type) { case AUDIT_TYPE_NORMAL: - audit_log_format(ab, "NORMAL"); + audit_log_string(ab, "nametype=NORMAL&q
Re: [PATCH ghak95] audit: Do not log full CWD path on empty relative paths
On Thu, Aug 2, 2018 at 7:45 AM Ondrej Mosnacek wrote: > > When a relative path has just a single component and we want to emit a > nametype=PARENT record, the current implementation just reports the full > CWD path (which is alrady available in the audit context). > > This is wrong for three reasons: > 1. Wasting log space for redundant data (CWD path is already in the CWD >record). > 2. Inconsistency with other PATH records (if a relative PARENT directory >path contains at least one component, only the verbatim relative path >is logged). > 3. In some syscalls (e.g. openat(2)) the relative path may not even be >relative to the CWD, but to another directory specified as a file >descriptor. In that case the logged path is simply plain wrong. > > This patch modifies this behavior to simply report "." in the > aforementioned case, which is equivalent to an "empty" directory path > and can be concatenated with the actual base directory path (CWD or > dirfd from openat(2)-like syscall) once support for its logging is added > later. In the meantime, defaulting to CWD as base directory on relative > paths (as already done by the userspace tools) will be enough to achieve > results equivalent to the current behavior. > > See: https://github.com/linux-audit/audit-kernel/issues/95 > > Fixes: 9c937dcc7102 ("[PATCH] log more info for directory entry change > events") > Signed-off-by: Ondrej Mosnacek > --- > kernel/audit.c | 9 - > 1 file changed, 4 insertions(+), 5 deletions(-) > > diff --git a/kernel/audit.c b/kernel/audit.c > index 2a8058764aa6..4f18bd48eb4b 100644 > --- a/kernel/audit.c > +++ b/kernel/audit.c > @@ -2127,28 +2127,27 @@ void audit_log_name(struct audit_context *context, > struct audit_names *n, > > audit_log_format(ab, "item=%d", record_num); > > + audit_log_format(ab, " name="); > if (path) > - audit_log_d_path(ab, " name=", path); > + audit_log_d_path(ab, NULL, path); > else if (n->name) { > switch (n->name_len) { > case AUDIT_NAME_FULL: > /* log the full path */ > - audit_log_format(ab, " name="); > audit_log_untrustedstring(ab, n->name->name); > break; > case 0: > /* name was specified as a relative path and the > * directory component is the cwd */ > - audit_log_d_path(ab, " name=", &context->pwd); > + audit_log_untrustedstring(ab, "."); This isn't a comprehensive review, I only gave this a quick look, but considering you are only logging "." above we can safely use audit_log_string() and safe a few cycles. Honestly, looking at the rest of this function, why are we using audit_log_format() in the case where we are simply writing a string literal? While I haven't tested it, simple code inspection would seem to indicate that audit_log_string() should be much quicker than audit_log_format()? I realize this isn't strictly a problem from this patch, but we probably shouldn't be propagating this mistake any further. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak9 0/3] audit: Record the path of FDs passed to *at(2) syscalls
On Thu, Jul 26, 2018 at 4:12 AM Ondrej Mosnacek wrote: > On Wed, Jul 25, 2018 at 3:11 PM Steve Grubb wrote: > > On Wednesday, July 25, 2018 9:02:50 AM EDT Ondrej Mosnacek wrote: > > > On Wed, Jul 25, 2018 at 2:48 PM Steve Grubb wrote: > > > > On Wednesday, July 25, 2018 3:44:07 AM EDT Ondrej Mosnacek wrote: > > > > > On Wed, Jul 25, 2018 at 3:11 AM Steve Grubb wrote: > > > > > > On Tuesday, July 24, 2018 6:15:54 PM EDT Paul Moore wrote: > > > > > > > On Tue, Jul 24, 2018 at 10:12 AM Ondrej Mosnacek > > > > > > > > > > > > > > > > > > > > > > Beyond that, there is really no information in the records that > > > > > > > > would > > > > > > > > allow reconstructing which PARENT path belongs to which > > > > > > > > CREATE/DELETE > > > > > > > > path... (Intuitively you can guess that src will come before > > > > > > > > dst, > > > > > > > > but > > > > > > > > that is not very reliable.) I think a "parent inode" field in > > > > > > > > the > > > > > > > > PATH > > > > > > > > records could fix this, but maybe there is a better solution... > > > > > > > > > > > > > > I have my suspicions, but I would be curious to hear from Steve > > > > > > > how > > > > > > > the reconstruction is typically handled. > > > > > > > > > > > > For any *at function when the dirfd is not AT_FDCWD, it goes badly. > > > > > > If > > > > > > its a old style syscall without the dirfd, then if the first > > > > > > character > > > > > > is '/' use that. Otherwise concatonate cwd and path and pass it to > > > > > > realpath to sort out. > > > > > > > > > > In that case it seems the best fix for openat() et al. would be to > > > > > somehow always force outputting the full path when dirfd != AT_FDCWD. > > > > > Hopefully that won't require too much hacking around... > > > > > > > > What is asked for is the full path that dirfd was opened with. I can > > > > take > > > > care of everything else. > > > > > > But where/how should that path be logged? In case of renameat(), for > > > example, we have 6 (!) path components: > > > // and // > > > > > > (I am assuming the child paths always represent just the last path > > > component based on the observed inodes of the parent/child records.) > > > > > > Current record format can distinguish between PARENT and child > > > (DELETE/CREATE), but there is no nametype for the dirfd path. That's > > > why I am leaning towards just logging the full "<*_dir>/<*_parent>" > > > path in the PARENT record. Or do you prefer that we add a new nametype > > > for the dirfd path? > > > > You could make a new nametype so that we can make sense of it. But do you > > have all of the required information for a PATH record? I thought that you > > were making a new record type since you have abbreviated information. > > I think it should be possible to collect that information by putting > hooks in the right places of the filesystem code (and fixing the > current ones). > > To be honest, the reason why I had jumped right to making a new record > type was Paul's wording in the issue description ("We may need a new > auxiliary record type since this is neither the cwd or path." ... For future reference, when I say "may" I do really mean it; it *may* be a good idea, but it *may* also be a garbage idea ;) -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak9 0/3] audit: Record the path of FDs passed to *at(2) syscalls
On Thu, Jul 26, 2018 at 5:13 AM Ondrej Mosnacek wrote: > On Thu, Jul 26, 2018 at 10:12 AM Ondrej Mosnacek wrote: > > I think it should be possible to collect that information by putting > > hooks in the right places of the filesystem code (and fixing the > > current ones). > > Hm, after closer look, it seems this won't be doable (at least not > easily). The PATH record always logs the original path string from > userspace (and I think we need to preserve this behavior in case > someone relies on it). In case of PARENT records, it truncates away > the last component (because it wants to log inode information also for > the parent directory). If this truncated string ends up empty (i.e. > the original string had just one component), it just smashes in the > absolute path of the CWD (which is known), because it pretends no > *at() syscalls exist and all relative paths are relative to current > CWD. > > So to fix this, we need to do one of the following: > 1. Add a new field to the PATH records that would specify the path of > the directory that the value of name=... is relative to. If this is > CWD, we can just use some special value > ("(null)"/"(none)"/"(cwd)"/...) or omit the field completely. I prefer > this approach, because it will best solve the case of renameat(), > where different PATH records can have different base directories. It is worth calling extra attention to the fact that we would now be effectively recording two paths in a single record. I'm not sure how much we care about that, but it does increase the chances we blow past the end of the netlink buffer; although it is worth noting that a single PATH_MAX entry would do that today. Also, would this new field remain empty for non-*at syscalls? > 2. If adding fields is considered A Bad Thing, we could alternatively > provide this information in separate records (either PATH with special > nametype or a new record). However in such case we need to somehow > specify to which PATH records each base directory corresponds. For > PATH records this could be guessed from their order, but this is a > fragile thing (changes in filesystem code could change the order). While this may be a bit more difficult it seems like this is more in keeping with the current methodology, and would keep the overall audit logs smaller. In the case of the *at syscalls I presume you would create PARENT records for the base directory, omitting it if AT_FDCWD was used? I imagine this would also help the traditional rename() syscall? -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH ghak95] audit: Do not log full CWD path on empty relative paths
On Thu, Aug 2, 2018 at 7:45 AM Ondrej Mosnacek wrote: > When a relative path has just a single component and we want to emit a > nametype=PARENT record, the current implementation just reports the full > CWD path (which is alrady available in the audit context). > > This is wrong for three reasons: > 1. Wasting log space for redundant data (CWD path is already in the CWD >record). > 2. Inconsistency with other PATH records (if a relative PARENT directory >path contains at least one component, only the verbatim relative path >is logged). > 3. In some syscalls (e.g. openat(2)) the relative path may not even be >relative to the CWD, but to another directory specified as a file >descriptor. In that case the logged path is simply plain wrong. > > This patch modifies this behavior to simply report "." in the > aforementioned case, which is equivalent to an "empty" directory path > and can be concatenated with the actual base directory path (CWD or > dirfd from openat(2)-like syscall) once support for its logging is added > later. In the meantime, defaulting to CWD as base directory on relative > paths (as already done by the userspace tools) will be enough to achieve > results equivalent to the current behavior. I have to ask the obvious question, if we already have the necessary parent path in the CWD record, why do we need a nametype=parent PATH record anyway? Can we safely remove it or will that cause problems for Steve's userspace tools? > See: https://github.com/linux-audit/audit-kernel/issues/95 > > Fixes: 9c937dcc7102 ("[PATCH] log more info for directory entry change > events") > Signed-off-by: Ondrej Mosnacek > --- > kernel/audit.c | 9 - > 1 file changed, 4 insertions(+), 5 deletions(-) > > diff --git a/kernel/audit.c b/kernel/audit.c > index 2a8058764aa6..4f18bd48eb4b 100644 > --- a/kernel/audit.c > +++ b/kernel/audit.c > @@ -2127,28 +2127,27 @@ void audit_log_name(struct audit_context *context, > struct audit_names *n, > > audit_log_format(ab, "item=%d", record_num); > > + audit_log_format(ab, " name="); > if (path) > - audit_log_d_path(ab, " name=", path); > + audit_log_d_path(ab, NULL, path); > else if (n->name) { > switch (n->name_len) { > case AUDIT_NAME_FULL: > /* log the full path */ > - audit_log_format(ab, " name="); > audit_log_untrustedstring(ab, n->name->name); > break; > case 0: > /* name was specified as a relative path and the > * directory component is the cwd */ > - audit_log_d_path(ab, " name=", &context->pwd); > + audit_log_untrustedstring(ab, "."); > break; > default: > /* log the name's directory component */ > - audit_log_format(ab, " name="); > audit_log_n_untrustedstring(ab, n->name->name, > n->name_len); > } > } else > - audit_log_format(ab, " name=(null)"); > + audit_log_format(ab, "(null)"); > > if (n->ino != AUDIT_INO_UNSET) > audit_log_format(ab, " inode=%lu" > -- > 2.17.1 > -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH] audit: minimize our use of audit_log_format()
On Thu, Aug 2, 2018 at 8:05 PM Richard Guy Briggs wrote: > On 2018-08-02 18:24, Paul Moore wrote: > > WARNING: completely untested patch! > > > > There are several cases where we are using audit_log_format() when we > > could be using audit_log_string(), which should be quicker. There are > > also some cases where we are making multiple audit_log_format() calls > > in a row, for no apparent reason. > > > > This patch fixes the problems above in the core audit code, the other > > kernel subsystems are left for another time. > > > > Signed-off-by: Paul Moore > > This all looks reasonable to me. Thanks for the review. I still need to actually build a kernel with it and make sure I didn't do something stupid, but like the current (ref: task_struct) RFC, I'll probably revisit this after the merge window. > For what my opinion's worth... > Reviewed-by: Richard Guy Briggs I'm always happy to get additional eyes on patches. > > --- > > kernel/audit.c | 37 ++--- > > kernel/audit_fsnotify.c |2 +- > > kernel/audit_tree.c |3 +-- > > kernel/audit_watch.c|2 +- > > kernel/auditsc.c| 19 +-- > > 5 files changed, 30 insertions(+), 33 deletions(-) > > > > diff --git a/kernel/audit.c b/kernel/audit.c > > index 160144f7e5f9..a0f57f4f9944 100644 > > --- a/kernel/audit.c > > +++ b/kernel/audit.c > > @@ -1347,7 +1347,7 @@ static int audit_receive_msg(struct sk_buff *skb, > > struct nlmsghdr *nlh) > > else { > > int size; > > > > - audit_log_format(ab, " data="); > > + audit_log_string(ab, " data="); > > size = nlmsg_len(nlh); > > if (size > 0 && > > ((unsigned char *)data)[size - 1] == '\0') > > @@ -1375,7 +1375,7 @@ static int audit_receive_msg(struct sk_buff *skb, > > struct nlmsghdr *nlh) > > case AUDIT_TRIM: > > audit_trim_trees(); > > audit_log_common_recv_msg(&ab, AUDIT_CONFIG_CHANGE); > > - audit_log_format(ab, " op=trim res=1"); > > + audit_log_string(ab, " op=trim res=1"); > > audit_log_end(ab); > > break; > > case AUDIT_MAKE_EQUIV: { > > @@ -1406,9 +1406,9 @@ static int audit_receive_msg(struct sk_buff *skb, > > struct nlmsghdr *nlh) > > > > audit_log_common_recv_msg(&ab, AUDIT_CONFIG_CHANGE); > > > > - audit_log_format(ab, " op=make_equiv old="); > > + audit_log_string(ab, " op=make_equiv old="); > > audit_log_untrustedstring(ab, old); > > - audit_log_format(ab, " new="); > > + audit_log_string(ab, " new="); > > audit_log_untrustedstring(ab, new); > > audit_log_format(ab, " res=%d", !err); > > audit_log_end(ab); > > @@ -2021,7 +2021,7 @@ void audit_log_d_path(struct audit_buffer *ab, const > > char *prefix, > > char *p, *pathname; > > > > if (prefix) > > - audit_log_format(ab, "%s", prefix); > > + audit_log_string(ab, prefix); > > > > /* We will allow 11 spaces for ' (deleted)' to be appended */ > > pathname = kmalloc(PATH_MAX+11, ab->gfp_mask); > > @@ -2048,11 +2048,11 @@ void audit_log_session_info(struct audit_buffer *ab) > > > > void audit_log_key(struct audit_buffer *ab, char *key) > > { > > - audit_log_format(ab, " key="); > > + audit_log_string(ab, " key="); > > if (key) > > audit_log_untrustedstring(ab, key); > > else > > - audit_log_format(ab, "(null)"); > > + audit_log_string(ab, "(null)"); > > } > > > > void audit_log_cap(struct audit_buffer *ab, char *prefix, kernel_cap_t > > *cap) > > @@ -2134,7 +2134,7 @@ void audit_log_name(struct audit_context *context, > > struct audit_names *n, > > switch (n->name_len) { > > case AUDIT_NAME_FULL: > > /* log the full path */ > > - audit_log_format(ab, " name="); > > + audit_log_string(ab, " name=")
[GIT PULL] Audit patches for v4.19
Hi Linus, Twelve audit patches for v4.19 and they run the full gamut from fixes to features. Notable changes include the ability to use the "exe" audit filter field in a wider variety of filter types, a fix for our comparison of GID/EGID in audit filter rules, better association of related audit records (connecting related audit records together into one audit event), and a fix for a potential use-after-free in audit_add_watch(). All the patches pass the audit-testsuite and merge cleanly on your current master branch. Please pull, thanks. -Paul -- The following changes since commit ce397d215ccd07b8ae3f71db689aedb85d56ab40: Linux 4.18-rc1 (2018-06-17 08:04:49 +0900) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit.git tags/audit-pr-20180814 for you to fetch changes up to baa2a4fdd525c8c4b0f704d20457195b29437839: audit: fix use-after-free in audit_add_watch (2018-07-18 11:43:36 -0400) audit/stable-4.18 PR 20180814 Arnd Bergmann (1): audit: use ktime_get_coarse_ts64() for time access Ondrej Mosnáček (3): audit: allow other filter list types for AUDIT_EXE audit: Fix extended comparison of GID/EGID cred: conditionally declare groups-related functions Paul Moore (1): audit: use ktime_get_coarse_real_ts64() for timestamps Richard Guy Briggs (6): audit: tie SECCOMP records to syscall audit: tie ANOM_ABEND records to syscall audit: rename FILTER_TYPE to FILTER_EXCLUDE audit: eliminate audit_enabled magic number comparison audit: check audit_enabled in audit_tree_log_remove_rule() audit: simplify audit_enabled check in audit_watch_log_rule_change() Ronny Chevalier (1): audit: fix use-after-free in audit_add_watch drivers/tty/tty_audit.c | 2 +- include/linux/audit.h| 5 - include/linux/cred.h | 15 ++- include/net/xfrm.h | 2 +- include/uapi/linux/audit.h | 3 ++- kernel/audit.c | 7 ++- kernel/audit_tree.c | 2 ++ kernel/audit_watch.c | 41 -- kernel/auditfilter.c | 17 ++--- kernel/auditsc.c | 14 +++--- net/netfilter/xt_AUDIT.c | 2 +- net/netlabel/netlabel_user.c | 2 +- 12 files changed, 67 insertions(+), 45 deletions(-) -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: Kernel 4.9.51 null pointer dereference add_rule with null key
ug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.206822] RSP: >> 0018:c900079a7c38 EFLAGS: 00010246 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.211065] RAX: >> RBX: RCX: >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.216779] RDX: >> 1ffe RSI: 024000c0 RDI: 0014 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.222473] RBP: >> c900079a7c68 R08: 0004 R09: 88062a375414 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.228167] R10: >> 8808004032c0 R11: 88062a375400 R12: >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.233738] R13: >> 8804200a0100 R14: 52f1 R15: 00016d42 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.238999] FS: >> 7f3d40cc4700() GS:880800a0() knlGS: >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.244796] CS: >> 0010 DS: ES: CR0: 80050033 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.249555] CR2: >> 01e0 CR3: 00059e63d000 CR4: 001406f0 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.255401] DR0: >> DR1: DR2: >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.260725] DR3: >> DR6: fffe0ff0 DR7: 0400 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.266038] Stack: >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.267680] >> 079a7c58 52f1 52f1 ffa1 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.273540] >> 88062a376800 8804200a1d00 c900079a7cf8 811107d2 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.279379] >> c904 811e2f85 c900079a7ca8 8145a08c >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.285201] Call >> Trace: >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.287137] >> [] audit_receive_msg+0x992/0xc20 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.291561] >> [] ? __kmalloc_node_track_caller+0x35/0x260 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.296642] >> [] ? release_sock+0x8c/0xa0 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.300718] >> [] audit_receive+0x52/0xa0 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.304713] >> [] netlink_unicast+0x15f/0x230 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.308943] >> [] netlink_sendmsg+0x31c/0x390 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.313236] >> [] sock_sendmsg+0x38/0x50 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.317219] >> [] SYSC_sendto+0xef/0x170 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.321142] >> [] ? __audit_syscall_entry+0xeb/0x100 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.325853] >> [] ? syscall_trace_enter+0x1b7/0x290 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.330464] >> [] ? __audit_syscall_exit+0x1f3/0x280 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.335189] >> [] SyS_sendto+0xe/0x10 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.338952] >> [] do_syscall_64+0x54/0xc0 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.342995] >> [] entry_SYSCALL64_slow_path+0x25/0x25 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.347840] Code: >> 65 8b 05 a2 5b b6 7e 25 00 ff 00 00 83 f8 01 19 f6 81 e6 a0 00 38 00 81 c6 >> 20 00 08 02 e8 af cf ff ff 49 89 c5 31 c0 85 db 75 08 <49> 8b 84 24 e0 01 00 >> 00 48 89 45 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.368625] RIP >> [] netlink_unicast+0x4a/0x230 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.373253] RSP >> >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.376013] CR2: >> 01e0 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.379307] ---[ >> end trace 6a41b2274729ba2a ]--- >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.663029] audit: >> type=1300 audit(1534170895.379:117649235): arch=c03e syscall=59 >> success=yes exit=0 a0=271cce0 a1=2741dd0 a2=26dc3b0 a3=7ffdd7f1ea80 items=2 >> ppid=4556 pid=2407 >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.682741] audit: >> type=1309 audit(1534170895.379:117649235): argc=2 a0="date" a1="+%s" >> Aug 13 14:34:55 packer_default-10-180-21-24 kernel: [4749520.688585] audit: >> type=1307 audit(1534170895.379:117649235): cwd="/" >> Aug 13 14:34:55 packer_default-10-180-21-24 abrt-dump-oops: Reported 1 >> kernel oopses to Abrt > > -- > Linux-audit mailing list > Linux-audit@redhat.com > https://www.redhat.com/mailman/listinfo/linux-audit -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: Kernel 4.9.51 null pointer dereference add_rule with null key
On Fri, Aug 17, 2018 at 5:24 PM Preston Bennes wrote: > Hey Paul, > > My expectations are aligned to what you've pointed out already (not-vanilla > old kernel, user space client other than auditd). > > Primarily what I'm looking for is a sanity check on the null pointer deref on > current vanilla kernel code because I think the rest of the issues stem from > that. My C isn't sharp enough to try my hand at a reproduction case, though I > would hope someone with the requisite expertise could put one together > relatively quickly. specifically this message: > >> audit: type=1305 audit(1534170894.852:117649234): auid=501 ses=102936 >> op="add_rule" key=(null) list=4 res=0 Unfortunately without digging into Amazon's source for that kernel build it is difficult to come to any strong conclusion. The audit message you are seeing above is an AUDIT_CONFIG_CHANGE (type == 1305), and based on the 'op="add_rule"' field it appears as though the system is adding an exit (list == 4) filter rule to the kernel. The operation failed (res == 0), which shouldn't be surprising given the kernel oops. I'm not very familiar with osquery, does it load it's own audit filter rules? Are you loading some of your own? > On Fri, Aug 17, 2018 at 2:12 PM Paul Moore wrote: >> >> On Fri, Aug 17, 2018 at 4:26 PM Preston Bennes >> wrote: >> > Hey Audit team, >> >> Hey Audit user! >> >> > I've got a kernel null pointer deref oops on Amazon Linux kernel >> > 4.9.51-10.52.amzn1.x86_64. After the oops all new cron processes that >> > spawned were stuck in D wait on some audit related syscall. This exhausted >> > system file handles and the box ended up needing to be rebooted >> > out-of-bounds. The audit handle was held by >> > https://github.com/facebook/osquery/. >> > >> > It looks like there was an issue with audit sending to osquery >> > (netlink_unicast sending to pid 21233, error -111) followed by something >> > (presumably also osquery) attempting to 'op="add_rule" key=(null)'. >> >> The first thing I need to comment on is that since this is a distro >> specific kernel (4.9.51-10.52.amzn1.x86_64) and not an upstream kernel >> there is a limit in the amount of help we can provide. Generally the >> upstream list focuses on the upstream code and we leave support for >> the distro specific builds/packages up to the distro. If you can >> reproduce your problem on an upstream kernel, with a strong preference >> for a *recent* upstream kernel, we can probably be of more help. I >> simply don't know what patches Amazon merges into it's amzn1 kernels. >> >> With that out of the way, I will add that v4.9.x is a rather old >> kernel, and we've made a number of relevant improvements since the >> v4.9 release and it is unlikely all of them have been backported to >> your kernel. If possible, I would suggest upgrading to a newer kernel >> as it may resolve your problem. >> >> There is also a possible issue of osquery as an auditd replacement; >> it's unlikely this is being caused by osquery, but we don't test with >> osquery to that save level as we do with Steve's audit userspace >> (auditd). >> >> > Here's from /var/log/messages. >> > >> >> Aug 13 14:34:54 packer_default-10-180-21-24 kernel: [4749520.133904] >> >> audit: netlink_unicast sending to audit_pid=21233 returned error: -111 >> >> Aug 13 14:34:54 packer_default-10-180-21-24 kernel: [4749520.137982] >> >> audit_log_lost: 12 callbacks suppressed >> >> Aug 13 14:34:54 packer_default-10-180-21-24 kernel: [4749520.137983] >> >> audit: audit_lost=36081041 audit_rate_limit=8192 audit_backlog_limit=4096 >> >> Aug 13 14:34:54 packer_default-10-180-21-24 kernel: [4749520.137985] >> >> audit: type=1305 audit(1534170894.852:117649229): audit_pid=21233 >> >> old=21233 auid=501 ses=102936 res=0 >> >> Aug 13 14:34:54 packer_default-10-180-21-24 kernel: [4749520.137986] >> >> audit: type=1305 audit(1534170894.852:117649230): audit_enabled=1 old=1 >> >> auid=501 ses=102936 res=1 >> >> Aug 13 14:34:54 packer_default-10-180-21-24 kernel: [4749520.137987] >> >> audit: type=1305 audit(1534170894.852:117649231): >> >> audit_backlog_wait_time=1 old=1 auid=501 ses=102936 res=1 >> >> Aug 13 14:34:54 packer_default-10-180-21-24 kernel: [4749520.137989] >> >> audit: type=1305 audit(1534170894.852:117649232): >> >> audit_bac
Re: [RFC PATCH ghak10 v4 0/2] audit: Log modifying adjtimex(2) calls
On Tue, Aug 21, 2018 at 3:21 AM Miroslav Lichvar wrote: > > On Mon, 20 Aug 2018, Ondrej Mosnacek wrote: > > > @John or other timekeeping/NTP folks: We had a discussion on the audit > > > ML on which of the internal timekeeping/NTP variables we should actually > > > log changes for. We are only interested in variables that can (directly > > > or indirectly) cause noticeable changes to the system clock, but since we > > > have only limited understanding of the NTP code, we would like to ask > > > you for advice on which variables are security relevant. > > I guess that mostly depends on whether you consider setting the clock > to run faster or slower than real time to be an important event for > the audit. > > > > - NTP value adjustments: > > > - time_offset (probably important) > > This can adjust the clock by up to 0.5 seconds per call and also speed > it up or slow down by up to about 0.05% (43 seconds per day). This seems worthwhile. > > > - time_freq (maybe not important?) > > This can speed up or slow down by up to about 0.05%. This too. > > > - time_status (likely important, can cause leap second injection) > > Yes, it can insert/delete leap seconds and it also enables/disables > synchronization of the hardware real-time clock. This one as well. > > > - time_maxerror (maybe not important?) > > > - time_esterror (maybe not important?) > > These two change the error estimates that are reported to applications > using ntp_gettime()/adjtimex(). If an application was periodically > checking that the clock is synchronized with some specified accuracy > and setting the maxerror to a larger value would cause the application > to abort, would it be an important event in the audit? Since these don't really affect the time, just the expected error, I'm not sure this is important. > > > - time_constant (???) > > This controls the speed of the clock adjustments that are made when > time_offset is set. Probably not important for the audit. Agreed. I think we can skip this. > > > - time_adjust (sounds important) > > This is similar to time_freq. It can temporarily speed up or slow down > the clock by up to 0.05%. Like time_freq, we should probably log this too. > > > - tick_usec (???) > > This is a more extreme version of time_freq. It can speed up or slow > down the clock by up to 10%. Let's audit this one too. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [RFC PATCH ghak10 v4 0/2] audit: Log modifying adjtimex(2) calls
On Thu, Aug 23, 2018 at 5:14 AM Ondrej Mosnacek wrote: > On Wed, Aug 22, 2018 at 11:27 PM Paul Moore wrote: > > On Tue, Aug 21, 2018 at 3:21 AM Miroslav Lichvar > > wrote: > > > > On Mon, 20 Aug 2018, Ondrej Mosnacek wrote: > > > > > @John or other timekeeping/NTP folks: We had a discussion on the audit > > > > > ML on which of the internal timekeeping/NTP variables we should > > > > > actually > > > > > log changes for. We are only interested in variables that can > > > > > (directly > > > > > or indirectly) cause noticeable changes to the system clock, but > > > > > since we > > > > > have only limited understanding of the NTP code, we would like to ask > > > > > you for advice on which variables are security relevant. > > > > > > I guess that mostly depends on whether you consider setting the clock > > > to run faster or slower than real time to be an important event for > > > the audit. > > > > > > > > - NTP value adjustments: > > > > > - time_offset (probably important) > > > > > > This can adjust the clock by up to 0.5 seconds per call and also speed > > > it up or slow down by up to about 0.05% (43 seconds per day). > > > > This seems worthwhile. > > > > > > > - time_freq (maybe not important?) > > > > > > This can speed up or slow down by up to about 0.05%. > > > > This too. > > > > > > > - time_status (likely important, can cause leap second injection) > > > > > > Yes, it can insert/delete leap seconds and it also enables/disables > > > synchronization of the hardware real-time clock. > > > > This one as well. > > > > > > > - time_maxerror (maybe not important?) > > > > > - time_esterror (maybe not important?) > > > > > > These two change the error estimates that are reported to applications > > > using ntp_gettime()/adjtimex(). If an application was periodically > > > checking that the clock is synchronized with some specified accuracy > > > and setting the maxerror to a larger value would cause the application > > > to abort, would it be an important event in the audit? > > > > Since these don't really affect the time, just the expected error, I'm > > not sure this is important. > > > > > > > - time_constant (???) > > > > > > This controls the speed of the clock adjustments that are made when > > > time_offset is set. Probably not important for the audit. > > > > Agreed. I think we can skip this. > > > > > > > - time_adjust (sounds important) > > > > > > This is similar to time_freq. It can temporarily speed up or slow down > > > the clock by up to 0.05%. > > > > Like time_freq, we should probably log this too. > > > > > > > - tick_usec (???) > > > > > > This is a more extreme version of time_freq. It can speed up or slow > > > down the clock by up to 10%. > > > > Let's audit this one too. > > I agree with Paul on all counts. I will go ahead and prepare a > patchset that logs everything except maxerror, esterror, and constant. Please make sure you include explanations similar to the above in the patch descriptions; not just the cover letter, the goal is to have this information captured in the git log. > Thank you, Miroslav, for the explanations! Yes, those explanations were helpful. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH ghak95] audit: Do not log full CWD path on empty relative paths
On Fri, Aug 3, 2018 at 3:08 AM Ondrej Mosnacek wrote: > On Fri, Aug 3, 2018 at 12:24 AM Paul Moore wrote: > > On Thu, Aug 2, 2018 at 7:45 AM Ondrej Mosnacek wrote: > > > > > > When a relative path has just a single component and we want to emit a > > > nametype=PARENT record, the current implementation just reports the full > > > CWD path (which is alrady available in the audit context). > > > > > > This is wrong for three reasons: > > > 1. Wasting log space for redundant data (CWD path is already in the CWD > > >record). > > > 2. Inconsistency with other PATH records (if a relative PARENT directory > > >path contains at least one component, only the verbatim relative path > > >is logged). > > > 3. In some syscalls (e.g. openat(2)) the relative path may not even be > > >relative to the CWD, but to another directory specified as a file > > >descriptor. In that case the logged path is simply plain wrong. > > > > > > This patch modifies this behavior to simply report "." in the > > > aforementioned case, which is equivalent to an "empty" directory path > > > and can be concatenated with the actual base directory path (CWD or > > > dirfd from openat(2)-like syscall) once support for its logging is added > > > later. In the meantime, defaulting to CWD as base directory on relative > > > paths (as already done by the userspace tools) will be enough to achieve > > > results equivalent to the current behavior. > > > > > > See: https://github.com/linux-audit/audit-kernel/issues/95 > > > > > > Fixes: 9c937dcc7102 ("[PATCH] log more info for directory entry change > > > events") > > > Signed-off-by: Ondrej Mosnacek > > > --- > > > kernel/audit.c | 9 - > > > 1 file changed, 4 insertions(+), 5 deletions(-) > > > > > > diff --git a/kernel/audit.c b/kernel/audit.c > > > index 2a8058764aa6..4f18bd48eb4b 100644 > > > --- a/kernel/audit.c > > > +++ b/kernel/audit.c > > > @@ -2127,28 +2127,27 @@ void audit_log_name(struct audit_context > > > *context, struct audit_names *n, > > > > > > audit_log_format(ab, "item=%d", record_num); > > > > > > + audit_log_format(ab, " name="); > > > if (path) > > > - audit_log_d_path(ab, " name=", path); > > > + audit_log_d_path(ab, NULL, path); > > > else if (n->name) { > > > switch (n->name_len) { > > > case AUDIT_NAME_FULL: > > > /* log the full path */ > > > - audit_log_format(ab, " name="); > > > audit_log_untrustedstring(ab, n->name->name); > > > break; > > > case 0: > > > /* name was specified as a relative path and the > > > * directory component is the cwd */ > > > - audit_log_d_path(ab, " name=", &context->pwd); > > > + audit_log_untrustedstring(ab, "."); > > > > This isn't a comprehensive review, I only gave this a quick look, but > > considering you are only logging "." above we can safely use > > audit_log_string() and safe a few cycles. > > I used audit_log_untrustedstring() to maintain the current norm that > the name= field would always contain a quoted string (either in > double-quotes or hex-escaped). I don't know if such consistency is > important for parsing in userspace (it should probably be robust > enough to handle any format), but I wanted to be on the safe side. In this particular case there should be no visible difference in the resulting record and audit_log_string() is going to have less overhead in the kernel. > > Honestly, looking at the rest of this function, why are we using > > audit_log_format() in the case where we are simply writing a string > > literal? While I haven't tested it, simple code inspection would seem > > to indicate that audit_log_string() should be much quicker than > > audit_log_format()? I realize this isn't strictly a problem from this > > patch, but we probably shouldn't be propagating this mistake any > > further. > > Good point. If I happen to be sending a v2 of the patch, I will switch > to audit_log_string() where possible. Otherwise, I'll leave it to you > to fix up when applying (it will probably clash with your patch > anyway). Please fit it yourself and resubmit. As a general rule I shouldn't need to fix things like this when merging. While things like this sometimes happen, they are special cases and are usually the result of short deadlines, missing devs, etc. In general I try to limit my modifications to merge fuzz and minor code changes like whitespace fixes, silly checkpatch warnings, etc. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH ghak95] audit: Do not log full CWD path on empty relative paths
On Thu, Aug 2, 2018 at 8:03 PM Paul Moore wrote: > > On Thu, Aug 2, 2018 at 7:45 AM Ondrej Mosnacek wrote: > > When a relative path has just a single component and we want to emit a > > nametype=PARENT record, the current implementation just reports the full > > CWD path (which is alrady available in the audit context). > > > > This is wrong for three reasons: > > 1. Wasting log space for redundant data (CWD path is already in the CWD > >record). > > 2. Inconsistency with other PATH records (if a relative PARENT directory > >path contains at least one component, only the verbatim relative path > >is logged). > > 3. In some syscalls (e.g. openat(2)) the relative path may not even be > >relative to the CWD, but to another directory specified as a file > >descriptor. In that case the logged path is simply plain wrong. > > > > This patch modifies this behavior to simply report "." in the > > aforementioned case, which is equivalent to an "empty" directory path > > and can be concatenated with the actual base directory path (CWD or > > dirfd from openat(2)-like syscall) once support for its logging is added > > later. In the meantime, defaulting to CWD as base directory on relative > > paths (as already done by the userspace tools) will be enough to achieve > > results equivalent to the current behavior. > > I have to ask the obvious question, if we already have the necessary > parent path in the CWD record, why do we need a nametype=parent PATH > record anyway? Can we safely remove it or will that cause problems > for Steve's userspace tools? As a reminder, these questions still need answers. > > See: https://github.com/linux-audit/audit-kernel/issues/95 > > > > Fixes: 9c937dcc7102 ("[PATCH] log more info for directory entry change > > events") > > Signed-off-by: Ondrej Mosnacek > > --- > > kernel/audit.c | 9 - > > 1 file changed, 4 insertions(+), 5 deletions(-) > > > > diff --git a/kernel/audit.c b/kernel/audit.c > > index 2a8058764aa6..4f18bd48eb4b 100644 > > --- a/kernel/audit.c > > +++ b/kernel/audit.c > > @@ -2127,28 +2127,27 @@ void audit_log_name(struct audit_context *context, > > struct audit_names *n, > > > > audit_log_format(ab, "item=%d", record_num); > > > > + audit_log_format(ab, " name="); > > if (path) > > - audit_log_d_path(ab, " name=", path); > > + audit_log_d_path(ab, NULL, path); > > else if (n->name) { > > switch (n->name_len) { > > case AUDIT_NAME_FULL: > > /* log the full path */ > > - audit_log_format(ab, " name="); > > audit_log_untrustedstring(ab, n->name->name); > > break; > > case 0: > > /* name was specified as a relative path and the > > * directory component is the cwd */ > > - audit_log_d_path(ab, " name=", &context->pwd); > > + audit_log_untrustedstring(ab, "."); > > break; > > default: > > /* log the name's directory component */ > > - audit_log_format(ab, " name="); > > audit_log_n_untrustedstring(ab, n->name->name, > > n->name_len); > > } > > } else > > - audit_log_format(ab, " name=(null)"); > > + audit_log_format(ab, "(null)"); > > > > if (n->ino != AUDIT_INO_UNSET) > > audit_log_format(ab, " inode=%lu" > > -- > > 2.17.1 > > > > > -- > paul moore > www.paul-moore.com -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH ghak95] audit: Do not log full CWD path on empty relative paths
On Thu, Sep 13, 2018 at 9:58 AM Ondrej Mosnacek wrote: > Paul, could you please answer this question so I can move forward? :) Yep, sorry for the delay; there have been some serious bugs over in SELinux land (as well as some employer related things that should be public soon) that have taken my focus since I've returned from the Labor Day holidays in the US. I'm going to start working through the audit backlog this afternoon/evening; your patches and Jan's fixes will likely get priority (yours are small and easy-ish to review, Jan's fixes some potentially ugly problems). -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH ghak10 v5 1/2] audit: Add functions to log time adjustments
On Thu, Sep 13, 2018 at 9:59 AM Ondrej Mosnacek wrote: > On Mon, Aug 27, 2018 at 6:38 PM Steve Grubb wrote: > > On Monday, August 27, 2018 5:13:17 AM EDT Ondrej Mosnacek wrote: > > > On Mon, Aug 27, 2018 at 9:50 AM Miroslav Lichvar > > wrote: > > > > On Fri, Aug 24, 2018 at 02:00:00PM +0200, Ondrej Mosnacek wrote: > > > > > This patch adds two auxiliary record types that will be used to > > > > > annotate > > > > > the adjtimex SYSCALL records with the NTP/timekeeping values that have > > > > > been changed. > > > > > > > > It seems the "adjust" function intentionally logs also calls/modes > > > > that don't actually change anything. Can you please explain it a bit > > > > in the message? > > > > > > > > NTP/PTP daemons typically don't read the adjtimex values in a normal > > > > operation and overwrite them on each update, even if they don't > > > > change. If the audit function checked that oldval != newval, the > > > > number of messages would be reduced and it might be easier to follow. > > > > > > We actually want to log any attempt to change a value, as even an > > > intention to set/change something could be a hint that the process is > > > trying to do something bad (see discussion at [1]). > > > > One of the problems is that these applications can flood the logs very > > quickly. An attempt to change is not needed unless it fails for permissions > > reasons. So, limiting to actual changes is probably a good thing. > > Well, Richard seemed to "violently" agree with the opposite, so now I > don't know which way to go... Paul, you are the official tie-breaker > here, which do you prefer? The general idea is that we only care about *changes* to the system state, so if a process is setting a variable to with a value that matches it's current value I see no reason why we need to generate a change record. Another thing to keep in mind, we can always change the behavior to be more verbose (*always* generate a record, regardless of value) without likely causing a regression, but limiting records is more difficult and more likely to cause regressions. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH ghak10 v5 1/2] audit: Add functions to log time adjustments
On Fri, Aug 24, 2018 at 8:00 AM Ondrej Mosnacek wrote: > This patch adds two auxiliary record types that will be used to annotate > the adjtimex SYSCALL records with the NTP/timekeeping values that have > been changed. > > Next, it adds two functions to the audit interface: > - audit_tk_injoffset(), which will be called whenever a timekeeping >offset is injected by a syscall from userspace, > - audit_ntp_adjust(), which will be called whenever an NTP internal >variable is changed by a syscall from userspace. > > Quick reference for the fields of the new records: > AUDIT_TIME_INJOFFSET > sec - the 'seconds' part of the offset > nsec - the 'nanoseconds' part of the offset > AUDIT_TIME_ADJNTPVAL > op - which value was adjusted: > offset - corresponding to the time_offset variable > freq - corresponding to the time_freq variable > status - corresponding to the time_status variable > adjust - corresponding to the time_adjust variable > tick - corresponding to the tick_usec variable > tai- corresponding to the timekeeping's TAI offset I understand that reusing "op" is tempting, but the above aren't really operations, they are state variables which are being changed. Using the CONFIG_CHANGE record as a basis, I wonder if we are better off with something like the following: type=TIME_CHANGE = old= ... you might need to preface the variable names with something like "ntp_" or "offset_". You'll notice I'm also suggesting we use a single record type here; is there any reason why two records types are required? > old - the old value > new - the new value > > Signed-off-by: Ondrej Mosnacek > --- > include/linux/audit.h | 21 + > include/uapi/linux/audit.h | 2 ++ > kernel/auditsc.c | 15 +++ > 3 files changed, 38 insertions(+) A reminder that we need tests for these new records and a RFE page on the wiki: * https://github.com/linux-audit/audit-testsuite * https://github.com/linux-audit/audit-kernel/wiki -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH ghak10 v5 1/2] audit: Add functions to log time adjustments
On Fri, Sep 14, 2018 at 11:21 AM Richard Guy Briggs wrote: > On 2018-09-13 23:18, Paul Moore wrote: > > On Fri, Aug 24, 2018 at 8:00 AM Ondrej Mosnacek wrote: > > > This patch adds two auxiliary record types that will be used to annotate > > > the adjtimex SYSCALL records with the NTP/timekeeping values that have > > > been changed. > > > > > > Next, it adds two functions to the audit interface: > > > - audit_tk_injoffset(), which will be called whenever a timekeeping > > >offset is injected by a syscall from userspace, > > > - audit_ntp_adjust(), which will be called whenever an NTP internal > > >variable is changed by a syscall from userspace. > > > > > > Quick reference for the fields of the new records: > > > AUDIT_TIME_INJOFFSET > > > sec - the 'seconds' part of the offset > > > nsec - the 'nanoseconds' part of the offset > > > AUDIT_TIME_ADJNTPVAL > > > op - which value was adjusted: > > > offset - corresponding to the time_offset variable > > > freq - corresponding to the time_freq variable > > > status - corresponding to the time_status variable > > > adjust - corresponding to the time_adjust variable > > > tick - corresponding to the tick_usec variable > > > tai- corresponding to the timekeeping's TAI offset > > > > I understand that reusing "op" is tempting, but the above aren't > > really operations, they are state variables which are being changed. > > Using the CONFIG_CHANGE record as a basis, I wonder if we are better > > off with something like the following: > > > > type=TIME_CHANGE = old= > > > > ... you might need to preface the variable names with something like > > "ntp_" or "offset_". You'll notice I'm also suggesting we use a > > single record type here; is there any reason why two records types are > > required? > > Why not do something like: > > type=TIME_CHANGE var= new= old= > > So that we don't pollute the field namespace *and* create 8 variants on > the same record format? This shouldn't be much of a concern with binary > record formats, but we're stuck with the current parsing scheme for now. Since there is already some precedence with the "=" format, and the field namespace is already a bit of a mess IMHO, I'd like us to stick with the style used by CONFIG_CHANGE. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH ghak10 v5 1/2] audit: Add functions to log time adjustments
On Mon, Sep 17, 2018 at 8:38 AM Ondrej Mosnacek wrote: > > On Fri, Sep 14, 2018 at 5:19 AM Paul Moore wrote: > > On Fri, Aug 24, 2018 at 8:00 AM Ondrej Mosnacek wrote: > > > This patch adds two auxiliary record types that will be used to annotate > > > the adjtimex SYSCALL records with the NTP/timekeeping values that have > > > been changed. > > > > > > Next, it adds two functions to the audit interface: > > > - audit_tk_injoffset(), which will be called whenever a timekeeping > > >offset is injected by a syscall from userspace, > > > - audit_ntp_adjust(), which will be called whenever an NTP internal > > >variable is changed by a syscall from userspace. > > > > > > Quick reference for the fields of the new records: > > > AUDIT_TIME_INJOFFSET > > > sec - the 'seconds' part of the offset > > > nsec - the 'nanoseconds' part of the offset > > > AUDIT_TIME_ADJNTPVAL > > > op - which value was adjusted: > > > offset - corresponding to the time_offset variable > > > freq - corresponding to the time_freq variable > > > status - corresponding to the time_status variable > > > adjust - corresponding to the time_adjust variable > > > tick - corresponding to the tick_usec variable > > > tai- corresponding to the timekeeping's TAI offset > > > > I understand that reusing "op" is tempting, but the above aren't > > really operations, they are state variables which are being changed. > > I remember Steve (or was it Richard?) convincing me at one of the > meetings that "op" is the right filed name to use, despite it not > being a name for an operation... But I don't really care, I'm okay > with changing it to e.g. "var" as Richard suggests later in this > thread. As I said before, this seems like an abuse of the "op" field. > > Using the CONFIG_CHANGE record as a basis, I wonder if we are better > > off with something like the following: > > > > type=TIME_CHANGE = old= > > > > ... you might need to preface the variable names with something like > > "ntp_" or "offset_". You'll notice I'm also suggesting we use a > > single record type here; is there any reason why two records types are > > required? > > There are actually two reasons: > 1. The injected offset is a timespec64, so it consists of two integer > values (and it would be weird to produce two records for it, since IMO > it is conceptually still a single variable). > 2. In all other cases the variable is reset to the (possibly > transformed) input value, while in this case the input value is added > directly to the system time. This can be viewed as a kind of variable > too, but it would be weird to report old and new value for it, since > its value flows with time. > > Plus, when I look at: > type=TIME_INJOFFSET [...]: sec=-16 nsec=124887145 > > I can immediately see that the time was shifted back by 16-something > seconds, while when I look at something like: > > type=TIME_CHANGE [...]: var=time_sec new=1537185685 old=1537185701 > type=TIME_CHANGE [...]: var=time_nsec new=664373417 old=789260562 > > I can just see some big numbers that I need to do math with before I > get an idea of what is the magnitude (or sign) of the change. Okay, with that in mind, perhaps when recording the offset values we omit the "old" values (arguably that doesn't make much sense here) and keep the sec/nsec split: type=TIME_CHANGE [...]: offset_sec= offset_nsec= ... and for all others we stick with: type=TIME_CHANGE [...]: ntp_= old= ... and if that results in multiple TIME_CHANGE records for a given event, that's fine with me. > > A reminder that we need tests for these new records and a RFE page on the > > wiki: > > > > * https://github.com/linux-audit/audit-testsuite > > I was going to start working on this once the format issues are > settled. (Although I probably should have kept the RFC in the subject > until then...) > > > * https://github.com/linux-audit/audit-kernel/wiki > > I admit I forgot about this duty, but again I would like to wait for > the discussions to settle before writing that up. That is fine, do it in whatever order works best for you, just understand that I'm probably not going to merge patches like this until I see both documentation and tests. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH ghak95] audit: Do not log full CWD path on empty relative paths
On Thu, Sep 13, 2018 at 10:13 AM Paul Moore wrote: > On Thu, Sep 13, 2018 at 9:58 AM Ondrej Mosnacek wrote: > > Paul, could you please answer this question so I can move forward? :) > > Yep, sorry for the delay ... I just went back over the original problem, your proposed fix, and all of the discussion in this thread. Sadly, I don't think the patch you have proposed is the right fix. As Steve has pointed out, the CWD path is the working directory from which the current process was executed. I believe we should log the full path, or as complete a path as possible, in the nametype=CWD PATH records. While the nametype=PARENT PATH records have a connection with some of the other PATH records (e.g. DELETE and CREATE), the nametype=PARENT PATH records are independent of the current working directory, although they sometimes may be the same; in the cases where they are the same, this is purely a coincidence and is due to operation being performed, not something that should be seen as a flaw. >From what I can tell, there are issues involving the nametype=PARENT PATH records, especially when it comes to the *at() syscalls, but no issue where the nametype=CWD PATH records have been wrong, is that correct? -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH ghak95] audit: Do not log full CWD path on empty relative paths
On Wed, Sep 19, 2018 at 7:01 AM Ondrej Mosnacek wrote: > On Wed, Sep 19, 2018 at 3:35 AM Paul Moore wrote: > > On Thu, Sep 13, 2018 at 10:13 AM Paul Moore wrote: > > > On Thu, Sep 13, 2018 at 9:58 AM Ondrej Mosnacek > > > wrote: > > > > Paul, could you please answer this question so I can move forward? :) > > > > > > Yep, sorry for the delay ... > > > > I just went back over the original problem, your proposed fix, and all > > of the discussion in this thread. > > > > Sadly, I don't think the patch you have proposed is the right fix. > > > > As Steve has pointed out, the CWD path is the working directory from > > which the current process was executed. I believe we should log the > > full path, or as complete a path as possible, in the nametype=CWD PATH > > records. While the nametype=PARENT PATH records have a connection > > with some of the other PATH records (e.g. DELETE and CREATE), the > > nametype=PARENT PATH records are independent of the current working > > directory, although they sometimes may be the same; in the cases where > > they are the same, this is purely a coincidence and is due to > > operation being performed, not something that should be seen as a > > flaw. > > > > From what I can tell, there are issues involving the nametype=PARENT > > PATH records, especially when it comes to the *at() syscalls, but no > > issue where the nametype=CWD PATH records have been wrong, is that > > correct? > > Sorry, but I think you are completely misunderstanding the problem... > I tried to explain it several times in different ways, but apparently > I'm still not doing it right... > > First of all, there is no "nametype=CWD" PATH record. There is only > the classic CWD record that is associated to every syscall and I don't > touch that one at all. The information in the CWD record is perfectly > fine. Yes, that was a casualty of me looking at too many audit logs too late in the day, I mistakenly typed it as a nametype PATH record when CWD is its own record type. My apologies. > Let me try to demonstrate it with some more verbose examples. (TL;DR: > The info in the CWD record is correct, but it is being abused in > audit_log_name().) > > EXAMPLE #1 (The non-edge case): > 1. A userspace process calls rename("dir1/file1", "dir2/file2") with > CWD set to "/home/user". > 2. The syscall causes four calls to __audit_inode(), which generate > four 'struct audit_names' objects with the following information > (maybe not in this specific order, but that doesn't matter): > .name = "dir1/file1", .type = AUDIT_TYPE_PARENT, .name_len = 5 > .name = "dir1/file1", .type = AUDIT_TYPE_DELETE, .name_len = AUDIT_NAME_FULL > .name = "dir2/file2", .type = AUDIT_TYPE_PARENT, .name_len = 5 > .name = "dir2/file2", .type = AUDIT_TYPE_CREATE, .name_len = AUDIT_NAME_FULL > 3. At the end of the syscall, audit_log_name() is called on each of > these objects and produces the following PATH records (simplifed): > nametype=PARENT name="dir1/" > nametype=DELETE name="dir1/file1" > nametype=PARENT name="dir2/" > nametype=CREATE name="dir2/file2" > > Notice that for the PARENT objects the .name_len is truncated to only > the directory component. > > EXAMPLE #2 (The single-path-component case): > 1. A userspace process calls rename("file1", "file2") with CWD set to > "/home/user". > 2. The 'struct audit_names' objects will now be: > .name = "file1", .type = AUDIT_TYPE_PARENT, .name_len = 0 > .name = "file1", .type = AUDIT_TYPE_DELETE, .name_len = AUDIT_NAME_FULL > .name = "file2", .type = AUDIT_TYPE_PARENT, .name_len = 0 > .name = "file2", .type = AUDIT_TYPE_CREATE, .name_len = AUDIT_NAME_FULL > 3. At the end of the syscall, audit_log_name() is called on each of > these objects and produces the following PATH records (simplifed): > nametype=PARENT name="/home/user" > nametype=DELETE name="file1" > nametype=PARENT name="/home/user" > nametype=CREATE name="file2" > > Notice that in this case, the "clever" logic in audit_log_name() > wanted to avoid logging an empty path (name="") in the PARENT records, > so it instead put the CWD path in there ("/home/user"). In this case > this is perfectly valid (although could be a bit surprising that there > is suddenly a full path instead of a relative one), since the full > path of "file1" is actually "/home/user/file1". A quick co
FYI: email change
A quick note that my @redhat.com email address is going to stop working in the next day or two, so if you are using my Red Hat email address to reach me please start using my @paul-moore.com address. Everything else, e.g. my community involvement, will remain unaffected. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH] audit: fixed a braces coding style issue
On Fri, Sep 21, 2018 at 7:38 PM Przemysław Głębocki wrote: > Fixed a coding style issue. > > Signed-off-by: Przemysław Głębocki > --- > kernel/audit_tree.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) I generally dislike accepting coding style patches from people who haven't made any significant code contributions. Would you be interested in participating in upstream audit development? We have a number of open issues that you could take a look at if you are interested ... * https://github.com/linux-audit/audit-kernel/issues > diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c > index ea43181cde4a..3d0e45fa765d 100644 > --- a/kernel/audit_tree.c > +++ b/kernel/audit_tree.c > @@ -112,10 +112,9 @@ static void free_chunk(struct audit_chunk *chunk) > { > int i; > > - for (i = 0; i < chunk->count; i++) { > + for (i = 0; i < chunk->count; i++) > if (chunk->owners[i].owner) > put_tree(chunk->owners[i].owner); > - } > kfree(chunk); > } > > -- > 2.17.1 > -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH ghak10 v5 1/2] audit: Add functions to log time adjustments
On Fri, Sep 21, 2018 at 7:21 AM Ondrej Mosnacek wrote: > On Mon, Sep 17, 2018 at 4:51 PM Paul Moore wrote: > > On Mon, Sep 17, 2018 at 8:38 AM Ondrej Mosnacek wrote: > > > On Fri, Sep 14, 2018 at 5:19 AM Paul Moore wrote: > > > > On Fri, Aug 24, 2018 at 8:00 AM Ondrej Mosnacek > > > > wrote: ... > > Okay, with that in mind, perhaps when recording the offset values we > > omit the "old" values (arguably that doesn't make much sense here) and > > keep the sec/nsec split: > > > > type=TIME_CHANGE [...]: offset_sec= offset_nsec= > > > > ... and for all others we stick with: > > > > type=TIME_CHANGE [...]: ntp_= old= > > Alright, that format would work. However, I would still like to have a > separate type for the offset injection, since it has different field > structure and semantics (difference vs. new+old). I don't see any > reason to sacrifice the distinction for just one record type slot > (AFAIK we technically still have about 2 billion left...). > > (Maybe you just duplicated the record type by mistake, in that case > please disregard the last sentence.) A reasonable guess on the typo, but no that was intentional :) As described above, there is no set field ordering for the TIME_CHANGE record, just like there is not set field ordering for the CONFIG_CHANGE record. Why? We only include the state variables that are being changed instead of including all of the available state variables. Yes, historically there are the "new" and "old" fields, but I don't see that as a strong convention; the special "old=" field name helps prevent confusion. Yes, we aren't really at risk of running out of record types, but why do we *need* two types here? I don't believe the ordering/structure argument is significant in this particular case, and I would much rather see all the time related state changes included in one TIME_CHANGE record. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH 09/11] audit: Allocate fsnotify mark independently of chunk
On Tue, Sep 4, 2018 at 12:06 PM Jan Kara wrote: > Allocate fsnotify mark independently instead of embedding it inside > chunk. This will allow us to just replace chunk attached to mark when > growing / shrinking chunk instead of replacing mark attached to inode > which is a more complex operation. > > Signed-off-by: Jan Kara > --- > kernel/audit_tree.c | 64 > + > 1 file changed, 50 insertions(+), 14 deletions(-) > > diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c > index 0cd08b3581f1..481fdc190c2f 100644 > --- a/kernel/audit_tree.c > +++ b/kernel/audit_tree.c > @@ -142,10 +148,33 @@ static void audit_mark_put_chunk(struct audit_chunk > *chunk) > call_rcu(&chunk->head, __put_chunk); > } > > +static inline struct audit_tree_mark *audit_mark(struct fsnotify_mark *entry) > +{ > + return container_of(entry, struct audit_tree_mark, mark); > +} > + > +static struct audit_chunk *mark_chunk(struct fsnotify_mark *mark) > +{ > + return audit_mark(mark)->chunk; > +} > + ... > @@ -426,7 +460,7 @@ static int tag_chunk(struct inode *inode, struct > audit_tree *tree) > if (!old_entry) > return create_chunk(inode, tree); > > - old = container_of(old_entry, struct audit_chunk, mark); > + old = mark_chunk(old_entry)->chunk; I'm pretty sure you mean the following instead? old = mark_chunk(old_entry); > /* are we already there? */ > spin_lock(&hash_lock); -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH 09/11] audit: Allocate fsnotify mark independently of chunk
On Mon, Sep 17, 2018 at 12:46 PM Jan Kara wrote: > On Fri 14-09-18 10:09:09, Richard Guy Briggs wrote: > > On 2018-09-04 18:06, Jan Kara wrote: > > > Allocate fsnotify mark independently instead of embedding it inside > > > chunk. This will allow us to just replace chunk attached to mark when > > > growing / shrinking chunk instead of replacing mark attached to inode > > > which is a more complex operation. > > > > > > Signed-off-by: Jan Kara > > > --- > ... > > > +static struct audit_chunk *mark_chunk(struct fsnotify_mark *mark) > > > +{ > > > + return audit_mark(mark)->chunk; > > > +} > > > + > > > static void audit_tree_destroy_watch(struct fsnotify_mark *entry) > > > { > > > - struct audit_chunk *chunk = container_of(entry, struct audit_chunk, > > > mark); > > > + struct audit_chunk *chunk = mark_chunk(entry); > > > audit_mark_put_chunk(chunk); > > > + kmem_cache_free(audit_tree_mark_cachep, audit_mark(entry)); > > > +} > > > + > > > +static struct fsnotify_mark *alloc_mark(void) > > > +{ > > > + struct audit_tree_mark *mark; > > > > Would it make sense to call this local variable "amark" to indicate it > > isn't a struct fsnotify_mark, but in fact an audit helper variant? > > > > > + > > > + mark = kmem_cache_zalloc(audit_tree_mark_cachep, GFP_KERNEL); > > > + if (!mark) > > > + return NULL; > > > + fsnotify_init_mark(&mark->mark, audit_tree_group); > > > + mark->mark.mask = FS_IN_IGNORED; > > > + return &mark->mark; > > > > There are no other places where it is used in this patch to name a > > variable, but this one I found a bit confusing to follow the > > "mark->mark" > > Yeah, makes sense. I can do the change. Unless you have to respin this patchset for some other reason I wouldn't worry about it. I've been working my way through the patchset this week (currently on 09/11) and I expect to finish it up today. Assuming everything looks good, I'm going to merge this into a working branch, include it in my weekly -rc test builds, and beat on it for a couple of weeks. If all is good I'll merge it into audit/next after the upcoming merge window. Thanks for your patience. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH 0/11 v3] audit: Fix various races when tagging and untagging mounts
On Mon, Sep 17, 2018 at 12:57 PM Jan Kara wrote: > On Fri 14-09-18 15:13:28, Richard Guy Briggs wrote: > > On 2018-09-04 18:06, Jan Kara wrote: > > > Hello, > > > > Jan, > > > > > this is the third revision of the series that addresses problems I have > > > identified when trying to understand how exactly is kernel/audit_tree.c > > > using > > > generic fsnotify framework. I hope I have understood all the interactions > > > right > > > but careful review is certainly welcome. > > > > I've tried to review it as carefully as I am able. As best I understand > > it, this all looks reasonable and an improvement over the previous > > state. Thanks for the hard work. > > > > FWIW, > > Reviewed-by: Richard Guy Briggs > > Thanks for review! Paul should I send you updated patch 9 with that one > variable renamed or will you do that small change while merging the series? Hi Jan, Thanks again for these patches and your patience; some travel, holidays, and a job change delayed my review. However, everything looks okay to me (minus the one problem I noted in patch 09/11). I've added the patches to audit/working-fsnotify_fixes and I'm going to start stressing them as soon as I get a test kernel built with the idea of merging them into audit/next as soon as the upcoming merge window closes. As far as the variable rename is concerned, that's not something I would prefer to change during a merge, but if you or Richard wanted to submit a renaming patch I would be okay with that in this case. If you do submit the rename patch, please base it on top of this patchset (or audit/working-fsnotify_fixes). Thanks! -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH] audit: Rename audit mark variable to amark
On Thu, Oct 4, 2018 at 3:07 AM Jan Kara wrote: > Variable of audit_tree_mark type was called 'mark' which is confusing as > we usually call fsnotify_mark variables this way. Rename it to 'amark' > to make it explicit this a actually a different thing. > > Signed-off-by: Jan Kara > --- > kernel/audit_tree.c | 12 ++-- > 1 file changed, 6 insertions(+), 6 deletions(-) > > Hello Paul, > > here is the patch to rename mark to amark as Richard suggested. Thanks, merged into audit/working-fsnotify_fixes. > diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c > index 9c53f7c37bdf..232b8b18cb5b 100644 > --- a/kernel/audit_tree.c > +++ b/kernel/audit_tree.c > @@ -175,14 +175,14 @@ static void audit_tree_destroy_watch(struct > fsnotify_mark *mark) > > static struct fsnotify_mark *alloc_mark(void) > { > - struct audit_tree_mark *mark; > + struct audit_tree_mark *amark; > > - mark = kmem_cache_zalloc(audit_tree_mark_cachep, GFP_KERNEL); > - if (!mark) > + amark = kmem_cache_zalloc(audit_tree_mark_cachep, GFP_KERNEL); > + if (!amark) > return NULL; > - fsnotify_init_mark(&mark->mark, audit_tree_group); > - mark->mark.mask = FS_IN_IGNORED; > - return &mark->mark; > + fsnotify_init_mark(&amark->mark, audit_tree_group); > + amark->mark.mask = FS_IN_IGNORED; > + return &amark->mark; > } > > static struct audit_chunk *alloc_chunk(int count) > -- > 2.16.4 > -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH 12/11 TESTSUITE] audit_testsuite: Add stress test for tree watches
{ > + mkdir "$dir/mnt/mnt$i"; > + mkdir "$dir/leaf$i"; > +} > + > +my $stat_pid = fork(); > + > +if ($stat_pid == 0) { > + run_stat($dir, $dirs); > + # Never reached > + exit; > +} > + > +my $mount_pid = fork(); > + > +if ($mount_pid == 0) { > + run_mount($dir, $dirs); > + # Never reached > + exit; > +} > + > +my $key = key_gen(); > + > +my $audit_pid = fork(); > + > +if ($audit_pid == 0) { > + run_mark_audit($dir, $dirs, $key); > + # Never reached > + exit; > +} > + > +# Sleep for a minute to let stress test run... > +sleep(60); > +ok(1); > + > +### > +# cleanup > + > +kill('KILL', $stat_pid, $mount_pid, $audit_pid); > +# Wait for children to terminate > +waitpid($stat_pid, 0); > +waitpid($mount_pid, 0); > +waitpid($audit_pid, 0); > +system("auditctl -D >& /dev/null"); > +umount_all($dir, $dirs, 1); > -- > 2.16.4 > -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH 12/11 TESTSUITE] audit_testsuite: Add stress test for tree watches
On Tue, Oct 9, 2018 at 3:40 AM Jan Kara wrote: > On Fri 05-10-18 17:06:22, Paul Moore wrote: > > On Tue, Sep 4, 2018 at 12:06 PM Jan Kara wrote: > > > Add stress test for stressing audit tree watches by adding and deleting > > > rules while events are generated and watched filesystems are mounted and > > > unmounted in parallel. > > > > > > Signed-off-by: Jan Kara > > > --- > > > tests/stress_tree/Makefile | 8 +++ > > > tests/stress_tree/test | 171 > > > + > > > 2 files changed, 179 insertions(+) > > > create mode 100644 tests/stress_tree/Makefile > > > create mode 100755 tests/stress_tree/test > > > > No commentary on the test itself, other than perhaps it should live > > under test_manual/, but in running the tests in a loop today I am > > reliably able to panic my test kernel after ~30m or so. > > Interesting. How do you run the test? Nothing fancy, just a simple bash loop: # cd tests/stress_tree # while ./test; do /bin/true; done > > I'm using the kernel linked below which is Fedora Rawhide + > > selinux/next + audit/next + audit/working-fsnotify_fixes; a link to > > the patches added to the Rawhide kernel can be found in the list > > archive linked below. > > > > * https://groups.google.com/forum/#!topic/kernel-secnext/SFv0d-ij3z8 > > * > > https://copr.fedorainfracloud.org/coprs/pcmoore/kernel-secnext/build/805949 > > > > The initial panic dump is below: > > I can try to reproduce this but could you perhaps find out which of the > list manipulations in audit_remove_tree_rule() hit the corrution? A quick dump of the audit_remove_tree_rule() function makes it look like it is the inner most list_del_init() call. Dump of assembler code for function audit_remove_tree_rule: 646 { 0x811ad900 <+0>: callq 0x81c01850 <__fentry__> 647 struct audit_tree *tree; 648 tree = rule->tree; 0x811ad905 <+5>: push %r14 0x811ad907 <+7>: xor%eax,%eax 0x811ad909 <+9>: push %r13 0x811ad90b <+11>:push %r12 0x811ad90d <+13>:push %rbp 0x811ad90e <+14>:push %rbx 0x811ad90f <+15>:mov0x140(%rdi),%rbp 649 if (tree) { 0x811ad916 <+22>:test %rbp,%rbp 0x811ad919 <+25>:je 0x811ad990 0x811ad91b <+27>:mov%rdi,%rbx 650 spin_lock(&hash_lock); 651 list_del_init(&rule->rlist); 0x811ad92a <+42>:lea0x150(%rbx),%r12 652 if (list_empty(&tree->rules) && !tree->goner) { 653 tree->root = NULL; 0x811ad999 <+153>: movq $0x0,0x8(%rbp) 654 list_del_init(&tree->same_root); 0x811ad9a1 <+161>: lea0x40(%rbp),%r12 655 tree->goner = 1; 0x811ad9c8 <+200>: lea0x30(%rbp),%r12 0x811ad9cc <+204>: movl $0x1,0x4(%rbp) 656 list_move(&tree->list, &prune_list); 657 rule->tree = NULL; 0x811ada21 <+289>: movq $0x0,0x140(%rbx) 658 spin_unlock(&hash_lock); 659 audit_schedule_prune(); 660 return 1; 0x811ada44 <+324>: mov$0x1,%eax 0x811ada49 <+329>: pop%rbx 0x811ada4a <+330>: pop%rbp 0x811ada4b <+331>: pop%r12 0x811ada4d <+333>: pop%r13 0x811ada4f <+335>: pop%r14 0x811ada51 <+337>: retq 0x811ada52: data16 nopw %cs:0x0(%rax,%rax,1) 0x811ada5d: nopl (%rax) 661 } 662 rule->tree = NULL; 0x811ad974 <+116>: movq $0x0,0x140(%rbx) 663 spin_unlock(&hash_lock); 664 return 1; 0x811ad98b <+139>: mov$0x1,%eax 0x811ad990 <+144>: pop%rbx 0x811ad991 <+145>: pop%rbp 0x811ad992 <+146>: pop%r12 0x811ad994 <+148>: pop%r13 0x811ad996 <+150>: pop%r14 0x811ad998 <+152>: retq 665 } 666 return 0; 667 } > > [ 139.619065] list_del corruption. prev->next should be > > 985fa98d4100, but was 985fae91e370 > > [ 139.622504] [ cut here ]-
Re: [PATCH 12/11 TESTSUITE] audit_testsuite: Add stress test for tree watches
On October 11, 2018 7:39:39 AM Jan Kara wrote: > On Wed 10-10-18 02:43:46, Paul Moore wrote: >> On Tue, Oct 9, 2018 at 3:40 AM Jan Kara wrote: >>> On Fri 05-10-18 17:06:22, Paul Moore wrote: >>>> On Tue, Sep 4, 2018 at 12:06 PM Jan Kara wrote: >>>>> Add stress test for stressing audit tree watches by adding and deleting >>>>> rules while events are generated and watched filesystems are mounted and >>>>> unmounted in parallel. >>>>> >>>>> Signed-off-by: Jan Kara >>>>> --- >>>>> tests/stress_tree/Makefile | 8 +++ >>>>> tests/stress_tree/test | 171 >>>>> + >>>>> 2 files changed, 179 insertions(+) >>>>> create mode 100644 tests/stress_tree/Makefile >>>>> create mode 100755 tests/stress_tree/test >>>> >>>> No commentary on the test itself, other than perhaps it should live >>>> under test_manual/, but in running the tests in a loop today I am >>>> reliably able to panic my test kernel after ~30m or so. >>> >>> Interesting. How do you run the test? >> >> Nothing fancy, just a simple bash loop: >> >> # cd tests/stress_tree >> # while ./test; do /bin/true; done > > OK, I did succeed in reproducing some problems with my patches - once I was > able to trigger a livelock and following softlockup warning - this is > actually a problem introduced by my patches, and once a use after free > issue (not sure what that was since after I've added some debugging I > wasn't able to trigger it anymore). Anyway, I'll try more after fixing the > livelock. Do you want me to add fixes on top of my series or just fixup the > original series? Since these are pretty serious bugs, and I try to avoid merging known-broken patches which will go up to Linus, why don't you go ahead and respin the patchset with the new fixes included. You can also use the opportunity to squash in the rename patch and fix that mid-patchset compilation problem that I fixed up during the merge. Thanks. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: 4.9 kernel panic in netlink unicast because audit replace passing audit sock as NULL
On October 11, 2018 10:44:01 PM Kassey Li wrote: > hi, Paul: >we got one kernel panic on 4.9 kernel > >[16237.397896] [2018:10:09 23:06:55]audit: audit_pid=20802 > reset >[16238.098916] [2018:10:09 23:06:57]Unable to handle kernel > NULL pointer dereference at virtual address 0280 > >audit_sock is set to NULL in kauditd_send_skb, but later we > are access it again in audit_replace caused this panic. >is there patch for such SW issue on 4.9 kernel ? > >static int audit_replace(pid_t pid) > { > struct sk_buff *skb = audit_make_reply(0, 0, AUDIT_REPLACE, 0, 0, > &pid, sizeof(pid)); > > if (!skb) > return -ENOMEM; > return netlink_unicast(audit_sock, skb, audit_nlk_portid, 0); > } Hi. Have you been able to reproduce this problem on a recent kernel? Unfortunately there have been some major changes to that area of the code since v4.9 and it is very likely that we have fixed this issue in the current upstream kernel. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: 4.9 kernel panic in netlink unicast because audit replace passing audit sock as NULL
On Fri, Oct 12, 2018 at 3:33 AM Kassey Li wrote: > hi, Paul: > > it is hard to running on upstream kernel on my hardware. > I checked the kernel log that we can see that we are trying to send the sock > to task 20802, but it was killed already. > is there any suggest that we can pick up some patches to backport to 4.9 to > try ? There have been numerous patches which could be applicable to your problem, starting in v4.10 through v4.15. Since this is an old kernel from an upstream community perspective I would suggest reaching out to your Linux distribution's support channels to see if they can help you. If that is not an option, you can see all of the audit patches from the audit team in the audit kernel repo broken down by release at the following location: * git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit.git * https://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit.git ... patches for Linux vX.Y can be found in branches stable-X.Y. -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH 12/11 TESTSUITE] audit_testsuite: Add stress test for tree watches
On Mon, Oct 15, 2018 at 6:04 AM Jan Kara wrote: > On Thu 11-10-18 19:03:53, Paul Moore wrote: > > On October 11, 2018 7:39:39 AM Jan Kara wrote: > > > On Wed 10-10-18 02:43:46, Paul Moore wrote: > > >> On Tue, Oct 9, 2018 at 3:40 AM Jan Kara wrote: > > >>> On Fri 05-10-18 17:06:22, Paul Moore wrote: > > >>>> On Tue, Sep 4, 2018 at 12:06 PM Jan Kara wrote: > > >>>>> Add stress test for stressing audit tree watches by adding and > > >>>>> deleting > > >>>>> rules while events are generated and watched filesystems are mounted > > >>>>> and > > >>>>> unmounted in parallel. > > >>>>> > > >>>>> Signed-off-by: Jan Kara > > >>>>> --- > > >>>>> tests/stress_tree/Makefile | 8 +++ > > >>>>> tests/stress_tree/test | 171 > > >>>>> + > > >>>>> 2 files changed, 179 insertions(+) > > >>>>> create mode 100644 tests/stress_tree/Makefile > > >>>>> create mode 100755 tests/stress_tree/test > > >>>> > > >>>> No commentary on the test itself, other than perhaps it should live > > >>>> under test_manual/, but in running the tests in a loop today I am > > >>>> reliably able to panic my test kernel after ~30m or so. > > >>> > > >>> Interesting. How do you run the test? > > >> > > >> Nothing fancy, just a simple bash loop: > > >> > > >> # cd tests/stress_tree > > >> # while ./test; do /bin/true; done > > > > > > OK, I did succeed in reproducing some problems with my patches - once I > > > was > > > able to trigger a livelock and following softlockup warning - this is > > > actually a problem introduced by my patches, and once a use after free > > > issue (not sure what that was since after I've added some debugging I > > > wasn't able to trigger it anymore). Anyway, I'll try more after fixing the > > > livelock. Do you want me to add fixes on top of my series or just fixup > > > the > > > original series? > > > > Since these are pretty serious bugs, and I try to avoid merging > > known-broken patches which will go up to Linus, why don't you go ahead > > and respin the patchset with the new fixes included. You can also use > > the opportunity to squash in the rename patch and fix that mid-patchset > > compilation problem that I fixed up during the merge. > > OK, I'm now testing a version with the softlockup fixed and some locking > around untag_chunk() simplified when I had to meddle with that anyway. I'll > see if I can hit further failures... Thanks for the update, let me know how the testing goes ... -- paul moore www.paul-moore.com -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH ghak90 (was ghak32) V4 02/10] audit: add container id
Ooops, I hit send prematurely on this :/ My comments below should stand, but for things like this I usually try to get through the entire patchset before sending my comments as later patches can affect my comments on the earlier patches. On Fri, Oct 19, 2018 at 3:38 PM Paul Moore wrote: > On Tue, Jul 31, 2018 at 4:11 PM Richard Guy Briggs wrote: > > > > Implement the proc fs write to set the audit container identifier of a > > process, emitting an AUDIT_CONTAINER_OP record to document the event. > > > > This is a write from the container orchestrator task to a proc entry of > > the form /proc/PID/audit_containerid where PID is the process ID of the > > newly created task that is to become the first task in a container, or > > an additional task added to a container. > > > > The write expects up to a u64 value (unset: 18446744073709551615). > > > > The writer must have capability CAP_AUDIT_CONTROL. > > > > This will produce a record such as this: > > type=CONTAINER_ID msg=audit(2018-06-06 12:39:29.636:26949) : op=set > > opid=2209 old-contid=18446744073709551615 contid=123456 pid=628 auid=root > > uid=root tty=ttyS0 ses=1 > > subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 comm=bash > > exe=/usr/bin/bash res=yes > > You need to update the record type in the example above. > > > The "op" field indicates an initial set. The "pid" to "ses" fields are > > the orchestrator while the "opid" field is the object's PID, the process > > being "contained". Old and new audit container identifier values are > > given in the "contid" fields, while res indicates its success. > > I understand Steve's concern around the "op" field, but I think it > might be a bit premature to think we might not need to do some sort of > audit container ID management in the future that would want to make > use of the CONTAINER_OP message type. I would like to see the "op" > field preserved. > > > It is not permitted to unset the audit container identifier. > > A child inherits its parent's audit container identifier. > > > > See: https://github.com/linux-audit/audit-kernel/issues/90 > > See: https://github.com/linux-audit/audit-userspace/issues/51 > > See: https://github.com/linux-audit/audit-testsuite/issues/64 > > See: https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID > > > > Signed-off-by: Richard Guy Briggs > > Acked-by: Serge Hallyn > > Acked-by: Steve Grubb > > --- > > fs/proc/base.c | 37 + > > include/linux/audit.h | 24 > > include/uapi/linux/audit.h | 2 ++ > > kernel/auditsc.c | 68 > > ++ > > 4 files changed, 131 insertions(+) > > ... > > > @@ -2112,6 +2114,72 @@ int audit_set_loginuid(kuid_t loginuid) > > } > > > > /** > > + * audit_set_contid - set current task's audit_context contid > > + * @contid: contid value > > + * > > + * Returns 0 on success, -EPERM on permission failure. > > + * > > + * Called (set) from fs/proc/base.c::proc_contid_write(). > > + */ > > +int audit_set_contid(struct task_struct *task, u64 contid) > > +{ > > + u64 oldcontid; > > + int rc = 0; > > + struct audit_buffer *ab; > > + uid_t uid; > > + struct tty_struct *tty; > > + char comm[sizeof(current->comm)]; > > + > > + task_lock(task); > > + /* Can't set if audit disabled */ > > + if (!task->audit) { > > + task_unlock(task); > > + return -ENOPROTOOPT; > > + } > > + oldcontid = audit_get_contid(task); > > + read_lock(&tasklist_lock); > > I assume lockdep was happy with nesting the tasklist_lock inside the task > lock? > > > + /* Don't allow the audit containerid to be unset */ > > + if (!audit_contid_valid(contid)) > > + rc = -EINVAL; > > + /* if we don't have caps, reject */ > > + else if (!capable(CAP_AUDIT_CONTROL)) > > + rc = -EPERM; > > + /* if task has children or is not single-threaded, deny */ > > + else if (!list_empty(&task->children)) > > + rc = -EBUSY; > > + else if (!(thread_group_leader(task) && thread_group_empty(task))) > > + rc = -EALREADY; > > + read_unlock(&tasklist_lock); > > + if (!rc)