On 2020-07-05 11:11, Paul Moore wrote:
> On Sat, Jun 27, 2020 at 9:23 AM Richard Guy Briggs wrote:
> > Require the target task to be a descendant of the container
> > orchestrator/engine.
> >
> > You would only change the audit container ID from one set or inherited
> > value to another if you were nesting containers.
> >
> > If changing the contid, the container orchestrator/engine must be a
> > descendant and not same orchestrator as the one that set it so it is not
> > possible to change the contid of another orchestrator's container.
Are we able to agree on the premises above? Is anything asserted that
should not be and is there anything missing?
I've been sitting on my response below for more than a week trying to
understand the issues raised and to give it the proper attention to a
reply. Please excuse my tardiness at replying on this issue since I'm
still having trouble thinking through all the scenarios for nesting.
> > Since the task_is_descendant() function is used in YAMA and in audit,
> > remove the duplication and pull the function into kernel/core/sched.c
> >
> > Signed-off-by: Richard Guy Briggs
> > ---
> > include/linux/sched.h| 3 +++
> > kernel/audit.c | 23 +--
> > kernel/sched/core.c | 33 +
> > security/yama/yama_lsm.c | 33 -
> > 4 files changed, 57 insertions(+), 35 deletions(-)
> >
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index 2213ac670386..06938d0b9e0c 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -2047,4 +2047,7 @@ static inline void rseq_syscall(struct pt_regs *regs)
> >
> > const struct cpumask *sched_trace_rd_span(struct root_domain *rd);
> >
> > +extern int task_is_descendant(struct task_struct *parent,
> > + struct task_struct *child);
> > +
> > #endif
> > diff --git a/kernel/audit.c b/kernel/audit.c
> > index a862721dfd9b..efa65ec01239 100644
> > --- a/kernel/audit.c
> > +++ b/kernel/audit.c
> > @@ -2713,6 +2713,20 @@ int audit_signal_info(int sig, struct task_struct *t)
> > return audit_signal_info_syscall(t);
> > }
> >
> > +static bool audit_contid_isnesting(struct task_struct *tsk)
> > +{
> > + bool isowner = false;
> > + bool ownerisparent = false;
> > +
> > + rcu_read_lock();
> > + if (tsk->audit && tsk->audit->cont) {
> > + isowner = current == tsk->audit->cont->owner;
> > + ownerisparent = task_is_descendant(tsk->audit->cont->owner,
> > current);
>
> I want to make sure I'm understanding this correctly and I keep
> mentally tripping over something: it seems like for a given audit
> container ID a task is either the owner or a descendent, there is no
> third state, is that correct?
Sure there is. It could be another owner (which is addressed when we
search for an existing contobj match), or in the next patch, the
owner's parent if nested or a peer.
> Assuming that is true, can the descendent check simply be a negative
> owner check given they both have the same audit container ID?
There isn't actually a check in my code for the orchestrator contid and
task contid being the same. Maybe I was making this check more
complicated than necessary, and still incomplete, but see below for more...
> > + }
> > + rcu_read_unlock();
> > + return !isowner && ownerisparent;
> > +}
> > +
> > /*
> > * audit_set_contid - set current task's audit contid
> > * @task: target task
> > @@ -2755,8 +2769,13 @@ int audit_set_contid(struct task_struct *task, u64
> > contid)
> > rc = -EBUSY;
> > goto unlock;
> > }
> > - /* if contid is already set, deny */
> > - if (audit_contid_set(task))
> > + /* if task is not descendant, block */
> > + if (task == current || !task_is_descendant(current, task)) {
>
> I'm also still fuzzy on why we can't let a task set it's own audit
> container ID, assuming it meets all the criteria established in patch
> 2/13. It somewhat made sense when you were tracking inherited vs
> explicitly set audit container IDs, but that doesn't appear to be the
> case so far in this patchset, yes?
I'm still having a strong reluctance to permit this but can't come up
with a solid technical reason right now, but it feels like a layer
violation. If we forbid it and discover it necessary and harmless, then
permitting it won't break the API. If we permit it and later discover a
reason it causes a problem, then blocking it will break the API. I have
heard that there are cases where there is no orchestrator/engine, so in
those cases I conclude that a process would need to set its own contid
but I'm having trouble recalling what those circumstances are.
I also was seriously considering blocking any contid set on the initial
user or PID namespaces to avoid polluting them, and even had a tested
patch to implement it, but this starts