Re: hit a KASan bug related to Perf during stress test

2016-10-26 Thread Oleg Nesterov
On 10/26, Peter Zijlstra wrote: > > On Tue, Oct 25, 2016 at 04:41:26PM +0200, Oleg Nesterov wrote: > > > > > > So what serialization would close that race? __task_pid_nr_ns() only > > > seems to use RCU nothing more. > > > > I do not see how can we close this race, we obviously do not want to use >

Re: hit a KASan bug related to Perf during stress test

2016-10-26 Thread Peter Zijlstra
On Tue, Oct 25, 2016 at 04:41:26PM +0200, Oleg Nesterov wrote: > > > > So what serialization would close that race? __task_pid_nr_ns() only > > seems to use RCU nothing more. > > I do not see how can we close this race, we obviously do not want to use > any locking. > > That is why I tried to sug

Re: hit a KASan bug related to Perf during stress test

2016-10-25 Thread Oleg Nesterov
On 10/25, Peter Zijlstra wrote: > > On Mon, Oct 24, 2016 at 05:39:08PM +0200, Oleg Nesterov wrote: > > On 10/24, Peter Zijlstra wrote: > > > > > > --- a/kernel/events/core.c > > > +++ b/kernel/events/core.c > > > @@ -1257,7 +1257,14 @@ static u32 perf_event_pid(struct perf_event > > > *event, stru

Re: hit a KASan bug related to Perf during stress test

2016-10-25 Thread Peter Zijlstra
On Mon, Oct 24, 2016 at 05:39:08PM +0200, Oleg Nesterov wrote: > On 10/24, Peter Zijlstra wrote: > > > > --- a/kernel/events/core.c > > +++ b/kernel/events/core.c > > @@ -1257,7 +1257,14 @@ static u32 perf_event_pid(struct perf_event *event, > > struct task_struct *p) > > if (event->parent) >

RE: hit a KASan bug related to Perf during stress test

2016-10-24 Thread Ni, BaoleX
shish...@linux.intel.com; Liu, Chuansheng Subject: Re: hit a KASan bug related to Perf during stress test On 10/24, Oleg Nesterov wrote: > > On 10/24, Peter Zijlstra wrote: > > > > --- a/kernel/events/core.c > > +++ b/kernel/events/core.c > > @@ -1257,7 +1257,14

Re: hit a KASan bug related to Perf during stress test

2016-10-24 Thread Oleg Nesterov
On 10/24, Oleg Nesterov wrote: > > On 10/24, Peter Zijlstra wrote: > > > > --- a/kernel/events/core.c > > +++ b/kernel/events/core.c > > @@ -1257,7 +1257,14 @@ static u32 perf_event_pid(struct perf_event *event, > > struct task_struct *p) > > if (event->parent) > > event = event->p

Re: hit a KASan bug related to Perf during stress test

2016-10-24 Thread Oleg Nesterov
On 10/24, Peter Zijlstra wrote: > > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -1257,7 +1257,14 @@ static u32 perf_event_pid(struct perf_event *event, > struct task_struct *p) > if (event->parent) > event = event->parent; > > - return task_tgid_nr_ns(p,

Re: hit a KASan bug related to Perf during stress test

2016-10-24 Thread Peter Zijlstra
On Mon, Oct 24, 2016 at 03:25:55PM +0200, Oleg Nesterov wrote: > Well, if we add that PIDTYPE_TGID hack, I think we can do something > like below... > > Or do you think we should add a perf_alive() check into perf_event_pid() > for a quick fix? That is what I was thinking. Then we don't need to d

Re: hit a KASan bug related to Perf during stress test

2016-10-24 Thread Peter Zijlstra
On Mon, Oct 24, 2016 at 03:40:13PM +0200, Oleg Nesterov wrote: > On 10/24, Oleg Nesterov wrote: > > > > -static u32 perf_event_pid(struct perf_event *event, struct task_struct *p) > > +static u32 perf_event_xxx(struct perf_event *event, struct task_struct *p, > > + enum pi

Re: hit a KASan bug related to Perf during stress test

2016-10-24 Thread Oleg Nesterov
On 10/24, Oleg Nesterov wrote: > > -static u32 perf_event_pid(struct perf_event *event, struct task_struct *p) > +static u32 perf_event_xxx(struct perf_event *event, struct task_struct *p, > + enum pid_type type) > { > + pid_t nr; > /* >* only top leve

Re: hit a KASan bug related to Perf during stress test

2016-10-24 Thread Oleg Nesterov
On 10/24, Peter Zijlstra wrote: > > On Mon, Oct 24, 2016 at 02:29:42PM +0200, Oleg Nesterov wrote: > > On 10/24, Peter Zijlstra wrote: > > > > > > Still, I wonder if returning 0 is the right thing. 0 is a 'valid' PID > > > for the init/idle task. > > > > Yes, now I think that -1 would make more sen

Re: hit a KASan bug related to Perf during stress test

2016-10-24 Thread Peter Zijlstra
On Mon, Oct 24, 2016 at 02:29:42PM +0200, Oleg Nesterov wrote: > On 10/24, Peter Zijlstra wrote: > > > > On Mon, Oct 24, 2016 at 02:10:31PM +0200, Oleg Nesterov wrote: > > > --- x/kernel/pid.c > > > +++ x/kernel/pid.c > > > @@ -526,8 +526,11 @@ pid_t __task_pid_nr_ns(struct task_struc > > > if (!

Re: hit a KASan bug related to Perf during stress test

2016-10-24 Thread Oleg Nesterov
On 10/24, Peter Zijlstra wrote: > > On Mon, Oct 24, 2016 at 02:10:31PM +0200, Oleg Nesterov wrote: > > --- x/kernel/pid.c > > +++ x/kernel/pid.c > > @@ -526,8 +526,11 @@ pid_t __task_pid_nr_ns(struct task_struc > > if (!ns) > > ns = task_active_pid_ns(current); > > if (likely(pi

Re: hit a KASan bug related to Perf during stress test

2016-10-24 Thread Peter Zijlstra
On Mon, Oct 24, 2016 at 02:21:23PM +0200, Oleg Nesterov wrote: > > Should we do the same for perf_event_tid() and report -1 as the pid/tid > > in the !alive case? -1 should be an obvious invalid pid since we limit > > the pid-space to less than 32 bits. > > task_pid_nr_ns() is always safe, it call

Re: hit a KASan bug related to Perf during stress test

2016-10-24 Thread Oleg Nesterov
On 10/24, Peter Zijlstra wrote: > > On Mon, Oct 24, 2016 at 01:15:27PM +0200, Oleg Nesterov wrote: > > > > --- x/kernel/events/core.c > > +++ x/kernel/events/core.c > > @@ -1257,7 +1257,7 @@ static u32 perf_event_pid(struct perf_ev > > if (event->parent) > > event = event->parent; >

Re: hit a KASan bug related to Perf during stress test

2016-10-24 Thread Peter Zijlstra
On Mon, Oct 24, 2016 at 02:10:31PM +0200, Oleg Nesterov wrote: > --- x/kernel/pid.c > +++ x/kernel/pid.c > @@ -526,8 +526,11 @@ pid_t __task_pid_nr_ns(struct task_struc > if (!ns) > ns = task_active_pid_ns(current); > if (likely(pid_alive(task))) { > - if (type

Re: hit a KASan bug related to Perf during stress test

2016-10-24 Thread Peter Zijlstra
On Mon, Oct 24, 2016 at 02:02:32PM +0200, Oleg Nesterov wrote: > Perhaps. Or into task_tgid(). Or even the patch below, __task_pid_nr_ns() > is always safe. This certainly needs some cleanups. > --- x/include/linux/pid.h > +++ x/include/linux/pid.h > @@ -8,7 +8,8 @@ enum pid_type > PIDTYPE_

Re: hit a KASan bug related to Perf during stress test

2016-10-24 Thread Oleg Nesterov
On 10/24, Oleg Nesterov wrote: > > On 10/24, Peter Zijlstra wrote: > > > > On Mon, Oct 24, 2016 at 01:15:27PM +0200, Oleg Nesterov wrote: > > > > > > --- x/kernel/events/core.c > > > +++ x/kernel/events/core.c > > > @@ -1257,7 +1257,7 @@ static u32 perf_event_pid(struct perf_ev > > > if (event->p

Re: hit a KASan bug related to Perf during stress test

2016-10-24 Thread Peter Zijlstra
On Mon, Oct 24, 2016 at 01:15:27PM +0200, Oleg Nesterov wrote: > Yes, current is still valid. > > But nothing protects current->group_leader or parent/real_parent, they > can point to the exited/freed task. We really need to nullify them in > __unhash_process() to catch the problems like this, I w

Re: hit a KASan bug related to Perf during stress test

2016-10-24 Thread Peter Zijlstra
On Mon, Oct 24, 2016 at 02:04:11PM +0200, Jiri Olsa wrote: > On Mon, Oct 24, 2016 at 01:29:45PM +0200, Peter Zijlstra wrote: > > Hurm, then again, I imagine that after unhash_process the PID/TID could > > be instantly re-used and then we're still confused. > > sounds bad.. I haven't checked the r

Re: hit a KASan bug related to Perf during stress test

2016-10-24 Thread Jiri Olsa
On Mon, Oct 24, 2016 at 01:29:45PM +0200, Peter Zijlstra wrote: > On Mon, Oct 24, 2016 at 01:27:32PM +0200, Peter Zijlstra wrote: > > On Mon, Oct 24, 2016 at 01:15:27PM +0200, Oleg Nesterov wrote: > > > How about the trivial fix below? > > > > > > Oleg. > > > > > > --- x/kernel/events/core.c > >

Re: hit a KASan bug related to Perf during stress test

2016-10-24 Thread Oleg Nesterov
On 10/24, Peter Zijlstra wrote: > > On Mon, Oct 24, 2016 at 01:15:27PM +0200, Oleg Nesterov wrote: > > > > --- x/kernel/events/core.c > > +++ x/kernel/events/core.c > > @@ -1257,7 +1257,7 @@ static u32 perf_event_pid(struct perf_ev > > if (event->parent) > > event = event->parent; >

Re: hit a KASan bug related to Perf during stress test

2016-10-24 Thread Peter Zijlstra
On Mon, Oct 24, 2016 at 01:27:32PM +0200, Peter Zijlstra wrote: > On Mon, Oct 24, 2016 at 01:15:27PM +0200, Oleg Nesterov wrote: > > How about the trivial fix below? > > > > Oleg. > > > > --- x/kernel/events/core.c > > +++ x/kernel/events/core.c > > @@ -1257,7 +1257,7 @@ static u32 perf_event_pid

Re: hit a KASan bug related to Perf during stress test

2016-10-24 Thread Peter Zijlstra
On Mon, Oct 24, 2016 at 01:15:27PM +0200, Oleg Nesterov wrote: > How about the trivial fix below? > > Oleg. > > --- x/kernel/events/core.c > +++ x/kernel/events/core.c > @@ -1257,7 +1257,7 @@ static u32 perf_event_pid(struct perf_ev > if (event->parent) > event = event->parent

Re: hit a KASan bug related to Perf during stress test

2016-10-24 Thread Peter Zijlstra
On Mon, Oct 24, 2016 at 01:15:27PM +0200, Oleg Nesterov wrote: > On 10/24, Peter Zijlstra wrote: > > > > > [32738.867020] [] task_tgid_nr_ns+0x35/0xb0 > > > > So here we did: perf_event_[pt]id(event, current); > > > > How can _current_ not be valid anymore? > > ... > > > > [32739.040207] [] __c

Re: hit a KASan bug related to Perf during stress test

2016-10-24 Thread Oleg Nesterov
On 10/24, Peter Zijlstra wrote: > > > [32738.867020] [] task_tgid_nr_ns+0x35/0xb0 > > So here we did: perf_event_[pt]id(event, current); > > How can _current_ not be valid anymore? ... > > [32739.040207] [] __call_rcu+0x12c/0x450 > > And while we just called release_task(), that call_rcu() shou

Re: hit a KASan bug related to Perf during stress test

2016-10-24 Thread Peter Zijlstra
On Mon, Oct 24, 2016 at 09:35:46AM +, Ni, BaoleX wrote: > > [32736.018823] BUG: KASan: use after free in task_tgid_nr_ns+0x35/0xb0 at > addr 8800265568c0 > [32736.028309] Read of size 8 by task dumpsys/11268 > [32736.033511] > =