Re: [PATCH v2] fix use-after-free in perf_sched__lat
Em Thu, Jul 04, 2019 at 07:21:28PM +0800, liwei (GF) escreveu: > Hi Arnaldo, > I found this issue has not been fixed in mainline now, please take a glance > at this. See below. > On 2019/5/23 10:50, Namhyung Kim wrote: > > On Wed, May 22, 2019 at 08:08:23AM -0300, Arnaldo Carvalho de Melo wrote: > >> I'll try to analyse this one soon, but my first impression was that we > >> should just grab reference counts when keeping a pointer to those > >> threads instead of keeping _all_ threads alive when supposedly we could > >> trow away unreferenced data structures. > >> But this is just a first impression from just reading the patch > >> description, probably I'm missing something. > > No, thread refcounting is fine. We already did it and threads with the > > refcount will be accessed only. > > But the problem is the head of the list. After using the thread, the > > refcount is gone and thread is removed from the list and destroyed. > > However the head of list is in a struct machine which was freed with > > session already. I see, and sorry for the delay, thanks for bringing this up again, how about the following instead? I tested it with 'perf top' that exercises this code in a multithreaded fashion as well with the set of steps in your patch commit log, seems to work. - Arnaldo diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c index 86fede2b7507..cf826eca3aaf 100644 --- a/tools/perf/util/machine.c +++ b/tools/perf/util/machine.c @@ -210,6 +210,18 @@ void machine__exit(struct machine *machine) for (i = 0; i < THREADS__TABLE_SIZE; i++) { struct threads *threads = >threads[i]; + struct thread *thread, *n; + /* +* Forget about the dead, at this point whatever threads were +* left in the dead lists better have a reference count taken +* by who is using them, and then, when they drop those references +* and it finally hits zero, thread__put() will check and see that +* its not in the dead threads list and will not try to remove it +* from there, just calling thread__delete() straight away. +*/ + list_for_each_entry_safe(thread, n, >dead, node) + list_del_init(>node); + exit_rwsem(>lock); } } @@ -1759,9 +1771,11 @@ static void __machine__remove_thread(struct machine *machine, struct thread *th, if (threads->last_match == th) threads__set_last_match(threads, NULL); - BUG_ON(refcount_read(>refcnt) == 0); if (lock) down_write(>lock); + + BUG_ON(refcount_read(>refcnt) == 0); + rb_erase_cached(>rb_node, >entries); RB_CLEAR_NODE(>rb_node); --threads->nr; @@ -1771,9 +1785,16 @@ static void __machine__remove_thread(struct machine *machine, struct thread *th, * will be called and we will remove it from the dead_threads list. */ list_add_tail(>node, >dead); + + /* +* We need to do the put here because if this is the last refcount, +* then we will be touching the threads->dead head when removing the +* thread. +*/ + thread__put(th); + if (lock) up_write(>lock); - thread__put(th); } void machine__remove_thread(struct machine *machine, struct thread *th) diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c index 4db8cd2a33ae..873ab505ca80 100644 --- a/tools/perf/util/thread.c +++ b/tools/perf/util/thread.c @@ -125,10 +125,27 @@ void thread__put(struct thread *thread) { if (thread && refcount_dec_and_test(>refcnt)) { /* -* Remove it from the dead_threads list, as last reference -* is gone. +* Remove it from the dead threads list, as last reference is +* gone, if it is in a dead threads list. +* +* We may not be there anymore if say, the machine where it was +* stored was already deleted, so we already removed it from +* the dead threads and some other piece of code still keeps a +* reference. +* +* This is what 'perf sched' does and finally drops it in +* perf_sched__lat(), where it calls perf_sched__read_events(), +* that processes the events by creating a session and deleting +* it, which ends up destroying the list heads for the dead +* threads, but before it does that it removes all threads from +* it using list_del_init(). +* +* So we need to check here if it is in a dead threads list and +* if so, remove it before finally deleting the thread, to avoid +* an use after free situation. */ -
Re: [PATCH v2] fix use-after-free in perf_sched__lat
Hi Arnaldo, I found this issue has not been fixed in mainline now, please take a glance at this. On 2019/5/23 10:50, Namhyung Kim wrote: > On Wed, May 22, 2019 at 08:08:23AM -0300, Arnaldo Carvalho de Melo wrote: >> Em Wed, May 22, 2019 at 03:56:10PM +0900, Namhyung Kim escreveu: >>> On Wed, May 08, 2019 at 10:36:48PM +0800, Wei Li wrote: After thread is added to machine->threads[i].dead in __machine__remove_thread, the machine->threads[i].dead is freed when calling free(session) in perf_session__delete(). So it get a Segmentation fault when accessing it in thread__put(). In this patch, we delay the perf_session__delete until all threads have been deleted. This can be reproduced by following steps: ulimit -c unlimited export MALLOC_MMAP_THRESHOLD_=0 perf sched record sleep 10 perf sched latency --sort max Segmentation fault (core dumped) Signed-off-by: Zhipeng Xie Signed-off-by: Wei Li >>> >>> Acked-by: Namhyung Kim >> >> I'll try to analyse this one soon, but my first impression was that we >> should just grab reference counts when keeping a pointer to those >> threads instead of keeping _all_ threads alive when supposedly we could >> trow away unreferenced data structures. >> >> But this is just a first impression from just reading the patch >> description, probably I'm missing something. > > No, thread refcounting is fine. We already did it and threads with the > refcount will be accessed only. > > But the problem is the head of the list. After using the thread, the > refcount is gone and thread is removed from the list and destroyed. > However the head of list is in a struct machine which was freed with > session already. > > Thanks, > Namhyung > > >> >> Thanks for providing instructions on readily triggering the segfault. >> >> - Arnaldo > > . > Thanks, Wei
Re: [PATCH v2] fix use-after-free in perf_sched__lat
On Wed, May 22, 2019 at 08:08:23AM -0300, Arnaldo Carvalho de Melo wrote: > Em Wed, May 22, 2019 at 03:56:10PM +0900, Namhyung Kim escreveu: > > On Wed, May 08, 2019 at 10:36:48PM +0800, Wei Li wrote: > > > After thread is added to machine->threads[i].dead in > > > __machine__remove_thread, the machine->threads[i].dead is freed > > > when calling free(session) in perf_session__delete(). So it get a > > > Segmentation fault when accessing it in thread__put(). > > > > > > In this patch, we delay the perf_session__delete until all threads > > > have been deleted. > > > > > > This can be reproduced by following steps: > > > ulimit -c unlimited > > > export MALLOC_MMAP_THRESHOLD_=0 > > > perf sched record sleep 10 > > > perf sched latency --sort max > > > Segmentation fault (core dumped) > > > > > > Signed-off-by: Zhipeng Xie > > > Signed-off-by: Wei Li > > > > Acked-by: Namhyung Kim > > I'll try to analyse this one soon, but my first impression was that we > should just grab reference counts when keeping a pointer to those > threads instead of keeping _all_ threads alive when supposedly we could > trow away unreferenced data structures. > > But this is just a first impression from just reading the patch > description, probably I'm missing something. No, thread refcounting is fine. We already did it and threads with the refcount will be accessed only. But the problem is the head of the list. After using the thread, the refcount is gone and thread is removed from the list and destroyed. However the head of list is in a struct machine which was freed with session already. Thanks, Namhyung > > Thanks for providing instructions on readily triggering the segfault. > > - Arnaldo
Re: [PATCH v2] fix use-after-free in perf_sched__lat
Em Wed, May 22, 2019 at 03:56:10PM +0900, Namhyung Kim escreveu: > On Wed, May 08, 2019 at 10:36:48PM +0800, Wei Li wrote: > > After thread is added to machine->threads[i].dead in > > __machine__remove_thread, the machine->threads[i].dead is freed > > when calling free(session) in perf_session__delete(). So it get a > > Segmentation fault when accessing it in thread__put(). > > > > In this patch, we delay the perf_session__delete until all threads > > have been deleted. > > > > This can be reproduced by following steps: > > ulimit -c unlimited > > export MALLOC_MMAP_THRESHOLD_=0 > > perf sched record sleep 10 > > perf sched latency --sort max > > Segmentation fault (core dumped) > > > > Signed-off-by: Zhipeng Xie > > Signed-off-by: Wei Li > > Acked-by: Namhyung Kim I'll try to analyse this one soon, but my first impression was that we should just grab reference counts when keeping a pointer to those threads instead of keeping _all_ threads alive when supposedly we could trow away unreferenced data structures. But this is just a first impression from just reading the patch description, probably I'm missing something. Thanks for providing instructions on readily triggering the segfault. - Arnaldo
Re: [PATCH v2] fix use-after-free in perf_sched__lat
On Wed, May 08, 2019 at 10:36:48PM +0800, Wei Li wrote: > After thread is added to machine->threads[i].dead in > __machine__remove_thread, the machine->threads[i].dead is freed > when calling free(session) in perf_session__delete(). So it get a > Segmentation fault when accessing it in thread__put(). > > In this patch, we delay the perf_session__delete until all threads > have been deleted. > > This can be reproduced by following steps: > ulimit -c unlimited > export MALLOC_MMAP_THRESHOLD_=0 > perf sched record sleep 10 > perf sched latency --sort max > Segmentation fault (core dumped) > > Signed-off-by: Zhipeng Xie > Signed-off-by: Wei Li Acked-by: Namhyung Kim Thahks, Namhyung > --- > tools/perf/builtin-sched.c | 63 ++ > 1 file changed, 43 insertions(+), 20 deletions(-) > > diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c > index 275f2d92a7bf..8a4841fa124c 100644 > --- a/tools/perf/builtin-sched.c > +++ b/tools/perf/builtin-sched.c > @@ -1774,7 +1774,8 @@ static int perf_sched__process_comm(struct perf_tool > *tool __maybe_unused, > return 0; > } > > -static int perf_sched__read_events(struct perf_sched *sched) > +static int __perf_sched__read_events(struct perf_sched *sched, > + struct perf_session *session) > { > const struct perf_evsel_str_handler handlers[] = { > { "sched:sched_switch", process_sched_switch_event, }, > @@ -1783,30 +1784,17 @@ static int perf_sched__read_events(struct perf_sched > *sched) > { "sched:sched_wakeup_new", process_sched_wakeup_event, }, > { "sched:sched_migrate_task", process_sched_migrate_task_event, > }, > }; > - struct perf_session *session; > - struct perf_data data = { > - .path = input_name, > - .mode = PERF_DATA_MODE_READ, > - .force = sched->force, > - }; > - int rc = -1; > - > - session = perf_session__new(, false, >tool); > - if (session == NULL) { > - pr_debug("No Memory for session\n"); > - return -1; > - } > > symbol__init(>header.env); > > if (perf_session__set_tracepoints_handlers(session, handlers)) > - goto out_delete; > + return -1; > > if (perf_session__has_traces(session, "record -R")) { > int err = perf_session__process_events(session); > if (err) { > pr_err("Failed to process events, error %d", err); > - goto out_delete; > + return -1; > } > > sched->nr_events = session->evlist->stats.nr_events[0]; > @@ -1814,9 +1802,28 @@ static int perf_sched__read_events(struct perf_sched > *sched) > sched->nr_lost_chunks = > session->evlist->stats.nr_events[PERF_RECORD_LOST]; > } > > - rc = 0; > -out_delete: > + return 0; > +} > + > +static int perf_sched__read_events(struct perf_sched *sched) > +{ > + struct perf_session *session; > + struct perf_data data = { > + .path = input_name, > + .mode = PERF_DATA_MODE_READ, > + .force = sched->force, > + }; > + int rc; > + > + session = perf_session__new(, false, >tool); > + if (session == NULL) { > + pr_debug("No Memory for session\n"); > + return -1; > + } > + > + rc = __perf_sched__read_events(sched, session); > perf_session__delete(session); > + > return rc; > } > > @@ -3130,12 +3137,25 @@ static void perf_sched__merge_lat(struct perf_sched > *sched) > > static int perf_sched__lat(struct perf_sched *sched) > { > + struct perf_session *session; > + struct perf_data data = { > + .path = input_name, > + .mode = PERF_DATA_MODE_READ, > + .force = sched->force, > + }; > struct rb_node *next; > + int rc = -1; > > setup_pager(); > > - if (perf_sched__read_events(sched)) > + session = perf_session__new(, false, >tool); > + if (session == NULL) { > + pr_debug("No Memory for session\n"); > return -1; > + } > + > + if (__perf_sched__read_events(sched, session)) > + goto out_delete; > > perf_sched__merge_lat(sched); > perf_sched__sort_lat(sched); > @@ -3164,7 +3184,10 @@ static int perf_sched__lat(struct perf_sched *sched) > print_bad_events(sched); > printf("\n"); > > - return 0; > + rc = 0; > +out_delete: > + perf_session__delete(session); > + return rc; > } > > static int setup_map_cpus(struct perf_sched *sched) > -- > 2.17.1 >
[PATCH v2] fix use-after-free in perf_sched__lat
After thread is added to machine->threads[i].dead in __machine__remove_thread, the machine->threads[i].dead is freed when calling free(session) in perf_session__delete(). So it get a Segmentation fault when accessing it in thread__put(). In this patch, we delay the perf_session__delete until all threads have been deleted. This can be reproduced by following steps: ulimit -c unlimited export MALLOC_MMAP_THRESHOLD_=0 perf sched record sleep 10 perf sched latency --sort max Segmentation fault (core dumped) Signed-off-by: Zhipeng Xie Signed-off-by: Wei Li --- tools/perf/builtin-sched.c | 63 ++ 1 file changed, 43 insertions(+), 20 deletions(-) diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c index 275f2d92a7bf..8a4841fa124c 100644 --- a/tools/perf/builtin-sched.c +++ b/tools/perf/builtin-sched.c @@ -1774,7 +1774,8 @@ static int perf_sched__process_comm(struct perf_tool *tool __maybe_unused, return 0; } -static int perf_sched__read_events(struct perf_sched *sched) +static int __perf_sched__read_events(struct perf_sched *sched, + struct perf_session *session) { const struct perf_evsel_str_handler handlers[] = { { "sched:sched_switch", process_sched_switch_event, }, @@ -1783,30 +1784,17 @@ static int perf_sched__read_events(struct perf_sched *sched) { "sched:sched_wakeup_new", process_sched_wakeup_event, }, { "sched:sched_migrate_task", process_sched_migrate_task_event, }, }; - struct perf_session *session; - struct perf_data data = { - .path = input_name, - .mode = PERF_DATA_MODE_READ, - .force = sched->force, - }; - int rc = -1; - - session = perf_session__new(, false, >tool); - if (session == NULL) { - pr_debug("No Memory for session\n"); - return -1; - } symbol__init(>header.env); if (perf_session__set_tracepoints_handlers(session, handlers)) - goto out_delete; + return -1; if (perf_session__has_traces(session, "record -R")) { int err = perf_session__process_events(session); if (err) { pr_err("Failed to process events, error %d", err); - goto out_delete; + return -1; } sched->nr_events = session->evlist->stats.nr_events[0]; @@ -1814,9 +1802,28 @@ static int perf_sched__read_events(struct perf_sched *sched) sched->nr_lost_chunks = session->evlist->stats.nr_events[PERF_RECORD_LOST]; } - rc = 0; -out_delete: + return 0; +} + +static int perf_sched__read_events(struct perf_sched *sched) +{ + struct perf_session *session; + struct perf_data data = { + .path = input_name, + .mode = PERF_DATA_MODE_READ, + .force = sched->force, + }; + int rc; + + session = perf_session__new(, false, >tool); + if (session == NULL) { + pr_debug("No Memory for session\n"); + return -1; + } + + rc = __perf_sched__read_events(sched, session); perf_session__delete(session); + return rc; } @@ -3130,12 +3137,25 @@ static void perf_sched__merge_lat(struct perf_sched *sched) static int perf_sched__lat(struct perf_sched *sched) { + struct perf_session *session; + struct perf_data data = { + .path = input_name, + .mode = PERF_DATA_MODE_READ, + .force = sched->force, + }; struct rb_node *next; + int rc = -1; setup_pager(); - if (perf_sched__read_events(sched)) + session = perf_session__new(, false, >tool); + if (session == NULL) { + pr_debug("No Memory for session\n"); return -1; + } + + if (__perf_sched__read_events(sched, session)) + goto out_delete; perf_sched__merge_lat(sched); perf_sched__sort_lat(sched); @@ -3164,7 +3184,10 @@ static int perf_sched__lat(struct perf_sched *sched) print_bad_events(sched); printf("\n"); - return 0; + rc = 0; +out_delete: + perf_session__delete(session); + return rc; } static int setup_map_cpus(struct perf_sched *sched) -- 2.17.1