Hi Tao, On Wed, Mar 13, 2024 at 11:23:56AM +0800, Tao Liu wrote: > Hi Aditya, > > On Tue, Mar 12, 2024 at 5:12 PM Aditya Gupta <adit...@linux.ibm.com> wrote: > > > > Hi Tao, > > > > > <...> > > > > > > +crash_target *target = NULL; > > > + > > > void > > > crash_target_init (void) > > > { > > > int nr_cpus = crash_get_nr_cpus(); > > > - crash_target *target = new crash_target (); > > > + target = new crash_target (); > > > > > > /* Own the target until it is successfully pushed. */ > > > target_ops_up target_holder (target); > > > @@ -137,29 +139,35 @@ crash_target_init (void) > > > reinit_frame_cache (); > > > } > > > > > > -/* > > > - * Change gdb's thread context to the thread on given CPU > > > - **/ > > > extern "C" int > > > -gdb_change_cpu_context(unsigned int cpu) > > > +gdb_change_thread_context (ulong task) > > > { > > > - ptid_t ptid = ptid_t(CRASH_INFERIOR_PID, 0, cpu); > > > - inferior *inf = current_inferior (); > > > - thread_info *tp = find_thread_ptid (inf, ptid); > > > - > > > - if (tp == nullptr) > > > + int tried = 0; > > > + inferior* inf = current_inferior(); > > > + int cpu = crash_set_thread(task); > > > + if (cpu < 0) > > > return FALSE; > > > > > > - /* Making sure that crash's context is same */ > > > - set_cpu(cpu, FALSE); > > > - > > > - /* Switch to the thread */ > > > - switch_to_thread(tp); > > > - > > > - /* Fetch/Refresh thread's registers */ > > > - gdb_refresh_regcache(cpu); > > > + ptid_t ptid = ptid_t(CRASH_INFERIOR_PID, 0, cpu); > > > > > > - return TRUE; > > > +retry: > > > + thread_info *tp = find_thread_ptid (inf, ptid); > > > + if (tp == nullptr && !tried) { > > > + thread_info *thread = add_thread_silent(target, > > > + ptid_t(CRASH_INFERIOR_PID, 0, cpu)); > > > > A minor comment. > > Is it possible to do it without needing 'target' as global ? I see > > there's 'inf->current_top_target()', but it's return type if target_ops, > > don't know if it can be used. > > > Thanks for your comments, however I guess it is not workable for the > code change. What we want is to add a thread to the target. I found > one function "target.c:new_thread", which can be used as > "new_thread(current_inferior(), ptid_t(CRASH_INFERIOR_PID, 0, cpu));" > to replace the "add_thread_silent()" code, except we need to make > "new_thread" to be non-static. > > I prefer to keep the code as current, because I don't see the side > effect of making target as global variable. What do you think?
Then there doesn't seem to be an alternative, then I guess we will have to keep the global variable, seems okay to me. > > > > <...> > > > > > > /* > > > * Collect the irq_desc[] entry along with its associated handler and > > > diff --git a/task.c b/task.c > > > index a405b05..ef79f53 100644 > > > --- a/task.c > > > +++ b/task.c > > > @@ -715,7 +715,8 @@ task_init(void) > > > * crash_target::fetch_registers, so CPU 0's registers are shown as > > > * <unavailable> in gdb mode > > > * */ > > > - gdb_refresh_regcache(0); > > > + for (int i = 0; i < get_cpus_online(); i++) > > > + gdb_refresh_regcache(i); > > > > Does x86 require all regcache(s) to be refreshed at init time ? > > > > Till v4 of my ppc series, I also was doing refresh of all CPUs, but it's > > better to do it when required, and there was a review in v4, patch 5 > > from Lianbo, quoting it here: > > > > > In addition, I haven't tested how far it takes the time in the > > > multi-core environment, is it possible to implement a smaller > > > granularity refresh operation? For example: When we need switch to > > > another cpu, execute the refresh operation for corresponding regcache. > > > > CPU 0's refresh is required due to gdb having already tried to > > initialised registers before we initialised crash_target > > > > This might add a small noticeable delay on vmcores from big systems. > > > > Adds around 4.6 seconds for a ppc64 vmcore with 64 CPUs, in crash init > > time, while reading it on a fast 48 cpu Power8 system. > > This overhead is less for systems with less CPUs, eg. ~0.43 seconds, on > > a vmcore of 8 CPUs x86_64 system. > > Yes, thanks for the testing results. There do have performance > downgrades of refreshing all cpu caches. But this will give correct > results when user type "info threads" after showing "crash>": > > With all cpu cache refreshed: > crash> info threads > Id Target Id Frame > 1 CPU 0 native_safe_halt () at > arch/x86/include/asm/irqflags.h:54 > 2 CPU 1 native_safe_halt () at > arch/x86/include/asm/irqflags.h:54 > 3 CPU 2 <unavailable> in ?? () > 4 CPU 3 <unavailable> in ?? () > 5 CPU 4 native_safe_halt () at > arch/x86/include/asm/irqflags.h:54 > 6 CPU 5 <unavailable> in ?? () > 7 CPU 6 <unavailable> in ?? () > * 8 CPU 7 blk_mq_rq_timed_out (req=0xffff880fdb246000, > reserved=reserved@entry=false) at block/blk-mq.c:640 > 9 CPU 8 native_safe_halt () at > arch/x86/include/asm/irqflags.h:54 > 10 CPU 9 <unavailable> in ?? () > 11 CPU 10 <unavailable> in ?? () > 12 CPU 11 native_safe_halt () at > arch/x86/include/asm/irqflags.h:54 > 13 CPU 12 <unavailable> in ?? () > 14 CPU 13 native_safe_halt () at > arch/x86/include/asm/irqflags.h:54 > 15 CPU 14 <unavailable> in ?? () > 16 CPU 15 <unavailable> in ?? () > 17 CPU 16 <unavailable> in ?? () > 18 CPU 17 native_safe_halt () at > arch/x86/include/asm/irqflags.h:54 > 19 CPU 18 native_safe_halt () at > arch/x86/include/asm/irqflags.h:54 > 20 CPU 19 <unavailable> in ?? () > 21 CPU 20 <unavailable> in ?? () > 22 CPU 21 <unavailable> in ?? () > 23 CPU 22 native_safe_halt () at > arch/x86/include/asm/irqflags.h:54 > 24 CPU 23 <unavailable> in ?? () > crash> q > > With only cpu0 refreshed: > crash> info threads > Id Target Id Frame > 1 CPU 0 native_safe_halt () at > arch/x86/include/asm/irqflags.h:54 > 2 CPU 1 blk_mq_rq_timed_out (req=0xffff880fdb246000, > reserved=reserved@entry=false) at block/blk-mq.c:640 > 3 CPU 2 blk_mq_rq_timed_out (req=0xffff880fdb246000, > reserved=reserved@entry=false) at block/blk-mq.c:640 > 4 CPU 3 blk_mq_rq_timed_out (req=0xffff880fdb246000, > reserved=reserved@entry=false) at block/blk-mq.c:640 > 5 CPU 4 blk_mq_rq_timed_out (req=0xffff880fdb246000, > reserved=reserved@entry=false) at block/blk-mq.c:640 > 6 CPU 5 blk_mq_rq_timed_out (req=0xffff880fdb246000, > reserved=reserved@entry=false) at block/blk-mq.c:640 > 7 CPU 6 blk_mq_rq_timed_out (req=0xffff880fdb246000, > reserved=reserved@entry=false) at block/blk-mq.c:640 > * 8 CPU 7 blk_mq_rq_timed_out (req=0xffff880fdb246000, > reserved=reserved@entry=false) at block/blk-mq.c:640 > 9 CPU 8 blk_mq_rq_timed_out (req=0xffff880fdb246000, > reserved=reserved@entry=false) at block/blk-mq.c:640 > 10 CPU 9 blk_mq_rq_timed_out (req=0xffff880fdb246000, > reserved=reserved@entry=false) at block/blk-mq.c:640 > 11 CPU 10 blk_mq_rq_timed_out (req=0xffff880fdb246000, > reserved=reserved@entry=false) at block/blk-mq.c:640 > 12 CPU 11 blk_mq_rq_timed_out (req=0xffff880fdb246000, > reserved=reserved@entry=false) at block/blk-mq.c:640 > 13 CPU 12 blk_mq_rq_timed_out (req=0xffff880fdb246000, > reserved=reserved@entry=false) at block/blk-mq.c:640 > 14 CPU 13 blk_mq_rq_timed_out (req=0xffff880fdb246000, > reserved=reserved@entry=false) at block/blk-mq.c:640 > 15 CPU 14 blk_mq_rq_timed_out (req=0xffff880fdb246000, > reserved=reserved@entry=false) at block/blk-mq.c:640 > 16 CPU 15 blk_mq_rq_timed_out (req=0xffff880fdb246000, > reserved=reserved@entry=false) at block/blk-mq.c:640 > 17 CPU 16 blk_mq_rq_timed_out (req=0xffff880fdb246000, > reserved=reserved@entry=false) at block/blk-mq.c:640 > 18 CPU 17 blk_mq_rq_timed_out (req=0xffff880fdb246000, > reserved=reserved@entry=false) at block/blk-mq.c:640 > 19 CPU 18 blk_mq_rq_timed_out (req=0xffff880fdb246000, > reserved=reserved@entry=false) at block/blk-mq.c:640 > 20 CPU 19 blk_mq_rq_timed_out (req=0xffff880fdb246000, > reserved=reserved@entry=false) at block/blk-mq.c:640 > 21 CPU 20 blk_mq_rq_timed_out (req=0xffff880fdb246000, > reserved=reserved@entry=false) at block/blk-mq.c:640 > 22 CPU 21 blk_mq_rq_timed_out (req=0xffff880fdb246000, > reserved=reserved@entry=false) at block/blk-mq.c:640 > 23 CPU 22 blk_mq_rq_timed_out (req=0xffff880fdb246000, > reserved=reserved@entry=false) at block/blk-mq.c:640 > 24 CPU 23 blk_mq_rq_timed_out (req=0xffff880fdb246000, > reserved=reserved@entry=false) at block/blk-mq.c:640 > > As we can see, other cpus[1-6, 8-23] just take the reg cache of > cpu[7], which is incorrect. And if users go further like, "thread 20" > and "gdb bt", it will also give incorrect stack traces. > > The cpu cache will only get refreshed once user type "set <pid>", so > the cpu cache will be refreshed by the <pid> task's context. > > I doubt a user will understand all the details and constraints, so I'm > afraid the user will be confused by the faulty output. But I also have > no objection if the performance is the priority. Basically it is a > balance of pays and gains. In addition, since cmd "info" and "thread" > is a command provided by gdb, currently I don't know how to hack > those, so cpu cache can be refreshed when "info threads" or "thread > <num>" have been invoked. > > Do you have any thoughts? I also had faced that issue initially, ie. the other CPUs using up same regcache, if all are not refreshed. While iterating through all threads, gdb switches it's context temporarily, while crash's context remained same, thus causing gdb to get same registers for all threads other than 0. This was solved in patch #3 (synchronise cpu context changes between crash/gdb) in the ppc's 'Improve stack unwind on ppc64' series, by syncing gdb's context with crash. Can this change in thread.c in gdb-10.2.patch in patch #2 be reverted ? That will fix it. ``` ---- gdb-10.2/gdb/thread.c.orig -+++ gdb-10.2/gdb/thread.c -@@ -60,7 +60,7 @@ static thread_info *current_thread_; - - #ifdef CRASH_MERGE - /* Function to set cpu, defined by crash-utility */ --extern "C" void set_cpu (int); -+extern "C" void set_cpu(int cpu, int print_context); - #endif - - /* RAII type used to increase / decrease the refcount of each thread -@@ -1098,6 +1098,9 @@ print_thread_info_1 (struct ui_out *uiout, const char *requested_threads, - uiout->table_body (); - } - -+#ifdef CRASH_MERGE -+ int current_cpu = current_thread ? current_thread->ptid.tid() : 0; -+#endif - for (inferior *inf : all_inferiors ()) - for (thread_info *tp : inf->threads ()) - { -@@ -1113,6 +1116,11 @@ print_thread_info_1 (struct ui_out *uiout, const char *requested_threads, - - ui_out_emit_tuple tuple_emitter (uiout, NULL); - -+ /* Switch to the thread's CPU in crash */ -+#ifdef CRASH_MERGE -+ set_cpu(tp->ptid.tid(), FALSE); -+#endif -+ ``` 'info threads' keeps changing the gdb context temporarily, while it prints each thread's frame and other information. Correspondingly this changes ensured that crash's context also syncs according to change in gdb's context. Then, when user does 'info threads', it will refresh all regcache since it has to go through each thread. In that case that issue of the same frame being printed for all threads should not happen. That should remove the need to refresh all regcache at init time, and leave it to later when user needs it. Refreshing all regcache at init time is the sure shot way that all regcache are correct, but I guess everyone will anyways do a 'info threads', but we can delay the overhead to later. What do you think ? Thanks, Aditya Gupta > > Thanks, > Tao Liu > > > > > > > Thanks, > > Aditya Gupta > > > > > > > > tt->flags |= TASK_INIT_DONE; > > > } > > > @@ -5315,7 +5316,7 @@ set_context(ulong task, ulong pid, uint > > > update_gdb_thread) > > > > > > /* change the selected thread in gdb, according to current > > > context */ > > > if (update_gdb_thread) > > > - return gdb_change_cpu_context(tc->processor); > > > + return gdb_change_thread_context(tc->task); > > > else > > > return TRUE; > > > } else { > > > -- > > > 2.40.1 > > > > > > -- Crash-utility mailing list -- devel@lists.crash-utility.osci.io To unsubscribe send an email to devel-le...@lists.crash-utility.osci.io https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/ Contribution Guidelines: https://github.com/crash-utility/crash/wiki