On Tue, Jun 3, 2025 at 10:09 AM lijiang <liji...@redhat.com> wrote:
> > > On Tue, May 13, 2025 at 1:26 AM <devel-requ...@lists.crash-utility.osci.io> > wrote: > >> Date: Mon, 12 May 2025 10:22:43 +1200 >> From: Tao Liu <l...@redhat.com> >> Subject: [Crash-utility] Re: [PATCH] Fix incorrect task state during >> exit >> To: Stephen Brennan <stephen.s.bren...@oracle.com> >> Cc: devel@lists.crash-utility.osci.io >> Message-ID: >> <CAO7dBbWHa5x+BdiVF3y_QL-JgQhDLgEqKzaQS+YfBrL= >> jy6...@mail.gmail.com> >> Content-Type: text/plain; charset="UTF-8" >> >> Hi Stephen, >> >> Thanks for your fix and detailed explanation! >> >> On Sat, May 3, 2025 at 8:19 AM Stephen Brennan >> <stephen.s.bren...@oracle.com> wrote: >> > >> > task_state() assumes that exit_state is a unsigned long, when in >> > reality, it has been declared as an int since 97dc32cdb1b53 ("reduce >> > size of task_struct on 64-bit machines"), in Linux 2.6.22. So on 64-bit >> > machines, task_state() reads 8 bytes rather than 4, and gets the wrong >> > exit_state value by including the next field. >> > >> > This has gone unnoticed because directly after exit_state comes >> > exit_code, which is generally zero while the task is alive. When the >> > exit_code is set, exit_state is usually set not long after. Since >> > task_state_string() only checks whether exit_state bits are set, it >> > never notices the presence of the exit code inside of the state. >> > >> > But this leaves open a window during the process exit, when the >> > exit_code has been set (in do_exit()), but the exit_state has not (in >> > exit_notify()). In this case, crash reports a state of "??", but in >> > reality, the task is still running -- it's just running the exit() >> > system call. This race window can be long enough to be observed in core >> > dumps, for example if the mmput() takes a long time. >> > >> > This should be considered a bug. A task state of "??" or "(unknown)" is >> > frequently of concern when debugging, as it could indicate that the >> > state fields had some sort of corruption, and draw the attention of the >> > debugger. To handle it properly, record the size of exit_state, and read >> > it conditionally as a UINT or ULONG, just like the state. This ensures >> > we retain compatibility with kernel before v2.6.22. Whether that is >> > actually desirable is anybody's guess. >> > >> > Reported-by: Jeffery Yoder <jeffery.yo...@oracle.com> >> > Signed-off-by: Stephen Brennan <stephen.s.bren...@oracle.com> >> > --- >> > defs.h | 1 + >> > task.c | 11 +++++++++-- >> > 2 files changed, 10 insertions(+), 2 deletions(-) >> > >> > diff --git a/defs.h b/defs.h >> > index 4cf169c..58362d0 100644 >> > --- a/defs.h >> > +++ b/defs.h >> > @@ -2435,6 +2435,7 @@ struct size_table { /* stash of >> commonly-used sizes */ >> > long prb_desc; >> > long wait_queue_entry; >> > long task_struct_state; >> > + long task_struct_exit_state; >> > long printk_safe_seq_buf_buffer; >> > long sbitmap_word; >> > long sbitmap; >> >> The patch looks good to me, except for the above code. Any newly added >> > > In addition, also need to dump it, please see the "help -o". > > The current value(task_struct_exit_state) is not dumped, please see the handling of "task_struct_state". E.g: $ grep -nr "task_struct_state" * symbols.c:9806: fprintf(fp, " task_struct_state: %ld\n", symbols.c:9807: OFFSET(task_struct_state)); symbols.c:11899: fprintf(fp, " task_struct_state: %ld\n", SIZE(task_struct_state)); Otherwise, for the patch: Ack. > > Thanks > Lianbo > > >> members for size/offset_table should go to the end of the struct, >> rather than the middle of it. See the 2nd item of >> https://github.com/crash-utility/crash/wiki#writing-patches. But this >> is a slight error, I can get it corrected when merging. Other than >> that, Ack for the patch. >> >> Thanks, >> Tao Liu >> >> > diff --git a/task.c b/task.c >> > index 3bafe79..e07b479 100644 >> > --- a/task.c >> > +++ b/task.c >> > @@ -306,6 +306,7 @@ task_init(void) >> > MEMBER_SIZE_INIT(task_struct_state, "task_struct", >> "__state"); >> > } >> > MEMBER_OFFSET_INIT(task_struct_exit_state, "task_struct", >> "exit_state"); >> > + MEMBER_SIZE_INIT(task_struct_exit_state, "task_struct", >> "exit_state"); >> > MEMBER_OFFSET_INIT(task_struct_pid, "task_struct", "pid"); >> > MEMBER_OFFSET_INIT(task_struct_comm, "task_struct", "comm"); >> > MEMBER_OFFSET_INIT(task_struct_next_task, "task_struct", >> "next_task"); >> > @@ -5965,8 +5966,14 @@ task_state(ulong task) >> > state = ULONG(tt->task_struct + >> OFFSET(task_struct_state)); >> > else >> > state = UINT(tt->task_struct + >> OFFSET(task_struct_state)); >> > - exit_state = VALID_MEMBER(task_struct_exit_state) ? >> > - ULONG(tt->task_struct + OFFSET(task_struct_exit_state)) >> : 0; >> > + >> > + if (VALID_MEMBER(task_struct_exit_state) >> > + && SIZE(task_struct_exit_state) == sizeof(ulong)) >> > + exit_state = ULONG(tt->task_struct + >> OFFSET(task_struct_exit_state)); >> > + else if (VALID_MEMBER(task_struct_exit_state)) >> > + exit_state = UINT(tt->task_struct + >> OFFSET(task_struct_exit_state)); >> > + else >> > + exit_state = 0; >> > >> > return (state | exit_state); >> > } >> > -- >> >
-- Crash-utility mailing list -- devel@lists.crash-utility.osci.io To unsubscribe send an email to devel-le...@lists.crash-utility.osci.io https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/ Contribution Guidelines: https://github.com/crash-utility/crash/wiki