Utrace .report_jctl serialization issue ?
Is a utrace engine with .report_jctl enabled suppose to handle do_notify_parent_cldstop(current, notify) processing for the last stopping task ? Or should it muck with task-ptrace to force tracehook_notify_jctl() to return a non-zero value ? I ask because I have a simple multi-threaded process with a utrace engine attached to the process group leader; .report_jctl is enabled. If I SIGTSTP the process, occasionally control is not returned to the shell. On my 2.6.32-44.2.el6 kernel this happens because when utrace_report_jctl() releases spin_unlock_irq(task-sighand-siglock) it breaks serialization with sig-group_stop_count as required by do_signal_stop()'s do_notify_parent_cldstop(current, notify) processing. Let me explain: Consider the following code fragment from kernel/signal.c in function do_signal_stop(), released in rhel's 6x beta2 2.6.32-44.2.el6 kernel: 1707 /* 1708 * If there are no other threads in the group, or if there is 1709 * a group stop in progress and we are the last to stop, report 1710 * to the parent. When ptraced, every thread reports itself. 1711 */ 1712 notify = sig-group_stop_count == 1 ? CLD_STOPPED : 0; 1713 notify = tracehook_notify_jctl(notify, CLD_STOPPED); 1714 /* 1715 * tracehook_notify_jctl() can drop and reacquire siglock, so 1716 * we keep -group_stop_count != 0 before the call. If SIGCONT 1717 * or SIGKILL comes in between -group_stop_count == 0. 1718 */ 1719 if (sig-group_stop_count) { 1720 if (!--sig-group_stop_count) 1721 sig-flags = SIGNAL_STOP_STOPPED; 1722 current-exit_code = sig-group_exit_code; 1723 __set_current_state(TASK_STOPPED); 1724 } 1725 spin_unlock_irq(current-sighand-siglock); 1726 1727 if (notify) { 1728 read_lock(tasklist_lock); 1729 do_notify_parent_cldstop(current, notify); 1730 read_unlock(tasklist_lock); 1731 } 1732 1733 /* Now we don't run again until woken by SIGCONT or SIGKILL */ 1734 do { 1735 schedule(); 1736 } while (try_to_freeze()); 1737 1738 tracehook_finish_jctl(); 1739 current-exit_code = 0; 1740 1741 return 1; For the sake if discussion: * Let the task group have 2 tasks; therefore initially sig-group_stop_count == 2 * For both tasks task_ptrace(current) returns zero (see tracehook_notify_jctl() for why this matters) * Let task1 be the process group leader and let it be the first task to execute do_signal_stop() * Let task1 have a trace engine attached with .report_jctl enabled and let all engine ops be no-ops; they do nothing; simply return UTRACE_RESUME Now when I send a SIGTSTP via ctl-z on the terminal of this multi threaded process, the following can happen: * at line 1713 task1 calls tracehook_notify_jctl() with notify == 0 because sig-group_stop_count == 2 * Because task1 has a utrace engine with .report_jctl, it releases task-sighand-siglock in utrace_report_jctl() * Now task2 may enter do_signal_stop() with the task-sighand-siglock held. * For task2 sig-group_stop_count == 2 is still true because task1 is either off executing utrace code or it is waiting on task-sighand-siglock held by task2; task1 has not executed line 1720 * For task2 because sig-group_stop_count == 2 and because tracehook_notify_jctl(notify, CLD_STOPPED) returns zero, notify == 0 * Therefore when task2 executes line 1727 do_notify_parent_cldstop() is not executed. * After task2 releases the lock, task1 continues, but unfortunately because when it was setting the notify cookie sig-group_stop_count == 2 and tracehook_notify_jctl(notify, CLD_STOPPED) returned zero because notify was initially zero and task_ptrace(current) returned zero. * Therefore for task1, after tracehook_notify_jctl(), notify == 0 * Finally, when task1 executes line 1727 do_notify_parent_cldstop() is not executed. The result is a control-z that does not return control to the parent because line 1729 was never executed. One possible fix is to re-examine sig-group_stop_count after tracehook_notify_jctl() with something like: notify = notify ?: sig-group_stop_count == 1 ? CLD_STOPPED : 0;
Re: Utrace .report_jctl serialization issue ?
Is a utrace engine with .report_jctl enabled suppose to handle do_notify_parent_cldstop(current, notify) processing for the last stopping task ? Or should it muck with task-ptrace to force tracehook_notify_jctl() to return a non-zero value ? I can't figure out exactly how to construe this as a question about the utrace API. The documented API is that each thread gets a report_jctl callback, and the @notify argument is zero in all threads but one. I ask because I have a simple multi-threaded process with a utrace engine attached to the process group leader; .report_jctl is enabled. Do you mean the thread_group leader of one process in the process group? Or do you mean multiple utrace engines, one per thread, all in the process whose pid==pgid (that's what process group leader means in POSIX)? If I SIGTSTP the process, occasionally control is not returned to the shell. That sounds like a kernel bug. There should be nothing your report_jctl callback could do (assuming it doesn't send more signals itself) that affects the normal job control semantics. The kernels that are appropriate to discuss here are upstream kernels with the current utrace patches applied (i.e. what you get in the current utrace-ptrace git branch), or the most recent Fedora kernels that should include that same code. The code in question might well be the same in RHEL6 kernels, but RHEL issues need to be addressed through the proper RHEL support channels. What we will help you with here is the current utrace development code. Perhaps Oleg and/or I will get time soon to look into this issue. Chances are better if you provide a test case in the form of a simple utrace module and a test scenario using it. Thanks, Roland
Re: Utrace .report_jctl serialization issue ?
Roland McGrath wrote on 11/11/2010 02:53 PM: Is a utrace engine with .report_jctl enabled suppose to handle do_notify_parent_cldstop(current, notify) processing for the last stopping task ? Or should it muck with task-ptrace to force tracehook_notify_jctl() to return a non-zero value ? I can't figure out exactly how to construe this as a question about the utrace API. The documented API is that each thread gets a report_jctl callback, and the @notify argument is zero in all threads but one. notify originates from do_signal_stop(); it is only non-zero when sig-group_stop_count is 1: notify = sig-group_stop_count == 1 ? CLD_STOPPED : 0; Herein lies the issue; becaues utrace_report_jctl() release the lock, sig-group_stop_count may be greater than 1 for all tasks as the execute the above line. I ask because I have a simple multi-threaded process with a utrace engine attached to the process group leader; .report_jctl is enabled. Do you mean the thread_group leader of one process in the process group? Or do you mean multiple utrace engines, one per thread, all in the process whose pid==pgid (that's what process group leader means in POSIX)? Sorry, old term from older RHEL task-group_leader; The first task; it does not really matter. If I SIGTSTP the process, occasionally control is not returned to the shell. That sounds like a kernel bug. There should be nothing your report_jctl callback could do (assuming it doesn't send more signals itself) that affects the normal job control semantics. The bug comes about because utrace_report_jctl() releases the lock as detailed. The kernels that are appropriate to discuss here are upstream kernels with the current utrace patches applied (i.e. what you get in the current utrace-ptrace git branch), or the most recent Fedora kernels that should include that same code. The code in question might well be the same in RHEL6 kernels, but RHEL issues need to be addressed through the proper RHEL support channels. What we will help you with here is the current utrace development code. Perhaps Oleg and/or I will get time soon to look into this issue. Chances are better if you provide a test case in the form of a simple utrace module and a test scenario using it. Thanks, Roland I apologize for posting to the wrong forum. If I get a chance I will post module and test case, but as you point out I am not on the correct code base for this forum. Thank you for your time. Dave