Sometimes I have trouble capturing the "correct" state of a multithreaded 
process using gcore. That is, it looks like target process might have done some 
work since the time command was issued and the core file was generated.

Looking at the code, gcore calls ptrace(PT_ATTACH...), which internally issues 
SIGSTOP, and calls waitpid() to wait until the process stops. So, it's quite 
possible that some threads that are not sleeping interruptibly will continue to 
run until the process notices the signal. Signals are only checked when a 
thread that is tagged to handle the signal crosses the user boundary (return 
from syscall, trap). When the thread finally handles SIGSTOP, it needs to stop 
all threads, which is done by lighting a flag-bit it each thread. This bit is 
checked as each thread crosses the user boundary. So, there will always be some 
state change in the target process from the time SIGSTOP is posted to the time 
all threads are actually stopped. 

I was wondering if I could improve this a bit by calling PT_SUSPEND on all 
threads, instead of posting SIGSTOP and waiting for all threads to stop. Once 
the core is generated, unsuspend all threads. As with SIGSTOP, individual 
thread will only notice suspension as they cross user boundary. But there is no 
overhead of tagging a thread to handle the signal and that thread doing the 
suspension. The idea is to try and generate the core file which reflects the 
running state of the process as closely as possible.

Does this sound reasonable ?

Thanks,
Sushanth   
_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Reply via email to