[observability-discuss] pmap -L, agent LWP and ^C

Alexander Kolbasov Fri, 23 Dec 2005 14:56:21 -0800

>>>>> "Mike" == Mike Shapiro <mws at sun.com> writes:



  >> Current implementation of pmap -L and plgrp from the NUMA observability 
  >> toolkit uses agent LWP to issue system calls on behalf of another process.
  >> If the creator of the agent LWP is killed mid-way, the target process 
remains 
  >> stopped and requires explicit kick with prun.
  >> 
  >> Is there some way to automatically restart the target process when the 
  >> originator of agent LWP dies?
  >> 
  >> I tried calling Psetflags(PR_RLC) before Pcreate_agent() but it doesn't 
seem 
  >> to help.
  >> 
  >> - Alexander Kolbasov

  Mike> Examples would include if the agent had grabbed locks, was doing some 
  Mike> manipulation of memory mapping protections, performing text 
instrumentation,
  Mike> or performing a sequence of address space modifications which hadn't yet
  Mike> reached a sane state.  In any of those scenarios, you're kind of 
screwed.
  Mike> And in some sense, leaving the agent in place makes the failure mode 
much
  Mike> more clear: you can use prun(1), but you better be aware of the 
consequences.

  Mike> The fundamental issue is that the kernel can only have knowledge of and 
know
  Mike> how to run-on-last-close mechanisms that it fully implements, namely any
  Mike> tracing bits (system calls, signals, faults) and watchpoints.  Anything
  Mike> involve more complex interactions between debugger and victim means that
  Mike> the kernel can't know how to restore the state of the world.  The 
analogy
  Mike> here is to the more well-known case of a debugger leaving breakpoint
  Mike> instructions behind: if a debugger dies and a process w/ no /proc 
controller
  Mike> hits a FLT_BPT, we kill the process with SIGTRAP.  So in some sense, the
  Mike> only other logical choice besides leave frozen/prun or kill -9 would be
  Mike> for us to do the same thing: if you set RLC and the agent exists, kill
  Mike> the process with SIGTRAP as part of setting it running.

Would it make sense to have a special "I am not doing any dangerous operations"
flag that can be used by programs that know what they are doing with a target
process? If the flag is supplied the kernel may automatically kill the agent and
continue running the process. If the flag is not supplied, it may do SIGTRAP.
This would allow various p-tools that need to issue various system calls on
behalf of other processes to be implemented without potentially destructive
impact on observed processes.

- Alexander Kolbasov

[observability-discuss] pmap -L, agent LWP and ^C

Reply via email to