Re: linux-next: add utrace tree

2010-01-24 Thread Chris Moller

On 01/24/10 13:01, Frank Ch. Eigler wrote:


... all add up to a mere nudge away from entirely evil.  If so, I
wonder if your sort of grossly bimodal view of ethical virtue is going
to foster the right sorts of change in the linux kernel community.
   


Nothing like a good religious debate to liven up your Sunday...



- FChE

   




Yet another froggy update

2008-11-07 Thread Chris Moller
I've just committed a good, stable, version of froggy, including
documentation complete with respect to current function: cd to
froggy/docs and bang in 'make', then point your browser at
topdir/froggy/docs/html/book1.html.  It's still a bit rough, and I
need to add more details in various places, and maybe an index, but it
gets the point across.

-- 
Chris Moller

  I know that you believe you understand what you think I said, but
  I'm not sure you realize that what you heard is not what I meant.
  -- Robert McCloskey





signature.asc
Description: PGP signature


signature.asc
Description: OpenPGP digital signature


Froggy status 2008-11-05

2008-11-05 Thread Chris Moller
Last week:

* Added a little more documentation.
* Added a fair amount of code to disable various report_* callbacks
  unless the client has registered a corresponding client-side
  callback--no point in doing the utrace callback if the client
  doesn't need it.

Next week(s):

* Add more documentation.
* Add syscall entry filtering.
* Add signal reporting and filtering.
* Add an attach_tree capability.
* Add more testcase/demos.

-- 
Chris Moller

  I know that you believe you understand what you think I said, but
  I'm not sure you realize that what you heard is not what I meant.
  -- Robert McCloskey





signature.asc
Description: PGP signature


signature.asc
Description: OpenPGP digital signature


Re: abt froggy

2008-10-31 Thread Chris Moller


Srikar Dronamraju wrote:
 Hi Chris, 

 Thanks for your quick reply. 

 I have been regularly updating from cvs. So I thought I was always on
 the latest copy.  To confirm I checked out a new copy using 
 cvs -z9 -d :ext:[EMAIL PROTECTED]:/cvs/systemtap co froggy 
 and compared it the copy I update regularly and they seem to be identical.

   
 The printf()s in froggy.c:resp_listener() are all temporary diagnostics
 and will ultimately be removed.  I don't know which version of froggy
 you have, but if you'll look at the latest under the RESP_SYSCALL_ENTRY
 case you'll see:

 if (froggy-syscall_fcn)
 (*froggy-syscall_fcn)(resp.resp_syscall.nr,
(pid_t)resp.resp_syscall.pid,
regs);
 
 I see this:
  if (froggy-syscall_fcn)
 (*froggy-syscall_fcn)(resp.resp_syscall.nr,
(pid_t)resp.resp_syscall.pid,
resp.resp_syscall.args,
regs);
 }
   

Yeah, that's the bleeding edge as of yesterday afternoon after I sent
that note--froggy's changing fairly rapidly.

   
 In froogy/module/control.c 
 do_process_attach_task() and do_process_attach share lot of code and we
 could probably use common function for the shared code.
   
   
 do_process_attach_task() has been removed.  Originally, I had it set up
 such that client-thread report_clone wasa used to automatically attach
 

 I am not sure what you mean by removed. 
 grep -n do_process_attach_task froggy/froggy/module/control.c  shows
 65:do_process_attach_task (struct utrace_attached_engine * engine,

Look two lines above that--there's a #ifdef DO_AUTOATTACH and
DO_AUTOATTACH isn't being set in the build.  (This is my way of doing
hacks I may want to reverse if they don't work right.  Since things are
playing the way I expect them to, for the next snapshot I'll probably
delete all the code currently wrapped up in DO_AUTOATTACH #ifdefs--it
exists in five places in three separate files.)


   
 child processes that were ultimately to be execve()ed as code to be
 debugged, but I've replaced that with something more explicitly like
 ptrace(PTRACE_TRACEME,...) and the older code has been #ifdef-ed out.
 (I'll remove it entirely when I'm sure I don't need it anymore...)

 
 Similarly do_shutdown() shares code with do_process_detach and we could
 probably make do_shutdown() call do_process_detach.
   
   
 It sounds like you don't have the latest snapshot--I did a /lot/ of
 cleanup in the shutdown/detach code last week and fcns you mention split
 the work up differently now.

 

 I am not sure if we are looking at the version with different viewpoints
 or we are looking at different version of sources.
   

The absolute latest snapshot was at 1704 East Coast U.S. time
yesterday.   Yes,  there's a small overlap between the code in
do_process_detach and do_shutdown, but that's a micro-optimisation I can
do something about when the major stuff is working.  (Shutdown and
detach are fairly sensitive w.r.t. race conditions and what happens
depending on whether the client kills a child, the child is killed
externally, or the client is killed or dies--I don't tinker casually in
that area.)

   
 Do we plan to use report_jctl in the near future. utrace_ops has
 report_jctl set to NULL however we have defined report_jctl() function
 which never gets used.  Similarly for unsafe_exec and tracer_task.
   
   
 All the report_* callbacks will ultimately be returned to the client for
 handling as described above--keep in mind froggy is still in its early
 stages and there's a lot more that's yet to be done.
 

 Do you have a document that talks of the features and stuff that you
 intend to provide other than the README which talks of froggy being a
 sandbox for utrace.
 (I haven't read the documents that were updated today. Sorry if its
 covered in those docs.)
   


I'll be doing a lot of documentation work over the next few days--no
reason I can't include a tentative roadmap.

   
 Can you explain why you would want to attach with
 UTRACE_ATTACH_EXCLUSIVE always? So even  if I want to trace the same
 program twice from froggy then I shouldn't be able to do it?
   
   
 To be honest, I had no good reason for using UTRACE_ATTACH_EXCLUSIVE and
 can certainly experiment with removing it.
 

 Ok.

   

-- 
Chris Moller

  I know that you believe you understand what you think I said, but
  I'm not sure you realize that what you heard is not what I meant.
  -- Robert McCloskey




signature.asc
Description: OpenPGP digital signature


Froggy status 2008-10-23

2008-10-23 Thread Chris Moller
Last couple of weeks:

* Added code to catch client forks and attach the child proc--needed
  for attaching to target binaries started by the client via
  fork()/exec()  (kinda like PTRACE_TRACEME). 
* Fixed a killer interaction between client reap and the module
  .release in freeing up resources.

Next week(s):

* Add documentation.
* Add syscall entry filtering.
* Add signal reporting and filtering.
* Add attach_tree capability (i.e., attach to a given proc and
  recursively to all child proces thereof with auto attach/detach as
  those procs fork and die.)  
* More testcase/demo tinkering.

-- 
Chris Moller

  I know that you believe you understand what you think I said, but
  I'm not sure you realize that what you heard is not what I meant.
  -- Robert McCloskey




signature.asc
Description: OpenPGP digital signature


Re: Froggy status 2008-10-08

2008-10-13 Thread Chris Moller


Petr Tesarik wrote:


 All sounds very nice, Chris. But I'm having trouble finding the sources.
 Could you give me a pointer, please?
   

Sorry--you can get the source with cvs -d
:ext:sources.redhat.com:/cvs/systemtap co froggy

cm


 Petr Tesarik

   

-- 
Chris Moller

  I know that you believe you understand what you think I said, but
  I'm not sure you realize that what you heard is not what I meant.
  -- Robert McCloskey




signature.asc
Description: OpenPGP digital signature


froggy status 2008-09-09

2008-09-09 Thread Chris Moller
Last week:

More or less completed migrating to Roland's new utrace API--in test
mode now.  I've made no effort to keep froggy compatible with both
the new API and the old one--the new version won't compile or run
with other than a rawhide kernel.  Since I've no idea when
F10--presumably what's rawhide now--or the more-or-less equivalent
RHEL will be available, I've been reluctant to commit changes to the
existing froggy, thereby breaking it for F9/RHEL5 kernels.   I guess
I could just tag the old-API version, but I think it might be easier
all around if I just created a new froggy2--especially if it becomes
necessary to backport functional changes made for the new API back
to the original API version.

Next week.

   1. Refactor relevant parts of froggy-test.c--those parts dealing
  directly with the module interface--to a callable interface
  lib.  This will hide some of the messy and/or dangerous bits,
  e.g., getting out of sync in the transport stream.  I think it
  was Frank who once expressed a concern for maintaining access
  to the file descriptor for use in select()/poll(), and I'll
  make sure that happens, but I do want to have a clean i/f.
   2. Hack together an strace-like demo that exercisees
  report_syscall callbacks.
   3. Hack together a graphical pstree-like utility that exercises
  life-cycle callbacks (report_clone, report_exec, report_death,
  report_reap).   This, BTW, brings up an exposure: for such a
  utility or, for that matter, an original-frysk whole-system
  monitor thingy, user-space processes probably have to be able
  to attach system processes, but it would be a remarkably bad
  idea for them to be able to control those processes.  My
  thought was to add some code that will limit user-space (or
  perhaps all) clients to only passively attaching system
  processes, letting them get report_* callbacks, but inhibiting
  any action--quiescing, tinkering with signals, etc.--that
  would actually affect the processes.

Comments welcome.

-- 
Chris Moller

  I know that you believe you understand what you think I said, but
  I'm not sure you realize that what you heard is not what I meant.
  -- Robert McCloskey




signature.asc
Description: OpenPGP digital signature


Re: tasklist_lock question

2008-09-03 Thread Chris Moller


Roland McGrath wrote:
 y'all might get multiple copies of this--i think the stuttgart smtp
 server i've been using might not be working right...
 

 I got two copies, but one of them stole an R from you!

   
 looks like stuff that used to be wrapped up in
 rcu_read_lock()/rcu_read_unlock() is now wrapped up in
 read_lock(tasklist_lock)/read_unlock(tasklist_lock) but tasklist_lock
 is showing up as undefined in module build stage 2 and Unknown symbol
 in module in insmod.  am i balling something up?  or can i go on using
 rcu_read_(un)?lock?
 

 In the new utrace API there is nothing that requires you to use either RCU
 or tasklist_lock.  If you are using RCU for your own purposes, that's up to
 you.  The tasklist_lock has not been exported for modules to use in a long
 time.  Please be specific in your questions.  What are you doing that you
 think needs some locking you aren't doing?
   

A little fnc to get the task associated with a given pid:

static struct task_struct *
get_task (long utraced_pid)
{
  struct task_struct * task;

  read_lock (tasklist_lock);
  task = find_task_by_vpid(utraced_pid);
  if (task) get_task_struct(task);
  read_unlock (tasklist_lock);

  return task;
}


I know you've got utrace_attach_pid(), but there are other circumstances
I've used in the past, like accessing user mem via get_user_pages(),
where I needed access to the task struct.  Even if I don't need task
struct any more for utrace, I'm kinda trying to plan ahead a bit.

Or are pid_task() and get_pid_task() the new/modern way of doing that
w/o needing your own locking?

cm


 Thanks,
 Roland
   

-- 
Chris Moller

  I know that you believe you understand what you think I said, but
  I'm not sure you realize that what you heard is not what I meant.
  -- Robert McCloskey




signature.asc
Description: OpenPGP digital signature


syscall -1?

2008-08-14 Thread Chris Moller
Weirdness I don't get, but maybe someone else will.

I have few testcases for exercising froggy, one of which, to exercise
report_signal, is:

#define _GNU_SOURCE
#define __USE_GNU
#include stdio.h
#include stdlib.h
#include unistd.h
#include signal.h

static void
signal_action (int signo)
{
  fprintf (stderr, signal %d received by sigtest\n, signo);
}

main(int ac, char * av[])
{
  signal (SIGUSR1, signal_action);
  signal (SIGUSR2, signal_action);

  while (1) {
pause();
fprintf(stderr, looping\n);
  };

  exit (0);
}

Here's the curious thing:  when I attach to this and fire off kill -s
SIGUSR1 pid to it, every pass through the loop gets:

signal 10 received by sigtest
[ 15264] got syscall exit  29, pause
[ 15264] got syscall entry 4, write
[ 15264] got syscall exit  4, write
[ 15264] got syscall entry 119, sigreturn
looping
got unknown syscall exit -1
[ 15264] got syscall entry 4, write
[ 15264] got syscall exit  4, write
[ 15264] got syscall entry 29, pause

See that unknown syscall exit -1?  It's from the f froggy-test code
that decodes syscalls, the same hunk of code that's dumping the [
pid] got syscall entry... stuff.  Anyone have a clue what that's all
about? 

It's real--if I stick

if (-1 == regs-orig_ax) printk (KERN_ALERT got a syscall -1\n);

into the report_syscall code, the msg gets dumped to the console.

-- 
Chris Moller

  I know that you believe you understand what you think I said, but
  I'm not sure you realize that what you heard is not what I meant.
  -- Robert McCloskey




signature.asc
Description: OpenPGP digital signature


Re: syscall -1?

2008-08-14 Thread Chris Moller


Roland McGrath wrote:
 Please always mention the kernel version and machine you are using when
 asking any question like this.  The output of uname -a is a good, 

Linux hotbox.mollernet.net 2.6.25.6 #8 SMP Tue Jun 24 10:11:08 EDT 2008
i686 i686 i386 GNU/Linux



 For real code, you should be using syscall_get_nr (asm/syscall.h),

Hmm-- no such thing, apparently, as syscall_get_nr in 2.6.25.6.  Is it
supposed to trickle out into the world sometime soon?


Thanks,
Chris



 Thanks,
 Roland
   

-- 
Chris Moller

  I know that you believe you understand what you think I said, but
  I'm not sure you realize that what you heard is not what I meant.
  -- Robert McCloskey




signature.asc
Description: OpenPGP digital signature


Froggy update

2008-08-13 Thread Chris Moller
Just for the curious, here's an update on the current capabilities of 
the froggy module, expressed as a description of the command line 
options that exercise them.  (This is cut'n'pasted from the latest 
froggy README.)


   Just to put it all in one place, current usage of froggy-test is as
   follows:

   ./froggy-test [control options] (process specification) ...

   where the optional control options are:

   -q | --quiesce
   Quiesce the specified process after attaching it.
   Running processes attached via the -p option (see
   below) are quiesced according to utrace rules.
   Processes started via the -E option (e.g., binaries
   executed by fork/execv) are quieseced when the
   process is ready to run but has not yet started to
   do so (in report_exec).

   -e arg | --syscall-entry arg
   Enable reporting of entry into syscalls specified by
   arg.  arg can be a syscall number (e,g. 3 to
   specify entry into the i386 read syscall) or it 
can be

   the syscall name (as shown in
   /usr/include/bits/syscall.h with the SYS_ prefix
   removed, e.g.,  'read').  It can also be 
specified as

   the string all, in which case reporting of all
   syscall enties will be enabled.

   -x arg | --syscall-exit arg
   Just like --syscall-entry but controls the reporting
   of syscall exits.

   -s arg | --signal arg
   Similar to -e and -x, using the same argument
   conventions, but for signals.  (NOT YET FULLY
   IMPLEMENTED.)

   -w | --wait
   Causes the client to wait until the specified 
process

   is attached.  (Hasn't been tested in a while, might
   not work, and may be removed.)

   The process specification is one of the following:

   -p arg | --pid arg
   Attach to the running process the pid of which is
   arg.

   -E arg | --exec arg
   For this option, arg is a string specifying a
   program to be executed and attached, along with any
   necessary arguments.  E.g.:

   -E ./reader -i 50

   starts the test program reader, passing -i 
50

   to it as arguments, and attaching the resultant
   process.

   Any number of [control option] (process specification) 
sequences can

   be specified.  Once a process has been specified, any previously
   supplied control options are applied to that process and the options
   are cleared.  This permits usages such as:

   ./froggy-test -e read -p 1234 -x write -p 5678

   which would enable reporting of read syscalls from process 1234 and
   write syscalls from process 5678.  -p and -E specifications can be
   used concurrently and will be dealt with in the order specified,
   though there is no guarantee I'm aware of that the will be 
attached by

   utrace in any particular order.  (A -p attach is much simpler than
   the fork()/exec() process needed by -E, which also requires
   synchronisation with the client in various places.)

--
Chris Moller

 I know that you believe you understand what you think I said, but
 I'm not sure you realize that what you heard is not what I meant.
 -- Robert McCloskey




signature.asc
Description: OpenPGP digital signature


Re: Froggy status

2008-07-29 Thread Chris Moller



Chris Moller wrote:


BTW, until further notice--tomorrow morning, probably, don't use a 
froggy of any later that 2006-07-28 11:20.  Somehow a kernel-crasher 
snuck in which I'll debug in the morning.




Fixed and committed.


--
Chris Moller

 I know that you believe you understand what you think I said, but
 I'm not sure you realize that what you heard is not what I meant.
 -- Robert McCloskey




signature.asc
Description: OpenPGP digital signature


Froggy update.

2008-07-29 Thread Chris Moller

From the README:

   Added module code that utrace_attach()es the client 
process,e.g., the
   debugger.itself.  What this does is allow fork()s of the 
debugger to be
   caught by report_clone(), thereby allowing the resultant child 
process

   to be utrace_attach()ed, which has the effect of attaching a binary:
   I.e.,

   ./froggy-test -E ./reader

   fork()s, attaches the child, and then execvp()s ./reader, thereby
   allowing ./reader to be traced, debugged, or whatever.

   It's still primitive--at the moment, the process is attached running
   (i.e., it's not quiesced), all syscall entries and exits are 
reported.

   and that's it.  More to come.

--
Chris Moller

 I know that you believe you understand what you think I said, but
 I'm not sure you realize that what you heard is not what I meant.
 -- Robert McCloskey




signature.asc
Description: OpenPGP digital signature


Re: Froggy status

2008-07-28 Thread Chris Moller



David Smith wrote:


I took a quick look at froggy.c (the kernel module).  I'm not sure how
finished 


A long, long, way there from--it's still a hacker's sandbox.


you consider it, but in general I'm not sure it is paranoid
enough.  For instance, what happens if there is a error in a client and
it aborts?  What happens if someone does a rmmod on the module while
clients are still running?
  


Both circumstances on the to-do list.


--
Chris Moller

 I know that you believe you understand what you think I said, but
 I'm not sure you realize that what you heard is not what I meant.
 -- Robert McCloskey




signature.asc
Description: OpenPGP digital signature


Re: Froggy status

2008-07-28 Thread Chris Moller



David Smith wrote:


I took a quick look at froggy.c (the kernel module).  I'm not sure how
finished you consider it, but in general I'm not sure it is paranoid
enough.  For instance, what happens if there is a error in a client and
it aborts? 


Just looked into that:  when a task crashies the kernel apparently 
closes any open file descriptors, thereby invoking froggy_release() 
which gives froggy a chance to free any resources, etc.



 What happens if someone does a rmmod on the module while
clients are still running?
  


The rmmod fails as long as there are any open fds on the froggy 
peeudo-file /sys/kernel/debug/froggy.


BTW, until further notice--tomorrow morning, probably, don't use a 
froggy of any later that 2006-07-28 11:20.  Somehow a kernel-crasher 
snuck in which I'll debug in the morning.



--
Chris Moller

 I know that you believe you understand what you think I said, but
 I'm not sure you realize that what you heard is not what I meant.
 -- Robert McCloskey




signature.asc
Description: OpenPGP digital signature


Re: Kernel oops with froggy test

2008-07-13 Thread Chris Moller
Try it now (from the sources.r.c repo):  I think what was happening was 
that the response thread was terminating before the report_quiesce hit, 
thereby invalidating the pointer to the buffer.  I rewrote all that code.


Phil Muldoon wrote:

760 is a bash process. I'm heading out the door, be back latr tonight!

[EMAIL PROTECTED] froggy]$ ./froggy-test -t760
sending syscal vecs
sending signal vec
attaching tasks
attach status = 
resp buf = quiescing





--
Chris Moller

 I know that you believe you understand what you think I said, but
 I'm not sure you realize that what you heard is not what I meant.
 -- Robert McCloskey




signature.asc
Description: OpenPGP digital signature


Re: ntrace: interface ideas

2008-07-10 Thread Chris Moller



Phil Muldoon wrote:

Chris Moller wrote:


No guarantees, of course--the next-generation stuff may do things 
differently--but even as I type this, I'm hacking together a 
boilerplate framework that does exactly as described above.
Missed this in the last question, so apologies for the multiple 
emails. When can I get the ntrace/boilerplate code? Perhaps studying 
the code would render answers as well as just asking questions on the 
list ;)


I'll see if I can put something in some repo sometime in the next couple 
of days--with the proviso that I don't guarantee it won't slag your 
machine.  It won't have much real function--perhaps nothing that ever 
sees the real light of day and using mechanisms the may change 
radically--but my still-a-bit-buggy sandbox code right now attaches to 
any number of command-line specified PIDs and then, asynchronously, 
reports the advent of command-line specified syscall entries and exits 
to stdout, and that's it.  No interactivity.




Regards

Phil


--
Chris Moller

 I know that you believe you understand what you think I said, but
 I'm not sure you realize that what you heard is not what I meant.
 -- Robert McCloskey




signature.asc
Description: OpenPGP digital signature


Re: ntrace: interface ideas

2008-07-10 Thread Chris Moller



Phil Muldoon wrote:


Thanks for the description Chris. From an ntrace client implementation 
point of view, non-ordered replies to asynchronous requests present 
an ordering conundrum in the client - especially as it scales to many 
many inferiors. 


I tend to think of the asynch stuff not as replies but as responses, 
of which there are three types.  The most common (this is from 
utracer--things may change) is the response to an operational change 
like a quiesce command.  The command itself is non-blocking and 
returns immediately to the caller, and the kernel action is 
immediate--setting the QUIESCE bit in the utrace engine--but the task 
won't actually quiesce until it's in, to use Roland's description, a 
safe state.  At that point the engine issues a report_quiesce() which 
results in an asynchronous response, on the response thread, to the 
debugger.


A second kind of response is less direct.  For example, the debugger can 
issue a command to report the arrival of specified signals or entry into 
(or exit from, or both) specified syscalls.  Again, the response to the 
command is non-blocking, but there obviously won't be a response-thread 
response unless and until that a qualifying signal arrives or a 
qualifying syscall is actually entered or exited.


Finally, there are truly immediate responses.  For example, a request 
for the contents of one or more registers (utracer allowed the app to 
request a range of regs, e.g., all the GPRs, down to and including a 
single specified reg) will be responded to immediately, on the control 
thread, with that information.  (Assuming the task is quiesced; an error 
condition will be raised otherwise.)


(An additional case in utracer--I've no idea if we'll keep it--was a 
request for memory contents.  (The equivalent of reading 
/proc/$pid/memory but without the hassle of opening the file, lseeking, 
etc.)  The operation looked synchronous to the application and returned 
immediately just like register reads did, but in fact could block 
pending paging.)


Some operations, like setting registers, were completely synchronous and 
never resulted in any kind of asynch response.


All response-thread responses are structured and contain an enum 
specifying the nature of the response--here's a syscall entry of the 
kind you asked for--as well as the PID of the task that generated the 
response.  If it would be useful, I suppose some sort of sequence number 
could be returned as well, or maybe an application-supplied transaction 
identification number could be returned.  That might help in dealing 
with out-of-order problems.


HTH,
cm

But without knowing how events are structured and how then can paired 
with the original requester in the client, not sure how much of an 
issue it will be. Frysk uses the observer design pattern for a lot of 
these type requests, to solve this problem. Do you think this will be 
a similar pattern? Anyway not a big deal, it is a known problem with 
known solutions. Tom mentioned that the X protocol has something 
similar in this regard. To look a little closer at you synchronous 
event description, I'm struggling to classify what type of events 
would be synchronous. I don't want to get tied up here too much, as 
both you and Roland mention this is all fluid. But for the sake of 
understanding and documentation, can you give me a use-case where a 
synchronous request would be needed/used, and also an asynchronous one?


Regards

Phil



--
Chris Moller

 I know that you believe you understand what you think I said, but
 I'm not sure you realize that what you heard is not what I meant.
 -- Robert McCloskey




signature.asc
Description: OpenPGP digital signature


Re: ntrace: interface ideas

2008-07-10 Thread Chris Moller



Phil Muldoon wrote:

Chris Moller wrote:



Phil Muldoon wrote:


Thanks for the description Chris. From an ntrace client 
implementation point of view, non-ordered replies to asynchronous 
requests present an ordering conundrum in the client - especially as 
it scales to many many inferiors. 


I tend to think of the asynch stuff not as replies but as 
responses, of which there are three types.  The most common (this 
is from utracer--things may change) is the response to an operational 
change like a quiesce command.  The command itself is non-blocking 
and returns immediately to the caller, and the kernel action is 
immediate--setting the QUIESCE bit in the utrace engine--but the task 
won't actually quiesce until it's in, to use Roland's description, a 
safe state.  At that point the engine issues a report_quiesce() 
which results in an asynchronous response, on the response thread, to 
the debugger.


Sounds sensible. Perhaps from the old utracer, or this new interface, 
do you have a running api? 


The original utracer API header is in  utracer/utracer/include/utracer.h 
(from the CVS repo cvs -d :ext:cvs.ges.redhat.com:/cvs/cvsfiles co 
utracer) and the docs are in  utracer/docs.  (make pdf, make html, 
and make txt make more readable PDF, unified HTML, and text files 
respectively.  Just make makes man pages.)


It doesn't have to be finished, or even set in stone. 


It's mostly complete--some stuff  I never got around to documenting, but 
you'll get the idea.  The one thing you'll note, however, is there's not 
much in the way of reference to actual protocol to and from the module 
in the docs.  That's because I spent the last six months or so of active 
work on utracer hiding all that messy stuff from Java, which is clueless 
about C stuff like unions of structs, which is what the module protocol 
was based on.  You can get that from the utracer.h headers and 
utracer/utracer/utracer.c where the actual ioctl()s are.  (Turns out I 
wasn't following the conventions on the use of ioctl in utracer--my new 
boilerplate is doing it right.)


Just an idea.  I'm guessing your answer will be, look in the code when 
I get the repository up ;) But it would be interesting to see this in 
the making. The Quiesce command above for example. I'm pondering what 
happens if a task dies before it reached a safe state to be quiesced.


That's more a utrace internals question that Roland can handle better 
than I can, but the off-the-cuff answer would be that the engine would 
generate something like a report_signal(), report_exit(), or 
report_death() , depending on the manner of death, instead of the 
report_quiesce() and that would result in an asynch response to the app.


Over on Frysk we are taking a good, long look at our goals and one of 
those is interfacing to the new user-land api of utrace. Is there a 
moral equivalent available yet to the ptrace like control enums that 
one used to use with ptrace?


Not for the real thing, but near the top of utracer.h is a list of 
IF_CMD_* enums utracer supported.  Further down is a list of IF_RESP_* 
enums detailing the possible response types.






A second kind of response is less direct.  For example, the debugger 
can issue a command to report the arrival of specified signals or 
entry into (or exit from, or both) specified syscalls.  Again, the 
response to the command is non-blocking, but there obviously won't be 
a response-thread response unless and until that a qualifying signal 
arrives or a qualifying syscall is actually entered or exited.


Finally, there are truly immediate responses.  For example, a request 
for the contents of one or more registers (utracer allowed the app to 
request a range of regs, e.g., all the GPRs, down to and including a 
single specified reg) will be responded to immediately, on the 
control thread, with that information.  (Assuming the task is 
quiesced; an error condition will be raised otherwise.)


I'm not sure I hold truck with responses appearing on a control thread 
and the response thread. It would require a bit of knowledge beyond 
the api (I'm calling the interface an api, for want of a better term). 
What's preventing scheduling a reply to an immediate response on the 
response thread over the control thread?


Immediate responses are generally done by way of a pointer passed 
through the ioctl() and into which the module copies the appropriate 
data.  Register reads, for example, do this.  The actual response from 
the ioctl is just a success/fail return value that sets errno.  Nothing 
says, though, that even for immediately available data like register 
values there can't be both synch and asynch versions of the request.






All response-thread responses are structured and contain an enum 
specifying the nature of the response--here's a syscall entry of the 
kind you asked for--as well as the PID of the task that generated 
the response.  If it would be useful, I suppose some sort of sequence 
number

Re: ntrace: interface ideas

2008-07-10 Thread Chris Moller
Forgot to mention, BTW, there's a mini-app in utracer/*.[ch] that shows 
the utracer API in use in a fake GTK-based debugger.


Phil Muldoon wrote:

Chris Moller wrote:



Phil Muldoon wrote:


Thanks for the description Chris. From an ntrace client 
implementation point of view, non-ordered replies to asynchronous 
requests present an ordering conundrum in the client - especially as 
it scales to many many inferiors. 


--
Chris Moller

 I know that you believe you understand what you think I said, but
 I'm not sure you realize that what you heard is not what I meant.
 -- Robert McCloskey




signature.asc
Description: OpenPGP digital signature


Re: ntrace: interface ideas

2008-07-09 Thread Chris Moller



Phil Muldoon wrote:

Roland McGrath wrote:

I think of the interface as asynchronous at base.  There may at some
point be some synchronous calls to optimize the round trips.  But we
know that by its nature an interface for handling many threads at once
has to be asynchronous, because an event notification can always be
occurring while you are doing an operation on another thread.  So what
keeping it simple means for the first version is that we only worry
about the asynchronous model.  Think of it like a network protocol
between the application and the tracing server inside the kernel.
(Don't get hung up on that metaphor if it sounds like a complication to
you.)
  


Hi. I'm absorbing the email you wrote, but I keep coming back to this 
issue:


When mixing asynchronous and synchronous requests to utrace in one 
event loop, how do these events handled? If the client sends five 
asynchronous requests, (and has a thread waiting on replies), and then 
sends one synchronous request , does the synchronous request always 
return before the previous five? Is it basically a blocking call? And 
are the events returned asynchronously in a guaranteed order?


I'm not sure what Roland has in mind, but the way utracer worked was 
that every attached task had associated with it two debugger threads., 
what I called the control and response threads.  All requests, 
either synchronous or asynchronous, originated in the control thread and 
were non-blocking.  Synchronous reqs were, by definition, trivial in 
the sense that they could only request information immediately available 
to the kernel.


Since the threads were independent, in answer to the questions, there 
was no guarantee either that the synchronous request would return, on 
the control thread, before any of the prior asynch reqs generated 
responses on the response thread, or that the asynch reqs would generate 
results in any particular order.


At least in utracer, asynch reqs were typically task-control commands 
like a quiesce request, which could generate an asynch response if the 
appropriate utrace report_* had been enabled to do that.  Asynch ops 
could also enable syscall entries or exits or signal reports, resulting 
responses at arbitrary times in no deterministic order.  I also had a 
built-in memory-access request (like /proc/$pid/mem) that could generate 
an asynch response, depending, e.g., on paging.


Under the covers, all requests, either for synchronous or asynchronous 
results, are made via ioctl()s which can pass a pointer to a user-space 
struct that contains not only a precise specification of the command or 
data request, it can also contain pointers to user-space memory into 
which the module can store synchronous results.  Asynchronous results 
are implemented as blocking read()s.


No guarantees, of course--the next-generation stuff may do things 
differently--but even as I type this, I'm hacking together a boilerplate 
framework that does exactly as described above.



Feedback welcome.

cm


Regards

Phil



--
Chris Moller

 I know that you believe you understand what you think I said, but
 I'm not sure you realize that what you heard is not what I meant.
 -- Robert McCloskey




signature.asc
Description: OpenPGP digital signature


Re: The demise of utracer.

2008-07-01 Thread Chris Moller



Roland McGrath wrote:

Sorry to be blunt, Chris.  But I think you're headed down a useless rat hole.

I agree that the usage of /proc you've described is a bad interface.
I am slightly mystified as to how that came to be what you settled on.
  


At the time Andrew Cagney and I decided to go that route, it was because 
frysk had no means of effecting kernel changes other than by use of a 
loadable module and the use of  /proc entries was a common, easily 
accessible, means of communicating with modules.



I don't think it's worthwhile to hash over that.  Let's move on.
  


I'd love to, but it would be nice to have a clue as to which 
direction.   All I'm getting from The World is a list of stuff I 
shouldn't be doing, and that helps not at all with regard to what I 
/should/ be doing.



Please forget ptrace.  Please forget about adding syscalls.  At this
point I think I just need you to give me the benefit of the doubt when
I tell you I am sure this is not the way, and even dabbling sidetracks
us from really useful progress.  Let's move on.
  


Again, ptrace hacks and new syscall hacks are things I can actually do 
and in the absence of any other clue concerning what I should be doing 
it's what I've been doing.--I know it's a been a near-total waste of my 
time, but it kinda beats staring at a blank screen all day.  I'll be 
glad to give you the benefit of the doubt--you've been kernel hacking 
longer than I have--but if you have cool notions about which way to go, 
you kinda need to let the rest of us know what they are.  (And, reading 
ahead, yeah, I know, that's what the rest of this note is...)


I'll commence to hackin'.

cm

--
Chris Moller

 I know that you believe you understand what you think I said, but
 I'm not sure you realize that what you heard is not what I meant.
 -- Robert McCloskey




signature.asc
Description: OpenPGP digital signature


Re: The demise of utracer.

2008-07-01 Thread Chris Moller



K.Prasad wrote:

On Tue, Jul 01, 2008 at 03:28:04PM -0400, Chris Moller wrote:
  

K.Prasad wrote:


Hi All,
Sorry if I have missed out something I need to know before I
respond to this email. But the trace infrastructure (lib/trace.c)
already provides such a facility which more features such as per-cpu
buffer for faster transmission (it is a wrapper over relay which
sits on top of debugfs).

The interfaces provided by trace are much simpler/functional than
setting up a debugfs interface manually (see
samples/trace/fork_trace.c) and the directory structure and control
files setup by trace are already familiar to the systemtap code.

Thanks,
K.Prasad
P.S.: trace is currently in -mm tree.
  
  
Thought it might be interesting to check this out--the patched  
2.6.26-rc5 kernel built fine but panicked when I tried to boot it.  So  


You might want to directly try out 2.6.26-rc5-mm1 directly. It boots
fine on my T60p with default configs.
  


Hmmm.   okay, I'l try that, thx.  (i was using the -mm3 patch and an 
oldconfig from Fedora i686.)



Thanks,
K.Prasad

  


--
Chris Moller

 I know that you believe you understand what you think I said, but
 I'm not sure you realize that what you heard is not what I meant.
 -- Robert McCloskey




signature.asc
Description: OpenPGP digital signature


Re: The demise of utracer.

2008-06-30 Thread Chris Moller
.  But Gibbon did it, though occasionally if his 
principal footnote was in Latin, it's subordinate footnote would be in 
Greek.  This scales poorly, limiting the depth of subordinate footnotes 
to the number of available obscure languages.  Accordingly, I'm 
refraining from doing that.  (And, as well, I've no desire to learn 
Aramaic.)




Roland McGrath wrote:

I have numerous more specific thoughts about interface features.  Some
time back, I almost got started towards my preferred directions in the
ntrace tree, but not quite.  I think I have a good sense of all the
moving parts going into the interface/fancy-debugger-module picture.
In the next day or two, I'll start a new thread here about each facet of
the problem as I see it and what I've had in mind; for lack of a better
name, under the heading ntrace.


Thanks,
Roland
  


--
Chris Moller

 I know that you believe you understand what you think I said, but
 I'm not sure you realize that what you heard is not what I meant.
 -- Robert McCloskey




signature.asc
Description: OpenPGP digital signature


Re: The demise of utracer.

2008-06-25 Thread Chris Moller



Phil Muldoon wrote:

Chris Moller wrote:
Due to a complete lack of interest, I'm shelving utracer.  May it 
collect dust in peace.




I was (and still am) interested in it from a direct api point of view. 
As a an end-tool not so much.




Systemtap is cool, and, based on tinkering with it for a while, is 
great for finding deep, hairy, stuff, but doesn't appear to lend 
itself to highly interactive use in debuggers.


When you mean a Systemtap like approach, what do you mean? I am doing 
poor service in describing Systemtap, but do you mean compiling a 
script/api into a utracer kernel module and aggregating data from 
that? If this were the approach would not an assembly of scripts be 
the debugger itself?



I'm certainly no systemtap expert, but at least one limiting factor I 
see is that compiling and loading a kernel module takes time and 
resources way beyond the trivial--I can't imagine such an approach being 
workable in an interactive environment.  You almost certainly couldn't 
do it an a per-user-interaction basis with anything like decent response 
time.




The whole-new-syscall thing, at which I started some tentative 
hacking a while ago, was based on my aversion to hacking into Other 
People's Code, mostly kernel/ptrace.c in this case--I just thought it 
more expedient to be as unobtrusive as possible.  At least one 
downside of this, of course, is the redundancy with ptrace.  (And I 
suspect it might be difficult to coordinate getting a new syscall 
number universally accepted upstream, but I don't have much of a clue 
about that.)


I am not sure how this would scale.  It would be cool to see how it 
did in a basic prototype, if only to rule it out due to performance 
and scalability.


I'm not sure what you ean by scale.  My sandbox code added one new 
entry in unistd.h:  __NR_utrace, 327,  The redundancy I'm talking about 
is that ptrace() capabilities, stepping, peeking, poking, etc., would 
continue to exist in ptrace (in kernel/ptrace.c), but would be 
replicated as a subset of the capabilities of utrace() (actually, 
syscall(SYS_utrace,...) until libc caught up) in a hypothetical 
kernel/utrace_ui.c  (Not so hypothetical, actually--that's what I called 
it.)






That leaves, unless someone has a better idea, extending ptrace.  I 
started a bit of tentative hacking on that yesterday.  Turns out, 
actually, that it was easy to do that minimally intrusively. (It's 
about a ten-line patch to kernel/ptrace.c to add a default case to 
the ptrace_common() request switch that calls an external fcn to 
decode extensions to the PTRACE_* requests., plus a patch to 
include/linux/ptrace.h to add the new request number #defines.  All 
the real hacking is in the external fcn.)  At least one upside of 
this, of course, is that there's no redundancy with ptrace.


I've no opinion on this, and it always tends to generate debate on why 
ptrace works, does not work, ptrace rules, ptrace sucks type 
arguments. I'm curious to why you think this approach is 
relevant/useful though? Is it from a maintainability viewpoint? 
Performance?



Several things:  extensions of ptrace can do arbitrary things completely 
unlike the capabilities of existing ptrace()--there'd be an an entire 
new range of requests extending the usual PTRACE_POKE*, PTRACE_PEEK*, 
etc., stuff to include, hypothetically, things like UTRACE_WAIT.  
(Again, not so hypothetical: I've already sandboxed that: it acts like a 
super, highly controllable, waitpid(), that unblocks on a call-defined 
subset of task state changes, signals, syscall entries/exits, etc.)  It 
certainly is more maintainable than replicating ptrace() capability in a 
comprehensive utrace(), and, either as an extension of the ptrace 
syscall or as its own utrace syscall, it's got a lot better performance 
than the /proc entry approach--thee's a lot more overhead associated 
with read(), write(), and ioctl() than with a direct syscall.




Thanks

Phil


--
Chris Moller

 I know that you believe you understand what you think I said, but
 I'm not sure you realize that what you heard is not what I meant.
 -- Robert McCloskey




signature.asc
Description: OpenPGP digital signature


Re: The demise of utracer.

2008-06-25 Thread Chris Moller



Daniel Jacobowitz wrote:

On Wed, Jun 25, 2008 at 12:30:03PM -0400, Chris Moller wrote:
  

Extending ptrace seems like a sad idea.  If Linux is going to grow a
new userspace-accessible debug interface, can't it go in /proc or
something?
  
  
Actually, that's exactly what utracer does.  It's a module that creates  
an entries under /proc (/proc/utracer/*) that client apps can  
read()/write()/ioctl() to access utracer capabilities.  Maybe the coolest 
thing about utracer is that it gave every app its own /proc entry that 
blocked on read() until an app-defined interesting thing happened: 
specified signals, task state changes, specified syscall entry/exit, all 
the stuff accessible through utrace report_* callbacks.



So, how'd it demise in a way that a syscall interface would be any
better?  This sounds like the right way to do it (barring scaling
details; maintining one fd per thread becomes impractical).
  


No, utracer only required two fds per client app: two instances of a 
gdb-replacement, e.g., would require a total of four fds, regardless of 
how many threads each gbd-thing was following.  The module was desiged 
to accecpt client requests from any number of apps and assign each 
it's pair of fds,  (One fd was for write()ing various things to the 
module to control operations and ioctl()ing, mostly to extract 
synchronous data.  The other fd was a read-only that blocked pending 
user-defined interesting stuff, kinda like a super waitpid().)


The main reason I'm moving away from this approach is that the overhead 
of read()/write()/ioctl() to/from a /proc pseudo-entry is a lot higher 
than a simple syscall.  It's mostly a performance thing, but there's 
also less code tp maintain if I get rid of the /proc stuff.  (Plus, I 
couldn't find any existing examples of anyone doing ioctl() to a /proc 
entry and I kinda had to invent the method myself.  It works fine, but 
that may just be because I haven't found the right way to break it yet.)



Side note: every time someone talks about a ptrace replacement I
suggest stealing one from Solaris :-) It seems one of the areas that
Sun thought out properly, although in my limited brushes with it in
the last year I'm becoming less convinced of that.
  



What's Solaris ptrace() do that Linux ptrace() doesn't?  Nothing says I 
can't at least hack at putting it in.


--
Chris Moller

 I know that you believe you understand what you think I said, but
 I'm not sure you realize that what you heard is not what I meant.
 -- Robert McCloskey




signature.asc
Description: OpenPGP digital signature


Re: The demise of utracer.

2008-06-25 Thread Chris Moller



David Miller wrote:

From: Daniel Jacobowitz [EMAIL PROTECTED]
Date: Wed, 25 Jun 2008 11:38:25 -0400

  

The single biggest pain in GDB's process management is dealing with
signals, especially the ways that ptrace interferes with normal
operation.



Because of this, and other similar examples, I believe the only
way to design a new debugging interface is to walk through a
significant debugging tool like GDB and guide the interface
design by what something like GDB is trying to accomplish.

There are years and years of experience in debugging codified
into a code base like GDB, and therefore the perfect place
to mine interface guiding experience from.
  


I agree in principal, but there are years and years of old cruft in gdb 
too and I'm not altogether sure that separating the experience from the 
cruft is possible or, at least, any less work than  just starting over 
and accumulating new cruft.


--
Chris Moller

 I know that you believe you understand what you think I said, but
 I'm not sure you realize that what you heard is not what I meant.
 -- Robert McCloskey




signature.asc
Description: OpenPGP digital signature