Hi Ryan,

Ryan wrote:
Hi all,

While experimenting with the sched provider, I found some unexpected behaviors 
of the enqueue and dequeue probes. I had naïvely assumed that a cpu's currently 
running thread would always be the head of that cpu's runqueue, and that 
enqueue and dequeue events would sum to zero, but neither seems to be the case.
The current running thread on a cpu is not on any dispq. The runqueue (dispq) is where
runnable threads wait.  So when threads are enqueued, they are either
switching out, or going to sleep. Threads are dequeued when they are switching in.
I'll say more about your dtrace output below...
1. Threads can apparently be moved between runqueues while on-cpu (and without 
being migrated)
2. Moving a thread from one queue to another (or between front/back of a queue) 
generates an enqueue event but no dequeue
3. A thread won't necessarily be rescheduled on the same cpu as the one it was 
last enqueued on
4. A thread which is both dequeued and descheduled can be rescheduled later 
with no intervening enqueue events.

Does anybody know how to make sense of this behavior? FWIW, I had been hoping to use sched:::*-cpu and sched:::*queue to implement per-process load monitoring at fairly high frequency (1-10msec intervals). /proc/$pid/usage is too expensive because it iterates over all threads in the process at every call while holding locks that serialize other syscalls.
Thanks!
Ryan

The script:
sched::resume:on-cpu/pid == $1/ { printf("t...@%d running on cpu %d\n", tid, 
cpu); }
sched::resume:off-cpu/pid == $1/ { printf("t...@%d descheduled\n", tid); }
sched:::enqueue/pid == $1/ { printf("t...@%d enqueued on cpu %d\n", 
args[0]->pr_lwpid, args[2]->cpu_id); }
sched:::dequeue/pid == $1/ { printf("t...@%d dequeued\n", args[0]->pr_lwpid); }

Exerpts from (and commentary about) the output produced when tracing a bash 
shell that executes 'ls':

 16     16                    resume:on-cpu t...@1 running on cpu 16
 16   5065               setfrontdq:enqueue t...@1 enqueued on cpu 40
 16   5055                   resume:off-cpu t...@1 descheduled
 16     16                    resume:on-cpu t...@1 running on cpu 16

Why did it come back to cpu 16 instead of cpu 40?
It didn't. It was running on cpu16. Then it was preempted and place on the runqueue for cpu 40. After it places itself on the runqueue for cpu 40, it switches off (that's
the off-cpu).
Then it began running again on cpu 16.
 16   5065               setfrontdq:enqueue t...@1 enqueued on cpu 40
 16   5066                setbackdq:enqueue t...@1 enqueued on cpu 56
 16   5066                setbackdq:enqueue t...@1 enqueued on cpu 16

Was it not dequeued from cpus {16, 40, 56} or the dequeue events simply not 
reported?
Your predicate only looks for one thread.  The dequeue is being done by the
running thread (not t...@1).  You might find it more informational to not use
a predicate to see better what events are taking place, by whom, and
for whom.
 16   5068                     disp:dequeue t...@1 dequeued
 16   5055                   resume:off-cpu t...@1 descheduled
 16     16                    resume:on-cpu t...@1 running on cpu 16

Doesn't it need to join a runqueue before running?
Enqueues can be done either by the current thread enqueueing itself,
or if the thread is woken up following a blocking synchronization primitive.
Try the script without the predicate, or change the predicate to watch
the thread being switched instead of the current executing thread.
Also, if the process you are looking at is multithreaded, the predicate
can match different threads at different times.

max


_______________________________________________
dtrace-discuss mailing list
[email protected]

Reply via email to