On Fri, Apr 03, 2009 at 07:55:06PM +0100, Andrew Gabriel wrote:
> Adam Leventhal wrote:
> >On Fri, Apr 03, 2009 at 05:40:44PM +0100, Andrew Gabriel wrote:
> >>I have a dtrace variable which is a counter which I'm incrementing on
> >>entry to a kernel function, and decrementing on function return, so I
> >>expect the value to be the number of threads which are currently in that
> >>function. However, the value slowly climbs to values way higher than the
> >>number of threads I could possible have in the function. It's rather like
> >>what I'd expect if I was manipulating a counter in C from lots of threads
> >>without protection from a mutex, and indeed if I turn off all except 1
> >>CPU, the counter seems to work perfectly and give me the values I'd
> >>expect.
> >>
> >>So, is there anyway to safely have multiple threads increment and
> >>decrement a counter? I guess I could use an array indexed by cpu, but I
> >>can't think how to add together all the array elements to get the total,
> >>which I'm using in a tick probe.
> >
> >Hi Andrew,
> >
> >Take a look at the documentation for aggregations: that's the mechanism you
> >want to use for what you describe above.
>
>
> Thanks Adam, but I'm struggle to see how that helps.
>
> Here's a mini cut down version of what I was trying to do...
>
>
> #!/usr/sbin/dtrace -qs
>
> /*
> * Catch ufs/zfs filesystem VOP_WRITE calls.
> */
>
> fbt:zfs:zfs_write:entry, fbt:ufs:ufs_write:entry
> {
> self->start = 1;
> queued++;
> }
>
> fbt:zfs:zfs_write:return, fbt:ufs:ufs_write:return
> /self->start/
> {
> queued--;
> self->start = 0;
> }
>
> profile-5000
> {
> @QueueLength["queued writes"] = lquantize(queued, 0, 100, 1);
> }
>
>
> Short of indexing the 'queued' and '@QueueLength' by cpu (which works,
> but doesn't give me quite what I'm after, the spread of total numbers of
> threads in these functions), I can't see how I can use aggregations
> further to help here.
% cat > getqueue.d <<EOF
#!/usr/sbin/dtrace -s
#pragma D option quiet
#pragma D option aggsortkey=1
hrtime_t start;
inline hrtime_t bin = (timestamp - start) / 200000; /* bin every 200usecs */
BEGIN
{
start = timestamp;
}
fbt:zfs:zfs_write:entry, fbt:ufs:ufs_write:entry
{
self->start = 1;
@queued[bin] = sum(1ull);
}
fbt:zfs:zfs_write:return, fbt:ufs:ufs_write:return
/self->start/
{
@dequeued[bin] = sum(-1ull);
self->start = 0;
}
tick-1sec
/ ++seconds > 10 / /* gather about 10 seconds of data */
{
exit(0);
}
END {
printf("%20s %10s %10s\n", "BIN", "ENQUEUED", "DEQUEUED");
printa("%20d %...@d %...@d\n", @queued, @dequeued);
}
EOF
This gathers data in the form:
BIN ENQUEUED DEQUEUED
6492 1 0
6494 2 -2
6495 18 -18
6496 18 -18
6497 17 -17
6498 18 -18
6499 18 -18
6500 18 -18
6501 17 -18
6502 18 -18
6503 18 -17
6504 18 -18
6505 18 -18
6506 18 -18
6507 17 -18
6508 18 -18
...
You can process the data with a simple AWK script:
% chmod +x getqueue.d
% ./getqueue.d -o /var/tmp/data
% nawk '
$1 + 0 != 0
{ queued = queued + $2 + $3; printf("%10d %10d\n", $1, % queued)
}' < /var/tmp/data
6492 1
6494 1
6495 1
6496 1
6497 1
6498 1
...
Note that this gives you more data than you were asking for; you also know
how many items are being processed at any given time.
Most problems you might think you want global variables for can be re-cast
in terms of aggregations, and the resultant scripts will run faster,
more accurately, and with less contention then they would even if global
variables were atomic.
Cheers,
- jonathan
_______________________________________________
dtrace-discuss mailing list
[email protected]