On Fri, Apr 03, 2009 at 07:55:06PM +0100, Andrew Gabriel wrote:
> Adam Leventhal wrote:
> >On Fri, Apr 03, 2009 at 05:40:44PM +0100, Andrew Gabriel wrote:
> >>I have a dtrace variable which is a counter which I'm incrementing on 
> >>entry to a kernel function, and decrementing on function return, so I 
> >>expect the value to be the number of threads which are currently in that 
> >>function. However, the value slowly climbs to values way higher than the 
> >>number of threads I could possible have in the function. It's rather like 
> >>what I'd expect if I was manipulating a counter in C from lots of threads 
> >>without protection from a mutex, and indeed if I turn off all except 1 
> >>CPU, the counter seems to work perfectly and give me the values I'd 
> >>expect.
> >>
> >>So, is there anyway to safely have multiple threads increment and 
> >>decrement a counter? I guess I could use an array indexed by cpu, but I 
> >>can't think how to add together all the array elements to get the total, 
> >>which I'm using in a tick probe.
> >
> >Hi Andrew,
> >
> >Take a look at the documentation for aggregations: that's the mechanism you
> >want to use for what you describe above.
> 
> 
> Thanks Adam, but I'm struggle to see how that helps.
> 
> Here's a mini cut down version of what I was trying to do...
> 
> 
> #!/usr/sbin/dtrace -qs
> 
> /*
>  * Catch ufs/zfs filesystem VOP_WRITE calls.
>  */
> 
> fbt:zfs:zfs_write:entry, fbt:ufs:ufs_write:entry
> {
>         self->start = 1;
>         queued++;
> }
> 
> fbt:zfs:zfs_write:return, fbt:ufs:ufs_write:return
> /self->start/
> {
>         queued--;
>         self->start = 0;
> }
> 
> profile-5000
> {
>         @QueueLength["queued writes"] = lquantize(queued, 0, 100, 1);
> }
> 
> 
> Short of indexing the 'queued' and '@QueueLength' by cpu (which works, 
> but doesn't give me quite what I'm after, the spread of total numbers of 
> threads in these functions), I can't see how I can use aggregations 
> further to help here.

% cat > getqueue.d <<EOF
#!/usr/sbin/dtrace -s

#pragma D option quiet
#pragma D option aggsortkey=1

hrtime_t start;
inline hrtime_t bin = (timestamp - start) / 200000;     /* bin every 200usecs */

BEGIN
{
        start = timestamp;
}

fbt:zfs:zfs_write:entry, fbt:ufs:ufs_write:entry
{
        self->start = 1;
        @queued[bin] = sum(1ull);
}

fbt:zfs:zfs_write:return, fbt:ufs:ufs_write:return
/self->start/
{
        @dequeued[bin] = sum(-1ull);
        self->start = 0;
}

tick-1sec
/ ++seconds > 10 /              /* gather about 10 seconds of data */
{
        exit(0);
}

END {
        printf("%20s %10s %10s\n", "BIN", "ENQUEUED", "DEQUEUED");
        printa("%20d %...@d %...@d\n", @queued, @dequeued);
}
EOF

This gathers data in the form:

                 BIN   ENQUEUED   DEQUEUED
                6492          1          0
                6494          2         -2
                6495         18        -18
                6496         18        -18
                6497         17        -17
                6498         18        -18
                6499         18        -18
                6500         18        -18
                6501         17        -18
                6502         18        -18
                6503         18        -17
                6504         18        -18
                6505         18        -18
                6506         18        -18
                6507         17        -18
                6508         18        -18
...

You can process the data with a simple AWK script:

% chmod +x getqueue.d
% ./getqueue.d -o /var/tmp/data
% nawk '
    $1 + 0 != 0
    { queued = queued + $2 + $3; printf("%10d %10d\n", $1, % queued)
}' < /var/tmp/data
      6492          1
      6494          1
      6495          1
      6496          1
      6497          1
      6498          1
...


Note that this gives you more data than you were asking for;  you also know
how many items are being processed at any given time.

Most problems you might think you want global variables for can be re-cast
in terms of aggregations,  and the resultant scripts will run faster,
more accurately, and with less contention then they would even if global
variables were atomic.

Cheers,
- jonathan

_______________________________________________
dtrace-discuss mailing list
[email protected]

Reply via email to