I've been using it in another CR where destroying one of a snapshots
was helping the performance. Nevertheless here it's on that server:

Short period of time:

bash-3.00# ./metaslab-6495013.d

^C

  Loops count
           value  ------------- Distribution ------------- count
              -1 |                                         0
               0 |@@@@@@@@@@@@@@@@@@@@@@@@@@               17674
               1 |@@@@@@@                                  4418
               2 |@@@                                      2123
               4 |@@                                       1257
               8 |@                                        753
              16 |@                                        416
              32 |                                         220
              64 |                                         103
             128 |                                         58
             256 |                                         38
             512 |                                         21
            1024 |                                         13
            2048 |                                         10
            4096 |                                         8
            8192 |                                         3
           16384 |                                         3
           32768 |                                         2
           65536 |                                         1
          131072 |                                         26
          262144 |                                         7
          524288 |                                         0

bash-3.00#


Looks like that's it.

Yeah, sometimes doing over 200,000 avl_walks isn't going to be good.

...



IIRC you've written before that someone is actively working on it
right now, right? Any update? Any approx. ETA? I would like to test it
ASAP even before putback.

I believe this is next on George's list (he's got a couple of fixes he needs to putback first).

So since you're in this state, would you mind seeing the 'size' (arg1) distribution instead of avl_walk()s for metaslab_ff_claim()? Something like:
"
#!/usr/sbin/dtrace -s

#pragma D option quiet

BEGIN
{
        self->in_metaslab = 0;
}

fbt::metaslab_ff_alloc:entry
/self->in_metaslab == 0/
 {
          self->in_metaslab = 1;
          @sizes["metaslab sizes"] = quantize(arg1);
}

fbt::metaslab_ff_alloc:return
/self->in_metaslab/
{
        self->in_metaslab = 0;
}
"

I'm wondering if we can just lower the amount of space we're trying to alloc as the pool becomes more fragmented - we'll lose a little I/ O performance, but it should limit this bug.

eric
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to