Hello everybody.

I am experiencing terribly slow writes on my home server. This is from
zpool iostat:

                              capacity     operations    bandwidth
pool                       alloc   free   read  write   read  write
-------------------------  -----  -----  -----  -----  -----  -----
tank                       3.69T   812G    148    255   755K  1.61M
  raidz1                   2.72T   192G     86    112   554K   654K
  raidz1                    995G   621G     61    143   201K   962K

The case is that one vdev is almost full, while the other one has
plenty of space. I remember that at least one point it was known that
writes slow down when the fs became full, due to CPU time spent
looking for free space.

I am seeing that "zpool-tank" is using half a cpu (a quad opteron
setup) all the time while this write load is running. That seems weird
as I'd expect there to be almost a full cpu consumed if cpu is the
bottleneck. Disks aren't according to iostat -xcn, and there are
multiple writers writing large files so I would not expect a
bottleneck there.

History of the pool; I added a second vdev when the first vdev was
about 70% full. There are large files as well as small files and
virtual machines and a heavily loaded database, probably making all
free space very fragmented. After adding the second vdev, writes
didn't seem to be biased towards the new device enough; the older one
filled up anyway, and now speed has slowed to a crawl.

If I delete old snapshots or delete some data, the write speeds bump
up to a more
healthy 70MB/s, but after a while this problem comes back. I know I
could rewrite most
old data to move half of it to the new device, but that seems rather
an unelegant solution to the problem.

I had a look with zdb, and there are many metaslabs that have several
hundred megabytes of free
space, best ones almost a gigabyte (of 4 gigabytes) or in other words
being something like 75-90% full. Is that too heavy for the allocator?
Maybe space map could be reformatted to a more
optimal structure when a metaslab is opened for writing. Or maybe that
is exactly what causes
the high cpu usage, I don't know. and there are still perfectly empty
metaslabs on the other device..

Last time this occurred I devised some synthetic tests to recreate
this condition repeatedly, and noticed that at some point it appeared
that zfs stopped allocating space on the more full device, except for
ditto metadata blocks. This time around that doesn't seem to happen,
maybe the the trigger is less obvious than simple % free space. Such
an 'emergency bias' seems simple enough IIRC about the source code for
choosing vdev to allocate from, aside from the triggering condition
maybe being complicated to set accurately.

Is there such a trigger and can it be adjusted to occur earlier? Any
other remedies?

Is there a way to confirm that finding free space is indeed the cause
for slow writes, or whether there is possibly another reason?

I wonder if the write balancing code should bias more aggressively.
This condition should be expected if say, one has a system 80% full
and adds another rack of disks, and does not touch existing data.
Having speed slow to a crawl a month later is a bit unexpected.

Thanks,
Tuomas
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to