> -Still playing with 'recsize' values but it doesn't seem to be doing
> much...I don't think I have a good understand of what exactly is being
> written...I think the whole file might be overwritten each time
> because it's in binary format.

The other thing to keep in mind is that the tunables like compression
and recsize only affect newly written blocks.  If you have a bunch of
data that was already laid down on disk and then you change the tunable,
this will only cause new blocks to have the new size.  If you experiment
with this, make sure all of your data has the same blocksize by copying
it over to the new pool once you've changed the properties.

> -Setting zfs_nocacheflush, though got me drastically increased
> throughput--client requests took, on average, less than 2 seconds
> each!
> 
> So, in order to use this, I should have a storage array, w/battery
> backup, instead of using the internal drives, correct?

zfs_nocacheflush should only be used on arrays with a battery backed
cache.  If you use this option on a disk, and you lose power, there's no
guarantee that your write successfully made it out of the cache.

A performance problem when flushing the cache of an individual disk
implies that there's something wrong with the disk or its firmware.  You
can disable the write cache of an individual disk using format(1M).  When you
do this, ZFS won't lose any data, whereas enabling zfs_nocacheflush can
lead to problems.

I'm attaching a DTrace script that will show the cache-flush times
per-vdev.  Remove the zfs_nocacheflush tuneable and re-run your test
while using this DTrace script.  If one particular disk takes longer
than the rest to flush, this should show us.  In that case, we can
disable the write cache on that particular disk.  Otherwise, we'll need
to disable the write cache on all of the disks.

The script is attached as zfs_flushtime.d

Use format(1M) with the -e option to adjust the write_cache settings for
SCSI disks.

-j
#!/usr/sbin/dtrace -Cs
/*
 * CDDL HEADER START
 *
 * The contents of this file are subject to the terms of the
 * Common Development and Distribution License (the "License").
 * You may not use this file except in compliance with the License.
 *
 * You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
 * or http://www.opensolaris.org/os/licensing.
 * See the License for the specific language governing permissions
 * and limitations under the License.
 *
 * When distributing Covered Code, include this CDDL HEADER in each
 * file and include the License file at usr/src/OPENSOLARIS.LICENSE.
 * If applicable, add the following below this CDDL HEADER, with the
 * fields enclosed by brackets "[]" replaced with your own identifying
 * information: Portions Copyright [yyyy] [name of copyright owner]
 *
 * CDDL HEADER END
 */

/*
 * Copyright 2008 Sun Microsystems, Inc.  All rights reserved.
 * Use is subject to license terms.
 */

#define DKIOC                   (0x04 << 8)
#define DKIOCFLUSHWRITECACHE    (DKIOC|34)

fbt:zfs:vdev_disk_io_start:entry
/(args[0]->io_cmd == DKIOCFLUSHWRITECACHE) && (self->traced == 0)/
{
        self->traced = args[0];
        self->start = timestamp;
}

fbt:zfs:vdev_disk_ioctl_done:entry
/args[0] == self->traced/
{
        @a[stringof(self->traced->io_vd->vdev_path)] =
                quantize(timestamp - self->start);
        self->start = 0;
        self->traced = 0;
}

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to