Paul,
While testing iscsi targets exported from thumpers via 10GbE and
imported 10GbE on T2000s I am not seeing the throughput I expect,
and more importantly there is a tremendous amount of read IO
happending on a purely sequential write workload. (Note all systems
have Sun 10GbE cards and are running Nevada b65.)
The read IO activity you are seeing is a direct result of re-writes
on the ZFS storage pool. If you were to recreate the test from
scratch, you would notice that on the very first pass of write I/Os
from 'dd', there would be no reads. This is an artifact of using
zvols as backing store for iSCSI Targets.
The iSCSI Target software supports raw SCSI disks, Solaris raw
devices (/dev/rdsk/....), Solaris block devices (/dev/dsk/...),
zvols, SVM volumes, files in file systems, including temps.
Simple write workload (from T2000):
# time dd if=/dev/zero of=/dev/rdsk/
c6t010000144F210ECC00002A004675E957d0 \
bs=64k count=1000000
A couple of things, maybe missing here, or the commands are not true
cut-n-paste of what is being tested.
1). From the iSCSI initiator, there is no device at /dev/rdsk/
c6t010000144F210ECC00002A004675E957d0, note the missing slice. (s0,
s1, s2, etc....).
2). Even if one was to specify a slice, as in /dev/rdsk/
c6t010000144F210ECC00002A004675E957d0s2, it is unlikely that the LUN
has been formatted. When I run format the first time, I get the error
message of "Please run fdisk first".
Of course this does not have to be the case, because if the ZFS
storage pool that backed up this LUN had previously been formatted
with either a Solaris VTOC or Intel EFI label, then the disk would
show up correctly.
Performance of iscsi target pool on new blocks:
bash-3.00# zpool iostat thumper1-vdev0 1
thumper1-vdev0 17.4G 2.70T 0 526 0 63.6M
thumper1-vdev0 17.5G 2.70T 0 564 0 60.5M
thumper1-vdev0 17.5G 2.70T 0 0 0 0
thumper1-vdev0 17.5G 2.70T 0 0 0 0
thumper1-vdev0 17.5G 2.70T 0 0 0 0
Configuration of zpool/iscsi target:
# zpool status thumper1-vdev0
pool: thumper1-vdev0
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
thumper1-vdev0 ONLINE 0 0 0
c0t7d0 ONLINE 0 0 0
c1t7d0 ONLINE 0 0 0
c5t7d0 ONLINE 0 0 0
c6t7d0 ONLINE 0 0 0
c7t7d0 ONLINE 0 0 0
c8t7d0 ONLINE 0 0 0
errors: No known data errors
The first thing is that for this pool I was expecting 200-300MB/s
throughput, since it is a simple stripe across 6, 500G disks. In
fact, a direct local workload (directly on thumper1) of the same
type confirms what I expected:
bash-3.00# dd if=/dev/zero of=/dev/zvol/rdsk/thumper1-vdev0/iscsi
bs=64k count=1000000 &
bash-3.00# zpool iostat thumper1-vdev0 1
thumper1-vdev0 20.4G 2.70T 0 2.71K 0 335M
thumper1-vdev0 20.4G 2.70T 0 2.92K 0 374M
thumper1-vdev0 20.4G 2.70T 0 2.88K 0 368M
thumper1-vdev0 20.4G 2.70T 0 2.84K 0 363M
thumper1-vdev0 20.4G 2.70T 0 2.57K 0 327M
The second thing, is that when overwriting already written blocks
via the iscsi target (from the T2000) I see a lot of read bandwidth
for blocks that are being completely overwritten. This does not
seem to slow down the write performance, but 1) it is not seem in
the direct case; and 2) it consumes channel bandwidth unnecessarily.
bash-3.00# zpool iostat thumper1-vdev0 1
thumper1-vdev0 8.90G 2.71T 279 783 31.7M 95.9M
thumper1-vdev0 8.90G 2.71T 281 318 31.7M 29.1M
thumper1-vdev0 8.90G 2.71T 139 0 15.8M 0
thumper1-vdev0 8.90G 2.71T 279 0 31.7M 0
thumper1-vdev0 8.90G 2.71T 139 0 15.8M 0
Can anyone help to explain what I am seeing, or give me some
guidance on diagnosing the cause of the following:
- The bottleneck in accessing the iscsi target from the T2000
From the iSCSI Initiator's point of view, there are various
(Negotiated) Login Parameters, which may have a direct effect on
performance. Take a look at "iscsiadm list target --verbose", then
consult the iSCSI man pages, or documentation online at docs.sun.com.
Remember to keep track of what you change on a per-target basis, and
only change one parameter at a time, and measure your results.
- The cause of the extra read bandwidth when overwriting blocks on
the iscsi target from the T2000.
ZFS as the backing store, and it COW (Copy-on-write) in maintaining
the ZFS zvols within the storage pool.
Any help is much appreciated,
paul
_______________________________________________
storage-discuss mailing list
[EMAIL PROTECTED]
http://mail.opensolaris.org/mailman/listinfo/storage-discuss
Jim Dunham
Solaris, Storage Software Group
Sun Microsystems, Inc.
1617 Southwood Drive
Nashua, NH 03063
Email: [EMAIL PROTECTED]
http://blogs.sun.com/avs
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss