[zfs-discuss] No write coalescing after upgrade to Solaris 11 Express

2011-04-27 Thread Matthew Anderson
Hi All,

I've run into a massive performance problem after upgrading to Solaris 11 
Express from oSol 134.

Previously the server was performing a batch write every 10-15 seconds and the 
client servers (connected via NFS and iSCSI) had very low wait times. Now I'm 
seeing constant writes to the array with a very low throughput and high wait 
times on the client servers. Zil is currently disabled. There is currently one 
failed disk that is being replaced shortly.

Is there any ZFS tunable to revert Solaris 11 back to the behaviour of oSol 134?
I attempted to remove Sol 11 and reinstall 134 but it keeps freezing during 
install which is probably another issue entirely...

IOstat output is below. When running iostat -v 2 that level is writes OP's and 
throughput is very constant.

   capacity operationsbandwidth
poolalloc   free   read  write   read  write
--  -  -  -  -  -  -
MirrorPool  12.2T  4.11T153  4.63K  6.06M  33.6M
  mirror1.04T   325G 11416   400K  2.80M
c7t0d0  -  -  5114   163K  2.80M
c7t1d0  -  -  6114   237K  2.80M
  mirror1.04T   324G 10374   426K  2.79M
c7t2d0  -  -  5108   190K  2.79M
c7t3d0  -  -  5107   236K  2.79M
  mirror1.04T   324G 15425   537K  3.15M
c7t4d0  -  -  7115   290K  3.15M
c7t5d0  -  -  8116   247K  3.15M
  mirror1.04T   325G 13412   572K  3.00M
c7t6d0  -  -  7115   313K  3.00M
c7t7d0  -  -  6116   259K  3.00M
  mirror1.04T   324G 13381   580K  2.85M
c7t8d0  -  -  7111   362K  2.85M
c7t9d0  -  -  5111   219K  2.85M
  mirror1.04T   325G 15408   654K  3.10M
c7t10d0  -  -  7122   336K  3.10M
c7t11d0  -  -  7123   318K  3.10M
  mirror1.04T   325G 14461   681K  3.22M
c7t12d0  -  -  8130   403K  3.22M
c7t13d0  -  -  6132   278K  3.22M
  mirror 749G   643G  1279   140K  1.07M
c4t14d0  -  -  0  0  0  0
c7t15d0  -  -  1 83   140K  1.07M
  mirror1.05T   319G 18333   672K  2.74M
c7t16d0  -  - 11 96   406K  2.74M
c7t17d0  -  -  7 96   266K  2.74M
  mirror1.04T   323G 13353   540K  2.85M
c7t18d0  -  -  7 98   279K  2.85M
c7t19d0  -  -  6100   261K  2.85M
  mirror1.04T   324G 12459   543K  2.99M
c7t20d0  -  -  7118   285K  2.99M
c7t21d0  -  -  4119   258K  2.99M
  mirror1.04T   324G 11431   465K  3.04M
c7t22d0  -  -  5116   195K  3.04M
c7t23d0  -  -  6117   272K  3.04M
  c8t2d00  29.5G  0  0  0  0
cache   -  -  -  -  -  -
  c8t3d059.4G  3.88M113 64  6.51M  7.31M
  c8t1d059.5G48K 95 69  5.69M  8.08M


Thanks
-Matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] No write coalescing after upgrade to Solaris 11 Express

2011-04-27 Thread Matthew Anderson
NAMEPROPERTY  VALUE SOURCE
MirrorPool  sync  disabled  local
MirrorPool/CCIT sync  disabled  local
MirrorPool/EX01 sync  disabled  inherited from MirrorPool
MirrorPool/EX02 sync  disabled  inherited from MirrorPool
MirrorPool/FileStore1   sync  disabled  inherited from MirrorPool


Sync was disabled on the main pool and then let to inherrit to everything else. 
The reason for disabled this in the first place was to fix bad NFS write 
performance (even with Zil on an X25e SSD it was under 1MB/s).
I've also tried setting the logbias to throughput and latency but they both 
perform around the same level.

Thanks
-Matt


-Original Message-
From: Andrew Gabriel [mailto:andrew.gabr...@oracle.com] 
Sent: Wednesday, 27 April 2011 3:41 PM
To: Matthew Anderson
Cc: 'zfs-discuss@opensolaris.org'
Subject: Re: [zfs-discuss] No write coalescing after upgrade to Solaris 11 
Express

Matthew Anderson wrote:
 Hi All,

 I've run into a massive performance problem after upgrading to Solaris 11 
 Express from oSol 134.

 Previously the server was performing a batch write every 10-15 seconds and 
 the client servers (connected via NFS and iSCSI) had very low wait times. Now 
 I'm seeing constant writes to the array with a very low throughput and high 
 wait times on the client servers. Zil is currently disabled.

How/Why?

  There is currently one failed disk that is being replaced shortly.

 Is there any ZFS tunable to revert Solaris 11 back to the behaviour of oSol 
 134?
   

What does zfs get sync report?

-- 
Andrew Gabriel
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] NTFS on NFS and iSCSI always generates small IO's

2011-03-10 Thread Matthew Anderson
Hi All,

I've run into a problem with my OpenSolaris system and NTFS, I can't seem to 
make sense of it. 

The server running the virtual machines is Ubuntu Server 10.04 running KVM. 
Storage is presented via NFS over Infiniband. ZFS is not running compression or 
dedup. Zil is also currently disabled because it was causing terrible NFS 
performance.
ESXi also displayed the same behaviour when running Windows VM's. 

OpenSolaris box is running -
OS b147
Xeon 5520
24GB DDR3
24x 1.5TB 7200rpm sata drives
24 bay expander connected to an LSI raid card via 4 lane SAS cable
2x 64GB Intel X25-M L2ARC
2x 32GB Intel X25-M SLOG

Host -
4KB reads  writes are full speed. Tested using DD. Also tested using Samba 
writing to the NFS share (Bare metal PC - samba - NFS) and it was writing as 
fast as the server could dish out the data (50MB/s).

Test VM 1 -
OS is Ubuntu Server 10.10 on ext4. Reads  writes are full speed at 4kb, tested 
using DD. Sequential read/write seems fine. Tested using Samba also, same 
results as above (50MB/s).

Test VM 2-
OS is Windows Server 2008 on NTFS 4kb cluster size. 4KB reads  writes are 
30MB/s, tested using the ATTO disk benchmark. Sequential reads and writes max 
out at 30MB/s also and generate a lot of IO on the storage box. 
IOStat on the storage box shows 3k+ IOPS and only 30MB/s throughput. 

Also tested a bare metal system running iSCSI over gigabit. NTFS (default 4kb 
cluster size) directly on the ZFS block volume. Results are the same as Test VM 
2, lots of small IO (3k+ IOPS) and small throughput 30MB/s
I also did the same test using SRP. 4kb reads and writes have the same 30MB/s. 
Any IO above or equal to 128kb seems to pick up speed drastically, easily 
700MB/s.

I have a feeling it's to do with ZFS's recordsize property but haven't been 
able to find any solid testing done with NTFS. I'm going to do some testing 
using smaller record sizes tonight to see if that helps the issue. 
At the moment I'm surviving on cache and am quickly running out of capacity.

Can anyone suggest any further tests or have any idea about what's going on?

Thanks
-Matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] SCSI timeouts with rPool on usb

2010-11-15 Thread Matthew Anderson
I'm currently having a few problems with my storage server. Server specs are -
Open Solaris snv_134
Supermicro X8DTi motherboard
Intel Xeon 5520
6x 4GB DDR3
LSI RAID Card - running 24x 1.5TB SATA drives
Adaptec 2405 - running 4x Intel SSD X25-E's
Boot's from 8GB USB flash drive

The initial problem started after a while when the console showed SCSI timeouts 
and the whole server locked up. After rebooting  the USB boot drive failed to 
show in the bios, a power cycle fixed the problem and the server worked 
flawlessly for another 30-40 days until it locked up again. I inserted a second 
8GB USB key and mirrored the rpool, again it only lasted 30 days and then 
locked up. It definitely seems as though the USB drives are at fault, the other 
disks have so far performed perfectly.

The next step I was going to take is convert the rpool to a sata 2.5inch laptop 
drive, the problem is when I attempted to run the SSD's from the onboard data 
controller they lasted about 30 minutes before OS set them as degraded due to 
checksum errors. Running them from the 2405 controller has been error free so 
far so I'm very wary of using the on board controller. I was thinking I could 
probably slice up the SSD's to take over the rpool. The two read cache SSD's 
are 60GB each, missing 8GB from each wouldn't cause an issue. 

Can anyone help out with the process to swing the rpool from the USB key to 
SSD? Or if anyone has a suggestion on what I could possibly do to sort out the 
problem.

Thanks
-Matt


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] COMSTAR dropouts with dedup enabled

2010-06-15 Thread Matthew Anderson
Thanks Brandon,

This system has 24GB of RAM and currently no L2ARC. The total de duplicated 
data was about 250GB so I wouldn't have thought I would be out of RAM, I've 
removed the LUN for the time being so I can't get the DDR size at the moment. I 
have some X25-E's to go in as L2ARC and SLOG so I'll revisit de-dup soon to see 
if that helps the issue.  

-Matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] COMSTAR dropouts with dedup enabled

2010-06-14 Thread Matthew Anderson
Hi All,

I currently use b134 and COMSTAR to deploy SRP targets for virtual machine 
storage (VMware ESXi4) and have run into some unusual behaviour when dedup is 
enabled for a particular LUN. The target seems to lock up (ESX reports it as 
unavailable) when writing large amount or overwriting data, reads are 
unaffected. The easiest way for me to replicate the problem was to restore a 
2GB SQL database inside a VM. The dropouts lasted anywhere from 3 seconds to a 
few minutes and when connectivity is restored the other LUN's (without dedup) 
drop out for a few seconds.

The problem didn't seem to occur with only a small amount of data on the LUN 
(50GB) and happened more frequently as the LUN filled up. I've since moved all 
data to non-dedup LUN's and I haven't seen a dropout for over a month.  Does 
anyone know why this is happening? I've also seen the behaviour when exporting 
iSCSI targets with COMSTAR. I haven't had a chance to install the SSD's for 
L2ARC and SLOG yet so I'm unsure if that will help the issue.

System specs are-
Single Xeon 5620
24GB DDR3
24x 1.5TB 7200rpm
LSI RAID card

Thanks
-Matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss