Picking up file server tuning again

Kurt Buff Fri, 10 Feb 2012 14:58:11 -0800

I'm getting back to monitoring my situation with the file server
again, and just finished a perfmon session covering the 3rd through
the 7th of this month. Simultaneously, I set up perfmon on the same
workstation to monitor the backup server.


If anyone cares to help, I'd be deeply appreciative.

I set up perfmon on a Win7 VM on an ESXi 4.1 host to take measurements
at 60 second intervals of a whole bunch of counters, many of them
probably just noise.

I'll describe the history of the configuration first, however:

The file server is a Win2k3 R2 VM running on a ESX 3.5 host with 16g
of RAM - it's one of 10 VMs, and is definitely the heaviest hitter in
terms of disk I/O. About 2.5-3 months ago we noticed that the time to
completion for the weekly full backups spiked dramatically.

Prior to that time, the fulls would start around 7pm on a Friday, and
finish by about 7pm on Sunday.

Now they take until Thursday or Friday to complete.

This coincided with some changes to the environment: I had to move the
VM to a new host (it was a manual copy - we don't have vmotion
licensed and configured for these hosts) and at about that time I also
had to expand 2 of the 4 LUNS.  Finally, the OS drive for the VM on
the old host was on a LUN on our Lefthand unit - I had to migrate it
to the local disk storage on the new home for the VM. The 4 data
drives for this VM are attached via the MSFT iSCSI client running on
the VM, not through VMWare's iSCSI client. So, at that point, all of
the LUNS were on the Lefthand SAN, which is a 3-node cluster, and we
use 2-way replication for all LUNS. The 2 LUNS that were expanded went
to 2tb or slightly beyond. The Lefthand has two NSM 2060s and a
P4300G2, with 6 and 8 disks each, respectively - a total of 20 disks

Since that time, I've also added in our EMC VNXe 3100 with 6 disks in
it in a RAID6 array. I mention this because this means that all of the
file systems on the VNXe are clean and defragged.

Currently, I've migrated 3 of the 4 data LUNs for the VM to the EMC. I
made sure to align the partitions on the EMC to a megabyte boundary.

So, to make this simpler to visualize, a little table:

c: - local disk on ESX 3.5, 40gb, 23.6gb free
j: - iSCSI LUN on Lefthand, 2.5tb, 900gb free
k: - iSCSI LUN on VNXe, 1.98tb, 336gb free
l: - iSCSI LUN on VNXe, 1tb, 79gb free
m: - iSCSI LUN on VNXe 750gb, 425gb free

I tried to capture separate disk queue stats for each LUN, but in
spite of selecting and adding each drive letter separately in the
perfmon interface, all I got was _Total.

Selected stats are as follows:

     PhysicalDisk counters
Current disk queue length - average 0.483, maximum 33.000
Average disk read queue length - 0.037, maximum 1.294
%disk time - average 34.068, maximum 153.877
Average disk write queue length - average 0.645, maximum 2.828
Average disk queue length - average 0.681, maximum 3.078

I have more data on PhysicalDisk, and data on other objects, including
Memory, NetworkInterface, Paging File, Processor and  Server Work
Queues.

If anyone has thoughts, I'd surely like to hear them.

Thanks,

Kurt

~ Finally, powerful endpoint security that ISN'T a resource hog! ~
~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/>  ~

---
To manage subscriptions click here: 
http://lyris.sunbelt-software.com/read/my_forums/
or send an email to listmana...@lyris.sunbeltsoftware.com
with the body: unsubscribe ntsysadmin

Picking up file server tuning again

Reply via email to