Re: [zfs-discuss] ZFS performance degradation when backups are running

2008-10-01 Thread William D. Hathaway
You might want to also try toggling the Nagle tcp setting to see if that helps 
with your workload:
ndd -get /dev/tcp tcp_naglim_def 
(save that value, default is 4095)
ndd -set /dev/tcp tcp_naglim_def 1

If no (or a negative) difference, set it back to the original value
ndd -set /dev/tcp tcp_naglim_def 4095 (or whatever it was)
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance degradation when backups are running

2008-09-30 Thread gm_sjo
2008/9/30 Jean Dion [EMAIL PROTECTED]:
 iSCSI requires dedicated network and not a shared network or even VLAN.  
 Backup cause large I/O that fill your network quickly.  Like ans SAN today.

Could you clarify why it is not suitable to use VLANs for iSCSI?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance degradation when backups are running

2008-09-30 Thread Jean Dion




Simple. You cannot go faster than the slowest link.

Any VLAN share the bandwidth workload and do not provide a dedicated
bandwidth for each of them. That means if you have multiple VLAN
coming out of the same wire of your server you do not have "n" time the
bandwidth but only a fraction of it. Simple network maths.

Also iSCSI works better by using segregated IP network switches.
Beware that some switches do not guaranty full 1Gbits speed on all
ports when all active at the same time. Plan multiple uplinks if you
have more than one switch. Once again you cannot go faster than the
slowest link.

Jean


gm_sjo wrote:

  2008/9/30 Jean Dion [EMAIL PROTECTED]:
  
  
iSCSI requires dedicated network and not a shared network or even VLAN.  Backup cause large I/O that fill your network quickly.  Like ans SAN today.

  
  
Could you clarify why it is not suitable to use VLANs for iSCSI?
  

-- 




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance degradation when backups are running

2008-09-30 Thread Gary Mills
On Mon, Sep 29, 2008 at 06:01:18PM -0700, Jean Dion wrote:
 Do you have dedicated iSCSI ports from your server to your NetApp?  

Yes, it's a dedicated redundant gigabit network.

 iSCSI requires dedicated network and not a shared network or even VLAN.  
 Backup cause large I/O that fill your network quickly.  Like ans SAN today.
 
 Backup are extremely demanding on hardware (CPU, Mem, I/O ports, disk etc).  
 Not rare to see performance issues during backup with several thousands small 
 files.  Each small file cause seeks to your disk and file system.  
 
 As the number of files and size you will be impact.  That means, thousand of 
 small files cause thousand of small I/O but not a lot of throughput.  

What statistics can I generate to observe this contention?  ZFS pool I/O
statistics are not that different when the backup is running.

 Bigger your file are more likely the block will be consecutive on the file 
 system.  Small file can be spread in the entire file system causing seeks, 
 latency and bottleneck.
 
 Legato client and server contains tuning parameters to avoid such small file 
 problems.  Check your Legato buffer parameters.  These buffer will use your 
 server memory as disk cache.  

I'll ask our backup person to investigate those settings.  I assume that
Networker should not be buffering files since those files won't be read
again.  How can I see memory usage by ZFS and by applications?

 Here is a good source of network tuning parameters for your T2000 
 http://www.solarisinternals.com/wiki/index.php/Networks#Tunable_for_general_workloads_on_T1000.2FT2000
 
 The soft_ring is one of the best one.
 
 Here is another interesting place to look
 http://www.solarisinternals.com/wiki/index.php/Solaris_Internals_and_Performance_FAQ

Thanks.  I'll review those documents.

-- 
-Gary Mills--Unix Support--U of M Academic Computing and Networking-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance degradation when backups are running

2008-09-30 Thread Jean Dion




For Solaris internal debugging tools look here
http://opensolaris.org/os/community/advocacy/events/techdays/seattle/OS_SEA_POD_JMAURO.pdf;jsessionid=9B3E275EEB6F1A0E0BC191D8DEC0F965

ZFS specifics is available here
http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide

Jean



Gary Mills wrote:

  On Mon, Sep 29, 2008 at 06:01:18PM -0700, Jean Dion wrote:
  
  
Do you have dedicated iSCSI ports from your server to your NetApp?  

  
  
Yes, it's a dedicated redundant gigabit network.

  
  
iSCSI requires dedicated network and not a shared network or even VLAN.  Backup cause large I/O that fill your network quickly.  Like ans SAN today.

Backup are extremely demanding on hardware (CPU, Mem, I/O ports, disk etc).  Not rare to see performance issues during backup with several thousands small files.  Each small file cause seeks to your disk and file system.  

As the number of files and size you will be impact.  That means, thousand of small files cause thousand of small I/O but not a lot of throughput.  

  
  
What statistics can I generate to observe this contention?  ZFS pool I/O
statistics are not that different when the backup is running.

  
  
Bigger your file are more likely the block will be consecutive on the file system.  Small file can be spread in the entire file system causing seeks, latency and bottleneck.

Legato client and server contains tuning parameters to avoid such small file problems.  Check your Legato buffer parameters.  These buffer will use your server memory as disk cache.  

  
  
I'll ask our backup person to investigate those settings.  I assume that
Networker should not be buffering files since those files won't be read
again.  How can I see memory usage by ZFS and by applications?

  
  
Here is a good source of network tuning parameters for your T2000 
http://www.solarisinternals.com/wiki/index.php/Networks#Tunable_for_general_workloads_on_T1000.2FT2000

The soft_ring is one of the best one.

Here is another interesting place to look
http://www.solarisinternals.com/wiki/index.php/Solaris_Internals_and_Performance_FAQ

  
  
Thanks.  I'll review those documents.

  



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance degradation when backups are running

2008-09-30 Thread William D. Hathaway
Gary -
   Besides the network questions...

   What does your zpool status look like?


   Are you using compression on the file systems?
   (Was single-threaded and fixed in s10u4 or equiv patches)
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance degradation when backups are running

2008-09-30 Thread gm_sjo
2008/9/30 Jean Dion [EMAIL PROTECTED]:
 Simple. You cannot go faster than the slowest link.

That is indeed correct, but what is the slowest link when using a
Layer 2 VLAN? You made a broad statement that iSCSI 'requires' a
dedicated, standalone network. I do not believe this is the case.

 Any VLAN share the bandwidth workload and do not provide a dedicated
 bandwidth for each of them.   That means if you have multiple VLAN coming
 out of the same wire of your server you do not have n time the bandwidth
 but only a fraction of it.  Simple network maths.

I can only assume that you are only referring to VLAN trunks, eg using
a NIC on a server for both 'normal' traffic and having another virtual
interface on it bound to a 'storage' VLAN. If this is the case then
what you say is true, of course you are sharing the same physical link
so ultimately that will be the limit.

However, and this should be clarified before anyone gets the wrong
idea, there is nothing wrong with segmenting a switch by using VLANs
to have some ports for storage traffic and some ports for 'normal'
traffic. You can have one/multiple NIC(s) for storage, and
another/multiple NIC(s) for everything else (or however you please to
use your interfaces!). These can be hooked up to switch ports that are
on different physical VLANs with no performance degredation.

It's best not to assume that every use of a VLAN is a trunk.

 Also iSCSI works better by using segregated IP network switches.  Beware
 that some switches do not guaranty full 1Gbits speed on all ports when all
 active at the same time.   Plan multiple uplinks if you have more than one
 switch. Once again you cannot go faster than the slowest link.

I think it's fairly safe to assume that you're going to get per-port
line-speed across anything other than the cheapest budget switches.
Most SMB (and above) switches will be rated at say 48gbit/sec
backplane on a 24 port item, for example.

However, I am keen to see any benchmarks you may have that shows the
performance difference between running a single switch with layer 2
vlans Vs. two seperate switches.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance degradation when backups are running

2008-09-30 Thread Jean Dion




Normal iSCSI setup split network traffic at physical layer and not
logical layer. That mean physical ports and often physical PCI bridge
chip if you can. That will be fine for small traffic but we are
talking backup performance issues. IP network and number of small
files are very often the bottlenecks.

If you want performance you do not put all your I/O across the same
physical wire. Once again you cannot go faster than the physical wire
can support (CAT5E, CAT6, fibre). No matter if it is layer 2 or not.
Using VLAN on single port you "share" the bandwidth and not creating
more Gbits speed with Layer 2.

iSCSI best practice require separated physical network. Many books,
white papers are written about this. 

This is like any FC SAN implementation. We always split the workload
between disk and tape using more than one HBA. Never forget , backup
are intensive I/O and will fill the entire I/O path.

Jean


gm_sjo wrote:

  2008/9/30 Jean Dion [EMAIL PROTECTED]:
  
  
Simple. You cannot go faster than the slowest link.

  
  
That is indeed correct, but what is the slowest link when using a
Layer 2 VLAN? You made a broad statement that iSCSI 'requires' a
dedicated, standalone network. I do not believe this is the case.

  
  
Any VLAN share the bandwidth workload and do not provide a dedicated
bandwidth for each of them.   That means if you have multiple VLAN coming
out of the same wire of your server you do not have "n" time the bandwidth
but only a fraction of it.  Simple network maths.

  
  
I can only assume that you are only referring to VLAN trunks, eg using
a NIC on a server for both 'normal' traffic and having another virtual
interface on it bound to a 'storage' VLAN. If this is the case then
what you say is true, of course you are sharing the same physical link
so ultimately that will be the limit.

However, and this should be clarified before anyone gets the wrong
idea, there is nothing wrong with segmenting a switch by using VLANs
to have some ports for storage traffic and some ports for 'normal'
traffic. You can have one/multiple NIC(s) for storage, and
another/multiple NIC(s) for everything else (or however you please to
use your interfaces!). These can be hooked up to switch ports that are
on different physical VLANs with no performance degredation.

It's best not to assume that every use of a VLAN is a trunk.

  
  
Also iSCSI works better by using segregated IP network switches.  Beware
that some switches do not guaranty full 1Gbits speed on all ports when all
active at the same time.   Plan multiple uplinks if you have more than one
switch. Once again you cannot go faster than the slowest link.

  
  
I think it's fairly safe to assume that you're going to get per-port
line-speed across anything other than the cheapest budget switches.
Most SMB (and above) switches will be rated at say 48gbit/sec
backplane on a 24 port item, for example.

However, I am keen to see any benchmarks you may have that shows the
performance difference between running a single switch with layer 2
vlans Vs. two seperate switches.
  



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance degradation when backups are running

2008-09-30 Thread Gary Mills
On Tue, Sep 30, 2008 at 10:32:50AM -0700, William D. Hathaway wrote:
 Gary -
Besides the network questions...

Yes, I suppose I should see if traffic on the Iscsi network is
hitting a limit of some sort.

What does your zpool status look like?

Pretty simple:

  $ zpool status
pool: space
   state: ONLINE
   scrub: none requested
  config:
  
  NAME STATE READ WRITE CKSUM
  spaceONLINE   0 0 0
c4t60A98000433469764E4A2D456A644A74d0  ONLINE   0 0 0
c4t60A98000433469764E4A2D456A696579d0  ONLINE   0 0 0
c4t60A98000433469764E4A476D2F6B385Ad0  ONLINE   0 0 0
c4t60A98000433469764E4A476D2F664E4Fd0  ONLINE   0 0 0
  
  errors: No known data errors

The four LUNs use the built-in I/O multipathing, with separate Iscsi
networks, switches, and ethernet interfaces.

Are you using compression on the file systems?
(Was single-threaded and fixed in s10u4 or equiv patches)

No, I've never enabled compression there.

-- 
-Gary Mills--Unix Support--U of M Academic Computing and Networking-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance degradation when backups are running

2008-09-30 Thread Gary Mills
On Mon, Sep 29, 2008 at 06:01:18PM -0700, Jean Dion wrote:
 
 Legato client and server contains tuning parameters to avoid such small file 
 problems.  Check your Legato buffer parameters.  These buffer will use your 
 server memory as disk cache.  

Our backup person tells me that there are no settings in Networker
that affect buffering on the client side.

 Here is a good source of network tuning parameters for your T2000 
 http://www.solarisinternals.com/wiki/index.php/Networks#Tunable_for_general_workloads_on_T1000.2FT2000
 
 The soft_ring is one of the best one.

Those references are for network tuning.  I don't want to change
things blindly.  How do I tell if they are necessary, that is if
the network is the bottleneck in the I/O system?

-- 
-Gary Mills--Unix Support--U of M Academic Computing and Networking-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance degradation when backups are running

2008-09-30 Thread gm_sjo
2008/9/30 Jean Dion [EMAIL PROTECTED]:
 If you want performance you do not put all your I/O across the same physical
 wire.  Once again you cannot go faster than the physical wire can support
 (CAT5E, CAT6, fibre).  No matter if it is layer 2 or not. Using VLAN on
 single port you share the bandwidth and not creating more Gbits speed with
 Layer 2.

 iSCSI best practice require separated physical network. Many books, white
 papers are written about this.

Yes, that's true, but I don't believe you mentioned single NIC
implementations in your original statement. Just seeking clarification
to help others :-)

I think it's worth clarifying that iSCSI and VLANs is okay as long as
people appreciate you will require seperate interfaces to get best
performance.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance degradation when backups are running

2008-09-30 Thread Richard Elling
gm_sjo wrote:
 2008/9/30 Jean Dion [EMAIL PROTECTED]:
   
 If you want performance you do not put all your I/O across the same physical
 wire.  Once again you cannot go faster than the physical wire can support
 (CAT5E, CAT6, fibre).  No matter if it is layer 2 or not. Using VLAN on
 single port you share the bandwidth and not creating more Gbits speed with
 Layer 2.

 iSCSI best practice require separated physical network. Many books, white
 papers are written about this.
 

 Yes, that's true, but I don't believe you mentioned single NIC
 implementations in your original statement. Just seeking clarification
 to help others :-)

 I think it's worth clarifying that iSCSI and VLANs is okay as long as
 people appreciate you will require seperate interfaces to get best
 performance.
   

Separate interfaces or networks may not be required, but properly sized
networks are highly desirable.  For example, a back-of-the-envelope analysis
shows that a single 10GbE pipe is sufficient to drive 8 T10KB drives.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS performance degradation when backups are running

2008-09-29 Thread Jean Dion
Do you have dedicated iSCSI ports from your server to your NetApp?  

iSCSI requires dedicated network and not a shared network or even VLAN.  Backup 
cause large I/O that fill your network quickly.  Like ans SAN today.

Backup are extremely demanding on hardware (CPU, Mem, I/O ports, disk etc).  
Not rare to see performance issues during backup with several thousands small 
files.  Each small file cause seeks to your disk and file system.  

As the number of files and size you will be impact.  That means, thousand of 
small files cause thousand of small I/O but not a lot of throughput.  

Bigger your file are more likely the block will be consecutive on the file 
system.  Small file can be spread in the entire file system causing seeks, 
latency and bottleneck.

Legato client and server contains tuning parameters to avoid such small file 
problems.  Check your Legato buffer parameters.  These buffer will use your 
server memory as disk cache.  

Here is a good source of network tuning parameters for your T2000 
http://www.solarisinternals.com/wiki/index.php/Networks#Tunable_for_general_workloads_on_T1000.2FT2000

The soft_ring is one of the best one.

Here is another interesting place to look
http://www.solarisinternals.com/wiki/index.php/Solaris_Internals_and_Performance_FAQ


Jean Dion
Storage Architect 
Data Management Ambassador
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS performance degradation when backups are running

2008-09-27 Thread Gary Mills
We have a moderately sized Cyrus installation with 2 TB of storage
and a few thousand simultaneous IMAP sessions.  When one of the
backup processes is running during the day, there's a noticable
slowdown in IMAP client performance.  When I start my `mutt' mail
reader, it pauses for several seconds at `Selecting INBOX'.  That
behavior disappears when the backup finishes.

The IMAP server is a T2000 with six ZFS filesystems that correspond to
Cyrus partitions.  Storage is four Iscsi LUNs from our Netapp filer.
The backup in question is done with EMC Networker.  I've looked at
zpool I/O statistics when the backup is running, but there's nothing
clearly wrong.

I'm wondering if perhaps all the read activity by the backup system
is causing trouble with ZFS' caching.  Is there some way to examine
this area?

-- 
-Gary Mills--Unix Support--U of M Academic Computing and Networking-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss