Re: [zfs-discuss] Ideal SATA/SAS Controllers for ZFS

2010-05-18 Thread Marc Bevand
The LSI SAS1064E slipped through the cracks when I built the list.
This is a 4-port PCIe x8 HBA with very good Solaris (and Linux)
support. I don't remember having seen it mentionned on zfs-discuss@
before, even though many were looking for 4-port controllers. Perhaps
the fact it is priced too close to 8-port models explains why it is
relatively unnoted. That said, the wide x8 PCIe link makes it the
*cheapest* controller able to feed 300-350MB/s to at least 4 ports
concurrently. Now added to my list.

-mrb

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using WD Green drives?

2010-05-18 Thread Pasi Kärkkäinen
On Mon, May 17, 2010 at 03:12:44PM -0700, Erik Trimble wrote:
 On Mon, 2010-05-17 at 12:54 -0400, Dan Pritts wrote:
  On Mon, May 17, 2010 at 06:25:18PM +0200, Tomas Ögren wrote:
   Resilver does a whole lot of random io itself, not bulk reads.. It reads
   the filesystem tree, not block 0, block 1, block 2... You won't get
   60MB/s sustained, not even close.
  
  Even with large, unfragmented files?  
  
  danno
  --
  Dan Pritts, Sr. Systems Engineer
  Internet2
  office: +1-734-352-4953 | mobile: +1-734-834-7224
 
 Having large, unfragmented files will certainly help keep sustained
 throughput.  But, also, you have to consider the amount of deletions
 done on the pool.
 
 For instance, let's say you wrote files A, B, and C one right after
 another, and they're all big files.  Doing a re-silver, you'd be pretty
 well off on getting reasonable throughput reading A, then B, then C,
 since they're going to be contiguous on the drive (both internally, and
 across the three files).  However, if you have deleted B at some time,
 and say wrote a file D (where D  B in size) into B's old space, then,
 well, you seek to A, read A, seek forward to C, read C, seek back to D,
 etc.
 
 Thus, you'll get good throughput for resilver on these drives pretty
 much in just ONE case:  large files with NO deletions.  If you're using
 them for write-once/read-many/no-delete archives, then you're OK.
 Anything else is going to suck.
 
 :-)
 

So basicly if you have a lot of small files with a lot of changes
and deletions resilver is going to be really slow.

Sounds like the traditional RAID would be better/faster to rebuild in this 
case..

-- Pasi

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Pool revcovery from replaced disks.

2010-05-18 Thread Demian Phillips
Is it possible to recover a pool (as it was) from a set of disks that
were replaced during a capacity upgrade?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Pool revcovery from replaced disks.

2010-05-18 Thread Thomas Burgess
wow, that's a truly excelent question.

If you COULD do it, it might work with a simple import

but i have no idea...i'd love to know myself.


On Tue, May 18, 2010 at 7:06 AM, Demian Phillips
demianphill...@gmail.comwrote:

 Is it possible to recover a pool (as it was) from a set of disks that
 were replaced during a capacity upgrade?
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Very serious performance degradation

2010-05-18 Thread Philippe
Hi,

I'm running Opensolaris 2009.06, and I'm facing a serious performance loss with 
ZFS ! It's a raidz1 pool, made of 4 x 1TB SATA disks :
zfs_raidONLINE   0 0 0
  raidz1ONLINE   0 0 0
c7t2d0  ONLINE   0 0 0
c7t3d0  ONLINE   0 0 0
c7t4d0  ONLINE   0 0 0
c7t5d0  ONLINE   0 0 0

In the beginning, when the pool was just created (and empty !), I had the 
following performances :
 - Read : 200 MB/s
 - Write : 20 MB/s (10 MB/s with compression enabled)
These performances were OK at this time.

However, after 2 months of production use, and a volume of data of 1TB only, 
the performances are near :
 - Read : 5 MB/s 
 - Write : 500 KB/s  
The write speed is so low that is breaks any network copy (Samba or SFTP). The 
only solution I found to copy large files to the pool without outages is to use 
SFTP via Filezilla, with the activation of the bandwich limit (limit=300 KB/s) 
!!!

In this pool, I have 18 filesystems defined :
 - 4 FS have a recordsize of 16KB (with a total of 100 GB of data)
 - 14 FS have a recordsize of 128KB (with a total of 900GB of data)
There is a total of 284 snapshots on the pool, and compression is enabled.
There is 3 GB of physical RAM.

The usage of the pool is for daily backups, with rsync. Some big files are 
updated simulteanously, in different FS. So, I suspect a huge fragmentation of 
the files ! Or maybe..., a need of more RAM ??

Thank you for any thoughts !!
Philippe
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Very serious performance degradation

2010-05-18 Thread Bob Friesenhahn

On Tue, 18 May 2010, Philippe wrote:


The usage of the pool is for daily backups, with rsync. Some big files are 
updated simulteanously, in different FS. So, I suspect a huge fragmentation of 
the files ! Or maybe..., a need of more RAM ??


You forgot to tell us what brand/model of disks you are using, and the 
controller type.


It seems likely that one or more of your disks are barely working from 
time of initial installation.  Even 20 MB/s is quite slow.


Use 'iostat -x 30' with an I/O load to see if one disk is much slower 
than the others.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using WD Green drives?

2010-05-18 Thread Dan Pritts
On Tue, May 18, 2010 at 09:40:15AM +0300, Pasi Kärkkäinen wrote:
  Thus, you'll get good throughput for resilver on these drives pretty
  much in just ONE case:  large files with NO deletions.  If you're using
  them for write-once/read-many/no-delete archives, then you're OK.
  Anything else is going to suck.

thanks for pointing out the obvious.  :)

Still, though, this is basically true for ANY drive.

It's worse for slower RPM drives, but it's not like resilvers will
exactly be fast with 7200rpm drives, either.

danno
--
Dan Pritts, Sr. Systems Engineer
Internet2
office: +1-734-352-4953 | mobile: +1-734-834-7224

Visit our website: www.internet2.edu
Follow us on Twitter: www.twitter.com/internet2
Become a Fan on Facebook: www.internet2.edu/facebook
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS in campus clusters

2010-05-18 Thread John Hoogerdijk
I'm building a campus cluster with identical storage in two locations with ZFS 
mirrors spanning both storage frames. Data will be mirrored using zfs.  I'm 
looking for the best way to add log devices to this campus cluster. 

I am considering building a separate mirrored zpool of Flash disk that span the 
frames,  then creating zvols to use as log devices for the data zpool.  Will 
this work?   Any other suggestions?

regards,

jmh
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS in campus clusters

2010-05-18 Thread Darren J Moffat

On 18/05/2010 15:40, John Hoogerdijk wrote:

I'm building a campus cluster with identical storage in two locations with ZFS 
mirrors spanning both storage frames. Data will be mirrored using zfs.  I'm 
looking for the best way to add log devices to this campus cluster.


So this is a single pool with one side of the mirror in location A and 
one side in location B ?


Log devices can be mirrored too, so why not just put a log device in 
each frame and mirror them just like you do the normal pool disks.


What am I missing about your setup that means that won't work ?

--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Very serious performance degradation

2010-05-18 Thread Philippe
Hi,

The 4 disks are Western Digital ATA 1TB (one is slighlty different) :
 1 x ATA-WDC WD10EACS-00D-1A01-931.51GB
 3 x ATA-WDC WD10EARS-00Y-0A80-931.51GB

I've done lots of tests (speed tests + SMART reports) with each of these 4 disk 
on another system (another computer, running Windows 2003 x64), and everything 
was fine ! The 4 disks operate well, at 50-100 MB/s (tested with Hdtune). And 
the access time : 14ms

The controller is an LSI Logic SAS 1068-IR (MPT BIOS 6.12.00.00 - 31/10/2006)

Here are some stats :

1) cp of a big file to a ZFS filesystem (128K recordsize) :
=
iostat -x 30
 extended device statistics
devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
sd0   0.00.00.00.0  0.0  0.00.0   0   0
sd1   0.30.3   17.62.3  0.0  0.0   19.5   0   0
sd2  11.56.0  350.1  154.5  0.0  0.3   19.5   0   4
sd3  12.55.7  351.4  154.5  0.0  0.5   27.1   0   5
sd4  15.96.3  615.1  153.8  0.0  1.3   58.2   0   8
sd5  15.18.1  600.4  150.7  0.0  7.6  326.7   0  31
 extended device statistics
devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
sd0   0.00.00.00.0  0.0  0.00.0   0   0
sd1  41.30.0 5289.70.0  0.0  1.3   31.0   0   4
sd2   4.2   24.1  214.0 1183.0  0.0  0.5   19.4   0   4
sd3   3.7   23.6  227.2 1183.0  0.0  2.1   78.5   0  12
sd4   6.6   26.4  374.2 1179.4  0.0 10.1  306.5   0  35
sd5   4.3   31.0  369.6  973.3  0.0 22.0  622.0   0  96
 extended device statistics
devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
sd0   0.00.00.00.0  0.0  0.00.0   0   0
sd1  17.10.0 2184.60.0  0.0  0.5   30.6   0   2
sd2   1.6   12.3  116.4  570.9  0.0  0.6   41.3   0   3
sd3   1.6   12.1  107.6  570.9  0.0 10.3  754.7   0  33
sd4   2.1   12.6  187.1  569.4  0.0  9.4  634.7   0  28
sd5   0.4   21.7   25.6  700.6  0.0 29.5 1338.1   0  96


2) cp of a big file to a ZFS filesystem (16K recordsize) :
=
iostat -x 30
 extended device statistics
devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
sd0   0.00.00.00.0  0.0  0.00.0   0   0
sd1   0.20.3   16.72.3  0.0  0.0   19.3   0   0
sd2  11.56.0  350.7  154.5  0.0  0.3   19.5   0   4
sd3  12.55.7  352.0  154.5  0.0  0.5   27.0   0   5
sd4  15.96.3  616.2  153.8  0.0  1.3   58.0   0   8
sd5  15.18.1  601.3  150.7  0.0  7.5  324.6   0  31
 extended device statistics
devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
sd0   0.00.00.00.0  0.0  0.00.0   0   0
sd1  32.00.0 4095.90.0  0.0  1.0   30.8   0   3
sd2   2.0   22.4  124.2  425.0  0.0  0.12.3   0   2
sd3   1.9   19.4  115.9  425.0  0.0  0.6   28.7   0  14
sd4   2.3   23.6  170.9  421.8  0.0  3.2  124.7   0  15
sd5   3.2   24.5  290.6  306.6  0.0 22.5  810.5   0  94
 extended device statistics
devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
sd0   0.00.00.00.0  0.0  0.00.0   0   0
sd1   0.00.00.00.0  0.0  0.00.0   0   0
sd2   0.02.00.03.0  0.0  0.00.7   0   0
sd3   0.11.14.32.0  0.0  0.0   15.9   0   1
sd4   0.11.44.31.9  0.0  0.02.9   0   0
sd5   0.2   19.8   10.7  101.8  0.0 32.1 1606.9   0 100
 extended device statistics
devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
sd0   0.00.00.00.0  0.0  0.00.0   0   0
sd1   8.60.0 1096.20.0  0.0  0.3   29.7   0   1
sd2   0.24.8   10.7  267.2  0.0  0.07.8   0   0
sd3   0.25.56.8  268.2  0.0  0.6  107.0   0   3
sd4   0.29.1   11.0  265.4  0.0  6.3  678.4   0  21
sd5   0.2   21.46.8  104.5  0.0 31.6 1467.8   0  92
 extended device statistics
devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
sd0   0.00.00.00.0  0.0  0.00.0   0   0
sd1   0.00.00.00.0  0.0  0.00.0   0   0
sd2   0.00.00.00.0  0.0  0.00.0   0   0
sd3   0.00.00.00.0  0.0  0.00.0   0   0
sd4   0.00.00.00.0  0.0  0.00.0   0   0
sd5   0.0   18.90.0  101.7  0.0 35.0 1851.6   0 100
 extended device statistics
devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
sd0   0.00.00.00.0  0.0  0.00.0   0   0
sd1   0.00.00.00.0  0.0  0.00.0   0   0
sd2   0.35.9   15.3  279.2  0.0  0.06.7   0   1
sd3   0.45.7   23.5  279.2  0.0  1.0  161.5   0   5
sd4   0.4   11.6   23.8  275.6  0.0 11.6  964.3   0  36
sd5   0.2   20.6   13.1  107.2  0.0 30.2 1452.7   0  99
 extended 

Re: [zfs-discuss] Very serious performance degradation

2010-05-18 Thread John J Balestrini
Howdy,

Is dedup on? I was having some pretty strange problems including slow 
performance when dedup was on. Disabling dedup helped out a whole bunch. My 
system only has 4gig of ram, so that may have played a part too.

Good luck!

John


On May 18, 2010, at 7:51 AM, Philippe wrote:

 Hi,
 
 The 4 disks are Western Digital ATA 1TB (one is slighlty different) :
 1 x ATA-WDC WD10EACS-00D-1A01-931.51GB
 3 x ATA-WDC WD10EARS-00Y-0A80-931.51GB
 
 I've done lots of tests (speed tests + SMART reports) with each of these 4 
 disk on another system (another computer, running Windows 2003 x64), and 
 everything was fine ! The 4 disks operate well, at 50-100 MB/s (tested with 
 Hdtune). And the access time : 14ms
 
 The controller is an LSI Logic SAS 1068-IR (MPT BIOS 6.12.00.00 - 31/10/2006)
 
 Here are some stats :
 
 1) cp of a big file to a ZFS filesystem (128K recordsize) :
 =
 iostat -x 30
 extended device statistics
 devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
 sd0   0.00.00.00.0  0.0  0.00.0   0   0
 sd1   0.30.3   17.62.3  0.0  0.0   19.5   0   0
 sd2  11.56.0  350.1  154.5  0.0  0.3   19.5   0   4
 sd3  12.55.7  351.4  154.5  0.0  0.5   27.1   0   5
 sd4  15.96.3  615.1  153.8  0.0  1.3   58.2   0   8
 sd5  15.18.1  600.4  150.7  0.0  7.6  326.7   0  31
 extended device statistics
 devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
 sd0   0.00.00.00.0  0.0  0.00.0   0   0
 sd1  41.30.0 5289.70.0  0.0  1.3   31.0   0   4
 sd2   4.2   24.1  214.0 1183.0  0.0  0.5   19.4   0   4
 sd3   3.7   23.6  227.2 1183.0  0.0  2.1   78.5   0  12
 sd4   6.6   26.4  374.2 1179.4  0.0 10.1  306.5   0  35
 sd5   4.3   31.0  369.6  973.3  0.0 22.0  622.0   0  96
 extended device statistics
 devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
 sd0   0.00.00.00.0  0.0  0.00.0   0   0
 sd1  17.10.0 2184.60.0  0.0  0.5   30.6   0   2
 sd2   1.6   12.3  116.4  570.9  0.0  0.6   41.3   0   3
 sd3   1.6   12.1  107.6  570.9  0.0 10.3  754.7   0  33
 sd4   2.1   12.6  187.1  569.4  0.0  9.4  634.7   0  28
 sd5   0.4   21.7   25.6  700.6  0.0 29.5 1338.1   0  96
 
 
 2) cp of a big file to a ZFS filesystem (16K recordsize) :
 =
 iostat -x 30
 extended device statistics
 devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
 sd0   0.00.00.00.0  0.0  0.00.0   0   0
 sd1   0.20.3   16.72.3  0.0  0.0   19.3   0   0
 sd2  11.56.0  350.7  154.5  0.0  0.3   19.5   0   4
 sd3  12.55.7  352.0  154.5  0.0  0.5   27.0   0   5
 sd4  15.96.3  616.2  153.8  0.0  1.3   58.0   0   8
 sd5  15.18.1  601.3  150.7  0.0  7.5  324.6   0  31
 extended device statistics
 devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
 sd0   0.00.00.00.0  0.0  0.00.0   0   0
 sd1  32.00.0 4095.90.0  0.0  1.0   30.8   0   3
 sd2   2.0   22.4  124.2  425.0  0.0  0.12.3   0   2
 sd3   1.9   19.4  115.9  425.0  0.0  0.6   28.7   0  14
 sd4   2.3   23.6  170.9  421.8  0.0  3.2  124.7   0  15
 sd5   3.2   24.5  290.6  306.6  0.0 22.5  810.5   0  94
 extended device statistics
 devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
 sd0   0.00.00.00.0  0.0  0.00.0   0   0
 sd1   0.00.00.00.0  0.0  0.00.0   0   0
 sd2   0.02.00.03.0  0.0  0.00.7   0   0
 sd3   0.11.14.32.0  0.0  0.0   15.9   0   1
 sd4   0.11.44.31.9  0.0  0.02.9   0   0
 sd5   0.2   19.8   10.7  101.8  0.0 32.1 1606.9   0 100
 extended device statistics
 devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
 sd0   0.00.00.00.0  0.0  0.00.0   0   0
 sd1   8.60.0 1096.20.0  0.0  0.3   29.7   0   1
 sd2   0.24.8   10.7  267.2  0.0  0.07.8   0   0
 sd3   0.25.56.8  268.2  0.0  0.6  107.0   0   3
 sd4   0.29.1   11.0  265.4  0.0  6.3  678.4   0  21
 sd5   0.2   21.46.8  104.5  0.0 31.6 1467.8   0  92
 extended device statistics
 devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
 sd0   0.00.00.00.0  0.0  0.00.0   0   0
 sd1   0.00.00.00.0  0.0  0.00.0   0   0
 sd2   0.00.00.00.0  0.0  0.00.0   0   0
 sd3   0.00.00.00.0  0.0  0.00.0   0   0
 sd4   0.00.00.00.0  0.0  0.00.0   0   0
 sd5   0.0   18.90.0  101.7  0.0 35.0 1851.6   0 100
 extended device statistics
 devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
 sd0   0.00.0

Re: [zfs-discuss] ZFS in campus clusters

2010-05-18 Thread John Hoogerdijk
 On 18/05/2010 15:40, John Hoogerdijk wrote:
  I'm building a campus cluster with identical
 storage in two locations with ZFS mirrors spanning
 both storage frames. Data will be mirrored using zfs.
 I'm looking for the best way to add log devices to
  this campus cluster.
 So this is a single pool with one side of the mirror
 in location A and 
 one side in location B ?
 
 Log devices can be mirrored too, so why not just put
 a log device in 
 each frame and mirror them just like you do the
 normal pool disks.
 
 What am I missing about your setup that means that
 won't work ?

Yes - mirrored log devices will work. I want to share the Flash devices with 
more than one clustered zone/zpool (should have stated this earlier ...)  hence 
the use of zvols.

jmh
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Very serious performance degradation

2010-05-18 Thread Tomas Ögren
On 18 May, 2010 - Philippe sent me these 6,0K bytes:

 Hi,
 
 The 4 disks are Western Digital ATA 1TB (one is slighlty different) :
  1 x ATA-WDC WD10EACS-00D-1A01-931.51GB
  3 x ATA-WDC WD10EARS-00Y-0A80-931.51GB
 
 I've done lots of tests (speed tests + SMART reports) with each of these 4 
 disk on another system (another computer, running Windows 2003 x64), and 
 everything was fine ! The 4 disks operate well, at 50-100 MB/s (tested with 
 Hdtune). And the access time : 14ms
 
 The controller is an LSI Logic SAS 1068-IR (MPT BIOS 6.12.00.00 - 31/10/2006)
 
 Here are some stats :
 
 1) cp of a big file to a ZFS filesystem (128K recordsize) :
 =
 iostat -x 30
  extended device statistics
 devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
 sd0   0.00.00.00.0  0.0  0.00.0   0   0
 sd1   0.30.3   17.62.3  0.0  0.0   19.5   0   0
 sd2  11.56.0  350.1  154.5  0.0  0.3   19.5   0   4
 sd3  12.55.7  351.4  154.5  0.0  0.5   27.1   0   5
 sd4  15.96.3  615.1  153.8  0.0  1.3   58.2   0   8
 sd5  15.18.1  600.4  150.7  0.0  7.6  326.7   0  31
  extended device statistics
 devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
 sd0   0.00.00.00.0  0.0  0.00.0   0   0
 sd1  41.30.0 5289.70.0  0.0  1.3   31.0   0   4
 sd2   4.2   24.1  214.0 1183.0  0.0  0.5   19.4   0   4
 sd3   3.7   23.6  227.2 1183.0  0.0  2.1   78.5   0  12
 sd4   6.6   26.4  374.2 1179.4  0.0 10.1  306.5   0  35
 sd5   4.3   31.0  369.6  973.3  0.0 22.0  622.0   0  96
  extended device statistics
 devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
 sd0   0.00.00.00.0  0.0  0.00.0   0   0
 sd1  17.10.0 2184.60.0  0.0  0.5   30.6   0   2
 sd2   1.6   12.3  116.4  570.9  0.0  0.6   41.3   0   3
 sd3   1.6   12.1  107.6  570.9  0.0 10.3  754.7   0  33
 sd4   2.1   12.6  187.1  569.4  0.0  9.4  634.7   0  28
 sd5   0.4   21.7   25.6  700.6  0.0 29.5 1338.1   0  96

Umm.. Service time of sd3..5 are waay too high to be good working disks.
21 writes shouldn't take 1.3 seconds.

Some of your disks are not feeling well, possibly doing
block-reallocation like mad all the time, or block recovery of some
form. Service times should be closer to what sd1 and 2 are doing.
sd2,3,4 seems to be getting about the same amount of read+write, but
their service time is 15-20 times higher. This will lead to crap
performance (and probably broken array in a while).

/Tomas
-- 
Tomas Ögren, st...@acc.umu.se, http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Umeå
`- Sysadmin at {cs,acc}.umu.se
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Very serious performance degradation

2010-05-18 Thread Matt Cowger
I note in your iostat data below that one drive (sd5) consistently performs 
MUCH worse than the others, even when doing less work.

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of John J Balestrini
Sent: Tuesday, May 18, 2010 8:11 AM
To: OpenSolaris ZFS discuss
Subject: Re: [zfs-discuss] Very serious performance degradation

Howdy,

Is dedup on? I was having some pretty strange problems including slow 
performance when dedup was on. Disabling dedup helped out a whole bunch. My 
system only has 4gig of ram, so that may have played a part too.

Good luck!

John


On May 18, 2010, at 7:51 AM, Philippe wrote:

 Hi,
 
 The 4 disks are Western Digital ATA 1TB (one is slighlty different) :
 1 x ATA-WDC WD10EACS-00D-1A01-931.51GB
 3 x ATA-WDC WD10EARS-00Y-0A80-931.51GB
 
 I've done lots of tests (speed tests + SMART reports) with each of these 4 
 disk on another system (another computer, running Windows 2003 x64), and 
 everything was fine ! The 4 disks operate well, at 50-100 MB/s (tested with 
 Hdtune). And the access time : 14ms
 
 The controller is an LSI Logic SAS 1068-IR (MPT BIOS 6.12.00.00 - 31/10/2006)
 
 Here are some stats :
 
 1) cp of a big file to a ZFS filesystem (128K recordsize) :
 =
 iostat -x 30
 extended device statistics
 devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
 sd0   0.00.00.00.0  0.0  0.00.0   0   0
 sd1   0.30.3   17.62.3  0.0  0.0   19.5   0   0
 sd2  11.56.0  350.1  154.5  0.0  0.3   19.5   0   4
 sd3  12.55.7  351.4  154.5  0.0  0.5   27.1   0   5
 sd4  15.96.3  615.1  153.8  0.0  1.3   58.2   0   8
 sd5  15.18.1  600.4  150.7  0.0  7.6  326.7   0  31
 extended device statistics
 devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
 sd0   0.00.00.00.0  0.0  0.00.0   0   0
 sd1  41.30.0 5289.70.0  0.0  1.3   31.0   0   4
 sd2   4.2   24.1  214.0 1183.0  0.0  0.5   19.4   0   4
 sd3   3.7   23.6  227.2 1183.0  0.0  2.1   78.5   0  12
 sd4   6.6   26.4  374.2 1179.4  0.0 10.1  306.5   0  35
 sd5   4.3   31.0  369.6  973.3  0.0 22.0  622.0   0  96
 extended device statistics
 devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
 sd0   0.00.00.00.0  0.0  0.00.0   0   0
 sd1  17.10.0 2184.60.0  0.0  0.5   30.6   0   2
 sd2   1.6   12.3  116.4  570.9  0.0  0.6   41.3   0   3
 sd3   1.6   12.1  107.6  570.9  0.0 10.3  754.7   0  33
 sd4   2.1   12.6  187.1  569.4  0.0  9.4  634.7   0  28
 sd5   0.4   21.7   25.6  700.6  0.0 29.5 1338.1   0  96
 
 
 2) cp of a big file to a ZFS filesystem (16K recordsize) :
 =
 iostat -x 30
 extended device statistics
 devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
 sd0   0.00.00.00.0  0.0  0.00.0   0   0
 sd1   0.20.3   16.72.3  0.0  0.0   19.3   0   0
 sd2  11.56.0  350.7  154.5  0.0  0.3   19.5   0   4
 sd3  12.55.7  352.0  154.5  0.0  0.5   27.0   0   5
 sd4  15.96.3  616.2  153.8  0.0  1.3   58.0   0   8
 sd5  15.18.1  601.3  150.7  0.0  7.5  324.6   0  31
 extended device statistics
 devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
 sd0   0.00.00.00.0  0.0  0.00.0   0   0
 sd1  32.00.0 4095.90.0  0.0  1.0   30.8   0   3
 sd2   2.0   22.4  124.2  425.0  0.0  0.12.3   0   2
 sd3   1.9   19.4  115.9  425.0  0.0  0.6   28.7   0  14
 sd4   2.3   23.6  170.9  421.8  0.0  3.2  124.7   0  15
 sd5   3.2   24.5  290.6  306.6  0.0 22.5  810.5   0  94
 extended device statistics
 devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
 sd0   0.00.00.00.0  0.0  0.00.0   0   0
 sd1   0.00.00.00.0  0.0  0.00.0   0   0
 sd2   0.02.00.03.0  0.0  0.00.7   0   0
 sd3   0.11.14.32.0  0.0  0.0   15.9   0   1
 sd4   0.11.44.31.9  0.0  0.02.9   0   0
 sd5   0.2   19.8   10.7  101.8  0.0 32.1 1606.9   0 100
 extended device statistics
 devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
 sd0   0.00.00.00.0  0.0  0.00.0   0   0
 sd1   8.60.0 1096.20.0  0.0  0.3   29.7   0   1
 sd2   0.24.8   10.7  267.2  0.0  0.07.8   0   0
 sd3   0.25.56.8  268.2  0.0  0.6  107.0   0   3
 sd4   0.29.1   11.0  265.4  0.0  6.3  678.4   0  21
 sd5   0.2   21.46.8  104.5  0.0 31.6 1467.8   0  92
 extended device statistics
 devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
 sd0   0.00.00.00.0  0.0  0.00.0   0   0
 sd1   0.00.0  

Re: [zfs-discuss] Very serious performance degradation

2010-05-18 Thread Bob Friesenhahn

On Tue, 18 May 2010, Philippe wrote:


The 4 disks are Western Digital ATA 1TB (one is slighlty different) :
1 x ATA-WDC WD10EACS-00D-1A01-931.51GB
3 x ATA-WDC WD10EARS-00Y-0A80-931.51GB

I've done lots of tests (speed tests + SMART reports) with each of these 4 disk 
on another system (another computer, running Windows 2003 x64), and everything 
was fine ! The 4 disks operate well, at 50-100 MB/s (tested with Hdtune). And 
the access time : 14ms

The controller is an LSI Logic SAS 1068-IR (MPT BIOS 6.12.00.00 - 31/10/2006)

Here are some stats :
extended device statistics
devicer/sw/s   kr/s   kw/s wait actv  svc_t  %w  %b
sd0   0.00.00.00.0  0.0  0.00.0   0   0
sd1   8.60.0 1096.20.0  0.0  0.3   29.7   0   1
sd2   0.24.8   10.7  267.2  0.0  0.07.8   0   0
sd3   0.25.56.8  268.2  0.0  0.6  107.0   0   3
sd4   0.29.1   11.0  265.4  0.0  6.3  678.4   0  21
sd5   0.2   21.46.8  104.5  0.0 31.6 1467.8   0  92


It looks like your 'sd5' disk is performing horribly bad and except 
for the horrible performance of 'sd5' (which bottlenecks the I/O), 
'sd4' would look just as bad.  Regardless, the first step would be to 
investigate 'sd5'.  If 'sd4' is also a terrible performer, then 
resilvering a disk replacement of 'sd5' may take a very long time.


Use 'iostat -xen' to obtain more information, including the number of 
reported errors.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Ideal SATA/SAS Controllers for ZFS

2010-05-18 Thread Marc Nicholas
Nice write-up, Marc.

Aren't the SuperMicro cards their funny UIO form factor? Wouldn't want
someone buying a card that won't work in a standard chassis.

-marc

On Tue, May 18, 2010 at 2:26 AM, Marc Bevand m.bev...@gmail.com wrote:

 The LSI SAS1064E slipped through the cracks when I built the list.
 This is a 4-port PCIe x8 HBA with very good Solaris (and Linux)
 support. I don't remember having seen it mentionned on zfs-discuss@
 before, even though many were looking for 4-port controllers. Perhaps
 the fact it is priced too close to 8-port models explains why it is
 relatively unnoted. That said, the wide x8 PCIe link makes it the
 *cheapest* controller able to feed 300-350MB/s to at least 4 ports
 concurrently. Now added to my list.

 -mrb

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Ideal SATA/SAS Controllers for ZFS

2010-05-18 Thread Marc Bevand
Marc Nicholas geekything at gmail.com writes:
 
 Nice write-up, Marc.Aren't the SuperMicro cards their funny UIO form
 factor? Wouldn't want someone buying a card that won't work in a standard
 chassis.

Yes, 4 or the 6 Supermicro cards are UIO cards. I added a warning about it.
Thanks.

-mrb

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Ideal SATA/SAS Controllers for ZFS

2010-05-18 Thread Thomas Burgess
A really great alternative to the UIO cards for those who don't want the
headache of modifying the brackets or cases is the Intel SASUC8I
*
*
*
*
*This is a rebranded LSI SAS3081E-R*
*
*
*It can be flashed with the LSI IT firmware from the LSI website and is
physically identical to the LSI card.  It is really the exact same card, and
typically around 140-160 dollars.*
*
*
*These are what i went with.*
* *
On Tue, May 18, 2010 at 12:28 PM, Marc Bevand m.bev...@gmail.com wrote:

 Marc Nicholas geekything at gmail.com writes:
 
  Nice write-up, Marc.Aren't the SuperMicro cards their funny UIO form
  factor? Wouldn't want someone buying a card that won't work in a standard
  chassis.

 Yes, 4 or the 6 Supermicro cards are UIO cards. I added a warning about it.
 Thanks.

 -mrb

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] spares bug: explain to me status of bug report.

2010-05-18 Thread eXeC001er
Hi.

In bugster i found bug about spares.
I can to reproduce the problem. but developer set status Not a defect.
Why?

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6905317

Thanks.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] spares bug: explain to me status of bug report.

2010-05-18 Thread Cindy Swearingen

Hi--

The scenario in the bug report below is that the pool is exported.

The spare can't kick in if the pool is exported. It looks like the
issue reported in this CR's See Also section, CR 6887163 is still
open.

Thanks,

Cindy

On 05/18/10 11:19, eXeC001er wrote:

Hi.

In bugster i found bug about spares. 
I can to reproduce the problem. but developer set status Not a defect. 
Why?


http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6905317

Thanks.




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] spares bug: explain to me status of bug report.

2010-05-18 Thread eXeC001er
6887163 /bugdatabase/view_bug.do?bug_id=6887163 11-Closed:Duplicate
 (Closed)
6945634 /bugdatabase/view_bug.do?bug_id=6945634 11-Closed:Duplicate
 (Closed)


2010/5/18 Cindy Swearingen cindy.swearin...@oracle.com

 Hi--

 The scenario in the bug report below is that the pool is exported.

 The spare can't kick in if the pool is exported. It looks like the
 issue reported in this CR's See Also section, CR 6887163 is still
 open.

 Thanks,

 Cindy


 On 05/18/10 11:19, eXeC001er wrote:

 Hi.

 In bugster i found bug about spares. I can to reproduce the problem. but
 developer set status Not a defect. Why?

 http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6905317

 Thanks.


 

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] spares bug: explain to me status of bug report.

2010-05-18 Thread Cindy Swearingen

I think the remaining CR is this one:

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6911420

cs

On 05/18/10 12:08, eXeC001er wrote:
6887163 
/bugdatabase/view_bug.do?bug_id=6887163 11-Closed:Duplicate (Closed)
6945634 
/bugdatabase/view_bug.do?bug_id=6945634 11-Closed:Duplicate (Closed)



2010/5/18 Cindy Swearingen cindy.swearin...@oracle.com 
mailto:cindy.swearin...@oracle.com


Hi--

The scenario in the bug report below is that the pool is exported.

The spare can't kick in if the pool is exported. It looks like the
issue reported in this CR's See Also section, CR 6887163 is still
open.

Thanks,

Cindy


On 05/18/10 11:19, eXeC001er wrote:

Hi.

In bugster i found bug about spares. I can to reproduce the
problem. but developer set status Not a defect. Why?

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6905317

Thanks.




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org mailto:zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Ideal SATA/SAS Controllers for ZFS

2010-05-18 Thread Marc Bevand
Thomas Burgess wonslung at gmail.com writes:
 
 A really great alternative to the UIO cards for those who don't want the
 headache of modifying the brackets or cases is the Intel SASUC8I
 
 This is a rebranded LSI SAS3081E-R
 
 It can be flashed with the LSI IT firmware from the LSI website and
 is physically identical to the LSI card.  It is really the exact same
 card, and typically around 140-160 dollars.

The SASUC8I is already in my list. In fact I bought one last week. I
did not need to flash its firmware though - drives were used in JBOD
mode by default.

-mrb

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Is dedupe ready for prime time?

2010-05-18 Thread Paul Choi
I've been reading this list for a while, there's lots of discussion 
about b134 and deduplication. I see some stuff about snapshots not being 
destroyed, and maybe some recovery issues. What I'd like to know is, is 
ZFS with deduplication stable enough to use?


I have two NFS servers, each running OpenSolaris 2009.06 (111b), as 
datastores for VMWare ESX hosts. It works great right now, with ZIL 
offload and L2ARC SSDs. I still get occasional complaints from 
developers saying the storage is slow - which I'm guessing is that read 
latency is not stellar on a shared storage. Write latency is probably 
not an issue due to the ZIL offload. I'm guessing deduplication would 
solve a lot of this read latency problem, having to do fewer read IOs.


But is it stable? Can I do nightly recursive snapshots and periodically 
destroy old snapshots without worrying about a dozen VMs suddenly losing 
their datastore? I'd love to hear from your experience.


Thanks,

-Paul Choi
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is dedupe ready for prime time?

2010-05-18 Thread Roy Sigurd Karlsbakk
- Paul Choi paulc...@plaxo.com skrev:

 I've been reading this list for a while, there's lots of discussion 
 about b134 and deduplication. I see some stuff about snapshots not
 being 
 destroyed, and maybe some recovery issues. What I'd like to know is,
 is 
 ZFS with deduplication stable enough to use?

No, currently ZFS dedup is not ready for production. There are several bugs are 
filed, and the most problematic ones are that the system can be rendered 
unusable for days in some situations. Also, if using dedup, plan well your 
memory and spend money on l2arc, since it _will_ require either massive amounts 
of RAM or some good SSDs for l2arc.

Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Hashing files rapidly on ZFS

2010-05-18 Thread Bertrand Augereau
Thanks Dan, this is exactly what I had in mind (hashing the block checksums).
You convinced me to do it independently from zfs.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is dedupe ready for prime time?

2010-05-18 Thread Paul Choi

Roy,

Thanks for the info. Yeah, the bug you mentioned is pretty critical. In 
terms of SSDs, I have Intel X25-M for L2ARC and X25-E for ZIL. And the 
host has 24G RAM. I'm just waiting for that 2010.03 release or 
whatever we want to call it when it's released...


-Paul

On 5/18/10 12:49 PM, Roy Sigurd Karlsbakk wrote:

- Paul Choipaulc...@plaxo.com  skrev:

   

I've been reading this list for a while, there's lots of discussion
about b134 and deduplication. I see some stuff about snapshots not
being
destroyed, and maybe some recovery issues. What I'd like to know is,
is
ZFS with deduplication stable enough to use?
 

No, currently ZFS dedup is not ready for production. There are several bugs are 
filed, and the most problematic ones are that the system can be rendered 
unusable for days in some situations. Also, if using dedup, plan well your 
memory and spend money on l2arc, since it _will_ require either massive amounts 
of RAM or some good SSDs for l2arc.

Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is dedupe ready for prime time?

2010-05-18 Thread Roy Sigurd Karlsbakk

- Paul Choi paulc...@plaxo.com skrev:

 Roy,
 
 Thanks for the info. Yeah, the bug you mentioned is pretty critical.
 In
 terms of SSDs, I have Intel X25-M for L2ARC and X25-E for ZIL. And the
 host has 24G RAM. I'm just waiting for that 2010.03 release or
 whatever we want to call it when it's released...

IIRC the memory requirements for dedup is something like 150 bytes per block 
for the DDT, meaning about 1GB per 1TB space if it's all 128kB blocks. With 
smaller blocks, osol gets greedy. Zil is good for write performance, but 
remember, osol won't use more than top half of memory size for zil, so you can 
probably slice it up and use the rest for l2arc.

Vennlige hilsener

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Pool revcovery from replaced disks.

2010-05-18 Thread Peter Jeremy
On 2010-May-18 19:06:11 +0800, Demian Phillips demianphill...@gmail.com wrote:
Is it possible to recover a pool (as it was) from a set of disks that
were replaced during a capacity upgrade?

If no other writes occurred during the capacity upgrade then I'd
suspect it would be possible.  The transaction numbers would still
vary across the drives and the pool information would be inconsistent
but I suspect a recent version of ZFS could manage to recover.

It might be possible to test this by creating a small, file-backed
RAIDZn zpool, simulating a capacity upgrade, exporting that pool
and trying to import the original zpool from the detached files.

-- 
Peter Jeremy


pgp5OU8Gba0CI.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] New SSD options

2010-05-18 Thread Don
I'm looking for alternatives SSD options to the Intel X25-E and the ZEUS IOPS.

The ZEUS IOPS would probably cost as much as my entire current disk system (80 
15k SAS drives)- and that's just silly.

The Intel is much less expensive, and while fast- pales in comparison to the 
ZEUS.

I've allocated 4 disk slots in my array for ZIL SSD's and I'm trying to find 
the best performance for my dollar.

With that in mind- Is anyone using the new OCZ Vertex 2 SSD's as a ZIL?

http://www.ocztechnology.com/products/solid-state-drives/2-5--sata-ii/performance-enterprise-solid-state-drives/ocz-vertex-2-sata-ii-2-5--ssd.html

They're claiming 50k IOPS (4k Write- Aligned), 2 million hour MTBF, TRIM 
support, etc. That's more write IOPS than the ZEUS (40k IOPS, $) but at 
half the price of an Intel X25-E (3.3k IOPS, $400).

Needless to say I'd love to know if anyone has evaluated these drives to see if 
they make sense as a ZIL- for example- do they honor cache flush requests? Are 
those sustained IOPS numbers?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS in campus clusters

2010-05-18 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of John Hoogerdijk
 
 I'm building a campus cluster with identical storage in two locations
 with ZFS mirrors spanning both storage frames. Data will be mirrored
 using zfs.  I'm looking for the best way to add log devices to this
 campus cluster.

Either I'm crazy, or I completely miss what you're asking.  You want to have
one side of a mirror attached locally, and the other side of the mirror
attached ... via iscsi or something ... across the WAN?  Even if you have a
really fast WAN (1Gb or so) your performance is going to be terrible, and I
would be very concerned about reliability.  What happens if a switch reboots
or crashes?  Then suddenly half of the mirror isn't available anymore
(redundancy is degraded on all pairs) and ... Will it be a degraded mirror?
Or will the system just hang, waiting for iscsi IO to timeout?  When it
comes back online, will it intelligently resilver only the parts which have
changed since?  Since the mirror is now broken, and local operations can
happen faster than the WAN can carry them across, will the resilver ever
complete, ever?  I don't know.

anyway, it just doesn't sound like a good idea to me.  It sounds like
something that was meant for a clustering filesystem of some kind, not
particularly for ZFS.

If you are adding log devices to this, I have a couple of things to say:

The whole point of a log device is to accelerate sync writes, by providing
nonvolatile storage which is faster than the primary storage.  You're not
going to get this if any part of the log device is at the other side of a
WAN.  So either add a mirror of log devices locally and not across the WAN,
or don't do it at all.


 I am considering building a separate mirrored zpool of Flash disk that
 span the frames,  then creating zvols to use as log devices for the
 data zpool.  Will this work?   Any other suggestions?

This also sounds nonsensical to me.  If your primary pool devices are Flash,
then there's no point to add separate log devices.  Unless you have another
type of even faster nonvolatile storage.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Very serious performance degradation

2010-05-18 Thread Edward Ned Harvey
How full is your filesystem?  Give us the output of zfs list
You might be having a hardware problem, or maybe it's extremely full.

Also, if you have dedup enabled, on a 3TB filesystem, you surely want more
RAM.  I don't know if there's any rule of thumb you could follow, but
offhand I'd say 16G or 32G.  Numbers based on the vapor passing around the
room I'm in right now.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] scsi messages and mpt warning in log - harmless, or indicating a problem?

2010-05-18 Thread Willard Korfhage
This afternoon, messages like the following started appearing in 
/var/adm/messages:

May 18 13:46:37 fs8 scsi: [ID 365881 kern.info] 
/p...@0,0/pci8086,2...@1/pci15d9,a...@0 (mpt0):
May 18 13:46:37 fs8 Log info 0x3108 received for target 5.
May 18 13:46:37 fs8 scsi_status=0x0, ioc_status=0x804b, scsi_state=0x1
May 18 13:46:38 fs8 scsi: [ID 365881 kern.info] 
/p...@0,0/pci8086,2...@1/pci15d9,a...@0 (mpt0):
May 18 13:46:38 fs8 Log info 0x3108 received for target 5.
May 18 13:46:38 fs8 scsi_status=0x0, ioc_status=0x804b, scsi_state=0x0
May 18 13:46:40 fs8 scsi: [ID 365881 kern.info] 
/p...@0,0/pci8086,2...@1/pci15d9,a...@0 (mpt0):
May 18 13:46:40 fs8 Log info 0x3108 received for target 5.
May 18 13:46:40 fs8 scsi_status=0x0, ioc_status=0x804b, scsi_state=0x0

The pool has no errors, so I don't know if these represent a potential problem 
or not.

During this time I was copying files from one fileset to another in the same 
pool, so it was fairly I/O intensive.  Typically you get one every 1-5 seconds 
for 10 to 20 seconds, sometimes longer, and then it is quiet for many minutes 
before they occur again. Is this indicating a problem, or just a harmless 
message?

I just kicked off a scrub on the pool as I was writing this, and I am seeing a 
lot of these messages. I see that zpool status shows c4t5d0 has 12.5K repaired 
already. The scrub has been in progress for just 6 minutes, and it says I have 
170629h54m to go, and it gets longer every time I check the status. I ran a 
scrub on this a few weeks ago, and had no such problem.

I also see two warnings earlier today:

May 18 19:14:09 fs8 scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@1/pci15d9,a...@0 (mpt0):
May 18 19:14:09 fs8 mpt_handle_event_sync: IOCStatus=0x8000, 
IOCLogInfo=0x31110900
May 18 19:14:09 fs8 scsi: [ID 243001 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@1/pci15d9,a...@0 (mpt0):
May 18 19:14:09 fs8 mpt_handle_event: IOCStatus=0x8000, 
IOCLogInfo=0x31110900

and two more of these 1 minute and 10 seconds later. 

So, is my system in trouble or not?

Particulars of my system:

% uname -a
SunOS fs8 5.11 snv_134 i86pc i386 i86pc

The hardware is an Asus server motherboard carrying 4GB of ECC memory and a 
current Xeon CPU, and a SuperMicro AOC-USASLP-L8I card (it uses the 1068E) with 
8 Samsung Spinpoint F3EG HD203WI 2TB disks attached.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zhist good enough for usage

2010-05-18 Thread Edward Ned Harvey
The purpose of zhist is to simplify access to past snapshots.  For example,
if you zhist ls somefile then the results will be:  A list of all the
previous snapshot versions of that file or directory.  No need to find the
right .zfs directory, or check to see which ones have changed.  Some
reasonable steps (stat) are taken inside this zhist, to identify the
previous snaps, and to identify unique snaps of the requested object.

There is a long way still remaining to go.  But this is a 90/10 rule.  The
first good enough for usage version is available, and that's what most
people would care about.

You can get the present release as follows:

svn export https://zhist.googlecode.com/svn/tags/0.6beta
sudo chown root:root 0.6beta/zhist
sudo chmod 755 0.6beta/zhist
(optionally, edit the first line of 0.6beta/zhist to match your
environment's preferred python location)
sudo mv 0.6beta/zhist /usr/local/bin

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss