Re: [zfs-discuss] Ext. UPS-backed SATA SSD ZIL?

2010-11-28 Thread Erik Trimble

On 11/27/2010 7:42 PM, Tim Cook wrote:



On Sat, Nov 27, 2010 at 9:29 PM, Erik Trimble erik.trim...@oracle.com 
mailto:erik.trim...@oracle.com wrote:


On 11/27/2010 6:50 PM, Christopher George wrote:

Furthermore, I don't think 1 hour sustained is a very
accurate benchmark.
Most workloads are bursty in nature.

The IOPS degradation is additive, the length of the first and
second one hour
sustained period is completely arbitrary.  The take away from
slides 1 and 2 is
drive inactivity has no effect on the eventual outcome.  So
with either a bursty
or sustained workload the end result is always the same,
dramatic write IOPS
degradation after unpackaging or secure erase of the tested
Flash based SSDs.

Best regards,

Christopher George
Founder/CTO
www.ddrdrive.com http://www.ddrdrive.com


Without commenting on other threads, I often seen sustained IO in
my setups for extended periods of time - particularly, small IO
which eats up my IOPS.  At this moment, I run with ZIL turned off
for that pool, as it's a scratch pool and I don't care if it gets
corrupted. I suspect that a DDRdrive or one of the STEC Zeus
drives might help me, but I can overwhelm any other SSD quickly.

I'm doing compiles of the JDK, with a single backed ZFS system
handing the files for 20-30 clients, each trying to compile a 15
million-line JDK at the same time.

Lots and lots of small I/O.

:-)



Sounds like you need lots and lots of 15krpm drives instead of 7200rpm 
SATA ;)


--Tim




That's the scary part.  I've got 24 2.5 15k SAS drives with a 512MB 
caching raid controller. Still gets hammered on my workload.



--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Ext. UPS-backed SATA SSD ZIL?

2010-11-28 Thread Erik Trimble

On 11/27/2010 11:08 PM, Christopher George wrote:

I'm doing compiles of the JDK, with a single backed ZFS system handing
the files for 20-30 clients, each trying to compile a 15 million-line
JDK at the same time.

Very cool application!

Can you share any metrics, such as the aggregate size of source files
compiled and the size of the resultant binaries?

Thanks,

Christopher George
Founder/CTO
www.ddrdrive.com



MY biggest issue is that I eventually flood my network bandwidth. I've 
got 4 bonded GigE into my NFS server, and I'll still overwhelm them all 
with my clients.


It's the JDK.  Figure copy a 700MB tarball to each client machine, then 
explode that on an NFS-mounted directory. About 50,000 files, averaging 
under 4k each.


Final binary size is not that big:  figure 400MB total size, but 
intermediary size of ~4GB.


Tar up the results and save them elsewhere. Erase the whole filesystem 
after the build is complete. Figure, 1 build on 8 platforms over 20 
machines total takes 3 hours.




--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Data set busy

2010-11-28 Thread bhanu prakash
Hi Team,

When I am trying take the snapshot on the zfs list file system, it is giving
below error.

cannot create snapshot 'sanpool_new/wlsdva13_d...@new': dataset is busy

Please give me suggestions on this.


Regards,
Bhanu
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OCZ RevoDrive ZFS support

2010-11-28 Thread David Magda

On Nov 27, 2010, at 16:14, Tim Cook wrote:

You don't need drivers for any SATA based SSD.  It shows up as a  
standard
hard drive and plugs into a standard SATA port.  By the time the G3  
Intel
drive is out, the next gen Sandforce should be out as well.  Unless  
Intel

does something revolutionary, they will still be behind the Sandforce
drives.


Are you referring to the SF-2000 chips?

http://www.sandforce.com/index.php?id=133
http://www.legitreviews.com/article/1429/1/
http://www.google.com/search?q=sandforce+sf-2000

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OCZ RevoDrive ZFS support

2010-11-28 Thread Orvar Korvar
There are problems with Sandforce controllers, according to forum posts. Buggy 
firmware. And in practice, Sandforce is far below it's theoretical values. I 
expect Intel to have fewer problems.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OCZ RevoDrive ZFS support

2010-11-28 Thread Tim Cook
On Sun, Nov 28, 2010 at 1:41 PM, Orvar Korvar 
knatte_fnatte_tja...@yahoo.com wrote:

 There are problems with Sandforce controllers, according to forum posts.
 Buggy firmware. And in practice, Sandforce is far below it's theoretical
 values. I expect Intel to have fewer problems.


According to what forum posts?  There were issues when Crucial and a few
others released alpha firmware into production...  Anandtech has put those
drives through the ringer without issue.  Several people on this list are
running them as well.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OCZ RevoDrive ZFS support

2010-11-28 Thread Tim Cook
On Sun, Nov 28, 2010 at 10:42 AM, David Magda dma...@ee.ryerson.ca wrote:

 On Nov 27, 2010, at 16:14, Tim Cook wrote:

  You don't need drivers for any SATA based SSD.  It shows up as a standard
 hard drive and plugs into a standard SATA port.  By the time the G3 Intel
 drive is out, the next gen Sandforce should be out as well.  Unless Intel
 does something revolutionary, they will still be behind the Sandforce
 drives.


 Are you referring to the SF-2000 chips?

http://www.sandforce.com/index.php?id=133
http://www.legitreviews.com/article/1429/1/
http://www.google.com/search?q=sandforce+sf-2000



Yup.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Recomandations

2010-11-28 Thread Paul Piscuc
Hi,

We are a company that want to replace our current  storage layout with one
that uses ZFS. We have been testing it for a month now, and everything looks
promising. One element that we cannot determine is the optimum number of
disks in a raid-z pool. In the ZFS best practice guide, 7,9 and 11 disks
are recommended to be used in a single raid-z2.  On the other hand, another
user specifies that the most important part is the distribution of the
defaul 128k record size to all the disks. So, the recommended layout would
be:

4-disk RAID-Z2 = 128KiB / 2 = 64KiB = good
5-disk RAID-Z2 = 128KiB / 3 = ~43KiB = not good
6-disk RAID-Z2 = 128KiB / 4 = 32KiB = good
10-disk RAID-Z2 = 128KiB / 8 = 16KiB = good

What is your recommendations regarding the number of disks? We are planning
to use 2 raid-z2 pools with 8+2 disks, 2 spare, 2 SSDs for L2ARC, 2 SSDs for
ZIL, 2 for syspool, and a similar machine for replication.

Thanks in advance,
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OCZ RevoDrive ZFS support

2010-11-28 Thread Krunal Desai
 There are problems with Sandforce controllers, according to forum posts. 
 Buggy firmware. And in practice, Sandforce is far below it's theoretical 
 values. I expect Intel to have fewer problems.

I believe it's more the firmware (and pace of firmware updates) from companies 
making Sandforce-based drives than it is the controller. Enthusiasts can 
tolerate OCZ and others releasing alphas/betas in forum posts.

While the G2 Intel drives may not be the performance kings anymore (or the most 
price-effective), I'd argue they're certainly the most stable when it comes to 
firmware. Have my eye on a G3 Intel drive for my laptop, where I can't really 
afford beta firmware updates biting me on the road.

--khd

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Recomandations

2010-11-28 Thread Erik Trimble

On 11/28/2010 1:51 PM, Paul Piscuc wrote:

Hi,

We are a company that want to replace our current  storage layout with 
one that uses ZFS. We have been testing it for a month now, and 
everything looks promising. One element that we cannot determine is 
the optimum number of disks in a raid-z pool. In the ZFS best practice 
guide, 7,9 and 11 disks are recommended to be used in a single 
raid-z2.  On the other hand, another user specifies that the most 
important part is the distribution of the defaul 128k record size to 
all the disks. So, the recommended layout would be:


4-disk RAID-Z2 = 128KiB / 2 = 64KiB = good
5-disk RAID-Z2 = 128KiB / 3 = ~43KiB = not good
6-disk RAID-Z2 = 128KiB / 4 = 32KiB = good
10-disk RAID-Z2 = 128KiB / 8 = 16KiB = good

What is your recommendations regarding the number of disks? We are 
planning to use 2 raid-z2 pools with 8+2 disks, 2 spare, 2 SSDs for 
L2ARC, 2 SSDs for ZIL, 2 for syspool, and a similar machine for 
replication.


Thanks in advance,



You've hit on one of the hardest parts of using ZFS - optimization.   
Truth of the matter is that there is NO one-size-fits-all best 
solution. It heavily depends on your workload type - access patterns, 
write patterns, type of I/O, and size of average I/O request.


A couple of things here:

(1) Unless you are using Zvols for raw disk partitions (for use with 
something like a database), the recordsize value is a MAXIMUM value, NOT 
an absolute value.  Thus, if you have a ZFS filesystem with a record 
size of 128k, it will break up I/O into 128k chunks for writing, but it 
will also write smaller chunks.  I forget what the minimum size is (512b 
or 1k, IIRC), but what ZFS does is use a Variable block size, up to the 
maximum size specified in the recordsize property.   So, if 
recordsize=128k and you have a 190k write I/O op, it will write a 128k 
chunk, and a 64k chunk (64 being the smallest multiple of 2 greater than 
the remaining 62 bits of info).  It WON'T write two 128k chunks.


(2) #1 comes up a bit when you have a mix of file sizes - for instance, 
home directories, where you have lots of small files (initialization 
files, source code, etc.) combined with some much larger files (images, 
mp3s, executable binaries, etc.).  Thus, such a filesystem will have a 
wide variety of chunk sizes, which makes optimization difficult, to say 
the least.


(3) For *random* I/O, a raidZ of any number of disks performs roughly 
like a *single* disk in terms of IOPs and a little better than a single 
disk in terms of throughput.  So, if you have considerable amounts of 
random I/O, you should really either use small raidz configs (no more 
than 4 data disks), or switch to mirrors instead.


(4) For *sequential* or large-size I/O, a raidZ performs roughly 
equivalent to a stripe of the same number of data disks. That is, a 
N-disk raidz2 will perform about the same as a (N-2) disk stripe in 
terms of throughput and IOPS.


(5) As I mentioned in #1, *all* ZFS I/O is broken up into 
powers-of-two-sized chunks, even if the last chunk must have some 
padding in it to get to a power-of-two.   This has implications as to 
the best number of disks in a raidZ(n).



I'd have to re-look at the ZFS Best Practices Guide, but I'm pretty sure 
the recommendation of 7, 9, or 11 disks was for a raidz1, NOT a raidz2.  
Due to #5 above, best performance comes with an EVEN number of data 
disks in any raidZ, so a write to any disks is always a full portion of 
the chunk, rather than a partial one (that sounds funny, but trust me).  
The best balance of size, IOPs, and throughput is found in the mid-size 
raidZ(n) configs, where there are 4, 6 or 8 data disks.



Honestly, even with you describing a workload, it will be hard for us to 
give you a real exact answer. My best suggestion is to do some testing 
with raidZ(n) of different sizes, to see the tradeoffs between size and 
performance.



Also, in your sample config, unless you plan to use the spare disks for 
redundancy on the boot mirror, it would be better to configure 2 x 
11-disk raidZ3 than 2 x 10-disk raidZ2 + 2 spares. Better reliability.



--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss