Re: [zfs-discuss] ZFS performance falls off a cliff

2011-05-13 Thread Don
~# uname -a SunOS nas01a 5.11 oi_147 i86pc i386 i86pc Solaris ~# zfs get version pool0 NAME PROPERTY VALUESOURCE pool0 version 5- ~# zpool get version pool0 NAME PROPERTY VALUESOURCE pool0 version 28 default -- This message posted from opensolaris.org

Re: [zfs-discuss] Performance problem suggestions?

2011-05-12 Thread Don
This is a slow operation which can only be done about 180-250 times per second for very random I/Os (may be more with HDD/Controller caching, queuing and faster spindles). I'm afraid that seeking to very dispersed metadata blocks, such as traversing the tree during a scrub on a fragmented

Re: [zfs-discuss] Modify stmf_sbd_lu properties

2011-05-11 Thread Don
I can't actually disable the STMF framework to do this but I can try renaming things and dumping the properties from one device to another and see if it works- it might actually do it. I will let you know. -- This message posted from opensolaris.org

Re: [zfs-discuss] Performance problem suggestions?

2011-05-11 Thread Don
It sent a series of blocks to write from the queue, newer disks wrote them and stay dormant, while older disks seek around to fit that piece of data... When old disks complete the writes, ZFS batches them a new set of tasks. The thing is- as far as I know the OS doesn't ask the disk to find

Re: [zfs-discuss] Modify stmf_sbd_lu properties

2011-05-11 Thread Don
It turns out this was actually as simple as: stmfadm create-lu -p guid=XXX.. I kept looking at modify-lu to change this and never thought to check the create-lu options. Thanks to Evaldas for the suggestion. -- This message posted from opensolaris.org

Re: [zfs-discuss] Performance problem suggestions?

2011-05-10 Thread Don
I've been going through my iostat, zilstat, and other outputs all to no avail. None of my disks ever seem to show outrageous service times, the load on the box is never high, and if the darned thing is CPU bound- I'm not even sure where to look. (traversing DDT blocks even if in memory, etc -

Re: [zfs-discuss] Performance problem suggestions?

2011-05-10 Thread Don
# dd if=/dev/zero of=/dcpool/nodedup/bigzerofile Ahh- I misunderstood your pool layout earlier. Now I see what you were doing. People on this forum have seen and reported that adding a 100Mb file tanked their multiterabyte pool's performance, and removing the file boosted it back up. Sadly I

Re: [zfs-discuss] zfs send receive problem/questions

2010-12-03 Thread Don Jackson
Try using the -d option to zfs receive.  The ability to do zfs send -R ... | zfs receive [without -d] was added relatively recently, and you may be encountering a bug that is specific to receiving a send of a whole pool. I just tried this, didn't work, new error: # zfs send -R

[zfs-discuss] zfs send receive problem/questions

2010-12-01 Thread Don Jackson
send them. Once they are received on the new zpool, I really don't need nor want this snapshot on the receiving side. Is it OK to zfs destroy that snapshot? I've been pounding my head against this problem for a couple of days, and I would definitely appreciate any tips/pointers/advice. Don

Re: [zfs-discuss] zfs send receive problem/questions

2010-12-01 Thread Don Jackson
Here is some more info on my system: This machine is running Solaris 10 U9, with all the patches as of 11/10/2010. The source zpool I am attempting to transfer from was originally created on a older OpenSolaris (specifically Nevada) release, I think it was 111. I did a zpool export on that

[zfs-discuss] Resizing ZFS block devices and sbdadm

2010-11-30 Thread Don
sbdadm can be used with a regular ZFS file or a ZFS block device. Is there an advatage to using a ZFS block device and exporting it to comstar via sbdadm as opposed to using a file and exporting it? (e.g. performance or manageability?) Also- let's say you have a 5G block device called

Re: [zfs-discuss] zpool lockup and dedupratio meanings

2010-11-26 Thread Don
I've previously posted about some lockups I've experienced with ZFS. There were two suspected causes at the time: one was deduplication, and one was the 2009.06 code we were running. After upgrading the zpools and adding some more disks to the pool I initiated a zpool scrub and was rewarded

[zfs-discuss] ZFS performance questions

2010-11-24 Thread Don
I have an OpenSolaris (technically OI 147) box running ZFS with Comstar (zpool version 28, zfs version 5) The box is a 2950 with 32 GB of RAM, Dell SAS5/e card connected to 6 Promise vTrak J610sD (dual controller SAS) disk shelves spread across both channels of the card (2 chains of 3

[zfs-discuss] zpool lockup and dedupratio meanings

2010-11-20 Thread Don
I've previously posted about some lockups I've experienced with ZFS. There were two suspected causes at the time: one was deduplication, and one was the 2009.06 code we were running. I upgraded to OpenIndiana 147 (initially without upgrading the zpool and zfs disk versions). The lockups

Re: [zfs-discuss] Interesting experience with Nexenta - anyone seen it?

2010-05-21 Thread Don
Now, if someone would make a Battery FOB, that gives broken SSD 60 seconds of power, then we could use the consumer SSD's in servers again with real value instead of CYA value. You know- it would probably be sufficient to provide the SSD with _just_ a big capacitor bank. If the host lost

Re: [zfs-discuss] New SSD options

2010-05-21 Thread Don
I just spoke with a co-worker about doing something about it. He says he can design a small in-line UPS that will deliver 20-30 seconds of 3.3V, 5V, and 12V to the SATA power connector for about $50 in parts. It would be even less if only one voltage was needed. That should be enough for

Re: [zfs-discuss] New SSD options

2010-05-21 Thread Don
The SATA power connector supplies 3.3, 5 and 12v. A complete solution will have all three. Most drives use just the 5v, so you can probably ignore 3.3v and 12v. I'm not interested in building something that's going to work for every possible drive config- just my config :) Both the Intel X25-e

Re: [zfs-discuss] New SSD options

2010-05-20 Thread Don
So, IMHO, a cheap consumer ssd used as a zil may still be worth it (for some use cases) to narrow the window of data loss from ~30 seconds to a sub-second value. There are lots of reasons to enable the ZIL now- I can throw four very inexpensive SSD's in there now in a pair of mirrors, and then

Re: [zfs-discuss] New SSD options

2010-05-19 Thread Don
Well- 40k IOPS is the current claim from ZEUS- and they're the benchmark. They use to be 17k IOPS. How real any of these numbers are from any manufacturer is a guess. Given the Intel's refusal to honor a cache flush, and their performance problems with the cache disabled- I don't trust them

Re: [zfs-discuss] New SSD options

2010-05-19 Thread Don
Well the larger size of the Vertex, coupled with their smaller claimed write amplification should result in sufficient service life for my needs. Their claimed MTBF also matches the Intel X25-E's. -- This message posted from opensolaris.org ___

Re: [zfs-discuss] New SSD options

2010-05-19 Thread Don
Since it ignores Cache Flush command and it doesn't have any persistant buffer storage, disabling the write cache is the best you can do. This actually brings up another question I had: What is the risk, beyond a few seconds of lost writes, if I lose power, there is no capacitor and the cache

Re: [zfs-discuss] New SSD options

2010-05-19 Thread Don
You can lose all writes from the last committed transaction (i.e., the one before the currently open transaction). And I don't think that bothers me. As long as the array itself doesn't go belly up- then a few seconds of lost transactions are largely irrelevant- all of the QA virtual machines

Re: [zfs-discuss] New SSD options

2010-05-19 Thread Don
You can lose all writes from the last committed transaction (i.e., the one before the currently open transaction). I'll pick one- performance :) Honestly- I wish I had a better grasp on the real world performance of these drives. 50k IOPS is nice- and considering the incredible likelihood of

[zfs-discuss] New SSD options

2010-05-18 Thread Don
I'm looking for alternatives SSD options to the Intel X25-E and the ZEUS IOPS. The ZEUS IOPS would probably cost as much as my entire current disk system (80 15k SAS drives)- and that's just silly. The Intel is much less expensive, and while fast- pales in comparison to the ZEUS. I've

Re: [zfs-discuss] Oracle to no longer support ZFS on OpenSolaris?

2010-04-20 Thread Don Turnbull
Not to be a conspiracy nut but anyone anywhere could have registered that gmail account and supplied that answer. It would be a lot more believable from Mr Kay's Oracle or Sun account. On 4/20/2010 9:40 AM, Ken Gunderson wrote: On Tue, 2010-04-20 at 13:57 +0100, Dominic Kay wrote:

Re: [zfs-discuss] SSD best practices

2010-04-19 Thread Don
Yes yes- /etc/zfs/zpool.cache - we all hate typos :) -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SSD best practices

2010-04-19 Thread Don
I must note that you haven't answered my question... If the zpool.cache file differs between the two heads for some reason- how do I ensure that the second head has an accurate copy without importing the ZFS pool? -- This message posted from opensolaris.org

Re: [zfs-discuss] SSD best practices

2010-04-19 Thread Don
I'm not certain if I'm misunderstanding you- or if you didn't read my post carefully. Why would the zpool.cache file be current on the _second_ node? The first node is where I've added my zpools and so on. The second node isn't going to have an updated cache file until I export the zpool from

Re: [zfs-discuss] SSD best practices

2010-04-19 Thread Don
Ok- I think perhaps I'm failing to explain myself. I want to know if there is a way for a second node- connected to a set of shared disks- to keep its zpool.cache up to date _without_ actually importing the ZFS pool. As I understand it- keeping the zpool up to date on the second node would

Re: [zfs-discuss] SSD best practices

2010-04-19 Thread Don
That section of the man page is actually helpful- as I wasn't sure what I was going to do to ensure the nodes didn't try to bring up the zpool on their own- outside of clustering software or my own intervention. That said- it still doesn't explain how I would keep the secondary nodes

Re: [zfs-discuss] SSD best practices

2010-04-19 Thread Don
Now I'm simply confused. Do you mean one cachefile shared between the two nodes for this zpool? How, may I ask, would this work? The rpool should be in /etc/zfs/zpool.cache. The shared pool should be in /etc/cluster/zpool.cache (or wherever you prefer to put it) so it won't come up on system

Re: [zfs-discuss] SSD best practices

2010-04-19 Thread Don
I apologize- I didn't mean to come across as rude- I'm just not sure if I'm asking the right question. I'm not ready to use the ha-cluster software yet as I haven't finished testing it. For now I'm manually failing over from the primary to the backup node. That will change- but I'm not ready

Re: [zfs-discuss] SSD best practices

2010-04-19 Thread Don
I understand that important bit about having the cachefile is the GUID's (although the disk record is, I believe, helpful in improving import speeds) so we can recover in certain oddball cases. As such- I'm still confused why you say it's unimportant. Is it enough to simply copy the

Re: [zfs-discuss] SSD best practices

2010-04-19 Thread Don
Continuing on the best practices theme- how big should the ZIL slog disk be? The ZFS evil tuning guide suggests enough space for 10 seconds of my synchronous write load- even assuming I could cram 20 gigabits/sec into the host (2 10 gigE NICs) That only comes out to 200 Gigabits which = 25

Re: [zfs-discuss] SSD best practices

2010-04-19 Thread Don
I think the size of the ZIL log is basically irrelevant That was the understanding I got from reading the various blog posts and tuning guide. only a single SSD, just due to the fact that you've probably got dozens of disks attached, and you'll probably use multiple log devices striped just

Re: [zfs-discuss] SSD best practices

2010-04-19 Thread Don
I always try to plan for the worst case- I just wasn't sure how to arrive at the worst case. Thanks for providing the information- and I will definitely checkout the dtrace zilstat script. Considering the smallest SSD I can buy from a manufacturer that I trust seems to be 32GB- that's probably

Re: [zfs-discuss] SSD best practices

2010-04-19 Thread Don
A STEC Zeus IOPS SSD (45K IOPS) will behave quite differently than an Intel X-25E (~3.3K IOPS). Where can you even get the Zeus drives? I thought they were only in the OEM market and last time I checked they were ludicrously expensive. I'm looking for between 5k and 10k IOPS using up to 4

Re: [zfs-discuss] SSD best practices

2010-04-18 Thread Don
So if the Intel X25E is a bad device- can anyone recommend an SLC device with good firmware? (Or an MLC drive that performs as well?) I've got 80 spindles in 5 16 bay drive shelves (76 15k RPM SAS drives in 19 4 disk raidz sets, 2 hot spares, and 2 bays set aside for a mirrored ZIL) connected

Re: [zfs-discuss] SSD best practices

2010-04-18 Thread Don
If you have a pair of heads talking to shared disks with ZFS- what can you do to ensure the second head always has a current copy of the zpool.cache file? I'd prefer not to lose the ZIL, fail over, and then suddenly find out I can't import the pool on my second head. -- This message posted

Re: [zfs-discuss] SSD best practices

2010-04-18 Thread Don
But if the X25E doesn't honor cache flushes then it really doesn't matter if they are mirrored- they both may cache the data, not write it out, and leave me screwed. I'm running 2009.06 and not one of the newer developer candidates that handle ZIL losses gracefully (or at all- at least as far

Re: [zfs-discuss] SSD best practices

2010-04-18 Thread Don
I'm not sure to what you are referring when you say my running BE I haven't looked at the zpool.cache file too closely but if the devices don't match between the two systems for some reason- isn't that going to cause a problem? I was really asking if there is a way to build the cache file

Re: [zfs-discuss] Replacing faulty disk in ZFS pool

2009-08-06 Thread Don Turnbull
I believe there are a couple of ways that work. The commands I've always used are to attach the new disk as a spare (if not already) and then replace the failed disk with the spare. I don't know if there are advantages or disavantages but I also have never had a problem doing it this way.

Re: [zfs-discuss] Replacing faulty disk in ZFS pool

2009-08-06 Thread Don Turnbull
If her adds the spare and then manually forces a replace, it will take no more time than any other way. I do this quite frequently and without needing the scrub which does take quite a lot of time. cindy.swearin...@sun.com wrote: Hi Andreas, Good job for using a mirrored configuration. :-)

Re: [zfs-discuss] Fed up with ZFS causing data loss

2009-08-03 Thread Don Turnbull
This may have been mentioned elsewhere and, if so, I apologize for repeating. Is it possible your difficulty here is with the Marvell driver and not, strictly speaking, ZFS? The Solaris Marvell driver has had many, MANY bug fixes and continues to this day to be supported by IDR patches and

[zfs-discuss] Losts of small files vs fewer big files

2009-07-07 Thread Don Turnbull
I work with Greenplum which is essentially a number of Postgres database instances clustered together. Being postgres, the data is held in a lot of individual files which can be each fairly big (hundreds of MB or several GB) or very small (50MB or less). We've noticed a performance

Re: [zfs-discuss] Losts of small files vs fewer big files

2009-07-07 Thread Don Turnbull
Thanks for the suggestion! We've fiddled with this in the past. Our app is 32k instead of 8k blocks and it is data warehousing so the I/O model is a lot more long sequential reads generally. Changing the blocksize has very little effect on us. I'll have to look at fsync; hadn't considered

[zfs-discuss] Large zpool design considerations

2008-07-03 Thread Don Enrique
should i need to worry ? Anyone having experience from this kind of setup ? /Don E. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Large zpool design considerations

2008-07-03 Thread Don Enrique
Don Enrique wrote: Now, my initial plan was to create one large pool comprised of X RAIDZ-2 vdevs ( 7 + 2 ) with one hotspare per 10 drives and just continue to expand that pool as needed. Between calculating the MTTDL and performance models i was hit by a rather scary thought