Re: [zfs-discuss] SSD's and ZFS...
It just so happens I have one of the 128G and two of the 32G versions in my drawer, waiting to go into our "DR" disk array when it arrives. I dropped the 128G into a spare Dell 745 (2GB ram) and used a Ubuntu liveCD to run some simple iozone tests on it. I had some stability issues with Iozone crashing however I did get some results... Attached are what I've got. I intended to do two sets of tests, one for each of sequential reads, writes, and a "random IO" mix. I also wanted to do a second set of tests, running a streaming read or streaming write in parallel with the random IO mix, as I understand many SSD's have trouble with those kind of workloads. As it turns out, so did my test PC. :-) I've used 8K IO sizes for all the stage one tests - I know I might get it to go faster with a larger size, but I like to know how well systems will do when I treat them badly! The Stage_1_Ops_thru_run is interesting. 2000+ ops/sec on random writes, 5000 on reads. The Streaming write load and "random over writes" were started at the same time - although I didn't see which one finished first, so it's possible that the stream finished first and allowed the random run to finish strong. Basically take these numbers with several large grains of salt! Interestingly, the random IO mix doesn't slow down much, but the streaming writes are hurt a lot. Regards, Tristan. -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of thomas Sent: Friday, 24 July 2009 5:23 AM To: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] SSD's and ZFS... > I think it is a great idea, assuming the SSD has good write performance. > This one claims up to 230MB/s read and 180MB/s write and it's only $196. > > http://www.newegg.com/Product/Product.aspx?Item=N82E16820609393 > > Compared to this one (250MB/s read and 170MB/s write) which is $699. > > Are those claims really trustworthy? They sound too good to be true! MB/s numbers are not a good indication of performance. What you should pay attention to are usually random IOPS write and read. They tend to correlate a bit, but those numbers on newegg are probably just best case from the manufacturer. In the world of consumer grade SSDs, Intel has crushed everyone on IOPS performance.. but the other manufacturers are starting to catch up a bit. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ Iozone: Performance Test of File I/O Version $Revision: 3.287 $ Compiled for 64 bit mode. Build: linux Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins Al Slater, Scott Rhine, Mike Wisner, Ken Goss Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR, Randy Dunlap, Mark Montague, Dan Million, Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root. Run began: Fri Jul 24 20:59:50 2009 Excel chart generation enabled POSIX Async I/O (no bcopy). Depth 12 Microseconds/op Mode. Output is in microseconds per operation. Record Size 8 KB File size set to 6291456 KB Command line used: iozone -R -k 12 -i 0 -i 2 -N -r 8K -s 6G Time Resolution = 0.01 seconds. Processor cache size set to 1024 Kbytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. random random bkwd record stride KB reclen write rewritereadrereadread write read rewriteread 6291456 8 46 51 200 401 iozone test complete. Excel output is below: "Writer report" "8" "6291456" 46 "Re-writer report" "8" "6291456" 51 "Random read report" "8" "6291456" 200 "Random write report" "8" "6291456" 401 Iozone: Performance Test of File I/O Version $Revision: 3.287 $ Compiled for 64 bit mode. Build: linux Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins Al Slater, Scott Rhine, Mike Wisner, Ken Goss Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR
Re: [zfs-discuss] Opensolaris attached to 70 disk HP array
That is an interesting bit of kit. I wish a "white box" manufacturer would create something like this (hint hint supermicro) -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Motherboard for home zfs/solaris file server
>Note that the 'ecccheck.pl' script depends on the 'pcitweak' utility >which is no longer present in OpenSolaris 2009.06 and Ubuntu 8.10 >because of Xorg changes. This is exactly the kind of hidden trap I fear. One does everything right and then discovers that xx is missing or has been changed or depends on yy or doesn't work with zz. And that discovery comes after hours/days/weeks of trying to find out why something misbehaves. Thanks for the heads up! 2008.11 would be a safer bet then? Or Solaris CE? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Mirror cloning
On Fri, Jul 24, 2009 at 9:24 AM, Jorgen Lundman wrote: >> However, "zpool detach" appears to mark the disk as blank, so nothing will >> find any pools (import, import -D etc). zdb -l will show labels, If both disks are bootable (with installboot or installgrub), removing the mirror and put in in the new server should create an exact clone (including IP address and hostname). I don't think this is recommended though. This page provides root pool recovery methods, which shoud also be usable for cloning purposes. http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide#ZFS_Root_Pool_Recovery -- Fajar ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Motherboard for home zfs/solaris file server
More choice is good! It seems Intel's server boards sometimes accept desktop CPUS, but don't support speedstep. Is all OK with those? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Motherboard for home zfs/solaris file server
Cheers Miles, and thanks also for the tip to look in the BIOS options to see if ECC is actually used. Which mode woud you use? Max seems the most appealing, why would anyone use something called basic? But there must be a catch if they provided several ECC support modes. I am glad this thread seems to be going somewhere with lots of valuable contributions =:^) -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Mirror cloning
Jorgen Lundman wrote: However, "zpool detach" appears to mark the disk as blank, so nothing will find any pools (import, import -D etc). zdb -l will show labels, For kicks, I tried to demonstrate this does indeed happen, so I dd'ed the first 1024 1k blocks from the disk, zpool detach it, then dd'ed the image back out to the HDD. Pulled out disk and it boots directly without any interventions. If only zpool detach had a flag to tell it not to scribble over the detached disk. Guess I could diff the before and after disk image and work out what it is that it does, and write a tool to undo it, or figure out if I can undo it using "zdb". Lund -- Jorgen Lundman | Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo| +81 (0)90-5578-8500 (cell) Japan| +81 (0)3 -3375-1767 (home) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Opensolaris attached to 70 disk HP array
Looking at this external array by HP: http://h18006.www1.hp.com/products/storageworks/600mds/index.html 70 disks in 5U, which could probably be configured in JBOD. Has anyone attempted to connect this to a box running opensolaris to create a 70 disk pool? -- Brent Jones br...@servuhome.net ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] triple-parity: RAID-Z3
Robert, On Fri, Jul 24, 2009 at 12:59:01AM +0100, Robert Milkowski wrote: >> To what analysis are you referring? Today the absolute fastest you can >> resilver a 1TB drive is about 4 hours. Real-world speeds might be half >> that. In 2010 we'll have 3TB drives meaning it may take a full day to >> resilver. The odds of hitting a latent bit error are already reasonably >> high especially with a large pool that's infrequently scrubbed meaning. >> What then are the odds of a second drive failing in the 24 hours it takes >> to resiler? > > I wish it was so good with raid-zN. > In real life, at least from mine experience, it can take several days to > resilver a disk for vdevs in raid-z2 made of 11x sata disk drives with real > data. > While the way zfs ynchronizes data is way faster under some circumstances > it is also much slower under other. > IIRC some builds ago there were some fixes integrated so maybe it is > different now. Absolutely. I was talking more or less about optimal timing. I realize that due to the priorities within ZFS and real word loads that it can take far longer. Adam -- Adam Leventhal, Fishworks http://blogs.sun.com/ahl ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Mirror cloning
Ok, so it seems that with DiskSuite, detaching a mirror does nothing to the disk you detached. However, "zpool detach" appears to mark the disk as blank, so nothing will find any pools (import, import -D etc). zdb -l will show labels, but no amount of work that we have found will bring the HDD back online in the new server. Grub is blank, and findroot can not see any pool. zpool will not let you offline the 2nd disk in a mirror. This is incorrect behaviour. You can not "cfgadm unconfigure" the sata device while zpool has the disk. We can just yank the disk, but we had issues getting a new-blank disk recognised after that. cfgadm would not release the old disk. However, we found we can do this: # cfgadm -x sata_port_deactivate sata0/1::dsk/c0t1d0 This will make zpool mark it: c0t1d0s0 REMOVED 0 0 0 and eventually: c0t1d0s0 FAULTED 0 0 0 too many errors After that, we pull out the disk, and issue: # zpool detach zboot c0t1d0s0 # cfgadm -x sata_port_activate sata0/1::dsk/c0t1d0 # cfgadm -c configure sata0/1::dsk/c0t1d0 # format (fdisk, partition as required to be the same) # zpool attach zboot c0t0d0s0 c0t1d0s0 There is one final thing to address, when the disk is used in a new machine, it will generally panic with "pool was used previously with system-id xx". Which requires more miniroot work. It would be nice to be able to avoid this as well. But you can't export the "/" pool before pulling out the disk, either. Jorgen Lundman wrote: Hello list, Before we started changing to ZFS bootfs, we used DiskSuite mirrored ufs boot. Very often, if we needed to grow a cluster by another machine or two, we would simply clone a running live server. Generally the procedure for this would be; 1 detach the "2nd" HDD, metaclear, and delete metadb on 2nd disk. 2 mount the 2nd HDD under /mnt, and change system/vfstab to be a single boot HDD, and no longer "mirrored", as well as host name, and IP addresses. 3 bootadm update-archive -R /mnt 4 unmount, cfgadm unconfigure, and pull out the HDD. and generally, in about ~4 minutes, we have a new live server in the cluster. We tried to do the same thing to day, but with a ZFS bootfs. We did: 1 zpool detach on the "2nd HDD". 2 cfgadm unconfigure the HDD, and pull out the disk. The source server was fine, could insert new disk, attach it, and it resilvered. However, the new destination server had lots of issues. At first, grub would give no menu at all, just the grub? command prompt. The command: findroot(pool_zboot,0,a) would return "Error 15: No such file". After booting a Solaris Live CD, I could "zpool import" the pool, but of course it was in Degraded mode etc. Now it would show menu, but if you boot it, it would flash the message that the pool was last accessed by Solaris $sysid, and "panic". After a lot of reboots, and fiddling, I managed to get miniroot to at least boot, then, only after inserting a new HDD and letting the pool become completely "good" would it let me boot into multi-user. Is there something we should do perhaps, that will let the cloning procedure go smoothly? Should I "export" the 'now separated disk' somehow? In fact, can I mount that disk to make changes to it before pulling out the disk? Most documentation on cloning uses "zfs send", which would be possible, but 4 minutes is hard to beat when your cluster is under heavy load. Lund -- Jorgen Lundman | Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo| +81 (0)90-5578-8500 (cell) Japan| +81 (0)3 -3375-1767 (home) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] triple-parity: RAID-Z3
Adam Leventhal wrote: I just blogged about triple-parity RAID-Z (raidz3): http://blogs.sun.com/ahl/entry/triple_parity_raid_z As for performance, on the system I was using (a max config Sun Storage 7410), I saw about a 25% improvement to 1GB/s for a streaming write workload. YMMV, but I'd be interested in hearing your results. 25% improvement when comparing what exactly to what? -- Robert Milkowski http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] triple-parity: RAID-Z3
Adam Leventhal wrote: Hey Bob, MTTDL analysis shows that given normal evironmental conditions, the MTTDL of RAID-Z2 is already much longer than the life of the computer or the attendant human. Of course sometimes one encounters unusual conditions where additional redundancy is desired. To what analysis are you referring? Today the absolute fastest you can resilver a 1TB drive is about 4 hours. Real-world speeds might be half that. In 2010 we'll have 3TB drives meaning it may take a full day to resilver. The odds of hitting a latent bit error are already reasonably high especially with a large pool that's infrequently scrubbed meaning. What then are the odds of a second drive failing in the 24 hours it takes to resiler? I wish it was so good with raid-zN. In real life, at least from mine experience, it can take several days to resilver a disk for vdevs in raid-z2 made of 11x sata disk drives with real data. While the way zfs ynchronizes data is way faster under some circumstances it is also much slower under other. IIRC some builds ago there were some fixes integrated so maybe it is different now. I do think that it is worthwhile to be able to add another parity disk to an existing raidz vdev but I don't know how much work that entails. It entails a bunch of work: http://blogs.sun.com/ahl/entry/expand_o_matic_raid_z Matt Ahrens is working on a key component after which it should all be possible. A lot of people are waiting for it! :) :) :) ps. thank you for raid-z3! -- Robert Milkowski http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Another user looses his pool (10TB) in this case and 40 days work
On 07/21/09 01:21 PM, Richard Elling wrote: I never win the lottery either :-) Let's see. Your chance of winning a 49 ball lottery is apparently around 1 in 14*10^6, although it's much better than that because of submatches (smaller payoffs for matches on less than 6 balls). There are about 32*10^6 seconds in a year. If ZFS saves its writes for 30 seconds and batches them out, that means 1 write leaves the buffer exposed for roughly one millionth of a year. If you have 4GB of memory, you might get 50 errors a year, but you say ZFS uses only 1/10 of this for writes, so that memory could see 5 errors/year. If your single write was 1/70th of that (say around 6 MB), your chance of a hit is around 5/70/10^-6 or 1 in 14*10^6, so you are correct! So if you do one 6MB write/year, your chances of a hit in a year are about the same as that of winning a grand slam lottery. Hopefully not every hit will trash a file or pool, but odds are that you'll do many more writes than that, so on the whole I think a ZFS hit is quite a bit more likely than winning the lottery each year :-). Conversely, if you average one big write every 3 minutes or so (20% occupancy), odds are almost certain that you'll get one hit a year. So some SOHO users who do far fewer writes won't see any hits (say) over a 5 year period. But some will, and they will be most unhappy -- calculate your odds and then make a decision! I daresay the PC makers have done this calculation, which is why PCs don't have ECC, and hence IMO make for insufficiently reliable servers. Conclusions from what I've gleaned from all the discussions here: if you are too cheap to opt for mirroring, your best bet is to disable checksumming and set copies=2. If you mirror but don't have ECC then at least set copies=2 and consider disabling checksums. Actually, set copies=2 regardless, so that you have some redundancy if one half of the mirror fails and you have a 10 hour resilver, in which time you could easily get a (real) disk read error. It seems to me some vendor is going to cotton onto the SOHO server problem and make a bundle at the right price point. Sun's offerings seem unfortunately mostly overkill for the SOHO market, although the X4140 looks rather interesting... Shame there aren't any entry level SPARCs any more :-(. Now what would doctors' front offices do if they couldn't blame the computer for being down all the time? It is quite simple -- ZFS sent the flush command and VirtualBox ignored it. Therefore the bits on the persistent store are consistent. But even on the most majestic of hardware, a flush command could be lost, could it not? An obvious case in point is ZFS over iscsi and a router glitch. But the discussion seems to be moot since CR 6667683 is being addressed. Now about those writes to mirrored disks :) Cheers -- Frank ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Motherboard for home zfs/solaris file server
chris wrote: Ok, so the choice for a MB boils down to: - Intel desktop MB, no ECC support This is mostly true. The exceptions are some implementations of the Socket T LGA 775 (i.e. late Pentium 4 series, and Core 2) D975X and X38 chipsets, and possibly some X48 boards as well. Intel's other desktop chipsets do not support ECC. Some motherboard examples include: Intel DX38BT - ECC support is mentioned in the documentation and is a BIOS option Gigabyte GA-X38-DS4, GA-EX38-DS4 - ECC support is mentioned in the documentation and is listed in the website FAQ The Sun Ultra 24 also uses the X38 chipset. It's not clear how well ECC support has actually been implemented on the Intel and Gigabyte boards, i.e. whether it is simply unbuffered ECC memory compatible, or actually able to initialize and use the ECC capability. I mentioned the X48 chipset above because discussions surrounding it say it is just a higher binned X38 chip. On Linux, the EDAC project maintains software to manage the motherboard's ECC capability. A list of memory controllers currently supported by Linux EDAC is here: http://buttersideup.com/edacwiki/Main_Page A prior discussion thread in 'fm' titled 'X38/975x ECC memory support' is here: http://opensolaris.org/jive/thread.jspa?threadID=52440&tstart=60 Thread links: http://www.madore.org/~david/linux/#ECC_for_82x http://developmentonsolaris.wordpress.com/2008/03/12/intel-82975x-mch-and-logging-of-ecc-events-on-solaris/ Note that the 'ecccheck.pl' script depends on the 'pcitweak' utility which is no longer present in OpenSolaris 2009.06 and Ubuntu 8.10 because of Xorg changes. One Linux user needing the utility copied it from another distro. The version of pcitweak included with previous versions of OpenSolaris might work on 2009.06. http://opensolaris.org/jive/thread.jspa?threadID=105975&tstart=90 http://ubuntuforums.org/showthread.php?t=1054516 Finally, on unbuffered ECC memory prices and speeds...they are a bit behind in price and speed vs. regular unbuffered RAM, but both are still reasonable. Keep When comparing prices, remember that ECC RAM uses 9 chips where non-ECC uses 8, so expect at least a 12.5% price increase. Consider: DDR2: $64 for Crucial 4GB kit (2GBx2), 240-pin DIMM, Unbuffered DDR2 PC2-6400 memory module http://www.crucial.com/store/partspecs.aspx?IMODULE=CT2KIT25672AA800 DDR3: $108 for Crucial 6GB (3 x 2GB) 240-Pin DDR3 SDRAM ECC Unbuffered DDR3 1333 (PC3 10600) Triple Channel Kit Server Memory Model CT3KIT25672BA1339 - Retail http://www.newegg.com/Product/Product.aspx?Item=N82E16820148259 -hk - Intel server MB, ECC support, expensive (requires a Xeon for speedstep support). It is a shame to waste top kit doing nothing 24/7. - AMD K8: ECC support(right?), no Cool'n'quiet support (but maybe still cool enough with the right CPU?) - AMD K10: should have the best all of both worlds: ECC support, Cool'n'quiet, cheap-ish and lowish-power CPU like Athlon II 250 Is my understanding correct? Like many I want reliable, cheap, low power, ECC supporting MB. Integrated video and low power chipset would be best. The sata ports will have to come from an additional controller it seems, but that's life. Intel gear is best supported, but they shoot themselves (or is that that us?) in the foot with their ECC-on-server MB policy. AMD K10 seems the most tempting as it has it all. I wonder about solaris support though. For example, is an AM3 MB OK with solaris? I'd like this hopefully to work right away with opensolaris 2009.06, without fiddling with drivers, I dont have much time or skills. What AM3 MB do you guys know that is trouble free with solaris? If none, maybe top quality ram (suggestions?) would allow me to forego ECC and use a well supported low power intel board (suggestions?) instead? and a E5200? Thanks for your insight. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Motherboard for home zfs/solaris file server
I'm going the other route here, and using a Intel small server motherboard. I'm currently trying the Supermicro X7SBE, which supports a non-Xeon CPU, and _should_ actually use the (unbuffered) ECC RAM I have in it. It can also support a network KVM IPMI board, which is nice (though not cheap - i.e. $100 or so). The Supermicro X7SBL-LN[12] boards also look good, though they won't support the network KVM option. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD's and ZFS...
I didn't meant using slog for the root pool. I meant using the slog for a data pool. Where the data pool consists of (rotating) hard disk and complement them with a ssd based slog. But instead of a dedicated ssd for the slog I want the root pool share the ssd with the slog. Both can mirrored to a second ssd. I think that in this scenario my initial concern remains. Since you cannot remove an slog from a pool, if you want to move the pool or something bad happens you're in trouble. Richard I'm under the impression that most current ssd's have a dram buffer. Some are used only for reading some are also used for writing. I'm pretty sure the sun LogZilla devices (the stec zeus) have a dram write buffer. Some have a supercap to flush the caches others don't I'm trying to compile some guidelines regarding write caching, ssd and zfs. I don't like the posts like "I can't import my pool" "my pool went down the niagara falls" etc. So in order to prevent more of these stories I think it's important to get it in the open if write caching can be enabled on ssd's (full disk and slice usage) I'm really looking for a conclusive test to determine wether or not it can be enabled. Regards, Frederik -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Motherboard for home zfs/solaris file server
> "c" == chris writes: c> do you know what the ECC BIOS modes mean? It's about the hardware scrubbing feature I mentioned. pgpcOJUfEwhmS.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD's and ZFS...
On Thu, 2009-07-23 at 14:24 -0700, Richard Elling wrote: > On Jul 23, 2009, at 11:09 AM, Greg Mason wrote: > > I think it is a great idea, assuming the SSD has good write > performance. > >>> This one claims up to 230MB/s read and 180MB/s write and it's only > >>> $196. > >>> > >>> http://www.newegg.com/Product/Product.aspx?Item=N82E16820609393 > >>> > >>> Compared to this one (250MB/s read and 170MB/s write) which is $699. > >>> > >> Oops. Forgot the link: > >> > >> http://www.newegg.com/Product/Product.aspx?Item=N82E16820167014 > >>> Are those claims really trustworthy? They sound too good to be true! > >>> > >>> -Kyle > > > > Kyle- > > > > The less expensive SSD is an MLC device. The Intel SSD is an SLC > > device. > > Some newer designs use both SLC and MLC. It is no longer possible > to use SLC vs MLC as a primary differentiator. Use the specifications. > -- richard > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss I'm finding the new-gen MLC w/ large DRAM cache & improved microcontroller to be more than sufficient for workgroup server. e.g. the OCZ Summit series and similar. I suspect the Intel X25-M is likely good enough, too. I'm using one SSD for both read and write caches, and it's good enough for a 20-person small workgroup server doing NFS. I suspect that write caches are much more sensitive to IOPS performance than read ones, but that's just my feeling. In any case, I'd pay more attention to the IOPS rating for things, than the sync read/write speeds. I'm testing that set up right now for iSCSI-based xVM guests, so we'll see if it can stand the IOPs. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Motherboard for home zfs/solaris file server
Oh, and another unrelated question: Would I better off using OpenSolaris or Solaris Community Edition? I suspect SCE has more drivers (though mayby in a more beta state?), but its huge download size (several days in backward New Zealand, thanks Telecom NZ!) means I would only try if there is good justification. What would you guys recommend (I know, this is an OpenSolaris forum, but at least can you tell me how these 2 differ)? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Motherboard for home zfs/solaris file server
The Asus M4N78-VM uses a Nvidia GeForce 8200 Chipset (This board only has 1 PCIe-16 slot though, I should look at those that have 2 slots). -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD's and ZFS...
On Jul 23, 2009, at 11:09 AM, Greg Mason wrote: I think it is a great idea, assuming the SSD has good write performance. This one claims up to 230MB/s read and 180MB/s write and it's only $196. http://www.newegg.com/Product/Product.aspx?Item=N82E16820609393 Compared to this one (250MB/s read and 170MB/s write) which is $699. Oops. Forgot the link: http://www.newegg.com/Product/Product.aspx?Item=N82E16820167014 Are those claims really trustworthy? They sound too good to be true! -Kyle Kyle- The less expensive SSD is an MLC device. The Intel SSD is an SLC device. Some newer designs use both SLC and MLC. It is no longer possible to use SLC vs MLC as a primary differentiator. Use the specifications. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Motherboard for home zfs/solaris file server
Thanks for this, good news! Yes, I would try to use onboard video. > Please note that frequency scaling is only supported > on the K10 architecture. But don't expect to much > power saving from it. A lower voltage yields far > greater savings than a lower frequency. Doesn't Cool'n'quiet step the voltage as well? An Athlon X2 5050e and an Athlon II X2 250 are the same price. The former has a TDP of 45W, while the latter is 65W. But the 250 uses 45nm technology and the K10 architecture, so I would hope that its power consumption at idle would be lower. Would you agree? Also, out of interest, do you know what the ECC BIOS modes mean? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] No ZFS snapshot since upgrade to 2009.06
This is probably bug #6462803. The work-around goes something like this: $ pfexec bash # beadm mount opensolaris /mnt # beadm unmount opensolaris # svcadm clear svc:/system/filesystem/zfs/auto-snapshot:frequent # svcadm clear svc:/system/filesystem/zfs/auto-snapshot:hourly # svcadm clear svc:/system/filesystem/zfs/auto-snapshot:daily # svcs -a | grep snapshot The last command should show that the services are back online. n On Thu, Jul 23, 2009 at 12:31 PM, Axelle Apvrille wrote: > I've upgrade my OpenSolaris 2008.11 to 2009.06. During that process it > created a new boot environment: > BE Active Mountpoint Space Policy Created > -- -- -- - -- --- > opensolaris NR / 7.53G static 2009-01-03 13:18 > opensolaris-1 - - 2.80G static 2009-07-20 22:38 > > But now, all my zfs snapshot services are in maintenance mode: > svc:/system/filesystem/zfs/auto-snapshot:frequent (ZFS automatic snapshots) > State: maintenance since Thu Jul 23 20:21:29 2009 > Reason: Restarter svc:/system/svc/restarter:default gave no explanation. > See: http://sun.com/msg/SMF-8000-9C > See: /var/svc/log/system-filesystem-zfs-auto-snapshot:frequent.log > Impact: 1 dependent service is not running: > svc:/application/time-slider:default > etc > > The logs say that my host tried to create a snapshot, but couldn't because > 'dataset is busy': > cannot create snapshot > 'rpool/ROOT/opensolari...@zfs-auto-snap:frequent-2009-07-23-20:21': dataset > is busy > no snapshots were created > Error: Unable to take recursive snapshots of > rpool/r...@zfs-auto-snap:frequent-2009-07-23-20:21. > Moving service svc:/system/filesystem/zfs/auto-snapshot:frequent to > maintenance mode. > > Anyone knows how to fix this ? > What is that boot environment opensolaris-1 for ? can I erase it safely ? > > Regards > -- > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] [SOLVED] Re: No ZFS snapshot since upgrade to 2009.06
Ok, I've found the solution to my problem on Internet here: http://sigtar.com/2009/03/17/troubleshooting-time-slider-zfs-snapshots/ This was indeed caused by the old boot environment. This is how to solve it: - disable snapshots on the old boot environment: pfexec zfs set com.sun:auto-snapshot=false rpool/ROOT/opensolaris-1 - clear zfs snapshot services pfexec svcadm clear auto-snapshot:frequent pfexec svcadm clear auto-snapshot:hourly pfexec svcadm clear auto-snapshot:daily - launch the time slider and enable zfs snapshots on the appropriate systems - check the services are online: svcs ... online 21:50:20 svc:/system/filesystem/zfs/auto-snapshot:hourly online 21:50:26 svc:/system/filesystem/zfs/auto-snapshot:frequent online 21:50:36 svc:/system/filesystem/zfs/auto-snapshot:daily Hurray ! -- Axelle -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD's and ZFS...
>I don't think this is limited to root pools. None of my pools (root or >non-root) seem to have the write cache enabled. Now that I think about >it, all my disks are "hidden" behind an LSI1078 controller so I'm not >sure what sort of impact that would have on the situation. I have a few of those controllers as well, I wouldn't believe for a second that ZFS could change the controller config for an ld, I couldn't see how? # /usr/local/bin/CLI/MegaCli -LdGetProp -DskCache -LAll -a0 Adapter 0-VD 0(target id: 0): Disk Write Cache : Disabled Adapter 0-VD 1(target id: 1): Disk Write Cache : Disabled Adapter 0-VD 2(target id: 2): Disk Write Cache : Disabled Adapter 0-VD 3(target id: 3): Disk Write Cache : Disabled Adapter 0-VD 4(target id: 4): Disk Write Cache : Disabled The comment later about defining a pool w/ and w/o the sX syntax warrants a test:) Good to keep in mind... jlc ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] No ZFS snapshot since upgrade to 2009.06
I've upgrade my OpenSolaris 2008.11 to 2009.06. During that process it created a new boot environment: BEActive Mountpoint Space Policy Created ---- -- - -- --- opensolaris NR / 7.53G static 2009-01-03 13:18 opensolaris-1 - - 2.80G static 2009-07-20 22:38 But now, all my zfs snapshot services are in maintenance mode: svc:/system/filesystem/zfs/auto-snapshot:frequent (ZFS automatic snapshots) State: maintenance since Thu Jul 23 20:21:29 2009 Reason: Restarter svc:/system/svc/restarter:default gave no explanation. See: http://sun.com/msg/SMF-8000-9C See: /var/svc/log/system-filesystem-zfs-auto-snapshot:frequent.log Impact: 1 dependent service is not running: svc:/application/time-slider:default etc The logs say that my host tried to create a snapshot, but couldn't because 'dataset is busy': cannot create snapshot 'rpool/ROOT/opensolari...@zfs-auto-snap:frequent-2009-07-23-20:21': dataset is busy no snapshots were created Error: Unable to take recursive snapshots of rpool/r...@zfs-auto-snap:frequent-2009-07-23-20:21. Moving service svc:/system/filesystem/zfs/auto-snapshot:frequent to maintenance mode. Anyone knows how to fix this ? What is that boot environment opensolaris-1 for ? can I erase it safely ? Regards -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD's and ZFS...
> I think it is a great idea, assuming the SSD has good write performance. > This one claims up to 230MB/s read and 180MB/s write and it's only $196. > > http://www.newegg.com/Product/Product.aspx?Item=N82E16820609393 > > Compared to this one (250MB/s read and 170MB/s write) which is $699. > > Are those claims really trustworthy? They sound too good to be true! MB/s numbers are not a good indication of performance. What you should pay attention to are usually random IOPS write and read. They tend to correlate a bit, but those numbers on newegg are probably just best case from the manufacturer. In the world of consumer grade SSDs, Intel has crushed everyone on IOPS performance.. but the other manufacturers are starting to catch up a bit. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Motherboard for home zfs/solaris file server
On 07/23/09 09:19 AM, Richard Elling wrote: On Jul 23, 2009, at 5:42 AM, F. Wessels wrote: Hi, I'm using asus m3a78 boards (with the sb700) for opensolaris and m2a* boards (with the sb600) for linux some of them with 4*1GB and others with 4*2Gb ECC memory. Ecc faults will be detected and reported. I tested it with a small tungsten light. By moving the light source slowly towards the memory banks you'll heat them up in a controlled way and at a certain point bit flips will occur. I am impressed! I don't know very many people interested in inducing errors in their garage. This is an excellent way to demonstrate random DRAM errors. Well done! I recommend you to go for a m4a board since they support up to 16 GB. I don't know if you can run opensolaris without a videocard after installation I think you can disable the "halt on no video card" in the bios. But Simon Breden had some trouble with it, see his homeserver blog. But you can go for one of the three m4a boards with a 780g onboard. Those will give you 2 pci-e x16 connectors. I don't think the onboard nic is supported. What is the specific model of the onboard nic chip? We may be working on it right now. Neal I always put an intel (e1000) in, just to prevent any trouble. I don't have any trouble with the sb700 in ahci mode. Hotplugging works like a charm. Transfering a couple of GB's over esata takes considerable less time than via usb. I have a pata to dual cf adapter and two industrial 16gb cf cards as mirrored root pool. It takes for ever to install nevada, at least 14 hours. I suspect the cf cards lack caches. But I don't update that regularly, still on snv104. And have 2 mirrors and a hot spare. The sixth port is an esata port I use to transfer large amounts of data. This system consumes about 73 watts idle and 82 under load i/o load. (5 disks , a separate nic ,8 gb ram and a be2400 all using just 73 watts!!!) How much power does the tungsten light burn? :-) -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD's and ZFS...
Adam Sherman wrote: In the context of a low-volume file server, for a few users, is the low-end Intel SSD sufficient? You're right, it supposedly has less than half the the write speed, and that probably won't matter for me, but I can't find a 64GB version of it for sale, and the 80GB version is over 50% more at $314. -Kyle A. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD's and ZFS...
In the context of a low-volume file server, for a few users, is the low-end Intel SSD sufficient? A. -- Adam Sherman +1.613.797.6819 On 2009-07-23, at 14:09, Greg Mason wrote: I think it is a great idea, assuming the SSD has good write performance. This one claims up to 230MB/s read and 180MB/s write and it's only $196. http://www.newegg.com/Product/Product.aspx?Item=N82E16820609393 Compared to this one (250MB/s read and 170MB/s write) which is $699. Oops. Forgot the link: http://www.newegg.com/Product/Product.aspx?Item=N82E16820167014 Are those claims really trustworthy? They sound too good to be true! -Kyle Kyle- The less expensive SSD is an MLC device. The Intel SSD is an SLC device. That right there accounts for the cost difference. The SLC device (Intel X25-E) will last quite a bit longer than the MLC device. -Greg -- Greg Mason System Administrator Michigan State University High Performance Computing Center ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD's and ZFS...
Greg Mason wrote: I think it is a great idea, assuming the SSD has good write performance. This one claims up to 230MB/s read and 180MB/s write and it's only $196. http://www.newegg.com/Product/Product.aspx?Item=N82E16820609393 Compared to this one (250MB/s read and 170MB/s write) which is $699. Oops. Forgot the link: http://www.newegg.com/Product/Product.aspx?Item=N82E16820167014 Are those claims really trustworthy? They sound too good to be true! -Kyle Kyle- The less expensive SSD is an MLC device. The Intel SSD is an SLC device. That right there accounts for the cost difference. The SLC device (Intel X25-E) will last quite a bit longer than the MLC device. I understand that. That's why I picked that one to compare. It was my understanding that the MLC drives weren't even close performance wise to the SLC ones. This one seems pretty close. How can that be? -Kyle -Greg ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD's and ZFS...
> >> I think it is a great idea, assuming the SSD has good write performance. > > This one claims up to 230MB/s read and 180MB/s write and it's only $196. > > > > http://www.newegg.com/Product/Product.aspx?Item=N82E16820609393 > > > > Compared to this one (250MB/s read and 170MB/s write) which is $699. > > > Oops. Forgot the link: > > http://www.newegg.com/Product/Product.aspx?Item=N82E16820167014 > > Are those claims really trustworthy? They sound too good to be true! > > > > -Kyle Kyle- The less expensive SSD is an MLC device. The Intel SSD is an SLC device. That right there accounts for the cost difference. The SLC device (Intel X25-E) will last quite a bit longer than the MLC device. -Greg -- Greg Mason System Administrator Michigan State University High Performance Computing Center ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD's and ZFS...
Kyle McDonald wrote: Richard Elling wrote: On Jul 23, 2009, at 9:37 AM, Kyle McDonald wrote: Richard Elling wrote: On Jul 23, 2009, at 7:28 AM, Kyle McDonald wrote: F. Wessels wrote: Thanks posting this solution. But I would like to point out that bug 6574286 "removing a slog doesn't work" still isn't resolved. A solution is under it's way, according to George Wilson. But in the mean time, IF something happens you might be in a lot of trouble. Even without some unfortunate incident you cannot for example export your data pool, pull the drives and leave the root pool. In my case the slog slice wouldn't be the slog for the root pool, it would be the slog for a second data pool. If the device went bad, I'd have to replace it, true. But if the device goes bad, then so did a good part of my root pool, and I'd have to replace that too. Mirror the slog to match your mirrored root pool. Yep. That was the plan. I was just explaining that not being able to remove the slog wasn't an issue for me since I planned on always having that device available. I was more curious about whether there were any diown sides to sharing the SSD between the root pool and the slog? I think it is a great idea, assuming the SSD has good write performance. This one claims up to 230MB/s read and 180MB/s write and it's only $196. http://www.newegg.com/Product/Product.aspx?Item=N82E16820609393 Compared to this one (250MB/s read and 170MB/s write) which is $699. Oops. Forgot the link: http://www.newegg.com/Product/Product.aspx?Item=N82E16820167014 Are those claims really trustworthy? They sound too good to be true! -Kyle -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD's and ZFS...
Richard Elling wrote: On Jul 23, 2009, at 9:37 AM, Kyle McDonald wrote: Richard Elling wrote: On Jul 23, 2009, at 7:28 AM, Kyle McDonald wrote: F. Wessels wrote: Thanks posting this solution. But I would like to point out that bug 6574286 "removing a slog doesn't work" still isn't resolved. A solution is under it's way, according to George Wilson. But in the mean time, IF something happens you might be in a lot of trouble. Even without some unfortunate incident you cannot for example export your data pool, pull the drives and leave the root pool. In my case the slog slice wouldn't be the slog for the root pool, it would be the slog for a second data pool. If the device went bad, I'd have to replace it, true. But if the device goes bad, then so did a good part of my root pool, and I'd have to replace that too. Mirror the slog to match your mirrored root pool. Yep. That was the plan. I was just explaining that not being able to remove the slog wasn't an issue for me since I planned on always having that device available. I was more curious about whether there were any diown sides to sharing the SSD between the root pool and the slog? I think it is a great idea, assuming the SSD has good write performance. This one claims up to 230MB/s read and 180MB/s write and it's only $196. http://www.newegg.com/Product/Product.aspx?Item=N82E16820609393 Compared to this one (250MB/s read and 170MB/s write) which is $699. Are those claims really trustworthy? They sound too good to be true! -Kyle -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD's and ZFS...
On Jul 23, 2009, at 9:37 AM, Kyle McDonald wrote: Richard Elling wrote: On Jul 23, 2009, at 7:28 AM, Kyle McDonald wrote: F. Wessels wrote: Thanks posting this solution. But I would like to point out that bug 6574286 "removing a slog doesn't work" still isn't resolved. A solution is under it's way, according to George Wilson. But in the mean time, IF something happens you might be in a lot of trouble. Even without some unfortunate incident you cannot for example export your data pool, pull the drives and leave the root pool. In my case the slog slice wouldn't be the slog for the root pool, it would be the slog for a second data pool. If the device went bad, I'd have to replace it, true. But if the device goes bad, then so did a good part of my root pool, and I'd have to replace that too. Mirror the slog to match your mirrored root pool. Yep. That was the plan. I was just explaining that not being able to remove the slog wasn't an issue for me since I planned on always having that device available. I was more curious about whether there were any diown sides to sharing the SSD between the root pool and the slog? I think it is a great idea, assuming the SSD has good write performance. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD's and ZFS...
Richard Elling wrote: On Jul 23, 2009, at 7:28 AM, Kyle McDonald wrote: F. Wessels wrote: Thanks posting this solution. But I would like to point out that bug 6574286 "removing a slog doesn't work" still isn't resolved. A solution is under it's way, according to George Wilson. But in the mean time, IF something happens you might be in a lot of trouble. Even without some unfortunate incident you cannot for example export your data pool, pull the drives and leave the root pool. In my case the slog slice wouldn't be the slog for the root pool, it would be the slog for a second data pool. If the device went bad, I'd have to replace it, true. But if the device goes bad, then so did a good part of my root pool, and I'd have to replace that too. Mirror the slog to match your mirrored root pool. Yep. That was the plan. I was just explaining that not being able to remove the slog wasn't an issue for me since I planned on always having that device available. I was more curious about whether there were any diown sides to sharing the SSD between the root pool and the slog? Thanks for the valuable input, Richard. -Kyle Don't get me wrong I would like such a setup a lot. But I'm not going to implement it until the slog can be removed or the pool be imported without the slog. In the mean time can someone confirm that in such a case, root pool and zil in two slices and mirrored, that the write cache can be enabled with format? Only zfs is using the disk, but perhaps I'm wrong on this. There have been post's regarding enabling the write_cache. But I couldn't find a conclusive answer for the above scenario. When you have just the root pool on a disk, ZFS won't enable the write cache by default. I think you can manually enable it but I don't know the dangers. Adding the slog shouldn't be any different. To be honest, I don't know how closely the write caching on a SSD matches what a moving disk has. Write caches only help hard disks. Most (all?) SSDs do not have volatile write buffers. Volatile write buffers are another "bad thing" you can forget when you go to SSDs :-) -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD's and ZFS...
On Jul 23, 2009, at 7:28 AM, Kyle McDonald wrote: F. Wessels wrote: Thanks posting this solution. But I would like to point out that bug 6574286 "removing a slog doesn't work" still isn't resolved. A solution is under it's way, according to George Wilson. But in the mean time, IF something happens you might be in a lot of trouble. Even without some unfortunate incident you cannot for example export your data pool, pull the drives and leave the root pool. In my case the slog slice wouldn't be the slog for the root pool, it would be the slog for a second data pool. If the device went bad, I'd have to replace it, true. But if the device goes bad, then so did a good part of my root pool, and I'd have to replace that too. Mirror the slog to match your mirrored root pool. Don't get me wrong I would like such a setup a lot. But I'm not going to implement it until the slog can be removed or the pool be imported without the slog. In the mean time can someone confirm that in such a case, root pool and zil in two slices and mirrored, that the write cache can be enabled with format? Only zfs is using the disk, but perhaps I'm wrong on this. There have been post's regarding enabling the write_cache. But I couldn't find a conclusive answer for the above scenario. When you have just the root pool on a disk, ZFS won't enable the write cache by default. I think you can manually enable it but I don't know the dangers. Adding the slog shouldn't be any different. To be honest, I don't know how closely the write caching on a SSD matches what a moving disk has. Write caches only help hard disks. Most (all?) SSDs do not have volatile write buffers. Volatile write buffers are another "bad thing" you can forget when you go to SSDs :-) -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Motherboard for home zfs/solaris file server
On Jul 23, 2009, at 5:42 AM, F. Wessels wrote: Hi, I'm using asus m3a78 boards (with the sb700) for opensolaris and m2a* boards (with the sb600) for linux some of them with 4*1GB and others with 4*2Gb ECC memory. Ecc faults will be detected and reported. I tested it with a small tungsten light. By moving the light source slowly towards the memory banks you'll heat them up in a controlled way and at a certain point bit flips will occur. I am impressed! I don't know very many people interested in inducing errors in their garage. This is an excellent way to demonstrate random DRAM errors. Well done! I recommend you to go for a m4a board since they support up to 16 GB. I don't know if you can run opensolaris without a videocard after installation I think you can disable the "halt on no video card" in the bios. But Simon Breden had some trouble with it, see his homeserver blog. But you can go for one of the three m4a boards with a 780g onboard. Those will give you 2 pci-e x16 connectors. I don't think the onboard nic is supported. I always put an intel (e1000) in, just to prevent any trouble. I don't have any trouble with the sb700 in ahci mode. Hotplugging works like a charm. Transfering a couple of GB's over esata takes considerable less time than via usb. I have a pata to dual cf adapter and two industrial 16gb cf cards as mirrored root pool. It takes for ever to install nevada, at least 14 hours. I suspect the cf cards lack caches. But I don't update that regularly, still on snv104. And have 2 mirrors and a hot spare. The sixth port is an esata port I use to transfer large amounts of data. This system consumes about 73 watts idle and 82 under load i/o load. (5 disks , a separate nic ,8 gb ram and a be2400 all using just 73 watts!!!) How much power does the tungsten light burn? :-) -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD's and ZFS...
Brian Hechinger wrote: On Thu, Jul 23, 2009 at 10:28:38AM -0400, Kyle McDonald wrote: In my case the slog slice wouldn't be the slog for the root pool, it would be the slog for a second data pool. I didn't think you could add a slog to the root pool anyway. Or has that changed in recent builds? I'm a little behind on my SXCE versions, been too busy to keep up. :) I don't know either. It's not really what I was looking to do so I never even thought of it. :) When you have just the root pool on a disk, ZFS won't enable the write cache by default. I don't think this is limited to root pools. None of my pools (root or non-root) seem to have the write cache enabled. Now that I think about it, all my disks are "hidden" behind an LSI1078 controller so I'm not sure what sort of impact that would have on the situation. When you give the full disk (deivce name 'cWtXdY' - with no 'sZ' ) then ZFS will usually instruct the drive to enable write caching. You're right though if youre drives are really something like single drive RAID 0 LUNs, then who knows what happens. -Kyle -brian ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD's and ZFS...
On Thu, Jul 23, 2009 at 10:28:38AM -0400, Kyle McDonald wrote: > > > In my case the slog slice wouldn't be the slog for the root pool, it > would be the slog for a second data pool. I didn't think you could add a slog to the root pool anyway. Or has that changed in recent builds? I'm a little behind on my SXCE versions, been too busy to keep up. :) > When you have just the root pool on a disk, ZFS won't enable the write > cache by default. I don't think this is limited to root pools. None of my pools (root or non-root) seem to have the write cache enabled. Now that I think about it, all my disks are "hidden" behind an LSI1078 controller so I'm not sure what sort of impact that would have on the situation. -brian -- "Coding in C is like sending a 3 year old to do groceries. You gotta tell them exactly what you want or you'll end up with a cupboard full of pop tarts and pancake mix." -- IRC User (http://www.bash.org/?841435) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD's and ZFS...
F. Wessels wrote: Thanks posting this solution. But I would like to point out that bug 6574286 "removing a slog doesn't work" still isn't resolved. A solution is under it's way, according to George Wilson. But in the mean time, IF something happens you might be in a lot of trouble. Even without some unfortunate incident you cannot for example export your data pool, pull the drives and leave the root pool. In my case the slog slice wouldn't be the slog for the root pool, it would be the slog for a second data pool. If the device went bad, I'd have to replace it, true. But if the device goes bad, then so did a good part of my root pool, and I'd have to replace that too. Don't get me wrong I would like such a setup a lot. But I'm not going to implement it until the slog can be removed or the pool be imported without the slog. In the mean time can someone confirm that in such a case, root pool and zil in two slices and mirrored, that the write cache can be enabled with format? Only zfs is using the disk, but perhaps I'm wrong on this. There have been post's regarding enabling the write_cache. But I couldn't find a conclusive answer for the above scenario. When you have just the root pool on a disk, ZFS won't enable the write cache by default. I think you can manually enable it but I don't know the dangers. Adding the slog shouldn't be any different. To be honest, I don't know how closely the write caching on a SSD matches what a moving disk has. -Kyle ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Motherboard for home zfs/solaris file server
Hi, I'm using asus m3a78 boards (with the sb700) for opensolaris and m2a* boards (with the sb600) for linux some of them with 4*1GB and others with 4*2Gb ECC memory. Ecc faults will be detected and reported. I tested it with a small tungsten light. By moving the light source slowly towards the memory banks you'll heat them up in a controlled way and at a certain point bit flips will occur. I recommend you to go for a m4a board since they support up to 16 GB. I don't know if you can run opensolaris without a videocard after installation I think you can disable the "halt on no video card" in the bios. But Simon Breden had some trouble with it, see his homeserver blog. But you can go for one of the three m4a boards with a 780g onboard. Those will give you 2 pci-e x16 connectors. I don't think the onboard nic is supported. I always put an intel (e1000) in, just to prevent any trouble. I don't have any trouble with the sb700 in ahci mode. Hotplugging works like a charm. Transfering a couple of GB's over esata takes considerable less time than via usb. I have a pata to dual cf adapter and two industrial 16gb cf cards as mirrored root pool. It takes for ever to install nevada, at least 14 hours. I suspect the cf cards lack caches. But I don't update that regularly, still on snv104. And have 2 mirrors and a hot spare. The sixth port is an esata port I use to transfer large amounts of data. This system consumes about 73 watts idle and 82 under load i/o load. (5 disks , a separate nic ,8 gb ram and a be2400 all using just 73 watts!!!) Please note that frequency scaling is only supported on the K10 architecture. But don't expect to much power saving from it. A lower voltage yields far greater savings than a lower frequency. In september I'll do a post about the afore mentioned M4A boards and an lsi sas controller in one of the pcie x16 slots. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SSD's and ZFS...
Thanks posting this solution. But I would like to point out that bug 6574286 "removing a slog doesn't work" still isn't resolved. A solution is under it's way, according to George Wilson. But in the mean time, IF something happens you might be in a lot of trouble. Even without some unfortunate incident you cannot for example export your data pool, pull the drives and leave the root pool. Don't get me wrong I would like such a setup a lot. But I'm not going to implement it until the slog can be removed or the pool be imported without the slog. In the mean time can someone confirm that in such a case, root pool and zil in two slices and mirrored, that the write cache can be enabled with format? Only zfs is using the disk, but perhaps I'm wrong on this. There have been post's regarding enabling the write_cache. But I couldn't find a conclusive answer for the above scenario. Regards, Frederik -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] why is zpool import still hanging in opensolaris 2009.06 ??? no fix yet ???
Follow-up : happy end ... It took quite some thinkering but... i have my data back... I ended up starting without the troublesome zfs storage array, de-installed the iscsitartget software and re-installed it...just to have solaris boot without complaining about missing modules... That left me with a system that would boot as long as the storage was disconnected... Reconnecting it made the boot stop at the hostname. Then the disk activity light would flash every second or so forever... I then rebooted using milestone=none. That worked also with the storage attached! So now I was sure that some software process was causing a hangup (or what appeared to be a hangup.) I could now in milestone none verify that the pool was intact: and so it was... fortunately I had not broken the pool itself... all online with no errors to report. I then went to milestone-all which again made the system hang with the disk activity every second forever. I think the task doing this was devfsadm. I then "assumed on a gut feeling" that somehow the system was scanning or checking the pool. I left the system overnight in a desperate attempt because I calculated the 500GB checking to take about 8 hrs if the system would *really* scan everything. (I copied a 1 TB drive last week which took nearly 20 hrs, so I learned that sometimes I need to wait... copying these big disks takes a *lot* of time!) This morning I switched on the monitor and lo and behold : a login screen The store was there! Lesson for myself and others: you MUST wait at the hostname line: the system WILL eventually come online... but don't ask how long it takes... I hate to think how long it would take if I had a 10TB system. (but then again, a file-system-check on an ext2 disk also takes forever...) I re-enabled the iscsitgtd and did a list : it saw one of the two targets ! (which was ok because I remembered that I had turned off the shareiscsi flag on the second share. I then went ahead and connected the system back into the network and "repaired" the iscsi-target on the virtual mainframe : WORKED ! Copied over the virtual disks to local store so I can at least start up these servers asap again. Then set the iscsishare on the second and most important share: OK! Listed the targets: THERE, BOTH! Repaired it's connection too: WORKED...! I am copying everything away from the ZFS pools now, but my data is recovered... fortunately. I now have mixed feelings about the ordeal: yes Sun Solaris kept its promise: I did not loose my data. But the time and trouble it took to recover in this case (just to restart a system for example taking an overnight wait!) is something that a few of my customers would *seriously* dislike... But: a happy end after all... most important data rescued and 2nd important : I learned a lot in the process... Bye Luc De Meyer Belgium -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] triple-parity: RAID-Z3
On 22.07.09 10:45, Adam Leventhal wrote: which gap? 'RAID-Z should mind the gap on writes' ? Message was edited by: thometal I believe this is in reference to the raid 5 write hole, described here: http://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_5_performance It's not. So I'm not sure what the 'RAID-Z should mind the gap on writes' comment is getting at either. Clarification? I'm planning to write a blog post describing this, but the basic problem is that RAID-Z, by virtue of supporting variable stripe writes (the insight that allows us to avoid the RAID-5 write hole), must round the number of sectors up to a multiple of nparity+1. This means that we may have sectors that are effectively skipped. ZFS generally lays down data in large contiguous streams, but these skipped sectors can stymie both ZFS's write aggregation as well as the hard drive's ability to group I/Os and write them quickly. Jeff Bonwick added some code to mind these gaps on reads. The key insight there is that if we're going to read 64K, say, with a 512 byte hole in the middle, we might as well do one big read rather than two smaller reads and just throw out the data that we don't care about. Of course, doing this for writes is a bit trickier since we can't just blithely write over gaps as those might contain live data on the disk. To solve this we push the knowledge of those skipped sectors down to the I/O aggregation layer in the form of 'optional' I/Os purely for the purpose of coalescing writes into larger chunks. This exact issue was discussed here almost three years ago: http://www.opensolaris.org/jive/thread.jspa?messageID=60241 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss