Re: [zfs-discuss] RAIDZ one of the disk showing unavail
Srinivas Chadalavada wrote: I see the first disk as unavailble, How do i make it online? By replacing it with a non-broken one. -- Ralf Ramge Senior Solaris Administrator, SCNA, SCSA Tel. +49-721-91374-3963 [EMAIL PROTECTED] - http://web.de/ 11 Internet AG Brauerstraße 48 76135 Karlsruhe Amtsgericht Montabaur HRB 6484 Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Markus Huhn, Oliver Mauss, Achim Weiss Aufsichtsratsvorsitzender: Michael Scheeren ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Which is better for root ZFS: mlc or slc SSD?
For a root device it doesn't matter that much. You're not going to be writing to the device at a high data rate so write/erase cycles don't factor much (MLC can sustain about a factor of 10 more). With MLC you'll get 2-4x the capacity for the same price, but again that doesn't matter much for a root device. Performance is typically a bit better with SLC -- especially on the write side -- but it's not such a huge difference. The reason you'd use a flash SSD for a boot device is power (with maybe a dash of performance), and either SLC or MLC will do just fine. Adam On Sep 24, 2008, at 11:41 AM, Erik Trimble wrote: I was under the impression that MLC is the preferred type of SSD, but I want to prevent myself from having a think-o. I'm looking to get (2) SSD to use as my boot drive. It looks like I can get 32GB SSDs composed of either SLC or MLC for roughly equal pricing. Which would be the better technology? (I'll worry about rated access times/etc of the drives, I'm just wondering about general tech for an OS boot drive usage...) -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Adam Leventhal, Fishworkshttp://blogs.sun.com/ahl ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zfs resilvering
Hi, I've searched without luck, so I'm asking instead. I have a Solaris 10 box, # cat /etc/release Solaris 10 11/06 s10s_u3wos_10 SPARC Copyright 2006 Sun Microsystems, Inc. All Rights Reserved. Use is subject to license terms. Assembled 14 November 2006 this box was rebooted this morning and after the boot I noticed a resilver was in progress. But the suggested time seemed a bit long, so is this a problem which can be patched or remediated in another way? # zpool status -x pool: zonedata state: ONLINE status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scrub: resilver in progress, 0.04% done, [b]4398h43m[/b] to go config: NAME STATE READ WRITE CKSUM zonedata ONLINE 0 0 0 mirror ONLINE 0 0 0 c6t60060E8004282B00282B10A0d0 ONLINE 0 0 0 c6t60060E8004283300283310A0d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c6t60060E8004282B00282B10A1d0 ONLINE 0 0 0 c6t60060E8004283300283310A1d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c6t60060E8004282B00282B10A2d0 ONLINE 0 0 0 c6t60060E8004283300283310A2d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c6t60060E8004282B00282B10A4d0 ONLINE 0 0 0 c6t60060E8004283300283310A4d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c6t60060E8004282B00282B10A5d0 ONLINE 0 0 0 c6t60060E8004283300283310A5d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c6t60060E8004282B00282B10A6d0 ONLINE 0 0 0 c6t60060E8004283300283310A6d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c6t60060E8004282B00282B2022d0 ONLINE 0 0 0 c6t60060E800428330028332022d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c6t60060E8004282B00282B2023d0 ONLINE 0 0 0 c6t60060E800428330028332024d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c6t60060E8004282B00282B2024d0 ONLINE 0 0 0 c6t60060E800428330028332023d0 ONLINE 0 0 0 I also have a question about sharing a zfs from the global zone to a local zone. Are there any issues with this? We had an unfortunate sysadmin who did this and our systems hung. We have no logs that show anyhing at all, but I thought I'd ask just be sure. cheers, //Mike -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Scripting zfs send / receive
Hey folks, Is anybody able to help a Solaris scripting newbie with this? I want to put together an automatic script to take snapshots on one system and send them across to another. I've shown the manual process works, but only have a very basic idea about how I'm going to automate this. My current thinking is that I want to put together a cron job that will work along these lines: - Run every 15 mins - take a new snapshot of the pool - send the snapshot to the remote system with zfs send / receive and ssh. (am I right in thinking I can get ssh to work with no password if I create a public/private key pair? http://www.go2linux.org/ssh-login-using-no-password) - send an e-mail alert if zfs send / receive fails for any reason (with the text of the failure message) - send an e-mail alert if zfs send / receive takes longer than 15 minutes and clashes with the next attempt - delete the oldest snapshot on both systems if the send / receive worked Can anybody think of any potential problems I may have missed? Bearing in mind I've next to no experience in bash scripting, how does the following look? ** #!/bin/bash # Prepare variables for e-mail alerts SUBJECT=zfs send / receive error EMAIL=[EMAIL PROTECTED] NEWSNAP=build filesystem + snapshot name here RESULTS=$(/usr/sbin/zfs snapshot $NEWSNAP) # how do I check for a snapshot failure here? Just look for non blank $RESULTS? if $RESULTS; then # send e-mail /bin/mail -s $SUBJECT $EMAIL $RESULTS exit fi PREVIOUSSNAP=build filesystem + snapshot name here RESULTS=$(/usr/sbin/zfs send -i $NEWSNAP $PREVIOUSSNAP | ssh -l *user* *remote-system* /usr/sbin/zfs receive *filesystem*) # again, how do I check for error messages here? Do I just look for a blank $RESULTS to indicate success? if $RESULTS ok; then OBSOLETESNAP=build filesystem + name here zfs destroy $OBSOLETESNAP ssh -l *user* *remote-system* /usr/sbin/zfs destroy $OBSOLETESNAP else # send e-mail with error message /bin/mail -s $SUBJECT $EMAIL $RESULTS fi ** One concern I have is what happens if the send / receive takes longer than 15 minutes. Do I need to check that manually, or will the script cope with this already? Can anybody confirm that it will behave as I am hoping in that the script will take the next snapshot, but the send / receive will fail and generate an e-mail alert? thanks, Ross -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Scripting zfs send / receive
Hi Clive King has a nice blog entry showing this in action http://blogs.sun.com/clive/entry/replication_using_zfs with associated script at: http://blogs.sun.com/clive/resource/zfs_repl.ksh Which I think answers most of your questions. Enda Ross wrote: Hey folks, Is anybody able to help a Solaris scripting newbie with this? I want to put together an automatic script to take snapshots on one system and send them across to another. I've shown the manual process works, but only have a very basic idea about how I'm going to automate this. My current thinking is that I want to put together a cron job that will work along these lines: - Run every 15 mins - take a new snapshot of the pool - send the snapshot to the remote system with zfs send / receive and ssh. (am I right in thinking I can get ssh to work with no password if I create a public/private key pair? http://www.go2linux.org/ssh-login-using-no-password) - send an e-mail alert if zfs send / receive fails for any reason (with the text of the failure message) - send an e-mail alert if zfs send / receive takes longer than 15 minutes and clashes with the next attempt - delete the oldest snapshot on both systems if the send / receive worked Can anybody think of any potential problems I may have missed? Bearing in mind I've next to no experience in bash scripting, how does the following look? ** #!/bin/bash # Prepare variables for e-mail alerts SUBJECT=zfs send / receive error EMAIL=[EMAIL PROTECTED] NEWSNAP=build filesystem + snapshot name here RESULTS=$(/usr/sbin/zfs snapshot $NEWSNAP) # how do I check for a snapshot failure here? Just look for non blank $RESULTS? if $RESULTS; then # send e-mail /bin/mail -s $SUBJECT $EMAIL $RESULTS exit fi PREVIOUSSNAP=build filesystem + snapshot name here RESULTS=$(/usr/sbin/zfs send -i $NEWSNAP $PREVIOUSSNAP | ssh -l *user* *remote-system* /usr/sbin/zfs receive *filesystem*) # again, how do I check for error messages here? Do I just look for a blank $RESULTS to indicate success? if $RESULTS ok; then OBSOLETESNAP=build filesystem + name here zfs destroy $OBSOLETESNAP ssh -l *user* *remote-system* /usr/sbin/zfs destroy $OBSOLETESNAP else # send e-mail with error message /bin/mail -s $SUBJECT $EMAIL $RESULTS fi ** One concern I have is what happens if the send / receive takes longer than 15 minutes. Do I need to check that manually, or will the script cope with this already? Can anybody confirm that it will behave as I am hoping in that the script will take the next snapshot, but the send / receive will fail and generate an e-mail alert? thanks, Ross -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Slow zpool import with b98
Hi again... today I maybe had the same problem you described... I had an on disk format of 11. After upgrading to 13 all works fine. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs resilvering
On Fri, Sep 26, 2008 at 1:27 AM, Mikael Kjerrman [EMAIL PROTECTED] wrote: Hi, I've searched without luck, so I'm asking instead. I have a Solaris 10 box, # cat /etc/release Solaris 10 11/06 s10s_u3wos_10 SPARC Copyright 2006 Sun Microsystems, Inc. All Rights Reserved. Use is subject to license terms. Assembled 14 November 2006 this box was rebooted this morning and after the boot I noticed a resilver was in progress. But the suggested time seemed a bit long, so is this a problem which can be patched or remediated in another way? # zpool status -x pool: zonedata state: ONLINE status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scrub: resilver in progress, 0.04% done, [b]4398h43m[/b] to go config: NAME STATE READ WRITE CKSUM zonedata ONLINE 0 0 0 mirror ONLINE 0 0 0 c6t60060E8004282B00282B10A0d0 ONLINE 0 0 0 c6t60060E8004283300283310A0d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c6t60060E8004282B00282B10A1d0 ONLINE 0 0 0 c6t60060E8004283300283310A1d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c6t60060E8004282B00282B10A2d0 ONLINE 0 0 0 c6t60060E8004283300283310A2d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c6t60060E8004282B00282B10A4d0 ONLINE 0 0 0 c6t60060E8004283300283310A4d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c6t60060E8004282B00282B10A5d0 ONLINE 0 0 0 c6t60060E8004283300283310A5d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c6t60060E8004282B00282B10A6d0 ONLINE 0 0 0 c6t60060E8004283300283310A6d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c6t60060E8004282B00282B2022d0 ONLINE 0 0 0 c6t60060E800428330028332022d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c6t60060E8004282B00282B2023d0 ONLINE 0 0 0 c6t60060E800428330028332024d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c6t60060E8004282B00282B2024d0 ONLINE 0 0 0 c6t60060E800428330028332023d0 ONLINE 0 0 0 I also have a question about sharing a zfs from the global zone to a local zone. Are there any issues with this? We had an unfortunate sysadmin who did this and our systems hung. We have no logs that show anyhing at all, but I thought I'd ask just be sure. cheers, //Mike -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss Do you have a lot of competing I/O's on the box which would slow down the resilver? -- Brent Jones [EMAIL PROTECTED] ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs resilvering
define a lot :-) We are doing about 7-8M per second which I don't think is a lot but perhaps it is enough to screw up the estimates? Anyhow the resilvering completed about 4386h earlier than expected so everything is ok now, but I still feel that the way it figures out the number is wrong. Any thoughts on my other issue? cheers, //Mike -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs resilvering
On Fri, Sep 26, 2008 at 4:02 PM, [EMAIL PROTECTED] wrote: Note the progress so far 0.04%. In my experience the time estimate has no basis in reality until it's about 1% do or so. I think there is some bookkeeping or something ZFS does at the start of a scrub or resilver that throws off the time estimate for a while. Thats just my experience with it but it's been like that pretty consistently for me. Jonathan Stewart I agree here. I've watched iostat -xnc 5 while I start scrubbing a few times, and the first minute or so is spend doing very little IO. There after the transfers shoot up to near what I think is the maximum the drive can do an stays there until the scrub is completed. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS poor performance on Areca 1231ML
Well, I just got in a system I am intending to be a BIG fileserver; background- I work for a SAN startup, and we're expecting in our first year to collect 30-60 terabytes of Fibre Channel traces. The purpose of this is to be a large repository for those traces w/ statistical analysis run against them. Looking at that storage figure, I decided this would be a perfect application for ZFS. I purchased a Super Micro chassis that's 4u and has 24 slots for SATA drives. I've put one quad-core 2.66 ghz processor in 8gig of ECC ram. I put in two Areca 1231ML ( http://www.areca.com.tw/products/pcie341.htm ) controllers which come with Solaris drivers. I've half-populated the chassis with 12 1Tb drives to begin with, and I'm running some experiments. I loaded OpenSolaris 05-2008 on the system. I configured up an 11 drive RAID6 set + 1 hot spare on the Areca controller put a ZFS on that raid volume, and ran bonnie++ against it (16g size), and achieved 150 mb/s write, 200 mb/s read. I then blew that away, configured the Areca to present JBOD, and configured ZFS with RAIDZ2 11 disks, and a hot spare. Running bonnie++ against that, it achieved 40 mb/sec read and 40 mb/sec write. I wasn't expecting RAIDZ to outrun the controller-based RAID, but I wasn't expecting 1/3rd to 1/4 the performance. I've looked at the ZFS tuning info on the solaris site, and mostly what they said is tuning is evil, with a few things for Database tuning. Anyone got suggestions on whether there's something I might poke at to at least get this puppy up closer to 100 mb/sec? Otherwise, I may dump the JBOD and go back to the controller-based RAID. Cheers Ross -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RAIDZ one of the disk showing unavail
sc == Srinivas Chadalavada [EMAIL PROTECTED] writes: rr == Ralf Ramge [EMAIL PROTECTED] writes: sc I see the first disk as unavailble, How do i make it online? rr By replacing it with a non-broken one. Ralf, aren't you missing this obstinence-error: sc the following errors must be manually repaired: sc /dev/dsk/c0t2d0s0 is part of active ZFS pool export_content. and he used the -f flag. pgp1eI3LJ8FWN.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs resilvering
Mikael Kjerrman wrote: define a lot :-) We are doing about 7-8M per second which I don't think is a lot but perhaps it is enough to screw up the estimates? Anyhow the resilvering completed about 4386h earlier than expected so everything is ok now, but I still feel that the way it figures out the number is wrong. Yes, the algorithm is conservative and very often wrong until you get close to the end. In part this is because resilvering works in time order, not spatial distance. In ZFS, the oldest data is resilvered first. This is also why you will see a lot of thinking before you see a lot of I/O because ZFS is determining the order to resilver the data. Unfortunately, this makes time completion prediction somewhat difficult to get right. Any thoughts on my other issue? Try the zones-discuss forum -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] working closed blob driver
t == Tim [EMAIL PROTECTED] writes: t http://www.supermicro.com/products/accessories/addon/AOC-USASLP-L8i.cfm I'm not sure. A different thing is wrong with it depending on what driver attaches to it. I can't tell for sure because this page: http://linuxmafia.com/faq/Hardware/sas.html says the LSI SAS 3800 series uses a 1068E chip, and James says (1) 1068E is supported by mpt, (2) LSI SAS 3800 uses mega_sas. so, I don't know which for that card, which means I don't know which for this card. If it's mpt: * does not come with source according to: http://www.openbsd.org/papers/opencon06-drivers/mgp00024.html http://www.opensolaris.org/os/about/no_source/ If it's mega_sas: * does not come with source * driver is new and unproven. We believed the Marvell driver was good for the first few months too, the same amount of experience we have with mega_sas. * not sure if it's available in stable solaris. In either case: * may require expensive cables Uncertain problems: * might not support hotplug * might not support NCQ * probably doesn't support port multipliers * probably doesn't support smartctl * none of these features can be fixed by the community without source. all are available with cheaper cards on Linux, and on Linux both mptsas and megaraid_sas come with source as far as I can tell maintained by dell and lsi, though might not support the above features. HTH, HAND. pgprXeXGuGTc7.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS poor performance on Areca 1231ML
On Fri, 26 Sep 2008, Ross Becker wrote: I configured up an 11 drive RAID6 set + 1 hot spare on the Areca controller put a ZFS on that raid volume, and ran bonnie++ against it (16g size), and achieved 150 mb/s write, 200 mb/s read. I then blew that away, configured the Areca to present JBOD, and configured ZFS with RAIDZ2 11 disks, and a hot spare. Running bonnie++ against that, it achieved 40 mb/sec read and 40 mb/sec write. I wasn't expecting RAIDZ to outrun the controller-based RAID, but I wasn't expecting 1/3rd to 1/4 the performance. I've looked at the ZFS Terrible! Have you tested the I/O performance of each drive to make sure that they are all performing ok? If the individual drives are found to be performing ok with your JBOD setup, then I would suspect a device driver, card slot, or card firmware performance problem. If RAID6 is done by the RAID card then backplane I/O to the card is not very high. If raidz2 is used, then the I/O to the card is much higher. With a properly behaving device driver and card, it is quite likely that ZFS raidz2 will outperform the on-card RAID6. You might try disabling the card's NVRAM to see if that makes a difference. Bob == Bob Friesenhahn [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] working closed blob driver
On Thu, Sep 25, 2008 at 18:51, Tim [EMAIL PROTECTED] wrote: So what's wrong with this card? http://www.supermicro.com/products/accessories/addon/AOC-USASLP-L8i.cfm If you have a UIO slot (many recent Supermicro boards do) then it's a fine choice. But if you have a non-Supermicro board, you may be in for a shock when you get it---it's swapped left for right, compare it to a regular pci-e card. It won't fit in a standard case. AIUI it is a standard pci express slot, just shifted over a bit so the backwards slot cover fits into normal cases, so perhaps you could try fastening a normal slot cover to it and using it in a normal pci-e slot... but that doesn't sound particularly elegant, and would take up the slot on the other side of it as well. Will ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] working closed blob driver
On Thu, Sep 25, 2008 at 21:59, Miles Nordin [EMAIL PROTECTED] wrote: wm == Will Murnane [EMAIL PROTECTED] writes: wm I'd rather have a working closed blob than a driver that is wm Free Software for a device that is faulty. Ideals are very wm nice, but broken hardware isn't. except, 1. part of the reason the closed Solaris drivers are (also) broken, IMHO, is that they're closed, so highly-invested competent people can't fix them if they happen to be on the wrong side of the wall. I agree this is an issue. But as I said, I'd rather have a working closed driver than a broken open one. 2. Linux has open drivers for the Marvell chip that work better than Sun's closed driver (snip) That's not my experience. I bought my Marvell card around 2005, and at that point I used Linux drivers. Drivers for the card at that point did not support DMA, but were fairly reliable. In late 2006 or so, DMA support was finally added, so I gleefully installed a new kernel and was happy. Until I realized that my data was corrupt. This is for a home system, so I didn't have checksums for the data before the corruption, but I started to hear glitches in music playback. At that point I switched to Solaris, and was very glad for the drivers that didn't cause corruption---and the filesystem that could tell me when things went wrong. I did have a problem with disks falling off the card, so I posted to the storage-discuss mailing list [2]. Despite being on the wrong side of the wall, the drivers were updated fairly soon thereafter, and my problem was solved [1]. The system worked quite well for me in this instance. 3. The position is incredibly short-sighted. Imagine the quality of driver we'd have right now if _everyone_ refused to sign that damned paper, not just the Linux people. We would have a better driver. It would be open, too, but open or not it would be better. Not necessarily. Suppose that the corporation making the hardware released its own drivers, for Windows and Linux, say, and didn't release specs to anyone else, even under NDA conditions. Then nobody gets good drivers (ones that correctly use all the features the hardware has). I agree that having complete hardware specs is a very helpful thing to make drivers. But they're not strictly necessary, as the Linux/BSD folks have shown. 4. there are missing features like NCQ, hotplug, port-multiplier support, all highly relevant to ZFS, for which we will have to wait longer because we've accepted closed drivers. That's true. But honestly, I don't see those features (with the exception of hot-plug) as being all that necessary. Port multipliers are uncommon and don't perform as well as they could, and NCQ seems to me to be something the OS could do better than the drive firmware. 5. The Sil 3124 chip works fine on Linux. I have not tried the 3114, but at least on Linux it is part of libata, their SATA framework, not supported in remedial PATA mode, so it's at least more of a first-class driver in Linux than in Solaris, if not simply a better one. IMHO, attempting to make SilImage controllers work well is lipstick on a pig. Working around the bugs in the hardware is not worth the effort. I just want an open driver that works well for some fairly-priced card I can actually buy. This I can agree with. Despite my objections to free drivers being inherently better than closed ones, I do like the idea of being able to have a completely transparent machine, where I can inspect every piece of software. I would be more than happy to buy such hardware were it available, but in the interim I will continue to suggest and buy LSI's products, which are not free but which have good drivers for them. The open driver isn't obtainable as an add-on card The ICH series would indeed be nice to see as an addon card of some sort. If there _is_ an open vs. closed trade-off, the track record so far suggests a different trade-off than what you suggest: you can have closed drivers if you really want them, but they'll be more broken than the open ones. That may be the case in the larger picture, but in my experience I've seen otherwise. Will [1]: http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg14188.html [2]: http://osdir.com/ml/os.solaris.opensolaris.storage.general/2007-08/msg00054.html ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zpool import of bootable root pool renders it unbootable
I am running OpenSolaris 2008.05 as a PV guest under Xen. If you import the bootable root pool of a VM into another Solaris VM, the root pool is no longer bootable. It is related to the device associated to the pool, which is originally c4d0s0, but on import (-f) becomes c0d2s0 in this case. Afterwards, booting the original image results in a kernel panic because, I think, zfs_mountroot() cannot mount the root path (which is evidently now wrong). I this fixable? How does one mount (import) a bootable zpool without wrecking it? This is something that is commonly done under virtualization platforms, e.g., to manage the contents of a VM from another VM, or to perform a file-system level copy of the contents of a VM to another device. Any insight would be appreciated. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] working closed blob driver
On Fri, Sep 26, 2008 at 1:02 PM, Will Murnane [EMAIL PROTECTED]wrote: On Thu, Sep 25, 2008 at 18:51, Tim [EMAIL PROTECTED] wrote: So what's wrong with this card? http://www.supermicro.com/products/accessories/addon/AOC-USASLP-L8i.cfm If you have a UIO slot (many recent Supermicro boards do) then it's a fine choice. But if you have a non-Supermicro board, you may be in for a shock when you get it---it's swapped left for right, compare it to a regular pci-e card. It won't fit in a standard case. AIUI it is a standard pci express slot, just shifted over a bit so the backwards slot cover fits into normal cases, so perhaps you could try fastening a normal slot cover to it and using it in a normal pci-e slot... but that doesn't sound particularly elegant, and would take up the slot on the other side of it as well. Will This is not a UIO card. It's a standard PCI-E card. What the description is telling you is that you can combine it with a UIO card to add raid functionality as there is none built-in. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] working closed blob driver
On Fri, Sep 26, 2008 at 12:29 PM, Miles Nordin [EMAIL PROTECTED] wrote: t == Tim [EMAIL PROTECTED] writes: t http://www.supermicro.com/products/accessories/addon/AOC-USASLP-L8i.cfm I'm not sure. A different thing is wrong with it depending on what driver attaches to it. I can't tell for sure because this page: http://linuxmafia.com/faq/Hardware/sas.html says the LSI SAS 3800 series uses a 1068E chip, and James says (1) 1068E is supported by mpt, (2) LSI SAS 3800 uses mega_sas. so, I don't know which for that card, which means I don't know which for this card. If it's mpt: * does not come with source according to: http://www.openbsd.org/papers/opencon06-drivers/mgp00024.html http://www.opensolaris.org/os/about/no_source/ If it's mega_sas: * does not come with source * driver is new and unproven. We believed the Marvell driver was good for the first few months too, the same amount of experience we have with mega_sas. * not sure if it's available in stable solaris. Someone's already gotten it working, if they're watching I'm sure they'll pipe up on what driver it uses. In either case: * may require expensive cables Nope, cables are standardized. I'm not sure what your definition of expensive is but I believe they were roughly 15$ for a SAS4sata ports. Uncertain problems: * might not support hotplug * might not support NCQ * probably doesn't support port multipliers * probably doesn't support smartctl * none of these features can be fixed by the community without source. all are available with cheaper cards on Linux, and on Linux both mptsas and megaraid_sas come with source as far as I can tell maintained by dell and lsi, though might not support the above features. HTH, HAND. I know it supports hotplug and NCQ. Can't say smartctl was ever on my list of important features so I haven't bothered to research if it does. I'm also not sure what good port multipliers are going to do you in this instance... the cables it uses already support 4 SATA drives per physical card port. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] working closed blob driver
On Fri, Sep 26, 2008 at 21:51, Tim [EMAIL PROTECTED] wrote: This is not a UIO card. It's a standard PCI-E card. What the description is telling you is that you can combine it with a UIO card to add raid functionality as there is none built-in. Not so. The description [1] mentions that this is UIO, and says only that it negotiates pci-e link speeds, not that it fits in a pci express slot. UIO is pci express, but the slots are positioned differently from pci-e ones. Compare this to the picture of an equivalent LSI card [2]. The pictures are similar, but compare the position of the bracket. The components are mounted on the wrong sides. Take a look at a UIO board [3]: the PCI-X slot is shared with the blue UIO slot on the left side, like PCI and ISA slots used to be shared. This is why the components are backwards. Will [1]: http://www.supermicro.com/products/accessories/addon/AOC-USASLP-L8i.cfm [2]: http://www.lsi.com/storage_home/products_home/internal_raid/megaraid_sas/megaraid_sas_8208elp/index.html [3]: http://www.supermicro.com/products/motherboard/Xeon1333/5400/X7DWE.cfm ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] working closed blob driver
On Fri, Sep 26, 2008 at 5:07 PM, Will Murnane [EMAIL PROTECTED]wrote: On Fri, Sep 26, 2008 at 21:51, Tim [EMAIL PROTECTED] wrote: This is not a UIO card. It's a standard PCI-E card. What the description is telling you is that you can combine it with a UIO card to add raid functionality as there is none built-in. Not so. The description [1] mentions that this is UIO, and says only that it negotiates pci-e link speeds, not that it fits in a pci express slot. UIO is pci express, but the slots are positioned differently from pci-e ones. Compare this to the picture of an equivalent LSI card [2]. The pictures are similar, but compare the position of the bracket. The components are mounted on the wrong sides. Take a look at a UIO board [3]: the PCI-X slot is shared with the blue UIO slot on the left side, like PCI and ISA slots used to be shared. This is why the components are backwards. Will [1]: http://www.supermicro.com/products/accessories/addon/AOC-USASLP-L8i.cfm [2]: http://www.lsi.com/storage_home/products_home/internal_raid/megaraid_sas/megaraid_sas_8208elp/index.html [3]: http://www.supermicro.com/products/motherboard/Xeon1333/5400/X7DWE.cfm Well, there's people that have it working in a PCI-E slot, so I don't know what to tell you. http://www.opensolaris.org/jive/thread.jspa?messageID=272283#272283 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS poor performance on Areca 1231ML
Okay, after doing some testing, it appears that the issue is on the ZFS side. I fiddled around a while with options on the areca card, and never got any better performance results than my first test. So, my best out of the raidz2 is 42 mb/s write and 43 mb/s read. I also tried turning off crc's (not how I'd run production, but for testing), and got no performance gain. After fiddling with options, I destroyed my zfs zpool, and tried some single-drive bits. I simply used newfs to create filesystems on single drives, mounted them, and ran some single-drive bonnie++ tests. On a single drive, I got 50 mb/sec write 70 mb/sec read. I also tested two benchmarks on two drives simultaneously, and on each of the tests, the result dropped by about 2mb/sec, so I got a combined 96 mb/sec write 136 mb/sec read with two separate UFS filesystems on two separate disks. So next steps? --ross -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS poor performance on Areca 1231ML
On Fri, Sep 26, 2008 at 5:46 PM, Ross Becker [EMAIL PROTECTED]wrote: Okay, after doing some testing, it appears that the issue is on the ZFS side. I fiddled around a while with options on the areca card, and never got any better performance results than my first test. So, my best out of the raidz2 is 42 mb/s write and 43 mb/s read. I also tried turning off crc's (not how I'd run production, but for testing), and got no performance gain. After fiddling with options, I destroyed my zfs zpool, and tried some single-drive bits. I simply used newfs to create filesystems on single drives, mounted them, and ran some single-drive bonnie++ tests. On a single drive, I got 50 mb/sec write 70 mb/sec read. I also tested two benchmarks on two drives simultaneously, and on each of the tests, the result dropped by about 2mb/sec, so I got a combined 96 mb/sec write 136 mb/sec read with two separate UFS filesystems on two separate disks. So next steps? --ross -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss Did you try disabling the card cache as others advised? --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] working closed blob driver
Tim wrote: On Fri, Sep 26, 2008 at 12:29 PM, Miles Nordin [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: t == Tim [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] writes: t http://www.supermicro.com/products/accessories/addon/AOC-USASLP-L8i.cfm I'm not sure. A different thing is wrong with it depending on what driver attaches to it. I can't tell for sure because this page: http://linuxmafia.com/faq/Hardware/sas.html says the LSI SAS 3800 series uses a 1068E chip, and James says (1) 1068E is supported by mpt, (2) LSI SAS 3800 uses mega_sas. so, I don't know which for that card, which means I don't know which for this card. There are several LSI cards which use the 1068 and 1068E chip. Some of these use mpt(7d), some use mega_sas(7d). It all depends on the firmware of the card, basically. You could also have a look at the PCI IDs database at http://pciids.sourceforge.net to see what the name to pci vid/did mapping is. That provides a fairly good indicator of whether you'll need mpt(7d) or mega_sas(7d). If it's mpt: * does not come with source according to: http://www.openbsd.org/papers/opencon06-drivers/mgp00024.html http://www.opensolaris.org/os/about/no_source/ If it's mega_sas: * does not come with source * driver is new and unproven. We believed the Marvell driver was good for the first few months too, the same amount of experience we have with mega_sas. * not sure if it's available in stable solaris. Someone's already gotten it working, if they're watching I'm sure they'll pipe up on what driver it uses. http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/io/mega_sas We've got this driver into Solaris 10 Update 6. I'm still keen to find out from Miles why mega_sas is new and unproven given that it's been in NV since build 88. Miles - if you're seeing problems with it, please let us know so that we can fix them. If you don't tell us, how will we ever know? In either case: * may require expensive cables Nope, cables are standardized. I'm not sure what your definition of expensive is but I believe they were roughly 15$ for a SAS4sata ports. If you want to get an external SAS cable (particularly if it's got the InfiniBand-style SF8088 connector), then that might cost you a bit. If you just want to connect devices internally, then I would expect the cables to be somewhat cheaper. Either way, with more and more volume of cards and devices on the market, the pricing for cables should decrease too. [snip] James C. McPherson -- Senior Kernel Software Engineer, Solaris Sun Microsystems http://blogs.sun.com/jmcp http://www.jmcp.homeunix.com/blog ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS poor performance on Areca 1231ML
Ross Becker wrote: Well, I just got in a system I am intending to be a BIG fileserver; background- I work for a SAN startup, and we're expecting in our first year to collect 30-60 terabytes of Fibre Channel traces. The purpose of this is to be a large repository for those traces w/ statistical analysis run against them. Looking at that storage figure, I decided this would be a perfect application for ZFS. I purchased a Super Micro chassis that's 4u and has 24 slots for SATA drives. I've put one quad-core 2.66 ghz processor in 8gig of ECC ram. I put in two Areca 1231ML ( http://www.areca.com.tw/products/pcie341.htm ) controllers which come with Solaris drivers. I've half-populated the chassis with 12 1Tb drives to begin with, and I'm running some experiments. I loaded OpenSolaris 05-2008 on the system. I configured up an 11 drive RAID6 set + 1 hot spare on the Areca controller put a ZFS on that raid volume, and ran bonnie++ against it (16g size), and achieved 150 mb/s write, 200 mb/s read. I then blew that away, configured the Areca to present JBOD, and configured ZFS with RAIDZ2 11 disks, and a hot spare. Running bonnie++ against that, it achieved 40 mb/sec read and 40 mb/sec write. I wasn't expecting RAIDZ to outrun the controller-based RAID, but I wasn't expecting 1/3rd to 1/4 the performance. I've looked at the ZFS tuning info on the solaris site, and mostly what they said is tuning is evil, with a few things for Database tuning. Anyone got suggestions on whether there's something I might poke at to at least get this puppy up closer to 100 mb/sec? Otherwise, I may dump the JBOD and go back to the controller-based RAID. While running pre-integration testing of arcmsr(7d), I noticed that random IO was pretty terrible. My results matched what I saw in benchmark PDFs from http://www.areca.com.tw/support/main.htm (bottom of page), but I'd still like to improve the results. Were you doing more random or more sequential IO? The source is here: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/intel/io/scsi/adapters/arcmsr ... and I'm keen to talk with you in detail about the issues you're seeing with arcmsr too. James C. McPherson -- Senior Kernel Software Engineer, Solaris Sun Microsystems http://blogs.sun.com/jmcp http://www.jmcp.homeunix.com/blog ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS poor performance on Areca 1231ML
Ross Becker wrote: Okay, after doing some testing, it appears that the issue is on the ZFS side. I fiddled around a while with options on the areca card, and never got any better performance results than my first test. So, my best out of the raidz2 is 42 mb/s write and 43 mb/s read. I also tried turning off crc's (not how I'd run production, but for testing), and got no performance gain. After fiddling with options, I destroyed my zfs zpool, and tried some single-drive bits. I simply used newfs to create filesystems on single drives, mounted them, and ran some single-drive bonnie++ tests. On a single drive, I got 50 mb/sec write 70 mb/sec read. I also tested two benchmarks on two drives simultaneously, and on each of the tests, the result dropped by about 2mb/sec, so I got a combined 96 mb/sec write 136 mb/sec read with two separate UFS filesystems on two separate disks. So next steps? --ross Raidz(2) vdevs can sustain the max iops of single drive in the vdev. I'm curious what zpool iostat would say while bonnie++ is running it's writing intelligently test. The throughput sounds very low to me, but the clue here is the single drive speed is in line with the raidz2 vdev, so if a single drive is being limited by iops, not by raw throughput, then this IO result makes sense. For fun, you should make two vdevs out of two raidz to see if you get twice the throughput, more or less. I'll bet the answer is yes. Jon -- - _/ _/ / - Jonathan Loran - - -/ / /IT Manager - - _ / _ / / Space Sciences Laboratory, UC Berkeley -/ / / (510) 643-5146 [EMAIL PROTECTED] - __/__/__/ AST:7731^29u18e3 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss