Re: [zfs-discuss] zfs send/receive as backup - reliability?
Allen Eastwood wrote: On Jan 19, 2010, at 22:54 , Ian Collins wrote: Allen Eastwood wrote: On Jan 19, 2010, at 18:48 , Richard Elling wrote: Many people use send/recv or AVS for disaster recovery on the inexpensive side. Obviously, enterprise backup systems also provide DR capabilities. Since ZFS has snapshots that actually work, and you can use send/receive or other backup solutions on snapshots, I assert the problem is low priority. What I have issue with is the idea that no one uses/should use tape any more. There are places for tape and it still has value as a backup device. In many cases in the past, ufsdump, despite it's many issues, was able to restore working OS's, or individual files. Perfect, not by a long shot. But it did get the job done. As was pointed out earlier, all I needed was a Solaris CD (or network boot) and I could restore. Entire OS gone, boot and ufsrestore. Critical files deleted, same thing…and I can restore just the file(s) I need. And while it's been a few years since I've read the man page on ufsdump, ufsrestore and fssnap, those tools have proven useful when dealing with a downed system. For a full recovery, you can archive a send stream and receive it back. With ZFS snapshots, the need for individual file recovery from tape is much reduced. The backup server I manage for a large client has 60 days of snaps and I can't remember when they had to go to tape to recover a file. -- Ian. Let's see… For full recovery, I have to zfs send to something, preferably that understands tape (yes, I know I can send to tape directly, but how well does zfs send handle the end of the tape? auto-changers?). I keep a stream (as a file) of my root pool on a USB stick. It could be on tape, but root pools are small. Then for individual file recovery, I have snaphots…which I also have to get on to tape…if I want to have them available on something other than the boot devices. No, just keep the snapshots in place. If a file is lost, just gab it form the snapshot directory. If the root filesystem is munted, roll back to the last snapshot. Now…to recover the entire OS, perhaps not so bad…but that's one tool. And to recover the one file, say a messed up /etc/system, that's preventing my OS from booting? Have to get that snapshot where I can use it first…oh and restoring individual files and not the entire snapshot? As I said, roll back. Boot form install media, import the root pool, get the file from a snapshot, or roll back to the last good snapshot, export and reboot. At best, it's an unwieldy process. But does it offer the simplicity that ufsdump/ufsrestore (or dump/restore on how many Unix variants…) did? No way. It certainly does for file recovery. Do you run incremental dumps every hour, or every 15 minutes? Periodic snapshots are quick and cheep. As I said before, careful use of snapshots all be removes the need to recover files from tape. We have 60 days of 4 hourly and 24 hourly snapshots in place, so the odds on finding a recent copy of a lost file are way better than they would be on daily incrementals. I certainly don't miss the pain of loading a sequence of incrementals to recover lost data. So ZFS solves the problems in a different way. For fat-finger recovery, it's way better than ufsdump/ufsrestore. A simple, effective dump/restore that deals with all the supported file systems, can deal with tape or disk, and allows for complete OS restore or individual file restore, and can be run from a install CD/DVD. As much as I love ZFS and as many problems as it does solve, leaving this out was a mistake, IMO. It possibly was, but it has encouraged us to find better solutions. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
On 19 jan 2010, at 20.11, Ian Collins wrote: Julian Regel wrote: Based on what I've seen in other comments, you might be right. Unfortunately, I don't feel comfortable backing up ZFS filesystems because the tools aren't there to do it (built into the operating system or using Zmanda/Amanda). Commercial backup solutions are available for ZFS. I know tape backup isn't sexy, but it's a reality for many of us and it's not going away anytime soon. True, but I wonder how viable its future is. One of my clients requires 17 LT04 types for a full backup, which cost more and takes up more space than the equivalent in removable hard drives. In the past few years growth in hard drive capacities has outstripped tapes to the extent that removable hard drives and ZFS snapshots have become a more cost effective and convenient backup media. LTO media is still cheaper than equivalent sized disks, maybe a factor 5 or so. LTO drivers cost a little, but so do disk shelves. So, now that there is no big price issue, there is choice instead. Use it! Hard drives are good for random access - both restore of individual files and partial rewrite. Hard drivers aren't faster than tape for data transfer, but they might be cheaper to run in parallel and therefore you could potentially gain speed. Hard drives have shorter seek time, which may be important. Hard drives are probably bad for storing for longer times - especially - you will never know how long it could be stored before it will fail. A month? Probably. A year? Maybe. Five years? Well... Ten years? Probably not. LTO tapes are supposed to be able to keep it's data for at least 30 years of stored properly. Hard drives are probably best when used online or at least very often. So - it is wrong to say that one is better or cheaper than the other. They have different properties, and could be used to solve different problems. /ragge s ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
Richard Elling richard.ell...@gmail.com wrote: ufsdump/restore was perfect in that regard. The lack of equivalent functionality is a big problem for the situations where this functionality is a business requirement. How quickly we forget ufsdump's limitations :-). For example, it is not supported for use on an active file system (known data corruption possibility) and UFS snapshots are, well, a poor hack and often not usable for backups. As the ufsdump(1m) manpage says, It seems you forgot that zfs also needs snapshots. There is nothing bad with snapshots. When I was talking with Jeff Bonwick in September 2004 (before ZFS became public), the only feature that was missing in Solaris for a 100% correct backup based on star was an interface for holey files, so we designed it. I believe the only mistake from ufsdumps is that is does not use standard OS interfaces and that it does not use a standard archive format. You get both with star and star is even faster than ufsdump. Jörg -- EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin j...@cs.tu-berlin.de(uni) joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] L2ARC in Cluster is picked up althought not part of the pool
Hello, we tested clustering with ZFS and the setup looks like this: - 2 head nodes (nodea, nodeb) - head nodes contain l2arc devices (nodea_l2arc, nodeb_l2arc) - two external jbods - two mirror zpools (pool1,pool2) - each mirror is a mirror of one disk from each jbod - no ZIL (anyone knows a well priced SAS SSD ?) We want active/active and added the l2arc to the pools. - pool1 has nodea_l2arc as cache - pool2 has nodeb_l2arc as cache Everything is great so far. One thing to node is that the nodea_l2arc and nodea_l2arc are named equally ! (c0t2d0 on both nodes). What we found is that during tests, the pool just picked up the device nodeb_l2arc automatically, altought is was never explicitly added to the pool pool1. We had a setup stage when pool1 was configured on nodea with nodea_l2arc and pool2 was configured on nodeb without a l2arc. Then we did a failover. Then pool1 pickup up the (until then) unconfigured nodeb_l2arc. Is this intended ? Why is a L2ARC device automatically picked up if the device name is the same ? In a later stage we had both pools configured with the corresponding l2arc device. (po...@nodea with nodea_l2arc and po...@nodeb with nodeb_l2arc). Then we also did a failover. The l2arc device of the pool failing over was marked as too many corruptions instead of missing. So from this tests it looks like ZFS just picks up the device with the same name and replaces the l2arc without looking at the device signatures to only consider devices beeing part of a pool. We have not tested with a data disk as c0t2d0 but if the same behaviour occurs - god save us all. Can someone clarify the logic behind this ? Can also someone give a hint how to rename SAS disk devices in opensolaris ? (to workaround I would like to rename c0t2d0 on nodea (nodea_l2arc) to c0t24d0 and c0t2d0 on nodeb (nodea_l2arc) to c0t48d0). P.s. Release is build 104 (NexentaCore 2). Thanks! -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
Ian Collins i...@ianshome.com wrote: The correct way to archivbe ACLs would be to put them into extended POSIX tar attrubutes as star does. See http://cdrecord.berlios.de/private/man/star/star.4.html for a format documentation or have a look at ftp://ftp.berlios.de/pub/star/alpha, e.g. ftp://ftp.berlios.de/pub/star/alpha/acl-test.tar.gz The ACL format used by Sun is undocumented.. man acltotext We are talking about TAR and I did give a pointer to the star archive format documentation, so it is obvious that I was talking about the ACL format from Sun tar. This format is not documented. Jörg -- EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin j...@cs.tu-berlin.de(uni) joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
Edward Ned Harvey sola...@nedharvey.com wrote: Star implements this in a very effective way (by using libfind) that is even faster that the find(1) implementation from Sun. Even if I just find my filesystem, it will run for 7 hours. But zfs can create my whole incremental snapshot in a minute or two. There is no way star or any other user-space utility that walks the filesystem can come remotely close to this performance. Such performance can only be implemented at the filesystem level, or lower. You claim that it is fast for you but this is because it is block oriented and because you probably changed only few data. If you like to have a backup that allows to access files, you need a file based backup and I am sure that even a filesystem level scan for recently changed files will not be much faster than what you may achive with e.g. star. Note that ufsdump directly accesees the raw disk device and thus _is_ at filesystem leven but still is slower than star on UFS. Jörg -- EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin j...@cs.tu-berlin.de(uni) joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Mirror of SAN Boxes with ZFS ? (split site mirror)
Actually I found some time (and reason) to test this. Environment: - 1 osol server - one SLES10 iSCSI Target - two LUN's exported via iSCSi to the OSol server I did some rescilver tests to see how ZFS resilvers devices. Prep: osol: create a pool (myiscsi) with one mirror pair made from the two iSCSI backend disks of SLES10 Test: osol: both disks ok, osol: txn in ueberblock of pool = 86 sles10: remove one disk (lun=1) osol: disk is detected failed, pool degraded osol: write with oflag=direct; sync multiple times to the pool osol: create fs myiscsi/test osol: txn in ueberblock = 107 osol: power off (hard) sles10: add lun 1 again (the one with txn 86) sles10: remove lun 0 (the onw with txn 107) osol: power on osol: txn in ueberblock = 92 osol: zfs myiscsi/test does not exist osol: create fs myiscsi/mytest_old osol: txn in ueberblock = 96 osol: power off (hard) sles10: add lun 0 again (with txn 107) sles10: both luns are there osol: Resilvering happens automatically osol: txn in ueberblock = 112 osol: filesystem myiscsi/test exists ... same thing other way around to see if rescilver direction is persistent ... osol: both disks ok, osol: txn in ueberblock = 120 sles10: remove one disk (lun=0) osol: write with oflag=sync; sync multiple times osol: create fs myiscsi/test osol: txn in ueberblock = 142 osol: power off (hard) sles10: add lun 0 again (the one with txn 120) sles10: remove lun 1 (the onw with txn 142) osol: boot osol: txn in ueberblock = 127 osol: filesystem myiscsi/test does not exist osol: create fs myiscsi/mytest_old osol: txn in ueberblock = 133 osol: power off sles10: add lun 1 again (with txn 142) sles10: both luns are there osol: boot osol: Resilvering happens automatically osol: txn in ueberblock = 148 osol: filesystem myiscsi/test exists --- From this tests it seems that the latest txn always wins. This practially means that the jbod with most changes (in terms of transacitons) will always sync over the one with the least modifications. Could someone confirm this assumtion ? Could someone explain resilvering direction selection ? Regards, Robert p.s. I did not test split brain, but this is next. (The planned setup is clustered not iSCSI but SAS, so the split brain is more academic in this case). -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
While I can appreciate that ZFS snapshots are very useful in being able to recover files that users might have deleted, they do not do much to help when the entire disk array experiences a crash/corruption or catches fire. Backing up to a second array helps if a) the array is off-site and for many of us the cost of remote links with sufficient bandwidth is still prohibitive,or b) on the local network but sufficiently far away from the original array such that the fire does not corrupt damage the backup as well. This leaves some form of removable storage. I'm not sure I'm aware of any enterprise-level removable disk solution, primarily because disk isn't really designed to be used for offsite backup whereas tape is. The biggest problem with tape was finding a sufficiently large window in which to perform the backup. ZFS snapshots completely solves this issue, but Sun have failed to provide the mechanism to protect the data off-site. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New Supermicro SAS/SATA controller: AOC-USAS2-L8e in SOHO NAS and HD HT
Yes, this model looks to be interesting. SuperMicro seem to have produced two models new models that satisfy the SATA III requirements of 6Gbps per channel: 1. AOC-USAS2-L8e: http://www.supermicro.com/products/accessories/addon/AOC-USAS2-L8i.cfm?TYP=E 2. AOC-USAS2-L8i: http://www.supermicro.com/products/accessories/addon/AOC-USAS2-L8i.cfm?TYP=I The main difference appears to be that the L8i model has RAID capabilities, whereas the L8e model does not. As ZFS does its own RAID calculations in software it needs JBOD, and doesn't need the adapter to have RAID capabilities, so the AOC-USAS2-L8e model looks to be ideal. If we're lucky maybe it's also a little cheaper too. Sorry I can't help you with your questions though. Hopefully someone else will be able to help. I will also be interested to hear any further info on this card. Cheers, Simon http://breden.org.uk/2008/03/02/a-home-fileserver-using-zfs/ -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
If you like to have a backup that allows to access files, you need a file based backup and I am sure that even a filesystem level scan for recently changed files will not be much faster than what you may achive with e.g. star. Note that ufsdump directly accesees the raw disk device and thus _is_ at filesystem leven but still is slower than star on UFS. While I am sure that star is technically a fine utility, the problem is that it is effectively an unsupported product. If our customers find a bug in their backup that is caused by a failure in a Sun supplied utility, then they have a legal course of action. The customer's system administrators are covered because they were using tools provided by the vendor. The wrath of the customer would be upon Sun, not the supplier (us) or the supplier's technical lead (me). If the system administrator has chosen star (or if the supplier recommends star), then the conversation becomes a lot more awkward. From the perspective of the business, the system administrator will have acted irresponsibly by choosing a tool that has no vendor support. Alternatively, the supplier will be held responsible for recommending a product that has broken the customer's ability to restore, and with no legal recourse, I wouldn't dare touch it. Sorry. This is why Sun need to provide the solution themselves (or adopt and provide support for star or similar third party products). JR ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?
I see also that Samsung have very recently released the HD203WI 2TB 4-platter model. It seems to have good customer ratings so far at newegg.com, but currently there are only 13 reviews so it's a bit early to tell if it's reliable. Has anyone tried this model with ZFS? Cheers, Simon http://breden.org.uk/2008/03/02/a-home-fileserver-using-zfs/ -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
Julian Regel jrmailgate-zfsdisc...@yahoo.co.uk wrote: If you like to have a backup that allows to access files, you need a file based backup and I am sure that even a filesystem level scan for recently changed files will not be much faster than what you may achive with e.g. star. Note that ufsdump directly accesees the raw disk device and thus _is_ at filesystem leven but still is slower than star on UFS. While I am sure that star is technically a fine utility, the problem is that it is effectively an unsupported product. From this viewpoint, you may call most of Solaris unsupported. If our customers find a bug in their backup that is caused by a failure in a Sun supplied utility, then they have a legal course of action. The customer's system administrators are covered because they were using tools provided by the vendor. The wrath of the customer would be upon Sun, not the supplier (us) or the supplier's technical lead (me). Do you really believe that Sun will help such a customer? There are many bugs in Solaris (I remember e.g. some showstopper bugs in the multimedia area) that are not fixed although they are known since a very long time (more than a year). There is a bug in ACL handling in Sun's tar (reported by me in 2004 or even before) that is not fixed. As a result in many cases ACLs are not restored. Note that bugs in star are fixed much faster and looking back at the 28 years of history with star, I know of not a single bug that took more than 3 months to get a fix. Typically, bugs are fixed withing less than a week - many bugs even within a few hours. This is a support quality that Sun does not offer. So please explain us where you see a problem with star.. Jörg -- EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin j...@cs.tu-berlin.de(uni) joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
While I am sure that star is technically a fine utility, the problem is that it is effectively an unsupported product. From this viewpoint, you may call most of Solaris unsupported. From the perspective of the business, the contract with Sun provides that support. If our customers find a bug in their backup that is caused by a failure in a Sun supplied utility, then they have a legal course of action. The customer's system administrators are covered because they were using tools provided by the vendor. The wrath of the customer would be upon Sun, not the supplier (us) or the supplier's technical lead (me). Do you really believe that Sun will help such a customer? There are many bugs in Solaris (I remember e.g. some showstopper bugs in the multimedia area) that are not fixed although they are known since a very long time (more than a year). There is a bug in ACL handling in Sun's tar (reported by me in 2004 or even before) that is not fixed. As a result in many cases ACLs are not restored. If Sun don't fix a critical bug that is affecting the availability of server that is under support, then it becomes a problem for the legal department. In the ACL example, it's possible the effected users didn't have a support contract. Note that bugs in star are fixed much faster and looking back at the 28 years of history with star, I know of not a single bug that took more than 3 months to get a fix. Typically, bugs are fixed withing less than a week - many bugs even within a few hours. This is a support quality that Sun does not offer. Possibly, but there is no guarantee that it will be fixed, no-one to call when there is a problem, no-one to escalate the problem to if it is ignored, and no company to sue if it all goes wrong. So please explain us where you see a problem with star.. Hopefully my above comments explain sufficiently. It's not a technical issue with star, it's a business issue. The rules there are very different and not based on merit (this is also why many companies prefer running their mission critical apps on Red Hat Enterprise Linux instead of CentOS, even though technically they are almost identical). JR ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
Julian Regel jrmailgate-zfsdisc...@yahoo.co.uk wrote: While I am sure that star is technically a fine utility, the problem is that it is effectively an unsupported product. From this viewpoint, you may call most of Solaris unsupported. From the perspective of the business, the contract with Sun provides that support. From a perspective of reality, such a contract will not help. Do you really believe that Sun will help such a customer? There are many bugs in Solaris (I remember e.g. some showstopper bugs in the multimedia area) that are not fixed although they are known since a very long time (more than a year). There is a bug in ACL handling in Sun's tar (reported by me in 2004 or even before) that is not fixed. As a result in many cases ACLs are not restored. If Sun don't fix a critical bug that is affecting the availability of server that is under support, then it becomes a problem for the legal department. In the ACL example, it's possible the effected users didn't have a support contract. What you seem to point out is that in case of a problem for a customer with a contract, the legal department gets involved. Unfortunately, laywers do not fix bugs. Note that bugs in star are fixed much faster and looking back at the 28 years of history with star, I know of not a single bug that took more than 3 months to get a fix. Typically, bugs are fixed withing less than a week - many bugs even within a few hours. This is a support quality that Sun does not offer. Possibly, but there is no guarantee that it will be fixed, no-one to call when there is a problem, no-one to escalate the problem to if it is ignored, and no company to sue if it all goes wrong. Escalating a problem does not fix it. So please explain us where you see a problem with star.. Hopefully my above comments explain sufficiently. It's not a technical issue with star, it's a business issue. The rules there are very different and not based on merit (this is also why many companies prefer running their mission critical apps on Red Hat Enterprise Linux instead of CentOS, even though technically they are almost identical). Now we are back to reality. A person that is interested in a solution will usually check what happened in similar cases before. If you compare star with Sun supplied tools with this background, Sun cannot outperform star. Red Hat Enterprise Linux may offer something you cannot get with CentOS. But I don't see that Sun can offer something you don't get with star. Let me make another reality check: Many people use GNU tar for backup purposes, but my first automated test case with incremental backups using GNU tar did fail so miserably that I was unable to use GNU tar as a test reference at all. On the other side, I am doing incremental backup _and_ restore tests with gigabytes of real delta data on a dayly base since 2004 and I did hot see any problem since April 2005. Jörg -- EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin j...@cs.tu-berlin.de(uni) joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS default compression and file size limit?
I have a 13GB text file. I turned ZFS compression on with zfs set compression=on mypool. When i copy the 13GB file into another file, it does not get compressed (checking via du -sh). However if i set compression=gzip, then the file gets compressed. Is there a limit on file size with the default compression algorithm? I did experiment with a much smaller file of 0.5GB with the default compression and it did get compressed. I am using S10 U8 x86/64. Regards, -- Wajih Ahmed Principal Field Technologist 877.274.6589 / x40572 Skype: wajih_ahmed ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?
Hi, I'm using 2 x 1.5 TB drives from Samsung (EcoGreen, I believe) in my current home server. One reported 14 Read errors a few weeks ago, roughly 6 months after install, which went away during the next scrub/resilver. This remembered me to order a 3rd drive, a 2.0 TB WD20EADS from Western Digital and I now have a 3-way mirror, which is effectively a 2-way mirror with its hot-spare already synced in. The idea behind notching up the capacity is threefold: - No sorry, this disk happens to have 1 block too few problems on attach. - When the 1.5 TB disks _really_ break, I'll just order another 2 TB one and use the opportunity to upgrade pool capacity. Since at least one of the 1.5TB drives will still be attached, there won't be any slightly smaller drive problems either when attaching the second 2TB drive. - After building in 2 bigger drives, it becomes easy to figure out which of the drives to phase out. Just go for the smaller drives. This solves the headache of trying to figure out the right drive to build out when you replace drives that aren't hot spares and don't have blinking lights. Frankly, I don't care whether the Samsung or the WD drives are better or worse, they're both consumer drives and they're both dirt cheap. Just assume that they'll break soon (since you're probably using them more intensely than their designed purpose) and make sure their replacements are already there. It also helps mixing vendors, so one glitch that affect multiple disks in the same batch won't affect your setup too much. (And yes, I broke that rule with my initial 2 Samsung drives but I'm now glad I have both vendors :)). Hope this helps, Constantin Simon Breden wrote: I see also that Samsung have very recently released the HD203WI 2TB 4-platter model. It seems to have good customer ratings so far at newegg.com, but currently there are only 13 reviews so it's a bit early to tell if it's reliable. Has anyone tried this model with ZFS? Cheers, Simon http://breden.org.uk/2008/03/02/a-home-fileserver-using-zfs/ -- Sent from OpenSolaris, http://www.opensolaris.org/ Constantin Gonzalez Sun Microsystems GmbH, Germany Principal Field Technologisthttp://blogs.sun.com/constantin Tel.: +49 89/4 60 08-25 91 http://google.com/search?q=constantin+gonzalez Sitz d. Ges.: Sun Microsystems GmbH, Sonnenallee 1, 85551 Kirchheim-Heimstetten Amtsgericht Muenchen: HRB 161028 Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Wolf Frenkel Vorsitzender des Aufsichtsrates: Martin Haering ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
Joerg Schilling wrote: Julian Regel jrmailgate-zfsdisc...@yahoo.co.uk wrote: If you like to have a backup that allows to access files, you need a file based backup and I am sure that even a filesystem level scan for recently changed files will not be much faster than what you may achive with e.g. star. Note that ufsdump directly accesees the raw disk device and thus _is_ at filesystem leven but still is slower than star on UFS. While I am sure that star is technically a fine utility, the problem is that it is effectively an unsupported product. From this viewpoint, you may call most of Solaris unsupported. what is that supposed to mean? Michael -- Michael Schusterhttp://blogs.sun.com/recursion Recursion, n.: see 'Recursion' ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?
Hi Constantin, It's good to hear your setup with the Samsung drives is working well. Which model/revision are they? My personal preference is to use drives of the same model revision. However, in order to help ensure that the drives will perform reliably, I prefer to do a fair amount of research first, in order to find drives that are reported by many users to be working reliably in their systems. I did this for my current WD7500AAKS drives and have never seen even one read/write or checksum error in 2 years - they have worked flawlessly. As a crude method of checking reliability of any particular drive, I take a look at newegg.com and see the percentage of users rating the drives with 4 or 5 stars, and read the problems listed to see what kind of problems the drives may have. If you read the WDC links I list in the first post above, there does appear to be some problem that many users are experiencing with the most recent revisions of the WD Green 'EADS' drives and also the new Green models in the 'EARS' range. I don't know the cause of the problem though. I did wonder if the problems people are experiencing might be caused by spindown/power-saving features of the drives, which might cause a long delay before data is accessible again after spin-up, but this is just a guess. For now, I am looking at the 1.5TB Samsung HD154UI (revision 1AG01118 ?), or possibly the 2TB Samsung HD203WI when more user ratings are available. Cheers, Simon http://breden.org.uk/2008/03/02/a-home-fileserver-using-zfs/ -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
On 19/01/2010 19:11, Ian Collins wrote: Julian Regel wrote: Based on what I've seen in other comments, you might be right. Unfortunately, I don't feel comfortable backing up ZFS filesystems because the tools aren't there to do it (built into the operating system or using Zmanda/Amanda). Commercial backup solutions are available for ZFS. I know tape backup isn't sexy, but it's a reality for many of us and it's not going away anytime soon. True, but I wonder how viable its future is. One of my clients requires 17 LT04 types for a full backup, which cost more and takes up more space than the equivalent in removable hard drives. In the past few years growth in hard drive capacities has outstripped tapes to the extent that removable hard drives and ZFS snapshots have become a more cost effective and convenient backup media. What do people with many tens of TB use for backup these days? http://milek.blogspot.com/2009/12/my-presentation-at-losug.html -- Robert Milkowski http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
On 20/01/2010 10:48, Ragnar Sundblad wrote: On 19 jan 2010, at 20.11, Ian Collins wrote: Julian Regel wrote: Based on what I've seen in other comments, you might be right. Unfortunately, I don't feel comfortable backing up ZFS filesystems because the tools aren't there to do it (built into the operating system or using Zmanda/Amanda). Commercial backup solutions are available for ZFS. I know tape backup isn't sexy, but it's a reality for many of us and it's not going away anytime soon. True, but I wonder how viable its future is. One of my clients requires 17 LT04 types for a full backup, which cost more and takes up more space than the equivalent in removable hard drives. In the past few years growth in hard drive capacities has outstripped tapes to the extent that removable hard drives and ZFS snapshots have become a more cost effective and convenient backup media. LTO media is still cheaper than equivalent sized disks, maybe a factor 5 or so. LTO drivers cost a little, but so do disk shelves. So, now that there is no big price issue, there is choice instead. Use it! Hard drives are good for random access - both restore of individual files and partial rewrite. Hard drivers aren't faster than tape for data transfer, but they might be cheaper to run in parallel and therefore you could potentially gain speed. Hard drives have shorter seek time, which may be important. Hard drives are probably bad for storing for longer times - especially - you will never know how long it could be stored before it will fail. A month? Probably. A year? Maybe. Five years? Well... Ten years? Probably not. LTO tapes are supposed to be able to keep it's data for at least 30 years of stored properly. Hard drives are probably best when used online or at least very often. So - it is wrong to say that one is better or cheaper than the other. They have different properties, and could be used to solve different problems. It is actually not that easy. Compare a cost of 2x x4540 with 1TB disks to equivalent solution on LTO. Each x4540 could be configured as: 4x 11 disks in raidz-2 + 2x hot spare + 2x OS disks. The four raidz2 group form a single pool. This would provide well over 30TB of logical storage per each box. Now you rsync all the data from your clients to a dedicated filesystem per client, then create a snapshot. All snapshots are replicated to a 2nd x4540 so even if you would loose entire box/data for some reason you would still have a spare copy. Now compare it to a cost of a library, lto drives, tapes, software + licenses, support costs, ... See more details at http://milek.blogspot.com/2009/12/my-presentation-at-losug.html -- Robert Milkowski http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS default compression and file size limit?
On 20/01/2010 13:39, Wajih Ahmed wrote: I have a 13GB text file. I turned ZFS compression on with zfs set compression=on mypool. When i copy the 13GB file into another file, it does not get compressed (checking via du -sh). However if i set compression=gzip, then the file gets compressed. Is there a limit on file size with the default compression algorithm? I did experiment with a much smaller file of 0.5GB with the default compression and it did get compressed. if a given block is not gaining more than 12.5% from a compression then it will not be stored as compressed. It might be that with a default compression algorithm (lzjb) you are gaining less than 12.5% while when using gzip you are getting more therefore blocks end up being compressed. -- Robert Milkowski http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
On Wed, January 20, 2010 09:23, Robert Milkowski wrote: Now you rsync all the data from your clients to a dedicated filesystem per client, then create a snapshot. Is there an rsync out there that can reliably replicate all file characteristics between two ZFS/Solaris systems? I haven't found one. The ZFS ACLs seem to be beyond all of them, in particular. (Losing just that, and preserving the data, is clearly far, far better than losing everything! And a system build *knowing* it was losing the protections could preserve them some other way.) -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
On Wed, January 20, 2010 04:48, Ragnar Sundblad wrote: LTO media is still cheaper than equivalent sized disks, maybe a factor 5 or so. LTO drivers cost a little, but so do disk shelves. So, now that there is no big price issue, there is choice instead. Use it! Depends on the scale you're operating at. Backing up my 800GB home data pool onto a couple of external 1TB USB drives is *immensely* cheaper than buying tape equipment. At enough bigger scales, I accept that tape is still cheaper. Makes sense, since the tapes are relatively simple compared to drives, and you only need a small number of drives to use a large number of tapes. I think hard drives are still cheaper at small-enterprise levels, actually. -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Unavailable device
Hi John, In general, ZFS will warn you when you attempt to add a device that is already part of an existing pool. One exception is when the system is being re-installed. I'd like to see the set of steps that led to the notification failure. Thanks, Cindy On 01/19/10 20:58, John wrote: I was able to solve it, but it actually worried me more than anything. Basically, I had created the second pool using the mirror as a primary device. So three disks but two full disk root mirrors. Shouldn't zpool have detected an active pool and prevented this? The other LDOM was claiming a corrupted device, which I was able to replace and clear easily. But the one pool I originally posted about looks to be permanently gone, since it believes there is another device, but doesn't know where the device is or what it was ever called. If I could import it and re-do the mirror somehow, or something similar, it'd be great. Is there anyway to force it to realize it's wrong? Obviously, I should've kept better track of the WWN's - But I've made the mistake before and zpool always prevented it. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
On Wed, 20 Jan 2010, Julian Regel wrote: If our customers find a bug in their backup that is caused by a failure in a Sun supplied utility, then they have a legal course of action. The customer's system administrators are covered because they were using tools provided by the vendor. The wrath of the customer would be upon Sun, not the supplier (us) or the supplier's technical lead (me). I would love to try whatever you are smoking because it must be really good stuff. It would be a bold new step for me, but the benefits are clear. While your notions of the transitive protection offered by vendor support are interesting, I will be glad to meet you in the unemployment line then we can share some coffee and discuss the good old days. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
It is actually not that easy. Compare a cost of 2x x4540 with 1TB disks to equivalent solution on LTO. Each x4540 could be configured as: 4x 11 disks in raidz-2 + 2x hot spare + 2x OS disks. The four raidz2 group form a single pool. This would provide well over 30TB of logical storage per each box. Now you rsync all the data from your clients to a dedicated filesystem per client, then create a snapshot. All snapshots are replicated to a 2nd x4540 so even if you would loose entire box/data for some reason you would still have a spare copy. Now compare it to a cost of a library, lto drives, tapes, software + licenses, support costs, ... See more details at http://milek.blogspot.com/2009/12/my-presentation-at-losug.html I've just read your presentation Robert. Interesting stuff. I've also just done a pen and paper exercise to see how much 30TB of tape would cost as a comparison to your disk based solution. Using list prices from Sun's website (and who pays list..?), an SL48 with 2 x LTO3 drives would cost £14000. I couldn't see a price on an LTO4 equipped SL48 despite the Sun website saying it's a supported option. Each LTO3 has a native capacity of 300GB and the SL48 can hold up to 48 tapes in the library (14.4TB native per library). To match the 30TB in your solution, we'd need two libraries totalling £28000. You would also need 100 LTO3 tapes to provide 30TB of native storage. I recently bought a pack of 20 tapes for £340, so five packs would be £1700. So you could provision a tape backup for just under £3 (~$49000). In comparison, the cost of one X4540 with ~ 36TB usable storage is UK list price £30900. I've not factored in backup software since you could use an open source solution such as Amanda or Bacula. Which isn't to say tape would be a better solution since it's going to be slower to restore etc. But it does show that tape can work out cheaper, especially since the cost of a high speed WAN link isn't required. JR ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS default compression and file size limit?
I have a 13GB text file. I turned ZFS compression on with zfs set compression=on mypool. When i copy the 13GB file into another file, it does not get compressed (checking via du -sh). However if i set compression=gzip, then the file gets compressed. Is there a limit on file size with the default compression algorithm? I did experiment with a much smaller file of 0.5GB with the default compression and it did get compressed. I am using S10 U8. Regards, -- Wajih Ahmed Principal Field Technologist 877.274.6589 / x40572 Skype: wajih_ahmed ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
ae == Allen Eastwood mi...@paconet.us writes: ic == Ian Collins i...@ianshome.com writes: If people are really still backing up to tapes or DVD's, just use file vdev's, export the pool, and then copy the unmounted vdev onto the tape or DVD. ae And some of those enterprises require backup mechanism that ae can be easily used in a DR situation. ae ufsdump/restore was perfect in that regard. The lack of ae equivalent functionality is a big problem for the situations ae where this functionality is a business requirement. ae For example, one customer, local government, requires a backup ae that can be taken offsite and used in a DR situation. Were you confused by some part of: Use file vdevs, epxort the pool, and then copy the unmounted vdev onto the tape. or do you find that this doesn't do what you want? because it seems fine to me. And the fact that it doesn't need any extra tools means it's unlikely to break (1) far into the future or (2) for a few unlucky builds, and (3) that the restore environment is simple and doesn't involve prepopulating some Legato Database with the TOC of every tape in the library or some such nonsense, which ought to all be among your ``requirements'' but if you're substituting for those, ``works in exactly the way we were used to it working before'' then you may as well use 'zfs send' since you're more concerned with identical-feeling invocatoin syntax than the problems I mentioned. ic For a full recovery, you can archive a send stream and receive ic it back. You can send the stream to the tape, transport the tape to the DR site, and receive it. You can do this weekly as part of your offsite backup plan provided that you receive each tape you transport immediately. Then the data should be permanently stored on disk at the DR site, and the tapes used only for transport. If you store the backup permanently on tape then it's a step backwards from tar/cpio/ufsrestore because the 'zfs send' format is more fragile and has to be restored entire. If you receive the tape immediately this is an improvement because under the old convention tapes could be damaged in transit, or over the years by heat/dust/sunlight, without your knowledge, while on disks it's simple to scrub periodically. I am not trying to take away your tapes, Allen, so please quote Ian instead if that's the thing you object to. I've instead suggested a different way to use them if you really do need them archivally: store file vdev's on them. If you're just using them to replicate data to the DR site then you needn't even go as far as my workaround. I do agree that there's a missing tool: it's not possible to copy one subdirectory to another while preserving holes, forkey extended attributes, and ACL's. Also if Windows ACL's are going to be stored right in the filesystem, then Windows ACL's probably ought to be preserved over an rsync pipe between Solaris and EnTee, or a futuristic tarball written on one and extracted on the other. I don't agree that the missing tool is designed primarily for the narrow use-case of writing to ancient backup tapes: it's a more general tool. or, really, it's just a matter of documenting and committing the extra-OOB-gunk APIs and then fixing rsync and GNUtar. pgpk5wk2koJcZ.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
On 20/01/2010 16:22, Julian Regel wrote: It is actually not that easy. Compare a cost of 2x x4540 with 1TB disks to equivalent solution on LTO. Each x4540 could be configured as: 4x 11 disks in raidz-2 + 2x hot spare + 2x OS disks. The four raidz2 group form a single pool. This would provide well over 30TB of logical storage per each box. Now you rsync all the data from your clients to a dedicated filesystem per client, then create a snapshot. All snapshots are replicated to a 2nd x4540 so even if you would loose entire box/data for some reason you would still have a spare copy. Now compare it to a cost of a library, lto drives, tapes, software + licenses, support costs, ... See more details at http://milek.blogspot.com/2009/12/my-presentation-at-losug.html I've just read your presentation Robert. Interesting stuff. I've also just done a pen and paper exercise to see how much 30TB of tape would cost as a comparison to your disk based solution. Using list prices from Sun's website (and who pays list..?), an SL48 with 2 x LTO3 drives would cost £14000. I couldn't see a price on an LTO4 equipped SL48 despite the Sun website saying it's a supported option. Each LTO3 has a native capacity of 300GB and the SL48 can hold up to 48 tapes in the library (14.4TB native per library). To match the 30TB in your solution, we'd need two libraries totalling £28000. You would also need 100 LTO3 tapes to provide 30TB of native storage. I recently bought a pack of 20 tapes for £340, so five packs would be £1700. So you could provision a tape backup for just under £3 (~$49000). In comparison, the cost of one X4540 with ~ 36TB usable storage is UK list price £30900. I've not factored in backup software since you could use an open source solution such as Amanda or Bacula. Which isn't to say tape would be a better solution since it's going to be slower to restore etc. But it does show that tape can work out cheaper, especially since the cost of a high speed WAN link isn't required. JR You would also need to add at least one server to your library with fc cards. Then with most software you would need more tapes due to data fragmentation and a need to do regular full backups (with zfs+rsync you only do a full backup once). So in best case a library will cost about the same as disk based solution but generally will be less flexible, etc. If you would add any enterprise software on top of it (Legato, NetBackup, ...) then the price would change dramaticallly. Additionally with ZFS one could start using deduplication (in testing already). ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
On 20/01/2010 17:21, Robert Milkowski wrote: On 20/01/2010 16:22, Julian Regel wrote: It is actually not that easy. Compare a cost of 2x x4540 with 1TB disks to equivalent solution on LTO. Each x4540 could be configured as: 4x 11 disks in raidz-2 + 2x hot spare + 2x OS disks. The four raidz2 group form a single pool. This would provide well over 30TB of logical storage per each box. Now you rsync all the data from your clients to a dedicated filesystem per client, then create a snapshot. All snapshots are replicated to a 2nd x4540 so even if you would loose entire box/data for some reason you would still have a spare copy. Now compare it to a cost of a library, lto drives, tapes, software + licenses, support costs, ... See more details at http://milek.blogspot.com/2009/12/my-presentation-at-losug.html I've just read your presentation Robert. Interesting stuff. I've also just done a pen and paper exercise to see how much 30TB of tape would cost as a comparison to your disk based solution. Using list prices from Sun's website (and who pays list..?), an SL48 with 2 x LTO3 drives would cost £14000. I couldn't see a price on an LTO4 equipped SL48 despite the Sun website saying it's a supported option. Each LTO3 has a native capacity of 300GB and the SL48 can hold up to 48 tapes in the library (14.4TB native per library). To match the 30TB in your solution, we'd need two libraries totalling £28000. You would also need 100 LTO3 tapes to provide 30TB of native storage. I recently bought a pack of 20 tapes for £340, so five packs would be £1700. So you could provision a tape backup for just under £3 (~$49000). In comparison, the cost of one X4540 with ~ 36TB usable storage is UK list price £30900. I've not factored in backup software since you could use an open source solution such as Amanda or Bacula. Which isn't to say tape would be a better solution since it's going to be slower to restore etc. But it does show that tape can work out cheaper, especially since the cost of a high speed WAN link isn't required. JR You would also need to add at least one server to your library with fc cards. Then with most software you would need more tapes due to data fragmentation and a need to do regular full backups (with zfs+rsync you only do a full backup once). So in best case a library will cost about the same as disk based solution but generally will be less flexible, etc. If you would add any enterprise software on top of it (Legato, NetBackup, ...) then the price would change dramaticallly. Additionally with ZFS one could start using deduplication (in testing already). What I really mean is that a disk based solution used to be much more expensive than tapes but currently they are comparable in costs while often the disk based is more flexible. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS default compression and file size limit?
Mike, Thank you for your quick response... Is there a way for me to test the compression from the command line to see if lzjb is giving me more or less than the 12.5% mark? I guess it will depend if there is a lzjb command line utility. I am just a little surprised because gzip-6 is able to compress it to 4.4GB from 14GB (and gzip-1 4.8GB) and from what i read lzjb should be giving me better an 12.5% compression. For example the *compress* command (which i think uses LZO, a slight different variant of Lempel-Ziv) manges to reduce it to 8.0GB. That is a 57% ratio. Regards, -- Wajih Ahmed Principal Field Technologist 877.274.6589 / x40572 Skype: wajih_ahmed Robert Milkowski wrote: On 20/01/2010 13:39, Wajih Ahmed wrote: I have a 13GB text file. I turned ZFS compression on with zfs set compression=on mypool. When i copy the 13GB file into another file, it does not get compressed (checking via du -sh). However if i set compression=gzip, then the file gets compressed. Is there a limit on file size with the default compression algorithm? I did experiment with a much smaller file of 0.5GB with the default compression and it did get compressed. if a given block is not gaining more than 12.5% from a compression then it will not be stored as compressed. It might be that with a default compression algorithm (lzjb) you are gaining less than 12.5% while when using gzip you are getting more therefore blocks end up being compressed. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Unavailable device
John wrote: I was able to solve it, but it actually worried me more than anything. Basically, I had created the second pool using the mirror as a primary device. So three disks but two full disk root mirrors. Shouldn't zpool have detected an active pool and prevented this? The other LDOM was claiming a corrupted device, which I was able to replace and clear easily. But the one pool I originally posted about looks to be permanently gone, since it believes there is another device, but doesn't know where the device is or what it was ever called. If I could import it and re-do the mirror somehow, or something similar, it'd be great. Is there anyway to force it to realize it's wrong? You can try limiting access to one device at a time by removing one device from LDOM configuration, or creating separate directory like /tmp/dsk and copying symlink for the device you want to try there and trying to do zpool import(if device is removed at the LDOM level) zpool import -d /tmp/dsk(in case you prefer trick with symlinks) Posting label 0 (from zdb -l /dev/rdsk/... output) of both involved disks may provide more clues. regards, victor Obviously, I should've kept better track of the WWN's - But I've made the mistake before and zpool always prevented it. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Clearing a directory with more than 60 million files
ml == Mikko Lammi mikko.la...@lmmz.net writes: ml rm -rf to problematic directory from parent level. Running ml this command shows directory size decreasing by 10,000 ml files/hour, but this would still mean close to ten months ml (over 250 days) to delete everything! interesting. does 'zpool scrub' take unusually long, too? or is it pretty close to normal speed? pgpzuvM8WeXmu.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] hard drive choice, TLER/ERC/CCTL
To those concerned about this issue, there is a patched version of smartmontools that enables the querying and setting of TLER/ERC/CCTL values (well, except for recent desktop drives from Western Digitial). It's available here http://www.csc.liv.ac.uk/~greg/projects/erc/ Unfortunately, smartmontools has limited SATA drive support in opensolaris, and you cannot query or set the values. I'm looking into booting into linux, setting the values, and then rebooting into opensolaris since the settings will survive a warm reboot (but not a powercycle). -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
On Jan 20, 2010, at 3:15 AM, Joerg Schilling wrote: Richard Elling richard.ell...@gmail.com wrote: ufsdump/restore was perfect in that regard. The lack of equivalent functionality is a big problem for the situations where this functionality is a business requirement. How quickly we forget ufsdump's limitations :-). For example, it is not supported for use on an active file system (known data corruption possibility) and UFS snapshots are, well, a poor hack and often not usable for backups. As the ufsdump(1m) manpage says, It seems you forgot that zfs also needs snapshots. There is nothing bad with snapshots. Yes, snapshots are a good thing. But most people who try fssnap on the UFS root file system will discover that it doesn't work; for reasons mentioned in the NOTES section of fssnap_ufs(1m). fssnap_ufs is simply a butt-ugly hack. So if you believe you can reliably use ufsdump to store a DR copy of root for a 7x24x365 production environment, then you probably believe the Backup Fairy will leave a coin under your pillow when your restore fails :-) Fortunately, ZFS snapshot do the right thing. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
Julian Regel wrote: It is actually not that easy. Compare a cost of 2x x4540 with 1TB disks to equivalent solution on LTO. Each x4540 could be configured as: 4x 11 disks in raidz-2 + 2x hot spare + 2x OS disks. The four raidz2 group form a single pool. This would provide well over 30TB of logical storage per each box. Now you rsync all the data from your clients to a dedicated filesystem per client, then create a snapshot. All snapshots are replicated to a 2nd x4540 so even if you would loose entire box/data for some reason you would still have a spare copy. Now compare it to a cost of a library, lto drives, tapes, software + licenses, support costs, ... See more details at http://milek.blogspot.com/2009/12/my-presentation-at-losug.html I've just read your presentation Robert. Interesting stuff. I've also just done a pen and paper exercise to see how much 30TB of tape would cost as a comparison to your disk based solution. Using list prices from Sun's website (and who pays list..?), an SL48 with 2 x LTO3 drives would cost £14000. I couldn't see a price on an LTO4 equipped SL48 despite the Sun website saying it's a supported option. Each LTO3 has a native capacity of 300GB and the SL48 can hold up to 48 tapes in the library (14.4TB native per library). To match the 30TB in your solution, we'd need two libraries totalling £28000. You would also need 100 LTO3 tapes to provide 30TB of native storage. I recently bought a pack of 20 tapes for £340, so five packs would be £1700. So you could provision a tape backup for just under £3 (~$49000). In comparison, the cost of one X4540 with ~ 36TB usable storage is UK list price £30900. I've not factored in backup software since you could use an open source solution such as Amanda or Bacula. A more apples to apples comparison would be to compare the storage only. Both removable drive and tape options require a server with FC or SCSI ports, so that can be excluded from the comparison. So for 30TB, assuming 2TB drives @ ~£100 with a pool built of 6 drive raidz vdevs 18 drives would be required plus 2 16 drive shelves. So each backup set would cost about £1800. So there's not a great deal of difference. With drives you also get the added benefit of keeping all your incrementals (as snapshots) on the archive set. HDD price per GB will continue to drop faster than tape, so it will be interesting to do the same comparison in 12 months. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
Joerg Schilling wrote: Ian Collins i...@ianshome.com wrote: The correct way to archivbe ACLs would be to put them into extended POSIX tar attrubutes as star does. See http://cdrecord.berlios.de/private/man/star/star.4.html for a format documentation or have a look at ftp://ftp.berlios.de/pub/star/alpha, e.g. ftp://ftp.berlios.de/pub/star/alpha/acl-test.tar.gz The ACL format used by Sun is undocumented.. man acltotext We are talking about TAR and I did give a pointer to the star archive format documentation, so it is obvious that I was talking about the ACL format from Sun tar. This format is not documented. It is, Sun's ZFS ACL aware tools use acltotext() to format ACLs. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
jr == Julian Regel jrmailgate-zfsdisc...@yahoo.co.uk writes: jr While I am sure that star is technically a fine utility, the jr problem is that it is effectively an unsupported product. I have no problems with this whatsoever. jr If our customers find a bug in their backup that is caused by jr a failure in a Sun supplied utility, then they have a legal jr course of action. The customer's system administrators are jr covered because they were using tools provided by the jr vendor. The wrath of the customer would be upon Sun, not the jr supplier (us) or the supplier's technical lead (me). We were just talking about this somewhere else, actually: ``if something goes wrong, its their ass. but if nothing ever gets done, its nobody's fault.'' It's sad for me how much money is to be made supporting broken corporate cultures like that. I'm not saying you're wrong, just that you might not want to contribute to such a culture because you've chosen to endure it for a scratch. You need to have a better way to evaluate employees than micromanagement-by-the-clueless and vindictive hindsight. But the point that there's money to be made by bleeding it out of ossified broken American companies is well-taken. jr From the perspective of the business, the system administrator jr will have acted irresponsibly by choosing a tool that has no jr vendor support. From the perspective of MY business, I would much rather have the dark OOB acl/fork/whatever-magic that's gone into ZFS and NFSv4 supported in standard tools like rsync and GNUtar. This is, for example, what Apple achieved with CUPS and why I can share printers between Ubuntu and Mac OS effortlessly, and this increases the amount of money I'm willing to give Apple for their proprietary platform. The purpose of the tool I'm discussing definitely includes the same level of cooperation, so working with the existing best-in-class and most-popular tools, and reasonableness, might be better than brittle CYA support in some fringey '/opt/SUNWbkpkit/bin/VendorCP -Rf' tool. Even if you get your cyaCP tool you may find it doesn't achieve the ass-covering you wanted because these tools can be cheeky little bastards. Most of the other quirky little balkanized-platform Solaris-only tools are littered with straightjacketing assertions to avoid ``call generators'' and push the blame back onto the sysadmin, then there is some ``all bets are off'' flag to allowe you to actually accomplish job, like 'NOINUSECHECK=1 format -e'. Honestly...why bother playing this game? pgpBTS02xCBZW.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Filesystem Quotas
I currently have one filesystem / (root), is it possible to put a quota on let's say /var? Or would I have to move /var to it's own filesystem in the same pool? Thanks ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS default compression and file size limit?
On Wed, Jan 20, 2010 at 12:42:35PM -0500, Wajih Ahmed wrote: Mike, Thank you for your quick response... Is there a way for me to test the compression from the command line to see if lzjb is giving me more or less than the 12.5% mark? I guess it will depend if there is a lzjb command line utility. I am just a little surprised because gzip-6 is able to compress it to 4.4GB from 14GB (and gzip-1 4.8GB) and from what i read lzjb should be giving me better an 12.5% compression. For example the *compress* command (which i think uses LZO, a slight different variant of Lempel-Ziv) manges to reduce it to 8.0GB. That is a 57% ratio. That's over the whole file as a single compression stream. ZFS has to compress each block (128k or maybe less) independently. This can't do as well. -- Dan. pgpdvR9z5t17a.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
On Jan 20, 2010, at 12:21, Robert Milkowski wrote: On 20/01/2010 16:22, Julian Regel wrote: [...] So you could provision a tape backup for just under £3 (~ $49000). In comparison, the cost of one X4540 with ~ 36TB usable storage is UK list price £30900. I've not factored in backup software since you could use an open source solution such as Amanda or Bacula. [...] You would also need to add at least one server to your library with fc cards. Then with most software you would need more tapes due to data fragmentation and a need to do regular full backups (with zfs+rsync you only do a full backup once). So in best case a library will cost about the same as disk based solution but generally will be less flexible, etc. If you would add any enterprise software on top of it (Legato, NetBackup, ...) then the price would change dramaticallly. Additionally with ZFS one could start using deduplication (in testing already). Regardless of the economics of tape, nowadays you generally need to go to disk first because trying to stream at 120 MB/s (LTO-4) really isn't practical over the network, directly from the client. So in the end you'll be starting with disk (either DAS or VTL or whatever), and generally going to tape if you need to keep stuff that's older than (say) 3-6 months. Tape also doesn't rotate while it's sitting there, so if it's going to be sitting around for a while (e.g., seven years) better to use tape than something that sucks up power. LTO-5 is expected to be released RSN, with a native capacity of 1.6 TB and (uncompressed) writes at 180 MB/s. The only way to realistically feed that is from disk. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Filesystem Quotas
On 20 January, 2010 - Mr. T Doodle sent me these 1,0K bytes: I currently have one filesystem / (root), is it possible to put a quota on let's say /var? Or would I have to move /var to it's own filesystem in the same pool? Only filesystems can have different settings. /Tomas -- Tomas Ögren, st...@acc.umu.se, http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Umeå `- Sysadmin at {cs,acc}.umu.se ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Panic running a scrub
Hi Frank, I couldn't reproduce this problem on SXCE build 130 by failing a disk in mirrored pool and then immediately running a scrub on the pool. It works as expected. Any other symptoms (like a power failure?) before the disk went offline? It is possible that both disks went offline? We would like to review the crash dump if you still have it, just let me know when its uploaded. Thanks, Cindy On 01/19/10 12:30, Frank Middleton wrote: This is probably unreproducible, but I just got a panic whilst scrubbing a simple mirrored pool on scxe snv124. Evidently on of the disks went offline for some reason and shortly thereafter the panic happened. I have the dump and the /var/adm/messages containing the trace. Is there any point in submitting a bug report? The panic starts with: Jan 19 13:27:13 host6 ^Mpanic[cpu1]/thread=2a1009f5c80: Jan 19 13:27:13 host6 unix: [ID 403854 kern.notice] assertion failed: 0 == zap_update(dp-dp_meta_objset, DMU_POOL_DIRECTORY_OBJECT, DMU_POOL_SCRUB_BOOKMARK, sizeof (uint64_t), 4, dp-dp_scrub_bookmark, tx), file: ../../common/fs/zfs/dsl_scrub.c, line: 853 FWIW when the system came back up, it resilvered with no problem and now I'm rerunning the scrub. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Unavailable device
Unfortunately, since we got a new priority on the project, I had to scrap and recreate the pool, so I don't have any of the information anymore. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] can i make a COMSTAR zvol bigger?
On Wed, Jan 20, 2010 02:38 PM, Thomas Burgess wonsl...@gmail.com wrote: I finally got iscsi working, and it's amazing...it took a minute for me to figure out...i didn't realize it required 2 toolsbut anwyays. my original zvol is too smalli created a 120 gb zvol for time machine but i really need more like 250 gb so this is a 2 part questions. First, can i make the zvol/iscsi drive bigger...and also, let's assuming i can't (and just for my general knowledge) how can i delete the comstar iscsi volume? I noticed zfs destroy won't work if it's shared iscsi even if i try to force it (i was hoping it would just destroy it and i could make a new one) Yes you can. Size of the vol is a ZFS property. -bash-3.2# zfs get all datapool/stores/axigen/lun2 NAME PROPERTY VALUE SOURCE datapool/stores/axigen/lun2 type volume - datapool/stores/axigen/lun2 creation Sun Sep 27 21:40 2009 - datapool/stores/axigen/lun2 used 250G - datapool/stores/axigen/lun2 available 516G - datapool/stores/axigen/lun2 referenced87.1G - datapool/stores/axigen/lun2 compressratio 1.00x - datapool/stores/axigen/lun2 reservation none default datapool/stores/axigen/lun2 volsize 250G - datapool/stores/axigen/lun2 volblocksize 4K - datapool/stores/axigen/lun2 checksum on default datapool/stores/axigen/lun2 compression off default datapool/stores/axigen/lun2 readonly off default datapool/stores/axigen/lun2 shareiscsioff default datapool/stores/axigen/lun2 copies1 default datapool/stores/axigen/lun2 refreservation250G local datapool/stores/axigen/lun2 primarycache all default datapool/stores/axigen/lun2 secondarycacheall default datapool/stores/axigen/lun2 usedbysnapshots 0 - datapool/stores/axigen/lun2 usedbydataset 87.1G - datapool/stores/axigen/lun2 usedbychildren0 - datapool/stores/axigen/lun2 usedbyrefreservation 163G - Set the volsize property to what you want then, then modify the logical unit e.g. Usage: stmfadm modify-lu [OPTIONS] LU-name OPTIONS: -p, --lu-prop logical-unit-property=value -s, --size size K/M/G/T/P -f, --file Description: Modify properties of a logical unit. Valid properties for -p, --lu-prop are: alias- alias for logical unit (up to 255 chars) mgmt-url - Management URL address wcd - write cache disabled (true, false) wp - write protect (true, false) You will probably want to offline your target before making these changes. Now of course, this doesn't mean the space is immediately usable on the target host. If it's Windows you can use diskpart extend. If it's Linux, then you may need another method depending upon the file system. -Errol ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
Ian Collins i...@ianshome.com wrote: We are talking about TAR and I did give a pointer to the star archive format documentation, so it is obvious that I was talking about the ACL format from Sun tar. This format is not documented. It is, Sun's ZFS ACL aware tools use acltotext() to format ACLs. Please don't reply without checking facts. The fact that you know that there is salt in the soup does not give you the whole list of ingredients Please look into the Sun tar format to understand that you are wrong. Jörg -- EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin j...@cs.tu-berlin.de(uni) joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] L2ARC in Cluster is picked up althought not part of the pool
Hi Lutz, On Jan 20, 2010, at 3:17 AM, Lutz Schumann wrote: Hello, we tested clustering with ZFS and the setup looks like this: - 2 head nodes (nodea, nodeb) - head nodes contain l2arc devices (nodea_l2arc, nodeb_l2arc) This makes me nervous. I suspect this is not in the typical QA test plan. - two external jbods - two mirror zpools (pool1,pool2) - each mirror is a mirror of one disk from each jbod - no ZIL (anyone knows a well priced SAS SSD ?) We want active/active and added the l2arc to the pools. - pool1 has nodea_l2arc as cache - pool2 has nodeb_l2arc as cache Everything is great so far. One thing to node is that the nodea_l2arc and nodea_l2arc are named equally ! (c0t2d0 on both nodes). What we found is that during tests, the pool just picked up the device nodeb_l2arc automatically, altought is was never explicitly added to the pool pool1. This is strange. Each vdev is supposed to be uniquely identified by its GUID. This is how ZFS can identify the proper configuration when two pools have the same name. Can you check the GUIDs (using zdb) to see if there is a collision? -- richard We had a setup stage when pool1 was configured on nodea with nodea_l2arc and pool2 was configured on nodeb without a l2arc. Then we did a failover. Then pool1 pickup up the (until then) unconfigured nodeb_l2arc. Is this intended ? Why is a L2ARC device automatically picked up if the device name is the same ? In a later stage we had both pools configured with the corresponding l2arc device. (po...@nodea with nodea_l2arc and po...@nodeb with nodeb_l2arc). Then we also did a failover. The l2arc device of the pool failing over was marked as too many corruptions instead of missing. So from this tests it looks like ZFS just picks up the device with the same name and replaces the l2arc without looking at the device signatures to only consider devices beeing part of a pool. We have not tested with a data disk as c0t2d0 but if the same behaviour occurs - god save us all. Can someone clarify the logic behind this ? Can also someone give a hint how to rename SAS disk devices in opensolaris ? (to workaround I would like to rename c0t2d0 on nodea (nodea_l2arc) to c0t24d0 and c0t2d0 on nodeb (nodea_l2arc) to c0t48d0). P.s. Release is build 104 (NexentaCore 2). Thanks! -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
Miles Nordin car...@ivy.net wrote: From the perspective of MY business, I would much rather have the dark OOB acl/fork/whatever-magic that's gone into ZFS and NFSv4 supported in standard tools like rsync and GNUtar. This is, for example, what GNU tar does not support any platform speficic feature on any OS. Don't expect that GNU tar will ever add such properties.. Jörg -- EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin j...@cs.tu-berlin.de(uni) joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] L2ARC in Cluster is picked up althought not part of the pool
On 20 January, 2010 - Richard Elling sent me these 2,7K bytes: Hi Lutz, On Jan 20, 2010, at 3:17 AM, Lutz Schumann wrote: Hello, we tested clustering with ZFS and the setup looks like this: - 2 head nodes (nodea, nodeb) - head nodes contain l2arc devices (nodea_l2arc, nodeb_l2arc) This makes me nervous. I suspect this is not in the typical QA test plan. - two external jbods - two mirror zpools (pool1,pool2) - each mirror is a mirror of one disk from each jbod - no ZIL (anyone knows a well priced SAS SSD ?) We want active/active and added the l2arc to the pools. - pool1 has nodea_l2arc as cache - pool2 has nodeb_l2arc as cache Everything is great so far. One thing to node is that the nodea_l2arc and nodea_l2arc are named equally ! (c0t2d0 on both nodes). What we found is that during tests, the pool just picked up the device nodeb_l2arc automatically, altought is was never explicitly added to the pool pool1. This is strange. Each vdev is supposed to be uniquely identified by its GUID. This is how ZFS can identify the proper configuration when two pools have the same name. Can you check the GUIDs (using zdb) to see if there is a collision? Reproducable: itchy:/tmp/blah# mkfile 64m 64m disk1 itchy:/tmp/blah# zfs create -V 64m rpool/blahcache itchy:/tmp/blah# zpool create blah /tmp/blah/disk1 itchy:/tmp/blah# zpool add blah cache /dev/zvol/dsk/rpool/blahcache itchy:/tmp/blah# zpool status blah pool: blah state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM blah ONLINE 0 0 0 /tmp/blah/disk1ONLINE 0 0 0 cache /dev/zvol/dsk/rpool/blahcache ONLINE 0 0 0 errors: No known data errors itchy:/tmp/blah# zpool export blah itchy:/tmp/blah# zdb -l /dev/zvol/dsk/rpool/blahcache LABEL 0 version=15 state=4 guid=6931317478877305718 itchy:/tmp/blah# zfs destroy rpool/blahcache itchy:/tmp/blah# zfs create -V 64m rpool/blahcache itchy:/tmp/blah# dd if=/dev/zero of=/dev/zvol/dsk/rpool/blahcache bs=1024k count=64 64+0 records in 64+0 records out 67108864 bytes (67 MB) copied, 0.559299 seconds, 120 MB/s itchy:/tmp/blah# zpool import -d /tmp/blah pool: blah id: 16691059548146709374 state: ONLINE action: The pool can be imported using its name or numeric identifier. config: blah ONLINE /tmp/blah/disk1ONLINE cache /dev/zvol/dsk/rpool/blahcache itchy:/tmp/blah# zdb -l /dev/zvol/dsk/rpool/blahcache LABEL 0 LABEL 1 LABEL 2 LABEL 3 itchy:/tmp/blah# zpool import -d /tmp/blah blah itchy:/tmp/blah# zpool status pool: blah state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM blah ONLINE 0 0 0 /tmp/blah/disk1ONLINE 0 0 0 cache /dev/zvol/dsk/rpool/blahcache ONLINE 0 0 0 errors: No known data errors itchy:/tmp/blah# zdb -l /dev/zvol/dsk/rpool/blahcache LABEL 0 version=15 state=4 guid=6931317478877305718 ... It did indeed overwrite my formerly clean blahcache. Smells like a serious bug. /Tomas -- Tomas Ögren, st...@acc.umu.se, http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Umeå `- Sysadmin at {cs,acc}.umu.se -- richard We had a setup stage when pool1 was configured on nodea with nodea_l2arc and pool2 was configured on nodeb without a l2arc. Then we did a failover. Then pool1 pickup up the (until then) unconfigured nodeb_l2arc. Is this intended ? Why is a L2ARC device automatically picked up if the device name is the same ? In a later stage we had both pools configured with the corresponding l2arc device. (po...@nodea with nodea_l2arc and po...@nodeb with nodeb_l2arc). Then we also did a failover. The l2arc device of the pool failing over was marked as too many corruptions instead of missing. So from this tests it looks like ZFS just picks up the device with the same name and replaces the l2arc without looking at the device signatures to only consider devices beeing part of a pool. We have not tested with a data disk as c0t2d0 but if the same behaviour
Re: [zfs-discuss] Panic running a scrub
On 01/20/10 04:27 PM, Cindy Swearingen wrote: Hi Frank, I couldn't reproduce this problem on SXCE build 130 by failing a disk in mirrored pool and then immediately running a scrub on the pool. It works as expected. The disk has to fail whilst the scrub is running. It has happened twice now, once with the bottom half of the mirror, and again with the top half. Any other symptoms (like a power failure?) before the disk went offline? It is possible that both disks went offline? Neither. The system is on a pretty beefy UPS, and one half of the mirror was definitely online (zpool status just before panic showed one disk offline and the pool as degraded). We would like to review the crash dump if you still have it, just let me know when its uploaded. Do you need the unix.0, vmcore.0 or both? I'll add either or both as attachments to newly created Bug 14012, Panic running a scrub, when you let me know which one(s) you want. Thanks -- Frank ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Panic running a scrub
Hi Frank, We need both files. Thanks, Cindy On 01/20/10 15:43, Frank Middleton wrote: On 01/20/10 04:27 PM, Cindy Swearingen wrote: Hi Frank, I couldn't reproduce this problem on SXCE build 130 by failing a disk in mirrored pool and then immediately running a scrub on the pool. It works as expected. The disk has to fail whilst the scrub is running. It has happened twice now, once with the bottom half of the mirror, and again with the top half. Any other symptoms (like a power failure?) before the disk went offline? It is possible that both disks went offline? Neither. The system is on a pretty beefy UPS, and one half of the mirror was definitely online (zpool status just before panic showed one disk offline and the pool as degraded). We would like to review the crash dump if you still have it, just let me know when its uploaded. Do you need the unix.0, vmcore.0 or both? I'll add either or both as attachments to newly created Bug 14012, Panic running a scrub, when you let me know which one(s) you want. Thanks -- Frank ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] L2ARC in Cluster is picked up althought not part of the pool
Though the ARC case, PSARC/2007/618 is unpublished, I gather from googling and the source that L2ARC devices are considered auxiliary, in the same category as spares. If so, then it is perfectly reasonable to expect that it gets picked up regardless of the GUID. This also implies that it is shareable between pools until assigned. Brief testing confirms this behaviour. I learn something new every day :-) So, I suspect Lutz sees a race when both pools are imported onto one node. This still makes me nervous though... -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
On Wed, Jan 20, 2010 at 2:52 PM, David Magda dma...@ee.ryerson.ca wrote: On Jan 20, 2010, at 12:21, Robert Milkowski wrote: On 20/01/2010 16:22, Julian Regel wrote: [...] So you could provision a tape backup for just under £3 (~$49000). In comparison, the cost of one X4540 with ~ 36TB usable storage is UK list price £30900. I've not factored in backup software since you could use an open source solution such as Amanda or Bacula. [...] You would also need to add at least one server to your library with fc cards. Then with most software you would need more tapes due to data fragmentation and a need to do regular full backups (with zfs+rsync you only do a full backup once). So in best case a library will cost about the same as disk based solution but generally will be less flexible, etc. If you would add any enterprise software on top of it (Legato, NetBackup, ...) then the price would change dramaticallly. Additionally with ZFS one could start using deduplication (in testing already). Regardless of the economics of tape, nowadays you generally need to go to disk first because trying to stream at 120 MB/s (LTO-4) really isn't practical over the network, directly from the client. I remember for about 5 years ago (before LT0-4 days) that streaming tape drives would go to great lengths to ensure that the drive kept streaming - because it took so much time to stop, backup and stream again. And one way the drive firmware accomplished that was to write blocks of zeros when there was no data available. This also occurred when the backup source was sending a bunch of small files, which took longer to stream and did'nt produce enough data to keep the drive writing useful data. And if you had the tape hardware setup to do compression, then, assuming a normal 2:1 compression ratio, you'd need to source 240Mb/Sec in order to keep the tape writing 120Mb/Sec. The net result was the consumption of a lot more tape than a back-of-the-napkin calculation told you was required. Obviously at higher compression ratios or with the higher stream data write rates you quote below - this problem becomes more troublesome. So I agree with your conclusion: The only way to realistically feed that is from disk. So in the end you'll be starting with disk (either DAS or VTL or whatever), and generally going to tape if you need to keep stuff that's older than (say) 3-6 months. Tape also doesn't rotate while it's sitting there, so if it's going to be sitting around for a while (e.g., seven years) better to use tape than something that sucks up power. LTO-5 is expected to be released RSN, with a native capacity of 1.6 TB and (uncompressed) writes at 180 MB/s. The only way to realistically feed that is from disk. ___ Regards, -- Al Hopper Logical Approach Inc,Plano,TX a...@logical-approach.com Voice: 972.379.2133 Timezone: US CDT OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007 http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
On 20 jan 2010, at 17.22, Julian Regel wrote: It is actually not that easy. Compare a cost of 2x x4540 with 1TB disks to equivalent solution on LTO. Each x4540 could be configured as: 4x 11 disks in raidz-2 + 2x hot spare + 2x OS disks. The four raidz2 group form a single pool. This would provide well over 30TB of logical storage per each box. Now you rsync all the data from your clients to a dedicated filesystem per client, then create a snapshot. All snapshots are replicated to a 2nd x4540 so even if you would loose entire box/data for some reason you would still have a spare copy. Now compare it to a cost of a library, lto drives, tapes, software + licenses, support costs, ... See more details at http://milek.blogspot.com/2009/12/my-presentation-at-losug.html I've just read your presentation Robert. Interesting stuff. I've also just done a pen and paper exercise to see how much 30TB of tape would cost as a comparison to your disk based solution. Using list prices from Sun's website (and who pays list..?), an SL48 with 2 x LTO3 drives would cost £14000. I couldn't see a price on an LTO4 equipped SL48 despite the Sun website saying it's a supported option. Each LTO3 has a native capacity of 300GB and the SL48 can hold up to 48 tapes in the library (14.4TB native per library). To match the 30TB in your solution, we'd need two libraries totalling £28000. LTO3 has native capacity 400 GB, LTO4 has 800. The price is about the same per tape and per drive, a little higher for LTO4. You would also need 100 LTO3 tapes to provide 30TB of native storage. I recently bought a pack of 20 tapes for £340, so five packs would be £1700. Or rather 37 LTO4 tapes, and only one 48 tape library. But that doesn't matter, the interesting part is that one now can use whatever best solves the problem at hand. So you could provision a tape backup for just under £3 (~$49000). In comparison, the cost of one X4540 with ~ 36TB usable storage is UK list price £30900. I've not factored in backup software since you could use an open source solution such as Amanda or Bacula. Which isn't to say tape would be a better solution since it's going to be slower to restore etc. But it does show that tape can work out cheaper, especially since the cost of a high speed WAN link isn't required. Reading from tape is normally faster than reading from (a single) disk. Seek time of course isn't. /ragge ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
On 21 jan 2010, at 00.20, Al Hopper wrote: I remember for about 5 years ago (before LT0-4 days) that streaming tape drives would go to great lengths to ensure that the drive kept streaming - because it took so much time to stop, backup and stream again. And one way the drive firmware accomplished that was to write blocks of zeros when there was no data available. This also occurred when the backup source was sending a bunch of small files, which took longer to stream and did'nt produce enough data to keep the drive writing useful data. And if you had the tape hardware setup to do compression, then, assuming a normal 2:1 compression ratio, you'd need to source 240Mb/Sec in order to keep the tape writing 120Mb/Sec. The net result was the consumption of a lot more tape than a back-of-the-napkin calculation told you was required. Obviously at higher compression ratios or with the higher stream data write rates you quote below - this problem becomes more troublesome. So I agree with your conclusion: The only way to realistically feed that is from disk. Yes! Modern LTO drives can typically vary their speed about a factor four or so, so even if you can't keep up with the tape drive maximum speed, it will typically work pretty good anyway. If you can't keep up even then, it will have to stop, back up a bit, and restart, which will be _very_ slow. Having a disk system deliver data at 240 MB/s at the same time as you are writing to it can be a bit of a challenge. I haven't seen drives that fill out with zeros. Sounds like an ugly solution, but maybe it could be useful in some strange case. /ragge s ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Panic running a scrub
On 01/20/10 05:55 PM, Cindy Swearingen wrote: Hi Frank, We need both files. The vmcore is 1.4GB. An http upload is never going to complete. Is there an ftp-able place to send it, or can you download it if I post it somewhere? Cheers -- Frank ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] L2ARC in Cluster is picked up althought not part of the pool
On Wed, Jan 20, 2010 at 03:20:20PM -0800, Richard Elling wrote: Though the ARC case, PSARC/2007/618 is unpublished, I gather from googling and the source that L2ARC devices are considered auxiliary, in the same category as spares. If so, then it is perfectly reasonable to expect that it gets picked up regardless of the GUID. This also implies that it is shareable between pools until assigned. Brief testing confirms this behaviour. I learn something new every day :-) So, I suspect Lutz sees a race when both pools are imported onto one node. This still makes me nervous though... Yes. What if device reconfiguration renumbers my controllers, will l2arc suddenly start trashing a data disk? The same problem used to be a risk for swap, but less so now that we swap to named zvol. There's work afoot to make l2arc persistent across reboot, which implies some organised storage structure on the device. Fixing this shouldn't wait for that. -- Dan. pgp1Mb4Zg7Mxp.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup - reliability?
Ragnar Sundblad ra...@csc.kth.se wrote: Yes! Modern LTO drives can typically vary their speed about a factor four or so, so even if you can't keep up with the tape drive maximum speed, it will typically work pretty good anyway. If you can't keep up even then, it will have to stop, back up a bit, and restart, which will be _very_ slow. Having a disk system deliver data at 240 MB/s at the same time as you are writing to it can be a bit of a challenge. And star implements a FIFO that is written in a way that dramatically reduces the sawtooth behavior seen typically with other applications. You just need to tell star to use a say 2 GB FIFO and star will be able to keep the tame streaming for a longer time before it waits until there is enough data for the next longer streaming period. Jörg -- EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin j...@cs.tu-berlin.de(uni) joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] can i make a COMSTAR zvol bigger?
Yes you can. Size of the vol is a ZFS property. yes, i knew this but i wasn't sure how to do the REST. =) Set the volsize property to what you want then, then modify the logical unit e.g. Usage: stmfadm modify-lu [OPTIONS] LU-name OPTIONS: -p, --lu-prop logical-unit-property=value -s, --size size K/M/G/T/P -f, --file Description: Modify properties of a logical unit. Valid properties for -p, --lu-prop are: alias- alias for logical unit (up to 255 chars) mgmt-url - Management URL address wcd - write cache disabled (true, false) wp - write protect (true, false) You will probably want to offline your target before making these changes. Now of course, this doesn't mean the space is immediately usable on the target host. If it's Windows you can use diskpart extend. If it's Linux, then you may need another method depending upon the file system. -Errol I don't even care if i need to reformat it on my target host, so long as i can make it bigger. Thanks for the help. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] hard drive choice, TLER/ERC/CCTL
On Wed, Jan 20, 2010 at 10:04:34AM -0800, Willy wrote: To those concerned about this issue, there is a patched version of smartmontools that enables the querying and setting of TLER/ERC/CCTL values (well, except for recent desktop drives from Western Digitial). [Joining together two recent threads...] Can you (or anyone else on the list) confirm if this works with the samsung drives discussed here recently? (HD145UI and the 2Tb version) I've been a regular purchaser of WD drives for some time, and they have been good to me. However, I found this recent change disturbing and annoying; now that I realise it is actually against the standards I'm even more annoyed. It's coming time to purchase another batch of disks, so I have begun paying closer attention again recently. WD may try to force customers to buy the more expensive drives, but find instead that their customers choose another drive manufacturer altogether. Users of RAID (to whom this change matters) are, by definition, purchasers of larger numbers of drives. I was also interested in the 4k-sector WD advanced format WD-EARS drives, but if they have the same limitation, and the Samsung drives allow ERC, my choice is made. It's available here http://www.csc.liv.ac.uk/~greg/projects/erc/ Unfortunately, smartmontools has limited SATA drive support in opensolaris, and you cannot query or set the values. I'm looking into booting into linux, setting the values, and then rebooting into opensolaris since the settings will survive a warm reboot (but not a powercycle). This clearly needs to be fixed and is a project worth someone taking on! Any volunteers? -- Dan. pgpSqXnDTQWph.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] x4500...need input and clarity on striped/mirrored configuration
Can anyone recommend a optimum and redundant striped configuration for a X4500? We'll be using it for a OLTP (Oracle) database and will need best performance. Is it also true that the reads will be load-balanced across the mirrors? Is this considered a raid 1+0 configuration? zpool create -f testpool mirror c0t0d0 c1t0d0 mirror c4t0d0 c6t0d0 mirror c0t1d0 c1t1d0 mirror c4t1d0 c5t1d0 mirror c6t1d0 c7t1d0 mirror c0t2d0 c1t2d0 mirror c4t2d0 c5t2d0 mirror c6t2d0 c7t2d0 mirror c0t3d0 c1t3d0 mirror c4t3d0 c5t3d0 mirror c6t3d0 c7t3d0 mirror c0t4d0 c1t4d0 mirror c4t4d0 c6t4d0 mirror c0t5d0 c1t5d0 mirror c4t5d0 c5t5d0 mirror c6t5d0 c7t5d0 mirror c0t6d0 c1t6d0 mirror c4t6d0 c5t6d0 mirror c6t6d0 c7t6d0 mirror c0t7d0 c1t7d0 mirror c4t7d0 c5t7d0 mirror c6t7d0 c7t7d0 mirror c7t0d0 c7t4d0 Is it even possible to do a raid 0+1? zpool create -f testpool c0t0d0 c4t0d0 c0t1d0 c4t1d0 c6t1d0 c0t2d0 c4t2d0 c6t2d0 c0t3d0 c4t3d0 c6t3d0 c0t4d0 c4t4d0 c0t5d0 c4t5d0 c6t5d0 c0t6d0 c4t6d0 c6t6d0 c0t7d0 c4t7d0 c6t7d0 c7t0d0 mirror c1t0d0 c6t0d0 c1t1d0 c5t1d0 c7t1d0 c1t2d0 c5t2d0 c7t2d0 c1t3d0 c5t3d0 c7t3d0 c1t4d0 c6t4d0 c1t5d0 c5t5d0 c7t5d0 c1t6d0 c5t6d0 c7t6d0 c1t7d0 c5t7d0 c7t7d0 c7t4d0 -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration
On Wed, 20 Jan 2010, Brad wrote: Can anyone recommend a optimum and redundant striped configuration for a X4500? We'll be using it for a OLTP (Oracle) database and will need best performance. Is it also true that the reads will be load-balanced across the mirrors? Is this considered a raid 1+0 configuration? Zfs does not strictly support RAID 1+0. However, your sample command will create a pool based on mirror vdevs which is written to in a load-shared fashion (not striped). This type of pool is ideal for databases since it consumes the least of those precious IOPS. With SATA drives, you need to preserve those precious IOPS as much as possible. Zfs does not do striping across vdevs, but its load share approach will write based on (roughly) a round-robin basis, but will also prefer a less loaded vdev when under a heavy write load, or will prefer to write to an empty vdev rather than write to an almost full one. Due to zfs behavior, it is best to provision the full number of disks to start with so that the disks are evenly filled and the data is well distributed. Reads from mirror pairs use a simple load share algorithm to select the mirror side which does not attempt to strictly balance the reads. This does provide more performance than one disk, but not twice the performance. Is it even possible to do a raid 0+1? No. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration
Have you looked at using Oracle ASM instead of or with ZFS? Recent Sun docs concerning the F5100 seem to recommend a hybrid of both. If you don't go that route, generally you should separate redo logs from actual data so they don't compete for I/O, since a redo switch lagging hangs the database. If you use archive logs, separate that on to yet another pool. Realistically, it takes lots of analysis with different configurations. Every workload and database is different. A decent overview of configuring JBOD-type storage for databases is here, though it doesn't use ASM... https://www.sun.com/offers/docs/j4000_oracle_db.pdf It's a couple years old and that might contribute to the lack of an ASM mention. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration
@hortnon - ASM is not within the scope of this project. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration
Zfs does not do striping across vdevs, but its load share approach will write based on (roughly) a round-robin basis, but will also prefer a less loaded vdev when under a heavy write load, or will prefer to write to an empty vdev rather than write to an almost full one. I'm trying to visualize this...can you elaborate or give a ascii example? So with the syntax below, load sharing is implemented? zpool create testpool disk1 disk2 disk3 -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] CR# 6574286, remove slog device
Hi George. Any news on this? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration
I was reading your old posts about load-shares http://opensolaris.org/jive/thread.jspa?messageID=294580#294580 . So between raidz and load-share striping, raidz stripes a file system block evenly across each vdev but with load sharing the file system block is written on a vdev that's not filled up (slab??) then for the next file system block it continues filling up the 1MB slab until its full being moving on to the next one? Richard can you comment? :) -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Panic running a scrub
On 01/20/10 04:27 PM, Cindy Swearingen wrote: Hi Frank, I couldn't reproduce this problem on SXCE build 130 by failing a disk in mirrored pool and then immediately running a scrub on the pool. It works as expected. As noted, the disk mustn't go offline until well after the scrub has started. There's another wrinkle. There are some COMSTAR iscsi targets on this pool. If there are no initiators accessing any of them, the scrub completes with no errors after 6 hours. If one specific target is active, the panic ensues reproducibly at about 5h30m or so. The precise configuration has 2 disks on one LSI controller as a mirrored pool (whole disks - no slices). Around 750GB of 1.3TB was in use when the most recent iscsi target was created. The pool is read-mostly, so it probably isn't fragmented. The zvol has copies=1; compression off (no dedupe with snv124). The initiator is VirtualBox running on Fedora C10 on AMD64 and the target disk has 32 bit Fedora C12 installed as whole disk, which I believe is EFI. To reproduce this might require setting up a COMSTAR iscsi target on a mirrored pool, formatting it with an EFI label, and then running a scrub. Another, similar, target has OpenSolaris installed on it, and it doesn't seem to cause a panic on a scrub if it is running; AFAIK it doesn't use EFI, but I have not run a scrub with it active since converting to COMSTAR either. This wouldn't explain why one or the other disk randomly goes offline and it may be a red herring. But the scrub now runs to completion just as it always has. Since I can't get FC12 to boot from the EFI disk in VirtualBox, I may reinstall FC12 without EFI and see if that makes a difference, but it is an extremely slow process since it takes almost 6 hours for the panic to occur each time and there's no practical way to relocate the zvol to the start of the pool. HTH -- Frank ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] x4500...need input and clarity on striped/mirrored configuration
On Jan 20, 2010, at 8:14 PM, Brad wrote: I was reading your old posts about load-shares http://opensolaris.org/jive/thread.jspa?messageID=294580#294580 . So between raidz and load-share striping, raidz stripes a file system block evenly across each vdev but with load sharing the file system block is written on a vdev that's not filled up (slab??) then for the next file system block it continues filling up the 1MB slab until its full being moving on to the next one? Richard can you comment? :) That seems to be a reasonable interpretation. The nit is that the 1MB changeover is not the slab size. Slab sizes are usually much larger. In my list of things to remember for Oracle and ZFS: 1. recordsize is the biggest tuning knob 2. put redo log on a low latency device, SSD if possible 3. avoid raidz, when possible 4. prefer to give memory to the SGA rather than the ARC Roch provides some good guidelines when you have an SSD and a ZFS release which offers the logbias property here: http://blogs.sun.com/roch/entry/synchronous_write_bias_property -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Hang zpool detach while scrub is running..
I know, we should have done zpool scrub -s first.. but.. sigh.. bits...@zfs:/opt/StorMan# zpool status -v tankmir1 pool: tankmir1 state: ONLINE scrub: scrub in progress for 0h16m, 0.14% done, 187h17m to go config: NAMESTATE READ WRITE CKSUM tankmir1ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c7t6d0 ONLINE 0 0 0 c7t7d0 ONLINE 0 0 0 errors: No known data errors bits...@zfs:/opt/StorMan# zpool detach tankmir1 c7t7d0 (hung). At this point a new SSH session to the server hangs during login, I was connected via KVM over IP to the console and don't seem to have any problem with that session (although I'm not trying to log off and back in). iostat shows all activity on c7t6d0 (as expected), however IO is extremely slow ( 1 megabyte / second and 100% busy). We've been fighting a slow i/o problem with Seagate ES2 drives, some needed firmware flashing which we didn't catch before they were in a pool, so remove, flash, reinstall, resilver, etc. is a long process. Anything I can try to dump that's not overly intrusive? The system seems to be working still for iSCSI and CIFS which is it's purpose, so a reboot isn't planned unless this hangs in more ways. Hopefully it will respond in a while. snv129 installed. Steve Radich - Founder and Principal of Business Information Technology Shop - www.bitshop.com Developer Resources Site: www.ASPDeveloper.Net - www.VirtualServerFAQ.com LinkedIn Public Profile: http://www.linkedin.com/in/steveradich -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New Supermicro SAS/SATA controller: AOC-USAS2-L8e in SOHO NAS and HD HTPC
hello i have basically tested supermicro mainboard x8dth-6f together with nexenta http://www.supermicro.com/products/motherboard/QPI/5500/X8DTH-6F.cfm (same sas-II lsi-2008 chipset) nexenta 2: did not work nexenta 3: (snv 124+) install without problem, but no further testing see also my reference hardware for nexenta: http://www.napp-it.org/hardware/index.html gea -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss