Re: [zfs-discuss] ZFS Perfomance
On Wed, Apr 14, 2010 at 09:58:50AM -0700, Richard Elling wrote: > On Apr 14, 2010, at 8:57 AM, Yariv Graf wrote: > > > From my experience dealing with > 4TB you stop writing after 80% of zpool > > utilization > > YMMV. I have routinely completely filled zpools. There have been some > improvements in performance of allocations when free space gets low in > the past 6-9 months, so later releases are more efficient. Some weeks ago, I read with interest an excellent discussion of changes resulting in performance benefits for the fishworks platform, from Roch Bourbonnais. After all the analysis, three key changes are described in the penultimate paragraph. The first two of these basically adjust thresholds for existing behavioual changes (e.g the switch from first-fit to best-fit); the last is an actual code change. I meant to ask at the time, and never followed up to do so, whether: - these changes are also/yet in onnv-gate zfs - which builds, if so - whether the altered thresholds are accessible as tunables, for older builds/in the meantime. I've just added the above as a comment on the blog post, in the hopes of attracting Roch's attention there. There have been recent commits go by (>b134) that seem promising too. -- Dan. pgp40C47xo00o.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zdb zpool show as 'Value too large for defined data type"
zdb zpool output as below: bash-3.00# zdb ttt version=15 name='ttt' state=0 txg=4 pool_guid=4724975198934143337 hostid=69113181 hostname='cdc-x4100s8' vdev_tree type='root' id=0 guid=4724975198934143337 children[0] type='disk' id=0 guid=7112095013338462572 path='/dev/dsk/emcpower19c' phys_path='/pseudo/e...@19:c,blk' whole_disk=0 metaslab_array=27 metaslab_shift=26 ashift=9 asize=10704388096 is_log=0 children[1] type='disk' id=1 guid=2075488245026441048 path='/dev/dsk/emcpower18c' phys_path='/pseudo/e...@18:c,blk' whole_disk=0 metaslab_array=25 metaslab_shift=26 ashift=9 asize=10679746560 is_log=0 children[2] type='disk' id=2 guid=2555205856571100029 path='/dev/dsk/emcpower17c' phys_path='/pseudo/e...@17:c,blk' whole_disk=0 metaslab_array=23 metaslab_shift=26 ashift=9 asize=10679746560 is_log=0 zdb: can't open ttt: Value too large for defined data type. And another question about create zpool, what's the difference between creating zpool with slice 2 or whole disk? and which one is the recommended one? Thanks, Ming -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Which build is the most stable, mainly for NAS (zfs)?
Yesterday I received a victim. "SuperServer 5026T-3RF 19" 2U, Intel X58, 1xCPU LGA1366 8xSAS/SATA hot-swap drive bays, 8 ports SAS LSI 1068E, 6 ports SATA-II Intel ICH10R, 2xGigabit Ethernet" and i have 2 ways Openfiler vs Opensolaris :) -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dedup screwing up snapshot deletion
On Wed, Apr 14, 2010 at 09:04:50PM -0500, Paul Archer wrote: > I realize that I did things in the wrong order. I should have removed the > oldest snapshot first, on to the newest, and then removed the data in the > FS itself. For the problem in question, this is irrelevant. As discussed in the rest of the thread, you'll hit this when doing anyting that requires updating the ref counts on a large number of DDT entries. The only way snapshot order can really make a big difference is if you arrange for it to do so in advance. If you know you have a large amount of data to delete from a filesystem: - snapshot at the start - start deleting - snapshot fast and frequently during the deletion - let the snapshots go, later, at a controlled pace, to limit the rate of actual block frees. -- Dan. pgp9jB6BpV8mc.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dedup causing problems with NFS?(was Re: snapshots taking too much space)
Daniel Carosone wrote: On Wed, Apr 14, 2010 at 08:48:42AM -0500, Paul Archer wrote: So I turned deduplication on on my staging FS (the one that gets mounted on the database servers) yesterday, and since then I've been seeing the mount hang for short periods of time off and on. (It lights nagios up like a Christmas tree 'cause the disk checks hang and timeout.) Does it have enough (really, lots) of memory? Do you have an l2arc cache device attached (as well)? The OP said he had 8GB of RAM, and I suspect that a cheap SSD in the 40-60GB range for L2ARC would actually be the best choice to speed things up in the future, rather than add another 8GB of RAM. Dedup has a significant memory requirement, or it has to go to disk for lots of DDT entries. While its doing that, NFS requests can time out. Lengthening the timeouts on the client (for the fs mounted as a backup destination) might help you around the edges of the problem. Also, destroying the zpool where the deduped snapshots exist is fast, though not really an option if there are other filesystems on it that matter. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Which build is the most stable, mainly for NAS (zfs)?
David Dyer-Bennet wrote: On 14-Apr-10 22:44, Ian Collins wrote: Hint: the southern hemisphere does exist! I've even been there. But the month/season relationship is too deeply built into too many things I follow (like the Christmas books come out of the publisher's fall list; for that matter, like that Christmas is in the winter) to go away at all easily. California doesn't have seasons anyway. Yes we do: Wet Season and Dry Season (if you're in the Bay Area) or Dry Season and Burn-Baby-Burn Season (if you live in LA or thereabouts). Oops. Forgot San Francisco: Fog Season and well... Ummm... Fog Season. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dedup causing problems with NFS?(was Re: snapshots taking too much space)
On Wed, Apr 14, 2010 at 08:48:42AM -0500, Paul Archer wrote: > So I turned deduplication on on my staging FS (the one that gets mounted > on the database servers) yesterday, and since then I've been seeing the > mount hang for short periods of time off and on. (It lights nagios up > like a Christmas tree 'cause the disk checks hang and timeout.) Does it have enough (really, lots) of memory? Do you have an l2arc cache device attached (as well)? Dedup has a significant memory requirement, or it has to go to disk for lots of DDT entries. While its doing that, NFS requests can time out. Lengthening the timeouts on the client (for the fs mounted as a backup destination) might help you around the edges of the problem. As a related issue, are your staging (export) and backup fileystems in the same pool? If they are, moving from staging to final will involve another round of updating lots of DDT entries. What might be worthwhile trying: - turning dedup *off* on the staging filesystem, so NFS isn't waiting for it, and then deduping later as you move to the backup area at leisure (effectively, asynchronously to the nfs writes). - or, perhaps eliminating this double work by writing directly to the main backup fs. -- Dan. pgpZIJVO9TuLw.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Which build is the most stable, mainly for NAS (zfs)?
On 14-Apr-10 22:44, Ian Collins wrote: On 04/15/10 06:16 AM, David Dyer-Bennet wrote: Because 132 was the most current last time I paid much attention :-). As I say, I'm currently holding out for 2010.$Spring, but knowing how to get to a particular build via package would be potentially interesting for the future still. I hope it's 2010.$Autumn, I don't fancy waiting until October. Hint: the southern hemisphere does exist! I've even been there. But the month/season relationship is too deeply built into too many things I follow (like the Christmas books come out of the publisher's fall list; for that matter, like that Christmas is in the winter) to go away at all easily. California doesn't have seasons anyway. -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Why would zfs have too many errors when underlying raid array is fine?
As I mentioned earlier, I removed the hardware-based Raid6 array, changed all the disks to passthrough disks, made a raidz2 pool using all the disk. I used my backup program to copy 55GB of data to the disk, and now I have errors all over the place. # zpool status -v pool: bigraid state: DEGRADED status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-9P scrub: scrub completed after 0h4m with 0 errors on Wed Apr 14 22:56:36 2010 config: NAMESTATE READ WRITE CKSUM bigraid DEGRADED 0 0 0 raidz2-0 DEGRADED 0 024 c4t0d0 ONLINE 0 0 3 c4t0d1 ONLINE 0 0 2 c4t0d2 ONLINE 0 0 2 c4t0d3 DEGRADED 0 0 2 too many errors c4t0d4 ONLINE 0 0 2 c4t0d5 ONLINE 0 0 2 c4t0d6 ONLINE 0 0 1 c4t0d7 ONLINE 0 0 0 c4t1d0 ONLINE 0 0 0 c4t1d1 ONLINE 0 0 2 c4t1d2 ONLINE 0 0 2 c4t1d3 ONLINE 0 0 4 errors: No known data errors So, zfs on hardware-supported raid was fine, but zfs on passthrough disks is not. I'm at a loss to explain it. Any ideas? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] How can I get the unuse partition size or usage after it is been used.
Hello,all, I use opensolaris snv-133 I use comstar to share IPSAN a partition werr/pwd to window client. Now I copy a 1.05GB file to the format disk. then the partition usage is about 80%,now I delete the file and the disk is idle.cancel IPSAN share I thought the partition usage is drop to about 3%.But the result is not like that. # zpool list NAMESIZE ALLOC FREECAP DEDUP HEALTH ALTROOT rpool 3.75G 2.34G 1.41G62% 1.00x ONLINE - werr 1.98G 1.10G 908M55% 1.00x ONLINE # zfs list werr/pwd NAME USED AVAIL REFER MOUNTPOINT werr/pwd 1.35G 876M 1.10G - How can I get the partiton usage after I delete the partion after the IPSAN share? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Which build is the most stable, mainly for NAS (zfs)?
On 04/15/10 06:16 AM, David Dyer-Bennet wrote: Because 132 was the most current last time I paid much attention :-). As I say, I'm currently holding out for 2010.$Spring, but knowing how to get to a particular build via package would be potentially interesting for the future still. I hope it's 2010.$Autumn, I don't fancy waiting until October. Hint: the southern hemisphere does exist! As to which build is more stable, that depends what you want to do with it. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dedup screwing up snapshot deletion
On 04/14/10 19:51, Richard Jahnel wrote: This sounds like the known issue about the dedupe map not fitting in ram. Indeed, but this is not correct: When blocks are freed, dedupe scans the whole map to ensure each block is not is use before releasing it. That's not correct. dedup uses a data structure which is indexed by the hash of the contents of each block. That hash function is effectively random, so it needs to access a *random* part of the map for each free which means that it (as you correctly stated): ... takes a veeery long time if the map doesn't fit in ram. If you can try adding more ram to the system. Adding a flash-based ssd as an cache/L2ARC device is also very effective; random i/o to ssd is much faster than random i/o to spinning rust. - Bill ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dedup screwing up snapshot deletion
7:51pm, Richard Jahnel wrote: This sounds like the known issue about the dedupe map not fitting in ram. When blocks are freed, dedupe scans the whole map to ensure each block is not is use before releasing it. This takes a veeery long time if the map doesn't fit in ram. If you can try adding more ram to the system. -- Thanks for the info. Unfortunately, I'm not sure I'll be able to add more RAM any time soon. But I'm certainly going to try, as this is the primary backup server for our Oracle databases. Thanks again, Paul PS It's got 8GB right now. You think doubling that to 16GB would cut it? Is there a way to see how big the map is, anyway? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Which build is the most stable, mainly for NAS (zfs)?
On Wed, Apr 14 at 13:16, David Dyer-Bennet wrote: I don't get random hangs in normal use; so I haven't done anything to "get past" this. Interesting. Win7-64 clients were locking up our 2009.06 server within seconds while performing common operations like searching and copying large directory trees. Luckilly I could still rollback to 101b which worked fine (except for a CIFS bug because of its age), and my roll-forward to b130 was successful as well. We now have our primary on b130 and our slave server on b134, with no stability issues in either one. I DO get hangs when funny stuff goes on, which may well be related to that problem (at least they require a reboot). Hmmm; I get hangs sometimes when trying to send a full replication stream to an external backup drive, and I have to reboot to recover from them. I can live with this, in the short term. But now I'm feeling hopeful that they're fixed in what I'm likely to be upgrading to next. Yes, hopefully. --eric -- Eric D. Mudama edmud...@mail.bounceswoosh.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dedup screwing up snapshot deletion
This sounds like the known issue about the dedupe map not fitting in ram. When blocks are freed, dedupe scans the whole map to ensure each block is not is use before releasing it. This takes a veeery long time if the map doesn't fit in ram. If you can try adding more ram to the system. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] dedup screwing up snapshot deletion
I have an approx 700GB (of data) FS that I had dedup turned on for. (See previous posts.) I turned on dedup after the FS was populated, and was not sure dedup was working. I had another copy of the data, so I removed the data, and then tried to destroy the snapshots I had taken. The first two didn't take too long, but the last one (the oldest) has taken literally hours now. I've rebooted and tried starting over, but it hasn't made a difference. I realize that I did things in the wrong order. I should have removed the oldest snapshot first, on to the newest, and then removed the data in the FS itself. But still, it shouldn't take hours, should it? I made sure the machine was otherwise idle, and did an 'iostat', which shows about 5KB/sec reads and virtually no writes to the pool. Any ideas where to look? I'd just remove the FS entirely at this point, but I'd have to destroy the snapshot first, so I'm in the same boat, yes? TIA, Paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Perfomance
Richard Elling wrote: On Apr 14, 2010, at 8:57 AM, Yariv Graf wrote: From my experience dealing with > 4TB you stop writing after 80% of zpool utilization YMMV. I have routinely completely filled zpools. There have been some improvements in performance of allocations when free space gets low in the past 6-9 months, so later releases are more efficient. -- richard I would echo Richard here, and add that it also seems to be dependent on the usage characteristics. That is, using in a write-mostly (or, write-almost-exclusively) form seems to result in no problems filling a pool to 100% - so, if you're going to use the zpool for (say) storing your DVD images (or other media-server applications), then go ahead, and plan to fill the pool up. On the other hand, doing lots of write/erase stuff (particularly with a wide mix of file sizes) does indeed seem to cause performance to drop off rather quickly once 80% (or thereabouts) capacity is reached. For instance - I routinely back up my local developers' workstation's disks to a ZFS box using rsync, rapidly changing the contents of my zpool each night (as I also expire old backups more than 1 week old). That machine hits a brick wall on performance at about 82% full (6-disk 250GB 7200RPM SATA in a raidz1). -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Which build is the most stable, mainly for NAS (zfs)?
On Wed, April 14, 2010 15:28, Miles Nordin wrote: >> "dd" == David Dyer-Bennet writes: > > dd> Is it possible to switch to b132 now, for example? > > yeah, this is not so bad. I know of two approaches: Thanks, I've filed and flagged this for reference. -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Which build is the most stable, mainly for NAS (zfs)?
> "dd" == David Dyer-Bennet writes: dd> Is it possible to switch to b132 now, for example? yeah, this is not so bad. I know of two approaches: * genunix.org assembles livecd's of each b tag. You can burn one, unplug from the internet, install it. It is nice to have a livecd capable of mounting whatever zpool and zfs version you are using. I'm not sure how they do this, but they do it. * see these untested but relatively safe-looking instructions (apolo to whoever posted that i didn't write down the credit): formal IPS docs: http://dlc.sun.com/osol/docs/content/2009.06/IMGPACKAGESYS/index.html how to get a specific snv build with ips -8<- Starting from OpenSolaris 2009.06 (snv_111b) active BE. 1) beadm create snv_111b-dev 2) beadm activate snv_111b-dev 3) reboot 4) pkg set-authority -O http://pkg.opensolaris.org/dev opensolaris.org 5) pkg install SUNWipkg 6) pkg list 'entire*' 7) beadm create snv_118 8) beadm mount snv_118 /mnt 9) pkg -R /mnt refresh 10) pkg -R /mnt install ent...@0.5.11-0.118 11) bootadm update-archive -R /mnt 12) beadm umount snv_118 13) beadm activate snv_118 14) reboot Now you have a snv_118 development environment. also see: http://defect.opensolaris.org/bz/show_bug.cgi?id=3436 which currently says about the same thing. -8<- you see the b is specified in line 10, ent...@0.5.11-0. There is no ``failsafe'' boot archive with opensolaris like the ramdisk-based one that was in the now-terminated SXCE, so you should make a failsafe boot option yourself by cloning a working BE and leaving that clone alone. and...make the failsafe clone new enough to understand your pool version or else it's not very useful. :) pgpxowC3Fu66n.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Suggestions about current ZFS setup
On 04/14/10 12:37, Christian Molson wrote: First I want to thank everyone for their input, It is greatly appreciated. To answer a few questions: Chassis I have: http://www.supermicro.com/products/chassis/4U/846/SC846E2-R900.cfm Motherboard: http://www.tyan.com/product_board_detail.aspx?pid=560 RAM: 24 GB (12 x 2GB) 10 x 1TB Seagates 7200.11 10 x 1TB Hitachi 4 x 2TB WD WD20EARS (4K blocks) If you have the spare change for it I'd add one or two SSD's to the mix, with space on them allocated to the root pool plus l2arc cache, and slog for the data pool(s). - Bill ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Suggestions about current ZFS setup
Just a quick update, Tested using bonnie++ just during its "Intelligent write": my 5 vdevs of 4x1tb drives wrote around 300-350MB/sec using that test. The 1vdev of 4x2TB drives wrote more inconsistently, between 200-300. This is not a complete test... just looking at iostat output while bonnie++ ran.. I will do a complete test later on, but it seems initially that the new drives are not horrible in a pool of 4 raidz. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Which build is the most stable, mainly for NAS (zfs)?
On Wed, Apr 14, 2010 at 5:41 AM, Dmitry wrote: > Which build is the most stable, mainly for NAS? > I plann NAS zfs + CIFS,iSCSI I'm using b133. My current box was installed with 118, upgraded to 128a, then 133. I'm avoiding b134 due to changes in the CIFS service that affect ACLs. http://bugs.opensolaris.org/view_bug.do?bug_id=6706181 For any new installation, I would suggest b134, or wait for the 10.spring release, which should be based on b134 or b135. -B -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Suggestions about current ZFS setup
First I want to thank everyone for their input, It is greatly appreciated. To answer a few questions: Chassis I have: http://www.supermicro.com/products/chassis/4U/846/SC846E2-R900.cfm Motherboard: http://www.tyan.com/product_board_detail.aspx?pid=560 RAM: 24 GB (12 x 2GB) 10 x 1TB Seagates 7200.11 10 x 1TB Hitachi 4 x 2TB WD WD20EARS (4K blocks) I used to have (selling it now) a 3Ware 9690SA controller, and had setup rather poorly. I added drives to the RAID6 array, and then created new partitions on it which were concatenated via LVM. I ran EXT3 as a filesystem as well. Firstly, the 3Ware controller was ok, but the limitations with HW raid were what brought me to ZFS. Mainly the fact that you cannot shrink the size of a RAID6 HW array since it has no knowledge of the FS. Most of the VM's and data files I store here are not critical. I make backups of the important stuff (family pictures, work etc). I also backup the data within the VM's so if their disk files are ever lost it is not too much of a problem. >From what you guys have said, adding slow drives to the pool will cause them >to be a bottleneck in the pool. I am just finishing up some copying etc, and >will benchmark a test pool with the 2TB drives. Even if they are fast enough, >I think It would be better to create a seperate pool for them, and store only >data which can be lost. Also, do you guys have any suggestions for this: I have a desktop running windows 7, it runs off of an SSD. Would setting up NFS or iSCSI (even possible?) to install other apps on be worth it? I am guessing it would be easier to just pop a regular drive in it instead.. Thanks again! -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [shell-discuss] getconf test for case insensitive ZFS?
? wrote: > There is no way in the SUS standard to determinate if a file system is > case insensitive, i.e. with pathconf? SUS requires a case sensitive filesystem. There is no need to request this from a POSIX view Jörg -- EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin j...@cs.tu-berlin.de(uni) joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Suggestions about current ZFS setup
On Wed, April 14, 2010 12:29, Bob Friesenhahn wrote: > On Wed, 14 Apr 2010, David Dyer-Bennet wrote: Not necessarily for a home server. While mine so far is all mirrored pairs of 400GB disks, I don't even think about "performance" issues, I never come anywhere near the limits of the hardware. >>> >>> I don't see how the location of the server has any bearing on required >>> performance. If these 2TB drives are the new 4K sector variety, even >>> you might notice. >> >> The location does not, directly, of course; but the amount and type of >> work being supported does, and most home servers see request streams >> very >> different from commercial servers. > > If it was not clear, the performance concern is primarily for writes > since zfs will load-share the writes across the available vdevs using > an algorithm which also considers the write queue/backlog for each > vdev. If a vdev is slow, then it may be filled more slowly than the > other vdevs. This is also the reason why zfs encourages that all > vdevs use the same organization. As I said, I don't think of performance issues on mine. So I wasn't thinking of that particular detail, and it's good to call it out explicitly. If the performance of the new drives isn't adequate, then the performance of the entire pool will become inadequate, it looks like. I expect it's routine to have disks of different generations in the same pool at this point (and if it isn't now, it will be in 5 years), just due to what's available, replacing bad drives, and so forth. -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Which build is the most stable, mainly for NAS (zfs)?
On Wed, April 14, 2010 11:51, Tonmaus wrote: >> >> On Wed, April 14, 2010 08:52, Tonmaus wrote: >> > safe to say: 2009.06 (b111) is unusable for the >> purpose, ans CIFS is dead >> > in this build. >> >> That's strange; I run it every day (my home Windows >> "My Documents" folder >> and all my photos are on 2009.06). >> >> >> -bash-3.2$ cat /etc/release >> OpenSolaris 2009.06 snv_111b >> X86 >> Copyright 2009 Sun Microsystems, Inc. All >> Rights Reserved. >> Use is subject to license >> terms. >> Assembled 07 May 2009 > > > I would be really interested how you got past this > http://defect.opensolaris.org/bz/show_bug.cgi?id=11371 > which I was so badly bitten by that I considered giving up on OpenSolaris. I don't get random hangs in normal use; so I haven't done anything to "get past" this. I DO get hangs when funny stuff goes on, which may well be related to that problem (at least they require a reboot). Hmmm; I get hangs sometimes when trying to send a full replication stream to an external backup drive, and I have to reboot to recover from them. I can live with this, in the short term. But now I'm feeling hopeful that they're fixed in what I'm likely to be upgrading to next. >> not sure if this is best choice. I'd like to >> hear from others as well. >> Well, it's technically not a stable build. >> >> I'm holding off to see what 2010.$Spring ends up >> being; I'll convert to >> that unless it turns into a disaster. >> >> Is it possible to switch to b132 now, for example? I >> don't think the old >> builds are available after the next one comes out; I >> haven't been able to >> find them. > > There are methods to upgrade to any dev build by pkg. Can't tell you from > the top of my head, but I have done it with success. > > I wouldn't know why to go to 132 instead of 133, though. 129 seems to be > an option. Because 132 was the most current last time I paid much attention :-). As I say, I'm currently holding out for 2010.$Spring, but knowing how to get to a particular build via package would be potentially interesting for the future still. Having been told it's possible helps, makes it worth looking harder. -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] getconf test for case insensitive ZFS?
Roy, I was looking for a C API which works for all types of file systems, including ZFS, CIFS, PCFS and others. Olga On Wed, Apr 14, 2010 at 7:46 PM, Roy Sigurd Karlsbakk wrote: > r...@urd:~# zfs get casesensitivity dpool/test > NAMEPROPERTY VALUESOURCE > dpool/test casesensitivity sensitive- > > this seems to be settable only by create, not later. See man zfs for more info > > Best regards > > roy > -- > Roy Sigurd Karlsbakk > (+47) 97542685 > r...@karlsbakk.net > http://blogg.karlsbakk.net/ > -- > I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det > er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av > idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og > relevante synonymer på norsk. > > - "ольга крыжановская" skrev: > >> Can I use getconf to test if a ZFS file system is mounted in case >> insensitive mode? >> >> Olga >> -- >> , __ , >> { \/`o;-Olga Kryzhanovska -;o`\/ } >> .'-/`-/ olga.kryzhanov...@gmail.com \-`\-'. >> `'-..-| / Solaris/BSD//C/C++ programmer \ |-..-'` >> /\/\ /\/\ >> `--` `--` >> ___ >> zfs-discuss mailing list >> zfs-discuss@opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- , __ , { \/`o;-Olga Kryzhanovska -;o`\/ } .'-/`-/ olga.kryzhanov...@gmail.com \-`\-'. `'-..-| / Solaris/BSD//C/C++ programmer \ |-..-'` /\/\ /\/\ `--` `--` ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] getconf test for case insensitive ZFS?
r...@urd:~# zfs get casesensitivity dpool/test NAMEPROPERTY VALUESOURCE dpool/test casesensitivity sensitive- this seems to be settable only by create, not later. See man zfs for more info Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 r...@karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. - "ольга крыжановская" skrev: > Can I use getconf to test if a ZFS file system is mounted in case > insensitive mode? > > Olga > -- > , __ , > { \/`o;-Olga Kryzhanovska -;o`\/ } > .'-/`-/ olga.kryzhanov...@gmail.com \-`\-'. > `'-..-| / Solaris/BSD//C/C++ programmer \ |-..-'` > /\/\ /\/\ > `--` `--` > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [shell-discuss] getconf test for case insensitive ZFS?
There is no way in the SUS standard to determinate if a file system is case insensitive, i.e. with pathconf? Olga On Wed, Apr 14, 2010 at 7:48 PM, Glenn Fowler wrote: > > On Wed, 14 Apr 2010 17:54:02 +0200 =?KOI8-R?B?z8zYx8Egy9LZ1sHOz9fTy8HR?= > wrote: >> Can I use getconf to test if a ZFS file system is mounted in case >> insensitive mode? > > we would have to put in the zfs query (hopefull more generic that just for > zfs) > the only current working case-insensitive checks are for uwin > > ___ > shell-discuss mailing list > shell-disc...@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/shell-discuss > -- , __ , { \/`o;-Olga Kryzhanovska -;o`\/ } .'-/`-/ olga.kryzhanov...@gmail.com \-`\-'. `'-..-| / Solaris/BSD//C/C++ programmer \ |-..-'` /\/\ /\/\ `--` `--` ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Suggestions about current ZFS setup
On Wed, 14 Apr 2010, David Dyer-Bennet wrote: Not necessarily for a home server. While mine so far is all mirrored pairs of 400GB disks, I don't even think about "performance" issues, I never come anywhere near the limits of the hardware. I don't see how the location of the server has any bearing on required performance. If these 2TB drives are the new 4K sector variety, even you might notice. The location does not, directly, of course; but the amount and type of work being supported does, and most home servers see request streams very different from commercial servers. If it was not clear, the performance concern is primarily for writes since zfs will load-share the writes across the available vdevs using an algorithm which also considers the write queue/backlog for each vdev. If a vdev is slow, then it may be filled more slowly than the other vdevs. This is also the reason why zfs encourages that all vdevs use the same organization. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Replaced drive in zpool, was fine, now degraded - ohno
Jonathan, For a different diagnostic perspective, you might use the fmdump -eV command to identify what FMA indicates for this device. This level of diagnostics is below the ZFS level and definitely more detailed so you can see when these errors began and for how long. Cindy On 04/14/10 11:08, Jonathan wrote: Yeah, -- $smartctl -d sat,12 -i /dev/rdsk/c5t0d0 smartctl 5.39.1 2010-01-28 r3054 [i386-pc-solaris2.11] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net Smartctl: Device Read Identity Failed (not an ATA/ATAPI device) -- I'm thinking between 111 and 132 (mentioned in post) something changed. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Replaced drive in zpool, was fine, now degraded - ohno
> > Do worry about media errors. Though this is the most > common HDD > error, it is also the cause of data loss. > Fortunately, ZFS detected this > and repaired it for you. Right. I assume you do recommend swapping the faulted drive out though? Other file systems may not > be so gracious. > -- richard As we are all too aware I'm sure :) -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS forensics/revert/restore shellscript and how-to.
On Apr 14, 2010, at 5:13 AM, fred pam wrote: > I have a similar problem that differs in a subtle way. I moved a zpool > (single disk) from one system to another. Due to my inexperience I did not > import the zpool but (doh!) 'zpool create'-ed it (I may also have used a -f > somewhere in there...) You have destroyed the previous pool. There is a reason the "-f" flag is required, though it is human nature to ignore such reasons. > Interestingly the script still gives me the old uberblocks but in this case > the first couple (lowest TXG's) are actually younger (later timestamp) than > the higher TXG ones. Obviously removing the highest TXG's will actually > remove the uberblocks I want to keep. This is because creation of the new pool did not zero-out the uberblocks. > Is there a way to copy an uberblock over another one? Or could I perhaps > remove the low-TXG uberblocks instead of the high-TXG ones (and would that > mean the old pool becomes available again). Or are more things missing than > just the uberblocks and should I move to a file-based approach (on ZFS?) I do not believe you can recover the data on the previous pool without considerable time and effort. -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Checksum errors on and after resilver
[this seems to be the question of the day, today...] On Apr 14, 2010, at 2:57 AM, bonso wrote: > Hi all, > I recently experienced a disk failure on my home server and observed checksum > errors while resilvering the pool and on the first scrub after the resilver > had completed. Now everything seems fine but I'm posting this to get help > with calming my nerves and detect any possible future faults. > > Lets start with some specs. > OSOL 2009.06 > Intel SASUC8i (w LSI 1.30IT FW) > Gigabyte MA770-UD3 mobo w 8GB ECC RAM > Hitachi P7K500 harddrives > > When checking the condition of my pool some days ago (yes I should make it > mail me if something like this happens again) one disk in my pool was labeled > as "Removed" with a small number of read errors, nineish I think, all other > disks where fine. I removed tested (DFT crashed so the disk seemed very > broken) replaced the drive and started a resilver. > > Checking the status of the resilver everything looked good from the start but > when it was finished the status report looked like this: > pool: sasuc8i > state: ONLINE > status: One or more devices has experienced an unrecoverable error. An > attempt was made to correct the error. Applications are unaffected. > action: Determine if the device needs to be replaced, and clear the errors > using 'zpool clear' or replace the device with 'zpool replace'. > see: http://www.sun.com/msg/ZFS-8000-9P > scrub: resilver completed after 4h9m with 0 errors on Mon Apr 12 18:12:26 2010 > config: > > NAME STATE READ WRITE CKSUM > sasuc8i ONLINE 0 0 0 > raidz2 ONLINE 0 0 0 > c12t4d0 ONLINE 0 0 5 108K resilvered > c12t8d0 ONLINE 0 0 0 254G resilvered > c12t6d0 ONLINE 0 0 0 > c12t7d0 ONLINE 0 0 0 > c12t0d0 ONLINE 0 0 1 21.5K resilvered > c12t1d0 ONLINE 0 0 2 43K resilvered > c12t2d0 ONLINE 0 0 4 86K resilvered > c12t3d0 ONLINE 0 0 1 21.5K resilvered > > errors: No known data errors > > All I really cared about at this point was the "Applications are unaffected" > and "No known data errors" and I thought that the checksum errors might be > down to the failing drive (c12t5d0 failed, the controlled labeled the new > drive as c12t8d0) going out during a write. Then again ZFS is atomic, better > clear the errors and run a scrub, it came out like this: > pool: sasuc8i > state: ONLINE > status: One or more devices has experienced an unrecoverable error. An > attempt was made to correct the error. Applications are unaffected. > action: Determine if the device needs to be replaced, and clear the errors > using 'zpool clear' or replace the device with 'zpool replace'. > see: http://www.sun.com/msg/ZFS-8000-9P > scrub: scrub completed after 1h16m with 0 errors on Tue Apr 13 01:29:32 2010 > config: > > NAME STATE READ WRITE CKSUM > sasuc8i ONLINE 0 0 0 > raidz2 ONLINE 0 0 0 > c12t4d0 ONLINE 0 0 5 > c12t8d0 ONLINE 0 0 0 > c12t6d0 ONLINE 0 0 0 > c12t7d0 ONLINE 0 0 4 86K repaired > c12t0d0 ONLINE 0 0 1 > c12t1d0 ONLINE 0 0 6 86K repaired > c12t2d0 ONLINE 0 0 4 > c12t3d0 ONLINE 0 0 6 108K repaired > > errors: No known data errors > > Now I'm getting nervous. Checksum errors, some repaired others not. Am I > going to end up with multiple drive failures or what the * is going on here? When I see many disks suddenly reporting errors, I suspect a common element: HBA, cables, backplane, mobo, CPU, power supply, etc. If you search the zfs-discuss archives you can find instances where HBA firmware, driver issues, or firmware+driver interactions caused such reports. Cabling and power supplies are less commonly reported. > Ran one more scrub and everything came up roses. > Checked smart status on the drives with checksum errors and they are fine, > allthough I expect only read/write errors would show up there. > > I'm not sure of how to get this into a propper question but what I'm after is > "is this normal to be expected after a resilver and can I start breathing > again?". Checksum errors are as far as I can gather dodgy data on disk and > read/write somewhere in the physical link (more or less). Breathing is good. Then check your firmware releases. -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com ___ zfs-discuss mailing list zfs-
Re: [zfs-discuss] ZIL errors but device seems OK
comment below... On Apr 14, 2010, at 1:49 AM, Richard Skelton wrote: > Hi, > I have installed OpenSolaris snv_134 from the iso at genunix.org. > Mon Mar 8 2010 New OpenSolaris preview, based on build 134 > I created a zpool:- >NAMESTATE READ WRITE CKSUM >tankONLINE 0 0 0 > c7t4d0ONLINE 0 0 0 > c7t5d0ONLINE 0 0 0 > c7t6d0ONLINE 0 0 0 > c7t8d0ONLINE 0 0 0 > c7t9d0ONLINE 0 0 0 >logs > c5d1p1ONLINE 0 0 0 >cache > c5d1p2ONLINE 0 0 0 > > The log device and cache are each one half of a 128GB OCZ VERTEX-TURBO flash > card. > > I am getting good NFS performance but have seen this error:- > r...@brszfs02:~# zpool status tank > pool: tank > state: DEGRADED > status: One or more devices are faulted in response to persistent errors. >Sufficient replicas exist for the pool to continue functioning in a >degraded state. > action: Replace the faulted device, or use 'zpool clear' to mark the device >repaired. > scrub: none requested > config: > >NAMESTATE READ WRITE CKSUM >tankDEGRADED 0 0 0 > c7t4d0ONLINE 0 0 0 > c7t5d0ONLINE 0 0 0 > c7t6d0ONLINE 0 0 0 > c7t8d0ONLINE 0 0 0 > c7t9d0ONLINE 0 0 0 >logs > c5d1p1FAULTED 0 4 0 too many errors >cache > c5d1p2ONLINE 0 0 0 > > errors: No known data errors > > r...@brszfs02:~# fmadm faulty > --- -- - > TIMEEVENT-ID MSG-ID SEVERITY > --- -- - > Mar 25 13:14:34 6c0bd163-56bf-ee92-e393-ce2063355b52 ZFS-8000-FDMajor > > Host: brszfs02 > Platform: HP-Compaq-dc7700-Convertible-MinitowerChassis_id : > CZC7264JN4 > Product_sn : > > Fault class : fault.fs.zfs.vdev.io > Affects : zfs://pool=tank/vdev=4ec464b5bf74a898 > faulted but still in service > Problem in : zfs://pool=tank/vdev=4ec464b5bf74a898 > faulted but still in service > > Description : The number of I/O errors associated with a ZFS device exceeded > acceptable levels. Refer to > http://sun.com/msg/ZFS-8000-FD > for more information. > > Response: The device has been offlined and marked as faulted. An attempt > will be made to activate a hot spare if available. > > Impact : Fault tolerance of the pool may be compromised. > > Action : Run 'zpool status -x' and replace the bad device. > > r...@brszfs02:~# iostat -En c5d1 > c5d1 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 > Model: OCZ VERTEX-TURB Revision: Serial No: 062F97G71C5T676 Size: 128.04GB > <128035160064 bytes> > Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 > Illegal Request: 0 > > > As there seems to be not hardware errors as reported by iostat I ran zpool > clear tank and a scrub on Monday. > Up to now I have seen no new errors, I have set-up a cron to scrub a 01:30 > each day. > > Is the flash card faulty or is this a ZFS problem? In my testing of Flash-based SSDs, this is the most common error. Since the drive is not reporting media errors or hard errors, the only interim conclusion is that something in the data path caused data to be corrupted. This can mean the drive doesn't report these errors, the errors are transient, or an error occurred which is not related to the data (eg. phantom writes). For example, my current bad-boy says: $ iostat -En ... c7t0d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: USB2.0 Product: VAULT DRIVE Revision: 1100 Serial No: Size: 8.12GB <8120172544 bytes> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 103 Predictive Failure Analysis: 0 ... $ pfexec zpool status -v syspool pool: syspool state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: scrub completed after 0h1m with 325 errors on Wed Apr 14 11:06:58 2010 config: NAMESTATE READ WRITE CKSUM syspool ONLINE 0 0 330
Re: [zfs-discuss] Suggestions about current ZFS setup
On Wed, April 14, 2010 12:06, Bob Friesenhahn wrote: > On Wed, 14 Apr 2010, David Dyer-Bennet wrote: >>> It should be "safe" but chances are that your new 2TB disks are >>> considerably slower than the 1TB disks you already have. This should >>> be as much cause for concern (or more so) than the difference in raidz >>> topology. >> >> Not necessarily for a home server. While mine so far is all mirrored >> pairs of 400GB disks, I don't even think about "performance" issues, I >> never come anywhere near the limits of the hardware. > > I don't see how the location of the server has any bearing on required > performance. If these 2TB drives are the new 4K sector variety, even > you might notice. The location does not, directly, of course; but the amount and type of work being supported does, and most home servers see request streams very different from commercial servers. The last server software I worked on was able to support 80,000 simultaneous HD video streams. Coming off Thumpers, in fact (well, coming out of a truly obscene amount of DRAM buffer on the streaming board, which was in turn loaded from Thumpers); this was the thing that Thumper was originally designed for, known when I worked there as the Sun Streaming System I believe. You don't see loads like that on home servers :-). And a big database server would have an equally extreme but totally different access pattern. -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Replaced drive in zpool, was fine, now degraded - ohno
Yeah, -- $smartctl -d sat,12 -i /dev/rdsk/c5t0d0 smartctl 5.39.1 2010-01-28 r3054 [i386-pc-solaris2.11] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net Smartctl: Device Read Identity Failed (not an ATA/ATAPI device) -- I'm thinking between 111 and 132 (mentioned in post) something changed. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Suggestions about current ZFS setup
On Wed, 14 Apr 2010, David Dyer-Bennet wrote: It should be "safe" but chances are that your new 2TB disks are considerably slower than the 1TB disks you already have. This should be as much cause for concern (or more so) than the difference in raidz topology. Not necessarily for a home server. While mine so far is all mirrored pairs of 400GB disks, I don't even think about "performance" issues, I never come anywhere near the limits of the hardware. I don't see how the location of the server has any bearing on required performance. If these 2TB drives are the new 4K sector variety, even you might notice. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Replaced drive in zpool, was fine, now degraded - ohno
On Apr 14, 2010, at 9:56 AM, Jonathan wrote: > I just ran 'iostat -En'. This is what was reported for the drive in question > (all other drives showed 0 errors across the board. > > All drives indicated the "illegal request... predictive failure analysis" > -- > c7t1d0 Soft Errors: 0 Hard Errors: 36 Transport Errors: 0 > Vendor: ATA Product: SAMSUNG HD203WI Revision: 0002 Serial No: > Size: 2000.40GB <2000398934016 bytes> > Media Error: 36 Device Not Ready: 0 No Device: 0 Recoverable: 0 > Illegal Request: 126 Predictive Failure Analysis: 0 > -- Don't worry about illegal requests, they are not permanent. Do worry about media errors. Though this is the most common HDD error, it is also the cause of data loss. Fortunately, ZFS detected this and repaired it for you. Other file systems may not be so gracious. -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Replaced drive in zpool, was fine, now degraded - ohno
> I'm on snv 111b. I attempted to get smartmontools > workings, but it doesn't seem to want to work as > these are all sata drives. Have you tried using '-d sat,12' when using smartmontools? opensolaris.org/jive/thread.jspa?messageID=473727 -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Perfomance
On Apr 14, 2010, at 8:57 AM, Yariv Graf wrote: > From my experience dealing with > 4TB you stop writing after 80% of zpool > utilization YMMV. I have routinely completely filled zpools. There have been some improvements in performance of allocations when free space gets low in the past 6-9 months, so later releases are more efficient. -- richard > > 10 > > On Apr 14, 2010, at 6:53 PM, "eXeC001er" wrote: > >> 20 % - it is big size on for large volumes. right ? >> >> >> 2010/4/14 Yariv Graf >> Hi >> Keep below 80% >> >> 10 >> >> On Apr 14, 2010, at 6:49 PM, "eXeC001er" wrote: >> >>> Hi All. >>> >>> How many disk space i need to reserve for save ZFS perfomance ? >>> >>> any official doc? >>> >>> Thanks. >>> ___ >>> zfs-discuss mailing list >>> zfs-discuss@opensolaris.org >>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Replaced drive in zpool, was fine, now degraded - ohno
I just ran 'iostat -En'. This is what was reported for the drive in question (all other drives showed 0 errors across the board. All drives indicated the "illegal request... predictive failure analysis" -- c7t1d0 Soft Errors: 0 Hard Errors: 36 Transport Errors: 0 Vendor: ATA Product: SAMSUNG HD203WI Revision: 0002 Serial No: Size: 2000.40GB <2000398934016 bytes> Media Error: 36 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 126 Predictive Failure Analysis: 0 -- -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Which build is the most stable, mainly for NAS (zfs)?
> > On Wed, April 14, 2010 08:52, Tonmaus wrote: > > safe to say: 2009.06 (b111) is unusable for the > purpose, ans CIFS is dead > > in this build. > > That's strange; I run it every day (my home Windows > "My Documents" folder > and all my photos are on 2009.06). > > > -bash-3.2$ cat /etc/release > OpenSolaris 2009.06 snv_111b > X86 > Copyright 2009 Sun Microsystems, Inc. All > Rights Reserved. > Use is subject to license > terms. > Assembled 07 May 2009 I would be really interested how you got past this http://defect.opensolaris.org/bz/show_bug.cgi?id=11371 which I was so badly bitten by that I considered giving up on OpenSolaris. > not sure if this is best choice. I'd like to > hear from others as well. > Well, it's technically not a stable build. > > I'm holding off to see what 2010.$Spring ends up > being; I'll convert to > that unless it turns into a disaster. > > Is it possible to switch to b132 now, for example? I > don't think the old > builds are available after the next one comes out; I > haven't been able to > find them. There are methods to upgrade to any dev build by pkg. Can't tell you from the top of my head, but I have done it with success. I wouldn't know why to go to 132 instead of 133, though. 129 seems to be an option. Regards, Tonmaus -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Replaced drive in zpool, was fine, now degraded - ohno
On Apr 14, 2010, at 12:05 AM, Jonathan wrote: > I just started replacing drives in this zpool (to increase storage). I pulled > the first drive, and replaced it with a new drive and all was well. It > resilvered with 0 errors. This was 5 days ago. Just today I was looking > around and noticed that my pool was degraded (I see now that this occurred > last night). Sure enough there are 12 read errors on the new drive. > > I'm on snv 111b. I attempted to get smartmontools workings, but it doesn't > seem to want to work as these are all sata drives. fmdump indicates that the > read errors occurred within about 10 minutes of one another. Use "iostat -En" to see the nature of the I/O errors. > > Is it safe to say this drive is bad, or is there anything else I can do about > this? It is safe to say that there was trouble reading from the drive at some time in the past. But you have not determined the root cause -- the info available in zpool status is not sufficient. -- richard > > Thanks, > Jon > > > $ zpool status MyStorage > pool: MyStorage > state: DEGRADED > status: One or more devices are faulted in response to persistent errors. >Sufficient replicas exist for the pool to continue functioning in a >degraded state. > action: Replace the faulted device, or use 'zpool clear' to mark the device >repaired. > scrub: scrub completed after 8h7m with 0 errors on Sun Apr 11 13:07:40 2010 > config: > >NAMESTATE READ WRITE CKSUM >MyStorage DEGRADED 0 0 0 > raidz1DEGRADED 0 0 0 >c5t0d0 ONLINE 0 0 0 >c5t1d0 ONLINE 0 0 0 >c6t1d0 ONLINE 0 0 0 >c7t1d0 FAULTED 12 0 0 too many errors > > errors: No known data errors > > $ fmdump > TIME UUID SUNW-MSG-ID > Apr 09 16:08:04.4660 1f07d23f-a4ba-cbbb-8713-d003d9771079 ZFS-8000-D3 > Apr 13 22:29:02.8063 e26c7e32-e5dd-cd9c-cd26-d5715049aad8 ZFS-8000-FD > > That first log is the original drive being replaced. The second is the read > errors on the new drive. > -- > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Perfomance
From my experience dealing with > 4TB you stop writing after 80% of zpool utilization 10 On Apr 14, 2010, at 6:53 PM, "eXeC001er" wrote: 20 % - it is big size on for large volumes. right ? 2010/4/14 Yariv Graf Hi Keep below 80% 10 On Apr 14, 2010, at 6:49 PM, "eXeC001er" wrote: Hi All. How many disk space i need to reserve for save ZFS perfomance ? any official doc? Thanks. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] brtfs on Solaris? (Re: [osol-discuss] [indiana-discuss] So when are we gonna fork this sucker?)
I would like to see brtfs under a BSD license that FreeBSD/OpenBSD/NetBSD can adopt it, too. Olga 2010/4/14 : > >>brtfs could be supported on Opensolaris, too. IMO it could even >>complement ZFS and spawn some concurrent development between both. ZFS >>is too high end and works very poorly with less than 2GB while brtfs >>reportedly works well with 128MB on ARM. > > Both have license issues; Oracle can now re-license either, I believe, > unless brtfs has escaped. > > Casper > > -- , __ , { \/`o;-Olga Kryzhanovska -;o`\/ } .'-/`-/ olga.kryzhanov...@gmail.com \-`\-'. `'-..-| / Solaris/BSD//C/C++ programmer \ |-..-'` /\/\ /\/\ `--` `--` ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Perfomance
20 % - it is big size on for large volumes. right ? 2010/4/14 Yariv Graf > Hi > Keep below 80% > > 10 > > On Apr 14, 2010, at 6:49 PM, "eXeC001er" wrote: > > Hi All. > > How many disk space i need to reserve for save ZFS perfomance ? > > any official doc? > > Thanks. > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] getconf test for case insensitive ZFS?
Can I use getconf to test if a ZFS file system is mounted in case insensitive mode? Olga -- , __ , { \/`o;-Olga Kryzhanovska -;o`\/ } .'-/`-/ olga.kryzhanov...@gmail.com \-`\-'. `'-..-| / Solaris/BSD//C/C++ programmer \ |-..-'` /\/\ /\/\ `--` `--` ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Perfomance
Hi Keep below 80% 10 On Apr 14, 2010, at 6:49 PM, "eXeC001er" wrote: Hi All. How many disk space i need to reserve for save ZFS perfomance ? any official doc? Thanks. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] brtfs on Solaris? (Re: [osol-discuss] [indiana-discuss] So when are we gonna fork this sucker?)
>brtfs could be supported on Opensolaris, too. IMO it could even >complement ZFS and spawn some concurrent development between both. ZFS >is too high end and works very poorly with less than 2GB while brtfs >reportedly works well with 128MB on ARM. Both have license issues; Oracle can now re-license either, I believe, unless brtfs has escaped. Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] casesensitivity mixed and CIFS
On 14/04/2010 16:04, John wrote: Hello, we set our ZFS filesystems to casesensitivity=mixed when we created them. However, CIFS access to these files is still case sensitive. Here is the configuration: # zfs get casesensitivity pool003/arch NAME PROPERTY VALUESOURCE pool003/arch casesensitivity mixed- # At the pool level it's set as follows: # zfs get casesensitivity pool003 NAME PROPERTY VALUESOURCE pool003 casesensitivity sensitive- # > From a Windows client, accessing \\filer\arch\MYFOLDER\myfile.txt fails, while accessing \\filer\arch\myfolder\myfile.txt works. Any ideas? We are running snv_130. you are not using Samba daemon, are you? -- Robert Milkowski http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS Perfomance
Hi All. How many disk space i need to reserve for save ZFS perfomance ? any official doc? Thanks. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Suggestions about current ZFS setup
On Tue, April 13, 2010 10:38, Bob Friesenhahn wrote: > On Tue, 13 Apr 2010, Christian Molson wrote: >> >> Now I would like to add my 4 x 2TB drives, I get a warning message >> saying that: "Pool uses 5-way raidz and new vdev uses 4-way raidz" >> Do you think it would be safe to use the -f switch here? > > It should be "safe" but chances are that your new 2TB disks are > considerably slower than the 1TB disks you already have. This should > be as much cause for concern (or more so) than the difference in raidz > topology. Not necessarily for a home server. While mine so far is all mirrored pairs of 400GB disks, I don't even think about "performance" issues, I never come anywhere near the limits of the hardware. Your suggestion (snipped) that he test performance on the new drives to see how they differ is certainly good if he needs to worry about performance. Testing actual performance in your own exact hardware is always smart. -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] casesensitivity mixed and CIFS
No the filesystem was created with b103 or earlier. Just to add more details, the issue only occurred for the first direct access to the file. >From a windows client that has never access the file, you can issue: dir \\filer\arch\myfolder\myfile.TXT and you will get file not found, if the file is named myfile.txt on the filesystem. If you browse to the folder using Windows explorer, everything works. Subsequent direct access to the file (using 'dir') will work as well, as the windows client may be caching the folder information. > was b130 also the version that created the data set? > > -Tonmaus -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Suggestions about current ZFS setup
On Tue, April 13, 2010 09:48, Christian Molson wrote: > > Now I would like to add my 4 x 2TB drives, I get a warning message saying > that: "Pool uses 5-way raidz and new vdev uses 4-way raidz" Do you think > it would be safe to use the -f switch here? Yes. 4-way on the bigger drive is *more* redundancy (25%, rather than 20%) (though not necessarily "safer", since the bigger drive increases recovery time) than 5-way on the smaller drive. I'd describe these as "vaguely" the same level of redundancy, and hence not especially inappropriate to put in the same pool. Putting a single disk into a pool that's otherwise RAIDZ would be a bad idea, obviously, and that's what that message is particularly to warn you about I believe. However, I have some doubts about using 2TB drives with single redundancy in general. It takes a LONG time to resilver a drive that big, and during the resilver you have no redundancy and are hence subject to data loss if one of the remaining drives also fails. And resilvering puts extra stress on the IO system and drives, so probably the risk of failure is increased. (If your backups are good enough, you may plan to cover the possibility of that second failure by restoring from backups. That works, if they're really good enough; it just takes more work and time.) 24 hot-swap bays in your home chassis? Now that does sound pretty extreme. I felt like my 8-bay chassis is a bit excessive for home; and it only has 6 bays populated with data-disks, and they're just 400GB. And I store a lot of RAW files from DSLRs on it it, I feel like I use quite a bit of space (until I see somebody come along casually talking about vaguely 10 times more space). How DO you deal with backup at that data size? I can back up to a single external USB disk (I have 3 I rotate), and a full backup completes overnight. -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] brtfs on Solaris? (Re: [osol-discuss] [indiana-discuss] So when are we gonna fork this sucker?)
brtfs could be supported on Opensolaris, too. IMO it could even complement ZFS and spawn some concurrent development between both. ZFS is too high end and works very poorly with less than 2GB while brtfs reportedly works well with 128MB on ARM. Olga On Wed, Apr 14, 2010 at 5:31 PM, wrote: > > >>Just a completely different question...is there any plans for btrfs ? >>Will ZFS and btrfs co-exist or there's a chance that the less used one >>would be dropped? > > Which OS supports both? > > Linux support brtfs, Solaris supports ZFS. > > Casper > > ___ > opensolaris-discuss mailing list > opensolaris-disc...@opensolaris.org > -- , __ , { \/`o;-Olga Kryzhanovska -;o`\/ } .'-/`-/ olga.kryzhanov...@gmail.com \-`\-'. `'-..-| / Solaris/BSD//C/C++ programmer \ |-..-'` /\/\ /\/\ `--` `--` ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] casesensitivity mixed and CIFS
was b130 also the version that created the data set? -Tonmaus -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Which build is the most stable, mainly for NAS (zfs)?
On Wed, April 14, 2010 08:52, Tonmaus wrote: > safe to say: 2009.06 (b111) is unusable for the purpose, ans CIFS is dead > in this build. That's strange; I run it every day (my home Windows "My Documents" folder and all my photos are on 2009.06). -bash-3.2$ cat /etc/release OpenSolaris 2009.06 snv_111b X86 Copyright 2009 Sun Microsystems, Inc. All Rights Reserved. Use is subject to license terms. Assembled 07 May 2009 > I am using B133, but I am not sure if this is best choice. I'd like to > hear from others as well. Well, it's technically not a stable build. I'm holding off to see what 2010.$Spring ends up being; I'll convert to that unless it turns into a disaster. Is it possible to switch to b132 now, for example? I don't think the old builds are available after the next one comes out; I haven't been able to find them. -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] casesensitivity mixed and CIFS
Hello, we set our ZFS filesystems to casesensitivity=mixed when we created them. However, CIFS access to these files is still case sensitive. Here is the configuration: # zfs get casesensitivity pool003/arch NAME PROPERTY VALUESOURCE pool003/arch casesensitivity mixed- # At the pool level it's set as follows: # zfs get casesensitivity pool003 NAME PROPERTY VALUESOURCE pool003 casesensitivity sensitive- # >From a Windows client, accessing \\filer\arch\MYFOLDER\myfile.txt fails, while >accessing \\filer\arch\myfolder\myfile.txt works. Any ideas? We are running snv_130. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dedup causing problems with NFS?(was Re: snapshots taking too much space)
Hi, Maybe your zfs box used for dedup has a big load, therefore giving timeouts in nagios checks? I ask you this because i also suffer from that effect in a system with 2 Intel Xeon 3.0Ghz ;) Bruno On 14-4-2010 15:48, Paul Archer wrote: > So I turned deduplication on on my staging FS (the one that gets > mounted on the database servers) yesterday, and since then I've been > seeing the mount hang for short periods of time off and on. (It lights > nagios up like a Christmas tree 'cause the disk checks hang and timeout.) > > I haven't turned dedup off again yet, because I'd like to figure out > how to get past this problem. > > Can anyone give me an idea of why the mounts might be hanging, or > where to look for clues? And has anyone had this problem with dedup > and NFS before? FWIW, the clients are a mix of Solaris and Linux. > > Paul > > > > > Yesterday, Paul Archer wrote: > >> Yesterday, Arne Jansen wrote: >> >>> Paul Archer wrote: Because it's easier to change what I'm doing than what my DBA does, I decided that I would put rsync back in place, but locally. So I changed things so that the backups go to a staging FS, and then are rsync'ed over to another FS that I take snapshots on. The only problem is that the snapshots are still in the 500GB range. So, I need to figure out why these snapshots are taking so much more room than they were before. This, BTW, is the rsync command I'm using (and essentially the same command I was using when I was rsync'ing from the NetApp): rsync -aPH --inplace --delete /staging/oracle_backup/ /backups/oracle_backup/ >>> >>> Try adding --no-whole-file to rsync. rsync disables block-by-block >>> comparison if used locally by default. >>> >> >> Thanks for the tip. I didn't realize rsync had that behavior. It >> looks like that got my snapshots back to the 50GB range. I'm going to >> try dedup on the staging FS as well, so I can do a side-by-side of >> which gives me the better space savings. >> >> Paul >> ___ >> zfs-discuss mailing list >> zfs-discuss@opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > smime.p7s Description: S/MIME Cryptographic Signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Which build is the most stable, mainly for NAS (zfs)?
safe to say: 2009.06 (b111) is unusable for the purpose, ans CIFS is dead in this build. I am using B133, but I am not sure if this is best choice. I'd like to hear from others as well. -Tonmaus -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] dedup causing problems with NFS?(was Re: snapshots taking too much space)
So I turned deduplication on on my staging FS (the one that gets mounted on the database servers) yesterday, and since then I've been seeing the mount hang for short periods of time off and on. (It lights nagios up like a Christmas tree 'cause the disk checks hang and timeout.) I haven't turned dedup off again yet, because I'd like to figure out how to get past this problem. Can anyone give me an idea of why the mounts might be hanging, or where to look for clues? And has anyone had this problem with dedup and NFS before? FWIW, the clients are a mix of Solaris and Linux. Paul Yesterday, Paul Archer wrote: Yesterday, Arne Jansen wrote: Paul Archer wrote: Because it's easier to change what I'm doing than what my DBA does, I decided that I would put rsync back in place, but locally. So I changed things so that the backups go to a staging FS, and then are rsync'ed over to another FS that I take snapshots on. The only problem is that the snapshots are still in the 500GB range. So, I need to figure out why these snapshots are taking so much more room than they were before. This, BTW, is the rsync command I'm using (and essentially the same command I was using when I was rsync'ing from the NetApp): rsync -aPH --inplace --delete /staging/oracle_backup/ /backups/oracle_backup/ Try adding --no-whole-file to rsync. rsync disables block-by-block comparison if used locally by default. Thanks for the tip. I didn't realize rsync had that behavior. It looks like that got my snapshots back to the 50GB range. I'm going to try dedup on the staging FS as well, so I can do a side-by-side of which gives me the better space savings. Paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Which build is the most stable, mainly for NAS (zfs)?
Which build is the most stable, mainly for NAS? I plann NAS zfs + CIFS,iSCSI Thanks -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS forensics/revert/restore shellscript and how-to.
I have a similar problem that differs in a subtle way. I moved a zpool (single disk) from one system to another. Due to my inexperience I did not import the zpool but (doh!) 'zpool create'-ed it (I may also have used a -f somewhere in there...) Interestingly the script still gives me the old uberblocks but in this case the first couple (lowest TXG's) are actually younger (later timestamp) than the higher TXG ones. Obviously removing the highest TXG's will actually remove the uberblocks I want to keep. Is there a way to copy an uberblock over another one? Or could I perhaps remove the low-TXG uberblocks instead of the high-TXG ones (and would that mean the old pool becomes available again). Or are more things missing than just the uberblocks and should I move to a file-based approach (on ZFS?) Regards, Fred -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Checksum errors on and after resilver
Hi all, I recently experienced a disk failure on my home server and observed checksum errors while resilvering the pool and on the first scrub after the resilver had completed. Now everything seems fine but I'm posting this to get help with calming my nerves and detect any possible future faults. Lets start with some specs. OSOL 2009.06 Intel SASUC8i (w LSI 1.30IT FW) Gigabyte MA770-UD3 mobo w 8GB ECC RAM Hitachi P7K500 harddrives When checking the condition of my pool some days ago (yes I should make it mail me if something like this happens again) one disk in my pool was labeled as "Removed" with a small number of read errors, nineish I think, all other disks where fine. I removed tested (DFT crashed so the disk seemed very broken) replaced the drive and started a resilver. Checking the status of the resilver everything looked good from the start but when it was finished the status report looked like this: pool: sasuc8i state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-9P scrub: resilver completed after 4h9m with 0 errors on Mon Apr 12 18:12:26 2010 config: NAME STATE READ WRITE CKSUM sasuc8i ONLINE 0 0 0 raidz2 ONLINE 0 0 0 c12t4d0 ONLINE 0 0 5 108K resilvered c12t8d0 ONLINE 0 0 0 254G resilvered c12t6d0 ONLINE 0 0 0 c12t7d0 ONLINE 0 0 0 c12t0d0 ONLINE 0 0 1 21.5K resilvered c12t1d0 ONLINE 0 0 2 43K resilvered c12t2d0 ONLINE 0 0 4 86K resilvered c12t3d0 ONLINE 0 0 1 21.5K resilvered errors: No known data errors All I really cared about at this point was the "Applications are unaffected" and "No known data errors" and I thought that the checksum errors might be down to the failing drive (c12t5d0 failed, the controlled labeled the new drive as c12t8d0) going out during a write. Then again ZFS is atomic, better clear the errors and run a scrub, it came out like this: pool: sasuc8i state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-9P scrub: scrub completed after 1h16m with 0 errors on Tue Apr 13 01:29:32 2010 config: NAME STATE READ WRITE CKSUM sasuc8i ONLINE 0 0 0 raidz2 ONLINE 0 0 0 c12t4d0 ONLINE 0 0 5 c12t8d0 ONLINE 0 0 0 c12t6d0 ONLINE 0 0 0 c12t7d0 ONLINE 0 0 4 86K repaired c12t0d0 ONLINE 0 0 1 c12t1d0 ONLINE 0 0 6 86K repaired c12t2d0 ONLINE 0 0 4 c12t3d0 ONLINE 0 0 6 108K repaired errors: No known data errors Now I'm getting nervous. Checksum errors, some repaired others not. Am I going to end up with multiple drive failures or what the * is going on here? Ran one more scrub and everything came up roses. Checked smart status on the drives with checksum errors and they are fine, allthough I expect only read/write errors would show up there. I'm not sure of how to get this into a propper question but what I'm after is "is this normal to be expected after a resilver and can I start breathing again?". Checksum errors are as far as I can gather dodgy data on disk and read/write somewhere in the physical link (more or less). Thank you! -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS panic
On 04/ 2/10 10:25 AM, Ian Collins wrote: Is this callstack familiar to anyone? It just happened on a Solaris 10 update 8 box: genunix: [ID 655072 kern.notice] fe8000d1b830 unix:real_mode_end+7f81 () genunix: [ID 655072 kern.notice] fe8000d1b910 unix:trap+5e6 () genunix: [ID 655072 kern.notice] fe8000d1b920 unix:_cmntrap+140 () genunix: [ID 655072 kern.notice] fe8000d1ba40 zfs:zfs_space_delta_cb+46 () genunix: [ID 655072 kern.notice] fe8000d1ba80 zfs:dmu_objset_do_userquota_callbacks+b9 () genunix: [ID 655072 kern.notice] fe8000d1bae0 zfs:dsl_pool_sync+df () genunix: [ID 655072 kern.notice] fe8000d1bb90 zfs:spa_sync+29d () genunix: [ID 655072 kern.notice] fe8000d1bc40 zfs:txg_sync_thread+1f0 () genunix: [ID 655072 kern.notice] fe8000d1bc50 unix:thread_start+8 () I've seen a couple more of these, they look very similar (same stack, slightly different offsets) to 6886691 and 6885428. Both of these are closed as not reproducible. I guess I'd better open a new case... -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZIL errors but device seems OK
Hi, I have installed OpenSolaris snv_134 from the iso at genunix.org. Mon Mar 8 2010 New OpenSolaris preview, based on build 134 I created a zpool:- NAMESTATE READ WRITE CKSUM tankONLINE 0 0 0 c7t4d0ONLINE 0 0 0 c7t5d0ONLINE 0 0 0 c7t6d0ONLINE 0 0 0 c7t8d0ONLINE 0 0 0 c7t9d0ONLINE 0 0 0 logs c5d1p1ONLINE 0 0 0 cache c5d1p2ONLINE 0 0 0 The log device and cache are each one half of a 128GB OCZ VERTEX-TURBO flash card. I am getting good NFS performance but have seen this error:- r...@brszfs02:~# zpool status tank pool: tank state: DEGRADED status: One or more devices are faulted in response to persistent errors. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the faulted device, or use 'zpool clear' to mark the device repaired. scrub: none requested config: NAMESTATE READ WRITE CKSUM tankDEGRADED 0 0 0 c7t4d0ONLINE 0 0 0 c7t5d0ONLINE 0 0 0 c7t6d0ONLINE 0 0 0 c7t8d0ONLINE 0 0 0 c7t9d0ONLINE 0 0 0 logs c5d1p1FAULTED 0 4 0 too many errors cache c5d1p2ONLINE 0 0 0 errors: No known data errors r...@brszfs02:~# fmadm faulty --- -- - TIMEEVENT-ID MSG-ID SEVERITY --- -- - Mar 25 13:14:34 6c0bd163-56bf-ee92-e393-ce2063355b52 ZFS-8000-FDMajor Host: brszfs02 Platform: HP-Compaq-dc7700-Convertible-MinitowerChassis_id : CZC7264JN4 Product_sn : Fault class : fault.fs.zfs.vdev.io Affects : zfs://pool=tank/vdev=4ec464b5bf74a898 faulted but still in service Problem in : zfs://pool=tank/vdev=4ec464b5bf74a898 faulted but still in service Description : The number of I/O errors associated with a ZFS device exceeded acceptable levels. Refer to http://sun.com/msg/ZFS-8000-FD for more information. Response: The device has been offlined and marked as faulted. An attempt will be made to activate a hot spare if available. Impact : Fault tolerance of the pool may be compromised. Action : Run 'zpool status -x' and replace the bad device. r...@brszfs02:~# iostat -En c5d1 c5d1 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Model: OCZ VERTEX-TURB Revision: Serial No: 062F97G71C5T676 Size: 128.04GB <128035160064 bytes> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 0 As there seems to be not hardware errors as reported by iostat I ran zpool clear tank and a scrub on Monday. Up to now I have seen no new errors, I have set-up a cron to scrub a 01:30 each day. Is the flash card faulty or is this a ZFS problem? Cheers Richard -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] b134 panic in ddt_sync_entry()
On Wed, Apr 14, 2010 at 3:01 AM, Victor Latushkin wrote: > > On Apr 13, 2010, at 9:52 PM, Cyril Plisko wrote: > >> Hello ! >> >> I've had a laptop that crashed a number of times during last 24 hours >> with this stack: >> >> panic[cpu0]/thread=ff0007ab0c60: >> assertion failed: ddt_object_update(ddt, ntype, nclass, dde, tx) == 0, >> file: ../../common/fs/zfs/ddt.c, line: 968 >> >> >> ff0007ab09a0 genunix:assfail+7e () >> ff0007ab0a20 zfs:ddt_sync_entry+2f1 () >> ff0007ab0a80 zfs:ddt_sync_table+dd () >> ff0007ab0ae0 zfs:ddt_sync+136 () >> ff0007ab0ba0 zfs:spa_sync+41f () >> ff0007ab0c40 zfs:txg_sync_thread+24a () >> ff0007ab0c50 unix:thread_start+8 () >> >> >> Is that a known issue ? > > There is CR 6912741 with similar stack reported. It is now closed, as problem > was seen on some custom kernel, and was not reproducible. > >> I have vmdump files available in case people want to have a look. > > > If you can pack and upload your dumps to e.g. supportfiles.sun.com (or > provide a link to download), it is definitely interesting to have a look and > reopen the bug (or even file a new one). Hi Victor, Here we go: === Thanks for your upload Your file has been stored as "/cores/vmdump.0.7z" on the Supportfiles service. Size of the file (in bytes) : 169866288. The file has a cksum of : 2780601688 . You can verify the checksum of the file by comparing this value with the output of /usr/bin/cksum filename on your local machine. If there is any difference in the checksum values, please re-upload the file. -- Regards, Cyril ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Replaced drive in zpool, was fine, now degraded - ohno
I just started replacing drives in this zpool (to increase storage). I pulled the first drive, and replaced it with a new drive and all was well. It resilvered with 0 errors. This was 5 days ago. Just today I was looking around and noticed that my pool was degraded (I see now that this occurred last night). Sure enough there are 12 read errors on the new drive. I'm on snv 111b. I attempted to get smartmontools workings, but it doesn't seem to want to work as these are all sata drives. fmdump indicates that the read errors occurred within about 10 minutes of one another. Is it safe to say this drive is bad, or is there anything else I can do about this? Thanks, Jon $ zpool status MyStorage pool: MyStorage state: DEGRADED status: One or more devices are faulted in response to persistent errors. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the faulted device, or use 'zpool clear' to mark the device repaired. scrub: scrub completed after 8h7m with 0 errors on Sun Apr 11 13:07:40 2010 config: NAMESTATE READ WRITE CKSUM MyStorage DEGRADED 0 0 0 raidz1DEGRADED 0 0 0 c5t0d0 ONLINE 0 0 0 c5t1d0 ONLINE 0 0 0 c6t1d0 ONLINE 0 0 0 c7t1d0 FAULTED 12 0 0 too many errors errors: No known data errors $ fmdump TIME UUID SUNW-MSG-ID Apr 09 16:08:04.4660 1f07d23f-a4ba-cbbb-8713-d003d9771079 ZFS-8000-D3 Apr 13 22:29:02.8063 e26c7e32-e5dd-cd9c-cd26-d5715049aad8 ZFS-8000-FD That first log is the original drive being replaced. The second is the read errors on the new drive. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss