Re: [zfs-discuss] x4500 vs AVS ?
[EMAIL PROTECTED] wrote: War wounds? Could you please expand on the why a bit more? - ZFS is not aware of AVS. On the secondary node, you'll always have to force the `zfs import` due to the unnoticed changes of metadata (zpool in use). No mechanism to prevent data loss exists, e.g. zpools can be imported when the replicator is *not* in logging mode. - AVS is not ZFS aware. For instance, if ZFS resilves a mirrored disk, e.g. after replacing a drive, the complete disk is sent over the network to the secondary node, even though the replicated data on the secondary is intact. That's a lot of fun with today's disk sizes of 750 GB and 1 TB drives, resulting in usually 10+ hours without real redundancy (customers who use Thumpers to store important data usually don't have the budget to connect their data centers with 10 Gbit/s, so expect 10+ hours *per disk*). - ZFS AVS X4500 leads to a bad error handling. The Zpool may not be imported on the secondary node during the replication. The X4500 does not have a RAID controller which signals (and handles) drive faults. Drive failures on the secondary node may happen unnoticed until the primary nodes goes down and you want to import the zpool on the secondary node with the broken drive. Since ZFS doesn't offer a recovery mechanism like fsck, data loss of up to 20 TB may occur. If you use AVS with ZFS, make sure that you have a storage which handles drive failures without OS interaction. - 5 hours for scrubbing a 1 TB drive. If you're lucky. Up to 48 drives in total. - An X4500 has no battery buffered write cache. ZFS uses the server's RAM as a cache, 15 GB+. I don't want to find out how much time a resilver over the network after a power outage may take (a full reverse replication would take up to 2 weeks and is no valid option in a serious production environment). But the underlying question I asked myself is why I should I want to replicate data in such an expensive way, when I think the 48 TB data itself are not important enough to be protected by a battery? - I gave AVS a set of 6 drives just for the bitmaps (using SVM soft partitions). Weren't enough, the replication was still very slow, probably because of an insane amount of head movements, and scales badly. Putting the bitmap of a drive on the drive itself (if I remember correctly, this is recommended in one of the most referenced howto blog articles) is a bad idea. Always use ZFS on whole disks, if performance and caching matters to you. - AVS seems to require an additional shared storage when building failover clusters with 48 TB of internal storage. That may be hard to explain to the customer. But I'm not 100% sure about this, because I just didn't find a way, I didn't ask on a mailing list for help. If you want a fail-over solution for important data, use the external JBODs. Use AVS only to mirror complete clusters, don't use it to replicate single boxes with local drives. And, in case OpenSolaris is not an option for you due to your company policies or support contracts, building a real cluster also A LOT cheaper. -- Ralf Ramge Senior Solaris Administrator, SCNA, SCSA Tel. +49-721-91374-3963 [EMAIL PROTECTED] - http://web.de/ 11 Internet AG Brauerstraße 48 76135 Karlsruhe Amtsgericht Montabaur HRB 6484 Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Thomas Gottschlich, Matthias Greve, Robert Hoffmann, Markus Huhn, Oliver Mauss, Achim Weiss Aufsichtsratsvorsitzender: Michael Scheeren ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] [install-discuss] Will OpenSolaris and Nevada co-exist in peace on the same root zpool
Well, I want to give OpenSolaris a try, but have not yet worked up the confidence to just try it. So a few questions: When I start the OpenSolaris installer, will it install into my existing root zpool? Which is called RPOOL. not rpool? Without destroying my existing Nevada installations? Or killing my existing Grub menu? And will it be intelligent about my existing Live Upgrade BEs? And other existing Shareable ZFS datasets (eg /export and /var/shared) Related to this: Can I have the same directory used for my home directory under both Nevada and OpenSolaris? I am guessing the answer is YMMV depending on the differences in versions of, for example Firefox, Gnome, Thunderbird, etc, and based on how well these cope with settings that was changed by another potentially newer version of itself. -- ZFS snapshots is your friend. ZFS = LiveUpgrade: A match made in heaven. Any sufficiently advanced technology is indistinguishable from magic. Arthur C. Clarke ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Terabyte scrub
You are right! Seeing the numbers i could not think very well ;-) What matters is the used size, and not the storage capacity! My fault... Thanks a lot for the answers. Leal. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] send/receive statistics
Thanks a lot for the answers! Relling did say something about checksum, i did ask to him about a more detailed explanation about it. Because i did not understand what checksum the receive part has to check, as the send can be redirected to a file on a disc or tape... In the end, i think if we can import (receive) the snapshot, and that procedure ends fine, we are in good shape. Leal. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] A question about recordsize...
Hello! Assuming the default recordsize (FSB) in zfs is 128k, so: 1 - If i have a file with 10k, the zfs will allocate a FSD of 10k. Right? As zfs is not static like the other filesystems, i don´t have that old internal fragmentation... 2 - If the above is right, i don´t need to adjust the recordsize (FSB) if i will handle a lot of tiny files. Right? 3 - if the two above are right ones, so the tuning of the recordsize is just important for files greater than the FSB. Let´s say, 129k... but so, another question: If the file is 129k, the zfs will allocate one filesystem block of 128k and another of... 1k! Right? Or two of 128k? 4 - The last one... ;-) For the FSB allocation, how the zfs knows the file size, for know if the file is smaller than the FSB? Something related to the txg? When the write goes to the disk, the zfs knows (some way) if that write is a whole file or a piece of it? Thanks a lot! Leal. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [install-discuss] Will OpenSolaris and Nevada co-exist in peace on the same root zpool
Johan Hartzenberg writes: I am guessing the answer is YMMV depending on the differences in versions of, for example Firefox, Gnome, Thunderbird, etc, and based on how well these cope with settings that was changed by another potentially newer version of itself. The answers to your questions are basically all no. The new installer wants a primary partition or a whole disk. However, there are helpful blogs from folks who've made the transition. Poor Ed seems to have a broken 'shift' key, but he gives great details here: http://blogs.sun.com/edp/entry/moving_from_nevada_and_live -- James Carlson, Solaris Networking [EMAIL PROTECTED] Sun Microsystems / 35 Network Drive71.232W Vox +1 781 442 2084 MS UBUR02-212 / Burlington MA 01803-2757 42.496N Fax +1 781 442 1677 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] x4500 vs AVS ?
[jumping ahead and quoting myself] AVS is not a mirroring technology, it is a remote replication technology. So, yes, I agree 100% that people should not expect AVS to be a mirror. Ralf Ramge wrote: [EMAIL PROTECTED] wrote: War wounds? Could you please expand on the why a bit more? - ZFS is not aware of AVS. On the secondary node, you'll always have to force the `zfs import` due to the unnoticed changes of metadata (zpool in use). No mechanism to prevent data loss exists, e.g. zpools can be imported when the replicator is *not* in logging mode. ZFS isn't special in this regard, AFAIK all file systems, databases and other data stores suffer from the same issue with remote replication. - AVS is not ZFS aware. For instance, if ZFS resilves a mirrored disk, e.g. after replacing a drive, the complete disk is sent over the network to the secondary node, even though the replicated data on the secondary is intact. That's a lot of fun with today's disk sizes of 750 GB and 1 TB drives, resulting in usually 10+ hours without real redundancy (customers who use Thumpers to store important data usually don't have the budget to connect their data centers with 10 Gbit/s, so expect 10+ hours *per disk*). ZFS only resilvers data. Other LVMs, like SVM, will resilver the entire disk, though. - ZFS AVS X4500 leads to a bad error handling. The Zpool may not be imported on the secondary node during the replication. The X4500 does not have a RAID controller which signals (and handles) drive faults. Drive failures on the secondary node may happen unnoticed until the primary nodes goes down and you want to import the zpool on the secondary node with the broken drive. Since ZFS doesn't offer a recovery mechanism like fsck, data loss of up to 20 TB may occur. If you use AVS with ZFS, make sure that you have a storage which handles drive failures without OS interaction. If this is the case, then array-based replication would also be similarly affected by this architectural problem. In other words, if you say that a software RAID system cannot be replicated by a software replicator, then TrueCopy, SRDF, and other RAID array-based (also software) replicators also do not work. I think there is enough empirical evidence that they do work. I can see where there might be a best practice here, but I see no fundamental issue. fsck does not recover data, it only recovers metadata. - 5 hours for scrubbing a 1 TB drive. If you're lucky. Up to 48 drives in total. ZFS only scrubs data. But it is not unusual for a lot of data scrubbing to take a long time. ZFS only performs read scrubs, so there is no replication required during a ZFS scrub, unless data is repaired. - An X4500 has no battery buffered write cache. ZFS uses the server's RAM as a cache, 15 GB+. I don't want to find out how much time a resilver over the network after a power outage may take (a full reverse replication would take up to 2 weeks and is no valid option in a serious production environment). But the underlying question I asked myself is why I should I want to replicate data in such an expensive way, when I think the 48 TB data itself are not important enough to be protected by a battery? ZFS will not be storing 15 GBytes of unflushed data on any system I can imagine today. While we can all agree that 48 TBytes will be painful to replicate, that is not caused by ZFS -- though it is enabled by ZFS, because some other file systems (UFS) cannot be as large as 48 TBytes. - I gave AVS a set of 6 drives just for the bitmaps (using SVM soft partitions). Weren't enough, the replication was still very slow, probably because of an insane amount of head movements, and scales badly. Putting the bitmap of a drive on the drive itself (if I remember correctly, this is recommended in one of the most referenced howto blog articles) is a bad idea. Always use ZFS on whole disks, if performance and caching matters to you. I think there are opportunities for perormance improvement, but don't know who is currently actively working on this. Actually, the cases where ZFS for whole disks is a big win are small. And, of course, you can enable disk write caches by hand. - AVS seems to require an additional shared storage when building failover clusters with 48 TB of internal storage. That may be hard to explain to the customer. But I'm not 100% sure about this, because I just didn't find a way, I didn't ask on a mailing list for help. If you want a fail-over solution for important data, use the external JBODs. Use AVS only to mirror complete clusters, don't use it to replicate single boxes with local drives. And, in case OpenSolaris is not an option for you due to your company policies or support contracts, building a real cluster also A LOT cheaper. AVS is not a mirroring technology, it is a remote replication technology. So, yes, I agree
Re: [zfs-discuss] zfs metada corrupted
LyeBeng Ong wrote: I made a bad judgment and now my raidz pool is corrupted. I have a raidz pool running on Opensolaris b85. I wanted to try out freenas 0.7 and tried to add my pool to freenas. After adding the zfs disk, vdev and pool. I decided to back out and went back to opensolaris. Now my raidz pool will not mount and got the following errors. Hope someone expert can help me recover from this error. The symptoms are consistent with a repartitioning of the disk event. First check that the disks are labeled now as they were originally. When you run zdb -l, make sure you use the same devices as before. For example, zdb -l /dev/rdsk/c2d0 (see below) is not the same as: zdb -l /dev/dsk/c2d0s0 which is where ZFS thinks the data should be. Also, /dev/ad10 is something I don't recognize... what is it? -- richard [EMAIL PROTECTED]:/dev/rdsk# zpool status pool: syspool state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM syspool ONLINE 0 0 0 c1d0s0ONLINE 0 0 0 errors: No known data errors pool: tank state: FAULTED status: The pool metadata is corrupted and the pool cannot be opened. action: Destroy and re-create the pool from a backup source. see: http://www.sun.com/msg/ZFS-8000-72 scrub: none requested config: NAMESTATE READ WRITE CKSUM tankFAULTED 0 0 4 corrupted data raidz1ONLINE 0 0 4 c2d0ONLINE 0 0 0 c2d1ONLINE 0 0 0 c3d0ONLINE 0 0 0 c3d1ONLINE 0 0 0 [EMAIL PROTECTED]:/dev/rdsk# [EMAIL PROTECTED]:/dev/rdsk# zdb -vvv syspool version=10 name='syspool' state=0 txg=13 pool_guid=7417064082496892875 hostname='elatte_installcd' vdev_tree type='root' id=0 guid=7417064082496892875 children[0] type='disk' id=0 guid=16996723219710622372 path='/dev/dsk/c1d0s0' devid='id1,[EMAIL PROTECTED]/a' phys_path='/[EMAIL PROTECTED],0/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED],0:a' whole_disk=0 metaslab_array=14 metaslab_shift=30 ashift=9 asize=158882856960 is_log=0 tank version=10 name='tank' state=0 txg=9305484 pool_guid=6165551123815947851 hostname='cempedak' vdev_tree type='root' id=0 guid=6165551123815947851 children[0] type='raidz' id=0 guid=18029757455913565148 nparity=1 metaslab_array=14 metaslab_shift=33 ashift=9 asize=1280228458496 is_log=0 children[0] type='disk' id=0 guid=14740261559114907785 path='/dev/dsk/c2d0s0' devid='id1,[EMAIL PROTECTED]/a' phys_path='/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED],0:a' whole_disk=1 children[1] type='disk' id=1 guid=7618479640615121644 path='/dev/dsk/c2d1s0' devid='id1,[EMAIL PROTECTED]/a' phys_path='/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED],0:a' whole_disk=1 children[2] type='disk' id=2 guid=1801493855297946488 path='/dev/dsk/c3d0s0' devid='id1,[EMAIL PROTECTED]/a' phys_path='/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED],0:a' whole_disk=1 children[3] type='disk' id=3 guid=15710901655082836445 path='/dev/dsk/c3d1s0' devid='id1,[EMAIL PROTECTED]/a' phys_path='/[EMAIL PROTECTED],0/pci10de,[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED],0:a' whole_disk=1 [EMAIL PROTECTED]:/dev/rdsk# [EMAIL PROTECTED]:/dev/rdsk# zdb -l /dev/rdsk/c2d0 LABEL 0 version=6 name='tank'
[zfs-discuss] resilver speed.
Is there any way to control the resliver speed? Having attached a third disk to a mirror (so I can replace the other disks with larger ones) the resilver goes at a fraction of the speed of the same operation using disk suite. However it still renders the system pretty much unusable for anything else. So I would like to control the rate of the resilver. Either slow it down a lot so that the system is still usable or tell it to go as fast as possible to get it overwith. Also does the resilver deliberately pause? Running iostat I see that it will pause for five to ten seconds where no IO is done at all, then it continues on at a more reasonable pace. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs metada corrupted
--On 05 September 2008 07:37 -0700 Richard Elling [EMAIL PROTECTED] wrote: Also, /dev/ad10 is something I don't recognize... what is it? -- richard '/dev/ad10' is a FreeBSD disk device, which would kind of be fitting, as: LyeBeng Ong wrote: I made a bad judgment and now my raidz pool is corrupted. I have a raidz pool running on Opensolaris b85. I wanted to try out freenas 0.7 and tried to add my pool to freenas. FreeNAS is FreeBSD based... -Kp ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs metada corrupted
Karl Pielorz wrote: --On 05 September 2008 07:37 -0700 Richard Elling [EMAIL PROTECTED] wrote: Also, /dev/ad10 is something I don't recognize... what is it? -- richard '/dev/ad10' is a FreeBSD disk device, which would kind of be fitting, as: LyeBeng Ong wrote: I made a bad judgment and now my raidz pool is corrupted. I have a raidz pool running on Opensolaris b85. I wanted to try out freenas 0.7 and tried to add my pool to freenas. FreeNAS is FreeBSD based... Ah, ok, so perhaps an export and import would clear the cobwebs? What happens when you try to import? -- richard -Kp ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS in Solaris 10 5/08
All, I realize this is an Open Solaris forum but I need help. I'm getting conflicting information that ZFS in Solaris 5/08 release does support gzip compression. However when I run a zpool upgrade -v command it reports version 4 and this doesn't gzip... yes?? Is there a way to get gzip compression (gzip-9) enabled in zfs on Solaris 10 5/08?? Thanks in advance for the discussion. ---Kenny -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] resilver speed.
Chris Gerhard wrote: Is there any way to control the resliver speed? Having attached a third disk to a mirror (so I can replace the other disks with larger ones) the resilver goes at a fraction of the speed of the same operation using disk suite. However it still renders the system pretty much unusable for anything else. Resilvers work at low priority in the ZFS scheduler. In general, they work at the media speed of the disk being resilvered. However, anecdotal evidence suggests that this may be impacted by the number and extent of snapshots. I have a lot of characterization data for resilvers, but without varying the scope and number of snapshots (which is a hard thing to identify). ZFS resilvers in time sequence, not by disk block location, so there are many more variables at play here than might be immediately obvious. So I would like to control the rate of the resilver. Either slow it down a lot so that the system is still usable or tell it to go as fast as possible to get it overwith. There are two competing RFEs for this: http://bugs.opensolaris.org/view_bug.do?bug_id=6592835 http://bugs.opensolaris.org/view_bug.do?bug_id=6494473 Also does the resilver deliberately pause? Running iostat I see that it will pause for five to ten seconds where no IO is done at all, then it continues on at a more reasonable pace. I have not seen such behaviour during resilver characterization. Which OS release are you using? Also, are you using IDE disks or disks which do not handle multiple outstanding operations? You may also be seeing http://bugs.opensolaris.org/view_bug.do?bug_id=6729696 -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS in Solaris 10 5/08
Kenny wrote: All, I realize this is an Open Solaris forum but I need help. I'm getting conflicting information that ZFS in Solaris 5/08 release does support gzip compression. However when I run a zpool upgrade -v command it reports version 4 and this doesn't gzip... yes?? correct. gzip arrives with zpool version 5. Is there a way to get gzip compression (gzip-9) enabled in zfs on Solaris 10 5/08?? Not today, AFAIK. This will appear as a patch for Solaris 10 5/08 (aka Solaris 10 update 5) and should be in Solaris 10 update 6. But I do not know the schedule beyond later this year or real soon now. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] resilver speed.
Thanks Richard Elling wrote: Also, are you using IDE disks or disks which do not handle multiple outstanding operations? SATA with the cmdk driver which is only sending 2 commands at a time. You may also be seeing http://bugs.opensolaris.org/view_bug.do?bug_id=6729696 that could well be the case. Fortunately none of the users would know how to run the sync command. -- Chris Gerhard. __o __o __o Systems TSC Chief Technologist_`\,`\,`\,_ Sun Microsystems Limited (*)/---/---/ (*) Phone: +44 (0) 1252 426033 (ext 26033) http://blogs.sun.com/chrisg smime.p7s Description: S/MIME Cryptographic Signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ?: any effort for snapshot management
I have seen Tim Foster's auto-snapshot and it looks interesting. Is there a bug id or effort to deliver snapshot policy and space management framework? Not looking for a GUI, although a CLI based UI might be helpful. Customer needs something that allows the use of snapshots on 100s of systems, and minimizes the administration to handle disks filling up. I imagine a component is a time or condition based auto-delete of older snopshot(s). Thanks Steffen ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Snapshots during a scrub
I have a weekly scrub setup, and I've seen at least once now where it says don't snapshot while scrubbing Is this a data integrity issue, or will it make one or both of the processes take longer? Thanks ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Snapshots during a scrub
mike wrote: I have a weekly scrub setup, and I've seen at least once now where it says don't snapshot while scrubbing Is this a data integrity issue, or will it make one or both of the processes take longer? Thank That problem has been fixed in build 94. Here is the bug that people have been referring to: 6343667 scrub/resilver has to start over when a snapshot is taken -Mark ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Snapshots during a scrub
mike wrote: I have a weekly scrub setup, and I've seen at least once now where it says don't snapshot while scrubbing Is this a data integrity issue, or will it make one or both of the processes take longer? The problem prior to NV b94 is that a snaphot would restart a scrub. This has been fixed with: http://bugs.opensolaris.org/view_bug.do?bug_id=6343667 -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ?: any effort for snapshot management
Steffen, Most complete and serious ZFS snapshot management, integrated ZFS send/recv replication over RSYNC with CLI, integrated AVS, GUI and management server which provides rich API for C/C++/Perl/Python/Ruby integrators available here: http://www.nexenta.com/nexentastor-overview Its ZFS+ with a lot of reliability fixes. Enterprise quality, production ready solution. Demo of of advanced CLI usage is here: http://www.nexenta.com/demos/automated-snapshots.html http://www.nexenta.com/demos/auto-tier-basic.html As a side not, I think that dis-integrated general-purpose scripting which is available on the Internet simply can not provide production quality and easy of use. On Fri, 2008-09-05 at 13:14 -0400, Steffen Weiberle wrote: I have seen Tim Foster's auto-snapshot and it looks interesting. Is there a bug id or effort to deliver snapshot policy and space management framework? Not looking for a GUI, although a CLI based UI might be helpful. Customer needs something that allows the use of snapshots on 100s of systems, and minimizes the administration to handle disks filling up. I imagine a component is a time or condition based auto-delete of older snopshot(s). Thanks Steffen ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Snapshots during a scrub
Okay, well I am running snv_94 already. So I guess I'm good :) On Fri, Sep 5, 2008 at 10:23 AM, Mark Shellenbaum [EMAIL PROTECTED] wrote: mike wrote: I have a weekly scrub setup, and I've seen at least once now where it says don't snapshot while scrubbing Is this a data integrity issue, or will it make one or both of the processes take longer? Thank That problem has been fixed in build 94. Here is the bug that people have been referring to: 6343667 scrub/resilver has to start over when a snapshot is taken -Mark ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Error: value too large for defined data type
I am having a very odd problem on one of our ZFS filesystems On certain files, when accessed on the Solaris server itself locally where the zfs fs sits, we get an error like the following: [EMAIL PROTECTED] # ls -l ./README: Value too large for defined data type total 36 -rw-r- 1 mreuter mreuter 1019 Sep 25 2006 Makefile -rw-r- 1 mreuter mreuter 3185 Feb 22 2000 lcompgre.cc -rw-r- 1 mreuter mreuter 3238 Feb 22 2000 lcompgsh.cc -rw-r- 1 mreuter mreuter 2485 Feb 22 2000 lcompreg.cc -rw-r- 1 mreuter mreuter 2774 Feb 22 2000 lcompshf.cc The odd thing is that when the filesystem is accessed from our Linux boxes over NFS, there is no error access the same file vader:complex[84] ls -l total 24 drwxr-x---+ 2 mreuter mreuter8 Sep 25 2006 . drwxr-x---+ 5 mreuter mreuter5 Mar 31 1997 .. -rw-r-+ 1 mreuter mreuter 3185 Feb 22 2000 lcompgre.cc -rw-r-+ 1 mreuter mreuter 3238 Feb 22 2000 lcompgsh.cc -rw-r-+ 1 mreuter mreuter 2485 Feb 22 2000 lcompreg.cc -rw-r-+ 1 mreuter mreuter 2774 Feb 22 2000 lcompshf.cc -rw-r-+ 1 mreuter mreuter 1019 Sep 25 2006 Makefile -rw-r-+ 1 mreuter mreuter 1435 Jan 4 1945 README vader:mreuter:complex[85] wc README 40 181 1435 README The file is obvious small so this is not a large file problem. Anyone have an idea what gives? -- --- Paul Rainesemail: raines at nmr.mgh.harvard.edu MGH/MIT/HMS Athinoula A. Martinos Center for Biomedical Imaging 149 (2301) 13th Street Charlestown, MA 02129USA ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ?: any effort for snapshot management
I think I can answer those questions. Firstly, Tim is working with some of Sun's desktop guys to put together a GUI for his stuff. I don't know how long it will be, but I'd guess that you'll see that as a full part of OpenSolaris sometime this year. Regarding snapshot management, I believe there are two types of filesystem quota in ZFS now. The original quotas include snapshot space, and you can also create a quota that applies to the main filesystem only. I don't know the details of which settings they are I'm afraid, I just remember reading that the two types of quota had been implemented. I would have thought those quotas, plus the auto-delete ability of Tim's snapshot tools should fulfil most needs. Ross -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZIL NVRAM partitioning?
I understand that if you want to use ZIL, then the requirement is one or more ZILs per pool. With an SSD you can partition the disk to allow usage of a single disk for multiple ZILs. Can we do the same thing with an PCIe-based NVRAM card (like http://www.vmetro.com/category4304.html)? Thanks Narayan -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zfs cksum errors
I am having trouble with zfs, if I scrub the pool, i get cksum errors. I f I scrub, zpool clear rpool and then re-scrub, the cksum errors remain. this appears to be a systematic error and not hardware related. I am also having trouble with beadm create. please see: http://www.opensolaris.org/jive/thread.jspa?threadID=71960tstart=0 # zpool status -v pool: rpool state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: resilver in progress for 0h13m, 93.11% done, 0h0m to go config: NAMESTATE READ WRITE CKSUM rpool ONLINE 0 0 2 mirrorONLINE 0 0 2 c6d0s0 ONLINE 0 0 4 c7d0s2 ONLINE 0 0 4 errors: Permanent errors have been detected in the following files: metadata: 0x0 (note had to delete leading carets as this forum stripped the whole line otherwise) -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Error: value too large for defined data type
On Fri, Sep 05, 2008 at 03:17:44PM -0400, Paul Raines wrote: [EMAIL PROTECTED] # ls -l ./README: Value too large for defined data type total 36 -rw-r- 1 mreuter mreuter 1019 Sep 25 2006 Makefile -rw-r- 1 mreuter mreuter 3185 Feb 22 2000 lcompgre.cc -rw-r- 1 mreuter mreuter 3238 Feb 22 2000 lcompgsh.cc -rw-r- 1 mreuter mreuter 2485 Feb 22 2000 lcompreg.cc -rw-r- 1 mreuter mreuter 2774 Feb 22 2000 lcompshf.cc -rw-r-+ 1 mreuter mreuter 1435 Jan 4 1945 README vader:mreuter:complex[85] wc README 40 181 1435 README The file is obvious small so this is not a large file problem. Probably the date. I don't think 'ls' is isaexec-wrapped by default. You might try running the 64-bit version of ls. -- Darren ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] resilver speed.
On Fri, 2008-09-05 at 09:41 -0700, Richard Elling wrote: Also does the resilver deliberately pause? Running iostat I see that it will pause for five to ten seconds where no IO is done at all, then it continues on at a more reasonable pace. I have not seen such behaviour during resilver characterization. I have, post nv_94, and I filed a bug: 6729696 sync causes scrub or resilver to pause for up to 30s - Bill ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Error: value too large for defined data type
Paul Raines wrote: I am having a very odd problem on one of our ZFS filesystems On certain files, when accessed on the Solaris server itself locally where the zfs fs sits, we get an error like the following: [EMAIL PROTECTED] # ls -l ./README: Value too large for defined data type total 36 -rw-r- 1 mreuter mreuter 1019 Sep 25 2006 Makefile -rw-r- 1 mreuter mreuter 3185 Feb 22 2000 lcompgre.cc -rw-r- 1 mreuter mreuter 3238 Feb 22 2000 lcompgsh.cc -rw-r- 1 mreuter mreuter 2485 Feb 22 2000 lcompreg.cc -rw-r- 1 mreuter mreuter 2774 Feb 22 2000 lcompshf.cc Do you by chance have /usr/gnu/bin, or any directory with a Gnu 'ls' in your path before /usr/bin? (what does 'which ls' show?) I've seen this with Gnu ls that I have compiled myself as far back as Solaris 9 mayber earlier. By default Gnu ls compiled on solaris doesn't know how to handle latgr files (and therefore probably 64bit dates either.) When I've seen this, explicitly running /usr/bin/ls -l worked fine, and I suspect it will for you too. -Kyle The odd thing is that when the filesystem is accessed from our Linux boxes over NFS, there is no error access the same file vader:complex[84] ls -l total 24 drwxr-x---+ 2 mreuter mreuter8 Sep 25 2006 . drwxr-x---+ 5 mreuter mreuter5 Mar 31 1997 .. -rw-r-+ 1 mreuter mreuter 3185 Feb 22 2000 lcompgre.cc -rw-r-+ 1 mreuter mreuter 3238 Feb 22 2000 lcompgsh.cc -rw-r-+ 1 mreuter mreuter 2485 Feb 22 2000 lcompreg.cc -rw-r-+ 1 mreuter mreuter 2774 Feb 22 2000 lcompshf.cc -rw-r-+ 1 mreuter mreuter 1019 Sep 25 2006 Makefile -rw-r-+ 1 mreuter mreuter 1435 Jan 4 1945 README vader:mreuter:complex[85] wc README 40 181 1435 README The file is obvious small so this is not a large file problem. Anyone have an idea what gives? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZIL NVRAM partitioning?
On 09/05/08 14:42, Narayan Venkat wrote: I understand that if you want to use ZIL, then the requirement is one or more ZILs per pool. A little clarification of ZFS terms may help here. The term ZIL is somewhat overloaded. I think what you mean here is a separate log device (slog), because intent logs are always present in ZFS. Without a slog, the logs are present in the main pool. There is one log per file system and it allocates blocks in the main pool to form a chain. When a slog is defined, then it can be made up of multiple devices (in which case the writes are striped across the devices) or it can be in the form on a N way mirror - to provide redundancy. With an SSD you can partition the disk to allow usage of a single disk for multiple ZILs Can we do the same thing with an PCIe-based NVRAM card (like http://www.vmetro.com/category4304.html)? I don't think there's a Solaris supported driver for that device. However, any Solaris device, whether a partition or not, will work with ZFS provided it's at least 64MB. It's performance is another matter. Thanks Narayan ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZIL NVRAM partitioning?
Thanks Neil for the clarification. Regards, Narayan -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss