Re: [zfs-discuss] b134 pool borked!
90 reads and not a single comment? Not the slightest hint of what's going on? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] b134 pool borked!
This is how my zpool import command looks like: Attached you'll find the output of zdb -l of each device. pool: tank id: 10904371515657913150 state: ONLINE action: The pool can be imported using its name or numeric identifier. config: tank ONLINE raidz1-0 ONLINE c13t4d0 ONLINE c13t5d0 ONLINE c13t6d0 ONLINE c13t7d0 ONLINE raidz1-1 ONLINE c13t3d0 ONLINE c13t1d0 ONLINE c13t2d0 ONLINE c13t0d0 ONLINE cache c8t2d0 logs c8t0d0 ONLINE -- This message posted from opensolaris.org zdbl.gz Description: Binary data ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] b134 pool borked!
Hi, It definitely seems like hardware-related issue as panics related to common tools like format isn’t to be expected. Anyhow. You might want to start to get all your disks show up in iostat / cfgadm before trying to import pool. You should replace controller if you have not already done so, and RAM should be all ok I guess? Yours Markus Kovero ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Another MPT issue - kernel crash
Hi all, I have faced yet another kernel panic that seems to be related to mpt driver. This time i was trying to add a new disk to a running system (snv_134) and this new disk was not being detected...following a tip i ran the lsitool to reset the bus and this lead to a system panic. MPT driver : BAD TRAP: type=e (#pf Page fault) rp=ff001fc98020 addr=4 occurred in module mpt due to a NULL pointer dereference If someone has a similar problem it might be worthwhile to expose it here or to add information to the filled bug , available at https://defect.opensolaris.org/bz/show_bug.cgi?id=15879 Thanks, Bruno -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [storage-discuss] iscsitgtd failed request to share on zpool import after upgrade from b104 to b134
Przem, Anybody has an idea what I can do about it? zfs set shareiscsi=off vol01/zvol01 zfs set shareiscsi=off vol01/zvol02 Doing this will have no impact on the LUs if configured under COMSTAR. This will also transparently go away with b136, when ZFS ignores the shareiscsi property. - Jim On 04/05/2010 16:43, eXeC001er execoo...@gmail.com wrote: Perhaps the problem is that the old version of pool have shareiscsi, but new version have not this option, and for share LUN via iscsi you need to make lun-mapping. 2010/5/4 Przemyslaw Ceglowski prze...@ceglowski.netmailto:prze...@ceglowski.net Jim, On May 4, 2010, at 3:45 PM, Jim Dunham wrote: On May 4, 2010, at 2:43 PM, Richard Elling wrote: On May 4, 2010, at 5:19 AM, Przemyslaw Ceglowski wrote: It does not look like it is: r...@san01a:/export/home/admin# svcs -a | grep iscsi online May_01 svc:/network/iscsi/initiator:default online May_01 svc:/network/iscsi/target:default This is COMSTAR. Thanks Richard, I am aware of that. Since you upgrade to b134, not b136 the iSCSI Target Daemon is still around, just not on our system. IPS packaging changes have not installed the iSCSI Target Daemon (among other things) by default. It is contained in IPS package known as either SUNWiscsitgt or network/iscsi/target/legacy. Visit your local package repository for updates: http://pkg.opensolaris.org/dev/ Of course starting with build 136..., iSCSI Target Daemon (and ZFS shareiscsi) are gone, so you will need to reconfigure your two ZVOLs 'vol01/zvol01' and 'vol01/zvol02', under COMSTAR soon. http://wikis.sun.com/display/OpenSolarisInfo/How+to+Configure+iSCSI+Target+Po rts http://wikis.sun.com/display/OpenSolarisInfo/COMSTAR+Administration - Jim The migrated zVols have been running under COMSTAR originally on b104 which makes me wonder even more. Is there any way I can get rid of those messages? _ Przem From: Rick McNeal [ramcn...@gmail.commailto:ramcn...@gmail.com] Sent: 04 May 2010 13:14 To: Przemyslaw Ceglowski Subject: Re: [storage-discuss] iscsitgtd failed request to share on zpool import after upgrade from b104 to b134 Look and see if the target daemon service is still enabled. COMSTAR has been the official scsi target project for a while now. In fact, the old iscscitgtd was removed in build 136. For Nexenta, the old iscsi target was removed in 3.0 (based on b134). -- richard It does not answer my original question. -- Przem Rick McNeal On May 4, 2010, at 5:38 AM, Przemyslaw Ceglowski prze...@ceglowski.netmailto:prze...@ceglowski.net wrote: Hi, I am posting my question to both storage-discuss and zfs-discuss as I am not quite sure what is causing the messages I am receiving. I have recently migrated my zfs volume from b104 to b134 and upgraded it from zfs version 14 to 22. It consist of two zvol's 'vol01/zvol01' and 'vol01/zvol02'. During zpool import I am getting a non-zero exit code, however the volume is imported successfuly. Could you please help me to understand what could be the reason of those messages? r...@san01a:/export/home/admin#zpool import vol01 r...@san01a:/export/home/admin#cannot share 'vol01/zvol01': iscsitgtd failed request to share r...@san01a:/export/home/admin#cannot share 'vol01/zvol02': iscsitgtd failed request to share Many thanks, Przem ___ storage-discuss mailing list storage-disc...@opensolaris.orgmailto:storage-disc...@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/storage-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.orgmailto:zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- ZFS storage and performance consulting at http://www.RichardElling.com ___ storage-discuss mailing list storage-disc...@opensolaris.orgmailto:storage-disc...@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/storage-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.orgmailto:zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Another MPT issue - kernel crash
On 5/05/10 10:42 PM, Bruno Sousa wrote: Hi all, I have faced yet another kernel panic that seems to be related to mpt driver. This time i was trying to add a new disk to a running system (snv_134) and this new disk was not being detected...following a tip i ran the lsitool to reset the bus and this lead to a system panic. MPT driver : BAD TRAP: type=e (#pf Page fault) rp=ff001fc98020 addr=4 occurred in module mpt due to a NULL pointer dereference If someone has a similar problem it might be worthwhile to expose it here or to add information to the filled bug , available at https://defect.opensolaris.org/bz/show_bug.cgi?id=15879 That's an already-known CR, tracked in Bugster. I've updated defect.o.o and transferred your info to the Bugster CR, 6895862. Until the nightly inside-outside bugs.o.o sync up it'll still show up as closed, but don't worry, I've re-opened it. James C. McPherson -- Senior Software Engineer, Solaris Oracle http://www.jmcp.homeunix.com/blog ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Another MPT issue - kernel crash
Hi James, Thanks for the information, and if there's any test/command to be done on this server, just let me know it. Regards, Bruno On 5-5-2010 15:38, James C. McPherson wrote: On 5/05/10 10:42 PM, Bruno Sousa wrote: Hi all, I have faced yet another kernel panic that seems to be related to mpt driver. This time i was trying to add a new disk to a running system (snv_134) and this new disk was not being detected...following a tip i ran the lsitool to reset the bus and this lead to a system panic. MPT driver : BAD TRAP: type=e (#pf Page fault) rp=ff001fc98020 addr=4 occurred in module mpt due to a NULL pointer dereference If someone has a similar problem it might be worthwhile to expose it here or to add information to the filled bug , available at https://defect.opensolaris.org/bz/show_bug.cgi?id=15879 That's an already-known CR, tracked in Bugster. I've updated defect.o.o and transferred your info to the Bugster CR, 6895862. Until the nightly inside-outside bugs.o.o sync up it'll still show up as closed, but don't worry, I've re-opened it. James C. McPherson -- Senior Software Engineer, Solaris Oracle http://www.jmcp.homeunix.com/blog ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Using local raw disks for an opensolaris b134 virtualized host under ESXi 4
Hi all, I would like to install a virtual san using opensolaris b134 under an ESXi 4 host. Instead of use vmfs datastores I would like to use local raw disks on ESXi 4 host: http://www.mattiasholm.com/node/33. Somebody have tried?? Some problem to do this? Or is it better to use vmfs than raw local disks?? Thanks. -- CL Martinez carlopmart {at} gmail {d0t} com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [indiana-discuss] image-update doesn't work anymore (bootfs not supported on EFI)
On 5/5/10 1:44 AM, Christian Thalinger wrote: On Tue, 2010-05-04 at 16:19 -0600, Evan Layton wrote: Can you try the following and see if it really thinks it's an EFI lable? # dd if=/dev/dsk/c12t0d0s2 of=x skip=512 bs=1 count=10 # cat x This may help us determine if this is another instance of bug 6860320 # dd if=/dev/dsk/c12t0d0s2 of=x skip=512 bs=1 count=10 10+0 records in 10+0 records out 10 bytes (10 B) copied, 0.0259365 s, 0.4 kB/s # cat x # od x 000 00 00 00 00 00 012 # Doesn't look like an EFI label. No that doesn't appear like an EFI label. So it appears that ZFS is seeing something there that it's interpreting as an EFI label. Then the command to set the bootfs property is failing due to that. To restate the problem the BE can't be activated because we can't set the bootfs property of the root pool and even the ZFS command to set it fails with property 'bootfs' not supported on EFI labeled devices for example the following command: # zfs set bootfs=rpool/ROOT/opensolaris rpool fails with that same error message. Do you have any of the older BEs like build 134 that you can boot back to and see if those will allow you to set the bootfs property on the root pool? It's just really strange that out of nowhere it started thinking that the device is EFI labeled. I'm including zfs-discuss to get the ZFS folks thoughts on the issue. -evan More info from original thread: On 05/ 4/10 10:45 AM, Christian Thalinger wrote: On Tue, 2010-05-04 at 10:36 -0500, Shawn Walker wrote: What confuses me is that the update from b133 to b134 obviously worked before--because I have a b134 image--but it doesn't now. I'm on b135 myself and haven't seen this issue yet. I can't think of anything I did that changed anything on the disk or the partition table, whatever that could be. Or is this because I tried to install b137 and that changed something? What does your partition layout look like? Not sure how I can print the partition to show what you want to see. Maybe this: format current Current Disk = c12t0d0 DEFAULT cyl 38910 alt 2 hd 255 sec 63 /p...@0,0/pci10de,c...@b/d...@0,0 format verify Primary label contents: Volume name = ascii name =DEFAULT cyl 38910 alt 2 hd 255 sec 63 pcyl = 38912 ncyl = 38910 acyl = 2 bcyl = 0 nhead = 255 nsect = 63 Part Tag Flag Cylinders Size Blocks 0 root wm 1 - 38909 298.06GB (38909/0/0) 625073085 1 unassigned wm 0 0 (0/0/0) 0 2 backup wu 0 - 38909 298.07GB (38910/0/0) 625089150 3 unassigned wm 0 0 (0/0/0) 0 4 unassigned wm 0 0 (0/0/0) 0 5 unassigned wm 0 0 (0/0/0) 0 6 unassigned wm 0 0 (0/0/0) 0 7 unassigned wm 0 0 (0/0/0) 0 8 boot wu 0 - 0 7.84MB (1/0/0) 16065 9 unassigned wm 0 0 (0/0/0) 0 format How are you booting the system? (rEFIt?) No, I just installed OpenSolaris. Ah, only OS then? ... Only bug I see possibly related is 6929493 (in the sense that changes for the bug may have triggered this issue possibly). A few days ago I noticed that the new boot environment is actually there and can be booted despite the ZFS error. I installed b138 today and it works, but I get this error on updating. So, there are some ZFS bugs that seem related, although some of them are supposedly already fixed and I'm not certain that others relate: http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6740164 http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6860320 Did you recently attach or import any zpools? Also, when you originally installed the OS, did you completely erase the drive before installing? I've run into problems in the past where fixes or changes have caused the OS to check partition headers and other areas for signatures that were leftover by other disk utilities and gave me grief. So, to be clear, sometime after you updated to b134, you could no longer update to any other builds because it gave you a message like this? be_get_uuid: failed to get uuid property from BE root dataset user ... set_bootfs: failed to set bootfs property for pool rpool: property 'bootfs' not supported on EFI labeled devices be_activate: failed to set bootfs pool property for rpool/ROOT/opensolaris-135 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] b134 pool borked!
Thanks for your reply! I ran memtest86 and it did not report any errors. The disk controller I've not replaced, yet. The server is up in multi-user mode with the broken pool in an un-imported state. Format now works and properly lists all my devices without panic'ing. zpool import poolname panic's the box with the same stack trace as above. Could it still be the disk controller? I'd jump through the roof of happiness if that's the case. It's one of those Supermicro thumper controllers. Anyone know any good non-destructive diagnostics to run? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [indiana-discuss] image-update doesn't work anymore (bootfs not supported on EFI)
On 5/5/10 10:22 AM, Christian Thalinger wrote: On Wed, 2010-05-05 at 09:45 -0600, Evan Layton wrote: No that doesn't appear like an EFI label. So it appears that ZFS is seeing something there that it's interpreting as an EFI label. Then the command to set the bootfs property is failing due to that. To restate the problem the BE can't be activated because we can't set the bootfs property of the root pool and even the ZFS command to set it fails with property 'bootfs' not supported on EFI labeled devices for example the following command: # zfs set bootfs=rpool/ROOT/opensolaris rpool fails with that same error message. I guess you mean zpool, but yes: Yes that's what I meant (I hate when my fingers betray me like that) ;-) # zpool set bootfs=rpool/ROOT/opensolaris-138 rpool cannot set property for 'rpool': property 'bootfs' not supported on EFI labeled devices Do you have any of the older BEs like build 134 that you can boot back to and see if those will allow you to set the bootfs property on the root pool? It's just really strange that out of nowhere it started thinking that the device is EFI labeled. I have a couple of BEs I could boot to: $ beadm list BE Active Mountpoint Space Policy Created -- -- -- - -- --- opensolaris - - 1.00G static 2009-10-01 08:00 opensolaris-124 - - 20.95M static 2009-10-03 13:30 opensolaris-125 - - 30.00M static 2009-10-17 15:18 opensolaris-126 - - 25.33M static 2009-10-29 20:18 opensolaris-127 - - 1.37G static 2009-11-14 13:20 opensolaris-128 - - 1.91G static 2009-12-04 14:28 opensolaris-129 - - 22.49M static 2009-12-12 11:31 opensolaris-130 - - 21.64M static 2009-12-26 19:46 opensolaris-131 - - 24.72M static 2010-01-22 22:51 opensolaris-132 - - 57.32M static 2010-02-09 23:05 opensolaris-133 - - 1.07G static 2010-02-20 12:55 opensolaris-134 N / 43.17G static 2010-03-08 21:58 opensolaris-138 R - 1.81G static 2010-05-04 12:03 I will try on 132 or 133. Get back to you later. Thanks! ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore
On May 4, 2010, at 7:55 AM, Bob Friesenhahn wrote: On Mon, 3 May 2010, Richard Elling wrote: This is not a problem on Solaris 10. It can affect OpenSolaris, though. That's precisely the opposite of what I thought. Care to explain? In Solaris 10, you are stuck with LiveUpgrade, so the root pool is not shared with other boot environments. Richard, You have fallen out of touch with Solaris 10, which is still a moving target. While the Live Upgrade commands you are familiar with in Solaris 10 still mostly work as before, they *do* take advantage of zfs's features and boot environments do share the same root pool just like in OpenSolaris. Solaris 10 Live Upgrade is dramatically improved in conjunction with zfs boot. I am not sure how far behind it is from OpenSolaris new boot administration tools but under zfs its function can not be terribly different. Bob and Ian are right. I was trying to remember the last time I installed Solaris 10, and the best I can recall, it was around late fall 2007. The fine folks at Oracle have been making improvements to the product since then, even though no new significant features have been added since that time :-( -- richard -- ZFS storage and performance consulting at http://www.RichardElling.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zfs destroy -f and dataset is busy?
We have a pair of opensolaris systems running snv_124. Our main zpool 'z' is running ZFS pool version 18. Problem: #zfs destroy -f z/Users/harri...@zfs-auto-snap:daily-2010-04-09-00:00 cannot destroy 'z/Users/harri...@zfs-auto-snap:daily-2010-04-09-00:00': dataset is busy I have tried: Unable to destroy numerous datasets even with a -f option. unmounted the filesystem and remounted it same problem exporting and importing the zpool The zpool scrub completes without errors: 12:51pm taurus/harrison [~] 182#zpool status z pool: z state: ONLINE scrub: scrub completed after 17h32m with 0 errors on Sun May 2 18:47:38 2010 config: NAME STATE READ WRITE CKSUM zONLINE 0 0 0 raidz2 ONLINE 0 0 0 Any suggestions would be greatly appreciated. Thanks in advance, -C ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Does Opensolaris support thin reclamation?
Support for thin reclamation depends on the SCSI WRITE SAME command; see this draft of a document from T10: http://www.t10.org/ftp/t10/document.05/05-270r0.pdf. I spent some time searching the source code for support for WRITE SAME, but I wasn't able to find much. I assume that if it was supported, it would be listed in this header file: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/sys/scsi/generic/commands.h Does anyone know for certain whether Opensolaris supports thin reclamation on thinly-provisioned LUNs? If not, is anyone interested in or actively working on this? I'm especially interested in ZFS' support for thin reclamation, but I would be interested in hearing about support (or lack of) for UFS and SVM as well. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZIL behavior on import
All, I had a question regarding how the ZIL interacts with zpool import: Given that the intent log is replayed in the event of a system failure, does the replay behavior differ if -f is passed to zpool import? For example, if I have a system which fails prior to completing a series of writes and I reboot using a failsafe (i.e. install disc), will the log be replayed after a zpool import -f ? Regards, Steve ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZIL behavior on import
On 05/05/2010 20:45, Steven Stallion wrote: All, I had a question regarding how the ZIL interacts with zpool import: Given that the intent log is replayed in the event of a system failure, does the replay behavior differ if -f is passed to zpool import? For example, if I have a system which fails prior to completing a series of writes and I reboot using a failsafe (i.e. install disc), will the log be replayed after a zpool import -f ? yes -- Robert Milkowski http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] How to completely erradicate ZFS
Stupid question time. I have a CF Card that I place a ZFS volume. Now I want to put a UFS volume on it instead but I can not seem to get ride of the ZFS information on the drive. I have tried clearing and recreating the Partition Table with fdisk. I have tried clearing the labels and VTOC but when I put the Solaris partition on the disk again the ZFS information seeming reapears and the system complains that is cannot mount ZFS rpool. Any help would be appreciated. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Mirroring USB Drive with Laptop for Backup purposes
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Matt Keenan Just wondering whether mirroring a USB drive with main laptop disk for backup purposes is recommended or not. Plan would be to connect the USB drive, once or twice a week, let it resilver, and then disconnect again. Connecting USB drive 24/7 would AFAIK have performance issues for the Laptop. MMmmm... If it works, sounds good. But I don't think it'll work as expected, for a number of reasons, outlined below. The suggestion I would have instead, would be to make the external drive its own separate zpool, and then you can incrementally zfs send | zfs receive onto the external. Here are the obstacles I think you'll have with your proposed solution: #1 I think all the entire used portion of the filesystem needs to resilver every time. I don't think there's any such thing as an incremental resilver. #2 How would you plan to disconnect the drive? If you zpool detach it, I think it's no longer a mirror, and not mountable. If you simply yank out the plug ... although that might work, it would certainly be nonideal. If you power off, disconnect, and power on ... Again, it should probably be fine, but it's not designed to be used that way intentionally, so your results ... are probably as-yet untested. I don't want to go on. This list could go on forever. I will strongly encourage you to simply use zfs send | zfs receive because that's a standard practice thing to do. It is known that the external drive is not bootable this way, but that's why you have this article on how to make it bootable: http://docs.sun.com/app/docs/doc/819-5461/ghzur?l=ena=view This would have the added benefit of the USB drive being bootable. By default, AFAIK, that's not correct. When you mirror rpool to another device, by default the 2nd device is not bootable, because it's just got an rpool in there. No boot loader. Even if you do this mirror idea, which I believe will be slower and less reliable than zfs send | zfs receive you still haven't gained anything as compared to the zfs send | zfs receive procedure, which is known to work reliable with optimal performance. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to completely erradicate ZFS
It probably put an EFI label on the disk. Try doing a wiping the first AND last 2MB. --M -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of nich romero Sent: Wednesday, May 05, 2010 1:00 PM To: zfs-discuss@opensolaris.org Subject: [zfs-discuss] How to completely erradicate ZFS Stupid question time. I have a CF Card that I place a ZFS volume. Now I want to put a UFS volume on it instead but I can not seem to get ride of the ZFS information on the drive. I have tried clearing and recreating the Partition Table with fdisk. I have tried clearing the labels and VTOC but when I put the Solaris partition on the disk again the ZFS information seeming reapears and the system complains that is cannot mount ZFS rpool. Any help would be appreciated. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to completely erradicate ZFS
On Wed, May 5, 2010 at 1:36 PM, Matt Cowger mcow...@salesforce.com wrote: It probably put an EFI label on the disk. Try doing a wiping the first AND last 2MB. If nothing else works, the following should definitely do it: dd if=/dev/zero of=/dev/whatever bs=1M That will write zeroes to every bit of the drive, start to finish. You can play around with the block size (bs). -- Freddie Cash fjwc...@gmail.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] b134 pool borked!
I got a suggestion to check what fmdump -eV gave to look for PCI errors if the controller might be broken. Attached you'll find the last panic's fmdump -eV. It indicates that ZFS can't open the drives. That might suggest a broken controller, but my slog is on the motherboard's internal controller. One might think that the motherboard itself might be toast or do we have a case of unstable power? -- This message posted from opensolaris.orgMay 04 2010 19:44:31.716566239 ereport.fs.zfs.vdev.open_failed nvlist version: 0 class = ereport.fs.zfs.vdev.open_failed ena = 0xeeed67dca00c01 detector = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = zfs pool = 0x97541c1ea1ad833e vdev = 0x645834a4c69584e5 (end detector) pool = tank pool_guid = 0x97541c1ea1ad833e pool_context = 1 pool_failmode = wait vdev_guid = 0x645834a4c69584e5 vdev_type = disk vdev_path = /dev/dsk/c13t1d0s0 vdev_devid = id1,s...@sata_wdc_wd5001aals-0_wd-wmasy3260051/a parent_guid = 0x6041a7903a345374 parent_type = raidz prev_state = 0x1 __ttl = 0x1 __tod = 0x4be05cff 0x2ab5eedf May 04 2010 19:44:31.716565705 ereport.fs.zfs.vdev.open_failed nvlist version: 0 class = ereport.fs.zfs.vdev.open_failed ena = 0xeeed67dca00c01 detector = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = zfs pool = 0x97541c1ea1ad833e vdev = 0x928ecd01b281b313 (end detector) pool = tank pool_guid = 0x97541c1ea1ad833e pool_context = 1 pool_failmode = wait vdev_guid = 0x928ecd01b281b313 vdev_type = disk vdev_path = /dev/dsk/c13t2d0s0 vdev_devid = id1,s...@sata_samsung_hd103si___s1vsj90sc22634/a parent_guid = 0x6041a7903a345374 parent_type = raidz prev_state = 0x1 __ttl = 0x1 __tod = 0x4be05cff 0x2ab5ecc9 May 04 2010 19:44:31.716565713 ereport.fs.zfs.vdev.open_failed nvlist version: 0 class = ereport.fs.zfs.vdev.open_failed ena = 0xeeed67dca00c01 detector = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = zfs pool = 0x97541c1ea1ad833e vdev = 0xc6c893601f1263cb (end detector) pool = tank pool_guid = 0x97541c1ea1ad833e pool_context = 1 pool_failmode = wait vdev_guid = 0xc6c893601f1263cb vdev_type = disk vdev_path = /dev/dsk/c8t0d0s0 vdev_devid = id1,s...@sata_intel_ssdsa2m080__cvpo003401vt080bgn/a parent_guid = 0x97541c1ea1ad833e parent_type = root prev_state = 0x1 __ttl = 0x1 __tod = 0x4be05cff 0x2ab5ecd1 May 04 2010 19:44:31.716566468 ereport.fs.zfs.vdev.open_failed nvlist version: 0 class = ereport.fs.zfs.vdev.open_failed ena = 0xeeed67dca00c01 detector = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = zfs pool = 0x97541c1ea1ad833e vdev = 0x381e0480469b4ed7 (end detector) pool = tank pool_guid = 0x97541c1ea1ad833e pool_context = 1 pool_failmode = wait vdev_guid = 0x381e0480469b4ed7 vdev_type = disk vdev_path = /dev/dsk/c13t3d0s0 vdev_devid = id1,s...@sata_samsung_hd103si___s1vsj90sc22045/a parent_guid = 0x6041a7903a345374 parent_type = raidz prev_state = 0x1 __ttl = 0x1 __tod = 0x4be05cff 0x2ab5efc4 May 04 2010 19:44:31.716566182 ereport.fs.zfs.vdev.open_failed nvlist version: 0 class = ereport.fs.zfs.vdev.open_failed ena = 0xeeed67dca00c01 detector = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = zfs pool = 0x97541c1ea1ad833e vdev = 0x6e5ce9b416a3f8a4 (end detector) pool = tank pool_guid = 0x97541c1ea1ad833e pool_context = 1 pool_failmode = wait vdev_guid = 0x6e5ce9b416a3f8a4 vdev_type = disk vdev_path = /dev/dsk/c13t6d0s0 vdev_devid = id1,s...@sata_wdc_wd6400aacs-0_wd-wcauf0934679/a parent_guid = 0x4491e617ebc26c75 parent_type = raidz prev_state = 0x1 __ttl = 0x1 __tod = 0x4be05cff 0x2ab5eea6 May 04 2010 19:44:31.716565740 ereport.fs.zfs.vdev.open_failed nvlist version: 0 class = ereport.fs.zfs.vdev.open_failed ena = 0xeeed67dca00c01 detector = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = zfs pool = 0x97541c1ea1ad833e vdev = 0x69f0986c92adda53
Re: [zfs-discuss] Mirroring USB Drive with Laptop for Backup purposes
* Edward Ned Harvey (solar...@nedharvey.com) wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Matt Keenan Just wondering whether mirroring a USB drive with main laptop disk for backup purposes is recommended or not. Plan would be to connect the USB drive, once or twice a week, let it resilver, and then disconnect again. Connecting USB drive 24/7 would AFAIK have performance issues for the Laptop. MMmmm... If it works, sounds good. But I don't think it'll work as expected, for a number of reasons, outlined below. It used to work for James Gosling. http://blogs.sun.com/jag/entry/solaris_and_os_x [snip] This would have the added benefit of the USB drive being bootable. By default, AFAIK, that's not correct. When you mirror rpool to another device, by default the 2nd device is not bootable, because it's just got an rpool in there. No boot loader. That's true, but easily fixed (just like for any other mirrored pool configuration). installgrub -m /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/{disk} Even if you do this mirror idea, which I believe will be slower and less reliable than zfs send | zfs receive you still haven't gained anything as compared to the zfs send | zfs receive procedure, which is known to work reliable with optimal performance. How about ease-of-use, all you have to do is plug in the usb disk and zfs will 'do the right thing'. You don't have to remember to run zfs send | zfs receive, or bother with figuring out what to send/recv etc etc etc. Cheers, -- Glenn ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore
On 05/ 6/10 05:32 AM, Richard Elling wrote: On May 4, 2010, at 7:55 AM, Bob Friesenhahn wrote: On Mon, 3 May 2010, Richard Elling wrote: This is not a problem on Solaris 10. It can affect OpenSolaris, though. That's precisely the opposite of what I thought. Care to explain? In Solaris 10, you are stuck with LiveUpgrade, so the root pool is not shared with other boot environments. Richard, You have fallen out of touch with Solaris 10, which is still a moving target. While the Live Upgrade commands you are familiar with in Solaris 10 still mostly work as before, they *do* take advantage of zfs's features and boot environments do share the same root pool just like in OpenSolaris. Solaris 10 Live Upgrade is dramatically improved in conjunction with zfs boot. I am not sure how far behind it is from OpenSolaris new boot administration tools but under zfs its function can not be terribly different. Bob and Ian are right. I was trying to remember the last time I installed Solaris 10, and the best I can recall, it was around late fall 2007. The fine folks at Oracle have been making improvements to the product since then, even though no new significant features have been added since that time :-( ZFS boot? -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Mirroring USB Drive with Laptop for Backup purposes
Glenn Lagasse wrote: How about ease-of-use, all you have to do is plug in the usb disk and zfs will 'do the right thing'. You don't have to remember to run zfs send | zfs receive, or bother with figuring out what to send/recv etc etc etc. It should be possible to automate that via syseventd/syseventconfd. Sadly the documentation is a bit... um... sparse... -- Carson ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Mirroring USB Drive with Laptop for Backup purposes
On Wed, May 05, 2010 at 04:34:13PM -0400, Edward Ned Harvey wrote: The suggestion I would have instead, would be to make the external drive its own separate zpool, and then you can incrementally zfs send | zfs receive onto the external. I'd suggest doing both, to different destinations :) Each kind of backup serves different, complementary purposes. #1 I think all the entire used portion of the filesystem needs to resilver every time. I don't think there's any such thing as an incremental resilver. Incorrect. It will play forward all the (still-live) blocks from txg's between the time it was last online and now. That said, I'd also recommend a scrub on a regular basis, once the resilver has completed, and that will trawl through all the data and take all that time you were worried about anyway. For a 200G disk, full, over usb, I'd expect around 4-5 hours. That's fine for a leave running overnight workflow. This is the benefit of this kind of backup - as well as being almost brainless to initiate, it's able to automatically repair marginal sectors on the laptop disk if they become unreadable, saving you from the hassle of trying to restore damaged files. The send|recv kind of backup is much better for restoring data from old snapshots (if the target is larger than the source and keeps them longer), and recovering from accidentally destroying both mirrored copies of data due to operator error. #2 How would you plan to disconnect the drive? If you zpool detach it, I think it's no longer a mirror, and not mountable. That's correct - which is why you would use zpool offline. -- Dan. pgpgbQjfYhj6R.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestor
Hi Euan, You might find some of this useful: http://breden.org.uk/2009/08/29/home-fileserver-mirrored-ssd-zfs-root-boot/ http://breden.org.uk/2009/08/30/home-fileserver-zfs-boot-pool-recovery/ I backed up the rpool to a single file which I believe is frowned upon, due to the consequences of an error occurring within the sent stream, but sending to a file system instead will fix this aspect, and you may still find the rest of use. Cheers, Simon -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to completely erradicate ZFS
On May 5, 2010, at 12:59 PM, nich romero wrote: Stupid question time. I have a CF Card that I place a ZFS volume. Now I want to put a UFS volume on it instead but I can not seem to get ride of the ZFS information on the drive. I have tried clearing and recreating the Partition Table with fdisk. I have tried clearing the labels and VTOC but when I put the Solaris partition on the disk again the ZFS information seeming reapears and the system complains that is cannot mount ZFS rpool. Any help would be appreciated. The system won't care unless it is expected to import rpool. Use zbd -C and see if the cache expects to import the pool. If so, export it. If not, please show the exact error message you see. -- richard -- ZFS storage and performance consulting at http://www.RichardElling.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Mirroring USB Drive with Laptop for Backup purposes
On Wed, 5 May 2010, Edward Ned Harvey wrote: Here are the obstacles I think you'll have with your proposed solution: #1 I think all the entire used portion of the filesystem needs to resilver every time. I don't think there's any such thing as an incremental resilver. It sounds like you are not sure. Maybe you should be sure. Yes, I do think that it is a wise idea if you are really sure. See Transactional pruning at http://blogs.sun.com/bonwick/entry/smokin_mirrors and then Top-down resilvering. This would have the added benefit of the USB drive being bootable. By default, AFAIK, that's not correct. When you mirror rpool to another device, by default the 2nd device is not bootable, because it's just got an rpool in there. No boot loader. Unless it was added at install time, or the user added a boot loader. It is quite doable since it is the normal case as when a system is installed onto a mirror pair of disks. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Mirroring USB Drive with Laptop for Backup purposes
On Thu, 6 May 2010, Daniel Carosone wrote: That said, I'd also recommend a scrub on a regular basis, once the resilver has completed, and that will trawl through all the data and take all that time you were worried about anyway. For a 200G disk, full, over usb, I'd expect around 4-5 hours. That's fine for a leave running overnight workflow. When I have simply powered down a mirror disk in a USB-based mirrored pair, I have noticed that it seems that zfs must be doing its own little secret scrub of the restored disk without me requesting it even though 'zpool status' does not mention it and it says the disk is resilvered. The flashing lights annoyed me so I exported and imported the pool and then the flashing lights were gone. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore
On Thu, 6 May 2010, Ian Collins wrote: Bob and Ian are right. I was trying to remember the last time I installed Solaris 10, and the best I can recall, it was around late fall 2007. The fine folks at Oracle have been making improvements to the product since then, even though no new significant features have been added since that time :-( ZFS boot? I think that Richard is referring to the fact that the PowerPC/Cell Solaris 10 port for the Sony Playstation III never emerged. ;-) Other than desktop features, as a Solaris 10 user I have seen OpenSolaris kernel features continually percolate down to Solaris 10 so I don't feel as left out as Richard would like me to feel. From a zfs standpoint, Solaris 10 does not seem to be behind the currently supported OpenSolaris release. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore
On Wed, May 05, 2010 at 04:31:08PM -0700, Bob Friesenhahn wrote: On Thu, 6 May 2010, Ian Collins wrote: Bob and Ian are right. I was trying to remember the last time I installed Solaris 10, and the best I can recall, it was around late fall 2007. The fine folks at Oracle have been making improvements to the product since then, even though no new significant features have been added since that time :-( ZFS boot? I think that Richard is referring to the fact that the PowerPC/Cell Solaris 10 port for the Sony Playstation III never emerged. ;-) Other than desktop features, as a Solaris 10 user I have seen OpenSolaris kernel features continually percolate down to Solaris 10 so I don't feel as left out as Richard would like me to feel. From a zfs standpoint, Solaris 10 does not seem to be behind the currently supported OpenSolaris release. Bob Well, being able to remove ZIL devices is one important feature missing. Hopefully in U9. :) Ray ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Different devices with the same name in zpool status
I know for certain that my rpool and tank pool are not both using c6t0d0 and c6t1d0, but that's what zpool status is showing. It appears to be an output bug, or a problem with the zpool.cache, since format shows my rpool devices at c8t0d0 and c8t1d0. What's the right way to fix this? Do nothing? boot -r? Remove /etc/zfs/zpool.cache? Edit or remove /etc/path_to_inst and let boot-r fix it? -B bh...@basestar:~$ zpool status pool: rpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 mirror-0ONLINE 0 0 0 c6t0d0s0 ONLINE 0 0 0 c6t1d0s0 ONLINE 0 0 0 errors: No known data errors pool: tank state: ONLINE scrub: scrub completed after 6h35m with 0 errors on Tue May 4 16:29:46 2010 config: NAMESTATE READ WRITE CKSUM tankONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 c6t0d0 ONLINE 0 0 0 c6t1d0 ONLINE 0 0 0 c6t2d0 ONLINE 0 0 0 c6t3d0 ONLINE 0 0 0 c6t4d0 ONLINE 0 0 0 c6t5d0 ONLINE 0 0 0 c6t6d0 ONLINE 0 0 0 c6t7d0 ONLINE 0 0 0 logs c7t0d0s0 ONLINE 0 0 0 cache c7t0d0s1 ONLINE 0 0 0 bh...@basestar:~$ pfexec format -e /dev/null Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c6t0d0 ATA-WDC WD10EADS-00L-1A01-931.51GB /p...@0,0/pci10de,3...@a/pci8086,3...@0/pci11ab,1...@4/d...@0,0 1. c6t1d0 ATA-WDC WD10EADS-00L-1A01-931.51GB /p...@0,0/pci10de,3...@a/pci8086,3...@0/pci11ab,1...@4/d...@1,0 2. c6t2d0 ATA-WDC WD10EADS-00L-1A01-931.51GB /p...@0,0/pci10de,3...@a/pci8086,3...@0/pci11ab,1...@4/d...@2,0 3. c6t3d0 ATA-WDC WD10EADS-00L-1A01-931.51GB /p...@0,0/pci10de,3...@a/pci8086,3...@0/pci11ab,1...@4/d...@3,0 4. c6t4d0 ATA-WDC WD10EADS-00L-1A01-931.51GB /p...@0,0/pci10de,3...@a/pci8086,3...@0/pci11ab,1...@4/d...@4,0 5. c6t5d0 ATA-WDC WD10EADS-00L-1A01-931.51GB /p...@0,0/pci10de,3...@a/pci8086,3...@0/pci11ab,1...@4/d...@5,0 6. c6t6d0 ATA-WDC WD10EADS-00L-1A01-931.51GB /p...@0,0/pci10de,3...@a/pci8086,3...@0/pci11ab,1...@4/d...@6,0 7. c6t7d0 ATA-WDC WD10EADS-00L-1A01-931.51GB /p...@0,0/pci10de,3...@a/pci8086,3...@0/pci11ab,1...@4/d...@7,0 8. c7t0d0 DEFAULT cyl 3889 alt 2 hd 255 sec 63 /p...@0,0/pci1043,8...@5,2/d...@0,0 9. c8t0d0 ATA-WDCWD1200BEVT-0-1A01 cyl 14590 alt 2 hd 255 sec 63 /p...@0,0/pci1043,8...@5,1/d...@0,0 10. c8t1d0 ATA-WDCWD1200BEVT-0-1A01 cyl 14590 alt 2 hd 255 sec 63 /p...@0,0/pci1043,8...@5,1/d...@1,0 -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Different devices with the same name in zpool status
On 05/ 6/10 11:48 AM, Brandon High wrote: I know for certain that my rpool and tank pool are not both using c6t0d0 and c6t1d0, but that's what zpool status is showing. It appears to be an output bug, or a problem with the zpool.cache, since format shows my rpool devices at c8t0d0 and c8t1d0. Have you hot swapped any drives? I had a similar oddity after swapping drives and running cfgadm. What's the right way to fix this? Do nothing? boot -r? Remove /etc/zfs/zpool.cache? Edit or remove /etc/path_to_inst and let boot-r fix it? After the system rebooted, the drives all matched up. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to completely erradicate ZFS
You are right; the system does not really care that it can not mount it automatically but it still tries since it sees the zpool data. [b]pfexec zdb -l /dev/rdsk/c7t0d0s2[/b] LABEL 0 failed to unpack label 0 LABEL 1 failed to unpack label 1 LABEL 2 version: 19 name: 'rpool' state: 0 txg: 604 pool_guid: 15191080926808974889 hostid: 215494 hostname: '' top_guid: 10231941211973911013 guid: 10231941211973911013 vdev_children: 1 vdev_tree: type: 'disk' id: 0 guid: 10231941211973911013 path: '/dev/dsk/c4t0d0s0' devid: 'id1,s...@ast68022cf=4nx017qk/a' phys_path: '/p...@0,0/pci10f1,2...@5/d...@0,0:a' whole_disk: 0 metaslab_array: 23 metaslab_shift: 26 ashift: 9 asize: 7985430528 is_log: 0 create_txg: 4 LABEL 3 version: 19 name: 'rpool' state: 0 txg: 604 pool_guid: 15191080926808974889 hostid: 215494 hostname: '' top_guid: 10231941211973911013 guid: 10231941211973911013 vdev_children: 1 vdev_tree: type: 'disk' id: 0 guid: 10231941211973911013 path: '/dev/dsk/c4t0d0s0' devid: 'id1,s...@ast68022cf=4nx017qk/a' phys_path: '/p...@0,0/pci10f1,2...@5/d...@0,0:a' whole_disk: 0 metaslab_array: 23 metaslab_shift: 26 ashift: 9 asize: 7985430528 is_log: 0 create_txg: 4 What I finally ended up doing was dd'ing the the disk: [b]prtvtoc /dev/rdsk/c7t0d0s2[/b] * /dev/rdsk/c7t0d0s2 partition map * * Dimensions: * 512 bytes/sector * 32 sectors/track * 128 tracks/cylinder *4096 sectors/cylinder *3813 cylinders *3811 accessible cylinders * * Flags: * 1: unmountable * 10: read-only * * Unallocated space: * First SectorLast * Sector CountSector *4096 15605760 15609855 * * First SectorLast * Partition Tag FlagsSector CountSector Mount Directory 2 501 0 15609856 15609855 8 101 0 4096 4095 [b]pfexec dd if=/dev/zero of=/dev/rdsk/c7t0d0p0 bs=512 count=8192 pfexec dd if=/dev/zero of=/dev/rdsk/c7t0d0p0 bs=512 seek=15613952 count=8192 pfexec fdisk -B /dev/rdsk/c7t0d0p0[/b] [b]pfexec newfs -v /dev/dsk/c7t0d0s2[/b] newfs: construct a new file system /dev/rdsk/c7t0d0s2: (y/n)? y pfexec mkfs -F ufs /dev/rdsk/c7t0d0s2 15609856 32 -1 8192 1024 224 1 1056 8192 t 0 -1 8 8 n mkfs: bad value for rps: 1056 must be between 1 and 1000 mkfs: rps reset to default 60 Warning: 2048 sector(s) in last cylinder unallocated /dev/rdsk/c7t0d0s2: 15609856 sectors in 2541 cylinders of 48 tracks, 128 sectors 7622.0MB in 159 cyl groups (16 c/g, 48.00MB/g, 5824 i/g) super-block backups (for fsck -F ufs -o b=#) at: 32, 98464, 196896, 295328, 393760, 492192, 590624, 689056, 787488, 885920, The only real reason I am doing this anyway it to experiment with the R/W speeds and comparing PCFS (FAT32), UFS and ZFS on the removable media. Apparently the slow PCFS speeds are not going to be fixed any time soon and copying 8G files to a CF was becoming tedious. Just switching to UFS took me from 1.3MB/s to 6.9MB/s on an old microdrive. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore
On Wed, 5 May 2010, Ray Van Dolson wrote: From a zfs standpoint, Solaris 10 does not seem to be behind the currently supported OpenSolaris release. Well, being able to remove ZIL devices is one important feature missing. Hopefully in U9. :) While the development versions of OpenSolaris are clearly well beyond Solaris 10, I don't believe that the supported version of OpenSolaris (a year old already) has this feature yet either and Solaris 10 has been released several times since then already. When the forthcoming OpenSolaris release emerges in 2011, the situation will be far different. Solaris 10 can then play catch-up with the release of U9 in 2012. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore
On Wed, 2010-05-05 at 19:03 -0500, Bob Friesenhahn wrote: On Wed, 5 May 2010, Ray Van Dolson wrote: From a zfs standpoint, Solaris 10 does not seem to be behind the currently supported OpenSolaris release. Well, being able to remove ZIL devices is one important feature missing. Hopefully in U9. :) While the development versions of OpenSolaris are clearly well beyond Solaris 10, I don't believe that the supported version of OpenSolaris (a year old already) has this feature yet either and Solaris 10 has been released several times since then already. When the forthcoming OpenSolaris release emerges in 2011, the situation will be far different. Solaris 10 can then play catch-up with the release of U9 in 2012. Bob Pessimist. ;-) s/2011/2010/ s/2012/2011/ -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore
On Wed, May 05, 2010 at 05:09:40PM -0700, Erik Trimble wrote: On Wed, 2010-05-05 at 19:03 -0500, Bob Friesenhahn wrote: On Wed, 5 May 2010, Ray Van Dolson wrote: From a zfs standpoint, Solaris 10 does not seem to be behind the currently supported OpenSolaris release. Well, being able to remove ZIL devices is one important feature missing. Hopefully in U9. :) While the development versions of OpenSolaris are clearly well beyond Solaris 10, I don't believe that the supported version of OpenSolaris (a year old already) has this feature yet either and Solaris 10 has been released several times since then already. When the forthcoming OpenSolaris release emerges in 2011, the situation will be far different. Solaris 10 can then play catch-up with the release of U9 in 2012. Bob Pessimist. ;-) s/2011/2010/ s/2012/2011/ Yeah, U9 in 2012 makes me very sad. I would really love to see the hot-removable ZIL's this year. Otherwise I'll need to rebuild a few zpools :) Ray ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Different devices with the same name in zpool status
On Wed, May 5, 2010 at 5:00 PM, Ian Collins i...@ianshome.com wrote: Have you hot swapped any drives? I had a similar oddity after swapping drives and running cfgadm. No hot-swapping. I'd imported exported both pools from a LiveCD environment, but I'd also rebooted at least twice since then. -B -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] why both dedup and compression?
I've googled this for a bit, but can't seem to find the answer. What does compression bring to the party that dedupe doesn't cover already? Thank you for you patience and answers. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] why both dedup and compression?
Dedup came much later than compression. Also, compression saves both space and therefore load time even when there's only one copy. It is especially good for e.g. HTML or man page documentation which tends to compress very well (versus binary formats like images or MP3s that don't). It gives me an extra, say, 10g on my laptop's 80g SSD which isn't bad. Alex Sent from my (new) iPhone On 6 May 2010, at 02:06, Richard Jahnel rich...@ellipseinc.com wrote: I've googled this for a bit, but can't seem to find the answer. What does compression bring to the party that dedupe doesn't cover already? Thank you for you patience and answers. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] why both dedup and compression?
I've googled this for a bit, but can't seem to find the answer. What does compression bring to the party that dedupe doesn't cover already? Thank you for you patience and answers. That almost sounds like a classroom question. Pick a simple example: large text files, of which each is unique, maybe lines of data or something. Not likely to be much in the way of duplicate blocks to share, but very likely to be highly compressible. Contrast that with binary files, which might have blocks of zero bytes in them (without being strictly sparse, sometimes). With deduping, one such block is all that's actually stored (along with all the references to it, of course). In the 30 seconds or so I've been thinking about it to type this, I would _guess_ that one might want one or the other, but rarely both, since compression might tend to work against deduplication. So given the availability of both, and how lightweight zfs filesystems are, one might want to create separate filesystems within a pool with one or the other as appropriate, and separate the data according to which would likely work better on it. Also, one might as well put compressed video, audio, and image formats in a filesystem that was _not_ compressed, since compressing an already compressed file seldom gains much if anything more. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] why both dedup and compression?
Another thought is this: _unless_ the CPU is the bottleneck on a particular system, compression (_when_ it actually helps) can speed up overall operation, by reducing the amount of I/O needed. But storing already-compressed files in a filesystem with compression is likely to result in wasted effort, with little or no gain to show for it. Even deduplication requires some extra effort. Looking at the documentation, it implies a particular checksum algorithm _plus_ verification (if the checksum or digest matches, then make sure by doing a byte-for-byte compare of the blocks, since nothing shorter than the data itself can _guarantee_ that they're the same, just like no lossless compression can possibly work for all possible bitstreams). So doing either of these where the success rate is likely to be too low is probably not helpful. There are stats that show the savings for a filesystem due to compression or deduplication. What I think would be interesting is some advice as to how much (percentage) savings one should be getting to expect to come out ahead not just on storage, but on overall system performance. Of course, no such guidance would exactly fit any particular workload, but I think one might be able to come up with some approximate numbers, or at least a range, below which those features probably represented a waste of effort unless space was at an absolute premium. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] why both dedup and compression?
One of the big things to remember with dedup is that it is block-oriented (as is compression) - it deals with things in discrete chunks, (usually) not the entire file as a stream. So, let's do a thought-experiment here: File A is 100MB in size. From ZFS's standpoint, let's say it's made up of 100 1MB blocks (or chunks, or slabs). Let's also say that none of the blocks are identical (which is highly likely) - that is, no block checksums identically. Thus, with dedup on, this file takes up 100MB of space. If I do a cp fileA fileB, no more additional space will be taken up. However, let's say I then add 1 bit of data to the very front of file A. Now, block alignments have changed for the entire file, so all the 1MB blocks checksum differently. Thus, in this case, adding 1 bit of data to file A actually causes 100MB+1bit of new data to be used, as now none of file B's block are the same as file A. Therefore, after 1 additional bit has been written, total disk usage is 200MB+1 bit. If compression were being used, file A originally would likely take up 100MB, and file B would take up the same amount; thus, the two together could take up, say 150MB together (with a conservative 25% compression ratio). After writing 1 new bit to file A, file A almost certainly compresses the same as before, so the two files will continue to occupy 150MB of space. Compression is not obsoleted by dedup. They both have their places, depending on the data being stored, and the usage pattern of that data. -Erik On Wed, 2010-05-05 at 19:11 -0700, Richard L. Hamilton wrote: Another thought is this: _unless_ the CPU is the bottleneck on a particular system, compression (_when_ it actually helps) can speed up overall operation, by reducing the amount of I/O needed. But storing already-compressed files in a filesystem with compression is likely to result in wasted effort, with little or no gain to show for it. Even deduplication requires some extra effort. Looking at the documentation, it implies a particular checksum algorithm _plus_ verification (if the checksum or digest matches, then make sure by doing a byte-for-byte compare of the blocks, since nothing shorter than the data itself can _guarantee_ that they're the same, just like no lossless compression can possibly work for all possible bitstreams). So doing either of these where the success rate is likely to be too low is probably not helpful. There are stats that show the savings for a filesystem due to compression or deduplication. What I think would be interesting is some advice as to how much (percentage) savings one should be getting to expect to come out ahead not just on storage, but on overall system performance. Of course, no such guidance would exactly fit any particular workload, but I think one might be able to come up with some approximate numbers, or at least a range, below which those features probably represented a waste of effort unless space was at an absolute premium. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Reverse lookup: inode to name lookup
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Edward Ned Harvey Thanks to Victor, here is at least proof of concept that yes, it is possible to reverse resolve, inode number -- pathname, and yes, it is almost infinitely faster than doing something like find: Root can reverse lookup names of inodes with this command: zdb - dataset_name object_number (on a tangent) Surprisingly, it is not limited to just looking up directories. It finds files too (sort of). Apparently a file inode does contain *one* reference to its latest parent. But if you hardlink more than once, you'll only find the latest parent, and if you rm the latest hardlink, then it'll still find only the latest parent, which has been unlinked, and therefore not valid. But it works perfectly for directories. (back from tangent) Regardless of how big the filesystem is, regardless of cache warmness, regardless of how many inodes you want to reverse-lookup, this zdb command takes between 1 and 2 seconds per filesystem, fixed. In other words, the operation of performing reverse-lookup per inode is essentially zero time, but there is some kind of startup overhead. In theory at least, the reverse lookup could be equally as fast as a regular forward lookup, such as ls or stat. But my measurements also show that a forward lookup incurs some form of startup overhead. A forward lookup on an already mounted filesystem should require a few ms. But in my example below, it takes several hundred ms per snapshot, which means there's a warmup period for some reason, to open up each snapshot. Find, of course, scales linearly with the total number of directories/files in the filesystem. On my company filer, I got these results: Just do a forward lookup time ls -d /tank/somefilesystem/.zfs/snapshot/*/some_object took 24 sec, on my 53 snapshots (that's 0.45sec per snapshot) Using a for loop and zdb to reverse lookup those things took 1m 3sec, on my 53 snapshots (that's 1.19 sec per snapshot) Using find -inum to locate all those things ... I only let it complete 4 snapshots. Took 33 mins per snapshot So that's a marvelous proof of concept. Yes, reverse lookup is possible, and it's essentially infinitely faster than find -inum can be. I have a feeling a reverse-lookup application could be even faster, if it were an application designed specifically for this purpose. Zdb is not a suitable long term solution for this purpose. Zdb is only sufficient here, as a proof of concept. Here's the problem with zdb: man zdb DESCRIPTION The zdb command is used by support engineers to diagnose failures and gather statistics. Since the ZFS file system is always consistent on disk and is self-repairing, zdb should only be run under the direction by a support engineer. If no arguments are specified, zdb, performs basic con- sistency checks on the pool and associated datasets, and report any problems detected. Any options supported by this command are internal to Sun and subject to change at any time. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Bob Friesenhahn From a zfs standpoint, Solaris 10 does not seem to be behind the currently supported OpenSolaris release. I'm sorry, I'll have to disagree with you there. In solaris 10, fully updated, you can only get up to zpool version 15. This is lacking many later features ... For me in particular, zpool 19 is when zpool remove log was first supported. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best practice for full stystem backup - equivelent of ufsdump/ufsrestore
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Ray Van Dolson Well, being able to remove ZIL devices is one important feature missing. Hopefully in U9. :) I did have a support rep confirm for me that both the log device removal, and the ability to mirror slightly smaller devices will be present in U9. But he couldn't say when that would be. And if I happen to remember my facts wrong (or not remember my facts when I think I do) ... Please throw no stones. ;-) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Loss of L2ARC SSD Behaviour
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Michael Sullivan I have a question I cannot seem to find an answer to. Google for ZFS Best Practices Guide (on solarisinternals). I know this answer is there. I know if I set up ZIL on SSD and the SSD goes bad, the the ZIL will be relocated back to the spool. I'd probably have it mirrored anyway, just in case. However you cannot mirror the L2ARC, so... Careful. The log device removal feature exists, and is present in the developer builds of opensolaris today. However, it's not included in opensolars 2009.06, and it's not included in the latest and greatest solaris 10 yet. Which means, right now, if you lose an unmirrored ZIL (log) device, your whole pool is lost, unless you're running a developer build of opensolaris. What I want to know, is what happens if one of those SSD's goes bad? What happens to the L2ARC? Is it just taken offline, or will it continue to perform even with one drive missing? In the L2ARC (cache) there is no ability to mirror, because cache device removal has always been supported. You can't mirror a cache device, because you don't need it. If one of the cache devices fails, no harm is done. That device goes offline. The rest stay online. Sorry, if these questions have been asked before, but I cannot seem to find an answer. Since you said this twice, I'll answer it twice. ;-) I think the best advice regarding cache/log device mirroring is in the ZFS Best Practices Guide. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Performance of the ZIL
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Robert Milkowski if you can disable ZIL and compare the performance to when it is off it will give you an estimate of what's the absolute maximum performance increase (if any) by having a dedicated ZIL device. I'll second this suggestion. It'll cost you nothing to disable the ZIL temporarily. (You have to dismount the filesystem twice. Once to disable the ZIL, and once to re-enable it.) Then you can see if performance is good. If performance is good, then you'll know you need to accelerate your ZIL. (Because disabled ZIL is the fastest thing you could possibly ever do.) Generally speaking, you should not disable your ZIL for the long run. But in some cases, it makes sense. Here's how you determine if you want to disable your ZIL permanently: First, understand that with the ZIL disabled, all sync writes are treated as async writes. This is buffered in ram before being written to disk, so the kernel can optimize and aggregate the write operations into one big chunk. No matter what, if you have an ungraceful system shutdown, you will lose all the async writes that were waiting in ram. If you have ZIL disabled, you will also lose the sync writes that were waiting in ram (because those are being handled as async.) In neither case do you have data or filesystem corruption. The risk of running with no ZIL is: In the case of ungraceful shutdown, in addition to the (up to 30 sec) async writes that will be lost, you will also lose up to 30 sec of sync writes. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] why both dedup and compression?
Hmm... To clarify. Every discussion or benchmarking that I have seen always show both off, compression only or both on. Why never compression off and dedup on? After some further thought... perhaps it's because compression works at the byte level and dedup is at the block level. Perhaps I have answered my own question. Some confirmation would be nice though. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] why both dedup and compression?
On 05/ 6/10 03:35 PM, Richard Jahnel wrote: Hmm... To clarify. Every discussion or benchmarking that I have seen always show both off, compression only or both on. Why never compression off and dedup on? After some further thought... perhaps it's because compression works at the byte level and dedup is at the block level. Perhaps I have answered my own question. Data that don't compress also tends to be data that doesn't dedup well (media files for example). -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZIL behavior on import
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Steven Stallion I had a question regarding how the ZIL interacts with zpool import: Given that the intent log is replayed in the event of a system failure, does the replay behavior differ if -f is passed to zpool import? For example, if I have a system which fails prior to completing a series of writes and I reboot using a failsafe (i.e. install disc), will the log be replayed after a zpool import -f ? If your log devices are present, and you zpool import (even without the -f), then the log devices will be replayed. Regardless of which version of zpool you have. If your log device is not present, and you zpool import ... If you have zpool version 19, you simply cannot import. If you have zpool =19, the system will prompt you: Warning, log device not present. If you import -f, you will lose any unplayed events on the missing log device, but the pool will import. FYI, in solaris 10, you cannot have zpool version 19 yet. In opensolaris 2009.06, zpool version 19 is not available unless you upgrade to a developer build. In the latest developer build, zpool version is =19 by default. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zpool mirror (dumb question)
OK, I've installed OpenSolaris DEV 134, created 2 files. Mkfile 128m /disk1 Mkfile 127m /disk2 Zpool create stapler /disk1 Zpool attach stapler /disk1 /disk2 Cannot attach /disk2 to /disk1: device is too small (that's what she said.. lol) But, if I created 128m and 128m - 10bytes, it works. I can attach the smaller drive. And if I create 1000m, I can attach a 999m virtual disk... So, my question is, is what is the ratio on this? how much smaller can drive2 be than drive1? I was trying to find developer notes, and what's been upgraded. but my searching didn't turn up anything of interest :( Steve another question, is zpool shrinking available? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Loss of L2ARC SSD Behaviour
Hi Ed, Thanks for your answers. Seem to make sense, sort of… On 6 May 2010, at 12:21 , Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Michael Sullivan I have a question I cannot seem to find an answer to. Google for ZFS Best Practices Guide (on solarisinternals). I know this answer is there. My Google is very strong and I have the Best Practices Guide committed to bookmark as well as most of it to memory. While it explains how to implement these, there is no information regarding failure of a device in a striped L2ARC set of SSD's. I have been hard pressed to find this information anywhere, short of testing it myself, but I don't have the necessary hardware in a lab to test correctly. If someone has pointers to references, could you please provide them to chapter and verse, rather than the advice to Go read the manual. I know if I set up ZIL on SSD and the SSD goes bad, the the ZIL will be relocated back to the spool. I'd probably have it mirrored anyway, just in case. However you cannot mirror the L2ARC, so... Careful. The log device removal feature exists, and is present in the developer builds of opensolaris today. However, it's not included in opensolars 2009.06, and it's not included in the latest and greatest solaris 10 yet. Which means, right now, if you lose an unmirrored ZIL (log) device, your whole pool is lost, unless you're running a developer build of opensolaris. I'm running 2009.11 which is the latest OpenSolaris. I should have made that clear, and that I don't intend this to be on Solaris 10 system, and am waiting for the next production build anyway. As you say, it does not exist in 2009.06, this is not the latest production Opensolaris which is 2009.11, and I'd be more interested in its behavior than an older release. I am also well aware of the effect of losing a ZIL device will cause loss of the entire pool. Which is why I would never have a ZIL device unless it was mirrored and on different controllers. From the information I've been reading about the loss of a ZIL device, it will be relocated to the storage pool it is assigned to. I'm not sure which version this is in, but it would be nice if someone could provide the release number it is included in (and actually works), it would be nice. Also, will this functionality be included in the mythical 2010.03 release? Also, I'd be interested to know what features along these lines will be available in 2010.03 if it ever sees the light of day. What I want to know, is what happens if one of those SSD's goes bad? What happens to the L2ARC? Is it just taken offline, or will it continue to perform even with one drive missing? In the L2ARC (cache) there is no ability to mirror, because cache device removal has always been supported. You can't mirror a cache device, because you don't need it. If one of the cache devices fails, no harm is done. That device goes offline. The rest stay online. So what you are saying is that if a single device fails in a striped L2ARC VDEV, then the entire VDEV is taken offline and the fallback is to simply use the regular ARC and fetch from the pool whenever there is a cache miss. Or, does what you are saying here mean that if I have a 4 SSD's in a stripe for my L2ARC, and one device fails, the L2ARC will be reconfigured dynamically using the remaining SSD's for L2ARC. It would be good to get an answer to this from someone who has actually tested this or is more intimately familiar with the ZFS code rather than all the speculation I've been getting so far. Sorry, if these questions have been asked before, but I cannot seem to find an answer. Since you said this twice, I'll answer it twice. ;-) I think the best advice regarding cache/log device mirroring is in the ZFS Best Practices Guide. Been there read that, many, many times. It's an invaluable reference, I agree. Thanks Mike --- Michael Sullivan michael.p.sulli...@me.com http://www.kamiogi.net/ Japan Mobile: +81-80-3202-2599 US Phone: +1-561-283-2034 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Loss of L2ARC SSD Behaviour
From: Michael Sullivan [mailto:michael.p.sulli...@mac.com] My Google is very strong and I have the Best Practices Guide committed to bookmark as well as most of it to memory. While it explains how to implement these, there is no information regarding failure of a device in a striped L2ARC set of SSD's. I have http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#Sepa rate_Cache_Devices It is not possible to mirror or use raidz on cache devices, nor is it necessary. If a cache device fails, the data will simply be read from the main pool storage devices instead. I guess I didn't write this part, but: If you have multiple cache devices, they are all independent from each other. Failure of one does not negate the functionality of the others. I'm running 2009.11 which is the latest OpenSolaris. Quoi?? 2009.06 is the latest available from opensolaris.com and opensolaris.org. If you want something newer, AFAIK, you have to go to developer build, such as osol-dev-134 Sure you didn't accidentally get 2008.11? I am also well aware of the effect of losing a ZIL device will cause loss of the entire pool. Which is why I would never have a ZIL device unless it was mirrored and on different controllers. Um ... the log device is not special. If you lose *any* unmirrored device, you lose the pool. Except for cache devices, or log devices on zpool =19 From the information I've been reading about the loss of a ZIL device, it will be relocated to the storage pool it is assigned to. I'm not sure which version this is in, but it would be nice if someone could provide the release number it is included in (and actually works), it would be nice. What the heck? Didn't I just answer that question? I know I said this is answered in ZFS Best Practices Guide. http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#Sepa rate_Log_Devices Prior to pool version 19, if you have an unmirrored log device that fails, your whole pool is permanently lost. Prior to pool version 19, mirroring the log device is highly recommended. In pool version 19 or greater, if an unmirrored log device fails during operation, the system reverts to the default behavior, using blocks from the main storage pool for the ZIL, just as if the log device had been gracefully removed via the zpool remove command. Also, will this functionality be included in the mythical 2010.03 release? Zpool 19 was released in build 125. Oct 16, 2009. You can rest assured it will be included in 2010.03, or 04, or whenever that thing comes out. So what you are saying is that if a single device fails in a striped L2ARC VDEV, then the entire VDEV is taken offline and the fallback is to simply use the regular ARC and fetch from the pool whenever there is a cache miss. It sounds like you're only going to believe it if you test it. Go for it. That's what I did before I wrote that section of the ZFS Best Practices Guide. In ZFS, there is no such thing as striping, although the term is commonly used, because adding multiple devices creates all the benefit of striping, plus all the benefit of concatenation, but colloquially, people think concatenation is weird or unused or something, so people just naturally gravitated to calling it a stripe in ZFS too, although that's not technically correct according to the traditional RAID definition. But nobody bothered to create a new term stripecat or whatever, for ZFS. Or, does what you are saying here mean that if I have a 4 SSD's in a stripe for my L2ARC, and one device fails, the L2ARC will be reconfigured dynamically using the remaining SSD's for L2ARC. No reconfiguration necessary, because it's not a stripe. It's 4 separate devices, which ZFS can use simultaneously if it wants to. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Loss of L2ARC SSD Behaviour
On 6 May 2010, at 13:18 , Edward Ned Harvey wrote: From: Michael Sullivan [mailto:michael.p.sulli...@mac.com] While it explains how to implement these, there is no information regarding failure of a device in a striped L2ARC set of SSD's. I have http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#Sepa rate_Cache_Devices It is not possible to mirror or use raidz on cache devices, nor is it necessary. If a cache device fails, the data will simply be read from the main pool storage devices instead. I understand this. I guess I didn't write this part, but: If you have multiple cache devices, they are all independent from each other. Failure of one does not negate the functionality of the others. Ok, this is what I wanted to know. The that the L2ARC devices assigned to the pool are not striped but are independent. Loss of one drive will just cause a cache miss and force ZFS to go out to the pool for its objects. But then I'm not talking about using RAIDZ on a cache device. I'm talking about a striped device which would be RAID-0. If the SSD's are all assigned to L2ARC, then they are not striped in any fashion (RAID-0), but are completely independent and the L2ARC will continue to operate, just missing a single SSD. I'm running 2009.11 which is the latest OpenSolaris. Quoi?? 2009.06 is the latest available from opensolaris.com and opensolaris.org. If you want something newer, AFAIK, you have to go to developer build, such as osol-dev-134 Sure you didn't accidentally get 2008.11? My mistake… snv_111b which is 2009.06. I know it went up to 11 somewhere. I am also well aware of the effect of losing a ZIL device will cause loss of the entire pool. Which is why I would never have a ZIL device unless it was mirrored and on different controllers. Um ... the log device is not special. If you lose *any* unmirrored device, you lose the pool. Except for cache devices, or log devices on zpool =19 Well, if I've got a separate ZIL which is mirrored for performance, and mirrored because I think my data is valuable and important, I will have something more than RAID-0 on my main storage pool too. More than likely RAIDZ2 since I plan on using L2ARC to help improve performance along with separate SSD mirrored ZIL devices. From the information I've been reading about the loss of a ZIL device, it will be relocated to the storage pool it is assigned to. I'm not sure which version this is in, but it would be nice if someone could provide the release number it is included in (and actually works), it would be nice. What the heck? Didn't I just answer that question? I know I said this is answered in ZFS Best Practices Guide. http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#Sepa rate_Log_Devices Prior to pool version 19, if you have an unmirrored log device that fails, your whole pool is permanently lost. Prior to pool version 19, mirroring the log device is highly recommended. In pool version 19 or greater, if an unmirrored log device fails during operation, the system reverts to the default behavior, using blocks from the main storage pool for the ZIL, just as if the log device had been gracefully removed via the zpool remove command. No need to get defensive here, all I'm looking for is the spool version number which supports it and the version of OpenSolaris which supports that ZPOOL version. I think that if you are building for performance, it would be almost intuitive to have a mirrored ZIL in the event of failure, and perhaps even a hot spare available as well. I don't like the idea of my ZIL being transferred back to the pool, but having it transferred back is better than the alternative which would be data loss or corruption. Also, will this functionality be included in the mythical 2010.03 release? Zpool 19 was released in build 125. Oct 16, 2009. You can rest assured it will be included in 2010.03, or 04, or whenever that thing comes out. Thanks, build 125. So what you are saying is that if a single device fails in a striped L2ARC VDEV, then the entire VDEV is taken offline and the fallback is to simply use the regular ARC and fetch from the pool whenever there is a cache miss. It sounds like you're only going to believe it if you test it. Go for it. That's what I did before I wrote that section of the ZFS Best Practices Guide. In ZFS, there is no such thing as striping, although the term is commonly used, because adding multiple devices creates all the benefit of striping, plus all the benefit of concatenation, but colloquially, people think concatenation is weird or unused or something, so people just naturally gravitated to calling it a stripe in ZFS too, although that's not technically correct according to the traditional RAID definition. But nobody bothered to create a new term stripecat or whatever, for ZFS. Ummm, yes
Re: [zfs-discuss] why both dedup and compression?
On May 5, 2010, at 8:35 PM, Richard Jahnel wrote: Hmm... To clarify. Every discussion or benchmarking that I have seen always show both off, compression only or both on. Why never compression off and dedup on? I've seen this quite often. The decision to compress is based on the compressibility of the data. The decision to dedup is based on the duplication of the data. After some further thought... perhaps it's because compression works at the byte level and dedup is at the block level. Perhaps I have answered my own question. Both work at the block level. Hence, they are complementary. Two identical blocks will compress identically, and then dedup. -- richard -- ZFS storage and performance consulting at http://www.RichardElling.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss