Re: [zfs-discuss] Asymmetric zpool load
Aha, found it! It was this thread, also started by Carsten :) http://www.opensolaris.org/jive/thread.jspa?threadID=78921&tstart=45 -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Asymmetric zpool load
Guys, this looks to me like the second time we've had something like this reported on the forums for an x4500, again with the first zvol having much lower load than the other two, despite being created at the same time. I can't find the thread to check, can anybody else remember it? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs free space
Hi, A good rough estimate would be the total of the space that is displayed under the "USED" column of "zfs list" for those snapshots. Here is an example : -- snip -- [EMAIL PROTECTED] zfs list -r tank NAME USED AVAIL REFER MOUNTPOINT tank24.6M 38.9M19K /tank tank/fs124.4M 38.9M18K /tank/fs1 tank/[EMAIL PROTECTED] 24.4M - 24.4M - -- snip -- In the above case tank/[EMAIL PROTECTED] is using 24.4M. So, if we delete that snapshot it would freeup about 24.4M. Let's delete it an see what we get : -- snip -- [EMAIL PROTECTED] zfs destroy tank/[EMAIL PROTECTED] [EMAIL PROTECTED] zfs list -r tank NAME USED AVAIL REFER MOUNTPOINT tank 220K 63.3M19K /tank tank/fs118K 63.3M18K /tank/fs1 -- snip -- So, we did get back 24.4M freed (39.9M + 24.4M = 63.3M). Note that this could get a little complicated if there are multiple snapshots which refer to the same set of blocks. So, even after deleting one snapshot you might not see the space freed up. And this could be because, of the second snapshot which is refering to some of the blocks still. Hope that helps. Thanks and regards, Sanjeev On Wed, Dec 03, 2008 at 12:26:48AM +, Robert Milkowski wrote: > Hello none, > > Thursday, November 6, 2008, 7:55:42 PM, you wrote: > > n> Hi Milek, > n> Thanks for your reply. > n> What I really need is a way to tell how much space will be freed > n> for any particular set of snapshots that I delete. > > n> So I would like to query zfs, > n> "if I delete these snapshots > n> storage/[EMAIL PROTECTED] > n> storage/[EMAIL PROTECTED] > n> how much space will be freed?" > > I'm afraid you can do only one at a time. > > -- > Best regards, > Robertmailto:[EMAIL PROTECTED] >http://milek.blogspot.com > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Separate /var
On Tue, Dec 2, 2008 at 6:13 PM, Lori Alt <[EMAIL PROTECTED]> wrote: > On 12/02/08 10:24, Mike Gerdts wrote: > I follow you up to here. But why do the next steps? > > > zonecfg -z $zone > > remove fs dir=/var > > > > zfs set mountpoint=/zones/$zone/root/var rpool/zones/$zone/var It's not strictly required to perform this last set of commands, but the lofs mount point is not really needed. Longer term it will likely look cleaner (e.g. to live upgrade) to not have this lofs mount. That is, I suspect that live upgrade is more likely to look at /var in the zone and say "ahhh, that is a zfs file system - I known how to deal with that" than it is for it to say "ahhh, that is a lofs file system to some other zfs file system in the global zone - I know how to deal with that." -- Mike Gerdts http://mgerdts.blogspot.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zpool replace - choke point
It's something we've considered here as well. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs free space
Hello none, Thursday, November 6, 2008, 7:55:42 PM, you wrote: n> Hi Milek, n> Thanks for your reply. n> What I really need is a way to tell how much space will be freed n> for any particular set of snapshots that I delete. n> So I would like to query zfs, n> "if I delete these snapshots n> storage/[EMAIL PROTECTED] n> storage/[EMAIL PROTECTED] n> how much space will be freed?" I'm afraid you can do only one at a time. -- Best regards, Robertmailto:[EMAIL PROTECTED] http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zpool replace - choke point
Would any of this have to do with the system being a T2000? Would ZFS resilvering be affected by single threadedness, slowish US-T1 clock speed or lack of strong FPU performance? On 12/1/08, Alan Rubin <[EMAIL PROTECTED]> wrote: > We will be considering it in the new year, but that will not happen in time > to affect our current SAN migration. > -- > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- -- Matt Walburn http://mattwalburn.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Separate /var
On 12/02/08 11:04, Brian Wilson wrote: - Original Message - From: Lori Alt <[EMAIL PROTECTED]> Date: Tuesday, December 2, 2008 11:19 am Subject: Re: [zfs-discuss] Separate /var To: Gary Mills <[EMAIL PROTECTED]> Cc: zfs-discuss@opensolaris.org On 12/02/08 09:00, Gary Mills wrote: On Mon, Dec 01, 2008 at 04:45:16PM -0700, Lori Alt wrote: On 11/27/08 17:18, Gary Mills wrote: On Fri, Nov 28, 2008 at 11:19:14AM +1300, Ian Collins wrote: On Fri 28/11/08 10:53 , Gary Mills [EMAIL PROTECTED] sent: On Fri, Nov 28, 2008 at 07:39:43AM +1100, Edward Irvine wrote: I'm currently working with an organisation who want use ZFS for their > full zones. Storage is SAN attached, and they also want to create a > separate /var for each zone, which causes issues when the zone is > installed. They believe that a separate /var is still good practice. If your mount options are different for /var and /, you will need a separate filesystem. In our case, we use `setuid=off' and `devices=off' on /var for security reasons. We do the same thing for home directories and /tmp . For zones? Sure, if you require different mount options in the zones. I looked into this and found that, using ufs, you can indeed set up the zone's /var directory as a separate file system. I don't know about how LiveUpgrade works with that configuration (I didn't try it). But I was at least able to get the zone to install and boot. But with zfs, I couldn't even get a zone with a separate /var dataset to install, let alone be manageable with LiveUpgrade. I configured the zone like so: # zonecfg -z z4 z4: No such zone configured Use 'create' to begin configuring a new zone. zonecfg:z4> create zonecfg:z4> set zonepath=/zfszones/z4 zonecfg:z4> add fs zonecfg:z4:fs> set dir=/var zonecfg:z4:fs> set special=rpool/ROOT/s10x_u6wos_07b/zfszones/z4/var zonecfg:z4:fs> set type=zfs zonecfg:z4:fs> end zonecfg:z4> exit I then get this result from trying to install the zone: prancer# zoneadm -z z4 install Preparing to install zone . ERROR: No such file or directory: cannot mount I think you're running into the problem of defining the var as the filesystem that already exists under the zone root. We had issues with that, so any time I've been doing filesystems, I don't push in zfs datasets, I create a zfs filesystem in the global zone and mount that directory into the zone with lofs. For example, I've got a pool zdisk with a filesystem down the path - zdisk/zones/zvars/(zonename) which mounts itself to - /zdisk/zones/zvars/(zonename) It's a ZFS filesystem with quota and reservation setup, and I just do an lofs to it via these lines in the /etc/zones/(zonename).xml file - I think that's the equivalent of the following zonecfg lines - zonecfg:z4> add fs zonecfg:z4:fs> set dir=/var zonecfg:z4:fs> set special=/zdisk/zones/zvars/z4/var zonecfg:z4:fs> set type=lofs zonecfg:z4:fs> end I think to put the zfs into the zone, you need to do an add dataset, instead of an add fs. I tried that once and didn't like the results though completely. The dataset was controllable inside the zone (which is what I wanted at the time), but it wasn't controllably from the global zone anymore. And I couldn't access it from the global zone easily to get the backup software to pick it up. Doing it this way means you have to manage the zfs datasets from the global zone, but that's not really an issue here. So I tried your suggestion and it appears to work, at least initially (I have a feeling that it will cause problems later if I want to clone the BE using LiveUpgrade, but first things first.) So, create the separate filesystems you want in the global zone (without stacking them under the zoneroot - separate directory somewhere Why does it have to be in a separate directory? lori setup the zfs stuff you want, then lofs it into the local zone. I've had that install successfully before. Hope that's helpful in some way! ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Separate /var
On 12/02/08 10:24, Mike Gerdts wrote: On Tue, Dec 2, 2008 at 11:17 AM, Lori Alt <[EMAIL PROTECTED]> wrote: I did pre-create the file system. Also, I tried omitting "special" and zonecfg complains. I think that there might need to be some changes to zonecfg and the zone installation code to get separate /var datasets in non-global zones to work. You could probably do something like: zfs create rpool/zones/$zone zfs create rpool/zones/$zone/var zonecfg -z $zone add fs set dir=/var set special=/zones/$zone/var set type=lofs end ... zoneadm -z $zone install I follow you up to here. But why do the next steps? zonecfg -z $zone remove fs dir=/var zfs set mountpoint=/zones/$zone/root/var rpool/zones/$zone/var ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Problem importing degraded Pool
Hi, Eeemmm, i think its safe to say your zpool and its data are gone for ever. Use the Samsung disk checker boot CD, and see if it can fix your faulty disk. Then connect all 3 drives to your system and use raidz. Your data will then be well protected. Brian, -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] continuous replication
Hello Mattias, Saturday, November 15, 2008, 12:24:05 AM, you wrote: MP> On Sat, Nov 15, 2008 at 00:46, Richard Elling <[EMAIL PROTECTED]> wrote: >> Adam Leventhal wrote: >>> >>> On Fri, Nov 14, 2008 at 10:48:25PM +0100, Mattias Pantzare wrote: >>> That is _not_ active-active, that is active-passive. If you have a active-active system I can access the same data via both controllers at the same time. I can't if it works like you just described. You can't call it active-active just because different volumes are controlled by different controllers. Most active-passive RAID controllers can do that. The data sheet talks about active-active clusters, how does that work? >>> >>> What the Sun Storage 7000 Series does would more accurately be described >>> as >>> dual active-passive. >>> >> >> This is ambiguous in the cluster market. It is common to describe >> HA clusters where each node can be offering services concurrently, >> as active/active, even though the services themselves are active/passive. >> This is to appease folks who feel that idle secondary servers are a bad >> thing. MP> But this product is not in the cluster market. It is in the storage market. MP> By your definition virtually all dual controller RAID boxes are active/active. MP> You should talk to Veritas so that they can change all their documentation... MP> Active/active and active/passive has a real technical meaning, don't MP> let marketing destroy that! I thought that when you can access the same LUN via different controller then you have a symmetric disk array and when you can't you have an asymmetric one. It has nothing to do with active-active or active-standby. Most of a disk arrays in the marked are active-active and asymmetric. -- Best regards, Robert Milkowskimailto:[EMAIL PROTECTED] http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Availability: ZFS needs to handle disk removal / driver failure better
On 2-Dec-08, at 3:35 PM, Miles Nordin wrote: >> "r" == Ross <[EMAIL PROTECTED]> writes: > > r> style before I got half way through your post :) [...status > r> problems...] could be a case of oversimplifying things. > ... > And yes, this is a religious argument. Just because it spans decades > of experience and includes ideas of style doesn't mean it should be > dismissed as hocus-pocus. And I don't like all these binary config > files either. Not even Mac OS X is pulling that baloney any more. OS X never used binary config files; it standardised on XML property lists for the new subsystems (plus a lot of good old fashioned UNIX config). Perhaps you are thinking of Mac OS 9 and earlier (resource forks). --Toby ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] HP Smart Array and b99?
OK, In the end I managed to install OpenSolaris snv_101b on hp blade on smart array drive directly from install cd. Everything is fine. The problems I experienced with hangs on boot on snv_99+ is related to Qlogic driver, but this is a different story. Simon -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RE : rsync using 100% of a cpu
Francois Dion wrote: > >>"Francois Dion" wrote: > >> Source is local to rsync, copying from a zfs file system, > >> destination is remote over a dsl connection. Takes forever to just > >> go through the unchanged files. Going the other way is not a > >> problem, it takes a fraction of the time. Anybody seen that? > >> Suggestions? > >De: Blake Irvin [mailto:[EMAIL PROTECTED] > >Upstream when using DSL is much slower than downstream? > > No, that's not the problem. I know ADSL is assymetrical. When there is > an actual data transfer going on, the cpu drops to 0.2%. It's only when > rsync is doing its thing (reading, not writing) locally that it pegs the > cpu. We are talking 15 minutes in one direction while in the other it > looks like I'll pass the 24 hours mark before the rsync is complete. And > there were less than 100MB added on each side. > > BTW, the only other process I've seen that pegs the cpu solid for as > long as it runs on my v480 is when I downloaded Belenix through a python > script (btdownloadheadless). Is the list of files long? rsync 3.0.X does not use a monolithic file list pull and uses less memory... Are you using a -c option or other option that causes rsync to checksum every block of all the files? Is the zfs file system compressed, so it has to decompress each block so that rsync can checksum it? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
> > I don't want to steer you wrong under the > circumstances, > so I think we need more information. > > First, is the failure the same as in the earlier part > of this > thread. I.e., when you boot, do you get a failure > like this? > > Warning: Fcode sequence resulted in a net stack depth > change of 1 > Evaluating: > > Evaluating: > > The file just loaded does not appear to be executable Nope: Sun Fire T200, No Keyboard Copyright 2007 Sun Microsystems, Inc. All rights reserved. OpenBoot 4.27.4, 16256 MB memory available, Serial #75621394. Ethernet address 0:14:4f:81:e4:12, Host ID: 8481e412. Boot device: /[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED]/[EMAIL PROTECTED] File and args: ufs-file-system Loading: /platform/SUNW,Sun-Fire-T200/boot_archive Loading: /platform/sun4v/boot_archive Can't open boot_archive Evaluating: The file just loaded does not appear to be executable. === > > Second, at least at first glance, this looks like > more of > a generic patch problem than a problem specifically > related to zfs boot. Since this is S10, not > OpenSolaris, > perhaps you should be escalating this through the > standard support channels. This alias probably > won't get you any really useful answers on general > problems with patching. Yeah I just thought since I'd followed this thread before it might be useful to add to it since there might be crossover issues. I'll keep pushing on the string. I hate being the annoying customer who says "I won't follow your suggestion because (blah) please escalate this ticket." I hope to move these systems over to 10u6 in a few months and streamline our patching so problems like this won't exist. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
I don't want to steer you wrong under the circumstances, so I think we need more information. First, is the failure the same as in the earlier part of this thread. I.e., when you boot, do you get a failure like this? Warning: Fcode sequence resulted in a net stack depth change of 1 Evaluating: Evaluating: The file just loaded does not appear to be executable Second, at least at first glance, this looks like more of a generic patch problem than a problem specifically related to zfs boot. Since this is S10, not OpenSolaris, perhaps you should be escalating this through the standard support channels. This alias probably won't get you any really useful answers on general problems with patching. Lori On 12/02/08 14:42, Vincent Fox wrote: > The SupportTech responding to case #66153822 so far > has only suggested "boot from cdrom and patchrm 137137-09" > which tells me I'm dealing with a level-1 binder monkey. > It's the idle node of a cluster holding 10K email accounts > so I'm proceeding cautiously. It is unfortunate the admin doing > the original patching did them from multi-user but here we are. > > I am attempting to boot net:dhcp -s just to collect more info: > > My patchadd output shows 138866-01 & 137137-09 being applied OK: > > bash-3.00# patchadd /net/matlock/local/d02/patches/all_patches/138866-01 > Validating patches... > > Loading patches installed on the system... > > Done! > > Loading patches requested to install. > > Done! > > Checking patches that you specified for installation. > > Done! > > > Approved patches will be installed in this order: > > 138866-01 > > > Checking installed patches... > Verifying sufficient filesystem capacity (dry run method)... > Installing patch packages... > > Patch 138866-01 has been successfully installed. > See /var/sadm/patch/138866-01/log for details > > Patch packages installed: > SUNWcsr > > bash-3.00# patchadd /net/matlock/local/d02/patches/all_patches/137137-09 > Validating patches... > > Loading patches installed on the system... > > Done! > > Loading patches requested to install. > > Version of package SUNWcakr from directory SUNWcakr.u in patch 137137-09 > differs from the package installed on the system. > Version of package SUNWcar from directory SUNWcar.u in patch 137137-09 > differs from the package installed on the system. > Version of package SUNWkvm from directory SUNWkvm.c in patch 137137-09 > differs from the package installed on the system. > Version of package SUNWkvm from directory SUNWkvm.d in patch 137137-09 > differs from the package installed on the system. > Version of package SUNWkvm from directory SUNWkvm.m in patch 137137-09 > differs from the package installed on the system. > Version of package SUNWkvm from directory SUNWkvm.u in patch 137137-09 > differs from the package installed on the system. > Architecture for package SUNWnxge from directory SUNWnxge.u in patch > 137137-09 differs from the package installed on the system. > Version of package SUNWcakr from directory SUNWcakr.us in patch 137137-09 > differs from the package installed on the system. > Version of package SUNWcar from directory SUNWcar.us in patch 137137-09 > differs from the package installed on the system. > Version of package SUNWkvm from directory SUNWkvm.us in patch 137137-09 > differs from the package installed on the system. > Done! > > The following requested patches have packages not installed on the system > Package SUNWcpr from directory SUNWcpr.u in patch 137137-09 is not installed > on the system. Changes for package SUNWcpr will not be applied to the system. > Package SUNWefc from directory SUNWefc.u in patch 137137-09 is not installed > on the system. Changes for package SUNWefc will not be applied to the system. > Package SUNWfruip from directory SUNWfruip.u in patch 137137-09 is not > installed on the system. Changes for package SUNWfruip will not be applied to > the system. > Package SUNWluxd from directory SUNWluxd.u in patch 137137-09 is not > installed on the system. Changes for package SUNWluxd will not be applied to > the system. > Package SUNWs8brandr from directory SUNWs8brandr in patch 137137-09 is not > installed on the system. Changes for package SUNWs8brandr will not be applied > to the system. > Package SUNWs8brandu from directory SUNWs8brandu in patch 137137-09 is not > installed on the system. Changes for package SUNWs8brandu will not be applied > to the system. > Package SUNWs9brandr from directory SUNWs9brandr in patch 137137-09 is not > installed on the system. Changes for package SUNWs9brandr will not be applied > to the system. > Package SUNWs9brandu from directory SUNWs9brandu in patch 137137-09 is not > installed on the system. Changes for package SUNWs9brandu will not be applied > to the system. > Package SUNWus from directory SUNWus.u in patch 137137-09 is not installed on > the system. Changes for package SUNWus will not be applied to the system. > Package SUNWefc from directory SUNW
Re: [zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
The SupportTech responding to case #66153822 so far has only suggested "boot from cdrom and patchrm 137137-09" which tells me I'm dealing with a level-1 binder monkey. It's the idle node of a cluster holding 10K email accounts so I'm proceeding cautiously. It is unfortunate the admin doing the original patching did them from multi-user but here we are. I am attempting to boot net:dhcp -s just to collect more info: My patchadd output shows 138866-01 & 137137-09 being applied OK: bash-3.00# patchadd /net/matlock/local/d02/patches/all_patches/138866-01 Validating patches... Loading patches installed on the system... Done! Loading patches requested to install. Done! Checking patches that you specified for installation. Done! Approved patches will be installed in this order: 138866-01 Checking installed patches... Verifying sufficient filesystem capacity (dry run method)... Installing patch packages... Patch 138866-01 has been successfully installed. See /var/sadm/patch/138866-01/log for details Patch packages installed: SUNWcsr bash-3.00# patchadd /net/matlock/local/d02/patches/all_patches/137137-09 Validating patches... Loading patches installed on the system... Done! Loading patches requested to install. Version of package SUNWcakr from directory SUNWcakr.u in patch 137137-09 differs from the package installed on the system. Version of package SUNWcar from directory SUNWcar.u in patch 137137-09 differs from the package installed on the system. Version of package SUNWkvm from directory SUNWkvm.c in patch 137137-09 differs from the package installed on the system. Version of package SUNWkvm from directory SUNWkvm.d in patch 137137-09 differs from the package installed on the system. Version of package SUNWkvm from directory SUNWkvm.m in patch 137137-09 differs from the package installed on the system. Version of package SUNWkvm from directory SUNWkvm.u in patch 137137-09 differs from the package installed on the system. Architecture for package SUNWnxge from directory SUNWnxge.u in patch 137137-09 differs from the package installed on the system. Version of package SUNWcakr from directory SUNWcakr.us in patch 137137-09 differs from the package installed on the system. Version of package SUNWcar from directory SUNWcar.us in patch 137137-09 differs from the package installed on the system. Version of package SUNWkvm from directory SUNWkvm.us in patch 137137-09 differs from the package installed on the system. Done! The following requested patches have packages not installed on the system Package SUNWcpr from directory SUNWcpr.u in patch 137137-09 is not installed on the system. Changes for package SUNWcpr will not be applied to the system. Package SUNWefc from directory SUNWefc.u in patch 137137-09 is not installed on the system. Changes for package SUNWefc will not be applied to the system. Package SUNWfruip from directory SUNWfruip.u in patch 137137-09 is not installed on the system. Changes for package SUNWfruip will not be applied to the system. Package SUNWluxd from directory SUNWluxd.u in patch 137137-09 is not installed on the system. Changes for package SUNWluxd will not be applied to the system. Package SUNWs8brandr from directory SUNWs8brandr in patch 137137-09 is not installed on the system. Changes for package SUNWs8brandr will not be applied to the system. Package SUNWs8brandu from directory SUNWs8brandu in patch 137137-09 is not installed on the system. Changes for package SUNWs8brandu will not be applied to the system. Package SUNWs9brandr from directory SUNWs9brandr in patch 137137-09 is not installed on the system. Changes for package SUNWs9brandr will not be applied to the system. Package SUNWs9brandu from directory SUNWs9brandu in patch 137137-09 is not installed on the system. Changes for package SUNWs9brandu will not be applied to the system. Package SUNWus from directory SUNWus.u in patch 137137-09 is not installed on the system. Changes for package SUNWus will not be applied to the system. Package SUNWefc from directory SUNWefc.us in patch 137137-09 is not installed on the system. Changes for package SUNWefc will not be applied to the system. Package SUNWluxd from directory SUNWluxd.us in patch 137137-09 is not installed on the system. Changes for package SUNWluxd will not be applied to the system. Package FJSVvplr from directory FJSVvplr.u in patch 137137-09 is not installed on the system. Changes for package FJSVvplr will not be applied to the system. Package FJSVvplr from directory FJSVvplr.us in patch 137137-09 is not installed on the system. Changes for package FJSVvplr will not be applied to the system. Checking patches that you specified for installation. Done! Approved patches will be installed in this order: 137137-09 Checking installed patches... Executing prepatch script... Verifying sufficient filesystem capacity (dry run method)... Dec 2 10:05:58 cyrus2-2 cfenvd[706]: LDT(3) in loadavg chi = 19.18 thresh 11.58 D
Re: [zfs-discuss] A failed disk can bring down a machine?
On Tue, Dec 02, 2008 at 12:50:08PM -0600, Tim wrote: > On Tue, Dec 2, 2008 at 11:42 AM, Brian Hechinger <[EMAIL PROTECTED]> wrote: > > I believe the issue you're running into is the failmode you currently have > set. Take a look at this: > http://prefetch.net/blog/index.php/2008/03/01/configuring-zfs-to-gracefully-deal-with-failures/ Ah ha! It's now set to continue. Hopefully that'll save me next time this happens. Which I hope isn't too soon. ;) Sadly this has rid me of my urgent need to replace that box, which I suppose isn't a bad thing as I can now take my time. Anyone have any opinions of that ASUS box running the latest OpenSolaris? -brian -- "Coding in C is like sending a 3 year old to do groceries. You gotta tell them exactly what you want or you'll end up with a cupboard full of pop tarts and pancake mix." -- IRC User (http://www.bash.org/?841435) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
Vincent Fox wrote: > Reviving this thread. > > We have a Solaris 10u4 system recently patched with 137137-09. > Unfortunately the patch was applied from multi-user mode, I wonder if this > may have been original posters problem as well? Anyhow we are now stuck > with an unbootable system as well. > > I have submitted a case to Sun about it, will add details as that proceeds. Hi There are basically two possible issue that we are aware of 6772822, where the root fs has insufficient space to hold the failsafe archive ( 181M ) the bootarchive 80M approx, and a rebuild of same when rebooting, leading to some possible different outcomes if you see "seek failed" it indicates that new bootblk installed ok, but it couldn't rebuild on reboot, There are also issues where if running svm on mpxio, the bootblk won't et installed, 6772083 or 6775167 Let us know the exact errror seen and if possible the exact output from patchadd 137137-09 Enda ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Availability: ZFS needs to handle disk removal / driver failure better
> "r" == Ross <[EMAIL PROTECTED]> writes: r> style before I got half way through your post :) [...status r> problems...] could be a case of oversimplifying things. yeah I was a bit inappropriate, but my frustration comes from the (partly paranoid) imagining of how the idea ``we need to make it simple'' might have spooled out through a series of design meetings to a culturally-insidious mind-blowing condescention toward the sysadmin. ``simple'', to me, means that a 'status' tool does not read things off disks, and does not gather a bunch of scraps to fabricate a pretty (``simple''?) fantasy-world at invocation which is torn down again when it exits. The Linux status tools are pretty-printing wrappers around 'cat /proc/$THING/status'. That, is SIMPLE! And, screaming monkeys though they often are, the college kids writing Linux are generally disciplined enough not to grab a bunch of locks and then go to sleep for minutes when delivering things from /proc. I love that. The other, broken, idea of ``simple'' is what I come to Unix to avoid. And yes, this is a religious argument. Just because it spans decades of experience and includes ideas of style doesn't mean it should be dismissed as hocus-pocus. And I don't like all these binary config files either. Not even Mac OS X is pulling that baloney any more. r> There's no denying the ease of admin is one of ZFS' strengths, I deny it! It is not simple to start up 'format' and 'zpool iostat' and RoboCopy on another machine because you cannot trust the output of the status command. And getting visibility into something by starting a bunch of commands in different windows and watching when which one unfreezes is hilarious, not simple. r> the problems you've reported with resilvering. I think we were watching this bug: http://bugs.opensolaris.org/view_bug.do?bug_id=6675685 so that ought to be fixed in your test system but not in s10u6. but it might not be completely fixed yet: http://bugs.opensolaris.org/view_bug.do?bug_id=6747698 pgpx4Yk6ZjF1M.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Asymmetric zpool load
On Tue, 2 Dec 2008, Carsten Aulbert wrote: > > Hmm, since I only started with Solaris this year, is there a way to > identify a "slow" disk? In principle these should all be identical > Hitachi Deathstar^WDeskstar drives and should only have the standard > deviation during production. Look at the output of 'iostat -xn 30' when the system is under load. Possibly ignore the initial output entry since that is an aggregate since the dawn of time. You will need to know which disks are in each vdev. Check to see if the asvc_t value for one of the disks is much more than the others in the same vdev. If a disk is acting as the bottleneck then it is likely that its asvc_t value is far greater than the others. In order to get zfs's view of I/O at use zpool iostat -v poolname 30 Bob == Bob Friesenhahn [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
Reviving this thread. We have a Solaris 10u4 system recently patched with 137137-09. Unfortunately the patch was applied from multi-user mode, I wonder if this may have been original posters problem as well? Anyhow we are now stuck with an unbootable system as well. I have submitted a case to Sun about it, will add details as that proceeds. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Asymmetric zpool load
On Tue, 2 Dec 2008, Carsten Aulbert wrote: > > No I think a single disk would be much less performant, however I'm a > bit disappointed by the overall performance of the boxes and just now we > have users where they experience extremely slow performance. If all of the disks in the vdev need to be written at once prior to the next write, then the write latency will surely be more than just one disk. Bob == Bob Friesenhahn [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Asymmetric zpool load
Bob Friesenhahn wrote: > You may have one or more "slow" disk drives which slow down the whole > vdev due to long wait times. If you can identify those slow disk drives > and replace them, then overall performance is likely to improve. > > The problem is that under severe load, the vdev with the highest backlog > will be used the least. One or more slow disks in the vdev will slow > down the whole vdev. It takes only one slow disk to slow down the whole > vdev. Hmm, since I only started with Solaris this year, is there a way to identify a "slow" disk? In principle these should all be identical Hitachi Deathstar^WDeskstar drives and should only have the standard deviation during production. > > ZFS commits the writes to all involved disks in a raidz2 before > proceeding with the next write. With so many disks, you are asking for > quite a lot of fortuitous luck in that everything must be working > optimally. Compounding the problem is that I understand that when the > stripe width exceeds the number of segmented blocks from the data to be > written (ZFS is only willing to dice to a certain minimum size), then > only a subset of the disks will be used, wasting potiential I/O > bandwidth. Your stripes are too wide. > Ah, ok, that's one of the first reasonable explanation (which I understand) why large zpools might be bad. So far I was not able to track that down and only found the standard "magic" rule not to exceed 10 drives - but our (synthetic) tests had not shown a significant drawbacks. But I guess we might be bitten by it now. >> (c) Would the use of several smaller vdev would help much? And which >> layout would be a good compromise for getting space as well as >> performance and reliability? 46 disks have so few prime factors > > Yes, more vdevs should definitely help quite a lot for dealing with > real-world muti-user loads. One raidz/raidz2 vdev provides (at most) > the IOPs of a single disk. > > There is a point of diminishing returns and your layout has gone far > beyond this limit. Thanks for the insight, I guess I need to experiment with empty boxes to get into a better state! Cheers Carsten ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Asymmetric zpool load
Hi Miles, Miles Nordin wrote: >> "ca" == Carsten Aulbert <[EMAIL PROTECTED]> writes: > > ca> (a) Why the first vdev does not get an equal share > ca> of the load > > I don't know. but, if you don't add all the vdev's before writing > anything, there's no magic to make them balance themselves out. Stuff > stays where it's written. I'm guessing you did add them at the same > time, and they still filled up unevenly? > Yes, they are created all in one go (even on the same command line) and only then are filled - either "naturally" over time or via zfs send/receive (all on Sol10u5). So yes, it seems they fill up unevenly. > 'zpool iostat' that you showed is the place I found to see how data is > spread among vdev's. > > ca> (b) Why is a large raidz2 so bad? When I use a > ca> standard Linux box with hardware raid6 over 16 disks I usually > ca> get more bandwidth and at least about the same small file > ca> performance > > obviously there are all kinds of things going on but...the standard > answer is, traditional RAID5/6 doesn't have to do full stripe I/O. > ZFS is more like FreeBSD's RAID3: it gets around the NVRAMless-RAID5 > write hole by always writing a full stripe, which means all spindles > seek together and you get the seek performance of 1 drive (per vdev). > Linux RAID5/6 just gives up and accepts a write hole, AIUI, but > because the stripes are much fatter than a filesystem block, you'll > sometimes get the record you need by seeking a subset of the drives > rather than all of them, which means the drives you didn't seek have > the chance to fetch another record. > > If you're saying you get worse performance than a single spindle, I'm > not sure why. > No I think a single disk would be much less performant, however I'm a bit disappointed by the overall performance of the boxes and just now we have users where they experience extremely slow performance. But already thanks for the inside Cheers Carsten ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Asymmetric zpool load
> "ca" == Carsten Aulbert <[EMAIL PROTECTED]> writes: ca> (a) Why the first vdev does not get an equal share ca> of the load I don't know. but, if you don't add all the vdev's before writing anything, there's no magic to make them balance themselves out. Stuff stays where it's written. I'm guessing you did add them at the same time, and they still filled up unevenly? 'zpool iostat' that you showed is the place I found to see how data is spread among vdev's. ca> (b) Why is a large raidz2 so bad? When I use a ca> standard Linux box with hardware raid6 over 16 disks I usually ca> get more bandwidth and at least about the same small file ca> performance obviously there are all kinds of things going on but...the standard answer is, traditional RAID5/6 doesn't have to do full stripe I/O. ZFS is more like FreeBSD's RAID3: it gets around the NVRAMless-RAID5 write hole by always writing a full stripe, which means all spindles seek together and you get the seek performance of 1 drive (per vdev). Linux RAID5/6 just gives up and accepts a write hole, AIUI, but because the stripes are much fatter than a filesystem block, you'll sometimes get the record you need by seeking a subset of the drives rather than all of them, which means the drives you didn't seek have the chance to fetch another record. If you're saying you get worse performance than a single spindle, I'm not sure why. pgpUTlo8kBKC2.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Asymmetric zpool load
On Tue, 2 Dec 2008, Carsten Aulbert wrote: > > Questions: > (a) Why the first vdev does not get an equal share of the load You may have one or more "slow" disk drives which slow down the whole vdev due to long wait times. If you can identify those slow disk drives and replace them, then overall performance is likely to improve. The problem is that under severe load, the vdev with the highest backlog will be used the least. One or more slow disks in the vdev will slow down the whole vdev. It takes only one slow disk to slow down the whole vdev. > (b) Why is a large raidz2 so bad? When I use a standard Linux box with > hardware raid6 over 16 disks I usually get more bandwidth and at least > about the same small file performance ZFS commits the writes to all involved disks in a raidz2 before proceeding with the next write. With so many disks, you are asking for quite a lot of fortuitous luck in that everything must be working optimally. Compounding the problem is that I understand that when the stripe width exceeds the number of segmented blocks from the data to be written (ZFS is only willing to dice to a certain minimum size), then only a subset of the disks will be used, wasting potiential I/O bandwidth. Your stripes are too wide. > (c) Would the use of several smaller vdev would help much? And which > layout would be a good compromise for getting space as well as > performance and reliability? 46 disks have so few prime factors Yes, more vdevs should definitely help quite a lot for dealing with real-world muti-user loads. One raidz/raidz2 vdev provides (at most) the IOPs of a single disk. There is a point of diminishing returns and your layout has gone far beyond this limit. Bob == Bob Friesenhahn [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Availability: ZFS needs to handle disk removal / driver failure better
Hi Miles, It's probably a bad sign that although that post came through as anonymous in my e-mail, I recognised your style before I got half way through your post :) I agree, the zpool status being out of date is weird, I'll dig out the bug number for that at some point as I'm sure I've mentioned it before. It looks to me like there are two separate pieces of code that work out the status of the pool. There's the stuff ZFS uses internally to run the pool, and then there's a completely separate piece that does the reporting to the end user. I agree that it could be a case of oversimplifying things. There's no denying the ease of admin is one of ZFS' strengths, but I think the whole zpool status thing needs looking at again. Neither the way the command freezes, nor the out of date information make any sense to me. And yes, I'm aware of the problems you've reported with resilvering. That's on my list of things to test with this. I've already done a quick test of running a scrub after the resilver (which appeared ok at first glance), and tomorrow I'll be testing the reboot status too. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [install-discuss] differences.. why?
On 12/02/08 11:29, dick hoogendijk wrote: Lori Alt wrote: On 12/02/08 03:21, jan damborsky wrote: Hi Dick, I am redirecting your question to zfs-discuss mailing list, where people are more knowledgeable about this problem and your question could be better answered. Best regards, Jan dick hoogendijk wrote: I have s10u6 installed on my server. zfs list (partly): NAMEUSED AVAIL REFER MOUNTPOINT rpool 88.8G 140G 27.5K /rpool rpool/ROOT 20.0G 140G18K /rpool/ROOT rpool/ROOT/s10BE2 20.0G 140G 7.78G / But just now, on a newly installed s10u6 system I got rpool/ROOT with a mountpoint "legacy" The mount point for //ROOT is supposed to be "legacy" because that dataset should never be mounted. It's just a "container" dataset to group all the BEs. The drives were different. On the latter (legacy) system it was not formatted (yet) (in VirtualBox). On my server I switched from UFS to ZFS, so I first created a rpool and than did a luupgrade into it. This could explain the mountpoint /rpool/ROOT but WHY the difference? Why can't s10u6 install the same mountpoint on the new disk? The server runs very well; is this "legacy" thing really needed? When you created the rpool, did you also explicitly create the rpool/ROOT datasets? If you did create it and didn't set the mount point to "legacy", that explains why you ended up with your original configuration. If you didn't create the rpool/ROOT dataset yourself, and instead let LiveUpgrade create it automatically, and LiveUpgrade set the mountpoint to /rpool/ROOT, then that's a bug in LiveUpgrade (though a minor one, I think). NO, I'm quite positive all I did was "zfs create rpool" and after that I did a "lucreate -n zfsBE -p rpool" followed by "luupgrade -u -n zfsBE -s /iso" So, it must have been LU that "forgot" to set the mountpoint to legacy. yes, we verified that and filed a bug against LU. What is the correct syntax to correct this situation? I'm not sure you really need to, but you should be able to do this: zfs unmount rpool/ROOT zfs set mountpoint=legacy rpool/ROOT Lori ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Asymmetric zpool load
Hi all, We are running pretty large vdevs since the initial testing showed that our setup was not too much off the optimum. However, under real world load we do see quite some weird behaviour: The system itself is a X4500 with 500 GB drives and right now the system seems to be under heavy load, e.g. ls takes minutes to return on only a few hundred entries, top shows 10% kernel, rest idle. zpool ioststat -v atlashome 60 shows (not the first output): capacity operationsbandwidth pool used avail read write read write -- - - - - - - atlashome 2.11T 18.8T 2.29K 36 71.7M 138K raidz2 466G 6.36T493 11 14.9M 34.1K c0t0d0 - - 48 5 1.81M 3.52K c1t0d0 - - 48 5 1.81M 3.46K c4t0d0 - - 48 5 1.81M 3.27K c6t0d0 - - 48 5 1.81M 3.40K c7t0d0 - - 47 5 1.81M 3.40K c0t1d0 - - 47 5 1.81M 3.20K c1t1d0 - - 47 6 1.81M 3.59K c4t1d0 - - 47 6 1.81M 3.53K c5t1d0 - - 47 5 1.81M 3.33K c6t1d0 - - 48 6 1.81M 3.67K c7t1d0 - - 48 6 1.81M 3.66K c0t2d0 - - 48 5 1.82M 3.42K c1t2d0 - - 48 6 1.81M 3.56K c4t2d0 - - 48 6 1.81M 3.54K c5t2d0 - - 48 5 1.81M 3.41K raidz2 732G 6.10T800 12 24.6M 52.3K c6t2d0 - -139 5 7.52M 4.54K c7t2d0 - -139 5 7.52M 4.81K c0t3d0 - -140 5 7.52M 4.98K c1t3d0 - -139 5 7.51M 4.47K c4t3d0 - -139 5 7.51M 4.82K c5t3d0 - -139 5 7.51M 4.99K c6t3d0 - -139 5 7.52M 4.44K c7t3d0 - -139 5 7.52M 4.78K c0t4d0 - -139 5 7.52M 4.97K c1t4d0 - -139 5 7.51M 4.60K c4t4d0 - -139 5 7.51M 4.86K c6t4d0 - -139 5 7.51M 4.99K c7t4d0 - -139 5 7.51M 4.52K c0t5d0 - -139 5 7.51M 4.78K c1t5d0 - -138 5 7.51M 4.94K raidz2 960G 6.31T 1.02K 12 32.2M 52.0K c4t5d0 - -178 5 9.29M 4.79K c5t5d0 - -178 5 9.28M 4.64K c6t5d0 - -179 5 9.29M 4.44K c7t5d0 - -178 4 9.26M 4.26K c0t6d0 - -178 5 9.28M 4.78K c1t6d0 - -178 5 9.20M 4.58K c4t6d0 - -178 5 9.26M 4.25K c5t6d0 - -177 4 9.21M 4.18K c6t6d0 - -178 5 9.29M 4.69K c7t6d0 - -177 5 9.26M 4.61K c0t7d0 - -177 5 9.29M 4.34K c1t7d0 - -177 5 9.24M 4.28K c4t7d0 - -177 5 9.29M 4.78K c5t7d0 - -177 5 9.27M 4.75K c6t7d0 - -177 5 9.29M 4.34K c7t7d0 - -177 5 9.27M 4.28K -- - - - - - - Questions: (a) Why the first vdev does not get an equal share of the load (b) Why is a large raidz2 so bad? When I use a standard Linux box with hardware raid6 over 16 disks I usually get more bandwidth and at least about the same small file performance (c) Would the use of several smaller vdev would help much? And which layout would be a good compromise for getting space as well as performance and reliability? 46 disks have so few prime factors Thanks a lot Carsten ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] A failed disk can bring down a machine?
On Tue, Dec 2, 2008 at 11:42 AM, Brian Hechinger <[EMAIL PROTECTED]> wrote: > I was not in front of the machine, I had remote hands working with me, so I > appologize in advance for any lack of detail I'm about to give. > > The server in question is running snv_81 booting ZFS Root using Tim's > scripts to > "convert" it over to ZFS Root. > > My server in colo stopped responding. I had a screen session open and I > could > switch between screen windows and create new windows but I could not run > any > commands. I also could not log into the box. > > The hands on person saw this on the console (transcribed from a video > console): > > SYNCHRONIZE CACHE command failed (5) > scsi: WARNING: /[EMAIL PROTECTED],0/pci1095,[EMAIL PROTECTED]/[EMAIL > PROTECTED],0 (sd1) > > sd1 is one of two SATA disks connected to the machine via a SiL3124 > controller. > > I had the remote hands pull sd1 and reboot the machine. It came right up > and has > been running fine since. Lacking its mirrored disks, however. > > Due to other issues I've had with this box (If you think you can get away > with running > ZFS on a 32-bit machine, you are mistaken) I'm looking to replace it > anyway. What > concerns me is that a single disk having gone bad like that can take out > the whole > machine. This is not what I would consider an ideal or acceptable setup > for a machine > that is in colo that doesn't have 24x7 onsite support. > > What was to blame for this disk failure causing my machine to become > unresponsive? Was > it the SiL3124? Is it something else? Is this what I should expect from > SATA? > > I ask all these questions as I want to make sure that if this is indeed > connected to the > use of a SATA controller, or the use of a specific SATA controller that I > certainly avoid > that with this next machine. > > I've got a very slim budget on this, and based on that I found what looks > like a pretty > nice little server that is in my budget. It's an ASUS RS161-E2/PA2 which > is based on the > nForce Professional 2200, which from what I can tell is what the Ultra 40 > is based on, so > I would expect it to pretty much just work. > > Will the nv_sata driver behave in a more sane fashion in a case like what > I've just gone > through? If this is a shortcoming of SATA, does anyone have any > recommendations on a not > too expensive setup based on a SAS controller? > > As much as I would like this thing to do a great job in the performance > arena, stability is > definitely higher on the list of what's really important to me. > > Thanks, > > -brian > I believe the issue you're running into is the failmode you currently have set. Take a look at this: http://prefetch.net/blog/index.php/2008/03/01/configuring-zfs-to-gracefully-deal-with-failures/ --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] A failed disk can bring down a machine?
I was not in front of the machine, I had remote hands working with me, so I appologize in advance for any lack of detail I'm about to give. The server in question is running snv_81 booting ZFS Root using Tim's scripts to "convert" it over to ZFS Root. My server in colo stopped responding. I had a screen session open and I could switch between screen windows and create new windows but I could not run any commands. I also could not log into the box. The hands on person saw this on the console (transcribed from a video console): SYNCHRONIZE CACHE command failed (5) scsi: WARNING: /[EMAIL PROTECTED],0/pci1095,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd1) sd1 is one of two SATA disks connected to the machine via a SiL3124 controller. I had the remote hands pull sd1 and reboot the machine. It came right up and has been running fine since. Lacking its mirrored disks, however. Due to other issues I've had with this box (If you think you can get away with running ZFS on a 32-bit machine, you are mistaken) I'm looking to replace it anyway. What concerns me is that a single disk having gone bad like that can take out the whole machine. This is not what I would consider an ideal or acceptable setup for a machine that is in colo that doesn't have 24x7 onsite support. What was to blame for this disk failure causing my machine to become unresponsive? Was it the SiL3124? Is it something else? Is this what I should expect from SATA? I ask all these questions as I want to make sure that if this is indeed connected to the use of a SATA controller, or the use of a specific SATA controller that I certainly avoid that with this next machine. I've got a very slim budget on this, and based on that I found what looks like a pretty nice little server that is in my budget. It's an ASUS RS161-E2/PA2 which is based on the nForce Professional 2200, which from what I can tell is what the Ultra 40 is based on, so I would expect it to pretty much just work. Will the nv_sata driver behave in a more sane fashion in a case like what I've just gone through? If this is a shortcoming of SATA, does anyone have any recommendations on a not too expensive setup based on a SAS controller? As much as I would like this thing to do a great job in the performance arena, stability is definitely higher on the list of what's really important to me. Thanks, -brian -- "Coding in C is like sending a 3 year old to do groceries. You gotta tell them exactly what you want or you'll end up with a cupboard full of pop tarts and pancake mix." -- IRC User (http://www.bash.org/?841435) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Availability: ZFS needs to handle disk removal / driver failure better
> "rs" == Ross Smith <[EMAIL PROTECTED]> writes: rs> 4. zpool status still reports out of date information. I know people are going to skim this message and not hear this. They'll say ``well of course zpool status says ONLINE while the pool is hung. ZFS is patiently waiting. It doesn't know anything is broken yet.'' but you are NOT saying it's out of date because it doesn't say OFFLINE the instant you power down an iSCSI target. You're saying: rs> - After 3 minutes, the iSCSI drive goes offline. rs> The pool carries on with the remaining two drives, CIFS rs> carries on working, iostat carries on working. "zpool status" rs> however is still out of date. rs> - zpool status eventually rs> catches up, and reports that the drive has gone offline. so, there is a ~30sec window when it's out of date. When you say ``goes offline'' in the first bullet, you're saying ``ZFS must have marked it offline internally, because the pool unfroze.'' but you found that even after it ``goes offline'' 'zpool status' still reports it ONLINE. The question is, what the hell is 'zpool status' reporting? not the status, apparently. It's supposed to be a diagnosis tool. Why should you have to second-guess it and infer the position of ZFS's various internal state machines through careful indirect observation, ``oops, CIFS just came back,'' or ``oh sometihng must have changed because zpool iostat isn't hanging any more''? Why not have a tool that TELLS you plainly what's going on? 'zpool status' isn't. Is it trying to oversimplify things, to condescend to the sysadmin or hide ZFS's rough edges? Are there more states for devices that are being compressed down to ONLINE OFFLINE DEGRADED FAULTED? Is there some tool in zdb or mdb that is like 'zpool status -simonsez'? I already know sometimes it'll report everything as ONLINE but refuse 'zpool offline ... ' with 'no valid replicas', so I think, yes there are ``secret states'' for devices? Or is it trying to do too many things with one output format? rs> 5. When iSCSI targets finally do come back online, ZFS is rs> resilvering all of them (again, this rings a bell, Miles might rs> have reported something similar). my zpool status is so old it doesn't say ``xxkB resilvered'' so I've no indication which devices are the source vs. target of the resilver. What I found was, the auto-resilver isn't sufficient. If you wait for it to complete, then 'zpool scrub', you'll get thousands of CKSUM errors on the dirty device, so the resilver isn't covering all the dirtyness. Also ZFS seems to forget about the need to resilver if you shut down the machine, bring back the missing target, and boot---it marks everything ONLINE and then resilvers as you hit the dirty data, counting CKSUM errors. This has likely been fixed between b71 and b101. It's easy to test: (a) shut down one iSCSI target, (b) write to the pool, (c) bring the iSCSI target back, (d) wait for auto-resilver to finish, (e) 'zpool scrub', (f) look for CKSUM errors. I suspect you're more worried about your own problems though---I'll try to retest it soon. pgpcvDMGKA1VP.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [install-discuss] differences.. why?
Lori Alt wrote: > On 12/02/08 03:21, jan damborsky wrote: >> Hi Dick, >> >> I am redirecting your question to zfs-discuss >> mailing list, where people are more knowledgeable >> about this problem and your question could be >> better answered. >> >> Best regards, >> Jan >> >> >> dick hoogendijk wrote: >> >>> I have s10u6 installed on my server. >>> zfs list (partly): >>> NAMEUSED AVAIL REFER MOUNTPOINT >>> rpool 88.8G 140G 27.5K /rpool >>> rpool/ROOT 20.0G 140G18K /rpool/ROOT >>> rpool/ROOT/s10BE2 20.0G 140G 7.78G / >>> >>> But just now, on a newly installed s10u6 system I got rpool/ROOT with a >>> mountpoint "legacy" >>> >>> > The mount point for //ROOT is supposed > to be "legacy" because that dataset should never be mounted. > It's just a "container" dataset to group all the BEs. > >>> The drives were different. On the latter (legacy) system it was not >>> formatted (yet) (in VirtualBox). On my server I switched from UFS to >>> ZFS, so I first created a rpool and than did a luupgrade into it. >>> This could explain the mountpoint /rpool/ROOT but WHY the difference? >>> Why can't s10u6 install the same mountpoint on the new disk? >>> The server runs very well; is this "legacy" thing really needed? >>> >>> > When you created the rpool, did you also explicitly create the rpool/ROOT > datasets? If you did create it and didn't set the mount point to > "legacy", > that explains why you ended up with your original configuration. If > you didn't create the rpool/ROOT dataset yourself, and instead let > LiveUpgrade > create it automatically, and LiveUpgrade set the mountpoint to > /rpool/ROOT, then > that's a bug in LiveUpgrade (though a minor one, I think). NO, I'm quite positive all I did was "zfs create rpool" and after that I did a "lucreate -n zfsBE -p rpool" followed by "luupgrade -u -n zfsBE -s /iso" So, it must have been LU that "forgot" to set the mountpoint to legacy. What is the correct syntax to correct this situation? -- Dick Hoogendijk -- PGP/GnuPG key: F86289CE +http://nagual.nl/ | SunOS 10u6 10/08 ZFS+ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Separate /var
- Original Message - From: Lori Alt <[EMAIL PROTECTED]> Date: Tuesday, December 2, 2008 11:19 am Subject: Re: [zfs-discuss] Separate /var To: Gary Mills <[EMAIL PROTECTED]> Cc: zfs-discuss@opensolaris.org > On 12/02/08 09:00, Gary Mills wrote: > > On Mon, Dec 01, 2008 at 04:45:16PM -0700, Lori Alt wrote: > > > >>On 11/27/08 17:18, Gary Mills wrote: > >> On Fri, Nov 28, 2008 at 11:19:14AM +1300, Ian Collins wrote: > >> On Fri 28/11/08 10:53 , Gary Mills [EMAIL PROTECTED] sent: > >> On Fri, Nov 28, 2008 at 07:39:43AM +1100, Edward Irvine wrote: > >> > >> I'm currently working with an organisation who > >> want use ZFS for their > full zones. Storage is SAN attached, and > they > >> also want to create a > separate /var for each zone, which causes > issues > >> when the zone is > installed. They believe that a separate /var is > >> still good practice. > >> If your mount options are different for /var and /, you will need > >> a separate filesystem. In our case, we use `setuid=off' and > >> `devices=off' on /var for security reasons. We do the same thing > >> for home directories and /tmp . > >> > >> For zones? > >> > >> Sure, if you require different mount options in the zones. > >> > >>I looked into this and found that, using ufs, you can indeed > set up > >>the zone's /var directory as a separate file system. I don't know > >>about > >>how LiveUpgrade works with that configuration (I didn't try it). > >>But I was at least able to get the zone to install and boot. > >>But with zfs, I couldn't even get a zone with a separate /var > >>dataset to install, let alone be manageable with LiveUpgrade. > >>I configured the zone like so: > >># zonecfg -z z4 > >>z4: No such zone configured > >>Use 'create' to begin configuring a new zone. > >>zonecfg:z4> create > >>zonecfg:z4> set zonepath=/zfszones/z4 > >>zonecfg:z4> add fs > >>zonecfg:z4:fs> set dir=/var > >>zonecfg:z4:fs> set special=rpool/ROOT/s10x_u6wos_07b/zfszones/z4/var > >>zonecfg:z4:fs> set type=zfs > >>zonecfg:z4:fs> end > >>zonecfg:z4> exit > >>I then get this result from trying to install the zone: > >>prancer# zoneadm -z z4 install > >>Preparing to install zone . > >>ERROR: No such file or directory: cannot mount I think you're running into the problem of defining the var as the filesystem that already exists under the zone root. We had issues with that, so any time I've been doing filesystems, I don't push in zfs datasets, I create a zfs filesystem in the global zone and mount that directory into the zone with lofs. For example, I've got a pool zdisk with a filesystem down the path - zdisk/zones/zvars/(zonename) which mounts itself to - /zdisk/zones/zvars/(zonename) It's a ZFS filesystem with quota and reservation setup, and I just do an lofs to it via these lines in the /etc/zones/(zonename).xml file - I think that's the equivalent of the following zonecfg lines - zonecfg:z4> add fs zonecfg:z4:fs> set dir=/var zonecfg:z4:fs> set special=/zdisk/zones/zvars/z4/var zonecfg:z4:fs> set type=lofs zonecfg:z4:fs> end I think to put the zfs into the zone, you need to do an add dataset, instead of an add fs. I tried that once and didn't like the results though completely. The dataset was controllable inside the zone (which is what I wanted at the time), but it wasn't controllably from the global zone anymore. And I couldn't access it from the global zone easily to get the backup software to pick it up. Doing it this way means you have to manage the zfs datasets from the global zone, but that's not really an issue here. So, create the separate filesystems you want in the global zone (without stacking them under the zoneroot - separate directory somewhere), setup the zfs stuff you want, then lofs it into the local zone. I've had that install successfully before. Hope that's helpful in some way! > >> > > > > You might have to pre-create this filesystem. `special' may not be > > needed at all. > > > I did pre-create the file system. Also, I tried omitting "special" and > zonecfg complains. > > I think that there might need to be some changes > to zonecfg and the zone installation code to get separate > /var datasets in non-global zones to work. > > Lori > > > >>in non-global zone to install: the source block device or directory > >> cannot be accessed > >>ERROR: cannot setup zone inherited and configured file systems > >>ERROR: cannot setup zone file systems inherited and configured > >>from the global zone > >>ERROR: cannot create zone boot environment > >>I don't fully understand the failures here. I suspect that > there are > >>problems both in the zfs code and zones code. It SHOULD work though. > >>The fact that it doesn't seems like a bug. > >>In the meantime, I guess we have to conclude that a separate /var > >>in a non-global zone
Re: [zfs-discuss] Problem importing degraded Pool
It seems that my devices have several settings of pools :-( zdb -l /dev/rdsk/c0t5d0 tells me LABEL 0 failed to unpack label 0 LABEL 1 failed to unpack label 1 LABEL 2 version=6 name='tank' state=0 txg=4 pool_guid=1230498626424814687 hostid=2180312168 hostname='sunny.local' top_guid=7409377091667366359 guid=7409377091667366359 vdev_tree type='disk' id=1 guid=7409377091667366359 path='/dev/ad6' devid='ad:S13UJDWQ726303' whole_disk=0 metaslab_array=14 metaslab_shift=32 ashift=9 asize=750151532544 LABEL 3 version=6 name='tank' state=0 txg=4 pool_guid=1230498626424814687 hostid=2180312168 hostname='sunny.local' top_guid=7409377091667366359 guid=7409377091667366359 vdev_tree type='disk' id=1 guid=7409377091667366359 path='/dev/ad6' devid='ad:S13UJDWQ726303' whole_disk=0 metaslab_array=14 metaslab_shift=32 ashift=9 asize=750151532544 zdb -l /dev/rdsk/c0t5d0[b]s0[/b] tells me LABEL 0 version=10 name='tank' state=0 txg=72220 pool_guid=1717390511944489 hostname='sunny' top_guid=2169144823532120681 guid=2169144823532120681 vdev_tree type='disk' id=1 guid=2169144823532120681 path='/dev/dsk/c0t1d0s0' devid='id1,[EMAIL PROTECTED]/a' phys_path='/[EMAIL PROTECTED],0/pci1002,[EMAIL PROTECTED]/[EMAIL PROTECTED],0:a' whole_disk=1 metaslab_array=14 metaslab_shift=32 ashift=9 asize=750142881792 is_log=0 DTL=93 LABEL 1 version=10 name='tank' state=0 txg=72220 pool_guid=1717390511944489 hostname='sunny' top_guid=2169144823532120681 guid=2169144823532120681 vdev_tree type='disk' id=1 guid=2169144823532120681 path='/dev/dsk/c0t1d0s0' devid='id1,[EMAIL PROTECTED]/a' phys_path='/[EMAIL PROTECTED],0/pci1002,[EMAIL PROTECTED]/[EMAIL PROTECTED],0:a' whole_disk=1 metaslab_array=14 metaslab_shift=32 ashift=9 asize=750142881792 is_log=0 DTL=93 LABEL 2 version=10 name='tank' state=0 txg=72220 pool_guid=1717390511944489 hostname='sunny' top_guid=2169144823532120681 guid=2169144823532120681 vdev_tree type='disk' id=1 guid=2169144823532120681 path='/dev/dsk/c0t1d0s0' devid='id1,[EMAIL PROTECTED]/a' phys_path='/[EMAIL PROTECTED],0/pci1002,[EMAIL PROTECTED]/[EMAIL PROTECTED],0:a' whole_disk=1 metaslab_array=14 metaslab_shift=32 ashift=9 asize=750142881792 is_log=0 DTL=93 LABEL 3 version=10 name='tank' state=0 txg=72220 pool_guid=1717390511944489 hostname='sunny' top_guid=2169144823532120681 guid=2169144823532120681 vdev_tree type='disk' id=1 guid=2169144823532120681 path='/dev/dsk/c0t1d0s0' devid='id1,[EMAIL PROTECTED]/a' phys_path='/[EMAIL PROTECTED],0/pci1002,[EMAIL PROTECTED]/[EMAIL PROTECTED],0:a' whole_disk=1 metaslab_array=14 metaslab_shift=32 ashift=9 asize=750142881792 is_log=0 DTL=93 So the right pool data of pool # 1717390511944489 is in c0t2d0s0 and c0t5d0s0. But somehow there is a second pool setting in c0t2d0 and c0t5d0. Thank you to Richard Eling to point that out. So is it possible to clear the invalid pool setting and just use the valid ones in the s0 Partitions? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How often to scrub?
On Tue, 2 Dec 2008, Toby Thain wrote: > > Even that is probably more frequent than necessary. I'm sure somebody > has done the MTTDL math. IIRC, the big win is doing any scrubbing at > all. The difference between scrubbing every 2 weeks and every 2 > months may be negligible. (IANAMathematician tho) This surely depends on the type of hardware used. If the disks are not true "enterprise" grade (e.g. ordinary SATA drives) then scrubbing more often is likely warranted since these are much more likely to exhibit user-visible decay over a period of time and the scrub will find (and correct) the decaying bits before it is too late. The enterprise-class disks should not require scrubbing very often. Enterprise disks may be almost as likely to go entirely belly up as they are to produce a bad sector. Bob == Bob Friesenhahn [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Separate /var
On Tue, Dec 2, 2008 at 11:17 AM, Lori Alt <[EMAIL PROTECTED]> wrote: > I did pre-create the file system. Also, I tried omitting "special" and > zonecfg complains. > > I think that there might need to be some changes > to zonecfg and the zone installation code to get separate > /var datasets in non-global zones to work. You could probably do something like: zfs create rpool/zones/$zone zfs create rpool/zones/$zone/var zonecfg -z $zone add fs set dir=/var set special=/zones/$zone/var set type=lofs end ... zoneadm -z $zone install zonecfg -z $zone remove fs dir=/var zfs set mountpoint=/zones/$zone/root/var rpool/zones/$zone/var -- Mike Gerdts http://mgerdts.blogspot.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Separate /var
On 12/02/08 09:00, Gary Mills wrote: On Mon, Dec 01, 2008 at 04:45:16PM -0700, Lori Alt wrote: On 11/27/08 17:18, Gary Mills wrote: On Fri, Nov 28, 2008 at 11:19:14AM +1300, Ian Collins wrote: On Fri 28/11/08 10:53 , Gary Mills [EMAIL PROTECTED] sent: On Fri, Nov 28, 2008 at 07:39:43AM +1100, Edward Irvine wrote: I'm currently working with an organisation who want use ZFS for their > full zones. Storage is SAN attached, and they also want to create a > separate /var for each zone, which causes issues when the zone is > installed. They believe that a separate /var is still good practice. If your mount options are different for /var and /, you will need a separate filesystem. In our case, we use `setuid=off' and `devices=off' on /var for security reasons. We do the same thing for home directories and /tmp . For zones? Sure, if you require different mount options in the zones. I looked into this and found that, using ufs, you can indeed set up the zone's /var directory as a separate file system. I don't know about how LiveUpgrade works with that configuration (I didn't try it). But I was at least able to get the zone to install and boot. But with zfs, I couldn't even get a zone with a separate /var dataset to install, let alone be manageable with LiveUpgrade. I configured the zone like so: # zonecfg -z z4 z4: No such zone configured Use 'create' to begin configuring a new zone. zonecfg:z4> create zonecfg:z4> set zonepath=/zfszones/z4 zonecfg:z4> add fs zonecfg:z4:fs> set dir=/var zonecfg:z4:fs> set special=rpool/ROOT/s10x_u6wos_07b/zfszones/z4/var zonecfg:z4:fs> set type=zfs zonecfg:z4:fs> end zonecfg:z4> exit I then get this result from trying to install the zone: prancer# zoneadm -z z4 install Preparing to install zone . ERROR: No such file or directory: cannot mount You might have to pre-create this filesystem. `special' may not be needed at all. I did pre-create the file system. Also, I tried omitting "special" and zonecfg complains. I think that there might need to be some changes to zonecfg and the zone installation code to get separate /var datasets in non-global zones to work. Lori in non-global zone to install: the source block device or directory cannot be accessed ERROR: cannot setup zone inherited and configured file systems ERROR: cannot setup zone file systems inherited and configured from the global zone ERROR: cannot create zone boot environment I don't fully understand the failures here. I suspect that there are problems both in the zfs code and zones code. It SHOULD work though. The fact that it doesn't seems like a bug. In the meantime, I guess we have to conclude that a separate /var in a non-global zone is not supported on zfs. A separate /var in the global zone is supported however, even when the root is zfs. I haven't tried ZFS zone roots myself, but I do have a few comments. ZFS filesystems are cheap because they don't require separate disk slices. As well, they are attribute boundaries. Those are necessary or convenient in some case. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [install-discuss] differences.. why?
On 12/02/08 03:21, jan damborsky wrote: > Hi Dick, > > I am redirecting your question to zfs-discuss > mailing list, where people are more knowledgeable > about this problem and your question could be > better answered. > > Best regards, > Jan > > > dick hoogendijk wrote: > >> I have s10u6 installed on my server. >> zfs list (partly): >> NAMEUSED AVAIL REFER MOUNTPOINT >> rpool 88.8G 140G 27.5K /rpool >> rpool/ROOT 20.0G 140G18K /rpool/ROOT >> rpool/ROOT/s10BE2 20.0G 140G 7.78G / >> >> But just now, on a newly installed s10u6 system I got rpool/ROOT with a >> mountpoint "legacy" >> >> The mount point for //ROOT is supposed to be "legacy" because that dataset should never be mounted. It's just a "container" dataset to group all the BEs. >> The drives were different. On the latter (legacy) system it was not >> formatted (yet) (in VirtualBox). On my server I switched from UFS to >> ZFS, so I first created a rpool and than did a luupgrade into it. >> This could explain the mountpoint /rpool/ROOT but WHY the difference? >> Why can't s10u6 install the same mountpoint on the new disk? >> The server runs very well; is this "legacy" thing really needed? >> >> When you created the rpool, did you also explicitly create the rpool/ROOT datasets? If you did create it and didn't set the mount point to "legacy", that explains why you ended up with your original configuration. If you didn't create the rpool/ROOT dataset yourself, and instead let LiveUpgrade create it automatically, and LiveUpgrade set the mountpoint to /rpool/ROOT, then that's a bug in LiveUpgrade (though a minor one, I think). Lori ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Availability: ZFS needs to handle disk removal / driver failure better
Hi Richard, Thanks, I'll give that a try. I think I just had a kernel dump while trying to boot this system back up though, I don't think it likes it if the iscsi targets aren't available during boot. Again, that rings a bell, so I'll go see if that's another known bug. Changing that setting on the fly didn't seem to help, if anything things are worse this time around. I changed the timeout to 15 seconds, but didn't restart any services: # echo iscsi_rx_max_window/D | mdb -k iscsi_rx_max_window: iscsi_rx_max_window:180 # echo iscsi_rx_max_window/W0t15 | mdb -kw iscsi_rx_max_window:0xb4= 0xf # echo iscsi_rx_max_window/D | mdb -k iscsi_rx_max_window: iscsi_rx_max_window:15 After making those changes, and repeating the test, offlining an iscsi volume hung all the commands running on the pool. I had three ssh sessions open, running the following: # zpool iostats -v iscsipool 10 100 # format < /dev/null # time zpool status They hung for what felt a minute or so. After that, the CIFS copy timed out. After the CIFS copy timed out, I tried immediately restarting it. It took a few more seconds, but restarted no problem. Within a few seconds of that restarting, iostat recovered, and format returned it's result too. Around 30 seconds later, zpool status reported two drives, paused again, then showed the status of the third: # time zpool status pool: iscsipool state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-9P scrub: resilver completed after 0h0m with 0 errors on Tue Dec 2 16:39:21 2008 config: NAME STATE READ WRITE CKSUM iscsipool ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c2t600144F04933FF6C5056967AC800d0 ONLINE 0 0 0 15K resilvered c2t600144F04934FAB35056964D9500d0 ONLINE 0 0 0 15K resilvered c2t600144F04934119E50569675FF00d0 ONLINE 0 200 0 24K resilvered errors: No known data errors real3m51.774s user0m0.015s sys 0m0.100s Repeating that a few seconds later gives: # time zpool status pool: iscsipool state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://www.sun.com/msg/ZFS-8000-2Q scrub: resilver completed after 0h0m with 0 errors on Tue Dec 2 16:39:21 2008 config: NAME STATE READ WRITE CKSUM iscsipool DEGRADED 0 0 0 raidz1 DEGRADED 0 0 0 c2t600144F04933FF6C5056967AC800d0 ONLINE 0 0 0 15K resilvered c2t600144F04934FAB35056964D9500d0 ONLINE 0 0 0 15K resilvered c2t600144F04934119E50569675FF00d0 UNAVAIL 3 5.80K 0 cannot open errors: No known data errors real0m0.272s user0m0.029s sys 0m0.169s On Tue, Dec 2, 2008 at 3:58 PM, Richard Elling <[EMAIL PROTECTED]> wrote: .. > iSCSI timeout is set to 180 seconds in the client code. The only way > to change is to recompile it, or use mdb. Since you have this test rig > setup, and I don't, do you want to experiment with this timeout? > The variable is actually called "iscsi_rx_max_window" so if you do > echo iscsi_rx_max_window/D | mdb -k > you should see "180" > Change it using something like: > echo iscsi_rx_max_window/W0t30 | mdb -kw > to set it to 30 seconds. > -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How often to scrub?
On 2-Dec-08, at 8:24 AM, Glaser, David wrote: > Ok, thanks for all the responses. I'll probably do every other week > scrubs, as this is the backup data (so doesn't need to be checked > constantly). Even that is probably more frequent than necessary. I'm sure somebody has done the MTTDL math. IIRC, the big win is doing any scrubbing at all. The difference between scrubbing every 2 weeks and every 2 months may be negligible. (IANAMathematician tho) --T > I'm a little concerned about the time involved to do 33TB (after > the 48TB has been RAIDed fully) when it is fully populated with > filesystems and snapshots, but I'll keep an eye on it. > > Thanks all. > > Dave > > > -Original Message- > From: Paul Weaver [mailto:[EMAIL PROTECTED] > Sent: Tuesday, December 02, 2008 8:11 AM > To: Glaser, David; zfs-discuss@opensolaris.org > Subject: RE: [zfs-discuss] How often to scrub? > >> I have a Thumper (ok, actually 3) with each having one large pool, > multiple >> filesystems and many snapshots. They are holding rsync copies of > multiple >> clients, being synced every night (using snapshots to keep > 'incremental' >> backups). >> >> I'm wondering how often (if ever) I should do scrubs of the pools, or > if >> the internal zfs integrity is enough that I don't need to do manual > scrubs >> of the pool? I read through a number of tutorials online as well as > the zfs >> wiki entry, but I didn't see anything very pertinent. Scrubs are I/O >> intensive, but is the Pool able to be used normally during a scrub? I >> think the answer is yes, but some confirmation helps me sleep at > night. > > Scrubs are the lowest priority, so I understand it should > theoretically > work fine. > > We've got two 48TB thumpers, with a nightly rsync from the main to the > reserve. I'm currently running a scrub every Friday at 23:02, which > last > week took 5h15 to scrub the 7TB of used data (about 5TB of real, > 2TB of > snapshots) on the single pool. That's about 380MBytes/second. > > > -- > > "09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0" > > Paul Weaver > Systems Development Engineer > News Production Facilities, BBC News > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How often to scrub?
On Tue, Dec 2, 2008 at 10:15, Paul Weaver <[EMAIL PROTECTED]> wrote: > So you've got a zpool across 46 (48?) of the disks? > > When I was looking into our thumpers everyone seemed to think a raidz > over > more than 10 disks was a hideous idea. A vdev that size is bad, a pool that size composed of multiple vdevs is fine. Raidz2 is always recommended over raidz. Will ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Separate /var
On Mon, Dec 01, 2008 at 04:45:16PM -0700, Lori Alt wrote: >On 11/27/08 17:18, Gary Mills wrote: > On Fri, Nov 28, 2008 at 11:19:14AM +1300, Ian Collins wrote: > On Fri 28/11/08 10:53 , Gary Mills [EMAIL PROTECTED] sent: > On Fri, Nov 28, 2008 at 07:39:43AM +1100, Edward Irvine wrote: > > I'm currently working with an organisation who > want use ZFS for their > full zones. Storage is SAN attached, and they > also want to create a > separate /var for each zone, which causes issues > when the zone is > installed. They believe that a separate /var is > still good practice. > If your mount options are different for /var and /, you will need > a separate filesystem. In our case, we use `setuid=off' and > `devices=off' on /var for security reasons. We do the same thing > for home directories and /tmp . > > For zones? > > Sure, if you require different mount options in the zones. > >I looked into this and found that, using ufs, you can indeed set up >the zone's /var directory as a separate file system. I don't know >about >how LiveUpgrade works with that configuration (I didn't try it). >But I was at least able to get the zone to install and boot. >But with zfs, I couldn't even get a zone with a separate /var >dataset to install, let alone be manageable with LiveUpgrade. >I configured the zone like so: ># zonecfg -z z4 >z4: No such zone configured >Use 'create' to begin configuring a new zone. >zonecfg:z4> create >zonecfg:z4> set zonepath=/zfszones/z4 >zonecfg:z4> add fs >zonecfg:z4:fs> set dir=/var >zonecfg:z4:fs> set special=rpool/ROOT/s10x_u6wos_07b/zfszones/z4/var >zonecfg:z4:fs> set type=zfs >zonecfg:z4:fs> end >zonecfg:z4> exit >I then get this result from trying to install the zone: >prancer# zoneadm -z z4 install >Preparing to install zone . >ERROR: No such file or directory: cannot mount You might have to pre-create this filesystem. `special' may not be needed at all. >in non-global zone to install: the source block device or directory > cannot be accessed >ERROR: cannot setup zone inherited and configured file systems >ERROR: cannot setup zone file systems inherited and configured >from the global zone >ERROR: cannot create zone boot environment >I don't fully understand the failures here. I suspect that there are >problems both in the zfs code and zones code. It SHOULD work though. >The fact that it doesn't seems like a bug. >In the meantime, I guess we have to conclude that a separate /var >in a non-global zone is not supported on zfs. A separate /var in >the global zone is supported however, even when the root is zfs. I haven't tried ZFS zone roots myself, but I do have a few comments. ZFS filesystems are cheap because they don't require separate disk slices. As well, they are attribute boundaries. Those are necessary or convenient in some case. -- -Gary Mills--Unix Support--U of M Academic Computing and Networking- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Problem importing degraded Pool
thx for your suggestions couper88, but this did not help :-/. I tried the lastes live-cd of 2008.11 and got new information: a zpool import shows me now: [EMAIL PROTECTED]:~# zpool import pool: tank id: 1717390511944489 state: UNAVAIL status: One or more devices contains corrupted data. action: The pool cannot be imported due to damaged devices or data. see: http://www.sun.com/msg/ZFS-8000-5E config: tankUNAVAIL insufficient replicas c3t5d0ONLINE pool: tank id: 1230498626424814687 state: FAULTED status: The pool was last accessed by another system. action: The pool cannot be imported due to damaged devices or data. The pool may be active on another system, but can be imported using the '-f' flag. see: http://www.sun.com/msg/ZFS-8000-EY config: tankFAULTED corrupted data c3t5d0p0 FAULTED corrupted data c3t2d0p0 ONLINE So i think the second pool is the right one... BUT i really do not know how to import it. I tried both [EMAIL PROTECTED]:~# zpool import -f tank cannot import 'tank': more than one matching pool import by numeric ID instead [EMAIL PROTECTED]:/dev/rdsk# zpool import -f 1230498626424814687 cannot import 'tank': one or more devices is currently unavailable S i got a little bit more hope now... But there is still the Problem, that i cannot import that specific pool :-/ -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How often to scrub?
So you've got a zpool across 46 (48?) of the disks? When I was looking into our thumpers everyone seemed to think a raidz over more than 10 disks was a hideous idea. -- Paul Weaver Systems Development Engineer News Production Facilities, BBC News Work: 020 822 58109 Room 1244 Television Centre, Wood Lane, London, W12 7RJ > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of > Glaser, David > Sent: 02 December 2008 13:24 > To: zfs-discuss@opensolaris.org > Subject: Re: [zfs-discuss] How often to scrub? > > Ok, thanks for all the responses. I'll probably do every > other week scrubs, as this is the backup data (so doesn't > need to be checked constantly). I'm a little concerned about > the time involved to do 33TB (after the 48TB has been RAIDed > fully) when it is fully populated with filesystems and > snapshots, but I'll keep an eye on it. > > Thanks all. > > Dave > > > -Original Message- > From: Paul Weaver [mailto:[EMAIL PROTECTED] > Sent: Tuesday, December 02, 2008 8:11 AM > To: Glaser, David; zfs-discuss@opensolaris.org > Subject: RE: [zfs-discuss] How often to scrub? > > > I have a Thumper (ok, actually 3) with each having one large pool, > multiple > > filesystems and many snapshots. They are holding rsync copies of > multiple > > clients, being synced every night (using snapshots to keep > 'incremental' > > backups). > > > > I'm wondering how often (if ever) I should do scrubs of the > pools, or > if > > the internal zfs integrity is enough that I don't need to do manual > scrubs > > of the pool? I read through a number of tutorials online as well as > the zfs > > wiki entry, but I didn't see anything very pertinent. > Scrubs are I/O > > intensive, but is the Pool able to be used normally during > a scrub? I > > think the answer is yes, but some confirmation helps me sleep at > night. > > Scrubs are the lowest priority, so I understand it should > theoretically work fine. > > We've got two 48TB thumpers, with a nightly rsync from the > main to the reserve. I'm currently running a scrub every > Friday at 23:02, which last week took 5h15 to scrub the 7TB > of used data (about 5TB of real, 2TB of > snapshots) on the single pool. That's about 380MBytes/second. > > > -- > > "09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0" > > Paul Weaver > Systems Development Engineer > News Production Facilities, BBC News > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] RE : rsync using 100% of a cp u
How are the two sides different? If you run something like 'openssl md5sum' on both sides is it much faster on one side? Does one machine have a lot more memory/ARC and allow it to skip the physical reads? Is the dataset compressed on one side? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How often to scrub?
Ok, thanks for all the responses. I'll probably do every other week scrubs, as this is the backup data (so doesn't need to be checked constantly). I'm a little concerned about the time involved to do 33TB (after the 48TB has been RAIDed fully) when it is fully populated with filesystems and snapshots, but I'll keep an eye on it. Thanks all. Dave -Original Message- From: Paul Weaver [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 02, 2008 8:11 AM To: Glaser, David; zfs-discuss@opensolaris.org Subject: RE: [zfs-discuss] How often to scrub? > I have a Thumper (ok, actually 3) with each having one large pool, multiple > filesystems and many snapshots. They are holding rsync copies of multiple > clients, being synced every night (using snapshots to keep 'incremental' > backups). > > I'm wondering how often (if ever) I should do scrubs of the pools, or if > the internal zfs integrity is enough that I don't need to do manual scrubs > of the pool? I read through a number of tutorials online as well as the zfs > wiki entry, but I didn't see anything very pertinent. Scrubs are I/O > intensive, but is the Pool able to be used normally during a scrub? I > think the answer is yes, but some confirmation helps me sleep at night. Scrubs are the lowest priority, so I understand it should theoretically work fine. We've got two 48TB thumpers, with a nightly rsync from the main to the reserve. I'm currently running a scrub every Friday at 23:02, which last week took 5h15 to scrub the 7TB of used data (about 5TB of real, 2TB of snapshots) on the single pool. That's about 380MBytes/second. -- "09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0" Paul Weaver Systems Development Engineer News Production Facilities, BBC News ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How often to scrub?
> I have a Thumper (ok, actually 3) with each having one large pool, multiple > filesystems and many snapshots. They are holding rsync copies of multiple > clients, being synced every night (using snapshots to keep 'incremental' > backups). > > I'm wondering how often (if ever) I should do scrubs of the pools, or if > the internal zfs integrity is enough that I don't need to do manual scrubs > of the pool? I read through a number of tutorials online as well as the zfs > wiki entry, but I didn't see anything very pertinent. Scrubs are I/O > intensive, but is the Pool able to be used normally during a scrub? I > think the answer is yes, but some confirmation helps me sleep at night. Scrubs are the lowest priority, so I understand it should theoretically work fine. We've got two 48TB thumpers, with a nightly rsync from the main to the reserve. I'm currently running a scrub every Friday at 23:02, which last week took 5h15 to scrub the 7TB of used data (about 5TB of real, 2TB of snapshots) on the single pool. That's about 380MBytes/second. -- "09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0" Paul Weaver Systems Development Engineer News Production Facilities, BBC News ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] RE : rsync using 100% of a cp u
>>"Francois Dion" wrote: >> Source is local to rsync, copying from a zfs file system, >> destination is remote over a dsl connection. Takes forever to just >> go through the unchanged files. Going the other way is not a >> problem, it takes a fraction of the time. Anybody seen that? >> Suggestions? >De: Blake Irvin [mailto:[EMAIL PROTECTED] >Upstream when using DSL is much slower than downstream? No, that's not the problem. I know ADSL is assymetrical. When there is an actual data transfer going on, the cpu drops to 0.2%. It's only when rsync is doing its thing (reading, not writing) locally that it pegs the cpu. We are talking 15 minutes in one direction while in the other it looks like I'll pass the 24 hours mark before the rsync is complete. And there were less than 100MB added on each side. BTW, the only other process I've seen that pegs the cpu solid for as long as it runs on my v480 is when I downloaded Belenix through a python script (btdownloadheadless). ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Hardware Raid Vs ZFS implementation on Sun X4150/X4450
Hi, Has anyone implemented the Hardware RAID 1/5 on Sun X4150/X4450 class of servers . Also any comparison between ZFS Vs H/W Raid ? I would like to know the experience (good/bad) and the pros/cons? Regards, Vikash ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Availability: ZFS needs to handle disk removal / driver failure better
Incidentally, while I've reported this again as a RFE, I still haven't seen a CR number for this. Could somebody from Sun check if it's been filed please. thanks, Ross -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS fragmentation with MySQL databases
t. johnson wrote: >>> One would expect so, yes. But the usefulness of this is limited to the >>> cases where the entire working set will fit into an SSD cache. >>> >> Not entirely out of the question. SSDs can be purchased today >> with more than 500 GBytes in a 2.5" form factor. One or more of >> these would make a dandy L2ARC. >> http://www.stecinc.com/product/mach8mlc.php > > > Speaking of which.. what's the current limit on L2ARC size? Gathering tidbits > here and there (7000 storage line config limits, FAST talk given by Bill > Moore) there are indications that L2ARC can only be ~500GB? There is no limits on the size of the L2ARC that I could fine implemented in the source code. However every buffer that is cached on an L2ARC device needs an ARC header in the in memory ARC that points to it. So in practical terms there will be a limit on the size of an L2ARC based on the size of physical ram. For example a machine with 512 MegaByte RAM and a 500GByte SSD L2ARC is probably pretty silly. I'll leave it as an exercise to the reader to work out how much core memory is needed based on the sizes of arc_buf_t (0x30) and arc_buf_hdr_t (0xf8). -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Problem importing degraded Pool
Hi, Attach both original drives to the system, the faulty one may only have had a few check sum errors. zpool status -vshould hopefully show your data pool. Provided you have not started to replaced the faulty drive yet. If it don't see the pool, zpool export then zpool import and hope for the best If you get back to original failed state with your pool degraded but readable. It can be easily fixed. most of the time. Do a zpool status -v <- mind the -v Whats it saying about your pool? I suspect the faulty drive has check sum errors and has been off-lined. power down the system and add the spare 3rd drive to the system so you have all 3 drives connected. DO NOT MOVE the original drives to different connections in the system that just going to cause more trouble. While your inside the system check all the connection to the hard drives. power up the system Look up the ZFS commands. Read and understand what your about to do. you need to force the failed drive online #zpool online pool device do a zpool clear to clear the error log on the faulty pool #zpool clear pool now you have 2 choices here, back up your critical data to the new 3rd drive or replace the faulty drive. zpool replace [-f] pool device [new_device] Now zfs is almost certainly going to complain like hell about the faulty pool. during the copy / replace operation. To be blunt your data is either readable or its not. Run zpool clear and force online the faulty drive. Every time it gets put offline, this may be several times! Zfs will tell you exactly what files have been lost, if any. The process could take several hours. Do a zpool scrub once its finished. Then back up your data use zpool status -v to monitor progress. If you don't get a lot of errors from the faulty drive. You could try a low level format, to fix the drive. After you have got the data off it ;) one final word, a striped zpool with copy's=2 is about as much use, as a chocolate fire guard when it comes to protecting data. Use 3+ drives and raidz its far better. Am no expert, been using zfs for 7months. When i fist started using it, ZFS found 4 faulty drives in my setup. And other operating systems said they were good drives!!! So i have used ZFS to its full recovery potential!!! Brian, -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Availability: ZFS needs to handle disk removal / driver failure better
Hey folks, I've just followed up on this, testing iSCSI with a raided pool, and it still appears to be struggling when a device goes offline. >>> I don't see how this could work except for mirrored pools. Would that >>> carry enough market to be worthwhile? >>> -- richard >>> >> >> I have to admit, I've not tested this with a raided pool, but since >> all ZFS commands hung when my iSCSI device went offline, I assumed >> that you would get the same effect of the pool hanging if a raid-z2 >> pool is waiting for a response from a device. Mirrored pools do work >> particularly well with this since it gives you the potential to have >> remote mirrors of your data, but if you had a raid-z2 pool, you still >> wouldn't want that hanging if a single device failed. >> > > zpool commands hanging is CR6667208, and has been fixed in b100. > http://bugs.opensolaris.org/view_bug.do?bug_id=6667208 > >> I will go and test the raid scenario though on a current build, just to be >> sure. >> > > Please. > -- richard I've just created a pool using three snv_103 iscsi Targets, with a fourth install of snv_103 collating those targets into a raidz pool, and sharing that out over CIFS. To test the server, while transferring files from a windows workstation, I powered down one of the three iSCSI targets. It took a few minutes to shutdown, but once that happened the windows copy halted with the error: "The specified network name is no longer available." At this point, the zfs admin tools still work fine (which is a huge improvement, well done!), but zpool status still reports that all three devices are online. A minute later, I can open the share again, and start another copy. Thirty seconds after that, zpool status finally reports that the iscsi device is offline. So it looks like we have the same problems with that 3 minute delay, with zpool status reporting wrong information, and the CIFS service having problems tool. At this point I restarted the iSCSI target, but had problems bringing it back online. It appears there's a bug in the initiator, but it's easily worked around: http://www.opensolaris.org/jive/thread.jspa?messageID=312981 What was great was that as soon as the iSCSI initiator reconnected, ZFS started resilvering. What might not be so great is the fact that all three devices are showing that they've been resilvered: # zpool status pool: iscsipool state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-9P scrub: resilver completed after 0h2m with 0 errors on Tue Dec 2 11:04:10 2008 config: NAME STATE READ WRITE CKSUM iscsipool ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c2t600144F04933FF6C5056967AC800d0 ONLINE 0 0 0 179K resilvered c2t600144F04934FAB35056964D9500d0 ONLINE 5 9.88K 0 311M resilvered c2t600144F04934119E50569675FF00d0 ONLINE 0 0 0 179K resilvered errors: No known data errors It's proving a little hard to know exactly what's happening when, since I've only got a few seconds to log times, and there are delays with each step. However, I ran another test using robocopy and was able to observe the behaviour a little more closely: Test 2: Using robocopy for the transfer, and iostat plus zpool status on the server 10:46:30 - iSCSI server shutdown started 10:52:20 - all drives still online according to zpool status 10:53:30 - robocopy error - "The specified network name is no longer available" - zpool status shows all three drives as online - zpool iostat appears to have hung, taking much longer than the 30s specified to return a result - robocopy is now retrying the file, but appears to have hung 10:54:30 - robocopy, CIFS and iostat all start working again, pretty much simultaneously - zpool status now shows the drive as offline I could probably do with using DTrace to get a better look at this, but I haven't learnt that yet. My guess as to what's happening would be: - iSCSI target goes offline - ZFS will not be notified for 3 minutes, but I/O to that device is essentially hung - CIFS times out (I suspect this is on the client side with around a 30s timeout, but I can't find the timeout documented anywhere). - zpool iostat is now waiting, I may be wrong but this doesn't appear to have benefited from the changes to zpool status - After 3 minutes, the iSCSI drive goes offline. The pool carries on with the remaining two drives, CIFS carries on working, iostat carries on working. "zpool status" however is still out of date. - zpool status eventually catches u
Re: [zfs-discuss] ZFS fragmentation with MySQL databases
>> >> One would expect so, yes. But the usefulness of this is limited to the cases >> where the entire working set will fit into an SSD cache. >> > > Not entirely out of the question. SSDs can be purchased today > with more than 500 GBytes in a 2.5" form factor. One or more of > these would make a dandy L2ARC. > http://www.stecinc.com/product/mach8mlc.php Speaking of which.. what's the current limit on L2ARC size? Gathering tidbits here and there (7000 storage line config limits, FAST talk given by Bill Moore) there are indications that L2ARC can only be ~500GB? Is this the case? If so, is that a raw size limitation or a number of devices used to form the L2ARC limitation or something else? I'm sure some of us can come with examples where we really would like to use much more than a 500GB L2ARC :) -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zfs_nocacheflush, nvram, and root pools
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 hi, i have a system connected to an external DAS (SCSI) array, using ZFS. the array has an nvram write cache, but it honours SCSI cache flush commands by flushing the nvram to disk. the array has no way to disable this behaviour. a well-known behaviour of ZFS is that it often issues cache flush commands to storage in order to ensure data integrity; while this is important with normal disks, it's useless for nvram write caches, and it effectively disables the cache. so far, i've worked around this by setting zfs_nocacheflush, as described at [1], which works fine. but now i want to upgrade this system to Solaris 10 Update 6, and use a ZFS root pool on its internal SCSI disks (previously, the root was UFS). the problem is that zfS_nocacheflush applies to all pools, which will include the root pool. my understanding of ZFS is that when run on a root pool, which uses slices (instead of whole disks), ZFS won't enable the write cache itself. i also didn't enable the write cache manually. so, it _should_ be safe to use zfs_nocacheflush, because there is no caching on the root pool. am i right, or could i encounter problems here? (the system is an NFS server, which means lots of synchronous writes (and therefore ZFS cache flushes), so i *really* want the performance benefit from using the nvram write cache.) - river. [1] http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Cache_Flushes -BEGIN PGP SIGNATURE- iD8DBQFJNRJVIXd7fCuc5vIRAgDlAJ0boVf5zmvkRySeIHVumsKm3VSVhACffyOK POEMyzG8U2yQYeZr01uJ74Q= =9eBp -END PGP SIGNATURE- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [install-discuss] differences.. why?
Hi Dick, I am redirecting your question to zfs-discuss mailing list, where people are more knowledgeable about this problem and your question could be better answered. Best regards, Jan dick hoogendijk wrote: > I have s10u6 installed on my server. > zfs list (partly): > NAMEUSED AVAIL REFER MOUNTPOINT > rpool 88.8G 140G 27.5K /rpool > rpool/ROOT 20.0G 140G18K /rpool/ROOT > rpool/ROOT/s10BE2 20.0G 140G 7.78G / > > But just now, on a newly installed s10u6 system I got rpool/ROOT with a > mountpoint "legacy" > > The drives were different. On the latter (legacy) system it was not > formatted (yet) (in VirtualBox). On my server I switched from UFS to > ZFS, so I first created a rpool and than did a luupgrade into it. > This could explain the mountpoint /rpool/ROOT but WHY the difference? > Why can't s10u6 install the same mountpoint on the new disk? > The server runs very well; is this "legacy" thing really needed? > > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss