Re: [zfs-discuss] How to make an extended LUN size known to ZFS and Solaris
On 9/29/06, Michael Phua - PTS <[EMAIL PROTECTED]> wrote: Our customer has an Sun Fire X4100 with Solaris 10 using ZFS and a HW RAID array (STK D280). He has extended a LUN on the storage array and want to make this new size known to ZFS and Solaris. Does anyone know if this can be done and how it can be done. Hi Michael, the customer needs to export the pool which contains the lun, run devfsadm, then re-import the port. That's the high level summary, there's probably some more in the archives on really detailed specifics but I can't recall them at the moment. The reasoning behind the sequence of operations is that the lun's inquiry data is read on pool import or creation, and not at any other time. At least, that's how it is at the moment, perhaps Eric, Matt, Mark or Bill might have some project underway to make this a bit easier. cheers, James C. McPherson -- Solaris kernel software engineer, system admin and troubleshooter http://www.jmcp.homeunix.com/blog Find me on LinkedIn @ http://www.linkedin.com/in/jamescmcpherson ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] How to make an extended LUN size known to ZFS and Solaris
Hi, Our customer has an Sun Fire X4100 with Solaris 10 using ZFS and a HW RAID array (STK D280). He has extended a LUN on the storage array and want to make this new size known to ZFS and Solaris. Does anyone know if this can be done and how it can be done. Cheers! Warm regards. Michael Phua -- === __ /_/\ /_\\ \MICHAEL PHUA /_\ \\ / SUN Microsystems - APAC PTS /_/ \/ / / Block 750, [EMAIL PROTECTED] Chee /_/ / \//\ #04-04 Chai Chee Road Singapore 469000 \_\//\ / / Phone (65) 395-9546 | FAX (65) 242-7166 \_/ / /\ / Email: [EMAIL PROTECTED] \_/ \\ \ SWAN : 88546 MS: CSIN05 \_\ \\ \_\/ === ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] mkdir == zfs create
On Thu, Sep 28, 2006 at 05:36:16PM +0200, Robert Milkowski wrote: > Hello Chris, > > Thursday, September 28, 2006, 4:55:13 PM, you wrote: > > CG> I keep thinking that it would be useful to be able to define a > CG> zfs file system where all calls to mkdir resulted not just in a > CG> directory but in a file system. Clearly such a property would not > CG> be inherited but in a number of situations here it would be a really > useful feature. > > CG> I can see there would be issues with access these new file > CG> systems over NFS as NFS is currently not to good when new file > CG> systems are created, but has any one considered this? > CG> > > > dtrace -w -n syscall::mkdir:entry'{system("zfs create %s\n", > copyinstr(arg0+1));}' > > Should do the job :) > > > ps. just kidding - but it will work to some extent :) You'd have to stop() the victim and prun it after the mkdir. And you'd have to use predicates to do this only for the mkdirs of interest. And you'd have to be willing to deal with drops (where the mkdir just doesn't happen). Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Metaslab alignment on RAID-Z
Frank Cusack wrote: On September 27, 2006 11:27:16 AM -0700 Richard Elling - PAE <[EMAIL PROTECTED]> wrote: In the interim, does it makes sense for a simple rule of thumb? For example, in the above case, I would not have the hole if I did any of the following: 1. add one disk 2. remove one disk 3. use raidz2 instead of raidz More generally, I could suggest that we use an odd number of vdevs for raidz and an even number for mirrors and raidz2. These rules are not generally accurate. To eliminate the blank "round up" sectors for power-of-two blocksizes of 8k or larger, you should use a power-of-two plus 1 number of disks in your raid-z group -- that is, 3, 5, or 9 disks (for double-parity, use a power-of-two plus 2 -- that is, 4, 6, or 10). Smaller blocksizes are more constrained; for 4k, use 3 or 5 disks (for double parity, use 4 or 6) and for 2k, use 3 disks (for double parity, use 4). I have a 12-drive array (JBOD). I was going to carve out a 5-way raidz and a 6-way raidz, both in one pool. Should I do 5-3-3 instead? If you know you'll be using lots of small, same-size blocks (eg, a database where you're changing the recordsize property), AND you need the best possible performance, AND you can't afford to use mirroring, then you should do a performance comparison on both and see how it works. Otherwise (ie. in the common case), don't worry about it. --matt ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: mkdir == zfs create
On Fri, Sep 29, 2006 at 01:26:04AM +0200, [EMAIL PROTECTED] wrote: > > >Please elaborate: "CIFS just requires the automount hack." > > > CIFS currently access the files through the local file system so > it can invoke the automouter and there can use "tricky maps". Well, Samba does. And it doesn't even need the automounter :) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Metaslab alignment on RAID-Z
On September 27, 2006 11:27:16 AM -0700 Richard Elling - PAE <[EMAIL PROTECTED]> wrote: In the interim, does it makes sense for a simple rule of thumb? For example, in the above case, I would not have the hole if I did any of the following: 1. add one disk 2. remove one disk 3. use raidz2 instead of raidz More generally, I could suggest that we use an odd number of vdevs for raidz and an even number for mirrors and raidz2. I have a 12-drive array (JBOD). I was going to carve out a 5-way raidz and a 6-way raidz, both in one pool. Should I do 5-3-3 instead? -frank ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: mkdir == zfs create
>Please elaborate: "CIFS just requires the automount hack." CIFS currently access the files through the local file system so it can invoke the automouter and there can use "tricky maps". Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: mkdir == zfs create
On Thu, Sep 28, 2006 at 12:40:17PM -0500, Ed Plese wrote: > This can be elaborated on to do neat things like create a ZFS clone when > a client connects and then destroy the clone when the client > disconnects (via "root postexec"). This could possibly be useful for > the shared build system that was mentioned by an earlier post. To follow up on this, here's an example share from smb.conf that accomplishes this: [build] comment = Build Share writable = yes path = /tank/clones/%d root preexec = zfs clone tank/clones/[EMAIL PROTECTED] tank/clones/%d root postexec = zfs destroy tank/clones/%d This gives you a fresh clone every time you connect to the share. Ed Plese ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Re: mkdir == zfs create
This works like a champ: smbclient -U foobar samba_server\\home Given a UNIX account foobar with a smbpasswd for the account, the zfs filesystem pool/home/foobar was created and smclient connected to it. Thank you. Ron Halstead This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] jbod questions
On Thu, 2006-09-28 at 10:51 -0700, Richard Elling - PAE wrote: > Keith Clay wrote: > > We are in the process of purchasing new san/s that our mail server runs > > on (JES3). We have moved our mailstores to zfs and continue to have > > checksum errors -- they are corrected but this improves on the ufs inode > > errors that require system shutdown and fsck. > > > > So, I am recommending that we buy small jbods, do raidz2 and let zfs > > handle the raiding of these boxes. As we need more storage, we can add > > boxes and place them in a pool. This would allow more controllers and > > move spindles which I would think would add reliability and > > performance. I am thinking SATA II drives. > > > > Any recommendations and/or advice is welcome. > Also, I can't remember how JES3 does its mailstore, but lots of little writes to a RAIDZ volume aren't good for performance, even through ZFS is better about waiting for sufficient write data to do a full-stripe-width write (vs. RAID-5). That is, using RAIDZ on SATA isn't a good performance idea for the small write usage pattern, so I'd be careful and get a demo unit first to check out the actual numbers. -- Erik Trimble Java System Support Mailstop: usca14-102 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] jbod questions
Keith Clay wrote: We are in the process of purchasing new san/s that our mail server runs on (JES3). We have moved our mailstores to zfs and continue to have checksum errors -- they are corrected but this improves on the ufs inode errors that require system shutdown and fsck. So, I am recommending that we buy small jbods, do raidz2 and let zfs handle the raiding of these boxes. As we need more storage, we can add boxes and place them in a pool. This would allow more controllers and move spindles which I would think would add reliability and performance. I am thinking SATA II drives. Any recommendations and/or advice is welcome. I would take a look at the Hitachi Enterprise-class SATA drives. Also, try to keep them cool. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: mkdir == zfs create
On Thu, Sep 28, 2006 at 09:00:47AM -0700, Ron Halstead wrote: > Please elaborate: "CIFS just requires the automount hack." Samba's smb.conf supports a "root preexec" parameter that allows a program to be run when a share is connected to. For example, with a simple script, createhome.sh, like, #!/usr/bin/bash if [ ! -e /tank/home/$1 ]; then zfs create tank/home/$1 fi and a [homes] share in smb.conf like, [homes] comment = User Home Directories browseable = no writable = yes root preexec = createhome.sh '%U' Samba will automatically create a ZFS filesystem for each user's home directory the first time the user connects to the server. You'd likely want to expand on this to have it properly set the permissions, perform some additional checks and logging, etc. This can be elaborated on to do neat things like create a ZFS clone when a client connects and then destroy the clone when the client disconnects (via "root postexec"). This could possibly be useful for the shared build system that was mentioned by an earlier post. To truely replace every mkdir call you can write a fairly simple VFS module for Samba that would replace every mkdir call with a call to "zfs create". This method is a bit more involved than the above method since the VFS modules are coded in C, but it's definitely a possibility. Ed Plese ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Automounting ? (idea ?)
On Wed, Sep 27, 2006 at 08:55:48AM -0600, Mark Maybee wrote: > Patrick wrote: > >So ... how about an automounter? Is this even possible? Does it exist ? > > *sigh*, one of the issues we recognized, when we introduced the new > cheap/fast file system creation, was that this new model would stress > the scalability (or lack thereof) of other parts of the operating > system. This is a prime example. I think the notion of an automount > option for zfs directories is an excellent one. Solaris does support > automount, and it should be possible, by setting the mountpoint property > to "legacy", to set up automount tables to achieve what you want now; > but it would be nice if zfs had a property to do this for you > automatically. Perhaps ZFS could write a cache on shutdown that could be used to speed up mounting on startup by avoiding all that I/O? Sounds difficult; if the cache is ever wrong there has to be some way to recover. Alternatively, it'd be neat if ZFS could do the automounting of ZFS filesystems mounted on ZFS filesystems as needed and without autofs. It'd have to work server-side (i.e., when the trigger comes from NFS). And because of the MOUNT protocol ZFS would still have to keep a cache of the whole hierarchy so that the MOUNT protocol can serve it without everything having to be mounted (and also so 'zfs list' can show what's there even if not yet mounted). Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: mkdir == zfs create
Please elaborate: "CIFS just requires the automount hack." Ron This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] mkdir == zfs create
> > (And if you don't need it to work remotely automount could take care of it > > if you think "cd" should be sufficient reason to create a directory) > > Maybe on unmount empty filesystems could be destroyed. more general? If we have "events" like a library-call to "mkdir" or "change-dir" or "no open filedescriptors" or ünmount", then operator-predefined actions will be triggered. Actions like "zfs create ", "take a snapshot" or "zsend on a snapshot" and others could be thought of. Thomas -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] mkdir == zfs create
Hello Chris, Thursday, September 28, 2006, 4:55:13 PM, you wrote: CG> I keep thinking that it would be useful to be able to define a CG> zfs file system where all calls to mkdir resulted not just in a CG> directory but in a file system. Clearly such a property would not CG> be inherited but in a number of situations here it would be a really useful feature. CG> I can see there would be issues with access these new file CG> systems over NFS as NFS is currently not to good when new file CG> systems are created, but has any one considered this? CG> dtrace -w -n syscall::mkdir:entry'{system("zfs create %s\n", copyinstr(arg0+1));}' Should do the job :) ps. just kidding - but it will work to some extent :) -- Best regards, Robertmailto:[EMAIL PROTECTED] http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] mkdir == zfs create
On Thu, Sep 28, 2006 at 05:29:27PM +0200, [EMAIL PROTECTED] wrote: > > >Any mkdir in a builds directory on a shared build machine. Would be > >very cool because then every user/project automatically gets a ZFS > >fileystems. > > > >Why map it to mkdir rather than using zfs create ? Because mkdir means > >it will work over NFS or CIFS. > > "NFS" will be fairly difficult because you will then have a filesystem > you cannot reach. (You can't traverse file systems that easily at the > moment over NFS) CIFS just requires the automount hack. For v4 the fix is coming though. > (And if you don't need it to work remotely automount could take care of it > if you think "cd" should be sufficient reason to create a directory) Maybe on unmount empty filesystems could be destroyed. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] mkdir == zfs create
>Any mkdir in a builds directory on a shared build machine. Would be >very cool because then every user/project automatically gets a ZFS >fileystems. > >Why map it to mkdir rather than using zfs create ? Because mkdir means >it will work over NFS or CIFS. "NFS" will be fairly difficult because you will then have a filesystem you cannot reach. (You can't traverse file systems that easily at the moment over NFS) CIFS just requires the automount hack. (And if you don't need it to work remotely automount could take care of it if you think "cd" should be sufficient reason to create a directory) Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] jbod questions
Folks, We are in the process of purchasing new san/s that our mail server runs on (JES3). We have moved our mailstores to zfs and continue to have checksum errors -- they are corrected but this improves on the ufs inode errors that require system shutdown and fsck. So, I am recommending that we buy small jbods, do raidz2 and let zfs handle the raiding of these boxes. As we need more storage, we can add boxes and place them in a pool. This would allow more controllers and move spindles which I would think would add reliability and performance. I am thinking SATA II drives. Any recommendations and/or advice is welcome. thanks, keith ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] mkdir == zfs create
Jeremy Teo wrote: I keep thinking that it would be useful to be able to define a zfs file system where all calls to mkdir resulted not just in a directory but in a file system. Clearly such a property would not be inherited but in a number of situations here it would be a really useful feature. Any example use cases? :) Keeping data associated with a customer case. Currently we do this in a number of file systems and then fake them up using symlinks to look like a flat directory as a single file system is not big enough. It would be really cool to use ZFS so that there was just one file system. However it would be even cooler if each directory in that pool was a file system. So there is a file system per customer case, without having to do rbac thing to have the file systems created. Then we could easily see where the space was and could archive file systems rather than directories plus snapshot based on activity in the case. -- Chris Gerhard. __o __o __o Sun Microsystems Limited_`\<,`\<,`\<,_ Phone: +44 (0) 1252 426033 (ext 26033) (*)/---/---/ (*) --- http://blogs.sun.com/chrisg --- NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. smime.p7s Description: S/MIME Cryptographic Signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] mkdir == zfs create
Jeremy Teo wrote: I keep thinking that it would be useful to be able to define a zfs file system where all calls to mkdir resulted not just in a directory but in a file system. Clearly such a property would not be inherited but in a number of situations here it would be a really useful feature. Any example use cases? :) Taking into account that Chris said this would NOT be inherited by default. Any mkdir in a builds directory on a shared build machine. Would be very cool because then every user/project automatically gets a ZFS fileystems. Why map it to mkdir rather than using zfs create ? Because mkdir means it will work over NFS or CIFS. -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] mkdir == zfs create
I keep thinking that it would be useful to be able to define a zfs file system where all calls to mkdir resulted not just in a directory but in a file system. Clearly such a property would not be inherited but in a number of situations here it would be a really useful feature. Any example use cases? :) -- Regards, Jeremy ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] mkdir == zfs create
I keep thinking that it would be useful to be able to define a zfs file system where all calls to mkdir resulted not just in a directory but in a file system. Clearly such a property would not be inherited but in a number of situations here it would be a really useful feature. I can see there would be issues with access these new file systems over NFS as NFS is currently not to good when new file systems are created, but has any one considered this? This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Question: vxvm/dmp or zfs/mpxio
Hello experts, My client is thinking about changing one of their servers running Solaris 8 (NIS Hub to mention something) to a server running Solaris 10. The question that i have given is to check the differences or if you wish the status/stability of changing VXvm with DMP (Dynamic Multi Pathing) to a system with ZFS and MPXio. I have made my share of RTFM but not come any closer to an answer. So now i am turning to you guys! Is ZFS/mpxio strong enough to be an alternativ to VXvm/Dmp today? Has anyone done this change and if so, what was your experiance? Regards, Pierre This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Metaslab alignment on RAID-Z
Gino Ruopolo writes: > Thank you Bill for your clear description. > > Now I have to find a way to justify myself with my head office that > after spending 100k+ in hw and migrating to "the most advanced OS" we > are running about 8 time slower :) > > Anyway I have a problem much more serious than rsync process speed. I > hope you'll help me solving it out! > > Our situation: > > /data/a > /data/b > /data/zones/ZONEX(whole root zone) > > As you know I have a process running "rsync -ax /data/a/* /data/b" for > about 14hrs. > The problem is that, while that rsync process is running, ZONEX is > completely unusable because of the rsync I/O load. > Even if we're using FSS, Solaris seems unable to give a small amount > of I/O resource to ZONEX's activity ... > > I know that FSS doesn't deal with I/O but I think Solaris should be > smarter .. > > To draw a comparison, FreeBSD Jail doesn't suffer from this problem > ... > > thank, > Gino > > > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss Under a streaming write load, we kind of overwhelm the devices and reads are few and far between. To alleviate this we somewhat need to throttle writers more 6429205 each zpool needs to monitor it's throughput and throttle heavy writers This is in a state of "fix in progress". At the same time, the notionofreserved slots for reads isbeing investigated. That should do wonders to your issue. I don't know how to workaround this for now (appart from starving rsync process of cpu access). -r ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] problem ZFS / NFS from FreeBSD nfsv3 client -- periodic NFS server not resp
On Sep 26, 2006, at 12:26 PM, Chad Leigh -- Shire.Net LLC wrote: On Sep 26, 2006, at 12:24 PM, Mike Kupfer wrote: "Chad" == Chad Leigh <-- Shire.Net LLC" <[EMAIL PROTECTED]>> writes: Chad> snoop does not show me the reply packets going back. What do I Chad> need to do to go both ways? It's possible that performance issues are causing snoop to miss the replies. If your server has multiple network interfaces, it's more likely that the server is routing the replies back on a different interface. We've run into that problem many times with the NFS server that has my home directory on it. If that is what's going on, you need to fire up multiple instances of snoop, one per interface. OK, I will try that. I did run tcpdump on the BSD client as well so the responses should show up there as well as it only has the 1 interface on that net while the Solaris box has 3. That got me thinking. Since I had 3 "dedicated" ports to use for nfs, I changed it so each is on its own network (192.168.2 .3. 4) so there is no port switcheroo on incoming and outgoing port. I also upgraded the FreeBSD to catch any bge updates and patches (there were some I think but I am not sure they had anything to do with my issue). Anyway, after doing both of these my issue seems to have gone away... I am still testing / watching but I have not seen or experienced the issue in a day. I am not sure which one "fixed" my problem but it seems to have gone away. Thanks Chad Thanks Chad mike --- Chad Leigh -- Shire.Net LLC Your Web App and Email hosting provider chad at shire.net ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss --- Chad Leigh -- Shire.Net LLC Your Web App and Email hosting provider chad at shire.net smime.p7s Description: S/MIME cryptographic signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Re: Metaslab alignment on RAID-Z
> > Even if we're using FSS, Solaris seems unable to > give a small amount of I/O resource to ZONEX's > activity ... > > > > I know that FSS doesn't deal with I/O but I think > Solaris should be smarter .. > > What about using ipqos (man ipqos)? I'm not referring to "network I/O" but "storage I/O" ... thanks, Gino This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss