Re: [zfs-discuss] Trying to determine if this box will be compatible with Opensolaris or Solaris
yeah i really wish the HCL was easier to work with, and allowed comments. for instance that HCL entry was updated in 2007 sometime. since then like you've said it could have been better or dropped altogether. some sort of more community oriented aspect might help beef it up some. also making the tools simpler - absolutely no UI for instance. does it really need one to dump out things? :) On Wed, Mar 11, 2009 at 7:15 PM, David Magda dma...@ee.ryerson.ca wrote: On Mar 11, 2009, at 21:59, mike wrote: On Wed, Mar 11, 2009 at 6:53 PM, David Magda dma...@ee.ryerson.ca wrote: If you know someone who already has the hardware, you can ask them to run the Sun Device Detection Tool: http://www.sun.com/bigadmin/hcl/hcts/device_detect.jsp It runs under other operating system (Windows, Linux, BSD) AFAIK, so a re-install or reboot isn't necessary to see what it comes up with. doesnt it require java and x11? Yes, it requires Java 1.5+; a GUI is needed, but I don't think X11 is specifically required (X is the GUI on Unix-y systems of course). Java doesn't specifically need X, it simply uses whatever the OS has. Looking at the page a bit more, you can run commands on the system and save the output to a file that can be processed by the tool on another system: Apart from testing the current system on which Sun Device Detection Tool is invoked, you can also test the device data files that are generated from the external systems. To test the external device data files, print the PCI configuration of the external systems to a text file by using the following commands: • prtconf -pv on Solaris OS. • lspci -vv -n on Linux OS. • reg query hklm\system\currentcontrolset\enum\pci /s on Windows OS. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] reboot when copying large amounts of data
I start the cp, and then, with prstat -a, watch the cpu load for the cp process climb to 25% on a 4-core machine. Load, measured for example with 'uptime', climbs steadily until the reboot. Note that the machine does not dump properly, panic or hang - rather, it reboots. I attached a screenshot earlier in this thread of the little bit of error message I could see on the console. The machine is trying to dump to the dump zvol, but fails to do so. Only sometimes do I see an error on the machine's local console - mos times, it simply reboots. On Thu, Mar 12, 2009 at 1:55 AM, Nathan Kroenert nathan.kroen...@sun.com wrote: Hm - Crashes, or hangs? Moreover - how do you know a CPU is pegged? Seems like we could do a little more discovery on what the actual problem here is, as I can read it about 4 different ways. By this last piece of information, I'm guessing the system does not crash, but goes really really slow?? Crash == panic == we see stack dump on console and try to take a dump hang == nothing works == no response - might be worth looking at mdb -K or booting with a -k on the boot line. So - are we crashing, hanging, or something different? It might simply be that you are eating up all your memory, and your physical backing storage is taking a while to catch up? Nathan. Blake wrote: My dump device is already on a different controller - the motherboards built-in nVidia SATA controller. The raidz2 vdev is the one I'm having trouble with (copying the same files to the mirrored rpool on the nVidia controller work nicely). I do notice that, when using cp to copy the files to the raidz2 pool, load on the machine climbs steadily until the crash, and one proc core pegs at 100%. Frustrating, yes. On Thu, Mar 12, 2009 at 12:31 AM, Maidak Alexander J maidakalexand...@johndeere.com wrote: If you're having issues with a disk contoller or disk IO driver its highly likely that a savecore to disk after the panic will fail. I'm not sure how to work around this, maybe a dedicated dump device not on a controller that uses a different driver then the one that you're having issues with? -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Blake Sent: Wednesday, March 11, 2009 4:45 PM To: Richard Elling Cc: Marc Bevand; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] reboot when copying large amounts of data I guess I didn't make it clear that I had already tried using savecore to retrieve the core from the dump device. I added a larger zvol for dump, to make sure that I wasn't running out of space on the dump device: r...@host:~# dumpadm Dump content: kernel pages Dump device: /dev/zvol/dsk/rpool/bigdump (dedicated) Savecore directory: /var/crash/host Savecore enabled: yes I was using the -L option only to try to get some idea of why the system load was climbing to 1 during a simple file copy. On Wed, Mar 11, 2009 at 4:58 PM, Richard Elling richard.ell...@gmail.com wrote: Blake wrote: I'm attaching a screenshot of the console just before reboot. The dump doesn't seem to be working, or savecore isn't working. On Wed, Mar 11, 2009 at 11:33 AM, Blake blake.ir...@gmail.com wrote: I'm working on testing this some more by doing a savecore -L right after I start the copy. savecore -L is not what you want. By default, for OpenSolaris, savecore on boot is disabled. But the core will have been dumped into the dump slice, which is not used for swap. So you should be able to run savecore at a later time to collect the core from the last dump. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- // // Nathan Kroenert nathan.kroen...@sun.com // // Systems Engineer Phone: +61 3 9869-6255 // // Sun Microsystems Fax: +61 3 9869-6288 // // Level 7, 476 St. Kilda Road Mobile: 0419 305 456 // // Melbourne 3004 Victoria Australia // // ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] reboot when copying large amounts of data
definitely time to bust out some mdb -k and see what it's moaning about. I did not see the screenshot earlier... sorry about that. Nathan. Blake wrote: I start the cp, and then, with prstat -a, watch the cpu load for the cp process climb to 25% on a 4-core machine. Load, measured for example with 'uptime', climbs steadily until the reboot. Note that the machine does not dump properly, panic or hang - rather, it reboots. I attached a screenshot earlier in this thread of the little bit of error message I could see on the console. The machine is trying to dump to the dump zvol, but fails to do so. Only sometimes do I see an error on the machine's local console - mos times, it simply reboots. On Thu, Mar 12, 2009 at 1:55 AM, Nathan Kroenert nathan.kroen...@sun.com wrote: Hm - Crashes, or hangs? Moreover - how do you know a CPU is pegged? Seems like we could do a little more discovery on what the actual problem here is, as I can read it about 4 different ways. By this last piece of information, I'm guessing the system does not crash, but goes really really slow?? Crash == panic == we see stack dump on console and try to take a dump hang == nothing works == no response - might be worth looking at mdb -K or booting with a -k on the boot line. So - are we crashing, hanging, or something different? It might simply be that you are eating up all your memory, and your physical backing storage is taking a while to catch up? Nathan. Blake wrote: My dump device is already on a different controller - the motherboards built-in nVidia SATA controller. The raidz2 vdev is the one I'm having trouble with (copying the same files to the mirrored rpool on the nVidia controller work nicely). I do notice that, when using cp to copy the files to the raidz2 pool, load on the machine climbs steadily until the crash, and one proc core pegs at 100%. Frustrating, yes. On Thu, Mar 12, 2009 at 12:31 AM, Maidak Alexander J maidakalexand...@johndeere.com wrote: If you're having issues with a disk contoller or disk IO driver its highly likely that a savecore to disk after the panic will fail. I'm not sure how to work around this, maybe a dedicated dump device not on a controller that uses a different driver then the one that you're having issues with? -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Blake Sent: Wednesday, March 11, 2009 4:45 PM To: Richard Elling Cc: Marc Bevand; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] reboot when copying large amounts of data I guess I didn't make it clear that I had already tried using savecore to retrieve the core from the dump device. I added a larger zvol for dump, to make sure that I wasn't running out of space on the dump device: r...@host:~# dumpadm Dump content: kernel pages Dump device: /dev/zvol/dsk/rpool/bigdump (dedicated) Savecore directory: /var/crash/host Savecore enabled: yes I was using the -L option only to try to get some idea of why the system load was climbing to 1 during a simple file copy. On Wed, Mar 11, 2009 at 4:58 PM, Richard Elling richard.ell...@gmail.com wrote: Blake wrote: I'm attaching a screenshot of the console just before reboot. The dump doesn't seem to be working, or savecore isn't working. On Wed, Mar 11, 2009 at 11:33 AM, Blake blake.ir...@gmail.com wrote: I'm working on testing this some more by doing a savecore -L right after I start the copy. savecore -L is not what you want. By default, for OpenSolaris, savecore on boot is disabled. But the core will have been dumped into the dump slice, which is not used for swap. So you should be able to run savecore at a later time to collect the core from the last dump. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- // // Nathan Kroenert nathan.kroen...@sun.com // // Systems Engineer Phone: +61 3 9869-6255 // // Sun Microsystems Fax:+61 3 9869-6288 // // Level 7, 476 St. Kilda Road Mobile: 0419 305 456// // Melbourne 3004 VictoriaAustralia // // -- // // Nathan Kroenert nathan.kroen...@sun.com // // Systems Engineer Phone: +61 3 9869-6255 // // Sun Microsystems Fax:+61 3 9869-6288 // // Level 7, 476 St. Kilda Road Mobile: 0419 305 456// // Melbourne 3004 VictoriaAustralia //
Re: [zfs-discuss] reboot when copying large amounts of data
definitely time to bust out some mdb -K or boot -k and see what it's moaning about. I did not see the screenshot earlier... sorry about that. Nathan. Blake wrote: I start the cp, and then, with prstat -a, watch the cpu load for the cp process climb to 25% on a 4-core machine. Load, measured for example with 'uptime', climbs steadily until the reboot. Note that the machine does not dump properly, panic or hang - rather, it reboots. I attached a screenshot earlier in this thread of the little bit of error message I could see on the console. The machine is trying to dump to the dump zvol, but fails to do so. Only sometimes do I see an error on the machine's local console - mos times, it simply reboots. On Thu, Mar 12, 2009 at 1:55 AM, Nathan Kroenert nathan.kroen...@sun.com wrote: Hm - Crashes, or hangs? Moreover - how do you know a CPU is pegged? Seems like we could do a little more discovery on what the actual problem here is, as I can read it about 4 different ways. By this last piece of information, I'm guessing the system does not crash, but goes really really slow?? Crash == panic == we see stack dump on console and try to take a dump hang == nothing works == no response - might be worth looking at mdb -K or booting with a -k on the boot line. So - are we crashing, hanging, or something different? It might simply be that you are eating up all your memory, and your physical backing storage is taking a while to catch up? Nathan. Blake wrote: My dump device is already on a different controller - the motherboards built-in nVidia SATA controller. The raidz2 vdev is the one I'm having trouble with (copying the same files to the mirrored rpool on the nVidia controller work nicely). I do notice that, when using cp to copy the files to the raidz2 pool, load on the machine climbs steadily until the crash, and one proc core pegs at 100%. Frustrating, yes. On Thu, Mar 12, 2009 at 12:31 AM, Maidak Alexander J maidakalexand...@johndeere.com wrote: If you're having issues with a disk contoller or disk IO driver its highly likely that a savecore to disk after the panic will fail. I'm not sure how to work around this, maybe a dedicated dump device not on a controller that uses a different driver then the one that you're having issues with? -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Blake Sent: Wednesday, March 11, 2009 4:45 PM To: Richard Elling Cc: Marc Bevand; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] reboot when copying large amounts of data I guess I didn't make it clear that I had already tried using savecore to retrieve the core from the dump device. I added a larger zvol for dump, to make sure that I wasn't running out of space on the dump device: r...@host:~# dumpadm Dump content: kernel pages Dump device: /dev/zvol/dsk/rpool/bigdump (dedicated) Savecore directory: /var/crash/host Savecore enabled: yes I was using the -L option only to try to get some idea of why the system load was climbing to 1 during a simple file copy. On Wed, Mar 11, 2009 at 4:58 PM, Richard Elling richard.ell...@gmail.com wrote: Blake wrote: I'm attaching a screenshot of the console just before reboot. The dump doesn't seem to be working, or savecore isn't working. On Wed, Mar 11, 2009 at 11:33 AM, Blake blake.ir...@gmail.com wrote: I'm working on testing this some more by doing a savecore -L right after I start the copy. savecore -L is not what you want. By default, for OpenSolaris, savecore on boot is disabled. But the core will have been dumped into the dump slice, which is not used for swap. So you should be able to run savecore at a later time to collect the core from the last dump. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- // // Nathan Kroenert nathan.kroen...@sun.com // // Systems Engineer Phone: +61 3 9869-6255 // // Sun Microsystems Fax:+61 3 9869-6288 // // Level 7, 476 St. Kilda Road Mobile: 0419 305 456// // Melbourne 3004 VictoriaAustralia // // -- // // Nathan Kroenert nathan.kroen...@sun.com // // Systems Engineer Phone: +61 3 9869-6255 // // Sun Microsystems Fax:+61 3 9869-6288 // // Level 7, 476 St. Kilda Road Mobile: 0419 305 456// // Melbourne 3004 VictoriaAustralia
Re: [zfs-discuss] reboot when copying large amounts of data
So, if I boot with the -k boot flags (to load the kernel debugger?) what do I need to look for? I'm no expert at kernel debugging. I think this is a pci error judging by the console output, or at least is i/o related... thanks for your feedback, Blake On Thu, Mar 12, 2009 at 2:18 AM, Nathan Kroenert nathan.kroen...@sun.com wrote: definitely time to bust out some mdb -K or boot -k and see what it's moaning about. I did not see the screenshot earlier... sorry about that. Nathan. Blake wrote: I start the cp, and then, with prstat -a, watch the cpu load for the cp process climb to 25% on a 4-core machine. Load, measured for example with 'uptime', climbs steadily until the reboot. Note that the machine does not dump properly, panic or hang - rather, it reboots. I attached a screenshot earlier in this thread of the little bit of error message I could see on the console. The machine is trying to dump to the dump zvol, but fails to do so. Only sometimes do I see an error on the machine's local console - mos times, it simply reboots. On Thu, Mar 12, 2009 at 1:55 AM, Nathan Kroenert nathan.kroen...@sun.com wrote: Hm - Crashes, or hangs? Moreover - how do you know a CPU is pegged? Seems like we could do a little more discovery on what the actual problem here is, as I can read it about 4 different ways. By this last piece of information, I'm guessing the system does not crash, but goes really really slow?? Crash == panic == we see stack dump on console and try to take a dump hang == nothing works == no response - might be worth looking at mdb -K or booting with a -k on the boot line. So - are we crashing, hanging, or something different? It might simply be that you are eating up all your memory, and your physical backing storage is taking a while to catch up? Nathan. Blake wrote: My dump device is already on a different controller - the motherboards built-in nVidia SATA controller. The raidz2 vdev is the one I'm having trouble with (copying the same files to the mirrored rpool on the nVidia controller work nicely). I do notice that, when using cp to copy the files to the raidz2 pool, load on the machine climbs steadily until the crash, and one proc core pegs at 100%. Frustrating, yes. On Thu, Mar 12, 2009 at 12:31 AM, Maidak Alexander J maidakalexand...@johndeere.com wrote: If you're having issues with a disk contoller or disk IO driver its highly likely that a savecore to disk after the panic will fail. I'm not sure how to work around this, maybe a dedicated dump device not on a controller that uses a different driver then the one that you're having issues with? -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Blake Sent: Wednesday, March 11, 2009 4:45 PM To: Richard Elling Cc: Marc Bevand; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] reboot when copying large amounts of data I guess I didn't make it clear that I had already tried using savecore to retrieve the core from the dump device. I added a larger zvol for dump, to make sure that I wasn't running out of space on the dump device: r...@host:~# dumpadm Dump content: kernel pages Dump device: /dev/zvol/dsk/rpool/bigdump (dedicated) Savecore directory: /var/crash/host Savecore enabled: yes I was using the -L option only to try to get some idea of why the system load was climbing to 1 during a simple file copy. On Wed, Mar 11, 2009 at 4:58 PM, Richard Elling richard.ell...@gmail.com wrote: Blake wrote: I'm attaching a screenshot of the console just before reboot. The dump doesn't seem to be working, or savecore isn't working. On Wed, Mar 11, 2009 at 11:33 AM, Blake blake.ir...@gmail.com wrote: I'm working on testing this some more by doing a savecore -L right after I start the copy. savecore -L is not what you want. By default, for OpenSolaris, savecore on boot is disabled. But the core will have been dumped into the dump slice, which is not used for swap. So you should be able to run savecore at a later time to collect the core from the last dump. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- // // Nathan Kroenert nathan.kroen...@sun.com // // Systems Engineer Phone: +61 3 9869-6255 // // Sun Microsystems Fax: +61 3 9869-6288 // // Level 7, 476 St. Kilda Road Mobile: 0419 305 456 // // Melbourne 3004 Victoria Australia //
[zfs-discuss] Encryption through compression?
Hello everyone, My understanding is that the ZFS crypto framework will not release until 2010. In light of that, I'm wondering if the following approach to encryption could make sense for some subset of users: The idea is to use the compression framework to do both compression and encryption in one pass. This would be done by defining a new compression type, which might be called compress-encrypt or something like that. There could be two levels, one that does both compress and encrypt and another that does encrypt only. I see the following issues with this approach: 1. ZFS compression framework presently takes compressed data only if there was at least 12.5% reduction. For data that didn't compress, you would wind up storing it unencrypted, even if encryption was on. 2. Meta-data would not be encrypted. I.e., even if you don't have the key, you will be able to do directory listings and see file names, etc. 3. There is no key management framework. I would deal with these as follows: Issue #1 can be solved by changing ZFS code such that it always accepts the compressed data. I guess this is an easy change. Issue #2 may be a limitation to some and feature to others. May be OK. Issue #3 can be solved using encryption hardware (which my company happens to make). The keys are stored in hardware and can be used directly from that. Of course, this means that the solution will be specific to our hardware, but that's fine by me. The idea is that we would do this project on our own and supply this modified ZFS with our compression/encryption hardware to our customers. We may submit the patch for inclusion in some future version of OS, if the developers are amenable to that. Does anyone see any problems with this? There are probably various gotchas here that I haven't thought of. If you can think of any, please let me know. Thanks, Monish Monish Shah CEO, Indra Networks, Inc. www.indranetworks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Encryption through compression?
Monish Shah wrote: Hello everyone, My understanding is that the ZFS crypto framework will not release until 2010. That is incorrect information, where did you get that from ? In light of that, I'm wondering if the following approach to encryption could make sense for some subset of users: The idea is to use the compression framework to do both compression and encryption in one pass. This would be done by defining a new compression type, which might be called compress-encrypt or something like that. There could be two levels, one that does both compress and encrypt and another that does encrypt only. I see the following issues with this approach: 1. ZFS compression framework presently takes compressed data only if there was at least 12.5% reduction. For data that didn't compress, you would wind up storing it unencrypted, even if encryption was on. 2. Meta-data would not be encrypted. I.e., even if you don't have the key, you will be able to do directory listings and see file names, etc. 3. There is no key management framework. That is impossible there has to be key management somewhere. I would deal with these as follows: Issue #1 can be solved by changing ZFS code such that it always accepts the compressed data. I guess this is an easy change. Issue #2 may be a limitation to some and feature to others. May be OK. Issue #3 can be solved using encryption hardware (which my company happens to make). The keys are stored in hardware and can be used directly from that. Of course, this means that the solution will be specific to our hardware, but that's fine by me. The idea is that we would do this project on our own and supply this modified ZFS with our compression/encryption hardware to our customers. We may submit the patch for inclusion in some future version of OS, if the developers are amenable to that. If it is specific to your companies hardware I doubt it would ever get integrated into OpenSolaris particularly given the existing zfs-crypto project has no hardware dependencies at all. The better way to use your encryption hardware is to get it plugged into the OpenSolaris cryptographic framework (see the crypto project on OpenSolaris.org) Does anyone see any problems with this? There are probably various gotchas here that I haven't thought of. If you can think of any, please let me know. The various gotchas are the things that have been taking me and the rest of the ZFS team a large part of the zfs-crypto project to resolve. It really isn't as simple as you think it is - if it were then the zfs-crypto project would be done by now! If you really want to help get encryption for ZFS then please come and join the already existing project rather than starting another one from scratch. -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Encryption through compression?
Monish Shah wrote: Hello Darren, Monish Shah wrote: Hello everyone, My understanding is that the ZFS crypto framework will not release until 2010. That is incorrect information, where did you get that from ? It was in Mike Shapiro's presentation at the Open Solaris Storage Summit that took place a couple of weeks ago. Perhaps I mis-read the slide, but I'm pretty sure it listed encryption as a feature for 2010. That is for its availablity in the S7000 appliance. It will be in OpenSolaris before that (it has to be because the S7000 is based on an OpenSolaris build). If the schedule is much sooner than 2010, I would definitely do so. What is your present schedule estimate? I can't commit to this yet but I expect somewhere around August 2009. Note that the code in hg.opensolaris.org/hg/zfs-crypto/gate actually works today and encrypts more than what your proposal work. It is just that we are making some design changes to simplify the model and ensure that encryption integrates with other ZFS features coming along. There will be a design update posted to zfs-crypto-discuss@ later this month. -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Encryption through compression?
Hello Darren, Monish Shah wrote: Hello everyone, My understanding is that the ZFS crypto framework will not release until 2010. That is incorrect information, where did you get that from ? It was in Mike Shapiro's presentation at the Open Solaris Storage Summit that took place a couple of weeks ago. Perhaps I mis-read the slide, but I'm pretty sure it listed encryption as a feature for 2010. ... 3. There is no key management framework. That is impossible there has to be key management somewhere. What I meant was, the compression framework does not have key management framework. Using our hardware (which I mentioned later in my mail), the key management would come with the hardware, since we store keys in the hardware. We provide a utility to manage the keys stored in the hardware. ... If it is specific to your companies hardware I doubt it would ever get integrated into OpenSolaris particularly given the existing zfs-crypto project has no hardware dependencies at all. The better way to use your encryption hardware is to get it plugged into the OpenSolaris cryptographic framework (see the crypto project on OpenSolaris.org) That was precisely what I want thinking originally. However, if it is out in 2010, there is temptation to do our own project, which I thought could be done in a couple of months. (In light of your comment below, my estimate may have been wildly optimistic, but the foregoing is merely an explanation of what I was thinking.) Does anyone see any problems with this? There are probably various gotchas here that I haven't thought of. If you can think of any, please let me know. The various gotchas are the things that have been taking me and the rest of the ZFS team a large part of the zfs-crypto project to resolve. It really isn't as simple as you think it is - if it were then the zfs-crypto project would be done by now! If you really want to help get encryption for ZFS then please come and join the already existing project rather than starting another one from scratch. If the schedule is much sooner than 2010, I would definitely do so. What is your present schedule estimate? -- Darren J Moffat Monish ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Export ZFS via ISCSI to Linux - Is it stable for production use now?
Hi, On Thu, Mar 12, 2009 at 1:19 AM, Darren J Moffat darr...@opensolaris.org wrote: That is all that has to be done on the OpenSolaris side to make a 10g lun available over iSCSI. The rest of it is all how Linux sets up its iSCSI client side which I don't know but I know on Solaris it is very easy using iscsiadm(1M). Thanks for your detail steps. Bbut I think using this setup, only one client can mount the share blocks at a time? So there must be a need of clustered file system. (e.g. gfs) Just out of curious, what is the clustered file system used in Sun Unified Storage 7000 series for data sharing ammong clients? Thanks. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS on a SAN
On Thu, Mar 12, 2009 at 2:12 AM, Erik Trimble erik.trim...@sun.com wrote: snip/ On the SAN, create (2) LUNs - one for your primary data, and one for your snapshots/backups. On hostA, create a zpool on the primary data LUN (call it zpool A), and another zpool on the backup LUN (zpool B). Take snapshots on A, then use 'zfs send' and 'zfs receive' to copy the clone/snapshot over to zpool B. then 'zpool export B' Shouldn't this be 'zpool export A' ? -- Sriram ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] CLI grinds to a halt during backups
Hi, I have a X4150 with a J4200 connected populated with 12 x 1 TB Disks (SATA) I run backup_pc as my software for backing up. Is there anything I can do to make the command line more responsive during backup windows? At the moment it grinds to a complete standstill. Thanks ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] User quota design discussion..
On Thu, 12 Mar 2009, Jorgen Lundman wrote: User-land will then have a daemon, whether or not it is one daemon per file-system or really just one daemon does not matter. This process will open '/dev/quota' and empty the transaction log entries constantly. Take the uid,gid entries and update the byte-count in its database. How we store this database is up to us, but since it is in user-land it should have more flexibility, and is not as critical to be fast as it would have to be in kernel. In order for this to work, ZFS data blocks need to somehow be associated with a POSIX user ID. To start with, the ZFS POSIX layer is implemented on top of a non-POSIX Layer which does not need to know about POSIX user IDs. ZFS also supports snapshots and clones. The support for snapshots, clones, and potentially non-POSIX data storage, results in ZFS data blocks which are owned by multiple users at the same time, or multiple users over a period of time spanned by multiple snapshots. If ZFS clones are modified, then files may have their ownership changed, while the unmodified data continues to be shared with other users. If a cloned file has its ownership changed, then it would be quite tedious to figure out which blocks are now wholely owned by the new user, and which blocks are shared with other users. By the time the analysis is complete, it will be wrong. Before ZFS can apply per-user quota management, it is necessary to figure out how individual blocks can be charged to a user. This seems to be a very complex issue and common usage won't work with your proposal. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS on a SAN
Hi Erik, A couple of questions about what you said in your email. In synopsis 2, if hostA has gone belly up and is no longer accessible, then a step that is implied (or maybe I'm just inferring it) is to go to the SAN and reassign the LUN from hostA to hostB. Correct? - Original Message From: Erik Trimble erik.trim...@sun.com To: Grant Lowe gl...@sbcglobal.net Cc: zfs-discuss@opensolaris.org Sent: Wednesday, March 11, 2009 1:42:06 PM Subject: Re: [zfs-discuss] ZFS on a SAN I'm not 100% sure what your question here is, but let me give you a (hopefully) complete answer: (1) ZFS is NOT a clustered file system, in the sense that it is NOT possible for two hosts to have the same LUN mounted at the same time, even if both are hooked to a SAN and can normally see that LUN. (2) ZFS can do failover, however. If you have a LUN from a SAN on hostA, create a ZFS pool in it, and use as normal. Should you with to failover the LUN to hostB, you need to do a 'zpool export zpool' on hostA, then 'zpool import zpool' on hostB. If hostA has been lost completely (hung/died/etc) and you are unable to do an 'export' on it, you can force the import on hostB via 'zpool import -f zpool' ZFS requires that you import/export entire POOLS, not just filesystems. So, given what you seem to want, I'd recommend this: On the SAN, create (2) LUNs - one for your primary data, and one for your snapshots/backups. On hostA, create a zpool on the primary data LUN (call it zpool A), and another zpool on the backup LUN (zpool B). Take snapshots on A, then use 'zfs send' and 'zfs receive' to copy the clone/snapshot over to zpool B. then 'zpool export B' On hostB, import the snapshot pool: 'zfs import B' It might just be as easy to have two independent zpools on each host, and just do a 'zfs send' on hostA, and 'zfs receive' on hostB to copy the snapshot/clone over the wire. -Erik On Wed, 2009-03-11 at 13:18 -0700, Grant Lowe wrote: Hi All, I'm new on ZFS, so I hope this isn't too basic a question. I have a host where I setup ZFS. The Oracle DBAs did their thing and I know have a number of ZFS datasets with their respective clones and snapshots on serverA. I want to export some of the clones to serverB. Do I need to zone serverB to see the same LUNs as serverA? Or does it have to have preexisting, empty LUNs to import the clones? Please help. Thanks. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] User quota design discussion..
Note that: 6501037 want user/group quotas on ZFS Is already committed to be fixed in build 113 (i.e. in the next month). - Eric On Thu, Mar 12, 2009 at 12:04:04PM +0900, Jorgen Lundman wrote: In the style of a discussion over a beverage, and talking about user-quotas on ZFS, I recently pondered a design for implementing user quotas on ZFS after having far too little sleep. It is probably nothing new, but I would be curious what you experts think of the feasibility of implementing such a system and/or whether or not it would even realistically work. I'm not suggesting that someone should do the work, or even that I will, but rather in the interest of chatting about it. Feel free to ridicule me as required! :) Thoughts: Here at work we would like to have user quotas based on uid (and presumably gid) to be able to fully replace the NetApps we run. Current ZFS are not good enough for our situation. We simply can not mount 500,000 file-systems on all the NFS clients. Nor do all servers we run support mirror-mounts. Nor do auto-mount see newly created directories without a full remount. Current UFS-style-user-quotas are very exact. To the byte even. We do not need this precision. If a user has 50MB of quota, and they are able to reach 51MB usage, then that is acceptable to us. Especially since they have to go under 50MB to be able to write new data, anyway. Instead of having complicated code in the kernel layer, slowing down the file-system with locking and semaphores (and perhaps avoiding learning indepth ZFS code?), I was wondering if a more simplistic setup could be designed, that would still be acceptable. I will use the word 'acceptable' a lot. Sorry. My thoughts are that the ZFS file-system will simply write a 'transaction log' on a pipe. By transaction log I mean uid, gid and 'byte count changed'. And by pipe I don't necessarily mean pipe(2), but it could be a fifo, pipe or socket. But currently I'm thinking '/dev/quota' style. User-land will then have a daemon, whether or not it is one daemon per file-system or really just one daemon does not matter. This process will open '/dev/quota' and empty the transaction log entries constantly. Take the uid,gid entries and update the byte-count in its database. How we store this database is up to us, but since it is in user-land it should have more flexibility, and is not as critical to be fast as it would have to be in kernel. The daemon process can also grow in number of threads as demand increases. Once a user's quota reaches the limit (note here that /the/ call to write() that goes over the limit will succeed, and probably a couple more after. This is acceptable) the process will blacklist the uid in kernel. Future calls to creat/open(CREAT)/write/(insert list of calls) will be denied. Naturally calls to unlink/read etc should still succeed. If the uid goes under the limit, the uid black-listing will be removed. If the user-land process crashes or dies, for whatever reason, the buffer of the pipe will grow in the kernel. If the daemon is restarted sufficiently quickly, all is well, it merely needs to catch up. If the pipe does ever get full and items have to be discarded, a full-scan will be required of the file-system. Since even with UFS quotas we need to occasionally run 'quotacheck', it would seem this too, is acceptable (if undesirable). If you have no daemon process running at all, you have no quotas at all. But the same can be said about quite a few daemons. The administrators need to adjust their usage. I can see a complication with doing a rescan. How could this be done efficiently? I don't know if there is a neat way to make this happen internally to ZFS, but from a user-land only point of view, perhaps a snapshot could be created (synchronised with the /dev/quota pipe reading?) and start a scan on the snapshot, while still processing kernel log. Once the scan is complete, merge the two sets. Advantages are that only small hooks are required in ZFS. The byte updates, and the blacklist with checks for being blacklisted. Disadvantages are that it is loss of precision, and possibly slower rescans? Sanity? But I do not really know the internals of ZFS, so I might be completely wrong, and everyone is laughing already. Discuss? Lund -- Jorgen Lundman | lund...@lundman.net Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo| +81 (0)90-5578-8500 (cell) Japan| +81 (0)3 -3375-1767 (home) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Eric Schrock, Fishworkshttp://blogs.sun.com/eschrock ___ zfs-discuss mailing list zfs-discuss@opensolaris.org
Re: [zfs-discuss] Export ZFS via ISCSI to Linux - Is it stable for production use now?
howard chen wrote: Hi, On Thu, Mar 12, 2009 at 1:19 AM, Darren J Moffat darr...@opensolaris.org wrote: That is all that has to be done on the OpenSolaris side to make a 10g lun available over iSCSI. The rest of it is all how Linux sets up its iSCSI client side which I don't know but I know on Solaris it is very easy using iscsiadm(1M). Thanks for your detail steps. Bbut I think using this setup, only one client can mount the share blocks at a time? So there must be a need of clustered file system. (e.g. gfs) iSCSI doesn't enforce that but the filesystem you run on top of the LUNs might. All the Linux side sees is a block device - that is the whole point of using iSCSI. If you don't want a block device then iSCSI (and FCoE) are the wrong protocols to be using. Just out of curious, what is the clustered file system used in Sun Unified Storage 7000 series for data sharing ammong clients? The S7000 doesn't use a cluster filesystem it exports ZFS datasets using one or more of iSCSI, NFS, CIFS, WebDAV, FTP, ie network filesystems or filetransfer protocols or a block protocol. When there is an S7000 cluster configuration the cluster is Active/Active with each head controlling one data pool and the services for it. When a cluster head fails the other head takes over the pool and the network addresses and starts to provide the services from a single head. This doesn't require a cluster filesystem -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] User quota design discussion..
That is pretty freaking cool. On Thu, Mar 12, 2009 at 11:38 AM, Eric Schrock eric.schr...@sun.com wrote: Note that: 6501037 want user/group quotas on ZFS Is already committed to be fixed in build 113 (i.e. in the next month). - Eric On Thu, Mar 12, 2009 at 12:04:04PM +0900, Jorgen Lundman wrote: In the style of a discussion over a beverage, and talking about user-quotas on ZFS, I recently pondered a design for implementing user quotas on ZFS after having far too little sleep. It is probably nothing new, but I would be curious what you experts think of the feasibility of implementing such a system and/or whether or not it would even realistically work. I'm not suggesting that someone should do the work, or even that I will, but rather in the interest of chatting about it. Feel free to ridicule me as required! :) Thoughts: Here at work we would like to have user quotas based on uid (and presumably gid) to be able to fully replace the NetApps we run. Current ZFS are not good enough for our situation. We simply can not mount 500,000 file-systems on all the NFS clients. Nor do all servers we run support mirror-mounts. Nor do auto-mount see newly created directories without a full remount. Current UFS-style-user-quotas are very exact. To the byte even. We do not need this precision. If a user has 50MB of quota, and they are able to reach 51MB usage, then that is acceptable to us. Especially since they have to go under 50MB to be able to write new data, anyway. Instead of having complicated code in the kernel layer, slowing down the file-system with locking and semaphores (and perhaps avoiding learning indepth ZFS code?), I was wondering if a more simplistic setup could be designed, that would still be acceptable. I will use the word 'acceptable' a lot. Sorry. My thoughts are that the ZFS file-system will simply write a 'transaction log' on a pipe. By transaction log I mean uid, gid and 'byte count changed'. And by pipe I don't necessarily mean pipe(2), but it could be a fifo, pipe or socket. But currently I'm thinking '/dev/quota' style. User-land will then have a daemon, whether or not it is one daemon per file-system or really just one daemon does not matter. This process will open '/dev/quota' and empty the transaction log entries constantly. Take the uid,gid entries and update the byte-count in its database. How we store this database is up to us, but since it is in user-land it should have more flexibility, and is not as critical to be fast as it would have to be in kernel. The daemon process can also grow in number of threads as demand increases. Once a user's quota reaches the limit (note here that /the/ call to write() that goes over the limit will succeed, and probably a couple more after. This is acceptable) the process will blacklist the uid in kernel. Future calls to creat/open(CREAT)/write/(insert list of calls) will be denied. Naturally calls to unlink/read etc should still succeed. If the uid goes under the limit, the uid black-listing will be removed. If the user-land process crashes or dies, for whatever reason, the buffer of the pipe will grow in the kernel. If the daemon is restarted sufficiently quickly, all is well, it merely needs to catch up. If the pipe does ever get full and items have to be discarded, a full-scan will be required of the file-system. Since even with UFS quotas we need to occasionally run 'quotacheck', it would seem this too, is acceptable (if undesirable). If you have no daemon process running at all, you have no quotas at all. But the same can be said about quite a few daemons. The administrators need to adjust their usage. I can see a complication with doing a rescan. How could this be done efficiently? I don't know if there is a neat way to make this happen internally to ZFS, but from a user-land only point of view, perhaps a snapshot could be created (synchronised with the /dev/quota pipe reading?) and start a scan on the snapshot, while still processing kernel log. Once the scan is complete, merge the two sets. Advantages are that only small hooks are required in ZFS. The byte updates, and the blacklist with checks for being blacklisted. Disadvantages are that it is loss of precision, and possibly slower rescans? Sanity? But I do not really know the internals of ZFS, so I might be completely wrong, and everyone is laughing already. Discuss? Lund -- Jorgen Lundman | lund...@lundman.net Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo | +81 (0)90-5578-8500 (cell) Japan | +81 (0)3 -3375-1767 (home) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Eric Schrock, Fishworks http://blogs.sun.com/eschrock ___
Re: [zfs-discuss] User quota design discussion..
Jorgen Lundman wrote: In the style of a discussion over a beverage, and talking about user-quotas on ZFS, I recently pondered a design for implementing user quotas on ZFS after having far too little sleep. It is probably nothing new, but I would be curious what you experts think of the feasibility of implementing such a system and/or whether or not it would even realistically work. I'm not suggesting that someone should do the work, or even that I will, but rather in the interest of chatting about it. As it turns out, I'm working on zfs user quotas presently, and expect to integrate in about a month. My implementation is in-kernel, integrated with the rest of ZFS, and does not have the drawbacks you mention below. Feel free to ridicule me as required! :) Thoughts: Here at work we would like to have user quotas based on uid (and presumably gid) to be able to fully replace the NetApps we run. Current ZFS are not good enough for our situation. We simply can not mount 500,000 file-systems on all the NFS clients. Nor do all servers we run support mirror-mounts. Nor do auto-mount see newly created directories without a full remount. Current UFS-style-user-quotas are very exact. To the byte even. We do not need this precision. If a user has 50MB of quota, and they are able to reach 51MB usage, then that is acceptable to us. Especially since they have to go under 50MB to be able to write new data, anyway. Good, that's the behavior that user quotas will have -- delayed enforcement. Instead of having complicated code in the kernel layer, slowing down the file-system with locking and semaphores (and perhaps avoiding learning indepth ZFS code?), I was wondering if a more simplistic setup could be designed, that would still be acceptable. I will use the word 'acceptable' a lot. Sorry. My thoughts are that the ZFS file-system will simply write a 'transaction log' on a pipe. By transaction log I mean uid, gid and 'byte count changed'. And by pipe I don't necessarily mean pipe(2), but it could be a fifo, pipe or socket. But currently I'm thinking '/dev/quota' style. User-land will then have a daemon, whether or not it is one daemon per file-system or really just one daemon does not matter. This process will open '/dev/quota' and empty the transaction log entries constantly. Take the uid,gid entries and update the byte-count in its database. How we store this database is up to us, but since it is in user-land it should have more flexibility, and is not as critical to be fast as it would have to be in kernel. The daemon process can also grow in number of threads as demand increases. Once a user's quota reaches the limit (note here that /the/ call to write() that goes over the limit will succeed, and probably a couple more after. This is acceptable) the process will blacklist the uid in kernel. Future calls to creat/open(CREAT)/write/(insert list of calls) will be denied. Naturally calls to unlink/read etc should still succeed. If the uid goes under the limit, the uid black-listing will be removed. If the user-land process crashes or dies, for whatever reason, the buffer of the pipe will grow in the kernel. If the daemon is restarted sufficiently quickly, all is well, it merely needs to catch up. If the pipe does ever get full and items have to be discarded, a full-scan will be required of the file-system. Since even with UFS quotas we need to occasionally run 'quotacheck', it would seem this too, is acceptable (if undesirable). My implementation does not have this drawback. Note that you would need to use the recovery mechanism in the case of a system crash / power loss as well. Adding potentially hours to the crash recovery time is not acceptable. If you have no daemon process running at all, you have no quotas at all. But the same can be said about quite a few daemons. The administrators need to adjust their usage. I can see a complication with doing a rescan. How could this be done efficiently? I don't know if there is a neat way to make this happen internally to ZFS, but from a user-land only point of view, perhaps a snapshot could be created (synchronised with the /dev/quota pipe reading?) and start a scan on the snapshot, while still processing kernel log. Once the scan is complete, merge the two sets. Advantages are that only small hooks are required in ZFS. The byte updates, and the blacklist with checks for being blacklisted. Disadvantages are that it is loss of precision, and possibly slower rescans? Sanity? Not to mention that this information needs to get stored somewhere, and dealt with when you zfs send the fs to another system. But I do not really know the internals of ZFS, so I might be completely wrong, and everyone is laughing already. Discuss? --matt ___ zfs-discuss mailing list zfs-discuss@opensolaris.org
Re: [zfs-discuss] User quota design discussion..
On 12 March, 2009 - Matthew Ahrens sent me these 5,0K bytes: Jorgen Lundman wrote: In the style of a discussion over a beverage, and talking about user-quotas on ZFS, I recently pondered a design for implementing user quotas on ZFS after having far too little sleep. It is probably nothing new, but I would be curious what you experts think of the feasibility of implementing such a system and/or whether or not it would even realistically work. I'm not suggesting that someone should do the work, or even that I will, but rather in the interest of chatting about it. As it turns out, I'm working on zfs user quotas presently, and expect to integrate in about a month. My implementation is in-kernel, integrated with the rest of ZFS, and does not have the drawbacks you mention below. Is there any chance of this getting into S10? /Tomas -- Tomas Ögren, st...@acc.umu.se, http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Umeå `- Sysadmin at {cs,acc}.umu.se ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] User quota design discussion..
Bob Friesenhahn wrote: On Thu, 12 Mar 2009, Jorgen Lundman wrote: User-land will then have a daemon, whether or not it is one daemon per file-system or really just one daemon does not matter. This process will open '/dev/quota' and empty the transaction log entries constantly. Take the uid,gid entries and update the byte-count in its database. How we store this database is up to us, but since it is in user-land it should have more flexibility, and is not as critical to be fast as it would have to be in kernel. In order for this to work, ZFS data blocks need to somehow be associated with a POSIX user ID. To start with, the ZFS POSIX layer is implemented on top of a non-POSIX Layer which does not need to know about POSIX user IDs. ZFS also supports snapshots and clones. Yes, the DMU needs to communicate with the ZPL to determine the uid gid to charge each file to. This is done using a callback. The support for snapshots, clones, and potentially non-POSIX data storage, results in ZFS data blocks which are owned by multiple users at the same time, or multiple users over a period of time spanned by multiple snapshots. If ZFS clones are modified, then files may have their ownership changed, while the unmodified data continues to be shared with other users. If a cloned file has its ownership changed, then it would be quite tedious to figure out which blocks are now wholely owned by the new user, and which blocks are shared with other users. By the time the analysis is complete, it will be wrong. Before ZFS can apply per-user quota management, it is necessary to figure out how individual blocks can be charged to a user. This seems to be a very complex issue and common usage won't work with your proposal. Indeed. We have decided to charge for referenced space. This is the same concept used by the referenced, refquota, and refreservation properties, and reported by stat(2) in st_blocks, and du(1) on files today. This makes the issue much simpler. We don't need to worry about blocks being shared between clones or snapshots, because we charge for every time a block is referenced. When a clone is created, it starts with the same user accounting information as its origin snapshot. --matt ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] reboot when copying large amounts of data
maj == Maidak Alexander J maidakalexand...@johndeere.com writes: maj If you're having issues with a disk contoller or disk IO maj driver its highly likely that a savecore to disk after the maj panic will fail. I'm not sure how to work around this not in Solaris, but as a concept for solving the problem: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/kdump/kdump.txt;h=3f4bc840da8b7c068076dd057216e846e098db9f;hb=4a6908a3a050aacc9c3a2f36b276b46c0629ad91 They load a second kernel into a reserved spot of RAM, like 64MB or so, and forget about it. After a crash, they boot the second kernel. The second kernel runs using the reserved area of RAM as its working space, not touching any other memory, as if you were running on a very old machine with tiny RAM. It reprobes all the hardware, and then performs the dump. I don't know if it actually works, but the approach is appropriate if you are trying to debug the storage stack. You could even have a main kernel which crashes while taking an ordinary coredump, and then use the backup dumping-kernel to coredump the main kernel in mid-coredump---a dump of a dumping kernel. I think some Solaris developers were discussing putting coredump features into Xen, so the host could take the dump (or, maybe even something better than a dump---for example, if you built host/target debugging features into Xen for debugging running kernels, then you could just force a breakpoint in the guest instead of panic. Since Xen can hibernate domU's onto disk (it can, right?), you can treat the hibernated Xen-specific representation of the domU as the-dump, groveling through the ``dump'' with the same host/target tools you could use on a running kernel without any special dump support in the debugger itself). IIRC NetBSD developers discussed the same idea years ago but neither implementation exists. pgpsmSOamFWH7.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] usedby* properties for datasets created before v13
Gavin Maltby wrote: Hi, The manpage says Specifically, used = usedbychildren + usedbydataset + usedbyrefreservation +, usedbysnapshots. These proper- ties are only available for datasets created on zpool version 13 pools. .. and I now realize that created at v13 is the important bit, rather than created pre v13 and upgraded, and I see that for datasets created on a version prior to 13 show - for these properties (might be nice to note that in the manpage - I took - to mean zero for a while). Anyway, is there any way to retrospectively populate these statistics (avoiding dataset reconstruction, that is)? No chance a scrub would/could do it? In theory one could add code to calculate these after the fact. A tricky part is differentiating between usedbydataset and usedbysnapshots for clones. In that case you would need to examine all block pointers in the clone. Those born after the origin are usedbydataset, and usedbysnapshots is whatever's left over. Doing this while things are changing may be nontrivial. --matt ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS on a SAN
Grant, Yes this is correct. If host A goes belly up, you can deassign the LUN from host A and assign to host B. Being that host A has not gracefully exported it's zpool you will need to 'zpool import -f poolname' to force the pool to be imported because it hasn't been exported prior to import due to the unexpected inaccessibility of host A. It is possible to have the LUN visible to both machines at the same time, just not in use by both machines. This is in general how clusters work. Be aware that if you do do this and access the disk on both systems then you run a very real risk of corruption of the volume. I use the first approach here quite regularly in what I call 'poor mans clustering'. ;) I tend to install all my software and data environments on SAN based LUNS that allow ease of moving just by exporting the zpool , reassigning the LUN then importing to the new system. Works well as long as both systems are of the same OS revision or greater on the target system. /Scott. Grant Lowe wrote: Hi Erik, A couple of questions about what you said in your email. In synopsis 2, if hostA has gone belly up and is no longer accessible, then a step that is implied (or maybe I'm just inferring it) is to go to the SAN and reassign the LUN from hostA to hostB. Correct? - Original Message From: Erik Trimble erik.trim...@sun.com To: Grant Lowe gl...@sbcglobal.net Cc: zfs-discuss@opensolaris.org Sent: Wednesday, March 11, 2009 1:42:06 PM Subject: Re: [zfs-discuss] ZFS on a SAN I'm not 100% sure what your question here is, but let me give you a (hopefully) complete answer: (1) ZFS is NOT a clustered file system, in the sense that it is NOT possible for two hosts to have the same LUN mounted at the same time, even if both are hooked to a SAN and can normally see that LUN. (2) ZFS can do failover, however. If you have a LUN from a SAN on hostA, create a ZFS pool in it, and use as normal. Should you with to failover the LUN to hostB, you need to do a 'zpool export zpool' on hostA, then 'zpool import zpool' on hostB. If hostA has been lost completely (hung/died/etc) and you are unable to do an 'export' on it, you can force the import on hostB via 'zpool import -f zpool' ZFS requires that you import/export entire POOLS, not just filesystems. So, given what you seem to want, I'd recommend this: On the SAN, create (2) LUNs - one for your primary data, and one for your snapshots/backups. On hostA, create a zpool on the primary data LUN (call it zpool A), and another zpool on the backup LUN (zpool B). Take snapshots on A, then use 'zfs send' and 'zfs receive' to copy the clone/snapshot over to zpool B. then 'zpool export B' On hostB, import the snapshot pool: 'zfs import B' It might just be as easy to have two independent zpools on each host, and just do a 'zfs send' on hostA, and 'zfs receive' on hostB to copy the snapshot/clone over the wire. -Erik On Wed, 2009-03-11 at 13:18 -0700, Grant Lowe wrote: Hi All, I'm new on ZFS, so I hope this isn't too basic a question. I have a host where I setup ZFS. The Oracle DBAs did their thing and I know have a number of ZFS datasets with their respective clones and snapshots on serverA. I want to export some of the clones to serverB. Do I need to zone serverB to see the same LUNs as serverA? Or does it have to have preexisting, empty LUNs to import the clones? Please help. Thanks. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- ___ Scott Lawson Systems Architect Manukau Institute of Technology Information Communication Technology Services Private Bag 94006 Manukau City Auckland New Zealand Phone : +64 09 968 7611 Fax: +64 09 968 7641 Mobile : +64 27 568 7611 mailto:sc...@manukau.ac.nz http://www.manukau.ac.nz perl -e 'print $i=pack(c5,(41*2),sqrt(7056),(unpack(c,H)-2),oct(115),10);' ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] reboot when copying large amounts of data
I've managed to get the data transfer to work by rearranging my disks so that all of them sit on the integrated SATA controller. So, I feel pretty certain that this is either an issue with the Supermicro aoc-sat2-mv8 card, or with PCI-X on the motherboard (though I would think that the integrated SATA would also be using the PCI bus?). The motherboard, for those interested, is an HD8ME-2 (not, I now find after buying this box from Silicon Mechanics, a board that's on the Solaris HCL...) http://www.supermicro.com/Aplus/motherboard/Opteron2000/MCP55/h8dme-2.cfm So I'm not considering one of LSI's HBA's - what do list members think about this device: http://www.provantage.com/lsi-logic-lsi00117~7LSIG03X.htm On Thu, Mar 12, 2009 at 2:18 AM, Nathan Kroenert nathan.kroen...@sun.com wrote: definitely time to bust out some mdb -K or boot -k and see what it's moaning about. I did not see the screenshot earlier... sorry about that. Nathan. Blake wrote: I start the cp, and then, with prstat -a, watch the cpu load for the cp process climb to 25% on a 4-core machine. Load, measured for example with 'uptime', climbs steadily until the reboot. Note that the machine does not dump properly, panic or hang - rather, it reboots. I attached a screenshot earlier in this thread of the little bit of error message I could see on the console. The machine is trying to dump to the dump zvol, but fails to do so. Only sometimes do I see an error on the machine's local console - mos times, it simply reboots. On Thu, Mar 12, 2009 at 1:55 AM, Nathan Kroenert nathan.kroen...@sun.com wrote: Hm - Crashes, or hangs? Moreover - how do you know a CPU is pegged? Seems like we could do a little more discovery on what the actual problem here is, as I can read it about 4 different ways. By this last piece of information, I'm guessing the system does not crash, but goes really really slow?? Crash == panic == we see stack dump on console and try to take a dump hang == nothing works == no response - might be worth looking at mdb -K or booting with a -k on the boot line. So - are we crashing, hanging, or something different? It might simply be that you are eating up all your memory, and your physical backing storage is taking a while to catch up? Nathan. Blake wrote: My dump device is already on a different controller - the motherboards built-in nVidia SATA controller. The raidz2 vdev is the one I'm having trouble with (copying the same files to the mirrored rpool on the nVidia controller work nicely). I do notice that, when using cp to copy the files to the raidz2 pool, load on the machine climbs steadily until the crash, and one proc core pegs at 100%. Frustrating, yes. On Thu, Mar 12, 2009 at 12:31 AM, Maidak Alexander J maidakalexand...@johndeere.com wrote: If you're having issues with a disk contoller or disk IO driver its highly likely that a savecore to disk after the panic will fail. I'm not sure how to work around this, maybe a dedicated dump device not on a controller that uses a different driver then the one that you're having issues with? -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Blake Sent: Wednesday, March 11, 2009 4:45 PM To: Richard Elling Cc: Marc Bevand; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] reboot when copying large amounts of data I guess I didn't make it clear that I had already tried using savecore to retrieve the core from the dump device. I added a larger zvol for dump, to make sure that I wasn't running out of space on the dump device: r...@host:~# dumpadm Dump content: kernel pages Dump device: /dev/zvol/dsk/rpool/bigdump (dedicated) Savecore directory: /var/crash/host Savecore enabled: yes I was using the -L option only to try to get some idea of why the system load was climbing to 1 during a simple file copy. On Wed, Mar 11, 2009 at 4:58 PM, Richard Elling richard.ell...@gmail.com wrote: Blake wrote: I'm attaching a screenshot of the console just before reboot. The dump doesn't seem to be working, or savecore isn't working. On Wed, Mar 11, 2009 at 11:33 AM, Blake blake.ir...@gmail.com wrote: I'm working on testing this some more by doing a savecore -L right after I start the copy. savecore -L is not what you want. By default, for OpenSolaris, savecore on boot is disabled. But the core will have been dumped into the dump slice, which is not used for swap. So you should be able to run savecore at a later time to collect the core from the last dump. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org
Re: [zfs-discuss] CLI grinds to a halt during backups
Maybe you're also seeing this one? 6586537 async zio taskqs can block out userland commands -Jeff Blake wrote: I think we need some data to look at to find out what's being slow. Try some commands like this to get data: prstat -a iostat -x 5 zpool iostat 5 (if you are using ZFS) and then report sample output to this list. You might also consider enabling sar (svcadm enable sar), then reading the sar manpage. On Thu, Mar 12, 2009 at 10:36 AM, Marius van Vuuren mar...@breakpoint.co.za wrote: Hi, I have a X4150 with a J4200 connected populated with 12 x 1 TB Disks (SATA) I run backup_pc as my software for backing up. Is there anything I can do to make the command line more responsive during backup windows? At the moment it grinds to a complete standstill. Thanks ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] reboot when copying large amounts of data
On Thu, Mar 12, 2009 at 2:22 PM, Blake blake.ir...@gmail.com wrote: I've managed to get the data transfer to work by rearranging my disks so that all of them sit on the integrated SATA controller. So, I feel pretty certain that this is either an issue with the Supermicro aoc-sat2-mv8 card, or with PCI-X on the motherboard (though I would think that the integrated SATA would also be using the PCI bus?). The motherboard, for those interested, is an HD8ME-2 (not, I now find after buying this box from Silicon Mechanics, a board that's on the Solaris HCL...) http://www.supermicro.com/Aplus/motherboard/Opteron2000/MCP55/h8dme-2.cfm So I'm not considering one of LSI's HBA's - what do list members think about this device: http://www.provantage.com/lsi-logic-lsi00117~7LSIG03X.htmhttp://www.provantage.com/lsi-logic-lsi00117%7E7LSIG03X.htm I believe the MCP55's SATA controllers are actually PCI-E based. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] reboot when copying large amounts of data
Tim wrote: On Thu, Mar 12, 2009 at 2:22 PM, Blake blake.ir...@gmail.com mailto:blake.ir...@gmail.com wrote: I've managed to get the data transfer to work by rearranging my disks so that all of them sit on the integrated SATA controller. So, I feel pretty certain that this is either an issue with the Supermicro aoc-sat2-mv8 card, or with PCI-X on the motherboard (though I would think that the integrated SATA would also be using the PCI bus?). The motherboard, for those interested, is an HD8ME-2 (not, I now find after buying this box from Silicon Mechanics, a board that's on the Solaris HCL...) http://www.supermicro.com/Aplus/motherboard/Opteron2000/MCP55/h8dme-2.cfm So I'm not considering one of LSI's HBA's - what do list members think about this device: http://www.provantage.com/lsi-logic-lsi00117~7LSIG03X.htm http://www.provantage.com/lsi-logic-lsi00117%7E7LSIG03X.htm I believe the MCP55's SATA controllers are actually PCI-E based. I use Tyan 2927 motherboards. They have on-board nVidia MCP55 chipsets, which is the same chipset at the X4500 (IIRC). I wouldn't trust the MCP55 chipset in OpenSolaris. I had random disk hangs even while the machine was mostly idle. In Feb 2008 I bought AOC-SAT2-MV8 cards and moved all my drives to these add-in cards. I haven't had any issues with drive hanging since. There does not seem to be any problems with the SAT2-MV8 under heavy load in my servers from what I've seen. When the SuperMicro AOC-USAS-L8i came out later last year, I started using them instead. They work better than the SAT2-MV8s. This card needs a 3U or bigger case: http://www.supermicro.com/products/accessories/addon/AOC-USAS-L8i.cfm This is the low profile card that will fit in a 2U: http://www.supermicro.com/products/accessories/addon/AOC-USASLP-L8i.cfm They both work in normal PCI-E slots on my Tyan 2927 mobos. Finding good non-Sun hardware that works very well under OpenSolaris is frustrating to say the least. Good luck. -- Dave ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] reboot when copying large amounts of data
b == Blake blake.ir...@gmail.com writes: b http://www.provantage.com/lsi-logic-lsi00117~7LSIG03X.htm I'm having trouble matching up chips, cards, drivers, platforms, and modes with the LSI stuff. The more I look at it the mroe confused I get. Platforms: x86 SPARC Drivers: mpt mega_sas mfi Chips: 1068 (SAS, PCI-X) 1068E (SAS, PCIe) 1078 ??? -- from supermicro, seems to be SAS, PCIe, with support for 256 - 512MB RAM instead of the 16 - 32MB RAM on the others 1030 (parallel scsi) Cards: LSI cards http://www.lsi.com/storage_home/products_home/host_bus_adapters/sas_hbas/index.html I love the way they use the numbers 3800 and 3080, so you are constantly transposing them thus leaving google littered with all this confusingly wrong information. LSISAS3800X(PCI-X, external ports) LSISAS3080X-R (PCI-X, internal ports) LSISAS3801X(PCI-X, external ports) LSISAS3801E(PCIe, external ports) LSISAS3081E-R (PCIe, internal ports) I would have thought -R meant ``suports RAID'' but all I can really glean through the foggy marketing-glass behind which all the information is hidden, is -R means ``all the ports are internal''. Supermicro cards http://www.supermicro.com/products/accessories/index.cfm wow, this is even more of a mess. These are all UIO cards so I assume they have the PCIe bracket on backwards AOC-USAS-L4i(PCIe, 4 internal 4 external) AOC-USAS-L8i, AOC-USASLP-L8i(PCIe, internal ports) based on 1068E sounds similar to LSISAS3081E. Is that also 1068E? supports RAID0, RAID1, RAID10 AOC-USAS-L4iR identical to the above, but ``includes iButton'' which is an old type of smartcard-like device with sometimes crypto and javacard support. apparently some kind of license key to unlock RAID5? no L8iR exists though, only L4iR. I have the L8i, and it does have an iButton socket with no button in it. AOC-USAS-H4iR AOC-USAS-H8iR, AOC-USASLP-H8iR (PCIe, internal ports) based on 1078 low-profile version has more memory than fullsize version?! but here is the most fun thing about the supermicro cards. All cards have one driver *EXCEPT* the L8i, which has three drivers for three modes: IT, IR, and SR. When I google for this I find notes on some of their integrated motherboards like: * The onboard LSI 1068E supported SR and IT mode but not IR mode. I also found this: * SR = Software RAID IT = Integrate. Target mode. IR mode is not supported. but no idea what the three modes are. searching for SAS SR IT IR doesn't work either, so it's not some SAS thing. What *is* it? also there seem to be two different kinds of quad-SATA connector on these SAS cards so there are two different kinds of octopus cable. Questions: * which chips are used by each of the LSI boards? I can guess, but in particular LSISAS3800X and LSISAS3801X seem to be different chips, while from the list of chips I'd have no choice but to guess they are both 1068. * which drivers work on x86 and which SPARC? I know some LSI cards work in SPARC but maybe not all---do the drivers support the same set of cards on both platforms? Or will normal cards not work in SPARC for lack of Forth firmware to perform some LSI-proprietary ``initialization'' ritual? * which chips go with which drivers? Is it even that simple---will adding an iButton RAID5 license to a SuperMicro board make the same card change from mega_sas to mpt attachment, or something similar? For example there is a bug here about a 1068E card which doesn't work, even though most 1068E cards do work: http://bugs.opensolaris.org/view_bug.do?bug_id=6736187 Maybe the Solaris driver needs IR mode and won't work with the onboard supermicro chip which supports only ``software raid'' whatever that means, which is maybe denoted by SR? What does the iButton unlock, then, features of IR mode which are abstracted from the OS driver? * What are SR, IT, and IR mode? Which modes do the Solaris drivers use, or does it matter? * Has someone found the tool mentioned here by some above-the-table means, or only by request from LSI?: http://www.opensolaris.org/jive/message.jspa?messageID=184811#184811 The mention that a SPARC version of the tool exists is encouraging. The procedure to clear persistent mappings through the BIOS obviously won't work on SPARC. Here are the notes I have so far: -8- The driver for LSI's MegaRAID SAS card is mega_sas which was integrated into snv_88. It's planned for backporting to a Solaris 10 update. There is also a BSD-licensed driver for that hardware, called mfi. It's available from http://www.itee.uq.edu.au/~dlg/mfi a scsi_vhci sort of driver for the LSI card in the Ultra {20,25} Well yes, that's mpt(7d) as delivered
Re: [zfs-discuss] reboot when copying large amounts of data
For what it's worth, I have been running Nevada (so, same kernel as opensolaris) for ages (at least 18 months) on a Gigabyte board with the MCP55 chipset and it's been flawless. I liked it so much, I bought it's newer brother, based on the nvidia 750SLI chipset... M750SLI-DS4 Cheers! Nathan. On 13/03/09 09:21 AM, Dave wrote: Tim wrote: On Thu, Mar 12, 2009 at 2:22 PM, Blake blake.ir...@gmail.com mailto:blake.ir...@gmail.com wrote: I've managed to get the data transfer to work by rearranging my disks so that all of them sit on the integrated SATA controller. So, I feel pretty certain that this is either an issue with the Supermicro aoc-sat2-mv8 card, or with PCI-X on the motherboard (though I would think that the integrated SATA would also be using the PCI bus?). The motherboard, for those interested, is an HD8ME-2 (not, I now find after buying this box from Silicon Mechanics, a board that's on the Solaris HCL...) http://www.supermicro.com/Aplus/motherboard/Opteron2000/MCP55/h8dme-2.cfm So I'm not considering one of LSI's HBA's - what do list members think about this device: http://www.provantage.com/lsi-logic-lsi00117~7LSIG03X.htm http://www.provantage.com/lsi-logic-lsi00117%7E7LSIG03X.htm I believe the MCP55's SATA controllers are actually PCI-E based. I use Tyan 2927 motherboards. They have on-board nVidia MCP55 chipsets, which is the same chipset at the X4500 (IIRC). I wouldn't trust the MCP55 chipset in OpenSolaris. I had random disk hangs even while the machine was mostly idle. In Feb 2008 I bought AOC-SAT2-MV8 cards and moved all my drives to these add-in cards. I haven't had any issues with drive hanging since. There does not seem to be any problems with the SAT2-MV8 under heavy load in my servers from what I've seen. When the SuperMicro AOC-USAS-L8i came out later last year, I started using them instead. They work better than the SAT2-MV8s. This card needs a 3U or bigger case: http://www.supermicro.com/products/accessories/addon/AOC-USAS-L8i.cfm This is the low profile card that will fit in a 2U: http://www.supermicro.com/products/accessories/addon/AOC-USASLP-L8i.cfm They both work in normal PCI-E slots on my Tyan 2927 mobos. Finding good non-Sun hardware that works very well under OpenSolaris is frustrating to say the least. Good luck. -- Dave ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- // // Nathan Kroenert nathan.kroen...@sun.com // // Senior Systems Engineer Phone: +61 3 9869 6255 // // Global Systems Engineering Fax:+61 3 9869 6288 // // Level 7, 476 St. Kilda Road // // Melbourne 3004 VictoriaAustralia // // ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] reboot when copying large amounts of data
On Thu, Mar 12, 2009 at 18:30, Miles Nordin car...@ivy.net wrote: I love the way they use the numbers 3800 and 3080, so you are constantly transposing them thus leaving google littered with all this confusingly wrong information. Think of the middle two digits as (number of external ports, number of internal ports). For example, I have a 3442E-R which has 4 internal and 4 external ports, the 3800 has 8 external ports and 0 internal, and so forth. One place this breaks down is with cards like the ; it has a total of 8 ports, any group of 4 of which can be mapped to internal or external ports. AOC-USAS-L4iR identical to the above, but ``includes iButton'' which is an old type of smartcard-like device with sometimes crypto and javacard support. apparently some kind of license key to unlock RAID5? no L8iR exists though, only L4iR. I have the L8i, and it does have an iButton socket with no button in it. I think the iButton is just used as an unlock code for the builtin RAID 5 functionality. Nothing the end user cares about, unless they want RAID and have to spend the extra money. * SR = Software RAID IT = Integrate. Target mode. IR mode is not supported. Integrated target mode lets you export some storage attached to the host system (through another adapter, presumably) as a storage device. IR mode is almost certainly Internal RAID, which that card doesn't have support for. also there seem to be two different kinds of quad-SATA connector on these SAS cards so there are two different kinds of octopus cable. Yes---SFF-8484 and SFF-8087 are the key words. SATA disks will always show up when attached to a SAS HBA, because that's one of the requirements of the SAS specification. I'm not sure what you mean by this. SAS controllers can control SATA disks, and interact with them. They don't just show up; they're first-class citizens. Will ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] User quota design discussion..
Bob Friesenhahn wrote: In order for this to work, ZFS data blocks need to somehow be associated with a POSIX user ID. To start with, the ZFS POSIX layer is implemented on top of a non-POSIX Layer which does not need to know about POSIX user IDs. ZFS also supports snapshots and clones. This I did not know, but now that you point it out, this would be the right way to design it. So the advantage of requiring less ZFS integration is no longer the case. Lund -- Jorgen Lundman | lund...@lundman.net Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo| +81 (0)90-5578-8500 (cell) Japan| +81 (0)3 -3375-1767 (home) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] User quota design discussion..
Eric Schrock wrote: Note that: 6501037 want user/group quotas on ZFS Is already committed to be fixed in build 113 (i.e. in the next month). - Eric Wow, that would be fantastic. We have the Sun vendors camped out at the data center trying to apply fresh patches. I believe 6798540 fixed the largest issue but it would be desirable to be able to use just ZFS. Is this a project needing donations? I see your address is at Sun.com, and we already have 9 x4500s, but maybe you need some pocky, asse, collon or pocari sweat... Lundy [1] BugID:6798540 3-way deadlock happens in ufs filesystem on zvol when writng ufs log -- Jorgen Lundman | lund...@lundman.net Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo| +81 (0)90-5578-8500 (cell) Japan| +81 (0)3 -3375-1767 (home) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] User quota design discussion..
As it turns out, I'm working on zfs user quotas presently, and expect to integrate in about a month. My implementation is in-kernel, integrated with the rest of ZFS, and does not have the drawbacks you mention below. I merely suggested my design as it may have been something I _could_ have implemented, as it required little ZFS knowledge. (Adding hooks is usually easier). But naturally that has already been shown not to be the case. A proper implementation is always going to be much more desirable :) Good, that's the behavior that user quotas will have -- delayed enforcement. There probably are situations where precision is required, or perhaps historical reasons, but for us a delayed enforcement may even be better. Perhaps it would be better for the delivery of an email message that goes over the quota, to be allowed to complete writing the entire message. Than it is to abort a write() call somewhere in the middle, and return failures all the way back to generating a bounce message. Maybe.. can't say I have thought about it. My implementation does not have this drawback. Note that you would need to use the recovery mechanism in the case of a system crash / power loss as well. Adding potentially hours to the crash recovery time is not acceptable. Great! Will there be any particular limits on how many uids, or size of uids in your implementation? UFS generally does not, but I did note that if uid go over 1000 it flips out and changes the quotas file to 128GB in size. Not to mention that this information needs to get stored somewhere, and dealt with when you zfs send the fs to another system. That is a good point, I had not even planned to support quotas for ZFS send, but consider a rescan to be the answer. We don't ZFS send very often as it is far too slow. Lund -- Jorgen Lundman | lund...@lundman.net Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo| +81 (0)90-5578-8500 (cell) Japan| +81 (0)3 -3375-1767 (home) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] User quota design discussion..
Jorgen Lundman wrote: Great! Will there be any particular limits on how many uids, or size of uids in your implementation? UFS generally does not, but I did note that if uid go over 1000 it flips out and changes the quotas file to 128GB in size. All UIDs, as well as SIDs (from the SMB server), are permitted. Any number of users and quotas are permitted, and handled efficiently. Note, UID on Solaris is a 31-bit number. --matt ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] reboot when copying large amounts of data
wm == Will Murnane will.murn...@gmail.com writes: * SR = Software RAID IT = Integrate. Target mode. IR mode is not supported. wm Integrated target mode lets you export some storage attached wm to the host system (through another adapter, presumably) as a wm storage device. IR mode is almost certainly Internal RAID, wm which that card doesn't have support for. no, the supermicro page for AOC-USAS-L8i does claim support for all three, and supermicro has an ``IR driver'' available for download for Linux and Windows, or at least a link to one. I'm trying to figure out what's involved in determining and switching modes, why you'd want to switch them, what cards support which modes, which solaris drivers support which modes, u.s.w. The answer may be very simple, like ``the driver supports only IR. Most cards support IR, and cards that don't support IR won't work. IR can run in single-LUN mode. Some IR cards support RAID5, others support only RAID 0, 1, 10.'' Or it could be ``the driver supports only SR. The driver is what determines the mode, and it does this by loading firmware into the card, and the first step in initializing the card is always for the driver to load in a firmware blob. All currently-produced cards support SR.'' so...actually, now that I say it, I guess the answer cannot be very simple. It's going to have to be a little complicated. Anyway, I can guess, too. I was hoping someone would know for sure off-hand. pgpv7SB8wKna7.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] reboot when copying large amounts of data
On Thu, 12 Mar 2009 22:24:12 -0400 Miles Nordin car...@ivy.net wrote: wm == Will Murnane will.murn...@gmail.com writes: * SR = Software RAID IT = Integrate. Target mode. IR mode is not supported. wm Integrated target mode lets you export some storage attached wm to the host system (through another adapter, presumably) as a wm storage device. IR mode is almost certainly Internal RAID, wm which that card doesn't have support for. no, the supermicro page for AOC-USAS-L8i does claim support for all three, and supermicro has an ``IR driver'' available for download for Linux and Windows, or at least a link to one. I'm trying to figure out what's involved in determining and switching modes, why you'd want to switch them, what cards support which modes, which solaris drivers support which modes, u.s.w. The answer may be very simple, like ``the driver supports only IR. Most cards support IR, and cards that don't support IR won't work. IR can run in single-LUN mode. Some IR cards support RAID5, others support only RAID 0, 1, 10.'' Or it could be ``the driver supports only SR. The driver is what determines the mode, and it does this by loading firmware into the card, and the first step in initializing the card is always for the driver to load in a firmware blob. All currently-produced cards support SR.'' so...actually, now that I say it, I guess the answer cannot be very simple. It's going to have to be a little complicated. Anyway, I can guess, too. I was hoping someone would know for sure off-hand. Hi Miles, the mpt(7D) driver supports that card. mpt(7D) supports both IT and IR firmware variants. You can find out the specifics for what RAID volume levels are supported by reading the raidctl(1M) manpage. I don't think you can switch between IT and IR firmware, but not having needed to know this before, I haven't tried it. James C. McPherson -- Senior Kernel Software Engineer, Solaris Sun Microsystems http://blogs.sun.com/jmcp http://www.jmcp.homeunix.com/blog ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss