Re: [zfs-discuss] Odd prioritisation issues.
Dickon Hood writes: On Fri, Dec 07, 2007 at 13:14:56 +, I wrote: : On Fri, Dec 07, 2007 at 12:58:17 +, Darren J Moffat wrote: : : Dickon Hood wrote: : : On Fri, Dec 07, 2007 at 12:38:11 +, Darren J Moffat wrote: : : : Dickon Hood wrote: : : : We're seeing the writes stall in favour of the reads. For normal : : : workloads I can understand the reasons, but I was under the impression : : : that real-time processes essentially trump all others, and I'm surprised : : : by this behaviour; I had a dozen or so RT-processes sat waiting for disc : : : for about 20s. : : : Are the files opened with O_DSYNC or does the application call fsync ? : : No. O_WRONLY|O_CREAT|O_LARGEFILE|O_APPEND. Would that help? : : Don't know if it will help, but it will be different :-). I suspected : : that since you put the processes in the RT class you would also be doing : : synchronous writes. : Right. I'll let you know on Monday; I'll need to restart it in the : morning. I was a tad busy yesterday and didn't have the time, but I've switched one of our recorder processes (the one doing the HD stream; ~17Mb/s, broadcasting a preview we don't mind trashing) to a version of the code which opens its file O_DSYNC as suggested. We've gone from ~130 write ops per second and 10MB/s to ~450 write ops per second and 27MB/s, with a marginally higher CPU usage. This is roughly what I'd expect. We've artifically throttled the reads, which has helped (but not fixed; it isn't as determinative as we'd like) the starvation problem at the expense of increasing a latency we'd rather have as close to zero as possible. Any ideas? O_DSYNC was good idea. Then if you have recent Nevada you can use the separate intent log (log keyword in zpool create) to absord thosewrites without having splindle competition with the reads. Your write workload should then be well handled here (unless the incoming network processing is itself delayed). -r Thanks. -- Dickon Hood Due to digital rights management, my .sig is temporarily unavailable. Normal service will be resumed as soon as possible. We apologise for the inconvenience in the meantime. No virus was found in this outgoing message as I didn't bother looking. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] LSI SAS3081E = unstable drive numbers?
Based on recommendations from this list, I asked the company that built my box to use an LSI SAS3081E controller. The first problem I noticed was that the drive-numbers were ordered incorrectly. That is, given that my system has 24 bays (6 rows, 4 bays/row), the drive numbers from top-to-bottom left-to-right were 6, 1, 0, 2, 4, 5 - even though when the system boots, each drive is scanned in perfect order (I can tell by watching the LEDs blink). I contacted LSI tech support and they explained: start response SAS treats device IDs differently than SCSI. LSI SAS controllers remember devices in the order they were discovered by the controller. This memory is persistent across power cycles. It is based on the world wide name (WWN) given uniquely to every SAS device. This allows your boot device to remain your boot device no matter where it migrates in the SAS topology. In order to clear the memory of existing devices you need at least one device that will not be present in your final configuration. Re-boot the machine and enter the LSI configuration utility (CTRL-C). Then find your way to SAS Topology. To see more options, press CTRL-M. Choose the option to clear all non-present device IDs. This clears the persistent memory of all devices not present at that time. Exchange the drives. The system will now remember the order it finds the drives after the next boot cycle. end response Sure enough, I was able to physical reorder my drives so they were 0, 1, 2, 4, 5, 6 - so, appearantly, the company that put my system together moved the drives around after they were initially scanned. But where is 3? (answer below). Then I tried another test: 1. make first disk blink # run dd if=/dev/dsk/c2t0d0p0 of=/dev/null count=10 10+0 records in 10+0 records out 2. pull disk '0' out and replace it with a brand new disk # run dd if=/dev/dsk/c2t0d0p0 of=/dev/null count=10 dd: /dev/dsk/c2t0d0p0: open: No such file or directory 3. scratch head and try again with '3' (I had previously cleared the LSI's controllers memory) # run dd if=/dev/dsk/c2t3d0p0 of=/dev/null count=10 10+0 records in 10+0 records out So, it seems my SAS controller is being too smart for its own good - it tracks the drives themselves, not the drive-bays. If I hot-swap a brand new drive into a bay, Solaris will see it as a new disk, not a replacement for the old disk. How can ZFS support this? I asked the LSI tech support again and got: start quote I don't have the knowledge to answer that, so I'll just say this: most vendors, including Sun, set up the SAS HBA to use enclosure/slot naming, which means that if a drive is swapped, it does NOT get a new name (after all, the enclosure and slot did not change). end quote So, now I turn to you... Here some information about my system: Specs: Motherboard: SuperMicro H8DME-2 Rev 2.01 - BIOS: AMI v2.58 HBA: LSI SAS3081E (SN: P068170707) installed in Slot #5 - LSI Configuration Utility v6.16.00.00 (2007-05-07) Backplane: CI-Design 12-6412-01BR HBA connected to BP via two SFF-8087-SFF-8087 cables OS: SXCE b74 Details: * Chassis has 24 SAS/SATA bays * There are 6 backplanes - one for each *row* of drives * I currently have only 6 drives installed (see pic) * The LSI card is plugged into backplanes 1 2 * The LSI card is NOT configured to do any RAID - its only JBOD as I'm using Solaris's ZFS (software-RAID) Question: * I only plan to use SATA drives, would using a SATA controller like Supermicro's AOC-SAT2-MV8 help? Thanks again, Kent ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] LSI SAS3081E = unstable drive numbers?
Hi Kent, I'm one of the team that works on Solaris' mpt driver, which we recently enhanced to deliver mpxio support with SAS. I have a bit of knowledge about your issue :-) Kent Watsen wrote: Based on recommendations from this list, I asked the company that built my box to use an LSI SAS3081E controller. The first problem I noticed was that the drive-numbers were ordered incorrectly. That is, given that my system has 24 bays (6 rows, 4 bays/row), the drive numbers from top-to-bottom left-to-right were 6, 1, 0, 2, 4, 5 - even though when the system boots, each drive is scanned in perfect order (I can tell by watching the LEDs blink). I contacted LSI tech support and they explained: start response SAS treats device IDs differently than SCSI. LSI SAS controllers remember devices in the order they were discovered by the controller. This memory is persistent across power cycles. It is based on the world wide name (WWN) given uniquely to every SAS device. This allows your boot device to remain your boot device no matter where it migrates in the SAS topology. In order to clear the memory of existing devices you need at least one device that will not be present in your final configuration. Re-boot the machine and enter the LSI configuration utility (CTRL-C). Then find your way to SAS Topology. To see more options, press CTRL-M. Choose the option to clear all non-present device IDs. This clears the persistent memory of all devices not present at that time. Exchange the drives. The system will now remember the order it finds the drives after the next boot cycle. end response Firstly, yes, the LSI SAS hbas do use persistent mapping, with a logical target id by default. This is where the hba does the translation between the physical disk device's SAS address (which you'll see in prtconf -v as the devid), and an essentially arbitrary target number which gets passed up to the OS - in this case Solaris. The support person @ LSI was correct about deleting all those mappings. Yes, the controller is being smart and tracking the actual device rather than a particular bay/slot mapping. This isn't so bad, mostly. The effect for you is that you can't assume that the replaced device is going to have the same target number as the old one (in fact, I'd call that quite unlikely) so you'll have to see what the new device name is by checking your dmesg or iostat -En output. Sure enough, I was able to physical reorder my drives so they were 0, 1, 2, 4, 5, 6 - so, appearantly, the company that put my system together moved the drives around after they were initially scanned. But where is 3? (answer below). Then I tried another test: 1. make first disk blink # run dd if=/dev/dsk/c2t0d0p0 of=/dev/null count=10 10+0 records in 10+0 records out 2. pull disk '0' out and replace it with a brand new disk # run dd if=/dev/dsk/c2t0d0p0 of=/dev/null count=10 dd: /dev/dsk/c2t0d0p0: open: No such file or directory 3. scratch head and try again with '3' (I had previously cleared the LSI's controllers memory) # run dd if=/dev/dsk/c2t3d0p0 of=/dev/null count=10 10+0 records in 10+0 records out So, it seems my SAS controller is being too smart for its own good - it tracks the drives themselves, not the drive-bays. If I hot-swap a brand new drive into a bay, Solaris will see it as a new disk, not a replacement for the old disk. How can ZFS support this? I asked the LSI tech support again and got: start quote I don't have the knowledge to answer that, so I'll just say this: most vendors, including Sun, set up the SAS HBA to use enclosure/slot naming, which means that if a drive is swapped, it does NOT get a new name (after all, the enclosure and slot did not change). end quote Now here's where things get murky. At this point in time at least (it may change!) Solaris' mpt driver uses LSI's logical target id mapping method. This is *NOT* an enclosure/slot naming method - at least, not from the OS' point of view. Additionally, unless you're using an actual real SCSI Enclosure Services (ses) device, there's no enclosure to provide enclosure/slot mapping with either. Since mpt uses logical target id, therefore the target id which Solaris sees _will definitely change_ if you swap a disk. (I'm a tad annoyed that the LSI support person appears to have made an assumption based on a total lack of understanding about how Solaris' mpt driver works). (My assumption here is that you're using Solaris' mpt(7d) driver rather than LSI's itmpt driver) So how do you use your system and its up-to-24 drives with ZFS? (a) ensure that you note what Solaris's idea of the target id is when you replace a drive, then (b) use zpool replace to tell ZFS what to do with the new device in your enclosure. I hope the above helps you along the way... but I'm sure you'll have
[zfs-discuss] Thumper with many NFS-export ZFS filesystems
[0] andromeda:/2common/sge# wc /etc/dfs/sharetab 18537412 157646 /etc/dfs/sharetab This machine (Thumper) currently runs Solaris 10 Update 3 (with some patches) and things work just fine. Now, I'm a bit worried about reboot times due to the number of exported filesystems and I'm thinking of installing some version of Nevada instead. What's other people's experience? I assume just going to Update 4 will not be enought since that one doesn't contain the in-kernel sharetab and stuff. I'm going to move about 600 of those filesystems off that machine in the near future, and I think I could (if forced) group another 600 of those together into one single filesystem. Suggestions? Or is it no big deal (I'm thinking about the test report when you would see more than a week boot-time with many filesystems)? Hmm.. Now that I come thinking about that we have another server (located at the computer club here) that boots just fine with 1582 entries in /etc/dfs/sharetab, and that's a *much* slower server (Sun Enterprise 450). (Also ZFS backed storage). This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] LSI SAS3081E = unstable drive numbers?
Wow, how fortunate for me that you are on this list! I guess I do have a follow-up question... If each new drive gets a new id when plugged into the system - and I learn to discover that drive's id using dmesg or iostat and use `zfs replace ` correctly - when a drive fails, what will it take for me to physically find it. I'm hoping there is a command, like dd, that I can use to make that drive's LED blink, but I don't know if I can trust that `dd` will work at all when the drive is failed! Since I don't have enclosure-services, does that mean my best option is to manually track id-to-bay mappings? (envision a clip-board hanging on rack) Given that manually tracking shifting ids doesn't sound appealing to me, would using a SATA controller like the AOC-SAT2-MV8 resolve the issue? Given that I currently only have one LSI HBA - I'd need to get 2 more for all 24 drives ---or--- I could get 3 of these SATA controllers plus 6 discrete-to-8087 reverse breakout cables. Going down the LSI-route would cost about $600 while going down the AOC-SAT2-MV8 would cost about $400. I understand that the SATA controllers are less performant, but I'd gladly exchange some performance that I'm likely to never need to simplify my administrative overhead... Thanks, Kent James C. McPherson wrote: Hi Kent, I'm one of the team that works on Solaris' mpt driver, which we recently enhanced to deliver mpxio support with SAS. I have a bit of knowledge about your issue :-) Kent Watsen wrote: Based on recommendations from this list, I asked the company that built my box to use an LSI SAS3081E controller. The first problem I noticed was that the drive-numbers were ordered incorrectly. That is, given that my system has 24 bays (6 rows, 4 bays/row), the drive numbers from top-to-bottom left-to-right were 6, 1, 0, 2, 4, 5 - even though when the system boots, each drive is scanned in perfect order (I can tell by watching the LEDs blink). I contacted LSI tech support and they explained: start response SAS treats device IDs differently than SCSI. LSI SAS controllers remember devices in the order they were discovered by the controller. This memory is persistent across power cycles. It is based on the world wide name (WWN) given uniquely to every SAS device. This allows your boot device to remain your boot device no matter where it migrates in the SAS topology. In order to clear the memory of existing devices you need at least one device that will not be present in your final configuration. Re-boot the machine and enter the LSI configuration utility (CTRL-C). Then find your way to SAS Topology. To see more options, press CTRL-M. Choose the option to clear all non-present device IDs. This clears the persistent memory of all devices not present at that time. Exchange the drives. The system will now remember the order it finds the drives after the next boot cycle. end response Firstly, yes, the LSI SAS hbas do use persistent mapping, with a logical target id by default. This is where the hba does the translation between the physical disk device's SAS address (which you'll see in prtconf -v as the devid), and an essentially arbitrary target number which gets passed up to the OS - in this case Solaris. The support person @ LSI was correct about deleting all those mappings. Yes, the controller is being smart and tracking the actual device rather than a particular bay/slot mapping. This isn't so bad, mostly. The effect for you is that you can't assume that the replaced device is going to have the same target number as the old one (in fact, I'd call that quite unlikely) so you'll have to see what the new device name is by checking your dmesg or iostat -En output. Sure enough, I was able to physical reorder my drives so they were 0, 1, 2, 4, 5, 6 - so, appearantly, the company that put my system together moved the drives around after they were initially scanned. But where is 3? (answer below). Then I tried another test: 1. make first disk blink # run dd if=/dev/dsk/c2t0d0p0 of=/dev/null count=10 10+0 records in 10+0 records out 2. pull disk '0' out and replace it with a brand new disk # run dd if=/dev/dsk/c2t0d0p0 of=/dev/null count=10 dd: /dev/dsk/c2t0d0p0: open: No such file or directory 3. scratch head and try again with '3' (I had previously cleared the LSI's controllers memory) # run dd if=/dev/dsk/c2t3d0p0 of=/dev/null count=10 10+0 records in 10+0 records out So, it seems my SAS controller is being too smart for its own good - it tracks the drives themselves, not the drive-bays. If I hot-swap a brand new drive into a bay, Solaris will see it as a new disk, not a replacement for the old disk. How can ZFS support this? I asked the LSI tech support again and got: start quote I don't have the knowledge to answer that, so I'll just
Re: [zfs-discuss] Odd prioritisation issues.
On Wed, Dec 12, 2007 at 10:27:56 +0100, Roch - PAE wrote: : O_DSYNC was good idea. Then if you have recent Nevada you : can use the separate intent log (log keyword in zpool : create) to absord thosewrites without having splindle : competition with the reads. Your write workload should then : be well handled here (unless the incoming network processing : is itself delayed). Thanks for the suggestion -- I'll see if we can give that a go. -- Dickon Hood Due to digital rights management, my .sig is temporarily unavailable. Normal service will be resumed as soon as possible. We apologise for the inconvenience in the meantime. No virus was found in this outgoing message as I didn't bother looking. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] LSI SAS3081E = unstable drive numbers?
Hi Paul, Already in my LSI Configuration Utility I have an option to clear the persistent mapping for drives not present, but then the card resumes its normal persistent-mapping logic. What I really want is to disable to persistent mapping logic completely - is the `lsiutil` doing that for you? Thanks, Kent Paul Jochum wrote: Hi Kent: I have run into the same problem before, and have worked with LSI and SUN support to fix it. LSI calls this persistant drive mapping, and here is how to clear it 1) obtain the latest version of the program lsiutil from LSI. They don't seem to have the Solaris versions on their website, but I got it by email when entering a ticket into their support system. I know that they have a version for Solaris x86 (and I believe a Sparc version also). The version I currently have is: LSI Logic MPT Configuration Utility, Version 1.52, September 7, 2007 2) Execute the lsiutil program on your target box. a) first it will ask you to select which card to use (I have multiple cards in my machine, don't know if it will ask if you only have 1 card in your box) b) then you need to select option 15 (it is a hidden option, not shown on the menu) c) then you select option 10 (Clear all persistant mappings) d) then option 0 multiple times to get out of the program e) I normally than reboot the box, and the next time it comes up, the drives are back in order. e) or (instead of rebooting) option 99, to reset the chip (causes new mappings to be established), then option 8 (to verify lower target IDs), then devfsadm. After devfsadm completes, lsiutil option 42 should display valid device names (in /dev/rdsk), and format should find the devices so that you can label them. Hope this helps. I happened to need it last night again (I normally have to run it after re-imaging a box, assuming that I don't want to save the data that was on those drives). Paul Jochum This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] LSI SAS3081E = unstable drive numbers?
Hi Kent: I have run into the same problem before, and have worked with LSI and SUN support to fix it. LSI calls this persistant drive mapping, and here is how to clear it 1) obtain the latest version of the program lsiutil from LSI. They don't seem to have the Solaris versions on their website, but I got it by email when entering a ticket into their support system. I know that they have a version for Solaris x86 (and I believe a Sparc version also). The version I currently have is: LSI Logic MPT Configuration Utility, Version 1.52, September 7, 2007 2) Execute the lsiutil program on your target box. a) first it will ask you to select which card to use (I have multiple cards in my machine, don't know if it will ask if you only have 1 card in your box) b) then you need to select option 15 (it is a hidden option, not shown on the menu) c) then you select option 10 (Clear all persistant mappings) d) then option 0 multiple times to get out of the program e) I normally than reboot the box, and the next time it comes up, the drives are back in order. e) or (instead of rebooting) option 99, to reset the chip (causes new mappings to be established), then option 8 (to verify lower target IDs), then devfsadm. After devfsadm completes, lsiutil option 42 should display valid device names (in /dev/rdsk), and format should find the devices so that you can label them. Hope this helps. I happened to need it last night again (I normally have to run it after re-imaging a box, assuming that I don't want to save the data that was on those drives). Paul Jochum This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] LSI SAS3081E = unstable drive numbers?
Hi Kent: What the lsiutil does for me is clear the persistent mapping for all of the drives on a card. I don't know of a way to disable the mapping completely (but that does sound like a nice option). Since SUN is reselling this card now (that is how I got my cards), I wonder if they can put in a request to LSI to provide this enhancement? Paul Kent Watsen wrote: Hi Paul, Already in my LSI Configuration Utility I have an option to clear the persistent mapping for drives not present, but then the card resumes its normal persistent-mapping logic. What I really want is to disable to persistent mapping logic completely - is the `lsiutil` doing that for you? Thanks, Kent Paul Jochum wrote: Hi Kent: I have run into the same problem before, and have worked with LSI and SUN support to fix it. LSI calls this persistant drive mapping, and here is how to clear it 1) obtain the latest version of the program lsiutil from LSI. They don't seem to have the Solaris versions on their website, but I got it by email when entering a ticket into their support system. I know that they have a version for Solaris x86 (and I believe a Sparc version also). The version I currently have is: LSI Logic MPT Configuration Utility, Version 1.52, September 7, 2007 2) Execute the lsiutil program on your target box. a) first it will ask you to select which card to use (I have multiple cards in my machine, don't know if it will ask if you only have 1 card in your box) b) then you need to select option 15 (it is a hidden option, not shown on the menu) c) then you select option 10 (Clear all persistant mappings) d) then option 0 multiple times to get out of the program e) I normally than reboot the box, and the next time it comes up, the drives are back in order. e) or (instead of rebooting) option 99, to reset the chip (causes new mappings to be established), then option 8 (to verify lower target IDs), then devfsadm. After devfsadm completes, lsiutil option 42 should display valid device names (in /dev/rdsk), and format should find the devices so that you can label them. Hope this helps. I happened to need it last night again (I normally have to run it after re-imaging a box, assuming that I don't want to save the data that was on those drives). Paul Jochum This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Yager on ZFS
(apologies if this gets posted twice - it disappeared the first time, and it's not clear whether that was intentional) Hello can, Tuesday, December 11, 2007, 6:57:43 PM, you wrote: Monday, December 10, 2007, 3:35:27 AM, you wrote: cyg and it made them slower cyg That's the second time you've claimed that, so you'll really at cyg least have to describe *how* you measured this even if the cyg detailed results of those measurements may be lost in the mists of time. cyg So far you don't really have much of a position to defend at cyg all: rather, you sound like a lot of the disgruntled TOPS users cyg of that era. Not that they didn't have good reasons to feel cyg disgruntled - but they frequently weren't very careful about aiming their ire accurately. cyg Given that RMS really was *capable* of coming very close to the cyg performance capabilities of the underlying hardware, your cyg allegations just don't ring true. Not being able to jump into And where is your proof that it was capable of coming very close to the...? cyg It's simple: I *know* it, because I worked *with*, and *on*, it cyg - for many years. So when some bozo who worked with people with cyg a major known chip on their shoulder over two decades ago comes cyg along and knocks its capabilities, asking for specifics (not even cyg hard evidence, just specific allegations which could be evaluated cyg and if appropriate confronted) is hardly unreasonable. Bill, you openly criticize people (their work) who have worked on ZFS for years... not that there's anything wrong with that, just please realize that because you were working on it it doesn't mean it is/was perfect - just the same as with ZFS. Of course it doesn't - and I never claimed that RMS was anything close to 'perfect' (I even gave specific examples of areas in which it was *far* from perfect). Just as I've given specific examples of where ZFS is far from perfect. What I challenged was David's assertion that RMS was severely deficient in its *capabilities* - and demanded not 'proof' of any kind but only specific examples (comparable in specificity to the examples of ZFS's deficiencies that *I* have provided) that could actually be discussed. I know, everyone loves their baby... No, you don't know: you just assume that everyone is as biased as you and others here seem to be. Nevertheless just because you were working on and with it, it's not a proof. The person you were replaying to was also working with it (but not on it I guess). Not that I'm interested in such a proof. Just noticed that you're demanding some proof, while you are also just write some statements on its performance without any actual proof. You really ought to spend a lot more time understanding what you've read before responding to it, Robert. I *never* asked for anything like 'proof': I asked for *examples* specific enough to address - and repeated that explicitly in responding to your previous demand for 'proof'. Perhaps I should at that time have observed that your demand for 'proof' (your use of quotes suggesting that it was something that *I* had demanded) was ridiculous, but I thought my response made that obvious. Let me use your own words: In other words, you've got nothing, but you'd like people to believe it's something. The phrase Put up or shut up comes to mind. Where are your proofs on some of your claims about ZFS? cyg Well, aside from the fact that anyone with even half a clue cyg knows what the effects of uncontrolled file fragmentation are on cyg sequential access performance (and can even estimate those cyg effects within moderately small error bounds if they know what cyg the disk characteristics are and how bad the fragmentation is), cyg if you're looking for additional evidence that even someone cyg otherwise totally ignorant could appreciate there's the fact that I've never said there are not fragmentation problems with ZFS. Not having made a study of your collected ZFS contributions here I didn't know that. But some of ZFS's developers are on record stating that they believe there is no need to defragment (unless they've changed their views since and not bothered to make us aware of it), and in the entire discussion in the recent 'ZFS + DB + fragments' thread there were only three contributors (Roch, Anton, and I) who seemed willing to admit that any problem existed. So since one of my 'claims' for which you requested substantiation involved fragmentation problems, it seemed appropriate to address them. Well, actually I've been hit by the issue in one environment. But didn't feel any impulse to mention that during all the preceding discussion, I guess. Also you haven't done your work home properly, as one of ZFS developers actually stated they are going to work on ZFS de-fragmentation and disk removal (pool shrinking). See
Re: [zfs-discuss] Yager on ZFS
Hello can, Tuesday, December 11, 2007, 6:57:43 PM, you wrote: Monday, December 10, 2007, 3:35:27 AM, you wrote: cyg and it made them slower cyg That's the second time you've claimed that, so you'll really at cyg least have to describe *how* you measured this even if the cyg detailed results of those measurements may be lost in the mists of time. cyg So far you don't really have much of a position to defend at cyg all: rather, you sound like a lot of the disgruntled TOPS users cyg of that era. Not that they didn't have good reasons to feel cyg disgruntled - but they frequently weren't very careful about aiming their ire accurately. cyg Given that RMS really was *capable* of coming very close to the cyg performance capabilities of the underlying hardware, your cyg allegations just don't ring true. Not being able to jump into And where is your proof that it was capable of coming very close to the...? cyg It's simple: I *know* it, because I worked *with*, and *on*, it cyg - for many years. So when some bozo who worked with people with cyg a major known chip on their shoulder over two decades ago comes cyg along and knocks its capabilities, asking for specifics (not even cyg hard evidence, just specific allegations which could be evaluated cyg and if appropriate confronted) is hardly unreasonable. Bill, you openly criticize people (their work) who have worked on ZFS for years... not that there's anything wrong with that, just please realize that because you were working on it it doesn't mean it is/was perfect - just the same as with ZFS. Of course it doesn't - and I never claimed that RMS was anything close to 'perfect' (I even gave specific examples of areas in which it was *far* from perfect). Just as I've given specific examples of where ZFS is far from perfect. What I challenged was David's assertion that RMS was severely deficient in its *capabilities* - and demanded not 'proof' of any kind but only specific examples (comparable in specificity to the examples of ZFS's deficiencies that *I* have provided) that could actually be discussed. I know, everyone loves their baby... No, you don't know: you just assume that everyone is as biased as you and others here seem to be. Nevertheless just because you were working on and with it, it's not a proof. The person you were replaying to was also working with it (but not on it I guess). Not that I'm interested in such a proof. Just noticed that you're demanding some proof, while you are also just write some statements on its performance without any actual proof. You really ought to spend a lot more time understanding what you've read before responding to it, Robert. I *never* asked for anything like 'proof': I asked for *examples* specific enough to address - and repeated that explicitly in responding to your previous demand for 'proof'. Perhaps I should at that time have observed that your demand for 'proof' (your use of quotes suggesting that it was something that *I* had demanded) was ridiculous, but I thought my response made that obvious. Let me use your own words: In other words, you've got nothing, but you'd like people to believe it's something. The phrase Put up or shut up comes to mind. Where are your proofs on some of your claims about ZFS? cyg Well, aside from the fact that anyone with even half a clue cyg knows what the effects of uncontrolled file fragmentation are on cyg sequential access performance (and can even estimate those cyg effects within moderately small error bounds if they know what cyg the disk characteristics are and how bad the fragmentation is), cyg if you're looking for additional evidence that even someone cyg otherwise totally ignorant could appreciate there's the fact that I've never said there are not fragmentation problems with ZFS. Not having made a study of your collected ZFS contributions here I didn't know that. But some of ZFS's developers are on record stating that they believe there is no need to defragment (unless they've changed their views since and not bothered to make us aware of it), and in the entire discussion in the recent 'ZFS + DB + fragments' thread there were only three contributors (Roch, Anton, and I) who seemed willing to admit that any problem existed. So since one of my 'claims' for which you requested substantiation involved fragmentation problems, it seemed appropriate to address them. Well, actually I've been hit by the issue in one environment. But didn't feel any impulse to mention that during all the preceding discussion, I guess. Also you haven't done your work home properly, as one of ZFS developers actually stated they are going to work on ZFS de-fragmentation and disk removal (pool shrinking). See http://www.opensolaris.org/jive/thread.jspa?messageID=139680↠ Hmmm - there were at least two Sun ZFS personnel participating in the database thread, and they never mentioned
Re: [zfs-discuss] LSI SAS3081E = unstable drive numbers?
James C. McPherson wrote: Now here's where things get murky. At this point in time at least (it may change!) Solaris' mpt driver uses LSI's logical target id mapping method. This is *NOT* an enclosure/slot naming method - at least, not from the OS' point of view. Additionally, unless you're using an actual real SCSI Enclosure Services (ses) device, there's no enclosure to provide enclosure/slot mapping with either. Since mpt uses logical target id, therefore the target id which Solaris sees _will definitely change_ if you swap a disk. JMCP, For Sun systems, we have 3 LEDs on the drives: 1. Ready to remove (blue) 2. Service required (amber) 3. OK/Activity (green) So there must be a way to set the ready to remove LED from Solaris. In the old days, we could use luxadm(1m). Does that still work, or is there some new equivalent? -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] LSI SAS3081E = unstable drive numbers?
On Wed, Dec 12, 2007 at 09:22:07AM -0800, Richard Elling wrote: For Sun systems, we have 3 LEDs on the drives: 1. Ready to remove (blue) 2. Service required (amber) 3. OK/Activity (green) So there must be a way to set the ready to remove LED from Solaris. In the old days, we could use luxadm(1m). Does that still work, or is there some new equivalent? For x86 systems, you can use ipmitool to manipulate the led state (ipmitool sunoem led ...). On older galaxy systems, you can only set the fail LED ('io.hdd0.led'), as the ok2rm LED is not physically connected to anything. On newer systems, you can set both the 'fail' and 'okrm' LEDs. You cannot change the activity LED except by manually sending the 'set sensor reading' IPMI command (not available via impitool). For external enclosures, you'll need a SES control program. Both of these problems are being worked on under the FMA sensor framework to create a unified view through libtopo. Until that's complete, you'll be stuck using ad hoc methods. - Eric -- Eric Schrock, FishWorkshttp://blogs.sun.com/eschrock ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] reset a disk?
I can't find how to do this...I used a disk for a zfs pool and now I want to use it for normal UFS stuff. But my partition table now looks like: Part TagFlag First SectorSizeLast Sector 0usrwm34 33.91GB 71116541 1 unassignedwm 0 0 0 2 unassignedwm 0 0 0 3 unassignedwm 0 0 0 4 unassignedwm 0 0 0 5 unassignedwm 0 0 0 6 unassignedwm 0 0 0 8 reservedwm 71116542 8.00MB 71132925 I'm assuming zpool did this when I used the disk device c1t1d0 instead of the c1t1d0s2 partition? How do I release or reset this disk so I get the normal 0-7 partitions? I've already destroyed the pool. -Doug ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] reset a disk?
Hi Doug, ZFS uses an EFI label so you need to use format -e to set it back to a VTOC label, like this: # format -e Specify disk (enter its number)[4]: 3 selecting c0t4d0 [disk formatted] format label [0] SMI Label [1] EFI Label Specify Label type[1]: 0 Warning: This disk has an EFI label. Changing to SMI label will erase all current partitions. Continue? Cindy Doug Schwabauer wrote: I can't find how to do this...I used a disk for a zfs pool and now I want to use it for normal UFS stuff. But my partition table now looks like: Part TagFlag First SectorSizeLast Sector 0usrwm34 33.91GB 71116541 1 unassignedwm 0 0 0 2 unassignedwm 0 0 0 3 unassignedwm 0 0 0 4 unassignedwm 0 0 0 5 unassignedwm 0 0 0 6 unassignedwm 0 0 0 8 reservedwm 71116542 8.00MB 71132925 I'm assuming zpool did this when I used the disk device c1t1d0 instead of the c1t1d0s2 partition? How do I release or reset this disk so I get the normal 0-7 partitions? I've already destroyed the pool. -Doug ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Yager on ZFS
Hello can, I haven't been wasting so much time as in this thread... but from time to time it won't hurt :) More below :) Wednesday, December 12, 2007, 4:46:42 PM, you wrote: Hello Bill, I know, everyone loves their baby... cyg No, you don't know: you just assume that everyone is as biased cyg as you and others here seem to be. Which in turn is just your assumption :) I've never said there are not fragmentation problems with ZFS. cyg Not having made a study of your collected ZFS contributions here cyg I didn't know that. But some of ZFS's developers are on record cyg stating that they believe there is no need to defragment (unless cyg they've changed their views since and not bothered to make us cyg aware of it), and in the entire discussion in the recent 'ZFS + cyg DB + fragments' thread there were only three contributors cyg (Roch, Anton, and I) who seemed willing to admit that any problem existed. Which ZFS developer said that there's no need to defragment in ZFS? cyg So since one of my 'claims' for which you requested cyg substantiation involved fragmentation problems, it seemed appropriate to address them. I would say that right now there are other more important things to be done in ZFS than addressing fragmentation. While in one environment it looks like lowering fragmentation would help with some issues, in all the other environments I haven't run into fragmentation problem. Also you haven't done your work home properly, as one of ZFS developers actually stated they are going to work on ZFS de-fragmentation and disk removal (pool shrinking). See http://www.opensolaris.org/jive/thread.jspa?messageID=139680? cyg Hmmm - there were at least two Sun ZFS personnel participating cyg in the database thread, and they never mentioned this. I guess cyg they didn't do their 'work home' properly either (and unlike me they're paid to do it). Maybe they don't know? Different project, different group? My understanding (I might be wrong) is that actually what they are working on is disk removal from pool (which looks like is much more requested by people than fixing fragmentation 'problem'). In order to accomplish it you need a mechanism to re-arrange data in a pool, which as a side effect could be also used as a de-fragment tool. That doesn't mean the pool won't fragment again in a future - if it's a real problem in given environment. The point is, and you as a long time developer (I guess) should know it, you can't have everything done at once (lack of resources, and it takes some time anyway) so you must prioritize. cyg The issues here are not issues of prioritization but issues of cyg denial. Your citation above is the first suggestion that I've cyg seen (and by all appearances the first that anyone else cyg participating in these discussions has seen) that the ZFS crew cyg considers the fragmentation issue important enough to merit active attention in the future. Jeeez... now you need some kind of acknowledge from ZFS developers every time you think you found something? Are you paying their bills or what? While it's fine to talk about theoretical/hypothetical problems, I'm not entirely sure here is a good place to do it. On the other hand you can very often find ZFS developers responding on this list (and not only) to actual user problems. Another problem, I guess, could be - they already spent a lot of their time in projects they have to deliver - do you really expect them to spent still more time on analyzing some loosely statements of yours? Come on, they also have their private lifes and other things to do. Ignoring their customers/users would be unwise, responding to everyone with every problem, especially not a real user experience problem - would be just unpractical. Then there is your attitude - you know, there's a very good reasons why people at interviews are checking if you can actually work with the others people in a group. You're a very good example why. Don't expect people to take you seriously if you behave the way you do. As you put it before - you get what you deserve for. You probably got even more attention here that you deserved. I guess, that you are another good engineer, quite skillful, unfortunately unable to work in a team, and definitely not with customers. I would say some people here recognized it within you and did their best to treat you seriously and actually hear you - it's just that everyone has his limits. Looking thru your posts here, you can find lots words, some technical input but not much actual value - at first it could be entertaining, even intriguing but quickly becomes irritating. Bill, you could be the best engineer in the world, if you can't communicate with it you'll be the only one person who would recognize it. Or perhaps some people here (not only here) are right and for whatever reasons you are just trolling. cyg Do you by any chance have any similar hint of recognition that cyg RAID-Z might benefit from
[zfs-discuss] 6604198 - single thread for compression
Hello zfs-discuss, http://sunsolve.sun.com/search/document.do?assetkey=1-1-6604198-1 Is there a patch for S10? I thought it's been fixed. -- Best regards, Robert mailto:[EMAIL PROTECTED] http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 6604198 - single thread for compression
On Dec 12, 2007, at 3:03 PM, Robert Milkowski wrote: Hello zfs-discuss, http://sunsolve.sun.com/search/document.do?assetkey=1-1-6604198-1 Is there a patch for S10? I thought it's been fixed. It was fixed via 6460622 zio_nowait() doesn't live up to its name and that is in s10u4. Then 6437054 vdev_cache wises up: increase DB performance by 16% accidentally re-introduced it. That was putback in snv_70. As you've noted, 6604198 zfs only using single cpu for compression (part II) fixed it again. And that's available in snv_79. Neither 6437054 nor 6604198 has been backported to a s10 update yet. apologies for that, eric ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Yager on ZFS
... Bill - I don't think there's a point in continuing that discussion. I think you've finally found something upon which we can agree. I still haven't figured out exactly where on the stupid/intellectually dishonest spectrum you fall (lazy is probably out: you have put some effort in to responding), but it is clear that you're hopeless. On the other hand, there's always the possibility that someone else learned something useful out of this. And my question about just how committed you were to your ignorance has been answered. It's difficult to imagine how someone so incompetent in the specific area that he's debating can be so self-assured - I suspect that just not listening has a lot to do with it - but also kind of interesting to see that in action. - bill This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Yager on ZFS
Look, it's obvious this guy talks about himself as if he is the person he is addressing. Please stop taking this personally and feeding the troll. can you guess? wrote: Bill - I don't think there's a point in continuing that discussion. I think you've finally found something upon which we can agree. I still haven't figured out exactly where on the stupid/intellectually dishonest spectrum you fall (lazy is probably out: you have put some effort in to responding), but it is clear that you're hopeless. On the other hand, there's always the possibility that someone else learned something useful out of this. And my question about just how committed you were to your ignorance has been answered. It's difficult to imagine how someone so incompetent in the specific area that he's debating can be so self-assured - I suspect that just not listening has a lot to do with it - but also kind of interesting to see that in action. - bill ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Nice chassis for ZFS server
On November 29, 2007 5:56:04 AM -0800 MP [EMAIL PROTECTED] wrote: Intel show a configuration of this chassis in the Hardware Technical Specification: http://download.intel.com/support/motherboards/server/ssr212mc2/sb/ssr212 mc2_tps_12.pdf without the RAID controller. I assume that then the 4xSAS ports on the Blackford chipset are then used, rather than the 4xSAS on the RAID card. As Blackford is supported in Opensolaris, then this configuration would be the one to choose? Makes no difference. The host running Solaris, OpenSolaris or whatever talks SAS to the enclosure. The chipset used by the enclosure doesn't make any difference to the host OS (bug workarounds excepted). -frank ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Finding external USB disks
What are the approaches to finding what external USB disks are currently connected? I'm starting on backup scripts, and I need to check which volumes are present before I figure out what to back up to them. I suppose I could just try all the ones that I know about and see which are there (the list is small enough this is actually feasible), but it's inelegant. (On Solaris Nevada, currently build 76 I think). The external USB backup disks in question have ZFS filesystems on them, which may make a difference in finding them perhaps? I've glanced at Tim Foster's autobackup and related scripts, and they're all about being triggered by the plug connection being made; which is not what I need. I don't actually want to start the big backup when I plug in (or power on) the drive in the evening, it's supposed to wait until late (to avoid competition with users). (His autosnapshot script may be just what I need for that part, though.) -- David Dyer-Bennet, [EMAIL PROTECTED]; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] How can I install the CIFS service to a Core Group Solaris 10 install?
I'm trying to build a simple Solaris 10 file server using ZFS + CIFS... That means I don't need X Windows or anything like that, etc. During the Solaris 10 installation I chose the Core Group for the installation so that it doesn't install all of the extra software associated with the other installation groups. How do I go about installing the CIFS service? I've read the page here: http://opensolaris.org/os/project/cifs-server/gettingstarted.html ...and when I run svcadm enable -r smb/server it gives me an error about the service not being found or something like that... (sorry I'm not at the server) How do I go about installing just the core CIFS service/software so I can share my ZFS file systems via CIFS? Thanks, John This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How can I install the CIFS service to a Core Group Solaris 10 install?
John Klimek wrote: I'm trying to build a simple Solaris 10 file server using ZFS + CIFS... That means I don't need X Windows or anything like that, etc. Answered elsewhere. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss