Re: [zfs-discuss] Performance issues with iSCSI under Linux
Ian, It would help to have some config detail (e.g. what options are you using? zpool status output; property lists for specific filesystems and zvols; etc) Some basic Solaris stats can be very helpful too (e.g. peak flow samples of vmstat 1, mpstst 1, iostat -xnz 1, etc) It would also be great to know how you are running you tests. I'd also like to know what version of NFS and mount options. A network trace down to NFS RPC or iSCSI operation level with timings would be great too. I'm wondering whether your HBA has a write through or write back cache enabled? The latter might make things very fast, but could put data at risk if not sufficiently non-volatile. Cheers, Phil On 14 Oct 2010, at 22:02, Ian D wrote: >> Our next test is to try with a different kind of HBA, >> we have a Dell H800 lying around. > > ok... we're making progress. After swapping the LSI HBA for a Dell H800 the > issue disappeared. Now, I'd rather not use those controllers because they > don't have a JBOD mode. We have no choice but to make individual RAID0 > volumes for each disks which means we need to reboot the server every time we > replace a failed drive. That's not good... > > What can we do with the LSI HBA? Would you call LSI's support? Is there > anything we should try besides the obvious (using the latests > firmware/driver)? > > To resume the issue, when we copy files from/to the JBODs connected to that > HBA using NFS/iSCSI, we get slow transfer rate <20M/s and a 1-2 second pause > between each file. When we do the same experiment locally using the > external drives as a local volume (no NFS/iSCSI involved) then it goes upward > of 350M/sec with no delay between files. > > Ian > > Message was edited by: reward72 > -- > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] adding new disks and setting up a raidz2
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Derek G Nokes > > r...@dnokes.homeip.net:~# zpool create marketData raidz2 > c0t5000C5001A6B9C5Ed0 c0t5000C5001A81E100d0 c0t5000C500268C0576d0 > c0t5000C500268C5414d0 c0t5000C500268CFA6Bd0 c0t5000C500268D0821d0 > cannot label 'c0t5000C500268CFA6Bd0': try using fdisk(1M) and then > provide a specific slice > > Any idea what this means? I think it means there is something pre-existing on that drive. Maybe ZFS related, maybe not. You should probably double-check everything, to make sure there's no valuable data on that device... And then ... Either zero the drive the long way via dd ... or use your raid controller to "initialize" the device, which will virtually zero it the short way ... In some cases you have no choice, and you need to do it the long way. time dd if=/dev/zero of=/dev/rdsk/c0t5000C500268CFA6Bd0 bs=1024k ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Performance issues with iSCSI under Linux [SEC=UNCLASSIFIED]
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Wilkinson, Alex > > can you paste them anyway ? Note: If you have more than one adapter, I believe you can specify -aALL in the commands below, instead of -a0 I have 2 disks (slots 4 & 5) that are removable and rotate offsite for backups. To remove disks safely: zpool export removable-pool export EnclosureID=`MegaCli -PDList -a0 | grep 'Enclosure Device ID' | uniq | sed 's/.* //'` for DriveNum in 4 5 ; do MegaCli -PDOffline PhysDrv[${EnclosureID}:${DriveNum}] -a0 ; done Disks blink alternate orange & green. Safe to remove. To insert disks safely: Insert disks. MegaCli -CfgForeign -Clear -a0 MegaCli -CfgEachDskRaid0 -a0 devfsadm -Cv zpool import -a To clear foreign config off drives: MegaCli -CfgForeign -Clear -a0 To create a one-disk raid0 for each disk that's not currently part of another group: MegaCli -CfgEachDskRaid0 -a0 To configure all drives WriteThrough MegaCli -LDSetProp WT Lall -aALL To configure all drives WriteBack MegaCli -LDSetProp WB Lall -aALL ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] adding new disks and setting up a raidz2
Thank you both. I did try without specifying the 's0' portion before posting and got the following error: r...@dnokes.homeip.net:~# zpool create marketData raidz2 c0t5000C5001A6B9C5Ed0 c0t5000C5001A81E100d0 c0t5000C500268C0576d0 c0t5000C500268C5414d0 c0t5000C500268CFA6Bd0 c0t5000C500268D0821d0 cannot label 'c0t5000C500268CFA6Bd0': try using fdisk(1M) and then provide a specific slice Any idea what this means? Thanks again. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Performance issues with iSCSI under Linux [SEC=UNCLASSIFIED]
0n Thu, Oct 14, 2010 at 09:54:09PM -0400, Edward Ned Harvey wrote: >If you happen to find that MegaCLI is the right tool for your hardware, let >me know, and I'll paste a few commands here, which will simplify your life. >When I first started using it, I found it terribly cumbersome. But now I've >gotten used to it, and MegaCLI commands just roll off the tongue. can you paste them anyway ? -Alex IMPORTANT: This email remains the property of the Department of Defence and is subject to the jurisdiction of section 70 of the Crimes Act 1914. If you have received this email in error, you are requested to contact the sender and delete the email. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Performance issues with iSCSI under Linux
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Ian D > > ok... we're making progress. After swapping the LSI HBA for a Dell > H800 the issue disappeared. Now, I'd rather not use those controllers > because they don't have a JBOD mode. We have no choice but to make > individual RAID0 volumes for each disks which means we need to reboot > the server every time we replace a failed drive. That's not good... I believe those are rebranded LSI controllers. I know the PERC controllers are. I use MegaCLI on Perc systems for this purpose. You should be able to find a utility which allows you to do this sort of thing while the OS is running. If you happen to find that MegaCLI is the right tool for your hardware, let me know, and I'll paste a few commands here, which will simplify your life. When I first started using it, I found it terribly cumbersome. But now I've gotten used to it, and MegaCLI commands just roll off the tongue. > To resume the issue, when we copy files from/to the JBODs connected to > that HBA using NFS/iSCSI, we get slow transfer rate <20M/s and a 1-2 > second pause between each file. When we do the same experiment > locally using the external drives as a local volume (no NFS/iSCSI > involved) then it goes upward of 350M/sec with no delay between files. Baffling. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] adding new disks and setting up a raidz2
Derek, > I am relatively new to OpenSolaris / ZFS (have been using it for maybe 6 > months). I recently added 6 new drives to one of my servers and I would like > to create a new RAIDZ2 pool called 'marketData'. > > I figured the command to do this would be something like: > > zpool create marketData raidz2 c0t5000C5001A6B9C5Ed0s0 > c0t5000C5001A81E100d0s0 c0t5000C500268C0576d0s0 c0t5000C500268C5414d0s0 > c0t5000C500268CFA6Bd0s0 c0t5000C500268D0821d0s0 I would assume that one or more of these VDEVs are not formatted, or have valid disk labels, a requirement when specifying the 's0' portion of the Solaris device name. If 100% of all blocks on each of these disks are to be included within the marketData storage pool, then ZFS is very capable of formatting and/or labeling the underlying devices as follows: zpool create marketData raidz2 c0t5000C5001A6B9C5Ed0 c0t5000C5001A81E100d0 c0t5000C500268C0576d0 c0t5000C500268C5414d0 c0t5000C500268CFA6Bd0 c0t5000C500268D0821d0 - Jim > > Unfortunately I get an error: > > [b]cannot open '/dev/dsk/c0t5000C500268CFA6Bd0s0': I/O error[/b] > > Can anyone give me some clues as to what is wrong? > > I have included the zpool status and format output from my system > > > [b]ZPOOL STATUS[/b] > r...@dnokes.homeip.net:~# zpool status > pool: rpool > state: ONLINE > status: The pool is formatted using an older on-disk format. The pool can > still be used, but some features are unavailable. > action: Upgrade the pool using 'zpool upgrade'. Once this is done, the > pool will no longer be accessible on older software versions. > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > rpoolONLINE 0 0 0 > mirror-0 ONLINE 0 0 0 > c0t5000C50019F5B5BAd0s0 ONLINE 0 0 0 > c0t5000C50019E0EB23d0s0 ONLINE 0 0 0 > > errors: No known data errors > > [b]FORMAT[/b] > r...@dnokes.homeip.net:~# format > Searching for disks...done > > > AVAILABLE DISK SELECTIONS: > 0. c0t5000C5001A6B9C5Ed0 > /scsi_vhci/d...@g5000c5001a6b9c5e > 1. c0t5000C5001A81E100d0 > /scsi_vhci/d...@g5000c5001a81e100 > 2. c0t5000C50019E0EB23d0 > /scsi_vhci/d...@g5000c50019e0eb23 > 3. c0t5000C50019F5B5BAd0 > /scsi_vhci/d...@g5000c50019f5b5ba > 4. c0t5000C500268C0576d0 > /scsi_vhci/d...@g5000c500268c0576 > 5. c0t5000C500268C5414d0 > /scsi_vhci/d...@g5000c500268c5414 > 6. c0t5000C500268CFA6Bd0 > /scsi_vhci/d...@g5000c500268cfa6b > 7. c0t5000C500268D0821d0 > /scsi_vhci/d...@g5000c500268d0821 > > Thanks in advance. > -- > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] adding new disks and setting up a raidz2
On Oct 14, 2010, at 5:08 PM, Derek G Nokes wrote: > I am relatively new to OpenSolaris / ZFS (have been using it for maybe 6 > months). I recently added 6 new drives to one of my servers and I would like > to create a new RAIDZ2 pool called 'marketData'. > > I figured the command to do this would be something like: > > zpool create marketData raidz2 c0t5000C5001A6B9C5Ed0s0 > c0t5000C5001A81E100d0s0 c0t5000C500268C0576d0s0 c0t5000C500268C5414d0s0 > c0t5000C500268CFA6Bd0s0 c0t5000C500268D0821d0s0 > > Unfortunately I get an error: > > [b]cannot open '/dev/dsk/c0t5000C500268CFA6Bd0s0': I/O error[/b] This can happen if slice0 is size 0. KISS would say to do: zpool create marketData raidz2 \ c0t5000C5001A6B9C5Ed0 \ c0t5000C5001A81E100d0 \ c0t5000C500268C0576d0 \ c0t5000C500268C5414d0 \ c0t5000C500268CFA6Bd0 c0t5000C500268D0821d0 This method will create slice0 (s0) as the full disk, on your behalf. -- richard > > Can anyone give me some clues as to what is wrong? > > I have included the zpool status and format output from my system > > > [b]ZPOOL STATUS[/b] > r...@dnokes.homeip.net:~# zpool status > pool: rpool > state: ONLINE > status: The pool is formatted using an older on-disk format. The pool can > still be used, but some features are unavailable. > action: Upgrade the pool using 'zpool upgrade'. Once this is done, the > pool will no longer be accessible on older software versions. > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > rpoolONLINE 0 0 0 > mirror-0 ONLINE 0 0 0 > c0t5000C50019F5B5BAd0s0 ONLINE 0 0 0 > c0t5000C50019E0EB23d0s0 ONLINE 0 0 0 > > errors: No known data errors > > [b]FORMAT[/b] > r...@dnokes.homeip.net:~# format > Searching for disks...done > > > AVAILABLE DISK SELECTIONS: > 0. c0t5000C5001A6B9C5Ed0 > /scsi_vhci/d...@g5000c5001a6b9c5e > 1. c0t5000C5001A81E100d0 > /scsi_vhci/d...@g5000c5001a81e100 > 2. c0t5000C50019E0EB23d0 > /scsi_vhci/d...@g5000c50019e0eb23 > 3. c0t5000C50019F5B5BAd0 > /scsi_vhci/d...@g5000c50019f5b5ba > 4. c0t5000C500268C0576d0 > /scsi_vhci/d...@g5000c500268c0576 > 5. c0t5000C500268C5414d0 > /scsi_vhci/d...@g5000c500268c5414 > 6. c0t5000C500268CFA6Bd0 > /scsi_vhci/d...@g5000c500268cfa6b > 7. c0t5000C500268D0821d0 > /scsi_vhci/d...@g5000c500268d0821 > > Thanks in advance. > -- > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] adding new disks and setting up a raidz2
I am relatively new to OpenSolaris / ZFS (have been using it for maybe 6 months). I recently added 6 new drives to one of my servers and I would like to create a new RAIDZ2 pool called 'marketData'. I figured the command to do this would be something like: zpool create marketData raidz2 c0t5000C5001A6B9C5Ed0s0 c0t5000C5001A81E100d0s0 c0t5000C500268C0576d0s0 c0t5000C500268C5414d0s0 c0t5000C500268CFA6Bd0s0 c0t5000C500268D0821d0s0 Unfortunately I get an error: [b]cannot open '/dev/dsk/c0t5000C500268CFA6Bd0s0': I/O error[/b] Can anyone give me some clues as to what is wrong? I have included the zpool status and format output from my system [b]ZPOOL STATUS[/b] r...@dnokes.homeip.net:~# zpool status pool: rpool state: ONLINE status: The pool is formatted using an older on-disk format. The pool can still be used, but some features are unavailable. action: Upgrade the pool using 'zpool upgrade'. Once this is done, the pool will no longer be accessible on older software versions. scrub: none requested config: NAME STATE READ WRITE CKSUM rpoolONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c0t5000C50019F5B5BAd0s0 ONLINE 0 0 0 c0t5000C50019E0EB23d0s0 ONLINE 0 0 0 errors: No known data errors [b]FORMAT[/b] r...@dnokes.homeip.net:~# format Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c0t5000C5001A6B9C5Ed0 /scsi_vhci/d...@g5000c5001a6b9c5e 1. c0t5000C5001A81E100d0 /scsi_vhci/d...@g5000c5001a81e100 2. c0t5000C50019E0EB23d0 /scsi_vhci/d...@g5000c50019e0eb23 3. c0t5000C50019F5B5BAd0 /scsi_vhci/d...@g5000c50019f5b5ba 4. c0t5000C500268C0576d0 /scsi_vhci/d...@g5000c500268c0576 5. c0t5000C500268C5414d0 /scsi_vhci/d...@g5000c500268c5414 6. c0t5000C500268CFA6Bd0 /scsi_vhci/d...@g5000c500268cfa6b 7. c0t5000C500268D0821d0 /scsi_vhci/d...@g5000c500268d0821 Thanks in advance. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Performance issues with iSCSI under Linux
> Earlier you said you had eliminated the ZIL as an > issue, but one difference > between the Dell H800 and the LSI HBA is that the > H800 has an NV cache (if > you have the battery backup present). > > A very simple test would be when things are running > slow, try disabling > the ZIL temporarily, to see if that makes things go > fast. We'll try that, but keep in mind that we're having the issue even when we READ from the JBODs, not just during WRITES. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Performance issues with iSCSI under Linux
rewar...@hotmail.com said: > ok... we're making progress. After swapping the LSI HBA for a Dell H800 the > issue disappeared. Now, I'd rather not use those controllers because they > don't have a JBOD mode. We have no choice but to make individual RAID0 > volumes for each disks which means we need to reboot the server every time we > replace a failed drive. That's not good... Earlier you said you had eliminated the ZIL as an issue, but one difference between the Dell H800 and the LSI HBA is that the H800 has an NV cache (if you have the battery backup present). A very simple test would be when things are running slow, try disabling the ZIL temporarily, to see if that makes things go fast. Regards, Marion ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Performance issues with iSCSI under Linux
> Our next test is to try with a different kind of HBA, > we have a Dell H800 lying around. ok... we're making progress. After swapping the LSI HBA for a Dell H800 the issue disappeared. Now, I'd rather not use those controllers because they don't have a JBOD mode. We have no choice but to make individual RAID0 volumes for each disks which means we need to reboot the server every time we replace a failed drive. That's not good... What can we do with the LSI HBA? Would you call LSI's support? Is there anything we should try besides the obvious (using the latests firmware/driver)? To resume the issue, when we copy files from/to the JBODs connected to that HBA using NFS/iSCSI, we get slow transfer rate <20M/s and a 1-2 second pause between each file. When we do the same experiment locally using the external drives as a local volume (no NFS/iSCSI involved) then it goes upward of 350M/sec with no delay between files. Ian Message was edited by: reward72 -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Finding corrupted files
On 14-Oct-10, at 11:48 AM, Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Toby Thain I don't want to heat up the discussion about ZFS managed discs vs. HW raids, but if RAID5/6 would be that bad, no one would use it anymore. It is. And there's no reason not to point it out. The world has Well, neither one of the above statements is really fair. The truth is: radi5/6 are generally not that bad. Data integrity failures are not terribly common (maybe one bit per year out of 20 large disks or something like that.) Such statistics assume that no part of the stack (drive, cable, network, controller, memory, etc) has any fault and is operating normally. This is, indeed, the base presumption of RAID (which also assumes a perfect error reporting chain). And in order to reach the conclusion "nobody would use it," the people using it would have to first *notice* the failure. Which they don't. That's kind of the point. Indeed it is. And then we could talk about self healing (also missing from RAID). --Toby Since I started using ZFS in production, about a year ago, on three servers totaling approx 1.5TB used, I have had precisely one checksum error, which ZFS corrected. I have every reason to believe, if that were on a raid5/6, the error would have gone undetected and nobody would have noticed. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Performance issues with iSCSI under Linux
I've had a few people sending emails directly suggesting it might have something to do with the ZIL/SLOG. I guess I should have said that the issue happen both ways, whether we copy TO or FROM the Nexenta box. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Performance issues with iSCSI under Linux
> Sounding more and more like a networking issue - are > the network cards set up in an aggregate? I had some > similar issues on GbE where there was a mismatch > between the aggregate settings on the switches and > the LACP settings on the server. Basically the > network was wasting a ton of time trying to > renegotiate the LACP settings and slowing everything > down. > > Ditto for the Linux networking - single port or > aggregated dual port? We're only using one port on both boxes (we never have been able to saturate them yet), but maybe they are somehow set wrong. We'll investigate. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Upgrade a degraded pool
On Thu, Oct 14, 2010 at 11:47 PM, Oskar wrote: > I know that this is not necessarily the right forum, but the FreeBSD forum > haven't been able to help me... > > I recently updated my FreeBSD 8.0 RC3 to 8.1 and after the update I can't > import my zpool. My computer says that no such pool exists, even though it > can be seen with the zpool status command. I assume that it's due to > different zfs versions. That should be solved by a zpool upgrade, BUT the > problem is that I also have a failed disk. What happens to my data if I > upgrade a degraded pool? Furthermore a disk label was lost and zfs tried to > replace the disk, with a disk which won't be available once I get the disk > re-labeled. I have no clues about what to do... :s > providing these may (or may not) help in giving better assistance: - output of atacontrol list and camcontrol devlist - output of gpart show - output of glabel status - output of zpool status - output of zpool list - output of zpool import - output of zfs list - output of mount -- O< ascii ribbon campaign - stop html mail - www.asciiribbon.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Upgrade a degraded pool
I know that this is not necessarily the right forum, but the FreeBSD forum haven't been able to help me... I recently updated my FreeBSD 8.0 RC3 to 8.1 and after the update I can't import my zpool. My computer says that no such pool exists, even though it can be seen with the zpool status command. I assume that it's due to different zfs versions. That should be solved by a zpool upgrade, BUT the problem is that I also have a failed disk. What happens to my data if I upgrade a degraded pool? Furthermore a disk label was lost and zfs tried to replace the disk, with a disk which won't be available once I get the disk re-labeled. I have no clues about what to do... :s I only use the computer as a home file server, so I could migrate to OpenSolaris if that would help my case... Any help would be most appreciated!!! -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs diff cannot stat shares
I had to upgrade zfs zfs upgrade -a then pfexec zfs set sharesmb=off data pfexec zfs set sharesmb=on data after this zfs diff failed with the old snapshots. But with newly created snapshots it worked. Thanks Tim, Dirk -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Finding corrupted files
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Toby Thain > > > I don't want to heat up the discussion about ZFS managed discs vs. > > HW raids, but if RAID5/6 would be that bad, no one would use it > > anymore. > > It is. And there's no reason not to point it out. The world has Well, neither one of the above statements is really fair. The truth is: radi5/6 are generally not that bad. Data integrity failures are not terribly common (maybe one bit per year out of 20 large disks or something like that.) And in order to reach the conclusion "nobody would use it," the people using it would have to first *notice* the failure. Which they don't. That's kind of the point. Since I started using ZFS in production, about a year ago, on three servers totaling approx 1.5TB used, I have had precisely one checksum error, which ZFS corrected. I have every reason to believe, if that were on a raid5/6, the error would have gone undetected and nobody would have noticed. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Finding corrupted files
On 14-Oct-10, at 3:27 AM, Stephan Budach wrote: I'd like to see those docs as well. As all HW raids are driven by software, of course - and software can be buggy. It's not that the software 'can be buggy' - that's not the point here. The point being made is that conventional RAID just doesn't offer data *integrity* - it's not a design factor. The necessary mechanisms simply aren't there. Contrariwise, with ZFS, end to end integrity is *designed in*. The 'papers' which demonstrate this difference are the design documents; anyone could start with Mr Bonwick's blog - with which I am sure most list readers are already familiar. http://blogs.sun.com/bonwick/en_US/category/ZFS e.g. http://blogs.sun.com/bonwick/en_US/entry/zfs_end_to_end_data I don't want to heat up the discussion about ZFS managed discs vs. HW raids, but if RAID5/6 would be that bad, no one would use it anymore. It is. And there's no reason not to point it out. The world has changed a lot since RAID was 'state of the art'. It is important to understand its limitations (most RAID users apparently don't). The saddest part is that your experience clearly shows these limitations. As expected, the hardware RAID didn't protect your data, since it's designed neither to detect nor repair such errors. If you had been running any other filesystem on your RAID you would never even have found out about it until you accessed a damaged part of it. Furthermore, backups would probably have been silently corrupt, too. As many other replies have said: The correct solution is to let ZFS, and not conventional RAID, manage your redundancy. That's the bottom line of any discussion of "ZFS managed discs vs. HW raids". If still unclear, read Bonwick's blog posts, or the detailed reply to you from Edward Harvey (10/6). --Toby So… just post the link and I will take a close look at the docs. Thanks, budy -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Optimal raidz3 configuration
> From: David Magda [mailto:dma...@ee.ryerson.ca] > > On Wed, October 13, 2010 21:26, Edward Ned Harvey wrote: > > > I highly endorse mirrors for nearly all purposes. > > Are you a member of BAARF? > > http://www.miracleas.com/BAARF/BAARF2.html Never heard of it. I don't quite get it ... They want people to stop talking about pros/cons of various types of raid? That's definitely not me. I think there are lots of pros/cons, and many of them have nuances, and vary by implementation... I think it's important to keep talking about it, and all us "experts" in the field can keep current on all this ... Take, for example, the number of people discussing things in this mailing list, who say they still use hardware raid. That alone demonstrates misinformation (in most cases) and warrants more discussion. ;-) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Optimal raidz3 configuration
On Wed, October 13, 2010 21:26, Edward Ned Harvey wrote: > I highly endorse mirrors for nearly all purposes. Are you a member of BAARF? http://www.miracleas.com/BAARF/BAARF2.html :) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs diff cannot stat shares
a diff to list the file differences between snapshots http://arc.opensolaris.org/caselog/PSARC/2010/105/mail Dave On 10/13/10 15:48, Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of dirk schelfhout Wanted to test the zfs diff command and ran into this. What's zfs diff? I know it's been requested, but AFAIK, not implemented yet. Is that new feature being developed now or something? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] AMD/Supermicro machine - AS-2022G-URF
Sorry for the long post but I know trying to decide on hardware often want to see details about what people are using. I have the following AS-2022G-URF machine running OpenGaryIndiana[1] that I am starting to use. I successfully transferred a deduped zpool with 1.x TB of files and 60 or so zfs filesystems using mbuffer from an old 134 system with 6 drives - it ran at about 50MB/s or slightly more for much of the process and mbuffer worked great. I am wondering what commands people would recommend running to retrieve/save config info, logs, history, etc to document and save important files _before_ any problems arise. I am planning 'zfs history' and 'zpool history' but looking for ideas. Also, ideas for light testing/benchmarking would be great but I don't think I can erase/destroy the zpools for testing. One one thing that went wrong so far that I know about - I seem to have munged one of the SSDs by running format on it - it asked for cyl/track/sect info and then barfed. I won't look at SSDs using format ever again. And the 2TB drives seem somewhat slow... I am thinking about trying napp-it - anyone using it with OpenIndiana? Thanks in advance. %< --=-=-=-=-=-=-=-=-=-=-- >% AS-2022G-URF http://www.supermicro.com/Aplus/system/2U/2022/AS-2022G-URF.cfm with: * 1 x AMD 6172 12 core processor * 16 GB Ram * 2 x WD1002FAEX connected to motherboard amd-ahci for rpool * 2 x some Kingston SSDs connected to motherboard for arc/zil maybe * 1 LSI LSI00194 connected via 2 8087 cables to the internal drive bays with: * 8 x WD1002FAEX-0-1D05 - 1 TB WDC Black drives 2 x Areca ARC-1222X (with battery) each connected to a Tekram R08 2U 8 Bay case http://www.newegg.com/Product/Product.aspx?Item=N82E16816208017 these are discontinued, I not sure why, the guy at Tekram said they were samples or a trial or something but they were about $250 each * 8 x WDC-WD2001FASS-00W2B 2 TB WDC Black drives the other: * 8 x Samsung F3 1 TB - I'll post those later when I have it hooked up if anyone is interested # uname -a SunOS xxyyzz 5.11 oi_147 i86pc i386 i86pc Solaris # psrinfo -pv The physical processor has 12 virtual processors (0-11) x86 (AuthenticAMD 100F91 family 16 model 9 step 1 clock 2100 MHz) AMD Opteron(tm) Processor 6172 [ Socket: G34 ] # zpool list NAMESIZE ALLOC FREECAP DEDUP HEALTH ALTROOT rpool 928G 13.5G 914G 1% 1.00x DEGRADED - xxx1 14.5T 1.41T 13.1T 9% 4.54x ONLINE - xx27.25T 109G 7.14T 1% 1.00x ONLINE - # zpool status pool: rpool state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://www.sun.com/msg/ZFS-8000-2Q scan: scrub repaired 0 in 0h2m with 0 errors on Thu Oct 7 17:46:54 2010 config: NAME STATE READ WRITE CKSUM rpool DEGRADED 0 0 0 mirror-0DEGRADED 0 0 0 c5t0d0s0 ONLINE 0 0 0 c5t1d0s0 UNAVAIL 0 0 0 cannot open errors: No known data errors pool: xxx1 state: ONLINE scan: scrub repaired 0 in 30h34m with 0 errors on Mon Oct 11 06:44:59 2010 config: NAMESTATE READ WRITE CKSUM xxx1ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 c7t0d0 ONLINE 0 0 0 c7t0d1 ONLINE 0 0 0 c7t0d2 ONLINE 0 0 0 c7t0d3 ONLINE 0 0 0 c7t0d4 ONLINE 0 0 0 c7t0d5 ONLINE 0 0 0 c7t0d6 ONLINE 0 0 0 c7t0d7 ONLINE 0 0 0 logs c19d1 ONLINE 0 0 0 errors: No known data errors pool: xx2 state: ONLINE scan: scrub repaired 0 in 0h0m with 0 errors on Thu Oct 7 17:32:11 2010 config: NAME STATE READ WRITE CKSUM xx2ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 c4t50014EE2AF54BB46d0 ONLINE 0 0 0 c4t50014EE2AF54BE3Fd0 ONLINE 0 0 0 c4t50014EE2AF544E2Ed0 ONLINE 0 0 0 c4t50014EE204A9DC6Bd0 ONLINE 0 0 0 c4t50014EE204A9DCD5d0 ONLINE 0 0 0 c4t50014EE204A9E333d0 ONLINE 0 0 0 c4t50014EE259FF2C16d0 ONLINE 0 0 0 c4t50014EE259FF13F0d0 ONLINE 0 0 0 # prtdiag -l System Configuration: Supermicro H8DGU BIOS Configuration: American Megatrends Inc. 1.0a07/01/2010 BMC Configuration: IPMI 1.5 (KCS: Keyboard Controller Style) Proce
Re: [zfs-discuss] Performance issues with iSCSI under Linux
On 13 oct. 2010, at 18:37, Marty Scholes wrote: > The only thing that still stands out is that network operations (iSCSI and > NFS) to external drives are slow, correct? > > Just for completeness, what happens if you scp a file to the three different > pools? If the results are the same as NFS and iSCSI, then I think the > network can be ruled out. > > I would be leaning toward thinking there is some mismatch between the network > protocols and the external controllers/cables/arrays. Sounding more and more like a networking issue - are the network cards set up in an aggregate? I had some similar issues on GbE where there was a mismatch between the aggregate settings on the switches and the LACP settings on the server. Basically the network was wasting a ton of time trying to renegotiate the LACP settings and slowing everything down. Ditto for the Linux networking - single port or aggregated dual port? Erik ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Running on Dell hardware?
We got a R710 + 3 MD1000s running zfs, with intel 10GE network card. There was a period of time that R710 freezing randomly, when we used osol b12x release. I checked in google and there were reports of freezes caused by a new mpt driver used in b12x release which could be the cause. Changed to nexenta based on b134, then the issue is gone, running very stable ever since. Plan to add 3 more MD1000s. All MD1000s are connected to SAS 5e card. Not sure how is the mpt driver status in sol10u9. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Finding corrupted files
I'd like to see those docs as well. As all HW raids are driven by software, of course - and software can be buggy. I don't want to heat up the discussion about ZFS managed discs vs. HW raids, but if RAID5/6 would be that bad, no one would use it anymore. So… just post the link and I will take a close look at the docs. Thanks, budy -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss