Re: [zfs-discuss] Narrow escape with FAULTED disks

2010-08-23 Thread Mark Bennett
Well I do have a plan. Thanks to the portability of ZFS boot disks, I'll make two new OS disks on another machine with the next Nexcenta release, export the data pool and swap in the new ones. That way, I can at least manage a zfs scrub without killing the performance and get the Intel SSD's I

[zfs-discuss] Narrow escape with FAULTED disks

2010-08-16 Thread Mark Bennett
Nothing like a "heart in mouth moment" to shave tears from your life. I rebooted a snv_132 box in perfect heath, and it came back up with two FAULTED disks in the same vdisk group. Everything an hour on Google I found basically said "your data is gone". All 45Tb of it. A postmortem of fmadm sh

Re: [zfs-discuss] ZFS development moving behind closed doors

2010-08-14 Thread Mark Bennett
On 8/13/10 8:56 PM -0600 Eric D. Mudama wrote: > On Fri, Aug 13 at 19:06, Frank Cusack wrote: >> Interesting POV, and I agree. Most of the many "distributions" of >> OpenSolaris had very little value-add. Nexenta was the most interesting >> and why should Oracle enable them to build a business at t

Re: [zfs-discuss] Opensolaris is apparently dead

2010-08-14 Thread Mark Bennett
>That's a very good question actually. I would think that COMSTAR would >stay because its used by the Fishworks appliance... however, COMSTAR is >a competitive advantage for DIY storage solutions. Maybe they will rip >it out of S11 and make it an add-on or something. That would suck. >I guess the

Re: [zfs-discuss] Is LSI SAS3081E-R suitable for a ZFS NAS ?

2010-02-01 Thread Mark Bennett
The WD10EARS disks don't work well. I had too many issues with timeouts that disappeared when replacing them with ST32000542AS drives. My next challenge is to get the LSI 3081 to boot off the disk I want it to, and then to get multipath functional. Has anyone else had issues with the LSI IT mo

Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2010-02-01 Thread Mark Bennett
The results are in: My timeout issue is definitely the WD10EARS disks. Although differences in the error rate was seen with different LSI firmware revisions, the errors persisted. The more disks on the expander, the higher the number with iostat errors. This then causes zpool issues (disk failur

Re: [zfs-discuss] Is LSI SAS3081E-R suitable for a ZFS NAS ?

2010-02-01 Thread Mark Bennett
I did see that and confirmed the support has made it into the 130 release I'm testing with. However, the WD10EARS does not expose 4k sectors to the outside world, so it is not identified as supporting it. Correct alignment, to ensure best performance of the internal translation, seems to be the

Re: [zfs-discuss] Is LSI SAS3081E-R suitable for a ZFS NAS ?

2010-01-31 Thread Mark Bennett
Update: For the WD10EARS, the blocks appear to be aligned on the 4k boundary when zfs uses the whole disk (whole disk as EFI partition). Part TagFlag First Sector Size Last Sector 0usrwm256 931.51Gb 1953508750 calc25

Re: [zfs-discuss] Is LSI SAS3081E-R suitable for a ZFS NAS ?

2010-01-30 Thread Mark Bennett
I'm looking into the alignment implications for the WD10EARS disks. It may explain my issues. I seem to recall boot issues in some of the LSI release notes affecting other boot devices. I think it takes over boot responsibility. I've encountered this sort of issue over the years with many scsi ca

Re: [zfs-discuss] Is LSI SAS3081E-R suitable for a ZFS NAS ?

2010-01-28 Thread Mark Bennett
My experience was different again. I have the same timeout issues with both the LSI and Supermicro cards in IT mode. IR mode on the Supermicro card didn't solve the problem, but seems to have reduced it . Server has 1 x 16 bay chassis and 1 x 24 bay chassis (both use expander) test pool has 24 x

Re: [zfs-discuss] Strange random errors getting automatically repaired

2010-01-27 Thread Mark Bennett
Hi Giovanni, I have seen these while testing the mpt timeout issue, and on other systems during resilvering of failed disks and while running a scrub. Once so far on this test scrub, and several on yesterdays. I checked the iostat errors, and they weren't that high on that device, compared to

Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2010-01-26 Thread Mark Bennett
An update: Well things didn't quite turn out as expected. I decided to follow the path right to the disks for clues. Digging into the adapter diags with LSIUTIL, revealed an Adapter Link issue. Adapter Phy 5: Link Down Invalid DWord Count 5,969,575 Running D

Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2010-01-25 Thread Mark Bennett
I can produce the timeout error on multiple, similar servers. These are storage servers, so no zones or gui running. Hardware: Supermicro X7DWN with AOC-USASLP-L8i controller E1 (single port) backplanes (16 & 24 bay) (LSILOGICSASX28 A.0 and LSILOGICSASX36 A.1) up to 36 1Tb WD Sata disks This serv

Re: [zfs-discuss] abusing zfs boot disk for fun and DR

2010-01-09 Thread Mark Bennett
Ben, I have found that booting from cdrom and importing the pool on the new host, then boot the hard disk will prevent these issues. That will reconfigure the zfs to use the new disk device. When running, zpool detach the missing mirror device and attach a new one. Mark. -- This message posted f

Re: [zfs-discuss] Understanding SAS/SATA Backplanes and Connectivity

2010-01-07 Thread Mark Bennett
Thanks Will, I thought it might be an i2c interface port to the psu, but obviously much simpler. I'll probably use a small picaxe micro, since I have a few here & have used them before. I used them to 'translate' the replacement fans clock pulse to what the monitoring circuit needed in a few V2

Re: [zfs-discuss] Understanding SAS/SATA Backplanes and Connectivity

2010-01-06 Thread Mark Bennett
Will, sorry for picking an old thread, but you mentioned a psu monitor to supplement the CSE-PTJBOD-CB1. I have two of these and am interested in your design. Oddly, the LSI backplane chipset supports 2 x i2c busses that Supermicro didn't make use of for monitoring the psu's. Mark. -- This mes

Re: [zfs-discuss] Supermicro AOC-SAT2-MV8 -- cfgadm won't create attach point (dsk/xxxx)

2010-01-06 Thread Mark Bennett
Check if your card has the latest firmware. Mark. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] how do i prevent changing device names? is this even a problem in ZFS

2010-01-06 Thread Mark Bennett
The earlier (2008) Opensolaris drivers tended to crash the server if you pulled out an active drive. It may have improved in later releases. In the case of the Sun Storage Appliances, the sata (and sas) drivers used are different from those in Opensolaris and are considerably better featured. My

Re: [zfs-discuss] how do i prevent changing device names? is this even a problem in ZFS

2010-01-04 Thread Mark Bennett
I'd recommend a SAS non-raid controller (with sas backplane) over sata. It has better hot plug support. I use the Supermicro SC836E1 and a AOC-USAS-L4i with a UIO M/b. Mark. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-d

Re: [zfs-discuss] Supermicro AOC-USAS-L8i

2010-01-03 Thread Mark Bennett
I have used these cards several UIO capable Supermicro systems and Opensolaris, with the Supermicro storage chassis and up to 30 stata 1Tb disks. With IT mode firmware (non-raid) they are excellent. They usually have the "hardware assisted" raid firmware by default. The card is designed for the

[zfs-discuss] zpool import without mounting

2010-01-03 Thread Mark Bennett
Hi, Is it possible to import a zpool and stop it mounting the zfs file systems, or override the mount paths? Mark. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman

[zfs-discuss] Identify cause when disk faulted

2009-08-25 Thread Mark Bennett
On an OpenSolaris 2009.06 I have a zpool of 12 x WD10EACS disks plus 2 spares One disk is reported as Faulted due to corrupted data. The drive tests ok, but won't let me reuse it. The drive passes the manufacturers diagnostic tests, and doesn't show issues with hdat2 diags or smart. zeroing and