Re: [zfs-discuss] Re: Significant pauses during zfs writes
Robert Milkowski wrote: Hello Michael, Wednesday, August 23, 2006, 12:49:28 PM, you wrote: MSSM Roch wrote: MSSM I sent this output offline to Roch, here's the essential ones and (first) MSSM his reply: So it looks like this: 6421427 netra x1 slagged by NFS over ZFS leading to long spins in the ATA driver code http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6421427 Is there any workarounds? Or maybe some code not yeyt integrated to try? not that I know of - Roch may be better informed than me though. cheers Michael -- Michael Schuster +49 89 46008-2974 / x62974 visit the online support center: http://www.sun.com/osc/ Recursion, n.: see 'Recursion' ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re[2]: [zfs-discuss] ZFS se6920
Hello Wee, Saturday, August 26, 2006, 6:43:05 PM, you wrote: WYT Thanks to all who have responded. I spent 2 weekends working through WYT the best practices tthat Jerome recommended -- it's quite a mouthful. WYT On 8/17/06, Roch [EMAIL PROTECTED] wrote: My general principles are: If you can, to improve you 'Availability' metrics, let ZFS handle one level of redundancy; WYT Cool. This is a good way to take advantage of the WYT error-detection/correcting feature in ZFS. We will definitely take WYT this suggestion! For Random Read performance prefer mirrors over raid-z. If you use raid-z, group together a smallish number of volumes. setup volumes that correspond to small number of drives (smallest you can bear) with a volume interlace that is in the [1M-4M] range. WYT I have a hard time picturing this wrt the 6920 storage pool. The WYT internal disks in the 6920 presents up to 2 VD per array (6-7 disk WYT each?). The storage pool will be built from a bunch of these VD and WYT may be futher partitioned into several volumes and each volume is WYT presented to a ZFS host. What should the storage profile look like? WYT I can probably do a stripe profile since I can leave the redundancy to WYT ZFS. IMHO if you have VD make just one partition and present it as a LUN to ZFS. Do not present severap partitions from the same disks to ZFS as different LUN. WYT To complicate matters, we are likely going to attach all our 3510 into WYT the 6920 and use some of these for the ZFS volumes so futher WYT restrictions may apply. Are we better off doing a direct attach? You can attach 3510 JBODs (I guess) directly - but currently there're restrictions - only one host and no MPxIO. If it's ok it looks like you'll get better performance than if going with 3510 head unit. ps. I did try with MPxIO and two hosts connected, with several JBODs - and I did see FC loop logoug/login, etc. -- Best regards, Robertmailto:[EMAIL PROTECTED] http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Re: Significant pauses during zfs writes
A fix for this should be integrated shortly. Thanks, George Michael Schuster - Sun Microsystems wrote: Robert Milkowski wrote: Hello Michael, Wednesday, August 23, 2006, 12:49:28 PM, you wrote: MSSM Roch wrote: MSSM I sent this output offline to Roch, here's the essential ones and (first) MSSM his reply: So it looks like this: 6421427 netra x1 slagged by NFS over ZFS leading to long spins in the ATA driver code http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6421427 Is there any workarounds? Or maybe some code not yeyt integrated to try? not that I know of - Roch may be better informed than me though. cheers Michael ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Sol 10 x86_64 intermittent SATA device locks up server
Hello All, I have an issue where I have two SATA cards with 5 drives each in one zfs pool. The issue is one of the devices has been intermittently failing. The problem is that the entire box seems to lock up on occasion when this happens. I currently have the SATA cable to that device disconnected in the hopes that the box will at least stay up for now. This is a new build that I am burning in in the hopes that it will serve as some NFS space for our solaris boxen. Below is the output from zpool status -vx bash-3.00# zpool status pool: tank state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://www.sun.com/msg/ZFS-8000-D3 scrub: none requested config: NAMESTATE READ WRITE CKSUM tankDEGRADED 0 0 0 raidz ONLINE 0 0 0 c1t1d0 ONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 c1t5d0 ONLINE 0 0 0 raidz DEGRADED 0 0 0 c2t1d0 ONLINE 0 0 0 c2t2d0 ONLINE 0 0 0 c2t3d0 ONLINE 0 0 0 c2t4d0 ONLINE 0 0 0 c2t5d0 UNAVAIL 4263 0 cannot open errors: No known data errors And below is some info from /var/adm/messages: Aug 29 12:42:08 localhost marvell88sx: [ID 812917 kern.warning] WARNING: marvell88sx1: error on port 5: Aug 29 12:42:08 localhost marvell88sx: [ID 702911 kern.notice] SError interrupt Aug 29 12:42:08 localhost marvell88sx: [ID 702911 kern.notice] link data receive error - crc Aug 29 12:42:08 localhost marvell88sx: [ID 702911 kern.notice] link data receive error - state Aug 29 12:42:08 localhost marvell88sx: [ID 812917 kern.warning] WARNING: marvell88sx1: error on port 5: Aug 29 12:42:08 localhost marvell88sx: [ID 702911 kern.notice] device error Aug 29 12:42:08 localhost marvell88sx: [ID 702911 kern.notice] SError interrupt Aug 29 12:42:08 localhost marvell88sx: [ID 702911 kern.notice] EDMA self disabled Aug 29 12:43:08 localhost marvell88sx: [ID 812917 kern.warning] WARNING: marvell88sx1: error on port 5: Aug 29 12:43:08 localhost marvell88sx: [ID 702911 kern.notice] device disconnected Aug 29 12:43:08 localhost marvell88sx: [ID 702911 kern.notice] device connected Aug 29 12:43:08 localhost marvell88sx: [ID 702911 kern.notice] SError interrupt Aug 29 12:43:10 localhost marvell88sx: [ID 812917 kern.warning] WARNING: marvell88sx1: error on port 5: Aug 29 12:43:10 localhost marvell88sx: [ID 702911 kern.notice] SError interrupt Aug 29 12:43:10 localhost marvell88sx: [ID 702911 kern.notice] link data receive error - crc Aug 29 12:43:10 localhost marvell88sx: [ID 702911 kern.notice] link data receive error - state Aug 29 12:43:11 localhost marvell88sx: [ID 812917 kern.warning] WARNING: marvell88sx1: error on port 5: Aug 29 12:43:11 localhost marvell88sx: [ID 702911 kern.notice] device error Aug 29 12:43:11 localhost marvell88sx: [ID 702911 kern.notice] SError interrupt Aug 29 12:43:11 localhost marvell88sx: [ID 702911 kern.notice] EDMA self disabled Aug 29 12:44:10 localhost marvell88sx: [ID 812917 kern.warning] WARNING: marvell88sx1: error on port 5: Aug 29 12:44:10 localhost marvell88sx: [ID 702911 kern.notice] device disconnected Aug 29 12:44:10 localhost marvell88sx: [ID 702911 kern.notice] device connected Aug 29 12:44:10 localhost marvell88sx: [ID 702911 kern.notice] SError interrupt My question is, shouldn't it be possible for the solaris to stay up even with an intermittent drive error? I have a replacement drive and cable on order to see if that fixes the problem. Thanks! ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss