On Mon, 10 Mar 2008, Lida Horn wrote: > Paul Raines wrote: >> Well, I ran updatemanager and started applying about 64 updates. After >> the progress meter got about half way it seemed to hang not moving for >> hours. I finally gave up and did a reboot. But the machine would not >> reboot. I went in the ILOM and tried 'stop /SYS' but after a few minutes >> would get back an error on the console saying something like "shutdown >> failed". So I finally just hard power cycled the box. Luckily, it came >> back up seemingly okay and I was able to rerun updatemanager and get all >> updates installed. However, after rebooting I now note the following >> error messages on the console: >> >> Mar 9 03:22:16 raidsrv03 sata: NOTICE: >> /[EMAIL PROTECTED],0/pci1022,[EMAIL PROTECTED]/pci11ab,[EMAIL PROTECTED]: >> Mar 9 03:22:16 raidsrv03 port 6: device reset >> Mar 9 03:22:16 raidsrv03 sata: NOTICE: >> /[EMAIL PROTECTED],0/pci1022,[EMAIL PROTECTED]/pci11ab,[EMAIL PROTECTED]: >> Mar 9 03:22:16 raidsrv03 port 6: link lost >> Mar 9 03:22:16 raidsrv03 sata: NOTICE: >> /[EMAIL PROTECTED],0/pci1022,[EMAIL PROTECTED]/pci11ab,[EMAIL PROTECTED]: >> Mar 9 03:22:16 raidsrv03 port 6: link established >> Mar 9 03:22:16 raidsrv03 scsi: WARNING: >> /[EMAIL PROTECTED],0/pci1022,[EMAIL PROTECTED]/pci11ab,[EMAIL >> PROTECTED]/[EMAIL PROTECTED],0 (sd46): >> Mar 9 03:22:16 raidsrv03 Error for Command: write(10) Error Level: >> Retryable >> Mar 9 03:22:16 raidsrv03 scsi: Requested Block: 68158362 Error >> Block: 68158362 >> Mar 9 03:22:16 raidsrv03 scsi: Vendor: ATA Serial Number: >> Mar 9 03:22:16 raidsrv03 scsi: Sense Key: No Additional Sense >> Mar 9 03:22:16 raidsrv03 scsi: ASC: 0x0 (no additional sense >> info), ASCQ: 0x0, FRU: 0x0 >> >> >> The above repeated a few times but now seems to have stopped. Running 'hd >> -c' >> shows all disks as ok. But it seems like I do have a disk problem. But >> since >> everything is redundant (zraid) why a failed disk should lock up the >> machine >> like I saw I don't understand unless there is a some bigger issue. >> >> Any advice? >> > It is unclear what you are talking about. Do you have any evidence to > connect > that retryable write errors with the previous hang or were they two > independent > events? The retried write error would appear to be normal behavior with > a bad sector. If the sector is actually bad, there would be the initial > write > attempt followed by five retries. The last retry would have "Error Level: > Fatal" > as opposed to "Error Level: Retryable", otherwise one of the retries would > have been successful and everything would move on. > > Regards, > Lida
No, I cannot connect the two events. When the 'zfs create' hang happened, and the hang on applying updates, there were no error messages at all I could find. The above only happened after the reboot. SO it is circumstancial. _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss