panix panix wrote: > Hello, > in advance sorry for the cross posting, it is just that freebsd-geom didnt > seem that populated. > I run 7.1-PRERELEASE, its a home server. > today morning after a power failure, the rebuild my root gm0 failed on disk > ad4. > The messages were: > > May 18 08:02:02 panix kernel: ad4: WARNING - WRITE_DMA UDMA ICRC error > (retrying request) LBA=268091264 > May 18 08:02:08 panix kernel: drm0: <Intel i865G GMCH> on vgapci0 > May 18 08:02:08 panix kernel: info: [drm] AGP at 0xf0000000 128MB > May 18 08:02:08 panix kernel: info: [drm] Initialized i915 1.5.0 20060119 > May 18 08:02:08 panix kernel: drm0: [ITHREAD] > May 18 08:02:08 panix kernel: ad4: FAILURE - device detached > May 18 08:02:08 panix kernel: subdisk4: detached > May 18 08:02:08 panix kernel: ad4: detached > May 18 08:02:08 panix kernel: GEOM_MIRROR: Device gm0: provider ad4 > disconnected. > May 18 08:02:08 panix kernel: GEOM_MIRROR: Device gm0: rebuilding provider > ad4 stopped. > > I read > http://www.eztiger.org/2008/08/removing-and-re-adding-a-disk-in-gmirror/ > hoping that the rebuld failure was temprary > and so i tried to just run > # gmirror forget gm0 > # gmirror insert gm0 ad4 > > But the system responded (if i remember correctly) > Unknown provider ad4. > The system no longer could see ad4 being online.
Yes, as you were informed by the "device detached" message - after that point the ad4 was removed from /dev. > So i rebooted the system many times and had these results: > -When having put offline ad4 (disconnected by hardware), the system booted ok. > -When having both disks online the system responded consistently > with: > "GEOM_MIRROR: Cannot add disk ad6 to gm0 (error=22)." Which means that gm0 was somehow created before - maybe from the "stale" ad4 copy? If so, you are attempting to add a newer generation of data (from ad6) to a gm0 instantiated from an older generation (from ad4). This could explain the error code (22=invalid argument). OTOH if you only have ad6 in the system this means you are trying to insert ad6 into a mirror which is already instantiated by ad6 - which is trivially wrong. > Which IMO is not very ok, since gm0 should add ad6 without problem, > no matter if ad4 is online or not. You cannot really expect the system to behave correctly with broken hardware. > -When having only ad4 online, then it simply cannot find gm0 at all. (kind of > reasonable) Relatively. Is the ad4 recognized by the system? You didn't really clear metadata on ad4 so it should be recognized, but as a stale version (hopefully). If it isn't recognized at all, then it's broken. > So my only option is to have only ad6 online, with a current gmirror status: > panix# gmirror status > Name Status Components > mirror/gm0 COMPLETE ad6 This is ok. > Anyone has an idea of how should i proceed (besides buying a UPS unit!) > Is it meaningfull to go for a new Disk to replace current ad4? Yes. Then proceed with gmirror insert. > Why is the presence of the supposed bad disk ad4, affecting gm0, > when having already told gm0 to forget about ad4? It's relatively common (it was more common in the days of PATA cables) to have a bad drive interfering with the rest of the system.
signature.asc
Description: OpenPGP digital signature