Re: Strange system lockups - kernel saying disk error

2011-06-06 Thread Dave
On 5 Jun 2011 at 16:55, Michael Powell wrote:

 per...@pluto.rain.com wrote:
 [snip]
  
  Power supplies do fail occasionally, and not always in obvious
  ways such as failing to turn on at all.  The output voltages may be
  a little too high or too low, or they may be correct but with
  excessive ripple or electrical noise; or the supply may be just fine
  until a disk draws a current spike to move the arm rapidly.
 
 I've seen a fair number or power supplies degrade somewhere around the
 5 year mark. Simple voltage checks with a VOM and its accuracy will
 usually still show the voltages as being correct. To see the ripple
 you'll need an oscilloscope. Excessive ripple can make a PC appear to
 have all kinds of intermittent hardware failures with little or no
 rhyme or reason. A degraded power supply will show large variations in
 ripple based on load. The largest load from hard drives is when they
 are first spinning up. Servers are commonly configured with the
 ability to spin up drives one at a time with a short delay in between.
 You won't usually find this on a desktop. 
 
 Generally, this situation will develop more often on an old machine
 that had a 'barely enough' capacity power supply when new. Add 3 more
 hard drives, bigger video, etc and it was still just inside the
 envelope until enough time went by and the power supply got old. Since
 the most amps pulled by the hard drives occurs on power up you will
 see the ripple on  a 'scope look really ugly while this happens. The
 unseen danger here is that bits on the drive(s) can get scrambled
 until things settle down. You will know this happens when stuff goes
 wrong and fsck is needed to get the file system clean, and after
 cleaning and working again will do the same thing again at some future
 reboot.
 
 Easiest way to look at this without a 'scope is to simply substitute a
 known good PSU of sufficient rating from a machine with no troubles.
 If all the random nonsense suddenly stops, you'll know. This is
 easiest for folks these days as those without an analog electronics
 background are unlikely to have an oscilloscope laying around. 
 
  It might be worth checking the fan mounted on the CPU heatsink if
  there is one, and the fan in the power supply (which ventilates the
  case as well as the power supply itself).
 
 Aside from the fans themselves, dust buildup plugs heat sinks
 eventually drastically reducing their ability to get rid of heat. When
 you get to this stage blowing them out with canned air can work
 wonders. My 2 servers at home sit on the floor and need this about
 once a year.
 
 -Mike
 

Hi..

I've recently replaced all the 3.3V decoupling caps on a 7 year old 
Compaq mobo, that was showing all sorts of odd behaviour, more (at first 
glance) related to the video card.  It wasn't expensive, but was time 
consuming even for me as a skilled electronics tech, with more years of 
soldering iron time than I care to admit, it took me a good couple of 
hours!  These things aren't made to be easily repaired, but it can be 
done.  In fact, for some common mobo's you can buy complete re-cap kits 
with all the right parts.  Same for all sorts of other consumer 
electronics.  (DVD players, Games consoles, DTV and other set-top boxes 
etc.)

As a result, that box now runs sweet as a nut.  Passing all diags with 
flying colours, even when hot.

Any caps that have a bulging top, on the mobo or in the PSU, need 
changing.  Idealy for the same value and voltage.  But you can go higher 
(within reason) in value, but don't go too high in voltage rating, as 
they can deteriorate if they don't have enough volts, and start to fail 
early again.

Re the PSU thing.   Don't get fooled into the common lore that bigger is 
better.   You can have too big a PSU that will fail to regulate the 
auxilary output lines correctly until you add extra load to it's main 
output.  Many PC supplies (sadly not all) do have a note to that effect 
on the ratings label.

For most Switch Mode supplies, they work best loaded to between half and 
full power on their main output.  Much less than 1/4 of their capability, 
and the auxilary outputs will start to wander about a bit, especially 
if the incoming line is a bit high in voltage.   Common symptoms are 
strange audiable noises from CD drives, or hard drives that struggle to 
start up, but are OK once working.

Yes, also keeping things clean and cool is a good move too.

Hope that helps someone.

Cheers.

Dave B.

PS:  I don't suppose anyone knows a real good simple blow by blow total 
newby dialog, as to how to realiably and correctly create and setup Jails 
on FreeBSD 8.0?   All the man pages I've found so far, are way over my 
head.  Good Reference material admittedly, but no good as an 
instructional if you dont already know How To...   I don't understand 
ezjail either...  Something to do with the faded grey cell and too many 
years etc...




___

Re: Strange system lockups - kernel saying disk error

2011-06-06 Thread Kaya Saman

[...]

PS:  I don't suppose anyone knows a real good simple blow by blow total
newby dialog, as to how to realiably and correctly create and setup Jails
on FreeBSD 8.0?   All the man pages I've found so far, are way over my
head.  Good Reference material admittedly, but no good as an
instructional if you dont already know How To...   I don't understand
ezjail either...  Something to do with the faded grey cell and too many
years etc...




___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
   


http://wiki.optiplex-networks.com/xwiki/bin/view/FreeBSD/Jails

Still a work in progress and running from a VM in a laptop on an ADSL 
line but it does the job :-)



Regards,


Kaya
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


[direct] Re: Strange system lockups - kernel saying disk error

2011-06-05 Thread Dave
 recovery and self contained AV disks, and also 
Memtest86, I carry a copy of Spinrite arround with me too.

I just wish I could come up with something as successful, and able to 
continue selling over and over...

As for changing mobo caps, it's not dificult, but it sure takes a lot of 
time and care.  Cap's in PSU's too go bad (Usually the Low Voltage ones) 
again, not dificult to change, but take care.  There's often considerable 
High Voltage stored in some places, that can bite you, and it hurts!

Lastly, large slow running fans last the longest, and are nice and quiet 
too.  Just regularly blow the dust bunnies out of the systems (two or 
three time a year?) and keep things like the CPU cooler and PSU clean, 
and your hardware will work for many years just fine.

Oh..  CPU coolers.  If your system has the ability to monitor the CPU 
temperature, get to know how that behaves depending on the software you 
use.  If it starts to slowly rise, but the room temperature is not 
correspondinlgy warmer, also cleaning the dust from the cooler doenst 
seem to help.  It may need the cooler removing, the old heat transfer 
compound removing and cleaning, and fresh compound using when you refit 
the cooler.   This issues seems worse with the earlier single core P4's, 
that had a very small contact area to the cooler.

At least Intel chips just slow down as they get hotter (cycle skipping) 
so as not to burn out.   Some AMD's will destroy themselves if the cooler 
fails!...There is a YouTube video somewhere, showing a PC with an 
Intel CPU with no cooler getting slower and slower till it almost stops.

I hope you get things sorted out, one way or another.  Life is so much 
nicer if you don't have to keep messing with the blessed things!

I have a sick Land Rover to fix too.  Gearbox rear oil seal, also rear 
drive shaft UJ's.   At least I can use big hammers on that sometimes...   
(Therapy!)   Oh, the grass needs cutting, and I'm now also under 
instruction to change the bed, when the cat's finished sleeping on it!!!

Best Regards.  

Dave B.


On 4 Jun 2011 at 21:35, Kaya Saman wrote:

Subject:Re: Strange system lockups - kernel saying disk error

 
 [...] 
 
 
 
 Hmmm Hard drives do not like heat!   Check the PSU voltages with a
 meter, for accuracy and ripple.  Failing SMPS's can do all sorts
 of odd things.
 
 Capacitor problems.  Been there done that.  They can be changed
 for very low cost, other than your time.
 
 DaveB
 
 You might guess by know, I know far more about hardware than I do
 about software, but for the latter to run well, the former must be
 good.
 
 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to
 freebsd-questions-unsubscr...@freebsd.org
 
 
 Many thanks Dave for all the suggestions!!!
 
 To be honest I think the drives are fine but the system is just s
 old including the IDE drives.
 
 I mean if I get a SATA/IDE USB adapter I should be able to backup the
 drives to the new DAS system I will have in place shortly since I am
 much more in favor of running Nexenta Core 3 OS with ZFS spanning the
 16x drives meaning a total of 36TB with 2 internal drives used for
 logging and caching.
 
 Then this system will be obsolete. However, I will keep your
 suggestion of using spinwrite in mind next time I encounter issues!
 
 BTW I respect your H/W knowledge that's quite in deep :-) thank you
 for your insight.
 
 just an observation demon.co.uk :-) used to be my old ISP til I went
 with Pipex which is now bust, then I moved out of the UK and now
 everything is roasting hot
 
 
 Best regards,
 
 
 Kaya
 
 
 __ NOD32 6175 (20110602) Information __
 
 This message was checked by NOD32 antivirus system.
 http://www.eset.com
 


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Strange system lockups - kernel saying disk error

2011-06-05 Thread Kaya Saman

On 06/05/2011 03:48 AM, per...@pluto.rain.com wrote:

Kaya Samankayasa...@gmail.com  wrote:

   

Did you apply any updates shortly before it started to fail?
   

No updates! I did however, install unrar through ports.
 

Intuitively, that seems unlikely to have triggered the problem.
   


This doesn't sound like an issue to me either as it wouldn't touch the 
kernel or any modules.


   

I remember on other boards that went on me in the past with
capacitor issues, a bunch of orange stuff starts leaking out
of them when they blow up.
 

A leaking capacitor has surely gone bad, but the syndrome I'm
thinking of is more subtle.  The top of the can, which should
be flat, bulges upward a little bit.

Whether replacing bad capacitors qualifies as quick depends
on how comfortable you are using a soldering iron.  It does
generally require taking the board out of the case, which may
or may not be quick or easy depending on the case design.
   


I have a degree in Electronic Engineering :-) - though no soldering iron :-(

   

Also the chassis doesn't have any cooling fans either since it was
bought extremely cheaply by the family member but not sure that's
the culprit neither power problems as the system has run in high
outside ambient temps in the past with no A/C in the room and also
was working fine on the PSU installed with the 4 disks.
 

Fans that were never there can't have suddenly failed :)
   


Odd that isn't it :-P


Power supplies do fail occasionally, and not always in obvious
ways such as failing to turn on at all.  The output voltages may
be a little too high or too low, or they may be correct but with
excessive ripple or electrical noise; or the supply may be just
fine until a disk draws a current spike to move the arm rapidly.
   


This needs either a voltmeter or oscilloscope to check out the voltages, 
fluctuations, and ripple.


None of those at home :-(

man what I am I doing with 2 racks and no tools to fix things???


It might be worth checking the fan mounted on the CPU heatsink if
there is one, and the fan in the power supply (which ventilates the
case as well as the power supply itself).
   


CPU fan works - at least it spins, fan in PSU not checked as I'd need to 
open it as it's a PS/2 design if not mistaken!



But all these tips would be useful for a system that was given more 
value then mine. If I had actually paid for the system and it been quite 
advanced it would definitely be worth taking everything into account.



Regards,


Kaya

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: [direct] Re: Strange system lockups - kernel saying disk error

2011-06-05 Thread Kaya Saman
 disks, and also
Memtest86, I carry a copy of Spinrite arround with me too.

I just wish I could come up with something as successful, and able to
continue selling over and over...

As for changing mobo caps, it's not dificult, but it sure takes a lot of
time and care.  Cap's in PSU's too go bad (Usually the Low Voltage ones)
again, not dificult to change, but take care.  There's often considerable
High Voltage stored in some places, that can bite you, and it hurts!

Lastly, large slow running fans last the longest, and are nice and quiet
too.  Just regularly blow the dust bunnies out of the systems (two or
three time a year?) and keep things like the CPU cooler and PSU clean,
and your hardware will work for many years just fine.

Oh..  CPU coolers.  If your system has the ability to monitor the CPU
temperature, get to know how that behaves depending on the software you
use.  If it starts to slowly rise, but the room temperature is not
correspondinlgy warmer, also cleaning the dust from the cooler doenst
seem to help.  It may need the cooler removing, the old heat transfer
compound removing and cleaning, and fresh compound using when you refit
the cooler.   This issues seems worse with the earlier single core P4's,
that had a very small contact area to the cooler.

At least Intel chips just slow down as they get hotter (cycle skipping)
so as not to burn out.   Some AMD's will destroy themselves if the cooler
fails!...There is a YouTube video somewhere, showing a PC with an
Intel CPU with no cooler getting slower and slower till it almost stops.

I hope you get things sorted out, one way or another.  Life is so much
nicer if you don't have to keep messing with the blessed things!

I have a sick Land Rover to fix too.  Gearbox rear oil seal, also rear
drive shaft UJ's.   At least I can use big hammers on that sometimes...
(Therapy!)   Oh, the grass needs cutting, and I'm now also under
instruction to change the bed, when the cat's finished sleeping on it!!!

Best Regards.

Dave B.


On 4 Jun 2011 at 21:35, Kaya Saman wrote:

Subject:Re: Strange system lockups - kernel saying disk error

   

[...]



 Hmmm Hard drives do not like heat!   Check the PSU voltages with a
 meter, for accuracy and ripple.  Failing SMPS's can do all sorts
 of odd things.

 Capacitor problems.  Been there done that.  They can be changed
 for very low cost, other than your time.

 DaveB

 You might guess by know, I know far more about hardware than I do
 about software, but for the latter to run well, the former must be
 good.

 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to
 freebsd-questions-unsubscr...@freebsd.org


Many thanks Dave for all the suggestions!!!

To be honest I think the drives are fine but the system is just s
old including the IDE drives.

I mean if I get a SATA/IDE USB adapter I should be able to backup the
drives to the new DAS system I will have in place shortly since I am
much more in favor of running Nexenta Core 3 OS with ZFS spanning the
16x drives meaning a total of 36TB with 2 internal drives used for
logging and caching.

Then this system will be obsolete. However, I will keep your
suggestion of using spinwrite in mind next time I encounter issues!

BTW I respect your H/W knowledge that's quite in deep :-) thank you
for your insight.

just an observation demon.co.uk :-) used to be my old ISP til I went
with Pipex which is now bust, then I moved out of the UK and now
everything is roasting hot


Best regards,


Kaya


__ NOD32 6175 (20110602) Information __

This message was checked by NOD32 antivirus system.
http://www.eset.com

 


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
   


Thanks Dave for this very graphic and insightful story :-)

It was a pleasure to read and a nice display of how experience really 
does prevail over things!!!



I liked the radio chart on the site provided :-) - what exactly is it 
measuring? Background noise?



I think not having a UPS for over a year killed me with the power 
cutting out almost every weekend for 10 - 20 minutes/night. Now I have 
UPS, 2x 1500KVA APC systems... nice but need the network and temp 
monitoring cards. Need plenty of £££ for that! Plus the new server I am 
intending to build as the DAS box already cost $2000.



Regards,


Kaya
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Strange system lockups - kernel saying disk error

2011-06-05 Thread Michael Powell
per...@pluto.rain.com wrote:
[snip]
 
 Power supplies do fail occasionally, and not always in obvious
 ways such as failing to turn on at all.  The output voltages may
 be a little too high or too low, or they may be correct but with
 excessive ripple or electrical noise; or the supply may be just
 fine until a disk draws a current spike to move the arm rapidly.

I've seen a fair number or power supplies degrade somewhere around the 5 
year mark. Simple voltage checks with a VOM and its accuracy will usually 
still show the voltages as being correct. To see the ripple you'll need an 
oscilloscope. Excessive ripple can make a PC appear to have all kinds of 
intermittent hardware failures with little or no rhyme or reason. A degraded 
power supply will show large variations in ripple based on load. The largest 
load from hard drives is when they are first spinning up. Servers are 
commonly configured with the ability to spin up drives one at a time with a 
short delay in between. You won't usually find this on a desktop. 

Generally, this situation will develop more often on an old machine that had 
a 'barely enough' capacity power supply when new. Add 3 more hard drives, 
bigger video, etc and it was still just inside the envelope until enough 
time went by and the power supply got old. Since the most amps pulled by the 
hard drives occurs on power up you will see the ripple on  a 'scope look 
really ugly while this happens. The unseen danger here is that bits on the 
drive(s) can get scrambled until things settle down. You will know this 
happens when stuff goes wrong and fsck is needed to get the file system 
clean, and after cleaning and working again will do the same thing again at 
some future reboot.

Easiest way to look at this without a 'scope is to simply substitute a known 
good PSU of sufficient rating from a machine with no troubles. If all the 
random nonsense suddenly stops, you'll know. This is easiest for folks these 
days as those without an analog electronics background are unlikely to have 
an oscilloscope laying around. 

 It might be worth checking the fan mounted on the CPU heatsink if
 there is one, and the fan in the power supply (which ventilates the
 case as well as the power supply itself).

Aside from the fans themselves, dust buildup plugs heat sinks eventually 
drastically reducing their ability to get rid of heat. When you get to this 
stage blowing them out with canned air can work wonders. My 2 servers at 
home sit on the floor and need this about once a year.

-Mike



___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Strange system lockups - kernel saying disk error

2011-06-04 Thread Kaya Saman

Many thanks for the response!

On 06/04/2011 02:00 AM, per...@pluto.rain.com wrote:

Kaya Samankayasa...@gmail.com  wrote:

   

I have an ancient pre-HT PIV machine with500MB RAM.
...
Everything was running fine until round about 2 days
ago when the system started locking up on me?

... is there anyway to fix the kernel error quickly?
 

Did you apply any updates shortly before it started to fail?
   


No updates! I did however, install unrar through ports.


If not, this is likely to be a hardware problem.  I'd suggest
checking the power supply and the fans, running memtest86, and
taking a close look at the electrolytic filter capacitors on
the system board -- the last because it sounds as if this system
may be about the right age to have been built with some bad ones.
(If any of the capacitors are bulging, either those caps, or the
entire board, need to be replaced.)  Power and heat problems can
cause all sorts of strange symptoms.
   


I guess, I mean I did mention that the system was old and also I've been 
running in 24/7 online for the past year and half as this box got passed 
down to me by a family member. It has a Gigabyte system board. Not sure 
about the capacitors; I'll check. I remember on other boards that went 
on me in the past with capacitor issues, a bunch of orange stuff starts 
leaking out of them when they blow up.


Also the chassis doesn't have any cooling fans either since it was 
bought extremely cheaply by the family member but not sure that's the 
culprit neither power problems as the system has run in high outside 
ambient temps in the past with no A/C in the room and also was working 
fine on the PSU installed with the 4 disks.


I guess it's hardware related somehow as something's blown up, either 
the PSU, system board or so..



As I explained in the beginning if there's no clear way to fix the 
problem easily then I'll wait a bit. - I have a 16 disk Promise DAS on 
the way and will build a server using a Chenbro industrial rack chassis 
and Supermicro AMD based 8-12 core system board. These systems will fit 
better in the 2 racks I have in my living room. This should be a bit 
more stable and also give me higher capacity too!



Regards,


Kaya

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Strange system lockups - kernel saying disk error

2011-06-04 Thread Dave
On 3 Jun 2011 at 15:09, Kaya Saman wrote:

 Hi,
 
 I have an ancient pre-HT PIV machine with 500MB RAM.
 
 The system has an extra PCI-SATA card installed so I can  make use of
 modern high capacity drives.
 
 Everything was running fine until round about 2 days ago when the
 system started locking up on me?
 
 
 Current drive configuration for the system is:
 
 40GB IDE drive as root (ad2) - UFS2
 500GB IDE drive for storage (ad3) - EXT3
 1TB SATA drive for storage (ad4) - UFS2
 750GB SATA drive for storage (ad8) - EXT3
 
 I had an issue with the 750GB drive which the file system seemed to
 have got corrupted so I powered down and backed the information up to
 a 2TB SATA drive using ddrescue and the Gentoo Linux based System
 Rescue CD. I put the 2TB drive in place of the 1TB ad4 drive
 physically.
 
 Once backed up I powered down again and re-installed the 1TB SATA
 drive into ad4 position on system and completely removed the 2TB
 backup.
 
 When booted back into FreeBSD upon boot I received this error:
 
 
   WARNING:  Kernel Errors Present
  ad4: FAILURE - WRITE_DMA48 status=51READY,DSC,ERROR 
  error=4ABORTED  LBA=1 ...:  1 Time(s)
  g_vfs_done():ad4e[WRITE(offset=97691456, length=16384)]error
  = 5 ...:  1 Time(s)
 
 
 The current status of the disks seemed to be ok though:
 
   1 Time(s): ad2: 38166MBSeagate ST340014A 3.06  at ata1-master
   UDMA33 1 Time(s): ad2: DMA limited to UDMA33, controller found
   non-ATA66 cable 1 Time(s): ad3: 476940MBSeagate ST3500630A 3.AAF 
   at ata1-slave UDMA33 1 Time(s): ad3: DMA limited to UDMA33,
   controller found non-ATA66 cable 1 Time(s): ad4: 953869MBSAMSUNG
   HD103SJ 1AJ10001  at ata2-master SATA150 1 Time(s): ad8:
   715404MBSeagate ST3750640AS 3.AAE  at ata4-master SATA150 1
   Time(s): agp0:SiS 651 host to AGP bridge  on hostb0 1 Time(s):
   ata0:ATA channel 0  on atapci0 1 Time(s): ata0: [ITHREAD] 1
   Time(s): ata1:ATA channel 1  on atapci0 1 Time(s): ata1: [ITHREAD]
   1 Time(s): ata2:ATA channel 0  on atapci1 1 Time(s): ata2:
   [ITHREAD] 1 Time(s): ata3:ATA channel 1  on atapci1 1 Time(s):
   ata3: [ITHREAD] 1 Time(s): ata4:ATA channel 2  on atapci1 1
   Time(s): ata4: [ITHREAD] 1 Time(s): ata5:ATA channel 3  on atapci1
 
 
 In order to test if the error was due to disk failure I powered down
 and disconnected the ad4 and ad3 disks and powered back up.
 
 
 The system still seems to be locking on me and I can't understand why?
 
 
 Through Google'ing a discovered a post by Jeremy Chadwick about these
 kinds of errors:
 
 http://wiki.freebsd.org/JeremyChadwick/ATA_issues_and_troubleshooting
 
 however since the system board is pre-SATA is doesn't even have 
 S.M.A.R.T. so I'm totally lost on how to fix this. I mean the best
 remedy would be to get a new computer and migrate the stored
 information (something like this is on the way) but currently I don't
 have access to any of the disks at all and to make matters worse no
 NTP or DNS server as I was running these services on the same machine
 or TFTP boot server for my IP phones. - I do run multiboot UNIX on my
 notebook so Bind9 is naturally installed hence me writing this but I
 only activate in emergencies.
 
 I mean one way I thought of for fixing this would be to grab a USB -
 ATA/SATA adapter:
 
 http://www.startech.com/product/USB2SATAIDE-USB-20-to-IDE-or-SATA-Adap
 ter-Cable
 
 and hook the drives up to both Linux and FreeBSD in my notebook and
 copy the information across to the new system when it arrives in a few
 months.
 
 
 Aside from that is there anyway to fix the kernel error quickly?
 
 
 Thanks,
 
 
 Kaya
 

Hmmm...  No backups then?

First, check the drive data cables.  Many do fail with age.  Some SATA 
types are made with Aluminium not copper, and are extremley fragile when 
they age.   If that doenst shed some light...

Take a look athttp://www.grc.com/spinrite.htm

Will often restore a failling drive to full use, if it's not mechanicaly 
damaged.   It can take time though, if any sector corruption is very bad.  
Days, weeks, even months have been see in some cases, but if the software 
keeps going, it usualy does the job.

It's not a Windows program, if anyting it's a DOS program, but comes with 
it's own FreeDOS system to boot and run from, so you don't even need an 
OS on the machine to test!   It will work with IDE or SATA types, even 
over a USB adapter if needed (but then it can't access any SMART data the 
drive may have) but it'll run a lot slower as it won't be aware of the 
drive's detailed physical timing etc.

I've used it on WIndows and Linux machines in anger, and the FreeBSD box 
when I got it (an old Gateway E-1400) to make sure the drive was healthy.

It's the hard drive equivalent of Memtest86, and you know how good that 
is.

Even if it doesn't report any problems found, often it will cause the 
drive to maitain things itself, improving performance as a result.

Even if the recovered drive is still less than 100% happy, or some of 

Re: Strange system lockups - kernel saying disk error

2011-06-04 Thread Dave
On 4 Jun 2011 at 10:52, Kaya Saman wrote:

 Many thanks for the response!
 
 On 06/04/2011 02:00 AM, per...@pluto.rain.com wrote:
  Kaya Samankayasa...@gmail.com  wrote:
 
 
  I have an ancient pre-HT PIV machine with500MB RAM.
  ...
  Everything was running fine until round about 2 days
  ago when the system started locking up on me?
 
  ... is there anyway to fix the kernel error quickly?
   
  Did you apply any updates shortly before it started to fail?
 
 
 No updates! I did however, install unrar through ports.
 
  If not, this is likely to be a hardware problem.  I'd suggest
  checking the power supply and the fans, running memtest86, and
  taking a close look at the electrolytic filter capacitors on
  the system board -- the last because it sounds as if this system may
  be about the right age to have been built with some bad ones. (If
  any of the capacitors are bulging, either those caps, or the entire
  board, need to be replaced.)  Power and heat problems can cause all
  sorts of strange symptoms.
 
 
 I guess, I mean I did mention that the system was old and also I've
 been running in 24/7 online for the past year and half as this box got
 passed down to me by a family member. It has a Gigabyte system board.
 Not sure about the capacitors; I'll check. I remember on other boards
 that went on me in the past with capacitor issues, a bunch of orange
 stuff starts leaking out of them when they blow up.
 
 Also the chassis doesn't have any cooling fans either since it was
 bought extremely cheaply by the family member but not sure that's the
 culprit neither power problems as the system has run in high outside
 ambient temps in the past with no A/C in the room and also was working
 fine on the PSU installed with the 4 disks.
 
 I guess it's hardware related somehow as something's blown up, either
 the PSU, system board or so..
 
 
 As I explained in the beginning if there's no clear way to fix the
 problem easily then I'll wait a bit. - I have a 16 disk Promise DAS on
 the way and will build a server using a Chenbro industrial rack
 chassis and Supermicro AMD based 8-12 core system board. These systems
 will fit better in the 2 racks I have in my living room. This should
 be a bit more stable and also give me higher capacity too!
 
 
 Regards,
 
 
 Kaya
 
 
 

Hmmm  Hard drives do not like heat!   Check the PSU voltages with a 
meter, for accuracy and ripple.  Failing SMPS's can do all sorts of odd 
things.

Capacitor problems.  Been there done that.  They can be changed for very 
low cost, other than your time.

DaveB

You might guess by know, I know far more about hardware than I do about 
software, but for the latter to run well, the former must be good.

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Strange system lockups - kernel saying disk error

2011-06-04 Thread Kaya Saman

[...]


Hmmm  Hard drives do not like heat!   Check the PSU voltages with a
meter, for accuracy and ripple.  Failing SMPS's can do all sorts of odd
things.

Capacitor problems.  Been there done that.  They can be changed for very
low cost, other than your time.

DaveB

You might guess by know, I know far more about hardware than I do about
software, but for the latter to run well, the former must be good.

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
   


Many thanks Dave for all the suggestions!!!

To be honest I think the drives are fine but the system is just s 
old including the IDE drives.


I mean if I get a SATA/IDE USB adapter I should be able to backup the 
drives to the new DAS system I will have in place shortly since I am 
much more in favor of running Nexenta Core 3 OS with ZFS spanning the 
16x drives meaning a total of 36TB with 2 internal drives used for 
logging and caching.


Then this system will be obsolete. However, I will keep your suggestion 
of using *spinwrite* in mind next time I encounter issues!


BTW I respect your H/W knowledge that's quite in deep :-) thank you for 
your insight.


just an observation demon.co.uk :-) used to be my old ISP til I went 
with Pipex which is now bust, then I moved out of the UK and now 
everything is roasting hot



Best regards,


Kaya
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Strange system lockups - kernel saying disk error

2011-06-04 Thread perryh
Kaya Saman kayasa...@gmail.com wrote:

  Did you apply any updates shortly before it started to fail?

 No updates! I did however, install unrar through ports.

Intuitively, that seems unlikely to have triggered the problem.

 I remember on other boards that went on me in the past with
 capacitor issues, a bunch of orange stuff starts leaking out
 of them when they blow up.

A leaking capacitor has surely gone bad, but the syndrome I'm
thinking of is more subtle.  The top of the can, which should
be flat, bulges upward a little bit.

Whether replacing bad capacitors qualifies as quick depends
on how comfortable you are using a soldering iron.  It does
generally require taking the board out of the case, which may
or may not be quick or easy depending on the case design.

 Also the chassis doesn't have any cooling fans either since it was
 bought extremely cheaply by the family member but not sure that's
 the culprit neither power problems as the system has run in high
 outside ambient temps in the past with no A/C in the room and also
 was working fine on the PSU installed with the 4 disks.

Fans that were never there can't have suddenly failed :)

Power supplies do fail occasionally, and not always in obvious
ways such as failing to turn on at all.  The output voltages may
be a little too high or too low, or they may be correct but with
excessive ripple or electrical noise; or the supply may be just
fine until a disk draws a current spike to move the arm rapidly.

It might be worth checking the fan mounted on the CPU heatsink if
there is one, and the fan in the power supply (which ventilates the
case as well as the power supply itself).
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Strange system lockups - kernel saying disk error

2011-06-03 Thread perryh
Kaya Saman kayasa...@gmail.com wrote:

 I have an ancient pre-HT PIV machine with 500MB RAM.
 ...
 Everything was running fine until round about 2 days
 ago when the system started locking up on me?

 ... is there anyway to fix the kernel error quickly?

Did you apply any updates shortly before it started to fail?

If not, this is likely to be a hardware problem.  I'd suggest
checking the power supply and the fans, running memtest86, and
taking a close look at the electrolytic filter capacitors on
the system board -- the last because it sounds as if this system
may be about the right age to have been built with some bad ones.
(If any of the capacitors are bulging, either those caps, or the
entire board, need to be replaced.)  Power and heat problems can
cause all sorts of strange symptoms.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: disk error / reboot / 6.3

2008-12-28 Thread jerome
Hi Paul,

The patch worked (almost). 

At first a program accessing a disk that reported an uncorrectable error, the 
program just segfaulted.

Another instance let to the situation that I was only able to ping the server.
No ssh or console access was possible anymore.

-Pat
  _  

From: Paul B. Mahol [mailto:one...@gmail.com]
To: jerome [mailto:jer...@code-monkey.nl]
Cc: freebsd-questions@freebsd.org
Sent: Mon, 22 Dec 2008 13:15:12 +0100
Subject: Re: disk error / reboot / 6.3

On 12/22/08, jerome jer...@code-monkey.nl wrote:
   Hi Paul,
  
   The server resets while running, like pressing the reset button...
  
  Try this patch:
  
  --- src/sys/dev/ata/ata-queue.c 2008/10/27 09:26:24 1.74
  +++ src/sys/dev/ata/ata-queue.c 2008/11/27 03:37:46 1.75
  @@ -357,7 +357,7 @@ ata_completed(void *context, int dummy)
\6MEDIA_CHANGED\5NID_NOT_FOUND
\4MEDIA_CHANGE_REQEST
\3ABORTED\2NO_MEDIA\1ILLEGAL_LENGTH);
  -   if ((request-flags  ATA_R_DMA) 
  +   if ((request-flags  ATA_R_DMA)  request-dma 
  (request-dma-status  ATA_BMSTAT_ERROR))
  printf( dma=0x%02x, request-dma-status);
  if (!(request-flags  (ATA_R_ATAPI | ATA_R_CONTROL)))
  
  -- 
  Paul

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: disk error / reboot / 6.3

2008-12-28 Thread Paul B. Mahol
On 12/28/08, jerome jer...@code-monkey.nl wrote:
 Hi Paul,

 The patch worked (almost).

 At first a program accessing a disk that reported an uncorrectable error,
 the program just segfaulted.

 Another instance let to the situation that I was only able to ping the
 server.
 No ssh or console access was possible anymore.

That is somehow to be expected, the point of patch is to fix panic, not
trashing due to faulty disk/drivers/something else ...

-- 
Paul
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: disk error / reboot / 6.3

2008-12-22 Thread Paul B. Mahol
On 12/22/08, jerome jer...@code-monkey.nl wrote:
 Hi Paul,

 The server resets while running, like pressing the reset button...

Try this patch:

--- src/sys/dev/ata/ata-queue.c 2008/10/27 09:26:24 1.74
+++ src/sys/dev/ata/ata-queue.c 2008/11/27 03:37:46 1.75
@@ -357,7 +357,7 @@ ata_completed(void *context, int dummy)
  \6MEDIA_CHANGED\5NID_NOT_FOUND
  \4MEDIA_CHANGE_REQEST
  \3ABORTED\2NO_MEDIA\1ILLEGAL_LENGTH);
-   if ((request-flags  ATA_R_DMA) 
+   if ((request-flags  ATA_R_DMA)  request-dma 
(request-dma-status  ATA_BMSTAT_ERROR))
printf( dma=0x%02x, request-dma-status);
if (!(request-flags  (ATA_R_ATAPI | ATA_R_CONTROL)))

-- 
Paul
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: disk error / reboot / 6.3

2008-12-22 Thread jerome
Hi Paul,

Ok, thanks.
Will let you know the outcome.

-Jerome
  _  

From: Paul B. Mahol [mailto:one...@gmail.com]
To: jerome [mailto:jer...@code-monkey.nl]
Cc: freebsd-questions@freebsd.org
Sent: Mon, 22 Dec 2008 13:15:12 +0100
Subject: Re: disk error / reboot / 6.3

On 12/22/08, jerome jer...@code-monkey.nl wrote:
   Hi Paul,
  
   The server resets while running, like pressing the reset button...
  
  Try this patch:
  
  --- src/sys/dev/ata/ata-queue.c 2008/10/27 09:26:24 1.74
  +++ src/sys/dev/ata/ata-queue.c 2008/11/27 03:37:46 1.75
  @@ -357,7 +357,7 @@ ata_completed(void *context, int dummy)
\6MEDIA_CHANGED\5NID_NOT_FOUND
\4MEDIA_CHANGE_REQEST
\3ABORTED\2NO_MEDIA\1ILLEGAL_LENGTH);
  -   if ((request-flags  ATA_R_DMA) 
  +   if ((request-flags  ATA_R_DMA)  request-dma 
  (request-dma-status  ATA_BMSTAT_ERROR))
  printf( dma=0x%02x, request-dma-status);
  if (!(request-flags  (ATA_R_ATAPI | ATA_R_CONTROL)))
  
  -- 
  Paul

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


disk error / reboot / 6.3

2008-12-21 Thread jerome
 
  
Hi,  
   
We are running 6.3 on a fileserver with a couple of data disks.  
   
Once the server encounters an error on a data disk (os disk is separate) the 
server will reset itself without warning.  
   
We can usually identify the problem disk with a smartctl, the disk will show 
'Offline uncorrectable errors'.  
   
The fact that the server reboots itself, is this normal? Can we prevent this 
from happening?  
The disks are attached to the on-board sata ports of the mainboard itself, so 
no (raid)controllers whatsoever.  
We also do not use software raid.  
   
Best regards  
   
Jerome
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: disk error / reboot / 6.3

2008-12-21 Thread Paul B. Mahol
On 12/21/08, jerome jer...@code-monkey.nl wrote:
 Hi,

 We are running 6.3 on a fileserver with a couple of data disks.

 Once the server encounters an error on a data disk (os disk is separate) the
 server will reset itself without warning.

It just reset or it panic? There is known panic on bad block on some FreeBSD
versions but I don't think that such regression hit 6.X.


-- 
Paul
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: disk error / reboot / 6.3

2008-12-21 Thread jerome
Hi Paul,

The server resets while running, like pressing the reset button...

-Jerome
  _  

From: Paul B. Mahol [mailto:one...@gmail.com]
To: jerome [mailto:jer...@code-monkey.nl]
Cc: freebsd-questions@freebsd.org
Sent: Mon, 22 Dec 2008 00:35:04 +0100
Subject: Re: disk error / reboot / 6.3

On 12/21/08, jerome jer...@code-monkey.nl wrote:
   Hi,
  
   We are running 6.3 on a fileserver with a couple of data disks.
  
   Once the server encounters an error on a data disk (os disk is separate) the
   server will reset itself without warning.
  
  It just reset or it panic? There is known panic on bad block on some FreeBSD
  versions but I don't think that such regression hit 6.X.
  
  
  -- 
  Paul

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


RAID 1 / disk error / Offline uncorrectable sectors

2008-06-16 Thread Zbigniew Szalbot

Hello,

I'd like to ask your advice. We have RAID 1 / SATA turned on in BIOS.

A couple of days ago smartd let me know about a disk problem.

Jun 14 01:13:38 relay kernel: ad12: FAILURE - READ_DMA48 
status=51READY,DSC,ERROR error=40UNCORRECTABLE LBA=374468863
Jun 14 01:13:38 relay kernel: ar0: WARNING - mirror protection lost. 
RAID1 array in DEGRADED mode
Jun 14 01:14:19 relay kernel: ad12: WARNING - WRITE_DMA taskqueue 
timeout - completing request directly
Jun 14 01:14:19 relay kernel: ad12: WARNING - WRITE_DMA48 freeing 
taskqueue zombie request
Jun 14 01:37:38 relay smartd[683]: Device: /dev/ad12, 1 Currently 
unreadable (pending) sectors
Jun 14 01:37:38 relay smartd[683]: Device: /dev/ad12, 1 Offline 
uncorrectable sectors


If I do smarctl -a /dev/ad12 I get

197 Current_Pending_Sector  0x0012   100   100   000Old_age   Always 
  -   1
198 Offline_Uncorrectable   0x0010   100   100   000Old_age 
Offline  -   1


My understanding is that RAID 1 no longer works because of this error. 
There is a bad sector on HD (Offline uncorrectable sectors) and the best 
we can do is replace the drive? Does it make sense to try to turn RAID 1 
on ignoring this error (however, this is done in BIOS so the machine 
would have to be taken down in order to do that)? It seems serious 
enough for me not to ignore it but then I know close to nothing about HDs.


Many thanks for your suggestions!


Zbigniew Szalbot
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: RAID 1 / disk error / Offline uncorrectable sectors

2008-06-16 Thread Bill Moran
In response to Zbigniew Szalbot [EMAIL PROTECTED]:
 
 A couple of days ago smartd let me know about a disk problem.
 
 Jun 14 01:13:38 relay kernel: ad12: FAILURE - READ_DMA48 
 status=51READY,DSC,ERROR error=40UNCORRECTABLE LBA=374468863
 Jun 14 01:13:38 relay kernel: ar0: WARNING - mirror protection lost. 
 RAID1 array in DEGRADED mode
 Jun 14 01:14:19 relay kernel: ad12: WARNING - WRITE_DMA taskqueue 
 timeout - completing request directly
 Jun 14 01:14:19 relay kernel: ad12: WARNING - WRITE_DMA48 freeing 
 taskqueue zombie request
 Jun 14 01:37:38 relay smartd[683]: Device: /dev/ad12, 1 Currently 
 unreadable (pending) sectors
 Jun 14 01:37:38 relay smartd[683]: Device: /dev/ad12, 1 Offline 
 uncorrectable sectors
 
 If I do smarctl -a /dev/ad12 I get
 
 197 Current_Pending_Sector  0x0012   100   100   000Old_age   Always 
-   1
 198 Offline_Uncorrectable   0x0010   100   100   000Old_age 
 Offline  -   1
 
 My understanding is that RAID 1 no longer works because of this error. 
 There is a bad sector on HD (Offline uncorrectable sectors) and the best 
 we can do is replace the drive? Does it make sense to try to turn RAID 1 
 on ignoring this error (however, this is done in BIOS so the machine 
 would have to be taken down in order to do that)? It seems serious 
 enough for me not to ignore it but then I know close to nothing about HDs.

Replace the hard drive.  Every modern hard drive keeps extra space available
to remap bad sectors.  This happens magically behind the scenes without
you ever knowing about it.  Once you've hit uncorrectable errors, it means
your re-mappable sectors are used up, and that means the drive is on its
last legs.

-- 
Bill Moran
http://www.potentialtech.com
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: RAID 1 / disk error / Offline uncorrectable sectors

2008-06-16 Thread Zbigniew Szalbot

Dear all,

Bill Moran:


My understanding is that RAID 1 no longer works because of this
error. There is a bad sector on HD (Offline uncorrectable sectors)
and the best we can do is replace the drive? Does it make sense to
try to turn RAID 1 on ignoring this error (however, this is done in
BIOS so the machine would have to be taken down in order to do
that)? It seems serious enough for me not to ignore it but then I
know close to nothing about HDs.


Replace the hard drive.  Every modern hard drive keeps extra space
available to remap bad sectors.  This happens magically behind the
scenes without you ever knowing about it.  Once you've hit
uncorrectable errors, it means your re-mappable sectors are used
up, and that means the drive is on its last legs.



Thank you Bill. One last question. RAID 1 is off now (degraded) and the 
hosting company is asking if I can try to bring it up (to check if it 
will work). They have given me this link 
http://www.freebsd.org/doc/en/books/handbook/raid.html. The problem is 
that as far as I understand we are not using gmirror but RAID 1 turned 
on in BIOS (although it is also software-based).


Thank you very much in advance!

Zbigniew Szalbot
www.lc-words.com

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: RAID 1 / disk error / Offline uncorrectable sectors

2008-06-16 Thread Manolis Kiagias

Zbigniew Szalbot wrote:

Dear all,

Bill Moran:


My understanding is that RAID 1 no longer works because of this
error. There is a bad sector on HD (Offline uncorrectable sectors)
and the best we can do is replace the drive? Does it make sense to
try to turn RAID 1 on ignoring this error (however, this is done in
BIOS so the machine would have to be taken down in order to do
that)? It seems serious enough for me not to ignore it but then I
know close to nothing about HDs.


Replace the hard drive.  Every modern hard drive keeps extra space
available to remap bad sectors.  This happens magically behind the
scenes without you ever knowing about it.  Once you've hit
uncorrectable errors, it means your re-mappable sectors are used
up, and that means the drive is on its last legs.



Thank you Bill. One last question. RAID 1 is off now (degraded) and 
the hosting company is asking if I can try to bring it up (to check if 
it will work). They have given me this link 
http://www.freebsd.org/doc/en/books/handbook/raid.html. The problem is 
that as far as I understand we are not using gmirror but RAID 1 turned 
on in BIOS (although it is also software-based).


Thank you very much in advance!

Zbigniew Szalbot
www.lc-words.com



Hey Zbigniew ;)

I understand you are using the ataraid (ar) driver. I always use 
gmirror, but it seems they pointed you to the right place in the handbook.

Look at section 18.4.3 - you would probably need to do something like:

# atacontrol list

From the list, get the ATA channel for /dev/ad12 which is the faulty 
one, e.g. ata2


Detach and re-attach (maybe this will reset the state of the drive)

atacontrol detach ata2
atacontrol attach ata2

atacontrol addspare ar0 ad12
atacontrol rebuild ar0

I've done more or less the same with gmirror when I had similar messages 
a few months back. It may work for a few hours/days but it will fail 
again. Have it replaced ASAP.


Manolis

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: RAID 1 / disk error / Offline uncorrectable sectors

2008-06-16 Thread Wojciech Puchar


Replace the hard drive.  Every modern hard drive keeps extra space available
to remap bad sectors.  This happens magically behind the scenes without
you ever knowing about it.  Once you've hit uncorrectable errors, it means


no. usually it means that there was an error when writing that sector, and 
later there is an error on read. madia may be good (quite often is).


if you would be right i wouldn't have my disk running one year after 
having whole block of uncorrectable errors


i just rewrote that blocks and they are readable.

drive HAS TO know about bad media to remap, and no HDDs today perform 
verification

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: RAID 1 / disk error / Offline uncorrectable sectors

2008-06-16 Thread Zbigniew Szalbot

Hello Manolis,

I understand you are using the ataraid (ar) driver. I always use 
gmirror, but it seems they pointed you to the right place in the handbook.

Look at section 18.4.3 - you would probably need to do something like:

# atacontrol list


ATA channel 6:
Master: ad12 ST3250310NS/SN04 Serial ATA v1.0
Slave:   no device present

ATA channel 0:
Master:  no device present
Slave:   no device present
ATA channel 1:
Master:  no device present
Slave:   no device present
ATA channel 2:
Master:  no device present
Slave:   no device present
ATA channel 3:
Master:  no device present
Slave:   no device present
ATA channel 4:
Master:  no device present
Slave:   no device present
ATA channel 5:
Master: ad10 ST3250310NS/SN04 Serial ATA v1.0
Slave:   no device present
ATA channel 6:
Master: ad12 ST3250310NS/SN04 Serial ATA v1.0
Slave:   no device present
ATA channel 7:
Master:  no device present
Slave:   no device present
ATA channel 8:
Master:  no device present
Slave:   no device present
ATA channel 9:
Master:  no device present
Slave:   no device present
ATA channel 10:
Master:  no device present
Slave:   no device present

So in this case it would be ata6? Sorry for asking confirmation for 
every step but it is just so new to me!


And thanks for the list of steps to perform!

Zbigniew Szalbot
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: RAID 1 / disk error / Offline uncorrectable sectors

2008-06-16 Thread Erik Trulsson
On Mon, Jun 16, 2008 at 04:41:15PM +0200, Wojciech Puchar wrote:
 
  Replace the hard drive.  Every modern hard drive keeps extra space available
  to remap bad sectors.  This happens magically behind the scenes without
  you ever knowing about it.  Once you've hit uncorrectable errors, it means
 
 no. usually it means that there was an error when writing that sector, and 
 later there is an error on read. madia may be good (quite often is).
 
 if you would be right i wouldn't have my disk running one year after 
 having whole block of uncorrectable errors
 
 i just rewrote that blocks and they are readable.
 
 drive HAS TO know about bad media to remap, and no HDDs today perform 
 verification


Also, remapping can only happen if the error is encountered on a write
operation.  If there is an error on read the drive cannot remap, since
it does not know what data should be there.
(A good RAID implementation could however handle a read error by reading
the corresponding sector from the other disks(s) in the array and write it
back to the failing disk, probably causing it to remap the block.)

(Write errors is however usually a strong indication that the drive should
be replaced ASAP.)



-- 
Insert your favourite quote here.
Erik Trulsson
[EMAIL PROTECTED]
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: RAID 1 / disk error / Offline uncorrectable sectors

2008-06-16 Thread Manolis Kiagias

Zbigniew Szalbot wrote:

Hello Manolis,

I understand you are using the ataraid (ar) driver. I always use 
gmirror, but it seems they pointed you to the right place in the 
handbook.

Look at section 18.4.3 - you would probably need to do something like:

# atacontrol list


ATA channel 6:
Master: ad12 ST3250310NS/SN04 Serial ATA v1.0
Slave:   no device present

ATA channel 0:
Master:  no device present
Slave:   no device present
ATA channel 1:
Master:  no device present
Slave:   no device present
ATA channel 2:
Master:  no device present
Slave:   no device present
ATA channel 3:
Master:  no device present
Slave:   no device present
ATA channel 4:
Master:  no device present
Slave:   no device present
ATA channel 5:
Master: ad10 ST3250310NS/SN04 Serial ATA v1.0
Slave:   no device present
ATA channel 6:
Master: ad12 ST3250310NS/SN04 Serial ATA v1.0
Slave:   no device present
ATA channel 7:
Master:  no device present
Slave:   no device present
ATA channel 8:
Master:  no device present
Slave:   no device present
ATA channel 9:
Master:  no device present
Slave:   no device present
ATA channel 10:
Master:  no device present
Slave:   no device present

So in this case it would be ata6? Sorry for asking confirmation for 
every step but it is just so new to me!


And thanks for the list of steps to perform!

Zbigniew Szalbot



Yes, it is ata6
Give it a try, if the problem is serious enough, it will probably not 
even finish rebuild :(

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: RAID 1 / disk error / Offline uncorrectable sectors

2008-06-16 Thread Wojciech Puchar


(Write errors is however usually a strong indication that the drive should
be replaced ASAP.)


he got read error... but your sentence alone is true of course.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: RAID 1 / disk error / Offline uncorrectable sectors

2008-06-16 Thread Zbigniew Szalbot

Hi Manolis,



Yes, it is ata6
Give it a try, if the problem is serious enough, it will probably not 
even finish rebuild :(


Detaching and ataching went well but when I issued
atacontrol addspare ar0 ad12
it said
atacontrol: ioctl(IOCATARAIDADDSPARE): Device busy

I am not sure if that means I should wait or rather that it is mission 
impossible?


Thanks!

Zbigniew Szalbot


smime.p7s
Description: S/MIME Cryptographic Signature


Re: RAID 1 / disk error / Offline uncorrectable sectors

2008-06-16 Thread Manolis Kiagias

Zbigniew Szalbot wrote:

Hi Manolis,



Yes, it is ata6
Give it a try, if the problem is serious enough, it will probably not 
even finish rebuild :(


Detaching and ataching went well but when I issued
atacontrol addspare ar0 ad12
it said
atacontrol: ioctl(IOCATARAIDADDSPARE): Device busy

I am not sure if that means I should wait or rather that it is mission 
impossible?


Thanks!

Zbigniew Szalbot


Try

atacontrol status ar0

Since you haven't actually removed/replaced ad12 you may simply have to 
continue with:


atacontrol rebuild ar0

but see what status says first.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: RAID 1 / disk error / Offline uncorrectable sectors

2008-06-16 Thread Zbigniew Szalbot

Hello,

Manolis Kiagias:


Try

atacontrol status ar0


ar0: ATA RAID1 status: DEGRADED
 subdisks:
   0 ad10 ONLINE
   1  MISSING

Since you haven't actually removed/replaced ad12 you may simply have to 
continue with:


atacontrol rebuild ar0


I'll try it now. Thanks!

Zbigniew Szalbot


smime.p7s
Description: S/MIME Cryptographic Signature


Re: RAID 1 / disk error / Offline uncorrectable sectors

2008-06-16 Thread Zbigniew Szalbot

Hello,

Manolis Kiagias:


Try

atacontrol status ar0

Since you haven't actually removed/replaced ad12 you may simply have to 
continue with:


atacontrol rebuild ar0


atacontrol rebuild ar0
atacontrol: ioctl(IOCATARAIDREBUILD): Input/output error

So it looks like it cannot be done?

Zbigniew Szalbot


smime.p7s
Description: S/MIME Cryptographic Signature


Re: RAID 1 / disk error / Offline uncorrectable sectors

2008-06-16 Thread Manolis Kiagias

Zbigniew Szalbot wrote:

Hello,

Manolis Kiagias:


Try

atacontrol status ar0


ar0: ATA RAID1 status: DEGRADED
 subdisks:
   0 ad10 ONLINE
   1  MISSING

Since you haven't actually removed/replaced ad12 you may simply have 
to continue with:


atacontrol rebuild ar0


I'll try it now. Thanks!

Zbigniew Szalbot


Ok, ad12 is missing, so it seems it was detached but not reattached.

try again:

atacontrol attach ata6

If this succeeds,

atacontrol addspare ar0 ad12
atacontrol rebuild ar0

If attach fails, then someone at the remote site may have to  physically 
detach / reattach the disk in question.

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: RAID 1 / disk error / Offline uncorrectable sectors

2008-06-16 Thread Zbigniew Szalbot

Hello one last time,

Manolis Kiagias:


Ok, ad12 is missing, so it seems it was detached but not reattached.

try again:

atacontrol attach ata6


$ sudo atacontrol attach ata6
atacontrol: ioctl(IOCATAATTACH): File exists

Thank you all for a lot of suggestions!


Zbigniew Szalbot


smime.p7s
Description: S/MIME Cryptographic Signature


Re: RAID 1 / disk error / Offline uncorrectable sectors

2008-06-16 Thread Manolis Kiagias

Zbigniew Szalbot wrote:

Hello one last time,

Manolis Kiagias:


Ok, ad12 is missing, so it seems it was detached but not reattached.

try again:

atacontrol attach ata6


$ sudo atacontrol attach ata6
atacontrol: ioctl(IOCATAATTACH): File exists

Thank you all for a lot of suggestions!


Zbigniew Szalbot

As a last resort, you could also try:

atacontrol reinit ata6

and try reattaching again
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: RAID 1 / disk error / Offline uncorrectable sectors

2008-06-16 Thread Zbigniew Szalbot

Hello,



As a last resort, you could also try:

atacontrol reinit ata6

and try reattaching again


Thank you Manolis - you have been more than patient with me! 
Unfortunately, the result is still the same. OK. I am going to ask our 
hosting company to replace the drive. Again, many thanks for your help!


Zbigniew Szalbot


smime.p7s
Description: S/MIME Cryptographic Signature


Re: RAID 1 / disk error / Offline uncorrectable sectors

2008-06-16 Thread Oliver Fromme
Bill Moran wrote:
  Zbigniew Szalbot wrote:
   [...]
   Jun 14 01:13:38 relay kernel: ad12: FAILURE - READ_DMA48 
   status=51READY,DSC,ERROR error=40UNCORRECTABLE LBA=374468863
  [...]
  
  Replace the hard drive.  Every modern hard drive keeps extra space available
  to remap bad sectors.  This happens magically behind the scenes without
  you ever knowing about it.  Once you've hit uncorrectable errors, it means
  your re-mappable sectors are used up, and that means the drive is on its
  last legs.

That's not completely true.

When a disk drive encounters a bad sector during a read
operation, it will remember the bad sector address, but
it is unable to transparently remap the sector because it
doesn't know that correct contents of the sector.  So it
has to report the unrecoverable error to the OS, even if
there's still plenty of space for remapping sectors.

Upon the next write operation to a sector marked as bad,
the drive will finally remap it and write the data to a
spare location.

Therefore, getting uncorrectable errors does *not* mean
that the drive has used up its spare sectors.  You only
need to overwrite the bad sectors (e.g. with dd(1))so the
drive gets a chance to remap them.

Of course, it might still be a good idea to replace the
drive anyway.  It depends on the cause of the bad sectors
(mechanical or electrical).

If you had a head crash (caused by mechanical impact or
a media manufacturing error or whatever), it is possible
that it caused debris within the drive which will cause
further bad blocks.  This can lead to a snowball effect
that can really exhaust all spare sectors quickly.

On the other hand, if the bad sectors where caused by
a voltage spike, a power failure or similar, chances are
that the drive is fine and you can continue to use it
after making sure that the bad sectors are remapped
(by overwriting them, see above).

Finally, there is also the possibility that the problem
is caused by a bug in the drive's firmware.  If that's
the case, I would be inclined to replace the drive with
a different brand.  However, I guess all drives have
bugs ...  the question is whether they affect you.
Another question is whether it's possible at all to
find out what caused the problem in the first place.

Best regards
   Oliver

-- 
Oliver Fromme, secnetix GmbH  Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Geschäftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
chen, HRB 125758,  Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

What is this talk of 'release'?  We do not make software 'releases'.
Our software 'escapes', leaving a bloody trail of designers and quality
assurance people in its wake.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


disk error

2008-02-16 Thread Peter Boosten

Hi all,

Just found these messages in my logfile. Is it something to worry about?
I've never seen them before upgrading to 6.3.

ra kernel: ad0: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=281550271
ra kernel: ad0: FAILURE - READ_DMA48 status=51READY,DSC,ERROR 
error=4ABORTED LBA=281550271
ra kernel: g_vfs_done():ad0s1f[READ(offset=138248126464, 
length=16384)]error = 5

ra kernel: handle_workitem_freeblocks: block count
ra kernel: handle_workitem_freeblks: got error 5 while accessing filesystem

Peter
--
http://www.boosten.org
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: disk error

2008-02-16 Thread Brian A. Seklecki

On Sat, 2008-02-16 at 17:59 +0100, Peter Boosten wrote:
 Hi all,
 
 Just found these messages in my logfile. Is it something to worry about?
 I've never seen them before upgrading to 6.3.
 
 ra kernel: ad0: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=281550271
 ra kernel: ad0: FAILURE - READ_DMA48 status=51READY,DSC,ERROR 
 error=4ABORTED LBA=281550271

Yea -- normally that means a bad sector(*), and where there's one,
there's bound to be more.  Failed drive eventually.

I would pull this server from rotation and run a full surface sector
scan on it (download an ISO of Hiran's Boot CD)

Or if its a geom mirror raid-1, test this component.

If it was scsi, I would recommend camcontrol(8) to query the disk for a
list of grown defect sectors.

~BAS

*. If you've never seen it before and it developed.  Bad
cables/controllers/drives/interference can cause it too, but you would
have seen it from inception.

 ra kernel: g_vfs_done():ad0s1f[READ(offset=138248126464, 
 length=16384)]error = 5
 ra kernel: handle_workitem_freeblocks: block count
 ra kernel: handle_workitem_freeblks: got error 5 while accessing filesystem
 
 Peter

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: disk error

2008-02-16 Thread Brian A. Seklecki

On Sat, 16 Feb 2008, Peter Boosten wrote:


Brian, thanks for your answer (and sugggestion).

Isn't a drive supposed to mark a bad sector as bad and ignore it (that is:


They ship with a certain number of unallocated sectors to reassign failed 
ones to (I dont think ATA/IDE disks have a way to ask this, maybe SMART).


Once all of the silent allocations happen unbeknown to the user, then your 
suffering starts.


Install smartutils and check these values:

5 Reallocated_Sector_Ct 0x0033   100   100   005Pre-fail  Always
- 0
7 Seek_Error_Rate   0x000b   100   100   067Pre-fail  Always 
-   0
196 Reallocated_Event_Count 0x0032   100   100   000Old_age   Always 
-   0
197 Current_Pending_Sector  0x0022   100   100   000Old_age   Always 
-   0
198 Offline_Uncorrectable   0x0008   100   100   000Old_age   Offline 
-   0
199 UDMA_CRC_Error_Count0x000a   200   200   000Old_age   Always 
-   0



~BAS


not use it anymore)?

--
http://www.boosten.org



l8*
-lava (Brian A. Seklecki - Pittsburgh, PA, USA)
   http://www.spiritual-machines.org/

Guilty? Yeah. But he knows it. I mean, you're guilty.
You just don't know it. So who's really in jail?
~Maynard James Keenan

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: disk error

2008-02-16 Thread Peter Boosten

Brian A. Seklecki wrote:

On Sat, 2008-02-16 at 17:59 +0100, Peter Boosten wrote:

Hi all,

Just found these messages in my logfile. Is it something to worry about?
I've never seen them before upgrading to 6.3.

ra kernel: ad0: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=281550271
ra kernel: ad0: FAILURE - READ_DMA48 status=51READY,DSC,ERROR 
error=4ABORTED LBA=281550271


Yea -- normally that means a bad sector(*), and where there's one,
there's bound to be more.  Failed drive eventually.

I would pull this server from rotation and run a full surface sector
scan on it (download an ISO of Hiran's Boot CD)



Brian, thanks for your answer (and sugggestion).

Isn't a drive supposed to mark a bad sector as bad and ignore it (that 
is: not use it anymore)?


--
http://www.boosten.org
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: disk error

2008-02-16 Thread Erik Trulsson
On Sat, Feb 16, 2008 at 07:30:37PM +0100, Peter Boosten wrote:
 Brian A. Seklecki wrote:
 On Sat, 2008-02-16 at 17:59 +0100, Peter Boosten wrote:
 Hi all,
 
 Just found these messages in my logfile. Is it something to worry about?
 I've never seen them before upgrading to 6.3.
 
 ra kernel: ad0: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=281550271
 ra kernel: ad0: FAILURE - READ_DMA48 status=51READY,DSC,ERROR 
 error=4ABORTED LBA=281550271
 
 Yea -- normally that means a bad sector(*), and where there's one,
 there's bound to be more.  Failed drive eventually.
 
 I would pull this server from rotation and run a full surface sector
 scan on it (download an ISO of Hiran's Boot CD)
 
 
 Brian, thanks for your answer (and sugggestion).
 
 Isn't a drive supposed to mark a bad sector as bad and ignore it (that is: 
 not use it anymore)?

The drive can only remap bad sectors when you write to them.  When you read
from a bad sector the drive does not know what data was supposed to be there
and thus can only return an error or return garbage data.  Returning an
error (which is what disks do) is a much better choice.





-- 
Insert your favourite quote here.
Erik Trulsson
[EMAIL PROTECTED]
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


uncorrectable disk error

2007-08-20 Thread Wojciech Puchar
ad4: FAILURE - READ_DMA48 status=51READY,DSC,ERROR 
error=40UNCORRECTABLE LBA=465628608

g_vfs_done():ad4a[READ(offset=238401650688, length=638976)]error = 5



how can i find (UFS2) what file uses that block?

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Block to i-node to file name (was Re: uncorrectable disk error)

2007-08-20 Thread cpghost
On Tue, Aug 21, 2007 at 01:04:38AM +0200, Wojciech Puchar wrote:
 ad4: FAILURE - READ_DMA48 status=51READY,DSC,ERROR 
 error=40UNCORRECTABLE LBA=465628608
 g_vfs_done():ad4a[READ(offset=238401650688, length=638976)]error = 5
 
 how can i find (UFS2) what file uses that block?

[I took the liberty to change the subject for better archival]

Unless you're an fs guru or very patient and careful, you probably
won't or would have a hard time. But don't give up yet!

Try the following procedure:

1. Determine the slice where the block is located (fdisk)

2. Determine the partition of the block (bsdlabel)

3. Calculate the partition-relative offset of the block
   (i.e. subtract the slice offset and subtract from the
   result the partition offset).

4. Fire up fsdb(8) with the -r option on that file system.

5. Use fsdb's findblk command with that fs-relative offset
   to determine the inode that is holding this block. From man fsdb:

 findblk disk block number ...
 Find the inode(s) owning the specified disk block(s) number(s).
 Note that these are not absolute disk blocks numbers, but offsets
 from the start of the partition.

   Keep in mind that the block could also be in the free list
   (unused); but you'd not get this error message if it was (?).

6. Verify that the resulting i-node number is the right one
   by jumping to that inode with the inode command of fsdb,
   and rechecking that this block is indeed held by this i-node
   with the blocks command of fsdb. (you may want to run fsdb
   in a script(1), to capture the potentially long list of blocks).

7. The inode number you get won't tell you the name of the file.
   To find this, scan all directories of that file system for this
   inode number (I'd write a small C proggy for that, but you could
   just as well use find(1)'s -inum switch.

   If your disk is dying, this can (wether with a C program or with
   find(1) crash your system. If the number of directories is
   not very high, you could try to use fsdb(8) for that.

BEWARE: Always use fsdb(8) with the read-only flag -r! You could
irrevocably damage your file system otherwise if you don't know
exactly what you're doing.

Good luck!

Regards,
-cpghost.

P.S.: We really need a little LBA to i-node utility for UFS/UFS2,
that we could combine with find /fs -inum n...! If possible, a
utility that also takes care of GEOM-ified disks etc...

-- 
Cordula's Web. http://www.cordula.ws/
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Disk Error - DUMP output.

2007-05-26 Thread Lowell Gilbert
Grant Peel [EMAIL PROTECTED] writes:

 Is there any way to figure out the files that are not being read using the 
 DUMP error output below?

   DUMP: read error from /dev/da0s1g: Input/output error: [block 42718592]: 
 count=8192
   DUMP: read error from /dev/da0s1g: Input/output error: [sector 42718594]: 
 count=512
   DUMP: read error from /dev/da0s1g: Input/output error: [block 42671366]: 
 count=5120
   DUMP: read error from /dev/da0s1g: Input/output error: [sector 42671371]: 
 count=512

I had such a problem just last night.  I tracked it down by copying
directory trees within the filesystem to /dev/null until one failed.
Then I repeated the process one directory level down, narrowing down
the problem.  [It turned out to be my wife's incoming mail spool...]
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Disk Error - DUMP output.

2007-05-25 Thread Grant Peel
Is there any way to figure out the files that are not being read using the DUMP 
error output below?

  DUMP: read error from /dev/da0s1g: Input/output error: [block 42718592]: 
count=8192
  DUMP: read error from /dev/da0s1g: Input/output error: [sector 42718594]: 
count=512
  DUMP: read error from /dev/da0s1g: Input/output error: [block 42671366]: 
count=5120
  DUMP: read error from /dev/da0s1g: Input/output error: [sector 42671371]: 
count=512

-Grant
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Finding an LBA after a disk error

2006-03-13 Thread Doug Hardie
After much revision I finally have a tool that does a pretty good job  
of identifying the usage of an LBA.  Its not perfect, but its  
normally only used with a disk with a bad sector.  It no longer needs  
the complete source distribution but can be built from the normal  
libraries.  It has been tested on FreeBSD 5.3 and 6.0.  One of the  
libraries it uses was introduced in 5.1 so its not likely to work on  
anything earlier.  It works on ufs1 and ufs2 formats and there is  
even a man page now.  It could be mnade into a port, but I am out of  
time right now.  A quick look at the documents for creating ports  
shows that it will take quite a bit of time to figure out that part.   
Contact me off-list if you would like to get it.

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Fwd: Re: Disk error messages (ad0: HARD READ ERROR blk# xxxxxx)

2006-01-03 Thread Malcolm Kay
On Tue, 3 Jan 2006 12:25 pm, Gayn Winters wrote:
  [mailto:[EMAIL PROTECTED] On Behalf Of
  Russell J. Wood
  Sent: Monday, January 02, 2006 3:54 PM
  To: freebsd-questions@freebsd.org
  Subject: Re: Disk error messages (ad0: HARD READ ERROR blk#
  xx)
 
  On Mon, Jan 02, 2006 at 11:15:08PM +, [EMAIL PROTECTED] 
wrote:
   Hi there,
  
   On my screen, there were messages like the followings
 
  comeing up. I have to
 
   reboot mutiple times to get it boot up normally. Does this
 
  mean I have to
 
   replace the disk which is a relatively new disk (1-2
 
  years)? Any simple way to
 
   fix it and to avoid the time consuming task?
  
  
   ad0: 39205MB Maxtor 6EX [79656/16/63] at ata0-master
   WDMA2 ad0: HARD READ ERROR blk# 131199
   ad0: HARD READ ERROR blk# 131199 status=59 error=40
   ad0: DMA problem fallback to PIO mode
   ad0: HARD READ ERROR blk# 11272319 status=59 error=40
   ad0: HARD READ ERROR blk# 11272319 status=59 error=40
   ad0: HARD READ ERROR blk# 11272319 status=59 error=40
   ad0: HARD READ ERROR blk# 131199 status=59 error=40
   ad0: HARD READ ERROR blk# 3473535 status=59 error=40
   ad0: HARD READ ERROR blk# 9240703 status=59 error=40
   ad0: HARD READ ERROR blk# 17367167 status=59 error=40
   ad0: HARD READ ERROR blk# 17760383 status=59 error=40
 
  I suspect that you have bad sectors on your hard disk drive
  (and many of
  them). A good tool to use is Segate's Seatools
  (http://www.seagate.com/support/seatools/index.html). Just
  burn the Seatools Desktop edition to CDROM and boot from it.
 
  - Russell

 After you've checked for loose cables, you might want to take
 the drive out and check it in another system (using the
 Seagate or other such tools).  If indeed the problem is with
 DMA, the drive might be ok but the MB is flakey.  Perhaps the
 PC or MB manufacturer has diagnosics with which you can zero
 into the latter ugly possiblity.  In any case, get yourself a
 backup asap (at least of the user data so that you can recover
 from a fresh installation.)  Unless you are getting other
 types of errors, it is probably still possible to copy the
 drive with dd using bs=512b, and this would be your quickest
 fix of a hard drive problem. Run fsck on your new disk after
 the copy.


dd is generally not a good choice for copying disks (although it 
does sort of work). The new disk will appear unclean when copied 
from a live fs and may in fact have an odd instance of a file 
which has not yet been physically updated. 

And it just takes too long since you copy empty space as well as 
real data.

Instead slice, partition the new disk and create newfs on the new 
partitions.

And then pipe dump (using the snapshot option) through to restore 
for each fs on the disks.
I have (successfully) used this approach extensively for cloning 
systems.

Malcolm
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Disk error messages (ad0: HARD READ ERROR blk# xxxxxx)

2006-01-02 Thread mtang
Hi there,

On my screen, there were messages like the followings comeing up. I have to 
reboot mutiple times to get it boot up normally. Does this mean I have to 
replace the disk which is a relatively new disk (1-2 years)? Any simple way to 
fix it and to avoid the time consuming task?


ad0: 39205MB Maxtor 6EX [79656/16/63] at ata0-master WDMA2
ad0: HARD READ ERROR blk# 131199
ad0: HARD READ ERROR blk# 131199 status=59 error=40
ad0: DMA problem fallback to PIO mode
ad0: HARD READ ERROR blk# 11272319 status=59 error=40
ad0: HARD READ ERROR blk# 11272319 status=59 error=40
ad0: HARD READ ERROR blk# 11272319 status=59 error=40
ad0: HARD READ ERROR blk# 131199 status=59 error=40
ad0: HARD READ ERROR blk# 3473535 status=59 error=40
ad0: HARD READ ERROR blk# 9240703 status=59 error=40
ad0: HARD READ ERROR blk# 17367167 status=59 error=40
ad0: HARD READ ERROR blk# 17760383 status=59 error=40


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Disk error messages (ad0: HARD READ ERROR blk# xxxxxx)

2006-01-02 Thread Beech Rintoul
On Monday 02 January 2006 02:15 pm, [EMAIL PROTECTED] wrote:
 Hi there,

 On my screen, there were messages like the followings comeing up. I have to
 reboot mutiple times to get it boot up normally. Does this mean I have to
 replace the disk which is a relatively new disk (1-2 years)? Any simple way
 to fix it and to avoid the time consuming task?


 ad0: 39205MB Maxtor 6EX [79656/16/63] at ata0-master WDMA2
 ad0: HARD READ ERROR blk# 131199
 ad0: HARD READ ERROR blk# 131199 status=59 error=40
 ad0: DMA problem fallback to PIO mode
 ad0: HARD READ ERROR blk# 11272319 status=59 error=40
 ad0: HARD READ ERROR blk# 11272319 status=59 error=40
 ad0: HARD READ ERROR blk# 11272319 status=59 error=40
 ad0: HARD READ ERROR blk# 131199 status=59 error=40
 ad0: HARD READ ERROR blk# 3473535 status=59 error=40
 ad0: HARD READ ERROR blk# 9240703 status=59 error=40
 ad0: HARD READ ERROR blk# 17367167 status=59 error=40
 ad0: HARD READ ERROR blk# 17760383 status=59 error=40

Check that your cables are tight. You might even try swapping your drive 
cable. Other than that it looks like your drive is failing. You do have 
backups don't you?

Beech
-- 

---
Beech Rintoul - System Administrator - [EMAIL PROTECTED]
/\   ASCII Ribbon Campaign  | NorthWind Communications
\ / - NO HTML/RTF in e-mail  | 201 East 9th Avenue Ste.310
 X  - NO Word docs in e-mail | Anchorage, AK 99501
/ \  - Please visit Alaska Paradise - http://akparadise.byethost33.com
---













pgpwsowY81Mak.pgp
Description: PGP signature


Re: Disk error messages (ad0: HARD READ ERROR blk# xxxxxx)

2006-01-02 Thread Russell J. Wood
On Mon, Jan 02, 2006 at 11:15:08PM +, [EMAIL PROTECTED] wrote:
 Hi there,
 
 On my screen, there were messages like the followings comeing up. I have to 
 reboot mutiple times to get it boot up normally. Does this mean I have to 
 replace the disk which is a relatively new disk (1-2 years)? Any simple way 
 to 
 fix it and to avoid the time consuming task?
 
 
 ad0: 39205MB Maxtor 6EX [79656/16/63] at ata0-master WDMA2
 ad0: HARD READ ERROR blk# 131199
 ad0: HARD READ ERROR blk# 131199 status=59 error=40
 ad0: DMA problem fallback to PIO mode
 ad0: HARD READ ERROR blk# 11272319 status=59 error=40
 ad0: HARD READ ERROR blk# 11272319 status=59 error=40
 ad0: HARD READ ERROR blk# 11272319 status=59 error=40
 ad0: HARD READ ERROR blk# 131199 status=59 error=40
 ad0: HARD READ ERROR blk# 3473535 status=59 error=40
 ad0: HARD READ ERROR blk# 9240703 status=59 error=40
 ad0: HARD READ ERROR blk# 17367167 status=59 error=40
 ad0: HARD READ ERROR blk# 17760383 status=59 error=40

I suspect that you have bad sectors on your hard disk drive (and many of
them). A good tool to use is Segate's Seatools
(http://www.seagate.com/support/seatools/index.html). Just burn the
Seatools Desktop edition to CDROM and boot from it.

- Russell
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Disk error messages (ad0: HARD READ ERROR blk# xxxxxx)

2006-01-02 Thread Gayn Winters
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Russell J. Wood
 Sent: Monday, January 02, 2006 3:54 PM
 To: freebsd-questions@freebsd.org
 Subject: Re: Disk error messages (ad0: HARD READ ERROR blk# xx)
 
 
 On Mon, Jan 02, 2006 at 11:15:08PM +, [EMAIL PROTECTED] wrote:
  Hi there,
  
  On my screen, there were messages like the followings 
 comeing up. I have to 
  reboot mutiple times to get it boot up normally. Does this 
 mean I have to 
  replace the disk which is a relatively new disk (1-2 
 years)? Any simple way to 
  fix it and to avoid the time consuming task?
  
  
  ad0: 39205MB Maxtor 6EX [79656/16/63] at ata0-master WDMA2
  ad0: HARD READ ERROR blk# 131199
  ad0: HARD READ ERROR blk# 131199 status=59 error=40
  ad0: DMA problem fallback to PIO mode
  ad0: HARD READ ERROR blk# 11272319 status=59 error=40
  ad0: HARD READ ERROR blk# 11272319 status=59 error=40
  ad0: HARD READ ERROR blk# 11272319 status=59 error=40
  ad0: HARD READ ERROR blk# 131199 status=59 error=40
  ad0: HARD READ ERROR blk# 3473535 status=59 error=40
  ad0: HARD READ ERROR blk# 9240703 status=59 error=40
  ad0: HARD READ ERROR blk# 17367167 status=59 error=40
  ad0: HARD READ ERROR blk# 17760383 status=59 error=40
 
 I suspect that you have bad sectors on your hard disk drive 
 (and many of
 them). A good tool to use is Segate's Seatools
 (http://www.seagate.com/support/seatools/index.html). Just burn the
 Seatools Desktop edition to CDROM and boot from it.
 
 - Russell

After you've checked for loose cables, you might want to take the drive
out and check it in another system (using the Seagate or other such
tools).  If indeed the problem is with DMA, the drive might be ok but
the MB is flakey.  Perhaps the PC or MB manufacturer has diagnosics with
which you can zero into the latter ugly possiblity.  In any case, get
yourself a backup asap (at least of the user data so that you can
recover from a fresh installation.)  Unless you are getting other types
of errors, it is probably still possible to copy the drive with dd using
bs=512b, and this would be your quickest fix of a hard drive problem.
Run fsck on your new disk after the copy.

Good luck,

-gayn

Bristol Systems Inc.
714/532-6776
www.bristolsystems.com 


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Non-system disk or disk error

2005-11-07 Thread Portie Owner
I don't know if this is poor netiquete or not but I am
bumping my own question in case anyone missed it. 
Basically my 5.4 installation will not boot up from a
warm reboot but will boot with no problems from a
power-off situation.  Thanks.

--- Portie Owner [EMAIL PROTECTED] wrote:

 I am sure there is an easy solution to this but here
 is my problem, and it is driving me nuts.  
 
 The error message on boot is Non-system disk or
 disk
 error.  I only get this message if I do a warm
 reboot
 with no power off.  If I halt the system and power
 off
 and restart it boots right up.  
 
 Computer is a Compaq AP500 (P-II 450mhz, 700MB Ram,
 Adaptec SCSI card).  The system has two SCSI drives,
 C: which is at ID 1 and D: which is at ID 2. 
 The
 OS is FreeBDS 5.4, standard installation using the
 FreeBSD-only boot manager (I also tried the
 alternate
 FreeBSD boot choice).  No other OSs reside on the
 machine and I have tried to start with a clean DOS
 Fdisked bachine before installing FreeBSD.  The PC
 does not have the Compaq bios partition installed
 but
 that does not seem to matter.  I have not been able
 to
 upgrade the ROM BIOS on this machine, but the Compaq
 Diagnostics and Setup programs seem to work and
 report
 the right information about the disks.  I even tried
 disabling floppy and CD media boot but that dodn't
 help either.  
 
 Thanks, Portie
 
 
 
 
   
 __ 
 Yahoo! FareChase: Search multiple travel sites in
 one click.
 http://farechase.yahoo.com
 ___
 freebsd-questions@freebsd.org mailing list

http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to
 [EMAIL PROTECTED]
 





__ 
Yahoo! Mail - PC Magazine Editors' Choice 2005 
http://mail.yahoo.com
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Non-system disk or disk error

2005-11-04 Thread Portie Owner
I am sure there is an easy solution to this but here
is my problem, and it is driving me nuts.  

The error message on boot is Non-system disk or disk
error.  I only get this message if I do a warm reboot
with no power off.  If I halt the system and power off
and restart it boots right up.  

Computer is a Compaq AP500 (P-II 450mhz, 700MB Ram,
Adaptec SCSI card).  The system has two SCSI drives,
C: which is at ID 1 and D: which is at ID 2.  The
OS is FreeBDS 5.4, standard installation using the
FreeBSD-only boot manager (I also tried the alternate
FreeBSD boot choice).  No other OSs reside on the
machine and I have tried to start with a clean DOS
Fdisked bachine before installing FreeBSD.  The PC
does not have the Compaq bios partition installed but
that does not seem to matter.  I have not been able to
upgrade the ROM BIOS on this machine, but the Compaq
Diagnostics and Setup programs seem to work and report
the right information about the disks.  I even tried
disabling floppy and CD media boot but that dodn't
help either.  

Thanks, Portie





__ 
Yahoo! FareChase: Search multiple travel sites in one click.
http://farechase.yahoo.com
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


disk error?

2005-09-01 Thread kalin mintchev
hi all...

suddenly today out of nowhere this happens (log below) and now i get
vchkpw core dumps every few minutes or so. vchkpw is authorization module
for vpopmail...  does this mean the disk where vpopmail lives - ad2 - is
already craping up?!  thanks...

here is the log:

Aug 31 22:53:33 chavo /kernel: ad2: READ command timeout tag=0 serv=0 -
resetting
Aug 31 22:53:33 chavo /kernel: ata1: resetting devices .. done
Sep  1 00:36:22 chavo /kernel: ad2: READ command timeout tag=0 serv=0 -
resetting
Sep  1 00:36:22 chavo /kernel: ata1: resetting devices .. done
Sep  1 01:12:42 chavo /kernel: ad2: READ command timeout tag=0 serv=0 -
resetting
Sep  1 01:12:42 chavo /kernel: ata1: resetting devices .. done
Sep  1 01:49:54 chavo /kernel: ad2: WRITE command timeout tag=0 serv=0 -
resetting
Sep  1 01:49:54 chavo /kernel: ata1: resetting devices .. done
Sep  1 01:52:12 chavo /kernel: ad2: WRITE command timeout tag=0 serv=0 -
resetting
Sep  1 01:52:12 chavo /kernel: ata1: resetting devices ..
Sep  1 01:52:12 chavo /kernel: ad2: removed from configuration
Sep  1 01:52:12 chavo /kernel: ad3: removed from configuration
Sep  1 01:52:12 chavo /kernel: done
Sep  1 01:53:02 chavo /kernel: handle_workitem_freeblocks: block count Sep
 1 01:54:04 chavo /kernel: handle_workitem_freeblocks: block count Sep  1
01:55:37 chavo last message repeated 2 times



--




___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Disk Error ... back up method

2005-03-08 Thread Lowell Gilbert
Yance Kowara [EMAIL PROTECTED] writes:

 Hi all,
  
 I am a FreeBSD newbie... would like to know more about backing up the whole 
 FreeBSD system to a new hard disk.
  
 What is the most convenient method of backing up to a new harddisk?
 any pointers appreciated

The question isn't completely clear, but I think the FAQ entry for
How do I move my system over to my huge new disk? is probably what
you're looking for.
 http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/disks.html#NEW-HUGE-DISK

  
 I cut and pasted Aftabs' reply to Disk Error thread ...
  
 Thanks in advance.
  
 ASAP
 1. fsck -y
 2. tunefs ( enable softupdate)
 3. backup to new hard disk
 4. remove this faulty hard disk
 
 Your hard disk is dyeing .
 
 __
 Do You Yahoo!?
 Tired of spam?  Yahoo! Mail has the best spam protection around 
 http://mail.yahoo.com 
 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to [EMAIL PROTECTED]
 

-- 
Lowell Gilbert, embedded/networking software engineer, Boston area
http://be-well.ilk.org/~lowell/
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


RE: Disk Error

2005-03-07 Thread Ted Mittelstaedt


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] Behalf Of Doug Hardie
 Sent: Sunday, March 06, 2005 10:24 PM
 To: Aftab Jahan Subedar
 Cc: FreeBSD Questions
 Subject: Re: Disk Error


 I doubt that its dying.  There is only one bad sector.  The
 drive is in
 constant use.  Its ran at 100% for almost 12 hours while copying the
 files and no errors were detected.  Its always the same sector
 with the
 error.


I've seen something like this once when a drive/bios combo lied about
the number of blocks the drive had available.  The BSD partition was
created larger than the actual available blocks, thus whenever the OS
sent data to blocks that didn't exist, you got this problem.

If this is setup OK then as the other poster said your days on this drive
are coming to an end.  IDE drives have a number of reserved blocks
available
that are used internally by the drive to map out bad sectors.  When a
drive
starts going bad the sectors start failing one by one and the drive maps
them
out - when it uses up all the reserved blocks then the drive starts
returning
errors to the operating system.

If this drive supports S.M.A.R.T. and it's enabled and your running 5.X
then smartmon might give you some data about the actual real state of the
drive, rather than the lies that the drive normally tells the OS.

Ted

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Disk Error

2005-03-07 Thread Eric McCoy
Doug Hardie wrote:
I doubt that its dying.  There is only one bad sector.  The drive is in 
constant use.  Its ran at 100% for almost 12 hours while copying the 
files and no errors were detected.  Its always the same sector with the 
error.
Just as a note, hard drives now come with a number of spare sectors 
which they map automatically to replace dead sectors.  This is done 
because all drives ship with a few bad sectors.  Usually when errors 
like this show up, it is because the drive is out of spares.  Since 
problems like these tend to accelerate, it is a good idea at least to 
consider replacing the disk before you start losing data more than a 
sector at a time.

You might consider getting smartmontools and seeing what the drive's 
diagnostics have to say.  Usually that will tell you if this is a fluke 
or a symptom of a failing drive.

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Disk Error ... back up method

2005-03-07 Thread Yance Kowara
Hi all,
 
I am a FreeBSD newbie... would like to know more about backing up the whole 
FreeBSD system to a new hard disk.
 
What is the most convenient method of backing up to a new harddisk?
any pointers appreciated
 
I cut and pasted Aftabs' reply to Disk Error thread ...
 
Thanks in advance.
 
ASAP
1. fsck -y
2. tunefs ( enable softupdate)
3. backup to new hard disk
4. remove this faulty hard disk

Your hard disk is dyeing .

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Disk Error

2005-03-06 Thread Doug Hardie
I have been getting the following disk errors consistently for the last 
month.

ad2s1e: hard error reading fsbn 6934399 of 3467168-3467295 (ad2s1 bn 
6934399; cn 431 tn 164 sn 52) status=59 error=40
spec_getpages:(#ad/0x20014) I/O read failure: (error=5) bp 0xc5678f94 
vp 0xcb5f3a80
   size: 65536, resid: 65536, a_count: 65536, valid: 0x0
   nread: 0, reqpage: 0, pindex: 504, pcount: 16
vm_fault: pager read error, pid 35441 (expireover)

How do you figure out which file has the problem?  expireover's logs 
are all buffered so you don't get the last partial buffer.  I don't 
know yet if I can mark that particular sector as bad, but if I can find 
the file I can at least move to someplace where it won't get deleted.  
I chased through the core dump and the only directory indicated but all 
of those files are good.  I have also tar'd the entire news directory 
elsewhere and no errors were encountered.  The sector is the same every 
day.

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Disk Error

2005-03-06 Thread Aftab Jahan Subedar
ASAP
1. fsck -y
2. tunefs ( enable softupdate)
3. backup to new hard disk
4. remove this faulty hard disk
Your hard disk is dyeing .
Doug Hardie wrote:
I have been getting the following disk errors consistently for the 
last month.

ad2s1e: hard error reading fsbn 6934399 of 3467168-3467295 (ad2s1 bn 
6934399; cn 431 tn 164 sn 52) status=59 error=40
spec_getpages:(#ad/0x20014) I/O read failure: (error=5) bp 0xc5678f94 
vp 0xcb5f3a80
   size: 65536, resid: 65536, a_count: 65536, valid: 0x0
   nread: 0, reqpage: 0, pindex: 504, pcount: 16
vm_fault: pager read error, pid 35441 (expireover)

How do you figure out which file has the problem?  expireover's logs 
are all buffered so you don't get the last partial buffer.  I don't 
know yet if I can mark that particular sector as bad, but if I can 
find the file I can at least move to someplace where it won't get 
deleted.  I chased through the core dump and the only directory 
indicated but all of those files are good.  I have also tar'd the 
entire news directory elsewhere and no errors were encountered.  The 
sector is the same every day.

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to 
[EMAIL PROTECTED]


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Disk Error

2005-03-06 Thread Doug Hardie
I doubt that its dying.  There is only one bad sector.  The drive is in 
constant use.  Its ran at 100% for almost 12 hours while copying the 
files and no errors were detected.  Its always the same sector with the 
error.

On Mar 7, 2005, at 09:54, Aftab Jahan Subedar wrote:
ASAP
1. fsck -y
2. tunefs ( enable softupdate)
3. backup to new hard disk
4. remove this faulty hard disk
Your hard disk is dyeing .
Doug Hardie wrote:
I have been getting the following disk errors consistently for the 
last month.

ad2s1e: hard error reading fsbn 6934399 of 3467168-3467295 (ad2s1 bn 
6934399; cn 431 tn 164 sn 52) status=59 error=40
spec_getpages:(#ad/0x20014) I/O read failure: (error=5) bp 
0xc5678f94 vp 0xcb5f3a80
   size: 65536, resid: 65536, a_count: 65536, valid: 0x0
   nread: 0, reqpage: 0, pindex: 504, pcount: 16
vm_fault: pager read error, pid 35441 (expireover)

How do you figure out which file has the problem?  expireover's logs 
are all buffered so you don't get the last partial buffer.  I don't 
know yet if I can mark that particular sector as bad, but if I can 
find the file I can at least move to someplace where it won't get 
deleted.  I chased through the core dump and the only directory 
indicated but all of those files are good.  I have also tar'd the 
entire news directory elsewhere and no errors were encountered.  The 
sector is the same every day.

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to 
[EMAIL PROTECTED]



___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


disk error or what ?

2005-01-06 Thread Omer Faruk Sen
Hi, 

I have a scsi controller: 

iir0: Intel Integratd RAID Controller mem 0xfc2f-0xfc2f3fff irq 20 at 
device 8.0 on pci4 

Yestarday I have taken this messages from my dmesg output: 

iir0: SCSI-B, ID 0: last status 0x0107. I/O status: SELECTION_TIMEOUT
iir0: SCSI-B, ID 0: Check cables, termination, termpower, LVDS operation, 
etc.
iir0: Array Drive 0: Logical Drive 0 SCSI-B, ID 0, LUN 0 failed
iir0: Array Drive 0: FAIL state entered
iir0: SCSI-B, ID 0: Auto Hot Plug started for slot 0
iir0: SCSI-B, ID 0: MPI returned 0x0043
iir0: Bus B: The SCSI controller successfully recovered from a SCSI BUS 
issue.  The issue may still be present on the BUS.  Check cables, 
termination, termpower, LVDS operation, etc
iir0: SCSI-B, ID 0: MPI returned 0x0048 

I want to be sure that if I have understood right. It seems that Driver0 has 
failed. But I want to be sure that that is correct. How can I verift that 
Driver0 has the problem. 

---
Omer Faruk Sen
http://www.EnderUNIX.ORG
Software Development Team @ Turkey
http://www.Faruk.NET
For Public key: http://www.enderunix.org/ofsen/ofsen.asc
 

First Turkish FreeBSD book is out! Go check it.
Duydunuz mu! Turkiye'nin ilk FreeBSD kitabi cikti.
http://www.acikkod.com/freebsd.php 

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


4.9-Stable: Disk error booting from hard drive

2004-01-15 Thread Max Clark
Hi all,

I am completely stumped by this one. I have a new MSI 1u P1-1000 server: 
2.4GHz P4, 1GB Ram, with a 40GB IDE (38166MB ST340014A [77545/16/63]) 
hard drive on the primary master channel.

Here's the problem, I can install 4.9-Stable, but when I finish and 
reboot the machine the bios reports disk error, the machine reboots and 
is trapped in this loop (I've tried setting the box up with a Boot 
Manger and Standard, but it doesn't work for either).

To make things interesting 5.2-Release works fine, I can install, 
reboot, everything is cool. I don't understand what is going on here, I 
even tried re-downloading the 4.9 iso to check my cd with no luck.

This machine will be a high use server so I really want to run Stable, 
what do I need to do?

Thanks in advance,
Max
This is the dmesg from 5.2:

Copyright (c) 1992-2004 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 5.2-RELEASE #0: Sun Jan 11 04:21:45 GMT 2004
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC
Preloaded elf kernel /boot/kernel/kernel at 0xc0a33000.
Preloaded elf module /boot/kernel/acpi.ko at 0xc0a331f4.
Timecounter i8254 frequency 1193182 Hz quality 0
CPU: Intel(R) Pentium(R) 4 CPU 2.40GHz (2391.15-MHz 686-class CPU)
  Origin = GenuineIntel  Id = 0xf29  Stepping = 9
Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
real memory  = 1073676288 (1023 MB)
avail memory = 1033510912 (985 MB)
ACPI APIC Table: IntelR AWRDACPI
ioapic0 Version 2.0 irqs 0-23 on motherboard
Pentium Pro MTRR support enabled
npx0: [FAST]
npx0: math processor on motherboard
npx0: INT 16 interface
acpi0: IntelR AWRDACPI on motherboard
pcibios: BIOS version 2.10
Using $PIR table, 10 entries at 0xc00fdec0
acpi0: Power Button (fixed)
Timecounter ACPI-fast frequency 3579545 Hz quality 1000
acpi_timer0: 24-bit timer at 3.579545MHz port 0x408-0x40b on acpi0
acpi_cpu0: CPU on acpi0
acpi_cpu1: CPU on acpi0
device_probe_and_attach: acpi_cpu1 attach returned 6
acpi_tz0: Thermal Zone on acpi0
acpi_button0: Power Button on acpi0
acpi_button1: Sleep Button on acpi0
pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0
pci0: ACPI PCI bus on pcib0
agp0: Intel 82845 host to AGP bridge mem 0xd000-0xdfff at 
device 0.0 on pci0
pcib1: PCI-PCI bridge at device 1.0 on pci0
pci1: PCI bus on pcib1
uhci0: Intel 82801DB (ICH4) USB controller USB-A port 0xd800-0xd81f 
irq 16 at device 29.0 on pci0
usb0: Intel 82801DB (ICH4) USB controller USB-A on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1: Intel 82801DB (ICH4) USB controller USB-B port 0xd000-0xd01f 
irq 19 at device 29.1 on pci0
usb1: Intel 82801DB (ICH4) USB controller USB-B on uhci1
usb1: USB revision 1.0
uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhci2: Intel 82801DB (ICH4) USB controller USB-C port 0xd400-0xd41f 
irq 18 at device 29.2 on pci0
usb2: Intel 82801DB (ICH4) USB controller USB-C on uhci2
usb2: USB revision 1.0
uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
pci0: serial bus, USB at device 29.7 (no driver attached)
pcib2: ACPI PCI-PCI bridge at device 30.0 on pci0
pci2: ACPI PCI bus on pcib2
em0: Intel(R) PRO/1000 Network Connection, Version - 1.7.19 port 
0xc000-0xc03f mem 0xe200-0xe201 irq 21 at device 5.0 on pci2
em0:  Speed:N/A  Duplex:N/A
fxp0: Intel 82551 Pro/100 Ethernet port 0xc400-0xc43f mem 
0xe202-0xe203,0xe2041000-0xe2041fff irq 23 at device 6.0 on pci2
fxp0: Ethernet address 00:0c:76:4e:78:73
miibus0: MII bus on fxp0
inphy0: i82555 10/100 media interface on miibus0
inphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
pci2: display, VGA at device 7.0 (no driver attached)
isab0: PCI-ISA bridge at device 31.0 on pci0
isa0: ISA bus on isab0
atapci0: Intel ICH4 UDMA100 controller port 
0xf000-0xf00f,0-0x3,0-0x7,0-0x3,0-0x7 at device 31.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata0: [MPSAFE]
ata1: at 0x170 irq 15 on atapci0
ata1: [MPSAFE]
pci0: serial bus, SMBus at device 31.3 (no driver attached)
fdc0: Enhanced floppy controller (i82077, NE72065 or clone) port 
0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0
sio0 port 0x3f8-0x3ff irq 4 on acpi0
sio0: type 16550A
sio1 port 0x2f8-0x2ff irq 3 on acpi0
sio1: type 16550A
atkbdc0: Keyboard controller (i8042) port 0x64,0x60 irq 1 on acpi0
atkbd0: AT Keyboard flags 0x1 irq 1 on atkbdc0
kbd0 at atkbd0
psm0: PS/2 Mouse irq 12 on atkbdc0
psm0: model IntelliMouse, device ID 3
acpi_cpu1: CPU on acpi0
device_probe_and_attach: acpi_cpu1 attach returned 6
orm0: Option ROM at iomem 0xc-0xc7fff on isa0
pmtimer0 on isa0
ppc0: parallel port not found.
sc0: System console at flags 0x100 on isa0
sc0: VGA 16 virtual consoles

vinum crashed disk error

2003-09-23 Thread John Fox
Hello,

Three IDE drives multiplexed together to make one large partition
(for mounting as /usr/local).

We were messing with hardware in the box and when we rebooted
vinum spat out errors about defective objects and the boot
came to a halt.  We figured we had left something loose or
unplugged on the motherboard, so we shut it down and took
a look.  Sure enough, the plug in the mobo's secondary IDE
channel was loose, so we reseated it and powered the machine
up again.

We saw that the kernel found all the IDE drives and figured
the problem was over.  But vinum had the same problem.  It
said (loose quotation): /dev/ mounted read-only.  vinum config
not being rebuilt.  And then spit out the same errors it had
the first time.

Unfortunately, the vinum.org domain is having problems of some
sort, and the vinum help pages on lemis.com are redirected to
the vinum.org site, so I am deprived of a great trouble-shooting
resource.

System is 4.8-STABLE.

Below you will see the output of 'vinum start'.  

Any suggestions as to fixing this problem would be
greatly appreciated, as we do make production use
of this box.

Thank you in advance for any words you may be
able to offer.

-John

output of 'vinum start':

Warning: defective objects

V bigdisk   State: down Plexes:   1 Size: 23 GB
P big_plexC State: faulty   Subdisks: 3 Size: 23 GB
S drive0State: down PO:0  B Size:   8063 MB
S drive2State: crashed  PO: 8063 MB Size:   8063 MB
S drive3State: crashed  PO:   15 GB Size:   8063 MB


-- 
+---+
| John Fox [EMAIL PROTECTED] |System Administrator   | InfoStructure   |
+---+
|Gideon: I thought you said don't hold a grudge.|
| Galen: I don't. I have no surviving enemies...at all. |
| -- Crusdade, _Racing the Night_ |
+---+
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


vinum crashed disk error -- addendum

2003-09-23 Thread John Fox
Maybe it'll help if I give some more comprehensive information:

vinum - l
3 drives:
D ide0e State: up   Device /dev/ad0s1e  Avail: 0/8063 MB (0%)
D ide2e State: up   Device /dev/ad2s1e  Avail: 0/4031 MB (0%)
D ide3e State: up   Device /dev/ad3s1e  Avail: 0/8063 MB (0%)

1 volumes:
V bigdisk   State: down Plexes:   1 Size: 23 GB

1 plexes:
P big_plexC State: faulty   Subdisks: 3 Size: 23 GB

3 subdisks:
S drive0State: down PO:0  B Size:   8063 MB
S drive2State: crashed  PO: 8063 MB Size:   8063 MB
S drive3State: crashed  PO:   15 GB Size:   8063 MB


Thanks,

John
-- 
+---+
| John Fox [EMAIL PROTECTED] |System Administrator   | InfoStructure   |
+---+
|Gideon: I thought you said don't hold a grudge.|
| Galen: I don't. I have no surviving enemies...at all. |
| -- Crusdade, _Racing the Night_ |
+---+
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: vinum crashed disk error -- addendum

2003-09-23 Thread Greg 'groggy' Lehey
On Tuesday, 23 September 2003 at 11:29:44 -0700, John Fox wrote:
 Maybe it'll help if I give some more comprehensive information:

 vinum - l
 3 drives:
 D ide0e State: up   Device /dev/ad0s1e  Avail: 0/8063 MB (0%)
 D ide2e State: up   Device /dev/ad2s1e  Avail: 0/4031 MB (0%)
 D ide3e State: up   Device /dev/ad3s1e  Avail: 0/8063 MB (0%)

 1 volumes:
 V bigdisk   State: down Plexes:   1 Size: 23 GB

 1 plexes:
 P big_plexC State: faulty   Subdisks: 3 Size: 23 GB

 3 subdisks:
 S drive0State: down PO:0  B Size:   8063 MB
 S drive2State: crashed  PO: 8063 MB Size:   8063 MB
 S drive3State: crashed  PO:   15 GB Size:   8063 MB

By itself, this is meaningless.  If you have a problem, look at the
man page or the web site for information on how to report it.

Greg
--
When replying to this message, please copy the original recipients.
If you don't, I may ignore the reply or reply to the original recipients.
For more information, see http://www.lemis.com/questions.html
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature


Re: vinum crashed disk error

2003-09-23 Thread Greg 'groggy' Lehey
On Tuesday, 23 September 2003 at 11:21:49 -0700, John Fox wrote:
 Hello,

 Three IDE drives multiplexed together to make one large partition
 (for mounting as /usr/local).

 We were messing with hardware in the box and when we rebooted
 vinum spat out errors about defective objects and the boot
 came to a halt.  We figured we had left something loose or
 unplugged on the motherboard, so we shut it down and took
 a look.  Sure enough, the plug in the mobo's secondary IDE
 channel was loose, so we reseated it and powered the machine
 up again.

 We saw that the kernel found all the IDE drives and figured
 the problem was over.  But vinum had the same problem.  It
 said (loose quotation): /dev/ mounted read-only.  vinum config
 not being rebuilt.  And then spit out the same errors it had
 the first time.

These messages have a purpose.  You shouldn't just ignore them.

 Unfortunately, the vinum.org domain is having problems of some sort,

What sort?  More error messages?  I have no problem accessing it (and
no, it's not here, it's at the other end of the world).

 and the vinum help pages on lemis.com are redirected to the
 vinum.org site,

They're on the same server.

 so I am deprived of a great trouble-shooting resource.

There are still the man pages.

 output of 'vinum start':
 
 Warning: defective objects

 V bigdisk   State: down   Plexes:   1 Size: 23 GB
 P big_plexC State: faulty Subdisks: 3 Size: 23 GB
 S drive0State: down   PO:0  B Size:   8063 MB
 S drive2State: crashedPO: 8063 MB Size:   8063 MB
 S drive3State: crashedPO:   15 GB Size:   8063 MB
 

 Any suggestions as to fixing this problem would be greatly
 appreciated, as we do make production use of this box.

Do these objects have any relationship to each other?  The naming is
confusing to say the least.  In general, though, if your drives are up
again, and the volume only has one plex, you can use the 'vinum
setupstate' command to explicitly set the state to up.  You'll then
need to save the configuration with saveconfig after you've confirmed
that the data is OK.

Greg
--
When replying to this message, please copy the original recipients.
If you don't, I may ignore the reply or reply to the original recipients.
For more information, see http://www.lemis.com/questions.html
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature


Re: hard disk error , run fsck manually

2003-07-01 Thread manee
hi sirs,

thanks to all who give me good helps and hints on my
problem.  but i afraid that i really need to replace
hard disk.  i try David Wolfskill's last help but
fialed.  i mean i could not re-allocate hard disk
back.

but that is /home partion.  my other partion such as
/usr /var / are still clean.

at this time i simply want to backup /var ( actually
is mysql data), /etc and /usr/local/www/data,  should
only dump enough for backup ?

once again thanks so much to all of you.


--- David Wolfskill [EMAIL PROTECTED] wrote:
 Date: Sun, 29 Jun 2003 00:17:27 -0700 (PDT)
 From: manee [EMAIL PROTECTED]
 Subject: Re: hard disk error , run fsck manually
 To: David Wolfskill [EMAIL PROTECTED]
 Cc: [EMAIL PROTECTED]
 
 
 i got , after running fsck -p
 
 THE FOLLOWING FILE SYSTEM HAD AN UNEXPECTED
 INCONSISTENCY:
 
/dev/ad0s2g (/home)
 
 so that i ran fsck
 
 OK.
 
 and the following messages come
 
 Phase 1
 ad0s2g: hard error reading fsbn 98971950 of
 21364912-21365023( ad0s2 nb 98971950; cn 6160 tn
 183
 sn 21) status=59 error=40;
 
 CAN NOT READ: BLK 21364912
 continue? [yn]
 
 i had to hit y and a few messages simila to the
 above
 popped up and before Phase 2 started, i got
 
 FILE SYSTEM STILL DIRTY
 PLEASE RERUN fsck MANULLY.
 
 Well, you had data on your disk that is no longer
 readable.
 
 If you are lucky, you may be able to get the disk to
 re-allocate some of
 the bad sectors.  If there were more than about 6
 or 8 of these,
 though, I suspect that you will need to replace the
 disk soon enough
 that it is not worth your time.
 
 To try to get the disk to re-allocate block
 21364912, I would do:
 
   dd bs=512 count=1 if=/dev/zero of=/dev/ad0s2g
 seek=21364912
 
 Note that this has a very high probability of
 ensuring that whatever
 data is now written to block 21364912 is different
 from what it had
 been; its only saving grace is that it is data
 that may possibly be
 readable
 
 Once you have done this for each block that was
 reported as CAN NOT
 READ: BLK , then re-run fsck.  Because things
 are almost assuredly
 going to be inconsistent, you may wish to merely do
 
   fsck -y
 
 An alternate, and possibly faster approach would be
 to skip the fsck
 altogether, and just use newfs.  Of course, that
 will obliterate any
 data you once had on the file system, and you would
 then need to
 reconstruct the data -- from backups or other
 sources.
 
 But then, you may well need to do that anyway,
 especially for files
 affected by the bad blocks.
 
 at this ponit i had to edit /etc/fstab and put
 /home
 as read only in order to bring system up and
 running.
 
 Seems that you have a disk drive that is getting bad
 enough that its
 continued usefulness is in question.
 

with best regards,


=
ÁÒ¹Õ
http://www.thai-aec.org

__
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: hard disk error , run fsck manually

2003-06-29 Thread manee
hi sirs,

thanks for your time indeed.

--- David Wolfskill [EMAIL PROTECTED] wrote:
 Reboot your system.
 
 During the 10-second spinning propeller
 count-down, press the space
 bar once.  You should see the prompt
 
 boot
 
 
 At that point, type
 
   boot -s
 
 and press Enter.  This will enable you to boot
 into single-user mode.
 
 The machine should show the usual device probes, but
 instead of mounting
 filesystems and starting daemons, you will get a
 prompt like:
 
 Enter full pathname of shell or RETURN for /bin/sh: 
 
 
 At that point, press Enter.  The prompt should
 read
 
 # 
 
 This means that you are in single-user mode; you are
 running as root.
 
 At this point, I would (first) try
 
   fsck -p  reboot
 
 That is, do the fsck in preen mode; if that
 works OK, just reboot.
 If that does not automatically reboot, you have
 problems that fsck -p
 cannot fix easily.  In that case, try

as  expected, i need to run fsck

 
   fsck
 
 and answer the questions as best you can.  If you
 are (finally!) able
 to get through that OK, try


i got , after running fsck -p

THE FOLLOWING FILE SYSTEM HAD AN UNEXPECTED
INCONSISTENCY:

   /dev/ad0s2g (/home)

so that i ran fsck and the following messages come

Phase 1
ad0s2g: hard error reading fsbn 98971950 of
21364912-21365023( ad0s2 nb 98971950; cn 6160 tn 183
sn 21) status=59 error=40;

CAN NOT READ: BLK 21364912
continue? [yn]

i had to hit y and a few messages simila to the above
popped up and before Phase 2 started, i got

FILE SYSTEM STILL DIRTY
PLEASE RERUN fsck MANULLY.

at this ponit i had to edit /etc/fstab and put /home
as read only in order to bring system up and running.

   reboot
 
 and see how far you ge.
 
 Peace,
 david
 -- 
 David H. Wolfskill[EMAIL PROTECTED]
 Based on what I have seen to date, the use of
 Microsoft products is not
 consistent with reliability.  I recommend FreeBSD
 for reliable systems.

once again please cc to me

with best regards,


=
ÁÒ¹Õ
http://www.thai-aec.org

__
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re:hard disk error , run fsck manually

2003-06-29 Thread manee

--- Alex Zivenko [EMAIL PROTECTED] wrote:
 You need just run fsck. See the man page. I had this
 problem too. Only yesterday. All was fixed with
 fsck, I just gived root password and logged in in
 single user mode. Then I just runned fsck with some
 params.
 

thank you for your time.  but in my case,  fsck can
not help.  i also try

fsck -p -y

i still got

FILE SYSTEM STILL DIRTY
PLEASE RERUN fsck MANUALLY.

what i did was simply put that partion or file system
in read only mode and exit single user mode in order
to bring the system up and running.

anyway, thanks so much for your helps.

with best regards,

=
ÁÒ¹Õ
http://www.thai-aec.org

__
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


hard disk error , run fsck manually

2003-06-28 Thread manee
hi sirs,

i face problem of hard disk error because of the power
downed during using the machine.

once the power is coming, my machine stuck at 

file system is still dirty please run fsck manually.

the partion that is dirty is /dev/ad0s2g,  a home
partion one. i did run fsck /dev/ad0s2g several times
but still get the same message.

what i decided to do was that to edit /etc/fstab and
put read only option for /dev/ad0s2g and exited a
single mode.  i got a message said that /home was not
dismount, as you see in the attachment.

up to this point, only root that can log in.  my
question is that are there any method to recover an
fsck error during boot time?

please cc to me since i do not a member of the list.

with best regards,


=
ÁÒ¹Õ
http://www.thai-aec.org

__
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.comCopyright (c) 1992-2003 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 4.8-STABLE #2: Sun Jun  1 18:59:28 ICT 2003
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/Bank
Timecounter i8254  frequency 1193182 Hz
CPU: Intel(R) Celeron(TM) CPU1100MHz (1102.51-MHz 686-class CPU)
  Origin = GenuineIntel  Id = 0x6b1  Stepping = 1
  
Features=0x383fbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE
real memory  = 125763584 (122816K bytes)
config di sn0
config di lnc0
config di ie0
config di fe0
config di cs0
config q
avail memory = 118005760 (115240K bytes)
Preloaded elf kernel kernel at 0xc0447000.
Preloaded userconfig_script /boot/kernel.conf at 0xc044709c.
Preloaded elf module snd_via8233.ko at 0xc04470ec.
Preloaded elf module snd_pcm.ko at 0xc0447190.
Preloaded elf module snd_via82c686.ko at 0xc0447230.
Pentium Pro MTRR support enabled
md0: Malloc disk
Using $PIR table, 6 entries at 0xc00fdd40
npx0: math processor on motherboard
npx0: INT 16 interface
pcib0: Host to PCI bridge on motherboard
pci0: PCI bus on pcib0
agp0: VIA Generic host to PCI bridge mem 0xe000-0xe3ff at device 0.0 on pci0
pcib1: PCI to PCI bridge (vendor=1106 device=8601) at device 1.0 on pci0
pci1: PCI bus on pcib1
pci1: Trident model 8500 VGA-compatible display device at 0.0 irq 10
isab0: VIA 82C686 PCI-ISA bridge at device 7.0 on pci0
isa0: ISA bus on isab0
atapci0: VIA 82C686 ATA100 controller port 0xd000-0xd00f at device 7.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
uhci0: VIA 83C572 USB controller port 0xd400-0xd41f irq 9 at device 7.2 on pci0
usb0: VIA 83C572 USB controller on uhci0
usb0: USB revision 1.0
uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
pci0: unknown card (vendor=0x1106, dev=0x3057) at 7.4
pcm0: VIA VT82C686A port 0xe400-0xe403,0xe000-0xe003,0xdc00-0xdcff irq 11 at device 
7.5 on pci0
pcm0: Avance Logic ALC200/200P ac97 codec
ed0: NE2000 PCI Ethernet (RealTek 8029) port 0xe800-0xe81f irq 11 at device 10.0 on 
pci0
ed0: address 00:00:21:2d:ad:f7, type NE2000 (16 bit) 
orm0: Option ROM at iomem 0xc-0xcbfff on isa0
fdc0: NEC 72065B or clone at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: 1440-KB 3.5 drive on fdc0 drive 0
atkbdc0: Keyboard controller (i8042) at port 0x60,0x64 on isa0
atkbd0: AT Keyboard flags 0x1 irq 1 on atkbdc0
kbd0 at atkbd0
psm0: PS/2 Mouse irq 12 on atkbdc0
psm0: model IntelliMouse, device ID 3
vga0: Generic ISA VGA at port 0x3c0-0x3df iomem 0xa-0xb on isa0
sc0: System console at flags 0x100 on isa0
sc0: VGA 16 virtual consoles, flags=0x300
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
ppc0: Parallel port at port 0x378-0x37f irq 7 on isa0
ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/8 bytes threshold
plip0: PLIP network interface on ppbus0
lpt0: Printer on ppbus0
lpt0: Interrupt-driven port
ppi0: Parallel I/O on ppbus0
IP packet filtering initialized, divert enabled, rule-based forwarding enabled, 
default to deny, logging disabled
ad0: 38166MB ST340810A [77545/16/63] at ata0-master UDMA100
acd0: CDROM LTN526 at ata1-slave PIO4
Mounting root from ufs:/dev/ad0s2a
ad0s2g: hard error reading fsbn 98971886 of 21364912-21365023 (ad0s2 bn 98971886; cn 
6160 tn 182 sn 20) trying PIO mode
ad0: DMA problem fallback to PIO mode
ad0: DMA problem fallback to PIO mode
ad0: DMA problem fallback to PIO mode
ad0: DMA problem fallback to PIO mode
ad0s2g: hard error reading fsbn 98971950 of 21364912-21365023 (ad0s2 bn 98971950; cn 
6160 tn 183 sn 21) status=59 error=40
WARNING: /home was not properly dismounted
dmesg ended here

here is /etc/fstab, editted one
# DeviceMountpoint  FStype  Options DumpPass#
/dev/ad0s2b noneswapsw  0   0

Re: hard disk error , run fsck manually

2003-06-28 Thread Kent Stewart
On Saturday 28 June 2003 07:17 pm, manee wrote:
 hi sirs,

 i face problem of hard disk error because of the power
 downed during using the machine.

 once the power is coming, my machine stuck at

 file system is still dirty please run fsck manually.

 the partion that is dirty is /dev/ad0s2g,  a home
 partion one. i did run fsck /dev/ad0s2g several times
 but still get the same message.

 what i decided to do was that to edit /etc/fstab and
 put read only option for /dev/ad0s2g and exited a
 single mode.  i got a message said that /home was not
 dismount, as you see in the attachment.

 up to this point, only root that can log in.  my
 question is that are there any method to recover an
 fsck error during boot time?

 please cc to me since i do not a member of the list.


You need to do something like fsck -y from single user mode. The fs has to 
be unmounted to fix it.

Kent

-- 
Kent Stewart
Richland, WA

http://users.owt.com/kstewart/index.html

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: slice extends beyond end of disk error on install

2003-03-09 Thread W. J. Williams
Physician heal thyself...

I recreated my install disks and the problem disappeared...

Will

--- W. J. Williams [EMAIL PROTECTED] wrote:
 I keep getting the following error when trying to install FreeBSD 4.7
 
 
 
 ad0: 9773MB FUJITSU MPF3102AT [19857/16/63] at ata0-master UDMA 33
 Mounting root from ufs:/dev/md0c
 md0s4: slice extends beyond end of disk: truncating from 5 to 8640
 sectors
 .
 
 after this message the system just hangs.
 
 I have low-level formatted the disk twice now, but still the same error.
 
 Does anyone know what I am doing wrong?
 
 Will
 
 =
 Will Williams
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with unsubscribe freebsd-questions in the body of the message


=
Will Williams

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-questions in the body of the message


slice extends beyond end of disk error on install

2003-03-08 Thread W. J. Williams
I keep getting the following error when trying to install FreeBSD 4.7



ad0: 9773MB FUJITSU MPF3102AT [19857/16/63] at ata0-master UDMA 33
Mounting root from ufs:/dev/md0c
md0s4: slice extends beyond end of disk: truncating from 5 to 8640
sectors
.

after this message the system just hangs.

I have low-level formatted the disk twice now, but still the same error.

Does anyone know what I am doing wrong?

Will

=
Will Williams

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-questions in the body of the message


Disk error 0x10 (lba=0x48)

2003-02-11 Thread Peter Gonnissen
Hello,


I use FreeBSD 4.3 and I encounter problems with the 
installation floppies.

The two disk I made are working fine on my two other PCs. 
On a third machine, I cannot boot from kern.flp, I receive the 
following message


   Disk error 0x10 (lba=0x48)
   Disk error 0x10 (lba=0x48)
   No /boot/loader

   FreeBSD/i386 BOOT
   Default: 0:fd(0,a)/kernel
   Boot:
	Disk error 0x10 (lba=0x48)
   No /kernel

   FreeBSD/i386 BOOT
   Default: 0:fd(0,a)/kernel



 I can boot dos and linux from this floppy drive.  Linux is actually installed on the hardisk and is working fine. 

 It's a second hand pc, I bought for a good price because the first ide connecter is defect, so the first hard disk is 
 master on the second controller (with working linux on it).

 It's standard harware, nothing special, Intel PII, realtek ethernet, maxtor 91021u2 and floppy disk, no cd.

 Can you help me with this problem? Thanks a lot.

Peter




To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-questions in the body of the message


Looking up disk error codes

2002-12-10 Thread guest005
Hello all.
  
  Does anybody know how can I figure out what status=51 error=04 mean?
  
dmesg reported that regarding an (obviously) bad disk (due to power outage).
  
 
I didn't find anywhere in the Net a look-up table or something similar that
which possibly could give me a *precise* answer to my question.
  
  
  
Anybody feeling helpful out there?
  
  
Thanks in advance.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-questions in the body of the message