Re: The need for initialising disks before use?

2006-08-22 Thread Joe Koberg

Antony Mawer wrote:


Is it recommended/required to do something like:

dd if=/dev/zero of=/dev/ad0 bs=1m

before use to ensure the drive's sector remappings are all in place, 
before then doing a newfs?



It seems logical to read the whole device first with conv=noerror to 
be sure the drive has encountered and noted any correctable or 
uncorrectable errors present.


Only then write the entire drive, allowing it to remap any noted bad 
sectors. i.e.:


   # dd if=/dev/ad0 of=/dev/null bs=64k conv=noerror
   # dd if=/dev/zero of=/dev/ad0 bs=64k

The problem is that when dd hits the first bad sector, the whole 64k 
block containing the sector will be skipped. There could be more bad 
sectors there... or none... If you hit errors I would re-read the 
affected area with bs=512 to get down to sector granularity.


I seem to recall a utility posted to a freebsd mailing list some time 
ago that worked like dd(1), but would divide and conquer a block that 
returned with a read error.  Intent being to get the job done fast with 
large blocks but still copy every sector possible off a failing drive by 
reducing to sector-sized blocks if necessary Unfortunately I can't 
find it now.




Joe Koberg
joe at osoft dot us

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: The need for initialising disks before use?

2006-08-21 Thread Frode Nordahl

On 17. aug. 2006, at 15.35, Antony Mawer wrote:


Hi list,

A quick question - is it recommended to initialise disks before  
using them to allow the disks to map out any bad spots early on?  
I've seen some uninitialised disks (ie. new disks, thrown into a  
machine, newfs'd) start to show read errors within a few months of  
deployment, which I thought one or two might seem okay, but on a  
number of machines is more than a coincidence...


Is it recommended/required to do something like:

dd if=/dev/zero of=/dev/ad0 bs=1m

before use to ensure the drive's sector remappings are all in  
place, before then doing a newfs?


FWIW, I've been seeing this on more 6.0 systems that I would have  
thought to be just chance...


I think the change is that more systems use cheaper SATA drives now.

On several occations I have been unable to build a RAID (hardware or  
software based) on brand new disks due to one of the drives failing  
during initialization.


After zeroing all the drives with dd, everything works fine.

I'm not sure if vendors cut corners on initially formatting their  
drives to save some $$ or if SATA just lacks some features over SCSI  
that causes trouble like this.


--
Frode Nordahl



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: The need for initialising disks before use?

2006-08-18 Thread Kirk Strauser
On Thursday 17 August 2006 8:35 am, Antony Mawer wrote:

 A quick question - is it recommended to initialise disks before using
 them to allow the disks to map out any bad spots early on?

Note: if you once you actually start seeing bad sectors, the drive is almost 
dead.  A drive can remap a pretty large number internally, but once that 
pool is exhausted (and the number of errors is still growing 
exponentially), there's not a lot of life left.
-- 
Kirk Strauser
The Day Companies
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: The need for initialising disks before use?

2006-08-18 Thread Brooks Davis
On Fri, Aug 18, 2006 at 09:19:04AM -0500, Kirk Strauser wrote:
 On Thursday 17 August 2006 8:35 am, Antony Mawer wrote:
 
  A quick question - is it recommended to initialise disks before using
  them to allow the disks to map out any bad spots early on?
 
 Note: if you once you actually start seeing bad sectors, the drive is almost 
 dead.  A drive can remap a pretty large number internally, but once that 
 pool is exhausted (and the number of errors is still growing 
 exponentially), there's not a lot of life left.

There are some exceptions to this.  The drive can not remap a sector
which failes to read.  You must perform a write to cause the remap to
occur.  If you get a hard write failure it's gameover, but read failures
aren't necessicary a sign the disk is hopeless.  For example, the drive
I've had in my laptop for most of the last year developed a three sector[0]
error within a week or so of arrival.  After dd'ing zeros over the
problem sectors the problem sectors I've had no problems.

-- Brooks

[0] The error occured in one of the worst possible locations and fsck
could not complete until I zeroed those locations.  That really sucked.


pgp9MRru4oamG.pgp
Description: PGP signature


Re: The need for initialising disks before use?

2006-08-18 Thread Antony Mawer

On 18/08/2006 4:29 AM, Brooks Davis wrote:

On Fri, Aug 18, 2006 at 09:19:04AM -0500, Kirk Strauser wrote:

On Thursday 17 August 2006 8:35 am, Antony Mawer wrote:


A quick question - is it recommended to initialise disks before using
them to allow the disks to map out any bad spots early on?
Note: if you once you actually start seeing bad sectors, the drive is almost 
dead.  A drive can remap a pretty large number internally, but once that 
pool is exhausted (and the number of errors is still growing 
exponentially), there's not a lot of life left.


There are some exceptions to this.  The drive can not remap a sector
which failes to read.  You must perform a write to cause the remap to
occur.  If you get a hard write failure it's gameover, but read failures
aren't necessicary a sign the disk is hopeless.  For example, the drive
I've had in my laptop for most of the last year developed a three sector[0]
error within a week or so of arrival.  After dd'ing zeros over the
problem sectors the problem sectors I've had no problems.


This is what prompted it -- I've been seeing lots of drives that are 
showing up with huge numbers of read errors - for instance:



Aug 19 04:02:27 server kernel: ad0: FAILURE - READ_DMA status=51READY,DSC,ERROR 
error=40UNCORRECTABLE LBA=66293984
Aug 19 04:02:27 server kernel: g_vfs_done():ad0s1f[READ(offset=30796791808, 
length=16384)]error = 5
Aug 19 04:02:31 server kernel: ad0: FAILURE - READ_DMA status=51READY,DSC,ERROR 
error=40UNCORRECTABLE LBA=47702304
Aug 19 04:02:31 server kernel: g_vfs_done():ad0s1f[READ(offset=21277851648, 
length=16384)]error = 5
Aug 19 04:02:36 server kernel: ad0: FAILURE - READ_DMA status=51READY,DSC,ERROR 
error=40UNCORRECTABLE LBA=34943296
Aug 19 04:02:36 server kernel: g_vfs_done():ad0s1f[READ(offset=14745239552, 
length=16384)]error = 5
Aug 19 04:03:08 server kernel: ad0: FAILURE - READ_DMA status=51READY,DSC,ERROR 
error=40UNCORRECTABLE LBA=45514848
Aug 19 04:03:08 server kernel: g_vfs_done():ad0s1f[READ(offset=20157874176, 
length=16384)]error = 5


I have /var/log/messages flooded with incidents of these FAILURE - 
READ_DMA messages. I've seen it on more than one machine with 
relatively young drives.


I'm trying to determining of running a dd if=/dev/zero over the whole 
drive prior to use will help reduce the incidence of this, or if it is 
likely that these are developing after the initial install, in which 
case this will make negligible difference...


Once I do start seeing these, is there an easy way to:

a) determine what file/directory entry might be affected?
b) dd if=/dev/zero over the affected sectors only, in order to
 trigger a sector remapping without nuking the whole drive
c) depending on where that sector is allocated, I presume I'm
 either going to end up with:
i) zero'd bytes within a file (how can I tell which?!)
   ii) a destroyed inode
  iii) ???

Any thoughts/comments/etc appreciated...

How do other operating systems handle this - Windows, Linux, Solaris, 
MacOSX ...? I would have hoped this would be a condition the OS would 
make some attempt to trigger a sector remap... or are OSes typically 
ignorant of such things?


Regards
Antony

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: The need for initialising disks before use?

2006-08-18 Thread Brooks Davis
On Fri, Aug 18, 2006 at 01:41:27PM -1000, Antony Mawer wrote:
 On 18/08/2006 4:29 AM, Brooks Davis wrote:
 On Fri, Aug 18, 2006 at 09:19:04AM -0500, Kirk Strauser wrote:
 On Thursday 17 August 2006 8:35 am, Antony Mawer wrote:
 
 A quick question - is it recommended to initialise disks before using
 them to allow the disks to map out any bad spots early on?
 Note: if you once you actually start seeing bad sectors, the drive is 
 almost dead.  A drive can remap a pretty large number internally, but 
 once that pool is exhausted (and the number of errors is still growing 
 exponentially), there's not a lot of life left.
 
 There are some exceptions to this.  The drive can not remap a sector
 which failes to read.  You must perform a write to cause the remap to
 occur.  If you get a hard write failure it's gameover, but read failures
 aren't necessicary a sign the disk is hopeless.  For example, the drive
 I've had in my laptop for most of the last year developed a three sector[0]
 error within a week or so of arrival.  After dd'ing zeros over the
 problem sectors the problem sectors I've had no problems.
 
 This is what prompted it -- I've been seeing lots of drives that are 
 showing up with huge numbers of read errors - for instance:
 
 Aug 19 04:02:27 server kernel: ad0: FAILURE - READ_DMA 
 status=51READY,DSC,ERROR error=40UNCORRECTABLE LBA=66293984
 Aug 19 04:02:27 server kernel: 
 g_vfs_done():ad0s1f[READ(offset=30796791808, length=16384)]error = 5
 Aug 19 04:02:31 server kernel: ad0: FAILURE - READ_DMA 
 status=51READY,DSC,ERROR error=40UNCORRECTABLE LBA=47702304
 Aug 19 04:02:31 server kernel: 
 g_vfs_done():ad0s1f[READ(offset=21277851648, length=16384)]error = 5
 Aug 19 04:02:36 server kernel: ad0: FAILURE - READ_DMA 
 status=51READY,DSC,ERROR error=40UNCORRECTABLE LBA=34943296
 Aug 19 04:02:36 server kernel: 
 g_vfs_done():ad0s1f[READ(offset=14745239552, length=16384)]error = 5
 Aug 19 04:03:08 server kernel: ad0: FAILURE - READ_DMA 
 status=51READY,DSC,ERROR error=40UNCORRECTABLE LBA=45514848
 Aug 19 04:03:08 server kernel: 
 g_vfs_done():ad0s1f[READ(offset=20157874176, length=16384)]error = 5
 
 I have /var/log/messages flooded with incidents of these FAILURE - 
 READ_DMA messages. I've seen it on more than one machine with 
 relatively young drives.
 
 I'm trying to determining of running a dd if=/dev/zero over the whole 
 drive prior to use will help reduce the incidence of this, or if it is 
 likely that these are developing after the initial install, in which 
 case this will make negligible difference...

I really don't know.  The only way I can think of to find out is to own
a large number of machine and perform an experiment.  We (the general
computing public) don't have the kind of models needed to really say
anything definitive.  Drive are too darn opaque.

 Once I do start seeing these, is there an easy way to:
 
 a) determine what file/directory entry might be affected?

Not easily, but this question has been asked and answered on the mailing
lists recently (I don't remember the answer, but I think there were some
ports that can help).

 b) dd if=/dev/zero over the affected sectors only, in order to
  trigger a sector remapping without nuking the whole drive

You can use src/tools/tools/recover disk to refresh all of the disk
except the parts that don't work and then use dd and the console error
output to do the rest.

 c) depending on where that sector is allocated, I presume I'm
  either going to end up with:
 i) zero'd bytes within a file (how can I tell which?!)
ii) a destroyed inode
   iii) ???

Presumably it will be one of i, ii or a mangled superblock.  I don't
know how you'd tell which off the top of my head.  This is one of the
reasons I think Sun is on the right track with zfs's checksum everything
approach.  At least that way you actually know when something goes
wrong.

 Any thoughts/comments/etc appreciated...
 
 How do other operating systems handle this - Windows, Linux, Solaris, 
 MacOSX ...? I would have hoped this would be a condition the OS would 
 make some attempt to trigger a sector remap... or are OSes typically 
 ignorant of such things?

The OS is generally unaware of such events except to the extent that 
they know a fatal read error occurred or that they read the SMART data
from the drive in the case of write failures.

-- Brooks


pgpUCIAcpqMtN.pgp
Description: PGP signature


Re: The need for initialising disks before use?

2006-08-18 Thread jonathan michaels
On Fri, Aug 18, 2006 at 09:52:02PM -0500, Brooks Davis wrote:
 On Fri, Aug 18, 2006 at 01:41:27PM -1000, Antony Mawer wrote:
  On 18/08/2006 4:29 AM, Brooks Davis wrote:
  On Fri, Aug 18, 2006 at 09:19:04AM -0500, Kirk Strauser wrote:
  On Thursday 17 August 2006 8:35 am, Antony Mawer wrote:
  
  A quick question - is it recommended to initialise disks before using
  them to allow the disks to map out any bad spots early on?
  Note: if you once you actually start seeing bad sectors, the drive is 
  almost dead.  A drive can remap a pretty large number internally, but 
  once that pool is exhausted (and the number of errors is still growing 
  exponentially), there's not a lot of life left.
  
  There are some exceptions to this.  The drive can not remap a sector
  which failes to read.  You must perform a write to cause the remap to
  occur.  If you get a hard write failure it's gameover, but read failures
  aren't necessicary a sign the disk is hopeless.  For example, the drive
  I've had in my laptop for most of the last year developed a three sector[0]
  error within a week or so of arrival.  After dd'ing zeros over the
  problem sectors the problem sectors I've had no problems.
  

  This is what prompted it -- I've been seeing lots of drives that are 
  showing up with huge numbers of read errors - for instance:
  
  Aug 19 04:02:27 server kernel: ad0: FAILURE - READ_DMA 
  status=51READY,DSC,ERROR error=40UNCORRECTABLE LBA=66293984
  Aug 19 04:02:27 server kernel: 
  g_vfs_done():ad0s1f[READ(offset=30796791808, length=16384)]error = 5
  Aug 19 04:02:31 server kernel: ad0: FAILURE - READ_DMA 
  status=51READY,DSC,ERROR error=40UNCORRECTABLE LBA=47702304

i have recently managed to borrow an acer pentium III 550 mhz based
machine to  test and use as an installation server for freebsd
v6.1-release.

after running a minimum (basic) installation on teh machine, which has
a pair of drives (an 850 mb maxtor atapi/ide and a 1 gb fujitus
atapi/ide drive that has a block of some 400-550 megabite that the
bios/ms windows 2000 was not able to accessand i built my freebsd
partitions/slices around .. this is why i was originally interested in
this thread, so that i might get a way of refresh this disks media and
possibly revover teh who media surface or find out what is going on. 

originally the error messages concerned only the oddly
partitioned/sliced fujitsu but afte a few days it spread and as best as
i can recall the machine will loose console access (and network login
access as well but this could be some intermitent aspect) via sshd as
soon as either of the disks are written too, in my case it seems to be
access to teh swap slice as this machine has a small memory footprint,
32 megabyte untill i can canabalise another or replace the machine.

i cannot use freebsd 6.1-release on any of my machines as they all have
scsi drives and host with bootable cdroms but with bioses that use the
old (high seirra) bootable cdrom format, software and this machine
while not recent is still some 5 to 10 years newer than my own most
recent hardware.

stuff trimmed for brevity

  I have /var/log/messages flooded with incidents of these FAILURE - 
  READ_DMA messages. I've seen it on more than one machine with 
  relatively young drives.
  
  I'm trying to determining of running a dd if=/dev/zero over the whole 
  drive prior to use will help reduce the incidence of this, or if it is 
  likely that these are developing after the initial install, in which 
  case this will make negligible difference...
 
 I really don't know.  The only way I can think of to find out is to own
 a large number of machine and perform an experiment.  We (the general
 computing public) don't have the kind of models needed to really say
 anything definitive.  Drive are too darn opaque.
 
  Once I do start seeing these, is there an easy way to:
  
  a) determine what file/directory entry might be affected?
 
 Not easily, but this question has been asked and answered on the mailing
 lists recently (I don't remember the answer, but I think there were some
 ports that can help).

might i add that while the original question (the refreshing of the
operating diskes media) has (may have) been answered, sorry i didn't
follow this thread as asidiously as i should have, because the thread
was only of partial interst to me, but since this post has caught my
interest because my installation of freebsd on stable hardware has
started to produce similare error messages i now think that the
original question has morphed (as these things usually do, somewhat
sadly) into something dare i say it, quiet different.

i've sent Mr Mawer a post off list giving some details and depending on
teh answers it might be worth while posting a bug report of sorts ???

most kind regards

jonathan

-- 

powered by ..
QNX, OS9 and freeBSD  --  http://caamora com au/operating system
 === appropriate solution in an 

Re: The need for initialising disks before use?

2006-08-17 Thread Brooks Davis
On Thu, Aug 17, 2006 at 03:35:14AM -1000, Antony Mawer wrote:
 Hi list,
 
 A quick question - is it recommended to initialise disks before using 
 them to allow the disks to map out any bad spots early on? I've seen 
 some uninitialised disks (ie. new disks, thrown into a machine, 
 newfs'd) start to show read errors within a few months of deployment, 
 which I thought one or two might seem okay, but on a number of machines 
 is more than a coincidence...
 
 Is it recommended/required to do something like:
 
 dd if=/dev/zero of=/dev/ad0 bs=1m
 
 before use to ensure the drive's sector remappings are all in place, 
 before then doing a newfs?
 
 FWIW, I've been seeing this on more 6.0 systems that I would have 
 thought to be just chance...

This probably isn't a bad idea in general.  It might even be something
we should add to sysinstall.

-- Brooks


pgpCOr8lmMfAQ.pgp
Description: PGP signature