[Hampshire] Laptop Hardrive

2010-05-18 Thread e-mail dawn.gray1
 Hi guys,

My laptop hardrive is faulty, does anyone have a spare/old one I can
have/buy

details are :

Fujitsu MHV2080BH 80 Gb Sata

8mm thick

size is unimportant as long as it is big enough to install unbuntu on .

please contact me off line

cheers

Dawn
-- 
Please post to: Hampshire@mailman.lug.org.uk
Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
LUG URL: http://www.hantslug.org.uk
--

Re: [Hampshire] Laptop Hardrive

2010-05-18 Thread Vic

> My laptop hardrive is faulty, does anyone have a spare/old one I can
> have/buy

Second-hand HDDs are usually a bad investment - they have a limited
lifespan, so if someone else has taken one out of service, it's probably
used up quite a bit of that life...

A brand-spankers SATA laptop drive can be had for about £30. I simply
wouldn't bother looking for a used drive.

Vic.

p.s. I've replaced two SATA HDDs this last weekend. Neither was over 2
years old. I have a suspicion they don't last like they used to...


-- 
Please post to: Hampshire@mailman.lug.org.uk
Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
LUG URL: http://www.hantslug.org.uk
--


Re: [Hampshire] Laptop Hardrive

2010-05-18 Thread trotter
At 20:12 18/05/2010, you wrote:

> > My laptop hardrive is faulty, does anyone have a spare/old one I can
> > have/buy
>
>Second-hand HDDs are usually a bad investment - they have a limited
>lifespan, so if someone else has taken one out of service, it's probably
>used up quite a bit of that life...
>
>A brand-spankers SATA laptop drive can be had for about £30. I simply
>wouldn't bother looking for a used drive.
>
>Vic.
>
>p.s. I've replaced two SATA HDDs this last weekend. Neither was over 2
>years old. I have a suspicion they don't last like they used to...


In the desktop arena with 3.5" the experience of my last 2 drives would
bare out your thinking. The western digital blue 640GB has 2 error sectors
reallocated as soon as i installed it. The 1.5TB Seagate has had 3 sectors
go bad after a few weeks.

Its looking like higher capacity drives are more error prone at the mo on
my very limited sample. All the previous Samsung 750GB, 500, 400 have
no sector reallocation after up to 2 years.


Martin N

Running MorphOS v2.4 on Mac Mini, Moderator of 
MiniDisc,amithlonopen,bwfc Yahoogroups



-- 
Please post to: Hampshire@mailman.lug.org.uk
Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
LUG URL: http://www.hantslug.org.uk
--


Re: [Hampshire] Laptop Hardrive

2010-05-18 Thread Tim Brocklehurst
On Tuesday 18 May 2010 23:20:49 trotter wrote:
> At 20:12 18/05/2010, you wrote:
> > > My laptop hardrive is faulty, does anyone have a spare/old one I can
> > > have/buy
> >
> >Second-hand HDDs are usually a bad investment - they have a limited
> >lifespan, so if someone else has taken one out of service, it's probably
> >used up quite a bit of that life...
> >
> >A brand-spankers SATA laptop drive can be had for about £30. I simply
> >wouldn't bother looking for a used drive.
> >
> >Vic.
> >
> >p.s. I've replaced two SATA HDDs this last weekend. Neither was over 2
> >years old. I have a suspicion they don't last like they used to...
> 
> In the desktop arena with 3.5" the experience of my last 2 drives would
> bare out your thinking. The western digital blue 640GB has 2 error sectors
> reallocated as soon as i installed it. The 1.5TB Seagate has had 3 sectors
> go bad after a few weeks.
> 
> Its looking like higher capacity drives are more error prone at the mo on
> my very limited sample. All the previous Samsung 750GB, 500, 400 have
> no sector reallocation after up to 2 years.
> 
> 
> Martin N
> 
> Running MorphOS v2.4 on Mac Mini, Moderator of
> MiniDisc,amithlonopen,bwfc Yahoogroups
> 

That's not entirely surprising. The more you try to pack onto a disk, the more 
chance that a minor error will creep in. As for laptop hard-drives, I'm amazed 
at the amount of abuse they'll handle, but how about a solid-state drive? 
They're not cheap, but they're faster and more robust. E-buyer usually have 
some. The Laptop harddrive you have will be a standard 2.5" SATA model, so 
it's easily replaced.

Tim B.

-- 
OpenPilot - Open-source Marine Chart Plotter
Lead Developer
http://openpilot.engineering.selfip.org

-- 
Please post to: Hampshire@mailman.lug.org.uk
Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
LUG URL: http://www.hantslug.org.uk
--


Re: [Hampshire] Laptop Hardrive

2010-05-19 Thread James Courtier-Dutton
On 18 May 2010 23:20, trotter  wrote:
>
> In the desktop arena with 3.5" the experience of my last 2 drives would
> bare out your thinking. The western digital blue 640GB has 2 error sectors
> reallocated as soon as i installed it. The 1.5TB Seagate has had 3 sectors
> go bad after a few weeks.
>

I would like it if Linux would at least tell me which files got hit by
the reallocation.
So, I lost 3 sectors, so which files have 512 bytes missing?
I use sha256sum on all my picture files, so that I can detect which
one has gone bad, and then replace it from backup.

Kind Regards

James

-- 
Please post to: Hampshire@mailman.lug.org.uk
Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
LUG URL: http://www.hantslug.org.uk
--


Re: [Hampshire] Laptop Hardrive

2010-05-19 Thread Vic

> I would like it if Linux would at least tell me which files got hit by
> the reallocation.

It probably can't tell.

Reallocation happens by way of the drive controller; the main OS is not
involved, nor even informed unless it specifically asks. I'm not aware of
any way to inquire about the reallocation map - if such a method exists,
it's almost certainly manufacturer-specific and undocumented.

> So, I lost 3 sectors, so which files have 512 bytes missing?

None of them. That's the purpose of reallocating sectors, not just letting
them fail.

> I use sha256sum on all my picture files, so that I can detect which
> one has gone bad, and then replace it from backup.

That's nice for you, but has little to do with HDD sector reallocation.

Vic.


-- 
Please post to: Hampshire@mailman.lug.org.uk
Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
LUG URL: http://www.hantslug.org.uk
--


Re: [Hampshire] Laptop Hardrive

2010-05-19 Thread James Courtier-Dutton
On 19 May 2010 11:14, Vic  wrote:
>
>> I would like it if Linux would at least tell me which files got hit by
>> the reallocation.
>
> It probably can't tell.
>
> Reallocation happens by way of the drive controller; the main OS is not
> involved, nor even informed unless it specifically asks. I'm not aware of
> any way to inquire about the reallocation map - if such a method exists,
> it's almost certainly manufacturer-specific and undocumented.

reallocations appear in the Linux syslog so one could match these up
with the filesystem and know which file was touched.

>
>> So, I lost 3 sectors, so which files have 512 bytes missing?
>
> None of them. That's the purpose of reallocating sectors, not just letting
> them fail.

That is false, a reallocation can happen if a sector fails to be
readable any more although it is true that the drive tries to spot
sectors that are about to turn bad, and reallocate them before they
fail.
A reallocation can also happen on write, where a read after write
check is done, and if it failed to write correctly, it instead
reallocates the sector and writes to the new location for that same
sector. In the "write" case, data is not lost.

>
>> I use sha256sum on all my picture files, so that I can detect which
>> one has gone bad, and then replace it from backup.
>
> That's nice for you, but has little to do with HDD sector reallocation.

On some consumer HDs, it can silently loose data for a sector without
warning. Reallocation is one way this happens, so sha256sum can help
with this.

I was talking to someone at a kernel summit and they were in charge of
a large array of test disks. They tried doing a sha256sum on data that
was unlikely to change and then went back 6 months later to compare
it. They were stunned at the amount of silent data corruptions that
had happened. Some were even single bit flips, which apparently disc
sector CRC checking is supposed to catch, but it did not.
I do not remember the person who stated it, but it was made by someone
who was trustworthy.

I do sha256sums on all my important files now, but fortunately I have
not observed a problem yet.

Kind Regards

James

-- 
Please post to: Hampshire@mailman.lug.org.uk
Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
LUG URL: http://www.hantslug.org.uk
--


Re: [Hampshire] Laptop Hardrive

2010-05-19 Thread James
On Wed, 2010-05-19 at 11:14 +0100, Vic wrote:
> > So, I lost 3 sectors, so which files have 512 bytes missing?
> 
> None of them. That's the purpose of reallocating sectors, not just letting
> them fail.

Yes, if data is actually unrecoverable (as happened in my notebook's
hard disc: 3 bad sectors at the time of replacement, lost a video) the
drive will kick up a fuss-load of ATA errors, which will be reported all
over dmesg. (If you see any of them, these messages will indicate the
*logical* block(s) for which the read failed, but I don't know how to
map this data back to specific file-system entries.) In the case here
the drive probably read back the data after writing to check it, found
the media wanting, wrote it elsewhere, and blacklisted the sectors.

James

-- 
James   theholyet...@googlemail.com
PGP key ID: 03F94B5D
---


signature.asc
Description: This is a digitally signed message part
-- 
Please post to: Hampshire@mailman.lug.org.uk
Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
LUG URL: http://www.hantslug.org.uk
--

Re: [Hampshire] Laptop Hardrive

2010-05-19 Thread Vic

> reallocations appear in the Linux syslog

Really? I've only ever seen summary information from smartmontools there.
I've also been unable to find anything in Google to support the idea that
actual reallocation map data goes into syslog; perhaps you'd post some
examples so we can all learn about it.

>>> So, I lost 3 sectors, so which files have 512 bytes missing?
>>
>> None of them. That's the purpose of reallocating sectors, not just
>> letting
>> them fail.
>
> That is false

No it isn't.

> a reallocation can happen if a sector fails to be
> readable any more although it is true that the drive tries to spot
> sectors that are about to turn bad, and reallocate them before they
> fail.

That leaves you with all your files intact.

> A reallocation can also happen on write, where a read after write
> check is done, and if it failed to write correctly, it instead
> reallocates the sector and writes to the new location for that same
> sector. In the "write" case, data is not lost.

So if data is not lost - all your files are intact.

In both of your examples, all your files are intact - none of them are
missing any data. That's what I said, and what you claimed was false.

> On some consumer HDs, it can silently loose data for a sector without
> warning. Reallocation is one way this happens, so sha256sum can help
> with this.

That would be a faulty drive design. Whilst I'm not going to claim that
all HDD firmware is perfect, it's the least of your worries when it comes
to HDD reliability. Reallocation routines are well-tested and fairly
similar from one model of drive to the next.

> I was talking to someone at a kernel summit and they were in charge of
> a large array of test disks. They tried doing a sha256sum on data that
> was unlikely to change and then went back 6 months later to compare
> it. They were stunned at the amount of silent data corruptions that
> had happened. Some were even single bit flips, which apparently disc
> sector CRC checking is supposed to catch, but it did not.

That proves nothing about the behaviour of the drive - only that something
has changed data on it. I regularly get customer machines in where
supposedly-invariate files have changed. That is rarely to do with disk
failure, and usually to do with a virus.

Now I'm not claiming that there are virus problems in Linux systems - just
that declaring unexpected file changes to be down to sector reallocation
faults is simply bogus; there is absolutely no evidence to support it.

> I do not remember the person who stated it, but it was made by someone
> who was trustworthy.

That renders it "bloke in a pub told me"-reliable.

Vic.


-- 
Please post to: Hampshire@mailman.lug.org.uk
Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
LUG URL: http://www.hantslug.org.uk
--


Re: [Hampshire] Laptop Hardrive

2010-05-19 Thread Vic

> Yes, if data is actually unrecoverable (as happened in my notebook's
> hard disc: 3 bad sectors at the time of replacement, lost a video) the
> drive will kick up a fuss-load of ATA errors, which will be reported all
> over dmesg.

That's for broken sectors, not sector reallocation; if the data is
actually gone from the drive, there's nothing you can do about it.

The purpose of SMART is to notice impending failures before they get to
such a critical level, and move data away from the failing areas into
spare sectors. It's not perfect but, absent any significant external
events (like dropped disks), it's pretty good.

> (If you see any of them, these messages will indicate the
> *logical* block(s) for which the read failed, but I don't know how to
> map this data back to specific file-system entries.)

debugfs will do that, if you can be bothered. I usually rescue everything
to another drive first (failing drives often degrade while you're trying
to fix the machine), then run "rpm -Va" on it. That catches everything
managed by the package manager...

> In the case here
> the drive probably read back the data after writing to check it, found
> the media wanting, wrote it elsewhere, and blacklisted the sectors.

Yep.

Vic.


-- 
Please post to: Hampshire@mailman.lug.org.uk
Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
LUG URL: http://www.hantslug.org.uk
--


Re: [Hampshire] Laptop Hardrive

2010-05-19 Thread James Courtier-Dutton
On 19 May 2010 11:46, Vic  wrote:
>
>> reallocations appear in the Linux syslog
>
> Really? I've only ever seen summary information from smartmontools there.
> I've also been unable to find anything in Google to support the idea that
> actual reallocation map data goes into syslog; perhaps you'd post some
> examples so we can all learn about it.
>
>From the kernel sources:
constants.c:{0x0C01, "Write error - recovered with auto reallocation"},
constants.c:{0x0C02, "Write error - auto reallocation failed"},
constants.c:{0x1104, "Unrecovered read error - auto reallocate failed"},
constants.c:{0x1406, "Record not found - data auto-reallocated"},
constants.c:{0x1603, "Data sync error - data auto-reallocated"},
constants.c:{0x1706, "Recovered data without ECC - data auto-reallocated"},
constants.c:{0x1802, "Recovered data - data auto-reallocated"},

Any of the above cases will produce a syslog entry, the logical sector
number will also be included close by in the log.


>> a reallocation can happen if a sector fails to be
>> readable any more although it is true that the drive tries to spot
>> sectors that are about to turn bad, and reallocate them before they
>> fail.
>
> That leaves you with all your files intact.

No it does not in all cases. In some cases the data is recovered, in
other cases it is not.

>
>
>> I do not remember the person who stated it, but it was made by someone
>> who was trustworthy.
>
> That renders it "bloke in a pub told me"-reliable.
>

Well maybe, but the other people also listening were Linus Torvalds
and Andrew Morton together with about 5 other Linux kernel developers.
I therefore trusted the statement.

-- 
Please post to: Hampshire@mailman.lug.org.uk
Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
LUG URL: http://www.hantslug.org.uk
--


Re: [Hampshire] Laptop Hardrive

2010-05-19 Thread James Courtier-Dutton
On 19 May 2010 12:03, James Courtier-Dutton  wrote:
> On 19 May 2010 11:46, Vic  wrote:
>>
>>> reallocations appear in the Linux syslog
>>
>> Really? I've only ever seen summary information from smartmontools there.
>> I've also been unable to find anything in Google to support the idea that
>> actual reallocation map data goes into syslog; perhaps you'd post some
>> examples so we can all learn about it.
>>
> From the kernel sources:
> constants.c:    {0x0C01, "Write error - recovered with auto reallocation"},
> constants.c:    {0x0C02, "Write error - auto reallocation failed"},
> constants.c:    {0x1104, "Unrecovered read error - auto reallocate failed"},
> constants.c:    {0x1406, "Record not found - data auto-reallocated"},
> constants.c:    {0x1603, "Data sync error - data auto-reallocated"},
> constants.c:    {0x1706, "Recovered data without ECC - data 
> auto-reallocated"},
> constants.c:    {0x1802, "Recovered data - data auto-reallocated"},
>
> Any of the above cases will produce a syslog entry, the logical sector
> number will also be included close by in the log.
>
>
Example of a syslog entry for one of the above:
Apr 27 11:26:42 quad kernel: [ 3821.830237] sd 2:0:0:0: [sde]
Unhandled sense code
Apr 27 11:26:42 quad kernel: [ 3821.830239] sd 2:0:0:0: [sde] Result:
hostbyte=DID_OK driverbyte=DRIVER_SENSE
Apr 27 11:26:42 quad kernel: [ 3821.830243] sd 2:0:0:0: [sde] Sense
Key : Medium Error [current] [descriptor]
Apr 27 11:26:42 quad kernel: [ 3821.830247] Descriptor sense data with
sense descriptors (in hex):
Apr 27 11:26:42 quad kernel: [ 3821.830249] 72 03 11 04 00 00
00 0c 00 0a 80 00 00 00 00 00
Apr 27 11:26:42 quad kernel: [ 3821.830258] 0f 4d 1b 23
Apr 27 11:26:42 quad kernel: [ 3821.830261] sd 2:0:0:0: [sde] Add.
Sense: Unrecovered read error - auto reallocate failed
Apr 27 11:26:42 quad kernel: [ 3821.830266] sd 2:0:0:0: [sde] CDB:
Read(10): 28 00 0f 4d 1b 20 00 00 08 00
Apr 27 11:26:42 quad kernel: [ 3821.830274] end_request: I/O error,
dev sde, sector 256711459

So, here it is trying to reallocate the sector due to a read fault,
but failed to do so.
In smartctl, the drive only had 2 reallocated sectors so the
reallocate should have succeeded as there were spare sectors left for
reallocation.
I suspect a firmware bug. The drive was replaced with a different model.

-- 
Please post to: Hampshire@mailman.lug.org.uk
Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
LUG URL: http://www.hantslug.org.uk
--

Re: [Hampshire] Laptop Hardrive

2010-05-19 Thread Vic

>> perhaps you'd post some
>> examples so we can all learn about it.
>>
> From the kernel sources:

Not examples, then.

> constants.c:  {0x0C01, "Write error - recovered with auto reallocation"},
> constants.c:  {0x0C02, "Write error - auto reallocation failed"},
> constants.c:  {0x1104, "Unrecovered read error - auto reallocate failed"},
> constants.c:  {0x1406, "Record not found - data auto-reallocated"},
> constants.c:  {0x1603, "Data sync error - data auto-reallocated"},
> constants.c:  {0x1706, "Recovered data without ECC - data
> auto-reallocated"},
> constants.c:  {0x1802, "Recovered data - data auto-reallocated"},

Ah constants.c. That'll be drivers/scsi/constants.c, will it? SCSI is
different from ATA - which is what the discussion was about - and
different capabilities ensue.

Now I know that modern kernels tend to merge ATA and SCSI drives in terms
of how they are viewed (e.g. SATA drives showing up as /dev/sdX), but if
you look at the file, you'll see it's SCSI sense stuff, not ATA
reallocation.

What you've found above is again what is reported to the OS from the drive
firmware. It isn't the kernel doing the reallocation, it's the kernel
being able to understand the drive's reporting.

So I ask again - do you have any examples of what you claim, or are you
just going to throw random grep output at me and claim it supports your
argument?

>> That leaves you with all your files intact.
>
> No it does not in all cases. In some cases the data is recovered, in
> other cases it is not.

If the data is not recovered, you haven't got a reallocation - you've got
a disk failure. Disk failures do occur; they are less frequent than they
might be because of the drive's ability to swap out failing sectors before
they are completely gone, but immortal drives do not exist, even with
sector reallocation.

>> That renders it "bloke in a pub told me"-reliable.
>
> Well maybe, but the other people also listening were Linus Torvalds
> and Andrew Morton together with about 5 other Linux kernel developers.
> I therefore trusted the statement.

I once heard a bloke talking total cobblers about engines. Sir Stirling
Moss was also listening. Is a speaker made reliable by his audience?
Stirling Moss didn't think so, judging by the look on his face. But he
didn't say a word...

Vic.


-- 
Please post to: Hampshire@mailman.lug.org.uk
Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
LUG URL: http://www.hantslug.org.uk
--


Re: [Hampshire] Laptop Hardrive

2010-05-19 Thread Hugo Mills
On Wed, May 19, 2010 at 11:58:14AM +0100, Vic wrote:
> 
> > Yes, if data is actually unrecoverable (as happened in my notebook's
> > hard disc: 3 bad sectors at the time of replacement, lost a video) the
> > drive will kick up a fuss-load of ATA errors, which will be reported all
> > over dmesg.
> 
> That's for broken sectors, not sector reallocation; if the data is
> actually gone from the drive, there's nothing you can do about it.

   The main route to discovering you've got a failed drive or part of
drive is when you can't read the data that was originally put on
it. This will come to light either when a checksum is computed and
fails comparison, or when part of the hardware is operating outside
the parameters that are expected of it. When that happens, you have
already lost data.

> The purpose of SMART is to notice impending failures before they get to
> such a critical level,

   It's not very good at it, though. The famous Google paper on disk
failures quotes a model (under "related work", page 11) with only a
30% success rate based on SMART information. They also state (section
3.5.6, page 10) that 56% of failed drives show no failure indicators
at all in the four main SMART fields, and 36% of the failed drives
show no failure indicators in SMART _at all_.

   Detecting failures and fixing them before they're going to occur is
a nice fairy-tale, but in the real world, it's just not going to
happen unless you're very lucky.

> and move data away from the failing areas into
> spare sectors. It's not perfect but, absent any significant external
> events (like dropped disks), it's pretty good.

   SMART is simply a reporting process (plus a self-test feature) --
the drive still does sector reallocations even if SMART itself is
turned off.

   Now, regarding the sector reallocation process: My understanding is
that the drive will reallocate a sector if it has trouble reading
it. So, if the drive electronics generates an internal error and
causes a re-read, it _may_ attempt to move the sector, writing the
data that it read from the troublesome sector to a spare.

   Now, I doubt that it will do this on the first problematic read,
but after enough sequential retries where it's had a problem, it will
trigger this behaviour. I don't know how many internal retries are
needed. I would guess at 3-4. If there's damage to the sector
(physical or checksum), then the data that's read and rewritten may
not be the data that was originally put on the disk. In this instance,
it may not be 512 bytes of zeroes, but it's not guaranteed to be
identical.

   Hugo.

-- 
=== Hugo Mills: h...@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
  --- You've read the project plan.  Forget that. We're going to Do ---  
  Stuff and Have Fun doing it.   


signature.asc
Description: Digital signature
-- 
Please post to: Hampshire@mailman.lug.org.uk
Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
LUG URL: http://www.hantslug.org.uk
--

Re: [Hampshire] Laptop Hardrive

2010-05-19 Thread Vic


> Example of a syslog entry for one of the above:
> Apr 27 11:26:42 quad kernel: [ 3821.830237] sd 2:0:0:0: [sde]
> Unhandled sense code

"Unhandled sense code". That's a good start. Do you think this is a
reallocation?

> Apr 27 11:26:42 quad kernel: [ 3821.830239] sd 2:0:0:0: [sde] Result:
> hostbyte=DID_OK driverbyte=DRIVER_SENSE
> Apr 27 11:26:42 quad kernel: [ 3821.830243] sd 2:0:0:0: [sde] Sense
> Key : Medium Error [current] [descriptor]

"Medium Error". The disk has failed.

> Apr 27 11:26:42 quad kernel: [ 3821.830261] sd 2:0:0:0: [sde] Add.
> Sense: Unrecovered read error - auto reallocate failed

"Unrecovered read error". The data is already unreadable.

> Apr 27 11:26:42 quad kernel: [ 3821.830274] end_request: I/O error,
> dev sde, sector 256711459

And the failure is reported[1] to the kernel.

> So, here it is trying to reallocate the sector due to a read fault,
> but failed to do so.
> In smartctl, the drive only had 2 reallocated sectors so the
> reallocate should have succeeded as there were spare sectors left for
> reallocation.

OK, now think through things a little.

You're not getting assorted prefail warnings, you're getting a medium
error.  By the time this log excert was started, the data had already left
the drive. It was not available for copying.

So of course the reallocation would have failed. You can't copy something
if you don't have the thing to copy in the first place.

> I suspect a firmware bug.

I would say your logic is entirely flawed. This log does not show a
firmware bug.

> The drive was replaced with a different model.

That's up to you. It's your money.

Vic.

[1] "reported" might be the wrong word for the operation, but it shows the
overall effect :-)




-- 
Please post to: Hampshire@mailman.lug.org.uk
Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
LUG URL: http://www.hantslug.org.uk
--


Re: [Hampshire] Laptop Hardrive

2010-05-19 Thread Philip Stubbs
On 19 May 2010 12:21, Vic  wrote:
> If the data is not recovered, you haven't got a reallocation - you've got
> a disk failure. Disk failures do occur; they are less frequent than they
> might be because of the drive's ability to swap out failing sectors before
> they are completely gone, but immortal drives do not exist, even with
> sector reallocation.

Yes they do. If we consider 'dying' to be when a read fails, by simply
not asking the drive to read the data, we will never get a failed
read. Therefore the drive will never die. Immortal. :-) I have about
five 250 mb such drives on my bench in my shed. As long as I don't try
and use them, they are still alive.

http://en.wikipedia.org/wiki/Schr%C3%B6dinger's_cat

-- 
Philip Stubbs

-- 
Please post to: Hampshire@mailman.lug.org.uk
Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
LUG URL: http://www.hantslug.org.uk
--


Re: [Hampshire] Laptop Hardrive

2010-05-19 Thread James Courtier-Dutton
On 19 May 2010 12:22, Hugo Mills  wrote:
> On Wed, May 19, 2010 at 11:58:14AM +0100, Vic wrote:
>>
>> > Yes, if data is actually unrecoverable (as happened in my notebook's
>> > hard disc: 3 bad sectors at the time of replacement, lost a video) the
>> > drive will kick up a fuss-load of ATA errors, which will be reported all
>> > over dmesg.
>>
>> That's for broken sectors, not sector reallocation; if the data is
>> actually gone from the drive, there's nothing you can do about it.
>
>   The main route to discovering you've got a failed drive or part of
> drive is when you can't read the data that was originally put on
> it. This will come to light either when a checksum is computed and
> fails comparison, or when part of the hardware is operating outside
> the parameters that are expected of it. When that happens, you have
> already lost data.
>
>> The purpose of SMART is to notice impending failures before they get to
>> such a critical level,
>
>   It's not very good at it, though. The famous Google paper on disk
> failures quotes a model (under "related work", page 11) with only a
> 30% success rate based on SMART information. They also state (section
> 3.5.6, page 10) that 56% of failed drives show no failure indicators
> at all in the four main SMART fields, and 36% of the failed drives
> show no failure indicators in SMART _at all_.
>
>   Detecting failures and fixing them before they're going to occur is
> a nice fairy-tale, but in the real world, it's just not going to
> happen unless you're very lucky.
>
>> and move data away from the failing areas into
>> spare sectors. It's not perfect but, absent any significant external
>> events (like dropped disks), it's pretty good.
>
>   SMART is simply a reporting process (plus a self-test feature) --
> the drive still does sector reallocations even if SMART itself is
> turned off.
>
>   Now, regarding the sector reallocation process: My understanding is
> that the drive will reallocate a sector if it has trouble reading
> it. So, if the drive electronics generates an internal error and
> causes a re-read, it _may_ attempt to move the sector, writing the
> data that it read from the troublesome sector to a spare.
>
>   Now, I doubt that it will do this on the first problematic read,
> but after enough sequential retries where it's had a problem, it will
> trigger this behaviour. I don't know how many internal retries are
> needed. I would guess at 3-4. If there's damage to the sector
> (physical or checksum), then the data that's read and rewritten may
> not be the data that was originally put on the disk. In this instance,
> it may not be 512 bytes of zeroes, but it's not guaranteed to be
> identical.
>
>   Hugo.
>

My understanding of reallocation is the same as Hugo's. Maybe I just
explaining it badly.
My understanding of reallocation is not the same as Vic's.

I would add to Hugo's in that I believe that an entry will appear in
the syslog when the reallocation happens.

-- 
Please post to: Hampshire@mailman.lug.org.uk
Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
LUG URL: http://www.hantslug.org.uk
--

Re: [Hampshire] Laptop Hardrive

2010-05-19 Thread Vic

>The main route to discovering you've got a failed drive or part of
> drive is when you can't read the data that was originally put on
> it. This will come to light either when a checksum is computed and
> fails comparison, or when part of the hardware is operating outside
> the parameters that are expected of it. When that happens, you have
> already lost data.

I'm not so sure that's the "main route". It's the most likely route, given
the attention span of most computer users I see, but it's only hit my
supported customers when they have ignored advice to buy a new drive. I
suspect (but have insufficient data to prove) that a large portion of the
lack of effectiveness of auto-reallocation is down to the fact that modern
computer users are so often conditioned to ignore warnings...

>> The purpose of SMART is to notice impending failures before they get to
>> such a critical level,
>
>It's not very good at it, though.

I disagree.

> The famous Google paper on disk
> failures quotes a model (under "related work", page 11) with only a
> 30% success rate based on SMART information.

30% would do me. That's three in every ten failures that can be
intercepted prior to catastrophe, and the drive swapped out with minimal
interference to operation, and no data loss.

30% might not be as good as 100%, but it's a damn sight better than 0%.

> They also state (section
> 3.5.6, page 10) that 56% of failed drives show no failure indicators
> at all in the four main SMART fields, and 36% of the failed drives
> show no failure indicators in SMART _at all_.

Yes. These figures are higher than I would expect - and certainly don't
mesh with my personal experience. I suspect (but again, can't prove) that
the effectiveness of the technology depends on the duty cycle of the
drives; they have quite a bit of computation to do with very little
processing grunt...

>Detecting failures and fixing them before they're going to occur is
> a nice fairy-tale, but in the real world, it's just not going to
> happen unless you're very lucky.

I appear to be the luckiest man in the multiverse.

>SMART is simply a reporting process (plus a self-test feature) --
> the drive still does sector reallocations even if SMART itself is
> turned off.

Yes. It is the drive firmware that does the heavy lifting. It is the SMART
feature that warns the user, who then fails to do anything nutli the drive
has failed completely.

> If there's damage to the sector
> (physical or checksum), then the data that's read and rewritten may
> not be the data that was originally put on the disk. In this instance,
> it may not be 512 bytes of zeroes, but it's not guaranteed to be
> identical.

Drives have ECC on the data surface; there is a mathematical probability
that a random data failure could lead to a successful ECC check, but I
don't think we need worry too much about that.

In the event of an ECC failure, the sector will not be reallocated - it is
already failed.

If the ECC check passes, the data is almost certainly correct, so a
reallocation will work correctly.

Vic.


-- 
Please post to: Hampshire@mailman.lug.org.uk
Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
LUG URL: http://www.hantslug.org.uk
--


Re: [Hampshire] Laptop Hardrive

2010-05-19 Thread Vic

> My understanding of reallocation is the same as Hugo's. Maybe I just
> explaining it badly.

Hugo's argument appears to be that data reallocation is not as effective
as we would like. Yours appears to be very different - claiming a firmware
bug because you found an unrecoverable sector is just crazy.

> My understanding of reallocation is not the same as Vic's.

To which I would merely respond "how much time have you spent working on
drive firmware?"

I have a long history of embedded devlopment work, and I do have
first-hand experience[1] of these systems.

Vic.

[1] It's a few years ago now, so I'm not claiming my knowledge is bang
up-to-date, but I do have some inkling whereof I speak.


-- 
Please post to: Hampshire@mailman.lug.org.uk
Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
LUG URL: http://www.hantslug.org.uk
--


Re: [Hampshire] Laptop Hardrive

2010-05-19 Thread James Courtier-Dutton
On 19 May 2010 12:45, Vic  wrote:
>
> In the event of an ECC failure, the sector will not be reallocated - it is
> already failed.
>

This is the crux of the difference between your and my point of view.
You say the sector will not be reallocated.
I say it will and I believe Hugo also suggests it will.

A simple google would confirm mine and Hugo's point of view.
E.g.
http://www.ariolic.com/activesmart/smart-attributes/reallocated-sectors-count.html

-- 
Please post to: Hampshire@mailman.lug.org.uk
Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
LUG URL: http://www.hantslug.org.uk
--


Re: [Hampshire] Laptop Hardrive

2010-05-19 Thread Vic

>> In the event of an ECC failure, the sector will not be reallocated - it
>> is
>> already failed.
>
> This is the crux of the difference between your and my point of view.
> You say the sector will not be reallocated.
> I say it will

Why would it?

If the block has already failed, and the drive controller has already
identified it as failed, why would it attempt to copy essentially random
data into valuable spare sectors? Aside from being simply nonsensical to
conduct such nugatory work, it leads to silent corruption. Neither of
these outcomes is preferrable to the simple case of simply failing the
reallocation call.

> and I believe Hugo also suggests it will.

You might like to let Hugo speak for himself; I saw him arguing that
reallocation is less effective than it might be. You appear to be the only
person arguing that drive controller firmware has a behaviour that
deliberately corrupts data.

> A simple google would confirm mine and Hugo's point of view.
> E.g.
> http://www.ariolic.com/activesmart/smart-attributes/reallocated-sectors-count.html

That link does not support your argument. I wonder why you posted it.

Vic.


-- 
Please post to: Hampshire@mailman.lug.org.uk
Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
LUG URL: http://www.hantslug.org.uk
--


Re: [Hampshire] Laptop Hardrive

2010-05-19 Thread Hugo Mills
On Wed, May 19, 2010 at 01:36:35PM +0100, James Courtier-Dutton wrote:
> On 19 May 2010 12:45, Vic  wrote:
> >
> > In the event of an ECC failure, the sector will not be reallocated - it is
> > already failed.
> >
> 
> This is the crux of the difference between your and my point of view.
> You say the sector will not be reallocated.
> I say it will and I believe Hugo also suggests it will.

   A *repeated* ECC failure probably wouldn't cause a reallocation.
However, if the ECC failed (or some other hardware excursion occured)
several times, and then a good copy was received on one of the
retries, the reallocation would happen successfully. I would not
expect the drive to report an error to the OS in this case.

   A low-level reallocation (with or without a data copy -- I don't
know) would probably happen if the drive fails multiple retries and
doesn't get a good copy. In this case, I *would* expect the drive to
bounce an error back to the OS, as there will be data loss. (You can't
avoid it at that point).

   In both cases, I'd expect to see the SMART error count increase,
but the error reporting and data loss outcomes would be different.

   Hugo.

-- 
=== Hugo Mills: h...@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
--- Reintarnation:  Coming back from the dead as a hillbilly. ---


signature.asc
Description: Digital signature
-- 
Please post to: Hampshire@mailman.lug.org.uk
Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
LUG URL: http://www.hantslug.org.uk
--