"One or more disks are failing" ?

2009-07-04 Thread Scott Beamer
For a number of weeks now I've been getting this notification (only when 
in Fedora 11 & running GNOME) that "one or more disks are failing".

I first saw this a few months back. Since that time, I've reformatted the 
Linux partitions and installed other distros (including Mandriva and 
Ubuntu - both with GNOME) and in the same space never saw such a 
notification.

I wiped the Linux partitions each time when I installed a different Linux 
OS.

But when I last went back to Fedora (then rawhide, now F11) it appeared 
again. This only happens when running Fedora 11 in GNOME. I also run 
Fedora 11 with KDE and Xfce and never receive such notices (or anything 
close to it).

Here's how the partitions are set up (if it matters):

FilesystemSize  Used Avail Use% Mounted on
/dev/mapper/vg_ava-lv_root
   30G   20G  9.5G  68% /
proc 0 0 0   -  /proc
sysfs0 0 0   -  /sys
devpts   0 0 0   -  /dev/pts
/dev/sda5 194M   23M  162M  13% /boot
tmpfs 3.9G  520K  3.9G   1% /dev/shm
none 0 0 0   -  /proc/sys/fs/binfmt_misc
sunrpc   0 0 0   -  /var/lib/nfs/rpc_pipefs
gvfs-fuse-daemon 0 0 0   -  /home/scott/.gvfs
/dev/sda3 404G  298G  106G  74% /media/Files
/dev/sda2 489G  337G  153G  69% /media/Windows

In the meantime, I get no such notices in Windows either.

During all of this, I'm dual booting Linux and Windows 7.

I don't know much about tools to check this in Linux, but I did download 
a program a while back (I forget it's name now) for Windows that 
specifically queries S.M.A.R.T. and the result was everything was fine.

My drive is exhibiting no odd behavior. It behaves as it should. I've had 
drives fail many times in the past and this one is nowhere near leading 
me to believe that a failure is eminent. 

Oh, before I forget, when I had installed Mandriva (after a previous 
Fedora install) in the same space I was given an option to check for bad 
sectors, so I did (and that came up empty - everything was OK).

So I'm back to Fedora 11 now and the problem returns. This truly strange.

And lastly, here is a screenshot of what Fedora is telling me: http://
bit.ly/drive_is_failing

Your thoughts?

Thanks.

Scott

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-04 Thread Scott Beamer
Scott Beamer spake thusly:

> For a number of weeks now I've been getting this notification (only when
> in Fedora 11 & running GNOME) that "one or more disks are failing".

Sorry about that. The URL for the screenshot didn't wrap properly.

This is it: 

http://bit.ly/drive_is_failing



-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-04 Thread Michael Schwendt
On Sat, 4 Jul 2009 07:57:46 + (UTC), Scott wrote:

> For a number of weeks now I've been getting this notification (only when 
> in Fedora 11 & running GNOME) that "one or more disks are failing".

What do you get for "smartctl --all /dev/sda"? Perhaps a non-zero and
growing number of reallocated sectors?

> But when I last went back to Fedora (then rawhide, now F11) it appeared 
> again. This only happens when running Fedora 11 in GNOME.

I can confirm that Fedora 10 doesn't warn about the same disk.

> My drive is exhibiting no odd behavior. It behaves as it should. I've had 
> drives fail many times in the past and this one is nowhere near leading 
> me to believe that a failure is eminent. 

Filesystems (see "man badblocks") and the hard-disk itself protect against
a first bunch of errors that can only be worked around by reallocating/ignoring
sectors. Until the hardware failures become fatal all of a sudden. Hence
an early warning can be helpful.

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-04 Thread Scott Beamer
On 07/04/2009 02:41 AM, Michael Schwendt wrote:
> On Sat, 4 Jul 2009 07:57:46 + (UTC), Scott wrote:
> 
>> For a number of weeks now I've been getting this notification (only when 
>> in Fedora 11 & running GNOME) that "one or more disks are failing".
> 
> What do you get for "smartctl --all /dev/sda"? Perhaps a non-zero and
> growing number of reallocated sectors?

Uhhh

Actually, I got (106 lines of) all this:

$ sudo smartctl --all /dev/sda

smartctl version 5.38 [x86_64-redhat-linux-gnu] Copyright (C) 2002-8
Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model: SAMSUNG HD103UJ
Serial Number:S13PJ1MQ606788
Firmware Version: 1AA01112
User Capacity:1,000,204,886,016 bytes
Device is:In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 3b
Local Time is:Sat Jul  4 02:46:25 2009 MST

==> WARNING: May need -F samsung or -F samsung2 enabled; see manual for
details.

SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status:  ( 121) The previous self-test completed
having
the read element of the test failed.
Total time to complete Offline
data collection: (11658) seconds.
Offline data collection
capabilities:(0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off 
support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities:(0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:(0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time:(   2) minutes.
Extended self-test routine
recommended polling time:( 195) minutes.
Conveyance self-test routine
recommended polling time:(  21) minutes.
SCT capabilities:  (0x003f) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate 0x000f   100   064   051Pre-fail  Always
  -   8
  3 Spin_Up_Time0x0007   077   077   011Pre-fail  Always
  -   7820
  4 Start_Stop_Count0x0032   099   099   000Old_age   Always
  -   536
  5 Reallocated_Sector_Ct   0x0033   100   100   010Pre-fail  Always
  -   4
  7 Seek_Error_Rate 0x000f   100   100   051Pre-fail  Always
  -   0
  8 Seek_Time_Performance   0x0025   100   100   015Pre-fail
Offline  -   10567
  9 Power_On_Hours  0x0032   099   099   000Old_age   Always
  -   6202
 10 Spin_Retry_Count0x0033   100   100   051Pre-fail  Always
  -   0
 11 Calibration_Retry_Count 0x0012   100   100   000Old_age   Always
  -   0
 12 Power_Cycle_Count   0x0032   099   099   000Old_age   Always
  -   524
 13 Read_Soft_Error_Rate0x000e   100   066   000Old_age   Always
  -   8
183 Unknown_Attribute   0x0032   100   100   000Old_age   Always
  -   0
184 Unknown_Attribute   0x0033   100   100   099Pre-fail  Always
  -   0
187 Reported_Uncorrect  0x0032   100   100   000Old_age   Always
  -   2767
188 Unknown_Attribute   0x0032   100   100   000Old_age   Always
  -   0
190 Airflow_Temperature_Cel 0x0022   071   068   000Old_age   Always
  -   29 (Lifetime Min/Max 20/29)
194 Temperature_Celsius 0x0022   071   066   000Old_age   Always
  -   29 (Lifetime Min/Max 20/32)
195 Hardware_ECC_Recovered  0x001a   100   100   000Old_age   Always
  -   9633546
196 Reallocated_Event_Count 0x003

Re: "One or more disks are failing" ?

2009-07-04 Thread Jussi Lehtola
On Sat, 2009-07-04 at 02:57 -0700, Scott Beamer wrote:
> On 07/04/2009 02:41 AM, Michael Schwendt wrote:
> > On Sat, 4 Jul 2009 07:57:46 + (UTC), Scott wrote:
> > 
> >> For a number of weeks now I've been getting this notification (only when 
> >> in Fedora 11 & running GNOME) that "one or more disks are failing".
> > 
> > What do you get for "smartctl --all /dev/sda"? Perhaps a non-zero and
> > growing number of reallocated sectors?
> 
> Uhhh
> 
> Actually, I got (106 lines of) all this:
> 
> $ sudo smartctl --all /dev/sda

clip

>   5 Reallocated_Sector_Ct   0x0033   100   100   010Pre-fail  Always
>   -   4

This means your disk is breaking down, and should be replaced soon.

> 197 Current_Pending_Sector  0x0012   100   100   000Old_age   Always
>   -   2
> 198 Offline_Uncorrectable   0x0030   100   100   000Old_age
> Offline  -   1

And these IIRC mean that you've already lost some data. Replace the disk
ASAP.
-- 
Jussi Lehtola
Fedora Project Contributor
jussileht...@fedoraproject.org

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-04 Thread Michael Schwendt
On Sat, 04 Jul 2009 13:42:16 +0300, Jussi wrote:

> >   5 Reallocated_Sector_Ct   0x0033   100   100   010Pre-fail  Always
> >   -   4
> 
> This means your disk is breaking down, and should be replaced soon.

In other words, you're playing Russian roulette as you cannot know
how long the drive will continue to run even with a growing number of
reallocated sectors. If you decide to keep this drive running, watch
above value carefully, but even if it doesn't increase quickly, there
may be sudden death of this drive.
 
> > 197 Current_Pending_Sector  0x0012   100   100   000Old_age   Always
> >   -   2

This is worse than above. It ought to be zero. The internal hard-disk
controller has not reallocated these two sectors yet.

> > 198 Offline_Uncorrectable   0x0030   100   100   000Old_age
> > Offline  -   1
> 
> And these IIRC mean that you've already lost some data. Replace the disk
> ASAP.

Here, afaik, the drive has failed to reallocate bad sectors.
"smartctl -t offline /dev/sda" may help.

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-04 Thread Bruno Wolff III
On Sat, Jul 04, 2009 at 13:42:16 +0300,
  Jussi Lehtola  wrote:
> On Sat, 2009-07-04 at 02:57 -0700, Scott Beamer wrote:
> 
> >   5 Reallocated_Sector_Ct   0x0033   100   100   010Pre-fail  Always
> >   -   4
> 
> This means your disk is breaking down, and should be replaced soon.

It means some problem sectors have been reallocated. This correlates with
disk failure and depending on how rich you are might signal that it's
time to replace the disk.

> > 197 Current_Pending_Sector  0x0012   100   100   000Old_age   Always
> >   -   2
> > 198 Offline_Uncorrectable   0x0030   100   100   000Old_age
> > Offline  -   1
> 
> And these IIRC mean that you've already lost some data. Replace the disk
> ASAP.

No, it means that there are problem sectors that can't be read. If you write
over these sectors they may or may not see them get remapped. Disks
generally won't remap sectors which it can't read to give you a chance to
try to read them some more. It's possible to have one time problems with
a sector, so they may not even get remapped to a spare once you overwrite
the data or get a good read.

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-04 Thread Bruno Wolff III
On Sat, Jul 04, 2009 at 11:41:39 +0200,
  Michael Schwendt  wrote:
> On Sat, 4 Jul 2009 07:57:46 + (UTC), Scott wrote:
> 
> Filesystems (see "man badblocks") and the hard-disk itself protect against
> a first bunch of errors that can only be worked around by 
> reallocating/ignoring
> sectors. Until the hardware failures become fatal all of a sudden. Hence
> an early warning can be helpful.

Marking bad blocks in the OS isn't really useful these days. When a modern
disk runs out of spare blocks to remap it is way past the time it should
have been replaced.

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-04 Thread Tony Nelson
On 09-07-04 06:42:16, Jussi Lehtola wrote:
> On Sat, 2009-07-04 at 02:57 -0700, Scott Beamer wrote:
 ...
> >   5 Reallocated_Sector_Ct   0x0033   100   100   010Pre-fail 
> > Always  -   4
> 
> This means your disk is breaking down, and should be replaced soon.

Nonsense.  He has a disk with 2 million sectors, and 4 sectors have 
been remapped.  Such a number is common.  Only if the number increases 
at an increasing rate is the disk failing.  That's why SMART has a 
threshold (10 here); note that the current value is 100.


> > 197 Current_Pending_Sector  0x0012   100   100   000Old_age  
> > Always  -   2
> > 198 Offline_Uncorrectable   0x0030   100   100   000Old_age
> > Offline  -   1
> 
> And these IIRC mean that you've already lost some data. Replace the
> disk ASAP.

He has an unrecoverable sector.  If that sector was in use, it's data 
was lost.  If that sector was not in use, no data was lost.  If the 
number of uncorrectable sectors rises, that is a problem.

The OP should enable Automatic Offline Data Collection, which will scan 
the entire disk for bad sectors "every 4 hours", and remap them if they 
are still readable.  If sectors are unreadable when that is being done, 
then/ the disk needs replacement as data will surely be lost (sooner or 
later).


$ sudo smartctl -o on /dev/sda

-- 

TonyN.:'   
  '  

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-04 Thread Scott Beamer
Tony Nelson spake thusly:

[]

> He has an unrecoverable sector.  If that sector was in use, it's data 
> was lost.  If that sector was not in use, no data was lost.  If the 
> number of uncorrectable sectors rises, that is a problem.

I've been getting these warning messages popping up for 2-3 weeks now
(see original post for details), but I've otherwise not had any problems
with losing data or the drive's performance.

I'm not sure how to tell if this number is increasing

> 
> The OP should enable Automatic Offline Data Collection, which will scan 
> the entire disk for bad sectors "every 4 hours", and remap them if they 
> are still readable.  If sectors are unreadable when that is being done, 
> then/ the disk needs replacement as data will surely be lost (sooner or 
> later).
> 
> 
> $ sudo smartctl -o on /dev/sda

Thanks. I just started that. Does it run indefinitely? Do I stop it at
some point?

I've had the worst luck with hard drives in the past 4 years. It made up
for the 10 before that with no trouble whatsoever.

I just couldn't believe this one might be a goner also.

I'm crossing my fingers. Meanwhile, since I started this thread I've
ordered two new drives (one or both of which may replace this one and/or
supplement it).

Thanks for your help.


-- 
Scott
http://angrykeyboarder.com
I've never used an OS I didn't (dis)like.
©2009 angrykeyboarder™ & Elmer Fudd. All Wites Wesewved

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-04 Thread Bruno Wolff III
On Sat, Jul 04, 2009 at 08:30:52 -0700,
  Scott Beamer  wrote:
> 
> Thanks. I just started that. Does it run indefinitely? Do I stop it at
> some point?

If you have smartmontools installed you can use the smartd service to
monitor your drives. /etc/smartd.conf has the config setup.

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-04 Thread Antonio Olivares

> > Filesystems (see "man badblocks") and the hard-disk
> itself protect against
> > a first bunch of errors that can only be worked around
> by reallocating/ignoring
> > sectors. Until the hardware failures become fatal all
> of a sudden. Hence
> > an early warning can be helpful.
> 
> Well that make sense, I'm questioning the accuracy of the
> waring I
> guess. :)
> 
> Thanks!
> 
> -- 
>             Scott
> http://angrykeyboarder.com
> I've never used an OS I didn't (dis)like.
> ©2009 angrykeyboarder™ & Elmer Fudd. All Wites
> Wesewved
> 
> -- 

Scott(AngryKeyboarder)

Sometime ago I was Angry and posted a Bugzilla over here:

https://bugzilla.redhat.com/show_bug.cgi?id=498115

I saw the warning on two/three machines and I ran the tools that they asked me. 
 I disabled the warning  by going to startup sessions on one machine and on the 
other I moved to KDE.  That way I got not see that error message which BTW (Is 
a bunch of BULL$HIT)  IF my drives were failing they would have died by now, I 
did have one die, but I did not even get the warning it just DIED :( 

Add stuff/enhance the bugzilla.  

Regards,

Antonio 


  

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-04 Thread Bruno Wolff III
On Sat, Jul 04, 2009 at 13:38:12 +0200,
  Michael Schwendt  wrote:
> 
> In other words, you're playing Russian roulette as you cannot know
> how long the drive will continue to run even with a growing number of
> reallocated sectors. If you decide to keep this drive running, watch
> above value carefully, but even if it doesn't increase quickly, there
> may be sudden death of this drive.

But there can be sudden death without any warning with a chance of the
same order of magnitude. Sure if you have lots of money relative to the
value of the data it's a good idea to change the drive. For hobbiests
it may make more sense to squeeze some more life out of the drive.

> > > 197 Current_Pending_Sector  0x0012   100   100   000Old_age   Always
> > >   -   2
> 
> This is worse than above. It ought to be zero. The internal hard-disk
> controller has not reallocated these two sectors yet.

Drives typcially won't reallocate bad sectors if they can't get a good read
or the operation is a write. This is to give you a chance to recover the data
if you want to try. And if you want to spend some effort, you can figure out
what files, if any, were using these blocks.

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-04 Thread Bill Davidsen

Scott Beamer wrote:

On 07/04/2009 02:41 AM, Michael Schwendt wrote:

On Sat, 4 Jul 2009 07:57:46 + (UTC), Scott wrote:

For a number of weeks now I've been getting this notification (only when 
in Fedora 11 & running GNOME) that "one or more disks are failing".

What do you get for "smartctl --all /dev/sda"? Perhaps a non-zero and
growing number of reallocated sectors?


Uhhh


To avoid munging by mailer, this would have been a nice attach. Just a thought.



Actually, I got (106 lines of) all this:

$ sudo smartctl --all /dev/sda

smartctl version 5.38 [x86_64-redhat-linux-gnu] Copyright (C) 2002-8
Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model: SAMSUNG HD103UJ
Serial Number:S13PJ1MQ606788
Firmware Version: 1AA01112
User Capacity:1,000,204,886,016 bytes
Device is:In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 3b
Local Time is:Sat Jul  4 02:46:25 2009 MST

==> WARNING: May need -F samsung or -F samsung2 enabled; see manual for
details.

SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===



SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate 0x000f   100   064   051Pre-fail  Always
  -   8
  3 Spin_Up_Time0x0007   077   077   011Pre-fail  Always
  -   7820
  4 Start_Stop_Count0x0032   099   099   000Old_age   Always
  -   536
  5 Reallocated_Sector_Ct   0x0033   100   100   010Pre-fail  Always
  -   4


That's reasonable, if it isn't growing. But any bad spots are reallocated by 
testing before you get the drive, so the original count has been zero on my 
Seagate and WD drives. Don't have anything else spinning to check over the long 
holiday.



  7 Seek_Error_Rate 0x000f   100   100   051Pre-fail  Always
  -   0
  8 Seek_Time_Performance   0x0025   100   100   015Pre-fail
Offline  -   10567
  9 Power_On_Hours  0x0032   099   099   000Old_age   Always
  -   6202
 10 Spin_Retry_Count0x0033   100   100   051Pre-fail  Always
  -   0
 11 Calibration_Retry_Count 0x0012   100   100   000Old_age   Always
  -   0
 12 Power_Cycle_Count   0x0032   099   099   000Old_age   Always
  -   524


Looks like about twice a day, not uncommon. I don't totally trust POH (#9) as it 
is sometimes lower than my uptime. I suspect Seagate is reporting in days, yours 
looks right if the drive is eight months old.



 13 Read_Soft_Error_Rate0x000e   100   066   000Old_age   Always
  -   8



187 Reported_Uncorrect  0x0032   100   100   000Old_age   Always
  -   2767
188 Unknown_Attribute   0x0032   100   100   000Old_age   Always
  -   0
190 Airflow_Temperature_Cel 0x0022   071   068   000Old_age   Always
  -   29 (Lifetime Min/Max 20/29)
194 Temperature_Celsius 0x0022   071   066   000Old_age   Always
  -   29 (Lifetime Min/Max 20/32)
195 Hardware_ECC_Recovered  0x001a   100   100   000Old_age   Always
  -   9633546


That is really a lot, typically my three year old drives are showing single 
digits, often zero.



196 Reallocated_Event_Count 0x0032   100   100   000Old_age   Always
  -   0
197 Current_Pending_Sector  0x0012   100   100   000Old_age   Always
  -   2



SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 0
Warning: ATA Specification requires self-test log structure revision
number = 1
Num  Test_DescriptionStatus  Remaining
LifeTime(hours)  LBA_of_first_error
# 1  Extended offlineCompleted: read failure   90%  6201
 1818207693
# 2  Short offline   Aborted by host   20%  6201
 -
# 3  Conveyance offline  Aborted by host   90%  6201
 -
# 4  Extended offlineAborted by host   90%  4469
 -
# 5  Extended offlineAborted by host   90%  4469
 -

SMART Selective Self-Test Log Data Structure Revision Number (0) should be 1
SMART Selective self-test log data structure revision number 0
Warning: ATA Specification requires selective self-test log data
structure revision number = 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
100  Not_testing
200  Not_testing
300  Not_testing
400  Not_testing
500  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on

Re: "One or more disks are failing" ?

2009-07-04 Thread Michael Schwendt
On Sat, 04 Jul 2009 15:01:30 -0400, Bill wrote:

> >   9 Power_On_Hours  0x0032   099   099   000Old_age   Always
> >   -   6202

> I don't totally trust POH (#9) as it 
> is sometimes lower than my uptime. I suspect Seagate is reporting in days, 
> yours 
> looks right if the drive is eight months old.

Seagate reports in hours, too. [Or else some of my drives would have a POH
time of more than 70 years. ;-)]

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-04 Thread Scott Beamer
On 07/04/2009 10:25 AM, Bruno Wolff III wrote:
> 
> Drives typcially won't reallocate bad sectors if they can't get a good read
> or the operation is a write. This is to give you a chance to recover the data
> if you want to try. And if you want to spend some effort, you can figure out
> what files, if any, were using these blocks.

Wouldn't checking for bad sectors (finding none) followed by formatting
the drive eliminate this problem?

That's what I had done and within a few days I started getting warnings
again.



-- 
Scott
http://angrykeyboarder.com
I've never used an OS I didn't (dis)like.
©2009 angrykeyboarder™ & Elmer Fudd. All Wites Wesewved

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-04 Thread Scott Beamer
Michael Schwendt spake thusly:
> On Sat, 04 Jul 2009 15:01:30 -0400, Bill wrote:
> 
>>>   9 Power_On_Hours  0x0032   099   099   000Old_age   Always
>>>   -   6202
> 
>> I don't totally trust POH (#9) as it 
>> is sometimes lower than my uptime. I suspect Seagate is reporting in days, 
>> yours 
>> looks right if the drive is eight months old.
> 

Actually it's almost a year old. It came with this computer which I
bought in mid-July of last year.




-- 
Scott
http://angrykeyboarder.com
I've never used an OS I didn't (dis)like.
©2009 angrykeyboarder™ & Elmer Fudd. All Wites Wesewved

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-04 Thread Scott Beamer
Bill Davidsen spake thusly:
> Scott Beamer wrote:
[..]
> 
> To avoid munging by mailer, this would have been a nice attach. Just a
> thought.

Attachments are "no-nos" on lists such as this. What I should have done
is use a Pastebin.  I'll remember that in the future.

Sorry for the mess.

[..]

> 
> That's reasonable, if it isn't growing. But any bad spots are
> reallocated by testing before you get the drive, so the original count
> has been zero on my Seagate and WD drives. Don't have anything else
> spinning to check over the long holiday.

So it's not at all unusual for a drive to arrive this way out of the box
(sorta like a new LCD with dead pixels)?

[.]
>
> I confess I would not use that drive for anything critical, too much ECC
> for my taste. Google wrote a paper on matching SMART to failures, and
> concluded that it wasn't helpful in general. Errors somewhat predicted
> bad performance, but many drive fail hard without warning.

Well I'd be plenty pissed off if it did die. I could live without must
of the data but I'd rather not.  Fortunately the pricey data was just
backed up a few days ago to a DVD+R (Amazon.com mp3 collection).

> 
> In looking at my own numbers I just scheduled a drive for redeploy, I
> have a favorable money to time ratio at the moment. ;-)

Pardon my ignorance, but I don't quite get what you are saying here.

Hopefully I can accomplish a backup and drive replacements before I lose
this one.  The hardware needed was just ordered today and is supposed to
be here next week.

I used to have a couple of external drives, but naturally they failed.

I had no hard drive troubles for 10 years. The last 4 have more than
made up for that.

I've been getting these warnings for weeks, so hopefully it will hang on
a big longer.

Thanks.
-- 
Scott
http://angrykeyboarder.com
I've never used an OS I didn't (dis)like.
©2009 angrykeyboarder™ & Elmer Fudd. All Wites Wesewved

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-04 Thread Bruno Wolff III
On Sat, Jul 04, 2009 at 16:07:18 -0700,
  Scott Beamer  wrote:
> On 07/04/2009 10:25 AM, Bruno Wolff III wrote:
> > 
> > Drives typcially won't reallocate bad sectors if they can't get a good read
> > or the operation is a write. This is to give you a chance to recover the 
> > data
> > if you want to try. And if you want to spend some effort, you can figure out
> > what files, if any, were using these blocks.
> 
> Wouldn't checking for bad sectors (finding none) followed by formatting
> the drive eliminate this problem?

That depends on how you checked for bad sectors. Note that reformatting
doesn't write over the whole drive. If you really want to write the whole
drive as a test, you want to use dd to copy over /dev/zero or use the
badblocks program.

> That's what I had done and within a few days I started getting warnings
> again.

It's a very bad sign to be continously getting new bad sectors. If you get
a single burst of bad sectors at once it might be a problem local to part
of the disk and it might be worth continuing to use the disk depending
on your specific situation. (Though if your warranty allows you to RMA it,
that would probably be best.)

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-04 Thread Tim
On Sat, 2009-07-04 at 16:23 -0700, Scott Beamer wrote:
> Attachments are "no-nos" on lists such as this.

Some are fine.  The list automatically stops disallowed ones.  They may
be manually moderated back in though, but after a delay.

-- 
[...@localhost ~]$ uname -r
2.6.27.25-78.2.56.fc9.i686

Don't send private replies to my address, the mailbox is ignored.  I
read messages from the public lists.



-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-05 Thread Alan Cox
On Sat, 04 Jul 2009 16:07:18 -0700
Scott Beamer  wrote:

> On 07/04/2009 10:25 AM, Bruno Wolff III wrote:
> > 
> > Drives typcially won't reallocate bad sectors if they can't get a good read
> > or the operation is a write. This is to give you a chance to recover the 
> > data
> > if you want to try. And if you want to spend some effort, you can figure out
> > what files, if any, were using these blocks.
> 
> Wouldn't checking for bad sectors (finding none) followed by formatting
> the drive eliminate this problem?

Most drives will reallocate a bad sector providing you write over it.
fsck will do this for problematic metadata (block counts, bitmaps, inodes
etc) if it has to.

For data hdparm --repair-sector offers a very low level interface. As
there is no easy way of finding out which file owns the problematic block
or how many there are and which files they are in without accessing that
bit of data a backup and restore is normally wise.

When ever possible I use raid 1 (mirroring). Drives are fairly cheap,
sizes are so big that capacity isn't a problem. Reliability without raid
isn't good enough IMHO.

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-05 Thread Tony Nelson
On 09-07-04 11:30:52, Scott Beamer wrote:
> Tony Nelson spake thusly:
 ...
> I've been getting these warning messages popping up for 2-3 weeks now
> (see original post for details), but I've otherwise not had any
> problems with losing data or the drive's performance.
> 
> I'm not sure how to tell if this number is increasing

You write down the number and look again later.

> > The OP should enable Automatic Offline Data Collection, which will
> > scan the entire disk for bad sectors "every 4 hours", and remap
> > them if they are still readable.  If sectors are unreadable when
> > that is being done, then/ the disk needs replacement as data will
> > surely be lost (sooner or later).
> > 
> > $ sudo smartctl -o on /dev/sda
> 
> Thanks. I just started that. Does it run indefinitely? Do I stop it 
> at some point?

The disk will do "Offline Data Collection" forever or until you stop 
it.  It can have minor impact on drive performance.  Part of what it 
does is a surface scan "every 4 hours", which will attempt to recover 
any bad sectors, hopefully while they're still readable.

All my disks run with "Offline Data Collection" enabled.  One of them I 
found in a snowbank this January, but all its 300 GiB seem fine.  
Another started giving errors about 7 years ago, but it's been OK that 
whole time.

> I've had the worst luck with hard drives in the past 4 years. It made
> up for the 10 before that with no trouble whatsoever.

Hot drives are a common problem, but not here.  Most drives (and yours) 
show the drive temperature at attribute 194.  Your drive helpfully 
asserts that not only is your drive not running hot, but that it has 
never run hot.  Perhaps it is just luck.


> I just couldn't believe this one might be a goner also.
> 
> I'm crossing my fingers. Meanwhile, since I started this thread I've
> ordered two new drives (one or both of which may replace this one
> and/or supplement it).

I'd suggest setting up a RAID mirror, so that when one of the drives 
fails, you're ready to just keep working.


> Thanks for your help.

You're welcome, and good luck.

-- 

TonyN.:'   
  '  


-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-05 Thread Tony Nelson
On 09-07-04 19:07:18, Scott Beamer wrote:
> On 07/04/2009 10:25 AM, Bruno Wolff III wrote:
> > 
> > Drives typcially won't reallocate bad sectors if they can't get
> > good read or the operation is a write. This is to give you a chance
> > to recover the data if you want to try. And if you want to spend
> > some effort, you can figure out what files, if any, were using
> > these blocks.
> 
> Wouldn't checking for bad sectors (finding none) followed by
> formatting the drive eliminate this problem?
> 
> That's what I had done and within a few days I started getting
> warnings again.

Now /that/ indicates a failing drive -- if you used the drive 
manufacturer's formatting utility, which would completely reformat the 
drive.  Just writing a new filesystem does nothing of the kind.

-- 

TonyN.:'   
  '  

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-05 Thread Bill Davidsen

Alan Cox wrote:

On Sat, 04 Jul 2009 16:07:18 -0700
Scott Beamer  wrote:


On 07/04/2009 10:25 AM, Bruno Wolff III wrote:

Drives typcially won't reallocate bad sectors if they can't get a good read
or the operation is a write. This is to give you a chance to recover the data
if you want to try. And if you want to spend some effort, you can figure out
what files, if any, were using these blocks.

Wouldn't checking for bad sectors (finding none) followed by formatting
the drive eliminate this problem?


Most drives will reallocate a bad sector providing you write over it.
fsck will do this for problematic metadata (block counts, bitmaps, inodes
etc) if it has to.

For data hdparm --repair-sector offers a very low level interface. As
there is no easy way of finding out which file owns the problematic block
or how many there are and which files they are in without accessing that
bit of data a backup and restore is normally wise.

When ever possible I use raid 1 (mirroring). Drives are fairly cheap,
sizes are so big that capacity isn't a problem. Reliability without raid
isn't good enough IMHO.

Why raid-1 rather than raid-10,f2? The performance seems significantly better, 
although write performance is still a place where a good hardware raid 
controller can beat software raid, by only sending one copy of the data over the 
system bus to the controller.


--
Bill Davidsen 
  "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-05 Thread Bill Davidsen

Scott Beamer wrote:

Bill Davidsen spake thusly:

Scott Beamer wrote:

[..]

To avoid munging by mailer, this would have been a nice attach. Just a
thought.


Attachments are "no-nos" on lists such as this. What I should have done
is use a Pastebin.  I'll remember that in the future.


AFAIK attaching simple text to avoid munging is perfectly fine, the posting 
software disallows some other types.


--
Bill Davidsen 
  "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-05 Thread Bill Davidsen

Michael Schwendt wrote:

On Sat, 04 Jul 2009 15:01:30 -0400, Bill wrote:


  9 Power_On_Hours  0x0032   099   099   000Old_age   Always
  -   6202


I don't totally trust POH (#9) as it 
is sometimes lower than my uptime. I suspect Seagate is reporting in days, yours 
looks right if the drive is eight months old.


Seagate reports in hours, too. [Or else some of my drives would have a POH
time of more than 70 years. ;-)]

No, I wasn't kidding, I have three Segates, model ST3750640AS, bought new at the 
same time, and installed at the same time, and not only are the POH different 
but the machine was up for 400+ days at one time. The POHs reported via smartctl 
are 359,559,and 2135. That's just *broken* POH reporting. :-(


--
Bill Davidsen 
  "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-05 Thread Scott Beamer

On 07/04/2009 10:12 PM, Tim wrote:

On Sat, 2009-07-04 at 16:23 -0700, Scott Beamer wrote:

Attachments are "no-nos" on lists such as this.


Some are fine.  The list automatically stops disallowed ones.  They may
be manually moderated back in though, but after a delay.

Personally I'd be annoyed if I saw an email here with an attachment. 
That's right up there with HTML


--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-06 Thread Patrick O'Callaghan
On Sun, 2009-07-05 at 23:43 -0700, Scott Beamer wrote:
> On 07/04/2009 10:12 PM, Tim wrote:
> > On Sat, 2009-07-04 at 16:23 -0700, Scott Beamer wrote:
> >> Attachments are "no-nos" on lists such as this.
> >
> > Some are fine.  The list automatically stops disallowed ones.  They may
> > be manually moderated back in though, but after a delay.
> >
> Personally I'd be annoyed if I saw an email here with an attachment. 
> That's right up there with HTML

Not really. HTML is unnecessary, annoying and can be a security problem.
Some attachments are *occasionally* justifiable, e.g. if someone wants
to post a large log file I'd rather they used an attachment than put it
inline.

poc

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-06 Thread Bruno Wolff III
On Sun, Jul 05, 2009 at 23:43:35 -0700,
  Scott Beamer  wrote:
> On 07/04/2009 10:12 PM, Tim wrote:
>> On Sat, 2009-07-04 at 16:23 -0700, Scott Beamer wrote:
>>> Attachments are "no-nos" on lists such as this.
>>
>> Some are fine.  The list automatically stops disallowed ones.  They may
>> be manually moderated back in though, but after a delay.
>>
> Personally I'd be annoyed if I saw an email here with an attachment.  

Depending on your definition of attachment, pretty much any message to this
list is going to have a text/plain attachment.
My mail reader will inline text/plain attachments, so if someone put two in
one message, I would see both together as if it was just one attachment.
But I could conveniently save them separately if I wanted to do that.

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-06 Thread Scott Beamer

On 07/06/2009 06:58 AM, Patrick O'Callaghan wrote:

On Sun, 2009-07-05 at 23:43 -0700, Scott Beamer wrote:

On 07/04/2009 10:12 PM, Tim wrote:

On Sat, 2009-07-04 at 16:23 -0700, Scott Beamer wrote:

Attachments are "no-nos" on lists such as this.

Some are fine.  The list automatically stops disallowed ones.  They may
be manually moderated back in though, but after a delay.


Personally I'd be annoyed if I saw an email here with an attachment.
That's right up there with HTML


Not really. HTML is unnecessary, annoying and can be a security problem.
Some attachments are *occasionally* justifiable, e.g. if someone wants
to post a large log file I'd rather they used an attachment than put it
inline.


You misunderstood me. I agree with you.


--
Scott
http://angrykeyboarder.com
I've never used an OS I didn't (dis)like.
©2009 angrykeyboarder™ & Elmer Fudd. All Wites Wesewved

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-07 Thread Robin Laing
I am also seeing this on drives that are only a few months old.  I was 
having system crashes so I wouldn't be surprised about the need to 
re-allocate blocks.


Now the question that I pose is, how do get these blocks allocated/moved 
that is safe for data on the drives?  What is the best method to get 
these blocks allocated?


Can badblocks be used?


--
Robin Laing

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-07 Thread stan
On Tue, 07 Jul 2009 13:12:47 -0600
Robin Laing  wrote:

> I am also seeing this on drives that are only a few months old.  I
> was having system crashes so I wouldn't be surprised about the need
> to re-allocate blocks.
> 
> Now the question that I pose is, how do get these blocks
> allocated/moved that is safe for data on the drives?  What is the
> best method to get these blocks allocated?
> 
> Can badblocks be used?
> 
> 
I have a drive that seems to have some bad sectors, that have been bad
since new.  It has been flagged for years by various programs, and is
still running.  I don't trust it with valuable data, but I use it.

I used   e2fsck with -c -c -k   (the double -c tells it to do a non
destructive read write test, the -k says to preserve existing bad blocks
and add any new ones) on the drive. It seemed to work, at least I didn't
lose any data that I could tell and the number of bad sectors has
remained stable since. This is very slow as it is reading, writing, and
reading every byte on the drive. Of course, the drive has to be
umounted as well.

The man page for e2fsck has all the options and what they do.  Verbose
might be advisable as well.

Of course, maybe I'm just fooling myself and this is strictly placebo
effect.  :-)

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-07 Thread Rick Stevens

Robin Laing wrote:
I am also seeing this on drives that are only a few months old.  I was 
having system crashes so I wouldn't be surprised about the need to 
re-allocate blocks.


Now the question that I pose is, how do get these blocks allocated/moved 
that is safe for data on the drives?  What is the best method to get 
these blocks allocated?


Can badblocks be used?


It could, but it'd be far safer if you were to use, say 'e2fsck -c'
from a live CD rather than badblocks alone because of the block size
dependencies and such, and you want to keep the files whole if possible.
--
- Rick Stevens, Systems Engineer  ri...@nerd.com -
- AIM/Skype: therps2ICQ: 22643734Yahoo: origrps2 -
--
- "I was contemplating the immortal words of Socrates when he said,  -
- 'I drank what?'"   -- Val Kilmer in "Real Genius"  -
--

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: "One or more disks are failing" ?

2009-07-07 Thread D. Hugh Redelmeier
| From: Bill Davidsen 

| Scott Beamer wrote:

| > SMART Attributes Data Structure revision number: 16
| > Vendor Specific SMART Attributes with Thresholds:
| > ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE
| > UPDATED  WHEN_FAILED RAW_VALUE

| > 195 Hardware_ECC_Recovered  0x001a   100   100   000Old_age   Always
| >   -   9633546
| 
| That is really a lot, typically my three year old drives are showing single
| digits, often zero.

"9633546" is the "RAW_VALUE".  That may not be the count of ECC errors
recovered.  Interpretation of RAW_VALUE depends on the drive
manufacturer.  I know that this seems counter-intuitive.

If you google, you will find a bunch of people worried about high
numbers here.  Some notice it going up faster than the number of reads
performed.

All you can go on is the VALUE, WORST, and THRESH numbers.  They say
the drive is fine, trust us.  I added "trust us" because the
manufacturer's firmware came up with that number in a non-transparent way.

On some Seagate drives, people have figured out that certain raw
numbers are actually two numbers stuck together: some bits are for one
count and some bits are for another.

Here's an insufficiently descriptive FAQ entry on this topic:
http://www.readynas.com/forum/faq.php#My_hard_disk%28s%29_in_ReadyNAS_is_reporting_high_SMART_Raw_Read_Error_Rate%2C_Seek_Error_Rate%2C_and_Hardware_ECC_Recovered._What_should_I_do%3F

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Fedora was wrong (I think) (was Re: "One or more disks are failing" ?)

2009-07-05 Thread Scott Beamer
So last night I booted into Windows. And I got my hand on a program (GUI 
based) that gives you all the SMART stats in slightly better plain 
English. There were no warnings of demise of any type (I'm back in Linux 
now and the program name escapes me, but it's not really important).


In any event, just to be sure, I installed Ubuntu today (zapping the 
Fedora partitions) and ran the test again.


Results in a pastebin here:

http://bit.ly/hard-drive-is-in-fact-not-dying

I wish I'd done that before posting here.  Not to mention the nice hard 
drive I just bought that I really don't need (but I drooled over it, so 
it will be here tomorrow.


If I read the above mentioned output correctly, Some propeller-head(s) 
somewhere got me panicking needlessly and wasted a lot of my time..


But then, maybe I'm just an ignorant fool. :0)


--
Scott
http://angrykeyboarder.com
I've never used an OS I didn't (dis)like.
©2009 angrykeyboarder™ & Elmer Fudd. All Wites Wesewved

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: Fedora was wrong (I think) (was Re: "One or more disks are failing" ?)

2009-07-06 Thread Frank Murphy
On 06/07/09 07:57, Scott Beamer wrote:
> So last night I booted into Windows. And I got my hand on a program (GUI
> based) that gives you all the SMART stats in slightly better plain
> English. There were no warnings of demise of any type (I'm back in Linux
> now and the program name escapes me, but it's not really important).
> 
> In any event, just to be sure, I installed Ubuntu today (zapping the
> Fedora partitions) and ran the test again.

And you trust Windose and Ubuntu.  Mwaahhhaa


Frank


Sorry had to do it. :D

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: Fedora was wrong (I think) (was Re: "One or more disks are failing" ?)

2009-07-06 Thread Scott Beamer

On 07/06/2009 12:05 AM, Frank Murphy wrote:

On 06/07/09 07:57, Scott Beamer wrote:

So last night I booted into Windows. And I got my hand on a program (GUI
based) that gives you all the SMART stats in slightly better plain
English. There were no warnings of demise of any type (I'm back in Linux
now and the program name escapes me, but it's not really important).

In any event, just to be sure, I installed Ubuntu today (zapping the
Fedora partitions) and ran the test again.


And you trust Windose and Ubuntu.  Mwaahhhaa


Frank


Sorry had to do it. :D



LOL.

I trust Debian and Mandriva as well. Only Fedora had been raising the 
alarm over the past month. :)


And someone else earlier in this thread mentioned that he was convinced 
it was a bug and filed a (redhat) bug report some time ago.


And lastly, with regard to the "propeller-head" remark. I wasn't 
referring to anyone in this discussion. I was referring to the 
developer(s) who (may have) introduced the bug in the smartmontools that 
ended up in Fedora 11.


I just thought I'd clarify while I was thinking about it.

--
Scott
http://angrykeyboarder.com
I've never used an OS I didn't (dis)like.
©2009 angrykeyboarder™ & Elmer Fudd. All Wites Wesewved

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: Fedora was wrong (I think) (was Re: "One or more disks are failing" ?)

2009-07-06 Thread Michael Schwendt
On Mon, 06 Jul 2009 00:25:02 -0700, Scott wrote:

> On 07/06/2009 12:05 AM, Frank Murphy wrote:
> > On 06/07/09 07:57, Scott Beamer wrote:
> >> So last night I booted into Windows. And I got my hand on a program (GUI
> >> based) that gives you all the SMART stats in slightly better plain
> >> English. There were no warnings of demise of any type (I'm back in Linux
> >> now and the program name escapes me, but it's not really important).
> >>
> >> In any event, just to be sure, I installed Ubuntu today (zapping the
> >> Fedora partitions) and ran the test again.
> >
> > And you trust Windose and Ubuntu.  Mwaahhhaa
> >
> >
> > Frank
> >
> >
> > Sorry had to do it. :D
> >
> 
> LOL.
> 
> I trust Debian and Mandriva as well.

Funnily, you posted a SMART report that shows the same values as before.
A self-test that ended with a read failure. One sector that the drive has
failed to reallocate/replace. Two sectors that have not been
reallocated/replaced yet. Four reallocated sectors is not much of a
threat, but you still need to observe that this value doesn't increase
steadily.

> http://bit.ly/hard-drive-is-in-fact-not-dying
Why did you highlight the wrong lines?

> Only Fedora had been raising the 
> alarm over the past month. :)

To understand why a component raised the alarm you need to examine what
values it looked at.
 
> And someone else earlier in this thread mentioned that he was convinced 
> it was a bug and filed a (redhat) bug report some time ago.

There are some reports
http://bugz.fedoraproject.org/libatasmart
http://bugz.fedoraproject.org/gnome-disk-utility

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: Fedora was wrong (I think) (was Re: "One or more disks are failing" ?)

2009-07-06 Thread Jonathan Underwood
2009/7/6 Scott Beamer :
> So last night I booted into Windows. And I got my hand on a program (GUI
> based) that gives you all the SMART stats in slightly better plain English.
> There were no warnings of demise of any type (I'm back in Linux now and the
> program name escapes me, but it's not really important).
>
> In any event, just to be sure, I installed Ubuntu today (zapping the Fedora
> partitions) and ran the test again.
>
> Results in a pastebin here:
>
> http://bit.ly/hard-drive-is-in-fact-not-dying
>
> I wish I'd done that before posting here.  Not to mention the nice hard
> drive I just bought that I really don't need (but I drooled over it, so it
> will be here tomorrow.
>

As far as I can see, the output from smartctl from Ubuntu is identical
to the output of smartctl from Fedora which was discussed in your
previous thread. That clearly shows some issues with your drive, so I
don't know what logical leap you've made in the meantime to lead to
"There were no warnings of demise of any type".

The reason you don't see the pop-up appearing in the Gnome under
Ubuntu is probably because they don't ship that applet. It also wasn't
shipped in F-10. I'd really trust what you see in the output from
smartctl - it's just reporting raw data from the drive.

Incidentally, why have you aborted all of the smart tests that you've
started? If I was you, I'd run an extended offline test and allow it
to complete.

> If I read the above mentioned output correctly, Some propeller-head(s)
> somewhere got me panicking needlessly and wasted a lot of my time..
>

I think you need to reassess this conclusion, and also consider not
referring to people who are trying to help you in their own free time
(be it mailing list people or the people who wrote smartmontools,
smartctl etc) as "propeller-heads" that waste your time - this is just
plain rude.

Jonathan.

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: Fedora was wrong (I think) (was Re: "One or more disks are failing" ?)

2009-07-06 Thread Scott Beamer

Michael Schwendt spake thusly:



Funnily, you posted a SMART report that shows the same values as before.


It does? The other one had much more output (didn't it?).


A self-test that ended with a read failure. One sector that the drive has
failed to reallocate/replace. Two sectors that have not been
reallocated/replaced yet. Four reallocated sectors is not much of a
threat, but you still need to observe that this value doesn't increase
steadily.


http://bit.ly/hard-drive-is-in-fact-not-dying

Why did you highlight the wrong lines?


They're not "wrong". I highlighted them to show stuff I felt was important.





Only Fedora had been raising the
alarm over the past month. :)


To understand why a component raised the alarm you need to examine what
values it looked at.


You're right. But I don't understand enough about this to know. 
Meanwhile, I'm getting 3 new drives this week. And I'm taking the 
computer into a shop to have them installed.  Since it's there, I'll 
have them test this drive and if in fact it's failing (or they tell me 
it's clearly headed in that direction) then I'll have them yank it.


Thanks for your feedback.




--
Scott
http://angrykeyboarder.com
I've never used an OS I didn't (dis)like.
©2009 angrykeyboarder™ & Elmer Fudd. All Wites Wesewved

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: Fedora was wrong (I think) (was Re: "One or more disks are failing" ?)

2009-07-06 Thread Scott Beamer

On 07/06/2009 03:55 AM, Jonathan Underwood wrote:
[...]


I think you need to reassess this conclusion, and also consider not
referring to people who are trying to help you in their own free time
(be it mailing list people or the people who wrote smartmontools,
smartctl etc) as "propeller-heads" that waste your time - this is just
plain rude.


You're right and I apologize and stand corrected. But I also followed up 
on the "propeller head" comment last night to clarify and as I stated it 
was not directed at anyone who has been trying to help. It was directed 
at the developers who (so I thought) contributed to the "bug" I thought 
existed in smartmon tools.


Meanwhile, I'll wipe the egg off my face.

Thanks for your reply.


--
Scott
http://angrykeyboarder.com
I've never used an OS I didn't (dis)like.
©2009 angrykeyboarder™ & Elmer Fudd. All Wites Wesewved

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: Fedora was wrong (I think) (was Re: "One or more disks are failing" ?)

2009-07-07 Thread Michael Schwendt
On Mon, 06 Jul 2009 14:58:13 -0700, Scott wrote:

> Michael Schwendt spake thusly:
> 
> >
> > Funnily, you posted a SMART report that shows the same values as before.
> 
> It does? The other one had much more output (didn't it?).

It's the same version of smartctl even. If there's anything you'd
like to prove, please use "diff" (or a graphical tool like "diffuse" or
"meld") to visually compare the output of two reports.
 
> >> http://bit.ly/hard-drive-is-in-fact-not-dying
> > Why did you highlight the wrong lines?
> 
> They're not "wrong". I highlighted them to show stuff I felt was important.

Then why did you highlight unimportant lines? ;-)
For example, and since it's the "--all" report, you highlighted the
two lines for the "smartctl --log error /dev/sda" report, but not the
lines for the "smartctl --log selftest /dev/sda" report section.

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines