Re: [Hampshire] [OT] MTBF
Hugo Mills wrote: [snip] You can see all sorts of interesting things here which can easily be used to warn on pending failure of a drive. It's not actually a very good guide to failure. The figures I've seen quoted from NetApp are that SMART data will only give you warning of a pending drive failure in about 20% of cases, and that's if you know what you're looking for (which most systems don't, as they can't do the same level of analysis as NetApp can, to get the data). Hugo. With my cynical hat on: NetApp would not exactly be an un-biased source of information seeing as they have a business to run based on selling a solution. If SMART was better I wouldn't expect them to tell you ;) But fair enough comment, SMART is not a substitute for a proper storage solution, more just pointing out that there are already lots of metrics tracked and stored by even a dumb consumer hard drive.. and assuming NetApp are correct, 20% is better than nothing. -- Please post to: Hampshire@mailman.lug.org.uk Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire LUG URL: http://www.hantslug.org.uk --
Re: [Hampshire] [OT] MTBF
2009/7/20 Stephen Rowles step...@rowles.org.uk: James Courtier-Dutton wrote: I think people don't seem to realize that HDs have very low resistance to shock while switched on, and this is the main cause of HD failures. Most (all?) modern HDDs have a whole raft of sensors and store life time information about read errors, temperature range etc. etc. this is SMART (you might see on the bios screen). In Linux you can query this using smartctl: ~]# smartctl --all /dev/sda For example the stats from my current drive here at work: SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 108 093 006 Pre-fail Always - 16203744 3 Spin_Up_Time 0x0003 098 095 070 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 69 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 081 060 030 Pre-fail Always - 139114079 9 Power_On_Hours 0x0032 088 088 000 Old_age Always - 11202 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 99 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 061 057 045 Old_age Always - 39 (Lifetime Min/Max 21/43) 194 Temperature_Celsius 0x0022 039 043 000 Old_age Always - 39 (0 19 0 0) 195 Hardware_ECC_Recovered 0x001a 064 060 000 Old_age Always - 164354431 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x 100 253 000 Old_age Offline - 0 202 TA_Increase_Count 0x0032 100 253 000 Old_age Always - 0 You can see all sorts of interesting things here which can easily be used to warn on pending failure of a drive. Also most laptop drives now have accelerometers which will detect any dangerous shock conditions and park the drive heads to prevent further damage to the drive. I cannot find it now but I watched a video on the None of the above smart parameters give any indication from the accelerometers. So, one has no way of telling if shock was a contributing factor to the HD failure. It would be nice to see smart stats saying, we got this much shock before we managed to park the heads. Another thing, for the pre-fail smarts like: Raw_Read_Error_Rate 16203744 Seek_Error_Rate 139114079 Hardware_ECC_Recovered 164354431 What is an acceptable value and what indicates things starting to go wrong? My laptop HD has these values at zero!!! On my desktop, they keep increasing over time. So, what is an acceptable rate ? James -- Please post to: Hampshire@mailman.lug.org.uk Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire LUG URL: http://www.hantslug.org.uk --
Re: [Hampshire] [OT] MTBF
James Courtier-Dutton wrote: None of the above smart parameters give any indication from the accelerometers. So, one has no way of telling if shock was a contributing factor to the HD failure. It would be nice to see smart stats saying, we got this much shock before we managed to park the heads. Another thing, for the pre-fail smarts like: Raw_Read_Error_Rate 16203744 Seek_Error_Rate 139114079 Hardware_ECC_Recovered 164354431 What is an acceptable value and what indicates things starting to go wrong? My laptop HD has these values at zero!!! On my desktop, they keep increasing over time. So, what is an acceptable rate ? James Unfortunately this is my work desktop machine, I don't think it has an accelerometer in it... my personal laptop at home would appear to have this line: 191 G-Sense_Error_Rate 0x000a 100 100 000Old_age Always - 0 Which according to google is something to do with shock-sensitive sensor on the drive. Unfortunately I'm not an expert in analysing the output to tell you what the numbers mean, I expect there is some software that will do a better analysis job but I don't know of any off-hand. I've only used it once before in anger on a drive that was making odd noises and behaving strangely, one of the numbers was huge, which I looked up and google suggested it indicated a failing drive, so I backed up the data and replaced it :) -- Please post to: Hampshire@mailman.lug.org.uk Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire LUG URL: http://www.hantslug.org.uk --
Re: [Hampshire] Interesting Hardware Reference Poster
Hi All I am planning to print it out. I have access to an A0 printer. I can also get print pictures up to B0 laminated cost £5. Unfortunately Media Workshop is shut until September. It is only open doing school term time. If anyone wants laminated prints of http://tinyurl.com/lf3dqb let me know and I will get them printed in September. John Eayrs On Sunday 19 July 2009 15:10:11 Victor Churchill wrote: 2009/7/19 Sean Gibbins s...@funkygibbins.me.uk: Subject says it all really: http://tinyurl.com/lf3dqb I thought it might come in handy for other folks like me who sometimes find themselves playing with older kit. Heee-wack! That is pretty funky! I feel tempted to print it out just for the geek appeal. I I expanded the image and was thinking where's the RAM?... then found that my window was only showing two thirds of the width. Nice one... even though I very rarely take machines to bits and have never e.g. de/populated a motherboard it's a great resource to know about. -- Please post to: Hampshire@mailman.lug.org.uk Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire LUG URL: http://www.hantslug.org.uk --
Re: [Hampshire] [OT] MTBF
Hi It would be nice to see smart stats saying, we got this much shock before we managed to park the heads. As far as I can see this is non statement. What shock is needed for the disk data reader to touch the spinning disk. If this shock occurs there is no time to do any disk parking. MTBF is based on the assumption that repeated environmental stresses are below a certain value. If the environmental stress is above a certain value than a phenomena known as cyclic stress fatigue will be of great influence on the failure rate. Please excuse me if I have not quite got the right terms it is over 20 years since I was involved in quality and reliability. There are also other effects that MTBF does not take account of. There are effects that come about due to ageing that can only be determined by running something for the actual length of time concerned. Hard disks fail very quickly if they have to operate in a tank. John Eayrs On Monday 20 July 2009 14:27:09 Stephen Rowles wrote: James Courtier-Dutton wrote: None of the above smart parameters give any indication from the accelerometers. So, one has no way of telling if shock was a contributing factor to the HD failure. It would be nice to see smart stats saying, we got this much shock before we managed to park the heads. Another thing, for the pre-fail smarts like: Raw_Read_Error_Rate 16203744 Seek_Error_Rate 139114079 Hardware_ECC_Recovered 164354431 What is an acceptable value and what indicates things starting to go wrong? My laptop HD has these values at zero!!! On my desktop, they keep increasing over time. So, what is an acceptable rate ? James Unfortunately this is my work desktop machine, I don't think it has an accelerometer in it... my personal laptop at home would appear to have this line: 191 G-Sense_Error_Rate 0x000a 100 100 000Old_age Always - 0 Which according to google is something to do with shock-sensitive sensor on the drive. Unfortunately I'm not an expert in analysing the output to tell you what the numbers mean, I expect there is some software that will do a better analysis job but I don't know of any off-hand. I've only used it once before in anger on a drive that was making odd noises and behaving strangely, one of the numbers was huge, which I looked up and google suggested it indicated a failing drive, so I backed up the data and replaced it :) -- Please post to: Hampshire@mailman.lug.org.uk Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire LUG URL: http://www.hantslug.org.uk --
Re: [Hampshire] [OT] MTBF
On Sunday 19 Jul 2009, Philip Stubbs wrote: snip An example. Vacuum cleaners used to be rated only in watts. If you wanted a vacuum cleaner with lots of suck, you bought one that consumed the most watts. However, the way the watts are calculated were standardized. Run the vacuum in free air for one minute. Run the vacuum with its inlet blocked for one minute. Average watts consumed is then the rating. Marketing then say to the engineers, we need more watts. Well the vacuum consumes less power when the inlet is blocked, so the engineers introduce leaks into the design, so that when the inlet is blocked, the pump is still shifting air and doing work, keeping the power consumption up. Never mind that the vacuum performance is compromised. The end result is I no longer have much faith in the numbers on the box. The more colours, pictures and words on the box means more input from marketing, and the greater the pinch of salt needed. :-) Having worked in marketing for a year once upon a time, I'd say that you are actually being generous... That's not to say that you can cleverly market something that is well built but all too often clever marketing is used as a substitute for a well designed and built product... -- Adam Trickett Overton, HANTS, UK Yes, I'm bitter and cynical. That does not make me wrong. -- anon -- Please post to: Hampshire@mailman.lug.org.uk Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire LUG URL: http://www.hantslug.org.uk --
[Hampshire] Acer Laptop A3380 Power Problem
Hi all I'm having a power problem with the wife's laptop. I'll keep this as short as I can... In chronological order * Laptop was working fine * Wife spills liquid in it and it appears to die. We leave it 12 hours * Laptop comes back to life - no problems * Then (months later) it refuses to run on the mains and obviously the battery runs out. I conclude whatever's wrong cannot be to do with orange squash spiled inside. * Try another power supply - no dice. * Notice the central pin on the laptop's power socket is now in the male end of the charger (ooer!). Apparently this is something of a design fault for these models. A local shop have soldered on a new socket but they said they couldn't make it boot. This is probably because I'd taken the HD out. Having reinstalled it - delighted to see the thing spring back to life. * However - after 5 mintues, I got the 7% battery life remaining i.e. it's still not working from AC mains despite the socket being fixed. I'm loathe to simply scrap this because I've proved it's working. It's just that it refuses to either charge the battery or work from AC mains. Any ideas anyone? FWIW I've tried plugging it in with the battery completely removed - no dice. Cheers Rob-- Please post to: Hampshire@mailman.lug.org.uk Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire LUG URL: http://www.hantslug.org.uk --
Re: [Hampshire] Acer Laptop A3380 Power Problem
On Mon, 20 Jul 2009 21:22:40 +0100, xendis...@gmx.com said: FWIW I've tried plugging it in with the battery completely removed - no dice. . . Will it run on mains if you remove the battery?? I think the answer's in the question. -- Please post to: Hampshire@mailman.lug.org.uk Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire LUG URL: http://www.hantslug.org.uk --
Re: [Hampshire] Acer Laptop A3380 Power Problem
On Monday 20 July 2009 21:44:05 Keith Edmunds wrote: On Mon, 20 Jul 2009 21:22:40 +0100, xendis...@gmx.com said: FWIW I've tried plugging it in with the battery completely removed - no dice. . . Will it run on mains if you remove the battery?? I think the answer's in the question. I must go to specsavers :o Tim -- Please post to: Hampshire@mailman.lug.org.uk Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire LUG URL: http://www.hantslug.org.uk --
[Hampshire] LiveCD distro reputation good device detection
Hello, I work in a computer workshop and often get PCs in with weird windows install problems. I was thinking if someone could suggest a LiveCD distribution that has a good reputation for a wide range of device detection. Then i would have a cd that could proove that the hardware was working and it was likely to be a driver or other software problem. Any thoughts? Feel free to suggest something that works better on laptops than desktops and vice versa since latops are custom build in comparison to off the shelf desktop parts. Thanks Martin N Owner of the bwfc yahoogroup and Co-Moderator of MiniDisc and amithlonopen yahoo groups. -- Please post to: Hampshire@mailman.lug.org.uk Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire LUG URL: http://www.hantslug.org.uk --
Re: [Hampshire] LiveCD distro reputation good device detection
trotter wrote: Hello, I work in a computer workshop and often get PCs in with weird windows install problems. I was thinking if someone could suggest a LiveCD distribution that has a good reputation for a wide range of device detection. Then i would have a cd that could proove that the hardware was working and it was likely to be a driver or other software problem. Any thoughts? Feel free to suggest something that works better on laptops than desktops and vice versa since latops are custom build in comparison to off the shelf desktop parts. Thanks Martin N Owner of the bwfc yahoogroup and Co-Moderator of MiniDisc and amithlonopen yahoo groups. PCLinuxOS would be my choice, though I'm sure plenty here will suggest Ubuntu :) James -- Please post to: Hampshire@mailman.lug.org.uk Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire LUG URL: http://www.hantslug.org.uk --
Re: [Hampshire] LiveCD distro reputation good device detection
On Monday 20 July 2009 23:05:47 James Ashburner wrote: trotter wrote: Hello, I work in a computer workshop and often get PCs in with weird windows install problems. I was thinking if someone could suggest a LiveCD distribution that has a good reputation for a wide range of device detection. Then i would have a cd that could proove that the hardware was working and it was likely to be a driver or other software problem. Any thoughts? Feel free to suggest something that works better on laptops than desktops and vice versa since latops are custom build in comparison to off the shelf desktop parts. Thanks Martin N Owner of the bwfc yahoogroup and Co-Moderator of MiniDisc and amithlonopen yahoo groups. PCLinuxOS would be my choice, though I'm sure plenty here will suggest Ubuntu :) And then there is Knoppix. And Puppy PCLOS, Knoppix and Puppy are the ones I find myself actually using for that type of purpose. Puppy is good for comparatively low-resourced computers. But if I am only taking one with me, that one is Knoppix. :-) Lisi -- Please post to: Hampshire@mailman.lug.org.uk Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire LUG URL: http://www.hantslug.org.uk --
Re: [Hampshire] LiveCD distro reputation good device detection
At 23:16 20/07/2009, you wrote: On Monday 20 July 2009 23:05:47 James Ashburner wrote: trotter wrote: Hello, I work in a computer workshop and often get PCs in with weird windows install problems. I was thinking if someone could suggest a LiveCD distribution that has a good reputation for a wide range of device detection. Then i would have a cd that could proove that the hardware was working and it was likely to be a driver or other software problem. Any thoughts? Feel free to suggest something that works better on laptops than desktops and vice versa since latops are custom build in comparison to off the shelf desktop parts. Thanks Martin N Owner of the bwfc yahoogroup and Co-Moderator of MiniDisc and amithlonopen yahoo groups. PCLinuxOS would be my choice, though I'm sure plenty here will suggest Ubuntu :) And then there is Knoppix. And Puppy PCLOS, Knoppix and Puppy are the ones I find myself actually using for that type of purpose. Puppy is good for comparatively low-resourced computers. But if I am only taking one with me, that one is Knoppix. :-) Okay I could do with some info if there is better perceived hardware detection in desktops or laptops with either of those 3 distributions. Any version to avoid? I seem to remember someone on here or another list saying that a particular distro version was better at coping with errors than another on bootup. thanks Martin N Owner of the bwfc yahoogroup and Co-Moderator of MiniDisc and amithlonopen yahoo groups. -- Please post to: Hampshire@mailman.lug.org.uk Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire LUG URL: http://www.hantslug.org.uk --