Re: SATA - System Freezes

2008-06-24 Thread Nifty Hat Mitch
On Mon, Jun 23, 2008 at 9:52 PM, Nifty Hat Mitch <[EMAIL PROTECTED]> wrote:
>
> On Mon, 23 Jun 2008 08:13:54 -0700, Robin Laing <[EMAIL PROTECTED]> wrote:
>
> > Henry Ritzlmayr wrote:
> >> Am Freitag, den 20.06.2008, 09:37 -0600 schrieb Robin Laing:
> >>> Henry Ritzlmayr wrote:
>  Am Donnerstag, den 19.06.2008, 09:52 -0600 schrieb Robin Laing:
> > Henry Ritzlmayr wrote:
> >> Am Dienstag, den 17.06.2008, 13:25 -0400 schrieb Jorge FÃÆ'¡bregas:
> >>> Hello Everyone,
> >>>
> >>> I'm running Fedora 8 and my system freezes (for about 20 to 40 
> >>> seconds) a
> >>> couple of times a day. When it does I see this on /var/log/messages:
> >
>
> Is smartd enabled? Is smartd configured correctly for this disk?
> Some smartd actions will take a disk off line for some tests.
> When those tests are running other commands to the disk may time out.
> Thus the "smartd" daemon could be triggering the dead time.
>
> IMO, smartd is a cool tool.  It does catch lots of disk failures in time
> to back up and replace the disk.  It also can do things that are unexpected.

Checking back in the archives the  SMART question was asked,  but it
was not clear in the answer that  SMART was disabled both in the BIOS and
the drive.  If the BIOS has SMART enabled the drive could still stall
for various
time periods when a self test was running.


--
Nifty Hat Mitch
T o m M i t c h e l l

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list


Re: SATA - System Freezes

2008-06-23 Thread Nifty Hat Mitch
On Mon, 23 Jun 2008 08:13:54 -0700, Robin Laing <[EMAIL PROTECTED]> wrote:

> Henry Ritzlmayr wrote:
>> Am Freitag, den 20.06.2008, 09:37 -0600 schrieb Robin Laing:
>>> Henry Ritzlmayr wrote:
 Am Donnerstag, den 19.06.2008, 09:52 -0600 schrieb Robin Laing:
> Henry Ritzlmayr wrote:
>> Am Dienstag, den 17.06.2008, 13:25 -0400 schrieb Jorge Fábregas:
>>> Hello Everyone,
>>>
>>> I'm running Fedora 8 and my system freezes (for about 20 to 40 seconds) 
>>> a
>>> couple of times a day. When it does I see this on /var/log/messages:
>

Is smartd enabled? Is smartd configured correctly for this disk?
Some smartd actions will take a disk off line for some tests.
When those tests are running other commands to the disk may time out.
Thus the "smartd" daemon could be triggering the dead time.

IMO, smartd is a cool tool.  It does catch lots of disk failures in time
to back up and replace the disk.  It also can do things that are unexpected.


-- 
Nifty Hat Mitch
T o m   M i t c h e l l

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list


Re: SATA - System Freezes

2008-06-23 Thread Robin Laing

Henry Ritzlmayr wrote:

Am Freitag, den 20.06.2008, 09:37 -0600 schrieb Robin Laing:

Henry Ritzlmayr wrote:

Am Donnerstag, den 19.06.2008, 09:52 -0600 schrieb Robin Laing:

Henry Ritzlmayr wrote:

Am Dienstag, den 17.06.2008, 13:25 -0400 schrieb Jorge Fábregas:

Hello Everyone,

I'm running Fedora 8 and my system freezes (for about 20 to 40 seconds) a 
couple of times a day. When it does I see this on /var/log/messages:


I have looked at how lmsensors works and you can make some changes to 
the configuration files to increase the accuracy of the reports.  I have 
not played with it much though.  It is just as easy to pull the cover 
and measure the voltages with a volt meter.


I will try lmsensors and see what it can do here. If this works out,
than my initial questions stands: Could such an info be included in the
kernel output. Maybe not (by design) but a question can´t hurt. We do
development within SAP here at our company, and for special cases, when
something fails, we don´t only throw a error message to the user. We try
to retrieve as much information from the underlying OS/DB as much as
possible to help for later debugging or at least being more informative
in the message we display to the user. 



My point is lmsensors is only as accurate as the sensor configuration 
file.  In my case, lmsensors didn't report a problem but when I used a 
digital voltmeter, my power supply was low, even after being removed 
from the system.  lmsensors was showing a .5 volt difference.




I have two of the same brand of power supply's (only ones available in 
our area) with exactly the same fault.  A known fault that can be fixed 
with the addition of three variable resistors.


Don´t get me wrong here, but this is not something I would like to see
in one of my systems here. I could imagine the first response from one
of our managers if anything fails (even if it is only a web page which
is not displayed properly) that this must be due to the newly installed
resistors ;-)

Henry 


I have seen an adapter that fits into a case that gives you a selectable 
display that has voltages and temperatures.  This may be useful if you 
can find it.


--
Robin Laing


--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list


Re: SATA - System Freezes

2008-06-23 Thread Henry Ritzlmayr
Am Freitag, den 20.06.2008, 09:37 -0600 schrieb Robin Laing:
> Henry Ritzlmayr wrote:
> > Am Donnerstag, den 19.06.2008, 09:52 -0600 schrieb Robin Laing:
> >> Henry Ritzlmayr wrote:
> >>> Am Dienstag, den 17.06.2008, 13:25 -0400 schrieb Jorge Fábregas:
>  Hello Everyone,
> 
>  I'm running Fedora 8 and my system freezes (for about 20 to 40 seconds) 
>  a 
>  couple of times a day. When it does I see this on /var/log/messages:
> 
> >> -- 
> >> Robin Laing
> > 
> > 
> > Question to the devs - could you think of any way that the kernel output
> > could be a bit more informing, or don´t you get enough information from
> > the hardware for such an issue. I also checked smart for unusual power
> > cycle counts but to no avail. 
> > 
> > Henry
> > 
> 
> Henry, it would be nice but if the system, including the BIOS doesn't 
> know that there is a problem with the power supply, then how is the 
> hardware supposed to report it.  Maybe there can be a sensor added to 
> the harddrive to detect this type of error.

A sensor on the hardrive would be a good start I guess. But even better
would be a sensor on the power supply. I know that this is not something
which can be resolved on the fedora list, since it involves hardware
support which, as Alan already stated, is obviously not there. 

> I have looked at how lmsensors works and you can make some changes to 
> the configuration files to increase the accuracy of the reports.  I have 
> not played with it much though.  It is just as easy to pull the cover 
> and measure the voltages with a volt meter.

I will try lmsensors and see what it can do here. If this works out,
than my initial questions stands: Could such an info be included in the
kernel output. Maybe not (by design) but a question can´t hurt. We do
development within SAP here at our company, and for special cases, when
something fails, we don´t only throw a error message to the user. We try
to retrieve as much information from the underlying OS/DB as much as
possible to help for later debugging or at least being more informative
in the message we display to the user. 


> I have two of the same brand of power supply's (only ones available in 
> our area) with exactly the same fault.  A known fault that can be fixed 
> with the addition of three variable resistors.

Don´t get me wrong here, but this is not something I would like to see
in one of my systems here. I could imagine the first response from one
of our managers if anything fails (even if it is only a web page which
is not displayed properly) that this must be due to the newly installed
resistors ;-)

Henry 




--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list


Re: SATA - System Freezes

2008-06-20 Thread Roger Heflin

Henry Ritzlmayr wrote:

Am Donnerstag, den 19.06.2008, 09:52 -0600 schrieb Robin Laing:

Henry Ritzlmayr wrote:

Am Dienstag, den 17.06.2008, 13:25 -0400 schrieb Jorge Fábregas:

Hello Everyone,

I'm running Fedora 8 and my system freezes (for about 20 to 40 seconds) a 
couple of times a day. When it does I see this on /var/log/messages:


--- cut here -

kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
kernel: ata3.00: cmd ca/00:50:67:85:03/00:00:00:00:00/e0 tag 0 dma 40960 out
kernel:  res 40/00:00:76:6c:03/84:00:10:00:00/e0 Emask 0x4 (timeout)
kernel: ata3.00: status: { DRDY }
kernel: ata3: port is slow to respond, please be patient (Status 0xd0)
kernel: ata3: device not ready (errno=-16), forcing hardreset
kernel: ata3: soft resetting link
kernel: ata3.00: configured for UDMA/33
kernel: ata3: EH complete
kernel: sd 2:0:0:0: [sdc] 321672960 512-byte hardware sectors (164697 MB)
kernel: sd 2:0:0:0: [sdc] Write Protect is off
kernel: sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't 
support DPO or FUA


--- cut here -

/dev/sdc is my main drive. The only thing I can think of...is that this drive 
is actually a PATA drive connected to the SATA controller on MoBo thru 
a "SATA-TO-IDE Adapter" that I connect on the drive. Perhaps the converter is 
faulty...or could this be a known issue with libata?  Anyone had same 
problem?


Thanks,
Jorge

Many months ago I had the exact same output. Lots of google voodo and
try and error solved it. My issue was that on one outlet of the power
supply there where to many (3) drives connected. After recabling all
went away. Others claimed that they got rid of the problem be refitting
the sata cables.

Henry


Henry,

I was just about to suggest checking the power supply.  I had a power 
supply that wouldn't supply enough voltage on the 5V rail.  My system 
would freeze.  Turned out to be a known fault with the brand of 
powersupplies.


Took two power supplies to find out that it was a known fault.  Argh. 
Warranties are useless on some products.  I also learned that the sensor 
voltages were not accurate in the BIOS in comparison to a digital 
voltmeter on the actual power cable.


--
Robin Laing


What I didn´t like (still) is the fact that there is no indication, that
this could be even slightly related to the power supply. As stated above
it was more a try and error to solve this issue. Hopefully for the OP
this also solved his issue. 


Question to the devs - could you think of any way that the kernel output
could be a bit more informing, or don´t you get enough information from
the hardware for such an issue. I also checked smart for unusual power
cycle counts but to no avail. 


Henry





The problem with power supplies is that often they don't fully fail, if the 
voltage goes low enough things don't completely fail, only some operations will 
fail and some will not, and often things won't notice the PS was low for too 
long, and often they may only fail for the short period of the low voltage and 
be fine the next second, or if the fully fail the OS may still be able to reset 
the device and get it back up, but from the HW's point of view there was never a 
complete power failure.And none of the normal voltage monitoring devices sit 
there and sample the power voltages over time and verify they were always good 
for the entire time, they only check when someone looks, and all that really 
matters was that for tiny short period of time the voltage was too low, and 
screwed someone up enough to cause trouble.


I have seen a 110V AC outage that resulted in a remote controlled power switch 
switching off all of its relays, but the internal computer running those relays 
reported them all on (it did not reboot, and had no idea the relays internal to 
it were switched off and had no feedback on their position), obviously in this 
case the relays were more sensitive to voltage issues than the computer running 
the relays, likely a design issue were you really want to make sure the computer 
goes off first, or make sure that the computer has actual feedback on the relay 
positions so it knows something went wrong.


I have seen a power supply that was undersized on a certain voltage result in 
the ethernet going offline (kernel reported the ethernet was screwed up-but had 
no idea why and was unable to reset it and get it working again) and required a 
reboot to get ethernet back again, but other than the ethernet going offline 
nothing else looked wrong with the machines, and there were no other failures 
that could be found, and absolutely nothing indicated that there were any 
voltage issues.


Roger

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list


Re: SATA - System Freezes

2008-06-20 Thread Robin Laing

Henry Ritzlmayr wrote:

Am Donnerstag, den 19.06.2008, 09:52 -0600 schrieb Robin Laing:

Henry Ritzlmayr wrote:

Am Dienstag, den 17.06.2008, 13:25 -0400 schrieb Jorge Fábregas:

Hello Everyone,

I'm running Fedora 8 and my system freezes (for about 20 to 40 seconds) a 
couple of times a day. When it does I see this on /var/log/messages:



--
Robin Laing


What I didn´t like (still) is the fact that there is no indication, that
this could be even slightly related to the power supply. As stated above
it was more a try and error to solve this issue. Hopefully for the OP
this also solved his issue. 


Question to the devs - could you think of any way that the kernel output
could be a bit more informing, or don´t you get enough information from
the hardware for such an issue. I also checked smart for unusual power
cycle counts but to no avail. 


Henry



Henry, it would be nice but if the system, including the BIOS doesn't 
know that there is a problem with the power supply, then how is the 
hardware supposed to report it.  Maybe there can be a sensor added to 
the harddrive to detect this type of error.


I didn't suspect a power supply problem the second time either as the 
power supply was fairly new (upgraded when I added more drives).  The 
sensors showed all voltages as being normal.  Even the BIOS said it was 
normal but under load and after the system warmed up, the power supply 
drifted to just under the lower recommended limit under no real load.


I have looked at how lmsensors works and you can make some changes to 
the configuration files to increase the accuracy of the reports.  I have 
not played with it much though.  It is just as easy to pull the cover 
and measure the voltages with a volt meter.


I have two of the same brand of power supply's (only ones available in 
our area) with exactly the same fault.  A known fault that can be fixed 
with the addition of three variable resistors.


--
Robin Laing

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list


Re: SATA - System Freezes

2008-06-20 Thread Alan Cox
> Question to the devs - could you think of any way that the kernel output
> could be a bit more informing, or don´t you get enough information from
> the hardware for such an issue. I also checked smart for unusual power
> cycle counts but to no avail. 

There isn't information on the causes - it just didn't work.

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list


Re: SATA - System Freezes

2008-06-19 Thread Henry Ritzlmayr
Am Donnerstag, den 19.06.2008, 09:52 -0600 schrieb Robin Laing:
> Henry Ritzlmayr wrote:
> > Am Dienstag, den 17.06.2008, 13:25 -0400 schrieb Jorge Fábregas:
> >> Hello Everyone,
> >>
> >> I'm running Fedora 8 and my system freezes (for about 20 to 40 seconds) a 
> >> couple of times a day. When it does I see this on /var/log/messages:
> >>
> >> --- cut here 
> >> -
> >>
> >> kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> >> kernel: ata3.00: cmd ca/00:50:67:85:03/00:00:00:00:00/e0 tag 0 dma 40960 
> >> out
> >> kernel:  res 40/00:00:76:6c:03/84:00:10:00:00/e0 Emask 0x4 
> >> (timeout)
> >> kernel: ata3.00: status: { DRDY }
> >> kernel: ata3: port is slow to respond, please be patient (Status 0xd0)
> >> kernel: ata3: device not ready (errno=-16), forcing hardreset
> >> kernel: ata3: soft resetting link
> >> kernel: ata3.00: configured for UDMA/33
> >> kernel: ata3: EH complete
> >> kernel: sd 2:0:0:0: [sdc] 321672960 512-byte hardware sectors (164697 MB)
> >> kernel: sd 2:0:0:0: [sdc] Write Protect is off
> >> kernel: sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, 
> >> doesn't 
> >> support DPO or FUA
> >>
> >> --- cut here 
> >> -
> >>
> >> /dev/sdc is my main drive. The only thing I can think of...is that this 
> >> drive 
> >> is actually a PATA drive connected to the SATA controller on MoBo thru 
> >> a "SATA-TO-IDE Adapter" that I connect on the drive. Perhaps the converter 
> >> is 
> >> faulty...or could this be a known issue with libata?  Anyone had same 
> >> problem?
> >>
> >> Thanks,
> >> Jorge
> > 
> > Many months ago I had the exact same output. Lots of google voodo and
> > try and error solved it. My issue was that on one outlet of the power
> > supply there where to many (3) drives connected. After recabling all
> > went away. Others claimed that they got rid of the problem be refitting
> > the sata cables.
> > 
> > Henry
> > 
> 
> Henry,
> 
> I was just about to suggest checking the power supply.  I had a power 
> supply that wouldn't supply enough voltage on the 5V rail.  My system 
> would freeze.  Turned out to be a known fault with the brand of 
> powersupplies.
> 
> Took two power supplies to find out that it was a known fault.  Argh. 
> Warranties are useless on some products.  I also learned that the sensor 
> voltages were not accurate in the BIOS in comparison to a digital 
> voltmeter on the actual power cable.
> 
> -- 
> Robin Laing

What I didn´t like (still) is the fact that there is no indication, that
this could be even slightly related to the power supply. As stated above
it was more a try and error to solve this issue. Hopefully for the OP
this also solved his issue. 

Question to the devs - could you think of any way that the kernel output
could be a bit more informing, or don´t you get enough information from
the hardware for such an issue. I also checked smart for unusual power
cycle counts but to no avail. 

Henry





--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list


Re: SATA - System Freezes

2008-06-19 Thread Robin Laing

Henry Ritzlmayr wrote:

Am Dienstag, den 17.06.2008, 13:25 -0400 schrieb Jorge Fábregas:

Hello Everyone,

I'm running Fedora 8 and my system freezes (for about 20 to 40 seconds) a 
couple of times a day. When it does I see this on /var/log/messages:


--- cut here -

kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
kernel: ata3.00: cmd ca/00:50:67:85:03/00:00:00:00:00/e0 tag 0 dma 40960 out
kernel:  res 40/00:00:76:6c:03/84:00:10:00:00/e0 Emask 0x4 (timeout)
kernel: ata3.00: status: { DRDY }
kernel: ata3: port is slow to respond, please be patient (Status 0xd0)
kernel: ata3: device not ready (errno=-16), forcing hardreset
kernel: ata3: soft resetting link
kernel: ata3.00: configured for UDMA/33
kernel: ata3: EH complete
kernel: sd 2:0:0:0: [sdc] 321672960 512-byte hardware sectors (164697 MB)
kernel: sd 2:0:0:0: [sdc] Write Protect is off
kernel: sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't 
support DPO or FUA


--- cut here -

/dev/sdc is my main drive. The only thing I can think of...is that this drive 
is actually a PATA drive connected to the SATA controller on MoBo thru 
a "SATA-TO-IDE Adapter" that I connect on the drive. Perhaps the converter is 
faulty...or could this be a known issue with libata?  Anyone had same 
problem?


Thanks,
Jorge


Many months ago I had the exact same output. Lots of google voodo and
try and error solved it. My issue was that on one outlet of the power
supply there where to many (3) drives connected. After recabling all
went away. Others claimed that they got rid of the problem be refitting
the sata cables.

Henry



Henry,

I was just about to suggest checking the power supply.  I had a power 
supply that wouldn't supply enough voltage on the 5V rail.  My system 
would freeze.  Turned out to be a known fault with the brand of 
powersupplies.


Took two power supplies to find out that it was a known fault.  Argh. 
Warranties are useless on some products.  I also learned that the sensor 
voltages were not accurate in the BIOS in comparison to a digital 
voltmeter on the actual power cable.


--
Robin Laing

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list


Re: SATA - System Freezes

2008-06-18 Thread Jorge Fábregas
On Wednesday 18 June 2008 07:13:46 am Henry Ritzlmayr wrote:
> My issue was that on one outlet of the power supply there where
>  to many (3) drives connected. 

Thanks for the tip Henry. Indeed I have 3 drives connected from same 
power-supply outlet. I'm going to rewire and check behaviour.

Thanks!
Jorge



-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list


Re: SATA - System Freezes

2008-06-18 Thread Henry Ritzlmayr
Am Dienstag, den 17.06.2008, 13:25 -0400 schrieb Jorge Fábregas:
> Hello Everyone,
> 
> I'm running Fedora 8 and my system freezes (for about 20 to 40 seconds) a 
> couple of times a day. When it does I see this on /var/log/messages:
> 
> --- cut here -
> 
> kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> kernel: ata3.00: cmd ca/00:50:67:85:03/00:00:00:00:00/e0 tag 0 dma 40960 out
> kernel:  res 40/00:00:76:6c:03/84:00:10:00:00/e0 Emask 0x4 (timeout)
> kernel: ata3.00: status: { DRDY }
> kernel: ata3: port is slow to respond, please be patient (Status 0xd0)
> kernel: ata3: device not ready (errno=-16), forcing hardreset
> kernel: ata3: soft resetting link
> kernel: ata3.00: configured for UDMA/33
> kernel: ata3: EH complete
> kernel: sd 2:0:0:0: [sdc] 321672960 512-byte hardware sectors (164697 MB)
> kernel: sd 2:0:0:0: [sdc] Write Protect is off
> kernel: sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't 
> support DPO or FUA
> 
> --- cut here -
> 
> /dev/sdc is my main drive. The only thing I can think of...is that this drive 
> is actually a PATA drive connected to the SATA controller on MoBo thru 
> a "SATA-TO-IDE Adapter" that I connect on the drive. Perhaps the converter is 
> faulty...or could this be a known issue with libata?  Anyone had same 
> problem?
> 
> Thanks,
> Jorge

Many months ago I had the exact same output. Lots of google voodo and
try and error solved it. My issue was that on one outlet of the power
supply there where to many (3) drives connected. After recabling all
went away. Others claimed that they got rid of the problem be refitting
the sata cables.

Henry









--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list


Re: SATA - System Freezes

2008-06-17 Thread Alan Cox
On Tue, 17 Jun 2008 18:14:08 -0400
Jorge Fábregas <[EMAIL PROTECTED]> wrote:

> On Tuesday 17 June 2008 04:40:25 pm Alan Cox wrote:
> > What does smart utils have to say about the drive last logged errors ?
> 
> Agh thanks Alan. I forgot about S.M.A.R.T..but shame on me:  smartd  
> isn't 
> running on my machine (not enabled INIT-wise).  I just started it and enabled 
> it via chkconfig.
> 
> Anyway, I ran some tests using smartctl and it looks fine.  The "last logged 
> errors" you mention would be in some db on my filesystem right?  The memory 
> on the drive (for SMART stuff) won't store any historical information right?

The drive itself stores the last few errored/failed commands, but often
only until power cycled. Its very useful if you get a funny behaviour and
want to see how the drive saw it.

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list


Re: SATA - System Freezes

2008-06-17 Thread Jorge Fábregas
On Tuesday 17 June 2008 06:14:08 pm Jorge Fábregas wrote:
> The memory on the drive (for SMART stuff) won't store any historical
>  information right?

Ups...indeed it does.  

smartctl -l error /dev/sdc

--- cut here 
smartctl version 5.38 [i386-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
No Errors Logged

--- cut here -

I'm going to replace my convertor ...

Thanks!
Jorge

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list


Re: SATA - System Freezes

2008-06-17 Thread Jorge Fábregas
On Tuesday 17 June 2008 04:40:25 pm Alan Cox wrote:
> What does smart utils have to say about the drive last logged errors ?

Agh thanks Alan. I forgot about S.M.A.R.T..but shame on me:  smartd  isn't 
running on my machine (not enabled INIT-wise).  I just started it and enabled 
it via chkconfig.

Anyway, I ran some tests using smartctl and it looks fine.  The "last logged 
errors" you mention would be in some db on my filesystem right?  The memory 
on the drive (for SMART stuff) won't store any historical information right?

Thanks!
Jorge


-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list


Re: SATA - System Freezes

2008-06-17 Thread Alan Cox
> SMART Error Log Version: 1
> No Errors Logged

So the drive is happy (unless you've power cycled since it happened in
which case try and grab the next one).

A single odd timeout/reset/reissue is most likely just noise. SATA has
full CRC checksumming on the data and commands so is robust and can
accept them - they just take time to discover. PATA you shouldn't really
be seeing them on a sane machine but the data transfers (the fast bit)
are also CRC protected.

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list


Re: SATA - System Freezes

2008-06-17 Thread Geoffrey Leach
On 06/17/2008 10:25:46 AM, Jorge Fábregas wrote:
> Hello Everyone,
> 
> I'm running Fedora 8 and my system freezes (for about 20 to 40
> seconds) a 
> couple of times a day. When it does I see this on /var/log/messages:
> 
> --- cut here
> -
> 
> kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
> frozen
> kernel: ata3.00: cmd ca/00:50:67:85:03/00:00:00:00:00/e0 tag 0 dma
> 40960 out
> kernel:  res 40/00:00:76:6c:03/84:00:10:00:00/e0 Emask 0x4
> (timeout)
> kernel: ata3.00: status: { DRDY }
> kernel: ata3: port is slow to respond, please be patient (Status 
> 0xd0)
> kernel: ata3: device not ready (errno=-16), forcing hardreset
> kernel: ata3: soft resetting link
> kernel: ata3.00: configured for UDMA/33
> kernel: ata3: EH complete
> kernel: sd 2:0:0:0: [sdc] 321672960 512-byte hardware sectors (164697
> MB)
> kernel: sd 2:0:0:0: [sdc] Write Protect is off
> kernel: sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled,
> doesn't 
> support DPO or FUA
> 
> --- cut here
> -
> 
> /dev/sdc is my main drive. The only thing I can think of...is that
> this drive 
> is actually a PATA drive connected to the SATA controller on MoBo 
> thru
> 
> a "SATA-TO-IDE Adapter" that I connect on the drive. Perhaps the
> converter is 
> faulty...or could this be a known issue with libata?  Anyone had same 
> problem?

I have _exactly_ the same problem.  Here's the output from smartctl --
all. I'd be delighted to do any (non-destructive!) testing that might 
be useful.

 smartctl --all

smartctl version 5.38 [i386-redhat-linux-gnu] Copyright (C) 2002-8 
Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Model Family: Seagate Momentus 7200.1 series
Device Model: ST910021A
Serial Number:3MH08Q1W
Firmware Version: 3.04
User Capacity:100,030,242,816 bytes
Device is:In smartctl database [for details use: -P show]
ATA Version is:   7
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:Tue Jun 17 14:41:18 2008 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection 
activity
was completed without error.
Auto Offline Data Collection: 
Enabled.
Self-test execution status:  (   0) The previous self-test
routine 
completed
without error or no self-test 
has ever 
been run.
Total time to complete Offline 
data collection: ( 426) seconds.
Offline data collection
capabilities:(0x5b) SMART execute Offline 
immediate.
Auto Offline data collection 
on/off support.
Suspend Offline collection upon 
new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test 
supported.
Selective Self-test supported.
SMART capabilities:(0x0003) Saves SMART data before 
entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability:(0x01) Error logging supported.
No General Purpose Logging 
support.
Short self-test routine 
recommended polling time:(   1) minutes.
Extended self-test routine
recommended polling time:( 111) minutes.
SCT capabilities:  (0x0001) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE  
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate 0x000f   052   040   006Pre-fail  
Always   -   130929108
  3 Spin_Up_Time0x0003   094   094   000Pre-fail  
Always   -   0
  4 Start_Stop_Count0x0032   099   099   020Old_age   
Always   -   1648
  5 Reallocated_Sector_Ct   0x0033   100   100   036Pre-fail  
Always   -   0
  7 Seek_Error_Rate 0x000f   075   060   030Pre-fail  
Always   -   38477626
  9 Power_On_Hours  0x0032   098   098   000Old_age   
Always   -   2388
 10 Spin_Retry_Count0x0013   100   100   034Pre-fail  
Always   -   0
 12 Power_Cycle_Count   0x0032   099   099   020Old_age   
Always

Re: SATA - System Freezes

2008-06-17 Thread Alan Cox
> kernel: ata3.00: status: { DRDY }
> kernel: ata3: port is slow to respond, please be patient (Status 0xd0)
> kernel: ata3: device not ready (errno=-16), forcing hardreset

Stuck waiting for data then took a hard reset not a soft one to get it
back.

> /dev/sdc is my main drive. The only thing I can think of...is that this drive 
> is actually a PATA drive connected to the SATA controller on MoBo thru 
> a "SATA-TO-IDE Adapter" that I connect on the drive. Perhaps the converter is 
> faulty...or could this be a known issue with libata? 

Could be a convertor problem. What does smart utils have to say about the
drive last logged errors ?

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list


SATA - System Freezes

2008-06-17 Thread Jorge Fábregas
Hello Everyone,

I'm running Fedora 8 and my system freezes (for about 20 to 40 seconds) a 
couple of times a day. When it does I see this on /var/log/messages:

--- cut here -

kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
kernel: ata3.00: cmd ca/00:50:67:85:03/00:00:00:00:00/e0 tag 0 dma 40960 out
kernel:  res 40/00:00:76:6c:03/84:00:10:00:00/e0 Emask 0x4 (timeout)
kernel: ata3.00: status: { DRDY }
kernel: ata3: port is slow to respond, please be patient (Status 0xd0)
kernel: ata3: device not ready (errno=-16), forcing hardreset
kernel: ata3: soft resetting link
kernel: ata3.00: configured for UDMA/33
kernel: ata3: EH complete
kernel: sd 2:0:0:0: [sdc] 321672960 512-byte hardware sectors (164697 MB)
kernel: sd 2:0:0:0: [sdc] Write Protect is off
kernel: sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't 
support DPO or FUA

--- cut here -

/dev/sdc is my main drive. The only thing I can think of...is that this drive 
is actually a PATA drive connected to the SATA controller on MoBo thru 
a "SATA-TO-IDE Adapter" that I connect on the drive. Perhaps the converter is 
faulty...or could this be a known issue with libata?  Anyone had same 
problem?

Thanks,
Jorge

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list