On Sat, 14 Jul 2007 17:08:27 -0700 (PDT)
Mr. James W. Laferriere [EMAIL PROTECTED] wrote:
Hello All , I was under the impression that a 'machine check' would be
caused by some near to the CPU hardware failure , Not a bad disk ?
It indicates a hardware failure
Jul 14 23:00:26
are available here:
http://www.usb.org/developers/devclass_docs
Alan Stern
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
use it, although it would take a long time to build
because it includes so many drivers. Whittling it down to just the
drivers you need would be tedious but not very difficult.
Alan Stern
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL
with a hardware issue, not
a kernel issue, just because it is so consistent.
People have reported problems in which the hardware fails when it
encounters a certain pattern of bytes in the data stream. Maybe you're
seeing the same sort of thing.
Alan Stern
-
To unsubscribe from this list: send
khubd running.
That in itself is a very bad sign. You need to look at the dmesg log.
Alan Stern
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
is liable to miss bits and pieces of
the kernel log when a lot of information comes along all at once. You're
much better off getting the stack trace data directly from dmesg. (And
when you do, you don't end up with 30 columns of wasted data added to the
beginning of each line.)
Alan Stern
with the same upper bits. More
problematic is losing indirect blocks, and being able to keep some kind
of [inode low bits/block index] would help put stuff back together.
Alan
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More
(and a workaround) which is SATA capable 8)
Alan
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
write a sector on a device with
physical sector size larger than logical block size (as allowed by say
ATA7) then it's less clear what happens. I don't know if the drive
firmware implements multiple tails in this case.
On a read error it is worth trying the other parts of the I/O.
Alan
I think that this is mostly true, but we also need to balance this against
the
need for higher levels to get a timely response. In a really large IO, a
naive
retry of a very large write could lead to a non-responsive system for a very
large time...
And losing the I/O could result in a
One interesting counter example is a smaller write than a full page - say 512
bytes out of 4k.
If we need to do a read-modify-write and it just so happens that 1 of the 7
sectors we need to read is flaky, will this look like a write failure?
The current core kernel code can't handle
On Fri, 15 Dec 2006 13:39:27 -0800
Andrew Morton [EMAIL PROTECTED] wrote:
On Fri, 15 Dec 2006 13:05:52 -0800
Andrew Morton [EMAIL PROTECTED] wrote:
Jeff, I shall send all the sata patches which I have at you one single time
and I shall then drop the lot. So please don't flub them.
Ar Iau, 2006-08-24 am 09:07 -0400, ysgrifennodd Adam Kropelin:
Jeff Garzik [EMAIL PROTECTED] wrote:
with sw RAID of course if the builder is careful to use multiple PCI
cards, etc. Sw RAID over your motherboard's onboard controllers leaves
you vulnerable.
Generally speaking the channels on
Ar Iau, 2006-08-24 am 07:31 -0700, ysgrifennodd Marc Perkel:
So - the bottom line answer to my question is that unless you are
running raid 5 and you have a high powered raid card with cache and
battery backup that there is no significant speed increase to use
hardware raid. For raid 0 there
On Mer, 2006-01-18 at 09:14 +0100, Sander wrote:
If the (harddisk internal) remap succeeded, the OS doesn't see the bad
sector at all I believe.
True for ATA, in the SCSI case you may be told about the remap having
occurred but its a by the way type message not an error proper.
If you (the
On Wednesday May 16, [EMAIL PROTECTED] wrote:
(more patches to come. They will go to Linus, Alan, and linux-raid only).
This is the next one, which actually addresses the NULL Checking
Bug.
Thanks. As Linus merges I'll switch over to match his tree. Less diff is good 8
1) Read and write errors should be retried at least once before kicking
the drive out of the array.
This doesn't seem unreasonable on the face of it.
Device level retries are the job of the device level driver
2) On more persistent read errors, the failed block (or whatever unit is
any data, but under normal default drive setup the sector will not be
reallocated. If testing the failing sector is too much effort, a
simple overwrite with the corrected data, at worst, improves the
chances of the drive firmware being able to reallocate the sector.
This works just fine
Umm. Isn't RAID implemented as the md device? That implies that it is
responsible for some kind of error management. Bluntly, the file systems
don't declare a file system kaput until they've retried the critical
I/O operations. Why should RAID5 be any less tolerant?
File systems give up the
19 matches
Mail list logo