Re: fsck finding thousands of errors

2003-02-19 Thread Daniel B.
Levi Waldron wrote:
 
 On February 13, 2003 07:28 pm, Daniel Barclay wrote:
 ...
  Are you using DMA?
 
 ...  I didn't do anything outside of a
 normal  stock installation to turn DMA support on or off.

The Linux kernel and some IDE controllers don't work together, and
can cause several filesystem corruption.

You probably want to turn DMA off (hdparm -d0 ...) until you can
confirm that this is not your problem.


Daniel
-- 
Daniel Barclay
[EMAIL PROTECTED]


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]




Re: fsck finding thousands of errors

2003-02-19 Thread Leo Spalteholz
On February 19, 2003 02:39 pm, Daniel B. wrote:
 Levi Waldron wrote:
  On February 13, 2003 07:28 pm, Daniel Barclay wrote:
  ...
   Are you using DMA?
 
  ...  I didn't do anything outside of a
  normal  stock installation to turn DMA support on or off.

 The Linux kernel and some IDE controllers don't work together, and
 can cause several filesystem corruption.

 You probably want to turn DMA off (hdparm -d0 ...) until you can
 confirm that this is not your problem.

Do you know if support for any IDE controllers was dropped in 2.4 vs 
2.2?  I can run kernel 2.2 fine on my server but when I tried to 
update to 2.4 (standard debian kernel image) it gave me a whole 
crapload of fsck errors on boot.  Booting kernel 2.2 I could after 
much pain fix most of them.  So it seems that something in 2.4 is 
corrupting my filesystem..  I can't remember the exact version I 
tried and don't really want to try again but it was 2.4.16 or 17.  
How could I find out what IDE controller I have and whether it's 
supported?  I have some crappy no name board in that server.

Thanks
Leo


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]




Re: fsck finding thousands of errors

2003-02-13 Thread George Georgalis
On Tue, Feb 11, 2003 at 08:20:40PM -0800, Alvin Oga wrote:

hi ya

On Tue, 11 Feb 2003, George Georgalis wrote:

 On Tue, Feb 11, 2003 at 08:43:21PM -0500, Levi Waldron wrote:
 Do you know what's wrong with this hard drive, or how to troubleshoot it?  
 It's almost brand new, but is it a warranty item?
 
 who knows maybe it's software error, but more likely a loss of power
 while running/writting, but much much more likely a disk failure.
 
 anyway, you can use the -y option in fsck to answer yes to all the
 questions.

the system will fsck your disk if its not properly shutdown 

using -y to fsck is a bad idea, unless oyu know why its fsck'ing in the
first place
   - if its bad mmory... i dont want the disk touched

assuming that it was shutdown properly ...
- if your bios time is whacky... so can fsck...

- if you have bad memory... it will try to fix the drive according to
  its bad memory content

- if its a brand new disk...  there probably wasnt anything 
  wrong with it if it installed and was rebooted a few times
  before the last sleep of 2 months

- somebody else played with your system and powered it off incorrectly

the magic pixie dust did it

your best bet is to re-install again... 
   if it happens again..swapp out either the memory or disk
   or maybe even a bad cpu that has gone bonkers


Interesting points worth considering but if you do decide to run
fsck, I don't see any reason not to use -y I don't know when humans
wouldn't answer 'y' during a check anyway...

Regards,
// George


-- 
GEORGE GEORGALIS, System Admin/Architectcell: 347-451-8229 
Security Services, Web, Mail,mailto:[EMAIL PROTECTED] 
Multimedia, DB, DNS and Metrics.   http://www.galis.org/george 


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]




Re: fsck finding thousands of errors

2003-02-13 Thread Levi Waldron
On February 13, 2003 11:42 am, George Georgalis wrote:
 anyway, you can use the -y option in fsck to answer yes to all the
 questions.

I thank both of you for the tips.  We're still not sure what caused the 
catastrophic hard drive failure, although it may become more clear after 
figuring out whether the drive is now junk or not.  I had already give the -y 
order before reading Alvin's message.  I'm still don't understand what there 
is to lose by fscking the disk, answering y to all the questions?  Do you 
think there could be a problem with the bios or memory now that is now 
scrambling a previously good hdd through the fsck process?  Anyways, there's 
nothing to lose here other than the annoyance of redoing a fresh installation 
and, he'd already yessed a couple thousand fsck fixes manually.  

Last I heard, it has been fixing inodes for almost 24 hours at a rate of 
about one per 2 seconds.  I wonder if it will boot again if/when that ever 
finishes.

Question, Alvin:

 - if its bad mmory... i dont want the disk touched
 
 assuming that it was shutdown properly ...
 - if your bios time is whacky... so can fsck...
 
 - if you have bad memory... it will try to fix the drive according to
   its bad memory content

I'm not sure what these mean.  Does

bad memory = bad RAM memory 
bios time whacky = internal clock wrong?


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]




Re: fsck finding thousands of errors

2003-02-13 Thread Daniel Barclay
Levi Waldron wrote:
 
 ...  Do you
 think there could be a problem with the bios or memory now that is now
 scrambling a previously good hdd through the fsck process?  

Do you have IDE disks?

Are you using DMA?

If so, what kind of motherboard and/or IDE controller cards are you
using?

Daniel

-- 
Daniel Barclay
[EMAIL PROTECTED]


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]




Re: fsck finding thousands of errors

2003-02-13 Thread Levi Waldron
On February 13, 2003 07:28 pm, Daniel Barclay wrote:

 Do you have IDE disks?

Yes.


 Are you using DMA?

It's a Western Digital 80G HD.  The WD website at 
http://www.wdc.com/products/Products.asp?DriveID=5Lang=1
says, among other things:

Interface: Ultra ATA/100
Mode 5 Ultra ATA100.0 MB/s
Mode 4 Ultra ATA66.6 MB/s
Mode 2 Ultra ATA33.3 MB/s
Mode 4 PIO16.6 MB/s
Mode 2 multi-word DMA16.6 MB/s

Does this help answer your question?  I didn't do anything outside of a 
normal  stock installation to turn DMA support on or off.


 If so, what kind of motherboard and/or IDE controller cards are you
 using?

Motherboard: Soyo Dragon - AMD Socket-A Base Via KT266 ATX
CPU: AMD 1400 Thunderbird
RAM: 256K 266-DDR
/dev/hdb1 is 20GB

fsck has now been running for 28 consecutive hours, and the numbering of the 
inodes suggests it has fixed 60,000 of them now.  Maybe I can get a world  
record!  How many inodes would one 20G partition have?  I wonder what order 
of magnitude of time it might take for fsck to finish?

-- 
-Levi


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]




Re: fsck finding thousands of errors

2003-02-13 Thread Pigeon
On Tue, Feb 11, 2003 at 10:12:33PM -0500, George Georgalis wrote:
 On Tue, Feb 11, 2003 at 08:43:21PM -0500, Levi Waldron wrote:
 Do you know what's wrong with this hard drive, or how to troubleshoot it?  
 It's almost brand new, but is it a warranty item?
 
 who knows maybe it's software error, but more likely a loss of power
 while running/writting, but much much more likely a disk failure.

I reckon it's a disk failure. I had exactly the same thing happen, but
with a SCSI disk, so the SCSI driver spat out all sorts of helpful
error messages like Unrecoverable read error and other technical
euphemisms for fucked. And just prior to this it had been giving me
messages about running out of room in the grown defects map.

It Shouldn't Happen To A Brand New Disk... doesn't mean it never will.

 anyway, you can use the -y option in fsck to answer yes to all the
 questions.

This is true, but in this situation you won't find anything useful on
the disk at the end of it.

You might not even get to the end of it. After fixing a few hundred
thousand errors automatically it may come to something it can't fix. I
tried three times, got this problem, and decided it wasn't worth
bothering with. Fortunately I had backups.

Pigeon


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]




fsck finding thousands of errors

2003-02-11 Thread Levi Waldron

I helped someone install Debian on a new hard drive a couple months ago.  
They didn't use that hard drive for the last couple months, then tried to 
boot into Debian and got the following errors during the boot process:

--
mount:  mountpoint /proc is not a directory
...
(swap activates successfully)
...
/dev/hdb1 contains a filesystem with errors, check forced.
(fixes several inodes)
Inode  has magic flag set
unexpected inconsistency, run fsck manually, without -a or -p options.
--

So I get him to boot off an install CD as a rescue disk, since /dev/hdb1 is 
the root filesystem.  He runs e2fsck /dev/hdb1 from the emergency shell, and 
it starts correcting THOUSANDS of inodes, hitting enter one by one.  The 
types of messages that e2fsck reports are:

---
-Inode has illegal blocks
-Illegal block in Inode
-Too many illegal blocks in Inodes
-Inode has compression flag set on file sysstem without compression support
-Inode  is in use, but has dtime set
-Inode  has magic flag set
-Special (device/socket/fifo/symlink) file (inode ) has append-only 
flag set
___ has immutable or append-only flag set
-Inodes that were part of a corrupted orphan linked list found
-Inode ___ was part of an orphaned inode list
-Gal block 6 in inode ___
---

inodes are in the range of 1,030,000 and he corrected about 5,000 of them 
one-by-one before I said to give up.

There is also a fat32 partition on this hard drive which had files on it, 
which are no longer being read by the windows half of the machine.

The partition table seems to be undamaged.

He doesn't think he's done anything damaging to this hard drive, or even used 
it since the last time it was working.

Do you know what's wrong with this hard drive, or how to troubleshoot it?  
It's almost brand new, but is it a warranty item?


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]




Re: fsck finding thousands of errors

2003-02-11 Thread George Georgalis
On Tue, Feb 11, 2003 at 08:43:21PM -0500, Levi Waldron wrote:
Do you know what's wrong with this hard drive, or how to troubleshoot it?  
It's almost brand new, but is it a warranty item?

who knows maybe it's software error, but more likely a loss of power
while running/writting, but much much more likely a disk failure.

anyway, you can use the -y option in fsck to answer yes to all the
questions.

// George


-- 
GEORGE GEORGALIS, System Admin/Architectcell: 347-451-8229 
Security Services, Web, Mail,mailto:[EMAIL PROTECTED] 
Multimedia, DB, DNS and Metrics.   http://www.galis.org/george 


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]




Re: fsck finding thousands of errors

2003-02-11 Thread Alvin Oga

hi ya

On Tue, 11 Feb 2003, George Georgalis wrote:

 On Tue, Feb 11, 2003 at 08:43:21PM -0500, Levi Waldron wrote:
 Do you know what's wrong with this hard drive, or how to troubleshoot it?  
 It's almost brand new, but is it a warranty item?
 
 who knows maybe it's software error, but more likely a loss of power
 while running/writting, but much much more likely a disk failure.
 
 anyway, you can use the -y option in fsck to answer yes to all the
 questions.

the system will fsck your disk if its not properly shutdown 

using -y to fsck is a bad idea, unless oyu know why its fsck'ing in the
first place
- if its bad mmory... i dont want the disk touched

assuming that it was shutdown properly ...
- if your bios time is whacky... so can fsck...

- if you have bad memory... it will try to fix the drive according to
  its bad memory content

- if its a brand new disk...  there probably wasnt anything 
  wrong with it if it installed and was rebooted a few times
  before the last sleep of 2 months

- somebody else played with your system and powered it off incorrectly

the magic pixie dust did it

your best bet is to re-install again... 
if it happens again..swapp out either the memory or disk
or maybe even a bad cpu that has gone bonkers

c ya
alvin


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]