Re: 2.4 ate my filesystem on rw-mount, summary

2001-01-23 Thread Tobias Ringstrom

Ok, folks, it's time for a summary.  Since my last post, I've had time to
experiment a bit more, and I've also had some private communication with
Vojtech.

First, I would like to say that you do need quite a bit of bad luck (or
hardware) to have the same problems I did.  Linux 2.4, VIA and IDE works
very well for most users.  But I really recommend making a backup of all
your vital data before installing 2.4 and enabling DMA with IDE disks.
(And, yes, I did this.  Honest! :-) )

Problem log
===

1. Installed RedHat 7
2. Built 2.4.0 with VIA driver and DMA by default (well, in 2.4.0, the VIA
   driver will always use DMA by default, wheather you want to or not.)
3. Rebooted -> 2.4.0
4. The computer froze on the remounting root read-write message.
5. Powercycle
6. Rebooted -> 2.2.16-22
7. Got a corrupt disk, missing files, moved files, incorrect file contents
8. Goto 1

So, why did this happen?

Problem one
===

This one really makes me upset, because had it not been for this one, it
would have been soo much easier to find the cause of the problem.  It is
also so easy to fix.

The problem is that the RedHat disables all kernel messages during boot,
except for panics.  I my not so very humble opinion, kernel error
messages, and possibly also warning messages, should of course be shown.
It can easyly be fixed by editing /etc/sysconfig/init.

The error messages that was hidden by RH7, was a couple of CRC error
messages, and then an endless stream of "Busy" and "Drive not ready for
command" errors.  More on this later.

Problem two
===

The computer in question has problems with UDMA(33), otherwise I would not
have gotten CRC errors, and everything would have been fine.  Why I do get
CRC errors, one can so far only speculate, especially since I am able to
use UDMA(66) with another drive, on the same controller, without much
trouble.

One theory is that the PCI bus clock may be too fast, and the drive cannot
catch up.  To check this, I plan to measure the PCI clock to see if this
is true.  Quick measurements with a not too great oscilloscope seems to
indicate a clock speed of around 33.3-33.4 MHz, so it may actully be out
of spec, but not by much.

Another theory is that the CRC errors are caused by bad cables,
connectors, or motherboard, but the fact that I can use UDMA(66) on the
same controller seems to contradicts this.  But OTOH I have learnt not to
underestimate the amazing amount of trouble a bad cable can cause.

Possible work-arounds include a "idebus=40" kernel option, or using
hdparm to configure the drive and kernel for UDMA(22).

Problem three
=

The drive that gave me these problems is a SAMSUNG VG34323A, and the
problem with this drive is that it does not seem to recover from CRC
errors.  Once I get my first CRC error, the drive becomes permanently
busy, until I power cycle.

Problem four


I do not know exactly what Linux is doing when remounting a
partition read-write, but it does seem to update some very sensitive
sectors, and when the write fails, a lot of very vital data is destroyed.
It is perhaps questionable whether the destruction of a couple of files
would be much better than the destruction of /dev, but I think it is.


Lesson
==

Be very careful when enabling DMA on a Linux machine, especially on cheap
hardware.  It is not enough to test DMA on a read-only partition first,
since writing is a completely different story.

...and probably some more things that I either forgot, or are too painful
to remember...

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount, summary

2001-01-23 Thread Tobias Ringstrom

Ok, folks, it's time for a summary.  Since my last post, I've had time to
experiment a bit more, and I've also had some private communication with
Vojtech.

First, I would like to say that you do need quite a bit of bad luck (or
hardware) to have the same problems I did.  Linux 2.4, VIA and IDE works
very well for most users.  But I really recommend making a backup of all
your vital data before installing 2.4 and enabling DMA with IDE disks.
(And, yes, I did this.  Honest! :-) )

Problem log
===

1. Installed RedHat 7
2. Built 2.4.0 with VIA driver and DMA by default (well, in 2.4.0, the VIA
   driver will always use DMA by default, wheather you want to or not.)
3. Rebooted - 2.4.0
4. The computer froze on the remounting root read-write message.
5. Powercycle
6. Rebooted - 2.2.16-22
7. Got a corrupt disk, missing files, moved files, incorrect file contents
8. Goto 1

So, why did this happen?

Problem one
===

This one really makes me upset, because had it not been for this one, it
would have been soo much easier to find the cause of the problem.  It is
also so easy to fix.

The problem is that the RedHat disables all kernel messages during boot,
except for panics.  I my not so very humble opinion, kernel error
messages, and possibly also warning messages, should of course be shown.
It can easyly be fixed by editing /etc/sysconfig/init.

The error messages that was hidden by RH7, was a couple of CRC error
messages, and then an endless stream of "Busy" and "Drive not ready for
command" errors.  More on this later.

Problem two
===

The computer in question has problems with UDMA(33), otherwise I would not
have gotten CRC errors, and everything would have been fine.  Why I do get
CRC errors, one can so far only speculate, especially since I am able to
use UDMA(66) with another drive, on the same controller, without much
trouble.

One theory is that the PCI bus clock may be too fast, and the drive cannot
catch up.  To check this, I plan to measure the PCI clock to see if this
is true.  Quick measurements with a not too great oscilloscope seems to
indicate a clock speed of around 33.3-33.4 MHz, so it may actully be out
of spec, but not by much.

Another theory is that the CRC errors are caused by bad cables,
connectors, or motherboard, but the fact that I can use UDMA(66) on the
same controller seems to contradicts this.  But OTOH I have learnt not to
underestimate the amazing amount of trouble a bad cable can cause.

Possible work-arounds include a "idebus=40" kernel option, or using
hdparm to configure the drive and kernel for UDMA(22).

Problem three
=

The drive that gave me these problems is a SAMSUNG VG34323A, and the
problem with this drive is that it does not seem to recover from CRC
errors.  Once I get my first CRC error, the drive becomes permanently
busy, until I power cycle.

Problem four


speculationI do not know exactly what Linux is doing when remounting a
partition read-write, but it does seem to update some very sensitive
sectors, and when the write fails, a lot of very vital data is destroyed.
It is perhaps questionable whether the destruction of a couple of files
would be much better than the destruction of /dev, but I think it is.
/speculation

Lesson
==

Be very careful when enabling DMA on a Linux machine, especially on cheap
hardware.  It is not enough to test DMA on a read-only partition first,
since writing is a completely different story.

...and probably some more things that I either forgot, or are too painful
to remember...

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount, getting closer

2001-01-14 Thread Vojtech Pavlik

On Sun, Jan 14, 2001 at 06:59:57PM +0100, Tobias Ringstrom wrote:
> 
> I should also add that the 3.11 driver seems to make things better, but
> not yet perfect.  My intuition tells me that I get CRC errors much sooner
> with 2.1e than with 3.11.
> 
> Has the timings changed from 2.1e to 3.11, and would it be easy to modify
> 3.11 to get extra safe/paranoid, but less high performance, timings?

If you use 'idebus=40' or 'idebus=50', the driver will add an extra
margin to the timings, trying to compensate for the 40 or 50 MHz PCI bus
it will be tricked to think it's working with.

This could add a data point, yes.

> Some extra data:
> * B seems to work in 2 with udma2
> * A seems to work in 2 with udma1, but not with udma2.

UDMA1 is 22.2 MB/sec, UDMA2 is 33.3. UDMA0 is 16.6.

Could you (if didn't already) send me the lspci -vvxxx after the -X65
(UDMA1) command, together with the one before? That also could tell
something.

> I wouldn't say it's rock solid, and I would not trust my data to any of
> these combinations, but at least it not break immmediately (i.e. for less
> than 1 GB written).

Actually, the CRC messages are safe and only mean a data transfer is
retried. That is, only if it doesn't fail every time. They happen on
many boards and drives using UDMA even under normal correct operation :(

> The worst combination is 2.4.0 with VIA 2.1e and A in 1.  Going from 2.1e
> to 3.11 helps, but it is still very bad.
> 
> I'd really like to be more precise, but there are too many combinations to
> try to try them all, and sometimes it fails right away, and sometimes
> after several hundred megabytes.

If 'fails after several hundred megabytes' only means a single CRC error
which is recovered from correctly, then that actually means 'working and
probably would work perfect with a shorter cable'.

-- 
Vojtech Pavlik
SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount, getting closer

2001-01-14 Thread Tobias Ringstrom

I should also add that the 3.11 driver seems to make things better, but
not yet perfect.  My intuition tells me that I get CRC errors much sooner
with 2.1e than with 3.11.

Has the timings changed from 2.1e to 3.11, and would it be easy to modify
3.11 to get extra safe/paranoid, but less high performance, timings?

Some extra data:
* B seems to work in 2 with udma2
* A seems to work in 2 with udma1, but not with udma2.

I wouldn't say it's rock solid, and I would not trust my data to any of
these combinations, but at least it not break immmediately (i.e. for less
than 1 GB written).

The worst combination is 2.4.0 with VIA 2.1e and A in 1.  Going from 2.1e
to 3.11 helps, but it is still very bad.

I'd really like to be more precise, but there are too many combinations to
try to try them all, and sometimes it fails right away, and sometimes
after several hundred megabytes.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount, getting closer

2001-01-14 Thread Tobias Ringstrom

On Sun, 14 Jan 2001, Vojtech Pavlik wrote:
> > > So the drive *did* work on the vt82c686a in the A7V board? You tested it
> > > both on the Promise and on the 686a? But doesn't work on the 686a in
> > > your other board?
> >
> > Yes, on both the Promise and on the 686a.  But the device revisions are
> > different.  The machine that does NOT work:
> >
> > 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
> > 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)
> >
> > The machine that works:
> >
> > 00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 22)
> > 00:04.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10)
> >
> > The one the works is a 1 GHz Athlon, and the other is an 800 MHz
> > Pentium-III.

Of course is isn't.  The vt82c686 that does not work is a 450 MHz K-6, not
a PIII.

> > > > no matter what cable I use.  When I get this, the machine does not recover
> > > > most of the time, and I have to reset or power cycle.
> > >
> > > It should be able to recover in a couple (up to 10) minutes ...
> >
> > Who waits 10 minutes for a timeout?  Can it be lowered?
>
> It's not a 10 minute timeout, it's a shorter timeout retried many times.
> Not my code, though - this is generic PCI IDE code, and is a huge mess.

What I get is a number of Busy and Drive is not ready for command for
different sectors.

> > Expect another mail with the data you requested within a couple of hours.
>
> Thanks a lot.

Ok, it took a bit longer that that, mostly because me and my whife had
unexpected (but very welcome) guests at home.  It is Sunday, after all...

I have attached a tar file with "lspci -vvxxx" and "hdinfo -i" for machine
1 and 2 to this mail, but first some comments.

I will be talking about three machines:

1) 450 MHz K-6 on an AOpen MX59 PRO II motherboard
2) 800 MHz PIII on an unknown cheap/crappy motherboard.
3) 1 GHz Athlon on an ASUS A7V motherboard.

and the following drives:

A) SAMSUNG VG34323A, sdma0 sdma1 sdma2 mdma0 mdma1 mdma2 udma0 udma1 udma2
B) ST38421A, mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4

Machine 3 is the machine at home, and it does not have problems with any
disks I have tried soo far, and seems very stable, both with ATA100 and
ATA66.

I verified that what is happening when RH7 tries to remount / read-write,
is that I get the infamous CRC errors.  It does not seem to recover from
this state.  At least I did not wait that long.

I do not think that the RH7 kernel 2.2.16-22 uses udma2 at any time, and
that may be why it works.

Disk B does NOT work with DMA enabled with machine 1 or 2.  It works
better than disk A, but it does still fail after some time.  The
combination 1B was the most stable, and only failed once.

When using disk B, the computer has managed to recover from the CRC error
condition every time, as opposed to disk A which never recovers.  (Busy)

Using hdparm -X65 (udma1) makes disk A work with 2.4 in machine 2.  What
is the difference between udma1 and udma2?

Now I'm almost completely lost.  Hope this helps.  Let me know if you want
me to try something else.

/Tobias




/dev/hde:

 Model=SAMSUNG VG34323A (4.32GB), FwRev=GQ200, SerialNo=dW1921060033c8
 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
 RawCHS=14896/9/63, TrkSize=32256, SectSize=512, ECCbytes=21
 BuffType=DualPortCache, BuffSize=496kB, MaxMultSect=16, MultSect=off
 CurCHS=14896/9/63, CurSects=-531627904, LBA=yes, LBAsects=8446032
 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes: pio0 pio1 pio2 pio3 pio4 
 DMA modes: sdma0 sdma1 sdma2 mdma0 mdma1 mdma2 udma0 udma1 *udma2 


00:00.0 Host bridge: VIA Technologies, Inc.: Unknown device 0305 (rev 02)
Subsystem: Asustek Computer, Inc.: Unknown device 8033
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- 
SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- 
Capabilities: [c0] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
00: 06 11 05 03 06 00 10 a2 02 00 00 06 00 00 00 00
10: 08 00 00 e0 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 43 10 33 80
30: 00 00 00 00 a0 00 00 00 00 00 00 00 00 00 00 00
40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
50: 17 a4 6b b4 4f 81 10 10 80 00 08 10 10 10 10 10
60: 03 ff 00 b0 e6 e5 e5 00 44 7c 86 0f 08 3f 00 00
70: de 80 cc 0c 0e a1 d2 00 01 b4 11 02 00 00 00 01
80: 0f 40 00 00 80 00 00 00 02 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 02 c0 20 00 17 02 00 1f 00 00 00 00 6e 02 14 00
b0: 61 ec 80 e5 32 33 28 00 00 00 00 00 00 00 00 00
c0: 01 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 

Re: 2.4 ate my filesystem on rw-mount, getting closer

2001-01-14 Thread Vojtech Pavlik

On Sun, Jan 14, 2001 at 09:45:09AM +0100, Tobias Ringstrom wrote:
> On Sun, 14 Jan 2001, Vojtech Pavlik wrote:
> > On Sat, Jan 13, 2001 at 11:36:13PM +0100, Tobias Ringstrom wrote:
> >
> > > I have now tried the SAMSUNG VG34323A disk with two other controllers at
> > > home (Promise ATA100 an VIA vt82c686a rev 0x22, both on an ASUS A7V
> > > motherboard), and there are no problems to be found with DMA enabled.
> > > Streaming 10 MB/s without glitches.
> >
> > So the drive *did* work on the vt82c686a in the A7V board? You tested it
> > both on the Promise and on the 686a? But doesn't work on the 686a in
> > your other board?
> 
> Yes, on both the Promise and on the 686a.  But the device revisions are
> different.  The machine that does NOT work:
> 
> 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
> 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)
> 
> The machine that works:
> 
> 00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 22)
> 00:04.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10)
> 
> The one the works is a 1 GHz Athlon, and the other is an 800 MHz
> Pentium-III.
> 
> > > no matter what cable I use.  When I get this, the machine does not recover
> > > most of the time, and I have to reset or power cycle.
> >
> > It should be able to recover in a couple (up to 10) minutes ...
> 
> Who waits 10 minutes for a timeout?  Can it be lowered?

It's not a 10 minute timeout, it's a shorter timeout retried many times.
Not my code, though - this is generic PCI IDE code, and is a huge mess.

> Expect another mail with the data you requested within a couple of hours.

Thanks a lot.

-- 
Vojtech Pavlik
SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount, getting closer

2001-01-14 Thread Tobias Ringstrom

On Sun, 14 Jan 2001, Vojtech Pavlik wrote:
> On Sat, Jan 13, 2001 at 11:36:13PM +0100, Tobias Ringstrom wrote:
>
> > I have now tried the SAMSUNG VG34323A disk with two other controllers at
> > home (Promise ATA100 an VIA vt82c686a rev 0x22, both on an ASUS A7V
> > motherboard), and there are no problems to be found with DMA enabled.
> > Streaming 10 MB/s without glitches.
>
> So the drive *did* work on the vt82c686a in the A7V board? You tested it
> both on the Promise and on the 686a? But doesn't work on the 686a in
> your other board?

Yes, on both the Promise and on the 686a.  But the device revisions are
different.  The machine that does NOT work:

00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)

The machine that works:

00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 22)
00:04.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10)

The one the works is a 1 GHz Athlon, and the other is an 800 MHz
Pentium-III.

> > no matter what cable I use.  When I get this, the machine does not recover
> > most of the time, and I have to reset or power cycle.
>
> It should be able to recover in a couple (up to 10) minutes ...

Who waits 10 minutes for a timeout?  Can it be lowered?

Expect another mail with the data you requested within a couple of hours.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount, getting closer

2001-01-14 Thread Vojtech Pavlik

On Sat, Jan 13, 2001 at 11:36:13PM +0100, Tobias Ringstrom wrote:

> I have now tried the SAMSUNG VG34323A disk with two other controllers at
> home (Promise ATA100 an VIA vt82c686a rev 0x22, both on an ASUS A7V
> motherboard), and there are no problems to be found with DMA enabled.
> Streaming 10 MB/s without glitches.

So the drive *did* work on the vt82c686a in the A7V board? You tested it
both on the Promise and on the 686a? But doesn't work on the 686a in
your other board?

> However, writing to the SAMSUNG VG34323A disk with DMA enabled on either
> this machine [1] (at work, using the VIA IDE driver version 3.11)
> 
> 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C596 ISA [Apollo PRO] (rev 23)
> 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10)
> 
> or this machine [2] (at work, using the VIA IDE driver version 2.1e)
> 
> 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
> 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)

What's the manufacturer/model of these boards? Just for record ...
What's the PCI bus speed? Or memory speed?

> I get exactly the following errors on both machines
> 
> hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
> hdc: dma_intr: error=0x84 { DriveStatusError BadCRC }
> 
> no matter what cable I use.  When I get this, the machine does not recover
> most of the time, and I have to reset or power cycle.

It should be able to recover in a couple (up to 10) minutes ...

> This disc works
> flawlessly on two other IDE controllers, so I do not think that the disk
> is completely broken. It must be either these chipsets or the driver in
> combination with this disk.  Note that I _can_ use another UDMA66 disk
> _with_ DMA enabled on both machine [1] and [2] above without problems.
> Also, 2.2.16-22 seems to work with DMA enabled on machine [1].  I have not
> tried 2.2.16-22 with DMA enabled on machine [2].
> 
> The problem I reported at first, hence the nasty subject, was a hang and a
> nasty fs corruption when RH7 tried to remount the root fs read-write.  I
> examined the RH7 init scripts, or more precisely /etc/rc.sysinit, and
> discovered, to my great disgust, that the stupid thing disables the dmesg
> output on the console very early in the script.  It is thus entirely
> possible that I do get the above mentioned errors when the computer seems
> to hang, and my fs gets corrupted.  I will fix the script tomorrow to see
> if my assumption is correct.
> 
> SUMMARY:  I have a disk that with DMA enabled give me CRC errors on two
> machines, but not on two other, independent on the cable.  Both troubling
> machines do not recover from these errors.  Linux 2.2.16-22 from RedHat
> works fine with DMA enabled on machine [1], [2] is unknown.
> 
> I hope this makes things a lot clearer.

Yes, indeed it's much clearer now. Now to fix the bug, or at least be
able to track it closer, I'll need 'lspci -vvxxx' of the IDE pci device
in the following cases:

1) SAMSUNG VG34323A on VT82C596b/cf with RH 2.2.16-22 and DMA (working)
2) SAMSUNG VG34323A on VT82C686a/ce with RH 2.2.16-22 and DMA (working)
3) SAMSUNG VG34323A on VT82C596b/cf with 2.4.0+via3.11 and DMA,
(doesn't work, so fs readonly)
4) SAMSUNG VG34323A on VT82C686a/ce with 2.4.0+via3.11 and DMA,
(doesn't work, so fs readonly)
5) The other drive on VT82C596b/cf with 2.4.0+via3.11 and DMA (working)
6) The other drive on VT82C686a/ce with 2.4.0+via3.11 and DMA (working)

With these data I should be able to find out what's different between
the working and not working setups ...



My current theory: In UDMA, when reading, the drive provides the clock.
The IDE controller thus can read everything OK. When writing, the
controller provides the clock and for some reason the Samsung can't keep
up with the setting the driver selects for it. The question is why and
why the driver selects the incorrect (or just too tight?) value.

-- 
Vojtech Pavlik
SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount, getting closer

2001-01-14 Thread Vojtech Pavlik

On Sat, Jan 13, 2001 at 11:36:13PM +0100, Tobias Ringstrom wrote:

 I have now tried the SAMSUNG VG34323A disk with two other controllers at
 home (Promise ATA100 an VIA vt82c686a rev 0x22, both on an ASUS A7V
 motherboard), and there are no problems to be found with DMA enabled.
 Streaming 10 MB/s without glitches.

So the drive *did* work on the vt82c686a in the A7V board? You tested it
both on the Promise and on the 686a? But doesn't work on the 686a in
your other board?

 However, writing to the SAMSUNG VG34323A disk with DMA enabled on either
 this machine [1] (at work, using the VIA IDE driver version 3.11)
 
 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C596 ISA [Apollo PRO] (rev 23)
 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10)
 
 or this machine [2] (at work, using the VIA IDE driver version 2.1e)
 
 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)

What's the manufacturer/model of these boards? Just for record ...
What's the PCI bus speed? Or memory speed?

 I get exactly the following errors on both machines
 
 hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
 hdc: dma_intr: error=0x84 { DriveStatusError BadCRC }
 
 no matter what cable I use.  When I get this, the machine does not recover
 most of the time, and I have to reset or power cycle.

It should be able to recover in a couple (up to 10) minutes ...

 This disc works
 flawlessly on two other IDE controllers, so I do not think that the disk
 is completely broken. It must be either these chipsets or the driver in
 combination with this disk.  Note that I _can_ use another UDMA66 disk
 _with_ DMA enabled on both machine [1] and [2] above without problems.
 Also, 2.2.16-22 seems to work with DMA enabled on machine [1].  I have not
 tried 2.2.16-22 with DMA enabled on machine [2].
 
 The problem I reported at first, hence the nasty subject, was a hang and a
 nasty fs corruption when RH7 tried to remount the root fs read-write.  I
 examined the RH7 init scripts, or more precisely /etc/rc.sysinit, and
 discovered, to my great disgust, that the stupid thing disables the dmesg
 output on the console very early in the script.  It is thus entirely
 possible that I do get the above mentioned errors when the computer seems
 to hang, and my fs gets corrupted.  I will fix the script tomorrow to see
 if my assumption is correct.
 
 SUMMARY:  I have a disk that with DMA enabled give me CRC errors on two
 machines, but not on two other, independent on the cable.  Both troubling
 machines do not recover from these errors.  Linux 2.2.16-22 from RedHat
 works fine with DMA enabled on machine [1], [2] is unknown.
 
 I hope this makes things a lot clearer.

Yes, indeed it's much clearer now. Now to fix the bug, or at least be
able to track it closer, I'll need 'lspci -vvxxx' of the IDE pci device
in the following cases:

1) SAMSUNG VG34323A on VT82C596b/cf with RH 2.2.16-22 and DMA (working)
2) SAMSUNG VG34323A on VT82C686a/ce with RH 2.2.16-22 and DMA (working)
3) SAMSUNG VG34323A on VT82C596b/cf with 2.4.0+via3.11 and DMA,
(doesn't work, so fs readonly)
4) SAMSUNG VG34323A on VT82C686a/ce with 2.4.0+via3.11 and DMA,
(doesn't work, so fs readonly)
5) The other drive on VT82C596b/cf with 2.4.0+via3.11 and DMA (working)
6) The other drive on VT82C686a/ce with 2.4.0+via3.11 and DMA (working)

With these data I should be able to find out what's different between
the working and not working setups ...



My current theory: In UDMA, when reading, the drive provides the clock.
The IDE controller thus can read everything OK. When writing, the
controller provides the clock and for some reason the Samsung can't keep
up with the setting the driver selects for it. The question is why and
why the driver selects the incorrect (or just too tight?) value.

-- 
Vojtech Pavlik
SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount, getting closer

2001-01-14 Thread Tobias Ringstrom

On Sun, 14 Jan 2001, Vojtech Pavlik wrote:
 On Sat, Jan 13, 2001 at 11:36:13PM +0100, Tobias Ringstrom wrote:

  I have now tried the SAMSUNG VG34323A disk with two other controllers at
  home (Promise ATA100 an VIA vt82c686a rev 0x22, both on an ASUS A7V
  motherboard), and there are no problems to be found with DMA enabled.
  Streaming 10 MB/s without glitches.

 So the drive *did* work on the vt82c686a in the A7V board? You tested it
 both on the Promise and on the 686a? But doesn't work on the 686a in
 your other board?

Yes, on both the Promise and on the 686a.  But the device revisions are
different.  The machine that does NOT work:

00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)

The machine that works:

00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 22)
00:04.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10)

The one the works is a 1 GHz Athlon, and the other is an 800 MHz
Pentium-III.

  no matter what cable I use.  When I get this, the machine does not recover
  most of the time, and I have to reset or power cycle.

 It should be able to recover in a couple (up to 10) minutes ...

Who waits 10 minutes for a timeout?  Can it be lowered?

Expect another mail with the data you requested within a couple of hours.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount, getting closer

2001-01-14 Thread Vojtech Pavlik

On Sun, Jan 14, 2001 at 09:45:09AM +0100, Tobias Ringstrom wrote:
 On Sun, 14 Jan 2001, Vojtech Pavlik wrote:
  On Sat, Jan 13, 2001 at 11:36:13PM +0100, Tobias Ringstrom wrote:
 
   I have now tried the SAMSUNG VG34323A disk with two other controllers at
   home (Promise ATA100 an VIA vt82c686a rev 0x22, both on an ASUS A7V
   motherboard), and there are no problems to be found with DMA enabled.
   Streaming 10 MB/s without glitches.
 
  So the drive *did* work on the vt82c686a in the A7V board? You tested it
  both on the Promise and on the 686a? But doesn't work on the 686a in
  your other board?
 
 Yes, on both the Promise and on the 686a.  But the device revisions are
 different.  The machine that does NOT work:
 
 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)
 
 The machine that works:
 
 00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 22)
 00:04.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10)
 
 The one the works is a 1 GHz Athlon, and the other is an 800 MHz
 Pentium-III.
 
   no matter what cable I use.  When I get this, the machine does not recover
   most of the time, and I have to reset or power cycle.
 
  It should be able to recover in a couple (up to 10) minutes ...
 
 Who waits 10 minutes for a timeout?  Can it be lowered?

It's not a 10 minute timeout, it's a shorter timeout retried many times.
Not my code, though - this is generic PCI IDE code, and is a huge mess.

 Expect another mail with the data you requested within a couple of hours.

Thanks a lot.

-- 
Vojtech Pavlik
SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount, getting closer

2001-01-14 Thread Tobias Ringstrom

On Sun, 14 Jan 2001, Vojtech Pavlik wrote:
   So the drive *did* work on the vt82c686a in the A7V board? You tested it
   both on the Promise and on the 686a? But doesn't work on the 686a in
   your other board?
 
  Yes, on both the Promise and on the 686a.  But the device revisions are
  different.  The machine that does NOT work:
 
  00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
  00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)
 
  The machine that works:
 
  00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 22)
  00:04.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10)
 
  The one the works is a 1 GHz Athlon, and the other is an 800 MHz
  Pentium-III.

Of course is isn't.  The vt82c686 that does not work is a 450 MHz K-6, not
a PIII.

no matter what cable I use.  When I get this, the machine does not recover
most of the time, and I have to reset or power cycle.
  
   It should be able to recover in a couple (up to 10) minutes ...
 
  Who waits 10 minutes for a timeout?  Can it be lowered?

 It's not a 10 minute timeout, it's a shorter timeout retried many times.
 Not my code, though - this is generic PCI IDE code, and is a huge mess.

What I get is a number of Busy and Drive is not ready for command for
different sectors.

  Expect another mail with the data you requested within a couple of hours.

 Thanks a lot.

Ok, it took a bit longer that that, mostly because me and my whife had
unexpected (but very welcome) guests at home.  It is Sunday, after all...

I have attached a tar file with "lspci -vvxxx" and "hdinfo -i" for machine
1 and 2 to this mail, but first some comments.

I will be talking about three machines:

1) 450 MHz K-6 on an AOpen MX59 PRO II motherboard
2) 800 MHz PIII on an unknown cheap/crappy motherboard.
3) 1 GHz Athlon on an ASUS A7V motherboard.

and the following drives:

A) SAMSUNG VG34323A, sdma0 sdma1 sdma2 mdma0 mdma1 mdma2 udma0 udma1 udma2
B) ST38421A, mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4

Machine 3 is the machine at home, and it does not have problems with any
disks I have tried soo far, and seems very stable, both with ATA100 and
ATA66.

I verified that what is happening when RH7 tries to remount / read-write,
is that I get the infamous CRC errors.  It does not seem to recover from
this state.  At least I did not wait that long.

I do not think that the RH7 kernel 2.2.16-22 uses udma2 at any time, and
that may be why it works.

Disk B does NOT work with DMA enabled with machine 1 or 2.  It works
better than disk A, but it does still fail after some time.  The
combination 1B was the most stable, and only failed once.

When using disk B, the computer has managed to recover from the CRC error
condition every time, as opposed to disk A which never recovers.  (Busy)

Using hdparm -X65 (udma1) makes disk A work with 2.4 in machine 2.  What
is the difference between udma1 and udma2?

Now I'm almost completely lost.  Hope this helps.  Let me know if you want
me to try something else.

/Tobias




/dev/hde:

 Model=SAMSUNG VG34323A (4.32GB), FwRev=GQ200, SerialNo=dW1921060033c8
 Config={ HardSect NotMFM HdSw15uSec Fixed DTR10Mbs }
 RawCHS=14896/9/63, TrkSize=32256, SectSize=512, ECCbytes=21
 BuffType=DualPortCache, BuffSize=496kB, MaxMultSect=16, MultSect=off
 CurCHS=14896/9/63, CurSects=-531627904, LBA=yes, LBAsects=8446032
 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes: pio0 pio1 pio2 pio3 pio4 
 DMA modes: sdma0 sdma1 sdma2 mdma0 mdma1 mdma2 udma0 udma1 *udma2 


00:00.0 Host bridge: VIA Technologies, Inc.: Unknown device 0305 (rev 02)
Subsystem: Asustek Computer, Inc.: Unknown device 8033
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- 
SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium TAbort- TAbort- 
MAbort+ SERR- PERR+
Latency: 0
Region 0: Memory at e000 (32-bit, prefetchable) [size=128M]
Capabilities: [a0] AGP version 2.0
Status: RQ=31 SBA+ 64bit- FW+ Rate=x1,x2
Command: RQ=0 SBA- AGP- 64bit- FW- Rate=none
Capabilities: [c0] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
00: 06 11 05 03 06 00 10 a2 02 00 00 06 00 00 00 00
10: 08 00 00 e0 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 43 10 33 80
30: 00 00 00 00 a0 00 00 00 00 00 00 00 00 00 00 00
40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
50: 17 a4 6b b4 4f 81 10 10 80 00 08 10 10 10 10 10
60: 03 ff 00 b0 e6 e5 e5 00 44 7c 86 0f 08 3f 00 00
70: de 80 cc 0c 0e a1 d2 00 01 b4 11 02 00 00 00 01
80: 0f 40 00 00 80 00 00 00 02 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 02 c0 20 00 17 02 00 1f 00 00 00 00 6e 

Re: 2.4 ate my filesystem on rw-mount, getting closer

2001-01-14 Thread Tobias Ringstrom

I should also add that the 3.11 driver seems to make things better, but
not yet perfect.  My intuition tells me that I get CRC errors much sooner
with 2.1e than with 3.11.

Has the timings changed from 2.1e to 3.11, and would it be easy to modify
3.11 to get extra safe/paranoid, but less high performance, timings?

Some extra data:
* B seems to work in 2 with udma2
* A seems to work in 2 with udma1, but not with udma2.

I wouldn't say it's rock solid, and I would not trust my data to any of
these combinations, but at least it not break immmediately (i.e. for less
than 1 GB written).

The worst combination is 2.4.0 with VIA 2.1e and A in 1.  Going from 2.1e
to 3.11 helps, but it is still very bad.

I'd really like to be more precise, but there are too many combinations to
try to try them all, and sometimes it fails right away, and sometimes
after several hundred megabytes.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount, getting closer

2001-01-14 Thread Vojtech Pavlik

On Sun, Jan 14, 2001 at 06:59:57PM +0100, Tobias Ringstrom wrote:
 
 I should also add that the 3.11 driver seems to make things better, but
 not yet perfect.  My intuition tells me that I get CRC errors much sooner
 with 2.1e than with 3.11.
 
 Has the timings changed from 2.1e to 3.11, and would it be easy to modify
 3.11 to get extra safe/paranoid, but less high performance, timings?

If you use 'idebus=40' or 'idebus=50', the driver will add an extra
margin to the timings, trying to compensate for the 40 or 50 MHz PCI bus
it will be tricked to think it's working with.

This could add a data point, yes.

 Some extra data:
 * B seems to work in 2 with udma2
 * A seems to work in 2 with udma1, but not with udma2.

UDMA1 is 22.2 MB/sec, UDMA2 is 33.3. UDMA0 is 16.6.

Could you (if didn't already) send me the lspci -vvxxx after the -X65
(UDMA1) command, together with the one before? That also could tell
something.

 I wouldn't say it's rock solid, and I would not trust my data to any of
 these combinations, but at least it not break immmediately (i.e. for less
 than 1 GB written).

Actually, the CRC messages are safe and only mean a data transfer is
retried. That is, only if it doesn't fail every time. They happen on
many boards and drives using UDMA even under normal correct operation :(

 The worst combination is 2.4.0 with VIA 2.1e and A in 1.  Going from 2.1e
 to 3.11 helps, but it is still very bad.
 
 I'd really like to be more precise, but there are too many combinations to
 try to try them all, and sometimes it fails right away, and sometimes
 after several hundred megabytes.

If 'fails after several hundred megabytes' only means a single CRC error
which is recovered from correctly, then that actually means 'working and
probably would work perfect with a shorter cable'.

-- 
Vojtech Pavlik
SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount, getting closer

2001-01-13 Thread Tobias Ringstrom

I have now tried the SAMSUNG VG34323A disk with two other controllers at
home (Promise ATA100 an VIA vt82c686a rev 0x22, both on an ASUS A7V
motherboard), and there are no problems to be found with DMA enabled.
Streaming 10 MB/s without glitches.

However, writing to the SAMSUNG VG34323A disk with DMA enabled on either
this machine [1] (at work, using the VIA IDE driver version 3.11)

00:07.0 ISA bridge: VIA Technologies, Inc. VT82C596 ISA [Apollo PRO] (rev 23)
00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10)

or this machine [2] (at work, using the VIA IDE driver version 2.1e)

00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)

I get exactly the following errors on both machines

hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdc: dma_intr: error=0x84 { DriveStatusError BadCRC }

no matter what cable I use.  When I get this, the machine does not recover
most of the time, and I have to reset or power cycle.  This disc works
flawlessly on two other IDE controllers, so I do not think that the disk
is completely broken. It must be either these chipsets or the driver in
combination with this disk.  Note that I _can_ use another UDMA66 disk
_with_ DMA enabled on both machine [1] and [2] above without problems.
Also, 2.2.16-22 seems to work with DMA enabled on machine [1].  I have not
tried 2.2.16-22 with DMA enabled on machine [2].

The problem I reported at first, hence the nasty subject, was a hang and a
nasty fs corruption when RH7 tried to remount the root fs read-write.  I
examined the RH7 init scripts, or more precisely /etc/rc.sysinit, and
discovered, to my great disgust, that the stupid thing disables the dmesg
output on the console very early in the script.  It is thus entirely
possible that I do get the above mentioned errors when the computer seems
to hang, and my fs gets corrupted.  I will fix the script tomorrow to see
if my assumption is correct.

SUMMARY:  I have a disk that with DMA enabled give me CRC errors on two
machines, but not on two other, independent on the cable.  Both troubling
machines do not recover from these errors.  Linux 2.2.16-22 from RedHat
works fine with DMA enabled on machine [1], [2] is unknown.

I hope this makes things a lot clearer.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount

2001-01-13 Thread Tobias Ringstrom

On Sat, 13 Jan 2001, Vojtech Pavlik wrote:

> On Sat, Jan 13, 2001 at 09:12:27AM +0100, Tobias Ringstrom wrote:
> > > 2) What's in /proc/ide/via?
> >
> > It's not there since I disabled the VIA driver.
>
> Ok. Could you send me this file when you boot with fs r-o?

Ok, but this is with the wrong disc.  Withe the bad disc, drive0 looks
exacly like drive2, i.e. normal UDMA(33).  Sorry about that.

--VIA BusMastering IDE Configuration
Driver Version: 2.1e
South Bridge:   VIA vt82c686a rev 0x1b
Command register:   0x7
Latency timer:  32
PCI clock:  33MHz
Master Read  Cycle IRDY:0ws
Master Write Cycle IRDY:0ws
FIFO Output Data 1/2 Clock Advance: off
BM IDE Status Register Read Retry:  on
Max DRDY Pulse Width:   No limit
---Primary IDE---Secondary IDE--
Read DMA FIFO flush:   on  on
End Sect. FIFO flush:  on  on
Prefetch Buffer:   on  on
Post Write Buffer: on  on
FIFO size:  8   8
Threshold Prim.:  1/2 1/2
Bytes Per Sector: 512 512
Both channels togth:  yes yes
---drive0drive1drive2drive3-
BMDMA enabled:yes   yes   yes   yes
Transfer Mode:   UDMA   DMA/PIO  UDMA   DMA/PIO
Address Setup:   30ns 120ns  30ns 120ns
Active Pulse:90ns 330ns  90ns 330ns
Recovery Time:   30ns 270ns  30ns 270ns
Cycle Time:  30ns 600ns  60ns 600ns
Transfer Rate:   66.0MB/s   3.3MB/s  33.0MB/s   3.3MB/s

> > > 4) If you mount your filesystem read-only, does it read garbage?
> >
> > Now here's a strange part, or possibly a crusial clue.  When I booted a
> > 2.4.0 kernel (from floppy using the excellent syslinux) with "ro
> > init=/bin/sh", I could access the filesystem just fine.  I could even
> > remount the root filesystem rw, and there were no problems.  But I did not
> > write anything to the disk, since I was convinced that the problem was
> > gone (this was the second try).  After this I rebooted with
> > ctrl-alt-delete, forgetting how bad an idea that is with init=/bin/sh,
> > booted up the RH7 2.2.16 kernel, and fsck was run with no errors.
>
> So far no problem. Rebooting with c-a-d with fs r-o is OK.
>
> > Now I
> > though all was well, rebooted from floppy again, but without the init=
> > part, and poof, it hang.
>
> Where? It could be a different reason than IDE setup ...

Don't think so.  It happens on the "Remounting root read-write".

> > More interesting may be that I had to turn the computer off and on again
> > to get BIOS to find the hard drive. Repeated long reset button presses
> > did not help.  It is possible that it hung during BIOS hd detection - I
> > wish I could remember.
>
> I fear this isn't much of a clue, sorry.

The clue is that the VIA driver messed up either the chipset or the drive
quite a lot, but maybe that is already obvious.

> > I suspect that I could have hung the drive with init=/bin/sh if I would
> > have done some reading and writing to the device, besides ls.
>
> Please try it. Best mke2fs your swap partition and try reading & writing
> to that. You can mkswap it back after you finish.

After more testing, I think I have isolated the problem to this disk, or
at least this disk with this controller.  With another (UDMA66) disk,
there are no problems.  Details at the end.

> > I think I can spend some more time today trying it out some more.
>
> Please do. 'lspci -vvxxx' data for the case without a driver, with 2.4.0
> driver and with 3.11 driver would help me find the problem.

Ok, I'll do that later.

> Make sure you *don't* have any hdparm -d1 or hdparm -X66 or similar
> stuff in your init scripts.

I'm sure I don't.  This happens with a clean fresh RH7 installation.

> > I will
> > also try your 3.11 driver, which seems to be an enormous cleanup.
>
> the 2.1e driver is an enormous cleanup of the original driver from the
> 2.2 kernels. the 3.11 is an enormous cleanup of 2.1e, yes.

I have not had a chance to try the 3.11 driver yet.

Now for the new details.  When writing to the disk with DMA enabled, I get
the following errors, in two different machines.  Both are VIA IDE
machines.  I is NOT a cable error.  I have tries with several cables.
Possibly a connector or soldering problem.  I'll try the disk in more
machines an get back with more info.  I have to run now.

hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdc: dma_intr: error=0x84 { DriveStatusError BadCRC }

/Tobias


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount

2001-01-13 Thread Vojtech Pavlik

On Sat, Jan 13, 2001 at 09:12:27AM +0100, Tobias Ringstrom wrote:

> > Wow. Ok, I'm maintaining the 2.4.0 VIA driver, so I'd like to know more
> > about this:
> >
> > 1) What's the ISA bridge revision?
> 
> 00:00.0 Host bridge: VIA Technologies, Inc. VT8501 (rev 02)
> 00:01.0 PCI bridge: VIA Technologies, Inc. VT8501
> 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
> 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)
> 00:07.2 USB Controller: VIA Technologies, Inc. VT82C586B USB (rev 0e)
> 00:07.4 Bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 20)
> 00:07.5 Multimedia audio controller: VIA Technologies, Inc. VT82C686 [Apollo Super 
>AC97/Audio] (rev 21)
> 00:0a.0 Ethernet controller: VIA Technologies, Inc. VT86C100A [Rhine 10/100] (rev 06)
> 01:00.0 VGA compatible controller: Trident Microsystems CyberBlade/i7 (rev 5b)

Ok, your IDE chip is a vt82c686a/ce.

> > 2) What's in /proc/ide/via?
> 
> It's not there since I disabled the VIA driver.

Ok. Could you send me this file when you boot with fs r-o?

> > 3) What says hdparm -i on your devices?
> 
> /dev/hda:
> 
>  Model=SAMSUNG VG34323A (4.32GB), FwRev=GQ200, SerialNo=dW1921060033c8
>  Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
>  RawCHS=14896/9/63, TrkSize=32256, SectSize=512, ECCbytes=21
>  BuffType=DualPortCache, BuffSize=496kB, MaxMultSect=16, MultSect=off
>  CurCHS=14896/9/63, CurSects=-531627904, LBA=yes, LBAsects=8446032
>  IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
>  PIO modes: pio0 pio1 pio2 pio3 pio4
>  DMA modes: sdma0 sdma1 sdma2 *mdma0 mdma1 mdma2 udma0 udma1 *udma2

Looks good, too. An UDMA33 drive.

> > 4) If you mount your filesystem read-only, does it read garbage?
> 
> Now here's a strange part, or possibly a crusial clue.  When I booted a
> 2.4.0 kernel (from floppy using the excellent syslinux) with "ro
> init=/bin/sh", I could access the filesystem just fine.  I could even
> remount the root filesystem rw, and there were no problems.  But I did not
> write anything to the disk, since I was convinced that the problem was
> gone (this was the second try).  After this I rebooted with
> ctrl-alt-delete, forgetting how bad an idea that is with init=/bin/sh,
> booted up the RH7 2.2.16 kernel, and fsck was run with no errors.

So far no problem. Rebooting with c-a-d with fs r-o is OK.

> Now I
> though all was well, rebooted from floppy again, but without the init=
> part, and poof, it hang.

Where? It could be a different reason than IDE setup ...

> More interesting may be that I had to turn the computer off and on again
> to get BIOS to find the hard drive. Repeated long reset button presses
> did not help.  It is possible that it hung during BIOS hd detection - I
> wish I could remember.

I fear this isn't much of a clue, sorry.

> I suspect that I could have hung the drive with init=/bin/sh if I would
> have done some reading and writing to the device, besides ls.

Please try it. Best mke2fs your swap partition and try reading & writing
to that. You can mkswap it back after you finish.

> I think I can spend some more time today trying it out some more.

Please do. 'lspci -vvxxx' data for the case without a driver, with 2.4.0
driver and with 3.11 driver would help me find the problem.

Make sure you *don't* have any hdparm -d1 or hdparm -X66 or similar
stuff in your init scripts.

> I will
> also try your 3.11 driver, which seems to be an enormous cleanup.

the 2.1e driver is an enormous cleanup of the original driver from the
2.2 kernels. the 3.11 is an enormous cleanup of 2.1e, yes.

> Btw, do
> you have a home page for the VIA driver?  A CVS perhaps?  If not, please
> consider using sourceforge or something similar.

No, not yet, but working on that.

-- 
Vojtech Pavlik
SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount

2001-01-13 Thread Tobias Ringstrom

On Fri, 12 Jan 2001, Vojtech Pavlik wrote:
> Wow. Ok, I'm maintaining the 2.4.0 VIA driver, so I'd like to know more
> about this:
>
> 1) What's the ISA bridge revision?

00:00.0 Host bridge: VIA Technologies, Inc. VT8501 (rev 02)
00:01.0 PCI bridge: VIA Technologies, Inc. VT8501
00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)
00:07.2 USB Controller: VIA Technologies, Inc. VT82C586B USB (rev 0e)
00:07.4 Bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 20)
00:07.5 Multimedia audio controller: VIA Technologies, Inc. VT82C686 [Apollo Super 
AC97/Audio] (rev 21)
00:0a.0 Ethernet controller: VIA Technologies, Inc. VT86C100A [Rhine 10/100] (rev 06)
01:00.0 VGA compatible controller: Trident Microsystems CyberBlade/i7 (rev 5b)

> 2) What's in /proc/ide/via?

It's not there since I disabled the VIA driver.

> 3) What says hdparm -i on your devices?

/dev/hda:

 Model=SAMSUNG VG34323A (4.32GB), FwRev=GQ200, SerialNo=dW1921060033c8
 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
 RawCHS=14896/9/63, TrkSize=32256, SectSize=512, ECCbytes=21
 BuffType=DualPortCache, BuffSize=496kB, MaxMultSect=16, MultSect=off
 CurCHS=14896/9/63, CurSects=-531627904, LBA=yes, LBAsects=8446032
 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes: pio0 pio1 pio2 pio3 pio4
 DMA modes: sdma0 sdma1 sdma2 *mdma0 mdma1 mdma2 udma0 udma1 *udma2

> 4) If you mount your filesystem read-only, does it read garbage?

Now here's a strange part, or possibly a crusial clue.  When I booted a
2.4.0 kernel (from floppy using the excellent syslinux) with "ro
init=/bin/sh", I could access the filesystem just fine.  I could even
remount the root filesystem rw, and there were no problems.  But I did not
write anything to the disk, since I was convinced that the problem was
gone (this was the second try).  After this I rebooted with
ctrl-alt-delete, forgetting how bad an idea that is with init=/bin/sh,
booted up the RH7 2.2.16 kernel, and fsck was run with no errors.  Now I
though all was well, rebooted from floppy again, but without the init=
part, and poof, it hang.

More interesting may be that I had to turn the computer off and on again
to get BIOS to find the hard drive.  Repeated long reset button presses
did not help.  It is possible that it hung during BIOS hd detection - I
wish I could remember.

I suspect that I could have hung the drive with init=/bin/sh if I would
have done some reading and writing to the device, besides ls.

I think I can spend some more time today trying it out some more.  I will
also try your 3.11 driver, which seems to be an enormous cleanup.  Btw, do
you have a home page for the VIA driver?  A CVS perhaps?  If not, please
consider using sourceforge or something similar.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount

2001-01-13 Thread Tobias Ringstrom

On Fri, 12 Jan 2001, Vojtech Pavlik wrote:
 Wow. Ok, I'm maintaining the 2.4.0 VIA driver, so I'd like to know more
 about this:

 1) What's the ISA bridge revision?

00:00.0 Host bridge: VIA Technologies, Inc. VT8501 (rev 02)
00:01.0 PCI bridge: VIA Technologies, Inc. VT8501
00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)
00:07.2 USB Controller: VIA Technologies, Inc. VT82C586B USB (rev 0e)
00:07.4 Bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 20)
00:07.5 Multimedia audio controller: VIA Technologies, Inc. VT82C686 [Apollo Super 
AC97/Audio] (rev 21)
00:0a.0 Ethernet controller: VIA Technologies, Inc. VT86C100A [Rhine 10/100] (rev 06)
01:00.0 VGA compatible controller: Trident Microsystems CyberBlade/i7 (rev 5b)

 2) What's in /proc/ide/via?

It's not there since I disabled the VIA driver.

 3) What says hdparm -i on your devices?

/dev/hda:

 Model=SAMSUNG VG34323A (4.32GB), FwRev=GQ200, SerialNo=dW1921060033c8
 Config={ HardSect NotMFM HdSw15uSec Fixed DTR10Mbs }
 RawCHS=14896/9/63, TrkSize=32256, SectSize=512, ECCbytes=21
 BuffType=DualPortCache, BuffSize=496kB, MaxMultSect=16, MultSect=off
 CurCHS=14896/9/63, CurSects=-531627904, LBA=yes, LBAsects=8446032
 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes: pio0 pio1 pio2 pio3 pio4
 DMA modes: sdma0 sdma1 sdma2 *mdma0 mdma1 mdma2 udma0 udma1 *udma2

 4) If you mount your filesystem read-only, does it read garbage?

Now here's a strange part, or possibly a crusial clue.  When I booted a
2.4.0 kernel (from floppy using the excellent syslinux) with "ro
init=/bin/sh", I could access the filesystem just fine.  I could even
remount the root filesystem rw, and there were no problems.  But I did not
write anything to the disk, since I was convinced that the problem was
gone (this was the second try).  After this I rebooted with
ctrl-alt-delete, forgetting how bad an idea that is with init=/bin/sh,
booted up the RH7 2.2.16 kernel, and fsck was run with no errors.  Now I
though all was well, rebooted from floppy again, but without the init=
part, and poof, it hang.

More interesting may be that I had to turn the computer off and on again
to get BIOS to find the hard drive.  Repeated long reset button presses
did not help.  It is possible that it hung during BIOS hd detection - I
wish I could remember.

I suspect that I could have hung the drive with init=/bin/sh if I would
have done some reading and writing to the device, besides ls.

I think I can spend some more time today trying it out some more.  I will
also try your 3.11 driver, which seems to be an enormous cleanup.  Btw, do
you have a home page for the VIA driver?  A CVS perhaps?  If not, please
consider using sourceforge or something similar.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount

2001-01-13 Thread Vojtech Pavlik

On Sat, Jan 13, 2001 at 09:12:27AM +0100, Tobias Ringstrom wrote:

  Wow. Ok, I'm maintaining the 2.4.0 VIA driver, so I'd like to know more
  about this:
 
  1) What's the ISA bridge revision?
 
 00:00.0 Host bridge: VIA Technologies, Inc. VT8501 (rev 02)
 00:01.0 PCI bridge: VIA Technologies, Inc. VT8501
 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)
 00:07.2 USB Controller: VIA Technologies, Inc. VT82C586B USB (rev 0e)
 00:07.4 Bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 20)
 00:07.5 Multimedia audio controller: VIA Technologies, Inc. VT82C686 [Apollo Super 
AC97/Audio] (rev 21)
 00:0a.0 Ethernet controller: VIA Technologies, Inc. VT86C100A [Rhine 10/100] (rev 06)
 01:00.0 VGA compatible controller: Trident Microsystems CyberBlade/i7 (rev 5b)

Ok, your IDE chip is a vt82c686a/ce.

  2) What's in /proc/ide/via?
 
 It's not there since I disabled the VIA driver.

Ok. Could you send me this file when you boot with fs r-o?

  3) What says hdparm -i on your devices?
 
 /dev/hda:
 
  Model=SAMSUNG VG34323A (4.32GB), FwRev=GQ200, SerialNo=dW1921060033c8
  Config={ HardSect NotMFM HdSw15uSec Fixed DTR10Mbs }
  RawCHS=14896/9/63, TrkSize=32256, SectSize=512, ECCbytes=21
  BuffType=DualPortCache, BuffSize=496kB, MaxMultSect=16, MultSect=off
  CurCHS=14896/9/63, CurSects=-531627904, LBA=yes, LBAsects=8446032
  IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
  PIO modes: pio0 pio1 pio2 pio3 pio4
  DMA modes: sdma0 sdma1 sdma2 *mdma0 mdma1 mdma2 udma0 udma1 *udma2

Looks good, too. An UDMA33 drive.

  4) If you mount your filesystem read-only, does it read garbage?
 
 Now here's a strange part, or possibly a crusial clue.  When I booted a
 2.4.0 kernel (from floppy using the excellent syslinux) with "ro
 init=/bin/sh", I could access the filesystem just fine.  I could even
 remount the root filesystem rw, and there were no problems.  But I did not
 write anything to the disk, since I was convinced that the problem was
 gone (this was the second try).  After this I rebooted with
 ctrl-alt-delete, forgetting how bad an idea that is with init=/bin/sh,
 booted up the RH7 2.2.16 kernel, and fsck was run with no errors.

So far no problem. Rebooting with c-a-d with fs r-o is OK.

 Now I
 though all was well, rebooted from floppy again, but without the init=
 part, and poof, it hang.

Where? It could be a different reason than IDE setup ...

 More interesting may be that I had to turn the computer off and on again
 to get BIOS to find the hard drive. Repeated long reset button presses
 did not help.  It is possible that it hung during BIOS hd detection - I
 wish I could remember.

I fear this isn't much of a clue, sorry.

 I suspect that I could have hung the drive with init=/bin/sh if I would
 have done some reading and writing to the device, besides ls.

Please try it. Best mke2fs your swap partition and try reading  writing
to that. You can mkswap it back after you finish.

 I think I can spend some more time today trying it out some more.

Please do. 'lspci -vvxxx' data for the case without a driver, with 2.4.0
driver and with 3.11 driver would help me find the problem.

Make sure you *don't* have any hdparm -d1 or hdparm -X66 or similar
stuff in your init scripts.

 I will
 also try your 3.11 driver, which seems to be an enormous cleanup.

the 2.1e driver is an enormous cleanup of the original driver from the
2.2 kernels. the 3.11 is an enormous cleanup of 2.1e, yes.

 Btw, do
 you have a home page for the VIA driver?  A CVS perhaps?  If not, please
 consider using sourceforge or something similar.

No, not yet, but working on that.

-- 
Vojtech Pavlik
SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount

2001-01-13 Thread Tobias Ringstrom

On Sat, 13 Jan 2001, Vojtech Pavlik wrote:

 On Sat, Jan 13, 2001 at 09:12:27AM +0100, Tobias Ringstrom wrote:
   2) What's in /proc/ide/via?
 
  It's not there since I disabled the VIA driver.

 Ok. Could you send me this file when you boot with fs r-o?

Ok, but this is with the wrong disc.  Withe the bad disc, drive0 looks
exacly like drive2, i.e. normal UDMA(33).  Sorry about that.

--VIA BusMastering IDE Configuration
Driver Version: 2.1e
South Bridge:   VIA vt82c686a rev 0x1b
Command register:   0x7
Latency timer:  32
PCI clock:  33MHz
Master Read  Cycle IRDY:0ws
Master Write Cycle IRDY:0ws
FIFO Output Data 1/2 Clock Advance: off
BM IDE Status Register Read Retry:  on
Max DRDY Pulse Width:   No limit
---Primary IDE---Secondary IDE--
Read DMA FIFO flush:   on  on
End Sect. FIFO flush:  on  on
Prefetch Buffer:   on  on
Post Write Buffer: on  on
FIFO size:  8   8
Threshold Prim.:  1/2 1/2
Bytes Per Sector: 512 512
Both channels togth:  yes yes
---drive0drive1drive2drive3-
BMDMA enabled:yes   yes   yes   yes
Transfer Mode:   UDMA   DMA/PIO  UDMA   DMA/PIO
Address Setup:   30ns 120ns  30ns 120ns
Active Pulse:90ns 330ns  90ns 330ns
Recovery Time:   30ns 270ns  30ns 270ns
Cycle Time:  30ns 600ns  60ns 600ns
Transfer Rate:   66.0MB/s   3.3MB/s  33.0MB/s   3.3MB/s

   4) If you mount your filesystem read-only, does it read garbage?
 
  Now here's a strange part, or possibly a crusial clue.  When I booted a
  2.4.0 kernel (from floppy using the excellent syslinux) with "ro
  init=/bin/sh", I could access the filesystem just fine.  I could even
  remount the root filesystem rw, and there were no problems.  But I did not
  write anything to the disk, since I was convinced that the problem was
  gone (this was the second try).  After this I rebooted with
  ctrl-alt-delete, forgetting how bad an idea that is with init=/bin/sh,
  booted up the RH7 2.2.16 kernel, and fsck was run with no errors.

 So far no problem. Rebooting with c-a-d with fs r-o is OK.

  Now I
  though all was well, rebooted from floppy again, but without the init=
  part, and poof, it hang.

 Where? It could be a different reason than IDE setup ...

Don't think so.  It happens on the "Remounting root read-write".

  More interesting may be that I had to turn the computer off and on again
  to get BIOS to find the hard drive. Repeated long reset button presses
  did not help.  It is possible that it hung during BIOS hd detection - I
  wish I could remember.

 I fear this isn't much of a clue, sorry.

The clue is that the VIA driver messed up either the chipset or the drive
quite a lot, but maybe that is already obvious.

  I suspect that I could have hung the drive with init=/bin/sh if I would
  have done some reading and writing to the device, besides ls.

 Please try it. Best mke2fs your swap partition and try reading  writing
 to that. You can mkswap it back after you finish.

After more testing, I think I have isolated the problem to this disk, or
at least this disk with this controller.  With another (UDMA66) disk,
there are no problems.  Details at the end.

  I think I can spend some more time today trying it out some more.

 Please do. 'lspci -vvxxx' data for the case without a driver, with 2.4.0
 driver and with 3.11 driver would help me find the problem.

Ok, I'll do that later.

 Make sure you *don't* have any hdparm -d1 or hdparm -X66 or similar
 stuff in your init scripts.

I'm sure I don't.  This happens with a clean fresh RH7 installation.

  I will
  also try your 3.11 driver, which seems to be an enormous cleanup.

 the 2.1e driver is an enormous cleanup of the original driver from the
 2.2 kernels. the 3.11 is an enormous cleanup of 2.1e, yes.

I have not had a chance to try the 3.11 driver yet.

Now for the new details.  When writing to the disk with DMA enabled, I get
the following errors, in two different machines.  Both are VIA IDE
machines.  I is NOT a cable error.  I have tries with several cables.
Possibly a connector or soldering problem.  I'll try the disk in more
machines an get back with more info.  I have to run now.

hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdc: dma_intr: error=0x84 { DriveStatusError BadCRC }

/Tobias


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount, getting closer

2001-01-13 Thread Tobias Ringstrom

I have now tried the SAMSUNG VG34323A disk with two other controllers at
home (Promise ATA100 an VIA vt82c686a rev 0x22, both on an ASUS A7V
motherboard), and there are no problems to be found with DMA enabled.
Streaming 10 MB/s without glitches.

However, writing to the SAMSUNG VG34323A disk with DMA enabled on either
this machine [1] (at work, using the VIA IDE driver version 3.11)

00:07.0 ISA bridge: VIA Technologies, Inc. VT82C596 ISA [Apollo PRO] (rev 23)
00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10)

or this machine [2] (at work, using the VIA IDE driver version 2.1e)

00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)

I get exactly the following errors on both machines

hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdc: dma_intr: error=0x84 { DriveStatusError BadCRC }

no matter what cable I use.  When I get this, the machine does not recover
most of the time, and I have to reset or power cycle.  This disc works
flawlessly on two other IDE controllers, so I do not think that the disk
is completely broken. It must be either these chipsets or the driver in
combination with this disk.  Note that I _can_ use another UDMA66 disk
_with_ DMA enabled on both machine [1] and [2] above without problems.
Also, 2.2.16-22 seems to work with DMA enabled on machine [1].  I have not
tried 2.2.16-22 with DMA enabled on machine [2].

The problem I reported at first, hence the nasty subject, was a hang and a
nasty fs corruption when RH7 tried to remount the root fs read-write.  I
examined the RH7 init scripts, or more precisely /etc/rc.sysinit, and
discovered, to my great disgust, that the stupid thing disables the dmesg
output on the console very early in the script.  It is thus entirely
possible that I do get the above mentioned errors when the computer seems
to hang, and my fs gets corrupted.  I will fix the script tomorrow to see
if my assumption is correct.

SUMMARY:  I have a disk that with DMA enabled give me CRC errors on two
machines, but not on two other, independent on the cable.  Both troubling
machines do not recover from these errors.  Linux 2.2.16-22 from RedHat
works fine with DMA enabled on machine [1], [2] is unknown.

I hope this makes things a lot clearer.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount

2001-01-12 Thread Vojtech Pavlik

On Fri, Jan 12, 2001 at 12:23:21PM -0500, Martin Laberge wrote:

> > > This is on a 450 MHz AMD-K6 with the following IDE controller:
> > > 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)
> >
> > There are several people who have reported that the 2.4.0 VIA IDE driver
> > trashes hard disks like that. The 2.2 one also did this sometimes but only
> > with specific chipset versions and if you have dma autotune on (thats why
> > currently 2.2 refuses to do tuning on VP3)
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [EMAIL PROTECTED]
> > Please read the FAQ at http://www.tux.org/lkml/
> 
> I had exactly the same problem with my K6-350 and IDE VT82C586a
> on a kernet 2.2.16. i just made a hdparm to enable DMA and poo
> lost all data  reinstall necessary from scratch

Is this problem still present with 2.4.0? Well, you don't need to kill
your data to test this - make sure the kernel is mounting the
filesystems read only in the test. DMA will be probably enabled
automatically for your drives.

-- 
Vojtech Pavlik
SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount

2001-01-12 Thread Vojtech Pavlik

On Fri, Jan 12, 2001 at 10:15:45AM +0100, Tobias Ringstrom wrote:
> I've never seen anything like it before, which I'm happy for.  The system
> had been running a standard RedHat 7 kernel for days without any problems,
> but who wants to run a 2.2 kernel?  I compiled 2.4.0 for it, rebooted, and
> blam!  The RedHat init stripts got to the "remounting root read-write"
> point, and just froze solid.
> 
> Rebooting into RH7 failed, becauce inittab could not be found.  In fact
> the filesystem was completely messed up, with /dev empty, lots of device
> nodes in /etc, and files missing all over the place.  I had to reinstall
> RH7 from scratch.
> 
> I do not understand how this could happen during a remounting root rw.
> Is the filesystem really that unstable?
> 
> Am I right in suspecting DMA, which was enabled at the time?  Any other
> ideas?  Is it a known problem?
> 
> This is on a 450 MHz AMD-K6 with the following IDE controller:
> 
> 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)
> 
> [I know this is not a very good trouble report, but it will have to do for
> the time beeing.  I hope to do more testing at a later time.]
> 
> /Tobias
> 
> PS. This is _not_ the same system that I reported IDE busy errors for.

Wow. Ok, I'm maintaining the 2.4.0 VIA driver, so I'd like to know more
about this:

1) What's the ISA bridge revision?
2) What's in /proc/ide/via?
3) What says hdparm -i on your devices?
4) If you mount your filesystem read-only, does it read garbage?

Thanks.

-- 
Vojtech Pavlik
SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount

2001-01-12 Thread Martin Laberge

Alan Cox wrote:

> > This is on a 450 MHz AMD-K6 with the following IDE controller:
> > 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)
>
> There are several people who have reported that the 2.4.0 VIA IDE driver
> trashes hard disks like that. The 2.2 one also did this sometimes but only
> with specific chipset versions and if you have dma autotune on (thats why
> currently 2.2 refuses to do tuning on VP3)
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/

I had exactly the same problem with my K6-350 and IDE VT82C586a
on a kernet 2.2.16. i just made a hdparm to enable DMA and poo
lost all data  reinstall necessary from scratch

Martin Laberge
[EMAIL PROTECTED]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount

2001-01-12 Thread Alan Cox

> This is on a 450 MHz AMD-K6 with the following IDE controller:
> 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)

There are several people who have reported that the 2.4.0 VIA IDE driver
trashes hard disks like that. The 2.2 one also did this sometimes but only
with specific chipset versions and if you have dma autotune on (thats why
currently 2.2 refuses to do tuning on VP3)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



2.4 ate my filesystem on rw-mount

2001-01-12 Thread Tobias Ringstrom

I've never seen anything like it before, which I'm happy for.  The system
had been running a standard RedHat 7 kernel for days without any problems,
but who wants to run a 2.2 kernel?  I compiled 2.4.0 for it, rebooted, and
blam!  The RedHat init stripts got to the "remounting root read-write"
point, and just froze solid.

Rebooting into RH7 failed, becauce inittab could not be found.  In fact
the filesystem was completely messed up, with /dev empty, lots of device
nodes in /etc, and files missing all over the place.  I had to reinstall
RH7 from scratch.

I do not understand how this could happen during a remounting root rw.
Is the filesystem really that unstable?

Am I right in suspecting DMA, which was enabled at the time?  Any other
ideas?  Is it a known problem?

This is on a 450 MHz AMD-K6 with the following IDE controller:

00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)

[I know this is not a very good trouble report, but it will have to do for
the time beeing.  I hope to do more testing at a later time.]

/Tobias

PS. This is _not_ the same system that I reported IDE busy errors for.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



2.4 ate my filesystem on rw-mount

2001-01-12 Thread Tobias Ringstrom

I've never seen anything like it before, which I'm happy for.  The system
had been running a standard RedHat 7 kernel for days without any problems,
but who wants to run a 2.2 kernel?  I compiled 2.4.0 for it, rebooted, and
blam!  The RedHat init stripts got to the "remounting root read-write"
point, and just froze solid.

Rebooting into RH7 failed, becauce inittab could not be found.  In fact
the filesystem was completely messed up, with /dev empty, lots of device
nodes in /etc, and files missing all over the place.  I had to reinstall
RH7 from scratch.

I do not understand how this could happen during a remounting root rw.
Is the filesystem really that unstable?

Am I right in suspecting DMA, which was enabled at the time?  Any other
ideas?  Is it a known problem?

This is on a 450 MHz AMD-K6 with the following IDE controller:

00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)

[I know this is not a very good trouble report, but it will have to do for
the time beeing.  I hope to do more testing at a later time.]

/Tobias

PS. This is _not_ the same system that I reported IDE busy errors for.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount

2001-01-12 Thread Alan Cox

 This is on a 450 MHz AMD-K6 with the following IDE controller:
 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)

There are several people who have reported that the 2.4.0 VIA IDE driver
trashes hard disks like that. The 2.2 one also did this sometimes but only
with specific chipset versions and if you have dma autotune on (thats why
currently 2.2 refuses to do tuning on VP3)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount

2001-01-12 Thread Martin Laberge

Alan Cox wrote:

  This is on a 450 MHz AMD-K6 with the following IDE controller:
  00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)

 There are several people who have reported that the 2.4.0 VIA IDE driver
 trashes hard disks like that. The 2.2 one also did this sometimes but only
 with specific chipset versions and if you have dma autotune on (thats why
 currently 2.2 refuses to do tuning on VP3)

 -
 To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 the body of a message to [EMAIL PROTECTED]
 Please read the FAQ at http://www.tux.org/lkml/

I had exactly the same problem with my K6-350 and IDE VT82C586a
on a kernet 2.2.16. i just made a hdparm to enable DMA and poo
lost all data  reinstall necessary from scratch

Martin Laberge
[EMAIL PROTECTED]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount

2001-01-12 Thread Vojtech Pavlik

On Fri, Jan 12, 2001 at 10:15:45AM +0100, Tobias Ringstrom wrote:
 I've never seen anything like it before, which I'm happy for.  The system
 had been running a standard RedHat 7 kernel for days without any problems,
 but who wants to run a 2.2 kernel?  I compiled 2.4.0 for it, rebooted, and
 blam!  The RedHat init stripts got to the "remounting root read-write"
 point, and just froze solid.
 
 Rebooting into RH7 failed, becauce inittab could not be found.  In fact
 the filesystem was completely messed up, with /dev empty, lots of device
 nodes in /etc, and files missing all over the place.  I had to reinstall
 RH7 from scratch.
 
 I do not understand how this could happen during a remounting root rw.
 Is the filesystem really that unstable?
 
 Am I right in suspecting DMA, which was enabled at the time?  Any other
 ideas?  Is it a known problem?
 
 This is on a 450 MHz AMD-K6 with the following IDE controller:
 
 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)
 
 [I know this is not a very good trouble report, but it will have to do for
 the time beeing.  I hope to do more testing at a later time.]
 
 /Tobias
 
 PS. This is _not_ the same system that I reported IDE busy errors for.

Wow. Ok, I'm maintaining the 2.4.0 VIA driver, so I'd like to know more
about this:

1) What's the ISA bridge revision?
2) What's in /proc/ide/via?
3) What says hdparm -i on your devices?
4) If you mount your filesystem read-only, does it read garbage?

Thanks.

-- 
Vojtech Pavlik
SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount

2001-01-12 Thread Vojtech Pavlik

On Fri, Jan 12, 2001 at 12:23:21PM -0500, Martin Laberge wrote:

   This is on a 450 MHz AMD-K6 with the following IDE controller:
   00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)
 
  There are several people who have reported that the 2.4.0 VIA IDE driver
  trashes hard disks like that. The 2.2 one also did this sometimes but only
  with specific chipset versions and if you have dma autotune on (thats why
  currently 2.2 refuses to do tuning on VP3)
 
  -
  To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
  the body of a message to [EMAIL PROTECTED]
  Please read the FAQ at http://www.tux.org/lkml/
 
 I had exactly the same problem with my K6-350 and IDE VT82C586a
 on a kernet 2.2.16. i just made a hdparm to enable DMA and poo
 lost all data  reinstall necessary from scratch

Is this problem still present with 2.4.0? Well, you don't need to kill
your data to test this - make sure the kernel is mounting the
filesystems read only in the test. DMA will be probably enabled
automatically for your drives.

-- 
Vojtech Pavlik
SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/