WOL with 3c59x and 2.4.6-pre6 breaks WOL

2001-06-29 Thread Tobias Ringstrom

I just tried 2.4.6-pre6 this morning, and found out that when I enable
WOL (using enable_wol=1), my 3c905c-tx does not work at all any more.
It worked just fine with 2.4.5.  Without enable_wol=1, I have no problems.

It is my guess that this is very easy to reproduce, but if not, please ask
me for more details.  I'm attaching the dmesg output.  I'll be gone until
monday.

/Tobias



Linux version 2.4.6-pre6 ([EMAIL PROTECTED]) (gcc version 2.96 2731 (Red Hat 
Linux 7.1 2.96-85)) #2 Thu Jun 28 17:45:38 CEST 2001
BIOS-provided physical RAM map:
 BIOS-e820:  - 0009e800 (usable)
 BIOS-e820: 0009e800 - 000a (reserved)
 BIOS-e820: 000f - 0010 (reserved)
 BIOS-e820: 0010 - 0ffec000 (usable)
 BIOS-e820: 0ffec000 - 0ffef000 (ACPI data)
 BIOS-e820: 0ffef000 - 0000 (reserved)
 BIOS-e820: 0000 - 1000 (ACPI NVS)
 BIOS-e820:  - 0001 (reserved)
On node 0 totalpages: 65516
zone(0): 4096 pages.
zone(1): 61420 pages.
zone(2): 0 pages.
Kernel command line: BOOT_IMAGE=linux ro root=2103 BOOT_FILE=/boot/vmlinuz 
hdc=ide-scsi 1
ide_setup: hdc=ide-scsi
Initializing CPU#0
Detected 1009.014 MHz processor.
Console: colour VGA+ 80x25
Calibrating delay loop... 2011.95 BogoMIPS
Memory: 255060k/262064k available (1329k kernel code, 6612k reserved, 454k data, 220k 
init, 0k highmem)
Dentry-cache hash table entries: 32768 (order: 6, 262144 bytes)
Inode-cache hash table entries: 16384 (order: 5, 131072 bytes)
Mount-cache hash table entries: 4096 (order: 3, 32768 bytes)
Buffer-cache hash table entries: 16384 (order: 4, 65536 bytes)
Page-cache hash table entries: 65536 (order: 6, 262144 bytes)
CPU: Before vendor init, caps: 0183f9ff c1c7f9ff , vendor = 2
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 256K (64 bytes/line)
CPU: After vendor init, caps: 0183f9ff c1c7f9ff  
CPU: After generic, caps: 0183f9ff c1c7f9ff  
CPU: Common caps: 0183f9ff c1c7f9ff  
CPU: AMD Athlon(tm) Processor stepping 02
Enabling fast FPU save and restore... done.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
mtrr: v1.40 (20010327) Richard Gooch ([EMAIL PROTECTED])
mtrr: detected mtrr type: Intel
PCI: PCI BIOS revision 2.10 entry at 0xf1150, last bus=1
PCI: Using configuration type 1
PCI: Probing PCI hardware
Unknown bridge resource 0: assuming transparent
PCI: Using IRQ router VIA [1106/0686] at 00:04.0
PCI: Found IRQ 9 for device 00:09.0
PCI: Sharing IRQ 9 with 00:04.2
PCI: Sharing IRQ 9 with 00:04.3
PCI: Sharing IRQ 9 with 00:0d.0
Found VT82C686A, not applying VIA latency patch.
isapnp: Scanning for PnP cards...
isapnp: No Plug & Play device found
Linux NET4.0 for Linux 2.4
Based upon Swansea University Computer Society NET3.039
Initializing RT netlink socket
apm: BIOS version 1.2 Flags 0x03 (Driver version 1.14)
Starting kswapd v1.8
parport0: PC-style at 0x378 [PCSPP(,...)]
parport_pc: Via 686A parallel port: io=0x378
pty: 256 Unix98 ptys configured
Serial driver version 5.05a (2001-03-20) with MANY_PORTS SHARE_IRQ SERIAL_PCI ISAPNP 
enabled
ttyS00 at 0x03f8 (irq = 4) is a 16550A
ttyS01 at 0x02f8 (irq = 3) is a 16550A
Real Time Clock Driver v1.10d
ppdev: user-space parallel port driver
block: queued sectors max/low 169437kB/56479kB, 512 slots per queue
Uniform Multi-Platform E-IDE driver Revision: 6.31
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: IDE controller on PCI bus 00 dev 21
VP_IDE: chipset revision 16
VP_IDE: not 100% native mode: will probe irqs later
VP_IDE: VIA vt82c686a (rev 22) IDE UDMA66 controller on pci00:04.1
ide0: BM-DMA at 0xd800-0xd807, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0xd808-0xd80f, BIOS settings: hdc:DMA, hdd:pio
PDC20265: IDE controller on PCI bus 00 dev 88
PCI: Found IRQ 10 for device 00:11.0
PDC20265: chipset revision 2
PDC20265: not 100% native mode: will probe irqs later
PDC20265: (U)DMA Burst Bit ENABLED Primary PCI Mode Secondary PCI Mode.
ide2: BM-DMA at 0x7800-0x7807, BIOS settings: hde:DMA, hdf:DMA
ide3: BM-DMA at 0x7808-0x780f, BIOS settings: hdg:pio, hdh:DMA
hda: HITACHI DVD-ROM GD-7500, ATAPI CD/DVD-ROM drive
hdc: PLEXTOR CD-R PX-W1210A, ATAPI CD/DVD-ROM drive
hde: IBM-DTLA-307060, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
ide2 at 0x9000-0x9007,0x8802 on irq 10
hde: 120103200 sectors (61493 MB) w/1916KiB Cache, CHS=119150/16/63, UDMA(100)
hda: ATAPI 40X DVD-ROM drive, 512kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.12
Partition check:
 hde: [PTBL] [7476/255/63] hde1 hde2 hde3
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
loop: loaded (max 8 devices)
PPP generic driver version 2.4.1
PPP Deflate 

WOL with 3c59x and 2.4.6-pre6 breaks WOL

2001-06-29 Thread Tobias Ringstrom

I just tried 2.4.6-pre6 this morning, and found out that when I enable
WOL (using enable_wol=1), my 3c905c-tx does not work at all any more.
It worked just fine with 2.4.5.  Without enable_wol=1, I have no problems.

It is my guess that this is very easy to reproduce, but if not, please ask
me for more details.  I'm attaching the dmesg output.  I'll be gone until
monday.

/Tobias



Linux version 2.4.6-pre6 ([EMAIL PROTECTED]) (gcc version 2.96 2731 (Red Hat 
Linux 7.1 2.96-85)) #2 Thu Jun 28 17:45:38 CEST 2001
BIOS-provided physical RAM map:
 BIOS-e820:  - 0009e800 (usable)
 BIOS-e820: 0009e800 - 000a (reserved)
 BIOS-e820: 000f - 0010 (reserved)
 BIOS-e820: 0010 - 0ffec000 (usable)
 BIOS-e820: 0ffec000 - 0ffef000 (ACPI data)
 BIOS-e820: 0ffef000 - 0000 (reserved)
 BIOS-e820: 0000 - 1000 (ACPI NVS)
 BIOS-e820:  - 0001 (reserved)
On node 0 totalpages: 65516
zone(0): 4096 pages.
zone(1): 61420 pages.
zone(2): 0 pages.
Kernel command line: BOOT_IMAGE=linux ro root=2103 BOOT_FILE=/boot/vmlinuz 
hdc=ide-scsi 1
ide_setup: hdc=ide-scsi
Initializing CPU#0
Detected 1009.014 MHz processor.
Console: colour VGA+ 80x25
Calibrating delay loop... 2011.95 BogoMIPS
Memory: 255060k/262064k available (1329k kernel code, 6612k reserved, 454k data, 220k 
init, 0k highmem)
Dentry-cache hash table entries: 32768 (order: 6, 262144 bytes)
Inode-cache hash table entries: 16384 (order: 5, 131072 bytes)
Mount-cache hash table entries: 4096 (order: 3, 32768 bytes)
Buffer-cache hash table entries: 16384 (order: 4, 65536 bytes)
Page-cache hash table entries: 65536 (order: 6, 262144 bytes)
CPU: Before vendor init, caps: 0183f9ff c1c7f9ff , vendor = 2
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 256K (64 bytes/line)
CPU: After vendor init, caps: 0183f9ff c1c7f9ff  
CPU: After generic, caps: 0183f9ff c1c7f9ff  
CPU: Common caps: 0183f9ff c1c7f9ff  
CPU: AMD Athlon(tm) Processor stepping 02
Enabling fast FPU save and restore... done.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
mtrr: v1.40 (20010327) Richard Gooch ([EMAIL PROTECTED])
mtrr: detected mtrr type: Intel
PCI: PCI BIOS revision 2.10 entry at 0xf1150, last bus=1
PCI: Using configuration type 1
PCI: Probing PCI hardware
Unknown bridge resource 0: assuming transparent
PCI: Using IRQ router VIA [1106/0686] at 00:04.0
PCI: Found IRQ 9 for device 00:09.0
PCI: Sharing IRQ 9 with 00:04.2
PCI: Sharing IRQ 9 with 00:04.3
PCI: Sharing IRQ 9 with 00:0d.0
Found VT82C686A, not applying VIA latency patch.
isapnp: Scanning for PnP cards...
isapnp: No Plug  Play device found
Linux NET4.0 for Linux 2.4
Based upon Swansea University Computer Society NET3.039
Initializing RT netlink socket
apm: BIOS version 1.2 Flags 0x03 (Driver version 1.14)
Starting kswapd v1.8
parport0: PC-style at 0x378 [PCSPP(,...)]
parport_pc: Via 686A parallel port: io=0x378
pty: 256 Unix98 ptys configured
Serial driver version 5.05a (2001-03-20) with MANY_PORTS SHARE_IRQ SERIAL_PCI ISAPNP 
enabled
ttyS00 at 0x03f8 (irq = 4) is a 16550A
ttyS01 at 0x02f8 (irq = 3) is a 16550A
Real Time Clock Driver v1.10d
ppdev: user-space parallel port driver
block: queued sectors max/low 169437kB/56479kB, 512 slots per queue
Uniform Multi-Platform E-IDE driver Revision: 6.31
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: IDE controller on PCI bus 00 dev 21
VP_IDE: chipset revision 16
VP_IDE: not 100% native mode: will probe irqs later
VP_IDE: VIA vt82c686a (rev 22) IDE UDMA66 controller on pci00:04.1
ide0: BM-DMA at 0xd800-0xd807, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0xd808-0xd80f, BIOS settings: hdc:DMA, hdd:pio
PDC20265: IDE controller on PCI bus 00 dev 88
PCI: Found IRQ 10 for device 00:11.0
PDC20265: chipset revision 2
PDC20265: not 100% native mode: will probe irqs later
PDC20265: (U)DMA Burst Bit ENABLED Primary PCI Mode Secondary PCI Mode.
ide2: BM-DMA at 0x7800-0x7807, BIOS settings: hde:DMA, hdf:DMA
ide3: BM-DMA at 0x7808-0x780f, BIOS settings: hdg:pio, hdh:DMA
hda: HITACHI DVD-ROM GD-7500, ATAPI CD/DVD-ROM drive
hdc: PLEXTOR CD-R PX-W1210A, ATAPI CD/DVD-ROM drive
hde: IBM-DTLA-307060, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
ide2 at 0x9000-0x9007,0x8802 on irq 10
hde: 120103200 sectors (61493 MB) w/1916KiB Cache, CHS=119150/16/63, UDMA(100)
hda: ATAPI 40X DVD-ROM drive, 512kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.12
Partition check:
 hde: [PTBL] [7476/255/63] hde1 hde2 hde3
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
loop: loaded (max 8 devices)
PPP generic driver version 2.4.1
PPP Deflate 

Re: VM Requirement Document - v0.0

2001-06-28 Thread Tobias Ringstrom

On Thu, 28 Jun 2001, Alan Cox wrote:

> > > That isnt really down to labelling pages, what you are talking qbout is what
> > > you get for free when page aging works right (eg 2.0.39) but don't get in
> > > 2.2 - and don't yet (although its coming) quite get right in 2.4.6pre.
> >
> > Correct, but all pages are not equal.
>
> That is the whole point of page aging done right. The use of a page dictates
> how it is aged before being discarded. So pages referenced once are aged
> rapidly, but once they get touched a couple of times then you know they arent
> streaming I/O. There are other related techniques like punishing pages that
> are touched when streaming I/O is done to pages further down the same file -
> FreeBSD does this one for example

Are you saying that classification of pages will not be useful?

Only looking at the page access patterns can certainly reveal a lot, but
tuning how to punish different pages is useful.

> > The problem with updatedb is that it pushes all applications to the swap,
> > and when you get back in the morning, everything has to be paged back from
> > swap just because the (stupid) OS is prepared for yet another updatedb
> > run.
>
> Updatedb is a bit odd in that it mostly sucks in metadata and the buffer to
> page cache balancing is a bit suspect IMHO.

In 2.4.6-pre, the buffer cache is no longer used for metata, right?

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: VM Requirement Document - v0.0

2001-06-28 Thread Tobias Ringstrom

On Thu, 28 Jun 2001, Alan Cox wrote:

> > This would be extremely useful. My laptop has 256mb of ram, but every day
> > it runs the updatedb for locate. This fills the memory with the file
> > cache. Interactivity is then terrible, and swap is unnecessarily used. On
> > the laptop all this hard drive thrashing is bad news for battery life
>
> That isnt really down to labelling pages, what you are talking qbout is what
> you get for free when page aging works right (eg 2.0.39) but don't get in
> 2.2 - and don't yet (although its coming) quite get right in 2.4.6pre.

Correct, but all pages are not equal.

The problem with updatedb is that it pushes all applications to the swap,
and when you get back in the morning, everything has to be paged back from
swap just because the (stupid) OS is prepared for yet another updatedb
run.

Other bad activities include copying lots of files, tar/untar:ing and CD
writing.  They all cause unwanted paging, at least for the desktop user.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: VM Requirement Document - v0.0

2001-06-28 Thread Tobias Ringstrom

On 28 Jun 2001, Xavier Bestel wrote:

> On 28 Jun 2001 14:02:09 +0200, Tobias Ringstrom wrote:
>
> > This would be very useful, I think.  Would it be very hard to classify
> > pages like this (text/data/cache/...)?
>
> How would you classify a page of perl code ?

I do know how the Perl interpreter works, but I think it byte-compiles the
code and puts it in the data segment, which also would have a high paging
cost.

The perl source code would be paged in/out before running binaries such as
shells and the window system, but the same thing would happen to binaries
with short life-span, I suppose.  Perhaps cached executables and cached
data files can be classified differently as well.

What I meant to ask with the question above was if it would be hard to
implement the classification in the kernel.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: VM Requirement Document - v0.0

2001-06-28 Thread Tobias Ringstrom

On Thu, 28 Jun 2001, Helge Hafting wrote:
> Preventing swap-trashing at all cost doesn't help if the
> machine loose to io-trashing instead.  Performance will be
> just as much down, although perhaps more satisfying because
> people aren't that surprised if explicit file operations
> take a long time.  They hate it when moving the mouse
> or something cause a disk access even if their
> apps runs faster. :-(

Exactly.  I still want the ability to tune the system according to my
taste.  I've been thinking about this for some time, and I've specifically
tried to come up with nice tunables, completely ignoring if it is possible
now or not.

If individual pages could be classified as code (text segments), data,
file cache, and so on, I would specify costs to the paging of such pages
in or out.  This way I can make the system perfer to drop a file cache
page that has not been accessed for five minutes, over a program text page
that has not been acccessed for one hour (or much more).

This would be very useful, I think.  Would it be very hard to classify
pages like this (text/data/cache/...)?

Any reason why this is a bad idea?

/Tobias



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: VM Requirement Document - v0.0

2001-06-28 Thread Tobias Ringstrom

On Thu, 28 Jun 2001, Helge Hafting wrote:
 Preventing swap-trashing at all cost doesn't help if the
 machine loose to io-trashing instead.  Performance will be
 just as much down, although perhaps more satisfying because
 people aren't that surprised if explicit file operations
 take a long time.  They hate it when moving the mouse
 or something cause a disk access even if their
 apps runs faster. :-(

Exactly.  I still want the ability to tune the system according to my
taste.  I've been thinking about this for some time, and I've specifically
tried to come up with nice tunables, completely ignoring if it is possible
now or not.

If individual pages could be classified as code (text segments), data,
file cache, and so on, I would specify costs to the paging of such pages
in or out.  This way I can make the system perfer to drop a file cache
page that has not been accessed for five minutes, over a program text page
that has not been acccessed for one hour (or much more).

This would be very useful, I think.  Would it be very hard to classify
pages like this (text/data/cache/...)?

Any reason why this is a bad idea?

/Tobias



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: VM Requirement Document - v0.0

2001-06-28 Thread Tobias Ringstrom

On 28 Jun 2001, Xavier Bestel wrote:

 On 28 Jun 2001 14:02:09 +0200, Tobias Ringstrom wrote:

  This would be very useful, I think.  Would it be very hard to classify
  pages like this (text/data/cache/...)?

 How would you classify a page of perl code ?

I do know how the Perl interpreter works, but I think it byte-compiles the
code and puts it in the data segment, which also would have a high paging
cost.

The perl source code would be paged in/out before running binaries such as
shells and the window system, but the same thing would happen to binaries
with short life-span, I suppose.  Perhaps cached executables and cached
data files can be classified differently as well.

What I meant to ask with the question above was if it would be hard to
implement the classification in the kernel.

/Tobias

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: VM Requirement Document - v0.0

2001-06-28 Thread Tobias Ringstrom

On Thu, 28 Jun 2001, Alan Cox wrote:

  This would be extremely useful. My laptop has 256mb of ram, but every day
  it runs the updatedb for locate. This fills the memory with the file
  cache. Interactivity is then terrible, and swap is unnecessarily used. On
  the laptop all this hard drive thrashing is bad news for battery life

 That isnt really down to labelling pages, what you are talking qbout is what
 you get for free when page aging works right (eg 2.0.39) but don't get in
 2.2 - and don't yet (although its coming) quite get right in 2.4.6pre.

Correct, but all pages are not equal.

The problem with updatedb is that it pushes all applications to the swap,
and when you get back in the morning, everything has to be paged back from
swap just because the (stupid) OS is prepared for yet another updatedb
run.

Other bad activities include copying lots of files, tar/untar:ing and CD
writing.  They all cause unwanted paging, at least for the desktop user.

/Tobias

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: VM Requirement Document - v0.0

2001-06-28 Thread Tobias Ringstrom

On Thu, 28 Jun 2001, Alan Cox wrote:

   That isnt really down to labelling pages, what you are talking qbout is what
   you get for free when page aging works right (eg 2.0.39) but don't get in
   2.2 - and don't yet (although its coming) quite get right in 2.4.6pre.
 
  Correct, but all pages are not equal.

 That is the whole point of page aging done right. The use of a page dictates
 how it is aged before being discarded. So pages referenced once are aged
 rapidly, but once they get touched a couple of times then you know they arent
 streaming I/O. There are other related techniques like punishing pages that
 are touched when streaming I/O is done to pages further down the same file -
 FreeBSD does this one for example

Are you saying that classification of pages will not be useful?

Only looking at the page access patterns can certainly reveal a lot, but
tuning how to punish different pages is useful.

  The problem with updatedb is that it pushes all applications to the swap,
  and when you get back in the morning, everything has to be paged back from
  swap just because the (stupid) OS is prepared for yet another updatedb
  run.

 Updatedb is a bit odd in that it mostly sucks in metadata and the buffer to
 page cache balancing is a bit suspect IMHO.

In 2.4.6-pre, the buffer cache is no longer used for metata, right?

/Tobias

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Snowhite and the Seven Dwarfs - The REAL story!

2001-06-15 Thread Tobias Ringstrom

On Fri, 15 Jun 2001, Hahaha wrote:

> Today, Snowhite was turning 18. The 7 Dwarfs always where very educated and
> polite with Snowhite. When they go out work at mornign, they promissed a
> *huge* surprise. Snowhite was anxious. Suddlently, the door open, and the Seven
> Dwarfs enter...

Ah... the joy of reading mail using non-MS software, on a non-MS OS...

Hahaha, indeed!

/Tobias


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Snowhite and the Seven Dwarfs - The REAL story!

2001-06-15 Thread Tobias Ringstrom

On Fri, 15 Jun 2001, Hahaha wrote:

 Today, Snowhite was turning 18. The 7 Dwarfs always where very educated and
 polite with Snowhite. When they go out work at mornign, they promissed a
 *huge* surprise. Snowhite was anxious. Suddlently, the door open, and the Seven
 Dwarfs enter...

Ah... the joy of reading mail using non-MS software, on a non-MS OS...

Hahaha, indeed!

/Tobias


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Tobias Ringstrom

On Fri, 8 Jun 2001, Mike Galbraith wrote:
> On Fri, 8 Jun 2001, Tobias Ringstrom wrote:
> > On Fri, 8 Jun 2001, Mike Galbraith wrote:
> > > I gave this a shot at my favorite vm beater test (make -j30 bzImage)
> > > while testing some other stuff today.
> >
> > Could you please explain what is good about this test?  I understand that
> > it will stress the VM, but will it do so in a realistic and relevant way?
>
> Can you explain what is bad about this test? ;)  It spins the same VM wheels

I think a load of ~30 is quit uncommon, and therefor it is unclear to me
that it would be a test that would be repesentative of most normal loads.

> as any other load does.  What's the difference if I have a bunch of httpd
> allocating or a bunch of cc1/as/ld?  This load has a modest cachable data
> set and is compute bound.. and above all gives very repeatable results.

Not a big difference.  The difference I was thinking abount is the
difference between spawning lots of processes allocating, using and
freeing lots of memory, compared to a case where you have a few processes
touching a lot of already allocated pages in some pattern.  I was
wondering whether optimizing for your case would be good or bad for the
other case.  I know, I know, I should do more testing myself.  And I
should probably not ask you, since you really really like your test,
and you will probably just say yes... ;-)

At home, I'm running a couple of computers.  One of them is a slow
computer running Linux, serving mail, NFS, SMB, etc.  I'm usually logged
in on a couple of virtual consoles.  On this machine, I do not mind if all
shells, daemons and other idle processes are beeing swapped out in favor
of disk cache for the NFS and SMB serving.  In fact, that is a very good
thing, and I want it that way.

Another maching is my desktop machine.  When using this maching, I really
hate when my emacsen, browsers, xterms, etc are swapped out just to give
me some stupid disk cache for my xmms or compilations.  I do not care if a
kernel compile is a little slower as long as my applications are snappy.

How could Linux predict this?  It is a matter of taste, IMHO.

> I use it to watch reaction to surge.  I watch for the vm to build to a
> solid maximum throughput without thrashing.  That's the portion of VM
> that I'm interested in, so that's what I test.  Besides :) I simply don't
> have the hardware to try to simulate hairy chested server loads.  There
> are lots of folks with hairy chested boxes.. they should test that stuff.

Agreed.  More testing is needed.  Now if we would have those knobs and
wheels to turn, we could perhaps also tune our systems to behave as we
like them, and submit that as well.  Right now you need to be a kernel
hacker, and see through all the magic with shm, mmap, a bunch of caches,
page lists, etc.  I'd give a lot for a nice picture (or state diagram)
showing the lifetime of a page, but I have not found such a picture
anywhere.  Besides, the VM seems to change every new release anyway.

> I've been repeating ~this test since 2.0 times, and have noticed a 1:1
> relationship.  When I notice that my box is ~happy doing this load test,
> I also notice very few VM gripes hitting the list.

Ok, but as you say, we need more tests.

> > Isn't the interesting case when you have a number of processes using lots
> > of memory, but only a part of all that memory is beeing actively used, and
> > that memory fits in RAM.  In that case, the VM should make sure that the
> > not used memory is swapped out.  In RAM you should have the used memory,
> > but also disk cache if there is any RAM left.  Does the current VM handle
> > this case fine yet?  IMHO, this is the case most people care about.  It is
> > definately the case I care about, at least. :-)
>
> The interesting case is _every_ case.  Try seeing my particular test as
> a simulation of a small classroom box with 30 students compiling their
> assignments and it'll suddenly become quite realistic.  You'll notice
> by the numbers I post that I was very careful to not overload the box in
> a rediculous manner when selecting the total size of the job.. it's just
> a heavily loaded box.  This test does not overload my IO resources, so
> it tests the VM's ability to choose and move the right stuff at the right
> time to get the job done with a minimum of additional overhead.

I did not understand those numbers when I saw them the first time.  Now, I
must say that your test does not look as silly as it did before.

> The current VM handles things generally well imho, but has problems
> regulating itself under load.  My test load hits the VM right in it's
> weakest point (not _that_ weak, but..) by starting at zero and building
> rapidly to max.. and keeping it _right there_.
>
> > I'm not saying that it's a complet

Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Tobias Ringstrom

On Fri, 8 Jun 2001, Mike Galbraith wrote:
> I gave this a shot at my favorite vm beater test (make -j30 bzImage)
> while testing some other stuff today.

Could you please explain what is good about this test?  I understand that
it will stress the VM, but will it do so in a realistic and relevant way?

Isn't the interesting case when you have a number of processes using lots
of memory, but only a part of all that memory is beeing actively used, and
that memory fits in RAM.  In that case, the VM should make sure that the
not used memory is swapped out.  In RAM you should have the used memory,
but also disk cache if there is any RAM left.  Does the current VM handle
this case fine yet?  IMHO, this is the case most people care about.  It is
definately the case I care about, at least. :-)

I'm not saying that it's a completely uninteresting case when your active
memory is bigger than you RAM of course, but perhaps there should be other
algorithms handling that case, such as putting some of the swapping
processes to sleep for some time, especially if you have lots of processes
competing for the memory. I may be wrong, but it seems to me that your
testcase falls into this second category (also known as thrashing).

An at last, a humble request:  Every problem I've had with the VM has been
that it either swapped out too many processes and used too much cache, or
the other way around.  I'd really enjoy a way to tune this behaviour, if
possible.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Tobias Ringstrom

On Fri, 8 Jun 2001, Mike Galbraith wrote:
 I gave this a shot at my favorite vm beater test (make -j30 bzImage)
 while testing some other stuff today.

Could you please explain what is good about this test?  I understand that
it will stress the VM, but will it do so in a realistic and relevant way?

Isn't the interesting case when you have a number of processes using lots
of memory, but only a part of all that memory is beeing actively used, and
that memory fits in RAM.  In that case, the VM should make sure that the
not used memory is swapped out.  In RAM you should have the used memory,
but also disk cache if there is any RAM left.  Does the current VM handle
this case fine yet?  IMHO, this is the case most people care about.  It is
definately the case I care about, at least. :-)

I'm not saying that it's a completely uninteresting case when your active
memory is bigger than you RAM of course, but perhaps there should be other
algorithms handling that case, such as putting some of the swapping
processes to sleep for some time, especially if you have lots of processes
competing for the memory. I may be wrong, but it seems to me that your
testcase falls into this second category (also known as thrashing).

An at last, a humble request:  Every problem I've had with the VM has been
that it either swapped out too many processes and used too much cache, or
the other way around.  I'd really enjoy a way to tune this behaviour, if
possible.

/Tobias

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: VM Report was:Re: Break 2.4 VM in five easy steps

2001-06-08 Thread Tobias Ringstrom

On Fri, 8 Jun 2001, Mike Galbraith wrote:
 On Fri, 8 Jun 2001, Tobias Ringstrom wrote:
  On Fri, 8 Jun 2001, Mike Galbraith wrote:
   I gave this a shot at my favorite vm beater test (make -j30 bzImage)
   while testing some other stuff today.
 
  Could you please explain what is good about this test?  I understand that
  it will stress the VM, but will it do so in a realistic and relevant way?

 Can you explain what is bad about this test? ;)  It spins the same VM wheels

I think a load of ~30 is quit uncommon, and therefor it is unclear to me
that it would be a test that would be repesentative of most normal loads.

 as any other load does.  What's the difference if I have a bunch of httpd
 allocating or a bunch of cc1/as/ld?  This load has a modest cachable data
 set and is compute bound.. and above all gives very repeatable results.

Not a big difference.  The difference I was thinking abount is the
difference between spawning lots of processes allocating, using and
freeing lots of memory, compared to a case where you have a few processes
touching a lot of already allocated pages in some pattern.  I was
wondering whether optimizing for your case would be good or bad for the
other case.  I know, I know, I should do more testing myself.  And I
should probably not ask you, since you really really like your test,
and you will probably just say yes... ;-)

At home, I'm running a couple of computers.  One of them is a slow
computer running Linux, serving mail, NFS, SMB, etc.  I'm usually logged
in on a couple of virtual consoles.  On this machine, I do not mind if all
shells, daemons and other idle processes are beeing swapped out in favor
of disk cache for the NFS and SMB serving.  In fact, that is a very good
thing, and I want it that way.

Another maching is my desktop machine.  When using this maching, I really
hate when my emacsen, browsers, xterms, etc are swapped out just to give
me some stupid disk cache for my xmms or compilations.  I do not care if a
kernel compile is a little slower as long as my applications are snappy.

How could Linux predict this?  It is a matter of taste, IMHO.

 I use it to watch reaction to surge.  I watch for the vm to build to a
 solid maximum throughput without thrashing.  That's the portion of VM
 that I'm interested in, so that's what I test.  Besides :) I simply don't
 have the hardware to try to simulate hairy chested server loads.  There
 are lots of folks with hairy chested boxes.. they should test that stuff.

Agreed.  More testing is needed.  Now if we would have those knobs and
wheels to turn, we could perhaps also tune our systems to behave as we
like them, and submit that as well.  Right now you need to be a kernel
hacker, and see through all the magic with shm, mmap, a bunch of caches,
page lists, etc.  I'd give a lot for a nice picture (or state diagram)
showing the lifetime of a page, but I have not found such a picture
anywhere.  Besides, the VM seems to change every new release anyway.

 I've been repeating ~this test since 2.0 times, and have noticed a 1:1
 relationship.  When I notice that my box is ~happy doing this load test,
 I also notice very few VM gripes hitting the list.

Ok, but as you say, we need more tests.

  Isn't the interesting case when you have a number of processes using lots
  of memory, but only a part of all that memory is beeing actively used, and
  that memory fits in RAM.  In that case, the VM should make sure that the
  not used memory is swapped out.  In RAM you should have the used memory,
  but also disk cache if there is any RAM left.  Does the current VM handle
  this case fine yet?  IMHO, this is the case most people care about.  It is
  definately the case I care about, at least. :-)

 The interesting case is _every_ case.  Try seeing my particular test as
 a simulation of a small classroom box with 30 students compiling their
 assignments and it'll suddenly become quite realistic.  You'll notice
 by the numbers I post that I was very careful to not overload the box in
 a rediculous manner when selecting the total size of the job.. it's just
 a heavily loaded box.  This test does not overload my IO resources, so
 it tests the VM's ability to choose and move the right stuff at the right
 time to get the job done with a minimum of additional overhead.

I did not understand those numbers when I saw them the first time.  Now, I
must say that your test does not look as silly as it did before.

 The current VM handles things generally well imho, but has problems
 regulating itself under load.  My test load hits the VM right in it's
 weakest point (not _that_ weak, but..) by starting at zero and building
 rapidly to max.. and keeping it _right there_.

  I'm not saying that it's a completely uninteresting case when your active
  memory is bigger than you RAM of course, but perhaps there should be other
  algorithms handling that case, such as putting some of the swapping
  processes to sleep for some time, especially if you

Re: [PATCH] drivers/net/others

2001-05-24 Thread Tobias Ringstrom

Andrzej,

Thanks for your impressive clean-up patch.  I have a couple of comments
regarding your clean-up of the dmfe.c driver.

On Thu, 24 May 2001, Andrzej Krzysztofowicz wrote:

> @@ -395,7 +395,7 @@
>   u32 dev_rev, pci_pmr;
>
>   if (!printed_version++)
> - printk(version);
> + printk("%s", version);
>
>   DMFE_DBUG(0, "dmfe_init_one()", 0);
>

Could you please explain the purpose of this change?  To me it looks less
efficient in both performance and memory usage.

> @@ -2024,8 +2027,10 @@
>  {
>   int rc;
>
> - printk(version);
> +#ifdef MODULE
> + printk("s", version);
>   printed_version = 1;
> +#endif /* MODULE */
>
>   DMFE_DBUG(0, "init_module() ", debug);
>

Whoups...  And why did you add the ifdef, btw?

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] drivers/net/others

2001-05-24 Thread Tobias Ringstrom

Andrzej,

Thanks for your impressive clean-up patch.  I have a couple of comments
regarding your clean-up of the dmfe.c driver.

On Thu, 24 May 2001, Andrzej Krzysztofowicz wrote:

 @@ -395,7 +395,7 @@
   u32 dev_rev, pci_pmr;

   if (!printed_version++)
 - printk(version);
 + printk(%s, version);

   DMFE_DBUG(0, dmfe_init_one(), 0);


Could you please explain the purpose of this change?  To me it looks less
efficient in both performance and memory usage.

 @@ -2024,8 +2027,10 @@
  {
   int rc;

 - printk(version);
 +#ifdef MODULE
 + printk(s, version);
   printed_version = 1;
 +#endif /* MODULE */

   DMFE_DBUG(0, init_module() , debug);


Whoups...  And why did you add the ifdef, btw?

/Tobias

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: bug in redhat gcc 2.96

2001-05-09 Thread Tobias Ringstrom

On Wed, 9 May 2001, Alan Cox wrote:
> > Any suggestions for a way to cope with this?  We have a
> > customer who's system fails due to this.
>
> You can build 2.4 quite sanely with egcs-1.1.2 (aka kgcc)

Since there is no kgcc in RH71, will you be releasing an updated gcc
rpm, or is the best solution to download and compile egcs-1.1.2 from
source?

IMHO, it is best not to revert to an old egcs version, but instead
continue to find bugs in the upcoming 3.0 release.  I'm assuming that
your fixes for your gcc-2.96 are propagated to the pre-3.0 branch.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: bug in redhat gcc 2.96

2001-05-09 Thread Tobias Ringstrom

On Wed, 9 May 2001, Alan Cox wrote:
  Any suggestions for a way to cope with this?  We have a
  customer who's system fails due to this.

 You can build 2.4 quite sanely with egcs-1.1.2 (aka kgcc)

Since there is no kgcc in RH71, will you be releasing an updated gcc
rpm, or is the best solution to download and compile egcs-1.1.2 from
source?

IMHO, it is best not to revert to an old egcs version, but instead
continue to find bugs in the upcoming 3.0 release.  I'm assuming that
your fixes for your gcc-2.96 are propagated to the pre-3.0 branch.

/Tobias

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: page_launder() bug

2001-05-07 Thread Tobias Ringstrom

On Sun, 6 May 2001, David S. Miller wrote:
> It is the most straightforward way to make a '1' or '0'
> integer from the NULL state of a pointer.

But is it really specified in the C "standards" to be exctly zero or one,
and not zero and non-zero?

IMHO, the ?: construct is way more readable and reliable.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: page_launder() bug

2001-05-07 Thread Tobias Ringstrom

On Sun, 6 May 2001, David S. Miller wrote:
 It is the most straightforward way to make a '1' or '0'
 integer from the NULL state of a pointer.

But is it really specified in the C standards to be exctly zero or one,
and not zero and non-zero?

IMHO, the ?: construct is way more readable and reliable.

/Tobias

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Using alloc_skb causes memory corruption in 2.4.4

2001-04-30 Thread Tobias Ringstrom

I get severe memory corruption when forwarding packets from eth1 to eth0,
where eth0 is a 3Com 905C-TX (zc, hw checksumming), and eth1 is a Davicom
9102.  In every case it is the last two bytes of a 4096-byte block that
have been cleared.

To make a long bug hunting story short, the eth1 driver (dmfe) uses
alloc_skb for skbuf allocation, and if I change it into dev_alloc_skb, the
problem disappears.

Did I find the real problem, or did I just hide it?

/Tobias


diff -ru linux-2.4.4.orig/drivers/net/dmfe.c linux-2.4.4/drivers/net/dmfe.c
--- linux-2.4.4.orig/drivers/net/dmfe.c Sat Apr 28 11:41:49 2001
+++ linux-2.4.4/drivers/net/dmfe.c  Mon Apr 30 15:15:02 2001
@@ -1306,7 +1306,7 @@
rxptr = db->rx_insert_ptr;

while (db->rx_avail_cnt < RX_DESC_CNT) {
-   if ((skb = alloc_skb(RX_ALLOC_SIZE, GFP_ATOMIC)) == NULL)
+   if ((skb = dev_alloc_skb(RX_ALLOC_SIZE)) == NULL)
break;
rxptr->rx_skb_ptr = (u32) skb;
rxptr->rdes2 = virt_to_bus(skb->tail);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Using alloc_skb causes memory corruption in 2.4.4

2001-04-30 Thread Tobias Ringstrom

I get severe memory corruption when forwarding packets from eth1 to eth0,
where eth0 is a 3Com 905C-TX (zc, hw checksumming), and eth1 is a Davicom
9102.  In every case it is the last two bytes of a 4096-byte block that
have been cleared.

To make a long bug hunting story short, the eth1 driver (dmfe) uses
alloc_skb for skbuf allocation, and if I change it into dev_alloc_skb, the
problem disappears.

Did I find the real problem, or did I just hide it?

/Tobias


diff -ru linux-2.4.4.orig/drivers/net/dmfe.c linux-2.4.4/drivers/net/dmfe.c
--- linux-2.4.4.orig/drivers/net/dmfe.c Sat Apr 28 11:41:49 2001
+++ linux-2.4.4/drivers/net/dmfe.c  Mon Apr 30 15:15:02 2001
@@ -1306,7 +1306,7 @@
rxptr = db-rx_insert_ptr;

while (db-rx_avail_cnt  RX_DESC_CNT) {
-   if ((skb = alloc_skb(RX_ALLOC_SIZE, GFP_ATOMIC)) == NULL)
+   if ((skb = dev_alloc_skb(RX_ALLOC_SIZE)) == NULL)
break;
rxptr-rx_skb_ptr = (u32) skb;
rxptr-rdes2 = virt_to_bus(skb-tail);

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



IDE reset and DMA disabled after CD read error

2001-04-27 Thread Tobias Ringstrom

When reading a bad CD, Linux 2.4.4-pre5 decided to turn off DMA when
trying to read a bad sector.  It also decided to reset the drive.  Is that
the expected behaviour?  I'm certanly not an ATAPI expert, but it does
seem a bit drastic to me.  The drive is in UDMA33 mode on a VIA vt82c686a,
with no (U)DMA problems detected (so far).

Isn't it possible to recogise a read error and treat it more gently?

/Tobias, fumbling in the dark


Here is the dmesg output:

Apr 27 23:32:31 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:31 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:31 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:31 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:32 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:32 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:33 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:33 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:33 igor kernel: hda: DMA disabled
Apr 27 23:32:33 igor kernel: hda: ATAPI reset complete
Apr 27 23:32:33 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:33 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:34 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:34 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:35 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:35 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:35 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:35 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:35 igor kernel: hda: ATAPI reset complete
Apr 27 23:32:36 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:36 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:36 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:36 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:37 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:37 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:37 igor kernel: hda: ATAPI reset complete
Apr 27 23:32:37 igor kernel: end_request: I/O error, dev 03:00 (hda), sector 651222
Apr 27 23:32:38 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:38 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:38 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:38 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:39 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:39 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:40 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:40 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:40 igor kernel: hda: ATAPI reset complete
Apr 27 23:32:40 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:40 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:41 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:41 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:42 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:42 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:42 igor kernel: hda: ATAPI reset complete
Apr 27 23:32:42 igor kernel: end_request: I/O error, dev 03:00 (hda), sector 651222
Apr 27 23:32:42 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:42 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:43 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:43 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:44 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:44 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:44 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:44 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:44 igor kernel: hda: ATAPI reset complete
Apr 27 23:32:45 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:45 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:46 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:46 igor 

IDE reset and DMA disabled after CD read error

2001-04-27 Thread Tobias Ringstrom

When reading a bad CD, Linux 2.4.4-pre5 decided to turn off DMA when
trying to read a bad sector.  It also decided to reset the drive.  Is that
the expected behaviour?  I'm certanly not an ATAPI expert, but it does
seem a bit drastic to me.  The drive is in UDMA33 mode on a VIA vt82c686a,
with no (U)DMA problems detected (so far).

Isn't it possible to recogise a read error and treat it more gently?

/Tobias, fumbling in the dark


Here is the dmesg output:

Apr 27 23:32:31 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:31 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:31 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:31 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:32 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:32 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:33 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:33 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:33 igor kernel: hda: DMA disabled
Apr 27 23:32:33 igor kernel: hda: ATAPI reset complete
Apr 27 23:32:33 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:33 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:34 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:34 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:35 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:35 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:35 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:35 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:35 igor kernel: hda: ATAPI reset complete
Apr 27 23:32:36 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:36 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:36 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:36 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:37 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:37 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:37 igor kernel: hda: ATAPI reset complete
Apr 27 23:32:37 igor kernel: end_request: I/O error, dev 03:00 (hda), sector 651222
Apr 27 23:32:38 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:38 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:38 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:38 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:39 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:39 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:40 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:40 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:40 igor kernel: hda: ATAPI reset complete
Apr 27 23:32:40 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:40 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:41 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:41 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:42 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:42 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:42 igor kernel: hda: ATAPI reset complete
Apr 27 23:32:42 igor kernel: end_request: I/O error, dev 03:00 (hda), sector 651222
Apr 27 23:32:42 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:42 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:43 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:43 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:44 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:44 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:44 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:44 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:44 igor kernel: hda: ATAPI reset complete
Apr 27 23:32:45 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:45 igor kernel: hda: cdrom_decode_status: error=0x34
Apr 27 23:32:46 igor kernel: hda: cdrom_decode_status: status=0x51 { DriveReady 
SeekComplete Error }
Apr 27 23:32:46 igor 

Weird problem with 2.4.4-pre6

2001-04-25 Thread Tobias Ringstrom

Yesterday, I was running tcpdump, paging the output with less.  All of a
sudden, less started to dump core (SIGSEGV).  I could not even start less
by itself:

> less

without it getting a SIGSEGV, and in fact no user could run less without
getting a SIGSEGV, but it did work perfectly a few minutes earlier.  This
morning, I tried to run less again, and now it was working!  No core
dumps!

How can this happen?  Something overwriting the page/buffer cache?
Unfortunately, I don't know how to reproduce it.  I'm writing this because
it was so strange that I felt I had to share it.  There are no messages in
the (dmesg) log.

/Tobias, a little bit worried


Semi-random info:

00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
00:07.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] (rev 01)
00:07.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II]
00:07.2 USB Controller: Intel Corporation 82371SB PIIX3 USB [Natoma/Triton II] (rev 01)
00:0b.0 VGA compatible controller: ATI Technologies Inc 210888GX [Mach64 GX] (rev 01)
00:0f.0 Ethernet controller: Davicom Semiconductor, Inc. Ethernet 100/10 MBit (rev 31)
00:11.0 Ethernet controller: 3Com Corporation 3c905C-TX [Fast Etherlink] (rev 74)

hda is running with DMA enabled in mdma2 mode.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Weird problem with 2.4.4-pre6

2001-04-25 Thread Tobias Ringstrom

Yesterday, I was running tcpdump, paging the output with less.  All of a
sudden, less started to dump core (SIGSEGV).  I could not even start less
by itself:

 less

without it getting a SIGSEGV, and in fact no user could run less without
getting a SIGSEGV, but it did work perfectly a few minutes earlier.  This
morning, I tried to run less again, and now it was working!  No core
dumps!

How can this happen?  Something overwriting the page/buffer cache?
Unfortunately, I don't know how to reproduce it.  I'm writing this because
it was so strange that I felt I had to share it.  There are no messages in
the (dmesg) log.

/Tobias, a little bit worried


Semi-random info:

00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
00:07.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] (rev 01)
00:07.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II]
00:07.2 USB Controller: Intel Corporation 82371SB PIIX3 USB [Natoma/Triton II] (rev 01)
00:0b.0 VGA compatible controller: ATI Technologies Inc 210888GX [Mach64 GX] (rev 01)
00:0f.0 Ethernet controller: Davicom Semiconductor, Inc. Ethernet 100/10 MBit (rev 31)
00:11.0 Ethernet controller: 3Com Corporation 3c905C-TX [Fast Etherlink] (rev 74)

hda is running with DMA enabled in mdma2 mode.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Compiling problem kernel 2.4.2

2001-03-26 Thread Tobias Ringstrom

On Mon, 26 Mar 2001, Theodoor Scholte wrote:

> There are no relevant messsages in that file.

Strange, but I bet that you can compile again, right?  (Just remove the
broken compile.h that the dd command created)  Must have been an NFS
fluke, and without any more precise error messages, there is not much to
do, unless you can reproduce it.

/Tobias


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Compiling problem kernel 2.4.2

2001-03-26 Thread Tobias Ringstrom

On Mon, 26 Mar 2001, Theodoor Scholte wrote:

> Hello,
>
> I have a problem with compiling kernel-2.4.2. When I want to make a bzImage
> on a RedHat Linux 5.2 box,
> then I get this error-message:

> [...]

> cpp: /usr/src/linux/include/linux/compile.h: Input/output error

Disk full?  Bad disk?

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Compiling problem kernel 2.4.2

2001-03-26 Thread Tobias Ringstrom

On Mon, 26 Mar 2001, Theodoor Scholte wrote:

 Hello,

 I have a problem with compiling kernel-2.4.2. When I want to make a bzImage
 on a RedHat Linux 5.2 box,
 then I get this error-message:

 [...]

 cpp: /usr/src/linux/include/linux/compile.h: Input/output error

Disk full?  Bad disk?

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Compiling problem kernel 2.4.2

2001-03-26 Thread Tobias Ringstrom

On Mon, 26 Mar 2001, Theodoor Scholte wrote:

 There are no relevant messsages in that file.

Strange, but I bet that you can compile again, right?  (Just remove the
broken compile.h that the dd command created)  Must have been an NFS
fluke, and without any more precise error messages, there is not much to
do, unless you can reproduce it.

/Tobias


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Where is the RAM?

2001-03-22 Thread Tobias Ringstrom

On Thu, 22 Mar 2001, Neal Gieselman wrote:

> I have a Redhat 6.1 WS that was installed with 64 MB RAM.  I added another
> 64 MB, booted, BIOS sees it, but top, free, etc still see only 64 MB.
> Any clues on what to do?

Add mem=128M (or mem=127M if that fails) to the boot line (append in
LILO), or upgrade the kernel to something recent.

/Tobias


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Where is the RAM?

2001-03-22 Thread Tobias Ringstrom

On Thu, 22 Mar 2001, Neal Gieselman wrote:

 I have a Redhat 6.1 WS that was installed with 64 MB RAM.  I added another
 64 MB, booted, BIOS sees it, but top, free, etc still see only 64 MB.
 Any clues on what to do?

Add mem=128M (or mem=127M if that fails) to the boot line (append in
LILO), or upgrade the kernel to something recent.

/Tobias


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Hashing and directories

2001-03-02 Thread Tobias Ringstrom

On 2 Mar 2001, Oystein Viggen wrote:
> Pavel Machek wrote:
> > xargs is very ugly. I want to rm 12*. Just plain "rm 12*". *Not* "find
> These you work around using the smarter, \0 terminated, version:

Another example demonstrating why xargs is not always good (and why a
bigger command line is needed) is when you combine it with e.g. wc:

find . -type f -print0 | xargs -0 wc

You cannot trust the summary line from wc, since xargs may have decided to
run wc may times, and thus you have may summary lines.  If the kernel
would allow a larger command line, you could run

wc `find . -type f`

and get exacly what you want.  And if I'm not mistaken, Linux accepts a
much smaller command line than other "unices" such as Solaris.

...but it's not _that_ important...  obviously there has to be an upper
limit somewhere...

/Tobias


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Hashing and directories

2001-03-02 Thread Tobias Ringstrom

On 2 Mar 2001, Oystein Viggen wrote:
 Pavel Machek wrote:
  xargs is very ugly. I want to rm 12*. Just plain "rm 12*". *Not* "find
 These you work around using the smarter, \0 terminated, version:

Another example demonstrating why xargs is not always good (and why a
bigger command line is needed) is when you combine it with e.g. wc:

find . -type f -print0 | xargs -0 wc

You cannot trust the summary line from wc, since xargs may have decided to
run wc may times, and thus you have may summary lines.  If the kernel
would allow a larger command line, you could run

wc `find . -type f`

and get exacly what you want.  And if I'm not mistaken, Linux accepts a
much smaller command line than other "unices" such as Solaris.

...but it's not _that_ important...  obviously there has to be an upper
limit somewhere...

/Tobias


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: hang on mount, 2.4.2-pre4, VIA

2001-02-20 Thread Tobias Ringstrom

On Tue, 20 Feb 2001, Dan Christian wrote:
> Hello,
>   I just tried upgrading to 2.4.2-pre4 from 2.4.1 and get a hang when
> mounting the file systems.  I have the same problem with 2.4.1-ac18.

Have you tried to set LOGLEVEL in /etc/sysconfig/init to something higher
(8)? That way you may see what is happening, instead of just getting a
kernel freeze.

/Tobias


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: hang on mount, 2.4.2-pre4, VIA

2001-02-20 Thread Tobias Ringstrom

On Tue, 20 Feb 2001, Dan Christian wrote:
 Hello,
   I just tried upgrading to 2.4.2-pre4 from 2.4.1 and get a hang when
 mounting the file systems.  I have the same problem with 2.4.1-ac18.

Have you tried to set LOGLEVEL in /etc/sysconfig/init to something higher
(8)? That way you may see what is happening, instead of just getting a
kernel freeze.

/Tobias


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: making forward at vger.rutgers.org? [was Re: maestro3 patch,resent]

2001-02-12 Thread Tobias Ringstrom

On Sun, 11 Feb 2001, Pavel Machek wrote:

> Hi!
>
> > duh.  I sent this to rutgers originally..
>
> I'm doing same mistake over and over.
>
> Perhaps creating forward at vger.rutgers.edu would be good thing (tm)?

Then how would you ever learn?  ;-)

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: WOL failure after shutdown

2001-02-12 Thread Tobias Ringstrom

On Sun, 11 Feb 2001, James Brents wrote:

> Sorry, I wrote that in a hurry. Its a 3Com PCI 3c905C Tornado. I can
> successfully use wakeonlan if I power off the machine immeadiatly after
> turning it on. Using the shutdown command, which it will when I need it
> to power back up, it will not work.
> Im using a wakeonlan cable to my motherboard as well, not using wake
> through PCI bus.
> Kernel is 2.4.1
> I appologize for not providing all required the specs in the original
> message.

Try this patch.  It is against the zero-copy version of the driver, but
I'm sure you can apply it, at least manually, to any 2.4 version.

Andrew, when can we expect to have WOL working in 2.4?

/Tobias


--- linux-2.4.1-zc1.orig/drivers/net/3c59x.cTue Jan 30 22:16:01 2001
+++ linux-2.4.1-zc1/drivers/net/3c59x.c Wed Jan 31 08:46:00 2001
@@ -754,6 +754,7 @@
 static void set_rx_mode(struct net_device *dev);
 static int vortex_ioctl(struct net_device *dev, struct ifreq *rq, int cmd);
 static void vortex_tx_timeout(struct net_device *dev);
+static void acpi_wake(struct pci_dev *pdev);
 static void acpi_set_WOL(struct net_device *dev);

 /* This driver uses 'options' to pass the media type, full-duplex flag, etc. */
@@ -1426,6 +1427,8 @@
int i;
int retval;

+   acpi_wake(vp->pdev);
+
/* Use the now-standard shared IRQ implementation. */
if ((retval = request_irq(dev->irq, vp->full_bus_master_rx ?
_interrupt : _interrupt, SA_SHIRQ, 
dev->name, dev))) {
@@ -2647,12 +2650,6 @@
struct vortex_private *vp = (struct vortex_private *)dev->priv;
long ioaddr = dev->base_addr;

-   /* AKPM: This kills the 905 */
-   if (vortex_debug > 1) {
-   printk(KERN_INFO PFX "Wake-on-LAN functions disabled\n");
-   }
-   return;
-
/* Power up on: 1==Downloaded Filter, 2==Magic Packets, 4==Link Status. */
EL3WINDOW(7);
outw(2, ioaddr + 0x0c);
@@ -2663,6 +2660,34 @@
pci_write_config_word(vp->pdev, 0xe0, 0x8103);
 }

+/* Change from D3 (sleep) to D0 (active).
+   Problem: The Cyclone forgets all PCI config info during the transition! */
+static void acpi_wake(struct pci_dev *pdev)
+{
+   u32 base0, base1, romaddr;
+   u16 pci_command, pwr_command;
+   u8  pci_latency, pci_cacheline, irq;
+
+   pci_read_config_word(pdev, 0xe0, _command);
+   if ((pwr_command & 3) == 0)
+   return;
+   pci_read_config_word( pdev, PCI_COMMAND, _command);
+   pci_read_config_dword(pdev, PCI_BASE_ADDRESS_0, );
+   pci_read_config_dword(pdev, PCI_BASE_ADDRESS_1, );
+   pci_read_config_dword(pdev, PCI_ROM_ADDRESS, );
+   pci_read_config_byte( pdev, PCI_LATENCY_TIMER, _latency);
+   pci_read_config_byte( pdev, PCI_CACHE_LINE_SIZE, _cacheline);
+   pci_read_config_byte( pdev, PCI_INTERRUPT_LINE, );
+
+   pci_write_config_word( pdev, 0xe0, 0x);
+   pci_write_config_dword(pdev, PCI_BASE_ADDRESS_0, base0);
+   pci_write_config_dword(pdev, PCI_BASE_ADDRESS_1, base1);
+   pci_write_config_dword(pdev, PCI_ROM_ADDRESS, romaddr);
+   pci_write_config_byte( pdev, PCI_INTERRUPT_LINE, irq);
+   pci_write_config_byte( pdev, PCI_LATENCY_TIMER, pci_latency);
+   pci_write_config_byte( pdev, PCI_CACHE_LINE_SIZE, pci_cacheline);
+   pci_write_config_word( pdev, PCI_COMMAND, pci_command | 5);
+}

 static void __devexit vortex_remove_one (struct pci_dev *pdev)
 {

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: WOL failure after shutdown

2001-02-12 Thread Tobias Ringstrom

On Sun, 11 Feb 2001, James Brents wrote:

 Sorry, I wrote that in a hurry. Its a 3Com PCI 3c905C Tornado. I can
 successfully use wakeonlan if I power off the machine immeadiatly after
 turning it on. Using the shutdown command, which it will when I need it
 to power back up, it will not work.
 Im using a wakeonlan cable to my motherboard as well, not using wake
 through PCI bus.
 Kernel is 2.4.1
 I appologize for not providing all required the specs in the original
 message.

Try this patch.  It is against the zero-copy version of the driver, but
I'm sure you can apply it, at least manually, to any 2.4 version.

Andrew, when can we expect to have WOL working in 2.4?

/Tobias


--- linux-2.4.1-zc1.orig/drivers/net/3c59x.cTue Jan 30 22:16:01 2001
+++ linux-2.4.1-zc1/drivers/net/3c59x.c Wed Jan 31 08:46:00 2001
@@ -754,6 +754,7 @@
 static void set_rx_mode(struct net_device *dev);
 static int vortex_ioctl(struct net_device *dev, struct ifreq *rq, int cmd);
 static void vortex_tx_timeout(struct net_device *dev);
+static void acpi_wake(struct pci_dev *pdev);
 static void acpi_set_WOL(struct net_device *dev);

 /* This driver uses 'options' to pass the media type, full-duplex flag, etc. */
@@ -1426,6 +1427,8 @@
int i;
int retval;

+   acpi_wake(vp-pdev);
+
/* Use the now-standard shared IRQ implementation. */
if ((retval = request_irq(dev-irq, vp-full_bus_master_rx ?
boomerang_interrupt : vortex_interrupt, SA_SHIRQ, 
dev-name, dev))) {
@@ -2647,12 +2650,6 @@
struct vortex_private *vp = (struct vortex_private *)dev-priv;
long ioaddr = dev-base_addr;

-   /* AKPM: This kills the 905 */
-   if (vortex_debug  1) {
-   printk(KERN_INFO PFX "Wake-on-LAN functions disabled\n");
-   }
-   return;
-
/* Power up on: 1==Downloaded Filter, 2==Magic Packets, 4==Link Status. */
EL3WINDOW(7);
outw(2, ioaddr + 0x0c);
@@ -2663,6 +2660,34 @@
pci_write_config_word(vp-pdev, 0xe0, 0x8103);
 }

+/* Change from D3 (sleep) to D0 (active).
+   Problem: The Cyclone forgets all PCI config info during the transition! */
+static void acpi_wake(struct pci_dev *pdev)
+{
+   u32 base0, base1, romaddr;
+   u16 pci_command, pwr_command;
+   u8  pci_latency, pci_cacheline, irq;
+
+   pci_read_config_word(pdev, 0xe0, pwr_command);
+   if ((pwr_command  3) == 0)
+   return;
+   pci_read_config_word( pdev, PCI_COMMAND, pci_command);
+   pci_read_config_dword(pdev, PCI_BASE_ADDRESS_0, base0);
+   pci_read_config_dword(pdev, PCI_BASE_ADDRESS_1, base1);
+   pci_read_config_dword(pdev, PCI_ROM_ADDRESS, romaddr);
+   pci_read_config_byte( pdev, PCI_LATENCY_TIMER, pci_latency);
+   pci_read_config_byte( pdev, PCI_CACHE_LINE_SIZE, pci_cacheline);
+   pci_read_config_byte( pdev, PCI_INTERRUPT_LINE, irq);
+
+   pci_write_config_word( pdev, 0xe0, 0x);
+   pci_write_config_dword(pdev, PCI_BASE_ADDRESS_0, base0);
+   pci_write_config_dword(pdev, PCI_BASE_ADDRESS_1, base1);
+   pci_write_config_dword(pdev, PCI_ROM_ADDRESS, romaddr);
+   pci_write_config_byte( pdev, PCI_INTERRUPT_LINE, irq);
+   pci_write_config_byte( pdev, PCI_LATENCY_TIMER, pci_latency);
+   pci_write_config_byte( pdev, PCI_CACHE_LINE_SIZE, pci_cacheline);
+   pci_write_config_word( pdev, PCI_COMMAND, pci_command | 5);
+}

 static void __devexit vortex_remove_one (struct pci_dev *pdev)
 {

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: making forward at vger.rutgers.org? [was Re: maestro3 patch,resent]

2001-02-12 Thread Tobias Ringstrom

On Sun, 11 Feb 2001, Pavel Machek wrote:

 Hi!

  duh.  I sent this to rutgers originally..

 I'm doing same mistake over and over.

 Perhaps creating forward at vger.rutgers.edu would be good thing (tm)?

Then how would you ever learn?  ;-)

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: VT82C686A corruption with 2.4.x

2001-01-31 Thread Tobias Ringstrom

On Wed, 31 Jan 2001, safemode wrote:

> I'm wondering... Perhaps it's a problem motherboard specific.  I'm
> using the KA7 and saw pretty bad problems (extreme fs corruption)
> and bad latency. Perhaps the K7V and the KT7's dont have this problem.
> I dont see any of the problems with dma enabled on 2.2.x

But are you using the same DMA mode in 2.2 as in 2.4?  You can check that
using hdparm -i, I believe.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Wavelan IEEE driver

2001-01-31 Thread Tobias Ringstrom

On Tue, 30 Jan 2001, Jurgen Botz wrote:
> and appears to work.  I did observe a problem with iwconfig dumping
> core, but it seems to do its job before it dies, so this may be non-
> critical.

Make sure you compile wireless-tools using the right headers.  You must
manually insert -I/path/to/running-linux-version/include in the Makefile.

This is due to a bad (non-existing) ioctl backward and forward
compatibility, and is being worked on.  Basically, you cannot use the
tools compiled with one version of the wireless extension headers on a
kernel with another version of the wireless extensions.  The symptom is at
best a SEGV, but you may also get strange values.

/Tobias



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Wavelan IEEE driver

2001-01-31 Thread Tobias Ringstrom

On Tue, 30 Jan 2001, Jurgen Botz wrote:
 and appears to work.  I did observe a problem with iwconfig dumping
 core, but it seems to do its job before it dies, so this may be non-
 critical.

Make sure you compile wireless-tools using the right headers.  You must
manually insert -I/path/to/running-linux-version/include in the Makefile.

This is due to a bad (non-existing) ioctl backward and forward
compatibility, and is being worked on.  Basically, you cannot use the
tools compiled with one version of the wireless extension headers on a
kernel with another version of the wireless extensions.  The symptom is at
best a SEGV, but you may also get strange values.

/Tobias



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: VT82C686A corruption with 2.4.x

2001-01-31 Thread Tobias Ringstrom

On Wed, 31 Jan 2001, safemode wrote:

 I'm wondering... Perhaps it's a problem motherboard specific.  I'm
 using the KA7 and saw pretty bad problems (extreme fs corruption)
 and bad latency. Perhaps the K7V and the KT7's dont have this problem.
 I dont see any of the problems with dma enabled on 2.2.x

But are you using the same DMA mode in 2.2 as in 2.4?  You can check that
using hdparm -i, I believe.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: WOL and 3c59x (3c905c-tx)

2001-01-30 Thread Tobias Ringstrom

On Wed, 31 Jan 2001, Tobias Ringstrom wrote:
> Would it be enough to port the acpi_wake function to 2.4?  If so, I can do
> that myself.  In fact, I think I'll try that right away.  Who needs
> breakfast anyway? :-)

Ok, I tried it, and it works.  I can now start my computer using WOL
packets after an "init 0" in Linux.

I do not how it behaves in a suspend/wake-up situation, though.  Let me
know when you have a patch for 2.4, and I'll try it.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: WOL and 3c59x (3c905c-tx)

2001-01-30 Thread Tobias Ringstrom

On Wed, 31 Jan 2001, Andrew Morton wrote:
> The code was broken, so I disabled it.

Because of the loss of state bug with Cyclone, and the "missing" acpi_wake
workaround, right?

> I "fixed" WOL in the 2.2.19-pre candidate driver.  It's
> at http://www.uow.edu.au/~andrewm/linux/3c59x.c-2.2.19-pre6-1.gz
>
> I'd really appreciate it if you could test the WOL in
> that driver.  Then we can port it into 2.4 and try to
> fool Linus into thinking it's a bugfix :)

Of course it is a bug-fix!  I'm very bugged by the current behaviour!
Doesn't that count? :-)

Ugh.  Is the 2.2 driver more advanced than the 2.4 one?  Only temporary, I
hope... :-)

But alas, I cannot easily test this patch, since I need 2.4 for my ATA100
IDE controller, but please send me a patch for 2.4 as soon as you have
one, and I'll help you test it.

Would it be enough to port the acpi_wake function to 2.4?  If so, I can do
that myself.  In fact, I think I'll try that right away.  Who needs
breakfast anyway? :-)

/Tobias, the one smiley per sentence guy :-)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



WOL and 3c59x (3c905c-tx)

2001-01-30 Thread Tobias Ringstrom

When shutting down my computer with Linux, I cannot wake it up using
wake-on-LAN, which I can do if I shut it down from WinME or the LILO
prompt using the power button.

I see some "interesting" code in 3c59x.c and acpi_set_WOL, and there is
the following little comment: "AKPM: This kills the 905".

So, what's up?  Does it break all 905s?  And will not changing the state
to D3, as a comment a few lines down says, shut the card down, which seems
to be a bad thing to do in a function called from vortex_probe1...  I know
this code is currently bypassed, but still, what is this?

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: VT82C686A corruption with 2.4.x

2001-01-30 Thread Tobias Ringstrom

So you have not seen any corruption, but are willing to do testing.  Very
kind, but you could have choosen a better subject, I think.  There are a
lot more rumours that facts regarding the VIA drivers right now.

/Tobias


On Tue, 30 Jan 2001, Nicholas Knight wrote:

> I have a Soyo K7VIA motherboard which uses VT82C686A, with an 800mhz Athlon
> CPU in it.
> So far I've never run a 2.3* or 2.4* kernel on it, I've only done that on my
> P3 using a propriatory micron motherboard that uses an intel BX2 chipset.
> However, I recently trashed my linux installation (doing things totaly
> unrelated to the kernel) and now would be more than happy to assist in
> trying to figure out what the heck is causing the filesystem corruption on
> VIA chipsets, but so far I've only found bits and peices of information on
> it, and have been unable to locate a compiliation of information avalible on
> the problem, so I'd know just where to start.
> If anyone could point me to a good place to start looking, besides the
> thousands of messages containing just bits and peices of information, I
> could get to work on some testing.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: *massive* slowdowns on 2.4.1-pre1[1|2]

2001-01-30 Thread Tobias Ringstrom

On Mon, 29 Jan 2001, Mark Hahn wrote:
> > Kernel 2.4.1-pre11 and pre12 are both massively slower than 2.4.0 on the
> > same machine, compiled with the same options.  The machine is a Athlon
> > 900 on a KT133 chipset.  The slowdown is noticealbe in all areas...
>
> this is known: Linus decreed that, since two people reported
> disk corruption on VIA, any machine with a VIA southbridge
> must boot in stupid 1992 mode (PIO).  (yes, it might be possible
> to boot with ide=autodma or something, but who would guess?)

The only patch concerning VIA IDE in 2.4.1 is a patch that honors the
user's choise in "make menuconfig" regarding using DMA by default.  Just
say yes to that option, and you should have DMA enabled at boot, as you
had in 2.4.0.

The old behaviour was a bug.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: *massive* slowdowns on 2.4.1-pre1[1|2]

2001-01-30 Thread Tobias Ringstrom

On Mon, 29 Jan 2001, Mark Hahn wrote:
  Kernel 2.4.1-pre11 and pre12 are both massively slower than 2.4.0 on the
  same machine, compiled with the same options.  The machine is a Athlon
  900 on a KT133 chipset.  The slowdown is noticealbe in all areas...

 this is known: Linus decreed that, since two people reported
 disk corruption on VIA, any machine with a VIA southbridge
 must boot in stupid 1992 mode (PIO).  (yes, it might be possible
 to boot with ide=autodma or something, but who would guess?)

The only patch concerning VIA IDE in 2.4.1 is a patch that honors the
user's choise in "make menuconfig" regarding using DMA by default.  Just
say yes to that option, and you should have DMA enabled at boot, as you
had in 2.4.0.

The old behaviour was a bug.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: VT82C686A corruption with 2.4.x

2001-01-30 Thread Tobias Ringstrom

So you have not seen any corruption, but are willing to do testing.  Very
kind, but you could have choosen a better subject, I think.  There are a
lot more rumours that facts regarding the VIA drivers right now.

/Tobias


On Tue, 30 Jan 2001, Nicholas Knight wrote:

 I have a Soyo K7VIA motherboard which uses VT82C686A, with an 800mhz Athlon
 CPU in it.
 So far I've never run a 2.3* or 2.4* kernel on it, I've only done that on my
 P3 using a propriatory micron motherboard that uses an intel BX2 chipset.
 However, I recently trashed my linux installation (doing things totaly
 unrelated to the kernel) and now would be more than happy to assist in
 trying to figure out what the heck is causing the filesystem corruption on
 VIA chipsets, but so far I've only found bits and peices of information on
 it, and have been unable to locate a compiliation of information avalible on
 the problem, so I'd know just where to start.
 If anyone could point me to a good place to start looking, besides the
 thousands of messages containing just bits and peices of information, I
 could get to work on some testing.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



WOL and 3c59x (3c905c-tx)

2001-01-30 Thread Tobias Ringstrom

When shutting down my computer with Linux, I cannot wake it up using
wake-on-LAN, which I can do if I shut it down from WinME or the LILO
prompt using the power button.

I see some "interesting" code in 3c59x.c and acpi_set_WOL, and there is
the following little comment: "AKPM: This kills the 905".

So, what's up?  Does it break all 905s?  And will not changing the state
to D3, as a comment a few lines down says, shut the card down, which seems
to be a bad thing to do in a function called from vortex_probe1...  I know
this code is currently bypassed, but still, what is this?

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: WOL and 3c59x (3c905c-tx)

2001-01-30 Thread Tobias Ringstrom

On Wed, 31 Jan 2001, Andrew Morton wrote:
 The code was broken, so I disabled it.

Because of the loss of state bug with Cyclone, and the "missing" acpi_wake
workaround, right?

 I "fixed" WOL in the 2.2.19-pre candidate driver.  It's
 at http://www.uow.edu.au/~andrewm/linux/3c59x.c-2.2.19-pre6-1.gz

 I'd really appreciate it if you could test the WOL in
 that driver.  Then we can port it into 2.4 and try to
 fool Linus into thinking it's a bugfix :)

Of course it is a bug-fix!  I'm very bugged by the current behaviour!
Doesn't that count? :-)

Ugh.  Is the 2.2 driver more advanced than the 2.4 one?  Only temporary, I
hope... :-)

But alas, I cannot easily test this patch, since I need 2.4 for my ATA100
IDE controller, but please send me a patch for 2.4 as soon as you have
one, and I'll help you test it.

Would it be enough to port the acpi_wake function to 2.4?  If so, I can do
that myself.  In fact, I think I'll try that right away.  Who needs
breakfast anyway? :-)

/Tobias, the one smiley per sentence guy :-)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: WOL and 3c59x (3c905c-tx)

2001-01-30 Thread Tobias Ringstrom

On Wed, 31 Jan 2001, Tobias Ringstrom wrote:
 Would it be enough to port the acpi_wake function to 2.4?  If so, I can do
 that myself.  In fact, I think I'll try that right away.  Who needs
 breakfast anyway? :-)

Ok, I tried it, and it works.  I can now start my computer using WOL
packets after an "init 0" in Linux.

I do not how it behaves in a suspend/wake-up situation, though.  Let me
know when you have a patch for 2.4, and I'll try it.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [UPDATE] Zerocopy, last one today I promise :-)

2001-01-25 Thread Tobias Ringstrom

On Thu, 25 Jan 2001, David S. Miller wrote:
> This does show that not too many people are testing this all that
> thoroughly :-) Basically, any sys_sendfile() over TCP using a network
> card other than loopback/3c59x/sunhme/acenic would fail with -EFAULT
> or even worse a kernel crash depending upon architecture.

You may have said it before, but since you're the one who wants it tested,
I'm sure you're happy to repeat it, right? :-)

I understand from your comment that you want people to run it on all kinds
of hardware, both with and without hw checksumming, but how do you want us
to test it?  Is "my computer works as usual with this patch included" what
you are looking for, or do you want us to run specific tests or
benchmarks?

Should I get a speed increase in a normal TCP session, or do I have to use
sendfile to see any change?  Of course, stability is the most important
factor right now, but it would be nice if I will get a performance boost
from my old tired P90.  Right now it peaks at about 40 Mb/s for TCP.

/Tobias, a soon to be zerocopy patch tester

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [UPDATE] Zerocopy, last one today I promise :-)

2001-01-25 Thread Tobias Ringstrom

On Thu, 25 Jan 2001, David S. Miller wrote:
 This does show that not too many people are testing this all that
 thoroughly :-) Basically, any sys_sendfile() over TCP using a network
 card other than loopback/3c59x/sunhme/acenic would fail with -EFAULT
 or even worse a kernel crash depending upon architecture.

You may have said it before, but since you're the one who wants it tested,
I'm sure you're happy to repeat it, right? :-)

I understand from your comment that you want people to run it on all kinds
of hardware, both with and without hw checksumming, but how do you want us
to test it?  Is "my computer works as usual with this patch included" what
you are looking for, or do you want us to run specific tests or
benchmarks?

Should I get a speed increase in a normal TCP session, or do I have to use
sendfile to see any change?  Of course, stability is the most important
factor right now, but it would be nice if I will get a performance boost
from my old tired P90.  Right now it peaks at about 40 Mb/s for TCP.

/Tobias, a soon to be zerocopy patch tester

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



[PATCH] No VIA IDE DMA unless configured

2001-01-23 Thread Tobias Ringstrom

Linus, please consider this patch for 2.4.1.  It makes sure the VIA IDE
driver does not enable DMA automatically, unless the user has requested it
using "make whateverconfig".

/Tobias

--- via82cxxx.c.origTue Jan 23 22:26:25 2001
+++ via82cxxx.c Tue Jan 23 22:27:05 2001
@@ -602,7 +602,9 @@
 #ifdef CONFIG_BLK_DEV_IDEDMA
if (hwif->dma_base) {
hwif->dmaproc = _dmaproc;
+#ifdef CONFIG_IDEDMA_AUTO
hwif->autodma = 1;
+#endif /* CONFIG_IDEDMA_AUTO */
}
 #endif /* CONFIG_BLK_DEV_IDEDMA */
 }


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount, summary

2001-01-23 Thread Tobias Ringstrom

Ok, folks, it's time for a summary.  Since my last post, I've had time to
experiment a bit more, and I've also had some private communication with
Vojtech.

First, I would like to say that you do need quite a bit of bad luck (or
hardware) to have the same problems I did.  Linux 2.4, VIA and IDE works
very well for most users.  But I really recommend making a backup of all
your vital data before installing 2.4 and enabling DMA with IDE disks.
(And, yes, I did this.  Honest! :-) )

Problem log
===

1. Installed RedHat 7
2. Built 2.4.0 with VIA driver and DMA by default (well, in 2.4.0, the VIA
   driver will always use DMA by default, wheather you want to or not.)
3. Rebooted -> 2.4.0
4. The computer froze on the remounting root read-write message.
5. Powercycle
6. Rebooted -> 2.2.16-22
7. Got a corrupt disk, missing files, moved files, incorrect file contents
8. Goto 1

So, why did this happen?

Problem one
===

This one really makes me upset, because had it not been for this one, it
would have been soo much easier to find the cause of the problem.  It is
also so easy to fix.

The problem is that the RedHat disables all kernel messages during boot,
except for panics.  I my not so very humble opinion, kernel error
messages, and possibly also warning messages, should of course be shown.
It can easyly be fixed by editing /etc/sysconfig/init.

The error messages that was hidden by RH7, was a couple of CRC error
messages, and then an endless stream of "Busy" and "Drive not ready for
command" errors.  More on this later.

Problem two
===

The computer in question has problems with UDMA(33), otherwise I would not
have gotten CRC errors, and everything would have been fine.  Why I do get
CRC errors, one can so far only speculate, especially since I am able to
use UDMA(66) with another drive, on the same controller, without much
trouble.

One theory is that the PCI bus clock may be too fast, and the drive cannot
catch up.  To check this, I plan to measure the PCI clock to see if this
is true.  Quick measurements with a not too great oscilloscope seems to
indicate a clock speed of around 33.3-33.4 MHz, so it may actully be out
of spec, but not by much.

Another theory is that the CRC errors are caused by bad cables,
connectors, or motherboard, but the fact that I can use UDMA(66) on the
same controller seems to contradicts this.  But OTOH I have learnt not to
underestimate the amazing amount of trouble a bad cable can cause.

Possible work-arounds include a "idebus=40" kernel option, or using
hdparm to configure the drive and kernel for UDMA(22).

Problem three
=

The drive that gave me these problems is a SAMSUNG VG34323A, and the
problem with this drive is that it does not seem to recover from CRC
errors.  Once I get my first CRC error, the drive becomes permanently
busy, until I power cycle.

Problem four


I do not know exactly what Linux is doing when remounting a
partition read-write, but it does seem to update some very sensitive
sectors, and when the write fails, a lot of very vital data is destroyed.
It is perhaps questionable whether the destruction of a couple of files
would be much better than the destruction of /dev, but I think it is.


Lesson
==

Be very careful when enabling DMA on a Linux machine, especially on cheap
hardware.  It is not enough to test DMA on a read-only partition first,
since writing is a completely different story.

...and probably some more things that I either forgot, or are too painful
to remember...

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount, summary

2001-01-23 Thread Tobias Ringstrom

Ok, folks, it's time for a summary.  Since my last post, I've had time to
experiment a bit more, and I've also had some private communication with
Vojtech.

First, I would like to say that you do need quite a bit of bad luck (or
hardware) to have the same problems I did.  Linux 2.4, VIA and IDE works
very well for most users.  But I really recommend making a backup of all
your vital data before installing 2.4 and enabling DMA with IDE disks.
(And, yes, I did this.  Honest! :-) )

Problem log
===

1. Installed RedHat 7
2. Built 2.4.0 with VIA driver and DMA by default (well, in 2.4.0, the VIA
   driver will always use DMA by default, wheather you want to or not.)
3. Rebooted - 2.4.0
4. The computer froze on the remounting root read-write message.
5. Powercycle
6. Rebooted - 2.2.16-22
7. Got a corrupt disk, missing files, moved files, incorrect file contents
8. Goto 1

So, why did this happen?

Problem one
===

This one really makes me upset, because had it not been for this one, it
would have been soo much easier to find the cause of the problem.  It is
also so easy to fix.

The problem is that the RedHat disables all kernel messages during boot,
except for panics.  I my not so very humble opinion, kernel error
messages, and possibly also warning messages, should of course be shown.
It can easyly be fixed by editing /etc/sysconfig/init.

The error messages that was hidden by RH7, was a couple of CRC error
messages, and then an endless stream of "Busy" and "Drive not ready for
command" errors.  More on this later.

Problem two
===

The computer in question has problems with UDMA(33), otherwise I would not
have gotten CRC errors, and everything would have been fine.  Why I do get
CRC errors, one can so far only speculate, especially since I am able to
use UDMA(66) with another drive, on the same controller, without much
trouble.

One theory is that the PCI bus clock may be too fast, and the drive cannot
catch up.  To check this, I plan to measure the PCI clock to see if this
is true.  Quick measurements with a not too great oscilloscope seems to
indicate a clock speed of around 33.3-33.4 MHz, so it may actully be out
of spec, but not by much.

Another theory is that the CRC errors are caused by bad cables,
connectors, or motherboard, but the fact that I can use UDMA(66) on the
same controller seems to contradicts this.  But OTOH I have learnt not to
underestimate the amazing amount of trouble a bad cable can cause.

Possible work-arounds include a "idebus=40" kernel option, or using
hdparm to configure the drive and kernel for UDMA(22).

Problem three
=

The drive that gave me these problems is a SAMSUNG VG34323A, and the
problem with this drive is that it does not seem to recover from CRC
errors.  Once I get my first CRC error, the drive becomes permanently
busy, until I power cycle.

Problem four


speculationI do not know exactly what Linux is doing when remounting a
partition read-write, but it does seem to update some very sensitive
sectors, and when the write fails, a lot of very vital data is destroyed.
It is perhaps questionable whether the destruction of a couple of files
would be much better than the destruction of /dev, but I think it is.
/speculation

Lesson
==

Be very careful when enabling DMA on a Linux machine, especially on cheap
hardware.  It is not enough to test DMA on a read-only partition first,
since writing is a completely different story.

...and probably some more things that I either forgot, or are too painful
to remember...

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



[PATCH] No VIA IDE DMA unless configured

2001-01-23 Thread Tobias Ringstrom

Linus, please consider this patch for 2.4.1.  It makes sure the VIA IDE
driver does not enable DMA automatically, unless the user has requested it
using "make whateverconfig".

/Tobias

--- via82cxxx.c.origTue Jan 23 22:26:25 2001
+++ via82cxxx.c Tue Jan 23 22:27:05 2001
@@ -602,7 +602,9 @@
 #ifdef CONFIG_BLK_DEV_IDEDMA
if (hwif-dma_base) {
hwif-dmaproc = via82cxxx_dmaproc;
+#ifdef CONFIG_IDEDMA_AUTO
hwif-autodma = 1;
+#endif /* CONFIG_IDEDMA_AUTO */
}
 #endif /* CONFIG_BLK_DEV_IDEDMA */
 }


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



[OT] Re: rsync + ssh fail on raid; okay on 2.2.x

2001-01-18 Thread Tobias Ringstrom

On Thu, 18 Jan 2001, Nick Urbanik wrote:

> Dear folks,
>
> I use rsync to transfer my mail (including this list) from work to home
> over ppp ussing OpenSSH 2.3.0.  I have no problem transfering  hundreds
> of megabytes of my babies' photos from a non-raid partition (going to
> work), but I get:
>
> nsmail/Inbox
> Write failed: Cannot allocate memory
> unexpected EOF in read_timeout

This is not the right place to ask (or answer), but anyway:

Make sure you do not use protocol version 2 with openssh 2.3.0, since it
is ***very*** broken, and more often than not fails to receive (and
transmit?) all data.

Try
dd if=/dev/zero bs=1k count=100 | wc
and
ssh machine dd if=/dev/zero bs=1k count=100 | wc

They should give you the same result.  If not, you have the broken ssh.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



[OT] Re: rsync + ssh fail on raid; okay on 2.2.x

2001-01-18 Thread Tobias Ringstrom

On Thu, 18 Jan 2001, Nick Urbanik wrote:

 Dear folks,

 I use rsync to transfer my mail (including this list) from work to home
 over ppp ussing OpenSSH 2.3.0.  I have no problem transfering  hundreds
 of megabytes of my babies' photos from a non-raid partition (going to
 work), but I get:

 nsmail/Inbox
 Write failed: Cannot allocate memory
 unexpected EOF in read_timeout

This is not the right place to ask (or answer), but anyway:

Make sure you do not use protocol version 2 with openssh 2.3.0, since it
is ***very*** broken, and more often than not fails to receive (and
transmit?) all data.

Try
dd if=/dev/zero bs=1k count=100 | wc
and
ssh machine dd if=/dev/zero bs=1k count=100 | wc

They should give you the same result.  If not, you have the broken ssh.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



What happened to your kernel changelogs?

2001-01-17 Thread Tobias Ringstrom

I liked them a lot, and I bet I'm not alone.  Are they gone for good, or
have you just ceased writing them for test kernels?

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



What happened to your kernel changelogs?

2001-01-17 Thread Tobias Ringstrom

I liked them a lot, and I bet I'm not alone.  Are they gone for good, or
have you just ceased writing them for test kernels?

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: MTRR type AMD Duron/intel ?

2001-01-15 Thread Tobias Ringstrom

On Mon, 15 Jan 2001, Linus Torvalds wrote:
> On Mon, 15 Jan 2001, Tobias Ringstrom wrote:
> >
> > Last time I checked this was issued for perfectly known and valid bridges
> > that advertice no IO resources.  Isn't it a bit silly to issue that
> > warning for that case, or am I missing something?
>
> Ehh - so what do they bridge, then?
>
> I'd say that a bridge that doesn't seem to bridge any IO or MEM region,
> yet has stuff behind it, THAT is the silly thing. Thus the "silly"
> warning.

I'm talking about bridges that bridge memory, but not io, which is quite
common.  (AGP bridges)

I do not have my PCI book right now, but there are two registers,
basically io_base and io_limit, and if io_limit == io_base-1, that means
that no io is bridged.

I still think its silly.  ;-)

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: MTRR type AMD Duron/intel ?

2001-01-15 Thread Tobias Ringstrom

On Mon, 15 Jan 2001, David Balazic wrote:

> It also reports something like :
> PCI chipset unknown : assuming transparent

Are you sure it's not

Unknown bridge resource 0: assuming transparent

(which is just about every kernel log I have seen...)

Last time I checked this was issued for perfectly known and valid bridges
that advertice no IO resources.  Isn't it a bit silly to issue that
warning for that case, or am I missing something?

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: ide that *does* work well with 2.4.0?

2001-01-15 Thread Tobias Ringstrom

On Mon, 15 Jan 2001, dep wrote:
> i've got to get another udma ide drive today or tomorrow. i know that
> my w.d. is a little flaky, and i've seen reports that at least some
> ibm drives are kind of screwy with 2.4.0.

I have used IBM drives with Intel PIIX, Promise ATA100 and various VIA
chipsets on 2.4.  They have been extremely fast and reliable.  There were
some reports with troubles with IBM disks and a specific chipset, but it
may just as well be the chipset.  Hard to prove without an ATA analyzer
and a full spec...

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: ide that *does* work well with 2.4.0?

2001-01-15 Thread Tobias Ringstrom

On Mon, 15 Jan 2001, dep wrote:
 i've got to get another udma ide drive today or tomorrow. i know that
 my w.d. is a little flaky, and i've seen reports that at least some
 ibm drives are kind of screwy with 2.4.0.

I have used IBM drives with Intel PIIX, Promise ATA100 and various VIA
chipsets on 2.4.  They have been extremely fast and reliable.  There were
some reports with troubles with IBM disks and a specific chipset, but it
may just as well be the chipset.  Hard to prove without an ATA analyzer
and a full spec...

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: MTRR type AMD Duron/intel ?

2001-01-15 Thread Tobias Ringstrom

On Mon, 15 Jan 2001, David Balazic wrote:

 It also reports something like :
 PCI chipset unknown : assuming transparent

Are you sure it's not

Unknown bridge resource 0: assuming transparent

(which is just about every kernel log I have seen...)

Last time I checked this was issued for perfectly known and valid bridges
that advertice no IO resources.  Isn't it a bit silly to issue that
warning for that case, or am I missing something?

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: MTRR type AMD Duron/intel ?

2001-01-15 Thread Tobias Ringstrom

On Mon, 15 Jan 2001, Linus Torvalds wrote:
 On Mon, 15 Jan 2001, Tobias Ringstrom wrote:
 
  Last time I checked this was issued for perfectly known and valid bridges
  that advertice no IO resources.  Isn't it a bit silly to issue that
  warning for that case, or am I missing something?

 Ehh - so what do they bridge, then?

 I'd say that a bridge that doesn't seem to bridge any IO or MEM region,
 yet has stuff behind it, THAT is the silly thing. Thus the "silly"
 warning.

I'm talking about bridges that bridge memory, but not io, which is quite
common.  (AGP bridges)

I do not have my PCI book right now, but there are two registers,
basically io_base and io_limit, and if io_limit == io_base-1, that means
that no io is bridged.

I still think its silly.  ;-)

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount, getting closer

2001-01-14 Thread Tobias Ringstrom

I should also add that the 3.11 driver seems to make things better, but
not yet perfect.  My intuition tells me that I get CRC errors much sooner
with 2.1e than with 3.11.

Has the timings changed from 2.1e to 3.11, and would it be easy to modify
3.11 to get extra safe/paranoid, but less high performance, timings?

Some extra data:
* B seems to work in 2 with udma2
* A seems to work in 2 with udma1, but not with udma2.

I wouldn't say it's rock solid, and I would not trust my data to any of
these combinations, but at least it not break immmediately (i.e. for less
than 1 GB written).

The worst combination is 2.4.0 with VIA 2.1e and A in 1.  Going from 2.1e
to 3.11 helps, but it is still very bad.

I'd really like to be more precise, but there are too many combinations to
try to try them all, and sometimes it fails right away, and sometimes
after several hundred megabytes.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount, getting closer

2001-01-14 Thread Tobias Ringstrom

On Sun, 14 Jan 2001, Vojtech Pavlik wrote:
> > > So the drive *did* work on the vt82c686a in the A7V board? You tested it
> > > both on the Promise and on the 686a? But doesn't work on the 686a in
> > > your other board?
> >
> > Yes, on both the Promise and on the 686a.  But the device revisions are
> > different.  The machine that does NOT work:
> >
> > 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
> > 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)
> >
> > The machine that works:
> >
> > 00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 22)
> > 00:04.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10)
> >
> > The one the works is a 1 GHz Athlon, and the other is an 800 MHz
> > Pentium-III.

Of course is isn't.  The vt82c686 that does not work is a 450 MHz K-6, not
a PIII.

> > > > no matter what cable I use.  When I get this, the machine does not recover
> > > > most of the time, and I have to reset or power cycle.
> > >
> > > It should be able to recover in a couple (up to 10) minutes ...
> >
> > Who waits 10 minutes for a timeout?  Can it be lowered?
>
> It's not a 10 minute timeout, it's a shorter timeout retried many times.
> Not my code, though - this is generic PCI IDE code, and is a huge mess.

What I get is a number of Busy and Drive is not ready for command for
different sectors.

> > Expect another mail with the data you requested within a couple of hours.
>
> Thanks a lot.

Ok, it took a bit longer that that, mostly because me and my whife had
unexpected (but very welcome) guests at home.  It is Sunday, after all...

I have attached a tar file with "lspci -vvxxx" and "hdinfo -i" for machine
1 and 2 to this mail, but first some comments.

I will be talking about three machines:

1) 450 MHz K-6 on an AOpen MX59 PRO II motherboard
2) 800 MHz PIII on an unknown cheap/crappy motherboard.
3) 1 GHz Athlon on an ASUS A7V motherboard.

and the following drives:

A) SAMSUNG VG34323A, sdma0 sdma1 sdma2 mdma0 mdma1 mdma2 udma0 udma1 udma2
B) ST38421A, mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4

Machine 3 is the machine at home, and it does not have problems with any
disks I have tried soo far, and seems very stable, both with ATA100 and
ATA66.

I verified that what is happening when RH7 tries to remount / read-write,
is that I get the infamous CRC errors.  It does not seem to recover from
this state.  At least I did not wait that long.

I do not think that the RH7 kernel 2.2.16-22 uses udma2 at any time, and
that may be why it works.

Disk B does NOT work with DMA enabled with machine 1 or 2.  It works
better than disk A, but it does still fail after some time.  The
combination 1B was the most stable, and only failed once.

When using disk B, the computer has managed to recover from the CRC error
condition every time, as opposed to disk A which never recovers.  (Busy)

Using hdparm -X65 (udma1) makes disk A work with 2.4 in machine 2.  What
is the difference between udma1 and udma2?

Now I'm almost completely lost.  Hope this helps.  Let me know if you want
me to try something else.

/Tobias




/dev/hde:

 Model=SAMSUNG VG34323A (4.32GB), FwRev=GQ200, SerialNo=dW1921060033c8
 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
 RawCHS=14896/9/63, TrkSize=32256, SectSize=512, ECCbytes=21
 BuffType=DualPortCache, BuffSize=496kB, MaxMultSect=16, MultSect=off
 CurCHS=14896/9/63, CurSects=-531627904, LBA=yes, LBAsects=8446032
 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes: pio0 pio1 pio2 pio3 pio4 
 DMA modes: sdma0 sdma1 sdma2 mdma0 mdma1 mdma2 udma0 udma1 *udma2 


00:00.0 Host bridge: VIA Technologies, Inc.: Unknown device 0305 (rev 02)
Subsystem: Asustek Computer, Inc.: Unknown device 8033
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- 
SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- 
Capabilities: [c0] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
00: 06 11 05 03 06 00 10 a2 02 00 00 06 00 00 00 00
10: 08 00 00 e0 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 43 10 33 80
30: 00 00 00 00 a0 00 00 00 00 00 00 00 00 00 00 00
40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
50: 17 a4 6b b4 4f 81 10 10 80 00 08 10 10 10 10 10
60: 03 ff 00 b0 e6 e5 e5 00 44 7c 86 0f 08 3f 00 00
70: de 80 cc 0c 0e a1 d2 00 01 b4 11 02 00 00 00 01
80: 0f 40 00 00 80 00 00 00 02 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 02 c0 20 00 17 02 00 1f 00 00 00 00 6e 02 14 00
b0: 61 ec 80 e5 32 33 28 00 00 00 00 00 00 00 00 00
c0: 01 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 

Re: 2.4 ate my filesystem on rw-mount, getting closer

2001-01-14 Thread Tobias Ringstrom

On Sun, 14 Jan 2001, Vojtech Pavlik wrote:
> On Sat, Jan 13, 2001 at 11:36:13PM +0100, Tobias Ringstrom wrote:
>
> > I have now tried the SAMSUNG VG34323A disk with two other controllers at
> > home (Promise ATA100 an VIA vt82c686a rev 0x22, both on an ASUS A7V
> > motherboard), and there are no problems to be found with DMA enabled.
> > Streaming 10 MB/s without glitches.
>
> So the drive *did* work on the vt82c686a in the A7V board? You tested it
> both on the Promise and on the 686a? But doesn't work on the 686a in
> your other board?

Yes, on both the Promise and on the 686a.  But the device revisions are
different.  The machine that does NOT work:

00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)

The machine that works:

00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 22)
00:04.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10)

The one the works is a 1 GHz Athlon, and the other is an 800 MHz
Pentium-III.

> > no matter what cable I use.  When I get this, the machine does not recover
> > most of the time, and I have to reset or power cycle.
>
> It should be able to recover in a couple (up to 10) minutes ...

Who waits 10 minutes for a timeout?  Can it be lowered?

Expect another mail with the data you requested within a couple of hours.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: IDE DMA problem in 2.4.0

2001-01-14 Thread Tobias Ringstrom

On Fri, 12 Jan 2001, Tobias Ringstrom wrote:

> On Thu, 11 Jan 2001, Adrian Bunk wrote:
> > On Thu, 11 Jan 2001, Tobias Ringstrom wrote:
> >
> > > When copying huge files from one disk to another (hda->hdc), I get the
> > > following error (after some hundred megabytes):
> > >
> > > hdc: timeout waiting for DMA
> > > ide_dmaproc: chipset supported ide_dma_timeout func only: 14
> > > hdc: irq timeout: status=0xd1 { Busy }
> > > hdc: DMA disabled
> > > ide1: reset: success
> > >...
> > > VP_IDE: VIA vt82c596b IDE UDMA66 controller on pci0:7.1
> > >...
> > > Did I miss anything?
> >
> > Could you try if the (experimental) version 3.11 of the VIA IDE driver
> > (announced by Vojtech Pavlik in [1]) fixes your problem? Simply copy the
> > two files you find there to drivers/ide after you unpacked the kernel
> > source.
>
> Works like a charm!  I copied the full 4 GB without glitches, and it has
> not eaten my filesystem yet, either.  I will continue to stress it, and
> report any errors I find.

Hrmph...  Grrr...  No, I got the same error again, it was just s much
harder to get it.  The error is still there, I'm afraid.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: IDE DMA problem in 2.4.0

2001-01-14 Thread Tobias Ringstrom

On Fri, 12 Jan 2001, Tobias Ringstrom wrote:

 On Thu, 11 Jan 2001, Adrian Bunk wrote:
  On Thu, 11 Jan 2001, Tobias Ringstrom wrote:
 
   When copying huge files from one disk to another (hda-hdc), I get the
   following error (after some hundred megabytes):
  
   hdc: timeout waiting for DMA
   ide_dmaproc: chipset supported ide_dma_timeout func only: 14
   hdc: irq timeout: status=0xd1 { Busy }
   hdc: DMA disabled
   ide1: reset: success
  ...
   VP_IDE: VIA vt82c596b IDE UDMA66 controller on pci0:7.1
  ...
   Did I miss anything?
 
  Could you try if the (experimental) version 3.11 of the VIA IDE driver
  (announced by Vojtech Pavlik in [1]) fixes your problem? Simply copy the
  two files you find there to drivers/ide after you unpacked the kernel
  source.

 Works like a charm!  I copied the full 4 GB without glitches, and it has
 not eaten my filesystem yet, either.  I will continue to stress it, and
 report any errors I find.

Hrmph...  Grrr...  No, I got the same error again, it was just s much
harder to get it.  The error is still there, I'm afraid.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount, getting closer

2001-01-14 Thread Tobias Ringstrom

On Sun, 14 Jan 2001, Vojtech Pavlik wrote:
 On Sat, Jan 13, 2001 at 11:36:13PM +0100, Tobias Ringstrom wrote:

  I have now tried the SAMSUNG VG34323A disk with two other controllers at
  home (Promise ATA100 an VIA vt82c686a rev 0x22, both on an ASUS A7V
  motherboard), and there are no problems to be found with DMA enabled.
  Streaming 10 MB/s without glitches.

 So the drive *did* work on the vt82c686a in the A7V board? You tested it
 both on the Promise and on the 686a? But doesn't work on the 686a in
 your other board?

Yes, on both the Promise and on the 686a.  But the device revisions are
different.  The machine that does NOT work:

00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)

The machine that works:

00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 22)
00:04.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10)

The one the works is a 1 GHz Athlon, and the other is an 800 MHz
Pentium-III.

  no matter what cable I use.  When I get this, the machine does not recover
  most of the time, and I have to reset or power cycle.

 It should be able to recover in a couple (up to 10) minutes ...

Who waits 10 minutes for a timeout?  Can it be lowered?

Expect another mail with the data you requested within a couple of hours.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount, getting closer

2001-01-14 Thread Tobias Ringstrom

On Sun, 14 Jan 2001, Vojtech Pavlik wrote:
   So the drive *did* work on the vt82c686a in the A7V board? You tested it
   both on the Promise and on the 686a? But doesn't work on the 686a in
   your other board?
 
  Yes, on both the Promise and on the 686a.  But the device revisions are
  different.  The machine that does NOT work:
 
  00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
  00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)
 
  The machine that works:
 
  00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 22)
  00:04.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10)
 
  The one the works is a 1 GHz Athlon, and the other is an 800 MHz
  Pentium-III.

Of course is isn't.  The vt82c686 that does not work is a 450 MHz K-6, not
a PIII.

no matter what cable I use.  When I get this, the machine does not recover
most of the time, and I have to reset or power cycle.
  
   It should be able to recover in a couple (up to 10) minutes ...
 
  Who waits 10 minutes for a timeout?  Can it be lowered?

 It's not a 10 minute timeout, it's a shorter timeout retried many times.
 Not my code, though - this is generic PCI IDE code, and is a huge mess.

What I get is a number of Busy and Drive is not ready for command for
different sectors.

  Expect another mail with the data you requested within a couple of hours.

 Thanks a lot.

Ok, it took a bit longer that that, mostly because me and my whife had
unexpected (but very welcome) guests at home.  It is Sunday, after all...

I have attached a tar file with "lspci -vvxxx" and "hdinfo -i" for machine
1 and 2 to this mail, but first some comments.

I will be talking about three machines:

1) 450 MHz K-6 on an AOpen MX59 PRO II motherboard
2) 800 MHz PIII on an unknown cheap/crappy motherboard.
3) 1 GHz Athlon on an ASUS A7V motherboard.

and the following drives:

A) SAMSUNG VG34323A, sdma0 sdma1 sdma2 mdma0 mdma1 mdma2 udma0 udma1 udma2
B) ST38421A, mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4

Machine 3 is the machine at home, and it does not have problems with any
disks I have tried soo far, and seems very stable, both with ATA100 and
ATA66.

I verified that what is happening when RH7 tries to remount / read-write,
is that I get the infamous CRC errors.  It does not seem to recover from
this state.  At least I did not wait that long.

I do not think that the RH7 kernel 2.2.16-22 uses udma2 at any time, and
that may be why it works.

Disk B does NOT work with DMA enabled with machine 1 or 2.  It works
better than disk A, but it does still fail after some time.  The
combination 1B was the most stable, and only failed once.

When using disk B, the computer has managed to recover from the CRC error
condition every time, as opposed to disk A which never recovers.  (Busy)

Using hdparm -X65 (udma1) makes disk A work with 2.4 in machine 2.  What
is the difference between udma1 and udma2?

Now I'm almost completely lost.  Hope this helps.  Let me know if you want
me to try something else.

/Tobias




/dev/hde:

 Model=SAMSUNG VG34323A (4.32GB), FwRev=GQ200, SerialNo=dW1921060033c8
 Config={ HardSect NotMFM HdSw15uSec Fixed DTR10Mbs }
 RawCHS=14896/9/63, TrkSize=32256, SectSize=512, ECCbytes=21
 BuffType=DualPortCache, BuffSize=496kB, MaxMultSect=16, MultSect=off
 CurCHS=14896/9/63, CurSects=-531627904, LBA=yes, LBAsects=8446032
 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes: pio0 pio1 pio2 pio3 pio4 
 DMA modes: sdma0 sdma1 sdma2 mdma0 mdma1 mdma2 udma0 udma1 *udma2 


00:00.0 Host bridge: VIA Technologies, Inc.: Unknown device 0305 (rev 02)
Subsystem: Asustek Computer, Inc.: Unknown device 8033
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- 
SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium TAbort- TAbort- 
MAbort+ SERR- PERR+
Latency: 0
Region 0: Memory at e000 (32-bit, prefetchable) [size=128M]
Capabilities: [a0] AGP version 2.0
Status: RQ=31 SBA+ 64bit- FW+ Rate=x1,x2
Command: RQ=0 SBA- AGP- 64bit- FW- Rate=none
Capabilities: [c0] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
00: 06 11 05 03 06 00 10 a2 02 00 00 06 00 00 00 00
10: 08 00 00 e0 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 43 10 33 80
30: 00 00 00 00 a0 00 00 00 00 00 00 00 00 00 00 00
40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
50: 17 a4 6b b4 4f 81 10 10 80 00 08 10 10 10 10 10
60: 03 ff 00 b0 e6 e5 e5 00 44 7c 86 0f 08 3f 00 00
70: de 80 cc 0c 0e a1 d2 00 01 b4 11 02 00 00 00 01
80: 0f 40 00 00 80 00 00 00 02 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 02 c0 20 00 17 02 00 1f 00 00 00 00 6e 

Re: 2.4 ate my filesystem on rw-mount, getting closer

2001-01-14 Thread Tobias Ringstrom

I should also add that the 3.11 driver seems to make things better, but
not yet perfect.  My intuition tells me that I get CRC errors much sooner
with 2.1e than with 3.11.

Has the timings changed from 2.1e to 3.11, and would it be easy to modify
3.11 to get extra safe/paranoid, but less high performance, timings?

Some extra data:
* B seems to work in 2 with udma2
* A seems to work in 2 with udma1, but not with udma2.

I wouldn't say it's rock solid, and I would not trust my data to any of
these combinations, but at least it not break immmediately (i.e. for less
than 1 GB written).

The worst combination is 2.4.0 with VIA 2.1e and A in 1.  Going from 2.1e
to 3.11 helps, but it is still very bad.

I'd really like to be more precise, but there are too many combinations to
try to try them all, and sometimes it fails right away, and sometimes
after several hundred megabytes.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount, getting closer

2001-01-13 Thread Tobias Ringstrom

I have now tried the SAMSUNG VG34323A disk with two other controllers at
home (Promise ATA100 an VIA vt82c686a rev 0x22, both on an ASUS A7V
motherboard), and there are no problems to be found with DMA enabled.
Streaming 10 MB/s without glitches.

However, writing to the SAMSUNG VG34323A disk with DMA enabled on either
this machine [1] (at work, using the VIA IDE driver version 3.11)

00:07.0 ISA bridge: VIA Technologies, Inc. VT82C596 ISA [Apollo PRO] (rev 23)
00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10)

or this machine [2] (at work, using the VIA IDE driver version 2.1e)

00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)

I get exactly the following errors on both machines

hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdc: dma_intr: error=0x84 { DriveStatusError BadCRC }

no matter what cable I use.  When I get this, the machine does not recover
most of the time, and I have to reset or power cycle.  This disc works
flawlessly on two other IDE controllers, so I do not think that the disk
is completely broken. It must be either these chipsets or the driver in
combination with this disk.  Note that I _can_ use another UDMA66 disk
_with_ DMA enabled on both machine [1] and [2] above without problems.
Also, 2.2.16-22 seems to work with DMA enabled on machine [1].  I have not
tried 2.2.16-22 with DMA enabled on machine [2].

The problem I reported at first, hence the nasty subject, was a hang and a
nasty fs corruption when RH7 tried to remount the root fs read-write.  I
examined the RH7 init scripts, or more precisely /etc/rc.sysinit, and
discovered, to my great disgust, that the stupid thing disables the dmesg
output on the console very early in the script.  It is thus entirely
possible that I do get the above mentioned errors when the computer seems
to hang, and my fs gets corrupted.  I will fix the script tomorrow to see
if my assumption is correct.

SUMMARY:  I have a disk that with DMA enabled give me CRC errors on two
machines, but not on two other, independent on the cable.  Both troubling
machines do not recover from these errors.  Linux 2.2.16-22 from RedHat
works fine with DMA enabled on machine [1], [2] is unknown.

I hope this makes things a lot clearer.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount

2001-01-13 Thread Tobias Ringstrom

On Sat, 13 Jan 2001, Vojtech Pavlik wrote:

> On Sat, Jan 13, 2001 at 09:12:27AM +0100, Tobias Ringstrom wrote:
> > > 2) What's in /proc/ide/via?
> >
> > It's not there since I disabled the VIA driver.
>
> Ok. Could you send me this file when you boot with fs r-o?

Ok, but this is with the wrong disc.  Withe the bad disc, drive0 looks
exacly like drive2, i.e. normal UDMA(33).  Sorry about that.

--VIA BusMastering IDE Configuration
Driver Version: 2.1e
South Bridge:   VIA vt82c686a rev 0x1b
Command register:   0x7
Latency timer:  32
PCI clock:  33MHz
Master Read  Cycle IRDY:0ws
Master Write Cycle IRDY:0ws
FIFO Output Data 1/2 Clock Advance: off
BM IDE Status Register Read Retry:  on
Max DRDY Pulse Width:   No limit
---Primary IDE---Secondary IDE--
Read DMA FIFO flush:   on  on
End Sect. FIFO flush:  on  on
Prefetch Buffer:   on  on
Post Write Buffer: on  on
FIFO size:  8   8
Threshold Prim.:  1/2 1/2
Bytes Per Sector: 512 512
Both channels togth:  yes yes
---drive0drive1drive2drive3-
BMDMA enabled:yes   yes   yes   yes
Transfer Mode:   UDMA   DMA/PIO  UDMA   DMA/PIO
Address Setup:   30ns 120ns  30ns 120ns
Active Pulse:90ns 330ns  90ns 330ns
Recovery Time:   30ns 270ns  30ns 270ns
Cycle Time:  30ns 600ns  60ns 600ns
Transfer Rate:   66.0MB/s   3.3MB/s  33.0MB/s   3.3MB/s

> > > 4) If you mount your filesystem read-only, does it read garbage?
> >
> > Now here's a strange part, or possibly a crusial clue.  When I booted a
> > 2.4.0 kernel (from floppy using the excellent syslinux) with "ro
> > init=/bin/sh", I could access the filesystem just fine.  I could even
> > remount the root filesystem rw, and there were no problems.  But I did not
> > write anything to the disk, since I was convinced that the problem was
> > gone (this was the second try).  After this I rebooted with
> > ctrl-alt-delete, forgetting how bad an idea that is with init=/bin/sh,
> > booted up the RH7 2.2.16 kernel, and fsck was run with no errors.
>
> So far no problem. Rebooting with c-a-d with fs r-o is OK.
>
> > Now I
> > though all was well, rebooted from floppy again, but without the init=
> > part, and poof, it hang.
>
> Where? It could be a different reason than IDE setup ...

Don't think so.  It happens on the "Remounting root read-write".

> > More interesting may be that I had to turn the computer off and on again
> > to get BIOS to find the hard drive. Repeated long reset button presses
> > did not help.  It is possible that it hung during BIOS hd detection - I
> > wish I could remember.
>
> I fear this isn't much of a clue, sorry.

The clue is that the VIA driver messed up either the chipset or the drive
quite a lot, but maybe that is already obvious.

> > I suspect that I could have hung the drive with init=/bin/sh if I would
> > have done some reading and writing to the device, besides ls.
>
> Please try it. Best mke2fs your swap partition and try reading & writing
> to that. You can mkswap it back after you finish.

After more testing, I think I have isolated the problem to this disk, or
at least this disk with this controller.  With another (UDMA66) disk,
there are no problems.  Details at the end.

> > I think I can spend some more time today trying it out some more.
>
> Please do. 'lspci -vvxxx' data for the case without a driver, with 2.4.0
> driver and with 3.11 driver would help me find the problem.

Ok, I'll do that later.

> Make sure you *don't* have any hdparm -d1 or hdparm -X66 or similar
> stuff in your init scripts.

I'm sure I don't.  This happens with a clean fresh RH7 installation.

> > I will
> > also try your 3.11 driver, which seems to be an enormous cleanup.
>
> the 2.1e driver is an enormous cleanup of the original driver from the
> 2.2 kernels. the 3.11 is an enormous cleanup of 2.1e, yes.

I have not had a chance to try the 3.11 driver yet.

Now for the new details.  When writing to the disk with DMA enabled, I get
the following errors, in two different machines.  Both are VIA IDE
machines.  I is NOT a cable error.  I have tries with several cables.
Possibly a connector or soldering problem.  I'll try the disk in more
machines an get back with more info.  I have to run now.

hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdc: dma_intr: error=0x84 { DriveStatusError BadCRC }

/Tobias


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount

2001-01-13 Thread Tobias Ringstrom

On Fri, 12 Jan 2001, Vojtech Pavlik wrote:
> Wow. Ok, I'm maintaining the 2.4.0 VIA driver, so I'd like to know more
> about this:
>
> 1) What's the ISA bridge revision?

00:00.0 Host bridge: VIA Technologies, Inc. VT8501 (rev 02)
00:01.0 PCI bridge: VIA Technologies, Inc. VT8501
00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)
00:07.2 USB Controller: VIA Technologies, Inc. VT82C586B USB (rev 0e)
00:07.4 Bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 20)
00:07.5 Multimedia audio controller: VIA Technologies, Inc. VT82C686 [Apollo Super 
AC97/Audio] (rev 21)
00:0a.0 Ethernet controller: VIA Technologies, Inc. VT86C100A [Rhine 10/100] (rev 06)
01:00.0 VGA compatible controller: Trident Microsystems CyberBlade/i7 (rev 5b)

> 2) What's in /proc/ide/via?

It's not there since I disabled the VIA driver.

> 3) What says hdparm -i on your devices?

/dev/hda:

 Model=SAMSUNG VG34323A (4.32GB), FwRev=GQ200, SerialNo=dW1921060033c8
 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
 RawCHS=14896/9/63, TrkSize=32256, SectSize=512, ECCbytes=21
 BuffType=DualPortCache, BuffSize=496kB, MaxMultSect=16, MultSect=off
 CurCHS=14896/9/63, CurSects=-531627904, LBA=yes, LBAsects=8446032
 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes: pio0 pio1 pio2 pio3 pio4
 DMA modes: sdma0 sdma1 sdma2 *mdma0 mdma1 mdma2 udma0 udma1 *udma2

> 4) If you mount your filesystem read-only, does it read garbage?

Now here's a strange part, or possibly a crusial clue.  When I booted a
2.4.0 kernel (from floppy using the excellent syslinux) with "ro
init=/bin/sh", I could access the filesystem just fine.  I could even
remount the root filesystem rw, and there were no problems.  But I did not
write anything to the disk, since I was convinced that the problem was
gone (this was the second try).  After this I rebooted with
ctrl-alt-delete, forgetting how bad an idea that is with init=/bin/sh,
booted up the RH7 2.2.16 kernel, and fsck was run with no errors.  Now I
though all was well, rebooted from floppy again, but without the init=
part, and poof, it hang.

More interesting may be that I had to turn the computer off and on again
to get BIOS to find the hard drive.  Repeated long reset button presses
did not help.  It is possible that it hung during BIOS hd detection - I
wish I could remember.

I suspect that I could have hung the drive with init=/bin/sh if I would
have done some reading and writing to the device, besides ls.

I think I can spend some more time today trying it out some more.  I will
also try your 3.11 driver, which seems to be an enormous cleanup.  Btw, do
you have a home page for the VIA driver?  A CVS perhaps?  If not, please
consider using sourceforge or something similar.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount

2001-01-13 Thread Tobias Ringstrom

On Fri, 12 Jan 2001, Vojtech Pavlik wrote:
 Wow. Ok, I'm maintaining the 2.4.0 VIA driver, so I'd like to know more
 about this:

 1) What's the ISA bridge revision?

00:00.0 Host bridge: VIA Technologies, Inc. VT8501 (rev 02)
00:01.0 PCI bridge: VIA Technologies, Inc. VT8501
00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)
00:07.2 USB Controller: VIA Technologies, Inc. VT82C586B USB (rev 0e)
00:07.4 Bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 20)
00:07.5 Multimedia audio controller: VIA Technologies, Inc. VT82C686 [Apollo Super 
AC97/Audio] (rev 21)
00:0a.0 Ethernet controller: VIA Technologies, Inc. VT86C100A [Rhine 10/100] (rev 06)
01:00.0 VGA compatible controller: Trident Microsystems CyberBlade/i7 (rev 5b)

 2) What's in /proc/ide/via?

It's not there since I disabled the VIA driver.

 3) What says hdparm -i on your devices?

/dev/hda:

 Model=SAMSUNG VG34323A (4.32GB), FwRev=GQ200, SerialNo=dW1921060033c8
 Config={ HardSect NotMFM HdSw15uSec Fixed DTR10Mbs }
 RawCHS=14896/9/63, TrkSize=32256, SectSize=512, ECCbytes=21
 BuffType=DualPortCache, BuffSize=496kB, MaxMultSect=16, MultSect=off
 CurCHS=14896/9/63, CurSects=-531627904, LBA=yes, LBAsects=8446032
 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes: pio0 pio1 pio2 pio3 pio4
 DMA modes: sdma0 sdma1 sdma2 *mdma0 mdma1 mdma2 udma0 udma1 *udma2

 4) If you mount your filesystem read-only, does it read garbage?

Now here's a strange part, or possibly a crusial clue.  When I booted a
2.4.0 kernel (from floppy using the excellent syslinux) with "ro
init=/bin/sh", I could access the filesystem just fine.  I could even
remount the root filesystem rw, and there were no problems.  But I did not
write anything to the disk, since I was convinced that the problem was
gone (this was the second try).  After this I rebooted with
ctrl-alt-delete, forgetting how bad an idea that is with init=/bin/sh,
booted up the RH7 2.2.16 kernel, and fsck was run with no errors.  Now I
though all was well, rebooted from floppy again, but without the init=
part, and poof, it hang.

More interesting may be that I had to turn the computer off and on again
to get BIOS to find the hard drive.  Repeated long reset button presses
did not help.  It is possible that it hung during BIOS hd detection - I
wish I could remember.

I suspect that I could have hung the drive with init=/bin/sh if I would
have done some reading and writing to the device, besides ls.

I think I can spend some more time today trying it out some more.  I will
also try your 3.11 driver, which seems to be an enormous cleanup.  Btw, do
you have a home page for the VIA driver?  A CVS perhaps?  If not, please
consider using sourceforge or something similar.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount

2001-01-13 Thread Tobias Ringstrom

On Sat, 13 Jan 2001, Vojtech Pavlik wrote:

 On Sat, Jan 13, 2001 at 09:12:27AM +0100, Tobias Ringstrom wrote:
   2) What's in /proc/ide/via?
 
  It's not there since I disabled the VIA driver.

 Ok. Could you send me this file when you boot with fs r-o?

Ok, but this is with the wrong disc.  Withe the bad disc, drive0 looks
exacly like drive2, i.e. normal UDMA(33).  Sorry about that.

--VIA BusMastering IDE Configuration
Driver Version: 2.1e
South Bridge:   VIA vt82c686a rev 0x1b
Command register:   0x7
Latency timer:  32
PCI clock:  33MHz
Master Read  Cycle IRDY:0ws
Master Write Cycle IRDY:0ws
FIFO Output Data 1/2 Clock Advance: off
BM IDE Status Register Read Retry:  on
Max DRDY Pulse Width:   No limit
---Primary IDE---Secondary IDE--
Read DMA FIFO flush:   on  on
End Sect. FIFO flush:  on  on
Prefetch Buffer:   on  on
Post Write Buffer: on  on
FIFO size:  8   8
Threshold Prim.:  1/2 1/2
Bytes Per Sector: 512 512
Both channels togth:  yes yes
---drive0drive1drive2drive3-
BMDMA enabled:yes   yes   yes   yes
Transfer Mode:   UDMA   DMA/PIO  UDMA   DMA/PIO
Address Setup:   30ns 120ns  30ns 120ns
Active Pulse:90ns 330ns  90ns 330ns
Recovery Time:   30ns 270ns  30ns 270ns
Cycle Time:  30ns 600ns  60ns 600ns
Transfer Rate:   66.0MB/s   3.3MB/s  33.0MB/s   3.3MB/s

   4) If you mount your filesystem read-only, does it read garbage?
 
  Now here's a strange part, or possibly a crusial clue.  When I booted a
  2.4.0 kernel (from floppy using the excellent syslinux) with "ro
  init=/bin/sh", I could access the filesystem just fine.  I could even
  remount the root filesystem rw, and there were no problems.  But I did not
  write anything to the disk, since I was convinced that the problem was
  gone (this was the second try).  After this I rebooted with
  ctrl-alt-delete, forgetting how bad an idea that is with init=/bin/sh,
  booted up the RH7 2.2.16 kernel, and fsck was run with no errors.

 So far no problem. Rebooting with c-a-d with fs r-o is OK.

  Now I
  though all was well, rebooted from floppy again, but without the init=
  part, and poof, it hang.

 Where? It could be a different reason than IDE setup ...

Don't think so.  It happens on the "Remounting root read-write".

  More interesting may be that I had to turn the computer off and on again
  to get BIOS to find the hard drive. Repeated long reset button presses
  did not help.  It is possible that it hung during BIOS hd detection - I
  wish I could remember.

 I fear this isn't much of a clue, sorry.

The clue is that the VIA driver messed up either the chipset or the drive
quite a lot, but maybe that is already obvious.

  I suspect that I could have hung the drive with init=/bin/sh if I would
  have done some reading and writing to the device, besides ls.

 Please try it. Best mke2fs your swap partition and try reading  writing
 to that. You can mkswap it back after you finish.

After more testing, I think I have isolated the problem to this disk, or
at least this disk with this controller.  With another (UDMA66) disk,
there are no problems.  Details at the end.

  I think I can spend some more time today trying it out some more.

 Please do. 'lspci -vvxxx' data for the case without a driver, with 2.4.0
 driver and with 3.11 driver would help me find the problem.

Ok, I'll do that later.

 Make sure you *don't* have any hdparm -d1 or hdparm -X66 or similar
 stuff in your init scripts.

I'm sure I don't.  This happens with a clean fresh RH7 installation.

  I will
  also try your 3.11 driver, which seems to be an enormous cleanup.

 the 2.1e driver is an enormous cleanup of the original driver from the
 2.2 kernels. the 3.11 is an enormous cleanup of 2.1e, yes.

I have not had a chance to try the 3.11 driver yet.

Now for the new details.  When writing to the disk with DMA enabled, I get
the following errors, in two different machines.  Both are VIA IDE
machines.  I is NOT a cable error.  I have tries with several cables.
Possibly a connector or soldering problem.  I'll try the disk in more
machines an get back with more info.  I have to run now.

hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdc: dma_intr: error=0x84 { DriveStatusError BadCRC }

/Tobias


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4 ate my filesystem on rw-mount, getting closer

2001-01-13 Thread Tobias Ringstrom

I have now tried the SAMSUNG VG34323A disk with two other controllers at
home (Promise ATA100 an VIA vt82c686a rev 0x22, both on an ASUS A7V
motherboard), and there are no problems to be found with DMA enabled.
Streaming 10 MB/s without glitches.

However, writing to the SAMSUNG VG34323A disk with DMA enabled on either
this machine [1] (at work, using the VIA IDE driver version 3.11)

00:07.0 ISA bridge: VIA Technologies, Inc. VT82C596 ISA [Apollo PRO] (rev 23)
00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 10)

or this machine [2] (at work, using the VIA IDE driver version 2.1e)

00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super] (rev 1b)
00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)

I get exactly the following errors on both machines

hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdc: dma_intr: error=0x84 { DriveStatusError BadCRC }

no matter what cable I use.  When I get this, the machine does not recover
most of the time, and I have to reset or power cycle.  This disc works
flawlessly on two other IDE controllers, so I do not think that the disk
is completely broken. It must be either these chipsets or the driver in
combination with this disk.  Note that I _can_ use another UDMA66 disk
_with_ DMA enabled on both machine [1] and [2] above without problems.
Also, 2.2.16-22 seems to work with DMA enabled on machine [1].  I have not
tried 2.2.16-22 with DMA enabled on machine [2].

The problem I reported at first, hence the nasty subject, was a hang and a
nasty fs corruption when RH7 tried to remount the root fs read-write.  I
examined the RH7 init scripts, or more precisely /etc/rc.sysinit, and
discovered, to my great disgust, that the stupid thing disables the dmesg
output on the console very early in the script.  It is thus entirely
possible that I do get the above mentioned errors when the computer seems
to hang, and my fs gets corrupted.  I will fix the script tomorrow to see
if my assumption is correct.

SUMMARY:  I have a disk that with DMA enabled give me CRC errors on two
machines, but not on two other, independent on the cable.  Both troubling
machines do not recover from these errors.  Linux 2.2.16-22 from RedHat
works fine with DMA enabled on machine [1], [2] is unknown.

I hope this makes things a lot clearer.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



2.4 ate my filesystem on rw-mount

2001-01-12 Thread Tobias Ringstrom

I've never seen anything like it before, which I'm happy for.  The system
had been running a standard RedHat 7 kernel for days without any problems,
but who wants to run a 2.2 kernel?  I compiled 2.4.0 for it, rebooted, and
blam!  The RedHat init stripts got to the "remounting root read-write"
point, and just froze solid.

Rebooting into RH7 failed, becauce inittab could not be found.  In fact
the filesystem was completely messed up, with /dev empty, lots of device
nodes in /etc, and files missing all over the place.  I had to reinstall
RH7 from scratch.

I do not understand how this could happen during a remounting root rw.
Is the filesystem really that unstable?

Am I right in suspecting DMA, which was enabled at the time?  Any other
ideas?  Is it a known problem?

This is on a 450 MHz AMD-K6 with the following IDE controller:

00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)

[I know this is not a very good trouble report, but it will have to do for
the time beeing.  I hope to do more testing at a later time.]

/Tobias

PS. This is _not_ the same system that I reported IDE busy errors for.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: IDE DMA problem in 2.4.0

2001-01-12 Thread Tobias Ringstrom

On Thu, 11 Jan 2001, Adrian Bunk wrote:
> On Thu, 11 Jan 2001, Tobias Ringstrom wrote:
>
> > When copying huge files from one disk to another (hda->hdc), I get the
> > following error (after some hundred megabytes):
> >
> > hdc: timeout waiting for DMA
> > ide_dmaproc: chipset supported ide_dma_timeout func only: 14
> > hdc: irq timeout: status=0xd1 { Busy }
> > hdc: DMA disabled
> > ide1: reset: success
> >...
> > VP_IDE: VIA vt82c596b IDE UDMA66 controller on pci0:7.1
> >...
> > Did I miss anything?
>
> Could you try if the (experimental) version 3.11 of the VIA IDE driver
> (announced by Vojtech Pavlik in [1]) fixes your problem? Simply copy the
> two files you find there to drivers/ide after you unpacked the kernel
> source.

Works like a charm!  I copied the full 4 GB without glitches, and it has
not eaten my filesystem yet, either.  I will continue to stress it, and
report any errors I find.

Vojtech, can we expect to see this driver in 2.4 anytime soon?

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: IDE DMA problem in 2.4.0

2001-01-12 Thread Tobias Ringstrom

On Thu, 11 Jan 2001, Adrian Bunk wrote:
 On Thu, 11 Jan 2001, Tobias Ringstrom wrote:

  When copying huge files from one disk to another (hda-hdc), I get the
  following error (after some hundred megabytes):
 
  hdc: timeout waiting for DMA
  ide_dmaproc: chipset supported ide_dma_timeout func only: 14
  hdc: irq timeout: status=0xd1 { Busy }
  hdc: DMA disabled
  ide1: reset: success
 ...
  VP_IDE: VIA vt82c596b IDE UDMA66 controller on pci0:7.1
 ...
  Did I miss anything?

 Could you try if the (experimental) version 3.11 of the VIA IDE driver
 (announced by Vojtech Pavlik in [1]) fixes your problem? Simply copy the
 two files you find there to drivers/ide after you unpacked the kernel
 source.

Works like a charm!  I copied the full 4 GB without glitches, and it has
not eaten my filesystem yet, either.  I will continue to stress it, and
report any errors I find.

Vojtech, can we expect to see this driver in 2.4 anytime soon?

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



2.4 ate my filesystem on rw-mount

2001-01-12 Thread Tobias Ringstrom

I've never seen anything like it before, which I'm happy for.  The system
had been running a standard RedHat 7 kernel for days without any problems,
but who wants to run a 2.2 kernel?  I compiled 2.4.0 for it, rebooted, and
blam!  The RedHat init stripts got to the "remounting root read-write"
point, and just froze solid.

Rebooting into RH7 failed, becauce inittab could not be found.  In fact
the filesystem was completely messed up, with /dev empty, lots of device
nodes in /etc, and files missing all over the place.  I had to reinstall
RH7 from scratch.

I do not understand how this could happen during a remounting root rw.
Is the filesystem really that unstable?

Am I right in suspecting DMA, which was enabled at the time?  Any other
ideas?  Is it a known problem?

This is on a 450 MHz AMD-K6 with the following IDE controller:

00:07.1 IDE interface: VIA Technologies, Inc. VT82C586 IDE [Apollo] (rev 06)

[I know this is not a very good trouble report, but it will have to do for
the time beeing.  I hope to do more testing at a later time.]

/Tobias

PS. This is _not_ the same system that I reported IDE busy errors for.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



IDE DMA problem in 2.4.0

2001-01-11 Thread Tobias Ringstrom

When copying huge files from one disk to another (hda->hdc), I get the
following error (after some hundred megabytes):

hdc: timeout waiting for DMA
ide_dmaproc: chipset supported ide_dma_timeout func only: 14
hdc: irq timeout: status=0xd1 { Busy }
hdc: DMA disabled
ide1: reset: success

I got this using dd with a block size of 32kB, reading a large file on
hda, writing directly to hdc1.  I tried with another disk as hdc
(a Samsung), and I have tried two different cables.  Still no go.  Well,
it does work, of course, but much slower since DMA has been disabled.

I have been unable to reproduce this error using
dd bs=32k if=/dev/zero of=/dev/hdc1
or
dd bs=32k if=/dev/hdc1 of=/dev/null

Everything works fine in 2.2.16-22 from RedHat 7 (with DMA enabled using
hdparm).

Here is a relevant part of the startup log (I hope):

Uniform Multi-Platform E-IDE driver Revision: 6.31
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: IDE controller on PCI bus 00 dev 39
VP_IDE: chipset revision 16
VP_IDE: not 100% native mode: will probe irqs later
VP_IDE: VIA vt82c596b IDE UDMA66 controller on pci0:7.1
ide0: BM-DMA at 0xd000-0xd007, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0xd008-0xd00f, BIOS settings: hdc:DMA, hdd:pio
hda: SAMSUNG SV2044D, ATA DISK drive
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
hdc: ST38421A, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
hda: 39862368 sectors (20410 MB) w/472KiB Cache, CHS=2481/255/63, UDMA(66)
hdc: 16498944 sectors (8447 MB) w/256KiB Cache, CHS=16368/16/63, UDMA(33)

Did I miss anything?

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Is ECN useful yet?

2001-01-11 Thread Tobias Ringstrom

Does anyone know if ECN is supported by the Internet backbone routers yet,
i.e. will I gain anything by enabling ECN in my Linux boxes at this point?
(except pushing this excellent technology, of course).

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Benchmarking 2.2 and 2.4 using hdparm and dbench 1.1

2001-01-11 Thread Tobias Ringstrom

[regarding the buffer cache hash size and bad performance on machines
with little memory...  (<32MB)]

On Tue, 9 Jan 2001, Anton Blanchard wrote:
> > Where is the size defined, and is it easy to modify?
>
> Look in fs/buffer.c:buffer_init()

I experimented some, and increasing the huffer cache hash to the 2.2
levels helped a lot, especially for 16 MB memory.  The difference is huge,
64 kB in 2.2 vs 1 kB in 2.4 for a 32 MB memory machine.

> I havent done any testing on slow hardware and the high end stuff is
> definitely performing better in 2.4, but I agree we shouldn't forget
> about the slower stuff.

Being able to tune the machine for both high and low end systems is
neccessary, and if Linux can tune itself, that's of course the best.

> Narrowing down where the problem is would help. My guess is it is a TCP
> problem, can you check if it is performing worse in your case? (eg ftp
> something against 2.2 and 2.4)

Nope, TCP performance seems more or less unchanged.  I will keep
investigating, and get back when I have more info.

/Tobias


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Benchmarking 2.2 and 2.4 using hdparm and dbench 1.1

2001-01-11 Thread Tobias Ringstrom

[regarding the buffer cache hash size and bad performance on machines
with little memory...  (32MB)]

On Tue, 9 Jan 2001, Anton Blanchard wrote:
  Where is the size defined, and is it easy to modify?

 Look in fs/buffer.c:buffer_init()

I experimented some, and increasing the huffer cache hash to the 2.2
levels helped a lot, especially for 16 MB memory.  The difference is huge,
64 kB in 2.2 vs 1 kB in 2.4 for a 32 MB memory machine.

 I havent done any testing on slow hardware and the high end stuff is
 definitely performing better in 2.4, but I agree we shouldn't forget
 about the slower stuff.

Being able to tune the machine for both high and low end systems is
neccessary, and if Linux can tune itself, that's of course the best.

 Narrowing down where the problem is would help. My guess is it is a TCP
 problem, can you check if it is performing worse in your case? (eg ftp
 something against 2.2 and 2.4)

Nope, TCP performance seems more or less unchanged.  I will keep
investigating, and get back when I have more info.

/Tobias


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Is ECN useful yet?

2001-01-11 Thread Tobias Ringstrom

Does anyone know if ECN is supported by the Internet backbone routers yet,
i.e. will I gain anything by enabling ECN in my Linux boxes at this point?
(except pushing this excellent technology, of course).

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



IDE DMA problem in 2.4.0

2001-01-11 Thread Tobias Ringstrom

When copying huge files from one disk to another (hda-hdc), I get the
following error (after some hundred megabytes):

hdc: timeout waiting for DMA
ide_dmaproc: chipset supported ide_dma_timeout func only: 14
hdc: irq timeout: status=0xd1 { Busy }
hdc: DMA disabled
ide1: reset: success

I got this using dd with a block size of 32kB, reading a large file on
hda, writing directly to hdc1.  I tried with another disk as hdc
(a Samsung), and I have tried two different cables.  Still no go.  Well,
it does work, of course, but much slower since DMA has been disabled.

I have been unable to reproduce this error using
dd bs=32k if=/dev/zero of=/dev/hdc1
or
dd bs=32k if=/dev/hdc1 of=/dev/null

Everything works fine in 2.2.16-22 from RedHat 7 (with DMA enabled using
hdparm).

Here is a relevant part of the startup log (I hope):

Uniform Multi-Platform E-IDE driver Revision: 6.31
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: IDE controller on PCI bus 00 dev 39
VP_IDE: chipset revision 16
VP_IDE: not 100% native mode: will probe irqs later
VP_IDE: VIA vt82c596b IDE UDMA66 controller on pci0:7.1
ide0: BM-DMA at 0xd000-0xd007, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0xd008-0xd00f, BIOS settings: hdc:DMA, hdd:pio
hda: SAMSUNG SV2044D, ATA DISK drive
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
hdc: ST38421A, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
hda: 39862368 sectors (20410 MB) w/472KiB Cache, CHS=2481/255/63, UDMA(66)
hdc: 16498944 sectors (8447 MB) w/256KiB Cache, CHS=16368/16/63, UDMA(33)

Did I miss anything?

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Benchmarking 2.2 and 2.4 using hdparm and dbench 1.1

2001-01-04 Thread Tobias Ringstrom

On Fri, 5 Jan 2001, Anton Blanchard wrote:
> 
> > 1) Why does the hdbench numbers go down for 2.4 (only) when 32 MB is used?
> >I fail to see how that matters, especially for the '-T' test.
> 
> When I did some tests long ago, hdparm was hitting the buffer cache hash
> table pretty hard in 2.4 compared to 2.2 because it is now smaller. However
> as davem pointed out, most things don't do such things so resizing the hash
> table just for this is a waste.

Where is the size defined, and is it easy to modify?

> Since the hash is based on RAM, it may end up being big enough on the 128M
> machine.

Maybe.  I have been experimenting some more, and I see that the less
memory i have, kswapd takes more and more CPU (more than 10% for some
cases) when I am doing a continuous read from a block device.

I noticed that /proc/sys/vm/freepages is not writable any more.  Is there
any reason for this?

> > The reason for doing the benchmarks in the first place is that my 32MB P90
> > at home really does perform noticeably worse with samba using 2.4 kernels
> > than using 2.2 kernels, and that bugs me.  I have no hard numbers for that
> > machine (yet).  If they will show anything extra, I will post them here.  
> 
> What exactly are you seeing?

I first noticed the slowdown because the load meter LEDs on my ethernet
hub did not go as high with 2.4 as it did with 2.2.  A simple test,
transferring a large file using smbclient, did in deed show a decrease in
performance, both for a localhost and a remote file transfer.  This in
spite of the tcp transfer rate beeing (much) higher in 2.4 than in 2.2.

> > Btw, has anyone else noticed samba slowdowns when going from 2.2 to 2.4?
> 
> I am seeing good results with 2.4 + samba 2.2 using tdb spinlocks.

Hmm...  I'm still using samba 2.0.7.  I'll try 2.2 to see if it
helps.  What are tdb spinlocks?

Have you actually compared the same setup with 2.2 and 2.4 kernels and a
single client transferring a large file, preferably from a slow server
with little memory?  Most samba servers that people benchmark are fast
computers with lots of memory.  So far, every major kernel upgrade has
given me a performance boost, even for slow computers, and I would hate to
see that trend break for 2.4...

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Benchmarking 2.2 and 2.4 using hdparm and dbench 1.1

2001-01-04 Thread Tobias Ringstrom

On Fri, 5 Jan 2001, Anton Blanchard wrote:
 
  1) Why does the hdbench numbers go down for 2.4 (only) when 32 MB is used?
 I fail to see how that matters, especially for the '-T' test.
 
 When I did some tests long ago, hdparm was hitting the buffer cache hash
 table pretty hard in 2.4 compared to 2.2 because it is now smaller. However
 as davem pointed out, most things don't do such things so resizing the hash
 table just for this is a waste.

Where is the size defined, and is it easy to modify?

 Since the hash is based on RAM, it may end up being big enough on the 128M
 machine.

Maybe.  I have been experimenting some more, and I see that the less
memory i have, kswapd takes more and more CPU (more than 10% for some
cases) when I am doing a continuous read from a block device.

I noticed that /proc/sys/vm/freepages is not writable any more.  Is there
any reason for this?

  The reason for doing the benchmarks in the first place is that my 32MB P90
  at home really does perform noticeably worse with samba using 2.4 kernels
  than using 2.2 kernels, and that bugs me.  I have no hard numbers for that
  machine (yet).  If they will show anything extra, I will post them here.  
 
 What exactly are you seeing?

I first noticed the slowdown because the load meter LEDs on my ethernet
hub did not go as high with 2.4 as it did with 2.2.  A simple test,
transferring a large file using smbclient, did in deed show a decrease in
performance, both for a localhost and a remote file transfer.  This in
spite of the tcp transfer rate beeing (much) higher in 2.4 than in 2.2.

  Btw, has anyone else noticed samba slowdowns when going from 2.2 to 2.4?
 
 I am seeing good results with 2.4 + samba 2.2 using tdb spinlocks.

Hmm...  I'm still using samba 2.0.7.  I'll try 2.2 to see if it
helps.  What are tdb spinlocks?

Have you actually compared the same setup with 2.2 and 2.4 kernels and a
single client transferring a large file, preferably from a slow server
with little memory?  Most samba servers that people benchmark are fast
computers with lots of memory.  So far, every major kernel upgrade has
given me a performance boost, even for slow computers, and I would hate to
see that trend break for 2.4...

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Benchmarking 2.2 and 2.4 using hdparm and dbench 1.1

2001-01-03 Thread Tobias Ringstrom

On Wed, 3 Jan 2001, Daniel Phillips wrote:

> Tobias Ringstrom wrote:
> > 3) The 2.2 kernels outperform the 2.4 kernels for few clients (see
> >especially the "dbench 1" numbers for the PII-128M.  Oops!
> 
> I noticed that too.  Furthermore I noticed that the results of the more
> heavily loaded tests on the whole 2.4.0 series tend to be highly
> variable (usually worse) if you started by moving the whole disk through
> cache, e.g., fsck on a damaged filesystem.

Yes, they do seem to vary a lot.

> It would be great if you could track the ongoing progress - you could go
> so far as to automatically download the latest patch and rerun the
> tests.  (We have a script like that here to keep our lxr/cvs tree
> current.)  And yes, it gets more important to consider some of the other
> usage patterns so we don't end up with self-fullfilling prophecies.

I was thinking about an automatic test, build, modify lilo, reboot cycle
for a while, but I don't think it's worth it.  Benchmarking is hard, and
making it automatic is probably even harder, not mentioning trying to
interpret the numbers...  Probably "Samba feels slower" works quite well.  
:-)

But then it is even unclear to me what the vm people are trying to
optimize for.  Probably a system that "feels good", which according to
myself above, may actually be a good criteria, although a but imprecise.  
Oh, well...

> For benchmarking it would be really nice to have a way of emptying
> cache, beyond just syncing.  I took a look at that last week and
> unfortunately it's not trivial.  The things that have to be touched are
> optimized for the steady-state running case and tend to take their
> marching orders from global variables and embedded heuristics that you
> don't want to mess with.  Maybe I'm just looking at this problem the
> wrong way because the shortest piece of code I can imagine for doing
> this would be 1-200 lines long and would replicate a lot of the
> functionality of page_launder and flush_dirty_pages, in other words it
> would be a pain to maintain.

How about allocating lots of memory and locking it in memory?  I have not
looked at the source, but it seems (using strace) that hdbench uses shm to
do just that.  I'll dig into the hdbench code and try to make a program
that empties the cache.

/Tobias

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Benchmarking 2.2 and 2.4 using hdparm and dbench 1.1

2001-01-03 Thread Tobias Ringstrom

I have been torturing a couple of boxes and came up with these benchmark
results.  I have also enclosed the script used to do the benchmark, and I
am well aware that this is a very specialized benchmark, testing only
limited parts of the kernel, and so on, BUT I am convinced that I'm seeing
something interesting things in the results anyway.  :-)

There are numbers for two different machines, and each test has been run
five times in succession, with a sync and a sleep in between.  The kernels
were optimized for Pentium Classic, and only contained IDE support (e.g.
no SCSI, networking, etc).  The only processes running besides the
benchmark script were kernel daemons and getty:s.  Both machines had
plenty of swap space available.

There are some numbers that I think are interesting:

1) Why does the hdbench numbers go down for 2.4 (only) when 32 MB is used?
   I fail to see how that matters, especially for the '-T' test.

2) The 2.4 kernels outperform the 2.2 kernels for many clients (see
   the "dbench 10" numbers).  This matters a lot.  Great!

3) The 2.2 kernels outperform the 2.4 kernels for few clients (see
   especially the "dbench 1" numbers for the PII-128M.  Oops!

The reason for doing the benchmarks in the first place is that my 32MB P90
at home really does perform noticeably worse with samba using 2.4 kernels
than using 2.2 kernels, and that bugs me.  I have no hard numbers for that
machine (yet).  If they will show anything extra, I will post them here.  
Btw, has anyone else noticed samba slowdowns when going from 2.2 to 2.4?

Anyway, any help explaining/fixing points 1 and 3 would be highly
appreciated!

/Tobias


[All numbers in MB/s, bigger are better.]
==
133 MHz Pentium, DMA, 80M

2.2.16-22:
hdparm -T   37.43   37.43   37.32   37.32   37.43
hdparm -t   6.086.146.156.106.08
dbench 110.4302 10.0796 10.2559 10.2258 13.5464
dbench 57.97753 8.13792 7.66108 7.8526  7.44329
dbench 10   5.78309 5.58762 5.76388 5.54761 5.94415

2.2.18:
hdparm -T   37.87   37.87   37.65   37.54   37.65
hdparm -t   6.046.046.106.116.07
dbench 19.98084 9.19558 10.1023 9.78034 10.7593
dbench 57.5335  7.85761 7.90051 8.19119 7.78873
dbench 10   5.98423 5.99556 5.84676 6.02366 5.87104

2.2.19pre3:
hdparm -T   37.98   37.43   37.21   37.43   37.87
hdparm -t   6.086.086.066.116.09
dbench 110.2117 11.2996 12.017  11.3003 12.1677
dbench 56.51203 6.80555 6.46566 6.66772 6.56693
dbench 10   5.55781 5.68997 5.43493 5.58688 5.32528

2.4.0-prerelease:
hdparm -T   38.21   38.21   38.32   38.21   38.21
hdparm -t   6.076.246.146.306.26
dbench 14.23029 3.89185 10.27   14.2546 14.2719
dbench 58.63648 9.21302 11.0506 9.64396 10.8724
dbench 10   7.12402 6.92772 8.02011 7.65119 8.12557

==
133 MHz Pentium, DMA, 32M

2.2.16-22:
hdparm -T   37.10   37.76   37.98   37.54   37.32
hdparm -t   6.016.046.075.936.06
dbench 110.0048 8.59813 8.49598 9.59796 9.04642
dbench 53.99638 4.59673 4.08813 3.90389 4.30821
dbench 10   2.6141  2.69585 2.63201 2.59543 2.61565

2.2.18:
hdparm -T   37.21   37.98   37.98   37.98   37.76
hdparm -t   5.945.986.036.056.03
dbench 19.66983 10.3501 9.77682 9.44597 10.0551
dbench 55.32253 5.31517 5.9542  5.83028 5.22383
dbench 10   2.86753 2.85903 2.83746 2.93299 2.84175

2.2.19pre3:
hdparm -T   36.99   38.21   37.98   37.87   37.10
hdparm -t   6.086.156.146.146.14
dbench 18.64248 7.73716 7.42123 7.6462  9.58718
dbench 54.57973 4.5689  4.32196 4.50044 4.68734
dbench 10   2.56748 2.55824 2.56042 2.54797 2.59391

2.4.0-prerelease:
hdparm -T   37.32   37.21   37.32   37.32   37.21
hdparm -t   5.875.705.345.095.07
dbench 14.41358 7.43094 8.76442 8.64346 9.45806
dbench 54.42467 4.22284 4.89167 4.48322 5.08206
dbench 10   4.19795 5.6161  4.09045 4.09799 4.55435




==
400 MHz Mobile Pentium II, UDMA(33), mem=128M

2.2.18:
hdparm -T   82.05   82.05   82.05   82.05   82.05
hdparm -t   9.859.8210.09   10.08   9.83
dbench 145.8926 48.1175 47.7244 47.7418 47.457
dbench 518.4021 20.7948 22.3345 20.4869 14.2627
dbench 10   13.1219 10.0275 10.8189 13.7482 9.51982

2.2.19pre3:
hdparm -T   81.53   82.05   82.05   82.05   82.05
hdparm -t   9.889.899.869.889.85
dbench 138.752  43.6587 43.1486 41.004  42.4516
dbench 526.6624 27.4806 26.1421 28.9142 24.6634
dbench 10   14.9424 14.6862 14.2302 14.6954 14.8282

2.4.0-prerelease:
hdparm -T   82.05   83.12   83.12   83.12   83.12
hdparm -t   10.08   10.08   

  1   2   >