Re: why this 1ms delay in mdio_read? (cont'd from "are ioctl callssupposed to take this long?")

2001-07-06 Thread Donald Becker

On Fri, 6 Jul 2001, Chris Friesen wrote:

> Subject: why this 1ms delay in mdio_read?  (cont'd from "are ioctl calls
supposed  to take this long?")
> 
> The beginning of mdio_read() in tulip.c goes like this:
> 
> static int mdio_read(struct device *dev, int phy_id, int location)
...
>   mdelay(1); /* One ms delay... */

Ackkk!  What driver version?
And who put this bogus delay in the code?

Putting arbitrary delays in drivers is usually a sign that the someone
didn't understand how to fix a bug and is just trying to wait it out.

> The chip I'm using is the DEC 21143, which means that we skip over the two
> conditional blocks, so the first thing that happens when we call this is to
> wait around doing nothing for a millisecond.  Is there some subtle
> reason why we would want to wait around for a millisecond before doing
> anything? 

Nope.  None at all.

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: why this 1ms delay in mdio_read? (cont'd from are ioctl callssupposed to take this long?)

2001-07-06 Thread Donald Becker

On Fri, 6 Jul 2001, Chris Friesen wrote:

 Subject: why this 1ms delay in mdio_read?  (cont'd from are ioctl calls
supposed  to take this long?)
 
 The beginning of mdio_read() in tulip.c goes like this:
 
 static int mdio_read(struct device *dev, int phy_id, int location)
...
   mdelay(1); /* One ms delay... */

Ackkk!  What driver version?
And who put this bogus delay in the code?

Putting arbitrary delays in drivers is usually a sign that the someone
didn't understand how to fix a bug and is just trying to wait it out.

 The chip I'm using is the DEC 21143, which means that we skip over the two
 conditional blocks, so the first thing that happens when we call this is to
 wait around doing nothing for a millisecond.  Is there some subtle
 reason why we would want to wait around for a millisecond before doing
 anything? 

Nope.  None at all.

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: RTL8139 difficulties in 2.2, not in 2.4

2001-05-19 Thread Donald Becker

On Sat, 19 May 2001, Matthias Andree wrote:

> I'm having difficulties with a RTL8139 with Linux 2.2.19 (both drivers),
> but not with Linux 2.4.4's 8139too driver. The card is an Allied Telesyn
> AT-2500TX, the chip is reported as 8139C/rev. 0x10. The card shares its
> IRQ 9 with an nVidia Riva TNT 128 [NV04], rev. 4.
...
> eth1: Transmit timeout, status 0c 0005 media 18.
> eth1: Tx queue start entry 4  dirty entry 0.
> eth1: RTL8139 Interrupt line blocked, status 5.
> eth1: RTL8139 Interrupt line blocked, status 5.
> eth1: RTL8139 Interrupt line blocked, status 4.
> eth1: RTL8139 Interrupt line blocked, status 4.
> (continues every minute with status 4 if no traffic on interface)

The card is reporting that the interrupt line has been asserted (Tx
done), but the interrupt handler hasn't been called.

You can verify this by watching the interrupt count in /proc/interrupts.

Try booting the kernel with "noapic", which we recommend as the safe
default setting.


Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: RTL8139 difficulties in 2.2, not in 2.4

2001-05-19 Thread Donald Becker

On Sat, 19 May 2001, Matthias Andree wrote:

 I'm having difficulties with a RTL8139 with Linux 2.2.19 (both drivers),
 but not with Linux 2.4.4's 8139too driver. The card is an Allied Telesyn
 AT-2500TX, the chip is reported as 8139C/rev. 0x10. The card shares its
 IRQ 9 with an nVidia Riva TNT 128 [NV04], rev. 4.
...
 eth1: Transmit timeout, status 0c 0005 media 18.
 eth1: Tx queue start entry 4  dirty entry 0.
 eth1: RTL8139 Interrupt line blocked, status 5.
 eth1: RTL8139 Interrupt line blocked, status 5.
 eth1: RTL8139 Interrupt line blocked, status 4.
 eth1: RTL8139 Interrupt line blocked, status 4.
 (continues every minute with status 4 if no traffic on interface)

The card is reporting that the interrupt line has been asserted (Tx
done), but the interrupt handler hasn't been called.

You can verify this by watching the interrupt count in /proc/interrupts.

Try booting the kernel with noapic, which we recommend as the safe
default setting.


Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: thank's for answering

2001-04-17 Thread Donald Becker

On Tue, 17 Apr 2001, battata chafik wrote:

> i have a 3c595TX card and when i plus it in my hub it at 10base T i
> tride to put the new modules and nothing changed i have a 2.2.16 kernel
> and 2.4.1 kenel and it's the same in the too cases ,
> and i have to other computer using a 100base T cards from real tek and
> they  appear at 100 base T in the hub and te rate of any fule transfert
> is up to 10 mb/s  between the to other computer , so is there any
> upgrade to do for the bios of the nic card or is it normal " i don't
> think so but why not "

First problem: the EEPROM is set to forced full duplex.
This is almost certainly wrong for your hub.

The speed problem is likely because you have a dual speed repeater.  The
595 speed autosensing must be done by the driver.  In order to not screw
up 10baseT repeaters with 100baseTx link beat the driver first sets the
speed to 10baseT and checks for link beat.  If it finds 10baseT link
beat it never tries 100baseTx.

The solution is to set the speed to 100baseTx using a driver option.
Read
   http://www.scyld.com/network/vortex.html

The 3c595 is a very old card.
You will get better performance from any modern card.

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: thank's for answering

2001-04-17 Thread Donald Becker

On Tue, 17 Apr 2001, battata chafik wrote:

 i have a 3c595TX card and when i plus it in my hub it at 10base T i
 tride to put the new modules and nothing changed i have a 2.2.16 kernel
 and 2.4.1 kenel and it's the same in the too cases ,
 and i have to other computer using a 100base T cards from real tek and
 they  appear at 100 base T in the hub and te rate of any fule transfert
 is up to 10 mb/s  between the to other computer , so is there any
 upgrade to do for the bios of the nic card or is it normal " i don't
 think so but why not "

First problem: the EEPROM is set to forced full duplex.
This is almost certainly wrong for your hub.

The speed problem is likely because you have a dual speed repeater.  The
595 speed autosensing must be done by the driver.  In order to not screw
up 10baseT repeaters with 100baseTx link beat the driver first sets the
speed to 10baseT and checks for link beat.  If it finds 10baseT link
beat it never tries 100baseTx.

The solution is to set the speed to 100baseTx using a driver option.
Read
   http://www.scyld.com/network/vortex.html

The 3c595 is a very old card.
You will get better performance from any modern card.

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[PATCH 2.4.2] ne.c: Add to bad_clone_list

2001-04-09 Thread Donald Becker


>From: Jan-Benedict Glaw ([EMAIL PROTECTED])
>Date: Mon Apr 09 2001 - 12:07:16 EDT 

>This allows me to use some (old and broken) AT/LANTIC boards. 

Please re-test this patch.

Boards based on DP83905 AT/LANTIC chip should never need to be added to
the bad clone list.  The bad clone list should now be almost read-only,
since it's very unlikely that anyone is making new ISA NE2000 and not
following the design rules.

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



[PATCH 2.4.2] ne.c: Add to bad_clone_list

2001-04-09 Thread Donald Becker


From: Jan-Benedict Glaw ([EMAIL PROTECTED])
Date: Mon Apr 09 2001 - 12:07:16 EDT 

This allows me to use some (old and broken) AT/LANTIC boards. 

Please re-test this patch.

Boards based on DP83905 AT/LANTIC chip should never need to be added to
the bad clone list.  The bad clone list should now be almost read-only,
since it's very unlikely that anyone is making new ISA NE2000 and not
following the design rules.

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: PROBLEM: Network hanging - Tulip driver with Netgear (Lite-On)

2001-03-02 Thread Donald Becker

On Fri, 2 Mar 2001, Manfred Spraul wrote:
> Jeff Garzik wrote:
> > Manfred Spraul wrote:
> > > Could you double check the code in tulip_core.c, around line 1450?
> > > IMHO it's bogus.
> > >
> > > 1) if the network card contains multiple mii's, then the the advertised
> > > value of all mii's is changed to the advertised value of the first mii.
...
> > If you have a single controller with multiple MII phys...  how does one
> > select the phy of choice (for tulip, in the absence of SROM media
> > table...)?
> 
> I'd choose the first one with a link partner.

Well, yes, but what is "first"?

Are there any Tulip cards (besides the Comet-2 w/HPNA) that have multiple
MII transceivers?

The Comet2 is a special case, since only one transceiver is powered and
visible at a time.  Polling the other transceiver switches off the
first.

> > And once phy A has been selected out of N available as the
> > active phy, should you care about the others at all?
> 
> Not until the link beat disappears.

Uhmm, but you don't always know when you have lost link beat.  In some
cases the driver does basic polling to check for duplex changes, but
the semantics are not as clean as you would expect.


Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: PROBLEM: Network hanging - Tulip driver with Netgear (Lite-On)

2001-03-02 Thread Donald Becker

On Fri, 2 Mar 2001, Manfred Spraul wrote:
 Jeff Garzik wrote:
  Manfred Spraul wrote:
   Could you double check the code in tulip_core.c, around line 1450?
   IMHO it's bogus.
  
   1) if the network card contains multiple mii's, then the the advertised
   value of all mii's is changed to the advertised value of the first mii.
...
  If you have a single controller with multiple MII phys...  how does one
  select the phy of choice (for tulip, in the absence of SROM media
  table...)?
 
 I'd choose the first one with a link partner.

Well, yes, but what is "first"?

Are there any Tulip cards (besides the Comet-2 w/HPNA) that have multiple
MII transceivers?

The Comet2 is a special case, since only one transceiver is powered and
visible at a time.  Polling the other transceiver switches off the
first.

  And once phy A has been selected out of N available as the
  active phy, should you care about the others at all?
 
 Not until the link beat disappears.

Uhmm, but you don't always know when you have lost link beat.  In some
cases the driver does basic polling to check for duplex changes, but
the semantics are not as clean as you would expect.


Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: PATCH] via-rhine.c: don't reference skb after passing it tonetif_rx

2001-02-26 Thread Donald Becker

On Mon, 26 Feb 2001, Arnaldo Carvalho de Melo wrote:

> Em Mon, Feb 26, 2001 at 08:33:59PM -0300, Arnaldo Carvalho de Melo escreveu:
>   I've just read davem's post at netdev about the brokeness of
> referencing skbs after passing it to netif_rx, so please consider applying
> this patch. Ah, this was just added to the Janitor's TODO list at

> --- linux-2.4.2/drivers/net/via-rhine.c   Mon Dec 11 19:38:29 2000
> +++ linux-2.4.2.acme/drivers/net/via-rhine.c  Mon Feb 26 22:36:18 2001
> @@ -1147,9 +1147,9 @@
>np->rx_buf_sz, 
>PCI_DMA_FROMDEVICE);
>   }
>   skb->protocol = eth_type_trans(skb, dev);
> + np->stats.rx_bytes += skb->len;
>   netif_rx(skb);
>   dev->last_rx = jiffies;
> - np->stats.rx_bytes += skb->len;
>   np->stats.rx_packets++;
>   }

Easier fix: 
-   np->stats.rx_bytes += skb->len;
+   np->stats.rx_bytes += pkt_len;

Grouping the writes to np->stats results in better cache usage.


Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: PATCH] via-rhine.c: don't reference skb after passing it tonetif_rx

2001-02-26 Thread Donald Becker

On Mon, 26 Feb 2001, Arnaldo Carvalho de Melo wrote:

 Em Mon, Feb 26, 2001 at 08:33:59PM -0300, Arnaldo Carvalho de Melo escreveu:
   I've just read davem's post at netdev about the brokeness of
 referencing skbs after passing it to netif_rx, so please consider applying
 this patch. Ah, this was just added to the Janitor's TODO list at

 --- linux-2.4.2/drivers/net/via-rhine.c   Mon Dec 11 19:38:29 2000
 +++ linux-2.4.2.acme/drivers/net/via-rhine.c  Mon Feb 26 22:36:18 2001
 @@ -1147,9 +1147,9 @@
np-rx_buf_sz, 
PCI_DMA_FROMDEVICE);
   }
   skb-protocol = eth_type_trans(skb, dev);
 + np-stats.rx_bytes += skb-len;
   netif_rx(skb);
   dev-last_rx = jiffies;
 - np-stats.rx_bytes += skb-len;
   np-stats.rx_packets++;
   }

Easier fix: 
-   np-stats.rx_bytes += skb-len;
+   np-stats.rx_bytes += pkt_len;

Grouping the writes to np-stats results in better cache usage.


Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] starfire reads irq before pci_enable_device.

2001-02-13 Thread Donald Becker

On 12 Feb 2001, Jes Sorensen wrote:

> >>>>> "Donald" == Donald Becker <[EMAIL PROTECTED]> writes:
> 
> Donald> On 9 Feb 2001, Jes Sorensen wrote:
> >> The ia64 kernel has gotten mis aligned load support, but it's slow
> >> as a dog so we really want to copy the packet every time anyway
> >> when the header is not aligned. If people send out 802.3 headers or
> >> other crap on Ethernet then it's just too bad.
> 
> Donald> Note the word "required", meaning "must be done"
> Donald> vs. "recommended" meaning "should be done".
> 
> Donald> The initial issue was a comment in a starfire patch that
> Donald> claimed an IA64 bug had been fixed.  The copy breakpoint
> Donald> change might have improved performance by doing a copy-align,
> Donald> but it didn't fix a bug.
> 
> I agree it was a bug, and yes it has been fixed.

There was not a bug in the driver.  The bug was/is in the protocol handling
code.  The protocol handling code *must* be able to handle unaligned IP
headers.

> Donald> That performance tradeoff was already anticipated: the
> Donald> 'rx_copybreak' value that was changed was a module parameter,
..
> In this case it just results in a performance degradation for 99% of
> the usage. What about making the change so it is optimized away unless
> IPX is enabled?

???
  - It's not just IPX hosts that send 802.3 headers.
  - While a good initial value might depend on the architecture, the
best setting is processor implementation and environment dependent.
Those details are not known at compile time.
  - The code path cost of a module option is only a compare and a
conditional branch.

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] starfire reads irq before pci_enable_device.

2001-02-13 Thread Donald Becker

On 12 Feb 2001, Jes Sorensen wrote:

  "Donald" == Donald Becker [EMAIL PROTECTED] writes:
 
 Donald On 9 Feb 2001, Jes Sorensen wrote:
  The ia64 kernel has gotten mis aligned load support, but it's slow
  as a dog so we really want to copy the packet every time anyway
  when the header is not aligned. If people send out 802.3 headers or
  other crap on Ethernet then it's just too bad.
 
 Donald Note the word "required", meaning "must be done"
 Donald vs. "recommended" meaning "should be done".
 
 Donald The initial issue was a comment in a starfire patch that
 Donald claimed an IA64 bug had been fixed.  The copy breakpoint
 Donald change might have improved performance by doing a copy-align,
 Donald but it didn't fix a bug.
 
 I agree it was a bug, and yes it has been fixed.

There was not a bug in the driver.  The bug was/is in the protocol handling
code.  The protocol handling code *must* be able to handle unaligned IP
headers.

 Donald That performance tradeoff was already anticipated: the
 Donald 'rx_copybreak' value that was changed was a module parameter,
..
 In this case it just results in a performance degradation for 99% of
 the usage. What about making the change so it is optimized away unless
 IPX is enabled?

???
  - It's not just IPX hosts that send 802.3 headers.
  - While a good initial value might depend on the architecture, the
best setting is processor implementation and environment dependent.
Those details are not known at compile time.
  - The code path cost of a module option is only a compare and a
conditional branch.

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [PATCH] starfire reads irq before pci_enable_device.

2001-02-09 Thread Donald Becker

On 9 Feb 2001, Jes Sorensen wrote:

> >>>>> "Jeff" == Jeff Garzik <[EMAIL PROTECTED]> writes:
> Jeff> Donald Becker wrote:
> >> On Tue, 16 Jan 2001, Jeff Garzik wrote: > * IA64 support (Jes) Oh,
> >> and this is completely bogus.  This isn't a fix, it's a hack that
> >> covers up the real problem.
> >> 
> >> The align-copy should *never* be required because the alignment
> >> differs between DIX and E-II encapsulated packets.  The machine
> >> shouldn't crash because someone sends you a different encapsulation
> >> type!
> 
> The ia64 kernel has gotten mis aligned load support, but it's slow as
> a dog so we really want to copy the packet every time anyway when the
> header is not aligned. If people send out 802.3 headers or other crap
> on Ethernet then it's just too bad.

Note the word "required", meaning "must be done" vs. "recommended"
meaning "should be done".

The initial issue was a comment in a starfire patch that claimed an IA64
bug had been fixed.  The copy breakpoint change might have improved
performance by doing a copy-align, but it didn't fix a bug.

That performance tradeoff was already anticipated: the 'rx_copybreak'
value that was changed was a module parameter, not a constant.  That
allows a module-load-time tradeoff, based the specific implementation,
of copying the received packet or accepting a few unaligned loads of the
usually small IP header.  See the comments in starfire.c, as well as
several other bus-master drivers.
   

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] starfire reads irq before pci_enable_device.

2001-02-09 Thread Donald Becker

On 9 Feb 2001, Jes Sorensen wrote:

  "Jeff" == Jeff Garzik [EMAIL PROTECTED] writes:
 Jeff Donald Becker wrote:
  On Tue, 16 Jan 2001, Jeff Garzik wrote:  * IA64 support (Jes) Oh,
  and this is completely bogus.  This isn't a fix, it's a hack that
  covers up the real problem.
  
  The align-copy should *never* be required because the alignment
  differs between DIX and E-II encapsulated packets.  The machine
  shouldn't crash because someone sends you a different encapsulation
  type!
 
 The ia64 kernel has gotten mis aligned load support, but it's slow as
 a dog so we really want to copy the packet every time anyway when the
 header is not aligned. If people send out 802.3 headers or other crap
 on Ethernet then it's just too bad.

Note the word "required", meaning "must be done" vs. "recommended"
meaning "should be done".

The initial issue was a comment in a starfire patch that claimed an IA64
bug had been fixed.  The copy breakpoint change might have improved
performance by doing a copy-align, but it didn't fix a bug.

That performance tradeoff was already anticipated: the 'rx_copybreak'
value that was changed was a module parameter, not a constant.  That
allows a module-load-time tradeoff, based the specific implementation,
of copying the received packet or accepting a few unaligned loads of the
usually small IP header.  See the comments in starfire.c, as well as
several other bus-master drivers.
   

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] starfire reads irq before pci_enable_device.

2001-02-08 Thread Donald Becker

On Thu, 8 Feb 2001, Ion Badulescu wrote:

> On Thu, 8 Feb 2001, Donald Becker wrote:
> 
> > > > The align-copy should *never* be required because the alignment differs
> > > > between DIX and E-II encapsulated packets.  The machine shouldn't crash
> > > > because someone sends you a different encapsulation type!
> > 
> > This is true for a number of drivers -- triggering the copy-align code
> > might eliminate the misaligned traps on your local network, but it's not
> > a solution.
> 
> Ok, so what *is* a good solution then? I'm not arguing that unaligned 
> memory access traps should be avoided because they are deadly (they 
> shouldn't be), but because they are costly.
> 
> Or we can just tell people, "hey, don't use this 64-bit PCI card on a real 
> 64-bit system, it's broken by design"? I don't think that's a good 
> solution either.

This is not a 64 bit PCI issue.  It is an issue with the protocol
stack.  The IP protocol handling code must expect that the header words
will be misaligned in some circumstances.

It's amusing that a full receive copy is added without any concern, in
the same discussion where zero-copy transmit is treated as a holy grail!

> >The MII read code is no longer reliable.  I spent twenty minutes at
> >the show, but couldn't figure out the problem.  I haven't been able
> >reproduce the problem locally with my 2.2 code and somewhat older
> >hardware.
> 
> Yes, I've noticed this too, the PHY doesn't seem to get detected in all 
> cases, and it's pretty random at that. Other times the same PHY gets 
> detected multiple times at different addresses.

This might be a transceiver preamble issue with the specific
transceivers on the recent cards.  Debugging this type of problem
sometimes requires a D-Oscope on the MII data pins.

Normally I would suspect a timing problem with a very fast machine, but
the Starfire hardware generates its own preamble and clock signals, not
the driver code.

> The good news is that the same code behaves the same on 2.4 and 2.2, so 
> I think it's not a core kernel issue. I'll try to track it down; 

It's likely easier to use the starfire-diag program from
  http://www.scyld.com/diag/index.html

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] starfire reads irq before pci_enable_device.

2001-02-08 Thread Donald Becker

On Thu, 8 Feb 2001, Jeff Garzik wrote:

> > Well, I decided to bite the bullet and port my zerocopy starfire
> > changes to the official tree, properly ifdef'ed. So here it goes,
>
> I would prefer that the zerocopy changes stay in DaveM's external patch
> until they are ready to be merged.  Zerocopy is still changing and being

Good idea -- I expect that there will be significant interface changes
before the zero-copy code stabilizes.

> > + * The ia64 doesn't allow for unaligned loads even of integers being
> > + * misaligned on a 2 byte boundary. Thus always force copying of
> > + * packets as the starfire doesn't allow for misaligned DMAs ;-(
...
> Note that I have not yet sent this patch onto Linus for a reason... 
> Here is Don Becker's comment on the subject:

> > Oh, and this is completely bogus.
> > This isn't a fix, it's a hack that covers up the real problem.
> > 
> > The align-copy should *never* be required because the alignment differs
> > between DIX and E-II encapsulated packets.  The machine shouldn't crash
> > because someone sends you a different encapsulation type!

This is true for a number of drivers -- triggering the copy-align code
might eliminate the misaligned traps on your local network, but it's not
a solution.

I saw the Adaptec people last week at LinuxWorld.  The 2.4.0 starfire
has a number of actual bugs that should be fixed RSN:
   The consistency check in the Rx code was broken.  Did anyone ever try
   the driver after the changes?  The test triggers with every received
   packet.  The easiest patch is to just get rid the consistency checks
   inside "#ifndef final_version".

   The region resource was not released, requiring a reboot between each
   driver test.  Trivial fix.

   The MII read code is no longer reliable.  I spent twenty minutes at
   the show, but couldn't figure out the problem.  I haven't been able
   reproduce the problem locally with my 2.2 code and someone older
   hardware.

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] starfire reads irq before pci_enable_device.

2001-02-08 Thread Donald Becker

On Thu, 8 Feb 2001, Jeff Garzik wrote:

  Well, I decided to bite the bullet and port my zerocopy starfire
  changes to the official tree, properly ifdef'ed. So here it goes,

 I would prefer that the zerocopy changes stay in DaveM's external patch
 until they are ready to be merged.  Zerocopy is still changing and being

Good idea -- I expect that there will be significant interface changes
before the zero-copy code stabilizes.

  + * The ia64 doesn't allow for unaligned loads even of integers being
  + * misaligned on a 2 byte boundary. Thus always force copying of
  + * packets as the starfire doesn't allow for misaligned DMAs ;-(
...
 Note that I have not yet sent this patch onto Linus for a reason... 
 Here is Don Becker's comment on the subject:

  Oh, and this is completely bogus.
  This isn't a fix, it's a hack that covers up the real problem.
  
  The align-copy should *never* be required because the alignment differs
  between DIX and E-II encapsulated packets.  The machine shouldn't crash
  because someone sends you a different encapsulation type!

This is true for a number of drivers -- triggering the copy-align code
might eliminate the misaligned traps on your local network, but it's not
a solution.

I saw the Adaptec people last week at LinuxWorld.  The 2.4.0 starfire
has a number of actual bugs that should be fixed RSN:
   The consistency check in the Rx code was broken.  Did anyone ever try
   the driver after the changes?  The test triggers with every received
   packet.  The easiest patch is to just get rid the consistency checks
   inside "#ifndef final_version".

   The region resource was not released, requiring a reboot between each
   driver test.  Trivial fix.

   The MII read code is no longer reliable.  I spent twenty minutes at
   the show, but couldn't figure out the problem.  I haven't been able
   reproduce the problem locally with my 2.2 code and someone older
   hardware.

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] starfire reads irq before pci_enable_device.

2001-02-08 Thread Donald Becker

On Thu, 8 Feb 2001, Ion Badulescu wrote:

 On Thu, 8 Feb 2001, Donald Becker wrote:
 
The align-copy should *never* be required because the alignment differs
between DIX and E-II encapsulated packets.  The machine shouldn't crash
because someone sends you a different encapsulation type!
  
  This is true for a number of drivers -- triggering the copy-align code
  might eliminate the misaligned traps on your local network, but it's not
  a solution.
 
 Ok, so what *is* a good solution then? I'm not arguing that unaligned 
 memory access traps should be avoided because they are deadly (they 
 shouldn't be), but because they are costly.
 
 Or we can just tell people, "hey, don't use this 64-bit PCI card on a real 
 64-bit system, it's broken by design"? I don't think that's a good 
 solution either.

This is not a 64 bit PCI issue.  It is an issue with the protocol
stack.  The IP protocol handling code must expect that the header words
will be misaligned in some circumstances.

It's amusing that a full receive copy is added without any concern, in
the same discussion where zero-copy transmit is treated as a holy grail!

 The MII read code is no longer reliable.  I spent twenty minutes at
 the show, but couldn't figure out the problem.  I haven't been able
 reproduce the problem locally with my 2.2 code and somewhat older
 hardware.
 
 Yes, I've noticed this too, the PHY doesn't seem to get detected in all 
 cases, and it's pretty random at that. Other times the same PHY gets 
 detected multiple times at different addresses.

This might be a transceiver preamble issue with the specific
transceivers on the recent cards.  Debugging this type of problem
sometimes requires a D-Oscope on the MII data pins.

Normally I would suspect a timing problem with a very fast machine, but
the Starfire hardware generates its own preamble and clock signals, not
the driver code.

 The good news is that the same code behaves the same on 2.4 and 2.2, so 
 I think it's not a core kernel issue. I'll try to track it down; 

It's likely easier to use the starfire-diag program from
  http://www.scyld.com/diag/index.html

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] Re: Q: natsemi.c spinlocks

2001-01-21 Thread Donald Becker

On Sun, 21 Jan 2001, Jeff Garzik wrote:
> Andrew Morton wrote:
> > Manfred wrote:
> > > Hi Jeff, Tjeerd,
> > > I spotted the spin_lock in natsemi.c, and I think it's bogus.
> > >
> > > The "simultaneous interrupt entry" is a bug in some 2.0 and 2.1 kernel
> > > (even Alan didn't remember it exactly when I asked him), thus a sane
> > > driver can assume that an interrupt handler is never reentered.
...
> > I think you're right.  2.4's interrupt handling prevents
> > simultaneous entry of the same ISR.

The bug (simultaneous calls to the interrupt handler on SMP) existed in
most 2.0 versions was fixed before 2.2.  A driver that needs to work
with multiple kernel versions must have the check.

> > However, natsemi.c's spinlock needs to be retained, and
> > extended into start_tx(), because this driver has
> > a race which has cropped up in a few others:
> > ...
> > if (np->cur_tx - np->dirty_tx >= TX_QUEUE_LEN - 1) {
> > /* WINDOW HERE */
> > np->tx_full = 1;
> > netif_stop_queue(dev);
> > }
> > If the ring is currently full and an interrupt comes in
> > at the indicated window and reaps ALL the packets in the
> > ring, the driver ends up in state `tx_full = 1' and tramsmit
> > disabled, but with no outstanding transmit interrupts.

The better solution, which I've been adding to the drivers, is to check
again for a just-cleared Tx queue after setting tx_full.
That trades an extra comparison on a rarely followed path for a spinlock
that is taken for every transmit and interrupt.

Remember: spinlocks are expensive!

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] Re: Q: natsemi.c spinlocks

2001-01-21 Thread Donald Becker

On Sun, 21 Jan 2001, Jeff Garzik wrote:
 Andrew Morton wrote:
  Manfred wrote:
   Hi Jeff, Tjeerd,
   I spotted the spin_lock in natsemi.c, and I think it's bogus.
  
   The "simultaneous interrupt entry" is a bug in some 2.0 and 2.1 kernel
   (even Alan didn't remember it exactly when I asked him), thus a sane
   driver can assume that an interrupt handler is never reentered.
...
  I think you're right.  2.4's interrupt handling prevents
  simultaneous entry of the same ISR.

The bug (simultaneous calls to the interrupt handler on SMP) existed in
most 2.0 versions was fixed before 2.2.  A driver that needs to work
with multiple kernel versions must have the check.

  However, natsemi.c's spinlock needs to be retained, and
  extended into start_tx(), because this driver has
  a race which has cropped up in a few others:
  ...
  if (np-cur_tx - np-dirty_tx = TX_QUEUE_LEN - 1) {
  /* WINDOW HERE */
  np-tx_full = 1;
  netif_stop_queue(dev);
  }
  If the ring is currently full and an interrupt comes in
  at the indicated window and reaps ALL the packets in the
  ring, the driver ends up in state `tx_full = 1' and tramsmit
  disabled, but with no outstanding transmit interrupts.

The better solution, which I've been adding to the drivers, is to check
again for a just-cleared Tx queue after setting tx_full.
That trades an extra comparison on a rarely followed path for a spinlock
that is taken for every transmit and interrupt.

Remember: spinlocks are expensive!

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



[patch] tulip driver

2001-01-05 Thread Donald Becker


Peter De Schrijver ([EMAIL PROTECTED]) wrote:

> Attached you will find a patch to the tulip driver in Linux 2.4. This patch 
> will interpret a bit more of 21142 extended format type 3 info blocks in a 
> tulip SROM. This allows correct autonegotation of the builtin 21143 based 
> ethernet adapter on a digital PWS500a(u).

This patch isn't quite correct.

The to-advertise value in the type 3 media info block must be written
to the transceiver after each transceiver reset.  The write is done in
select_media(), not when initially reading EEPROM.

One part of the problem is that to work correctly the driver must keep the
to-advertise values separately for the SYM/serial transceiver and each MII
transceiver.

> Maybe a future version should do 
> a more thorough interpretation of the SROM. 

The tulip driver in 2.4 uses older media selection code.  See the
tulip.c driver at
   http://www.scyld.com/network/tulip.html
for a driver with the many updates needed to support recent chips and
boards.

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



[patch] tulip driver

2001-01-05 Thread Donald Becker


Peter De Schrijver ([EMAIL PROTECTED]) wrote:

 Attached you will find a patch to the tulip driver in Linux 2.4. This patch 
 will interpret a bit more of 21142 extended format type 3 info blocks in a 
 tulip SROM. This allows correct autonegotation of the builtin 21143 based 
 ethernet adapter on a digital PWS500a(u).

This patch isn't quite correct.

The to-advertise value in the type 3 media info block must be written
to the transceiver after each transceiver reset.  The write is done in
select_media(), not when initially reading EEPROM.

One part of the problem is that to work correctly the driver must keep the
to-advertise values separately for the SYM/serial transceiver and each MII
transceiver.

 Maybe a future version should do 
 a more thorough interpretation of the SROM. 

The tulip driver in 2.4 uses older media selection code.  See the
tulip.c driver at
   http://www.scyld.com/network/tulip.html
for a driver with the many updates needed to support recent chips and
boards.

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: question about tulip patch to set CSR0 for pci 2.0 bus

2000-12-08 Thread Donald Becker

On Sat, 9 Dec 2000, Alan Cox wrote:

> > Just in case you didn't catch it: this is not a PCI v2.0 vs. v2.1 issue.
> > The older Tulips work great with PCI v2.0 and v2.1.  The bug is with longer
> > bursts and a specific i486 chipset/motherboard.
> 
> Which chipset. I can then add it to the PCI quirks and we can do it nicely
> in 2.4 so that drivers can test the pci quirk list

I had the problem with the Intel Saturn II chipset used on the Asus SP3G.
The same problem was reported with the Saturn I on the SP3.

The bug manifests as occasional bus-master transfer data corruption.

The work-around was to change the Tulip PCI control register to use 
  8 longword cache alignment, 8 longword burst.
when the Tulip driver was run on a 486.

The old non-module work-around was
if (x86 <= 4)
  printk(KERN_INFO "%s: This is a 386/486 PCI system, setting cache "
 "alignment to %x.\n", dev->name,
 0x01A0 | (x86 <= 4 ? 0x4800 : 0x8000));

I removed this code and replaced with the ability to set the variable "csr0"
as a module option.  There is no way to activate the fix with a built-in
driver.

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: question about tulip patch to set CSR0 for pci 2.0 bus

2000-12-08 Thread Donald Becker


Jeff Garzik wrote:
> Clayton Weaver wrote: 
>>> 
>> Shouldn't the setting of the CSR0 value for x86 switch between normal 
>> (0x01A08000) and cautious (0x01A04800) based on some notion of 
>> what generation of pci bus is installed rather than what cpu the kernel 
>> is compiled for? 

No, you misunderstand the reason for that code.  It's not based on the PCI
bus version.  It's to work-around a specific bug in the Intel chipset
used on 486 PCI motherboards such as the Asus SP3 and SP3G.

The best way to check for this buggy chipset was to check for a 486
processor.  There are very few 486 chips on non-buggy motherboards, and the
performance impact of shorter PCI bursts is minimal given the slow speed of
the 486.

>> That's one thing that bothered me about the method that the .90 driver 
>> used. It worked for me, of course, cool, but when I thought about putting 

I put the check in the old drivers because the SP3 was a common motherboard
"way back in the old days".  The check was removed becaues the kernel
changes and removed the variable that held the processor architecture.

>> If the pci bus level is 2.0, it makes sense to use the cautious CSR0 
>> setting, for the same reasons that the .90 tulip.c in 2.0.38 does, and if 
>> the pci level is 2.1, you aren't taking any chances with 0x01A08000 that 
>> the driver doesn't take now. The pci driver, initialized before any 
>> pci devices, appears to know whether you have a pci 2.0 or pci 2.1 bus, so 
>> why not use that information instead of cpu generation? 
>
>A good suggestion, too... Some other hardware behaves differently 
>based on PCI bus version, it would be nice for the driver to notice that 
>and enable (or disable) advanced features. To blindly assume is just a 
>PCI bus lockup waiting to happen... 

Just in case you didn't catch it: this is not a PCI v2.0 vs. v2.1 issue.
The older Tulips work great with PCI v2.0 and v2.1.  The bug is with longer
bursts and a specific i486 chipset/motherboard.


Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: question about tulip patch to set CSR0 for pci 2.0 bus

2000-12-08 Thread Donald Becker


Jeff Garzik wrote:
 Clayton Weaver wrote: 
 
 Shouldn't the setting of the CSR0 value for x86 switch between normal 
 (0x01A08000) and cautious (0x01A04800) based on some notion of 
 what generation of pci bus is installed rather than what cpu the kernel 
 is compiled for? 

No, you misunderstand the reason for that code.  It's not based on the PCI
bus version.  It's to work-around a specific bug in the Intel chipset
used on 486 PCI motherboards such as the Asus SP3 and SP3G.

The best way to check for this buggy chipset was to check for a 486
processor.  There are very few 486 chips on non-buggy motherboards, and the
performance impact of shorter PCI bursts is minimal given the slow speed of
the 486.

 That's one thing that bothered me about the method that the .90 driver 
 used. It worked for me, of course, cool, but when I thought about putting 

I put the check in the old drivers because the SP3 was a common motherboard
"way back in the old days".  The check was removed becaues the kernel
changes and removed the variable that held the processor architecture.

 If the pci bus level is 2.0, it makes sense to use the cautious CSR0 
 setting, for the same reasons that the .90 tulip.c in 2.0.38 does, and if 
 the pci level is 2.1, you aren't taking any chances with 0x01A08000 that 
 the driver doesn't take now. The pci driver, initialized before any 
 pci devices, appears to know whether you have a pci 2.0 or pci 2.1 bus, so 
 why not use that information instead of cpu generation? 

A good suggestion, too... Some other hardware behaves differently 
based on PCI bus version, it would be nice for the driver to notice that 
and enable (or disable) advanced features. To blindly assume is just a 
PCI bus lockup waiting to happen... 

Just in case you didn't catch it: this is not a PCI v2.0 vs. v2.1 issue.
The older Tulips work great with PCI v2.0 and v2.1.  The bug is with longer
bursts and a specific i486 chipset/motherboard.


Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: question about tulip patch to set CSR0 for pci 2.0 bus

2000-12-08 Thread Donald Becker

On Sat, 9 Dec 2000, Alan Cox wrote:

  Just in case you didn't catch it: this is not a PCI v2.0 vs. v2.1 issue.
  The older Tulips work great with PCI v2.0 and v2.1.  The bug is with longer
  bursts and a specific i486 chipset/motherboard.
 
 Which chipset. I can then add it to the PCI quirks and we can do it nicely
 in 2.4 so that drivers can test the pci quirk list

I had the problem with the Intel Saturn II chipset used on the Asus SP3G.
The same problem was reported with the Saturn I on the SP3.

The bug manifests as occasional bus-master transfer data corruption.

The work-around was to change the Tulip PCI control register to use 
  8 longword cache alignment, 8 longword burst.
when the Tulip driver was run on a 486.

The old non-module work-around was
if (x86 = 4)
  printk(KERN_INFO "%s: This is a 386/486 PCI system, setting cache "
 "alignment to %x.\n", dev-name,
 0x01A0 | (x86 = 4 ? 0x4800 : 0x8000));

I removed this code and replaced with the ability to set the variable "csr0"
as a module option.  There is no way to activate the fix with a built-in
driver.

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [RFC] Configuring synchronous interfaces in Linux

2000-12-02 Thread Donald Becker

On Sun, 3 Dec 2000, Chris Wedgwood wrote:

> On Sat, Dec 02, 2000 at 11:09:35AM -0500, Donald Becker wrote:
> 
> Hey, I'll make it easy.  Find an approach that fully handles only the Tulip
> and 3c59x drivers, and that is consistent.
> 
> Actually, I starteed work on adding this to the 3c59x code last
> night; I am now a little dispondent though as it wasn't as simple as
> I first thought it might be.
> 
> I am now wondering whether it make sense to break 3c59x into smaller
> peices which hander fewer cards each; there soom to be many things
> the driver knows about which probably don't relate to my needs.

It's certainly possible to break the driver up, but it will be even more of
a problem to maintain.  Some of the complicated media selection code applies
to several generations.  Splitting the driver to have a copy for each
generation means a lot of duplicated code, which quickly leads to version
skew.

The story usually goes like this:

Someone wants to experiment with a driver.  It's always exciting to tweak
the code for the latest and greatest.  But the driver has all of this
complicated stuff for other, usually older, card/kernel versions.  So the
hacker tosses out the code, "simplifying" the driver.  They then release the
"new and improved driver".

They have no CVS tree to maintain, no old driver or hardware versions to
keep track of.  No one has been using the driver for years, and thus there
is no one screaming when their production machine stop working.  All of the
people with problems are just referred to the guy who did the original
driver, who is still expected to be there when things break.  They don't
realize they have just removed all of the excitement and motivation for they
guy who is doing all of the time consuming maintenance and testing work.

I don't mean to pick on 3Com, but the driver they released is a good
example.  It supported only a tiny set of card types that 3Com was currently
selling, and only with the current kernel.  It didn't support the previous
card types, the OEMed versions, or the older kernels.  The assumption was
that my driver would exist to support those hard cases, but by handling the
easy 90% that 3Com would get most of the credit.

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [RFC] Configuring synchronous interfaces in Linux

2000-12-02 Thread Donald Becker

On Fri, 1 Dec 2000, Francois Romieu wrote:

> Russell King <[EMAIL PROTECTED]> écrit :
> [...]
> > We already have a standard interface for this, but many drivers do not
> > support it.  Its called "ifconfig eth0 media xxx":

Uhmmm, it's not a standard if "many drivers do not support it".

It is very easy to hack up code to handle one or two drivers.
But you shouldn't claim the problem is fixed until the approach is tested
with all of the driver.

Hey, I'll make it easy.  Find an approach that fully handles only the Tulip
and 3c59x drivers, and that is consistent.

I'll start you out: the possible 100baseTx configurations for the 3c59x
driver are SYM transceiver, MII transceiver, and "NWay" transceiver.  The
latter two may use autonegotiation, only speed autosensing, or a fixed
speed.  The SYM transceiver version can do static speed sensing.

[[ Note static speed sensing on the 3c595 is potentially evil.  The chip
must generate 100baseTx link beat while checking for 100baseTx link beat.
This commonly hoses a 10baseT repeater with constant collisions.  So does
"auto speed" mean "check for 100baseTx link beat, even though I sense
10baseT" or "do the safe thing and stick with 10baseT". ]]

> Ok. Hmmm... If I want to do something like 
> 'ifconfig scc0 media some_frequency up' as I hope to set scc0 as a DCE (or 
> ifconfig scc0 media auto up' for a DTE), I must teach ifconfig.c to 
> distinguish Ethernet and synchrone interface based on interface.type,
> right ?

Correct.  And just speed isn't good enough for Ethernet.  We have 1/10HPNA,
100base-Fx,Tx,T4.

We should not just give up.
My point is that the issue isn't a trivial one.
Media selection code is the most time consuming and error prone code in many
drivers.  I would have avoiding doing that work if there had been an easy
answer.

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [RFC] Configuring synchronous interfaces in Linux

2000-12-02 Thread Donald Becker

On Fri, 1 Dec 2000, Francois Romieu wrote:

 Russell King [EMAIL PROTECTED] écrit :
 [...]
  We already have a standard interface for this, but many drivers do not
  support it.  Its called "ifconfig eth0 media xxx":

Uhmmm, it's not a standard if "many drivers do not support it".

It is very easy to hack up code to handle one or two drivers.
But you shouldn't claim the problem is fixed until the approach is tested
with all of the driver.

Hey, I'll make it easy.  Find an approach that fully handles only the Tulip
and 3c59x drivers, and that is consistent.

I'll start you out: the possible 100baseTx configurations for the 3c59x
driver are SYM transceiver, MII transceiver, and "NWay" transceiver.  The
latter two may use autonegotiation, only speed autosensing, or a fixed
speed.  The SYM transceiver version can do static speed sensing.

[[ Note static speed sensing on the 3c595 is potentially evil.  The chip
must generate 100baseTx link beat while checking for 100baseTx link beat.
This commonly hoses a 10baseT repeater with constant collisions.  So does
"auto speed" mean "check for 100baseTx link beat, even though I sense
10baseT" or "do the safe thing and stick with 10baseT". ]]

 Ok. Hmmm... If I want to do something like 
 'ifconfig scc0 media some_frequency up' as I hope to set scc0 as a DCE (or 
 ifconfig scc0 media auto up' for a DTE), I must teach ifconfig.c to 
 distinguish Ethernet and synchrone interface based on interface.type,
 right ?

Correct.  And just speed isn't good enough for Ethernet.  We have 1/10HPNA,
100base-Fx,Tx,T4.

We should not just give up.
My point is that the issue isn't a trivial one.
Media selection code is the most time consuming and error prone code in many
drivers.  I would have avoiding doing that work if there had been an easy
answer.

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [RFC] Configuring synchronous interfaces in Linux

2000-12-02 Thread Donald Becker

On Sun, 3 Dec 2000, Chris Wedgwood wrote:

 On Sat, Dec 02, 2000 at 11:09:35AM -0500, Donald Becker wrote:
 
 Hey, I'll make it easy.  Find an approach that fully handles only the Tulip
 and 3c59x drivers, and that is consistent.
 
 Actually, I starteed work on adding this to the 3c59x code last
 night; I am now a little dispondent though as it wasn't as simple as
 I first thought it might be.
 
 I am now wondering whether it make sense to break 3c59x into smaller
 peices which hander fewer cards each; there soom to be many things
 the driver knows about which probably don't relate to my needs.

It's certainly possible to break the driver up, but it will be even more of
a problem to maintain.  Some of the complicated media selection code applies
to several generations.  Splitting the driver to have a copy for each
generation means a lot of duplicated code, which quickly leads to version
skew.

The story usually goes like this:

Someone wants to experiment with a driver.  It's always exciting to tweak
the code for the latest and greatest.  But the driver has all of this
complicated stuff for other, usually older, card/kernel versions.  So the
hacker tosses out the code, "simplifying" the driver.  They then release the
"new and improved driver".

They have no CVS tree to maintain, no old driver or hardware versions to
keep track of.  No one has been using the driver for years, and thus there
is no one screaming when their production machine stop working.  All of the
people with problems are just referred to the guy who did the original
driver, who is still expected to be there when things break.  They don't
realize they have just removed all of the excitement and motivation for they
guy who is doing all of the time consuming maintenance and testing work.

I don't mean to pick on 3Com, but the driver they released is a good
example.  It supported only a tiny set of card types that 3Com was currently
selling, and only with the current kernel.  It didn't support the previous
card types, the OEMed versions, or the older kernels.  The assumption was
that my driver would exist to support those hard cases, but by handling the
easy 90% that 3Com would get most of the credit.

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] lance.c - dev_kfree_skb() then reference skb->len

2000-11-28 Thread Donald Becker

On Tue, 28 Nov 2000, Eli Carter wrote:

> Patch is against 2.2.17, drivers/net/lance.c.
> I believe this to be "obviously correct," but please correct me if I'm
> wrong.
> This moves a reference to skb->len to before the possible
> dev_kfree_skb(skb) call.  Though it appears to work as is, I suspect it
> is incorrect.

This patch looks reasonable.

Perhaps it would be better to have the driver retain the skbuff until the
transmit succeeds, and only then add the length to the stats.  But this
specific bug is related the ISA bounce buffer code.  Any ISA card is in
the "legacy" category, so it's better to make minimal change needed to
correct the obvious potential problem.



Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [PATCH] lance.c - dev_kfree_skb() then reference skb-len

2000-11-28 Thread Donald Becker

On Tue, 28 Nov 2000, Eli Carter wrote:

 Patch is against 2.2.17, drivers/net/lance.c.
 I believe this to be "obviously correct," but please correct me if I'm
 wrong.
 This moves a reference to skb-len to before the possible
 dev_kfree_skb(skb) call.  Though it appears to work as is, I suspect it
 is incorrect.

This patch looks reasonable.

Perhaps it would be better to have the driver retain the skbuff until the
transmit succeeds, and only then add the length to the stats.  But this
specific bug is related the ISA bounce buffer code.  Any ISA card is in
the "legacy" category, so it's better to make minimal change needed to
correct the obvious potential problem.



Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: duplicate entries in rtl8129 driver

2000-11-17 Thread Donald Becker

On Fri, 17 Nov 2000, Adam J. Richter wrote:

>   Both linux-2.4.0-test12-pre6/drivers/net/rtl8129.c and
> Don Becker's version at ftp.sycld.com appear to have identical
> PCI device ID and vendor ID values for these two cards:
> 
>   SMC1211TX EZCard 10/100 (RealTek RTL8139)
>   Accton MPX5030 (RealTek RTL8139)
> 
>   So, I do not see how the latter entry in pci_tbl is ever
> matched.  I think the result would be that users of either card
> will be told that they have an SMC1211TX EZCard 10/100.  I suggest
> deleting the latter entry and combine its label into the previous
> one, so it will be described as:
> 
>   SMC1211TX EZCard 10/100 or Accton MPX5030 (RealTek RTL8139)

They are distinguished by the PCI subsystem ID, which was truncated from the
list.

Note that Accton is really SMC.  They purchased part of SMC several years
ago, including the brand name.  The chip part of the old SMC is now named
SMsC, and they still make the EPIC Ethernet chip.

I do have a long list of subsystem IDs, but using multiple names for a
one-chip board with no design options is just confusing.  (Vs. the 21143
chip, which has at least 70 different driver-visible board design
variations.)

Bottom line: Yes, it's redundant.  But there was a reason.

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: duplicate entries in rtl8129 driver

2000-11-17 Thread Donald Becker

On Fri, 17 Nov 2000, Adam J. Richter wrote:

   Both linux-2.4.0-test12-pre6/drivers/net/rtl8129.c and
 Don Becker's version at ftp.sycld.com appear to have identical
 PCI device ID and vendor ID values for these two cards:
 
   SMC1211TX EZCard 10/100 (RealTek RTL8139)
   Accton MPX5030 (RealTek RTL8139)
 
   So, I do not see how the latter entry in pci_tbl is ever
 matched.  I think the result would be that users of either card
 will be told that they have an SMC1211TX EZCard 10/100.  I suggest
 deleting the latter entry and combine its label into the previous
 one, so it will be described as:
 
   SMC1211TX EZCard 10/100 or Accton MPX5030 (RealTek RTL8139)

They are distinguished by the PCI subsystem ID, which was truncated from the
list.

Note that Accton is really SMC.  They purchased part of SMC several years
ago, including the brand name.  The chip part of the old SMC is now named
SMsC, and they still make the EPIC Ethernet chip.

I do have a long list of subsystem IDs, but using multiple names for a
one-chip board with no design options is just confusing.  (Vs. the 21143
chip, which has at least 70 different driver-visible board design
variations.)

Bottom line: Yes, it's redundant.  But there was a reason.

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [patch] NE2000

2000-11-01 Thread Donald Becker

On Wed, 1 Nov 2000, Paul Gortmaker wrote:

> Jeff Garzik wrote:
> > Paul Gortmaker wrote:
> > > There is no urgency in trying to squeeze a patch like this in the back
> > > door of a 2.4.0 release.  For example, there are people out there now
> > > who are using the ne.c driver to run both ISA and PCI cards in the same
> > > box without having to use 2 different drivers.  We can wait until 2.5.0
> > > to break their .config file.
> > 
> > IMNSHO this is a bug, though...
..
> If you want to roll it into the merge (and can get it past Linus) then
> please feel free to do so - I'll be glad to cross it off my list sooner
> as opposed to later.

If the ne* drivers are going to be updated, you might want to add in the
full-duplex support of the latest ne2k-pci.c driver at
ftp://www.scyld.com/pub/network/ne2k-pci.c

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [patch] NE2000

2000-11-01 Thread Donald Becker

On Wed, 1 Nov 2000, Paul Gortmaker wrote:

 Jeff Garzik wrote:
  Paul Gortmaker wrote:
   There is no urgency in trying to squeeze a patch like this in the back
   door of a 2.4.0 release.  For example, there are people out there now
   who are using the ne.c driver to run both ISA and PCI cards in the same
   box without having to use 2 different drivers.  We can wait until 2.5.0
   to break their .config file.
  
  IMNSHO this is a bug, though...
..
 If you want to roll it into the merge (and can get it past Linus) then
 please feel free to do so - I'll be glad to cross it off my list sooner
 as opposed to later.

If the ne* drivers are going to be updated, you might want to add in the
full-duplex support of the latest ne2k-pci.c driver at
ftp://www.scyld.com/pub/network/ne2k-pci.c

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Topic for discussion: OS Design + GPL

2000-10-24 Thread Donald Becker

On Mon, 23 Oct 2000, Andre Hedrick wrote:

> Subject: Re: Topic for discussion: OS Design + GPL
> 
> ftp://ftp.etinc.com/pub/linux/linux22_hdlc.tgz
> 
> Could explain to me why ET Inc is modifying GPL drivers and then
> republishing the binaries as modules only?
> 
> Not that it is my sub-system, but I am not sure that my friend Don knows
> of this issue.  If Don does not care then, good day.
> 
> ls hdlc/usr/hdlc/dev/modules/2.2.14
> .   eepro100.o  etbwmgr.o   tulip.o
> ..  eepro100orig.o  ethdlc.otuliporig.o

I very much care.

Neither the eepro100.c nor the tulip.c driver have been released under any
license but the Gnu GPL.

Any distribution of those drivers must only be done under the terms of the
GPL.  That includes providing a copy of the GPL text and making a specific
offer of source code as required by the GPL.

If you have offered driver object files without offering the source code,,
you have terminated your right of redistribution of the code per paragraph 4
of the GPL.  That means you have no right to distribute the drivers, even
if the violation is corrected.

In general I will reinstate redistribution rights if
  - the license violation is acknowledged
  - you provide a specific plan to prevent future violations
  - you notify all existing recipients of the license violation, including
 their right to receive the source code.
This reinstatement of rights is not automatic, especially with evidence of
continued, willful violations.


Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Topic for discussion: OS Design + GPL

2000-10-24 Thread Donald Becker

On Mon, 23 Oct 2000, Andre Hedrick wrote:

 Subject: Re: Topic for discussion: OS Design + GPL
 
 ftp://ftp.etinc.com/pub/linux/linux22_hdlc.tgz
 
 Could explain to me why ET Inc is modifying GPL drivers and then
 republishing the binaries as modules only?
 
 Not that it is my sub-system, but I am not sure that my friend Don knows
 of this issue.  If Don does not care then, good day.
 
 ls hdlc/usr/hdlc/dev/modules/2.2.14
 .   eepro100.o  etbwmgr.o   tulip.o
 ..  eepro100orig.o  ethdlc.otuliporig.o

I very much care.

Neither the eepro100.c nor the tulip.c driver have been released under any
license but the Gnu GPL.

Any distribution of those drivers must only be done under the terms of the
GPL.  That includes providing a copy of the GPL text and making a specific
offer of source code as required by the GPL.

If you have offered driver object files without offering the source code,,
you have terminated your right of redistribution of the code per paragraph 4
of the GPL.  That means you have no right to distribute the drivers, even
if the violation is corrected.

In general I will reinstate redistribution rights if
  - the license violation is acknowledged
  - you provide a specific plan to prevent future violations
  - you notify all existing recipients of the license violation, including
 their right to receive the source code.
This reinstatement of rights is not automatic, especially with evidence of
continued, willful violations.


Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Second Generation Beowulf Clusters
Annapolis MD 21403  410-990-9993

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: PATCH 2.4.0.9.2: export ethtool interface

2000-09-19 Thread Donald Becker

On Tue, 19 Sep 2000, Andrew Morton wrote:

> > This patch, against 2.4.0-test9-pre2, moves ethtool.h from the private
> > domain of the sparc ports into include/linux.  This publishes an
...
> This is good. It would be useful to have this in place ASAP so driver
> authors have something to look at and to work against.

You all know my opinion on this interface: it is bad.

> * I added SUPPORTED_MAC_FLOWCTRL/ADVERTISED_MAC_FLOWCTRL for advertising
> of 802.3x MAC-layer flow control.

There are two elements of this: Rx and Tx flow control.  We might support Rx
flow control, but not generate flow control packets.  Or perhaps the
inverse.

> * There don't seem to be enough port types and transceiver types. 
> 10base2? BNC? 100baseTx/100baseFX?

1Mbps HomePNA? 10Mbps HomePNA?  (Check the control words for the PNA spec.)
Various 802.11?
10Gb Ethernet?  Have you looked at gigabit autonegotiation?

If you are proposing a new interface (and obviously tossing the
existing MII-MDIO emulation that has existed for a few years) you should at
least support current hardware.


Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Beowulf-II Cluster Distribution
Annapolis MD 21403

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: PATCH 2.4.0.9.2: export ethtool interface

2000-09-19 Thread Donald Becker

On Tue, 19 Sep 2000, Andrew Morton wrote:

  This patch, against 2.4.0-test9-pre2, moves ethtool.h from the private
  domain of the sparc ports into include/linux.  This publishes an
...
 This is good. It would be useful to have this in place ASAP so driver
 authors have something to look at and to work against.

You all know my opinion on this interface: it is bad.

 * I added SUPPORTED_MAC_FLOWCTRL/ADVERTISED_MAC_FLOWCTRL for advertising
 of 802.3x MAC-layer flow control.

There are two elements of this: Rx and Tx flow control.  We might support Rx
flow control, but not generate flow control packets.  Or perhaps the
inverse.

 * There don't seem to be enough port types and transceiver types. 
 10base2? BNC? 100baseTx/100baseFX?

1Mbps HomePNA? 10Mbps HomePNA?  (Check the control words for the PNA spec.)
Various 802.11?
10Gb Ethernet?  Have you looked at gigabit autonegotiation?

If you are proposing a new interface (and obviously tossing the
existing MII-MDIO emulation that has existed for a few years) you should at
least support current hardware.


Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Beowulf-II Cluster Distribution
Annapolis MD 21403

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Preallocated skb's?

2000-09-15 Thread Donald Becker

On Fri, 15 Sep 2000, Bogdan Costescu wrote:

> On Fri, 15 Sep 2000, jamal wrote:
> > You use the period(5-10micros), while waiting
> > for full packet arrival, to make the route decision (lookup etc).
> > i.e this will allow for a better FF; it will not offload things.
> 
> Just that you span several layers by doing this, it's not driver specific
> anymore.

Many chips have some sort of early-Rx feature, but it's still a bad idea for
the many reasons I've pointed out before.

An additional reason not use early-Rx is that chips such as the 3c905C are
most efficient at using the PCI bus when transfering a whole packet in a
single PCI burst (plus two smaller bursts initially reading and later
writing the descriptor).  Using an early-Rx interrupt scheme means using
multiple smaller bursts.

The early-Rx scheme worked well on the ISA bus, where transfers were slow
and not bursting.

Also note: it is possible to drop an Rx packet after the early Rx
interrupt.

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Beowulf-II Cluster Distribution
Annapolis MD 21403

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Preallocated skb's?

2000-09-14 Thread Donald Becker

On Thu, 14 Sep 2000, jamal wrote:
> On Thu, 14 Sep 2000, Andrew Morton wrote:
> > But for 3c59x (which is not a very efficient driver (yet)), it takes 6
> > usecs to even get into the ISR, and around 4 uSecs to traverse it. 
> > Guess another 4 to leave the ISR, guess half as much again for whoever
> > got interrupted to undo the resulting cache pollution.
> > 
> > That's 20 usec per interrupt, of which 1 usec could be saved by skb
> > pooling.
> 
> With these numbers + how long it takes to queue the packets in
> netif_rx(); i would say you roughly should be able to tune your DMA
> ring appropriately. 
> 
> Roughly your DMA ring should be able to hold:
> 
> (PCI_Burst_bandwidth*((20*10-6)+pci_bus_latency))) bits.
> 
> Did i hear Donald say something? ;->

No, because I know I sound like a broken record.  

What we measured is that the cache impact of allocating and initializing our
(ever-larger) skbuffs is huge.  So we pay some CPU time getting a new
skbuff, and some more CPU time later reloading the cache with useful data.

The skbuff is added to the end of the driver Rx buffer list, so the memory
lines are out of the cache by the time we need them.

The Rx ring should be able to hold at least
   (interrupt-latency * 100/1000Mbps) bits
and 
   (interrupt-latency * 100/1000Mbps)/(64 bytes/packet * 8 bits/byte) packets


> > If you don't do Rx interrupt mitigation there's no point in event
> > thinking about skb pooling.
> 
> FF does not use mitigation and as Robert was pointing out this was adding
> a lot of value.

The PCI drivers make some effort to always allocate the same size skbuff, so
recycling skbuffs, or otherwise optimizing their allocation, is useful.

The only significant advantage of interrupt mitigation is cache locality
when allocating new skbuffs, and having an additional mechanism to drop
packets under overwhelming load.

The disadvantage of Rx interrupt mitigation is adding latency just where it
might matter the most.  Remember that the hot ticket for old-IPX performance
was taking an *extra* early interrupt for each Rx packet.

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Beowulf-II Cluster Distribution
Annapolis MD 21403

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Fwd: ACPI & I4L irq confilct: bug reporting on kernel 2.4.0-test8-pre4

2000-09-06 Thread Donald Becker

On Wed, 6 Sep 2000, Guido Trentalancia wrote:
> On Wed, 06 Sep 2000, you wrote:
> > Guido Trentalancia schrieb:

> > > >Motherboard: ASUS P2B-F with the latest bios bx2f113.awd (microcode
> > > > update) ISDN: Winbond based card (Hisax type=36)
> > > >The problem is that if I compile the kernel (2.4.0-test8
> > > > pre1,pre2,pre3,pre4) with both ACPI support and ISDN support there is a
> > > > conflict in irq 9. I think ACPI first get irq 9 and then Hisax can't
> > > > get it. Consequentially Hisax doesn't work if ACPI support is enabled.
> > > >With ACPI turned off, everything works fine as with previous kernel
> > > > test6 and test5 and 

> > > after further testing the problem seems to be in IRQ SHARING.
> > > in fact, with acpi disabled, once the hisax has got irq 9 (it is not
> > > possible for card type 36 to change the irq), i can load the ethernet
> > > modules 8390 and ne2k-pci for my ethernet PCI NE2000 card, but the
> > > ne2k-pci driver also set its irq=9, so everytime i try to do:
> > >
> > > ifconfig eth0 up
> > > SIOCSIFFLAGS: resource temporarily unavailable
> > >
> > > why don't add the irq parameter to the hisax winbond driver and to the
> > > ne2k-pci driver ?

> > there is no need or even sense to add such parameter, as irqs are
> > assigned by the bios or OS for PCI type cards. The driver is supplied
> > with the selected irq which is normally assigned during boot by the
> > systems bios.
> my bios irq settings are currently set to AUTO.
> i don't know why 3 resources (ne2k-pci, the winbond isdn module and the
> acpi) want the same irq 9 whereas there are a lot of free irqs...
> the manteiner of the ACPI driver sayd his driver is ok for irq sharing
> what about ne2k and winbond isdn ? i don't know how to contact the
> manteiners of this drivers

The ne2k-pci driver will share IRQs.
The 'ne' driver will work for PCI cards, but is intended for ISA cards.  It
will not share the IRQ.

Documentation for the ne2k-pci driver is at
   http://www.scyld.com/network/ne2k-pci.html

> 1) ACPI
> 2) ISDN (Windbond - HiSax)
> 3) RealTek PCI NE2000 ethernet

Is the ISDN card a PCI device?  If not, it cannot share IRQs.
If it is a PCI device, the driver is broken.

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Beowulf-II Cluster Distribution
Annapolis MD 21403

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Fwd: ACPI I4L irq confilct: bug reporting on kernel 2.4.0-test8-pre4

2000-09-06 Thread Donald Becker

On Wed, 6 Sep 2000, Guido Trentalancia wrote:
 On Wed, 06 Sep 2000, you wrote:
  Guido Trentalancia schrieb:

   Motherboard: ASUS P2B-F with the latest bios bx2f113.awd (microcode
update) ISDN: Winbond based card (Hisax type=36)
   The problem is that if I compile the kernel (2.4.0-test8
pre1,pre2,pre3,pre4) with both ACPI support and ISDN support there is a
conflict in irq 9. I think ACPI first get irq 9 and then Hisax can't
get it. Consequentially Hisax doesn't work if ACPI support is enabled.
   With ACPI turned off, everything works fine as with previous kernel
test6 and test5 and 

   after further testing the problem seems to be in IRQ SHARING.
   in fact, with acpi disabled, once the hisax has got irq 9 (it is not
   possible for card type 36 to change the irq), i can load the ethernet
   modules 8390 and ne2k-pci for my ethernet PCI NE2000 card, but the
   ne2k-pci driver also set its irq=9, so everytime i try to do:
  
   ifconfig eth0 up
   SIOCSIFFLAGS: resource temporarily unavailable
  
   why don't add the irq parameter to the hisax winbond driver and to the
   ne2k-pci driver ?

  there is no need or even sense to add such parameter, as irqs are
  assigned by the bios or OS for PCI type cards. The driver is supplied
  with the selected irq which is normally assigned during boot by the
  systems bios.
 my bios irq settings are currently set to AUTO.
 i don't know why 3 resources (ne2k-pci, the winbond isdn module and the
 acpi) want the same irq 9 whereas there are a lot of free irqs...
 the manteiner of the ACPI driver sayd his driver is ok for irq sharing
 what about ne2k and winbond isdn ? i don't know how to contact the
 manteiners of this drivers

The ne2k-pci driver will share IRQs.
The 'ne' driver will work for PCI cards, but is intended for ISA cards.  It
will not share the IRQ.

Documentation for the ne2k-pci driver is at
   http://www.scyld.com/network/ne2k-pci.html

 1) ACPI
 2) ISDN (Windbond - HiSax)
 3) RealTek PCI NE2000 ethernet

Is the ISDN card a PCI device?  If not, it cannot share IRQs.
If it is a PCI device, the driver is broken.

Donald Becker   [EMAIL PROTECTED]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210   Beowulf-II Cluster Distribution
Annapolis MD 21403

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/