Re: Question about David's blog entry for NetCONF 2006, Day 1
On Thursday 21 September 2006 17:15, Rick Jones wrote: I was reading David's blog entries on the netdev meeting in Japan, and have a question about this bit: Currently, things like Xen have to put the card into promiscuous mode, accepting all packets, which is quite inefficient. Is the inefficient bit meant for accepting all packets, or more broadly that the promiscuous path is quite inefficient compared to the non-promiscuous path? I ask because I would have thought that if the system were connected to a switch (*), the number of packets received through a NIC in promiscuous mode would be nearly the same as when it was not in promiscuous mode - the delta being (perhaps) multicast frames. rick jones (*) Today, it seems 99 times out of 10 systems are connected to switches not hubs. It depends on how good your switch is. Say you have a bank of 8 servers on a 8-port switch, each running 16 Xen instances with virtual NICs and different MAC addresses. If the switch does not have enough resources in its MAC table (likely for an 8-port switch) to cache 136 entries (8 * (16 + 1) mac addresses), it will broadcast any packet that is not in the cache to every port on the switch, effectively making the switch into a hub for certain usage patterns. Of course, this is an argument for getting a better switch, but the possibility of virtual MAC addresses might cause some surprising resource utilization problems for network administrators who are used to counting physical ports. - Brent - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: e1000: operation without eeprom?
On Sunday 27 August 2006 19:50, Lennert Buytenhek wrote: Hi, There are a couple of ARM boards out there with on-board e1000s but without any kind of eeprom. The boot loader and kernel board support code have all the info necessary to configure the e1000, but the e1000 driver bombs out because there isn't an eeprom connected -- how are we supposed to deal with this situation? u-boot, which uses modified versions of the linux e1000 drivers, handles this special cases with a bunch of platform-specific #ifdefs http://www.denx.de/cgi-bin/gitweb.cgi?p=u-boot.git;a=blob;h=927acbb26737a20e02962f67047e192545a870a1;hb=16850919ff8666f20d047cb83b4ee77581336515;f=drivers/e1000.c I fear that working in general across the e1000 product line without an eeprom might not work so well. We've had 82545's work OK without and eeprom, but the 82572 did not work so well. Some chips appear to work OK with pure software config, whereas others might need some special setup parameters that would work best at chip power-up via the eeprom. As a general solution (with fewer ifdefs than the u-boot solution), it might be nice to have a read_eeprom_virtual(..) method in the driver where one could supply a binary blob to the driver instead of having a real eeprom. All of the driver code that relies on the eeprom could work like normal. I've been toying with this under u-boot for a custom ARM board without an eeprom too, though it does have the side-effect of bloating the u-boot driver a bit with the fake eeprom data that's really useless after boot (it's mostly 0x's though, so you could totally optimize it.) - Brent - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 1/4] kevent: core files.
On Monday 31 July 2006 17:00, David Miller wrote: So we'd have cases like this, assume we start with a full event queue: thread Athread B dequeue event aha, new connection accept() register new kevent queue is now full again add kevent on new connection At this point thread A doesn't have very many options when the kevent add fails. You cannot force this thread to read more events, since he may not be in a state where he is easily able to do so. There has to be some thread that is responsible for reading events. Perhaps a reasonable thing for a blocked thread that cannot process events to do is to yield to one that can? - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Alternate to Ixia's ANVL test harness for tcp compliance.
On Thursday 20 July 2006 16:31, Jeff Garzik wrote: Piet Delaney wrote: I wonder if Microsoft is providing the big challenge to porting the same GUI to linux. The world really doesn't need yet another Java language. Gosling is a Genius, I studied his X11 News Server enough to know first hand. Microsoft lost in court with their violating the Java standards and C sharp seems to be just another stratagy to their bizarre attempt to world domination (Like the SCO mess). Runtime dynamic bytecode languages -- Java, Perl, Python, Ruby, ... -- do seem to be all the rage. As DaveM noted, though, C# is fully supported under Linux. Or maybe they could go for Gtk+, which has successfully been used to maintain complex GUIs apps on both Windows and Linux. GIMP is the most notable example, but use of Gtk+, GLib, and mingw has meant that you can build Linux-ish apps on Windows without nasty porting layers like Cygwin. Jeff Base C# support is pretty good in Mono, but you still have to be quite careful when creating a cross-platform application with it. Microsoft's version implements a number of libraries that still are not quite as well implemented in Mono (if at all). The toolkit libraries (Windows Forms, to the latest stuff with Vista) are a bit of a moving target. Plus, the .Net platform still lets developers interact with COM objects and other Windows-only code. Just because the GUI is C# does not mean that it does not have a number of Windows-only dependencies, unless it was implemented with portability in-mind. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] mv643xx fixes - Disable interrupts on all ports during initialization
That's the last time I hand-edit a patch - this is really more of a workaround since I've learned that the interrupts should have been disabled by the firmware anyway. But, it doesn't hurt. Updated description This patch disable interrupts on all ports during initialization. The current driver assumes that the firmware has already disabled all interrupts on all ports. We have encountered some boards that do not always disable interrupts (XES XPedite 6200 for instance) on a soft reset on all ethernet ports. This patch prevents a kernel panic if a packet is received before the DMA ring buffers are setup for a port on which interrupts are left enabled by the firmware. Signed-off-by: Brent Cook [EMAIL PROTECTED] Index: current/drivers/net/mv643xx_eth.c === --- current/drivers/net/mv643xx_eth.c (revision 101) +++ current/drivers/net/mv643xx_eth.c (working copy) @@ -777,6 +777,12 @@ unsigned int size; int err; + /* Mask all interrupts on ethernet port */ + mv_write(MV643XX_ETH_INTERRUPT_MASK_REG(port_num), + ETH_INT_MASK_ALL); + /* wait for previous write to complete */ + mv_read(MV643XX_ETH_INTERRUPT_MASK_REG(port_num)); + err = request_irq(dev-irq, mv643xx_eth_int_handler, SA_SHIRQ | SA_SAMPLE_RANDOM, dev-name, dev); if (err) { - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] mv643xx fixes - Disable interrupts on all ports during initialization
This patch disable interrupts on all ports during initialization. The current assumes that the firmware has already disabled all interrupts on all ports. We have encountered some boards that do not always disable interrupts (XES XPedite 6200 for instance) on a soft reset. This patch prevents a kernel panic if a packet is received before the DMA ring buffers are setup for a port on which interrupts are left enabled by the firmware. Signed-off-by: Brent Cook [EMAIL PROTECTED] Index: linux-2.6-bps/drivers/net/mv643xx_eth.c === --- linux-2.6-bps/drivers/net/mv643xx_eth.c (revision 100) +++ linux-2.6-bps/drivers/net/mv643xx_eth.c (revision 101) @@ -777,6 +777,13 @@ unsigned int size; int err; + /* Mask all interrupts on ethernet port */ + mv_write(MV643XX_ETH_INTERRUPT_MASK_REG(port_num), +ETH_INT_MASK_ALL); + /* wait for previous write to complete */ + mv_read(MV643XX_ETH_INTERRUPT_MASK_REG(port_num)); + err = request_irq(dev-irq, mv643xx_eth_int_handler, SA_SHIRQ | SA_SAMPLE_RANDOM, dev-name, dev); if (err) { - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] mv643xx fixes - Support BCM5461 PHY initialization
A couple of PrPMC MV64x60 boards have Broadcom BCM5461 PHYs rather than Marvell. This patch adds the initialization bits required by these PHYs. There was infrastructure prior to the 2.6.16 kernel for the ppc platform support to add extra flags to the ethernet ports during initialization but driver simplifaction removed this ability, so I just put the required TX_CLK_DELAY flag in the main driver instead. This does not appear to have any affect on the Marvell PHY-based board that I tested it on. Signed-off-by: Brent Cook [EMAIL PROTECTED] --- a/drivers/net/mv643xx_eth.c 2006-05-04 00:51:47.0 -0500 +++ b/drivers/net/mv643xx_eth.c 2006-06-27 09:20:58.0 -0500 @@ -1788,6 +1795,7 @@ MV643XX_ETH_DISABLE_AUTO_NEG_SPEED_GMII| MV643XX_ETH_DISABLE_AUTO_NEG_FOR_DUPLX | MV643XX_ETH_DO_NOT_FORCE_LINK_FAIL | + MV643XX_ETH_TX_CLK_DELAY | MV643XX_ETH_SERIAL_PORT_CONTROL_RESERVED; mv_write(MV643XX_ETH_PORT_SERIAL_CONTROL_REG(port_num), pscr); @@ -2275,6 +2283,7 @@ static void ethernet_phy_reset(unsigned int eth_port_num) { unsigned int phy_reg_data; + unsigned int id1, id2; /* Reset the PHY */ eth_port_read_smi_reg(eth_port_num, 0, phy_reg_data); @@ -2286,6 +2295,14 @@ udelay(1); eth_port_read_smi_reg(eth_port_num, 0, phy_reg_data); } while (phy_reg_data 0x8000); + + /* Check PHY type */ + eth_port_read_smi_reg(eth_port_num, 0x02, id1); + eth_port_read_smi_reg(eth_port_num, 0x03, id2); + if ((id1 == 0x0020) ((id2 0xfff0) == 0x60c0)) { + /* BCM5461 fixup: Disable GTXC delay */ + eth_port_write_smi_reg(eth_port_num, 0x1c, 0x8c00); + } } static void mv643xx_eth_port_enable_tx(unsigned int port_num, - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/3] mv643xx fixes
This is a series of fixes that for problems that we've found with the Marvell MV643xx Gigabit MAC. * Disable interrupts on all ports during initialization * Increment TX bytes statistics counter * Support BCM5461 PHY initialization - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] mv643xx fixes - Increment TX bytes statistics counter
This fixes a simple typo from the last set of driver simplifications patches. Now TX bytes stats are actually incremented rather than just containing the size of the last packet sent. Signed-off-by: Brent Cook [EMAIL PROTECTED] --- a/drivers/net/mv643xx_eth.c 2006-05-04 00:51:47.0 -0500 +++ b/drivers/net/mv643xx_eth.c 2006-06-27 09:20:58.0 -0500 @@ -1211,7 +1218,7 @@ spin_lock_irqsave(mp-lock, flags); eth_tx_submit_descs_for_skb(mp, skb); - stats-tx_bytes = skb-len; + stats-tx_bytes += skb-len; stats-tx_packets++; dev-trans_start = jiffies; - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: reminder, 2.6.18 window...
On Thursday 25 May 2006 02:23, Bill Fink wrote: On Wed, 24 May 2006, Jeff Garzik wrote: Brent Cook wrote: Note that this is just clearing the hardware statistics on the interface, and would not require any kind of atomic_increment addition for interfaces that support that. It would be kind-of awkward to implement this on drivers that increment stats in hardware though (lo, vlan, br, etc.) This also brings up the question of resetting the stats for 'netstat -s' If you don't atomically clear the statistics, then you are leaving open a window where the stats could easily be corrupted, if the network interface is under load. This 'clearing' operation has implications on the rest of the statistics usage. More complexity, and breaking of apps, when we could just use the existing, working system? I'll take the do nothing, break nothing, everything still works route any day. I'll admit to not knowing all the intricacies of the kernel coding involved, but I don't offhand see how zeroing the stats would be significantly more complex than updating the stats during normal usage. But I'll have to leave that argument to the experts. What it boils down to is that currently, a single CPU or thread ever touches the stats concurrently, so it doesn't have to lock them or do anything special to ensure that the continue incrementing. If you want to make sure that the statistics actually reset when you want them to, you have to account for this case: CPU0 reads current value from memory (increment) CPU1 writes 0 to current value in memory (reset) CPU0 writes incremented value to memory (increment complete) Check out do_add_counters() in net/ipv4/netfilter/ip_tables.c to see what's required to do this reliably in the kernel. The current patch is fine if your hardware implements the required atomicity itself. Otherwise, you need a locking infrastructure to extend it to all network devices if you want zeroing to always work. What I'm seeing here in response to this is that it doesn't matter if zeroing just _mostly_ works, which is what you would get if you didn't lock. Eh, I'm OK with that too, but I think people are worried about the bugs that would get filed by admins when just zeroing the stats on cheap NIC x only works 90% of the time, less under load. Or not at all (not implemented in driver.) Then you're back to the userspace solution or actually implement stat locking / atomic ops. - Brent - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: reminder, 2.6.18 window...
On Thursday 25 May 2006 12:59, Phil Dibowitz wrote: On Thu, May 25, 2006 at 08:05:37AM -0500, Brent Cook wrote: I'll admit to not knowing all the intricacies of the kernel coding involved, but I don't offhand see how zeroing the stats would be significantly more complex than updating the stats during normal usage. But I'll have to leave that argument to the experts. What it boils down to is that currently, a single CPU or thread ever touches the stats concurrently, so it doesn't have to lock them or do anything special to ensure that the continue incrementing. If you want to make sure that the statistics actually reset when you want them to, you have to account for this case: CPU0 reads current value from memory (increment) CPU1 writes 0 to current value in memory (reset) CPU0 writes incremented value to memory (increment complete) Perhaps I'm missing something here, but these counters are only incrimented in hardware... i.e. attomically. No, you're right - I'm just thinking that once one driver has this ability, users are going to want it for all network devices, and implementation on some devices (namely virtual ones - lo, tun, tap, br, vlan) is trickier than just setting a register. Some hardware devices too - mv643xx_eth.c just increments the network stats in software, for instance. Lockless software reset is fine though as long as people understand the consequences - it's absolutely fine, given the way I would use reset in my environment, MMV. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: reminder, 2.6.18 window...
On Wednesday 24 May 2006 14:14, Phil Dibowitz wrote: On Wed, May 24, 2006 at 03:05:54PM -0400, Jeff Garzik wrote: Phil Dibowitz wrote: Given any method of clearing statistics across your cluster, I'm certain you can come up with a similar method of obtaining the current statistic (the baseline). Right, I'm aware there are other ways of doing this - I've written scripts to record a hundreds of numbers, and then subtract them from each other. But those scripts are work arounds for a feature _lacking_ in the kernel. A feature that, as I've mentioned, is supported on any piece of networking gear (and of course, lets not forget there's a specific option in the kernel config *just* for behave like a router). If my patch was invasive and broke things, I would understand the hesitation, but this is a feature that allows people to *choose* to do this if they need to and the code is pretty self-contained. I'm with you - this is a useful feature! But there aren't many other things I've found that can be cleared from the kernel other than by reloading a module, and dmesg -c. I think the object here isn't this particular patch, but the can-of-worms that it opens up. Note that this is just clearing the hardware statistics on the interface, and would not require any kind of atomic_increment addition for interfaces that support that. It would be kind-of awkward to implement this on drivers that increment stats in hardware though (lo, vlan, br, etc.) This also brings up the question of resetting the stats for 'netstat -s' What would be great is if ifconfig, netstat and their ilk just had a -z flag instead. This would write a file to the local user's home directory with a stats snapshot, and then every subsequent run would auto-calculate against the snapshot. You'd also need some way of resetting this when the stats actually _do_ reset (driver reload, reboot.) to avoid negative numbers. That way, you can get what you want without having to write a bunch of fragile, awkward scripts, and the kernel isn't throwing away information either. - Brent - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] expose vlan structure and defines to user space
On Wednesday 05 April 2006 10:40, David Acker wrote: Ben Greear wrote: The reason I suggested to not make the size #defines visible is that it really doesn't help much. If you are receiving in user space, a more useful number is: sizeof(vlan_eth_header) + MTU of interface Depending on the underlying NIC, you may still receive larger packets, so never assume you absolutely know the max size of pkt to be read: Check the return values from recvfrom, etc. I see your point. I will give this a bit to see if there are any other comments and then post a new patch that only exports the vlan_ethhdr structure. -Ack - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Additionaly, vlan headers may be nested, so the payload offset can be greater than sizeof(vlan_ethhdr) still. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Collisions statistics seem odd with -17-rc1
I just upgraded my local tree to 2.6.17-rc1 and see odd things with interface statistics, namely the collisions stat appears to be uninitialized. I'm just wondering if anyone else is seeing this; it may be a busybox bug too. This is on a 32-bit ppc system. [EMAIL PROTECTED]:~$ ifconfig loLink encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:12 errors:0 dropped:0 overruns:0 frame:0 TX packets:12 errors:0 dropped:0 overruns:0 carrier:0 collisions:805981868 txqueuelen:0 RX bytes:1626 (1.5 KiB) TX bytes:1626 (1.5 KiB) mgmt0 Link encap:Ethernet HWaddr 00:50:C2:27:63:BB inet addr:192.168.1.14 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:5007 errors:0 dropped:0 overruns:0 frame:0 TX packets:3267 errors:0 dropped:0 overruns:0 carrier:0 collisions:805981868 txqueuelen:0 RX bytes:4218550 (4.0 MiB) TX bytes:498617 (486.9 KiB) Interrupt:33 test0 Link encap:Ethernet HWaddr 00:50:C2:27:63:BA UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:2 errors:0 dropped:0 overruns:0 carrier:0 collisions:805981868 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:1180 (1.1 KiB) Interrupt:32 - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
What's the motivation behind ipt_hook: happy cracking messages
Just a guess, but doesn't this put a drag on apps that are legitimately using raw sockets? - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html