s390 allmodconfig
Not sure who to blame for all of this... net/bluetooth/hidp/Kconfig:4:warning: 'select' used by config symbol 'BT_HIDP' refer to undefined symbol 'HID' net/mac80211/Kconfig:17:warning: 'select' used by config symbol 'MAC80211_LEDS' refer to undefined symbol 'NEW_LEDS' net/mac80211/Kconfig:18:warning: 'select' used by config symbol 'MAC80211_LEDS' refer to undefined symbol 'LEDS_TRIGGERS' drivers/net/Kconfig:1435:warning: 'select' used by config symbol 'B44' refer to undefined symbol 'SSB' drivers/net/wireless/bcm43xx/Kconfig:5:warning: 'select' used by config symbol 'BCM43XX' refer to undefined symbol 'HW_RANDOM' drivers/net/wireless/mac80211/bcm43xx/Kconfig:13:warning: 'select' used by config symbol 'BCM43XX_MAC80211_PCI' refer to undefined symbol 'SSB_PCIHOST' drivers/net/wireless/mac80211/bcm43xx/Kconfig:14:warning: 'select' used by config symbol 'BCM43XX_MAC80211_PCI' refer to undefined symbol 'SSB_DRIVER_PCICORE' drivers/net/wireless/mac80211/bcm43xx/Kconfig:27:warning: 'select' used by config symbol 'BCM43XX_MAC80211_PCMCIA' refer to undefined symbol 'SSB_PCMCIAHOST' drivers/net/wireless/mac80211/bcm43xx/Kconfig:5:warning: 'select' used by config symbol 'BCM43XX_MAC80211' refer to undefined symbol 'SSB' - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: s390 allmodconfig
net/mac80211/ieee80211_led.c: In function 'ieee80211_led_init': net/mac80211/ieee80211_led.c:38: error: invalid application of 'sizeof' to incomplete type 'struct led_trigger' net/mac80211/ieee80211_led.c:43: error: dereferencing pointer to incomplete type net/mac80211/ieee80211_led.c:44: warning: implicit declaration of function 'led_trigger_register' net/mac80211/ieee80211_led.c:49: error: invalid application of 'sizeof' to incomplete type 'struct led_trigger' net/mac80211/ieee80211_led.c:54: error: dereferencing pointer to incomplete type net/mac80211/ieee80211_led.c: In function 'ieee80211_led_exit': net/mac80211/ieee80211_led.c:64: warning: implicit declaration of function 'led_trigger_unregister' akpm2:/usr/src/25> grep LED .config CONFIG_NF_CONNTRACK_ENABLED=m CONFIG_MAC80211_LEDS=y Probably related to the Kconfig problems. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Extensible hashing and RCU
On Sat, Feb 17, 2007 at 04:13:02PM +0300, Evgeniy Polyakov ([EMAIL PROTECTED]) wrote: > > >I noticed in an LCA talk mention that apprently extensible hashing > > >with RCU access is an unsolved problem. Here's an idea for solving it. > > > > > > > Yes, I have been playing around with the same idea for > > doing dynamic resizing of the TCP hashtable. > > > > Did a prototype "toy" implementation, and I have a > > "half-finished" patch which resizes the TCP hashtable > > at runtime. Hmmm, your mail may be the impetus to get > > me to finally finish this thing > > Why anyone do not want to use trie - for socket-like loads it has > exactly constant search/insert/delete time and scales as hell. Ok, I've ran an analysis of linked lists and trie traversals and found that (at least on x86) optimized one list traversal is about 4 (!) times faster than one bit lookup in trie traversal (or actually one lookup in binary tree-like structure) - that is because of the fact that trie traversal needs to have more instructions per lookup, and at least one additional branch which can not be predicted. Tests with rdtsc shows that one bit lookup in trie (actually it is any lookup in binary tree structures) is about 3-4 times slower than one lookup in linked list. Since hash table usually has upto 4 elements in each hash entry, competing binary tree/trie stucture must get an entry in one lookup, which is essentially impossible with usual tree/trie implementations. Things dramatically change when linked list became too long, but it should not happend with proper resizing of the hash table, wildcards implementation also introduce additional requirements, which can not be easily solved in hash tables. So I get my words about tree/trie implementation instead of hash table for socket lookup back. Interested reader can find more details on tests, asm outputs and conclusions at: http://tservice.net.ru/~s0mbre/blog/2007/03/01#2007_03_01 -- Evgeniy Polyakov - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: need some help on a backport of r8169
[EMAIL PROTECTED] a écrit, le Thu 01 Mar 2007 à 10:57:11AM : > Hello Ueimor, > [...] > > Once you have logged the ifconfig/ethtool dump, you can try the serie > > or the patch at: > > > > http://www.fr.zoreil.com/people/francois/backport/r8169/20070228-00 > Hum... ok I might have enough time to check it, not sure though, I > have a point with my boss this morning. Indeed I wasn't able to test it yesterday. I won't be able today so, the hardware being required for other tests, but don't worry, I don't forget you, I'll test it as soon as I can, probably next week. > > > > > Btw: > > > > [...dmesg dump...] > > > Enabling fast FPU save and restore... done. > > > Enabling unmasked SIMD FPU exception support... done. > > > Checking 'hlt' instruction... OK. > > > ACPI: setting ELCR to 0200 (from 0c08) > > > NET: Registered protocol family 16 > > > PCI: PCI BIOS revision 3.00 entry at 0xf0031, last bus=2 > > > PCI: Using MMCONFIG > > > > Please disable MMCONFIG. > In the BIOS? > > > > > If you have any PCI latency option in your bios, set it to 64. > I'm not the BIOS-master, I'll suggest it. > > > > > -- > > Ueimor > > Sigerg - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Arp announce (for Xen)
On Thu, 1 Mar 2007, Stephen Hemminger wrote: What about implementing the unused arp_announce flag on the inetdevice? Something like the following. Totally untested... Looks like it either was there (and got removed) or was planned but never implemented. If something like this goes in, it wouldn't hurt to do similar with IPv6 (RFC2461 section 7.2.6). There are very popular hardware-based routers which refresh their NDP caches only every 24 hours or 20 minutes (depending on the software version). Sending unsolicited NAs would eliminate traffic blackholing. diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c index e10794d..cefc339 100644 --- a/net/ipv4/devinet.c +++ b/net/ipv4/devinet.c @@ -1089,6 +1089,16 @@ static int inetdev_event(struct notifier } } ip_mc_up(in_dev); + /* fallthru */ + + case NETDEV_CHANGEADDR: + /* Send gratuitous ARP in case of address change or new device */ + if (IN_DEV_ARP_ANNOUNCE(in_dev)) + arp_send(ARPOP_REQUEST, ETH_P_ARP, +in_dev->ifa_list->ifa_address, dev, +in_dev->ifa_list->ifa_address, NULL, +dev->dev_addr, NULL); + break; case NETDEV_DOWN: ip_mc_down(in_dev); - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html -- Pekka Savola "You each name yourselves king, yet the Netcore Oykingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Extensible hashing and RCU
On Friday 02 March 2007 09:52, Evgeniy Polyakov wrote: > Ok, I've ran an analysis of linked lists and trie traversals and found > that (at least on x86) optimized one list traversal is about 4 (!) > times faster than one bit lookup in trie traversal (or actually one > lookup in binary tree-like structure) - that is because of the fact > that trie traversal needs to have more instructions per lookup, and at > least one additional branch which can not be predicted. > > Tests with rdtsc shows that one bit lookup in trie (actually it is any > lookup in binary tree structures) is about 3-4 times slower than one > lookup in linked list. > > Since hash table usually has upto 4 elements in each hash entry, > competing binary tree/trie stucture must get an entry in one lookup, > which is essentially impossible with usual tree/trie implementations. > > Things dramatically change when linked list became too long, but it > should not happend with proper resizing of the hash table, wildcards > implementation also introduce additional requirements, which can not be > easily solved in hash tables. > > So I get my words about tree/trie implementation instead of hash table > for socket lookup back. > > Interested reader can find more details on tests, asm outputs and > conclusions at: > http://tservice.net.ru/~s0mbre/blog/2007/03/01#2007_03_01 Thank you for this report. (Still avoiding cache misses studies, while they obviously are the limiting factor) Anyqay, if data is in cache and you want optimum performance from your cpu, you may try to use an algorithm without conditional branches : (well 4 in this case for the whole 32 bits tests) gcc -O2 -S -march=i686 test1.c struct node { struct node *left; struct node *right; int value; }; struct node *head; int v1; #define PASS2(bit) \ n2 = n1->left; \ right = n1->right; \ if (value & (1right; \ if (value & (2<>= 8; } printf("result=%p\n", n1); } .file "test1.c" .section.rodata.str1.1,"aMS",@progbits,1 .LC0: .string "result=%p\n" .text .p2align 4,,15 .globl main .type main, @function main: leal4(%esp), %ecx andl$-16, %esp pushl -4(%ecx) pushl %ebp movl%esp, %ebp pushl %ebx xorl%ebx, %ebx pushl %ecx subl$16, %esp movlv1, %ecx movlhead, %edx .p2align 4,,7 .L2: movl4(%edx), %eax testb $1, %cl cmove (%edx), %eax testb $2, %cl movl4(%eax), %edx cmove (%eax), %edx testb $4, %cl movl4(%edx), %eax cmove (%edx), %eax testb $8, %cl movl4(%eax), %edx cmove (%eax), %edx testb $16, %cl movl4(%edx), %eax cmove (%edx), %eax testb $32, %cl movl4(%eax), %edx cmove (%eax), %edx testb $64, %cl movl4(%edx), %eax cmove (%edx), %eax testb %cl, %cl movl4(%eax), %edx cmovns (%eax), %edx addl$1, %ebx cmpl$4, %ebx je .L19 shrl$8, %ecx jmp .L2 .p2align 4,,7 .L19: movl%edx, 4(%esp) movl$.LC0, (%esp) callprintf addl$16, %esp popl%ecx popl%ebx popl%ebp leal-4(%ecx), %esp ret .size main, .-main .comm head,4,4 .comm v1,4,4 .ident "GCC: (GNU) 4.1.2 20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu5)" .section.note.GNU-stack,"",@progbits
Re: CLOCK_MONOTONIC datagram timestamps by the kernel
On Friday 02 March 2007 10:26, John wrote: > Eric Dumazet wrote: > > Anyway, if you want to play, you can apply this patch on top of > > linux-2.6.21-rc2 (nanosecond resolution infrastructure needs 2.6.21) > > I let you do the adjustments for rt kernel. > > Why does it require 2.6.21? Well, this patch was done on top of the latest kernel for obvious practical reasons, but you probably can adapt it on the kernel of your choice. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: CLOCK_MONOTONIC datagram timestamps by the kernel
Eric Dumazet wrote: John wrote: Consider an idle Linux 2.6.20-rt8 system, equipped with a single PCI-E gigabit Ethernet NIC, running on a modern CPU (e.g. Core 2 Duo E6700). All this system does is time stamp 1000 packets per second. Are you claiming that this platform *cannot* handle most packets within less than 1 microsecond of their arrival? Yes I claim it. You expect too much of this platform, unless "most" means 10 % for you ;) By "most" I meant more than 50%. Has someone tried to measure interrupt latency in Linux? I'd like to plot the distribution of network IRQ to interrupt handler latencies. If you replace "1 us" by "50 us", then yes, it probably can do it, if "most" means 99%, (not 99.999 %) I think we need cold, hard numbers at this point :-) Anyway, if you want to play, you can apply this patch on top of linux-2.6.21-rc2 (nanosecond resolution infrastructure needs 2.6.21) I let you do the adjustments for rt kernel. Why does it require 2.6.21? This patch converts sk_buff timestamp to use new nanosecond infra (added in 2.6.21) Is this mentioned somewhere in the 2.6.21-rc1 ChangeLog? http://kernel.org/pub/linux/kernel/v2.6/testing/ChangeLog-2.6.21-rc1 Regards. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: s390 allmodconfig
On Fri, 2007-03-02 at 00:25 -0800, Andrew Morton wrote: > net/mac80211/ieee80211_led.c: In function 'ieee80211_led_init': > net/mac80211/ieee80211_led.c:38: error: invalid application of 'sizeof' to > incomplete type 'struct led_trigger' > net/mac80211/ieee80211_led.c:43: error: dereferencing pointer to incomplete > type > net/mac80211/ieee80211_led.c:44: warning: implicit declaration of function > 'led_trigger_register' > net/mac80211/ieee80211_led.c:49: error: invalid application of 'sizeof' to > incomplete type 'struct led_trigger' > net/mac80211/ieee80211_led.c:54: error: dereferencing pointer to incomplete > type > net/mac80211/ieee80211_led.c: In function 'ieee80211_led_exit': > net/mac80211/ieee80211_led.c:64: warning: implicit declaration of function > 'led_trigger_unregister' > > akpm2:/usr/src/25> grep LED .config > CONFIG_NF_CONNTRACK_ENABLED=m > CONFIG_MAC80211_LEDS=y > > Probably related to the Kconfig problems. Almost certainly. Someone is building some LED trigger/driver without the LED core enabled which is what that Kconfig warning was about. Nobody's ever mentioned this driver to me... Richard (LED Maintainer) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Extensible hashing and RCU
On Fri, Mar 02, 2007 at 10:56:23AM +0100, Eric Dumazet ([EMAIL PROTECTED]) wrote: > On Friday 02 March 2007 09:52, Evgeniy Polyakov wrote: > > > Ok, I've ran an analysis of linked lists and trie traversals and found > > that (at least on x86) optimized one list traversal is about 4 (!) > > times faster than one bit lookup in trie traversal (or actually one > > lookup in binary tree-like structure) - that is because of the fact > > that trie traversal needs to have more instructions per lookup, and at > > least one additional branch which can not be predicted. > > > > Tests with rdtsc shows that one bit lookup in trie (actually it is any > > lookup in binary tree structures) is about 3-4 times slower than one > > lookup in linked list. > > > > Since hash table usually has upto 4 elements in each hash entry, > > competing binary tree/trie stucture must get an entry in one lookup, > > which is essentially impossible with usual tree/trie implementations. > > > > Things dramatically change when linked list became too long, but it > > should not happend with proper resizing of the hash table, wildcards > > implementation also introduce additional requirements, which can not be > > easily solved in hash tables. > > > > So I get my words about tree/trie implementation instead of hash table > > for socket lookup back. > > > > Interested reader can find more details on tests, asm outputs and > > conclusions at: > > http://tservice.net.ru/~s0mbre/blog/2007/03/01#2007_03_01 > > Thank you for this report. (Still avoiding cache misses studies, while they > obviously are the limiting factor) > > Anyqay, if data is in cache and you want optimum performance from your cpu, > you may try to use an algorithm without conditional branches : > (well 4 in this case for the whole 32 bits tests) Tests were always for no-cache-miss case. I also ran them in kenel mode (to eliminate tlb flushes per rescheduling and to get into account that kernel tlb covers 8mb while userspace only 4k), but results were essentially the same (modulo several percents). I only tested trie, in my impementation its memory usage is smaller than hash table for 2^20 entries. > gcc -O2 -S -march=i686 test1.c > > struct node { > struct node *left; > struct node *right; > int value; > }; > struct node *head; > int v1; > > #define PASS2(bit) \ > n2 = n1->left; \ > right = n1->right; \ > if (value & (1< n2 = right; \ > n1 = n2->left; \ > right = n2->right; \ > if (value & (2< n1 = right; > > main() > { > int j; > unsigned int value = v1; > struct node *n1 = head, *n2, *right; > for (j=0; j<4; ++j) { > PASS2(0) > PASS2(2) > PASS2(4) > PASS2(6) > value >>= 8; > } > printf("result=%p\n", n1); > } This one resulted in 10*4 and 2*4 branches per loop. So total 32 branches (instead of 64 in simpler code) and 160 instructions (instead of 128 in simpler code). Getting that branch is two times longer to execute (though it is quite strange sentence, but I must admit, that I did not read x86 processor manual at all (only ppc32)) according to tests, we do not get any gain for 32bit value (32 lookups): 64*2+128 in old case, 32*2+160 in new one. I also have advanced trie implementation, which caches values in nodes if there are no child entries, and it _greatly_ decrease number of lookups and memory usage for smaller sets, but in long run and huge amount of entries in trie, it does not matter since only the lowest layer caches values. -- Evgeniy Polyakov - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: s390 allmodconfig
On Fri, 2007-03-02 at 00:25 -0800, Andrew Morton wrote: > Probably related to the Kconfig problems. Yeah, it is. s390 is funny, it doesn't include drivers/Kconfig, I don't think anybody of us would have suspected that. There doesn't seem to be a reason why it shouldn't have drivers/leds though. drivers/ssb I don't know about, does s390 have pci or pcmcia? And the bluetooth stuff is also plain weird, I suppose s390 really should include drivers/hid/Kconfig :) Same with drivers/char that includes hw_random. Is there any reason it isn't including drivers/Kconfig? I can offer below patch to fix the LED trigger problem, it's probably cleaner to depend on LEDS_TRIGGERS rather than selecting it and NEW_LEDS. johannes --- wireless-dev.orig/net/mac80211/Kconfig 2007-03-02 11:18:45.464333268 +0100 +++ wireless-dev/net/mac80211/Kconfig 2007-03-02 11:33:24.534333268 +0100 @@ -13,9 +13,7 @@ config MAC80211 config MAC80211_LEDS bool "Enable LED triggers" - depends on MAC80211 - select NEW_LEDS - select LEDS_TRIGGERS + depends on MAC80211 && LEDS_TRIGGERS ---help--- This option enables a few LED triggers for different packet receive/transmit events. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: s390 allmodconfig
On Fri, 02 Mar 2007 10:32:32 + Richard Purdie <[EMAIL PROTECTED]> wrote: > On Fri, 2007-03-02 at 00:25 -0800, Andrew Morton wrote: > > net/mac80211/ieee80211_led.c: In function 'ieee80211_led_init': > > net/mac80211/ieee80211_led.c:38: error: invalid application of 'sizeof' to > > incomplete type 'struct led_trigger' > > net/mac80211/ieee80211_led.c:43: error: dereferencing pointer to incomplete > > type > > net/mac80211/ieee80211_led.c:44: warning: implicit declaration of function > > 'led_trigger_register' > > net/mac80211/ieee80211_led.c:49: error: invalid application of 'sizeof' to > > incomplete type 'struct led_trigger' > > net/mac80211/ieee80211_led.c:54: error: dereferencing pointer to incomplete > > type > > net/mac80211/ieee80211_led.c: In function 'ieee80211_led_exit': > > net/mac80211/ieee80211_led.c:64: warning: implicit declaration of function > > 'led_trigger_unregister' > > > > akpm2:/usr/src/25> grep LED .config > > CONFIG_NF_CONNTRACK_ENABLED=m > > CONFIG_MAC80211_LEDS=y > > > > Probably related to the Kconfig problems. > > Almost certainly. Someone is building some LED trigger/driver without > the LED core enabled which is what that Kconfig warning was about. > > Nobody's ever mentioned this driver to me... > It's a mountain of new wireless code in the just-released 2.6.21-rc2-mm1. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: s390 allmodconfig
On Fri, 02 Mar 2007 11:38:24 +0100 Johannes Berg <[EMAIL PROTECTED]> wrote: > On Fri, 2007-03-02 at 00:25 -0800, Andrew Morton wrote: > > > Probably related to the Kconfig problems. > > Yeah, it is. s390 is funny, it doesn't include drivers/Kconfig, I don't > think anybody of us would have suspected that. > > There doesn't seem to be a reason why it shouldn't have drivers/leds > though. drivers/ssb I don't know about, does s390 have pci or pcmcia? No, s390 doesn't have PCI. > And the bluetooth stuff is also plain weird, I suppose s390 really > should include drivers/hid/Kconfig :) > > Same with drivers/char that includes hw_random. > > Is there any reason it isn't including drivers/Kconfig? > s390 is weird ;) There's no way it'll support any of the hardware which you're working on (until they release the s390 laptop). So all we really want to do here is to avoid breaking s390 allmodconfig. > I can offer below patch to fix the LED trigger problem, it's probably > cleaner to depend on LEDS_TRIGGERS rather than selecting it and > NEW_LEDS. > > johannes > > --- wireless-dev.orig/net/mac80211/Kconfig2007-03-02 11:18:45.464333268 > +0100 > +++ wireless-dev/net/mac80211/Kconfig 2007-03-02 11:33:24.534333268 +0100 > @@ -13,9 +13,7 @@ config MAC80211 > > config MAC80211_LEDS > bool "Enable LED triggers" > - depends on MAC80211 > - select NEW_LEDS > - select LEDS_TRIGGERS > + depends on MAC80211 && LEDS_TRIGGERS > ---help--- > This option enables a few LED triggers for different > packet receive/transmit events. OK, I'll try that, thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: s390 allmodconfig
On Fri, 2007-03-02 at 03:06 -0800, Andrew Morton wrote: > No, s390 doesn't have PCI. Ok. > s390 is weird ;) There's no way it'll support any of the hardware which > you're > working on (until they release the s390 laptop). So all we really want to > do here is to avoid breaking s390 allmodconfig. Alright. I think we'll probably have to make bcm43xx and b44 depend on SSB instead of selecting it like the LED trigger stuff below. But I don't see why s390 can't include hw random, led trigger or even hid, those are all software features afaict. > OK, I'll try that, thanks. Not that it'll actually help get the compile through... bcm43xx will drop fail and bluetooth probably as well. johannes signature.asc Description: This is a digitally signed message part
Re: s390 allmodconfig
On Fri, 02 Mar 2007 12:11:48 +0100 Johannes Berg <[EMAIL PROTECTED]> wrote: > On Fri, 2007-03-02 at 03:06 -0800, Andrew Morton wrote: > > > No, s390 doesn't have PCI. > > Ok. > > > s390 is weird ;) There's no way it'll support any of the hardware which > > you're > > working on (until they release the s390 laptop). So all we really want to > > do here is to avoid breaking s390 allmodconfig. > > Alright. I think we'll probably have to make bcm43xx and b44 depend on > SSB instead of selecting it like the LED trigger stuff below. > > But I don't see why s390 can't include hw random, led trigger or even > hid, those are all software features afaict. > > > > OK, I'll try that, thanks. > > Not that it'll actually help get the compile through... bcm43xx will > drop fail and bluetooth probably as well. > OK, thanks. fwiw, http://userweb.kernel.org/~akpm/cross-compilers/ has an s390 cross-compiler binary. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] [TCP]: FRTO undo response falls back to ratehalving one if ECEd
Undoing ssthresh is disabled in fastretrans_alert whenever FLAG_ECE is set by clearing prior_ssthresh. This clearing does not protect FRTO because FRTO operates before fastretrans_alert. Moving the clearing of prior_ssthresh earlier seems to be a suboptimal solution to the FRTO case because then FLAG_ECE will cause a second ssthresh reduction in try_to_open (the first occurred when FRTO was entered). So instead, FRTO falls back immediately to the rate halving response, which switches TCP to CA_CWR state preventing the latter reduction of ssthresh. If the first ECE arrived before the ACK after which FRTO is able to decide RTO as spurious, prior_ssthresh is already cleared. Thus no undoing for ssthresh occurs. Besides, FLAG_ECE should be set also in the following ACKs resulting in rate halving response that sees TCP already in CA_CWR, which again prevents an extra ssthresh reduction on that round-trip. If the first ECE arrived before RTO, ssthresh has already been adapted and prior_ssthresh remains cleared on entry because TCP is in CA_CWR (the same applies also to a case where FRTO is entered more than once and ECE comes in the middle). I believe that after this patch, FRTO should be ECN-safe and even able to take advantage of synergy benefits. Signed-off-by: Ilpo Järvinen <[EMAIL PROTECTED]> --- net/ipv4/tcp_input.c |9 ++--- 1 files changed, 6 insertions(+), 3 deletions(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index dc221a3..bdd6172 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -2592,9 +2592,12 @@ static void tcp_ratehalving_spur_to_resp tp->high_seq = tp->frto_highmark; /* Smoother w/o this? - ij */ } -static void tcp_undo_spur_to_response(struct sock *sk) +static void tcp_undo_spur_to_response(struct sock *sk, int flag) { - tcp_undo_cwr(sk, 1); + if (flag&FLAG_ECE) + tcp_ratehalving_spur_to_response(sk); + else + tcp_undo_cwr(sk, 1); } /* F-RTO spurious RTO detection algorithm (RFC4138) @@ -2680,7 +2683,7 @@ static int tcp_process_frto(struct sock return 1; } else /* frto_counter == 2 */ { switch (sysctl_tcp_frto_response) { - case 2: tcp_undo_spur_to_response(sk); break; + case 2: tcp_undo_spur_to_response(sk, flag); break; case 1: tcp_conservative_spur_to_response(tp); break; default: tcp_ratehalving_spur_to_response(sk); break; } -- 1.4.2
[PATCH v2] [TCP]: FRTO undo response falls back to ratehalving one if ECEd
Undoing ssthresh is disabled in fastretrans_alert whenever FLAG_ECE is set by clearing prior_ssthresh. The clearing does not protect FRTO because FRTO operates before fastretrans_alert. Moving the clearing of prior_ssthresh earlier seems to be a suboptimal solution to the FRTO case because then FLAG_ECE will cause a second ssthresh reduction in try_to_open (the first occurred when FRTO was entered). So instead, FRTO falls back immediately to the rate halving response, which switches TCP to CA_CWR state preventing the latter reduction of ssthresh. If the first ECE arrived before the ACK after which FRTO is able to decide RTO as spurious, prior_ssthresh is already cleared. Thus no undoing for ssthresh occurs. Besides, FLAG_ECE should be set also in the following ACKs resulting in rate halving response that sees TCP is already in CA_CWR, which again prevents an extra ssthresh reduction on that round-trip. If the first ECE arrived before RTO, ssthresh has already been adapted and prior_ssthresh remains cleared on entry because TCP is in CA_CWR (the same applies also to a case where FRTO is entered more than once and ECE comes in the middle). High_seq must not be touched after tcp_enter_cwr because CWR round-trip calculation depends on it. I believe that after this patch, FRTO should be ECN-safe and even able to take advantage of synergy benefits. Signed-off-by: Ilpo Järvinen <[EMAIL PROTECTED]> --- Of course I forgot to fix also the high_seq thing I had in mind last evening, so here is this again now with it too. diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index dc221a3..6b268dc 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -2587,14 +2587,15 @@ static void tcp_conservative_spur_to_res */ static void tcp_ratehalving_spur_to_response(struct sock *sk) { - struct tcp_sock *tp = tcp_sk(sk); tcp_enter_cwr(sk, 0); - tp->high_seq = tp->frto_highmark; /* Smoother w/o this? - ij */ } -static void tcp_undo_spur_to_response(struct sock *sk) +static void tcp_undo_spur_to_response(struct sock *sk, int flag) { - tcp_undo_cwr(sk, 1); + if (flag&FLAG_ECE) + tcp_ratehalving_spur_to_response(sk); + else + tcp_undo_cwr(sk, 1); } /* F-RTO spurious RTO detection algorithm (RFC4138) @@ -2680,7 +2681,7 @@ static int tcp_process_frto(struct sock return 1; } else /* frto_counter == 2 */ { switch (sysctl_tcp_frto_response) { - case 2: tcp_undo_spur_to_response(sk); break; + case 2: tcp_undo_spur_to_response(sk, flag); break; case 1: tcp_conservative_spur_to_response(tp); break; default: tcp_ratehalving_spur_to_response(sk); break; } -- 1.4.2
Re: Network activity LED trigger
Hi All, Some more thoughts. The IDE activity LED trigger is currently triggered when a function is called in the IDE writing/reading routines. In a similar way, we could call the trigger function in net/core/dev.c in netif_receive_skb and netif_rx ? I was also thinking that some network NIC already have LEDs, so it is not necessary for those models to "overload" the user with lights everywhere. Regars, Florian Le jeudi 1 mars 2007, Florian Fainelli a écrit : > Hi All, > > I have been talking a bit with Richard, who is the LED API maintainer, and > a LED trigger based on network activity would be something great. > > There are somethings that concern the network stack : > > - should we specify if the network driver is allowed to contribute to > the LED activity, just like it is done for random generation, at compile > time > > - I would like to trigger the LED based on one or several network > interfaces, maybe specify via sysfs which interface triggers which LED, > and also maybe differentiate the layer-2 activity from the layer-3 > activity for instance > > - A led driver could by default be bound to a network driver, or an > interface name > > As it could be very intrusive in the network stack, you might want to > specify a bit more how you imagine a network activity trigger. > > Thanks - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Network activity LED trigger
Where are these LEDs typically located? Are you talking about LEDs on a network card for example? can you light them up in different colors? cheers, jamal On Fri, 2007-02-03 at 13:58 +0100, Florian Fainelli wrote: > Hi All, > > Some more thoughts. The IDE activity LED trigger is currently triggered when > a > function is called in the IDE writing/reading routines. > > In a similar way, we could call the trigger function in net/core/dev.c in > netif_receive_skb and netif_rx ? > > I was also thinking that some network NIC already have LEDs, so it is not > necessary for those models to "overload" the user with lights everywhere. > > R - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Network activity LED trigger
Hi, Le vendredi 2 mars 2007, jamal a écrit : > Where are these LEDs typically located? Are you talking about LEDs on a > network card for example? can you light them up in different colors? Those LEDS are typically controlled by GPIO lines visible in front of the device. It is mostly targeted to embedded devices for which you do not necessarily want to assign a LED to a given network interface > > cheers, > jamal > > On Fri, 2007-02-03 at 13:58 +0100, Florian Fainelli wrote: > > Hi All, > > > > Some more thoughts. The IDE activity LED trigger is currently triggered > > when a function is called in the IDE writing/reading routines. > > > > In a similar way, we could call the trigger function in net/core/dev.c in > > netif_receive_skb and netif_rx ? > > > > I was also thinking that some network NIC already have LEDs, so it is not > > necessary for those models to "overload" the user with lights everywhere. > > > > R > > - > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Cordialement, Florian Fainelli - 5, rue Charles Fourier Chambre 1202 91011 Evry http://www.alphacore.net (+33) 01 60 76 64 21 (+33) 06 09 02 64 95 - Association MiNET http://www.minet.net - Institut National des Télécommunication http://www.int-evry.fr/telecomint - - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] tc35815 driver update (part 2)
More updates for tc35815 driver, including: * TX4939 support. * NETPOLL support. * NAPI support. (disabled by default) * Reduce memcpy on receiving. * PM support. * Many cleanups and bugfixes. Signed-off-by: Atsushi Nemoto <[EMAIL PROTECTED]> --- drivers/net/tc35815.c | 827 +++--- include/linux/pci_ids.h |1 2 files changed, 632 insertions(+), 196 deletions(-) diff --git a/drivers/net/tc35815.c b/drivers/net/tc35815.c index 0cf1f87..ec888db 100644 --- a/drivers/net/tc35815.c +++ b/drivers/net/tc35815.c @@ -38,9 +38,33 @@ * Add workaround for 100MHalf HUB. * 1.22Minor fix. * 1.23Minor cleanup. + * 1.24Remove tc35815_setup since new stype option + * ("tc35815.speed=10", etc.) can be used for 2.6 kernel. + * 1.25TX4939 support. + * 1.26Minor cleanup. + * 1.27Move TX4939 PCFG.SPEEDn control code out from this driver. + * Cleanup init_dev_addr. (NETDEV_REGISTER event notifier + * can overwrite dev_addr) + * support ETHTOOL_GPERMADDR. + * 1.28Minor cleanup. + * 1.29support netpoll. + * 1.30Minor cleanup. + * 1.31NAPI support. (disabled by default) + * Use DMA_RxAlign_2 if possible. + * Do not use PackedBuffer. + * Cleanup. + * 1.32Fix free buffer management on non-PackedBuffer mode. + * 1.33Fix netpoll build. + * 1.34Fix netpoll locking. "BH rule" for NAPI is not enough with + * netpoll, hard_start_xmit might be called from irq context. + * PM support. */ -#define DRV_VERSION"1.23" +#ifdef TC35815_NAPI +#define DRV_VERSION"1.34-NAPI" +#else +#define DRV_VERSION"1.34" +#endif static const char *version = "tc35815.c:v" DRV_VERSION "\n"; #define MODNAME"tc35815" @@ -71,23 +95,27 @@ static const char *version = "tc35815.c: #define GATHER_TXINT /* On-Demand Tx Interrupt */ #define WORKAROUND_LOSTCAR #define WORKAROUND_100HALF_PROMISC +/* #define TC35815_USE_PACKEDBUFFER */ typedef enum { TC35815CF = 0, TC35815_NWU, + TC35815_TX4939, } board_t; /* indexed by board_t, above */ -static struct { +static const struct { const char *name; } board_info[] __devinitdata = { { "TOSHIBA TC35815CF 10/100BaseTX" }, { "TOSHIBA TC35815 with Wake on LAN" }, + { "TOSHIBA TC35815/TX4939" }, }; -static struct pci_device_id tc35815_pci_tbl[] = { - {PCI_VENDOR_ID_TOSHIBA_2, PCI_DEVICE_ID_TOSHIBA_TC35815CF, PCI_ANY_ID, PCI_ANY_ID, 0, 0, TC35815CF }, - {PCI_VENDOR_ID_TOSHIBA_2, PCI_DEVICE_ID_TOSHIBA_TC35815_NWU, PCI_ANY_ID, PCI_ANY_ID, 0, 0, TC35815_NWU }, +static const struct pci_device_id tc35815_pci_tbl[] = { + {PCI_DEVICE(PCI_VENDOR_ID_TOSHIBA_2, PCI_DEVICE_ID_TOSHIBA_TC35815CF), .driver_data = TC35815CF }, + {PCI_DEVICE(PCI_VENDOR_ID_TOSHIBA_2, PCI_DEVICE_ID_TOSHIBA_TC35815_NWU), .driver_data = TC35815_NWU }, + {PCI_DEVICE(PCI_VENDOR_ID_TOSHIBA_2, PCI_DEVICE_ID_TOSHIBA_TC35815_TX4939), .driver_data = TC35815_TX4939 }, {0,} }; MODULE_DEVICE_TABLE (pci, tc35815_pci_tbl); @@ -140,6 +168,11 @@ struct tc35815_regs { * Bit assignments */ /* DMA_Ctl bit asign --- */ +#define DMA_RxAlign0x00c0 /* 1:Reception Alignment */ +#define DMA_RxAlign_1 0x0040 +#define DMA_RxAlign_2 0x0080 +#define DMA_RxAlign_3 0x00c0 +#define DMA_M66EnStat 0x0008 /* 1:66MHz Enable State*/ #define DMA_IntMask0x0004 /* 1:Interupt mask */ #define DMA_SWIntReq 0x0002 /* 1:Software Interrupt request*/ #define DMA_TxWakeUp 0x0001 /* 1:Transmit Wake Up */ @@ -351,6 +384,8 @@ struct BDesc { Int_SSysErrEn | Int_RMasAbtEn | Int_RTargAbtEn | \ Int_STargAbtEn | \ Int_BLExEn | Int_FDAExEn) /* maybe 0xb7f*/ +#define DMA_CTL_CMDDMA_BURST_SIZE +#define HAVE_DMA_RXALIGN(lp) likely((lp)->boardtype != TC35815CF) /* Tuning parameters */ #define DMA_BURST_SIZE 32 @@ -358,12 +393,28 @@ struct BDesc { #define TX_THRESHOLD_MAX 1536 /* used threshold with packet max byte for low pci transfer ability.*/ #define TX_THRESHOLD_KEEP_LIMIT 10 /* setting threshold max value when overrun error occured this count. */ +/* 16 + RX_BUF_NUM * 8 + RX_FD_NUM * 16 + TX_FD_NUM * 32 <= PAGE_SIZE*FD_PAGE_NUM */ +#ifdef TC35815_USE_PACKEDBUFFER #define FD_PAGE_NUM 2 -#define FD_PAGE_ORDER 1 -/* 16 + RX_BUF_PAGES * 8 + RX_FD_NUM * 16 + TX_FD_NUM * 32 <= PAGE_SIZE*2 */ -#define RX_BUF_PAGES 8 /* >= 2 */ +#define RX_BUF_NUM 8 /* >= 2 */ #define RX_FD_NUM 250 /* >= 32 */ #define TX_FD_NUM 128 +#define RX_BUF_SIZEPAGE_SIZE +#else /* TC35815_USE_PACKEDBUFFER */ +#define FD_PA
[PATCH] NET : convert network timestamps to ktime_t
We currently use a special structure (struct skb_timeval) and plain 'struct timeval' to store packet timestamps in sk_buffs and struct sock. This has some drawbacks : - Fixed resolution of micro second. - Waste of space on 64bit platforms where sizeof(struct timeval)=16 I suggest using ktime_t that is a nice abstraction of high resolution time services, currently capable of nanosecond resolution. As sizeof(ktime_t) is 8 bytes, using ktime_t in 'struct sock' permits a 8 byte shrink of this structure on 64bit architectures. Some other structures also benefit from this size reduction (struct ipq in ipv4/ip_fragment.c, struct frag_queue in ipv6/reassembly.c, ...) Once this ktime infrastructure adopted, we can more easily provide nanosecond resolution on top of it. (ioctl SIOCGSTAMPNS and/or SO_TIMESTAMPNS/SCM_TIMESTAMPNS) Note : this patch includes a bug correction in compat_sock_get_timestamp() where a "err = 0;" was missing (so this syscall returned -ENOENT instead of 0) Signed-off-by: Eric Dumazet <[EMAIL PROTECTED]> CC: Stephen Hemminger <[EMAIL PROTECTED]> CC: John find <[EMAIL PROTECTED]> include/linux/skbuff.h | 26 -- include/net/sock.h | 18 +++ net/bridge/netfilter/ebt_ulog.c |6 +++-- net/compat.c| 15 net/core/dev.c | 19 +++- net/core/sock.c | 16 +++-- net/econet/af_econet.c |2 - net/ipv4/ip_fragment.c |6 ++--- net/ipv4/netfilter/ip_queue.c |6 +++-- net/ipv4/netfilter/ipt_ULOG.c |8 -- net/ipv6/exthdrs.c |2 - net/ipv6/netfilter/ip6_queue.c |6 +++-- net/ipv6/netfilter/nf_conntrack_reasm.c |6 ++--- net/ipv6/reassembly.c |6 ++--- net/ipx/af_ipx.c|4 +-- net/netfilter/nfnetlink_log.c |8 +++--- net/netfilter/nfnetlink_queue.c |8 +++--- net/packet/af_packet.c |8 -- 18 files changed, 80 insertions(+), 90 deletions(-) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 4ff3940..24dcbb3 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -27,6 +27,7 @@ #include #include #include #include +#include #define HAVE_ALLOC_SKB /* For the drivers to know */ #define HAVE_ALIGNABLE_SKB /* Ditto 8)*/ @@ -156,11 +157,6 @@ struct skb_shared_info { #define SKB_DATAREF_SHIFT 16 #define SKB_DATAREF_MASK ((1 << SKB_DATAREF_SHIFT) - 1) -struct skb_timeval { - u32 off_sec; - u32 off_usec; -}; - enum { SKB_FCLONE_UNAVAILABLE, @@ -233,7 +229,7 @@ struct sk_buff { struct sk_buff *prev; struct sock *sk; - struct skb_timeval tstamp; + ktime_t tstamp; struct net_device *dev; struct net_device *input_dev; @@ -1360,26 +1356,14 @@ extern void skb_add_mtu(int mtu); */ static inline void skb_get_timestamp(const struct sk_buff *skb, struct timeval *stamp) { - stamp->tv_sec = skb->tstamp.off_sec; - stamp->tv_usec = skb->tstamp.off_usec; + *stamp = ktime_to_timeval(skb->tstamp); } -/** - * skb_set_timestamp - set timestamp of a skb - * @skb: skb to set stamp of - * @stamp: pointer to struct timeval to get stamp from - * - * Timestamps are stored in the skb as offsets to a base timestamp. - * This function converts a struct timeval to an offset and stores - * it in the skb. - */ -static inline void skb_set_timestamp(struct sk_buff *skb, const struct timeval *stamp) +static inline void __net_timestamp(struct sk_buff *skb) { - skb->tstamp.off_sec = stamp->tv_sec; - skb->tstamp.off_usec = stamp->tv_usec; + skb->tstamp = ktime_get_real(); } -extern void __net_timestamp(struct sk_buff *skb); extern __sum16 __skb_checksum_complete(struct sk_buff *skb); diff --git a/include/net/sock.h b/include/net/sock.h index 2c7d60c..19f6540 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -244,7 +244,7 @@ #define sk_prot __sk_common.skc_prot struct sk_filter*sk_filter; void*sk_protinfo; struct timer_list sk_timer; - struct timeval sk_stamp; + ktime_t sk_stamp; struct socket *sk_socket; void*sk_user_data; struct page *sk_sndmsg_page; @@ -1307,19 +1307,19 @@ static inline int sock_intr_errno(long t static __inline__ void sock_recv_timestamp(struct msghdr *msg, struct sock *sk, struct sk_buff *skb) { - struct timeval stamp; + ktime_t kt = skb->tstamp; - skb_get_timestamp(skb, &stamp); if (sock_flag(sk, SOCK_RCVTSTAMP)) {
Re: Network activity LED trigger
On Fri, 2007-02-03 at 15:16 +0100, Florian Fainelli wrote: > Hi, > > Le vendredi 2 mars 2007, jamal a écrit : > > Where are these LEDs typically located? Are you talking about LEDs on a > > network card for example? can you light them up in different colors? > > Those LEDS are typically controlled by GPIO lines visible in front of the > device. It is mostly targeted to embedded devices for which you do not > necessarily want to assign a LED to a given network interface > Ah, ok - ive worked with a not-so-embedded board that had something that was accessible via the ICH; i recall writting a user-space program to handle it. So instead of calling this just LED, probably find a more descriptive name for it; Example GPIO-LED. Those things are tricky to have in a generic code though, no? I.e each chipset/board will have different address mappings on where to read/write for a specific LED. So you need to deal with that problem without requiring changing of the kernel every time an address changes. I actually found exactly similar board (some manufacturer) but the firmware was slightly different. Heres my view of what would be useful: Have them accessible via the kernel, but also have an API from user space. This way user space apps can control the LED, but if i wanted to do it from the kernel i could as well. In my case i was actually monitoring the health of a daemon; it would show off if the daemon was not running, green if it was happy, yellow if semi-healthy and Red if it was in trouble. here are some operations/messages i can see that are useful which you probably already have in your API: turn on LED at #x color somecolor turn off LED at #y query LED info at #x dump all LEDs on board - think of this as a discovery flicker LED at #z at frequency y color green maybe even: "I am a wireless card with no LED, I claim LED #x" which is matched by "tell me if anyone owns LED code" In other words, if you just provide mechanims let people write the policies. This way if i wanted to tie it to my eth0 i can. Hope that helps. cheers, jamal - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 8107] New: dev->header_cache_update has a random value
Andrew Morton <[EMAIL PROTECTED]> writes: >> However, in >> drivers/net/wan/hdlc_cisco.c, in function static int cisco_ioctl(struct >> net_device *dev, struct ifreq *ifr), where dev->hard_header is assigned a >> valid >> function, and dev->hard_header_cache is assigned a known value (NULL), dev- >> >header_cache_update is not set to a known value: Right, it seems I was never aware of dev->header_cache_update existence. I wonder where does the non-NULL value come from? Nevermind. > diff -puN > drivers/net/wan/hdlc_cisco.c~cisco_ioctl-initialise-header_cache_update > drivers/net/wan/hdlc_cisco.c > --- a/drivers/net/wan/hdlc_cisco.c~cisco_ioctl-initialise-header_cache_update > +++ a/drivers/net/wan/hdlc_cisco.c > @@ -366,6 +366,7 @@ static int cisco_ioctl(struct net_device > dev->hard_start_xmit = hdlc->xmit; > dev->hard_header = cisco_hard_header; > dev->hard_header_cache = NULL; > + dev->header_cache_update = NULL; > dev->type = ARPHRD_CISCO; > dev->flags = IFF_POINTOPOINT | IFF_NOARP; > dev->addr_len = 0; > _ ACK, I think it's the best place. Is it OK to leave this (and hard_header_cache) set to random value if dev->hard_header = NULL (as with other protocols)? -- Krzysztof Halasa - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Network activity LED trigger
On Fri, 2007-03-02 at 10:16 -0500, jamal wrote: > Heres my view of what would be useful: > Have them accessible via the kernel, but also have an API from user > space. This way user space apps can control the LED, but if i wanted to > do it from the kernel i could as well. In my case i was actually > monitoring the health of a daemon; it would show off if the daemon was > not running, green if it was happy, yellow if semi-healthy and Red if it > was in trouble. We already have this API, see drivers/leds ;-) > here are some operations/messages i can see that are useful which you > probably already have in your API: > > turn on LED at #x color somecolor > turn off LED at #y > query LED info at #x > dump all LEDs on board - think of this as a discovery > flicker LED at #z at frequency y color green > maybe even: "I am a wireless card with no LED, I claim LED #x" > which is matched by "tell me if anyone owns LED code" > > In other words, if you just provide mechanims let people write the > policies. > This way if i wanted to tie it to my eth0 i can. We have LEDs which show up in sysfs and can be controlled by userspace from there. They can also choose to be controlled by kernel LED 'triggers', for example. we have an IDE disk trigger which shows up activity on IDE disks. Florian would like to see a network trigger. The LED trigger code is quite generic and designed to have little impact on the subsystem its added to, at least in terms of code. As always, there will be some runtime overhead though. Ultimately it depends how complex you make the trigger (eg. how many options it has) and where and how you hook it into the network subsystem. I know little about the network subsystem so this is something others will have to advise on. Cheers, Richard (LED Maintainer) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [git patches] net driver fixes
On Thu, 1 Mar 2007, Kok, Auke wrote: > Linus Torvalds wrote: > > > > Ok, here's an interesting one: my e1000 card no longer worked for a while. > > > > The green link-light blinks on/off once a second, and in time to that, my > > dmesg fills up with an endless supply of > > > > e1000: eth0: e1000_watchdog: NIC Link is Down > > e1000: eth0: e1000_watchdog: NIC Link is Up 100 Mbps Full Duplex, Flow > > Control: None > > e1000: eth0: e1000_watchdog: 10/100 speed: disabling TSO > > > > and networking obviously doesn't actually work. > > Just out of curiosity, which e1000 chipset+motherboard are you running this > on? The kernel prints out: e1000: :00:19.0: e1000_probe: (PCI Express:2.5Gb/s:Width x1) 00:16:76:c7:eb:fe e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection and lspci says: 00:19.0 Ethernet controller: Intel Corporation 82566DM Gigabit Network Connection (rev 02) Subsystem: Intel Corporation Unknown device 0001 Flags: bus master, fast devsel, latency 0, IRQ 506 Memory at e040 (32-bit, non-prefetchable) [size=128K] Memory at e0424000 (32-bit, non-prefetchable) [size=4K] I/O ports at 20c0 [size=32] Capabilities: 00: 86 80 4a 10 07 04 10 00 02 00 00 02 00 00 00 00 10: 00 00 40 e0 00 40 42 e0 c1 20 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 86 80 01 00 30: 00 00 00 00 c8 00 00 00 00 00 00 00 0b 01 00 00 It's an Intel system (Host bridge: Intel Corporation 82Q963/Q965) with integrated graphics: PCI ID 8086:2990 (rev 02) for the host bridge. DMI info isn't very interesting, but it's an all-Intel board: OEM-specific Type Strings: Intel_ASF Intel_ASF_001 .. Base Board Information Manufacturer: Intel Corporation Product Name: DQ965GF Version: AAD41676-305 Serial Number: BQGF635009R2 ... BIOS Information Vendor: Intel Corp. Version: CO96510J.86A.4462.2006.0804.2059 Release Date: 08/04/2006 so it's all-intel chipset, all-intel board, and all-intel BIOS ;) > there have been problems reported with AMT2 on several chipsets (AMT2 is > not supported under linux, unlike AMT1), and having it enabled in the BIOS > produces this phenomenon. Is there some way to at least disable AMT2 from the Linux driver (ie I assume this is some issue of Intel not documenting it all - but maybe you can add a "turn off that bit" to the affected chip). If I'm not the only one to see it, it's obviously not just my personal ethernet switch bug, but apparently the e1000 becoming confused by some link detection event (and powering down the switch probably just gets it out of its confusion). Linus - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] NetLabel: Verify sensitivity level has a valid CIPSO mapping
On Wednesday, February 28 2007 3:01:31 pm Paul Moore wrote: > The current CIPSO engine has a problem where it does not verify that the > given sensitivity level has a valid CIPSO mapping when the "std" CIPSO DOI > type is used. The end result is that bad packets are sent on the wire > which should have never been sent in the first place. This patch corrects > this problem by verifying the sensitivity level mapping similar to what is > done with the category mapping. This patch also changes the returned error > code in this case to -EPERM to better match what the category mapping > verification code returns. > > Signed-off-by: Paul Moore <[EMAIL PROTECTED]> > --- > net/ipv4/cipso_ipv4.c |7 --- > 1 file changed, 4 insertions(+), 3 deletions(-) I probably should have been more clear in the original patch posting ... this is a bugfix patch which I believe should go into 2.6.21 (as well as the -stable tree, but I know they like to see it hit Linus' tree first). -- paul moore linux security @ hp - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [git patches] net driver fixes
Linus Torvalds wrote: On Thu, 1 Mar 2007, Kok, Auke wrote: and lspci says: 00:19.0 Ethernet controller: Intel Corporation 82566DM Gigabit Network Connection (rev 02) DMI info isn't very interesting, but it's an all-Intel board: so it's all-intel chipset, all-intel board, and all-intel BIOS ;) It's like the devil plays with it. We just discussed adding a piece of text about this issue to our README. there have been problems reported with AMT2 on several chipsets (AMT2 is not supported under linux, unlike AMT1), and having it enabled in the BIOS produces this phenomenon. Is there some way to at least disable AMT2 from the Linux driver (ie I assume this is some issue of Intel not documenting it all - but maybe you can add a "turn off that bit" to the affected chip). Our suggestion is (IOW will be in the README) to turn AMT2 off completely in the BIOS, but I'll investigate if your suggestion is possible. It may be another workaround but this one indeed hurts. If I'm not the only one to see it, it's obviously not just my personal ethernet switch bug, but apparently the e1000 becoming confused by some link detection event (and powering down the switch probably just gets it out of its confusion). No, this fits the description perfectly of this issue. I'll get right on it and owe you a patch for the `e1000: not ready for irq` problem too, which seems to hold out after tests... Cheers, Auke - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Network activity LED trigger
On Fri, 2007-02-03 at 16:03 +, Richard Purdie wrote: > On Fri, 2007-03-02 at 10:16 -0500, jamal wrote: > We already have this API, see drivers/leds ;-) Very cool ;-> I was not aware of the existence of this API. Actually i dont think it was available around 2.6.10. > We have LEDs which show up in sysfs and can be controlled by userspace > from there. They can also choose to be controlled by kernel LED > 'triggers', for example. we have an IDE disk trigger which shows up > activity on IDE disks. Florian would like to see a network trigger. > This literally covers most of what i wanted; it may be too late to get rid of that user space program but it is something i see you already support;-> > The LED trigger code is quite generic and designed to have little impact > on the subsystem its added to, at least in terms of code. As always, > there will be some runtime overhead though. Ultimately it depends how > complex you make the trigger (eg. how many options it has) and where Well, give me pointers and i will send you a patch for a board i currently use: http://download.intel.com/design/telecom/techspec/9635.pdf which has GPIO LED. I take it i would have to write a "driver" using your API? > and how you hook it into the network subsystem. > I know little about the > network subsystem so this is something others will have to advise on. Other people may have different opionions: I cant think of something useful from a network perspective mostly because you cant make it generic enough i.e some boards will have LEDs for their NICs and some wont. Just as some boards have activity LEDS for their IDE disks. IOW, I think general purpose LEDs will probably be very dependent on the shipping product. other than that, great work! cheers, jamal - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] NET : convert network timestamps to ktime_t
On Fri, 2 Mar 2007 15:38:41 +0100 Eric Dumazet <[EMAIL PROTECTED]> wrote: > We currently use a special structure (struct skb_timeval) and plain 'struct > timeval' to store packet timestamps in sk_buffs and struct sock. > > This has some drawbacks : > - Fixed resolution of micro second. > - Waste of space on 64bit platforms where sizeof(struct timeval)=16 > > I suggest using ktime_t that is a nice abstraction of high resolution time > services, currently capable of nanosecond resolution. > > As sizeof(ktime_t) is 8 bytes, using ktime_t in 'struct sock' permits a 8 > byte > shrink of this structure on 64bit architectures. Some other structures also > benefit from this size reduction (struct ipq in ipv4/ip_fragment.c, struct > frag_queue in ipv6/reassembly.c, ...) This is even better. Also comparing ktime_t's is easier if some code needs to do that. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
SWS for rcvbuf < MTU
Hello, this is a rare corner case met by one of HP partners on 2.4.20 on IA64. Inspecting the sources of the latest 2.6.20.1 (net/ipv4/tcp_output.c) we can see that the bug is still there. Here is a description of the bug and the suggested fix. The problem occurs when the remote host (not necessarily Linux - in our case it was Solaris) does not implement SWS avoidance on sender side. If Linux connection socket has rcvbuf mtu. But if we use small rcvbuf (set by SO_RCVBUF), we can go into SWS mode. Let us for simplicity look only at the case when we don't have WS enabled. If we have free_space above full_space/2, we reach the following section: /* Don't do rounding if we are using window scaling, since the * scaled window will not line up with the MSS boundary anyway. */ window = tp->rcv_wnd; if (tp->rx_opt.rcv_wscale) { } else { /* Get the largest window that is a nice multiple of mss. * Window clamp already applied above. * If our current window offering is within 1 mss of the * free space we just keep it. This prevents the divide * and multiply from happening most of the time. * We also don't do any window rounding when the free space * is too small. */ (1) if (window <= free_space - mss || window > free_space) window = (free_space/mss)*mss; } return window; What happens if we have a small tp->rcv_wnd and rcvbuf <= mss? In this case condition (1) is almost always false and as a result we'll return unmodified 'window' set to tp->rcv_wnd. If tp->rcv_wnd is small, it can be reused over and over again. For the case rcvbuf <= mss __tcp_select_window() returns: 0 if we have free_space < full_space/2OK mss if rcvbuf is empty OK tp->rcv_wnd in other case Bad If there is no SWS avoidance on sender side, we can see Linux advertising the same small rcv_wnd over and over again. The problem here is that we never advertise one-half the receiver's buffer space as described e.g. in "TCP/IP Illustrated" by Stevens (v.1, Chapter 22.3): "The normal algorithm is for the receiver not to advertise a larger window than it is currently advertising (which can be 0) until the window can be increased by either one full-sized segment (i.e. the MSS being received) or by one-half the receiver's buffer space, whichever is smaller" ^^ The fix. We have not been able to reproduce the problem inside HP as it is unclear what conditions are needed to bring system into SWS mode (this needs very special event timing). HP customer was seeing it every 2-3 days while running a custom application (Solaris<->Linux) that was running with low priority on a busy host running other custom applications with SCHED_RR. After going into SWS mode, his application stayed in it until restarted. We provided to customer a fix for 2.4.20 only (used by customer in production) by adding another test and returning rcvbuf/2 when needed: --- net/ipv4/tcp_output.c.orig Wed May 3 20:40:43 2006 +++ net/ipv4/tcp_output.c Tue Jan 30 14:24:56 2007 @@ -641,6 +641,7 @@ * Note, we don't "adjust" for TIMESTAMP or SACK option bytes. * Regular options like TIMESTAMP are taken into account. */ +static const char *SWS_id_string="@#SWS-fix-2"; u32 __tcp_select_window(struct sock *sk) { struct tcp_opt *tp = &sk->tp_pinfo.af_tcp; @@ -682,6 +683,9 @@ window = tp->rcv_wnd; if (window <= free_space - mss || window > free_space) window = (free_space/mss)*mss; +/* A fix for small rcvbuf [EMAIL PROTECTED] */ + else if (mss == full_space && window < full_space/2) + window = full_space/2; return window; } Customer has confirmed that this resolves the problem and decreases CPU usage by his custom application - even when there is no SWS. This is a rare corner case and most users will never meet it. But as the fix is trivial, I think it makes sense to include it in upstream sources. Regards, Alex -- -- Alexandre Sidorenko email: [EMAIL PROTECTED] Global Solutions Engineering: Unix Networking Hewlett-Packard (Canada) -- - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] spidernet: Fix problem sending IP fragments
On Fri 2.3.2007 00:34, Linas Vepstas wrote: > On Thu, Mar 01, 2007 at 04:52:54PM -0600, Chris Engel wrote: > > I tried to apply this patch to 2.6.21-rc2 and CHECKSUM_HW appears > > to be changed to CHECKSUM_COMPLETE Oops. I did not test this on the actual 2.6.21-rc2 before sending it. It worked fine for me on 2.6.18. In the meantime it tested the patch below on 2.6.21. > The use of CHECKSUM_HW was replaced by CHECKSUM_PARTIAL and > CHECKSUM_COMPLETE on a cae-by-case basis, in the patch series leading > up to 2.6.19. In this case, I'm not sure which should have been > used. In fact CHECKSUM_COMPLETE seems to be used on the receiving side while CHECKSUM_PARTIAL is the one to be used while sending frames. Thus the latter is the one to chose. > Norbert, can you resubmit a patch that applies to a more recent > kernel? p.s. your emailer replaced tabs by spaces ... so here's the new one: Fix problem sending IP fragments on spidernet. Signed-off-by: Norbert Eicker <[EMAIL PROTECTED]> --- diff --git a/drivers/net/spider_net.c b/drivers/net/spider_net.c index 3b91af8..e3019d5 100644 --- a/drivers/net/spider_net.c +++ b/drivers/net/spider_net.c @@ -719,7 +719,7 @@ spider_net_prepare_tx_descr(struct spide SPIDER_NET_DESCR_CARDOWNED | SPIDER_NET_DMAC_NOCS; spin_unlock_irqrestore(&chain->lock, flags); - if (skb->protocol == htons(ETH_P_IP)) + if (skb->protocol == htons(ETH_P_IP) && skb->ip_summed == CHECKSUM_PARTIAL) switch (skb->nh.iph->protocol) { case IPPROTO_TCP: hwdescr->dmac_cmd_status |= SPIDER_NET_DMAC_TCP; - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: s390 allmodconfig
On Fri, 2007-03-02 at 12:11 +0100, Johannes Berg wrote: > On Fri, 2007-03-02 at 03:06 -0800, Andrew Morton wrote: > > s390 is weird ;) There's no way it'll support any of the hardware which > > you're > > working on (until they release the s390 laptop). So all we really want to > > do here is to avoid breaking s390 allmodconfig. Well, I would not say "weird" but different. None of the usual device attachments is present on a s390. That includes memory mapped i/o (!). > Alright. I think we'll probably have to make bcm43xx and b44 depend on > SSB instead of selecting it like the LED trigger stuff below. > > But I don't see why s390 can't include hw random, led trigger or even > hid, those are all software features afaict. True. I'm still sitting on a couple of patches that make s390 use the standard drivers/Kconfig. The downside of these patches is that I have to add a lot of "depends on !S390" all over the place. > > OK, I'll try that, thanks. > > Not that it'll actually help get the compile through... bcm43xx will > drop fail and bluetooth probably as well. No bcm43xx, no bluetooth on s390.. -- blue skies, Martin. Martin Schwidefsky Linux for zSeries Development & Services IBM Deutschland Entwicklung GmbH "Reality continues to ruin my life." - Calvin. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Arp announce (for Xen)
Pekka Savola wrote: On Thu, 1 Mar 2007, Stephen Hemminger wrote: What about implementing the unused arp_announce flag on the inetdevice? Something like the following. Totally untested... Looks like it either was there (and got removed) or was planned but never implemented. IN_DEV_ARP_ANNOUNCE is in 2.6.18, at least..used in arp_solicit in arp.c I really hope this didn't get removed because I find it very useful! But, you could certainly add another sysctl... Thanks, Ben -- Ben Greear <[EMAIL PROTECTED]> Candela Technologies Inc http://www.candelatech.com - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SWS for rcvbuf < MTU
Alex Sidorenko wrote: [snip] --- net/ipv4/tcp_output.c.orig Wed May 3 20:40:43 2006 +++ net/ipv4/tcp_output.c Tue Jan 30 14:24:56 2007 @@ -641,6 +641,7 @@ * Note, we don't "adjust" for TIMESTAMP or SACK option bytes. * Regular options like TIMESTAMP are taken into account. */ +static const char *SWS_id_string="@#SWS-fix-2"; u32 __tcp_select_window(struct sock *sk) { struct tcp_opt *tp = &sk->tp_pinfo.af_tcp; @@ -682,6 +683,9 @@ window = tp->rcv_wnd; if (window <= free_space - mss || window > free_space) window = (free_space/mss)*mss; +/* A fix for small rcvbuf [EMAIL PROTECTED] */ + else if (mss == full_space && window < full_space/2) + window = full_space/2; return window; } Good analysis of the problem, but the patch does not look quite right. In particular, you can't ever announce a zero window. :) I think this attached patch does the correct SWS avoidance. Thanks, -John Do receiver-side SWS avoidance for rcvbuf < MSS. Signed-off-by: John Heffner <[EMAIL PROTECTED]> --- commit 38d33181c93a28cf7fb2f9f3377305a04636c054 tree 503f8a9de6e78694bae9fc2eb1c9dd5d26a0b5ed parent 562aa1d4c6a874373f9a48ac184f662fbbb06a04 author John Heffner <[EMAIL PROTECTED]> Fri, 02 Mar 2007 13:47:44 -0500 committer John Heffner <[EMAIL PROTECTED]> Fri, 02 Mar 2007 13:47:44 -0500 net/ipv4/tcp_output.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index dc15113..688b955 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1607,6 +1607,9 @@ u32 __tcp_select_window(struct sock *sk) */ if (window <= free_space - mss || window > free_space) window = (free_space/mss)*mss; + else if (mss == full_space && +free_space > window + full_space/2) + window = free_space; } return window;
Netem tfifo implementation
Hi, I recently saw the qdisc "tfifo" in the netem module (net/sched/sch_netem.c) when I migrated some of my patches from 2.6.14 to 2.6.20. As I understand, tfifo helps in keeping the queue of packets sorted according to their "time_to_send". [tfifo was not present in 2.6.14 perhaps because arrival order of packets was always equal to the departure order]. However, tfifo uses a linear search in the packet queue to find where to enqueue the packet. Quite some time ago (2.6.14 era), I needed a similar functionality from the netem module and I ended up coding a pointer based min-heap for the same. I was wondering if the community was interested in using the min-heap implementation to replace the linear search implementation. I have tested the min-heap quite a few times and it seems to work. The implementation is slightly non-trivial because it uses pointers to maintain the heap structure instead if using good old fixed size arrays. I did this mainly so that the limit of the netem qdisc could be changed on the fly. However, because every sk_buff now needs two pointers for its children nodes, I added an extra (sk_buff*)next2 to struct sk_buff (sorry!). However, this can probably be changed to a pointer inside netem_skb_cb. Also, because I needed this for personal work and 2.6.14 didn't contain tfifo, I basically removed the embedded qdisc and made netem a classless qdisc with my min heap as the native "queue" (sorry again! :) ) My patch on sch_netem.c is included. If there is interest, I will be glad to make this into a proper tfifo patch along with any more of your suggestions. diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c index 79542af..66881ab 100644 --- a/net/sched/sch_netem.c +++ b/net/sched/sch_netem.c @@ -24,6 +24,9 @@ #include #define VERSION "1.2" /* Network Emulation Queuing algorithm. @@ -53,7 +56,11 @@ */ struct netem_sched_data { - struct Qdisc*qdisc; + struct sk_buff *root; /* The root of the heap of packets */ + struct sk_buff *end; /* The last element in the heap */ + int heap_size; /* The current size of the heap */ + __u64 num_arrivals; /* records the number of arrivals; to be used + for stable ordering in t he heap */ struct timer_list timer; u32 latency; @@ -75,11 +82,15 @@ struct netem_sched_data { u32 size; s16 table[0]; } *delay_dist; }; /* Time stamp put into socket buffer control block */ struct netem_skb_cb { psched_time_t time_to_send; + __u64 arrival_order; }; /* init_crandom - initialize correlated random number generator @@ -139,6 +150,210 @@ static long tabledist(unsigned long mu, long sigma, return x / NETEM_DIST_SCALE + (sigma / NETEM_DIST_SCALE) * t + mu; } +int netem_time_less(struct sk_buff *skb1, struct sk_buff *skb2) +{ + int r; + struct netem_skb_cb *cb1 = (struct netem_skb_cb *)skb1->cb; + struct netem_skb_cb *cb2 = (struct netem_skb_cb *)skb2->cb; + r = PSCHED_TDIFF(cb1->time_to_send, cb2->time_to_send); + if(r == 0) return (cb1->arrival_order < cb2->arrival_order); + else return (r < 0); +} + +int netem_insert_heap(struct sk_buff *skb, struct netem_sched_data *q) +{ + struct sk_buff *tmp; + struct netem_skb_cb *cb = (struct netem_skb_cb *)skb->cb; + + if(q->heap_size >= q->limit) + return NET_XMIT_DROP; + + skb->next = NULL; + skb->next2 = NULL; + //Use the arrival order in the heap to maintain stability. cb->arrival order + //is 64 bits... so it should take a few years before this wraps around. + cb->arrival_order = q->num_arrivals++; + //root is the root of the heap. end is the last element of the heap. + if(q->root == NULL){ + q->root = skb; + skb->prev = NULL; + q->end = skb; + goto success; + } + tmp = q->end; + //Note that the pointer next is left and next2 is right. + while(tmp->prev != NULL && tmp == tmp->prev->next2) tmp = tmp->prev; + if(tmp->prev == NULL){ + //Complete tree: make a new node at a new level. Also now, tmp == q->root + while(tmp->next != NULL) tmp = tmp->next; + tmp->next = skb; + skb->prev = tmp; + }else if(tmp->prev->next2 == NULL){ + tmp->prev->next2 = skb; + skb->prev = tmp->prev; + }else{ + tmp = tmp->prev->next2; + while(tmp->next != NULL) tmp = tmp->next; + tmp->next = skb; + skb->prev = tmp; + } + + //Now skb is at the end of the heap though q->end is not adjusted as yet. + if(netem_time_less(skb, skb->prev)) + q->end = skb->prev; + else + q->end =
Re: [Bugme-new] [Bug 8107] New: dev->header_cache_update has a random value
From: Krzysztof Halasa <[EMAIL PROTECTED]> Date: Fri, 02 Mar 2007 16:29:06 +0100 > Andrew Morton <[EMAIL PROTECTED]> writes: > > >> However, in > >> drivers/net/wan/hdlc_cisco.c, in function static int cisco_ioctl(struct > >> net_device *dev, struct ifreq *ifr), where dev->hard_header is assigned a > >> valid > >> function, and dev->hard_header_cache is assigned a known value (NULL), dev- > >> >header_cache_update is not set to a known value: > > Right, it seems I was never aware of dev->header_cache_update existence. > I wonder where does the non-NULL value come from? Nevermind. > > > diff -puN > > drivers/net/wan/hdlc_cisco.c~cisco_ioctl-initialise-header_cache_update > > drivers/net/wan/hdlc_cisco.c > > --- > > a/drivers/net/wan/hdlc_cisco.c~cisco_ioctl-initialise-header_cache_update > > +++ a/drivers/net/wan/hdlc_cisco.c > > @@ -366,6 +366,7 @@ static int cisco_ioctl(struct net_device > > dev->hard_start_xmit = hdlc->xmit; > > dev->hard_header = cisco_hard_header; > > dev->hard_header_cache = NULL; > > + dev->header_cache_update = NULL; > > dev->type = ARPHRD_CISCO; > > dev->flags = IFF_POINTOPOINT | IFF_NOARP; > > dev->addr_len = 0; > > _ > > ACK, I think it's the best place. I disagree, you can't leave dangling references to functions which are potentially inside of unloaded modules, as this code does. Rather, HDLC Cisco should implement a proper protocol destructor method to clean up these function pointers. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] NetLabel: Verify sensitivity level has a valid CIPSO mapping
From: Paul Moore <[EMAIL PROTECTED]> Date: Fri, 2 Mar 2007 11:12:12 -0500 > On Wednesday, February 28 2007 3:01:31 pm Paul Moore wrote: > > The current CIPSO engine has a problem where it does not verify that the > > given sensitivity level has a valid CIPSO mapping when the "std" CIPSO DOI > > type is used. The end result is that bad packets are sent on the wire > > which should have never been sent in the first place. This patch corrects > > this problem by verifying the sensitivity level mapping similar to what is > > done with the category mapping. This patch also changes the returned error > > code in this case to -EPERM to better match what the category mapping > > verification code returns. > > > > Signed-off-by: Paul Moore <[EMAIL PROTECTED]> > > --- > > net/ipv4/cipso_ipv4.c |7 --- > > 1 file changed, 4 insertions(+), 3 deletions(-) > > I probably should have been more clear in the original patch posting ... this > is a bugfix patch which I believe should go into 2.6.21 (as well as > the -stable tree, but I know they like to see it hit Linus' tree first). I realize this and plan to apply the patch, I'm just backlogged at the moment. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SWS for rcvbuf < MTU
From: Alex Sidorenko <[EMAIL PROTECTED]> Date: Fri, 2 Mar 2007 11:28:28 -0500 > Customer has confirmed that this resolves the problem and decreases > CPU usage by his custom application - even when there is no SWS. There is rarely ever a reason to set explicit socket receive buffer sizes, since the kernel dynamically sizes them based upon how the connection is used. Why do they set it so low? It is just as easy to fix their performance bug by simply removing SO_RCVBUF setting in the application. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Netem tfifo implementation
Ritesh Kumar wrote: > Hi, >I recently saw the qdisc "tfifo" in the netem module > (net/sched/sch_netem.c) when I migrated some of my patches from 2.6.14 > to 2.6.20. As I understand, tfifo helps in keeping the queue of > packets sorted according to their "time_to_send". [tfifo was not > present in 2.6.14 perhaps because arrival order of packets was always > equal to the departure order]. However, tfifo uses a linear search in > the packet queue to find where to enqueue the packet. >Quite some time ago (2.6.14 era), I needed a similar functionality > from the netem module and I ended up coding a pointer based min-heap > for the same. I was wondering if the community was interested in using > the min-heap implementation to replace the linear search > implementation. I have tested the min-heap quite a few times and it > seems to work. >The implementation is slightly non-trivial because it uses > pointers to maintain the heap structure instead if using good old > fixed size arrays. I did this mainly so that the limit of the netem > qdisc could be changed on the fly. However, because every sk_buff now > needs two pointers for its children nodes, I added an extra > (sk_buff*)next2 to struct sk_buff (sorry!). However, this can probably > be changed to a pointer inside netem_skb_cb. Also, because I needed > this for personal work and 2.6.14 didn't contain tfifo, I basically > removed the embedded qdisc and made netem a classless qdisc with my > min heap as the native "queue" (sorry again! :) ) The tfifo qdisc has a limit, why not just allocate a fixed-size heap based on that? - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Arp announce (for Xen)
On Fri, 02 Mar 2007 10:29:53 -0800 Ben Greear <[EMAIL PROTECTED]> wrote: > Pekka Savola wrote: > > On Thu, 1 Mar 2007, Stephen Hemminger wrote: > >> What about implementing the unused arp_announce flag on the inetdevice? > >> Something like the following. Totally untested... > >> > >> Looks like it either was there (and got removed) or was planned but > >> never implemented. > IN_DEV_ARP_ANNOUNCE is in 2.6.18, at least..used in arp_solicit in arp.c > > I really hope this didn't get removed because I find it very useful! > > But, you could certainly add another sysctl... > > Thanks, > Ben > yeah, something new like arp_notify? or arp_gratiutous There are other drivers that do their own arp, they need to be fixed. -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SWS for rcvbuf < MTU
On March 2, 2007 02:25:42 pm David Miller wrote: > From: Alex Sidorenko <[EMAIL PROTECTED]> > Date: Fri, 2 Mar 2007 11:28:28 -0500 > > > Customer has confirmed that this resolves the problem and decreases > > CPU usage by his custom application - even when there is no SWS. > > There is rarely ever a reason to set explicit socket receive > buffer sizes, since the kernel dynamically sizes them based > upon how the connection is used. > > Why do they set it so low? > > It is just as easy to fix their performance bug by simply removing > SO_RCVBUF setting in the application. Hi David, they told us that they use small rcvbuf to throttle bandwidth for this application. I explained it would be better to use TC for this purpose. They agreed and will probably redesign their application in the future, but they cannot do it right now. For the same reason they have to use the old 2.4.20 for a while - in big companies the important production software cannot be changed quickly. The fix I suggested is trivial and should have no impact the case of rcvfbuf>mtu, so I think it makes sense to include it in upstream kernel. Regards, Alex -- -- Alexandre Sidorenko email: [EMAIL PROTECTED] Global Solutions Engineering: Unix Networking Hewlett-Packard (Canada) -- - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Network access fails unless tcpdump is running?
On Thu, Mar 01, 2007 at 06:27:18PM -0500, Marc D Ronell wrote: > Thats correct. Its the wired interface, eth0 which is having the > problem. I have turned the wireless interface, eth2 off with both > ifconfig and ifdown, and still, the connection to the outside only > works when tcpdump is running. > Good to know. > > Can you post the output from `ethtool -i ethX` (where ethX is the wired > > interface). I ask because that tells me what version of the b44/ipw3945 > > driver you are using. > > > > > > # ethtool -i eth0 > driver: b44 > version: 1.01 > firmware-version: > bus-info: :03:00.0 > > > The system was working originally fine, but something changed. > Perhaps through an Debian aptitude update. Any chance you can boot back to the old kernel (the one where is was working) and run and ethtool -i eth0 on that one to see what version of the driver was used there? It's hard to know what may have changed between the 2 versions of the driver since I don't know the starting point. It's also hard to know if this is fixed already since you aren't running the latest upstream kernel. Downloading, building, and testing the latest from kernel.org would be a good way to know if this is already fixed. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SWS for rcvbuf < MTU
On March 2, 2007 01:54:45 pm John Heffner wrote: > Alex Sidorenko wrote: > [snip] > > > --- net/ipv4/tcp_output.c.orig Wed May 3 20:40:43 2006 > > +++ net/ipv4/tcp_output.c Tue Jan 30 14:24:56 2007 > > @@ -641,6 +641,7 @@ > > * Note, we don't "adjust" for TIMESTAMP or SACK option bytes. > > * Regular options like TIMESTAMP are taken into account. > > */ > > +static const char *SWS_id_string="@#SWS-fix-2"; > > u32 __tcp_select_window(struct sock *sk) > > { > > struct tcp_opt *tp = &sk->tp_pinfo.af_tcp; > > @@ -682,6 +683,9 @@ > > window = tp->rcv_wnd; > > if (window <= free_space - mss || window > free_space) > > window = (free_space/mss)*mss; > > +/* A fix for small rcvbuf [EMAIL PROTECTED] */ > > + else if (mss == full_space && window < full_space/2) > > + window = full_space/2; > > > > return window; > > } > > Good analysis of the problem, but the patch does not look quite right. > In particular, you can't ever announce a zero window. :) Hi John, in case when (free_space < full_space/2) we do not reach the modified code and we will return zero: if (free_space < full_space/2) { icsk->icsk_ack.quick = 0; if (tcp_memory_pressure) tp->rcv_ssthresh = min(tp->rcv_ssthresh, 4U*tp->advmss); if (free_space < mss) return 0; } Here is how windows look with the fixed kernel (from customer's test): 20:59:45.320758 Node1.logical.40171 > 11.0.0.1.39909: win = 708 20:59:45.322758 Node1.logical.40171 > 11.0.0.1.39909: win = 288 20:59:45.714567 Node1.logical.40171 > 11.0.0.1.39909: win = 354 20:59:45.717110 Node1.logical.40171 > 11.0.0.1.39909: win = 0 20:59:45.719110 Node1.logical.40171 > 11.0.0.1.39909: win = 708 ... Regards, Alex > I think this attached patch does the correct SWS avoidance. > > Thanks, >-John -- -- Alexandre Sidorenko email: [EMAIL PROTECTED] Global Solutions Engineering: Unix Networking Hewlett-Packard (Canada) -- - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SWS for rcvbuf < MTU
From: Alex Sidorenko <[EMAIL PROTECTED]> Date: Fri, 2 Mar 2007 15:21:58 -0500 > they told us that they use small rcvbuf to throttle bandwidth for this > application. I explained it would be better to use TC for this purpose. They > agreed and will probably redesign their application in the future, but they > cannot do it right now. For the same reason they have to use the old 2.4.20 > for a while - in big companies the important production software cannot be > changed quickly. > > The fix I suggested is trivial and should have no impact the case of > rcvfbuf>mtu, so I think it makes sense to include it in upstream kernel. I have no objection to the fix, especially John's version. I was just curious about the app, thanks for the info :) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Extensible hashing and RCU
On 3/2/07, Eric Dumazet <[EMAIL PROTECTED]> wrote: Thank you for this report. (Still avoiding cache misses studies, while they obviously are the limiting factor) 1) The entire point of going to a tree-like structure would be to allow the leaves to age out of cache (or even forcibly evict them) when the structure bloats (generally under DDoS attack), on the theory that most of them are bogus and won't be referenced again. It's not about the speed of the data structure -- it's about managing its impact on the rest of the system. 2) The other entire point of going to a tree-like structure is that they're drastically simpler to RCU than hashes, and more generally they don't involve individual atomic operations (RCU reaping passes, resizing, etc.) that cause big latency hiccups and evict a bunch of other stuff from cache. 3) The third entire point of going to a tree-like structure is to have a richer set of efficient operations, since you can give them a second "priority"-type index and have "pluck-highest-priority-item", three-sided search, and bulk delete operations. These aren't that much harder to RCU than the basic modify-existing-node operation. Now can we give these idiotic micro-benchmarks a rest until Robert's implementation is tuned and ready for stress-testing? Cheers, - Michael - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fix bugs in "Whether sock accept queue is full" checking
From: weidong <[EMAIL PROTECTED]> Date: Wed, 14 Feb 2007 11:30:57 -0500 > diff -ruN old/include/net/sock.h new/include/net/sock.h > --- old/include/net/sock.h2007-02-03 08:38:21.0 -0500 > +++ new/include/net/sock.h2007-02-03 08:38:30.0 -0500 > @@ -426,7 +426,7 @@ > > static inline int sk_acceptq_is_full(struct sock *sk) > { > - return sk->sk_ack_backlog > sk->sk_max_ack_backlog; > + return sk->sk_ack_backlog >= sk->sk_max_ack_backlog; > } > > /* I've applied this patch, and also fixed a similar case I spotted in AF_UNIX after doing a quick audit. Thank you. commit 626d548a8d145a032cff9237245f8ac9d9056ac1 Author: David S. Miller <[EMAIL PROTECTED]> Date: Fri Mar 2 12:49:23 2007 -0800 [AF_UNIX]: Test against sk_max_ack_backlog properly. This brings things inline with the sk_acceptq_is_full() bug fix. The limit test should be x >= sk_max_ack_backlog. Signed-off-by: David S. Miller <[EMAIL PROTECTED]> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 6069716..51ca438 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -934,7 +934,7 @@ static long unix_wait_for_peer(struct sock *other, long timeo) sched = !sock_flag(other, SOCK_DEAD) && !(other->sk_shutdown & RCV_SHUTDOWN) && - (skb_queue_len(&other->sk_receive_queue) > + (skb_queue_len(&other->sk_receive_queue) >= other->sk_max_ack_backlog); unix_state_runlock(other); @@ -1008,7 +1008,7 @@ restart: if (other->sk_state != TCP_LISTEN) goto out_unlock; - if (skb_queue_len(&other->sk_receive_queue) > + if (skb_queue_len(&other->sk_receive_queue) >= other->sk_max_ack_backlog) { err = -EAGAIN; if (!timeo) @@ -1381,7 +1381,7 @@ restart: } if (unix_peer(other) != sk && - (skb_queue_len(&other->sk_receive_queue) > + (skb_queue_len(&other->sk_receive_queue) >= other->sk_max_ack_backlog)) { if (!timeo) { err = -EAGAIN; - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Network access fails unless tcpdump is running?
"Andy Gospodarek" <[EMAIL PROTECTED]> writes: > > Any chance you can boot back to the old kernel (the one where is was > working) and run and ethtool -i eth0 on that one to see what version of > the driver was used there? It's hard to know what may have changed > between the 2 versions of the driver since I don't know the starting > point. > > It's also hard to know if this is fixed already since you aren't running > the latest upstream kernel. Downloading, building, and testing the > latest from kernel.org would be a good way to know if this is already > fixed. > I had already loaded, compiled, and tested linux-2.6.20.1. There was no change with the newer kernel. Network connections only worked when tcpdump was running. Similar for booting with an older kernel 2.6.17. I think the problem is not with the kernel, but with other system software. It could take a while to debug, so I am just rebuilding. Thanks for your help. marc - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Netem tfifo implementation
On 3/2/07, Patrick McHardy <[EMAIL PROTECTED]> wrote: Ritesh Kumar wrote: > Hi, >I recently saw the qdisc "tfifo" in the netem module > (net/sched/sch_netem.c) when I migrated some of my patches from 2.6.14 > to 2.6.20. As I understand, tfifo helps in keeping the queue of > packets sorted according to their "time_to_send". [tfifo was not > present in 2.6.14 perhaps because arrival order of packets was always > equal to the departure order]. However, tfifo uses a linear search in > the packet queue to find where to enqueue the packet. >Quite some time ago (2.6.14 era), I needed a similar functionality > from the netem module and I ended up coding a pointer based min-heap > for the same. I was wondering if the community was interested in using > the min-heap implementation to replace the linear search > implementation. I have tested the min-heap quite a few times and it > seems to work. >The implementation is slightly non-trivial because it uses > pointers to maintain the heap structure instead if using good old > fixed size arrays. I did this mainly so that the limit of the netem > qdisc could be changed on the fly. However, because every sk_buff now > needs two pointers for its children nodes, I added an extra > (sk_buff*)next2 to struct sk_buff (sorry!). However, this can probably > be changed to a pointer inside netem_skb_cb. Also, because I needed > this for personal work and 2.6.14 didn't contain tfifo, I basically > removed the embedded qdisc and made netem a classless qdisc with my > min heap as the native "queue" (sorry again! :) ) The tfifo qdisc has a limit, why not just allocate a fixed-size heap based on that? The tfifo queue limit itself can be changed and that creates the problem. If we use a fixed heap (say implemented using a fixed size array) then we will have to copy over all pointers from the first array to a reallocated array whenever the queue limit is changed. In retrospect, moving just a few 10s of kilobytes of data doesn't seem that much of a problem... now I feel stupid having put so much effort :). Ritesh - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] NET : convert network timestamps to ktime_t
On Fri, 2 Mar 2007 15:38:41 +0100 Eric Dumazet <[EMAIL PROTECTED]> wrote: > We currently use a special structure (struct skb_timeval) and plain 'struct > timeval' to store packet timestamps in sk_buffs and struct sock. > > This has some drawbacks : > - Fixed resolution of micro second. > - Waste of space on 64bit platforms where sizeof(struct timeval)=16 > > I suggest using ktime_t that is a nice abstraction of high resolution time > services, currently capable of nanosecond resolution. > > As sizeof(ktime_t) is 8 bytes, using ktime_t in 'struct sock' permits a 8 > byte > shrink of this structure on 64bit architectures. Some other structures also > benefit from this size reduction (struct ipq in ipv4/ip_fragment.c, struct > frag_queue in ipv6/reassembly.c, ...) > > You missed a couple of spots. --- tcp-2.6.orig/net/sunrpc/svcsock.c 2007-03-02 12:50:45.0 -0800 +++ tcp-2.6/net/sunrpc/svcsock.c2007-03-02 12:58:28.0 -0800 @@ -805,16 +805,9 @@ /* possibly an icmp error */ dprintk("svc: recvfrom returned error %d\n", -err); } - if (skb->tstamp.off_sec == 0) { - struct timeval tv; - tv.tv_sec = xtime.tv_sec; - tv.tv_usec = xtime.tv_nsec / NSEC_PER_USEC; - skb_set_timestamp(skb, &tv); - /* Don't enable netstamp, sunrpc doesn't - need that much accuracy */ - } - skb_get_timestamp(skb, &svsk->sk_sk->sk_stamp); + svsk->sk_sk->sk_stamp = (skb->tstamp.tv64 != 0) ? skb->tstamp + : ktime_get_real(); set_bit(SK_DATA, &svsk->sk_flags); /* there may be more data... */ /* --- tcp-2.6.orig/kernel/time.c 2007-03-02 12:59:55.0 -0800 +++ tcp-2.6/kernel/time.c 2007-03-02 13:00:08.0 -0800 @@ -469,6 +469,8 @@ return tv; } +EXPORT_SYMBOL(ns_to_timeval); + /* * Convert jiffies to milliseconds and back. -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Netem tfifo implementation
On Fri, 2 Mar 2007 15:56:54 -0500 "Ritesh Kumar" <[EMAIL PROTECTED]> wrote: > On 3/2/07, Patrick McHardy <[EMAIL PROTECTED]> wrote: > > Ritesh Kumar wrote: > > > Hi, > > >I recently saw the qdisc "tfifo" in the netem module > > > (net/sched/sch_netem.c) when I migrated some of my patches from 2.6.14 > > > to 2.6.20. As I understand, tfifo helps in keeping the queue of > > > packets sorted according to their "time_to_send". [tfifo was not > > > present in 2.6.14 perhaps because arrival order of packets was always > > > equal to the departure order]. However, tfifo uses a linear search in > > > the packet queue to find where to enqueue the packet. > > >Quite some time ago (2.6.14 era), I needed a similar functionality > > > from the netem module and I ended up coding a pointer based min-heap > > > for the same. I was wondering if the community was interested in using > > > the min-heap implementation to replace the linear search > > > implementation. I have tested the min-heap quite a few times and it > > > seems to work. > > >The implementation is slightly non-trivial because it uses > > > pointers to maintain the heap structure instead if using good old > > > fixed size arrays. I did this mainly so that the limit of the netem > > > qdisc could be changed on the fly. However, because every sk_buff now > > > needs two pointers for its children nodes, I added an extra > > > (sk_buff*)next2 to struct sk_buff (sorry!). However, this can probably > > > be changed to a pointer inside netem_skb_cb. Also, because I needed > > > this for personal work and 2.6.14 didn't contain tfifo, I basically > > > removed the embedded qdisc and made netem a classless qdisc with my > > > min heap as the native "queue" (sorry again! :) ) > > > > The tfifo qdisc has a limit, why not just allocate a fixed-size heap > > based on that? > > > > > > The tfifo queue limit itself can be changed and that creates the > problem. If we use a fixed heap (say implemented using a fixed size > array) then we will have to copy over all pointers from the first > array to a reallocated array whenever the queue limit is changed. > In retrospect, moving just a few 10s of kilobytes of data doesn't seem > that much of a problem... now I feel stupid having put so much effort > :). > Tfifo is a special case because: * timestamps are stored in skb->cb so it is only really usable inside netem that adds timestamps. * insertions are cheap because it walks backwards and netem usually has tnext > tlast. Only if you have a huge jitter which causes massive reordering and that is unrealistic, would you see a problem. You can always make a new qisc and since netem is classless use yours. -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SWS for rcvbuf < MTU
David Miller wrote: From: Alex Sidorenko <[EMAIL PROTECTED]> Date: Fri, 2 Mar 2007 15:21:58 -0500 they told us that they use small rcvbuf to throttle bandwidth for this application. I explained it would be better to use TC for this purpose. They agreed and will probably redesign their application in the future, but they cannot do it right now. For the same reason they have to use the old 2.4.20 for a while - in big companies the important production software cannot be changed quickly. The fix I suggested is trivial and should have no impact the case of rcvfbuf>mtu, so I think it makes sense to include it in upstream kernel. I have no objection to the fix, especially John's version. I was just curious about the app, thanks for the info :) Please don't apply the patch I sent. I've been thinking about this a bit harder, and it may not fix this particular problem. (Hard to say without knowing exactly what it is.) As the comment above __tcp_select_window() states, we do not do full receive-side SWS avoidance because of header prediction. Alex, you're right I missed that special zero-window case. I'm still not quite sure I'm completely happy with this patch. I'd like to think about this a little bit harder... Thanks, -John - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][BUG][SECURITY] Re: Weird problem with PPPoE on tap interface
From: Florian Zumbiehl <[EMAIL PROTECTED]> Date: Wed, 28 Feb 2007 13:38:44 +0100 > As noone seems to have an opinion on this: Here is a patch that does > work for me and that should solve the problem as far as that is easily > possible. It is based on the assumption that an interface's ifindex is > basically an alias for a local MAC address, so incoming packets now are > matched to sockets based on remote MAC, session id, and ifindex of the > interface the packet came in on/the socket was bound to by connect(). I agree with your analysis and have applied your patch. Another way to implement this would have been to store the pre-computed ifindex on the kernel side sockaddr. Thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] NetLabel: Verify sensitivity level has a valid CIPSO mapping
From: James Morris <[EMAIL PROTECTED]> Date: Wed, 28 Feb 2007 15:45:07 -0500 (EST) > On Wed, 28 Feb 2007, Paul Moore wrote: > > > The current CIPSO engine has a problem where it does not verify that the > > given > > sensitivity level has a valid CIPSO mapping when the "std" CIPSO DOI type is > > used. The end result is that bad packets are sent on the wire which should > > have never been sent in the first place. This patch corrects this problem > > by > > verifying the sensitivity level mapping similar to what is done with the > > category mapping. This patch also changes the returned error code in this > > case > > to -EPERM to better match what the category mapping verification code > > returns. > > > > Signed-off-by: Paul Moore <[EMAIL PROTECTED]> > > [removed redhat-lspp, which is subscriber only] > > Acked-by: James Morris <[EMAIL PROTECTED]> Applied, thanks everyone. If -stable inclusion is desired, please submit this patch there. You can add my signoff if you want: Signed-off-by: David S. Miller <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: PATCH: Second try at vlan mailing list patch.
From: Ben Greear <[EMAIL PROTECTED]> Date: Wed, 28 Feb 2007 15:25:59 -0800 > Hopefully, by attaching it as a file it will not screw up the tabs & spaces. > > Signed-off-by: Ben Greear <[EMAIL PROTECTED]> Nope still doesn't apply. I can guess that you didn't try emailing the patch to yourself and applying it? If so I'm basically still your guinea pig each time you correct this problem. How nice that is :-/ - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] bridge: avoid ptype_all packet handling
From: Stephen Hemminger <[EMAIL PROTECTED]> Date: Wed, 28 Feb 2007 17:18:46 -0800 > I was measuring bridging/routing performance and noticed this. > > The current code runs the "all packet" type handlers before calling the > bridge hook. If an application (like some DHCP clients) is using AF_PACKET, > this means that each received packet gets run through the Berkeley Packet > Filter > code in sk_run_filter (slow). I know we closed this out by saying that even though performance sucks, we can't really apply this without breaking things. What would be broken is if the DHCP client isn't specifying a device ifindex when it binds the AF_PACKET socket. That would be an easy way to fix this performance problem at the application level. The DHCP client should only care about a particular interface's traffic, the one it wants to listen on. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] [TCP]: Add two new spurious RTO responses to FRTO
From: "Ilpo_Järvinen" <[EMAIL PROTECTED]> Date: Thu, 1 Mar 2007 13:30:20 +0200 (EET) > [PATCH] [TCP]: Complete icsk-to-local-variable change (in tcp_enter_cwr) > > A local variable for icsk was created but this change was > missing. Spotted by Jarek Poplawski. > > Signed-off-by: Ilpo Järvinen <[EMAIL PROTECTED]> Applied to tcp-2.6, thank you. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] [TCP]: Move clearing of the prior_ssthresh due to ECE earlier
From: "Ilpo_Järvinen" <[EMAIL PROTECTED]> Date: Thu, 1 Mar 2007 22:26:57 +0200 (EET) > I think that doing it in the response is better that this approach, > since it knows that the ssthresh has been halved already within that > round-trip, so there is no need to do that again... I'll submit the > patch tomorrow... With this prior_ssthresh clearing move alone, the > ssthresh ends up being halved twice if I tought it right (first in > tcp_enter_frto and then again in tcp_enter_cwr that is called from > fastretrans_alert)... So please, drop this patch. Ok. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] [TCP]: FRTO undo response falls back to ratehalving one if ECEd
From: "Ilpo_Järvinen" <[EMAIL PROTECTED]> Date: Fri, 2 Mar 2007 14:34:36 +0200 (EET) > Undoing ssthresh is disabled in fastretrans_alert whenever > FLAG_ECE is set by clearing prior_ssthresh. The clearing does > not protect FRTO because FRTO operates before fastretrans_alert. > Moving the clearing of prior_ssthresh earlier seems to be a > suboptimal solution to the FRTO case because then FLAG_ECE will > cause a second ssthresh reduction in try_to_open (the first > occurred when FRTO was entered). So instead, FRTO falls back > immediately to the rate halving response, which switches TCP to > CA_CWR state preventing the latter reduction of ssthresh. > > If the first ECE arrived before the ACK after which FRTO is able > to decide RTO as spurious, prior_ssthresh is already cleared. > Thus no undoing for ssthresh occurs. Besides, FLAG_ECE should be > set also in the following ACKs resulting in rate halving response > that sees TCP is already in CA_CWR, which again prevents an extra > ssthresh reduction on that round-trip. > > If the first ECE arrived before RTO, ssthresh has already been > adapted and prior_ssthresh remains cleared on entry because TCP > is in CA_CWR (the same applies also to a case where FRTO is > entered more than once and ECE comes in the middle). > > High_seq must not be touched after tcp_enter_cwr because CWR > round-trip calculation depends on it. > > I believe that after this patch, FRTO should be ECN-safe and > even able to take advantage of synergy benefits. > > Signed-off-by: Ilpo Järvinen <[EMAIL PROTECTED]> Applied, but I had to apply this by hand, you did not generate this diff against tcp-2.6 And I'm very angry about this specific case because I told you EXPLICITLY that I reformated the switch() statement when I applied the earlier FRTO patches. Not only are people expected to patch against tcp-2.6, BUT I TOLD YOU specifically that I modified your patch in this specific area. What else do I need to do in order for people to generate clean patches? :-( Tell me, I'll do it!!! - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SWS for rcvbuf < MTU
From: John Heffner <[EMAIL PROTECTED]> Date: Fri, 02 Mar 2007 16:16:39 -0500 > Please don't apply the patch I sent. I've been thinking about this a > bit harder, and it may not fix this particular problem. (Hard to say > without knowing exactly what it is.) As the comment above > __tcp_select_window() states, we do not do full receive-side SWS > avoidance because of header prediction. > > Alex, you're right I missed that special zero-window case. I'm still > not quite sure I'm completely happy with this patch. I'd like to think > about this a little bit harder... Ok - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] vlan & net drivers: avoid a 4-order allocation]
From: Dan Aloni <[EMAIL PROTECTED]> Date: Thu, 1 Mar 2007 12:02:17 +0200 > This patch splits the vlan_group struct into a multi-allocated struct. On > x86_64, the size of the original struct is a little more than 32KB, causing > a 4-order allocation, which is prune to problems caused by buddy-system > external fragmentation conditions. > > I couldn't just use vmalloc() because vfree() cannot be called in the > softirq context of the RCU callback. > > Signed-off-by: Dan Aloni <[EMAIL PROTECTED]> No objections, this really needs to be fixed, applied. Thank you. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] udp: whitespace fixes
The udp code is full of bad indenting, extra whitespace and other style confusion. It makes no sense to declare functions that are used outside the current file (extern) as inline. Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> --- net/ipv4/udp.c | 402 - net/ipv6/udp.c | 175 +--- 2 files changed, 295 insertions(+), 282 deletions(-) --- tcp-2.6.orig/net/ipv4/udp.c 2007-03-02 12:08:06.0 -0800 +++ tcp-2.6/net/ipv4/udp.c 2007-03-02 12:37:38.0 -0800 @@ -120,8 +120,8 @@ struct hlist_node *node; sk_for_each(sk, node, &udptable[num & (UDP_HTABLE_SIZE - 1)]) - if (sk->sk_hash == num) - return 1; + if (sk->sk_hash == num) + return 1; return 0; } @@ -136,13 +136,13 @@ */ int __udp_lib_get_port(struct sock *sk, unsigned short snum, struct hlist_head udptable[], int *port_rover, - int (*saddr_comp)(const struct sock *sk1, -const struct sock *sk2 )) + int (*saddr_comp) (const struct sock * sk1, + const struct sock * sk2)) { struct hlist_node *node; struct hlist_head *head; struct sock *sk2; - interror = 1; + int error = 1; write_lock_bh(&udp_hash_lock); if (snum == 0) { @@ -160,8 +160,9 @@ if (hlist_empty(head)) { if (result > sysctl_local_port_range[1]) result = sysctl_local_port_range[0] + - ((result - sysctl_local_port_range[0]) & -(UDP_HTABLE_SIZE - 1)); + ((result - + sysctl_local_port_range[0]) & +(UDP_HTABLE_SIZE - 1)); goto gotit; } size = 0; @@ -175,12 +176,13 @@ ; } result = best; - for(i = 0; i < (1 << 16) / UDP_HTABLE_SIZE; i++, result += UDP_HTABLE_SIZE) { + for (i = 0; i < (1 << 16) / UDP_HTABLE_SIZE; +i++, result += UDP_HTABLE_SIZE) { if (result > sysctl_local_port_range[1]) result = sysctl_local_port_range[0] - + ((result - sysctl_local_port_range[0]) & - (UDP_HTABLE_SIZE - 1)); - if (! __udp_lib_lport_inuse(result, udptable)) + + ((result - sysctl_local_port_range[0]) & + (UDP_HTABLE_SIZE - 1)); + if (!__udp_lib_lport_inuse(result, udptable)) break; } if (i >= (1 << 16) / UDP_HTABLE_SIZE) @@ -191,13 +193,13 @@ head = &udptable[snum & (UDP_HTABLE_SIZE - 1)]; sk_for_each(sk2, node, head) - if (sk2->sk_hash == snum && - sk2 != sk&& - (!sk2->sk_reuse|| !sk->sk_reuse) && - (!sk2->sk_bound_dev_if || !sk->sk_bound_dev_if -|| sk2->sk_bound_dev_if == sk->sk_bound_dev_if) && - (*saddr_comp)(sk, sk2) ) - goto fail; + if (sk2->sk_hash == snum && + sk2 != sk && + (!sk2->sk_reuse || !sk->sk_reuse) && + (!sk2->sk_bound_dev_if || !sk->sk_bound_dev_if +|| sk2->sk_bound_dev_if == sk->sk_bound_dev_if) && + (*saddr_comp) (sk, sk2)) + goto fail; } inet_sk(sk)->num = snum; sk->sk_hash = snum; @@ -212,19 +214,19 @@ return error; } -__inline__ int udp_get_port(struct sock *sk, unsigned short snum, - int (*scmp)(const struct sock *, const struct sock *)) +int udp_get_port(struct sock *sk, unsigned short snum, +int (*scmp) (const struct sock *, const struct sock *)) { - return __udp_lib_get_port(sk, snum, udp_hash, &udp_port_rover, scmp); + return __udp_lib_get_port(sk, snum, udp_hash, &udp_port_rover, scmp); } -inline int ipv4_rcv_saddr_equal(const struct sock *sk1, const struct sock *sk2) +int ipv4_rcv_saddr_equal(const struct sock *sk1, const struct sock *sk2) { - struct inet_sock *inet1 = inet_sk(sk1), *inet2 = inet_sk(sk2); + const struct in
[RFC 2/2] bridge: per device promiscious taps
Part of the next set of bridge patches includes this. It allows packet capture by interface on a bridge: tcpdump -i eth0 will work as expected. @@ -128,34 +125,45 @@ static inline int is_link_local(const un int br_handle_frame(struct net_bridge_port *p, struct sk_buff **pskb) { struct sk_buff *skb = *pskb; + struct sk_buff *skb2 = NULL; const unsigned char *dest = eth_hdr(skb)->h_dest; if (!is_valid_ether_addr(eth_hdr(skb)->h_source)) goto err; if (unlikely(is_link_local(dest))) { skb->pkt_type = PACKET_HOST; return NF_HOOK(PF_BRIDGE, NF_BR_LOCAL_IN, skb, skb->dev, NULL, br_handle_local_finish) != 0; } + + if (unlikely(p->dev->promiscuity > 1)) + skb2 = skb_clone(skb, GFP_ATOMIC); - if (p->state == BR_STATE_FORWARDING || p->state == BR_STATE_LEARNING) { + switch (p->state) { + case BR_STATE_FORWARDING: if (br_should_route_hook) { - if (br_should_route_hook(pskb)) + if (br_should_route_hook(pskb)) { + kfree_skb(skb2); return 0; + } skb = *pskb; dest = eth_hdr(skb)->h_dest; } if (!compare_ether_addr(p->br->dev->dev_addr, dest)) skb->pkt_type = PACKET_HOST; + /* fall thru */ + case BR_STATE_LEARNING: NF_HOOK(PF_BRIDGE, NF_BR_PRE_ROUTING, skb, skb->dev, NULL, br_handle_frame_finish); - return 1; + break; + + default: + kfree_skb(skb); } -err: - kfree_skb(skb); - return 1; + if (likely(!skb2)) + return 1; + + *pskb = skb2; + return 0; } - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC 1/2] bridge: avoid ptype_all packet handling
On Fri, 02 Mar 2007 13:26:38 -0800 (PST) David Miller <[EMAIL PROTECTED]> wrote: > From: Stephen Hemminger <[EMAIL PROTECTED]> > Date: Wed, 28 Feb 2007 17:18:46 -0800 > > > I was measuring bridging/routing performance and noticed this. > > > > The current code runs the "all packet" type handlers before calling the > > bridge hook. If an application (like some DHCP clients) is using AF_PACKET, > > this means that each received packet gets run through the Berkeley Packet > > Filter > > code in sk_run_filter (slow). > > I know we closed this out by saying that even though performance > sucks, we can't really apply this without breaking things. wrong. > What would be broken is if the DHCP client isn't specifying > a device ifindex when it binds the AF_PACKET socket. That > would be an easy way to fix this performance problem at the > application level. > > The DHCP client should only care about a particular interface's > traffic, the one it wants to listen on. My assumption is that when bridging, the normal stack path only has to receive those packets that it would receive if it was not doing bridging. A better version of the patch is: == The current code runs the "all packet" type handlers before calling the bridge hook. If an application (like some DHCP clients) is using AF_PACKET, this means that each received packet gets run through the Berkeley Packet Filter code in sk_run_filter. This is significant overhead. By moving the bridging hook to run first, the packets flowing through the bridge get filtered out there first. This results in a 14% improvement in performance. Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> --- net/core/dev.c | 24 1 file changed, 12 insertions(+), 12 deletions(-) --- netem.orig/net/core/dev.c +++ netem/net/core/dev.c @@ -1702,9 +1702,12 @@ struct net_bridge_fdb_entry *(*br_fdb_ge unsigned char *addr); void (*br_fdb_put_hook)(struct net_bridge_fdb_entry *ent); -static __inline__ int handle_bridge(struct sk_buff **pskb, - struct packet_type **pt_prev, int *ret, - struct net_device *orig_dev) +/* + * If bridge module is loaded call bridging hook. + * when it returns 1, this is a non-local packet + */ +int (*br_handle_frame_hook)(struct net_bridge_port *p, struct sk_buff **pskb) __read_mostly; +static int handle_bridge(struct sk_buff **pskb) { struct net_bridge_port *port; @@ -1712,15 +1715,10 @@ static __inline__ int handle_bridge(stru (port = rcu_dereference((*pskb)->dev->br_port)) == NULL) return 0; - if (*pt_prev) { - *ret = deliver_skb(*pskb, *pt_prev, orig_dev); - *pt_prev = NULL; - } - return br_handle_frame_hook(port, pskb); } #else -#define handle_bridge(skb, pt_prev, ret, orig_dev) (0) +#define handle_bridge(pskb)0 #endif #ifdef CONFIG_NET_CLS_ACT @@ -1799,6 +1797,9 @@ int netif_receive_skb(struct sk_buff *sk } #endif + if (handle_bridge(&skb)) + goto out; + list_for_each_entry_rcu(ptype, &ptype_all, list) { if (!ptype->dev || ptype->dev == skb->dev) { if (pt_prev) @@ -1826,9 +1827,6 @@ int netif_receive_skb(struct sk_buff *sk ncls: #endif - if (handle_bridge(&skb, &pt_prev, &ret, orig_dev)) - goto out; - type = skb->protocol; list_for_each_entry_rcu(ptype, &ptype_base[ntohs(type)&15], list) { if (ptype->type == type && -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC 1/2] bridge: avoid ptype_all packet handling
On Fri, 02 Mar 2007 13:26:38 -0800 (PST) David Miller <[EMAIL PROTECTED]> wrote: > From: Stephen Hemminger <[EMAIL PROTECTED]> > Date: Wed, 28 Feb 2007 17:18:46 -0800 > > > I was measuring bridging/routing performance and noticed this. > > > > The current code runs the "all packet" type handlers before calling the > > bridge hook. If an application (like some DHCP clients) is using AF_PACKET, > > this means that each received packet gets run through the Berkeley Packet > > Filter > > code in sk_run_filter (slow). > > I know we closed this out by saying that even though performance > sucks, we can't really apply this without breaking things. wrong. > What would be broken is if the DHCP client isn't specifying > a device ifindex when it binds the AF_PACKET socket. That > would be an easy way to fix this performance problem at the > application level. > > The DHCP client should only care about a particular interface's > traffic, the one it wants to listen on. My assumption is that when bridging, the normal stack path only has to receive those packets that it would receive if it was not doing bridging. A better version of the patch is: == The current code runs the "all packet" type handlers before calling the bridge hook. If an application (like some DHCP clients) is using AF_PACKET, this means that each received packet gets run through the Berkeley Packet Filter code in sk_run_filter. This is significant overhead. By moving the bridging hook to run first, the packets flowing through the bridge get filtered out there first. This results in a 14% improvement in performance. Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> --- net/core/dev.c | 24 1 file changed, 12 insertions(+), 12 deletions(-) --- netem.orig/net/core/dev.c +++ netem/net/core/dev.c @@ -1702,9 +1702,12 @@ struct net_bridge_fdb_entry *(*br_fdb_ge unsigned char *addr); void (*br_fdb_put_hook)(struct net_bridge_fdb_entry *ent); -static __inline__ int handle_bridge(struct sk_buff **pskb, - struct packet_type **pt_prev, int *ret, - struct net_device *orig_dev) +/* + * If bridge module is loaded call bridging hook. + * when it returns 1, this is a non-local packet + */ +int (*br_handle_frame_hook)(struct net_bridge_port *p, struct sk_buff **pskb) __read_mostly; +static int handle_bridge(struct sk_buff **pskb) { struct net_bridge_port *port; @@ -1712,15 +1715,10 @@ static __inline__ int handle_bridge(stru (port = rcu_dereference((*pskb)->dev->br_port)) == NULL) return 0; - if (*pt_prev) { - *ret = deliver_skb(*pskb, *pt_prev, orig_dev); - *pt_prev = NULL; - } - return br_handle_frame_hook(port, pskb); } #else -#define handle_bridge(skb, pt_prev, ret, orig_dev) (0) +#define handle_bridge(pskb)0 #endif #ifdef CONFIG_NET_CLS_ACT @@ -1799,6 +1797,9 @@ int netif_receive_skb(struct sk_buff *sk } #endif + if (handle_bridge(&skb)) + goto out; + list_for_each_entry_rcu(ptype, &ptype_all, list) { if (!ptype->dev || ptype->dev == skb->dev) { if (pt_prev) @@ -1826,9 +1827,6 @@ int netif_receive_skb(struct sk_buff *sk ncls: #endif - if (handle_bridge(&skb, &pt_prev, &ret, orig_dev)) - goto out; - type = skb->protocol; list_for_each_entry_rcu(ptype, &ptype_base[ntohs(type)&15], list) { if (ptype->type == type && -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] NET : convert network timestamps to ktime_t
Stephen Hemminger a écrit : On Fri, 2 Mar 2007 15:38:41 +0100 Eric Dumazet <[EMAIL PROTECTED]> wrote: We currently use a special structure (struct skb_timeval) and plain 'struct timeval' to store packet timestamps in sk_buffs and struct sock. This has some drawbacks : - Fixed resolution of micro second. - Waste of space on 64bit platforms where sizeof(struct timeval)=16 I suggest using ktime_t that is a nice abstraction of high resolution time services, currently capable of nanosecond resolution. As sizeof(ktime_t) is 8 bytes, using ktime_t in 'struct sock' permits a 8 byte shrink of this structure on 64bit architectures. Some other structures also benefit from this size reduction (struct ipq in ipv4/ip_fragment.c, struct frag_queue in ipv6/reassembly.c, ...) You missed a couple of spots. Arg yes... --- tcp-2.6.orig/net/sunrpc/svcsock.c 2007-03-02 12:50:45.0 -0800 +++ tcp-2.6/net/sunrpc/svcsock.c2007-03-02 12:58:28.0 -0800 @@ -805,16 +805,9 @@ /* possibly an icmp error */ dprintk("svc: recvfrom returned error %d\n", -err); } - if (skb->tstamp.off_sec == 0) { - struct timeval tv; - tv.tv_sec = xtime.tv_sec; - tv.tv_usec = xtime.tv_nsec / NSEC_PER_USEC; - skb_set_timestamp(skb, &tv); - /* Don't enable netstamp, sunrpc doesn't - need that much accuracy */ - } - skb_get_timestamp(skb, &svsk->sk_sk->sk_stamp); + svsk->sk_sk->sk_stamp = (skb->tstamp.tv64 != 0) ? skb->tstamp + : ktime_get_real(); Well, if we want to stay in the spirit of old code, we probably want to use current_kernel_time() (+ timespec_to_ktime()), because its less expensive. And also setting the skb tstamp, no ? - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 1/2] bridge: avoid ptype_all packet handling
From: Stephen Hemminger <[EMAIL PROTECTED]> Date: Fri, 2 Mar 2007 14:09:29 -0800 > On Fri, 02 Mar 2007 13:26:38 -0800 (PST) > David Miller <[EMAIL PROTECTED]> wrote: > > > From: Stephen Hemminger <[EMAIL PROTECTED]> > > Date: Wed, 28 Feb 2007 17:18:46 -0800 > > > > > I was measuring bridging/routing performance and noticed this. > > > > > > The current code runs the "all packet" type handlers before calling the > > > bridge hook. If an application (like some DHCP clients) is using > > > AF_PACKET, > > > this means that each received packet gets run through the Berkeley Packet > > > Filter > > > code in sk_run_filter (slow). > > > > I know we closed this out by saying that even though performance > > sucks, we can't really apply this without breaking things. > > wrong. I disagee, and your patch is still broken because as Jamal pointed out (which you didn't address in any way) this breaks traffic classification of bridged traffic as well. If someone wants their network tap to hear all traffic, they do mean all traffic, and this includes potentially seeing it multiple times when things like bridging and virtual devices decap incoming frames. We can't apply this. Back to a workable solution, why doesn't DHCP specify a specific device? It would fix this performance problem completely, at the application level. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 1/2] bridge: avoid ptype_all packet handling
From: David Miller <[EMAIL PROTECTED]> Date: Fri, 02 Mar 2007 14:48:18 -0800 (PST) > Back to a workable solution, why doesn't DHCP specify a specific > device? It would fix this performance problem completely, at the > application level. Since nobody seems to be able to be bothered to actually look at what DHCP clients are doing, I actually did and it's no surprise that broken stuff is happening here. Here is how dhcp3-3.0.3 binds AF_PACKET sockets, in common/lpf.c: struct sockaddr sa; ... /* Bind to the interface name */ memset (&sa, 0, sizeof sa); sa.sa_family = AF_PACKET; strncpy (sa.sa_data, (const char *)info -> ifp, sizeof sa.sa_data); if (bind (sock, &sa, sizeof sa)) { if (errno == ENOPROTOOPT || errno == EPROTONOSUPPORT || errno == ESOCKTNOSUPPORT || errno == EPFNOSUPPORT || errno == EAFNOSUPPORT || errno == EINVAL) { log_error ("socket: %m - make sure"); log_error ("CONFIG_PACKET (Packet socket) %s", "and CONFIG_FILTER"); log_error ("(Socket Filtering) are enabled %s", "in your kernel"); log_fatal ("configuration!"); } log_fatal ("Bind socket to interface: %m"); } So it puts a string into the sockaddr data, and there is no mention of sockaddr_ll, which is what is supposed to be provided as the socket address here, in the entire DHCP tree. I'm tempted to say I must be missing something here, since I can't see how this could possible work at all. The string passed in should be interpreted as the ifindex value, and thus trigger a -ENODEV return from AF_PACKET's bind() implementation. My suspicions are confirmed by the patch here: http://kernel.org/pub/linux/kernel/people/chuyee/patches/dhcp-3.0/dhcp-3.0-linux_cooked_packet.patch Really, this bogus bind() explains everything. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 1/2] bridge: avoid ptype_all packet handling
On Fri, 02 Mar 2007 15:18:03 -0800 (PST) David Miller <[EMAIL PROTECTED]> wrote: > From: David Miller <[EMAIL PROTECTED]> > Date: Fri, 02 Mar 2007 14:48:18 -0800 (PST) > > > Back to a workable solution, why doesn't DHCP specify a specific > > device? It would fix this performance problem completely, at the > > application level. > > Since nobody seems to be able to be bothered to actually look > at what DHCP clients are doing, I actually did and it's no > surprise that broken stuff is happening here. I was in middle of checking that.. > Here is how dhcp3-3.0.3 binds AF_PACKET sockets, in common/lpf.c: > > struct sockaddr sa; > ... > /* Bind to the interface name */ > memset (&sa, 0, sizeof sa); > sa.sa_family = AF_PACKET; > strncpy (sa.sa_data, (const char *)info -> ifp, sizeof sa.sa_data); > if (bind (sock, &sa, sizeof sa)) { > if (errno == ENOPROTOOPT || errno == EPROTONOSUPPORT || > errno == ESOCKTNOSUPPORT || errno == EPFNOSUPPORT || > errno == EAFNOSUPPORT || errno == EINVAL) { > log_error ("socket: %m - make sure"); > log_error ("CONFIG_PACKET (Packet socket) %s", > "and CONFIG_FILTER"); > log_error ("(Socket Filtering) are enabled %s", > "in your kernel"); > log_fatal ("configuration!"); > } > log_fatal ("Bind socket to interface: %m"); > } > > So it puts a string into the sockaddr data, and there > is no mention of sockaddr_ll, which is what is supposed to be > provided as the socket address here, in the entire DHCP tree. > > I'm tempted to say I must be missing something here, since I can't see > how this could possible work at all. The string passed in should > be interpreted as the ifindex value, and thus trigger a -ENODEV > return from AF_PACKET's bind() implementation. > > My suspicions are confirmed by the patch here: > > http://kernel.org/pub/linux/kernel/people/chuyee/patches/dhcp-3.0/dhcp-3.0-linux_cooked_packet.patch Can you get FC fixed? > Really, this bogus bind() explains everything. Should we add a warning to kernel log, to make distro's fix it? It might make sense to add a per-device ptype_dev list in network device? -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 8107] New: dev->header_cache_update has a random value
Switching HDLC devices from Ethernet-framing mode caused stale ethernet function assignments within net_device. Signed-off-by: Krzysztof Halasa <[EMAIL PROTECTED]> diff --git a/drivers/net/wan/hdlc.c b/drivers/net/wan/hdlc.c index db354e0..f6e6b63 100644 --- a/drivers/net/wan/hdlc.c +++ b/drivers/net/wan/hdlc.c @@ -38,7 +38,7 @@ #include -static const char* version = "HDLC support module revision 1.20"; +static const char* version = "HDLC support module revision 1.21"; #undef DEBUG_LINK @@ -222,19 +222,30 @@ int hdlc_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd) return -EINVAL; } +static void hdlc_setup_dev(struct net_device *dev) +{ +/* Re-init all variables changed by HDLC protocol drivers, + including ether_setup() called from hdlc_raw_eth.c. */ + dev->get_stats = hdlc_get_stats; + dev->flags = IFF_POINTOPOINT | IFF_NOARP; + dev->mtu = HDLC_MAX_MTU; + dev->type= ARPHRD_RAWHDLC; + dev->hard_header_len = 16; + dev->addr_len= 0; + dev->hard_header = NULL; + dev->rebuild_header = NULL; + dev->set_mac_address = NULL; + dev->hard_header_cache = NULL; + dev->header_cache_update = NULL; + dev->change_mtu = hdlc_change_mtu; + dev->hard_header_parse = NULL; +} + void hdlc_setup(struct net_device *dev) { hdlc_device *hdlc = dev_to_hdlc(dev); - dev->get_stats = hdlc_get_stats; - dev->change_mtu = hdlc_change_mtu; - dev->mtu = HDLC_MAX_MTU; - - dev->type = ARPHRD_RAWHDLC; - dev->hard_header_len = 16; - - dev->flags = IFF_POINTOPOINT | IFF_NOARP; - + hdlc_setup_dev(dev); hdlc->carrier = 1; hdlc->open = 0; spin_lock_init(&hdlc->state_lock); @@ -294,6 +305,7 @@ void detach_hdlc_protocol(struct net_device *dev) } kfree(hdlc->state); hdlc->state = NULL; + hdlc_setup_dev(dev); } diff --git a/drivers/net/wan/hdlc_cisco.c b/drivers/net/wan/hdlc_cisco.c index b0bc5dd..c9664fd 100644 --- a/drivers/net/wan/hdlc_cisco.c +++ b/drivers/net/wan/hdlc_cisco.c @@ -365,10 +365,7 @@ static int cisco_ioctl(struct net_device *dev, struct ifreq *ifr) memcpy(&state(hdlc)->settings, &new_settings, size); dev->hard_start_xmit = hdlc->xmit; dev->hard_header = cisco_hard_header; - dev->hard_header_cache = NULL; dev->type = ARPHRD_CISCO; - dev->flags = IFF_POINTOPOINT | IFF_NOARP; - dev->addr_len = 0; netif_dormant_on(dev); return 0; } diff --git a/drivers/net/wan/hdlc_fr.c b/drivers/net/wan/hdlc_fr.c index b45ab68..c6c3c75 100644 --- a/drivers/net/wan/hdlc_fr.c +++ b/drivers/net/wan/hdlc_fr.c @@ -1289,10 +1289,7 @@ static int fr_ioctl(struct net_device *dev, struct ifreq *ifr) memcpy(&state(hdlc)->settings, &new_settings, size); dev->hard_start_xmit = hdlc->xmit; - dev->hard_header = NULL; dev->type = ARPHRD_FRAD; - dev->flags = IFF_POINTOPOINT | IFF_NOARP; - dev->addr_len = 0; return 0; case IF_PROTO_FR_ADD_PVC: diff --git a/drivers/net/wan/hdlc_ppp.c b/drivers/net/wan/hdlc_ppp.c index e9f7170..4591437 100644 --- a/drivers/net/wan/hdlc_ppp.c +++ b/drivers/net/wan/hdlc_ppp.c @@ -127,9 +127,7 @@ static int ppp_ioctl(struct net_device *dev, struct ifreq *ifr) if (result) return result; dev->hard_start_xmit = hdlc->xmit; - dev->hard_header = NULL; dev->type = ARPHRD_PPP; - dev->addr_len = 0; netif_dormant_off(dev); return 0; } diff --git a/drivers/net/wan/hdlc_raw.c b/drivers/net/wan/hdlc_raw.c index fe3cae5..e23bc66 100644 --- a/drivers/net/wan/hdlc_raw.c +++ b/drivers/net/wan/hdlc_raw.c @@ -88,10 +88,7 @@ static int raw_ioctl(struct net_device *dev, struct ifreq *ifr) return result; memcpy(hdlc->state, &new_settings, size); dev->hard_start_xmit = hdlc->xmit; - dev->hard_header = NULL; dev->type = ARPHRD_RAWHDLC; - dev->flags = IFF_POINTOPOINT | IFF_NOARP; - dev->addr_len = 0; netif_dormant_off(dev); return 0; } diff --git a/drivers/net/wan/hdlc_x25.c b/drivers/net/wan/hdlc_x25.c index e4bb9f8..cd7b22f 100644 --- a/drivers/net/wan/hdlc_x25.c +++ b/drivers/net/wan/hdlc_x25.c @@ -215,9 +215,7 @@ static int x25_ioctl(struct net_device *dev, struct ifreq *ifr) x25_rx, 0)) != 0) return result; dev->hard_start_xmit = x25_xmit; - dev->hard_header = NULL; d
Re: [Bugme-new] [Bug 8107] New: dev->header_cache_update has a random value
David Miller <[EMAIL PROTECTED]> writes: > I disagree, you can't leave dangling references to functions > which are potentially inside of unloaded modules, as this code > does. All such pointers were thought to be initialized by all HDLC protocol handlers before device activation, but they were actually used by the hdlc* code, and this one doesn't seem to... > Rather, HDLC Cisco should implement a proper protocol destructor > method to clean up these function pointers. No, it wouldn't work - hdlc_cisco doesn't use it at all, it's just a victim. But now I think there may be other victims. It seems the only way to become non-NULL is through ether_setup() from hdlc_raw_eth.c (Ethernet framing over HDLC). I think it's best to NULLify it and the like in hdlc.c unconditionally, it's slow path and we don't need another useless EXPORT_SYMBOL(s). It would fix all such problems forever. Compile-tested only but it seems pretty obvious and of course I check if the packets still flow after regular kernel upgrades (and I run automatic tests checking all protos except X.25 from time to time as well). (the patch is in the next message). Not sure if 2.6.21 material. -- Krzysztof Halasa - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 1/2] bridge: avoid ptype_all packet handling
From: Stephen Hemminger <[EMAIL PROTECTED]> Date: Fri, 2 Mar 2007 15:34:14 -0800 > Can you get FC fixed? I am not the DHCP package maintainer. :-) I'm up to my earfulls already dealing with people trying to slug broken patches into the kernel networking that paper around application bugs. ;) > Should we add a warning to kernel log, to make distro's fix it? Unfortunately it looks like a properly formed sockaddr_ll, the ifindex is in fact zero, so there is nothing we can do to warn about this case. The sockaddr_ll sits after the first sockaddr string in the ifreq, and the rest remains initialized to zeros, thus the bind() succeeds. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bugme-new] [Bug 8107] New: dev->header_cache_update has a random value
From: Krzysztof Halasa <[EMAIL PROTECTED]> Date: Sat, 03 Mar 2007 00:38:05 +0100 > Switching HDLC devices from Ethernet-framing mode caused stale ethernet > function assignments within net_device. > > Signed-off-by: Krzysztof Halasa <[EMAIL PROTECTED]> This looks good to me, I think I'll apply it :-) Thanks! - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] udp: whitespace fixes
From: Stephen Hemminger <[EMAIL PROTECTED]> Date: Fri, 2 Mar 2007 14:04:49 -0800 > > The udp code is full of bad indenting, extra whitespace and other > style confusion. It makes no sense to declare functions that are used > outside the current file (extern) as inline. > > Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> > --- > net/ipv4/udp.c | 402 > - > net/ipv6/udp.c | 175 +--- > 2 files changed, 295 insertions(+), 282 deletions(-) > > --- tcp-2.6.orig/net/ipv4/udp.c 2007-03-02 12:08:06.0 -0800 > +++ tcp-2.6/net/ipv4/udp.c2007-03-02 12:37:38.0 -0800 > @@ -120,8 +120,8 @@ > struct hlist_node *node; > > sk_for_each(sk, node, &udptable[num & (UDP_HTABLE_SIZE - 1)]) > - if (sk->sk_hash == num) > - return 1; > + if (sk->sk_hash == num) > + return 1; This turns tabs into spaces, it cannot be correct. Yoshifuji fixed all the whitespace problems under net/ already for 2.6.21 - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
tree plans...
I plan to cut a net-2.6.22 tree after I finish pushing the current round of 2.6.21 networking bug fixes to Linus. I'll load the tcp-2.6 tree changes into net-2.6.22, and then we'll do all non-bug-fix development in the net-2.6.22 tree. It may take some time for me to push out the bug fixes for today because due to the VLAN group allocation fix, I need to do an exhaustive build test with allmodconfig and stuff like that to make sure no drivers got accidently build broken by that change. Thanks! - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] net: Convert xtime.tv_sec to get_seconds()
From: James Morris <[EMAIL PROTECTED]> Date: Tue, 27 Feb 2007 16:24:49 -0500 (EST) > Where appropriate, convert references to xtime.tv_sec to the > get_seconds() helper function. > > Signed-off-by: James Morris <[EMAIL PROTECTED]> This looks great James, I'll apply it to net-2.6.2 once I set that tree up. Thanks again. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] pktgen: fix device name handling
From: Robert Olsson <[EMAIL PROTECTED]> Date: Wed, 28 Feb 2007 18:07:09 +0100 > Yes it seems be handle dev name change. So configuration scripts should > use ifindex now :) > > Signed-off-by: Robert Olsson <[EMAIL PROTECTED]> I will apply all 4 of these patches to net-2.6.22, thanks everyone. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Netem tfifo implementation
On 3/2/07, Stephen Hemminger <[EMAIL PROTECTED]> wrote: On Fri, 2 Mar 2007 15:56:54 -0500 "Ritesh Kumar" <[EMAIL PROTECTED]> wrote: > On 3/2/07, Patrick McHardy <[EMAIL PROTECTED]> wrote: > > Ritesh Kumar wrote: > > > Hi, > > >I recently saw the qdisc "tfifo" in the netem module > > > (net/sched/sch_netem.c) when I migrated some of my patches from 2.6.14 > > > to 2.6.20. As I understand, tfifo helps in keeping the queue of > > > packets sorted according to their "time_to_send". [tfifo was not > > > present in 2.6.14 perhaps because arrival order of packets was always > > > equal to the departure order]. However, tfifo uses a linear search in > > > the packet queue to find where to enqueue the packet. > > >Quite some time ago (2.6.14 era), I needed a similar functionality > > > from the netem module and I ended up coding a pointer based min-heap > > > for the same. I was wondering if the community was interested in using > > > the min-heap implementation to replace the linear search > > > implementation. I have tested the min-heap quite a few times and it > > > seems to work. > > >The implementation is slightly non-trivial because it uses > > > pointers to maintain the heap structure instead if using good old > > > fixed size arrays. I did this mainly so that the limit of the netem > > > qdisc could be changed on the fly. However, because every sk_buff now > > > needs two pointers for its children nodes, I added an extra > > > (sk_buff*)next2 to struct sk_buff (sorry!). However, this can probably > > > be changed to a pointer inside netem_skb_cb. Also, because I needed > > > this for personal work and 2.6.14 didn't contain tfifo, I basically > > > removed the embedded qdisc and made netem a classless qdisc with my > > > min heap as the native "queue" (sorry again! :) ) > > > > The tfifo qdisc has a limit, why not just allocate a fixed-size heap > > based on that? > > > > > > The tfifo queue limit itself can be changed and that creates the > problem. If we use a fixed heap (say implemented using a fixed size > array) then we will have to copy over all pointers from the first > array to a reallocated array whenever the queue limit is changed. > In retrospect, moving just a few 10s of kilobytes of data doesn't seem > that much of a problem... now I feel stupid having put so much effort > :). > Tfifo is a special case because: * timestamps are stored in skb->cb so it is only really usable inside netem that adds timestamps. * insertions are cheap because it walks backwards and netem usually has tnext > tlast. Only if you have a huge jitter which causes massive reordering and that is unrealistic, would you see a problem. You are right. A huge jitter inside a given flow is unrealistic in real networks. It can also cause artificial reordering. However, in our lab we use netem (with my changes) to enable per-flow delays. The per-flow delays that we use vary a lot and hence we have to go through some optimizations. Thanks for all the feedback. Ritesh - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] udp: whitespace fixes
Resend with less garbage... The udp code is full of bad indenting, extra whitespace and other style confusion. It makes no sense to declare functions that are used outside the current file (extern) as inline. Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> --- net/ipv4/udp.c | 312 +++-- net/ipv6/udp.c | 153 +++ 2 files changed, 236 insertions(+), 229 deletions(-) --- tcp-2.6.orig/net/ipv4/udp.c 2007-03-02 16:25:12.0 -0800 +++ tcp-2.6/net/ipv4/udp.c 2007-03-02 16:41:04.0 -0800 @@ -136,13 +136,13 @@ */ int __udp_lib_get_port(struct sock *sk, unsigned short snum, struct hlist_head udptable[], int *port_rover, - int (*saddr_comp)(const struct sock *sk1, -const struct sock *sk2 )) + int (*saddr_comp)(const struct sock * sk1, +const struct sock * sk2)) { struct hlist_node *node; struct hlist_head *head; struct sock *sk2; - interror = 1; + int error = 1; write_lock_bh(&udp_hash_lock); if (snum == 0) { @@ -160,7 +160,8 @@ if (hlist_empty(head)) { if (result > sysctl_local_port_range[1]) result = sysctl_local_port_range[0] + - ((result - sysctl_local_port_range[0]) & + ((result - + sysctl_local_port_range[0]) & (UDP_HTABLE_SIZE - 1)); goto gotit; } @@ -175,12 +176,13 @@ ; } result = best; - for(i = 0; i < (1 << 16) / UDP_HTABLE_SIZE; i++, result += UDP_HTABLE_SIZE) { + for (i = 0; i < (1 << 16) / UDP_HTABLE_SIZE; +i++, result += UDP_HTABLE_SIZE) { if (result > sysctl_local_port_range[1]) result = sysctl_local_port_range[0] + ((result - sysctl_local_port_range[0]) & (UDP_HTABLE_SIZE - 1)); - if (! __udp_lib_lport_inuse(result, udptable)) + if (!__udp_lib_lport_inuse(result, udptable)) break; } if (i >= (1 << 16) / UDP_HTABLE_SIZE) @@ -194,9 +196,8 @@ if (sk2->sk_hash == snum && sk2 != sk&& (!sk2->sk_reuse|| !sk->sk_reuse) && - (!sk2->sk_bound_dev_if || !sk->sk_bound_dev_if || sk2->sk_bound_dev_if == sk->sk_bound_dev_if) && - (*saddr_comp)(sk, sk2) ) + (*saddr_comp)(sk, sk2)) goto fail; } inet_sk(sk)->num = snum; @@ -212,19 +213,19 @@ return error; } -__inline__ int udp_get_port(struct sock *sk, unsigned short snum, - int (*scmp)(const struct sock *, const struct sock *)) +int udp_get_port(struct sock *sk, unsigned short snum, +int (*scmp)(const struct sock *, const struct sock *)) { - return __udp_lib_get_port(sk, snum, udp_hash, &udp_port_rover, scmp); + return __udp_lib_get_port(sk, snum, udp_hash, &udp_port_rover, scmp); } -inline int ipv4_rcv_saddr_equal(const struct sock *sk1, const struct sock *sk2) +int ipv4_rcv_saddr_equal(const struct sock *sk1, const struct sock *sk2) { - struct inet_sock *inet1 = inet_sk(sk1), *inet2 = inet_sk(sk2); + const struct inet_sock *inet1 = inet_sk(sk1), *inet2 = inet_sk(sk2); - return ( !ipv6_only_sock(sk2) && - (!inet1->rcv_saddr || !inet2->rcv_saddr || - inet1->rcv_saddr == inet2->rcv_saddr )); + return !ipv6_only_sock(sk2) && + (!inet1->rcv_saddr || !inet2->rcv_saddr || +inet1->rcv_saddr == inet2->rcv_saddr); } static inline int udp_v4_get_port(struct sock *sk, unsigned short snum) @@ -253,27 +254,27 @@ if (inet->rcv_saddr) { if (inet->rcv_saddr != daddr) continue; - score+=2; + score += 2; } if (inet->daddr) { if (inet->daddr != saddr) continue; - score+=2; +
Re: [PATCH] udp: whitespace fixes
From: Stephen Hemminger <[EMAIL PROTECTED]> Date: Fri, 2 Mar 2007 16:47:19 -0800 > Resend with less garbage... > > The udp code is full of bad indenting, extra whitespace and other > style confusion. It makes no sense to declare functions that are used > outside the current file (extern) as inline. > > Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> Looks good, I'll try to apply this when I cut the net-2.6.22 tree. Thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3]: Updates, removal of unsupported features and minor bug fixes.
Linsys Contractor Mithlesh Thukral wrote: NetXen: Updates, removal of unsupported features and minor bug fixes. Signed-off-by: Mithlesh Thukral <[EMAIL PROTECTED]> --- netxen_nic.h |4 + netxen_nic_ethtool.c | 144 +- netxen_nic_main.c |4 - netxen_nic_phan_reg.h |3 + 4 files changed, 34 insertions(+), 121 deletions(-) applied patches 1-2 of 3 - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] NetXen: Make driver use multi PCI functions
Linsys Contractor Mithlesh Thukral wrote: NetXen: Make driver use multi PCI functions. Signed-off by: Mithlesh Thukral <[EMAIL PROTECTED]> --- netxen_nic.h | 126 +--- netxen_nic_ethtool.c | 80 +++ netxen_nic_hdr.h |8 netxen_nic_hw.c | 213 +++- netxen_nic_hw.h | 18 - netxen_nic_init.c | 115 +++--- netxen_nic_isr.c | 80 +++ netxen_nic_main.c | 523 +- netxen_nic_niu.c | 27 +- netxen_nic_phan_reg.h | 125 --- 10 files changed, 631 insertions(+), 684 deletions(-) all three patches in this patchset contained nothing but one-line summaries of the changes included in them, and are overall very poorly and vaguely described. This patch is far too big, with far too little description and justification to go along with it. If you are not going to make the effort to write a paragraph or two describing such huge changes, then I'm not going to make the effort to review and apply it. NAK. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [NET] Add support for Seeq 8003 on Challenge S Mezz board.
Ralf Baechle wrote: From: Ladislav Michl <[EMAIL PROTECTED]> Thanks to Jö Fahlke for donating hardware. Signed-off-by: Ladislav Michl <[EMAIL PROTECTED]> Forward porting of Ladis' 2.4 patch. Signed-off-by: Ralf Baechle <[EMAIL PROTECTED]> applied to #upstream (2.6.22) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] tc35815 driver update (part 1)
Atsushi Nemoto wrote: Current tc35815 driver is very obsolete and less maintained for a long time. Replace it with a new driver based on one from CELF patch archive. It was for 2.6.10 kernel so some adjustment and cleanup are added. (remove config.h, SA_ to IRQF_ conversion, etc.) Major advantages are: * Independent of JMR3927. (Actually independent of MIPS, but AFAIK the chip is used only on MIPS platforms) * TX4938 support. * 64-bit proof. * Asynchronous and on-demand auto negotiation. * High performance on non-coherent architecture. * ethtool support. * Many bugfixes and cleanups. And next patch add further improvements/bugfixes/cleanups. Signed-off-by: Atsushi Nemoto <[EMAIL PROTECTED]> --- This is a patch against current linux-mips.org git-tree. drivers/net/Kconfig |3 drivers/net/tc35815.c | 2070 +++--- include/linux/pci_ids.h |1 3 files changed, 1440 insertions(+), 634 deletions(-) Would you be kind enough to a) provide a URL to a .c file (or post it, if it's under 100K) so that we may more easily review this b) combine both patches into a single patch. might as well, since it's a rewrite. c) rediff your patch against linux-2.6.git + Ralf's killall removal patch, and resend. There were some minor conflicting changes that appeared, though these changes will certainly become irrelevant once your new driver is merged. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] bridge: avoid ptype_all packet handling
David Miller <[EMAIL PROTECTED]> writes: > > And in fact that effectively makes the new socket option > pointless, since it doesn't buy us anything since we have > to support the old stuff fully anyways. I don't think it's pointless because it would still allow newer DHCP clients to have less impact on other packets when they are active. This can matter when you have a system with multiple interfaces where DHCP doesn't get a address on one. That's pretty common with many x86 server boards because they come with two NICs by default but must people only plug the cable into one. However the distro installers run DHCP on all. When this happens all packets are always forced through ptype_all chains before being rejected by AF_PACKETs device bind, which adds some overhead to them. -Andi - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] qla3xxx: bugfix for line omitted in previous patch.
Ron Mercer wrote: From 01751a39d7327acc28dabf4f68930b7e20b279d1 Mon Sep 17 00:00:00 2001 From: Ron Mercer <[EMAIL PROTECTED]> Date: Wed, 28 Feb 2007 16:42:17 -0800 Subject: [PATCH] [PATCH] qla3xxx: bugfix for line omitted in previous patch. This missing line caused transmit errors on the Qlogic 4032 chip. Signed-off-by: Ron Mercer <[EMAIL PROTECTED]> applied - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Network activity LED trigger
Florian Fainelli <[EMAIL PROTECTED]> writes: > Hi All, > > I have been talking a bit with Richard, who is the LED API maintainer, and a > LED trigger based on network activity would be something great. You should be aware that normally the kernel doesn't see all packets on a ethernet unless promiscuous mode is enabled (which it is normally not). That is because the hardware filters out all packets not for this host. A software controlled LED wouldn't be equivalent to the activity LEDs you normally have on network cards, but only show local traffic. That said if you want to get events for any in/outgoing packets you can use the same hooks as PF_PACKET uses for sniffing; using dev_add_pack with ETH_P_ALL. That will get you all incoming and outgoing packets that are local. And when someone runs tcpdump it will suddenly see all which might be unexpected. -Andi - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] bonding: only receive ARPs for us
Jay Vosburgh wrote: The ARP validation code only needs ARPs for the bonding device. Signed-off-by: Jay Vosburgh <[EMAIL PROTECTED]> I seem to have lost the context of this. Did this get discussed, and need further revision? The three patches from 2/28/2007 look OK to me, and I just wanted to make sure before applying them. Jeff - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] div64_64 consolidate (rev3)
Here is the current version of the 64 bit divide common code. Since it is used by three times by networking code, can we put it net-2.6.22 tree? Signed-off-by: Stephen Hemminger <[EMAIL PROTECTED]> --- include/asm-arm/div64.h |2 ++ include/asm-generic/div64.h |7 +++ include/asm-i386/div64.h |2 ++ include/asm-m68k/div64.h |1 + include/asm-mips/div64.h |2 ++ include/asm-um/div64.h |1 + include/asm-xtensa/div64.h |4 lib/Makefile |5 +++-- lib/div64.c | 22 ++ net/ipv4/tcp_cubic.c | 23 --- net/ipv4/tcp_yeah.c | 21 - net/ipv4/tcp_yeah.h |1 + net/netfilter/xt_connbytes.c | 16 13 files changed, 45 insertions(+), 62 deletions(-) --- tcp-2.6.orig/include/asm-arm/div64.h2007-03-02 17:21:27.0 -0800 +++ tcp-2.6/include/asm-arm/div64.h 2007-03-02 17:22:38.0 -0800 @@ -223,4 +223,6 @@ #endif +extern uint64_t div64_64(uint64_t dividend, uint64_t divisor); + #endif --- tcp-2.6.orig/include/asm-generic/div64.h2007-03-02 17:21:27.0 -0800 +++ tcp-2.6/include/asm-generic/div64.h 2007-03-02 17:22:38.0 -0800 @@ -30,6 +30,11 @@ __rem; \ }) +static inline uint64_t div64_64(uint64_t dividend, uint64_t divisor) +{ + return dividend / divisor; +} + #elif BITS_PER_LONG == 32 extern uint32_t __div64_32(uint64_t *dividend, uint32_t divisor); @@ -49,6 +54,8 @@ __rem; \ }) +extern uint64_t div64_64(uint64_t dividend, uint64_t divisor); + #else /* BITS_PER_LONG == ?? */ # error do_div() does not yet support the C64 --- tcp-2.6.orig/include/asm-i386/div64.h 2007-03-02 17:21:27.0 -0800 +++ tcp-2.6/include/asm-i386/div64.h2007-03-02 17:22:38.0 -0800 @@ -45,4 +45,6 @@ return dum2; } + +extern uint64_t div64_64(uint64_t dividend, uint64_t divisor); #endif --- tcp-2.6.orig/include/asm-m68k/div64.h 2007-03-02 17:21:27.0 -0800 +++ tcp-2.6/include/asm-m68k/div64.h2007-03-02 17:22:38.0 -0800 @@ -23,4 +23,5 @@ __rem; \ }) +extern uint64_t div64_64(uint64_t dividend, uint64_t divisor); #endif /* _M68K_DIV64_H */ --- tcp-2.6.orig/include/asm-mips/div64.h 2007-03-02 17:21:27.0 -0800 +++ tcp-2.6/include/asm-mips/div64.h2007-03-02 17:22:38.0 -0800 @@ -78,6 +78,8 @@ __quot = __quot << 32 | __low; \ (n) = __quot; \ __mod; }) + +extern uint64_t div64_64(uint64_t dividend, uint64_t divisor); #endif /* (_MIPS_SZLONG == 32) */ #if (_MIPS_SZLONG == 64) --- tcp-2.6.orig/include/asm-um/div64.h 2007-03-02 17:21:27.0 -0800 +++ tcp-2.6/include/asm-um/div64.h 2007-03-02 17:22:38.0 -0800 @@ -3,4 +3,5 @@ #include "asm/arch/div64.h" +extern uint64_t div64_64(uint64_t dividend, uint64_t divisor); #endif --- tcp-2.6.orig/include/asm-xtensa/div64.h 2007-03-02 17:21:27.0 -0800 +++ tcp-2.6/include/asm-xtensa/div64.h 2007-03-02 17:22:38.0 -0800 @@ -16,4 +16,8 @@ n /= (unsigned int) base; \ __res; }) +static inline uint64_t div64_64(uint64_t dividend, uint64_t divisor) +{ + return dividend / divisor; +} #endif --- tcp-2.6.orig/lib/Makefile 2007-03-02 17:21:27.0 -0800 +++ tcp-2.6/lib/Makefile2007-03-02 17:22:38.0 -0800 @@ -4,7 +4,7 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \ rbtree.o radix-tree.o dump_stack.o \ -idr.o div64.o int_sqrt.o bitmap.o extable.o prio_tree.o \ +idr.o int_sqrt.o bitmap.o extable.o prio_tree.o \ sha1.o irq_regs.o reciprocal_div.o lib-$(CONFIG_MMU) += ioremap.o @@ -12,7 +12,8 @@ lib-y += kobject.o kref.o kobject_uevent.o klist.o -obj-y += sort.o parser.o halfmd4.o debug_locks.o random32.o bust_spinlocks.o +obj-y += div64.o sort.o parser.o halfmd4.o debug_locks.o random32.o \ +bust_spinlocks.o ifeq ($(CONFIG_DEBUG_KOBJECT),y) CFLAGS_kobject.o += -DDEBUG --- tcp-2.6.orig/lib/div64.c2007-03-02 17:21:27.0 -0800 +++ tcp-2.6/lib/div64.c 2007-03-02 17:22:38.0 -0800 @@ -58,4 +58,26 @@ EXPORT_SYMBOL(__div64_32); +/* 64bit divisor, dividend and result. dynamic precision */ +uint64_t div64_64(uint64_t dividend, uint64_t divisor) +{ + uint32_t d = divisor; + + if (divisor > 0xULL) { + unsigned int shift = fls(divisor >> 32); + + d = divisor >> shift; + dividend >>= shift; + } + + /* avoid 64 bit division if possible */ + if (dividend >> 32) + do_div(dividend, d); + else + dividend = (uint32_t) dividend / d; + + return dividend; +} +EXPORT_SYMBOL(div64_64); + #endif /* BITS_PER
Re: [PATCH] [USBNET] DM9501: Add Corega FEther USB-TXC support.
YOSHIFUJI Hideaki / 吉藤英明 wrote: Signed-off-by: YOSHIFUJI Hideaki <[EMAIL PROTECTED]> --- drivers/usb/net/dm9601.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/usb/net/dm9601.c b/drivers/usb/net/dm9601.c index 4a932e1..c0bc52b 100644 --- a/drivers/usb/net/dm9601.c +++ b/drivers/usb/net/dm9601.c @@ -571,6 +571,10 @@ static const struct driver_info dm9601_info = { static const struct usb_device_id products[] = { { +USB_DEVICE(0x07aa, 0x9601),/* Corega FEther USB-TXC */ +.driver_info = (unsigned long)&dm9601_info, +}, + { ACK the patch, though I wonder if this shouldn't instead go to Greg. Honestly, I would prefer that the USB net drivers were moved into drivers/net with the other net drivers, /then/ I would merge such patches. We don't add drivers for PCI-based hardware to drivers/pci/net, after all... - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bonding-devel] [PATCH 2/3] bonding: only receive ARPs for us
Jeff Garzik <[EMAIL PROTECTED]> wrote: >Jay Vosburgh wrote: >> The ARP validation code only needs ARPs for the bonding device. >> >> Signed-off-by: Jay Vosburgh <[EMAIL PROTECTED]> > >I seem to have lost the context of this. Did this get discussed, and >need further revision? The further discussion can be (loosely) paraphrased as: Andy Gospodarek <[EMAIL PROTECTED]>: "Hey, this no workee with IPv6." Me: "True, but bonding no workee with IPv6 at all." Andy: "Oh, ok. Ack." After which followed some preliminary yakkage about fixing up said non-workee IPv6 support. -J --- -Jay Vosburgh, IBM Linux Technology Center, [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bonding-devel] [PATCH 2/3] bonding: only receive ARPs for us
Jay Vosburgh wrote: Jeff Garzik <[EMAIL PROTECTED]> wrote: Jay Vosburgh wrote: The ARP validation code only needs ARPs for the bonding device. Signed-off-by: Jay Vosburgh <[EMAIL PROTECTED]> I seem to have lost the context of this. Did this get discussed, and need further revision? The further discussion can be (loosely) paraphrased as: Andy Gospodarek <[EMAIL PROTECTED]>: "Hey, this no workee with IPv6." Me: "True, but bonding no workee with IPv6 at all." Andy: "Oh, ok. Ack." After which followed some preliminary yakkage about fixing up said non-workee IPv6 support. thanks :) I'll make sure the 3 patches go into #upstream-fixes - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [USBNET] DM9501: Add Corega FEther USB-TXC support.
On Fri, Mar 02, 2007 at 08:33:55PM -0500, Jeff Garzik wrote: > YOSHIFUJI Hideaki / wrote: > >Signed-off-by: YOSHIFUJI Hideaki <[EMAIL PROTECTED]> > >--- > > drivers/usb/net/dm9601.c |4 > > 1 files changed, 4 insertions(+), 0 deletions(-) > > > >diff --git a/drivers/usb/net/dm9601.c b/drivers/usb/net/dm9601.c > >index 4a932e1..c0bc52b 100644 > >--- a/drivers/usb/net/dm9601.c > >+++ b/drivers/usb/net/dm9601.c > >@@ -571,6 +571,10 @@ static const struct driver_info dm9601_info = { > > > > static const struct usb_device_id products[] = { > > { > >+ USB_DEVICE(0x07aa, 0x9601),/* Corega FEther USB-TXC */ > >+ .driver_info = (unsigned long)&dm9601_info, > >+ }, > >+{ > > > ACK the patch, though I wonder if this shouldn't instead go to Greg. > > Honestly, I would prefer that the USB net drivers were moved into > drivers/net with the other net drivers, /then/ I would merge such > patches. We don't add drivers for PCI-based hardware to > drivers/pci/net, after all... I have no objection to that. Things have been moving out of the drivers/usb/ directory over time, and if you want to take these under your umbrella too, that's fine with me. David, any objections? thanks, greg k-h - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] bridge: avoid ptype_all packet handling
From: Andi Kleen <[EMAIL PROTECTED]> Date: 03 Mar 2007 03:14:29 +0100 > That's pretty common with many x86 server boards because > they come with two NICs by default but must people only > plug the cable into one. However the distro installers > run DHCP on all. Nope, that's not what I've seen them do, instead they run dhcp on interfaces that report a link being present. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 1/2] bridge: avoid ptype_all packet handling
David Miller <[EMAIL PROTECTED]> wrote: > > I'm tempted to say I must be missing something here, since I can't see > how this could possible work at all. The string passed in should > be interpreted as the ifindex value, and thus trigger a -ENODEV > return from AF_PACKET's bind() implementation. This is using packet_bind_spkt which uses a name instead of ifindex. As you may recall, I've made a patch to convert it to use the new (actually it's not-so-new anymore) AF_PACKET interface. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 1/2] bridge: avoid ptype_all packet handling
From: Herbert Xu <[EMAIL PROTECTED]> Date: Sat, 03 Mar 2007 16:38:45 +1100 > This is using packet_bind_spkt which uses a name instead of ifindex. So it should be just fine, it should be binding to a specific device (by name instead of ifindex) and therefore it should only trigger the pt_all hook when the packet arrives on that specific device. > As you may recall, I've made a patch to convert it to use the new > (actually it's not-so-new anymore) AF_PACKET interface. That's right. So it's still a mystery why dhcp is causing bridge devices to trigger the network tap paths on Stephen's machine. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ppp and routing table rules.
On Thu, 01 Mar 2007, Ben Greear wrote: > Ben Greear wrote: > > I am sending udp packets through ppp400, and I see them appear on ppp401 > as expected. > > The thing that is bothering me is that all I see on rddVR4 (172.1.2.1) > is arps for 172.1.2.2, but the 'tell' IP is that of the > originating ppp400 link, not the IP of rddVR4, as I expected: > > 21:47:16.119640 arp who-has 172.1.2.2 tell 11.1.1.3 > 21:47:17.119371 arp who-has 172.1.2.2 tell 11.1.1.3 > 21:47:18.119254 arp who-has 172.1.2.2 tell 11.1.1.3 > 21:47:19.273118 arp who-has 172.1.2.2 tell 11.1.1.3 > > Unless I'm missing something dumb, a similar setup with all ethernet-ish > network devices > works fine. > > I have also enabled arp filtering: > # Only answer ARPs if it is for the IP on our own interface. > echo 2 > /proc/sys/net/ipv4/conf/all/arp_ignore > and for every device used in these routing tables: > echo 1 > /proc/sys/net/ipv4/conf/[dev]/arp_filter > > Any idea what I need to do in order to make the source IP for the ARP > packet correct? Wouldn't that be controlled by arp_announce? arp_announce - INTEGER Define different restriction levels for announcing the local source IP address from IP packets in ARP requests sent on interface: 0 - (default) Use any local address, configured on any interface 1 - Try to avoid local addresses that are not in the target's subnet for this interface. This mode is useful when target hosts reachable via this interface require the source IP address in ARP requests to be part of their logical network configured on the receiving interface. When we generate the request we will check all our subnets that include the target IP and will preserve the source address if it is from such subnet. If there is no such subnet we select source address according to the rules for level 2. 2 - Always use the best local address for this target. In this mode we ignore the source address in the IP packet and try to select local address that we prefer for talks with the target host. Such local address is selected by looking for primary IP addresses on all our subnets on the outgoing interface that include the target IP address. If no suitable local address is found we select the first local address we have on the outgoing interface or on all other interfaces, with the hope we will receive reply for our request and even sometimes no matter the source IP address we announce. The max value from conf/{all,interface}/arp_announce is used. Increasing the restriction level gives more chance for receiving answer from the resolved target while decreasing the level announces more valid sender's information. -Bill - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4]: Kill fastpath_{skb,cnt}_hint.
From: Baruch Even <[EMAIL PROTECTED]> Date: Thu, 1 Mar 2007 20:13:40 +0200 > If you take this approach it makes sense to also remove the sorting of > SACKs, the traversal of the SACK blocks will not start from the > beginning anyway which was the reason for this sorting in the first > place. > > One drawback for this approach is that you now walk the entire sack > block when you advance one packet. If you consider a 10,000 packet queue > which had several losses at the beginning and a large sack block that > advances from the middle to the end you'll walk a lot of packets for > that one last stretch of a sack block. > > One way to handle that is to use the still existing sack fast path to > detect this case and calculate what is the sequence number to search > for. Since you know what was the end_seq that was handled last, you can > search for it as the start_seq and go on from there. Does it make sense? Thanks for the feedback and these great ideas. BTW, I think I figured out a way to get rid of lost_{skb,cnt}_hint. The fact of the matter in this case is that the setting of the tag bits always propagates from front of the queue onward. We don't get holes mid-way. So what we can do is search the RB-tree for high_seq and walk backwards. Once we hit something with TCPCB_TAGBITS set, we stop processing as there are no earlier SKBs which we'd need to do anything with. Do you see any problems with that idea? scoreboard_skb_hint is a little bit trickier, but it is a similar case to the tcp_lost_skb_hint case. Except here the termination condition is a relative timeout instead of a sequence number and packet count test. Perhaps for that we can remember some state from the tcp_mark_head_lost() we do first. In fact, we can start the queue walk from the latest packet which tcp_mark_head_lost() marked with a tag bit. Basically these two algorithms are saying: 1) Mark up to smallest of 'lost' or tp->high_seq. 2) Mark packets after those processed in #1 which have timed out. Right? - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC 1/2] bridge: avoid ptype_all packet handling
On Fri, Mar 02, 2007 at 09:59:05PM -0800, David Miller wrote: > > So it's still a mystery why dhcp is causing bridge devices > to trigger the network tap paths on Stephen's machine. If this is the ISC DHCP daemon then perhaps it's because Stephen didn't specify an interface for it to listen on? By default it'll enumerate all broadcast interfaces and listen to each one of them. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] bridge: avoid ptype_all packet handling
David Miller wrote: From: Andi Kleen <[EMAIL PROTECTED]> Date: 03 Mar 2007 03:14:29 +0100 That's pretty common with many x86 server boards because they come with two NICs by default but must people only plug the cable into one. However the distro installers run DHCP on all. Nope, that's not what I've seen them do, instead they run dhcp on interfaces that report a link being present. Actually, It may be even simpler... I start bridge with a script and there was still a dhclient left over running on the original interface. It was an interesting exercise, and I have new tools to help, but still no magic bullet to get up to full line rate. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html