Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP
* David Miller <[EMAIL PROTECTED]> wrote: > > furthermore, the tweak allows the shifting of processing from a > > prioritized process context into a highest-priority softirq context. > > (it's not proven that there is any significant /net win/ of > > performance: all that was proven is that if we shift TCP processing > > from process context into softirq context then TCP throughput of > > that otherwise penalized process context increases.) > > If we preempt with any packets in the backlog, we send no ACKs and the > sender cannot send thus the pipe empties. That's the problem, this > has nothing to do with scheduler priorities or stuff like that IMHO. > The argument goes that if the reschedule is delayed long enough, the > ACKs will exceed the round trip time and trigger retransmits which > will absolutely kill performance. yes, but i disagree a bit about the characterisation of the problem. The question in my opinion is: how is TCP processing prioritized for this particular socket, which is attached to the process context which was preempted. normally, normally quite a bit of TCP processing happens in a softirq context (in fact most of it happens there), and softirq contexts have no fairness whatsoever - they preempt whatever processing is going on, regardless of any priority preferences of the user! what was observed here were the effects of completely throttling TCP processing for a given socket. I think such throttling can in fact be desirable: there is a /reason/ why the process context was preempted: in that load scenario there was 10 times more processing requested from the CPU than it can possibly service. It's a serious overload situation and it's the scheduler's task to prioritize between workloads! normally such kind of "throttling" of the TCP stack for this particular socket does not happen. Note that there's no performance lost: we dont do TCP processing because there are /9 other tasks for this CPU to run/, and the scheduler has a tough choice. Now i agree that there are more intelligent ways to throttle and less intelligent ways to throttle, but the notion to allow a given workload 'steal' CPU time from other workloads by allowing it to push its processing into a softirq is i think unfair. (and this issue is partially addressed by my softirq threading patches in -rt :-) Ingo - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: pktgen
On 11/30/06, David Miller <[EMAIL PROTECTED]> wrote: From: Alexey Dobriyan <[EMAIL PROTECTED]> Date: Wed, 29 Nov 2006 23:04:37 +0300 > Looks like worker thread strategically clears it if scheduled at wrong > moment. > > --- a/net/core/pktgen.c > +++ b/net/core/pktgen.c > @@ -3292,7 +3292,6 @@ static void pktgen_thread_worker(struct > >init_waitqueue_head(&t->queue); > > - t->control &= ~(T_TERMINATE); >t->control &= ~(T_RUN); >t->control &= ~(T_STOP); >t->control &= ~(T_REMDEVALL); Good catch Alexey. Did you rerun the load/unload test with this fix applied? If it fixes things, I'll merge it. Well, yes, it fixes things, except Ctrl+C getting you out of modprobe/rmmod loop will spit backtrace again. And other flags: T_RUN, T_STOP. Clearance is not needed due to kZalloc and create bugs as demostrated. Give me some time. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP
From: Ingo Molnar <[EMAIL PROTECTED]> Date: Thu, 30 Nov 2006 07:47:58 +0100 > furthermore, the tweak allows the shifting of processing from a > prioritized process context into a highest-priority softirq context. > (it's not proven that there is any significant /net win/ of performance: > all that was proven is that if we shift TCP processing from process > context into softirq context then TCP throughput of that otherwise > penalized process context increases.) If we preempt with any packets in the backlog, we send no ACKs and the sender cannot send thus the pipe empties. That's the problem, this has nothing to do with scheduler priorities or stuff like that IMHO. The argument goes that if the reschedule is delayed long enough, the ACKs will exceed the round trip time and trigger retransmits which will absolutely kill performance. The only reason we block input packet processing while we hold this lock is because we don't want the receive queue changing from underneath us while we're copying data to userspace. Furthermore once you preempt in this particular way, no input packet processing occurs in that socket still, exacerbating the situation. Anyways, even if we somehow unlocked the socket and ran the backlog at preemption points, by hand, since we've thus deferred the whole work of processing whatever is in the backlog until the preemption point, we've lost our quantum already, so it's perhaps not legal to do the deferred processing as the preemption signalling point from a fairness perspective. It would be different if we really did the packet processing at the original moment (where we had to queue to the socket backlog because it was locked, in softirq) because then we'd return from the softirq and hit the preemption point earlier or whatever. Therefore, perhaps the best would be to see if there is a way we can still allow input packet processing even while running the majority of TCP's recvmsg(). It won't be easy :) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP
* David Miller <[EMAIL PROTECTED]> wrote: > This is why my suggestion is to preempt_disable() as soon as we grab > the socket lock, [...] independently of the issue at hand, in general the explicit use of preempt_disable() in non-infrastructure code is quite a heavy tool. Its effects are heavy and global: it disables /all/ preemption (even on PREEMPT_RT). Furthermore, when preempt_disable() is used for per-CPU data structures then [unlike for example to a spin-lock] the connection between the 'data' and the 'lock' is not explicit - causing all kinds of grief when trying to convert such code to a different preemption model. (such as PREEMPT_RT :-) So my plan is to remove all "open-coded" use of preempt_disable() [and raw use of local_irq_save/restore] from the kernel and replace it with some facility that connects data and lock. (Note that this will not result in any actual changes on the instruction level because internally every such facility still maps to preempt_disable() on non-PREEMPT_RT kernels, so on non-PREEMPT_RT kernels such code will still be the same as before.) Ingo - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP
* David Miller <[EMAIL PROTECTED]> wrote: > > yeah, i like this one. If the problem is "too long locked section", > > then the most natural solution is to "break up the lock", not to > > "boost the priority of the lock-holding task" (which is what the > > proposed patch does). > > Ingo you're mis-read the problem :-) yeah, the problem isnt too long locked section but "too much time spent holding a lock" and hence opening up ourselves to possible negative side-effects of the scheduler's fairness algorithm when it forces a preemption of that process context with that lock held (and forcing all subsequent packets to be backlogged). but please read my last mail - i think i'm slowly starting to wake up ;-) I dont think there is any real problem: a tweak to the scheduler that in essence gives TCP-using tasks a preference changes the balance of workloads. Such an explicit tweak is possible already. furthermore, the tweak allows the shifting of processing from a prioritized process context into a highest-priority softirq context. (it's not proven that there is any significant /net win/ of performance: all that was proven is that if we shift TCP processing from process context into softirq context then TCP throughput of that otherwise penalized process context increases.) Ingo - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Bug 7596 - Potential performance bottleneck for Linxu TCP
* Andrew Morton <[EMAIL PROTECTED]> wrote: > > Attached is the detailed description of the problem and one possible > > solution. > > Thanks. The attachment will be too large for the mailing-list servers > so I uploaded a copy to > http://userweb.kernel.org/~akpm/Linux-TCP-Bottleneck-Analysis-Report.pdf > > From a quick peek it appears that you're getting around 10% > improvement in TCP throughput, best case. Wenji, have you tried to renice the receiving task (to say nice -20) and see how much TCP throughput you get in "background load of 10.0". (similarly, you could also renice the background load tasks to nice +19 and/or set their scheduling policy to SCHED_BATCH) as far as i can see, the numbers in the paper and the patch prove the following two points: - a task doing TCP receive with 10 other tasks running on the CPU will see lower TCP throughput than if it had the CPU for itself alone. - a patch that tweaks the scheduler to give the receiving task more timeslices (i.e. raises its nice level in essence) results in ... more timeslices, which results in higher receive numbers ... so the most important thing to check would be, before any scheduler and TCP code change is considered: if you give the task higher priority /explicitly/, via nice -20, do the numbers improve? Similarly, if all the other "background load" tasks are reniced to nice +19 (or their policy is set to SCHED_BATCH), do you get a similar improvement? Ingo - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP
From: Ingo Molnar <[EMAIL PROTECTED]> Date: Thu, 30 Nov 2006 07:17:58 +0100 > > * David Miller <[EMAIL PROTECTED]> wrote: > > > We can make explicitl preemption checks in the main loop of > > tcp_recvmsg(), and release the socket and run the backlog if > > need_resched() is TRUE. > > > > This is the simplest and most elegant solution to this problem. > > yeah, i like this one. If the problem is "too long locked section", then > the most natural solution is to "break up the lock", not to "boost the > priority of the lock-holding task" (which is what the proposed patch > does). Ingo you're mis-read the problem :-) The issue is that we actually don't hold any locks that prevent preemption, so we can take preemption points which the TCP code wasn't designed with in-mind. Normally, we control the sleep point very carefully in the TCP sendmsg/recvmsg code, such that when we sleep we drop the socket lock and process the backlog packets that accumulated while the socket was locked. With pre-emption we can't control that properly. The problem is that we really do need to run the backlog any time we give up the cpu in the sendmsg/recvmsg path, or things get real erratic. ACKs don't go out as early as we'd like them to, etc. It isn't easy to do generically, perhaps, because we can only drop the socket lock at certain points and we need to do that to run the backlog. This is why my suggestion is to preempt_disable() as soon as we grab the socket lock, and explicitly test need_resched() at places where it is absolutely safe, like this: if (need_resched()) { /* Run packet backlog... */ release_sock(sk); schedule(); lock_sock(sk); } The socket lock is just a by-hand binary semaphore, so it doesn't block pre-emption. We have to be able to sleep while holding it. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP
* Wenji Wu <[EMAIL PROTECTED]> wrote: > > That yield() will need to be removed - yield()'s behaviour is truly > > awfulif the system is otherwise busy. What is it there for? > > Please read the uploaded paper, which has detailed description. do you have any URL for that? Ingo - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP
* David Miller <[EMAIL PROTECTED]> wrote: > We can make explicitl preemption checks in the main loop of > tcp_recvmsg(), and release the socket and run the backlog if > need_resched() is TRUE. > > This is the simplest and most elegant solution to this problem. yeah, i like this one. If the problem is "too long locked section", then the most natural solution is to "break up the lock", not to "boost the priority of the lock-holding task" (which is what the proposed patch does). [ Also note that "sprinkle the code with preempt_disable()" kind of solutions, besides hurting interactivity, are also a pain to resolve in something like PREEMPT_RT. (unlike say a spinlock, preempt_disable() is quite opaque in what data structure it protects, etc., making it hard to convert it to a preemptible primitive) ] > The one suggested in your patch and paper are way overkill, there is > no reason to solve a TCP specific problem inside of the generic > scheduler. agreed. What we could also add is a /reverse/ mechanism to the scheduler: a task could query whether it has just a small amount of time left in its timeslice, and could in that case voluntarily drop its current lock and yield, and thus give up its current timeslice and wait for a new, full timeslice, instead of being forcibly preempted due to lack of timeslices with a possibly critical lock still held. But the suggested solution here, to "prolong the running of this task just a little bit longer" only starts a perpetual arms race between users of such a facility and other kernel subsystems. (besides not being adequate anyway, there can always be /so/ long lock-hold times that the scheduler would have no other option but to preempt the task) Ingo - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP
On Wed, 2006-11-29 at 17:08 -0800, Andrew Morton wrote: > + if (p->backlog_flag == 0) { > + if (!TASK_INTERACTIVE(p) || expired_starving(rq)) { > + enqueue_task(p, rq->expired); > + if (p->static_prio < rq->best_expired_prio) > + rq->best_expired_prio = p->static_prio; > + } else > + enqueue_task(p, rq->active); > + } else { > + if (expired_starving(rq)) { > + enqueue_task(p,rq->expired); > + if (p->static_prio < rq->best_expired_prio) > + rq->best_expired_prio = p->static_prio; > + } else { > + if (!TASK_INTERACTIVE(p)) > + p->extrarun_flag = 1; > + enqueue_task(p,rq->active); > + } > + } (oh my, doing that to the scheduler upsets my tummy, but that aside...) I don't see how that can really solve anything. "Interactive" tasks starting to use cpu heftily can still preempt and keep the special cased cpu hog off the cpu for ages. It also only takes one task in the expired array to trigger the forced array switch with a fully loaded cpu, and once any task hits the expired array, a stream of wakeups can prevent the switch from completing for as long as you can keep wakeups happening. -Mike - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Madwifi-devel] ar5k and Atheros AR5005G
Hi. Michael Buesch wrote: IIRC Pavel already explained that getting rid of the HAL per se should be no problem - it could easily be dissolved into the driver, if that is one of the requirements to be fulfilled before the driver (MadWifi or DadWifi) is considered for mainline inclusion. As soon as there is source available to dissolve, at least. Ok, so who actually does the work? The MadWifi team? It won't happen today or tomorrow, but I'm confident that it will happen. Any contribution to that effort is highly welcome - the more people help, the faster will the goal be reached. From what I understood the "... once the hal issue is resolved" part of David's mail refered to exactly that question. Ok, I don't know what "The HAL Issue" (tm) is. You referred to the archives where that exact "issue"(s) (binary-only, non-free, no sources, unwanted level of abstraction) has/have been discussed in lenght, but you claim you didn't have a clue what David was talking about? Come on. Bye, Mike - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linux 2.6.19
On Wed, Nov 29, 2006 at 06:15:37PM -0800, David Miller wrote: > In fact it does, the NDISC code is using MAX_HEADER incorrectly. It > needs to explicitly allocate space for the struct ipv6hdr in 'len'. > Luckily the TCP ipv6 code was doing it right. > > What a horrible bug, this patch should fix it. Let me know > if it doesn't, thanks: Yes, that fixes it up for me, thanks. Phil - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.19-rc6-mm2: uli526x only works after reload
On Thu, 30 Nov 2006 02:04:15 +0100 "Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote: > > > > > > git-netdev-all.patch > > > git-netdev-all-fixup.patch > > > libphy-dont-do-that.patch > > > > Are you able to eliminate libphy-dont-do-that.patch? > > > > > Is a broken-out version of git-netdev-all.patch available from somewhere? > > > > Nope, and my few fumbling attempts to generate the sort of patch series > > which you want didn't work out too well. One has to downgrade to > > git-bisect :( > > > > What does "doesn't work" mean, btw? > > Well, it turns out not to be 100% reproducible. I can only reproduce it after > a soft reboot (eg. shutdown -r now). > > Then, while configuring network interfaces the system says the interface name > is ethxx0, but it should be eth1 (eth0 is an RTL-8139, which is not used). > Now > if I run ifconfig, it says: > > eth0: error fetching interface information: Device not found > > and that's all (normally, ifconfig would show the information for lo and eth1, > without eth0). Moreover, 'ifconfig eth1' says: > > eth1: error fetching interface information: Device not found > > Next, I run 'rmmod uli526x' and 'modprobe uli526x' and then 'ifconfig' is > still saying the above (about eth0), but 'ifconfig eth1' seems to work as > it should. However, the interface often fails to transfer anything after > that. Lovely. Sounds like some startup race, perhaps against userspace. Is CONFIG_PCI_MULTITHREAD_PROBE set? (err, we meant to disable that for 2.6.19 but forgot). - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP
From: Wenji Wu <[EMAIL PROTECTED]> Date: Wed, 29 Nov 2006 19:56:58 -0600 > >We could also pepper tcp_recvmsg() with some very carefully placed > >preemption disable/enable calls to deal with this even with > >CONFIG_PREEMPT enabled. > > I also think about this approach. But since the "problem" happens in > the 2.6 Desktop and Low-latency Desktop (not server), system > responsiveness is a key feature, simply placing preemption > disabled/enable call might not work. If you want to place > preemption disable/enable calls within tcp_recvmsg, you have to put > them in the very beginning and end of the call. Disabling preemption > would degrade system responsiveness. We can make explicitl preemption checks in the main loop of tcp_recvmsg(), and release the socket and run the backlog if need_resched() is TRUE. This is the simplest and most elegant solution to this problem. The one suggested in your patch and paper are way overkill, there is no reason to solve a TCP specific problem inside of the generic scheduler. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linux 2.6.19
From: Phil Oester <[EMAIL PROTECTED]> Date: Wed, 29 Nov 2006 17:49:04 -0800 > Getting an oops on boot here, caused by commit > e81c73596704793e73e6dbb478f41686f15a4b34 titled > "[NET]: Fix MAX_HEADER setting". > > Reverting that patch fixes things up for me. Dave? I suspect that it might be because I removed the IPV6 ifdef from the list, but I can't imagine why that would matter other than due to a bug in the IPV6 stack Indeed. Looking at ndisc_send_rs() I wonder if it miscalculates 'len' or similar and the old MAX_HEADER setting was merely papering around this bug In fact it does, the NDISC code is using MAX_HEADER incorrectly. It needs to explicitly allocate space for the struct ipv6hdr in 'len'. Luckily the TCP ipv6 code was doing it right. What a horrible bug, this patch should fix it. Let me know if it doesn't, thanks: commit c28728decc37fe52c8cdf48b3e0c0cf9b0c2fefb Author: David S. Miller <[EMAIL PROTECTED]> Date: Wed Nov 29 18:14:47 2006 -0800 [IPV6] NDISC: Calculate packet length correctly for allocation. MAX_HEADER does not include the ipv6 header length in it, so we need to add it in explicitly. Signed-off-by: David S. Miller <[EMAIL PROTECTED]> diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c index 73eb8c3..c42d4c2 100644 --- a/net/ipv6/ndisc.c +++ b/net/ipv6/ndisc.c @@ -441,7 +441,8 @@ static void ndisc_send_na(struct net_dev struct sk_buff *skb; int err; - len = sizeof(struct icmp6hdr) + sizeof(struct in6_addr); + len = sizeof(struct ipv6hdr) + sizeof(struct icmp6hdr) + + sizeof(struct in6_addr); /* for anycast or proxy, solicited_addr != src_addr */ ifp = ipv6_get_ifaddr(solicited_addr, dev, 1); @@ -556,7 +557,8 @@ void ndisc_send_ns(struct net_device *de if (err < 0) return; - len = sizeof(struct icmp6hdr) + sizeof(struct in6_addr); + len = sizeof(struct ipv6hdr) + sizeof(struct icmp6hdr) + + sizeof(struct in6_addr); send_llinfo = dev->addr_len && !ipv6_addr_any(saddr); if (send_llinfo) len += ndisc_opt_addr_space(dev); @@ -632,7 +634,7 @@ void ndisc_send_rs(struct net_device *de if (err < 0) return; - len = sizeof(struct icmp6hdr); + len = sizeof(struct ipv6hdr) + sizeof(struct icmp6hdr); if (dev->addr_len) len += ndisc_opt_addr_space(dev); @@ -1381,7 +1383,8 @@ void ndisc_send_redirect(struct sk_buff struct in6_addr *target) { struct sock *sk = ndisc_socket->sk; - int len = sizeof(struct icmp6hdr) + 2 * sizeof(struct in6_addr); + int len = sizeof(struct ipv6hdr) + sizeof(struct icmp6hdr) + + 2 * sizeof(struct in6_addr); struct sk_buff *buff; struct icmp6hdr *icmph; struct in6_addr saddr_buf; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP
> That yield() will need to be removed - yield()'s behaviour is truly > awfulif the system is otherwise busy. What is it there for? Please read the uploaded paper, which has detailed description. thanks, wenji - Original Message - From: Andrew Morton <[EMAIL PROTECTED]> Date: Wednesday, November 29, 2006 7:08 pm Subject: Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP > On Wed, 29 Nov 2006 16:53:11 -0800 (PST) > David Miller <[EMAIL PROTECTED]> wrote: > > > > > Please, it is very difficult to review your work the way you have > > submitted this patch as a set of 4 patches. These patches have not > > been split up "logically", but rather they have been split up "per > > file" with the same exact changelog message in each patch posting. > > This is very clumsy, and impossible to review, and wastes a lot of > > mailing list bandwith. > > > > We have an excellent file, called > Documentation/SubmittingPatches, in > > the kernel source tree, which explains exactly how to do this > > correctly. > > > > By splitting your patch into 4 patches, one for each file touched, > > it is impossible to review your patch as a logical whole. > > > > Please also provide your patch inline so people can just hit reply > > in their mail reader client to quote your patch and comment on it. > > This is impossible with the attachments you've used. > > > > Here you go - joined up, cleaned up, ported to mainline and test- > compiled. > That yield() will need to be removed - yield()'s behaviour is truly > awfulif the system is otherwise busy. What is it there for? > > > > From: Wenji Wu <[EMAIL PROTECTED]> > > For Linux TCP, when the network applcaiton make system call to move > data from > socket's receive buffer to user space by calling tcp_recvmsg(). > The socket > will be locked. During this period, all the incoming packet for > the TCP > socket will go to the backlog queue without being TCP processed > > Since Linux 2.6 can be inerrupted mid-task, if the network application > expires, and moved to the expired array with the socket locked, all > thepackets within the backlog queue will not be TCP processed till > the network > applicaton resume its execution. If the system is heavily loaded, > TCP can > easily RTO in the Sender Side. > > > > include/linux/sched.h |2 ++ > kernel/fork.c |3 +++ > kernel/sched.c| 24 ++-- > net/ipv4/tcp.c|9 + > 4 files changed, 32 insertions(+), 6 deletions(-) > > diff -puN net/ipv4/tcp.c~tcp-speedup net/ipv4/tcp.c > --- a/net/ipv4/tcp.c~tcp-speedup > +++ a/net/ipv4/tcp.c > @@ -1109,6 +1109,8 @@ int tcp_recvmsg(struct kiocb *iocb, stru > struct task_struct *user_recv = NULL; > int copied_early = 0; > > + current->backlog_flag = 1; > + > lock_sock(sk); > > TCP_CHECK_TIMER(sk); > @@ -1468,6 +1470,13 @@ skip_copy: > > TCP_CHECK_TIMER(sk); > release_sock(sk); > + > + current->backlog_flag = 0; > + if (current->extrarun_flag == 1){ > + current->extrarun_flag = 0; > + yield(); > + } > + > return copied; > > out: > diff -puN include/linux/sched.h~tcp-speedup include/linux/sched.h > --- a/include/linux/sched.h~tcp-speedup > +++ a/include/linux/sched.h > @@ -1023,6 +1023,8 @@ struct task_struct { > #ifdefCONFIG_TASK_DELAY_ACCT > struct task_delay_info *delays; > #endif > + int backlog_flag; /* packets wait in tcp backlog queue flag */ > + int extrarun_flag; /* extra run flag for TCP performance */ > }; > > static inline pid_t process_group(struct task_struct *tsk) > diff -puN kernel/sched.c~tcp-speedup kernel/sched.c > --- a/kernel/sched.c~tcp-speedup > +++ a/kernel/sched.c > @@ -3099,12 +3099,24 @@ void scheduler_tick(void) > > if (!rq->expired_timestamp) > rq->expired_timestamp = jiffies; > - if (!TASK_INTERACTIVE(p) || expired_starving(rq)) { > - enqueue_task(p, rq->expired); > - if (p->static_prio < rq->best_expired_prio) > - rq->best_expired_prio = p->static_prio; > - } else > - enqueue_task(p, rq->active); > + if (p->backlog_flag == 0) { > + if (!TASK_INTERACTIVE(p) || expired_starving(rq)) { > + enqueue_task(p, rq->expired); > + if (p->static_prio < rq->best_expired_prio) > + rq->best_expired_prio = p- > >static_prio;+} else > + enqueue_task(p, rq->active); > + } else { > + if (expired_starving(rq)) { > + enqueue_task(p,rq->expired); > + if (p->static_prio < rq->best_expired_prio) > + rq->best_expired_prio = p- > >static_
Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP
Yes, when CONFIG_PREEMPT is disabled, the "problem" won't happen. That is why I put "for 2.6 desktop, low-latency desktop" in the uploaded paper. This "problem" happens in the 2.6 Desktop and Low-latency Desktop. >We could also pepper tcp_recvmsg() with some very carefully placed preemption >disable/enable calls to deal with this even with CONFIG_PREEMPT enabled. I also think about this approach. But since the "problem" happens in the 2.6 Desktop and Low-latency Desktop (not server), system responsiveness is a key feature, simply placing preemption disabled/enable call might not work. If you want to place preemption disable/enable calls within tcp_recvmsg, you have to put them in the very beginning and end of the call. Disabling preemption would degrade system responsiveness. wenji - Original Message - From: David Miller <[EMAIL PROTECTED]> Date: Wednesday, November 29, 2006 7:13 pm Subject: Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP > From: Andrew Morton <[EMAIL PROTECTED]> > Date: Wed, 29 Nov 2006 17:08:35 -0800 > > > On Wed, 29 Nov 2006 16:53:11 -0800 (PST) > > David Miller <[EMAIL PROTECTED]> wrote: > > > > > > > > Please, it is very difficult to review your work the way you have > > > submitted this patch as a set of 4 patches. These patches have > not> > been split up "logically", but rather they have been split > up "per > > > file" with the same exact changelog message in each patch posting. > > > This is very clumsy, and impossible to review, and wastes a lot of > > > mailing list bandwith. > > > > > > We have an excellent file, called > Documentation/SubmittingPatches, in > > > the kernel source tree, which explains exactly how to do this > > > correctly. > > > > > > By splitting your patch into 4 patches, one for each file touched, > > > it is impossible to review your patch as a logical whole. > > > > > > Please also provide your patch inline so people can just hit reply > > > in their mail reader client to quote your patch and comment on it. > > > This is impossible with the attachments you've used. > > > > > > > Here you go - joined up, cleaned up, ported to mainline and test- > compiled.> > > That yield() will need to be removed - yield()'s behaviour is > truly awful > > if the system is otherwise busy. What is it there for? > > What about simply turning off CONFIG_PREEMPT to fix this "problem"? > > We always properly run the backlog (by doing a release_sock()) before > going to sleep otherwise except for the specific case of taking a page > fault during the copy to userspace. It is only CONFIG_PREEMPT that > can cause this situation to occur in other circumstances as far as I > can see. > > We could also pepper tcp_recvmsg() with some very carefully placed > preemption disable/enable calls to deal with this even with > CONFIG_PREEMPT enabled. > - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][IPSEC][3/7] inter address family ipsec tunnel
Hello, I found a bug in my previous patch for af_key. The patch breaks transport mode. This is a fixed version. Signed-off-by: Miika Komu <[EMAIL PROTECTED]> Signed-off-by: Diego Beltrami <[EMAIL PROTECTED]> Signed-off-by: Kazunori Miyazawa <[EMAIL PROTECTED]> diff --git a/net/key/af_key.c b/net/key/af_key.c index 4e18309..0e1dbfb 100644 --- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -1767,11 +1767,11 @@ #endif /* addresses present only in tunnel mode */ if (t->mode == XFRM_MODE_TUNNEL) { - switch (xp->family) { + struct sockaddr *sa; + sa = (struct sockaddr *)(rq+1); + switch(sa->sa_family) { case AF_INET: - sin = (void*)(rq+1); - if (sin->sin_family != AF_INET) - return -EINVAL; + sin = (struct sockaddr_in*)sa; t->saddr.a4 = sin->sin_addr.s_addr; sin++; if (sin->sin_family != AF_INET) @@ -1780,9 +1780,7 @@ #endif break; #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) case AF_INET6: - sin6 = (void *)(rq+1); - if (sin6->sin6_family != AF_INET6) - return -EINVAL; + sin6 = (struct sockaddr_in6*)sa; memcpy(t->saddr.a6, &sin6->sin6_addr, sizeof(struct in6_addr)); sin6++; if (sin6->sin6_family != AF_INET6) @@ -1793,7 +1791,10 @@ #endif default: return -EINVAL; } - } + t->encap_family = sa->sa_family; + } else + t->encap_family = xp->family; + /* No way to set this via kame pfkey */ t->aalgos = t->ealgos = t->calgos = ~0; xp->xfrm_nr++; @@ -1830,18 +1831,25 @@ static inline int pfkey_xfrm_policy2sec_ static int pfkey_xfrm_policy2msg_size(struct xfrm_policy *xp) { + struct xfrm_tmpl *t; int sockaddr_size = pfkey_sockaddr_size(xp->family); - int socklen = (xp->family == AF_INET ? - sizeof(struct sockaddr_in) : - sizeof(struct sockaddr_in6)); + int socklen = 0; + int i; + + for (i=0; ixfrm_nr; i++) { + t = xp->xfrm_vec + i; + socklen += (t->encap_family == AF_INET ? + sizeof(struct sockaddr_in) : + sizeof(struct sockaddr_in6)); + } return sizeof(struct sadb_msg) + (sizeof(struct sadb_lifetime) * 3) + (sizeof(struct sadb_address) * 2) + (sockaddr_size * 2) + sizeof(struct sadb_x_policy) + - (xp->xfrm_nr * (sizeof(struct sadb_x_ipsecrequest) + - (socklen * 2))) + + (xp->xfrm_nr * sizeof(struct sadb_x_ipsecrequest)) + + (socklen * 2) + pfkey_xfrm_policy2sec_ctx_size(xp); } @@ -1999,7 +2007,9 @@ #endif req_size = sizeof(struct sadb_x_ipsecrequest); if (t->mode == XFRM_MODE_TUNNEL) - req_size += 2*socklen; + req_size += ((t->encap_family == AF_INET ? +sizeof(struct sockaddr_in) : +sizeof(struct sockaddr_in6)) * 2); else size -= 2*socklen; rq = (void*)skb_put(skb, req_size); @@ -2015,7 +2025,7 @@ #endif rq->sadb_x_ipsecrequest_level = IPSEC_LEVEL_USE; rq->sadb_x_ipsecrequest_reqid = t->reqid; if (t->mode == XFRM_MODE_TUNNEL) { - switch (xp->family) { + switch (t->encap_family) { case AF_INET: sin = (void*)(rq+1); sin->sin_family = AF_INET; - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [IPVS] transparent proxying
hi, Wensong. Thanks for your appraise. > I see that this patch probably makes IPVS code a bit complicated and > packet traversing less efficiently. In my opinion, worry about the side-effect to the packet throughput is not necessary. First, normal packets with mark rarely appear in the NF_IP_FORWARD chain, while people mark packets aiming at the network administration job usually on the NF_IP_LOCAL_IN or NF_IP_OUTPUT chain. Second, the new hook fn is called after ipvs SNAT hook fn, and pass the packets handled by the latter hook fn by simply checking the ipvs_property flag, so it would not disturb the SNAT job. Third, the new hook fn is just a thin wrapper of ip_vs_in(), so now that all packets which go through NF_IP_LOCAL_IN will be entirely checked up by ip_vs_in(), no matter they are virtual-server relative or not, why we mind that a comparatively small quantity of packets which go through NF_IP_FORWARD will be checked too? > If I remember correctly, policy-based routing can work with IPVS in > kernel 2.2 and 2.4 for transparent cache cluster for a long time. It > should work in kernel 2.6 too. Indeed, policy route can help too, but the patch provides a native manner to deploy transparent proxy, and meanwhile, this manner will not break the backbone networking context, such as policy routing setting, iptables rules, etc. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: pktgen
From: Alexey Dobriyan <[EMAIL PROTECTED]> Date: Wed, 29 Nov 2006 23:04:37 +0300 > Looks like worker thread strategically clears it if scheduled at wrong > moment. > > --- a/net/core/pktgen.c > +++ b/net/core/pktgen.c > @@ -3292,7 +3292,6 @@ static void pktgen_thread_worker(struct > > init_waitqueue_head(&t->queue); > > - t->control &= ~(T_TERMINATE); > t->control &= ~(T_RUN); > t->control &= ~(T_STOP); > t->control &= ~(T_REMDEVALL); Good catch Alexey. Did you rerun the load/unload test with this fix applied? If it fixes things, I'll merge it. Thanks! - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [NET_SCHED 07/06]: Fix endless loops (part 5): netem/tbf/hfsc ->requeue failures
From: Patrick McHardy <[EMAIL PROTECTED]> Date: Mon, 20 Nov 2006 16:01:03 +0100 > I forgot to fix one (AFAICT purely theoretical) case .. Also applied to net-2.6.20, thanks a lot Patrick. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [NET_SCHED 05/06]: Fix endless loops (part 3): HFSC
From: Patrick McHardy <[EMAIL PROTECTED]> Date: Mon, 20 Nov 2006 14:08:46 +0100 (MET) > [NET_SCHED]: Fix endless loops (part 3): HFSC > > Convert HFSC to use qdisc_tree_decrease_len() and add a callback > for deactivating a class when its child queue becomes empty. > > All queue purging goes through hfsc_purge_queue(), which is used in > three cases: grafting, class creation (when a leaf class is turned > into an intermediate class by attaching a new class) and class > deletion. In all cases qdisc_tree_decrease_len() is needed. > > Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> Applied to net-2.6.20, thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [NET_SCHED 03/06]: Fix endless loops caused by inaccurate qlen counters (part 1)
From: Patrick McHardy <[EMAIL PROTECTED]> Date: Mon, 20 Nov 2006 14:08:41 +0100 (MET) > [NET_SCHED]: Fix endless loops caused by inaccurate qlen counters (part 1) > > There are multiple problems related to qlen adjustment that can lead > to an upper qdisc getting out of sync with the real number of packets > queued, leading to endless dequeueing attempts by the upper layer code. > > All qdiscs must maintain an accurate q.qlen counter. There are basically > two groups of operations affecting the qlen: operations that propagate > down the tree (enqueue, dequeue, requeue, drop, reset) beginning at the > root qdisc and operations only affecting a subtree or single qdisc > (change, graft, delete class). Since qlen changes during operations from > the second group don't propagate to ancestor qdiscs, their qlen values > become desynchronized. > > This patch adds a function to propagate qlen changes up the qdisc tree, > optionally calling a callback function to perform qdisc-internal > maintenance when the child qdisc becomes empty. The follow-up patches > will convert all qdiscs to use this function where necessary. > > Noticed by Timo Steinbach <[EMAIL PROTECTED]>. > > Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> Applied to net-2.6.20, thanks a lot. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [NET_SCHED 06/06]: Fix endless loops (part 4): HTB
From: Patrick McHardy <[EMAIL PROTECTED]> Date: Mon, 20 Nov 2006 14:08:48 +0100 (MET) > [NET_SCHED]: Fix endless loops (part 4): HTB > > Convert HTB to use qdisc_tree_decrease_len() and add a callback > for deactivating a class when its child queue becomes empty. > > Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> Applied to net-2.6.20 - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [NET_SCHED 04/06]: Fix endless loops (part 2): "simple" qdiscs
From: Patrick McHardy <[EMAIL PROTECTED]> Date: Mon, 20 Nov 2006 14:08:44 +0100 (MET) > [NET_SCHED]: Fix endless loops (part 2): "simple" qdiscs > > Convert the "simple" qdiscs to use qdisc_tree_decrease_qlen() where > necessary: > > - all graft operations > - destruction of old child qdiscs in prio, red and tbf change operation > - purging of queue in sfq change operation > > Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> Applied to net-2.6.20 - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [NET_SCHED 01/06]: sch_htb: perform qlen adjustment immediately in ->delete
From: Patrick McHardy <[EMAIL PROTECTED]> Date: Mon, 20 Nov 2006 14:08:37 +0100 (MET) > [NET_SCHED]: sch_htb: perform qlen adjustment immediately in ->delete > > qlen adjustment should happen immediately in ->delete and not in the > class destroy function because the reference count will not hit zero in > ->delete (sch_api holds a reference) but in ->put. Since the qdisc > lock is released between deletion of the class and final destruction > this creates an externally visible error in the qlen counter. > > Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> Applied to net-2.6.20 - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [NET_SCHED 02/06]: Set parent classid in default qdiscs
From: Patrick McHardy <[EMAIL PROTECTED]> Date: Mon, 20 Nov 2006 14:08:38 +0100 (MET) > [NET_SCHED]: Set parent classid in default qdiscs > > Set parent classids in default qdiscs to allow walking up the tree > from outside the qdiscs. This is needed by the next patch. > > Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]> Applied to net-2.6.20, thanks Patrick. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP
From: Andrew Morton <[EMAIL PROTECTED]> Date: Wed, 29 Nov 2006 17:08:35 -0800 > On Wed, 29 Nov 2006 16:53:11 -0800 (PST) > David Miller <[EMAIL PROTECTED]> wrote: > > > > > Please, it is very difficult to review your work the way you have > > submitted this patch as a set of 4 patches. These patches have not > > been split up "logically", but rather they have been split up "per > > file" with the same exact changelog message in each patch posting. > > This is very clumsy, and impossible to review, and wastes a lot of > > mailing list bandwith. > > > > We have an excellent file, called Documentation/SubmittingPatches, in > > the kernel source tree, which explains exactly how to do this > > correctly. > > > > By splitting your patch into 4 patches, one for each file touched, > > it is impossible to review your patch as a logical whole. > > > > Please also provide your patch inline so people can just hit reply > > in their mail reader client to quote your patch and comment on it. > > This is impossible with the attachments you've used. > > > > Here you go - joined up, cleaned up, ported to mainline and test-compiled. > > That yield() will need to be removed - yield()'s behaviour is truly awful > if the system is otherwise busy. What is it there for? What about simply turning off CONFIG_PREEMPT to fix this "problem"? We always properly run the backlog (by doing a release_sock()) before going to sleep otherwise except for the specific case of taking a page fault during the copy to userspace. It is only CONFIG_PREEMPT that can cause this situation to occur in other circumstances as far as I can see. We could also pepper tcp_recvmsg() with some very carefully placed preemption disable/enable calls to deal with this even with CONFIG_PREEMPT enabled. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.19-rc6-mm2: uli526x only works after reload
On Thursday, 30 November 2006 00:26, Andrew Morton wrote: > On Thu, 30 Nov 2006 00:08:21 +0100 > "Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote: > > > On Wednesday, 29 November 2006 22:31, Rafael J. Wysocki wrote: > > > On Wednesday, 29 November 2006 22:30, Andrew Morton wrote: > > > > On Wed, 29 Nov 2006 21:08:00 +0100 > > > > "Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote: > > > > > > > > > On Wednesday, 29 November 2006 20:54, Rafael J. Wysocki wrote: > > > > > > On Tuesday, 28 November 2006 11:02, Andrew Morton wrote: > > > > > > > > > > > > > > Temporarily at > > > > > > > > > > > > > > http://userweb.kernel.org/~akpm/2.6.19-rc6-mm2/ > > > > > > > > > > > > > > Will appear eventually at > > > > > > > > > > > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc6/2.6.19-rc6-mm2/ > > > > > > > > > > > > A minor issue: on one of my (x86-64) test boxes the uli526x driver > > > > > > doesn't > > > > > > work when it's first loaded. I have to rmmod and modprobe it to > > > > > > make it work. > > > > > > > > That isn't a minor issue. > > > > > > > > > > It worked just fine on -mm1, so something must have happened to it > > > > > > recently. > > > > > > > > > > Sorry, I was wrong. The driver doesn't work at all, even after > > > > > reload. > > > > > > > > > > > > > tulip-dmfe-carrier-detection-fix.patch was added in rc6-mm2. But you're > > > > not using that (corrent?) > > > > > > > > git-netdev-all changes drivers/net/tulip/de2104x.c, but you're not using > > > > that either. > > > > > > > > git-powerpc(!) alters drivers/net/tulip/de4x5.c, but you're not using > > > > that. > > > > > > > > Beats me, sorry. Perhaps it's due to changes in networking core. It's > > > > presumably a showstopper for statically-linked-uli526x users. If you > > > > could > > > > bisect it, please? I'd start with git-netdev-all, then tulip-*. > > > > > > OK, but it'll take some time. > > > > OK, done. > > > > It's one of these (the first one alone doesn't compile): > > > > git-netdev-all.patch > > git-netdev-all-fixup.patch > > libphy-dont-do-that.patch > > Are you able to eliminate libphy-dont-do-that.patch? > > > Is a broken-out version of git-netdev-all.patch available from somewhere? > > Nope, and my few fumbling attempts to generate the sort of patch series > which you want didn't work out too well. One has to downgrade to > git-bisect :( > > What does "doesn't work" mean, btw? Well, it turns out not to be 100% reproducible. I can only reproduce it after a soft reboot (eg. shutdown -r now). Then, while configuring network interfaces the system says the interface name is ethxx0, but it should be eth1 (eth0 is an RTL-8139, which is not used). Now if I run ifconfig, it says: eth0: error fetching interface information: Device not found and that's all (normally, ifconfig would show the information for lo and eth1, without eth0). Moreover, 'ifconfig eth1' says: eth1: error fetching interface information: Device not found Next, I run 'rmmod uli526x' and 'modprobe uli526x' and then 'ifconfig' is still saying the above (about eth0), but 'ifconfig eth1' seems to work as it should. However, the interface often fails to transfer anything after that. Greetings, Rafael -- You never change things by fighting the existing reality. R. Buckminster Fuller - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP
On Wed, 29 Nov 2006 16:53:11 -0800 (PST) David Miller <[EMAIL PROTECTED]> wrote: > > Please, it is very difficult to review your work the way you have > submitted this patch as a set of 4 patches. These patches have not > been split up "logically", but rather they have been split up "per > file" with the same exact changelog message in each patch posting. > This is very clumsy, and impossible to review, and wastes a lot of > mailing list bandwith. > > We have an excellent file, called Documentation/SubmittingPatches, in > the kernel source tree, which explains exactly how to do this > correctly. > > By splitting your patch into 4 patches, one for each file touched, > it is impossible to review your patch as a logical whole. > > Please also provide your patch inline so people can just hit reply > in their mail reader client to quote your patch and comment on it. > This is impossible with the attachments you've used. > Here you go - joined up, cleaned up, ported to mainline and test-compiled. That yield() will need to be removed - yield()'s behaviour is truly awful if the system is otherwise busy. What is it there for? From: Wenji Wu <[EMAIL PROTECTED]> For Linux TCP, when the network applcaiton make system call to move data from socket's receive buffer to user space by calling tcp_recvmsg(). The socket will be locked. During this period, all the incoming packet for the TCP socket will go to the backlog queue without being TCP processed Since Linux 2.6 can be inerrupted mid-task, if the network application expires, and moved to the expired array with the socket locked, all the packets within the backlog queue will not be TCP processed till the network applicaton resume its execution. If the system is heavily loaded, TCP can easily RTO in the Sender Side. include/linux/sched.h |2 ++ kernel/fork.c |3 +++ kernel/sched.c| 24 ++-- net/ipv4/tcp.c|9 + 4 files changed, 32 insertions(+), 6 deletions(-) diff -puN net/ipv4/tcp.c~tcp-speedup net/ipv4/tcp.c --- a/net/ipv4/tcp.c~tcp-speedup +++ a/net/ipv4/tcp.c @@ -1109,6 +1109,8 @@ int tcp_recvmsg(struct kiocb *iocb, stru struct task_struct *user_recv = NULL; int copied_early = 0; + current->backlog_flag = 1; + lock_sock(sk); TCP_CHECK_TIMER(sk); @@ -1468,6 +1470,13 @@ skip_copy: TCP_CHECK_TIMER(sk); release_sock(sk); + + current->backlog_flag = 0; + if (current->extrarun_flag == 1){ + current->extrarun_flag = 0; + yield(); + } + return copied; out: diff -puN include/linux/sched.h~tcp-speedup include/linux/sched.h --- a/include/linux/sched.h~tcp-speedup +++ a/include/linux/sched.h @@ -1023,6 +1023,8 @@ struct task_struct { #ifdef CONFIG_TASK_DELAY_ACCT struct task_delay_info *delays; #endif + int backlog_flag; /* packets wait in tcp backlog queue flag */ + int extrarun_flag; /* extra run flag for TCP performance */ }; static inline pid_t process_group(struct task_struct *tsk) diff -puN kernel/sched.c~tcp-speedup kernel/sched.c --- a/kernel/sched.c~tcp-speedup +++ a/kernel/sched.c @@ -3099,12 +3099,24 @@ void scheduler_tick(void) if (!rq->expired_timestamp) rq->expired_timestamp = jiffies; - if (!TASK_INTERACTIVE(p) || expired_starving(rq)) { - enqueue_task(p, rq->expired); - if (p->static_prio < rq->best_expired_prio) - rq->best_expired_prio = p->static_prio; - } else - enqueue_task(p, rq->active); + if (p->backlog_flag == 0) { + if (!TASK_INTERACTIVE(p) || expired_starving(rq)) { + enqueue_task(p, rq->expired); + if (p->static_prio < rq->best_expired_prio) + rq->best_expired_prio = p->static_prio; + } else + enqueue_task(p, rq->active); + } else { + if (expired_starving(rq)) { + enqueue_task(p,rq->expired); + if (p->static_prio < rq->best_expired_prio) + rq->best_expired_prio = p->static_prio; + } else { + if (!TASK_INTERACTIVE(p)) + p->extrarun_flag = 1; + enqueue_task(p,rq->active); + } + } } else { /* * Prevent a too long timeslice allowing a task to monopolize diff -puN kernel/fork.c~tcp-speedup kernel/fork.c --- a/kernel/fork.c~tcp-speedup +++ a/kernel/fork.c @@ -1032,6 +1032,9 @@ static struct task_struct *copy_process( clear_tsk_thread_flag(p,
Re: Bug 7596 - Potential performance bottleneck for Linxu TCP
The delays dealt with in your paper might actually help a highly loaded server with lots of sockets and threads trying to communicate. The packet processing delays caused by the scheduling delay paces the TCP sender by controlling the rate at which ACKs go back to that sender. Those ACKs will go out paced to the rate at which the sleeping TCP receiver gets back onto the cpu, and this will cause the TCP sender to naturally adjust to the overall processing rate of the receiver system, on a per-connection basis. Perhaps try a system with hundreds of processes and potentially hundreds of thousands of TCP sockets, with thousands of unique sender sites, and see what happens. This is a similar topic like TSO, where we are trying to balance the gains from batching work from the losses of gaps in the communication stream. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP
Please, it is very difficult to review your work the way you have submitted this patch as a set of 4 patches. These patches have not been split up "logically", but rather they have been split up "per file" with the same exact changelog message in each patch posting. This is very clumsy, and impossible to review, and wastes a lot of mailing list bandwith. We have an excellent file, called Documentation/SubmittingPatches, in the kernel source tree, which explains exactly how to do this correctly. By splitting your patch into 4 patches, one for each file touched, it is impossible to review your patch as a logical whole. Please also provide your patch inline so people can just hit reply in their mail reader client to quote your patch and comment on it. This is impossible with the attachments you've used. Thanks. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] additional ipsec audit patch
On Wed, 29 Nov 2006, James Morris wrote: > On Wed, 29 Nov 2006, Joy Latten wrote: > > > This patch disables auditing in ipsec when CONFIG_AUDITSYSCALL is > > disabled in the kernel. > > > > This patch also includes a bug fix for xfrm_state.c as a result of > > original ipsec audit patch. > > > > Let me know if it looks ok. > > > Also, the last patch contains no Signed-off-by: line, please resend. And, what is the testing status of these patches? -- James Morris <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] additional ipsec audit patch
On Wed, 29 Nov 2006, Joy Latten wrote: > This patch disables auditing in ipsec when CONFIG_AUDITSYSCALL is > disabled in the kernel. > > This patch also includes a bug fix for xfrm_state.c as a result of > original ipsec audit patch. > > Let me know if it looks ok. Also, the last patch contains no Signed-off-by: line, please resend. -- James Morris <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Realtek 8139 driver (8139too.c) TX Timeout doesn't allow interrupt handler to disable receive interrupts at high bi-directional traffic
Stephen Hemminger <[EMAIL PROTECTED]> : [...] > Move the poll_enable to after hw_start() or put it inside hw_start. "after" probably The order would be the opposite of the one used in rtl8139_poll (which does __netif_rx_complete then irq_unlock) and it's past 1 AM. It starts to be a bit foggy. -- Ueimor - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Realtek 8139 driver (8139too.c) TX Timeout doesn't allow interrupt handler to disable receive interrupts at high bi-directional traffic
On Thu, 30 Nov 2006 00:32:19 +0100 Francois Romieu <[EMAIL PROTECTED]> wrote: > Stephen Hemminger <[EMAIL PROTECTED]> : > > Francois Romieu <[EMAIL PROTECTED]> wrote: > > > Stephen Hemminger <[EMAIL PROTECTED]> : > > > [...] > > > > @@ -1682,12 +1685,11 @@ static void rtl8139_tx_timeout_task (voi > > > > rtl8139_tx_clear (tp); > > > > spin_unlock_irq(&tp->lock); > > > > > > > > + netif_poll_enable(); > > > ^ -> dev > > > > + > > > > /* ...and finally, reset everything */ > > > > - if (netif_running(dev)) { > > > > - rtl8139_hw_start (dev); > > > > - netif_wake_queue (dev); > > > > - } > > > > - spin_unlock_bh(&tp->rx_lock); > > > > + rtl8139_hw_start (dev); > > > > + netif_wake_queue (dev); > > > > } > > > > > > rtl8139_hw_start() may mess with cur_rx, whence a race with rtl8139_rx() > > > if an in-flight interruption enables it a bit too fast. I'd rather go > > > with: > > > > but rt8139_rx is not possible here because we have blocked the poll > > routine from starting. Basically it uses the NAPI rx scheduler bit > > to replace the rx_lock. > > 1 - the irq handler is waiting for tp->lock > 2 - rtl8139_tx_timeout_task releases the lock > 3 - rtl8139_tx_timeout_task issues netif_poll_enable > 4 - the irq handler schedules ->poll(), returns Move the poll_enable to after hw_start() or put it inside hw_start. - Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/1] additional ipsec audit patch
This patch disables auditing in ipsec when CONFIG_AUDITSYSCALL is disabled in the kernel. This patch also includes a bug fix for xfrm_state.c as a result of original ipsec audit patch. Let me know if it looks ok. My mail gateway has been acting crazy so I apologize for any replicas being sent for ipsec audit patches. regards, Joy diff -urpN linux-2.6.18-patch/include/net/xfrm.h linux-2.6.18-patch.2/include/net/xfrm.h --- linux-2.6.18-patch/include/net/xfrm.h 2006-11-27 12:29:11.0 -0600 +++ linux-2.6.18-patch.2/include/net/xfrm.h 2006-11-28 13:26:49.0 -0600 @@ -395,8 +395,13 @@ struct xfrm_audit uid_t loginuid; u32 secid; }; -void xfrm_audit_log(uid_t auid, u32 secid, int type, int result, + +#ifdef CONFIG_AUDITSYSCALL +extern void xfrm_audit_log(uid_t auid, u32 secid, int type, int result, struct xfrm_policy *xp, struct xfrm_state *x); +#else +#define xfrm_audit_log(a,s,t,r,p,x) do { ; } while (0) +#endif /* CONFIG_AUDITSYSCALL */ static inline void xfrm_pol_hold(struct xfrm_policy *policy) { diff -urpN linux-2.6.18-patch/net/xfrm/xfrm_policy.c linux-2.6.18-patch.2/net/xfrm/xfrm_policy.c --- linux-2.6.18-patch/net/xfrm/xfrm_policy.c 2006-11-27 12:29:33.0 -0600 +++ linux-2.6.18-patch.2/net/xfrm/xfrm_policy.c 2006-11-28 14:51:09.0 -0600 @@ -1955,6 +1955,7 @@ int xfrm_bundle_ok(struct xfrm_policy *p EXPORT_SYMBOL(xfrm_bundle_ok); +#ifdef CONFIG_AUDITSYSCALL /* Audit addition and deletion of SAs and ipsec policy */ void xfrm_audit_log(uid_t auid, u32 sid, int type, int result, @@ -2063,6 +2064,7 @@ void xfrm_audit_log(uid_t auid, u32 sid, } EXPORT_SYMBOL(xfrm_audit_log); +#endif /* CONFIG_AUDITSYSCALL */ int xfrm_policy_register_afinfo(struct xfrm_policy_afinfo *afinfo) { diff -urpN linux-2.6.18-patch/net/xfrm/xfrm_state.c linux-2.6.18-patch.2/net/xfrm/xfrm_state.c --- linux-2.6.18-patch/net/xfrm/xfrm_state.c2006-11-27 12:29:33.0 -0600 +++ linux-2.6.18-patch.2/net/xfrm/xfrm_state.c 2006-11-28 12:58:56.0 -0600 @@ -407,7 +407,6 @@ restart: xfrm_state_hold(x); spin_unlock_bh(&xfrm_state_lock); - xfrm_state_delete(x); err = xfrm_state_delete(x); xfrm_audit_log(audit_info->loginuid, audit_info->secid, - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Changelog] - Potential performance bottleneck for Linxu TCP
Wenji Wu wrote: From: Wenji Wu <[EMAIL PROTECTED]> Greetings, For Linux TCP, when the network applcaiton make system call to move data from socket's receive buffer to user space by calling tcp_recvmsg(). The socket will be locked. During the period, all the incoming packet for the TCP socket will go to the backlog queue without being TCP processed. Since Linux 2.6 can be inerrupted mid-task, if the network application expires, and moved to the expired array with the socket locked, all the packets within the backlog queue will not be TCP processed till the network applicaton resume its execution. If the system is heavily loaded, TCP can easily RTO in the Sender Side. So how much difference did this patch actually make, and to what benchmark? The patch is for Linux kernel 2.6.14 Deskop and Low-latency Desktop The patch oesn't seem to be attached? Also, would be better to make it for the latest kernel version (2.6.19) ... 2.6.14 is rather old ;-) M - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] NetLabel: add the remaining CIPSO tag types from the IETF draft
On Wed, 29 Nov 2006, Paul Moore wrote: > James Morris wrote: > > All applied to: > > git://git.infradead.org/~jmorris/selinux-net-2.6.20 > > Thanks. > > Did you mean your kernel.org git tree? There's a copy at infradead (which may have still been cloning if you checked it immediately). -- James Morris <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Bug 7596 - Potential performance bottleneck for Linxu TCP
On Wed, 29 Nov 2006 17:22:10 -0600 Wenji Wu <[EMAIL PROTECTED]> wrote: > From: Wenji Wu <[EMAIL PROTECTED]> > > Greetings, > > For Linux TCP, when the network applcaiton make system call to move data > from > socket's receive buffer to user space by calling tcp_recvmsg(). The socket > will > be locked. During the period, all the incoming packet for the TCP socket > will go > to the backlog queue without being TCP processed. Since Linux 2.6 can be > inerrupted mid-task, if the network application expires, and moved to the > expired array with the socket locked, all the packets within the backlog > queue > will not be TCP processed till the network applicaton resume its execution. > If > the system is heavily loaded, TCP can easily RTO in the Sender Side. > > Attached is the detailed description of the problem and one possible > solution. Thanks. The attachment will be too large for the mailing-list servers so I uploaded a copy to http://userweb.kernel.org/~akpm/Linux-TCP-Bottleneck-Analysis-Report.pdf >From a quick peek it appears that you're getting around 10% improvement in TCP throughput, best case. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Realtek 8139 driver (8139too.c) TX Timeout doesn't allow interrupt handler to disable receive interrupts at high bi-directional traffic
Stephen Hemminger <[EMAIL PROTECTED]> : > Francois Romieu <[EMAIL PROTECTED]> wrote: > > Stephen Hemminger <[EMAIL PROTECTED]> : > > [...] > > > @@ -1682,12 +1685,11 @@ static void rtl8139_tx_timeout_task (voi > > > rtl8139_tx_clear (tp); > > > spin_unlock_irq(&tp->lock); > > > > > > + netif_poll_enable(); > > ^ -> dev > > > + > > > /* ...and finally, reset everything */ > > > - if (netif_running(dev)) { > > > - rtl8139_hw_start (dev); > > > - netif_wake_queue (dev); > > > - } > > > - spin_unlock_bh(&tp->rx_lock); > > > + rtl8139_hw_start (dev); > > > + netif_wake_queue (dev); > > > } > > > > rtl8139_hw_start() may mess with cur_rx, whence a race with rtl8139_rx() > > if an in-flight interruption enables it a bit too fast. I'd rather go > > with: > > but rt8139_rx is not possible here because we have blocked the poll > routine from starting. Basically it uses the NAPI rx scheduler bit > to replace the rx_lock. 1 - the irq handler is waiting for tp->lock 2 - rtl8139_tx_timeout_task releases the lock 3 - rtl8139_tx_timeout_task issues netif_poll_enable 4 - the irq handler schedules ->poll(), returns 5 - rtl8139_hw_start() races with ->poll(), aka rtl8139_rx(), for cur_rx -- Ueimor - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.19-rc6-mm2: uli526x only works after reload
On Thu, 30 Nov 2006 00:08:21 +0100 "Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote: > On Wednesday, 29 November 2006 22:31, Rafael J. Wysocki wrote: > > On Wednesday, 29 November 2006 22:30, Andrew Morton wrote: > > > On Wed, 29 Nov 2006 21:08:00 +0100 > > > "Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote: > > > > > > > On Wednesday, 29 November 2006 20:54, Rafael J. Wysocki wrote: > > > > > On Tuesday, 28 November 2006 11:02, Andrew Morton wrote: > > > > > > > > > > > > Temporarily at > > > > > > > > > > > > http://userweb.kernel.org/~akpm/2.6.19-rc6-mm2/ > > > > > > > > > > > > Will appear eventually at > > > > > > > > > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc6/2.6.19-rc6-mm2/ > > > > > > > > > > A minor issue: on one of my (x86-64) test boxes the uli526x driver > > > > > doesn't > > > > > work when it's first loaded. I have to rmmod and modprobe it to make > > > > > it work. > > > > > > That isn't a minor issue. > > > > > > > > It worked just fine on -mm1, so something must have happened to it > > > > > recently. > > > > > > > > Sorry, I was wrong. The driver doesn't work at all, even after reload. > > > > > > > > > > tulip-dmfe-carrier-detection-fix.patch was added in rc6-mm2. But you're > > > not using that (corrent?) > > > > > > git-netdev-all changes drivers/net/tulip/de2104x.c, but you're not using > > > that either. > > > > > > git-powerpc(!) alters drivers/net/tulip/de4x5.c, but you're not using > > > that. > > > > > > Beats me, sorry. Perhaps it's due to changes in networking core. It's > > > presumably a showstopper for statically-linked-uli526x users. If you > > > could > > > bisect it, please? I'd start with git-netdev-all, then tulip-*. > > > > OK, but it'll take some time. > > OK, done. > > It's one of these (the first one alone doesn't compile): > > git-netdev-all.patch > git-netdev-all-fixup.patch > libphy-dont-do-that.patch Are you able to eliminate libphy-dont-do-that.patch? > Is a broken-out version of git-netdev-all.patch available from somewhere? Nope, and my few fumbling attempts to generate the sort of patch series which you want didn't work out too well. One has to downgrade to git-bisect :( What does "doesn't work" mean, btw? - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 3/4] - Potential performance bottleneck for Linxu TCP
From: Wenji Wu <[EMAIL PROTECTED]> Greetings, For Linux TCP, when the network applcaiton make system call to move data from socket's receive buffer to user space by calling tcp_recvmsg(). The socket will be locked. During the period, all the incoming packet for the TCP socket will go to the backlog queue without being TCP processed. Since Linux 2.6 can be inerrupted mid-task, if the network application expires, and moved to the expired array with the socket locked, all the packets within the backlog queue will not be TCP processed till the network applicaton resume its execution. If the system is heavily loaded, TCP can easily RTO in the Sender Side. Attached is the patch 3/4 best regards, wenji Wenji Wu Network Researcher Fermilab, MS-368 P.O. Box 500 Batavia, IL, 60510 (Email): [EMAIL PROTECTED] (O): 001-630-840-4541 sched.c.patch Description: Binary data
[patch 2/4] - Potential performance bottleneck for Linxu TCP
From: Wenji Wu <[EMAIL PROTECTED]> Greetings, For Linux TCP, when the network applcaiton make system call to move data from socket's receive buffer to user space by calling tcp_recvmsg(). The socket will be locked. During the period, all the incoming packet for the TCP socket will go to the backlog queue without being TCP processed. Since Linux 2.6 can be inerrupted mid-task, if the network application expires, and moved to the expired array with the socket locked, all the packets within the backlog queue will not be TCP processed till the network applicaton resume its execution. If the system is heavily loaded, TCP can easily RTO in the Sender Side. Attached is the patch 2/4 best regards, wenji Wenji Wu Network Researcher Fermilab, MS-368 P.O. Box 500 Batavia, IL, 60510 (Email): [EMAIL PROTECTED] (O): 001-630-840-4541 sched.h.patch Description: Binary data
[patch 1/4] - Potential performance bottleneck for Linxu TCP
From: Wenji Wu <[EMAIL PROTECTED]> Greetings, For Linux TCP, when the network applcaiton make system call to move data from socket's receive buffer to user space by calling tcp_recvmsg(). The socket will be locked. During the period, all the incoming packet for the TCP socket will go to the backlog queue without being TCP processed. Since Linux 2.6 can be inerrupted mid-task, if the network application expires, and moved to the expired array with the socket locked, all the packets within the backlog queue will not be TCP processed till the network applicaton resume its execution. If the system is heavily loaded, TCP can easily RTO in the Sender Side. Attached is the patch 1/4 best regards, wenji Wenji Wu Network Researcher Fermilab, MS-368 P.O. Box 500 Batavia, IL, 60510 (Email): [EMAIL PROTECTED] (O): 001-630-840-4541 tcp.c.patch Description: Binary data
[Changelog] - Potential performance bottleneck for Linxu TCP
From: Wenji Wu <[EMAIL PROTECTED]> Greetings, For Linux TCP, when the network applcaiton make system call to move data from socket's receive buffer to user space by calling tcp_recvmsg(). The socket will be locked. During the period, all the incoming packet for the TCP socket will go to the backlog queue without being TCP processed. Since Linux 2.6 can be inerrupted mid-task, if the network application expires, and moved to the expired array with the socket locked, all the packets within the backlog queue will not be TCP processed till the network applicaton resume its execution. If the system is heavily loaded, TCP can easily RTO in the Sender Side. Attached is the Changelog for the patch best regards, wenji Wenji Wu Network Researcher Fermilab, MS-368 P.O. Box 500 Batavia, IL, 60510 (Email): [EMAIL PROTECTED] (O): 001-630-840-4541 From: Wenji Wu <[EMAIL PROTECTED]> - Subject Potential performance bottleneck for Linux TCP (2.6 Desktop, Low-latency Desktop) - Why the kernel needed patching For Linux TCP, when the network applcaiton make system call to move data from socket's receive buffer to user space by calling tcp_recvmsg(). The socket will be locked. During the period, all the incoming packet for the TCP socket will go to the backlog queue without being TCP processed. Since Linux 2.6 can be inerrupted mid-task, if the network application expires, and moved to the expired array with the socket locked, all the packets within the backlog queue will not be TCP processed till the network applicaton resume its execution. If the system is heavily loaded, TCP can easily RTO in the Sender Side. - The overall design apparoch in the patch the underlying idea here is that when there are packets waiting on the prequeue or backlog queue, do not allow the data receiving process to release the CPU for long. - Implementation details We have modified the Linux process scheduling policy and tcp_recvmsg(). To summarize, the solution works as follows: an expired data receiving process with packets waiting on backlog queue or prequeue is moved to the active array, instead of expired array as usual. More often than not, the expired data receiving process will continue to run. Even it doesnÂ’t, the wait time before it resumes its execution will be greatly reduced. However, this gives the process extra runs compared to other processes in the runqueue. For the sake of fairness, the process would be labeled with the extra_run_flag. Also considering the facts that: (1) the resumed process will continue its execution within tcp_recvmsg(); (2) tcp_recvmsg() does not return to user space until the prequeue and backlog queue are drained. For the sake of fairness, we modified tcp_recvmsg() as such: after prequeue and backlog queue are drained and before tcp_recvmsg() returns to user space, any process labeled with the extra_run_flag will call yield() to explicitly yield the CPU to other proc-esses in the runqueue. yield() works by removing the process from the active array (where it current is, because it is running), and inserting it into the expired array. Also, to prevent processes in the expired array from starving, A special rule has been provided for Linux process scheduling (the same rule used for interactive processes): an expired process is moved to the expired array without respect to its status if processes in the expired array are starved. Changed files: /kernel/sched.c /kernel/fork.c /include/linux/sched.h /net/ipv4/tcp.c - Testing results The proposed solution tradeoffs a small amount of fairness performance to resolve the TCP performance bottleneck. The proposed solution wonÂ’t cause serious fairness issue. The patch is for Linux kernel 2.6.14 Deskop and Low-latency Desktop
[patch 4/4] - Potential performance bottleneck for Linxu TCP
From: Wenji Wu <[EMAIL PROTECTED]> Greetings, For Linux TCP, when the network applcaiton make system call to move data from socket's receive buffer to user space by calling tcp_recvmsg(). The socket will be locked. During the period, all the incoming packet for the TCP socket will go to the backlog queue without being TCP processed. Since Linux 2.6 can be inerrupted mid-task, if the network application expires, and moved to the expired array with the socket locked, all the packets within the backlog queue will not be TCP processed till the network applicaton resume its execution. If the system is heavily loaded, TCP can easily RTO in the Sender Side. Attached is the patch 3/4 best regards, wenji Wenji Wu Network Researcher Fermilab, MS-368 P.O. Box 500 Batavia, IL, 60510 (Email): [EMAIL PROTECTED] (O): 001-630-840-4541 fork.c.patch Description: Binary data
Re: 2.6.19-rc6-mm2: uli526x only works after reload
On Wednesday, 29 November 2006 22:31, Rafael J. Wysocki wrote: > On Wednesday, 29 November 2006 22:30, Andrew Morton wrote: > > On Wed, 29 Nov 2006 21:08:00 +0100 > > "Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote: > > > > > On Wednesday, 29 November 2006 20:54, Rafael J. Wysocki wrote: > > > > On Tuesday, 28 November 2006 11:02, Andrew Morton wrote: > > > > > > > > > > Temporarily at > > > > > > > > > > http://userweb.kernel.org/~akpm/2.6.19-rc6-mm2/ > > > > > > > > > > Will appear eventually at > > > > > > > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc6/2.6.19-rc6-mm2/ > > > > > > > > A minor issue: on one of my (x86-64) test boxes the uli526x driver > > > > doesn't > > > > work when it's first loaded. I have to rmmod and modprobe it to make > > > > it work. > > > > That isn't a minor issue. > > > > > > It worked just fine on -mm1, so something must have happened to it > > > > recently. > > > > > > Sorry, I was wrong. The driver doesn't work at all, even after reload. > > > > > > > tulip-dmfe-carrier-detection-fix.patch was added in rc6-mm2. But you're > > not using that (corrent?) > > > > git-netdev-all changes drivers/net/tulip/de2104x.c, but you're not using > > that either. > > > > git-powerpc(!) alters drivers/net/tulip/de4x5.c, but you're not using that. > > > > Beats me, sorry. Perhaps it's due to changes in networking core. It's > > presumably a showstopper for statically-linked-uli526x users. If you could > > bisect it, please? I'd start with git-netdev-all, then tulip-*. > > OK, but it'll take some time. OK, done. It's one of these (the first one alone doesn't compile): git-netdev-all.patch git-netdev-all-fixup.patch libphy-dont-do-that.patch Is a broken-out version of git-netdev-all.patch available from somewhere? Greetings, Rafael - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Realtek 8139 driver (8139too.c) TX Timeout doesn't allow interrupt handler to disable receive interrupts at high bi-directional traffic
On Wed, 29 Nov 2006 23:44:00 +0100 Francois Romieu <[EMAIL PROTECTED]> wrote: > Stephen Hemminger <[EMAIL PROTECTED]> : > [...] > > @@ -1682,12 +1685,11 @@ static void rtl8139_tx_timeout_task (voi > > rtl8139_tx_clear (tp); > > spin_unlock_irq(&tp->lock); > > > > + netif_poll_enable(); > ^ -> dev > > + > > /* ...and finally, reset everything */ > > - if (netif_running(dev)) { > > - rtl8139_hw_start (dev); > > - netif_wake_queue (dev); > > - } > > - spin_unlock_bh(&tp->rx_lock); > > + rtl8139_hw_start (dev); > > + netif_wake_queue (dev); > > } > > rtl8139_hw_start() may mess with cur_rx, whence a race with rtl8139_rx() > if an in-flight interruption enables it a bit too fast. I'd rather go > with: but rt8139_rx is not possible here because we have blocked the poll routine from starting. Basically it uses the NAPI rx scheduler bit to replace the rx_lock. It is totally, untested. -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Realtek 8139 driver (8139too.c) TX Timeout doesn't allow interrupt handler to disable receive interrupts at high bi-directional traffic
Stephen Hemminger <[EMAIL PROTECTED]> : [...] > @@ -1682,12 +1685,11 @@ static void rtl8139_tx_timeout_task (voi > rtl8139_tx_clear (tp); > spin_unlock_irq(&tp->lock); > > + netif_poll_enable(); ^ -> dev > + > /* ...and finally, reset everything */ > - if (netif_running(dev)) { > - rtl8139_hw_start (dev); > - netif_wake_queue (dev); > - } > - spin_unlock_bh(&tp->rx_lock); > + rtl8139_hw_start (dev); > + netif_wake_queue (dev); > } rtl8139_hw_start() may mess with cur_rx, whence a race with rtl8139_rx() if an in-flight interruption enables it a bit too fast. I'd rather go with: [...] rtl8139_tx_clear (tp); rtl8139_hw_start (dev); netif_wake_queue (dev); netif_poll_enable(dev); spin_unlock_irq(&tp->lock); } Otherwise the patch is cool. -- Ueimor - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] add auditing to ipsec
On Monday 27 November 2006 14:11, Joy Latten wrote: > Please let me know if this is acceptable. >From an audit perspective, it looks good. -Steve - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] NetLabel: add the remaining CIPSO tag types from the IETF draft
James Morris wrote: > All applied to: > git://git.infradead.org/~jmorris/selinux-net-2.6.20 Thanks. Did you mean your kernel.org git tree? -- paul moore linux security @ hp - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] NetLabel: add the remaining CIPSO tag types from the IETF draft
All applied to: git://git.infradead.org/~jmorris/selinux-net-2.6.20 Thanks, - James -- James Morris <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Devel] Re: Network virtualization/isolation
Brian Haley wrote: Eric W. Biederman wrote: I think for cases across network socket namespaces it should be a matter for the rules, to decide if the connection should happen and what error code to return if the connection does not happen. There is a potential in this to have an ambiguous case where two applications can be listening for connections on the same socket on the same port and both will allow the connection. If that is the case I believe the proper definition is the first socket that we find that will accept the connection gets the connection. No. If you try to connect, the destination IP address is assigned to a network namespace. This network namespace is used to leave the listening socket ambiguity. Wouldn't you want to catch this at bind() and/or configuration time and fail? Having overlapping namespaces/rules seems undesirable, since as Herbert said, can get you "unexpected behaviour". Overlapping is not a problem, you can have several sockets binded on the same INADDR_ANY/port without ambiguity because the network namespace pointer is added as a new key for sockets lookup, (src addr, src port, dst addr, dst port, net ns pointer). The bind should not be forced to a specific address because you will not be able to connect via 127.0.0.1. I think with the appropriate set of rules it provides what is needed for application migration. I.e. 127.0.0.1 can be filtered so that you can only connect to sockets in your current container. It does get a little odd because it does allow for the possibility that you can have multiple connected sockets with same source ip, source port, destination ip, destination port. If the rules are setup appropriately. I don't see that peculiarity being visible on the outside network so it shouldn't be a problem. So if they're using the same protocol (eg TCP), how is it decided which one gets an incoming packet? Maybe I'm missing something as I don't understand your inside/outside network reference - is that to the loopback address comment in the previous paragraph? The sockets for l3 isolation are isolated like the l2 (this is common code). The difference is where the network namespace is found and used. At the layer 2, it is at the network device level where the namespace is found. At the layer 3, from the IP destination. So when you arrive to sockets level, you have the network namespace packet destination information and you search for sockets related to the specific namespace. -- Daniel - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 10/23] bcm43xx: Drain TX status before starting IRQs
-stable review patch. If anyone has any objections, please let us know. -- From: Michael Buesch <[EMAIL PROTECTED]> Drain the Microcode TX-status-FIFO before we enable IRQs. This is required, because the FIFO may still have entries left from a previous run. Those would immediately fire after enabling IRQs and would lead to an oops in the DMA TXstatus handling code. Cc: "John W. Linville" <[EMAIL PROTECTED]> Signed-off-by: Michael Buesch <[EMAIL PROTECTED]> Signed-off-by: Larry Finger <[EMAIL PROTECTED]> Signed-off-by: Chris Wright <[EMAIL PROTECTED]> --- drivers/net/wireless/bcm43xx/bcm43xx_main.c | 18 ++ 1 file changed, 18 insertions(+) --- linux-2.6.18.4.orig/drivers/net/wireless/bcm43xx/bcm43xx_main.c +++ linux-2.6.18.4/drivers/net/wireless/bcm43xx/bcm43xx_main.c @@ -1463,6 +1463,23 @@ static void handle_irq_transmit_status(s } } +static void drain_txstatus_queue(struct bcm43xx_private *bcm) +{ + u32 dummy; + + if (bcm->current_core->rev < 5) + return; + /* Read all entries from the microcode TXstatus FIFO +* and throw them away. +*/ + while (1) { + dummy = bcm43xx_read32(bcm, BCM43xx_MMIO_XMITSTAT_0); + if (!dummy) + break; + dummy = bcm43xx_read32(bcm, BCM43xx_MMIO_XMITSTAT_1); + } +} + static void bcm43xx_generate_noise_sample(struct bcm43xx_private *bcm) { bcm43xx_shm_write16(bcm, BCM43xx_SHM_SHARED, 0x408, 0x7F7F); @@ -3517,6 +3534,7 @@ int bcm43xx_select_wireless_core(struct bcm43xx_macfilter_clear(bcm, BCM43xx_MACFILTER_ASSOC); bcm43xx_macfilter_set(bcm, BCM43xx_MACFILTER_SELF, (u8 *)(bcm->net_dev->dev_addr)); bcm43xx_security_init(bcm); + drain_txstatus_queue(bcm); ieee80211softmac_start(bcm->net_dev); /* Let's go! Be careful after enabling the IRQs. -- - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: sky2 hang still exists in 2.6.19-rc6 --Bug#396185?
Stephen Hemminger wrote: That motherboard has dual lan, are you using both of them? I don't have that chip version, so hard to tell if it is using dual port with a single chip or not. There is a hack for the dual port PCI-X version already in the driver, that turns off receive checksums if both ports are in use. Please try turning off receive checksums with ethtool and see if that helps. I'm only using one port. The second port is disabled in the BIOS. The problem still occurs with receive checksums off. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.19-rc6-mm2: uli526x only works after reload
On Wednesday, 29 November 2006 22:30, Andrew Morton wrote: > On Wed, 29 Nov 2006 21:08:00 +0100 > "Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote: > > > On Wednesday, 29 November 2006 20:54, Rafael J. Wysocki wrote: > > > On Tuesday, 28 November 2006 11:02, Andrew Morton wrote: > > > > > > > > Temporarily at > > > > > > > > http://userweb.kernel.org/~akpm/2.6.19-rc6-mm2/ > > > > > > > > Will appear eventually at > > > > > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc6/2.6.19-rc6-mm2/ > > > > > > A minor issue: on one of my (x86-64) test boxes the uli526x driver doesn't > > > work when it's first loaded. I have to rmmod and modprobe it to make it > > > work. > > That isn't a minor issue. > > > > It worked just fine on -mm1, so something must have happened to it > > > recently. > > > > Sorry, I was wrong. The driver doesn't work at all, even after reload. > > > > tulip-dmfe-carrier-detection-fix.patch was added in rc6-mm2. But you're > not using that (corrent?) > > git-netdev-all changes drivers/net/tulip/de2104x.c, but you're not using > that either. > > git-powerpc(!) alters drivers/net/tulip/de4x5.c, but you're not using that. > > Beats me, sorry. Perhaps it's due to changes in networking core. It's > presumably a showstopper for statically-linked-uli526x users. If you could > bisect it, please? I'd start with git-netdev-all, then tulip-*. OK, but it'll take some time. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.19-rc6-mm2: uli526x only works after reload
On Wed, 29 Nov 2006 21:08:00 +0100 "Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote: > On Wednesday, 29 November 2006 20:54, Rafael J. Wysocki wrote: > > On Tuesday, 28 November 2006 11:02, Andrew Morton wrote: > > > > > > Temporarily at > > > > > > http://userweb.kernel.org/~akpm/2.6.19-rc6-mm2/ > > > > > > Will appear eventually at > > > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc6/2.6.19-rc6-mm2/ > > > > A minor issue: on one of my (x86-64) test boxes the uli526x driver doesn't > > work when it's first loaded. I have to rmmod and modprobe it to make it > > work. That isn't a minor issue. > > It worked just fine on -mm1, so something must have happened to it recently. > > Sorry, I was wrong. The driver doesn't work at all, even after reload. > tulip-dmfe-carrier-detection-fix.patch was added in rc6-mm2. But you're not using that (corrent?) git-netdev-all changes drivers/net/tulip/de2104x.c, but you're not using that either. git-powerpc(!) alters drivers/net/tulip/de4x5.c, but you're not using that. Beats me, sorry. Perhaps it's due to changes in networking core. It's presumably a showstopper for statically-linked-uli526x users. If you could bisect it, please? I'd start with git-netdev-all, then tulip-*. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] r8169: Fix iteration variable sign
This changes the type of variable "i" in rtl8169_init_one() from "unsigned int" to "int". "i" is checked for < 0 later, which can never happen for "unsigned". This results in broken error handling. Signed-off-by: Michael Buesch <[EMAIL PROTECTED]> Signed-off-by: Francois Romieu <[EMAIL PROTECTED]> diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c index 5002673..c8fa9b1 100644 --- a/drivers/net/r8169.c +++ b/drivers/net/r8169.c @@ -1491,8 +1491,8 @@ rtl8169_init_one(struct pci_dev *pdev, c struct rtl8169_private *tp; struct net_device *dev; void __iomem *ioaddr; - unsigned int i, pm_cap; - int rc; + unsigned int pm_cap; + int i, rc; if (netif_msg_drv(&debug)) { printk(KERN_INFO "%s Gigabit Ethernet driver %s loaded\n", - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] sundance: use NULL for pointer
From: Randy Dunlap <[EMAIL PROTECTED]> Use NULL instead of 0 for pointers (cures sparse warnings). drivers/net/sundance.c:1106:16: warning: Using plain integer as NULL pointer drivers/net/sundance.c:1652:16: warning: Using plain integer as NULL pointer Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]> --- drivers/net/sundance.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- linux-2.6.19-rc6-mm2.orig/drivers/net/sundance.c +++ linux-2.6.19-rc6-mm2/drivers/net/sundance.c @@ -1103,7 +1103,7 @@ reset_tx (struct net_device *dev) np->cur_tx = np->dirty_tx = 0; np->cur_task = 0; - np->last_tx = 0; + np->last_tx = NULL; iowrite8(127, ioaddr + TxDMAPollPeriod); iowrite16 (StatsEnable | RxEnable | TxEnable, ioaddr + MACCtrl1); @@ -1649,7 +1649,7 @@ static int netdev_close(struct net_devic np->cur_tx = 0; np->dirty_tx = 0; np->cur_task = 0; - np->last_tx = 0; + np->last_tx = NULL; netif_stop_queue(dev); --- - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] NetLabel: add the ranged tag to the CIPSOv4 protocol
On Wed, 29 Nov 2006, [EMAIL PROTECTED] wrote: > +{ > + /* The constant '16' is not random, it is the maximum number of > + * high/low category range pairs as permitted by the CIPSO draft based > + * on a maximum IPv4 header length of 60 bytes - the BUG_ON() assertion > + * does a sanity check to make sure we don't overflow the array. */ > + int iter = -1; > + u16 array[16]; Perhaps in a future update, make this a value a macro definition and document it in the header. -- James Morris <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Realtek 8139 driver (8139too.c) TX Timeout doesn't allow interrupt handler to disable receive interrupts at high bi-directional traffic
On Wed, 29 Nov 2006 14:20:31 +0530 "Basheer, Mansoor Ahamed" <[EMAIL PROTECTED]> wrote: > Francois Romieu [mailto:[EMAIL PROTECTED] wrote: > > > Afaics your change may disable the Rx irq right after the poll routine > > > enabled it again. It will not always work either. > > > > The (slow) timeout watchdog could grab the poll handler and hack the > > irq mask depending on whether poll was scheduled or not. > > Could you please confirm whether the attached patch would work? > I tested it and it works for me. > > > Signed-off-by: Mansoor Ahamed <[EMAIL PROTECTED]> > > --- old/8139too.c 2006-11-14 10:44:27.0 +0530 > +++ new/8139too.c 2006-11-14 10:44:18.0 +0530 > @@ -1438,8 +1438,18 @@ > if ((!(tmp & CmdRxEnb)) || (!(tmp & CmdTxEnb))) > RTL_W8 (ChipCmd, CmdRxEnb | CmdTxEnb); > > - /* Enable all known interrupts by setting the interrupt mask. */ > - RTL_W16 (IntrMask, rtl8139_intr_mask); > + local_irq_disable(); > + /* Don't enable RX if RX was already scheduled */ > + if(test_bit(__LINK_STATE_START, &dev->state) && > + test_bit(__LINK_STATE_RX_SCHED, &dev->state) ) { > + /* Enable all interrupts except RX by setting the > interrupt mask. */ > + RTL_W16 (IntrMask, rtl8139_norx_intr_mask); > + } > + else { > + /* Enable all known interrupts by setting the interrupt > mask. */ > + RTL_W16 (IntrMask, rtl8139_intr_mask); > + } > + local_irq_enable(); > } Sorry, that's not the right way. Testing for bits is not SMP safe and is usually a bad idea. The rx_lock model is not the best way. Try something like this: --- a/drivers/net/8139too.c.orig2006-11-29 12:22:32.0 -0800 +++ b/drivers/net/8139too.c 2006-11-29 12:22:06.0 -0800 @@ -589,7 +589,6 @@ struct rtl8139_private { unsigned int default_port : 4; /* Last dev->if_port value. */ unsigned int have_thread : 1; spinlock_t lock; - spinlock_t rx_lock; chip_t chipset; u32 rx_config; struct rtl_extra_stats xstats; @@ -1009,7 +1008,6 @@ static int __devinit rtl8139_init_one (s tp->msg_enable = (debug < 0 ? RTL8139_DEF_MSG_ENABLE : ((1 << debug) - 1)); spin_lock_init (&tp->lock); - spin_lock_init (&tp->rx_lock); INIT_WORK(&tp->thread, rtl8139_thread, dev); tp->mii.dev = dev; tp->mii.mdio_read = mdio_read; @@ -1654,6 +1652,9 @@ static void rtl8139_tx_timeout_task (voi int i; u8 tmp8; + if (!netif_running(dev)) + return; + printk (KERN_DEBUG "%s: Transmit timeout, status %2.2x %4.4x %4.4x " "media %2.2x.\n", dev->name, RTL_R8 (ChipCmd), RTL_R16(IntrStatus), RTL_R16(IntrMask), RTL_R8(MediaStatus)); @@ -1673,7 +1674,9 @@ static void rtl8139_tx_timeout_task (voi if (tmp8 & CmdTxEnb) RTL_W8 (ChipCmd, CmdRxEnb); - spin_lock_bh(&tp->rx_lock); + /* prevent NAPI poll from running */ + netif_poll_disable(); + /* Disable interrupts by clearing the interrupt mask. */ RTL_W16 (IntrMask, 0x); @@ -1682,12 +1685,11 @@ static void rtl8139_tx_timeout_task (voi rtl8139_tx_clear (tp); spin_unlock_irq(&tp->lock); + netif_poll_enable(); + /* ...and finally, reset everything */ - if (netif_running(dev)) { - rtl8139_hw_start (dev); - netif_wake_queue (dev); - } - spin_unlock_bh(&tp->rx_lock); + rtl8139_hw_start (dev); + netif_wake_queue (dev); } static void rtl8139_tx_timeout (struct net_device *dev) @@ -2116,7 +2118,6 @@ static int rtl8139_poll(struct net_devic int orig_budget = min(*budget, dev->quota); int done = 1; - spin_lock(&tp->rx_lock); if (likely(RTL_R16(IntrStatus) & RxAckBits)) { int work_done; @@ -2138,7 +2139,6 @@ static int rtl8139_poll(struct net_devic __netif_rx_complete(dev); local_irq_enable(); } - spin_unlock(&tp->rx_lock); return !done; } - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Network virtualization/isolation
Eric W. Biederman wrote: I think for cases across network socket namespaces it should be a matter for the rules, to decide if the connection should happen and what error code to return if the connection does not happen. There is a potential in this to have an ambiguous case where two applications can be listening for connections on the same socket on the same port and both will allow the connection. If that is the case I believe the proper definition is the first socket that we find that will accept the connection gets the connection. Wouldn't you want to catch this at bind() and/or configuration time and fail? Having overlapping namespaces/rules seems undesirable, since as Herbert said, can get you "unexpected behaviour". I think with the appropriate set of rules it provides what is needed for application migration. I.e. 127.0.0.1 can be filtered so that you can only connect to sockets in your current container. It does get a little odd because it does allow for the possibility that you can have multiple connected sockets with same source ip, source port, destination ip, destination port. If the rules are setup appropriately. I don't see that peculiarity being visible on the outside network so it shouldn't be a problem. So if they're using the same protocol (eg TCP), how is it decided which one gets an incoming packet? Maybe I'm missing something as I don't understand your inside/outside network reference - is that to the loopback address comment in the previous paragraph? Thanks, -Brian - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Madwifi-devel] ar5k and Atheros AR5005G
On Wed, 29 Nov 2006 08:03:28 -0800 David Kimdon <[EMAIL PROTECTED]> wrote: > On Wed, Nov 29, 2006 at 04:38:56PM +0100, Michael Buesch wrote: > > On Wednesday 29 November 2006 16:24, David Kimdon wrote: > > > On Wed, Nov 29, 2006 at 04:12:33PM +0100, Michael Buesch wrote: > > > > On Wednesday 29 November 2006 15:34, Nick Kossifidis wrote: > > > Why do you say that? > > > > > > There is absolutely no reason why dadwifi can't be merged into the > > > mainline once the hal issue is resolved. > > > > Last time we talked about that stuff, it was decided that > > we don't want a HAL... See archives. > > To be clear, that is all part of the hal issue that needs to be > resolved. Removing the hal abstraction is not difficult for an > interested party once source for the hal is available. The next step > in such an effort would be to add an open hal to dadwifi, IMO. > Isn't it obvious. Planning from goal through intermediate steps gives: 0 - today (raw materials) * softmac stack: d80211 * open hal: ar5k * glue layer: dadwifi 1- put pieces together * d80211 + dadwifi + ar5k 2 - release working code to d80211 tree 3 - hard link dad2ifi to ar5k (one module) 4 - collapse indirect calls and refactor 5 - lather rinse repeat in public d80211 tree ... 8 - resulting in atheros driver kernel module 9 - code ready in d80211 10 - mainline integration of working driver for Atheros using common softmac stack > > P.S. Actually, it isn't clear to me that removing the hal entirely is > a good idea. Abstractions exist for practical reasons. The hal > allows dadwifi to support a variety of Atheros chips without needing > to worry about the specific details of each chip. Abstractions that deal with hardware are good. See phylib. Abstractions that try to deal with operating system independence are gross. -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] lockdep: fix sk->sk_callback_lock locking
From: Herbert Xu <[EMAIL PROTECTED]> Date: Wed, 29 Nov 2006 23:07:09 +1100 > On Wed, Nov 29, 2006 at 12:42:24PM +0100, Peter Zijlstra wrote: > > > > However I'm not quite sure yet how to teach lockdep about this. The > > proposed patch will shut it up though. > > As a rule I think we should never make semantic changes to shut up > lockdep. Especially ones which are costly, as this proposed change is in that it disables software interrupts in a place where that is completely unnecessary. Let's not even consider this patch :) - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: pktgen
On Tue, Nov 28, 2006 at 03:33:25PM -0800, David Miller wrote: > From: Alexey Dobriyan <[EMAIL PROTECTED]> > Date: Wed, 22 Nov 2006 00:22:51 +0300 > > > [CCing netdev, bug in pktgen] > > > > [build modular pktgen] > > while true; do modprobe pktgen && rmmod pktgen; done > > > > BUG: warning at fs/proc/generic.c:732/remove_proc_entry() > > [] remove_proc_entry+0x161/0x1ca > > [] pg_cleanup+0xd5/0xdc [pktgen] > > [] autoremove_wake_function+0x0/0x35 > > [] sys_delete_module+0x162/0x189 > > [] remove_vma+0x31/0x36 > > [] do_munmap+0x193/0x1ac > > [] sysenter_past_esp+0x56/0x79 > > [] fn_hash_delete+0x4f/0x1c7 > > > > On Tue, Nov 21, 2006 at 09:36:46PM +0100, Pavol Gono wrote: > > > I am going to add two more: > > > for i in 1 2 3 4 5 ; do modprobe pktgen ; rmmod pktgen ; done > > > > Looks like it creates /proc/net/pktgen/kpktgen_%i but forgets to remove > > them. > > It's pretty careful to delete all of the entries under > /proc/net/pktgen/. > > When the module is brought down, it walks the list of threads > and brings them down by setting T_TERMINATE in t->control. Looks like worker thread strategically clears it if scheduled at wrong moment. --- a/net/core/pktgen.c +++ b/net/core/pktgen.c @@ -3292,7 +3292,6 @@ static void pktgen_thread_worker(struct init_waitqueue_head(&t->queue); - t->control &= ~(T_TERMINATE); t->control &= ~(T_RUN); t->control &= ~(T_STOP); t->control &= ~(T_REMDEVALL); > This makes the thread break out of it's loop and run: > > pktgen_stop(t); > pktgen_rem_all_ifs(t); > pktgen_rem_thread(t); Kernel seeems to survive, but when I hit Ctrl+C after half a minute backtrace is back being the very last dmesg lines. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] r8169: Fix iteration variable sign
This changes the type of variable "i" in rtl8169_init_one() from "unsigned int" to "int". "i" is checked for <0 later, which can never happen for "unsigned". This results in broken error handling. Signed-off-by: Michael Buesch <[EMAIL PROTECTED]> Index: linux-2.6/drivers/net/r8169.c === --- linux-2.6.orig/drivers/net/r8169.c 2006-11-04 19:03:28.0 +0100 +++ linux-2.6/drivers/net/r8169.c 2006-11-29 20:41:59.0 +0100 @@ -1473,8 +1473,8 @@ rtl8169_init_one(struct pci_dev *pdev, c struct rtl8169_private *tp; struct net_device *dev; void __iomem *ioaddr; - unsigned int i, pm_cap; - int rc; + unsigned int pm_cap; + int i, rc; if (netif_msg_drv(&debug)) { printk(KERN_INFO "%s Gigabit Ethernet driver %s loaded\n", -- Greetings Michael. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Intel 82559 NIC corrupted EEPROM
On 11/29/06, John <[EMAIL PROTECTED]> wrote: > Let's go ahead and print the output from e100_load_eeprom > debug patch attached. Loading (then unloading) e100.ko fails the first few times (i.e. the driver claims one of the EEPROMs is corrupted). Thereafter, sometimes it fails, other times it works. Sounds like a race, no? yes, or something like that. I think you may have a piece of eeprom hardware that is either "slow" or slightly out of spec. I wonder if the hrt kernel makes udelay(4) much more like 4us than the regular kernels. can you try adding mdelay(100); in e100_eeprom_load before the for loop, and then change the multiple udelay(4) to mdelay(1) in e100_eeprom_read On an unrelated note, insmod_100.txt is truncated at the beginning, and insmod_110.txt is truncated in the middle (!!) cf. line 14. What would cause klogd to behave like that? usually its because whatever is printing is printing too fast or too much at a time. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] NetLabel: add the remaining CIPSO tag types from the IETF draft
--- [EMAIL PROTECTED] wrote: > This patchset consists of three patches that add > support for the remaining two > tag types from the CIPSO draft specification, the > enumerated and range tags. > The most significant part about adding these two > tags is that NetLabel now has > the ability to represent more than 240 categories > (limitation imposed by the > current restricted bitmap tag). > > In addition, the first patch in the set converts > NetLabel's contiguous char > string category bitmap stored in network friendly > bit/byte order into a sparse > bitmap stored in host friendly bit/byte order. > While this change was not > required to support the new CIPSO tags, it should > make life much easier as the > old category bitmap would have proven problematic as > the number of usable > categories increases with the new tag types. It > also has a side effect of > making the LSM specific code much less ugly. Fabulous. Thank you. Casey Schaufler [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] NetLabel: add the ranged tag to the CIPSOv4 protocol
From: Paul Moore <[EMAIL PROTECTED]> Add support for the ranged tag (tag type #5) to the CIPSOv4 protocol. The ranged tag allows for seven, or eight if zero is the lowest category, category ranges to be specified in a CIPSO option. Each range is specified by two unsigned 16 bit fields, each with a maximum value of 65534. The two values specify the start and end of the category range; if the start of the category range is zero then it is omitted. See Documentation/netlabel/draft-ietf-cipso-ipsecurity-01.txt for more details. Signed-off-by: Paul Moore <[EMAIL PROTECTED]> --- net/ipv4/cipso_ipv4.c | 268 ++ 1 files changed, 268 insertions(+) Index: net-2.6.20_netlabel-cats/net/ipv4/cipso_ipv4.c === --- net-2.6.20_netlabel-cats.orig/net/ipv4/cipso_ipv4.c +++ net-2.6.20_netlabel-cats/net/ipv4/cipso_ipv4.c @@ -455,6 +455,10 @@ int cipso_v4_doi_add(struct cipso_v4_doi switch (doi_def->tags[iter]) { case CIPSO_V4_TAG_RBITMAP: break; + case CIPSO_V4_TAG_RANGE: + if (doi_def->type != CIPSO_V4_MAP_PASS) + return -EINVAL; + break; case CIPSO_V4_TAG_INVALID: if (iter == 0) return -EINVAL; @@ -1045,6 +1049,148 @@ static int cipso_v4_map_cat_enum_ntoh(co return 0; } +/** + * cipso_v4_map_cat_rng_valid - Checks to see if the categories are valid + * @doi_def: the DOI definition + * @rngcat: category list + * @rngcat_len: length of the category list in bytes + * + * Description: + * Checks the given categories against the given DOI definition and returns a + * negative value if any of the categories do not have a valid mapping and a + * zero value if all of the categories are valid. + * + */ +static int cipso_v4_map_cat_rng_valid(const struct cipso_v4_doi *doi_def, + const unsigned char *rngcat, + u32 rngcat_len) +{ + u16 cat_high; + u16 cat_low; + u32 cat_prev = CIPSO_V4_MAX_REM_CATS + 1; + u32 iter; + + if (doi_def->type != CIPSO_V4_MAP_PASS || rngcat_len & 0x01) + return -EFAULT; + + for (iter = 0; iter < rngcat_len; iter += 4) { + cat_high = ntohs(*((__be16 *)&rngcat[iter])); + if ((iter + 4) <= rngcat_len) + cat_low = ntohs(*((__be16 *)&rngcat[iter + 2])); + else + cat_low = 0; + + if (cat_high > cat_prev) + return -EFAULT; + + cat_prev = cat_low; + } + + return 0; +} + +/** + * cipso_v4_map_cat_rng_hton - Perform a category mapping from host to network + * @doi_def: the DOI definition + * @secattr: the security attributes + * @net_cat: the zero'd out category list in network/CIPSO format + * @net_cat_len: the length of the CIPSO category list in bytes + * + * Description: + * Perform a label mapping to translate a local MLS category bitmap to the + * correct CIPSO category list using the given DOI definition. Returns the + * size in bytes of the network category bitmap on success, negative values + * otherwise. + * + */ +static int cipso_v4_map_cat_rng_hton(const struct cipso_v4_doi *doi_def, +const struct netlbl_lsm_secattr *secattr, +unsigned char *net_cat, +u32 net_cat_len) +{ + /* The constant '16' is not random, it is the maximum number of +* high/low category range pairs as permitted by the CIPSO draft based +* on a maximum IPv4 header length of 60 bytes - the BUG_ON() assertion +* does a sanity check to make sure we don't overflow the array. */ + int iter = -1; + u16 array[16]; + u32 array_cnt = 0; + u32 cat_size = 0; + + BUG_ON(net_cat_len > 30); + + for (;;) { + iter = netlbl_secattr_catmap_walk(secattr->mls_cat, iter + 1); + if (iter < 0) + break; + cat_size += (iter == 0 ? 0 : sizeof(u16)); + if (cat_size > net_cat_len) + return -ENOSPC; + array[array_cnt++] = iter; + + iter = netlbl_secattr_catmap_walk_rng(secattr->mls_cat, iter); + if (iter < 0) + return -EFAULT; + cat_size += sizeof(u16); + if (cat_size > net_cat_len) + return -ENOSPC; + array[array_cnt++] = iter; + } + + for (iter = 0; array_cnt > 0;) { + *((__be16 *)&net_cat[iter]) = htons(array[--array_cnt]); + iter += 2; + array_cnt--; + if (array[array_cnt] != 0) { +
[PATCH 0/3] NetLabel: add the remaining CIPSO tag types from the IETF draft
This patchset consists of three patches that add support for the remaining two tag types from the CIPSO draft specification, the enumerated and range tags. The most significant part about adding these two tags is that NetLabel now has the ability to represent more than 240 categories (limitation imposed by the current restricted bitmap tag). In addition, the first patch in the set converts NetLabel's contiguous char string category bitmap stored in network friendly bit/byte order into a sparse bitmap stored in host friendly bit/byte order. While this change was not required to support the new CIPSO tags, it should make life much easier as the old category bitmap would have proven problematic as the number of usable categories increases with the new tag types. It also has a side effect of making the LSM specific code much less ugly. During testing I have not seen any regressions with this patchset; please consider this for net-2.6.20. Thanks. -- paul moore linux security @ hp - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] NetLabel: add the enumerated tag to the CIPSOv4 protocol
From: Paul Moore <[EMAIL PROTECTED]> Add support for the enumerated tag (tag type #2) to the CIPSOv4 protocol. The enumerated tag allows for 15 categories to be specified in a CIPSO option, where each category is an unsigned 16 bit field with a maximum value of 65534. See Documentation/netlabel/draft-ietf-cipso-ipsecurity-01.txt for more details. Signed-off-by: Paul Moore <[EMAIL PROTECTED]> --- net/ipv4/cipso_ipv4.c | 233 ++ 1 files changed, 233 insertions(+) Index: net-2.6.20_netlabel-cats/net/ipv4/cipso_ipv4.c === --- net-2.6.20_netlabel-cats.orig/net/ipv4/cipso_ipv4.c +++ net-2.6.20_netlabel-cats/net/ipv4/cipso_ipv4.c @@ -459,6 +459,10 @@ int cipso_v4_doi_add(struct cipso_v4_doi if (iter == 0) return -EINVAL; break; + case CIPSO_V4_TAG_ENUM: + if (doi_def->type != CIPSO_V4_MAP_PASS) + return -EINVAL; + break; default: return -EINVAL; } @@ -940,6 +944,107 @@ static int cipso_v4_map_cat_rbm_ntoh(con return -EINVAL; } +/** + * cipso_v4_map_cat_enum_valid - Checks to see if the categories are valid + * @doi_def: the DOI definition + * @enumcat: category list + * @enumcat_len: length of the category list in bytes + * + * Description: + * Checks the given categories against the given DOI definition and returns a + * negative value if any of the categories do not have a valid mapping and a + * zero value if all of the categories are valid. + * + */ +static int cipso_v4_map_cat_enum_valid(const struct cipso_v4_doi *doi_def, + const unsigned char *enumcat, + u32 enumcat_len) +{ + u16 cat; + int cat_prev = -1; + u32 iter; + + if (doi_def->type != CIPSO_V4_MAP_PASS || enumcat_len & 0x01) + return -EFAULT; + + for (iter = 0; iter < enumcat_len; iter += 2) { + cat = ntohs(*((__be16 *)&enumcat[iter])); + if (cat <= cat_prev) + return -EFAULT; + cat_prev = cat; + } + + return 0; +} + +/** + * cipso_v4_map_cat_enum_hton - Perform a category mapping from host to network + * @doi_def: the DOI definition + * @secattr: the security attributes + * @net_cat: the zero'd out category list in network/CIPSO format + * @net_cat_len: the length of the CIPSO category list in bytes + * + * Description: + * Perform a label mapping to translate a local MLS category bitmap to the + * correct CIPSO category list using the given DOI definition. Returns the + * size in bytes of the network category bitmap on success, negative values + * otherwise. + * + */ +static int cipso_v4_map_cat_enum_hton(const struct cipso_v4_doi *doi_def, + const struct netlbl_lsm_secattr *secattr, + unsigned char *net_cat, + u32 net_cat_len) +{ + int cat = -1; + u32 cat_iter = 0; + + for (;;) { + cat = netlbl_secattr_catmap_walk(secattr->mls_cat, cat + 1); + if (cat < 0) + break; + if ((cat_iter + 2) > net_cat_len) + return -ENOSPC; + + *((__be16 *)&net_cat[cat_iter]) = htons(cat); + cat_iter += 2; + } + + return cat_iter; +} + +/** + * cipso_v4_map_cat_enum_ntoh - Perform a category mapping from network to host + * @doi_def: the DOI definition + * @net_cat: the category list in network/CIPSO format + * @net_cat_len: the length of the CIPSO bitmap in bytes + * @secattr: the security attributes + * + * Description: + * Perform a label mapping to translate a CIPSO category list to the correct + * local MLS category bitmap using the given DOI definition. Returns zero on + * success, negative values on failure. + * + */ +static int cipso_v4_map_cat_enum_ntoh(const struct cipso_v4_doi *doi_def, + const unsigned char *net_cat, + u32 net_cat_len, + struct netlbl_lsm_secattr *secattr) +{ + int ret_val; + u32 iter; + + for (iter = 0; iter < net_cat_len; iter += 2) { + ret_val = netlbl_secattr_catmap_setbit(secattr->mls_cat, + ntohs(*((__be16 *)&net_cat[iter])), + GFP_ATOMIC); + if (ret_val != 0) + return ret_val; + } + + return 0; +} + /* * Protocol Handling Functions */ @@ -1068,6 +1173,99 @@ static int cipso_v4_parsetag_rbm(const s } /** + * cipso_v4_gentag_enum - Generate a CIPSO enumerated tag (type #2)
[PATCH 1/3] NetLabel: convert to an extensibile/sparse category bitmap
From: Paul Moore <[EMAIL PROTECTED]> The original NetLabel category bitmap was a straight char bitmap which worked fine for the initial release as it only supported 240 bits due to limitations in the CIPSO restricted bitmap tag (tag type 0x01). This patch converts that straight char bitmap into an extensibile/sparse bitmap in order to lay the foundation for other CIPSO tag types and protocols. This patch also has a nice side effect in that all of the security attributes passed by NetLabel into the LSM are now in a format which is in the host's native byte/bit ordering which makes the LSM specific code much simpler; look at the changes in security/selinux/ss/ebitmap.c as an example. Signed-off-by: Paul Moore <[EMAIL PROTECTED]> --- include/net/netlabel.h | 102 net/ipv4/cipso_ipv4.c | 170 ++ net/netlabel/netlabel_kapi.c | 201 + security/selinux/ss/ebitmap.c | 196 +-- security/selinux/ss/ebitmap.h | 26 - security/selinux/ss/mls.c | 156 ++- security/selinux/ss/mls.h | 46 ++--- security/selinux/ss/services.c | 23 +--- 8 files changed, 568 insertions(+), 352 deletions(-) Index: net-2.6.20_netlabel-cats/include/net/netlabel.h === --- net-2.6.20_netlabel-cats.orig/include/net/netlabel.h +++ net-2.6.20_netlabel-cats/include/net/netlabel.h @@ -111,6 +111,22 @@ struct netlbl_lsm_cache { void (*free) (const void *data); void *data; }; +/* The catmap bitmap field MUST be a power of two in length and large + * enough to hold at least 240 bits. Special care (i.e. check the code!) + * should be used when changing these values as the LSM implementation + * probably has functions which rely on the sizes of these types to speed + * processing. */ +#define NETLBL_CATMAP_MAPTYPE u64 +#define NETLBL_CATMAP_MAPCNT4 +#define NETLBL_CATMAP_MAPSIZE (sizeof(NETLBL_CATMAP_MAPTYPE) * 8) +#define NETLBL_CATMAP_SIZE (NETLBL_CATMAP_MAPSIZE * \ +NETLBL_CATMAP_MAPCNT) +#define NETLBL_CATMAP_BIT (NETLBL_CATMAP_MAPTYPE)0x01 +struct netlbl_lsm_secattr_catmap { + u32 startbit; + NETLBL_CATMAP_MAPTYPE bitmap[NETLBL_CATMAP_MAPCNT]; + struct netlbl_lsm_secattr_catmap *next; +}; #define NETLBL_SECATTR_NONE 0x #define NETLBL_SECATTR_DOMAIN 0x0001 #define NETLBL_SECATTR_CACHE0x0002 @@ -122,8 +138,7 @@ struct netlbl_lsm_secattr { char *domain; u32 mls_lvl; - unsigned char *mls_cat; - size_t mls_cat_len; + struct netlbl_lsm_secattr_catmap *mls_cat; struct netlbl_lsm_cache *cache; }; @@ -171,6 +186,41 @@ static inline void netlbl_secattr_cache_ } /** + * netlbl_secattr_catmap_alloc - Allocate a LSM secattr catmap + * @flags: memory allocation flags + * + * Description: + * Allocate memory for a LSM secattr catmap, returns a pointer on success, NULL + * on failure. + * + */ +static inline struct netlbl_lsm_secattr_catmap *netlbl_secattr_catmap_alloc( + gfp_t flags) +{ + return kzalloc(sizeof(struct netlbl_lsm_secattr_catmap), flags); +} + +/** + * netlbl_secattr_catmap_free - Free a LSM secattr catmap + * @catmap: the category bitmap + * + * Description: + * Free a LSM secattr catmap. + * + */ +static inline void netlbl_secattr_catmap_free( + struct netlbl_lsm_secattr_catmap *catmap) +{ + struct netlbl_lsm_secattr_catmap *iter; + + do { + iter = catmap; + catmap = catmap->next; + kfree(iter); + } while (catmap); +} + +/** * netlbl_secattr_init - Initialize a netlbl_lsm_secattr struct * @secattr: the struct to initialize * @@ -200,7 +250,8 @@ static inline void netlbl_secattr_destro if (secattr->cache) netlbl_secattr_cache_free(secattr->cache); kfree(secattr->domain); - kfree(secattr->mls_cat); + if (secattr->mls_cat) + netlbl_secattr_catmap_free(secattr->mls_cat); } /** @@ -231,6 +282,51 @@ static inline void netlbl_secattr_free(s kfree(secattr); } +#ifdef CONFIG_NETLABEL +int netlbl_secattr_catmap_walk(struct netlbl_lsm_secattr_catmap *catmap, + u32 offset); +int netlbl_secattr_catmap_walk_rng(struct netlbl_lsm_secattr_catmap *catmap, + u32 offset); +int netlbl_secattr_catmap_setbit(struct netlbl_lsm_secattr_catmap *catmap, +u32 bit, +gfp_t flags); +int netlbl_secattr_catmap_setrng(struct netlbl_lsm_secattr_catmap *catmap, +u32 start, +
[PATCH 1/1] add auditing to ipsec
This patch adds auditing to ipsec. An audit message occurs when an ipsec SA or ipsec policy is created/deleted. Patch was built against linux kernel 2.6.19-rc6. Please let me know if this is acceptable. Regards, Joy Signed-off-by: Joy Latten <[EMAIL PROTECTED]> --- diff -urpN linux-2.6.18.orig/include/linux/audit.h linux-2.6.18-patch/include/linux/audit.h --- linux-2.6.18.orig/include/linux/audit.h 2006-11-27 11:21:16.0 -0600 +++ linux-2.6.18-patch/include/linux/audit.h2006-11-27 12:28:43.0 -0600 @@ -101,6 +101,10 @@ #define AUDIT_MAC_CIPSOV4_DEL 1408/* NetLabel: del CIPSOv4 DOI entry */ #define AUDIT_MAC_MAP_ADD 1409/* NetLabel: add LSM domain mapping */ #define AUDIT_MAC_MAP_DEL 1410/* NetLabel: del LSM domain mapping */ +#define AUDIT_MAC_IPSEC_ADDSA 1411/* Add a XFRM state */ +#define AUDIT_MAC_IPSEC_DELSA 1412/* Delete a XFRM state */ +#define AUDIT_MAC_IPSEC_ADDSPD 1413/* Add a XFRM policy */ +#define AUDIT_MAC_IPSEC_DELSPD 1414/* Delete a XFRM policy */ #define AUDIT_FIRST_KERN_ANOM_MSG 1700 #define AUDIT_LAST_KERN_ANOM_MSG1799 @@ -377,6 +381,7 @@ extern void auditsc_get_stamp(struct aud struct timespec *t, unsigned int *serial); extern int audit_set_loginuid(struct task_struct *task, uid_t loginuid); extern uid_t audit_get_loginuid(struct audit_context *ctx); +extern void audit_log_task_context(struct audit_buffer *ab); extern int __audit_ipc_obj(struct kern_ipc_perm *ipcp); extern int __audit_ipc_set_perm(unsigned long qbytes, uid_t uid, gid_t gid, mode_t mode); extern int audit_bprm(struct linux_binprm *bprm); @@ -449,6 +454,7 @@ extern int audit_n_rules; #define audit_inode_update(i) do { ; } while (0) #define auditsc_get_stamp(c,t,s) do { BUG(); } while (0) #define audit_get_loginuid(c) ({ -1; }) +#define audit_log_task_context(b) do { ; } while (0) #define audit_ipc_obj(i) ({ 0; }) #define audit_ipc_set_perm(q,u,g,m) ({ 0; }) #define audit_bprm(p) ({ 0; }) diff -urpN linux-2.6.18.orig/include/net/xfrm.h linux-2.6.18-patch/include/net/xfrm.h --- linux-2.6.18.orig/include/net/xfrm.h2006-11-27 11:21:43.0 -0600 +++ linux-2.6.18-patch/include/net/xfrm.h 2006-11-27 12:29:11.0 -0600 @@ -389,6 +389,15 @@ extern int xfrm_unregister_km(struct xfr extern unsigned int xfrm_policy_count[XFRM_POLICY_MAX*2]; +/* Audit Information */ +struct xfrm_audit +{ + uid_t loginuid; + u32 secid; +}; +void xfrm_audit_log(uid_t auid, u32 secid, int type, int result, + struct xfrm_policy *xp, struct xfrm_state *x); + static inline void xfrm_pol_hold(struct xfrm_policy *policy) { if (likely(policy != NULL)) @@ -934,7 +943,7 @@ static inline int xfrm_state_sort(struct #endif extern struct xfrm_state *xfrm_find_acq_byseq(u32 seq); extern int xfrm_state_delete(struct xfrm_state *x); -extern void xfrm_state_flush(u8 proto); +extern void xfrm_state_flush(u8 proto, struct xfrm_audit *audit_info); extern int xfrm_replay_check(struct xfrm_state *x, __be32 seq); extern void xfrm_replay_advance(struct xfrm_state *x, __be32 seq); extern void xfrm_replay_notify(struct xfrm_state *x, int event); @@ -987,13 +996,13 @@ struct xfrm_policy *xfrm_policy_bysel_ct struct xfrm_selector *sel, struct xfrm_sec_ctx *ctx, int delete); struct xfrm_policy *xfrm_policy_byid(u8, int dir, u32 id, int delete); -void xfrm_policy_flush(u8 type); +void xfrm_policy_flush(u8 type, struct xfrm_audit *audit_info); u32 xfrm_get_acqseq(void); void xfrm_alloc_spi(struct xfrm_state *x, __be32 minspi, __be32 maxspi); struct xfrm_state * xfrm_find_acq(u8 mode, u32 reqid, u8 proto, xfrm_address_t *daddr, xfrm_address_t *saddr, int create, unsigned short family); -extern void xfrm_policy_flush(u8 type); +extern void xfrm_policy_flush(u8 type, struct xfrm_audit *audit_info); extern int xfrm_sk_policy_insert(struct sock *sk, int dir, struct xfrm_policy *pol); extern int xfrm_bundle_ok(struct xfrm_policy *pol, struct xfrm_dst *xdst, struct flowi *fl, int family, int strict); diff -urpN linux-2.6.18.orig/kernel/auditsc.c linux-2.6.18-patch/kernel/auditsc.c --- linux-2.6.18.orig/kernel/auditsc.c 2006-11-27 11:19:36.0 -0600 +++ linux-2.6.18-patch/kernel/auditsc.c 2006-11-27 12:26:39.0 -0600 @@ -730,7 +730,7 @@ static inline void audit_free_context(st printk(KERN_ERR "audit: freed %d contexts\n", count); } -static void audit_log_task_context(struct audit_buffer *ab) +void audit_log_task_context(struct audit_buffer *ab) { char *ctx = NULL; ssize_t len = 0; @@ -759,6 +759,8 @@ error_path: return; } +EXPORT_SYMBOL(audit_log_task_context); + sta
[SAA9730] Fix build error
Confusingly NET_PCI is also set for for non-PCI EISA configurations where building this driver will result in a build error due to a reference to pci_release_regions. While at it, remove the EXPERIMENTAL - in all its uglyness and despite the sincerest attempts of the buggy hardware the driver is known to work. Also limit the driver to the Atlas board - the only known system to ever use the SAA9730 before Phillips ended the short live of the SAA9730. Signed-off-by: Ralf Baechle <[EMAIL PROTECTED]> diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig index 4f22c8e..c80eb79 100644 --- a/drivers/net/Kconfig +++ b/drivers/net/Kconfig @@ -1761,8 +1761,8 @@ config VIA_RHINE_NAPI information. config LAN_SAA9730 - bool "Philips SAA9730 Ethernet support (EXPERIMENTAL)" - depends on NET_PCI && EXPERIMENTAL && MIPS + bool "Philips SAA9730 Ethernet support" + depends on NET_PCI && PCI && MIPS_ATLAS help The SAA9730 is a combined multimedia and peripheral controller used in thin clients, Internet access terminals, and diskless - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/4] NetXen: 64-bit memory fixes and driver cleanup
NetXen: 1G/10G Ethernet Driver updates - These fixes take care of driver on machines with >4G memory - Driver cleanup Signed-off-by: Amit S. Kale <[EMAIL PROTECTED]> netxen_nic.h | 41 ++ netxen_nic_ethtool.c | 19 ++-- netxen_nic_hw.c | 10 +- netxen_nic_hw.h |4 netxen_nic_init.c | 51 +++- netxen_nic_isr.c |3 netxen_nic_main.c | 204 +++--- netxen_nic_phan_reg.h | 10 +- 8 files changed, 293 insertions(+), 49 deletions(-) diff --git a/drivers/net/netxen/netxen_nic.h b/drivers/net/netxen/netxen_nic.h index 1bee560..84259f9 100644 --- a/drivers/net/netxen/netxen_nic.h +++ b/drivers/net/netxen/netxen_nic.h @@ -6,12 +6,12 @@ * modify it under the terms of the GNU General Public License * as published by the Free Software Foundation; either version 2 * of the License, or (at your option) any later version. - * + * * This program is distributed in the hope that it will be useful, but * WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. - * + * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place - Suite 330, Boston, @@ -89,8 +89,8 @@ * normalize a 64MB crb address to 32MB PCI window * To use NETXEN_CRB_NORMALIZE, window _must_ be set to 1 */ -#define NETXEN_CRB_NORMAL(reg)\ - (reg) - NETXEN_CRB_PCIX_HOST2 + NETXEN_CRB_PCIX_HOST +#define NETXEN_CRB_NORMAL(reg) \ + ((reg) - NETXEN_CRB_PCIX_HOST2 + NETXEN_CRB_PCIX_HOST) #define NETXEN_CRB_NORMALIZE(adapter, reg) \ pci_base_offset(adapter, NETXEN_CRB_NORMAL(reg)) @@ -164,7 +164,7 @@ enum { #define MAX_CMD_DESCRIPTORS1024 #define MAX_RCV_DESCRIPTORS32768 -#define MAX_JUMBO_RCV_DESCRIPTORS 1024 +#define MAX_JUMBO_RCV_DESCRIPTORS 4096 #define MAX_RCVSTATUS_DESCRIPTORS MAX_RCV_DESCRIPTORS #define MAX_JUMBO_RCV_DESC MAX_JUMBO_RCV_DESCRIPTORS #define MAX_RCV_DESC MAX_RCV_DESCRIPTORS @@ -592,6 +592,16 @@ struct netxen_skb_frag { u32 length; }; +/* Bounce buffer index */ +struct bounce_index { + /* Index of a buffer */ + unsigned buffer_index; + /* Offset inside the buffer */ + unsigned buffer_offset; +}; + +#define IS_BOUNCE 0xcafebb + /*Following defines are for the state of the buffers*/ #defineNETXEN_BUFFER_FREE 0 #defineNETXEN_BUFFER_BUSY 1 @@ -611,6 +621,8 @@ struct netxen_cmd_buffer { unsigned long time_stamp; u32 state; u32 no_of_descriptors; + u32 tx_bounce_buff; + struct bounce_index bnext; }; /* In rx_buffer, we do not need multiple fragments as is a single buffer */ @@ -619,6 +631,9 @@ struct netxen_rx_buffer { u64 dma; u16 ref_handle; u16 state; + u32 rx_bounce_buff; + struct bounce_index bnext; + char *bounce_ptr; }; /* Board types */ @@ -703,6 +718,7 @@ struct netxen_recv_context { }; #define NETXEN_NIC_MSI_ENABLED 0x02 +#define NETXEN_DMA_MASK0xfffe struct netxen_drvops; @@ -937,9 +953,7 @@ static inline void netxen_nic_disable_in /* * ISR_INT_MASK: Can be read from window 0 or 1. */ - writel(0x7ff, - (void __iomem - *)(PCI_OFFSET_SECOND_RANGE(adapter, ISR_INT_MASK))); + writel(0x7ff, PCI_OFFSET_SECOND_RANGE(adapter, ISR_INT_MASK)); } @@ -959,14 +973,12 @@ static inline void netxen_nic_enable_int break; } - writel(mask, - (void __iomem - *)(PCI_OFFSET_SECOND_RANGE(adapter, ISR_INT_MASK))); + writel(mask, PCI_OFFSET_SECOND_RANGE(adapter, ISR_INT_MASK)); if (!(adapter->flags & NETXEN_NIC_MSI_ENABLED)) { mask = 0xbff; - writel(mask, (void __iomem *) - (PCI_OFFSET_SECOND_RANGE(adapter, ISR_INT_TARGET_MASK))); + writel(mask, PCI_OFFSET_SECOND_RANGE(adapter, +ISR_INT_TARGET_MASK)); } } @@ -1040,6 +1052,9 @@ static inline void get_brd_name_by_type( int netxen_is_flash_supported(struct netxen_adapter *adapter); int netxen_get_flash_mac_addr(struct netxen_adapter *adapter, u64 mac[]); +int netxen_get_next_bounce_buffer(struct bounce_index *head, + struct bounce_index *tail, + struct bounce_index *biret, unsigned len); extern void netxen_change_ringparam(struct netxen_adapter *adapter); extern int netxen_rom_fast_read(struct netxen_adapter *adapter, int addr, diff --git a/drivers/net/netxen/netxen_nic_ethtool.c
[PATCH 1/4] NetXen: Fixed /sys mapping between device and driver
Signed-off-by: Amit S. Kale <[EMAIL PROTECTED]> netxen_nic_main.c |3 ++- 1 files changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/netxen/netxen_nic_main.c b/drivers/net/netxen/netxen_nic_main.c index 145bf47..a055208 100644 --- a/drivers/net/netxen/netxen_nic_main.c +++ b/drivers/net/netxen/netxen_nic_main.c @@ -273,6 +273,7 @@ netxen_nic_probe(struct pci_dev *pdev, c } SET_MODULE_OWNER(netdev); + SET_NETDEV_DEV(netdev, &pdev->dev); port = netdev_priv(netdev); port->netdev = netdev; @@ -1043,7 +1044,7 @@ static int netxen_nic_poll(struct net_de netxen_nic_enable_int(adapter); } - return (done ? 0 : 1); + return !done; } #ifdef CONFIG_NET_POLL_CONTROLLER - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/4] NetXen: 1G/10G Ethernet Driver updates
I will be sending NetXen: 1G/10G Ethernet Driver updates in subsequent emails. Thanks, --Amit - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] d80211: Reset assoc and auth retry counters
On Wed, Nov 29, 2006 at 03:33:07PM +0100, Jiri Benc wrote: > On Wed, 29 Nov 2006 15:27:06 +0100, Ivo Van Doorn wrote: > > Shouldn't this last one be: > > ieee80211_set_disassoc(dev, ifsta, 0) > > > > This one is called from the IOCTL request to dissassociate, > > so the interface should still be authenticated (with a valid > > auth retry counter). > > Yes, of course. Thanks for being watchful :-) I'll massage this and apply it on top of wireless-dev, since I already applied Ivo's patch. John -- John W. Linville [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Madwifi-devel] ar5k and Atheros AR5005G
On Wednesday 29 November 2006 16:58, Michael Renzmann wrote: > Hi. > > > On Wednesday 29 November 2006 16:24, David Kimdon wrote: > >> There is absolutely no reason why dadwifi can't be merged into the > >> mainline once the hal issue is resolved. > > Last time we talked about that stuff, it was decided that > > we don't want a HAL... See archives. > > IIRC Pavel already explained that getting rid of the HAL per se should be > no problem - it could easily be dissolved into the driver, if that is one > of the requirements to be fulfilled before the driver (MadWifi or DadWifi) > is considered for mainline inclusion. As soon as there is source available > to dissolve, at least. Ok, so who actually does the work? It has been talked a lot about what could and what should be done. But who does it? > From what I understood the "... once the hal issue is resolved" part of > David's mail refered to exactly that question. Ok, I don't know what "The HAL Issue" (tm) is. Sounds like a hollywood movie theme to me. ;) -- Greetings Michael. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Madwifi-devel] ar5k and Atheros AR5005G
On Wed, Nov 29, 2006 at 04:38:56PM +0100, Michael Buesch wrote: > On Wednesday 29 November 2006 16:24, David Kimdon wrote: > > On Wed, Nov 29, 2006 at 04:12:33PM +0100, Michael Buesch wrote: > > > On Wednesday 29 November 2006 15:34, Nick Kossifidis wrote: > > Why do you say that? > > > > There is absolutely no reason why dadwifi can't be merged into the > > mainline once the hal issue is resolved. > > Last time we talked about that stuff, it was decided that > we don't want a HAL... See archives. To be clear, that is all part of the hal issue that needs to be resolved. Removing the hal abstraction is not difficult for an interested party once source for the hal is available. The next step in such an effort would be to add an open hal to dadwifi, IMO. -David P.S. Actually, it isn't clear to me that removing the hal entirely is a good idea. Abstractions exist for practical reasons. The hal allows dadwifi to support a variety of Atheros chips without needing to worry about the specific details of each chip. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Madwifi-devel] ar5k and Atheros AR5005G
Hi. > On Wednesday 29 November 2006 16:24, David Kimdon wrote: >> There is absolutely no reason why dadwifi can't be merged into the >> mainline once the hal issue is resolved. > Last time we talked about that stuff, it was decided that > we don't want a HAL... See archives. IIRC Pavel already explained that getting rid of the HAL per se should be no problem - it could easily be dissolved into the driver, if that is one of the requirements to be fulfilled before the driver (MadWifi or DadWifi) is considered for mainline inclusion. As soon as there is source available to dissolve, at least. >From what I understood the "... once the hal issue is resolved" part of David's mail refered to exactly that question. Bye, Mike - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Madwifi-devel] ar5k and Atheros AR5005G
On Wed, 2006-11-29 at 15:55 +0200, Nick Kossifidis wrote: > I 've already ported ar5k to linux and it works with madwifi versions > before the bsd-head merge, you can see more infos here -> > http://madwifi.org/wiki/OpenHAL > > If i can help in any way feel free to mail ;-) Thanks, I'm trying it out to see whether it works on my hardware. I compiled and loaded everything OK, ath0 appears, but it doesn't seem to be working. I'm using these commands: ifconfig ath0 up iwlist ath0 scan Should that produce scan results, or do I need to use some weird tools to do that? (this is my first interaction with the madwifi-old driver) Currently it pauses for a while and then doesn't present any results. Thanks! -- Daniel Drake Brontes Technologies, A 3M Company - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Madwifi-devel] ar5k and Atheros AR5005G
On Wednesday 29 November 2006 16:24, David Kimdon wrote: > On Wed, Nov 29, 2006 at 04:12:33PM +0100, Michael Buesch wrote: > > On Wednesday 29 November 2006 15:34, Nick Kossifidis wrote: > > > Good luck then ;-) > > > > > > If anyone wants to help on making ar5k work with newer madwifi > > > versions and fix bugs etc (that 'll also help bsd ppl) plzz mail me. > > > We can make it better. > > > > > > Nick > > > P.S. Why not work on dawifi ? > > > > Because it won't be merged mainline either. > > Why do you say that? > > There is absolutely no reason why dadwifi can't be merged into the > mainline once the hal issue is resolved. Ok, I deleted my repository. Atheros stuff is really too frustrating to work on and I don't have the time anyway. If you believe dadwifi can be merged, please _do_ so. -- Greetings Michael. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Madwifi-devel] ar5k and Atheros AR5005G
On Wednesday 29 November 2006 16:24, David Kimdon wrote: > On Wed, Nov 29, 2006 at 04:12:33PM +0100, Michael Buesch wrote: > > On Wednesday 29 November 2006 15:34, Nick Kossifidis wrote: > > > Good luck then ;-) > > > > > > If anyone wants to help on making ar5k work with newer madwifi > > > versions and fix bugs etc (that 'll also help bsd ppl) plzz mail me. > > > We can make it better. > > > > > > Nick > > > P.S. Why not work on dawifi ? > > > > Because it won't be merged mainline either. > > Why do you say that? > > There is absolutely no reason why dadwifi can't be merged into the > mainline once the hal issue is resolved. Last time we talked about that stuff, it was decided that we don't want a HAL... See archives. -- Greetings Michael. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Madwifi-devel] ar5k and Atheros AR5005G
On Wed, Nov 29, 2006 at 10:21:09AM -0500, Dan Williams wrote: > On Wed, 2006-11-29 at 16:12 +0100, Michael Buesch wrote: > > On Wednesday 29 November 2006 15:34, Nick Kossifidis wrote: > > > Good luck then ;-) > > > > > > If anyone wants to help on making ar5k work with newer madwifi > > > versions and fix bugs etc (that 'll also help bsd ppl) plzz mail me. > > > We can make it better. > > > > > > Nick > > > P.S. Why not work on dawifi ? > > > > Because it won't be merged mainline either. > > I thought dadwifi was supposed to replace net80211 with d80211 (but not > replace the binary HAL). yes > Aren't the two things complementary, yes > or did > you just decide that starting from scratch would produce a less crufty, > better understood, better-d80211 integrated driver? well, dadwifi will be (is) well integrated with d80211. As far as cruft goes, I'd rather call it historical artifacts :-) We are doing our best to minimize cruft while standing on the shoulders of madwifi. -David > > - > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [IPVS] transparent proxying
Hi Horms, I see that this patch probably makes IPVS code a bit complicated and packet traversing less efficiently. If I remember correctly, policy-based routing can work with IPVS in kernel 2.2 and 2.4 for transparent cache cluster for a long time. It should work in kernel 2.6 too. For example, we can use iptables/ipchains to mark all web traffic with fwmark 1, then use policy-based routing to route all web traffic through NF_IP_LOCAL_IN, so that ip_vs_in can capture the packets and load balance packets to cache servers. ip rule add prio 100 fwmark 1 table 100 ip route add local 0/0 dev lo table 100 ipvsadm -A -f 1 -s wlc ipvsadm -a -f 1 -w 100 -r cache1 ipvsadm -a -f 1 -w 100 -r cache2 ipvsadm -a -f 1 -w 100 -r cache2 ... Cheers, Wensong Horms wrote: This seems to be a pretty clean solution to a real problem. Ultimately I would like to see IPVS move into the forward chain. This seems to be a nice way to explore that, without breaking any existing setups. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Madwifi-devel] ar5k and Atheros AR5005G
On Wed, Nov 29, 2006 at 04:12:33PM +0100, Michael Buesch wrote: > On Wednesday 29 November 2006 15:34, Nick Kossifidis wrote: > > Good luck then ;-) > > > > If anyone wants to help on making ar5k work with newer madwifi > > versions and fix bugs etc (that 'll also help bsd ppl) plzz mail me. > > We can make it better. > > > > Nick > > P.S. Why not work on dawifi ? > > Because it won't be merged mainline either. Why do you say that? There is absolutely no reason why dadwifi can't be merged into the mainline once the hal issue is resolved. -David - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Madwifi-devel] ar5k and Atheros AR5005G
On Wed, 2006-11-29 at 16:12 +0100, Michael Buesch wrote: > On Wednesday 29 November 2006 15:34, Nick Kossifidis wrote: > > Good luck then ;-) > > > > If anyone wants to help on making ar5k work with newer madwifi > > versions and fix bugs etc (that 'll also help bsd ppl) plzz mail me. > > We can make it better. > > > > Nick > > P.S. Why not work on dawifi ? > > Because it won't be merged mainline either. I thought dadwifi was supposed to replace net80211 with d80211 (but not replace the binary HAL). Aren't the two things complementary, or did you just decide that starting from scratch would produce a less crufty, better understood, better-d80211 integrated driver? Dan - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Broken commit: [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function
Krzysztof Halasa <[EMAIL PROTECTED]> writes: > I wound't care less btw. s/wound/couldn/, eh those foreign languages... -- Krzysztof Halasa - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Madwifi-devel] ar5k and Atheros AR5005G
On Wednesday 29 November 2006 15:34, Nick Kossifidis wrote: > Good luck then ;-) > > If anyone wants to help on making ar5k work with newer madwifi > versions and fix bugs etc (that 'll also help bsd ppl) plzz mail me. > We can make it better. > > Nick > P.S. Why not work on dawifi ? Because it won't be merged mainline either. -- Greetings Michael. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Broken commit: [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function
Jarek Poplawski <[EMAIL PROTECTED]> writes: > And if we talk about names: > > + Spotted by Krzysztof Halasa. I wound't care less btw. -- Krzysztof Halasa - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [IPVS] transparent proxying
On Wed, Nov 29, 2006 at 03:15:23PM +0100, Thomas Graf wrote: > * Horms <[EMAIL PROTECTED]> 2006-11-29 15:21 > > This seems to be a pretty clean solution to a real problem. > > > > Ultimately I would like to see IPVS move into the forward chain. > > This seems to be a nice way to explore that, without breaking > > any existing setups. > > > > -- > > Horms > > H: http://www.vergenet.net/~horms/ > > W: http://www.valinux.co.jp/en/ > > > > [IPVS] transparent proxying > > > > Patch from Jinhua Luo <[EMAIL PROTECTED]> to allow a web cluseter using > > transparent proxying. It works by simply grabing packets that have the > > fwmark set and have not already been processed by ipvs (ip_vs_out) and > > throwing them into ip_vs_in. > > > > See: > > http://archive.linuxvirtualserver.org/html/lvs-users/2006-11/msg00261.html > > > > Normally LVS packets are processed by ip_vs_in fron on the INPUT chain, > > and packets that are processed in this way never show up on the FORWARD > > chain, so they won't hit this rule. > > > > This patch seems like a good precursor to moving LVS permanantly to > > the FORWARD chain. As I'm struggling to think how it could break things. > > > > The changes to the original patch are: > > > > * Reformated to use tabs for indentation (instead of 4 spaces) > > * Reformated to be < 80 columns wide > > * Added some comments > > * Rewrote description (this text) > > > > Signed-off-by: Simon Horman <[EMAIL PROTECTED]> > > Signed-off-by: Jinhua Luo <[EMAIL PROTECTED]> > > > > Index: linux-2.6/net/ipv4/ipvs/ip_vs_core.c > > === > > --- linux-2.6.orig/net/ipv4/ipvs/ip_vs_core.c 2006-11-28 > > 15:30:00.0 +0900 > > +++ linux-2.6/net/ipv4/ipvs/ip_vs_core.c2006-11-29 10:27:49.0 > > +0900 > > @@ -23,7 +23,9 @@ > > * Changes: > > * Paul `Rusty' Russellproperly handle non-linear skbs > > * Harald Weltedon't use nfcache > > - * > > + * Jinhua Luo redirect packets with fwmark on > > + * NF_IP_FORWARD chain to ip_vs_in(), > > + * mainly for transparent cache cluster > > */ > > > > #include > > @@ -1070,6 +1072,26 @@ > > return ip_vs_in_icmp(pskb, &r, hooknum); > > } > > > > +/* > > + * This is hooked into the NF_IP_FORWARD. It catches > > + * packets that have not already been handled by ipvs (out) > > + * and have a fwmark set. This is to allow transparent proxying > > + * of fwmark virtual services. > > + * > > + * It will not process packets that are handled by ipvs (in) > > + * as they never traverse the NF_IP_FORWARD. > > + */ > > +static unsigned int > > +ip_vs_forward_with_fwmark(unsigned int hooknum, struct sk_buff **pskb, > > + const struct net_device *in, > > + const struct net_device *out, > > + int (*okfn)(struct sk_buff *)) > > +{ > > + if ((*pskb)->ipvs_property || ! (*pskb)->nfmark) > > + return NF_ACCEPT; > > This patch seems to be based on an old tree, I've renamed nfmark > to mark in net-2.6.20. The term fwmark and nfmark shouldn't be > used anymore. Sorry, I based this patch on Linus's tree. I'll port it to net-2.6.20. -- Horms H: http://www.vergenet.net/~horms/ W: http://www.valinux.co.jp/en/ - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Madwifi-devel] ar5k and Atheros AR5005G
Good luck then ;-) If anyone wants to help on making ar5k work with newer madwifi versions and fix bugs etc (that 'll also help bsd ppl) plzz mail me. We can make it better. Nick P.S. Why not work on dawifi ? 2006/11/29, Michael Buesch <[EMAIL PROTECTED]>: On Wednesday 29 November 2006 14:55, Nick Kossifidis wrote: > I 've already ported ar5k to linux and it works with madwifi versions No, you misunderstood me. Madwifi is not a native driver and will never be accepted into mainline. My attempt is to write a native d80211 driver based on the ar5k sources. Currently I don't have too much time, so it's not very progressed, but from next week on I have vacation from work, so I think I can work on this again. -- Greetings Michael. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] d80211: Reset assoc and auth retry counters
On Wed, 29 Nov 2006 15:27:06 +0100, Ivo Van Doorn wrote: > Shouldn't this last one be: > ieee80211_set_disassoc(dev, ifsta, 0) > > This one is called from the IOCTL request to dissassociate, > so the interface should still be authenticated (with a valid > auth retry counter). Yes, of course. Thanks for being watchful :-) Jiri -- Jiri Benc SUSE Labs - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] d80211: Reset assoc and auth retry counters
On 11/29/06, Jiri Benc <[EMAIL PROTECTED]> wrote: On Tue, 28 Nov 2006 20:56:05 +0100, Ivo van Doorn wrote: > After a succesfull authentication and association the matching retry counter > must be reset to 0. > Failure to do so will result in failure to authenticate after the interface > has been deauthenticated. This does not always happen after the first > deauthentication, but after the interface has been several times been > deauthenticated it will refuse to authenticate. Thanks for spotting this, but your fix makes statistics about authentication/association exported via sysfs useless. The counters should be reset before a new authentication/association attempt (as is done in ieee80211_sta_new_auth). Sounds good to me, I was unsure where those counters should be reset anyway. :) I think this is a more correct fix: @@ -2858,7 +2866,7 @@ int ieee80211_sta_deauthenticate(struct return -EINVAL; ieee80211_send_deauth(dev, ifsta, reason); - ieee80211_set_associated(dev, ifsta, 0); + ieee80211_set_disassoc(dev, ifsta, 1); return 0; } @@ -2878,6 +2886,6 @@ int ieee80211_sta_disassociate(struct ne return -1; ieee80211_send_disassoc(dev, ifsta, reason); - ieee80211_set_associated(dev, ifsta, 0); + ieee80211_set_disassoc(dev, ifsta, 1); return 0; } Shouldn't this last one be: ieee80211_set_disassoc(dev, ifsta, 0) This one is called from the IOCTL request to dissassociate, so the interface should still be authenticated (with a valid auth retry counter). Ivo - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] [IPVS] transparent proxying
* Horms <[EMAIL PROTECTED]> 2006-11-29 15:21 > This seems to be a pretty clean solution to a real problem. > > Ultimately I would like to see IPVS move into the forward chain. > This seems to be a nice way to explore that, without breaking > any existing setups. > > -- > Horms > H: http://www.vergenet.net/~horms/ > W: http://www.valinux.co.jp/en/ > > [IPVS] transparent proxying > > Patch from Jinhua Luo <[EMAIL PROTECTED]> to allow a web cluseter using > transparent proxying. It works by simply grabing packets that have the > fwmark set and have not already been processed by ipvs (ip_vs_out) and > throwing them into ip_vs_in. > > See: > http://archive.linuxvirtualserver.org/html/lvs-users/2006-11/msg00261.html > > Normally LVS packets are processed by ip_vs_in fron on the INPUT chain, > and packets that are processed in this way never show up on the FORWARD > chain, so they won't hit this rule. > > This patch seems like a good precursor to moving LVS permanantly to > the FORWARD chain. As I'm struggling to think how it could break things. > > The changes to the original patch are: > > * Reformated to use tabs for indentation (instead of 4 spaces) > * Reformated to be < 80 columns wide > * Added some comments > * Rewrote description (this text) > > Signed-off-by: Simon Horman <[EMAIL PROTECTED]> > Signed-off-by: Jinhua Luo <[EMAIL PROTECTED]> > > Index: linux-2.6/net/ipv4/ipvs/ip_vs_core.c > === > --- linux-2.6.orig/net/ipv4/ipvs/ip_vs_core.c 2006-11-28 15:30:00.0 > +0900 > +++ linux-2.6/net/ipv4/ipvs/ip_vs_core.c 2006-11-29 10:27:49.0 > +0900 > @@ -23,7 +23,9 @@ > * Changes: > * Paul `Rusty' Russellproperly handle non-linear skbs > * Harald Weltedon't use nfcache > - * > + * Jinhua Luo redirect packets with fwmark on > + * NF_IP_FORWARD chain to ip_vs_in(), > + * mainly for transparent cache cluster > */ > > #include > @@ -1070,6 +1072,26 @@ > return ip_vs_in_icmp(pskb, &r, hooknum); > } > > +/* > + * This is hooked into the NF_IP_FORWARD. It catches > + * packets that have not already been handled by ipvs (out) > + * and have a fwmark set. This is to allow transparent proxying > + * of fwmark virtual services. > + * > + * It will not process packets that are handled by ipvs (in) > + * as they never traverse the NF_IP_FORWARD. > + */ > +static unsigned int > +ip_vs_forward_with_fwmark(unsigned int hooknum, struct sk_buff **pskb, > + const struct net_device *in, > + const struct net_device *out, > + int (*okfn)(struct sk_buff *)) > +{ > + if ((*pskb)->ipvs_property || ! (*pskb)->nfmark) > + return NF_ACCEPT; This patch seems to be based on an old tree, I've renamed nfmark to mark in net-2.6.20. The term fwmark and nfmark shouldn't be used anymore. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] d80211: Reset assoc and auth retry counters
On Tue, 28 Nov 2006 20:56:05 +0100, Ivo van Doorn wrote: > After a succesfull authentication and association the matching retry counter > must be reset to 0. > Failure to do so will result in failure to authenticate after the interface > has been deauthenticated. This does not always happen after the first > deauthentication, but after the interface has been several times been > deauthenticated it will refuse to authenticate. Thanks for spotting this, but your fix makes statistics about authentication/association exported via sysfs useless. The counters should be reset before a new authentication/association attempt (as is done in ieee80211_sta_new_auth). I think this is a more correct fix: Signed-off-by: Jiri Benc <[EMAIL PROTECTED]> --- net/d80211/ieee80211_sta.c | 18 +- 1 files changed, 13 insertions(+), 5 deletions(-) --- dscape.orig/net/d80211/ieee80211_sta.c +++ dscape/net/d80211/ieee80211_sta.c @@ -382,6 +382,14 @@ static void ieee80211_set_associated(str ifsta->last_probe = jiffies; } +static void ieee80211_set_disassoc(struct net_device *dev, + struct ieee80211_if_sta *ifsta, int deauth) +{ + if (deauth) + ifsta->auth_tries = 0; + ifsta->assoc_tries = 0; + ieee80211_set_associated(dev, ifsta, 0); +} static void ieee80211_sta_tx(struct net_device *dev, struct sk_buff *skb, int encrypt, int probe_resp) @@ -1023,7 +1031,7 @@ static void ieee80211_rx_mgmt_deauth(str IEEE80211_RETRY_AUTH_INTERVAL); } - ieee80211_set_associated(dev, ifsta, 0); + ieee80211_set_disassoc(dev, ifsta, 1); ifsta->authenticated = 0; } @@ -1066,7 +1074,7 @@ static void ieee80211_rx_mgmt_disassoc(s IEEE80211_RETRY_AUTH_INTERVAL); } - ieee80211_set_associated(dev, ifsta, 0); + ieee80211_set_disassoc(dev, ifsta, 0); } @@ -1882,7 +1890,7 @@ void ieee80211_sta_work(void *ptr) "mixed-cell disabled - disassociate\n", dev->name); ieee80211_send_disassoc(dev, ifsta, WLAN_REASON_UNSPECIFIED); - ieee80211_set_associated(dev, ifsta, 0); + ieee80211_set_disassoc(dev, ifsta, 0); } } @@ -2858,7 +2866,7 @@ int ieee80211_sta_deauthenticate(struct return -EINVAL; ieee80211_send_deauth(dev, ifsta, reason); - ieee80211_set_associated(dev, ifsta, 0); + ieee80211_set_disassoc(dev, ifsta, 1); return 0; } @@ -2878,6 +2886,6 @@ int ieee80211_sta_disassociate(struct ne return -1; ieee80211_send_disassoc(dev, ifsta, reason); - ieee80211_set_associated(dev, ifsta, 0); + ieee80211_set_disassoc(dev, ifsta, 1); return 0; } -- Jiri Benc SUSE Labs - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Madwifi-devel] ar5k and Atheros AR5005G
On Wednesday 29 November 2006 14:55, Nick Kossifidis wrote: > I 've already ported ar5k to linux and it works with madwifi versions No, you misunderstood me. Madwifi is not a native driver and will never be accepted into mainline. My attempt is to write a native d80211 driver based on the ar5k sources. Currently I don't have too much time, so it's not very progressed, but from next week on I have vacation from work, so I think I can work on this again. -- Greetings Michael. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[DECNet] fib: Fix out of bound access of fib_props[]
Fixes a typo which caused fib_props[] to have the wrong size and makes sure the value used to index the array which is provided by userspace via netlink is checked to avoid out of bound access. Signed-off-by: Thomas Graf <[EMAIL PROTECTED]> Index: net-2.6/net/decnet/dn_fib.c === --- net-2.6.orig/net/decnet/dn_fib.c2006-11-29 13:35:51.0 +0100 +++ net-2.6/net/decnet/dn_fib.c 2006-11-29 13:36:17.0 +0100 @@ -63,7 +63,7 @@ { int error; u8 scope; -} dn_fib_props[RTA_MAX+1] = { +} dn_fib_props[RTN_MAX+1] = { [RTN_UNSPEC] = { .error = 0, .scope = RT_SCOPE_NOWHERE }, [RTN_UNICAST] = { .error = 0, .scope = RT_SCOPE_UNIVERSE }, [RTN_LOCAL] = { .error = 0, .scope = RT_SCOPE_HOST }, @@ -276,6 +276,9 @@ struct dn_fib_info *ofi; int nhs = 1; + if (r->rtm_type > RTN_MAX) + goto err_inval; + if (dn_fib_props[r->rtm_type].scope > r->rtm_scope) goto err_inval; - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html