Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP

2006-11-29 Thread Ingo Molnar

* David Miller <[EMAIL PROTECTED]> wrote:

> > furthermore, the tweak allows the shifting of processing from a 
> > prioritized process context into a highest-priority softirq context. 
> > (it's not proven that there is any significant /net win/ of 
> > performance: all that was proven is that if we shift TCP processing 
> > from process context into softirq context then TCP throughput of 
> > that otherwise penalized process context increases.)
> 
> If we preempt with any packets in the backlog, we send no ACKs and the 
> sender cannot send thus the pipe empties.  That's the problem, this 
> has nothing to do with scheduler priorities or stuff like that IMHO. 
> The argument goes that if the reschedule is delayed long enough, the 
> ACKs will exceed the round trip time and trigger retransmits which 
> will absolutely kill performance.

yes, but i disagree a bit about the characterisation of the problem. The 
question in my opinion is: how is TCP processing prioritized for this 
particular socket, which is attached to the process context which was 
preempted.

normally, normally quite a bit of TCP processing happens in a softirq 
context (in fact most of it happens there), and softirq contexts have no 
fairness whatsoever - they preempt whatever processing is going on, 
regardless of any priority preferences of the user!

what was observed here were the effects of completely throttling TCP 
processing for a given socket. I think such throttling can in fact be 
desirable: there is a /reason/ why the process context was preempted: in 
that load scenario there was 10 times more processing requested from the 
CPU than it can possibly service. It's a serious overload situation and 
it's the scheduler's task to prioritize between workloads!

normally such kind of "throttling" of the TCP stack for this particular 
socket does not happen. Note that there's no performance lost: we dont 
do TCP processing because there are /9 other tasks for this CPU to run/, 
and the scheduler has a tough choice.

Now i agree that there are more intelligent ways to throttle and less 
intelligent ways to throttle, but the notion to allow a given workload 
'steal' CPU time from other workloads by allowing it to push its 
processing into a softirq is i think unfair. (and this issue is 
partially addressed by my softirq threading patches in -rt :-)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen

2006-11-29 Thread Alexey Dobriyan

On 11/30/06, David Miller <[EMAIL PROTECTED]> wrote:

From: Alexey Dobriyan <[EMAIL PROTECTED]>
Date: Wed, 29 Nov 2006 23:04:37 +0300

> Looks like worker thread strategically clears it if scheduled at wrong
> moment.
>
> --- a/net/core/pktgen.c
> +++ b/net/core/pktgen.c
> @@ -3292,7 +3292,6 @@ static void pktgen_thread_worker(struct
>
>init_waitqueue_head(&t->queue);
>
> -  t->control &= ~(T_TERMINATE);
>t->control &= ~(T_RUN);
>t->control &= ~(T_STOP);
>t->control &= ~(T_REMDEVALL);

Good catch Alexey.  Did you rerun the load/unload test with
this fix applied?  If it fixes things, I'll merge it.


Well, yes, it fixes things, except Ctrl+C getting you out of
modprobe/rmmod loop will spit
backtrace again. And other flags: T_RUN, T_STOP. Clearance is not
needed due to kZalloc and
create bugs as demostrated.

Give me some time.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP

2006-11-29 Thread David Miller
From: Ingo Molnar <[EMAIL PROTECTED]>
Date: Thu, 30 Nov 2006 07:47:58 +0100

> furthermore, the tweak allows the shifting of processing from a 
> prioritized process context into a highest-priority softirq context. 
> (it's not proven that there is any significant /net win/ of performance: 
> all that was proven is that if we shift TCP processing from process 
> context into softirq context then TCP throughput of that otherwise 
> penalized process context increases.)

If we preempt with any packets in the backlog, we send no ACKs and the
sender cannot send thus the pipe empties.  That's the problem, this
has nothing to do with scheduler priorities or stuff like that IMHO.
The argument goes that if the reschedule is delayed long enough, the
ACKs will exceed the round trip time and trigger retransmits which
will absolutely kill performance.

The only reason we block input packet processing while we hold this
lock is because we don't want the receive queue changing from
underneath us while we're copying data to userspace.

Furthermore once you preempt in this particular way, no input
packet processing occurs in that socket still, exacerbating the
situation.

Anyways, even if we somehow unlocked the socket and ran the backlog at
preemption points, by hand, since we've thus deferred the whole work
of processing whatever is in the backlog until the preemption point,
we've lost our quantum already, so it's perhaps not legal to do the
deferred processing as the preemption signalling point from a fairness
perspective.

It would be different if we really did the packet processing at the
original moment (where we had to queue to the socket backlog because
it was locked, in softirq) because then we'd return from the softirq
and hit the preemption point earlier or whatever.

Therefore, perhaps the best would be to see if there is a way we can
still allow input packet processing even while running the majority of
TCP's recvmsg().  It won't be easy :)
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP

2006-11-29 Thread Ingo Molnar

* David Miller <[EMAIL PROTECTED]> wrote:

> This is why my suggestion is to preempt_disable() as soon as we grab 
> the socket lock, [...]

independently of the issue at hand, in general the explicit use of 
preempt_disable() in non-infrastructure code is quite a heavy tool. Its 
effects are heavy and global: it disables /all/ preemption (even on 
PREEMPT_RT). Furthermore, when preempt_disable() is used for per-CPU 
data structures then [unlike for example to a spin-lock] the connection 
between the 'data' and the 'lock' is not explicit - causing all kinds of 
grief when trying to convert such code to a different preemption model. 
(such as PREEMPT_RT :-)

So my plan is to remove all "open-coded" use of preempt_disable() [and 
raw use of local_irq_save/restore] from the kernel and replace it with 
some facility that connects data and lock. (Note that this will not 
result in any actual changes on the instruction level because internally 
every such facility still maps to preempt_disable() on non-PREEMPT_RT 
kernels, so on non-PREEMPT_RT kernels such code will still be the same 
as before.)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP

2006-11-29 Thread Ingo Molnar

* David Miller <[EMAIL PROTECTED]> wrote:

> > yeah, i like this one. If the problem is "too long locked section", 
> > then the most natural solution is to "break up the lock", not to 
> > "boost the priority of the lock-holding task" (which is what the 
> > proposed patch does).
> 
> Ingo you're mis-read the problem :-)

yeah, the problem isnt too long locked section but "too much time spent 
holding a lock" and hence opening up ourselves to possible negative 
side-effects of the scheduler's fairness algorithm when it forces a 
preemption of that process context with that lock held (and forcing all 
subsequent packets to be backlogged).

but please read my last mail - i think i'm slowly starting to wake up 
;-) I dont think there is any real problem: a tweak to the scheduler 
that in essence gives TCP-using tasks a preference changes the balance 
of workloads. Such an explicit tweak is possible already.

furthermore, the tweak allows the shifting of processing from a 
prioritized process context into a highest-priority softirq context. 
(it's not proven that there is any significant /net win/ of performance: 
all that was proven is that if we shift TCP processing from process 
context into softirq context then TCP throughput of that otherwise 
penalized process context increases.)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bug 7596 - Potential performance bottleneck for Linxu TCP

2006-11-29 Thread Ingo Molnar

* Andrew Morton <[EMAIL PROTECTED]> wrote:

> > Attached is the detailed description of the problem and one possible 
> > solution.
> 
> Thanks.  The attachment will be too large for the mailing-list servers 
> so I uploaded a copy to 
> http://userweb.kernel.org/~akpm/Linux-TCP-Bottleneck-Analysis-Report.pdf
> 
> From a quick peek it appears that you're getting around 10% 
> improvement in TCP throughput, best case.

Wenji, have you tried to renice the receiving task (to say nice -20) and 
see how much TCP throughput you get in "background load of 10.0". 
(similarly, you could also renice the background load tasks to nice +19 
and/or set their scheduling policy to SCHED_BATCH)

as far as i can see, the numbers in the paper and the patch prove the 
following two points:

 - a task doing TCP receive with 10 other tasks running on the CPU will
   see lower TCP throughput than if it had the CPU for itself alone.

 - a patch that tweaks the scheduler to give the receiving task more
   timeslices (i.e. raises its nice level in essence) results in ...
   more timeslices, which results in higher receive numbers ...

so the most important thing to check would be, before any scheduler and 
TCP code change is considered: if you give the task higher priority 
/explicitly/, via nice -20, do the numbers improve? Similarly, if all 
the other "background load" tasks are reniced to nice +19 (or their 
policy is set to SCHED_BATCH), do you get a similar improvement?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP

2006-11-29 Thread David Miller
From: Ingo Molnar <[EMAIL PROTECTED]>
Date: Thu, 30 Nov 2006 07:17:58 +0100

> 
> * David Miller <[EMAIL PROTECTED]> wrote:
> 
> > We can make explicitl preemption checks in the main loop of 
> > tcp_recvmsg(), and release the socket and run the backlog if 
> > need_resched() is TRUE.
> > 
> > This is the simplest and most elegant solution to this problem.
> 
> yeah, i like this one. If the problem is "too long locked section", then
> the most natural solution is to "break up the lock", not to "boost the 
> priority of the lock-holding task" (which is what the proposed patch 
> does).

Ingo you're mis-read the problem :-)

The issue is that we actually don't hold any locks that prevent
preemption, so we can take preemption points which the TCP code
wasn't designed with in-mind.

Normally, we control the sleep point very carefully in the TCP
sendmsg/recvmsg code, such that when we sleep we drop the socket
lock and process the backlog packets that accumulated while the
socket was locked.

With pre-emption we can't control that properly.

The problem is that we really do need to run the backlog any time
we give up the cpu in the sendmsg/recvmsg path, or things get real
erratic.  ACKs don't go out as early as we'd like them to, etc.

It isn't easy to do generically, perhaps, because we can only
drop the socket lock at certain points and we need to do that to
run the backlog.

This is why my suggestion is to preempt_disable() as soon as we
grab the socket lock, and explicitly test need_resched() at places
where it is absolutely safe, like this:

if (need_resched()) {
/* Run packet backlog... */
release_sock(sk);
schedule();
lock_sock(sk);
}

The socket lock is just a by-hand binary semaphore, so it doesn't
block pre-emption.  We have to be able to sleep while holding it.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP

2006-11-29 Thread Ingo Molnar

* Wenji Wu <[EMAIL PROTECTED]> wrote:

> > That yield() will need to be removed - yield()'s behaviour is truly 
> > awfulif the system is otherwise busy.  What is it there for?
> 
> Please read the uploaded paper, which has detailed description.

do you have any URL for that?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP

2006-11-29 Thread Ingo Molnar

* David Miller <[EMAIL PROTECTED]> wrote:

> We can make explicitl preemption checks in the main loop of 
> tcp_recvmsg(), and release the socket and run the backlog if 
> need_resched() is TRUE.
> 
> This is the simplest and most elegant solution to this problem.

yeah, i like this one. If the problem is "too long locked section", then
the most natural solution is to "break up the lock", not to "boost the 
priority of the lock-holding task" (which is what the proposed patch 
does).

[ Also note that "sprinkle the code with preempt_disable()" kind of
  solutions, besides hurting interactivity, are also a pain to resolve 
  in something like PREEMPT_RT. (unlike say a spinlock, 
  preempt_disable() is quite opaque in what data structure it protects, 
  etc., making it hard to convert it to a preemptible primitive) ]

> The one suggested in your patch and paper are way overkill, there is 
> no reason to solve a TCP specific problem inside of the generic 
> scheduler.

agreed.

What we could also add is a /reverse/ mechanism to the scheduler: a task 
could query whether it has just a small amount of time left in its 
timeslice, and could in that case voluntarily drop its current lock and 
yield, and thus give up its current timeslice and wait for a new, full 
timeslice, instead of being forcibly preempted due to lack of timeslices 
with a possibly critical lock still held.

But the suggested solution here, to "prolong the running of this task 
just a little bit longer" only starts a perpetual arms race between 
users of such a facility and other kernel subsystems. (besides not being 
adequate anyway, there can always be /so/ long lock-hold times that the 
scheduler would have no other option but to preempt the task)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP

2006-11-29 Thread Mike Galbraith
On Wed, 2006-11-29 at 17:08 -0800, Andrew Morton wrote:
> + if (p->backlog_flag == 0) {
> + if (!TASK_INTERACTIVE(p) || expired_starving(rq)) {
> + enqueue_task(p, rq->expired);
> + if (p->static_prio < rq->best_expired_prio)
> + rq->best_expired_prio = p->static_prio;
> + } else
> + enqueue_task(p, rq->active);
> + } else {
> + if (expired_starving(rq)) {
> + enqueue_task(p,rq->expired);
> + if (p->static_prio < rq->best_expired_prio)
> + rq->best_expired_prio = p->static_prio;
> + } else {
> + if (!TASK_INTERACTIVE(p))
> + p->extrarun_flag = 1;
> + enqueue_task(p,rq->active);
> + }
> + }

(oh my, doing that to the scheduler upsets my tummy, but that aside...)

I don't see how that can really solve anything.  "Interactive" tasks
starting to use cpu heftily can still preempt and keep the special cased
cpu hog off the cpu for ages.  It also only takes one task in the
expired array to trigger the forced array switch with a fully loaded
cpu, and once any task hits the expired array, a stream of wakeups can
prevent the switch from completing for as long as you can keep wakeups
happening.

-Mike

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Madwifi-devel] ar5k and Atheros AR5005G

2006-11-29 Thread Michael Renzmann

Hi.

Michael Buesch wrote:

IIRC Pavel already explained that getting rid of the HAL per se should be
no problem - it could easily be dissolved into the driver, if that is one
of the requirements to be fulfilled before the driver (MadWifi or DadWifi)
is considered for mainline inclusion. As soon as there is source available
to dissolve, at least.

Ok, so who actually does the work?


The MadWifi team? It won't happen today or tomorrow, but I'm confident 
that it will happen. Any contribution to that effort is highly welcome - 
the more people help, the faster will the goal be reached.



From what I understood the "... once the hal issue is resolved" part of
David's mail refered to exactly that question.

Ok, I don't know what "The HAL Issue" (tm) is.


You referred to the archives where that exact "issue"(s) (binary-only, 
non-free, no sources, unwanted level of abstraction) has/have been 
discussed in lenght, but you claim you didn't have a clue what David was 
talking about? Come on.


Bye, Mike
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux 2.6.19

2006-11-29 Thread Phil Oester
On Wed, Nov 29, 2006 at 06:15:37PM -0800, David Miller wrote:
> In fact it does, the NDISC code is using MAX_HEADER incorrectly.  It
> needs to explicitly allocate space for the struct ipv6hdr in 'len'.
> Luckily the TCP ipv6 code was doing it right.
> 
> What a horrible bug, this patch should fix it.  Let me know
> if it doesn't, thanks:

Yes, that fixes it up for me, thanks.

Phil
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.19-rc6-mm2: uli526x only works after reload

2006-11-29 Thread Andrew Morton
On Thu, 30 Nov 2006 02:04:15 +0100
"Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote:

> > > 
> > > git-netdev-all.patch
> > > git-netdev-all-fixup.patch
> > > libphy-dont-do-that.patch
> > 
> > Are you able to eliminate libphy-dont-do-that.patch?
> > 
> > > Is a broken-out version of git-netdev-all.patch available from somewhere?
> > 
> > Nope, and my few fumbling attempts to generate the sort of patch series
> > which you want didn't work out too well.  One has to downgrade to
> > git-bisect :(
> > 
> > What does "doesn't work" mean, btw?
> 
> Well, it turns out not to be 100% reproducible.  I can only reproduce it after
> a soft reboot (eg. shutdown -r now).
> 
> Then, while configuring network interfaces the system says the interface name
> is ethxx0, but it should be eth1 (eth0 is an RTL-8139, which is not used).  
> Now
> if I run ifconfig, it says:
> 
> eth0: error fetching interface information: Device not found
> 
> and that's all (normally, ifconfig would show the information for lo and eth1,
> without eth0).  Moreover, 'ifconfig eth1' says:
> 
> eth1: error fetching interface information: Device not found
> 
> Next, I run 'rmmod uli526x' and 'modprobe uli526x' and then 'ifconfig' is
> still saying the above (about eth0), but 'ifconfig eth1' seems to work as
> it should.  However, the interface often fails to transfer anything after
> that.

Lovely.  Sounds like some startup race, perhaps against userspace.

Is CONFIG_PCI_MULTITHREAD_PROBE set?  (err, we meant to disable that for
2.6.19 but forgot).

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP

2006-11-29 Thread David Miller
From: Wenji Wu <[EMAIL PROTECTED]>
Date: Wed, 29 Nov 2006 19:56:58 -0600

> >We could also pepper tcp_recvmsg() with some very carefully placed
> >preemption disable/enable calls to deal with this even with
> >CONFIG_PREEMPT enabled.
>
> I also think about this approach. But since the "problem" happens in
> the 2.6 Desktop and Low-latency Desktop (not server), system
> responsiveness is a key feature, simply placing preemption
> disabled/enable call might not work.  If you want to place
> preemption disable/enable calls within tcp_recvmsg, you have to put
> them in the very beginning and end of the call. Disabling preemption
> would degrade system responsiveness.

We can make explicitl preemption checks in the main loop of
tcp_recvmsg(), and release the socket and run the backlog if
need_resched() is TRUE.

This is the simplest and most elegant solution to this problem.

The one suggested in your patch and paper are way overkill, there is
no reason to solve a TCP specific problem inside of the generic
scheduler.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux 2.6.19

2006-11-29 Thread David Miller
From: Phil Oester <[EMAIL PROTECTED]>
Date: Wed, 29 Nov 2006 17:49:04 -0800

> Getting an oops on boot here, caused by commit
> e81c73596704793e73e6dbb478f41686f15a4b34 titled
> "[NET]: Fix MAX_HEADER setting".
> 
> Reverting that patch fixes things up for me.  Dave?

I suspect that it might be because I removed the IPV6
ifdef from the list,  but I can't imagine why that would
matter other than due to a bug in the IPV6 stack

Indeed.

Looking at ndisc_send_rs() I wonder if it miscalculates
'len' or similar and the old MAX_HEADER setting was
merely papering around this bug

In fact it does, the NDISC code is using MAX_HEADER incorrectly.  It
needs to explicitly allocate space for the struct ipv6hdr in 'len'.
Luckily the TCP ipv6 code was doing it right.

What a horrible bug, this patch should fix it.  Let me know
if it doesn't, thanks:

commit c28728decc37fe52c8cdf48b3e0c0cf9b0c2fefb
Author: David S. Miller <[EMAIL PROTECTED]>
Date:   Wed Nov 29 18:14:47 2006 -0800

[IPV6] NDISC: Calculate packet length correctly for allocation.

MAX_HEADER does not include the ipv6 header length in it,
so we need to add it in explicitly.

Signed-off-by: David S. Miller <[EMAIL PROTECTED]>

diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 73eb8c3..c42d4c2 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -441,7 +441,8 @@ static void ndisc_send_na(struct net_dev
 struct sk_buff *skb;
int err;
 
-   len = sizeof(struct icmp6hdr) + sizeof(struct in6_addr);
+   len = sizeof(struct ipv6hdr) + sizeof(struct icmp6hdr) +
+   sizeof(struct in6_addr);
 
/* for anycast or proxy, solicited_addr != src_addr */
ifp = ipv6_get_ifaddr(solicited_addr, dev, 1);
@@ -556,7 +557,8 @@ void ndisc_send_ns(struct net_device *de
if (err < 0)
return;
 
-   len = sizeof(struct icmp6hdr) + sizeof(struct in6_addr);
+   len = sizeof(struct ipv6hdr) + sizeof(struct icmp6hdr) +
+   sizeof(struct in6_addr);
send_llinfo = dev->addr_len && !ipv6_addr_any(saddr);
if (send_llinfo)
len += ndisc_opt_addr_space(dev);
@@ -632,7 +634,7 @@ void ndisc_send_rs(struct net_device *de
if (err < 0)
return;
 
-   len = sizeof(struct icmp6hdr);
+   len = sizeof(struct ipv6hdr) + sizeof(struct icmp6hdr);
if (dev->addr_len)
len += ndisc_opt_addr_space(dev);
 
@@ -1381,7 +1383,8 @@ void ndisc_send_redirect(struct sk_buff 
 struct in6_addr *target)
 {
struct sock *sk = ndisc_socket->sk;
-   int len = sizeof(struct icmp6hdr) + 2 * sizeof(struct in6_addr);
+   int len = sizeof(struct ipv6hdr) + sizeof(struct icmp6hdr) +
+   2 * sizeof(struct in6_addr);
struct sk_buff *buff;
struct icmp6hdr *icmph;
struct in6_addr saddr_buf;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP

2006-11-29 Thread Wenji Wu
> That yield() will need to be removed - yield()'s behaviour is truly 
> awfulif the system is otherwise busy.  What is it there for?

Please read the uploaded paper, which has detailed description.

thanks,

wenji

- Original Message -
From: Andrew Morton <[EMAIL PROTECTED]>
Date: Wednesday, November 29, 2006 7:08 pm
Subject: Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP

> On Wed, 29 Nov 2006 16:53:11 -0800 (PST)
> David Miller <[EMAIL PROTECTED]> wrote:
> 
> > 
> > Please, it is very difficult to review your work the way you have
> > submitted this patch as a set of 4 patches.  These patches have not
> > been split up "logically", but rather they have been split up "per
> > file" with the same exact changelog message in each patch posting.
> > This is very clumsy, and impossible to review, and wastes a lot of
> > mailing list bandwith.
> > 
> > We have an excellent file, called 
> Documentation/SubmittingPatches, in
> > the kernel source tree, which explains exactly how to do this
> > correctly.
> > 
> > By splitting your patch into 4 patches, one for each file touched,
> > it is impossible to review your patch as a logical whole.
> > 
> > Please also provide your patch inline so people can just hit reply
> > in their mail reader client to quote your patch and comment on it.
> > This is impossible with the attachments you've used.
> > 
> 
> Here you go - joined up, cleaned up, ported to mainline and test-
> compiled.
> That yield() will need to be removed - yield()'s behaviour is truly 
> awfulif the system is otherwise busy.  What is it there for?
> 
> 
> 
> From: Wenji Wu <[EMAIL PROTECTED]>
> 
> For Linux TCP, when the network applcaiton make system call to move 
> data from
> socket's receive buffer to user space by calling tcp_recvmsg().  
> The socket
> will be locked.  During this period, all the incoming packet for 
> the TCP
> socket will go to the backlog queue without being TCP processed
> 
> Since Linux 2.6 can be inerrupted mid-task, if the network application
> expires, and moved to the expired array with the socket locked, all 
> thepackets within the backlog queue will not be TCP processed till 
> the network
> applicaton resume its execution.  If the system is heavily loaded, 
> TCP can
> easily RTO in the Sender Side.
> 
> 
> 
> include/linux/sched.h |2 ++
> kernel/fork.c |3 +++
> kernel/sched.c|   24 ++--
> net/ipv4/tcp.c|9 +
> 4 files changed, 32 insertions(+), 6 deletions(-)
> 
> diff -puN net/ipv4/tcp.c~tcp-speedup net/ipv4/tcp.c
> --- a/net/ipv4/tcp.c~tcp-speedup
> +++ a/net/ipv4/tcp.c
> @@ -1109,6 +1109,8 @@ int tcp_recvmsg(struct kiocb *iocb, stru
>   struct task_struct *user_recv = NULL;
>   int copied_early = 0;
> 
> + current->backlog_flag = 1;
> +
>   lock_sock(sk);
> 
>   TCP_CHECK_TIMER(sk);
> @@ -1468,6 +1470,13 @@ skip_copy:
> 
>   TCP_CHECK_TIMER(sk);
>   release_sock(sk);
> +
> + current->backlog_flag = 0;
> + if (current->extrarun_flag == 1){
> + current->extrarun_flag = 0;
> + yield();
> + }
> +
>   return copied;
> 
> out:
> diff -puN include/linux/sched.h~tcp-speedup include/linux/sched.h
> --- a/include/linux/sched.h~tcp-speedup
> +++ a/include/linux/sched.h
> @@ -1023,6 +1023,8 @@ struct task_struct {
> #ifdefCONFIG_TASK_DELAY_ACCT
>   struct task_delay_info *delays;
> #endif
> + int backlog_flag;   /* packets wait in tcp backlog queue flag */
> + int extrarun_flag;  /* extra run flag for TCP performance */
> };
> 
> static inline pid_t process_group(struct task_struct *tsk)
> diff -puN kernel/sched.c~tcp-speedup kernel/sched.c
> --- a/kernel/sched.c~tcp-speedup
> +++ a/kernel/sched.c
> @@ -3099,12 +3099,24 @@ void scheduler_tick(void)
> 
>   if (!rq->expired_timestamp)
>   rq->expired_timestamp = jiffies;
> - if (!TASK_INTERACTIVE(p) || expired_starving(rq)) {
> - enqueue_task(p, rq->expired);
> - if (p->static_prio < rq->best_expired_prio)
> - rq->best_expired_prio = p->static_prio;
> - } else
> - enqueue_task(p, rq->active);
> + if (p->backlog_flag == 0) {
> + if (!TASK_INTERACTIVE(p) || expired_starving(rq)) {
> + enqueue_task(p, rq->expired);
> + if (p->static_prio < rq->best_expired_prio)
> + rq->best_expired_prio = p-
> >static_prio;+} else
> + enqueue_task(p, rq->active);
> + } else {
> + if (expired_starving(rq)) {
> + enqueue_task(p,rq->expired);
> + if (p->static_prio < rq->best_expired_prio)
> + rq->best_expired_prio = p-
> >static_

Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP

2006-11-29 Thread Wenji Wu
Yes, when CONFIG_PREEMPT is disabled, the "problem" won't happen. That is why I 
put "for 2.6 desktop, low-latency desktop" in the uploaded paper. This 
"problem" happens in the 2.6 Desktop and Low-latency Desktop.

>We could also pepper tcp_recvmsg() with some very carefully placed preemption 
>disable/enable calls to deal with this even with CONFIG_PREEMPT enabled.

I also think about this approach. But since the "problem" happens in the 2.6 
Desktop and Low-latency Desktop (not server), system responsiveness is a key 
feature, simply placing preemption disabled/enable call might not work.  If you 
want to place preemption disable/enable calls within tcp_recvmsg, you have to 
put them in the very beginning and end of the call. Disabling preemption would 
degrade system responsiveness.

wenji



- Original Message -
From: David Miller <[EMAIL PROTECTED]>
Date: Wednesday, November 29, 2006 7:13 pm
Subject: Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP

> From: Andrew Morton <[EMAIL PROTECTED]>
> Date: Wed, 29 Nov 2006 17:08:35 -0800
> 
> > On Wed, 29 Nov 2006 16:53:11 -0800 (PST)
> > David Miller <[EMAIL PROTECTED]> wrote:
> > 
> > > 
> > > Please, it is very difficult to review your work the way you have
> > > submitted this patch as a set of 4 patches.  These patches have 
> not> > been split up "logically", but rather they have been split 
> up "per
> > > file" with the same exact changelog message in each patch posting.
> > > This is very clumsy, and impossible to review, and wastes a lot of
> > > mailing list bandwith.
> > > 
> > > We have an excellent file, called 
> Documentation/SubmittingPatches, in
> > > the kernel source tree, which explains exactly how to do this
> > > correctly.
> > > 
> > > By splitting your patch into 4 patches, one for each file touched,
> > > it is impossible to review your patch as a logical whole.
> > > 
> > > Please also provide your patch inline so people can just hit reply
> > > in their mail reader client to quote your patch and comment on it.
> > > This is impossible with the attachments you've used.
> > > 
> > 
> > Here you go - joined up, cleaned up, ported to mainline and test-
> compiled.> 
> > That yield() will need to be removed - yield()'s behaviour is 
> truly awful
> > if the system is otherwise busy.  What is it there for?
> 
> What about simply turning off CONFIG_PREEMPT to fix this "problem"?
> 
> We always properly run the backlog (by doing a release_sock()) before
> going to sleep otherwise except for the specific case of taking a page
> fault during the copy to userspace.  It is only CONFIG_PREEMPT that
> can cause this situation to occur in other circumstances as far as I
> can see.
> 
> We could also pepper tcp_recvmsg() with some very carefully placed
> preemption disable/enable calls to deal with this even with
> CONFIG_PREEMPT enabled.
> 

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][IPSEC][3/7] inter address family ipsec tunnel

2006-11-29 Thread Kazunori MIYAZAWA
Hello,

I found a bug in my previous patch for af_key.
The patch breaks transport mode.
This is a fixed version.

Signed-off-by: Miika Komu <[EMAIL PROTECTED]>
Signed-off-by: Diego Beltrami <[EMAIL PROTECTED]>
Signed-off-by: Kazunori Miyazawa <[EMAIL PROTECTED]>

diff --git a/net/key/af_key.c b/net/key/af_key.c
index 4e18309..0e1dbfb 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -1767,11 +1767,11 @@ #endif
 
/* addresses present only in tunnel mode */
if (t->mode == XFRM_MODE_TUNNEL) {
-   switch (xp->family) {
+   struct sockaddr *sa;
+   sa = (struct sockaddr *)(rq+1);
+   switch(sa->sa_family) {
case AF_INET:
-   sin = (void*)(rq+1);
-   if (sin->sin_family != AF_INET)
-   return -EINVAL;
+   sin = (struct sockaddr_in*)sa;
t->saddr.a4 = sin->sin_addr.s_addr;
sin++;
if (sin->sin_family != AF_INET)
@@ -1780,9 +1780,7 @@ #endif
break;
 #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
case AF_INET6:
-   sin6 = (void *)(rq+1);
-   if (sin6->sin6_family != AF_INET6)
-   return -EINVAL;
+   sin6 = (struct sockaddr_in6*)sa;
memcpy(t->saddr.a6, &sin6->sin6_addr, sizeof(struct 
in6_addr));
sin6++;
if (sin6->sin6_family != AF_INET6)
@@ -1793,7 +1791,10 @@ #endif
default:
return -EINVAL;
}
-   }
+   t->encap_family = sa->sa_family;
+   } else
+   t->encap_family = xp->family;
+
/* No way to set this via kame pfkey */
t->aalgos = t->ealgos = t->calgos = ~0;
xp->xfrm_nr++;
@@ -1830,18 +1831,25 @@ static inline int pfkey_xfrm_policy2sec_
 
 static int pfkey_xfrm_policy2msg_size(struct xfrm_policy *xp)
 {
+   struct xfrm_tmpl *t;
int sockaddr_size = pfkey_sockaddr_size(xp->family);
-   int socklen = (xp->family == AF_INET ?
-  sizeof(struct sockaddr_in) :
-  sizeof(struct sockaddr_in6));
+   int socklen = 0;
+   int i;
+
+   for (i=0; ixfrm_nr; i++) {
+   t = xp->xfrm_vec + i;
+   socklen += (t->encap_family == AF_INET ?
+   sizeof(struct sockaddr_in) :
+   sizeof(struct sockaddr_in6));
+   }
 
return sizeof(struct sadb_msg) +
(sizeof(struct sadb_lifetime) * 3) +
(sizeof(struct sadb_address) * 2) + 
(sockaddr_size * 2) +
sizeof(struct sadb_x_policy) +
-   (xp->xfrm_nr * (sizeof(struct sadb_x_ipsecrequest) +
-   (socklen * 2))) +
+   (xp->xfrm_nr * sizeof(struct sadb_x_ipsecrequest)) +
+   (socklen * 2) +
pfkey_xfrm_policy2sec_ctx_size(xp);
 }
 
@@ -1999,7 +2007,9 @@ #endif
 
req_size = sizeof(struct sadb_x_ipsecrequest);
if (t->mode == XFRM_MODE_TUNNEL)
-   req_size += 2*socklen;
+   req_size += ((t->encap_family == AF_INET ?
+sizeof(struct sockaddr_in) :
+sizeof(struct sockaddr_in6)) * 2);
else
size -= 2*socklen;
rq = (void*)skb_put(skb, req_size);
@@ -2015,7 +2025,7 @@ #endif
rq->sadb_x_ipsecrequest_level = IPSEC_LEVEL_USE;
rq->sadb_x_ipsecrequest_reqid = t->reqid;
if (t->mode == XFRM_MODE_TUNNEL) {
-   switch (xp->family) {
+   switch (t->encap_family) {
case AF_INET:
sin = (void*)(rq+1);
sin->sin_family = AF_INET;
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [IPVS] transparent proxying

2006-11-29 Thread home_king

hi, Wensong. Thanks for your appraise.

> I see that this patch probably makes IPVS code a bit complicated and
> packet traversing less efficiently.

In my opinion, worry about the side-effect to the packet throughput is not
necessary. First, normal packets with mark rarely appear in the 
NF_IP_FORWARD

chain, while people mark packets aiming at the network administration job
usually on the NF_IP_LOCAL_IN or NF_IP_OUTPUT chain. Second, the new hook fn
is called after ipvs SNAT hook fn, and pass the packets handled by the 
latter
hook fn by simply checking the ipvs_property flag, so it would not 
disturb the
SNAT job. Third, the new hook fn is just a thin wrapper of ip_vs_in(), 
so now

that all packets which go through NF_IP_LOCAL_IN will be entirely checked up
by ip_vs_in(), no matter they are virtual-server relative or not, why we 
mind
that a comparatively small quantity of packets which go through 
NF_IP_FORWARD

will be checked too?

> If I remember correctly, policy-based routing can work with IPVS in
> kernel 2.2 and 2.4 for transparent cache cluster for a long time. It
> should work in kernel 2.6 too.

Indeed, policy route can help too, but the patch provides a native manner to
deploy transparent proxy, and meanwhile, this manner will not break the
backbone networking context, such as policy routing setting, iptables 
rules,

etc.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen

2006-11-29 Thread David Miller
From: Alexey Dobriyan <[EMAIL PROTECTED]>
Date: Wed, 29 Nov 2006 23:04:37 +0300

> Looks like worker thread strategically clears it if scheduled at wrong
> moment.
> 
> --- a/net/core/pktgen.c
> +++ b/net/core/pktgen.c
> @@ -3292,7 +3292,6 @@ static void pktgen_thread_worker(struct
>  
>   init_waitqueue_head(&t->queue);
>  
> - t->control &= ~(T_TERMINATE);
>   t->control &= ~(T_RUN);
>   t->control &= ~(T_STOP);
>   t->control &= ~(T_REMDEVALL);

Good catch Alexey.  Did you rerun the load/unload test with
this fix applied?  If it fixes things, I'll merge it.

Thanks!
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NET_SCHED 07/06]: Fix endless loops (part 5): netem/tbf/hfsc ->requeue failures

2006-11-29 Thread David Miller
From: Patrick McHardy <[EMAIL PROTECTED]>
Date: Mon, 20 Nov 2006 16:01:03 +0100

> I forgot to fix one (AFAICT purely theoretical) case ..

Also applied to net-2.6.20, thanks a lot Patrick.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NET_SCHED 05/06]: Fix endless loops (part 3): HFSC

2006-11-29 Thread David Miller
From: Patrick McHardy <[EMAIL PROTECTED]>
Date: Mon, 20 Nov 2006 14:08:46 +0100 (MET)

> [NET_SCHED]: Fix endless loops (part 3): HFSC
> 
> Convert HFSC to use qdisc_tree_decrease_len() and add a callback
> for deactivating a class when its child queue becomes empty.
> 
> All queue purging goes through hfsc_purge_queue(), which is used in
> three cases: grafting, class creation (when a leaf class is turned
> into an intermediate class by attaching a new class) and class
> deletion. In all cases qdisc_tree_decrease_len() is needed.
> 
> Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]>

Applied to net-2.6.20, thanks.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NET_SCHED 03/06]: Fix endless loops caused by inaccurate qlen counters (part 1)

2006-11-29 Thread David Miller
From: Patrick McHardy <[EMAIL PROTECTED]>
Date: Mon, 20 Nov 2006 14:08:41 +0100 (MET)

> [NET_SCHED]: Fix endless loops caused by inaccurate qlen counters (part 1)
> 
> There are multiple problems related to qlen adjustment that can lead
> to an upper qdisc getting out of sync with the real number of packets
> queued, leading to endless dequeueing attempts by the upper layer code.
> 
> All qdiscs must maintain an accurate q.qlen counter. There are basically
> two groups of operations affecting the qlen: operations that propagate
> down the tree (enqueue, dequeue, requeue, drop, reset) beginning at the
> root qdisc and operations only affecting a subtree or single qdisc
> (change, graft, delete class). Since qlen changes during operations from
> the second group don't propagate to ancestor qdiscs, their qlen values
> become desynchronized.
> 
> This patch adds a function to propagate qlen changes up the qdisc tree,
> optionally calling a callback function to perform qdisc-internal
> maintenance when the child qdisc becomes empty. The follow-up patches
> will convert all qdiscs to use this function where necessary.
> 
> Noticed by Timo Steinbach <[EMAIL PROTECTED]>.
> 
> Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]>

Applied to net-2.6.20, thanks a lot.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NET_SCHED 06/06]: Fix endless loops (part 4): HTB

2006-11-29 Thread David Miller
From: Patrick McHardy <[EMAIL PROTECTED]>
Date: Mon, 20 Nov 2006 14:08:48 +0100 (MET)

> [NET_SCHED]: Fix endless loops (part 4): HTB
> 
> Convert HTB to use qdisc_tree_decrease_len() and add a callback
> for deactivating a class when its child queue becomes empty.
> 
> Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]>

Applied to net-2.6.20
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NET_SCHED 04/06]: Fix endless loops (part 2): "simple" qdiscs

2006-11-29 Thread David Miller
From: Patrick McHardy <[EMAIL PROTECTED]>
Date: Mon, 20 Nov 2006 14:08:44 +0100 (MET)

> [NET_SCHED]: Fix endless loops (part 2): "simple" qdiscs
> 
> Convert the "simple" qdiscs to use qdisc_tree_decrease_qlen() where
> necessary:
> 
> - all graft operations
> - destruction of old child qdiscs in prio, red and tbf change operation
> - purging of queue in sfq change operation
> 
> Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]>

Applied to net-2.6.20
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NET_SCHED 01/06]: sch_htb: perform qlen adjustment immediately in ->delete

2006-11-29 Thread David Miller
From: Patrick McHardy <[EMAIL PROTECTED]>
Date: Mon, 20 Nov 2006 14:08:37 +0100 (MET)

> [NET_SCHED]: sch_htb: perform qlen adjustment immediately in ->delete
> 
> qlen adjustment should happen immediately in ->delete and not in the
> class destroy function because the reference count will not hit zero in
> ->delete (sch_api holds a reference) but in ->put. Since the qdisc
> lock is released between deletion of the class and final destruction
> this creates an externally visible error in the qlen counter.
> 
> Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]>

Applied to net-2.6.20
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [NET_SCHED 02/06]: Set parent classid in default qdiscs

2006-11-29 Thread David Miller
From: Patrick McHardy <[EMAIL PROTECTED]>
Date: Mon, 20 Nov 2006 14:08:38 +0100 (MET)

> [NET_SCHED]: Set parent classid in default qdiscs
> 
> Set parent classids in default qdiscs to allow walking up the tree
> from outside the qdiscs. This is needed by the next patch.
> 
> Signed-off-by: Patrick McHardy <[EMAIL PROTECTED]>

Applied to net-2.6.20, thanks Patrick.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP

2006-11-29 Thread David Miller
From: Andrew Morton <[EMAIL PROTECTED]>
Date: Wed, 29 Nov 2006 17:08:35 -0800

> On Wed, 29 Nov 2006 16:53:11 -0800 (PST)
> David Miller <[EMAIL PROTECTED]> wrote:
> 
> > 
> > Please, it is very difficult to review your work the way you have
> > submitted this patch as a set of 4 patches.  These patches have not
> > been split up "logically", but rather they have been split up "per
> > file" with the same exact changelog message in each patch posting.
> > This is very clumsy, and impossible to review, and wastes a lot of
> > mailing list bandwith.
> > 
> > We have an excellent file, called Documentation/SubmittingPatches, in
> > the kernel source tree, which explains exactly how to do this
> > correctly.
> > 
> > By splitting your patch into 4 patches, one for each file touched,
> > it is impossible to review your patch as a logical whole.
> > 
> > Please also provide your patch inline so people can just hit reply
> > in their mail reader client to quote your patch and comment on it.
> > This is impossible with the attachments you've used.
> > 
> 
> Here you go - joined up, cleaned up, ported to mainline and test-compiled.
> 
> That yield() will need to be removed - yield()'s behaviour is truly awful
> if the system is otherwise busy.  What is it there for?

What about simply turning off CONFIG_PREEMPT to fix this "problem"?

We always properly run the backlog (by doing a release_sock()) before
going to sleep otherwise except for the specific case of taking a page
fault during the copy to userspace.  It is only CONFIG_PREEMPT that
can cause this situation to occur in other circumstances as far as I
can see.

We could also pepper tcp_recvmsg() with some very carefully placed
preemption disable/enable calls to deal with this even with
CONFIG_PREEMPT enabled.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.19-rc6-mm2: uli526x only works after reload

2006-11-29 Thread Rafael J. Wysocki
On Thursday, 30 November 2006 00:26, Andrew Morton wrote:
> On Thu, 30 Nov 2006 00:08:21 +0100
> "Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote:
> 
> > On Wednesday, 29 November 2006 22:31, Rafael J. Wysocki wrote:
> > > On Wednesday, 29 November 2006 22:30, Andrew Morton wrote:
> > > > On Wed, 29 Nov 2006 21:08:00 +0100
> > > > "Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote:
> > > > 
> > > > > On Wednesday, 29 November 2006 20:54, Rafael J. Wysocki wrote:
> > > > > > On Tuesday, 28 November 2006 11:02, Andrew Morton wrote:
> > > > > > > 
> > > > > > > Temporarily at
> > > > > > > 
> > > > > > > http://userweb.kernel.org/~akpm/2.6.19-rc6-mm2/
> > > > > > > 
> > > > > > > Will appear eventually at
> > > > > > > 
> > > > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc6/2.6.19-rc6-mm2/
> > > > > > 
> > > > > > A minor issue: on one of my (x86-64) test boxes the uli526x driver 
> > > > > > doesn't
> > > > > > work when it's first loaded.  I have to rmmod and modprobe it to 
> > > > > > make it work.
> > > > 
> > > > That isn't a minor issue.
> > > > 
> > > > > > It worked just fine on -mm1, so something must have happened to it 
> > > > > > recently.
> > > > > 
> > > > > Sorry, I was wrong.  The driver doesn't work at all, even after 
> > > > > reload.
> > > > > 
> > > > 
> > > > tulip-dmfe-carrier-detection-fix.patch was added in rc6-mm2.  But you're
> > > > not using that (corrent?)
> > > > 
> > > > git-netdev-all changes drivers/net/tulip/de2104x.c, but you're not using
> > > > that either.
> > > > 
> > > > git-powerpc(!) alters drivers/net/tulip/de4x5.c, but you're not using 
> > > > that.
> > > > 
> > > > Beats me, sorry.  Perhaps it's due to changes in networking core.  It's
> > > > presumably a showstopper for statically-linked-uli526x users.  If you 
> > > > could
> > > > bisect it, please?  I'd start with git-netdev-all, then tulip-*.
> > > 
> > > OK, but it'll take some time.
> > 
> > OK, done.
> > 
> > It's one of these (the first one alone doesn't compile):
> > 
> > git-netdev-all.patch
> > git-netdev-all-fixup.patch
> > libphy-dont-do-that.patch
> 
> Are you able to eliminate libphy-dont-do-that.patch?
> 
> > Is a broken-out version of git-netdev-all.patch available from somewhere?
> 
> Nope, and my few fumbling attempts to generate the sort of patch series
> which you want didn't work out too well.  One has to downgrade to
> git-bisect :(
> 
> What does "doesn't work" mean, btw?

Well, it turns out not to be 100% reproducible.  I can only reproduce it after
a soft reboot (eg. shutdown -r now).

Then, while configuring network interfaces the system says the interface name
is ethxx0, but it should be eth1 (eth0 is an RTL-8139, which is not used).  Now
if I run ifconfig, it says:

eth0: error fetching interface information: Device not found

and that's all (normally, ifconfig would show the information for lo and eth1,
without eth0).  Moreover, 'ifconfig eth1' says:

eth1: error fetching interface information: Device not found

Next, I run 'rmmod uli526x' and 'modprobe uli526x' and then 'ifconfig' is
still saying the above (about eth0), but 'ifconfig eth1' seems to work as
it should.  However, the interface often fails to transfer anything after
that.

Greetings,
Rafael


-- 
You never change things by fighting the existing reality.
R. Buckminster Fuller
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP

2006-11-29 Thread Andrew Morton
On Wed, 29 Nov 2006 16:53:11 -0800 (PST)
David Miller <[EMAIL PROTECTED]> wrote:

> 
> Please, it is very difficult to review your work the way you have
> submitted this patch as a set of 4 patches.  These patches have not
> been split up "logically", but rather they have been split up "per
> file" with the same exact changelog message in each patch posting.
> This is very clumsy, and impossible to review, and wastes a lot of
> mailing list bandwith.
> 
> We have an excellent file, called Documentation/SubmittingPatches, in
> the kernel source tree, which explains exactly how to do this
> correctly.
> 
> By splitting your patch into 4 patches, one for each file touched,
> it is impossible to review your patch as a logical whole.
> 
> Please also provide your patch inline so people can just hit reply
> in their mail reader client to quote your patch and comment on it.
> This is impossible with the attachments you've used.
> 

Here you go - joined up, cleaned up, ported to mainline and test-compiled.

That yield() will need to be removed - yield()'s behaviour is truly awful
if the system is otherwise busy.  What is it there for?



From: Wenji Wu <[EMAIL PROTECTED]>

For Linux TCP, when the network applcaiton make system call to move data from
socket's receive buffer to user space by calling tcp_recvmsg().  The socket
will be locked.  During this period, all the incoming packet for the TCP
socket will go to the backlog queue without being TCP processed

Since Linux 2.6 can be inerrupted mid-task, if the network application
expires, and moved to the expired array with the socket locked, all the
packets within the backlog queue will not be TCP processed till the network
applicaton resume its execution.  If the system is heavily loaded, TCP can
easily RTO in the Sender Side.



 include/linux/sched.h |2 ++
 kernel/fork.c |3 +++
 kernel/sched.c|   24 ++--
 net/ipv4/tcp.c|9 +
 4 files changed, 32 insertions(+), 6 deletions(-)

diff -puN net/ipv4/tcp.c~tcp-speedup net/ipv4/tcp.c
--- a/net/ipv4/tcp.c~tcp-speedup
+++ a/net/ipv4/tcp.c
@@ -1109,6 +1109,8 @@ int tcp_recvmsg(struct kiocb *iocb, stru
struct task_struct *user_recv = NULL;
int copied_early = 0;
 
+   current->backlog_flag = 1;
+
lock_sock(sk);
 
TCP_CHECK_TIMER(sk);
@@ -1468,6 +1470,13 @@ skip_copy:
 
TCP_CHECK_TIMER(sk);
release_sock(sk);
+
+   current->backlog_flag = 0;
+   if (current->extrarun_flag == 1){
+   current->extrarun_flag = 0;
+   yield();
+   }
+
return copied;
 
 out:
diff -puN include/linux/sched.h~tcp-speedup include/linux/sched.h
--- a/include/linux/sched.h~tcp-speedup
+++ a/include/linux/sched.h
@@ -1023,6 +1023,8 @@ struct task_struct {
 #ifdef CONFIG_TASK_DELAY_ACCT
struct task_delay_info *delays;
 #endif
+   int backlog_flag;   /* packets wait in tcp backlog queue flag */
+   int extrarun_flag;  /* extra run flag for TCP performance */
 };
 
 static inline pid_t process_group(struct task_struct *tsk)
diff -puN kernel/sched.c~tcp-speedup kernel/sched.c
--- a/kernel/sched.c~tcp-speedup
+++ a/kernel/sched.c
@@ -3099,12 +3099,24 @@ void scheduler_tick(void)
 
if (!rq->expired_timestamp)
rq->expired_timestamp = jiffies;
-   if (!TASK_INTERACTIVE(p) || expired_starving(rq)) {
-   enqueue_task(p, rq->expired);
-   if (p->static_prio < rq->best_expired_prio)
-   rq->best_expired_prio = p->static_prio;
-   } else
-   enqueue_task(p, rq->active);
+   if (p->backlog_flag == 0) {
+   if (!TASK_INTERACTIVE(p) || expired_starving(rq)) {
+   enqueue_task(p, rq->expired);
+   if (p->static_prio < rq->best_expired_prio)
+   rq->best_expired_prio = p->static_prio;
+   } else
+   enqueue_task(p, rq->active);
+   } else {
+   if (expired_starving(rq)) {
+   enqueue_task(p,rq->expired);
+   if (p->static_prio < rq->best_expired_prio)
+   rq->best_expired_prio = p->static_prio;
+   } else {
+   if (!TASK_INTERACTIVE(p))
+   p->extrarun_flag = 1;
+   enqueue_task(p,rq->active);
+   }
+   }
} else {
/*
 * Prevent a too long timeslice allowing a task to monopolize
diff -puN kernel/fork.c~tcp-speedup kernel/fork.c
--- a/kernel/fork.c~tcp-speedup
+++ a/kernel/fork.c
@@ -1032,6 +1032,9 @@ static struct task_struct *copy_process(
clear_tsk_thread_flag(p,

Re: Bug 7596 - Potential performance bottleneck for Linxu TCP

2006-11-29 Thread David Miller

The delays dealt with in your paper might actually help a highly
loaded server with lots of sockets and threads trying to communicate.

The packet processing delays caused by the scheduling delay paces the
TCP sender by controlling the rate at which ACKs go back to that
sender.  Those ACKs will go out paced to the rate at which the
sleeping TCP receiver gets back onto the cpu, and this will cause the
TCP sender to naturally adjust to the overall processing rate of the
receiver system, on a per-connection basis.

Perhaps try a system with hundreds of processes and potentially
hundreds of thousands of TCP sockets, with thousands of unique sender
sites, and see what happens.

This is a similar topic like TSO, where we are trying to balance the
gains from batching work from the losses of gaps in the communication
stream.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/4] - Potential performance bottleneck for Linxu TCP

2006-11-29 Thread David Miller

Please, it is very difficult to review your work the way you have
submitted this patch as a set of 4 patches.  These patches have not
been split up "logically", but rather they have been split up "per
file" with the same exact changelog message in each patch posting.
This is very clumsy, and impossible to review, and wastes a lot of
mailing list bandwith.

We have an excellent file, called Documentation/SubmittingPatches, in
the kernel source tree, which explains exactly how to do this
correctly.

By splitting your patch into 4 patches, one for each file touched,
it is impossible to review your patch as a logical whole.

Please also provide your patch inline so people can just hit reply
in their mail reader client to quote your patch and comment on it.
This is impossible with the attachments you've used.

Thanks.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] additional ipsec audit patch

2006-11-29 Thread James Morris
On Wed, 29 Nov 2006, James Morris wrote:

> On Wed, 29 Nov 2006, Joy Latten wrote:
> 
> > This patch disables auditing in ipsec when CONFIG_AUDITSYSCALL is
> > disabled in the kernel. 
> > 
> > This patch also includes a bug fix for xfrm_state.c as a result of
> > original ipsec audit patch.
> > 
> > Let me know if it looks ok.
> 
> 
> Also, the last patch contains no Signed-off-by: line, please resend.

And, what is the testing status of these patches?


-- 
James Morris
<[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] additional ipsec audit patch

2006-11-29 Thread James Morris
On Wed, 29 Nov 2006, Joy Latten wrote:

> This patch disables auditing in ipsec when CONFIG_AUDITSYSCALL is
> disabled in the kernel. 
> 
> This patch also includes a bug fix for xfrm_state.c as a result of
> original ipsec audit patch.
> 
> Let me know if it looks ok.


Also, the last patch contains no Signed-off-by: line, please resend.


-- 
James Morris
<[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Realtek 8139 driver (8139too.c) TX Timeout doesn't allow interrupt handler to disable receive interrupts at high bi-directional traffic

2006-11-29 Thread Francois Romieu
Stephen Hemminger <[EMAIL PROTECTED]> :
[...]
> Move the poll_enable to after hw_start() or put it inside hw_start.

"after" probably The order would be the opposite of the one used
in rtl8139_poll (which does __netif_rx_complete then irq_unlock)
and it's past 1 AM. It starts to be a bit foggy.

-- 
Ueimor
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Realtek 8139 driver (8139too.c) TX Timeout doesn't allow interrupt handler to disable receive interrupts at high bi-directional traffic

2006-11-29 Thread Stephen Hemminger
On Thu, 30 Nov 2006 00:32:19 +0100
Francois Romieu <[EMAIL PROTECTED]> wrote:

> Stephen Hemminger <[EMAIL PROTECTED]> :
> > Francois Romieu <[EMAIL PROTECTED]> wrote:
> > > Stephen Hemminger <[EMAIL PROTECTED]> :
> > > [...]
> > > > @@ -1682,12 +1685,11 @@ static void rtl8139_tx_timeout_task (voi
> > > > rtl8139_tx_clear (tp);
> > > > spin_unlock_irq(&tp->lock);
> > > >  
> > > > +   netif_poll_enable();
> > >   ^ -> dev
> > > > +
> > > > /* ...and finally, reset everything */
> > > > -   if (netif_running(dev)) {
> > > > -   rtl8139_hw_start (dev);
> > > > -   netif_wake_queue (dev);
> > > > -   }
> > > > -   spin_unlock_bh(&tp->rx_lock);
> > > > +   rtl8139_hw_start (dev);
> > > > +   netif_wake_queue (dev);
> > > >  }
> > > 
> > > rtl8139_hw_start() may mess with cur_rx, whence a race with rtl8139_rx()
> > > if an in-flight interruption enables it a bit too fast. I'd rather go
> > > with:
> > 
> > but rt8139_rx is not possible here because we have blocked the poll
> > routine from starting.  Basically it uses the NAPI rx scheduler bit
> > to replace the rx_lock.
> 
> 1 - the irq handler is waiting for tp->lock
> 2 - rtl8139_tx_timeout_task releases the lock
> 3 - rtl8139_tx_timeout_task issues netif_poll_enable
> 4 - the irq handler schedules ->poll(), returns

Move the poll_enable to after hw_start() or put it inside hw_start.
- 
Stephen Hemminger <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/1] additional ipsec audit patch

2006-11-29 Thread Joy Latten
This patch disables auditing in ipsec when CONFIG_AUDITSYSCALL is
disabled in the kernel. 

This patch also includes a bug fix for xfrm_state.c as a result of
original ipsec audit patch.

Let me know if it looks ok.

My mail gateway has been acting crazy so I apologize for any
replicas being sent for ipsec audit patches.

regards,
Joy

diff -urpN linux-2.6.18-patch/include/net/xfrm.h 
linux-2.6.18-patch.2/include/net/xfrm.h
--- linux-2.6.18-patch/include/net/xfrm.h   2006-11-27 12:29:11.0 
-0600
+++ linux-2.6.18-patch.2/include/net/xfrm.h 2006-11-28 13:26:49.0 
-0600
@@ -395,8 +395,13 @@ struct xfrm_audit
uid_t   loginuid;
u32 secid;
 };
-void xfrm_audit_log(uid_t auid, u32 secid, int type, int result,
+
+#ifdef CONFIG_AUDITSYSCALL
+extern void xfrm_audit_log(uid_t auid, u32 secid, int type, int result,
struct xfrm_policy *xp, struct xfrm_state *x);
+#else
+#define xfrm_audit_log(a,s,t,r,p,x) do { ; } while (0)
+#endif /* CONFIG_AUDITSYSCALL */
 
 static inline void xfrm_pol_hold(struct xfrm_policy *policy)
 {
diff -urpN linux-2.6.18-patch/net/xfrm/xfrm_policy.c 
linux-2.6.18-patch.2/net/xfrm/xfrm_policy.c
--- linux-2.6.18-patch/net/xfrm/xfrm_policy.c   2006-11-27 12:29:33.0 
-0600
+++ linux-2.6.18-patch.2/net/xfrm/xfrm_policy.c 2006-11-28 14:51:09.0 
-0600
@@ -1955,6 +1955,7 @@ int xfrm_bundle_ok(struct xfrm_policy *p
 
 EXPORT_SYMBOL(xfrm_bundle_ok);
 
+#ifdef CONFIG_AUDITSYSCALL
 /* Audit addition and deletion of SAs and ipsec policy */
 
 void xfrm_audit_log(uid_t auid, u32 sid, int type, int result,
@@ -2063,6 +2064,7 @@ void xfrm_audit_log(uid_t auid, u32 sid,
 }
 
 EXPORT_SYMBOL(xfrm_audit_log);
+#endif /* CONFIG_AUDITSYSCALL */
 
 int xfrm_policy_register_afinfo(struct xfrm_policy_afinfo *afinfo)
 {
diff -urpN linux-2.6.18-patch/net/xfrm/xfrm_state.c 
linux-2.6.18-patch.2/net/xfrm/xfrm_state.c
--- linux-2.6.18-patch/net/xfrm/xfrm_state.c2006-11-27 12:29:33.0 
-0600
+++ linux-2.6.18-patch.2/net/xfrm/xfrm_state.c  2006-11-28 12:58:56.0 
-0600
@@ -407,7 +407,6 @@ restart:
xfrm_state_hold(x);
spin_unlock_bh(&xfrm_state_lock);
 
-   xfrm_state_delete(x);
err = xfrm_state_delete(x);
xfrm_audit_log(audit_info->loginuid,
   audit_info->secid,
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Changelog] - Potential performance bottleneck for Linxu TCP

2006-11-29 Thread Martin Bligh

Wenji Wu wrote:

From: Wenji Wu <[EMAIL PROTECTED]>

Greetings,

For Linux TCP, when the network applcaiton make system call to move data
from
socket's receive buffer to user space by calling tcp_recvmsg(). The socket
will
be locked. During the period, all the incoming packet for the TCP socket
will go
to the backlog queue without being TCP processed. Since Linux 2.6 can be
inerrupted mid-task, if the network application expires, and moved to the
expired array with the socket locked, all the packets within the backlog
queue
will not be TCP processed till the network applicaton resume its execution.
If
the system is heavily loaded, TCP can easily RTO in the Sender Side.



So how much difference did this patch actually make, and to what
benchmark?


The patch is for Linux kernel 2.6.14 Deskop and Low-latency Desktop


The patch oesn't seem to be attached? Also, would be better to make
it for the latest kernel version (2.6.19) ... 2.6.14 is rather old ;-)

M
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] NetLabel: add the remaining CIPSO tag types from the IETF draft

2006-11-29 Thread James Morris
On Wed, 29 Nov 2006, Paul Moore wrote:

> James Morris wrote:
> > All applied to:
> > git://git.infradead.org/~jmorris/selinux-net-2.6.20
> 
> Thanks.
> 
> Did you mean your kernel.org git tree?

There's a copy at infradead (which may have still been cloning if you 
checked it immediately).



-- 
James Morris
<[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bug 7596 - Potential performance bottleneck for Linxu TCP

2006-11-29 Thread Andrew Morton
On Wed, 29 Nov 2006 17:22:10 -0600
Wenji Wu <[EMAIL PROTECTED]> wrote:

> From: Wenji Wu <[EMAIL PROTECTED]>
> 
> Greetings,
> 
> For Linux TCP, when the network applcaiton make system call to move data
> from
> socket's receive buffer to user space by calling tcp_recvmsg(). The socket
> will
> be locked. During the period, all the incoming packet for the TCP socket
> will go
> to the backlog queue without being TCP processed. Since Linux 2.6 can be
> inerrupted mid-task, if the network application expires, and moved to the
> expired array with the socket locked, all the packets within the backlog
> queue
> will not be TCP processed till the network applicaton resume its execution.
> If
> the system is heavily loaded, TCP can easily RTO in the Sender Side.
> 
> Attached is the detailed description of the problem and one possible
> solution.

Thanks.  The attachment will be too large for the mailing-list servers so I
uploaded a copy to
http://userweb.kernel.org/~akpm/Linux-TCP-Bottleneck-Analysis-Report.pdf

>From a quick peek it appears that you're getting around 10% improvement in
TCP throughput, best case.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Realtek 8139 driver (8139too.c) TX Timeout doesn't allow interrupt handler to disable receive interrupts at high bi-directional traffic

2006-11-29 Thread Francois Romieu
Stephen Hemminger <[EMAIL PROTECTED]> :
> Francois Romieu <[EMAIL PROTECTED]> wrote:
> > Stephen Hemminger <[EMAIL PROTECTED]> :
> > [...]
> > > @@ -1682,12 +1685,11 @@ static void rtl8139_tx_timeout_task (voi
> > >   rtl8139_tx_clear (tp);
> > >   spin_unlock_irq(&tp->lock);
> > >  
> > > + netif_poll_enable();
> >   ^ -> dev
> > > +
> > >   /* ...and finally, reset everything */
> > > - if (netif_running(dev)) {
> > > - rtl8139_hw_start (dev);
> > > - netif_wake_queue (dev);
> > > - }
> > > - spin_unlock_bh(&tp->rx_lock);
> > > + rtl8139_hw_start (dev);
> > > + netif_wake_queue (dev);
> > >  }
> > 
> > rtl8139_hw_start() may mess with cur_rx, whence a race with rtl8139_rx()
> > if an in-flight interruption enables it a bit too fast. I'd rather go
> > with:
> 
> but rt8139_rx is not possible here because we have blocked the poll
> routine from starting.  Basically it uses the NAPI rx scheduler bit
> to replace the rx_lock.

1 - the irq handler is waiting for tp->lock
2 - rtl8139_tx_timeout_task releases the lock
3 - rtl8139_tx_timeout_task issues netif_poll_enable
4 - the irq handler schedules ->poll(), returns
5 - rtl8139_hw_start() races with ->poll(), aka rtl8139_rx(), for cur_rx

-- 
Ueimor
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.19-rc6-mm2: uli526x only works after reload

2006-11-29 Thread Andrew Morton
On Thu, 30 Nov 2006 00:08:21 +0100
"Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote:

> On Wednesday, 29 November 2006 22:31, Rafael J. Wysocki wrote:
> > On Wednesday, 29 November 2006 22:30, Andrew Morton wrote:
> > > On Wed, 29 Nov 2006 21:08:00 +0100
> > > "Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote:
> > > 
> > > > On Wednesday, 29 November 2006 20:54, Rafael J. Wysocki wrote:
> > > > > On Tuesday, 28 November 2006 11:02, Andrew Morton wrote:
> > > > > > 
> > > > > > Temporarily at
> > > > > > 
> > > > > > http://userweb.kernel.org/~akpm/2.6.19-rc6-mm2/
> > > > > > 
> > > > > > Will appear eventually at
> > > > > > 
> > > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc6/2.6.19-rc6-mm2/
> > > > > 
> > > > > A minor issue: on one of my (x86-64) test boxes the uli526x driver 
> > > > > doesn't
> > > > > work when it's first loaded.  I have to rmmod and modprobe it to make 
> > > > > it work.
> > > 
> > > That isn't a minor issue.
> > > 
> > > > > It worked just fine on -mm1, so something must have happened to it 
> > > > > recently.
> > > > 
> > > > Sorry, I was wrong.  The driver doesn't work at all, even after reload.
> > > > 
> > > 
> > > tulip-dmfe-carrier-detection-fix.patch was added in rc6-mm2.  But you're
> > > not using that (corrent?)
> > > 
> > > git-netdev-all changes drivers/net/tulip/de2104x.c, but you're not using
> > > that either.
> > > 
> > > git-powerpc(!) alters drivers/net/tulip/de4x5.c, but you're not using 
> > > that.
> > > 
> > > Beats me, sorry.  Perhaps it's due to changes in networking core.  It's
> > > presumably a showstopper for statically-linked-uli526x users.  If you 
> > > could
> > > bisect it, please?  I'd start with git-netdev-all, then tulip-*.
> > 
> > OK, but it'll take some time.
> 
> OK, done.
> 
> It's one of these (the first one alone doesn't compile):
> 
> git-netdev-all.patch
> git-netdev-all-fixup.patch
> libphy-dont-do-that.patch

Are you able to eliminate libphy-dont-do-that.patch?

> Is a broken-out version of git-netdev-all.patch available from somewhere?

Nope, and my few fumbling attempts to generate the sort of patch series
which you want didn't work out too well.  One has to downgrade to
git-bisect :(

What does "doesn't work" mean, btw?

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 3/4] - Potential performance bottleneck for Linxu TCP

2006-11-29 Thread Wenji Wu

From: Wenji Wu <[EMAIL PROTECTED]>

Greetings,

For Linux TCP, when the network applcaiton make system call to move data
from
socket's receive buffer to user space by calling tcp_recvmsg(). The socket
will
be locked. During the period, all the incoming packet for the TCP socket
will go
to the backlog queue without being TCP processed. Since Linux 2.6 can be
inerrupted mid-task, if the network application expires, and moved to the
expired array with the socket locked, all the packets within the backlog
queue
will not be TCP processed till the network applicaton resume its execution.
If
the system is heavily loaded, TCP can easily RTO in the Sender Side.

Attached is the patch 3/4

best regards,

wenji

Wenji Wu
Network Researcher
Fermilab, MS-368
P.O. Box 500
Batavia, IL, 60510
(Email): [EMAIL PROTECTED]
(O): 001-630-840-4541


sched.c.patch
Description: Binary data


[patch 2/4] - Potential performance bottleneck for Linxu TCP

2006-11-29 Thread Wenji Wu

From: Wenji Wu <[EMAIL PROTECTED]>

Greetings,

For Linux TCP, when the network applcaiton make system call to move data
from
socket's receive buffer to user space by calling tcp_recvmsg(). The socket
will
be locked. During the period, all the incoming packet for the TCP socket
will go
to the backlog queue without being TCP processed. Since Linux 2.6 can be
inerrupted mid-task, if the network application expires, and moved to the
expired array with the socket locked, all the packets within the backlog
queue
will not be TCP processed till the network applicaton resume its execution.
If
the system is heavily loaded, TCP can easily RTO in the Sender Side.

Attached is the patch 2/4

best regards,

wenji

Wenji Wu
Network Researcher
Fermilab, MS-368
P.O. Box 500
Batavia, IL, 60510
(Email): [EMAIL PROTECTED]
(O): 001-630-840-4541


sched.h.patch
Description: Binary data


[patch 1/4] - Potential performance bottleneck for Linxu TCP

2006-11-29 Thread Wenji Wu

From: Wenji Wu <[EMAIL PROTECTED]>

Greetings,

For Linux TCP, when the network applcaiton make system call to move data
from
socket's receive buffer to user space by calling tcp_recvmsg(). The socket
will
be locked. During the period, all the incoming packet for the TCP socket
will go
to the backlog queue without being TCP processed. Since Linux 2.6 can be
inerrupted mid-task, if the network application expires, and moved to the
expired array with the socket locked, all the packets within the backlog
queue
will not be TCP processed till the network applicaton resume its execution.
If
the system is heavily loaded, TCP can easily RTO in the Sender Side.

Attached is the patch 1/4

best regards,

wenji

Wenji Wu
Network Researcher
Fermilab, MS-368
P.O. Box 500
Batavia, IL, 60510
(Email): [EMAIL PROTECTED]
(O): 001-630-840-4541


tcp.c.patch
Description: Binary data


[Changelog] - Potential performance bottleneck for Linxu TCP

2006-11-29 Thread Wenji Wu

From: Wenji Wu <[EMAIL PROTECTED]>

Greetings,

For Linux TCP, when the network applcaiton make system call to move data
from
socket's receive buffer to user space by calling tcp_recvmsg(). The socket
will
be locked. During the period, all the incoming packet for the TCP socket
will go
to the backlog queue without being TCP processed. Since Linux 2.6 can be
inerrupted mid-task, if the network application expires, and moved to the
expired array with the socket locked, all the packets within the backlog
queue
will not be TCP processed till the network applicaton resume its execution.
If
the system is heavily loaded, TCP can easily RTO in the Sender Side.

Attached is the Changelog for the patch

best regards,

wenji

Wenji Wu
Network Researcher
Fermilab, MS-368
P.O. Box 500
Batavia, IL, 60510
(Email): [EMAIL PROTECTED]
(O): 001-630-840-4541

From: Wenji Wu <[EMAIL PROTECTED]>

- Subject

Potential performance bottleneck for Linux TCP (2.6 Desktop, Low-latency 
Desktop)


- Why the kernel needed patching

For Linux TCP, when the network applcaiton make system call to move data from
socket's receive buffer to user space by calling tcp_recvmsg(). The socket will
be locked. During the period, all the incoming packet for the TCP socket will go
to the backlog queue without being TCP processed. Since Linux 2.6 can be
inerrupted mid-task, if the network application expires, and moved to the
expired array with the socket locked, all the packets within the backlog queue
will not be TCP processed till the network applicaton resume its execution. If
the system is heavily loaded, TCP can easily RTO in the Sender Side.

- The overall design apparoch in the patch

the underlying idea here is that when there are packets waiting on the prequeue 
or backlog queue, do not allow the data receiving process to release the CPU 
for long. 

- Implementation details

We have modified the Linux process scheduling policy and tcp_recvmsg().

To summarize, the solution works as follows: 

an expired data receiving process with packets waiting on backlog queue or 
prequeue is moved to the active array, instead of expired array as usual. 
More often than not, the expired data receiving process will continue to run. 
Even it doesnÂ’t, the wait time before it resumes its execution will be greatly 
reduced. 
However, this gives the process extra runs compared to other processes in the 
runqueue. 

For the sake of fairness, the process would be labeled with the extra_run_flag. 

Also considering the facts that: 

(1) the resumed process will continue its execution within tcp_recvmsg(); 
(2) tcp_recvmsg() does not return to user space until the prequeue and backlog 
queue are drained. 

For the sake of fairness, we modified tcp_recvmsg() as such: after prequeue and 
backlog 
queue are drained and before tcp_recvmsg() returns to user space, any process 
labeled with 
the extra_run_flag will call yield() to explicitly yield the CPU to other 
proc-esses in the runqueue. 
yield() works by removing the process from the active array (where it current 
is, because it is running), 
and inserting it into the expired array. Also, to prevent processes in the 
expired array from starving, 

A special rule has been provided for Linux process scheduling (the same rule 
used for interactive processes): 
an expired process is moved to the expired array without respect to its status 
if processes in the expired array are starved.

Changed files:

/kernel/sched.c
/kernel/fork.c
/include/linux/sched.h
/net/ipv4/tcp.c

- Testing results

The proposed solution tradeoffs a small amount of fairness performance to 
resolve the TCP performance bottleneck. 
The proposed solution wonÂ’t cause serious fairness issue.

The patch is for Linux kernel 2.6.14 Deskop and Low-latency Desktop



[patch 4/4] - Potential performance bottleneck for Linxu TCP

2006-11-29 Thread Wenji Wu
From: Wenji Wu <[EMAIL PROTECTED]>

Greetings,

For Linux TCP, when the network applcaiton make system call to move data
from
socket's receive buffer to user space by calling tcp_recvmsg(). The socket
will
be locked. During the period, all the incoming packet for the TCP socket
will go
to the backlog queue without being TCP processed. Since Linux 2.6 can be
inerrupted mid-task, if the network application expires, and moved to the
expired array with the socket locked, all the packets within the backlog
queue
will not be TCP processed till the network applicaton resume its execution.
If
the system is heavily loaded, TCP can easily RTO in the Sender Side.

Attached is the patch 3/4

best regards,

wenji

Wenji Wu
Network Researcher
Fermilab, MS-368
P.O. Box 500
Batavia, IL, 60510
(Email): [EMAIL PROTECTED]
(O): 001-630-840-4541


fork.c.patch
Description: Binary data


Re: 2.6.19-rc6-mm2: uli526x only works after reload

2006-11-29 Thread Rafael J. Wysocki
On Wednesday, 29 November 2006 22:31, Rafael J. Wysocki wrote:
> On Wednesday, 29 November 2006 22:30, Andrew Morton wrote:
> > On Wed, 29 Nov 2006 21:08:00 +0100
> > "Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote:
> > 
> > > On Wednesday, 29 November 2006 20:54, Rafael J. Wysocki wrote:
> > > > On Tuesday, 28 November 2006 11:02, Andrew Morton wrote:
> > > > > 
> > > > > Temporarily at
> > > > > 
> > > > > http://userweb.kernel.org/~akpm/2.6.19-rc6-mm2/
> > > > > 
> > > > > Will appear eventually at
> > > > > 
> > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc6/2.6.19-rc6-mm2/
> > > > 
> > > > A minor issue: on one of my (x86-64) test boxes the uli526x driver 
> > > > doesn't
> > > > work when it's first loaded.  I have to rmmod and modprobe it to make 
> > > > it work.
> > 
> > That isn't a minor issue.
> > 
> > > > It worked just fine on -mm1, so something must have happened to it 
> > > > recently.
> > > 
> > > Sorry, I was wrong.  The driver doesn't work at all, even after reload.
> > > 
> > 
> > tulip-dmfe-carrier-detection-fix.patch was added in rc6-mm2.  But you're
> > not using that (corrent?)
> > 
> > git-netdev-all changes drivers/net/tulip/de2104x.c, but you're not using
> > that either.
> > 
> > git-powerpc(!) alters drivers/net/tulip/de4x5.c, but you're not using that.
> > 
> > Beats me, sorry.  Perhaps it's due to changes in networking core.  It's
> > presumably a showstopper for statically-linked-uli526x users.  If you could
> > bisect it, please?  I'd start with git-netdev-all, then tulip-*.
> 
> OK, but it'll take some time.

OK, done.

It's one of these (the first one alone doesn't compile):

git-netdev-all.patch
git-netdev-all-fixup.patch
libphy-dont-do-that.patch

Is a broken-out version of git-netdev-all.patch available from somewhere?

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Realtek 8139 driver (8139too.c) TX Timeout doesn't allow interrupt handler to disable receive interrupts at high bi-directional traffic

2006-11-29 Thread Stephen Hemminger
On Wed, 29 Nov 2006 23:44:00 +0100
Francois Romieu <[EMAIL PROTECTED]> wrote:

> Stephen Hemminger <[EMAIL PROTECTED]> :
> [...]
> > @@ -1682,12 +1685,11 @@ static void rtl8139_tx_timeout_task (voi
> > rtl8139_tx_clear (tp);
> > spin_unlock_irq(&tp->lock);
> >  
> > +   netif_poll_enable();
>   ^ -> dev
> > +
> > /* ...and finally, reset everything */
> > -   if (netif_running(dev)) {
> > -   rtl8139_hw_start (dev);
> > -   netif_wake_queue (dev);
> > -   }
> > -   spin_unlock_bh(&tp->rx_lock);
> > +   rtl8139_hw_start (dev);
> > +   netif_wake_queue (dev);
> >  }
> 
> rtl8139_hw_start() may mess with cur_rx, whence a race with rtl8139_rx()
> if an in-flight interruption enables it a bit too fast. I'd rather go
> with:

but rt8139_rx is not possible here because we have blocked the poll
routine from starting.  Basically it uses the NAPI rx scheduler bit
to replace the rx_lock.

It is totally, untested.  

-- 
Stephen Hemminger <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Realtek 8139 driver (8139too.c) TX Timeout doesn't allow interrupt handler to disable receive interrupts at high bi-directional traffic

2006-11-29 Thread Francois Romieu
Stephen Hemminger <[EMAIL PROTECTED]> :
[...]
> @@ -1682,12 +1685,11 @@ static void rtl8139_tx_timeout_task (voi
>   rtl8139_tx_clear (tp);
>   spin_unlock_irq(&tp->lock);
>  
> + netif_poll_enable();
  ^ -> dev
> +
>   /* ...and finally, reset everything */
> - if (netif_running(dev)) {
> - rtl8139_hw_start (dev);
> - netif_wake_queue (dev);
> - }
> - spin_unlock_bh(&tp->rx_lock);
> + rtl8139_hw_start (dev);
> + netif_wake_queue (dev);
>  }

rtl8139_hw_start() may mess with cur_rx, whence a race with rtl8139_rx()
if an in-flight interruption enables it a bit too fast. I'd rather go
with:
[...]

rtl8139_tx_clear (tp);

rtl8139_hw_start (dev);
netif_wake_queue (dev);
   
netif_poll_enable(dev);
spin_unlock_irq(&tp->lock);
}

Otherwise the patch is cool.

-- 
Ueimor
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] add auditing to ipsec

2006-11-29 Thread Steve Grubb
On Monday 27 November 2006 14:11, Joy Latten wrote:
> Please let me know if this is acceptable.

>From an audit perspective, it looks good. 

-Steve
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] NetLabel: add the remaining CIPSO tag types from the IETF draft

2006-11-29 Thread Paul Moore
James Morris wrote:
> All applied to:
>   git://git.infradead.org/~jmorris/selinux-net-2.6.20

Thanks.

Did you mean your kernel.org git tree?

-- 
paul moore
linux security @ hp
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] NetLabel: add the remaining CIPSO tag types from the IETF draft

2006-11-29 Thread James Morris
All applied to:
git://git.infradead.org/~jmorris/selinux-net-2.6.20



Thanks,



- James
-- 
James Morris
<[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Devel] Re: Network virtualization/isolation

2006-11-29 Thread Daniel Lezcano

Brian Haley wrote:

Eric W. Biederman wrote:

I think for cases across network socket namespaces it should
be a matter for the rules, to decide if the connection should
happen and what error code to return if the connection does not
happen.

There is a potential in this to have an ambiguous case where two
applications can be listening for connections on the same socket
on the same port and both will allow the connection.  If that
is the case I believe the proper definition is the first socket
that we find that will accept the connection gets the connection.
No. If you try to connect, the destination IP address is assigned to a 
network namespace. This network namespace is used to leave the listening 
socket ambiguity.


Wouldn't you want to catch this at bind() and/or configuration time and
fail?  Having overlapping namespaces/rules seems undesirable, since as
Herbert said, can get you "unexpected behaviour".


Overlapping is not a problem, you can have several sockets binded on the 
same INADDR_ANY/port without ambiguity because the network namespace 
pointer is added as a new key for sockets lookup, (src addr, src port, 
dst addr, dst port, net ns pointer). The bind should not be forced to a 
specific address because you will not be able to connect via 127.0.0.1.





I think with the appropriate set of rules it provides what is needed
for application migration.  I.e. 127.0.0.1 can be filtered so that
you can only connect to sockets in your current container.

It does get a little odd because it does allow for the possibility
that you can have multiple connected sockets with same source ip,
source port, destination ip, destination port.  If the rules are
setup appropriately.  I don't see that peculiarity being visible on
the outside network so it shouldn't be a problem.


So if they're using the same protocol (eg TCP), how is it decided which
one gets an incoming packet?  Maybe I'm missing something as I don't
understand your inside/outside network reference - is that to the
loopback address comment in the previous paragraph?


The sockets for l3 isolation are isolated like the l2 (this is common 
code). The difference is where the network namespace is found and used.
At the layer 2, it is at the network device level where the namespace is 
found. At the layer 3, from the IP destination. So when you arrive to 
sockets level, you have the network namespace packet destination 
information and you search for sockets related to the specific namespace.



  -- Daniel
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 10/23] bcm43xx: Drain TX status before starting IRQs

2006-11-29 Thread Chris Wright
-stable review patch.  If anyone has any objections, please let us know.
--

From: Michael Buesch <[EMAIL PROTECTED]>

Drain the Microcode TX-status-FIFO before we enable IRQs.
This is required, because the FIFO may still have entries left
from a previous run. Those would immediately fire after enabling
IRQs and would lead to an oops in the DMA TXstatus handling code.

Cc: "John W. Linville" <[EMAIL PROTECTED]>
Signed-off-by: Michael Buesch <[EMAIL PROTECTED]>
Signed-off-by: Larry Finger <[EMAIL PROTECTED]>
Signed-off-by: Chris Wright <[EMAIL PROTECTED]>
---
 drivers/net/wireless/bcm43xx/bcm43xx_main.c |   18 ++
 1 file changed, 18 insertions(+)

--- linux-2.6.18.4.orig/drivers/net/wireless/bcm43xx/bcm43xx_main.c
+++ linux-2.6.18.4/drivers/net/wireless/bcm43xx/bcm43xx_main.c
@@ -1463,6 +1463,23 @@ static void handle_irq_transmit_status(s
}
 }
 
+static void drain_txstatus_queue(struct bcm43xx_private *bcm)
+{
+   u32 dummy;
+
+   if (bcm->current_core->rev < 5)
+   return;
+   /* Read all entries from the microcode TXstatus FIFO
+* and throw them away.
+*/
+   while (1) {
+   dummy = bcm43xx_read32(bcm, BCM43xx_MMIO_XMITSTAT_0);
+   if (!dummy)
+   break;
+   dummy = bcm43xx_read32(bcm, BCM43xx_MMIO_XMITSTAT_1);
+   }
+}
+
 static void bcm43xx_generate_noise_sample(struct bcm43xx_private *bcm)
 {
bcm43xx_shm_write16(bcm, BCM43xx_SHM_SHARED, 0x408, 0x7F7F);
@@ -3517,6 +3534,7 @@ int bcm43xx_select_wireless_core(struct 
bcm43xx_macfilter_clear(bcm, BCM43xx_MACFILTER_ASSOC);
bcm43xx_macfilter_set(bcm, BCM43xx_MACFILTER_SELF, (u8 
*)(bcm->net_dev->dev_addr));
bcm43xx_security_init(bcm);
+   drain_txstatus_queue(bcm);
ieee80211softmac_start(bcm->net_dev);
 
/* Let's go! Be careful after enabling the IRQs.

--
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sky2 hang still exists in 2.6.19-rc6 --Bug#396185?

2006-11-29 Thread Berck E. Nash

Stephen Hemminger wrote:


That motherboard has dual lan, are you using both of them?
I don't have that chip version, so hard to tell if it is using dual port 
with a single chip
or not.  There is a hack for the dual port PCI-X version already in the 
driver,
that turns off receive checksums if both ports are in use. Please try 
turning off

receive checksums with ethtool and see if that helps.


I'm only using one port.  The second port is disabled in the BIOS.  The 
problem still occurs with receive checksums off.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.19-rc6-mm2: uli526x only works after reload

2006-11-29 Thread Rafael J. Wysocki
On Wednesday, 29 November 2006 22:30, Andrew Morton wrote:
> On Wed, 29 Nov 2006 21:08:00 +0100
> "Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote:
> 
> > On Wednesday, 29 November 2006 20:54, Rafael J. Wysocki wrote:
> > > On Tuesday, 28 November 2006 11:02, Andrew Morton wrote:
> > > > 
> > > > Temporarily at
> > > > 
> > > > http://userweb.kernel.org/~akpm/2.6.19-rc6-mm2/
> > > > 
> > > > Will appear eventually at
> > > > 
> > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc6/2.6.19-rc6-mm2/
> > > 
> > > A minor issue: on one of my (x86-64) test boxes the uli526x driver doesn't
> > > work when it's first loaded.  I have to rmmod and modprobe it to make it 
> > > work.
> 
> That isn't a minor issue.
> 
> > > It worked just fine on -mm1, so something must have happened to it 
> > > recently.
> > 
> > Sorry, I was wrong.  The driver doesn't work at all, even after reload.
> > 
> 
> tulip-dmfe-carrier-detection-fix.patch was added in rc6-mm2.  But you're
> not using that (corrent?)
> 
> git-netdev-all changes drivers/net/tulip/de2104x.c, but you're not using
> that either.
> 
> git-powerpc(!) alters drivers/net/tulip/de4x5.c, but you're not using that.
> 
> Beats me, sorry.  Perhaps it's due to changes in networking core.  It's
> presumably a showstopper for statically-linked-uli526x users.  If you could
> bisect it, please?  I'd start with git-netdev-all, then tulip-*.

OK, but it'll take some time.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.19-rc6-mm2: uli526x only works after reload

2006-11-29 Thread Andrew Morton
On Wed, 29 Nov 2006 21:08:00 +0100
"Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote:

> On Wednesday, 29 November 2006 20:54, Rafael J. Wysocki wrote:
> > On Tuesday, 28 November 2006 11:02, Andrew Morton wrote:
> > > 
> > > Temporarily at
> > > 
> > > http://userweb.kernel.org/~akpm/2.6.19-rc6-mm2/
> > > 
> > > Will appear eventually at
> > > 
> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc6/2.6.19-rc6-mm2/
> > 
> > A minor issue: on one of my (x86-64) test boxes the uli526x driver doesn't
> > work when it's first loaded.  I have to rmmod and modprobe it to make it 
> > work.

That isn't a minor issue.

> > It worked just fine on -mm1, so something must have happened to it recently.
> 
> Sorry, I was wrong.  The driver doesn't work at all, even after reload.
> 

tulip-dmfe-carrier-detection-fix.patch was added in rc6-mm2.  But you're
not using that (corrent?)

git-netdev-all changes drivers/net/tulip/de2104x.c, but you're not using
that either.

git-powerpc(!) alters drivers/net/tulip/de4x5.c, but you're not using that.

Beats me, sorry.  Perhaps it's due to changes in networking core.  It's
presumably a showstopper for statically-linked-uli526x users.  If you could
bisect it, please?  I'd start with git-netdev-all, then tulip-*.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] r8169: Fix iteration variable sign

2006-11-29 Thread Francois Romieu
This changes the type of variable "i" in rtl8169_init_one()
from "unsigned int" to "int". "i" is checked for < 0 later,
which can never happen for "unsigned". This results in broken
error handling.

Signed-off-by: Michael Buesch <[EMAIL PROTECTED]>
Signed-off-by: Francois Romieu <[EMAIL PROTECTED]>

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index 5002673..c8fa9b1 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -1491,8 +1491,8 @@ rtl8169_init_one(struct pci_dev *pdev, c
struct rtl8169_private *tp;
struct net_device *dev;
void __iomem *ioaddr;
-   unsigned int i, pm_cap;
-   int rc;
+   unsigned int pm_cap;
+   int i, rc;
 
if (netif_msg_drv(&debug)) {
printk(KERN_INFO "%s Gigabit Ethernet driver %s loaded\n",
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] sundance: use NULL for pointer

2006-11-29 Thread Randy Dunlap
From: Randy Dunlap <[EMAIL PROTECTED]>

Use NULL instead of 0 for pointers (cures sparse warnings).

drivers/net/sundance.c:1106:16: warning: Using plain integer as NULL pointer
drivers/net/sundance.c:1652:16: warning: Using plain integer as NULL pointer

Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]>
---
 drivers/net/sundance.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- linux-2.6.19-rc6-mm2.orig/drivers/net/sundance.c
+++ linux-2.6.19-rc6-mm2/drivers/net/sundance.c
@@ -1103,7 +1103,7 @@ reset_tx (struct net_device *dev)
np->cur_tx = np->dirty_tx = 0;
np->cur_task = 0;
 
-   np->last_tx = 0;
+   np->last_tx = NULL;
iowrite8(127, ioaddr + TxDMAPollPeriod);
 
iowrite16 (StatsEnable | RxEnable | TxEnable, ioaddr + MACCtrl1);
@@ -1649,7 +1649,7 @@ static int netdev_close(struct net_devic
np->cur_tx = 0;
np->dirty_tx = 0;
np->cur_task = 0;
-   np->last_tx = 0;
+   np->last_tx = NULL;
 
netif_stop_queue(dev);
 


---
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] NetLabel: add the ranged tag to the CIPSOv4 protocol

2006-11-29 Thread James Morris
On Wed, 29 Nov 2006, [EMAIL PROTECTED] wrote:

> +{
> + /* The constant '16' is not random, it is the maximum number of
> +  * high/low category range pairs as permitted by the CIPSO draft based
> +  * on a maximum IPv4 header length of 60 bytes - the BUG_ON() assertion
> +  * does a sanity check to make sure we don't overflow the array. */
> + int iter = -1;
> + u16 array[16];

Perhaps in a future update, make this a value a macro definition and 
document it in the header.



-- 
James Morris
<[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Realtek 8139 driver (8139too.c) TX Timeout doesn't allow interrupt handler to disable receive interrupts at high bi-directional traffic

2006-11-29 Thread Stephen Hemminger
On Wed, 29 Nov 2006 14:20:31 +0530
"Basheer, Mansoor Ahamed" <[EMAIL PROTECTED]> wrote:

> Francois Romieu [mailto:[EMAIL PROTECTED] wrote:
> 
> > Afaics your change may disable the Rx irq right after the poll routine
> 
> > enabled it again. It will not always work either.
> > 
> > The (slow) timeout watchdog could grab the poll handler and hack the 
> > irq mask depending on whether poll was scheduled or not.
> 
> Could you please confirm whether the attached patch would work?
> I tested it and it works for me.
> 
> 
> Signed-off-by: Mansoor Ahamed <[EMAIL PROTECTED]>
> 
> --- old/8139too.c 2006-11-14 10:44:27.0 +0530
> +++ new/8139too.c 2006-11-14 10:44:18.0 +0530
> @@ -1438,8 +1438,18 @@
>   if ((!(tmp & CmdRxEnb)) || (!(tmp & CmdTxEnb)))
>   RTL_W8 (ChipCmd, CmdRxEnb | CmdTxEnb);
>  
> - /* Enable all known interrupts by setting the interrupt mask. */
> - RTL_W16 (IntrMask, rtl8139_intr_mask);
> + local_irq_disable();
> + /* Don't enable RX if RX was already scheduled */
> + if(test_bit(__LINK_STATE_START, &dev->state) &&
> + test_bit(__LINK_STATE_RX_SCHED, &dev->state) ) {
> + /* Enable all interrupts except RX by setting the
> interrupt mask. */
> + RTL_W16 (IntrMask, rtl8139_norx_intr_mask);
> + }
> + else {
> + /* Enable all known interrupts by setting the interrupt
> mask. */ 
> + RTL_W16 (IntrMask, rtl8139_intr_mask);
> + }
> + local_irq_enable();
>  }

Sorry, that's not the right way. Testing for bits is not
SMP safe and is usually a bad idea. The rx_lock model is not the
best way. Try something like this:

--- a/drivers/net/8139too.c.orig2006-11-29 12:22:32.0 -0800
+++ b/drivers/net/8139too.c 2006-11-29 12:22:06.0 -0800
@@ -589,7 +589,6 @@ struct rtl8139_private {
unsigned int default_port : 4;  /* Last dev->if_port value. */
unsigned int have_thread : 1;
spinlock_t lock;
-   spinlock_t rx_lock;
chip_t chipset;
u32 rx_config;
struct rtl_extra_stats xstats;
@@ -1009,7 +1008,6 @@ static int __devinit rtl8139_init_one (s
tp->msg_enable =
(debug < 0 ? RTL8139_DEF_MSG_ENABLE : ((1 << debug) - 1));
spin_lock_init (&tp->lock);
-   spin_lock_init (&tp->rx_lock);
INIT_WORK(&tp->thread, rtl8139_thread, dev);
tp->mii.dev = dev;
tp->mii.mdio_read = mdio_read;
@@ -1654,6 +1652,9 @@ static void rtl8139_tx_timeout_task (voi
int i;
u8 tmp8;
 
+   if (!netif_running(dev))
+   return;
+
printk (KERN_DEBUG "%s: Transmit timeout, status %2.2x %4.4x %4.4x "
"media %2.2x.\n", dev->name, RTL_R8 (ChipCmd),
RTL_R16(IntrStatus), RTL_R16(IntrMask), RTL_R8(MediaStatus));
@@ -1673,7 +1674,9 @@ static void rtl8139_tx_timeout_task (voi
if (tmp8 & CmdTxEnb)
RTL_W8 (ChipCmd, CmdRxEnb);
 
-   spin_lock_bh(&tp->rx_lock);
+   /* prevent NAPI poll from running */
+   netif_poll_disable();
+
/* Disable interrupts by clearing the interrupt mask. */
RTL_W16 (IntrMask, 0x);
 
@@ -1682,12 +1685,11 @@ static void rtl8139_tx_timeout_task (voi
rtl8139_tx_clear (tp);
spin_unlock_irq(&tp->lock);
 
+   netif_poll_enable();
+
/* ...and finally, reset everything */
-   if (netif_running(dev)) {
-   rtl8139_hw_start (dev);
-   netif_wake_queue (dev);
-   }
-   spin_unlock_bh(&tp->rx_lock);
+   rtl8139_hw_start (dev);
+   netif_wake_queue (dev);
 }
 
 static void rtl8139_tx_timeout (struct net_device *dev)
@@ -2116,7 +2118,6 @@ static int rtl8139_poll(struct net_devic
int orig_budget = min(*budget, dev->quota);
int done = 1;
 
-   spin_lock(&tp->rx_lock);
if (likely(RTL_R16(IntrStatus) & RxAckBits)) {
int work_done;
 
@@ -2138,7 +2139,6 @@ static int rtl8139_poll(struct net_devic
__netif_rx_complete(dev);
local_irq_enable();
}
-   spin_unlock(&tp->rx_lock);
 
return !done;
 }
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Network virtualization/isolation

2006-11-29 Thread Brian Haley

Eric W. Biederman wrote:

I think for cases across network socket namespaces it should
be a matter for the rules, to decide if the connection should
happen and what error code to return if the connection does not
happen.

There is a potential in this to have an ambiguous case where two
applications can be listening for connections on the same socket
on the same port and both will allow the connection.  If that
is the case I believe the proper definition is the first socket
that we find that will accept the connection gets the connection.


Wouldn't you want to catch this at bind() and/or configuration time and 
fail?  Having overlapping namespaces/rules seems undesirable, since as 
Herbert said, can get you "unexpected behaviour".



I think with the appropriate set of rules it provides what is needed
for application migration.  I.e. 127.0.0.1 can be filtered so that
you can only connect to sockets in your current container.

It does get a little odd because it does allow for the possibility
that you can have multiple connected sockets with same source ip,
source port, destination ip, destination port.  If the rules are
setup appropriately.  I don't see that peculiarity being visible on
the outside network so it shouldn't be a problem.


So if they're using the same protocol (eg TCP), how is it decided which 
one gets an incoming packet?  Maybe I'm missing something as I don't 
understand your inside/outside network reference - is that to the 
loopback address comment in the previous paragraph?


Thanks,

-Brian
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Madwifi-devel] ar5k and Atheros AR5005G

2006-11-29 Thread Stephen Hemminger
On Wed, 29 Nov 2006 08:03:28 -0800
David Kimdon <[EMAIL PROTECTED]> wrote:

> On Wed, Nov 29, 2006 at 04:38:56PM +0100, Michael Buesch wrote:
> > On Wednesday 29 November 2006 16:24, David Kimdon wrote:
> > > On Wed, Nov 29, 2006 at 04:12:33PM +0100, Michael Buesch wrote:
> > > > On Wednesday 29 November 2006 15:34, Nick Kossifidis wrote:
> > > Why do you say that?
> > > 
> > > There is absolutely no reason why dadwifi can't be merged into the
> > > mainline once the hal issue is resolved. 
> > 
> > Last time we talked about that stuff, it was decided that
> > we don't want a HAL... See archives.
> 
> To be clear, that is all part of the hal issue that needs to be
> resolved.  Removing the hal abstraction is not difficult for an
> interested party once source for the hal is available.  The next step
> in such an effort would be to add an open hal to dadwifi, IMO.
> 

Isn't it obvious. Planning from goal through intermediate steps gives:

0 - today (raw materials)
* softmac stack: d80211
* open hal: ar5k
* glue layer: dadwifi

1- put pieces together
* d80211 + dadwifi + ar5k

2 - release working code to d80211 tree

3 - hard link dad2ifi to ar5k (one module)

4 - collapse indirect calls and refactor

5 - lather rinse repeat in public d80211 tree

...

8 - resulting in atheros driver kernel module

9 - code ready in d80211


10 - mainline integration of working driver for Atheros
 using common softmac stack

> 
> P.S. Actually, it isn't clear to me that removing the hal entirely is
> a good idea.  Abstractions exist for practical reasons.  The hal
> allows dadwifi to support a variety of Atheros chips without needing
> to worry about the specific details of each chip.

Abstractions that deal with hardware are good. See phylib.
Abstractions that try to deal with operating system independence are 
gross.



-- 
Stephen Hemminger <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] lockdep: fix sk->sk_callback_lock locking

2006-11-29 Thread David Miller
From: Herbert Xu <[EMAIL PROTECTED]>
Date: Wed, 29 Nov 2006 23:07:09 +1100

> On Wed, Nov 29, 2006 at 12:42:24PM +0100, Peter Zijlstra wrote:
> > 
> > However I'm not quite sure yet how to teach lockdep about this. The
> > proposed patch will shut it up though.
> 
> As a rule I think we should never make semantic changes to shut up
> lockdep.

Especially ones which are costly, as this proposed change is in
that it disables software interrupts in a place where that
is completely unnecessary.

Let's not even consider this patch :)
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen

2006-11-29 Thread Alexey Dobriyan
On Tue, Nov 28, 2006 at 03:33:25PM -0800, David Miller wrote:
> From: Alexey Dobriyan <[EMAIL PROTECTED]>
> Date: Wed, 22 Nov 2006 00:22:51 +0300
> 
> > [CCing netdev, bug in pktgen]
> > 
> > [build modular pktgen]
> > while true; do modprobe pktgen && rmmod pktgen; done
> > 
> > BUG: warning at fs/proc/generic.c:732/remove_proc_entry()
> >  [] remove_proc_entry+0x161/0x1ca
> >  [] pg_cleanup+0xd5/0xdc [pktgen]
> >  [] autoremove_wake_function+0x0/0x35
> >  [] sys_delete_module+0x162/0x189
> >  [] remove_vma+0x31/0x36
> >  [] do_munmap+0x193/0x1ac
> >  [] sysenter_past_esp+0x56/0x79
> >  [] fn_hash_delete+0x4f/0x1c7
> > 
> > On Tue, Nov 21, 2006 at 09:36:46PM +0100, Pavol Gono wrote:
> > > I am going to add two more:
> > > for i in 1 2 3 4 5 ; do modprobe pktgen ; rmmod pktgen ; done
> > 
> > Looks like it creates /proc/net/pktgen/kpktgen_%i but forgets to remove
> > them.
>
> It's pretty careful to delete all of the entries under
> /proc/net/pktgen/.
>
> When the module is brought down, it walks the list of threads
> and brings them down by setting T_TERMINATE in t->control.

Looks like worker thread strategically clears it if scheduled at wrong
moment.

--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -3292,7 +3292,6 @@ static void pktgen_thread_worker(struct
 
init_waitqueue_head(&t->queue);
 
-   t->control &= ~(T_TERMINATE);
t->control &= ~(T_RUN);
t->control &= ~(T_STOP);
t->control &= ~(T_REMDEVALL);

> This makes the thread break out of it's loop and run:
>
>   pktgen_stop(t);
>   pktgen_rem_all_ifs(t);
>   pktgen_rem_thread(t);

Kernel seeems to survive, but when I hit Ctrl+C after half
a minute backtrace is back being the very last dmesg lines.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] r8169: Fix iteration variable sign

2006-11-29 Thread Michael Buesch
This changes the type of variable "i" in
rtl8169_init_one() from "unsigned int" to "int".
"i" is checked for <0 later, which can never happen
for "unsigned". This results in broken error handling.

Signed-off-by: Michael Buesch <[EMAIL PROTECTED]>

Index: linux-2.6/drivers/net/r8169.c
===
--- linux-2.6.orig/drivers/net/r8169.c  2006-11-04 19:03:28.0 +0100
+++ linux-2.6/drivers/net/r8169.c   2006-11-29 20:41:59.0 +0100
@@ -1473,8 +1473,8 @@ rtl8169_init_one(struct pci_dev *pdev, c
struct rtl8169_private *tp;
struct net_device *dev;
void __iomem *ioaddr;
-   unsigned int i, pm_cap;
-   int rc;
+   unsigned int pm_cap;
+   int i, rc;
 
if (netif_msg_drv(&debug)) {
printk(KERN_INFO "%s Gigabit Ethernet driver %s loaded\n",

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Intel 82559 NIC corrupted EEPROM

2006-11-29 Thread Jesse Brandeburg

On 11/29/06, John <[EMAIL PROTECTED]> wrote:

> Let's go ahead and print the output from e100_load_eeprom
> debug patch attached.

Loading (then unloading) e100.ko fails the first few times (i.e. the
driver claims one of the EEPROMs is corrupted). Thereafter, sometimes it
fails, other times it works. Sounds like a race, no?


yes, or something like that.  I think you may have a piece of eeprom
hardware that is either "slow" or slightly out of spec.  I wonder if
the hrt kernel makes udelay(4) much more like 4us than the regular
kernels.

can you try adding mdelay(100); in e100_eeprom_load before the for loop,
and then change the multiple udelay(4) to mdelay(1) in e100_eeprom_read


On an unrelated note, insmod_100.txt is truncated at the beginning, and
insmod_110.txt is truncated in the middle (!!) cf. line 14. What would
cause klogd to behave like that?


usually its because whatever is printing is printing too fast or too
much at a time.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] NetLabel: add the remaining CIPSO tag types from the IETF draft

2006-11-29 Thread Casey Schaufler

--- [EMAIL PROTECTED] wrote:

> This patchset consists of three patches that add
> support for the remaining two
> tag types from the CIPSO draft specification, the
> enumerated and range tags.
> The most significant part about adding these two
> tags is that NetLabel now has
> the ability to represent more than 240 categories
> (limitation imposed by the
> current restricted bitmap tag).
> 
> In addition, the first patch in the set converts
> NetLabel's contiguous char
> string category bitmap stored in network friendly
> bit/byte order into a sparse
> bitmap stored in host friendly bit/byte order. 
> While this change was not
> required to support the new CIPSO tags, it should
> make life much easier as the
> old category bitmap would have proven problematic as
> the number of usable
> categories increases with the new tag types.  It
> also has a side effect of
> making the LSM specific code much less ugly.

Fabulous. Thank you.


Casey Schaufler
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] NetLabel: add the ranged tag to the CIPSOv4 protocol

2006-11-29 Thread paul . moore
From: Paul Moore <[EMAIL PROTECTED]>

Add support for the ranged tag (tag type #5) to the CIPSOv4 protocol.

The ranged tag allows for seven, or eight if zero is the lowest category,
category ranges to be specified in a CIPSO option.  Each range is specified by
two unsigned 16 bit fields, each with a maximum value of 65534.  The two values
specify the start and end of the category range; if the start of the category
range is zero then it is omitted.

See Documentation/netlabel/draft-ietf-cipso-ipsecurity-01.txt for more details.

Signed-off-by: Paul Moore <[EMAIL PROTECTED]>
---
 net/ipv4/cipso_ipv4.c |  268 ++
 1 files changed, 268 insertions(+)

Index: net-2.6.20_netlabel-cats/net/ipv4/cipso_ipv4.c
===
--- net-2.6.20_netlabel-cats.orig/net/ipv4/cipso_ipv4.c
+++ net-2.6.20_netlabel-cats/net/ipv4/cipso_ipv4.c
@@ -455,6 +455,10 @@ int cipso_v4_doi_add(struct cipso_v4_doi
switch (doi_def->tags[iter]) {
case CIPSO_V4_TAG_RBITMAP:
break;
+   case CIPSO_V4_TAG_RANGE:
+   if (doi_def->type != CIPSO_V4_MAP_PASS)
+   return -EINVAL;
+   break;
case CIPSO_V4_TAG_INVALID:
if (iter == 0)
return -EINVAL;
@@ -1045,6 +1049,148 @@ static int cipso_v4_map_cat_enum_ntoh(co
return 0;
 }
 
+/**
+ * cipso_v4_map_cat_rng_valid - Checks to see if the categories are valid
+ * @doi_def: the DOI definition
+ * @rngcat: category list
+ * @rngcat_len: length of the category list in bytes
+ *
+ * Description:
+ * Checks the given categories against the given DOI definition and returns a
+ * negative value if any of the categories do not have a valid mapping and a
+ * zero value if all of the categories are valid.
+ *
+ */
+static int cipso_v4_map_cat_rng_valid(const struct cipso_v4_doi *doi_def,
+ const unsigned char *rngcat,
+ u32 rngcat_len)
+{
+   u16 cat_high;
+   u16 cat_low;
+   u32 cat_prev = CIPSO_V4_MAX_REM_CATS + 1;
+   u32 iter;
+
+   if (doi_def->type != CIPSO_V4_MAP_PASS || rngcat_len & 0x01)
+   return -EFAULT;
+
+   for (iter = 0; iter < rngcat_len; iter += 4) {
+   cat_high = ntohs(*((__be16 *)&rngcat[iter]));
+   if ((iter + 4) <= rngcat_len)
+   cat_low = ntohs(*((__be16 *)&rngcat[iter + 2]));
+   else
+   cat_low = 0;
+
+   if (cat_high > cat_prev)
+   return -EFAULT;
+
+   cat_prev = cat_low;
+   }
+
+   return 0;
+}
+
+/**
+ * cipso_v4_map_cat_rng_hton - Perform a category mapping from host to network
+ * @doi_def: the DOI definition
+ * @secattr: the security attributes
+ * @net_cat: the zero'd out category list in network/CIPSO format
+ * @net_cat_len: the length of the CIPSO category list in bytes
+ *
+ * Description:
+ * Perform a label mapping to translate a local MLS category bitmap to the
+ * correct CIPSO category list using the given DOI definition.   Returns the
+ * size in bytes of the network category bitmap on success, negative values
+ * otherwise.
+ *
+ */
+static int cipso_v4_map_cat_rng_hton(const struct cipso_v4_doi *doi_def,
+const struct netlbl_lsm_secattr *secattr,
+unsigned char *net_cat,
+u32 net_cat_len)
+{
+   /* The constant '16' is not random, it is the maximum number of
+* high/low category range pairs as permitted by the CIPSO draft based
+* on a maximum IPv4 header length of 60 bytes - the BUG_ON() assertion
+* does a sanity check to make sure we don't overflow the array. */
+   int iter = -1;
+   u16 array[16];
+   u32 array_cnt = 0;
+   u32 cat_size = 0;
+
+   BUG_ON(net_cat_len > 30);
+
+   for (;;) {
+   iter = netlbl_secattr_catmap_walk(secattr->mls_cat, iter + 1);
+   if (iter < 0)
+   break;
+   cat_size += (iter == 0 ? 0 : sizeof(u16));
+   if (cat_size > net_cat_len)
+   return -ENOSPC;
+   array[array_cnt++] = iter;
+
+   iter = netlbl_secattr_catmap_walk_rng(secattr->mls_cat, iter);
+   if (iter < 0)
+   return -EFAULT;
+   cat_size += sizeof(u16);
+   if (cat_size > net_cat_len)
+   return -ENOSPC;
+   array[array_cnt++] = iter;
+   }
+
+   for (iter = 0; array_cnt > 0;) {
+   *((__be16 *)&net_cat[iter]) = htons(array[--array_cnt]);
+   iter += 2;
+   array_cnt--;
+   if (array[array_cnt] != 0) {
+ 

[PATCH 0/3] NetLabel: add the remaining CIPSO tag types from the IETF draft

2006-11-29 Thread paul . moore
This patchset consists of three patches that add support for the remaining two
tag types from the CIPSO draft specification, the enumerated and range tags.
The most significant part about adding these two tags is that NetLabel now has
the ability to represent more than 240 categories (limitation imposed by the
current restricted bitmap tag).

In addition, the first patch in the set converts NetLabel's contiguous char
string category bitmap stored in network friendly bit/byte order into a sparse
bitmap stored in host friendly bit/byte order.  While this change was not
required to support the new CIPSO tags, it should make life much easier as the
old category bitmap would have proven problematic as the number of usable
categories increases with the new tag types.  It also has a side effect of
making the LSM specific code much less ugly.

During testing I have not seen any regressions with this patchset; please
consider this for net-2.6.20.  Thanks.

--
paul moore
linux security @ hp
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] NetLabel: add the enumerated tag to the CIPSOv4 protocol

2006-11-29 Thread paul . moore
From: Paul Moore <[EMAIL PROTECTED]>

Add support for the enumerated tag (tag type #2) to the CIPSOv4 protocol.

The enumerated tag allows for 15 categories to be specified in a CIPSO option,
where each category is an unsigned 16 bit field with a maximum value of 65534.

See Documentation/netlabel/draft-ietf-cipso-ipsecurity-01.txt for more details.

Signed-off-by: Paul Moore <[EMAIL PROTECTED]>
---
 net/ipv4/cipso_ipv4.c |  233 ++
 1 files changed, 233 insertions(+)

Index: net-2.6.20_netlabel-cats/net/ipv4/cipso_ipv4.c
===
--- net-2.6.20_netlabel-cats.orig/net/ipv4/cipso_ipv4.c
+++ net-2.6.20_netlabel-cats/net/ipv4/cipso_ipv4.c
@@ -459,6 +459,10 @@ int cipso_v4_doi_add(struct cipso_v4_doi
if (iter == 0)
return -EINVAL;
break;
+   case CIPSO_V4_TAG_ENUM:
+   if (doi_def->type != CIPSO_V4_MAP_PASS)
+   return -EINVAL;
+   break;
default:
return -EINVAL;
}
@@ -940,6 +944,107 @@ static int cipso_v4_map_cat_rbm_ntoh(con
return -EINVAL;
 }
 
+/**
+ * cipso_v4_map_cat_enum_valid - Checks to see if the categories are valid
+ * @doi_def: the DOI definition
+ * @enumcat: category list
+ * @enumcat_len: length of the category list in bytes
+ *
+ * Description:
+ * Checks the given categories against the given DOI definition and returns a
+ * negative value if any of the categories do not have a valid mapping and a
+ * zero value if all of the categories are valid.
+ *
+ */
+static int cipso_v4_map_cat_enum_valid(const struct cipso_v4_doi *doi_def,
+  const unsigned char *enumcat,
+  u32 enumcat_len)
+{
+   u16 cat;
+   int cat_prev = -1;
+   u32 iter;
+
+   if (doi_def->type != CIPSO_V4_MAP_PASS || enumcat_len & 0x01)
+   return -EFAULT;
+
+   for (iter = 0; iter < enumcat_len; iter += 2) {
+   cat = ntohs(*((__be16 *)&enumcat[iter]));
+   if (cat <= cat_prev)
+   return -EFAULT;
+   cat_prev = cat;
+   }
+
+   return 0;
+}
+
+/**
+ * cipso_v4_map_cat_enum_hton - Perform a category mapping from host to network
+ * @doi_def: the DOI definition
+ * @secattr: the security attributes
+ * @net_cat: the zero'd out category list in network/CIPSO format
+ * @net_cat_len: the length of the CIPSO category list in bytes
+ *
+ * Description:
+ * Perform a label mapping to translate a local MLS category bitmap to the
+ * correct CIPSO category list using the given DOI definition.   Returns the
+ * size in bytes of the network category bitmap on success, negative values
+ * otherwise.
+ *
+ */
+static int cipso_v4_map_cat_enum_hton(const struct cipso_v4_doi *doi_def,
+ const struct netlbl_lsm_secattr *secattr,
+ unsigned char *net_cat,
+ u32 net_cat_len)
+{
+   int cat = -1;
+   u32 cat_iter = 0;
+
+   for (;;) {
+   cat = netlbl_secattr_catmap_walk(secattr->mls_cat, cat + 1);
+   if (cat < 0)
+   break;
+   if ((cat_iter + 2) > net_cat_len)
+   return -ENOSPC;
+
+   *((__be16 *)&net_cat[cat_iter]) = htons(cat);
+   cat_iter += 2;
+   }
+
+   return cat_iter;
+}
+
+/**
+ * cipso_v4_map_cat_enum_ntoh - Perform a category mapping from network to host
+ * @doi_def: the DOI definition
+ * @net_cat: the category list in network/CIPSO format
+ * @net_cat_len: the length of the CIPSO bitmap in bytes
+ * @secattr: the security attributes
+ *
+ * Description:
+ * Perform a label mapping to translate a CIPSO category list to the correct
+ * local MLS category bitmap using the given DOI definition.  Returns zero on
+ * success, negative values on failure.
+ *
+ */
+static int cipso_v4_map_cat_enum_ntoh(const struct cipso_v4_doi *doi_def,
+ const unsigned char *net_cat,
+ u32 net_cat_len,
+ struct netlbl_lsm_secattr *secattr)
+{
+   int ret_val;
+   u32 iter;
+
+   for (iter = 0; iter < net_cat_len; iter += 2) {
+   ret_val = netlbl_secattr_catmap_setbit(secattr->mls_cat,
+   ntohs(*((__be16 *)&net_cat[iter])),
+   GFP_ATOMIC);
+   if (ret_val != 0)
+   return ret_val;
+   }
+
+   return 0;
+}
+
 /*
  * Protocol Handling Functions
  */
@@ -1068,6 +1173,99 @@ static int cipso_v4_parsetag_rbm(const s
 }
 
 /**
+ * cipso_v4_gentag_enum - Generate a CIPSO enumerated tag (type #2)

[PATCH 1/3] NetLabel: convert to an extensibile/sparse category bitmap

2006-11-29 Thread paul . moore
From: Paul Moore <[EMAIL PROTECTED]>

The original NetLabel category bitmap was a straight char bitmap which worked
fine for the initial release as it only supported 240 bits due to limitations
in the CIPSO restricted bitmap tag (tag type 0x01).  This patch converts that
straight char bitmap into an extensibile/sparse bitmap in order to lay the
foundation for other CIPSO tag types and protocols.

This patch also has a nice side effect in that all of the security attributes
passed by NetLabel into the LSM are now in a format which is in the host's
native byte/bit ordering which makes the LSM specific code much simpler; look
at the changes in security/selinux/ss/ebitmap.c as an example.

Signed-off-by: Paul Moore <[EMAIL PROTECTED]>
---
 include/net/netlabel.h |  102 
 net/ipv4/cipso_ipv4.c  |  170 ++
 net/netlabel/netlabel_kapi.c   |  201 +
 security/selinux/ss/ebitmap.c  |  196 +--
 security/selinux/ss/ebitmap.h  |   26 -
 security/selinux/ss/mls.c  |  156 ++-
 security/selinux/ss/mls.h  |   46 ++---
 security/selinux/ss/services.c |   23 +---
 8 files changed, 568 insertions(+), 352 deletions(-)

Index: net-2.6.20_netlabel-cats/include/net/netlabel.h
===
--- net-2.6.20_netlabel-cats.orig/include/net/netlabel.h
+++ net-2.6.20_netlabel-cats/include/net/netlabel.h
@@ -111,6 +111,22 @@ struct netlbl_lsm_cache {
void (*free) (const void *data);
void *data;
 };
+/* The catmap bitmap field MUST be a power of two in length and large
+ * enough to hold at least 240 bits.  Special care (i.e. check the code!)
+ * should be used when changing these values as the LSM implementation
+ * probably has functions which rely on the sizes of these types to speed
+ * processing. */
+#define NETLBL_CATMAP_MAPTYPE   u64
+#define NETLBL_CATMAP_MAPCNT4
+#define NETLBL_CATMAP_MAPSIZE   (sizeof(NETLBL_CATMAP_MAPTYPE) * 8)
+#define NETLBL_CATMAP_SIZE  (NETLBL_CATMAP_MAPSIZE * \
+NETLBL_CATMAP_MAPCNT)
+#define NETLBL_CATMAP_BIT   (NETLBL_CATMAP_MAPTYPE)0x01
+struct netlbl_lsm_secattr_catmap {
+   u32 startbit;
+   NETLBL_CATMAP_MAPTYPE bitmap[NETLBL_CATMAP_MAPCNT];
+   struct netlbl_lsm_secattr_catmap *next;
+};
 #define NETLBL_SECATTR_NONE 0x
 #define NETLBL_SECATTR_DOMAIN   0x0001
 #define NETLBL_SECATTR_CACHE0x0002
@@ -122,8 +138,7 @@ struct netlbl_lsm_secattr {
char *domain;
 
u32 mls_lvl;
-   unsigned char *mls_cat;
-   size_t mls_cat_len;
+   struct netlbl_lsm_secattr_catmap *mls_cat;
 
struct netlbl_lsm_cache *cache;
 };
@@ -171,6 +186,41 @@ static inline void netlbl_secattr_cache_
 }
 
 /**
+ * netlbl_secattr_catmap_alloc - Allocate a LSM secattr catmap
+ * @flags: memory allocation flags
+ *
+ * Description:
+ * Allocate memory for a LSM secattr catmap, returns a pointer on success, NULL
+ * on failure.
+ *
+ */
+static inline struct netlbl_lsm_secattr_catmap *netlbl_secattr_catmap_alloc(
+  gfp_t flags)
+{
+   return kzalloc(sizeof(struct netlbl_lsm_secattr_catmap), flags);
+}
+
+/**
+ * netlbl_secattr_catmap_free - Free a LSM secattr catmap
+ * @catmap: the category bitmap
+ *
+ * Description:
+ * Free a LSM secattr catmap.
+ *
+ */
+static inline void netlbl_secattr_catmap_free(
+ struct netlbl_lsm_secattr_catmap *catmap)
+{
+   struct netlbl_lsm_secattr_catmap *iter;
+
+   do {
+   iter = catmap;
+   catmap = catmap->next;
+   kfree(iter);
+   } while (catmap);
+}
+
+/**
  * netlbl_secattr_init - Initialize a netlbl_lsm_secattr struct
  * @secattr: the struct to initialize
  *
@@ -200,7 +250,8 @@ static inline void netlbl_secattr_destro
if (secattr->cache)
netlbl_secattr_cache_free(secattr->cache);
kfree(secattr->domain);
-   kfree(secattr->mls_cat);
+   if (secattr->mls_cat)
+   netlbl_secattr_catmap_free(secattr->mls_cat);
 }
 
 /**
@@ -231,6 +282,51 @@ static inline void netlbl_secattr_free(s
kfree(secattr);
 }
 
+#ifdef CONFIG_NETLABEL
+int netlbl_secattr_catmap_walk(struct netlbl_lsm_secattr_catmap *catmap,
+  u32 offset);
+int netlbl_secattr_catmap_walk_rng(struct netlbl_lsm_secattr_catmap *catmap,
+  u32 offset);
+int netlbl_secattr_catmap_setbit(struct netlbl_lsm_secattr_catmap *catmap,
+u32 bit,
+gfp_t flags);
+int netlbl_secattr_catmap_setrng(struct netlbl_lsm_secattr_catmap *catmap,
+u32 start,
+  

[PATCH 1/1] add auditing to ipsec

2006-11-29 Thread Joy Latten
This patch adds auditing to ipsec. 
An audit message occurs when an ipsec SA
or ipsec policy is created/deleted.

Patch was built against linux kernel 2.6.19-rc6.
Please let me know if this is acceptable. 

Regards,
Joy

Signed-off-by: Joy Latten <[EMAIL PROTECTED]>

---
diff -urpN linux-2.6.18.orig/include/linux/audit.h 
linux-2.6.18-patch/include/linux/audit.h
--- linux-2.6.18.orig/include/linux/audit.h 2006-11-27 11:21:16.0 
-0600
+++ linux-2.6.18-patch/include/linux/audit.h2006-11-27 12:28:43.0 
-0600
@@ -101,6 +101,10 @@
 #define AUDIT_MAC_CIPSOV4_DEL  1408/* NetLabel: del CIPSOv4 DOI entry */
 #define AUDIT_MAC_MAP_ADD  1409/* NetLabel: add LSM domain mapping */
 #define AUDIT_MAC_MAP_DEL  1410/* NetLabel: del LSM domain mapping */
+#define AUDIT_MAC_IPSEC_ADDSA  1411/* Add a XFRM state */
+#define AUDIT_MAC_IPSEC_DELSA  1412/* Delete a XFRM state */
+#define AUDIT_MAC_IPSEC_ADDSPD 1413/* Add a XFRM policy */
+#define AUDIT_MAC_IPSEC_DELSPD 1414/* Delete a XFRM policy */
 
 #define AUDIT_FIRST_KERN_ANOM_MSG   1700
 #define AUDIT_LAST_KERN_ANOM_MSG1799
@@ -377,6 +381,7 @@ extern void auditsc_get_stamp(struct aud
  struct timespec *t, unsigned int *serial);
 extern int  audit_set_loginuid(struct task_struct *task, uid_t loginuid);
 extern uid_t audit_get_loginuid(struct audit_context *ctx);
+extern void audit_log_task_context(struct audit_buffer *ab);
 extern int __audit_ipc_obj(struct kern_ipc_perm *ipcp);
 extern int __audit_ipc_set_perm(unsigned long qbytes, uid_t uid, gid_t gid, 
mode_t mode);
 extern int audit_bprm(struct linux_binprm *bprm);
@@ -449,6 +454,7 @@ extern int audit_n_rules;
 #define audit_inode_update(i) do { ; } while (0)
 #define auditsc_get_stamp(c,t,s) do { BUG(); } while (0)
 #define audit_get_loginuid(c) ({ -1; })
+#define audit_log_task_context(b) do { ; } while (0)
 #define audit_ipc_obj(i) ({ 0; })
 #define audit_ipc_set_perm(q,u,g,m) ({ 0; })
 #define audit_bprm(p) ({ 0; })
diff -urpN linux-2.6.18.orig/include/net/xfrm.h 
linux-2.6.18-patch/include/net/xfrm.h
--- linux-2.6.18.orig/include/net/xfrm.h2006-11-27 11:21:43.0 
-0600
+++ linux-2.6.18-patch/include/net/xfrm.h   2006-11-27 12:29:11.0 
-0600
@@ -389,6 +389,15 @@ extern int xfrm_unregister_km(struct xfr
 
 extern unsigned int xfrm_policy_count[XFRM_POLICY_MAX*2];
 
+/* Audit Information */
+struct xfrm_audit
+{
+   uid_t   loginuid;
+   u32 secid;
+};
+void xfrm_audit_log(uid_t auid, u32 secid, int type, int result,
+   struct xfrm_policy *xp, struct xfrm_state *x);
+
 static inline void xfrm_pol_hold(struct xfrm_policy *policy)
 {
if (likely(policy != NULL))
@@ -934,7 +943,7 @@ static inline int xfrm_state_sort(struct
 #endif
 extern struct xfrm_state *xfrm_find_acq_byseq(u32 seq);
 extern int xfrm_state_delete(struct xfrm_state *x);
-extern void xfrm_state_flush(u8 proto);
+extern void xfrm_state_flush(u8 proto, struct xfrm_audit *audit_info);
 extern int xfrm_replay_check(struct xfrm_state *x, __be32 seq);
 extern void xfrm_replay_advance(struct xfrm_state *x, __be32 seq);
 extern void xfrm_replay_notify(struct xfrm_state *x, int event);
@@ -987,13 +996,13 @@ struct xfrm_policy *xfrm_policy_bysel_ct
  struct xfrm_selector *sel,
  struct xfrm_sec_ctx *ctx, int delete);
 struct xfrm_policy *xfrm_policy_byid(u8, int dir, u32 id, int delete);
-void xfrm_policy_flush(u8 type);
+void xfrm_policy_flush(u8 type, struct xfrm_audit *audit_info);
 u32 xfrm_get_acqseq(void);
 void xfrm_alloc_spi(struct xfrm_state *x, __be32 minspi, __be32 maxspi);
 struct xfrm_state * xfrm_find_acq(u8 mode, u32 reqid, u8 proto, 
  xfrm_address_t *daddr, xfrm_address_t *saddr, 
  int create, unsigned short family);
-extern void xfrm_policy_flush(u8 type);
+extern void xfrm_policy_flush(u8 type, struct xfrm_audit *audit_info);
 extern int xfrm_sk_policy_insert(struct sock *sk, int dir, struct xfrm_policy 
*pol);
 extern int xfrm_bundle_ok(struct xfrm_policy *pol, struct xfrm_dst *xdst,
  struct flowi *fl, int family, int strict);
diff -urpN linux-2.6.18.orig/kernel/auditsc.c 
linux-2.6.18-patch/kernel/auditsc.c
--- linux-2.6.18.orig/kernel/auditsc.c  2006-11-27 11:19:36.0 -0600
+++ linux-2.6.18-patch/kernel/auditsc.c 2006-11-27 12:26:39.0 -0600
@@ -730,7 +730,7 @@ static inline void audit_free_context(st
printk(KERN_ERR "audit: freed %d contexts\n", count);
 }
 
-static void audit_log_task_context(struct audit_buffer *ab)
+void audit_log_task_context(struct audit_buffer *ab)
 {
char *ctx = NULL;
ssize_t len = 0;
@@ -759,6 +759,8 @@ error_path:
return;
 }
 
+EXPORT_SYMBOL(audit_log_task_context);
+
 sta

[SAA9730] Fix build error

2006-11-29 Thread Ralf Baechle
Confusingly NET_PCI is also set for for non-PCI EISA configurations where
building this driver will result in a build error due to a reference to
pci_release_regions.

While at it, remove the EXPERIMENTAL - in all its uglyness and despite
the sincerest attempts of the buggy hardware the driver is known to work.
Also limit the driver to the Atlas board - the only known system to ever
use the SAA9730 before Phillips ended the short live of the SAA9730.

Signed-off-by: Ralf Baechle <[EMAIL PROTECTED]>

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 4f22c8e..c80eb79 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -1761,8 +1761,8 @@ config VIA_RHINE_NAPI
  information.
 
 config LAN_SAA9730
-   bool "Philips SAA9730 Ethernet support (EXPERIMENTAL)"
-   depends on NET_PCI && EXPERIMENTAL && MIPS
+   bool "Philips SAA9730 Ethernet support"
+   depends on NET_PCI && PCI && MIPS_ATLAS
help
  The SAA9730 is a combined multimedia and peripheral controller used
  in thin clients, Internet access terminals, and diskless
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] NetXen: 64-bit memory fixes and driver cleanup

2006-11-29 Thread Amit S. Kale
NetXen: 1G/10G Ethernet Driver updates
- These fixes take care of driver on machines with >4G memory
- Driver cleanup

Signed-off-by: Amit S. Kale <[EMAIL PROTECTED]>

 netxen_nic.h  |   41 ++
 netxen_nic_ethtool.c  |   19 ++--
 netxen_nic_hw.c   |   10 +-
 netxen_nic_hw.h   |4 
 netxen_nic_init.c |   51 +++-
 netxen_nic_isr.c  |3 
 netxen_nic_main.c |  204 +++---
 netxen_nic_phan_reg.h |   10 +-
 8 files changed, 293 insertions(+), 49 deletions(-)


diff --git a/drivers/net/netxen/netxen_nic.h b/drivers/net/netxen/netxen_nic.h
index 1bee560..84259f9 100644
--- a/drivers/net/netxen/netxen_nic.h
+++ b/drivers/net/netxen/netxen_nic.h
@@ -6,12 +6,12 @@
  * modify it under the terms of the GNU General Public License
  * as published by the Free Software Foundation; either version 2
  * of the License, or (at your option) any later version.
- *
+ *
  * This program is distributed in the hope that it will be useful, but
  * WITHOUT ANY WARRANTY; without even the implied warranty of
  * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  * GNU General Public License for more details.
- *   
+ *
  * You should have received a copy of the GNU General Public License
  * along with this program; if not, write to the Free Software
  * Foundation, Inc., 59 Temple Place - Suite 330, Boston,
@@ -89,8 +89,8 @@
  * normalize a 64MB crb address to 32MB PCI window 
  * To use NETXEN_CRB_NORMALIZE, window _must_ be set to 1
  */
-#define NETXEN_CRB_NORMAL(reg)\
-   (reg) - NETXEN_CRB_PCIX_HOST2 + NETXEN_CRB_PCIX_HOST
+#define NETXEN_CRB_NORMAL(reg) \
+   ((reg) - NETXEN_CRB_PCIX_HOST2 + NETXEN_CRB_PCIX_HOST)
 
 #define NETXEN_CRB_NORMALIZE(adapter, reg) \
pci_base_offset(adapter, NETXEN_CRB_NORMAL(reg))
@@ -164,7 +164,7 @@ enum {
 
 #define MAX_CMD_DESCRIPTORS1024
 #define MAX_RCV_DESCRIPTORS32768
-#define MAX_JUMBO_RCV_DESCRIPTORS  1024
+#define MAX_JUMBO_RCV_DESCRIPTORS  4096
 #define MAX_RCVSTATUS_DESCRIPTORS  MAX_RCV_DESCRIPTORS
 #define MAX_JUMBO_RCV_DESC MAX_JUMBO_RCV_DESCRIPTORS
 #define MAX_RCV_DESC   MAX_RCV_DESCRIPTORS
@@ -592,6 +592,16 @@ struct netxen_skb_frag {
u32 length;
 };
 
+/* Bounce buffer index */
+struct bounce_index {
+   /* Index of a buffer */
+   unsigned buffer_index;
+   /* Offset inside the buffer */
+   unsigned buffer_offset;
+};
+
+#define IS_BOUNCE 0xcafebb
+
 /*Following defines are for the state of the buffers*/
 #defineNETXEN_BUFFER_FREE  0
 #defineNETXEN_BUFFER_BUSY  1
@@ -611,6 +621,8 @@ struct netxen_cmd_buffer {
unsigned long time_stamp;
u32 state;
u32 no_of_descriptors;
+   u32 tx_bounce_buff;
+   struct bounce_index bnext;
 };
 
 /* In rx_buffer, we do not need multiple fragments as is a single buffer */
@@ -619,6 +631,9 @@ struct netxen_rx_buffer {
u64 dma;
u16 ref_handle;
u16 state;
+   u32 rx_bounce_buff;
+   struct bounce_index bnext;
+   char *bounce_ptr;
 };
 
 /* Board types */
@@ -703,6 +718,7 @@ struct netxen_recv_context {
 };
 
 #define NETXEN_NIC_MSI_ENABLED 0x02
+#define NETXEN_DMA_MASK0xfffe
 
 struct netxen_drvops;
 
@@ -937,9 +953,7 @@ static inline void netxen_nic_disable_in
/*
 * ISR_INT_MASK: Can be read from window 0 or 1.
 */
-   writel(0x7ff,
-  (void __iomem
-   *)(PCI_OFFSET_SECOND_RANGE(adapter, ISR_INT_MASK)));
+   writel(0x7ff, PCI_OFFSET_SECOND_RANGE(adapter, ISR_INT_MASK));
 
 }
 
@@ -959,14 +973,12 @@ static inline void netxen_nic_enable_int
break;
}
 
-   writel(mask,
-  (void __iomem
-   *)(PCI_OFFSET_SECOND_RANGE(adapter, ISR_INT_MASK)));
+   writel(mask, PCI_OFFSET_SECOND_RANGE(adapter, ISR_INT_MASK));
 
if (!(adapter->flags & NETXEN_NIC_MSI_ENABLED)) {
mask = 0xbff;
-   writel(mask, (void __iomem *)
-  (PCI_OFFSET_SECOND_RANGE(adapter, ISR_INT_TARGET_MASK)));
+   writel(mask, PCI_OFFSET_SECOND_RANGE(adapter,
+ISR_INT_TARGET_MASK));
}
 }
 
@@ -1040,6 +1052,9 @@ static inline void get_brd_name_by_type(
 
 int netxen_is_flash_supported(struct netxen_adapter *adapter);
 int netxen_get_flash_mac_addr(struct netxen_adapter *adapter, u64 mac[]);
+int netxen_get_next_bounce_buffer(struct bounce_index *head,
+ struct bounce_index *tail,
+ struct bounce_index *biret, unsigned len);
 
 extern void netxen_change_ringparam(struct netxen_adapter *adapter);
 extern int netxen_rom_fast_read(struct netxen_adapter *adapter, int addr,
diff --git a/drivers/net/netxen/netxen_nic_ethtool.c

[PATCH 1/4] NetXen: Fixed /sys mapping between device and driver

2006-11-29 Thread Amit S. Kale
Signed-off-by: Amit S. Kale <[EMAIL PROTECTED]>

 netxen_nic_main.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletion(-)


diff --git a/drivers/net/netxen/netxen_nic_main.c 
b/drivers/net/netxen/netxen_nic_main.c
index 145bf47..a055208 100644
--- a/drivers/net/netxen/netxen_nic_main.c
+++ b/drivers/net/netxen/netxen_nic_main.c
@@ -273,6 +273,7 @@ netxen_nic_probe(struct pci_dev *pdev, c
}
 
SET_MODULE_OWNER(netdev);
+   SET_NETDEV_DEV(netdev, &pdev->dev);
 
port = netdev_priv(netdev);
port->netdev = netdev;
@@ -1043,7 +1044,7 @@ static int netxen_nic_poll(struct net_de
netxen_nic_enable_int(adapter);
}
 
-   return (done ? 0 : 1);
+   return !done;
 }
 
 #ifdef CONFIG_NET_POLL_CONTROLLER
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/4] NetXen: 1G/10G Ethernet Driver updates

2006-11-29 Thread Amit S. Kale

I will be sending NetXen: 1G/10G Ethernet Driver updates in subsequent emails.

Thanks,
--Amit
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] d80211: Reset assoc and auth retry counters

2006-11-29 Thread John W. Linville
On Wed, Nov 29, 2006 at 03:33:07PM +0100, Jiri Benc wrote:
> On Wed, 29 Nov 2006 15:27:06 +0100, Ivo Van Doorn wrote:
> > Shouldn't this last one be:
> > ieee80211_set_disassoc(dev, ifsta, 0)
> > 
> > This one is called from the IOCTL request to dissassociate,
> > so the interface should still be authenticated (with a valid
> > auth retry counter).
> 
> Yes, of course. Thanks for being watchful :-)

I'll massage this and apply it on top of wireless-dev, since I already
applied Ivo's patch.

John
-- 
John W. Linville
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Madwifi-devel] ar5k and Atheros AR5005G

2006-11-29 Thread Michael Buesch
On Wednesday 29 November 2006 16:58, Michael Renzmann wrote:
> Hi.
> 
> > On Wednesday 29 November 2006 16:24, David Kimdon wrote:
> >> There is absolutely no reason why dadwifi can't be merged into the
> >> mainline once the hal issue is resolved.
> > Last time we talked about that stuff, it was decided that
> > we don't want a HAL... See archives.
> 
> IIRC Pavel already explained that getting rid of the HAL per se should be
> no problem - it could easily be dissolved into the driver, if that is one
> of the requirements to be fulfilled before the driver (MadWifi or DadWifi)
> is considered for mainline inclusion. As soon as there is source available
> to dissolve, at least.

Ok, so who actually does the work?
It has been talked a lot about what could and what should be done.
But who does it?

> From what I understood the "... once the hal issue is resolved" part of
> David's mail refered to exactly that question.

Ok, I don't know what "The HAL Issue" (tm) is.
Sounds like a hollywood movie theme to me. ;)

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Madwifi-devel] ar5k and Atheros AR5005G

2006-11-29 Thread David Kimdon
On Wed, Nov 29, 2006 at 04:38:56PM +0100, Michael Buesch wrote:
> On Wednesday 29 November 2006 16:24, David Kimdon wrote:
> > On Wed, Nov 29, 2006 at 04:12:33PM +0100, Michael Buesch wrote:
> > > On Wednesday 29 November 2006 15:34, Nick Kossifidis wrote:
> > Why do you say that?
> > 
> > There is absolutely no reason why dadwifi can't be merged into the
> > mainline once the hal issue is resolved. 
> 
> Last time we talked about that stuff, it was decided that
> we don't want a HAL... See archives.

To be clear, that is all part of the hal issue that needs to be
resolved.  Removing the hal abstraction is not difficult for an
interested party once source for the hal is available.  The next step
in such an effort would be to add an open hal to dadwifi, IMO.

-David

P.S. Actually, it isn't clear to me that removing the hal entirely is
a good idea.  Abstractions exist for practical reasons.  The hal
allows dadwifi to support a variety of Atheros chips without needing
to worry about the specific details of each chip.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Madwifi-devel] ar5k and Atheros AR5005G

2006-11-29 Thread Michael Renzmann
Hi.

> On Wednesday 29 November 2006 16:24, David Kimdon wrote:
>> There is absolutely no reason why dadwifi can't be merged into the
>> mainline once the hal issue is resolved.
> Last time we talked about that stuff, it was decided that
> we don't want a HAL... See archives.

IIRC Pavel already explained that getting rid of the HAL per se should be
no problem - it could easily be dissolved into the driver, if that is one
of the requirements to be fulfilled before the driver (MadWifi or DadWifi)
is considered for mainline inclusion. As soon as there is source available
to dissolve, at least.

>From what I understood the "... once the hal issue is resolved" part of
David's mail refered to exactly that question.

Bye, Mike



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Madwifi-devel] ar5k and Atheros AR5005G

2006-11-29 Thread Daniel Drake
On Wed, 2006-11-29 at 15:55 +0200, Nick Kossifidis wrote:
> I 've already ported ar5k to linux and it works with madwifi versions
> before the bsd-head merge, you can see more infos here ->
> http://madwifi.org/wiki/OpenHAL
> 
> If i can help in any way feel free to mail ;-)

Thanks, I'm trying it out to see whether it works on my hardware. I
compiled and loaded everything OK, ath0 appears, but it doesn't seem to
be working.

I'm using these commands:

ifconfig ath0 up
iwlist ath0 scan

Should that produce scan results, or do I need to use some weird tools
to do that? (this is my first interaction with the madwifi-old driver)

Currently it pauses for a while and then doesn't present any results.

Thanks!

-- 
Daniel Drake
Brontes Technologies, A 3M Company

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Madwifi-devel] ar5k and Atheros AR5005G

2006-11-29 Thread Michael Buesch
On Wednesday 29 November 2006 16:24, David Kimdon wrote:
> On Wed, Nov 29, 2006 at 04:12:33PM +0100, Michael Buesch wrote:
> > On Wednesday 29 November 2006 15:34, Nick Kossifidis wrote:
> > > Good luck then ;-)
> > > 
> > > If anyone wants to help on making ar5k work with newer madwifi
> > > versions and fix bugs etc (that 'll also help bsd ppl) plzz mail me.
> > > We can make it better.
> > > 
> > > Nick
> > > P.S. Why not work on dawifi ?
> > 
> > Because it won't be merged mainline either.
> 
> Why do you say that?
> 
> There is absolutely no reason why dadwifi can't be merged into the
> mainline once the hal issue is resolved. 

Ok, I deleted my repository.
Atheros stuff is really too frustrating to work on and
I don't have the time anyway.
If you believe dadwifi can be merged, please _do_ so.

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Madwifi-devel] ar5k and Atheros AR5005G

2006-11-29 Thread Michael Buesch
On Wednesday 29 November 2006 16:24, David Kimdon wrote:
> On Wed, Nov 29, 2006 at 04:12:33PM +0100, Michael Buesch wrote:
> > On Wednesday 29 November 2006 15:34, Nick Kossifidis wrote:
> > > Good luck then ;-)
> > > 
> > > If anyone wants to help on making ar5k work with newer madwifi
> > > versions and fix bugs etc (that 'll also help bsd ppl) plzz mail me.
> > > We can make it better.
> > > 
> > > Nick
> > > P.S. Why not work on dawifi ?
> > 
> > Because it won't be merged mainline either.
> 
> Why do you say that?
> 
> There is absolutely no reason why dadwifi can't be merged into the
> mainline once the hal issue is resolved. 

Last time we talked about that stuff, it was decided that
we don't want a HAL... See archives.

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Madwifi-devel] ar5k and Atheros AR5005G

2006-11-29 Thread David Kimdon
On Wed, Nov 29, 2006 at 10:21:09AM -0500, Dan Williams wrote:
> On Wed, 2006-11-29 at 16:12 +0100, Michael Buesch wrote:
> > On Wednesday 29 November 2006 15:34, Nick Kossifidis wrote:
> > > Good luck then ;-)
> > > 
> > > If anyone wants to help on making ar5k work with newer madwifi
> > > versions and fix bugs etc (that 'll also help bsd ppl) plzz mail me.
> > > We can make it better.
> > > 
> > > Nick
> > > P.S. Why not work on dawifi ?
> > 
> > Because it won't be merged mainline either.
> 
> I thought dadwifi was supposed to replace net80211 with d80211 (but not
> replace the binary HAL). 

yes

>  Aren't the two things complementary, 

yes

> or did
> you just decide that starting from scratch would produce a less crufty,
> better understood, better-d80211 integrated driver?

well, dadwifi will be (is) well integrated with d80211.  As far as
cruft goes, I'd rather call it historical artifacts :-)  We are doing
our best to minimize cruft while standing on the shoulders of madwifi.

-David

> 
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [IPVS] transparent proxying

2006-11-29 Thread Wensong Zhang


Hi Horms,

I see that this patch probably makes IPVS code a bit complicated and 
packet traversing less efficiently.


If I remember correctly, policy-based routing can work with IPVS in 
kernel 2.2 and 2.4 for transparent cache cluster for a long time. It 
should work in kernel 2.6 too.


For example, we can use iptables/ipchains to mark all web traffic with 
fwmark 1, then use policy-based routing to route all web traffic through 
NF_IP_LOCAL_IN, so that ip_vs_in can capture the packets and load 
balance packets to cache servers.

ip rule add prio 100 fwmark 1 table 100
ip route add local 0/0 dev lo table 100

ipvsadm -A -f 1 -s wlc
ipvsadm -a -f 1 -w 100 -r cache1
ipvsadm -a -f 1 -w 100 -r cache2
ipvsadm -a -f 1 -w 100 -r cache2

...

Cheers,

Wensong

Horms wrote:

This seems to be a pretty clean solution to a real problem.

Ultimately I would like to see IPVS move into the forward chain.
This seems to be a nice way to explore that, without breaking
any existing setups.

  


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Madwifi-devel] ar5k and Atheros AR5005G

2006-11-29 Thread David Kimdon
On Wed, Nov 29, 2006 at 04:12:33PM +0100, Michael Buesch wrote:
> On Wednesday 29 November 2006 15:34, Nick Kossifidis wrote:
> > Good luck then ;-)
> > 
> > If anyone wants to help on making ar5k work with newer madwifi
> > versions and fix bugs etc (that 'll also help bsd ppl) plzz mail me.
> > We can make it better.
> > 
> > Nick
> > P.S. Why not work on dawifi ?
> 
> Because it won't be merged mainline either.

Why do you say that?

There is absolutely no reason why dadwifi can't be merged into the
mainline once the hal issue is resolved. 

-David
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Madwifi-devel] ar5k and Atheros AR5005G

2006-11-29 Thread Dan Williams
On Wed, 2006-11-29 at 16:12 +0100, Michael Buesch wrote:
> On Wednesday 29 November 2006 15:34, Nick Kossifidis wrote:
> > Good luck then ;-)
> > 
> > If anyone wants to help on making ar5k work with newer madwifi
> > versions and fix bugs etc (that 'll also help bsd ppl) plzz mail me.
> > We can make it better.
> > 
> > Nick
> > P.S. Why not work on dawifi ?
> 
> Because it won't be merged mainline either.

I thought dadwifi was supposed to replace net80211 with d80211 (but not
replace the binary HAL).  Aren't the two things complementary, or did
you just decide that starting from scratch would produce a less crufty,
better understood, better-d80211 integrated driver?

Dan

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Broken commit: [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function

2006-11-29 Thread Krzysztof Halasa
Krzysztof Halasa <[EMAIL PROTECTED]> writes:

> I wound't care less btw.

s/wound/couldn/, eh those foreign languages...
-- 
Krzysztof Halasa
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Madwifi-devel] ar5k and Atheros AR5005G

2006-11-29 Thread Michael Buesch
On Wednesday 29 November 2006 15:34, Nick Kossifidis wrote:
> Good luck then ;-)
> 
> If anyone wants to help on making ar5k work with newer madwifi
> versions and fix bugs etc (that 'll also help bsd ppl) plzz mail me.
> We can make it better.
> 
> Nick
> P.S. Why not work on dawifi ?

Because it won't be merged mainline either.

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Broken commit: [NETFILTER]: ipt_REJECT: remove largely duplicate route_reverse function

2006-11-29 Thread Krzysztof Halasa
Jarek Poplawski <[EMAIL PROTECTED]> writes:

> And if we talk about names:
>
> + Spotted by Krzysztof Halasa.

I wound't care less btw.
-- 
Krzysztof Halasa
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [IPVS] transparent proxying

2006-11-29 Thread Horms
On Wed, Nov 29, 2006 at 03:15:23PM +0100, Thomas Graf wrote:
> * Horms <[EMAIL PROTECTED]> 2006-11-29 15:21
> > This seems to be a pretty clean solution to a real problem.
> > 
> > Ultimately I would like to see IPVS move into the forward chain.
> > This seems to be a nice way to explore that, without breaking
> > any existing setups.
> > 
> > -- 
> > Horms
> >   H: http://www.vergenet.net/~horms/
> >   W: http://www.valinux.co.jp/en/
> > 
> > [IPVS] transparent proxying
> > 
> > Patch from Jinhua Luo <[EMAIL PROTECTED]> to allow a web cluseter using
> > transparent proxying. It works by simply grabing packets that have the
> > fwmark set and have not already been processed by ipvs (ip_vs_out) and
> > throwing them into ip_vs_in.
> > 
> > See: 
> > http://archive.linuxvirtualserver.org/html/lvs-users/2006-11/msg00261.html
> > 
> > Normally LVS packets are processed by ip_vs_in fron on the INPUT chain,
> > and packets that are processed in this way never show up on the FORWARD
> > chain, so they won't hit this rule.
> > 
> > This patch seems like a good precursor to moving LVS permanantly to
> > the FORWARD chain. As I'm struggling to think how it could break things.
> > 
> > The changes to the original patch are:
> > 
> > * Reformated to use tabs for indentation (instead of 4 spaces)
> > * Reformated to be < 80 columns wide
> > * Added some comments
> > * Rewrote description (this text)
> > 
> > Signed-off-by: Simon Horman <[EMAIL PROTECTED]>
> > Signed-off-by: Jinhua Luo <[EMAIL PROTECTED]>
> > 
> > Index: linux-2.6/net/ipv4/ipvs/ip_vs_core.c
> > ===
> > --- linux-2.6.orig/net/ipv4/ipvs/ip_vs_core.c   2006-11-28 
> > 15:30:00.0 +0900
> > +++ linux-2.6/net/ipv4/ipvs/ip_vs_core.c2006-11-29 10:27:49.0 
> > +0900
> > @@ -23,7 +23,9 @@
> >   * Changes:
> >   * Paul `Rusty' Russellproperly handle non-linear skbs
> >   * Harald Weltedon't use nfcache
> > - *
> > + * Jinhua Luo  redirect packets with fwmark on
> > + * NF_IP_FORWARD chain to ip_vs_in(),
> > + * mainly for transparent cache cluster
> >   */
> >  
> >  #include 
> > @@ -1070,6 +1072,26 @@
> > return ip_vs_in_icmp(pskb, &r, hooknum);
> >  }
> >  
> > +/*
> > + * This is hooked into the NF_IP_FORWARD. It catches
> > + * packets that have not already been handled by ipvs (out)
> > + * and have a fwmark set. This is to allow transparent proxying
> > + * of fwmark virtual services.
> > + *
> > + * It will not process packets that are handled by ipvs (in)
> > + * as they never traverse the NF_IP_FORWARD.
> > + */
> > +static unsigned int
> > +ip_vs_forward_with_fwmark(unsigned int hooknum, struct sk_buff **pskb,
> > + const struct net_device *in,
> > + const struct net_device *out,
> > + int (*okfn)(struct sk_buff *))
> > +{
> > +   if ((*pskb)->ipvs_property || ! (*pskb)->nfmark)
> > +   return NF_ACCEPT;
> 
> This patch seems to be based on an old tree, I've renamed nfmark
> to mark in net-2.6.20. The term fwmark and nfmark shouldn't be
> used anymore.

Sorry, I based this patch on Linus's tree. I'll port it to net-2.6.20.

-- 
Horms
  H: http://www.vergenet.net/~horms/
  W: http://www.valinux.co.jp/en/

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Madwifi-devel] ar5k and Atheros AR5005G

2006-11-29 Thread Nick Kossifidis

Good luck then ;-)

If anyone wants to help on making ar5k work with newer madwifi
versions and fix bugs etc (that 'll also help bsd ppl) plzz mail me.
We can make it better.

Nick
P.S. Why not work on dawifi ?

2006/11/29, Michael Buesch <[EMAIL PROTECTED]>:

On Wednesday 29 November 2006 14:55, Nick Kossifidis wrote:
> I 've already ported ar5k to linux and it works with madwifi versions

No, you misunderstood me.
Madwifi is not a native driver and will never be accepted into
mainline. My attempt is to write a native d80211 driver based
on the ar5k sources. Currently I don't have too much time, so
it's not very progressed, but from next week on I have vacation
from work, so I think I can work on this again.

--
Greetings Michael.


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] d80211: Reset assoc and auth retry counters

2006-11-29 Thread Jiri Benc
On Wed, 29 Nov 2006 15:27:06 +0100, Ivo Van Doorn wrote:
> Shouldn't this last one be:
> ieee80211_set_disassoc(dev, ifsta, 0)
> 
> This one is called from the IOCTL request to dissassociate,
> so the interface should still be authenticated (with a valid
> auth retry counter).

Yes, of course. Thanks for being watchful :-)

 Jiri

-- 
Jiri Benc
SUSE Labs
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] d80211: Reset assoc and auth retry counters

2006-11-29 Thread Ivo Van Doorn

On 11/29/06, Jiri Benc <[EMAIL PROTECTED]> wrote:

On Tue, 28 Nov 2006 20:56:05 +0100, Ivo van Doorn wrote:
> After a succesfull authentication and association the matching retry
counter
> must be reset to 0.
> Failure to do so will result in failure to authenticate after the
interface
> has been deauthenticated. This does not always happen after the first
> deauthentication, but after the interface has been several times been
> deauthenticated it will refuse to authenticate.

Thanks for spotting this, but your fix makes statistics about
authentication/association exported via sysfs useless. The counters
should be reset before a new authentication/association attempt (as is
done in ieee80211_sta_new_auth).


Sounds good to me, I was unsure where those counters should be reset anyway. :)


I think this is a more correct fix:
@@ -2858,7 +2866,7 @@ int ieee80211_sta_deauthenticate(struct
return -EINVAL;

ieee80211_send_deauth(dev, ifsta, reason);
-   ieee80211_set_associated(dev, ifsta, 0);
+   ieee80211_set_disassoc(dev, ifsta, 1);
return 0;
 }

@@ -2878,6 +2886,6 @@ int ieee80211_sta_disassociate(struct ne
return -1;

ieee80211_send_disassoc(dev, ifsta, reason);
-   ieee80211_set_associated(dev, ifsta, 0);
+   ieee80211_set_disassoc(dev, ifsta, 1);
return 0;
 }



Shouldn't this last one be:
ieee80211_set_disassoc(dev, ifsta, 0)

This one is called from the IOCTL request to dissassociate,
so the interface should still be authenticated (with a valid
auth retry counter).

Ivo
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [IPVS] transparent proxying

2006-11-29 Thread Thomas Graf
* Horms <[EMAIL PROTECTED]> 2006-11-29 15:21
> This seems to be a pretty clean solution to a real problem.
> 
> Ultimately I would like to see IPVS move into the forward chain.
> This seems to be a nice way to explore that, without breaking
> any existing setups.
> 
> -- 
> Horms
>   H: http://www.vergenet.net/~horms/
>   W: http://www.valinux.co.jp/en/
> 
> [IPVS] transparent proxying
> 
> Patch from Jinhua Luo <[EMAIL PROTECTED]> to allow a web cluseter using
> transparent proxying. It works by simply grabing packets that have the
> fwmark set and have not already been processed by ipvs (ip_vs_out) and
> throwing them into ip_vs_in.
> 
> See: 
> http://archive.linuxvirtualserver.org/html/lvs-users/2006-11/msg00261.html
> 
> Normally LVS packets are processed by ip_vs_in fron on the INPUT chain,
> and packets that are processed in this way never show up on the FORWARD
> chain, so they won't hit this rule.
> 
> This patch seems like a good precursor to moving LVS permanantly to
> the FORWARD chain. As I'm struggling to think how it could break things.
> 
> The changes to the original patch are:
> 
> * Reformated to use tabs for indentation (instead of 4 spaces)
> * Reformated to be < 80 columns wide
> * Added some comments
> * Rewrote description (this text)
> 
> Signed-off-by: Simon Horman <[EMAIL PROTECTED]>
> Signed-off-by: Jinhua Luo <[EMAIL PROTECTED]>
> 
> Index: linux-2.6/net/ipv4/ipvs/ip_vs_core.c
> ===
> --- linux-2.6.orig/net/ipv4/ipvs/ip_vs_core.c 2006-11-28 15:30:00.0 
> +0900
> +++ linux-2.6/net/ipv4/ipvs/ip_vs_core.c  2006-11-29 10:27:49.0 
> +0900
> @@ -23,7 +23,9 @@
>   * Changes:
>   *   Paul `Rusty' Russellproperly handle non-linear skbs
>   *   Harald Weltedon't use nfcache
> - *
> + *   Jinhua Luo  redirect packets with fwmark on
> + *   NF_IP_FORWARD chain to ip_vs_in(),
> + *   mainly for transparent cache cluster
>   */
>  
>  #include 
> @@ -1070,6 +1072,26 @@
>   return ip_vs_in_icmp(pskb, &r, hooknum);
>  }
>  
> +/*
> + *   This is hooked into the NF_IP_FORWARD. It catches
> + *   packets that have not already been handled by ipvs (out)
> + *   and have a fwmark set. This is to allow transparent proxying
> + *   of fwmark virtual services.
> + *
> + *   It will not process packets that are handled by ipvs (in)
> + *   as they never traverse the NF_IP_FORWARD.
> + */
> +static unsigned int
> +ip_vs_forward_with_fwmark(unsigned int hooknum, struct sk_buff **pskb,
> +   const struct net_device *in,
> +   const struct net_device *out,
> +   int (*okfn)(struct sk_buff *))
> +{
> + if ((*pskb)->ipvs_property || ! (*pskb)->nfmark)
> + return NF_ACCEPT;

This patch seems to be based on an old tree, I've renamed nfmark
to mark in net-2.6.20. The term fwmark and nfmark shouldn't be
used anymore.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] d80211: Reset assoc and auth retry counters

2006-11-29 Thread Jiri Benc
On Tue, 28 Nov 2006 20:56:05 +0100, Ivo van Doorn wrote:
> After a succesfull authentication and association the matching retry counter
> must be reset to 0.
> Failure to do so will result in failure to authenticate after the interface
> has been deauthenticated. This does not always happen after the first
> deauthentication, but after the interface has been several times been
> deauthenticated it will refuse to authenticate.

Thanks for spotting this, but your fix makes statistics about
authentication/association exported via sysfs useless. The counters
should be reset before a new authentication/association attempt (as is
done in ieee80211_sta_new_auth).

I think this is a more correct fix:

Signed-off-by: Jiri Benc <[EMAIL PROTECTED]>

---
 net/d80211/ieee80211_sta.c |   18 +-
 1 files changed, 13 insertions(+), 5 deletions(-)

--- dscape.orig/net/d80211/ieee80211_sta.c
+++ dscape/net/d80211/ieee80211_sta.c
@@ -382,6 +382,14 @@ static void ieee80211_set_associated(str
ifsta->last_probe = jiffies;
 }
 
+static void ieee80211_set_disassoc(struct net_device *dev,
+  struct ieee80211_if_sta *ifsta, int deauth)
+{
+   if (deauth)
+   ifsta->auth_tries = 0;
+   ifsta->assoc_tries = 0;
+   ieee80211_set_associated(dev, ifsta, 0);
+}
 
 static void ieee80211_sta_tx(struct net_device *dev, struct sk_buff *skb,
 int encrypt, int probe_resp)
@@ -1023,7 +1031,7 @@ static void ieee80211_rx_mgmt_deauth(str
  IEEE80211_RETRY_AUTH_INTERVAL);
}
 
-   ieee80211_set_associated(dev, ifsta, 0);
+   ieee80211_set_disassoc(dev, ifsta, 1);
ifsta->authenticated = 0;
 }
 
@@ -1066,7 +1074,7 @@ static void ieee80211_rx_mgmt_disassoc(s
  IEEE80211_RETRY_AUTH_INTERVAL);
}
 
-   ieee80211_set_associated(dev, ifsta, 0);
+   ieee80211_set_disassoc(dev, ifsta, 0);
 }
 
 
@@ -1882,7 +1890,7 @@ void ieee80211_sta_work(void *ptr)
   "mixed-cell disabled - disassociate\n", dev->name);
 
ieee80211_send_disassoc(dev, ifsta, WLAN_REASON_UNSPECIFIED);
-   ieee80211_set_associated(dev, ifsta, 0);
+   ieee80211_set_disassoc(dev, ifsta, 0);
}
 }
 
@@ -2858,7 +2866,7 @@ int ieee80211_sta_deauthenticate(struct 
return -EINVAL;
 
ieee80211_send_deauth(dev, ifsta, reason);
-   ieee80211_set_associated(dev, ifsta, 0);
+   ieee80211_set_disassoc(dev, ifsta, 1);
return 0;
 }
 
@@ -2878,6 +2886,6 @@ int ieee80211_sta_disassociate(struct ne
return -1;
 
ieee80211_send_disassoc(dev, ifsta, reason);
-   ieee80211_set_associated(dev, ifsta, 0);
+   ieee80211_set_disassoc(dev, ifsta, 1);
return 0;
 }


-- 
Jiri Benc
SUSE Labs
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Madwifi-devel] ar5k and Atheros AR5005G

2006-11-29 Thread Michael Buesch
On Wednesday 29 November 2006 14:55, Nick Kossifidis wrote:
> I 've already ported ar5k to linux and it works with madwifi versions

No, you misunderstood me.
Madwifi is not a native driver and will never be accepted into
mainline. My attempt is to write a native d80211 driver based
on the ar5k sources. Currently I don't have too much time, so
it's not very progressed, but from next week on I have vacation
from work, so I think I can work on this again.

-- 
Greetings Michael.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[DECNet] fib: Fix out of bound access of fib_props[]

2006-11-29 Thread Thomas Graf
Fixes a typo which caused fib_props[] to have the wrong size
and makes sure the value used to index the array which is
provided by userspace via netlink is checked to avoid out of
bound access.

Signed-off-by: Thomas Graf <[EMAIL PROTECTED]>

Index: net-2.6/net/decnet/dn_fib.c
===
--- net-2.6.orig/net/decnet/dn_fib.c2006-11-29 13:35:51.0 +0100
+++ net-2.6/net/decnet/dn_fib.c 2006-11-29 13:36:17.0 +0100
@@ -63,7 +63,7 @@
 {
int error;
u8 scope;
-} dn_fib_props[RTA_MAX+1] = {
+} dn_fib_props[RTN_MAX+1] = {
[RTN_UNSPEC] =  { .error = 0,   .scope = RT_SCOPE_NOWHERE },
[RTN_UNICAST] = { .error = 0,   .scope = RT_SCOPE_UNIVERSE },
[RTN_LOCAL] =   { .error = 0,   .scope = RT_SCOPE_HOST },
@@ -276,6 +276,9 @@
struct dn_fib_info *ofi;
int nhs = 1;
 
+   if (r->rtm_type > RTN_MAX)
+   goto err_inval;
+
if (dn_fib_props[r->rtm_type].scope > r->rtm_scope)
goto err_inval;
 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >