Re: [pktgen script v2 0/2] Add a pktgen sample script of NUMA awareness

2017-09-22 Thread Jesper Dangaard Brouer
On Mon, 18 Sep 2017 11:06:21 +0200
Jesper Dangaard Brouer  wrote:

> On Sun, 17 Sep 2017 20:36:36 +0800 Robert Hoo  
> wrote:
> 
> > Change log
> > v2:
> >  Rebased to 
> > https://github.com/netoptimizer/network-testing/tree/master/pktgen  
> 
> Hi Robert,
> 
> Thank you for submitting this against my git tree[1]. I skimmed the
> patches and they looked okay.  I'll give them a test run, before I
> accept them into my tree.
> 
> Later I'll synchronize my pktgen scripts/git-tree with the kernel via
> regular patches against DaveM's net-next tree[2] (and I'll try to
> remember to give you author credit).
> 
> [1] https://github.com/netoptimizer/network-testing/tree/master/pktgen
> [2] 
> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/samples/pktgen

FYI, I've applied and pushed these patches to my tree.
 https://github.com/netoptimizer/network-testing/commits?author=robert-hoo
 https://github.com/netoptimizer/network-testing/commit/1b9b4b797a4f112
 https://github.com/netoptimizer/network-testing/commit/65efc2352f63dde
 https://github.com/netoptimizer/network-testing/commit/54eb5178aaf4031

I fixed the description a bit, and only made one simple change:
 https://github.com/netoptimizer/network-testing/commit/9ff58568b3f8c91

Thanks for working on improving the pktgen scripts :-)
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


Re: [pktgen script v2 0/2] Add a pktgen sample script of NUMA awareness

2017-09-18 Thread Jesper Dangaard Brouer
On Sun, 17 Sep 2017 20:36:36 +0800 Robert Hoo  wrote:

> Change log
> v2:
>  Rebased to https://github.com/netoptimizer/network-testing/tree/master/pktgen

Hi Robert,

Thank you for submitting this against my git tree[1]. I skimmed the
patches and they looked okay.  I'll give them a test run, before I
accept them into my tree.

Later I'll synchronize my pktgen scripts/git-tree with the kernel via
regular patches against DaveM's net-next tree[2] (and I'll try to
remember to give you author credit).

[1] https://github.com/netoptimizer/network-testing/tree/master/pktgen
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/samples/pktgen
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


Re: pktgen question

2007-10-08 Thread David Miller
From: Steve Wise <[EMAIL PROTECTED]>
Date: Mon, 08 Oct 2007 17:46:26 -0500

> We're talking about just for pktgen...eh?

My bad, I'm happy to review a patch that uses the
skb->destructor in pktgen to achieve this.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen question

2007-10-08 Thread Steve Wise



David Miller wrote:

From: Ben Greear <[EMAIL PROTECTED]>
Date: Mon, 08 Oct 2007 14:57:13 -0700


This skb recycling can certainly work and has been done several
times before.  It never gets into the kernel though.


Because it doesn't work.

A socket can hang onto a receive packet essentially forever.

You cannot therefore rely upon the network stack to give you the
packet back in some reasonable finite amount of time.  This is simply
the nature of the beast.

Which means that you either:

1) Starve and stop receiving packets when the recycling ring
   runs out because all of those packets are stuck in socket
   buffers.  This is easily DoS'able by users on your system

2) End up allocating new packets anyways, but then what's the
   point of the recycling ring?  It's just a hack that works
   when everything goes as planned, and in real life that is
   rarely the case.

It also makes the driver locking more complicated.  Packet
input occurs in NAPI drivers via netif_receive_skb() which
would be capable of recycling packets back to the same
driver in a recycling scheme.  But the recycling can occur
from other contexts too.  The netif_receive_skb() caller
already has atomic access to the receive queue, but those
other callback cases do not.

That locking issue is just the tip of the iceberg.  Once you
start discussing solutions, all sorts of new issues begin to
pop up.

SKB recycling, just say no.



We're talking about just for pktgen...eh?



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen question

2007-10-08 Thread David Miller
From: Ben Greear <[EMAIL PROTECTED]>
Date: Mon, 08 Oct 2007 14:57:13 -0700

> This skb recycling can certainly work and has been done several
> times before.  It never gets into the kernel though.

Because it doesn't work.

A socket can hang onto a receive packet essentially forever.

You cannot therefore rely upon the network stack to give you the
packet back in some reasonable finite amount of time.  This is simply
the nature of the beast.

Which means that you either:

1) Starve and stop receiving packets when the recycling ring
   runs out because all of those packets are stuck in socket
   buffers.  This is easily DoS'able by users on your system

2) End up allocating new packets anyways, but then what's the
   point of the recycling ring?  It's just a hack that works
   when everything goes as planned, and in real life that is
   rarely the case.

It also makes the driver locking more complicated.  Packet
input occurs in NAPI drivers via netif_receive_skb() which
would be capable of recycling packets back to the same
driver in a recycling scheme.  But the recycling can occur
from other contexts too.  The netif_receive_skb() caller
already has atomic access to the receive queue, but those
other callback cases do not.

That locking issue is just the tip of the iceberg.  Once you
start discussing solutions, all sorts of new issues begin to
pop up.

SKB recycling, just say no.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen question

2007-10-08 Thread Ben Greear

Steve Wise wrote:



Ben Greear wrote:

Rick Jones wrote:
Perf-wise, you could clone the skbs up front, then deliver them to 
the nic in a tight loop.  This would mitigate the added overhead 
introduced by calling skb_clone() in the loop doing transmits...


That only works if you are sending a small number of skbs.  You 
can't pre-clone several minutes worth of 10Gbe traffic

with any normal amount of RAM.


Does pktgen really need to allocate anything more than some smallish 
fraction more than the depth of the driver's transmit queue?


If you want to send sustained high rates of traffic, for more than
just a trivial amount of time, then you either have to play the current
trick with the skb_get(), or you have to allocate a real packet each time
(maybe with skb_clone() or similar, but it's still more overhead than 
the skb_get

which only bumps a reference count.)

I see no other way, but if you can think of one, please let me know.



You can keep freed skb's that were cloned on a free list, then reuse 
them once freed.  You can detect when the driver frees them by adding a 
destroy function to the skb.  So what will happen is the set of cloned 
skbs needed will eventually settled down to a constent amount and the 
amount will be based on the latency involved in transmitting a single 
skb.  And it should be bounded by the max txq depth.  Yes?  (or am I all 
wet :)


So you would pay the overhead of cloning only until you hit this steady 
state.


Whatchathink?


This skb recycling can certainly work and has been done several
times before.  It never gets into the kernel though.

Also, still a lot more overhead than incrementing the in-use counter
since you will have to re-initialize the pkt (you don't know what
all the underlying devices might have done to it.)

The number needed would be bound by all of the queues involved.  There
could be several queues if you are running pktgen on virtual devices
(with potential queues between virtual devs and hardware devs).

Still, I at least would be interested in seeing a patch if you put
one together.  You should get buy-in from Robert and/or DaveM if you
want it to have a chance to get into the kernel.

Thanks,
Ben

--
Ben Greear <[EMAIL PROTECTED]>
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen question

2007-10-08 Thread Steve Wise



Ben Greear wrote:

Rick Jones wrote:
Perf-wise, you could clone the skbs up front, then deliver them to 
the nic in a tight loop.  This would mitigate the added overhead 
introduced by calling skb_clone() in the loop doing transmits...


That only works if you are sending a small number of skbs.  You can't 
pre-clone several minutes worth of 10Gbe traffic

with any normal amount of RAM.


Does pktgen really need to allocate anything more than some smallish 
fraction more than the depth of the driver's transmit queue?


If you want to send sustained high rates of traffic, for more than
just a trivial amount of time, then you either have to play the current
trick with the skb_get(), or you have to allocate a real packet each time
(maybe with skb_clone() or similar, but it's still more overhead than 
the skb_get

which only bumps a reference count.)

I see no other way, but if you can think of one, please let me know.



You can keep freed skb's that were cloned on a free list, then reuse 
them once freed.  You can detect when the driver frees them by adding a 
destroy function to the skb.  So what will happen is the set of cloned 
skbs needed will eventually settled down to a constent amount and the 
amount will be based on the latency involved in transmitting a single 
skb.  And it should be bounded by the max txq depth.  Yes?  (or am I all 
wet :)


So you would pay the overhead of cloning only until you hit this steady 
state.


Whatchathink?



Thanks,
Ben


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen question

2007-09-24 Thread Ben Greear

Rick Jones wrote:
Perf-wise, you could clone the skbs up front, then deliver them to 
the nic in a tight loop.  This would mitigate the added overhead 
introduced by calling skb_clone() in the loop doing transmits...


That only works if you are sending a small number of skbs.  You can't 
pre-clone several minutes worth of 10Gbe traffic

with any normal amount of RAM.


Does pktgen really need to allocate anything more than some smallish 
fraction more than the depth of the driver's transmit queue?


If you want to send sustained high rates of traffic, for more than
just a trivial amount of time, then you either have to play the current
trick with the skb_get(), or you have to allocate a real packet each time
(maybe with skb_clone() or similar, but it's still more overhead than the 
skb_get
which only bumps a reference count.)

I see no other way, but if you can think of one, please let me know.

Thanks,
Ben

--
Ben Greear <[EMAIL PROTECTED]>
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen question

2007-09-24 Thread Rick Jones
Perf-wise, you could clone the skbs up front, then deliver them to the 
nic in a tight loop.  This would mitigate the added overhead 
introduced by calling skb_clone() in the loop doing transmits...


That only works if you are sending a small number of skbs.  You can't 
pre-clone several minutes worth of 10Gbe traffic

with any normal amount of RAM.


Does pktgen really need to allocate anything more than some smallish fraction 
more than the depth of the driver's transmit queue?


Of course, even that won't fit in the L1 cache of a processor, so if one is 
really just trying to max-out the NIC it might still have too much overhead, but 
then does pktgen+driver et al actually fit in an L1 cache?


rick jones
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen question

2007-09-24 Thread David Miller
From: Steve Wise <[EMAIL PROTECTED]>
Date: Mon, 24 Sep 2007 08:54:13 -0500

> I think pktgen should be cloning the skbs using skb_clone().  Then it 
> will work for all devices, eh?

The problem is that skb_clone() is (relatively) expensive and
pktgen is trying to just grab a reference to the SKB in
the absolutely cheapest way possible.

In my personal opinion, I think what the drivers that don't
work with pktgen are doing is wrong and they should find
another way to pass state around other than to depend upon
being able to use the ->cb[] area at-will.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen question

2007-09-24 Thread Robert Olsson

Hi,

Steve Wise writes:
 > I think pktgen should be cloning the skbs using skb_clone().  Then it 
 > will work for all devices, eh?

 pktgen assumes for "fastpath" sending exclusive ownership of
 the skb. And does a skb_get to avoid final skb destruction so 
 the same skb can be sent over and over. The idea is to avoid 
 memory allocation and keep things in cache to give very high
 packet rates with identical packets.
   I
 But if you need to alter the packet then the skb_get trick can't 
 be done. And you have to turn off "fastpath" with clone_skb 

 > Perf-wise, you could clone the skbs up front, then deliver them to the 
 > nic in a tight loop.  This would mitigate the added overhead introduced 
 > by calling skb_clone() in the loop doing transmits...

 Sure it's can be done. It could replay sequences etc but it will not
 beat the skb_get trick in sending identical packets. It has been
 proposed before but I've avoided such efforts to keep things relatively 
 small and simple. Really pktgen should be reworked to have s small 
 skim in kernel and move the rest of the stuff to userland.

 Cheers.
--ro
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen question

2007-09-24 Thread Ben Greear

Steve Wise wrote:


Ben Greear wrote:

Steve Wise wrote:
I think pktgen should be cloning the skbs using skb_clone().  Then 
it will work for all devices, eh?
That might work, but it would decrease performance slightly (or, 
increase CPU load at least).


Perf-wise, you could clone the skbs up front, then deliver them to the 
nic in a tight loop.  This would mitigate the added overhead 
introduced by calling skb_clone() in the loop doing transmits...
That only works if you are sending a small number of skbs.  You can't 
pre-clone several minutes worth of 10Gbe traffic

with any normal amount of RAM.



Maybe a new option:  multi_clone



If the current code is busted, I think it should be fixed.

Well, it works fine when used correctly :)

Ben

--
Ben Greear <[EMAIL PROTECTED]> 
Candela Technologies Inc  http://www.candelatech.com



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen question

2007-09-24 Thread Steve Wise


Ben Greear wrote:

Steve Wise wrote:
I think pktgen should be cloning the skbs using skb_clone().  Then it 
will work for all devices, eh?
That might work, but it would decrease performance slightly (or, 
increase CPU load at least).


Perf-wise, you could clone the skbs up front, then deliver them to the 
nic in a tight loop.  This would mitigate the added overhead introduced 
by calling skb_clone() in the loop doing transmits...




Maybe a new option:  multi_clone



If the current code is busted, I think it should be fixed.

My 2 cents.

Steve.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen question

2007-09-24 Thread Ben Greear

Steve Wise wrote:
I think pktgen should be cloning the skbs using skb_clone().  Then it 
will work for all devices, eh?
That might work, but it would decrease performance slightly (or, 
increase CPU load at least).


Maybe a new option:  multi_clone

Ben

--
Ben Greear <[EMAIL PROTECTED]> 
Candela Technologies Inc  http://www.candelatech.com



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen question

2007-09-24 Thread Steve Wise
I think pktgen should be cloning the skbs using skb_clone().  Then it 
will work for all devices, eh?



Ben Greear wrote:

jamal wrote:

On Sun, 2007-23-09 at 12:55 -0500, Steve Wise wrote:

 
Its a hack that breaks cxgb3 because cxgb3 uses the skb->cb area for 
each skb passed down.  So cxgb3 is at fault then?  IE a driver cannot 
use the skb->cb field if the users count is > 1?  Or maybe a driver 
can _never_ use the cb field?



any layer can use the cb structure whichever way they wish. There are
violations, e.g:
the vlan code also uses the cb field to pass vlan details for hardware
acceleration. How does pktgen affect it though, clone() will just copy
the cb field and pktgen doesnt touch it.
In retrospect, pktgen may have to use clone - ccing Robert Olsson.
  
Pktgen abuses the ref count, as far as I can tell.  It works with most 
ethernet drivers,
but that multi-pkt will also fail with things like vlans because the 
skb->dev is changed

as it is transmitted (to the lower-level device).

I'd say just don't use the multi-pkt with pktgen on devices that can't 
handle it.


Thanks,
Ben



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen question

2007-09-23 Thread Ben Greear

jamal wrote:

On Sun, 2007-23-09 at 12:55 -0500, Steve Wise wrote:

  
Its a hack that breaks cxgb3 because cxgb3 uses the skb->cb area for 
each skb passed down.  So cxgb3 is at fault then?  IE a driver cannot 
use the skb->cb field if the users count is > 1?  Or maybe a driver can 
_never_ use the cb field?



any layer can use the cb structure whichever way they wish. There are
violations, e.g:
the vlan code also uses the cb field to pass vlan details for hardware
acceleration. How does pktgen affect it though, clone() will just copy
the cb field and pktgen doesnt touch it.
In retrospect, pktgen may have to use clone - ccing Robert Olsson.
  
Pktgen abuses the ref count, as far as I can tell.  It works with most 
ethernet drivers,
but that multi-pkt will also fail with things like vlans because the 
skb->dev is changed

as it is transmitted (to the lower-level device).

I'd say just don't use the multi-pkt with pktgen on devices that can't 
handle it.


Thanks,
Ben


--
Ben Greear <[EMAIL PROTECTED]> 
Candela Technologies Inc  http://www.candelatech.com



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen question

2007-09-23 Thread jamal
On Sun, 2007-23-09 at 12:55 -0500, Steve Wise wrote:

> Its a hack that breaks cxgb3 because cxgb3 uses the skb->cb area for 
> each skb passed down.  So cxgb3 is at fault then?  IE a driver cannot 
> use the skb->cb field if the users count is > 1?  Or maybe a driver can 
> _never_ use the cb field?

any layer can use the cb structure whichever way they wish. There are
violations, e.g:
the vlan code also uses the cb field to pass vlan details for hardware
acceleration. How does pktgen affect it though, clone() will just copy
the cb field and pktgen doesnt touch it.
In retrospect, pktgen may have to use clone - ccing Robert Olsson.

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen question

2007-09-23 Thread Evgeniy Polyakov
Hi Steve.

On Sun, Sep 23, 2007 at 11:12:12AM -0500, Steve Wise ([EMAIL PROTECTED]) wrote:
> The pktgen module provides a way to "clone" the skb its using for 
> transmission, and allows passing N clones of the originally created skb 
> to the driver under test.However, it doesn't really use skb_clone(), 
> but rather it just bumps the skb->users count for each "clone" and 
> passes the same skb ptr to the driver.
> 
> Q: Is that a valid use of skb->users or should pktgen really be cloning 
> the skbuff?

It's a hack, but since skb is owned by pktgen only (no copies in some
outside queues or some other access) it is allowed just to bump reference
counter (i.e. 'share' skb in usual notation).

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen question

2007-09-23 Thread Steve Wise



Evgeniy Polyakov wrote:

Hi Steve.

On Sun, Sep 23, 2007 at 11:12:12AM -0500, Steve Wise ([EMAIL PROTECTED]) wrote:
The pktgen module provides a way to "clone" the skb its using for 
transmission, and allows passing N clones of the originally created skb 
to the driver under test.However, it doesn't really use skb_clone(), 
but rather it just bumps the skb->users count for each "clone" and 
passes the same skb ptr to the driver.


Q: Is that a valid use of skb->users or should pktgen really be cloning 
the skbuff?


It's a hack, but since skb is owned by pktgen only (no copies in some
outside queues or some other access) it is allowed just to bump reference
counter (i.e. 'share' skb in usual notation).



Its a hack that breaks cxgb3 because cxgb3 uses the skb->cb area for 
each skb passed down.  So cxgb3 is at fault then?  IE a driver cannot 
use the skb->cb field if the users count is > 1?  Or maybe a driver can 
_never_ use the cb field?


Steve.




-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [sis900] convert to NAPI, WAS Re: pktgen terminating condition

2007-09-06 Thread jamal
On Wed, 2007-05-09 at 22:06 -0700, Mandeep Singh Baines wrote:
> jamal ([EMAIL PROTECTED]) wrote:

> > If you read the paper: There are no issues with high throughput - NAPI
> > kicks in.
> > The challenge to be overcome is at low traffic, if you have a real fast
> > processor your cpu-cycles-used/bits-processed ratio is high
> 
> I'm not sure cpu-cycles-used/bits-processed is the correct metric to use.
> An alternative would be to look at cpu-cycles-used/unit-time (i.e.
> CPU utilization or load) for a given bit-rate or packet-rate. 

I wasnt explicit but we are saying the same thing. The paper is more
specific: if you make the packet count used constant - this translates
to a unit of time given low traffic rate. If you fix the time,
cpu-cycles are easier to extrapolate from.

> This would make an interesting graph.

If you are to plot a cpu-util vs packet rate, on most modern hardware you will 
see a spike before you hit the MLFRR and then utilization will go down and 
slowly 
start going up.

> At low packet-rate, CPU utilization is low so doing extra work per packet
> is probably OK. 

Thats the arguement weve used in the past ;-> Unfortunately, with the
relative cost of IO going up you cant keep making that arguement (at
least thats the position presented in the paper). 
Your mileage may vary. If you are running a machine dishing out bulk
transfers probably not a big deal. If you are doing database
transactions, you loose in benchmarks etc.

> Utilizing 2% of CPU vesus 1% is negligible. But at higher
> rate, when there is more CPU utilization, using 40% of CPU versus 60% is
> significant. I think the absolute different in CPU utilization is more
> important than the relative difference. iow, 2% versus 1%, even though 
> a 2x difference in cpu-cycles/packet, is negligible compared to 40% 
> versus 60%.

Refer to above.

> Using a timer might also behave better in a tick-less (CONFIG_NO_HZ) 
> configuration.

good point.

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [sis900] convert to NAPI, WAS Re: pktgen terminating condition

2007-09-05 Thread Mandeep Singh Baines
jamal ([EMAIL PROTECTED]) wrote:
> On Wed, 2007-05-09 at 14:55 +0100, James Chapman wrote:
> 
> > Thanks Jamal. Yes, I'd already read your paper. I think my idea is 
> > different to the ideas described in your paper 
> 
> I am hoping you can pick from the lessons of what has been tried and
> failed and the justification for critiqueing something as "Failed". 
> If you have - i feel i accomplished something useful writting the paper.
> 
> > and I'm in the process of 
> > writing it up as an RFC to post to netdev. 
> 
> Please cc me if you want my feedback - I am backlogged by about 3000
> messages on netdev.
> 
> > Briefly though, the driver's 
> > ->poll() holds off doing netif_rx_complete() for 1-2 jiffies and keeps 
> > itself in poll_on mode for that time, consuming no quota. 
> > The net_rx 
> > softirq processing loop is modified to detect the case when the only 
> > devices in its poll list are doing no work (consuming no quota). The 
> > driver's ->poll() samples jiffies while it is considering when to do the 
> > netif_rx_complete() like your Parked state - no timers are used.
> 
> Ok, so the difference seems to be you actually poll instead for those
> jiffies instead starting a timer in the parked state - is that right?
> 
> > If I missed that this approach has already been tried before and 
> > rejected, please let me know. I see better throughput and latency in my 
> > packet forwarding and LAN setups using it.
> 
> If you read the paper: There are no issues with high throughput - NAPI
> kicks in.
> The challenge to be overcome is at low traffic, if you have a real fast
> processor your cpu-cycles-used/bits-processed ratio is high

I'm not sure cpu-cycles-used/bits-processed is the correct metric to use.
An alternative would be to look at cpu-cycles-used/unit-time (i.e.
CPU utilization or load) for a given bit-rate or packet-rate. This would
make an interesting graph.

At low packet-rate, CPU utilization is low so doing extra work per packet
is probably OK. Utilizing 2% of CPU vesus 1% is negligible. But at higher
rate, when there is more CPU utilization, using 40% of CPU versus 60% is
significant. I think the absolute different in CPU utilization is more
important than the relative difference. iow, 2% versus 1%, even though 
a 2x difference in cpu-cycles/packet, is negligible compared to 40% 
versus 60%.

> If you are polling (softirqs have high prio and depending on the cpu,
> there could be a few gazillion within those 1-2 jiffies), then isnt the
> end result still a high cpu-cycles used?
> Thats what the timer tries to avoid (do nothing until some deffered
> point).

Using a timer might also behave better in a tick-less (CONFIG_NO_HZ) 
configuration.

> If you waste cpu cycles and poll, I can see that (to emphasize: For
> FastCPU-LowTraffic scenario), you will end up _not_ having latency
> issues i pointed out, but you surely have digressed from the original
> goal which is to address the cpu abuse at low traffic (because you abuse
> more cpu).
> 
> One thing that may be valuable is to show that the timers and polls are
> not much different in terms of cpu abuse (It's theoretically not true,
> but reality may not match).
> The other thing is now hrestimers are on, can you try using a piece of
> hardware that can get those kicked and see if you see any useful results
> on latency.
> 
> cheers,
> jamal
> 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [sis900] convert to NAPI, WAS Re: pktgen terminating condition

2007-09-05 Thread jamal
On Wed, 2007-05-09 at 14:55 +0100, James Chapman wrote:

> Thanks Jamal. Yes, I'd already read your paper. I think my idea is 
> different to the ideas described in your paper 

I am hoping you can pick from the lessons of what has been tried and
failed and the justification for critiqueing something as "Failed". 
If you have - i feel i accomplished something useful writting the paper.

> and I'm in the process of 
> writing it up as an RFC to post to netdev. 

Please cc me if you want my feedback - I am backlogged by about 3000
messages on netdev.

> Briefly though, the driver's 
> ->poll() holds off doing netif_rx_complete() for 1-2 jiffies and keeps 
> itself in poll_on mode for that time, consuming no quota. 
> The net_rx 
> softirq processing loop is modified to detect the case when the only 
> devices in its poll list are doing no work (consuming no quota). The 
> driver's ->poll() samples jiffies while it is considering when to do the 
> netif_rx_complete() like your Parked state - no timers are used.

Ok, so the difference seems to be you actually poll instead for those
jiffies instead starting a timer in the parked state - is that right?

> If I missed that this approach has already been tried before and 
> rejected, please let me know. I see better throughput and latency in my 
> packet forwarding and LAN setups using it.

If you read the paper: There are no issues with high throughput - NAPI
kicks in.
The challenge to be overcome is at low traffic, if you have a real fast
processor your cpu-cycles-used/bits-processed ratio is high
If you are polling (softirqs have high prio and depending on the cpu,
there could be a few gazillion within those 1-2 jiffies), then isnt the
end result still a high cpu-cycles used?
Thats what the timer tries to avoid (do nothing until some deffered
point).
If you waste cpu cycles and poll, I can see that (to emphasize: For
FastCPU-LowTraffic scenario), you will end up _not_ having latency
issues i pointed out, but you surely have digressed from the original
goal which is to address the cpu abuse at low traffic (because you abuse
more cpu).

One thing that may be valuable is to show that the timers and polls are
not much different in terms of cpu abuse (It's theoretically not true,
but reality may not match).
The other thing is now hrestimers are on, can you try using a piece of
hardware that can get those kicked and see if you see any useful results
on latency.

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [sis900] convert to NAPI, WAS Re: pktgen terminating condition

2007-09-05 Thread James Chapman

jamal wrote:

On Wed, 2007-05-09 at 13:03 +0100, James Chapman wrote:

I have a patch that solves the high interrupt rate problem by keeping 
the driver in polled mode longer. It's written for the latest NAPI 
version that DaveM posted recently. I'll try to get some time to write 
it up and post it for comment.


Theres interesting challenges to be tackled with resolving that. 
Just so you dont reinvent the wheel or to help invent a better one;

please read:
http://kernel.org/pub/linux/kernel/people/hadi/docs/UKUUG2005.pdf

if you have, what are the differences in your approach - and when
do you find it useful?


Thanks Jamal. Yes, I'd already read your paper. I think my idea is 
different to the ideas described in your paper and I'm in the process of 
writing it up as an RFC to post to netdev. Briefly though, the driver's 
->poll() holds off doing netif_rx_complete() for 1-2 jiffies and keeps 
itself in poll_on mode for that time, consuming no quota. The net_rx 
softirq processing loop is modified to detect the case when the only 
devices in its poll list are doing no work (consuming no quota). The 
driver's ->poll() samples jiffies while it is considering when to do the 
netif_rx_complete() like your Parked state - no timers are used.


If I missed that this approach has already been tried before and 
rejected, please let me know. I see better throughput and latency in my 
packet forwarding and LAN setups using it.


--
James Chapman
Katalix Systems Ltd
http://www.katalix.com
Catalysts for your Embedded Linux software development

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [sis900] convert to NAPI, WAS Re: pktgen terminating condition

2007-09-05 Thread jamal
On Wed, 2007-05-09 at 13:03 +0100, James Chapman wrote:

> I have a patch that solves the high interrupt rate problem by keeping 
> the driver in polled mode longer. It's written for the latest NAPI 
> version that DaveM posted recently. I'll try to get some time to write 
> it up and post it for comment.

Theres interesting challenges to be tackled with resolving that. 
Just so you dont reinvent the wheel or to help invent a better one;
please read:
http://kernel.org/pub/linux/kernel/people/hadi/docs/UKUUG2005.pdf

if you have, what are the differences in your approach - and when
do you find it useful?

cheers,
jamal

PS:- Mandeep, i was going to suggest thresholds but you beat me to it
with your latest patch.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [sis900] convert to NAPI, WAS Re: pktgen terminating condition

2007-09-05 Thread James Chapman

Mandeep Singh Baines wrote:

Daniele Venzano ([EMAIL PROTECTED]) wrote:
The patch looks good and I think it can be pushed higher (-mm ?) for some wider 
testing. I don't have the hardware available to do some tests myself, 
unfortunately, but it would be similar to yours anyway.


I'd like to know how this works for people with less memory and slower CPU, but any 
kind of test run will be appreciated.


Hi Daniele,

I tested the driver under different setting of CPU clock. Under a lower clock,
I get less interrupts (what I was hoping for) and slightly less performance.
Under higher clock, I get better performance and almost 1 interrupt per packet.


I have a patch that solves the high interrupt rate problem by keeping 
the driver in polled mode longer. It's written for the latest NAPI 
version that DaveM posted recently. I'll try to get some time to write 
it up and post it for comment.


--
James Chapman
Katalix Systems Ltd
http://www.katalix.com
Catalysts for your Embedded Linux software development

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [sis900] convert to NAPI, WAS Re: pktgen terminating condition

2007-09-05 Thread Mandeep Singh Baines
Daniele Venzano ([EMAIL PROTECTED]) wrote:
> The patch looks good and I think it can be pushed higher (-mm ?) for some 
> wider 
> testing. I don't have the hardware available to do some tests myself, 
> unfortunately, but it would be similar to yours anyway.
> 
> I'd like to know how this works for people with less memory and slower CPU, 
> but any 
> kind of test run will be appreciated.

Hi Daniele,

I tested the driver under different setting of CPU clock. Under a lower clock,
I get less interrupts (what I was hoping for) and slightly less performance.
Under higher clock, I get better performance and almost 1 interrupt per packet.

I also made a tiny change to the driver which resulted in slightly better
performance and less interrupts. I made a one-line change to finish_xmit,
which now wakes up the transmit queue if there are at least 16 available
slots in the tx ring. Previously, the driver would wake up the transmit queue
if there was just a single slot available. This should result in better
hysteresis.

Attached are my test results and a new patch.

Signed-off-by: Mandeep Singh Baines <[EMAIL PROTECTED]>

--
diff --git a/drivers/net/sis900.c b/drivers/net/sis900.c
index 7c6e480..40c1aee 100644
--- a/drivers/net/sis900.c
+++ b/drivers/net/sis900.c
@@ -185,7 +185,6 @@ struct sis900_private {
dma_addr_t tx_ring_dma;
dma_addr_t rx_ring_dma;
 
-   unsigned int tx_full; /* The Tx queue is full. */
u8 host_bridge_rev;
u8 chipset_rev;
 };
@@ -202,8 +201,10 @@ MODULE_PARM_DESC(max_interrupt_work, "SiS 900/7016 maximum 
events handled per in
 MODULE_PARM_DESC(sis900_debug, "SiS 900/7016 bitmapped debugging message 
level");
 
 #ifdef CONFIG_NET_POLL_CONTROLLER
-static void sis900_poll(struct net_device *dev);
+static void sis900_poll_controller(struct net_device *dev);
 #endif
+
+static int sis900_poll(struct net_device *dev, int *budget);
 static int sis900_open(struct net_device *net_dev);
 static int sis900_mii_probe (struct net_device * net_dev);
 static void sis900_init_rxfilter (struct net_device * net_dev);
@@ -216,8 +217,8 @@ static void sis900_tx_timeout(struct net_device *net_dev);
 static void sis900_init_tx_ring(struct net_device *net_dev);
 static void sis900_init_rx_ring(struct net_device *net_dev);
 static int sis900_start_xmit(struct sk_buff *skb, struct net_device *net_dev);
-static int sis900_rx(struct net_device *net_dev);
-static void sis900_finish_xmit (struct net_device *net_dev);
+static int sis900_rx(struct net_device *net_dev, int limit);
+static int sis900_finish_xmit (struct net_device *net_dev);
 static irqreturn_t sis900_interrupt(int irq, void *dev_instance);
 static int sis900_close(struct net_device *net_dev);
 static int mii_ioctl(struct net_device *net_dev, struct ifreq *rq, int cmd);
@@ -474,9 +475,11 @@ static int __devinit sis900_probe(struct pci_dev *pci_dev,
net_dev->tx_timeout = sis900_tx_timeout;
net_dev->watchdog_timeo = TX_TIMEOUT;
net_dev->ethtool_ops = &sis900_ethtool_ops;
+   net_dev->poll = &sis900_poll;
+   net_dev->weight = 64;
 
 #ifdef CONFIG_NET_POLL_CONTROLLER
-net_dev->poll_controller = &sis900_poll;
+net_dev->poll_controller = &sis900_poll_controller;
 #endif
 
if (sis900_debug > 0)
@@ -979,13 +982,44 @@ static u16 sis900_reset_phy(struct net_device *net_dev, 
int phy_addr)
return status;
 }
 
+static int sis900_poll(struct net_device *dev, int *budget)
+{
+   struct sis900_private *sis_priv = dev->priv;
+   long ioaddr = dev->base_addr;
+   int limit = min_t(int, dev->quota, *budget);
+   int rx_work_done;
+   int tx_work_done;
+
+   /* run TX completion thread */
+   spin_lock(sis_priv->lock);
+   tx_work_done = sis900_finish_xmit(dev);
+   spin_unlock(sis_priv->lock);
+
+   /* run RX thread */
+   rx_work_done = sis900_rx(dev, limit);
+   *budget -= rx_work_done;
+   dev->quota -= rx_work_done;
+
+   /* re-enable interrupts if no work done */
+   if (rx_work_done + tx_work_done == 0) {
+   netif_rx_complete(dev);
+   /* Enable all known interrupts. */
+   outl((RxSOVR|RxORN|RxERR|RxOK|TxURN|TxERR|TxOK), ioaddr + imr);
+   /* Handle rotting packet */
+   sis900_rx(dev, NUM_RX_DESC);
+   return 0;
+   }
+
+   return 1;
+}
+
 #ifdef CONFIG_NET_POLL_CONTROLLER
 /*
  * Polling 'interrupt' - used by things like netconsole to send skbs
  * without having to re-enable interrupts. It's not called while
  * the interrupt routine is executing.
 */
-static void sis900_poll(struct net_device *dev)
+static void sis900_poll_controller(struct net_device *dev)
 {
disable_irq(dev->irq);
sis900_interrupt(dev->irq, dev);
@@ -1032,7 +1066,7 @@ sis900_open(struct net_device *net_dev)
sis900_set_mode(ioaddr, HW_SPEED_10_MBPS, FDX_CAPABLE_HALF_SELECTED);
 
/* Enable all known interrupts by setting the inter

Re: [PATCH] [sis900] convert to NAPI, WAS Re: pktgen terminating condition

2007-09-04 Thread Mandeep Baines
Cool. I'll try to see if I can clock my pc lower and run the
experiments again. I'll measure cpu utilization also this time around.
That should be useful for extrapolating.

Regards,
Mandeep

On 9/4/07, Daniele Venzano <[EMAIL PROTECTED]> wrote:
> - Message d'origine -
> De: Mandeep Singh Baines <[EMAIL PROTECTED]>
> Date: Mon, 3 Sep 2007 20:20:36 -0700
> Sujet: [PATCH] [sis900] convert to NAPI, WAS Re: pktgen terminating condition
>
> >Hi Daniele,
> >
> >Attached is a patch for converting the sis900 driver to NAPI. Please take a
> >look at let me know what you think. I'm not 100% sure if I'm handling the
> >rotting packet issue correctly or that I have the locking right in 
> >tx_timeout.
> >These might be areas to look at closely.
> >
> >I didn't see much saving in interrupts on my machine (too fast, I guess). But
> >I still think its beneficial: pushing work out of the interrupt handler into
> >a bottom half is a good thing and we no longer need to disable interrupts
> >in start_xmit.
> >
> >I did see a significant boost to tx performance by optimizing start_xmit: 
> >more
> >than double pps in pktgen.
> >
> >I'm also attaching some test results for various iterations of development.
>
> The patch looks good and I think it can be pushed higher (-mm ?) for some 
> wider
> testing. I don't have the hardware available to do some tests myself,
> unfortunately, but it would be similar to yours anyway.
>
> I'd like to know how this works for people with less memory and slower CPU, 
> but any
> kind of test run will be appreciated.
>
> Thanks, bye.
>
>
> --
> Daniele Venzano
> [EMAIL PROTECTED]
>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [sis900] convert to NAPI, WAS Re: pktgen terminating condition

2007-09-04 Thread Mandeep Baines
On 9/4/07, jamal <[EMAIL PROTECTED]> wrote:
> On Mon, 2007-03-09 at 20:20 -0700, Mandeep Singh Baines wrote:
>
> > I didn't see much saving in interrupts on my machine (too fast, I guess).
>
> You could try the idea suggested by Dave earlier and just turn interupts
> for every nth packet. That should cut down the numbers.
>

I wanted to do that but I couldn't figure out how to free the last skb
if it happened to be a transmit that didn't generate a tx completion
interrupt.

Let's say I interrupt every 4th packet.I've already sent packets 0 to
4. Now I send packets 5 and 6. I can't figure out how to ensure that
the skb's for these packets get freed in a deterministic amount of
time if there is no interrupt? They will get freed when packet 8 gets
transmitted but there is no upper bound on when that will happen.

Maybe I could create a tx skb cleanup timer that I kickoff in
hard_start_xmit(). Every call to hard_start_xmit would reset the
timer. I could try this for a future patch. Not sure if the extra code
would cost me more than I get back in savings.

> > I did see a significant boost to tx performance by optimizing start_xmit: 
> > more
> > than double pps in pktgen.
>
> 148Kpps on a slow piece of hardware aint bad - Good Stuff. I wonder how
> much CPU is being abused.
>
> If you wanna go one extra mile (separate future patch): get rid of that
> tx lock and use netif_tx_lock on the interupt path. Look at some sane
> driver like tg3 for reference.
>

Cool. I'll check out the tg3.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [sis900] convert to NAPI, WAS Re: pktgen terminating condition

2007-09-04 Thread Daniele Venzano
- Message d'origine -
De: Mandeep Singh Baines <[EMAIL PROTECTED]>
Date: Mon, 3 Sep 2007 20:20:36 -0700
Sujet: [PATCH] [sis900] convert to NAPI, WAS Re: pktgen terminating condition

>Hi Daniele,
>
>Attached is a patch for converting the sis900 driver to NAPI. Please take a
>look at let me know what you think. I'm not 100% sure if I'm handling the
>rotting packet issue correctly or that I have the locking right in tx_timeout.
>These might be areas to look at closely.
>
>I didn't see much saving in interrupts on my machine (too fast, I guess). But
>I still think its beneficial: pushing work out of the interrupt handler into
>a bottom half is a good thing and we no longer need to disable interrupts 
>in start_xmit. 
>
>I did see a significant boost to tx performance by optimizing start_xmit: more
>than double pps in pktgen.
>
>I'm also attaching some test results for various iterations of development.

The patch looks good and I think it can be pushed higher (-mm ?) for some wider 
testing. I don't have the hardware available to do some tests myself, 
unfortunately, but it would be similar to yours anyway.

I'd like to know how this works for people with less memory and slower CPU, but 
any 
kind of test run will be appreciated.

Thanks, bye.


--
Daniele Venzano
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [sis900] convert to NAPI, WAS Re: pktgen terminating condition

2007-09-04 Thread jamal
On Mon, 2007-03-09 at 20:20 -0700, Mandeep Singh Baines wrote:

> I didn't see much saving in interrupts on my machine (too fast, I guess). 

You could try the idea suggested by Dave earlier and just turn interupts
for every nth packet. That should cut down the numbers.

> I did see a significant boost to tx performance by optimizing start_xmit: more
> than double pps in pktgen.

148Kpps on a slow piece of hardware aint bad - Good Stuff. I wonder how
much CPU is being abused.

If you wanna go one extra mile (separate future patch): get rid of that
tx lock and use netif_tx_lock on the interupt path. Look at some sane
driver like tg3 for reference.

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] [sis900] convert to NAPI, WAS Re: pktgen terminating condition

2007-09-03 Thread Mandeep Singh Baines
Hi Daniele,

Attached is a patch for converting the sis900 driver to NAPI. Please take a
look at let me know what you think. I'm not 100% sure if I'm handling the 
rotting packet issue correctly or that I have the locking right in tx_timeout.
These might be areas to look at closely.

I didn't see much saving in interrupts on my machine (too fast, I guess). But
I still think its beneficial: pushing work out of the interrupt handler into
a bottom half is a good thing and we no longer need to disable interrupts 
in start_xmit. 

I did see a significant boost to tx performance by optimizing start_xmit: more
than double pps in pktgen.

I'm also attaching some test results for various iterations of development.

Regards,
Mandeep

Daniele Venzano ([EMAIL PROTECTED]) wrote:
> - Message d'origine -
> De: jamal <[EMAIL PROTECTED]>
> Date: Fri, 31 Aug 2007 09:50:03 -0400
> Sujet: Re: Re: pktgen terminating condition
> 
> >I dont know if you followed the discussion - by defering the freeing of
> >skbs, you will be slowing down socket apps sending from the local
> >machine. It may be ok if the socket buffers were huge, but that comes at
> >the cost of system memory (which may not be a big deal)
> >Do you by any chance recall why you used the idle interupt instead of
> >txok to kick the prunning of tx descriptors?
> 
> That should be asked to the original author of the driver, that I wasn't able 
> to contact when I took over the maintainership several years ago.
> I think that since this chip is usually used on cheap/low performance 
> (relatively speaking) devices it was felt that generating an interrupt for 
> each transmitted packet was going to affect the performance of the system too 
> much. At the start, if I remember correctly, this was a chip thought for 
> consumer use, where transmissions won't happen continuously for long periods 
> of time. Then the sis900 started to get used everywhere, from embedded 
> systems to (cheap) server motherboards and the usage scenario changed.
> 
> 
> --
> Daniele Venzano
> [EMAIL PROTECTED]
Tested using pktgen.

cpuinfo:
processor   : 0
vendor_id   : AuthenticAMD
cpu family  : 6
model   : 8
model name  : AMD Geode NX
stepping: 1
cpu MHz : 1397.641
cache size  : 256 KB
fdiv_bug: no
hlt_bug : no
f00f_bug: no
coma_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 1
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 mmx fxsr sse syscall mp mmxext 3dnowext 3dnow ts fid vid
bogomips: 2796.00
clflush size: 32

meminfo:
MemTotal:   907092 kB


Results in pps. Sending 40 60-byte packets.
Other than Iteration 5 and 6, there is 1 interrupt per packet.

Iteration 0 (replace TxIDLE with TxOK):
6, 62052, 62335

Iteration 1 (optimize sis900_start_xmit):
148815, 148815, 148829

Iteration 2 (convert Rx to NAPI):
148669, 148635, 148677

Iteration 3 (remove tx_full dev state variable):
148677, 148528, 148532

Iteration 4 (optimize finish_xmit):
148677, 148676, 148689

Iteration 5 (move tx completion code into NAPI poll):
147751, 147658, 147809
Interrupts:
359270, 360747, 358010

Iteration 6 (cleanup + rotting packet handling + rx optimization):
148652, 148680, 148681
Interrupts:
399927, 399841, 399835

diff --git a/drivers/net/sis900.c b/drivers/net/sis900.c
index 7c6e480..862e15a 100644
--- a/drivers/net/sis900.c
+++ b/drivers/net/sis900.c
@@ -185,7 +185,6 @@ struct sis900_private {
dma_addr_t tx_ring_dma;
dma_addr_t rx_ring_dma;
 
-   unsigned int tx_full; /* The Tx queue is full. */
u8 host_bridge_rev;
u8 chipset_rev;
 };
@@ -202,8 +201,10 @@ MODULE_PARM_DESC(max_interrupt_work, "SiS 900/7016 maximum 
events handled per in
 MODULE_PARM_DESC(sis900_debug, "SiS 900/7016 bitmapped debugging message 
level");
 
 #ifdef CONFIG_NET_POLL_CONTROLLER
-static void sis900_poll(struct net_device *dev);
+static void sis900_poll_controller(struct net_device *dev);
 #endif
+
+static int sis900_poll(struct net_device *dev, int *budget);
 static int sis900_open(struct net_device *net_dev);
 static int sis900_mii_probe (struct net_device * net_dev);
 static void sis900_init_rxfilter (struct net_device * net_dev);
@@ -216,8 +217,8 @@ static void sis900_tx_timeout(struct net_device *net_dev);
 static void sis900_init_tx_ring(struct net_device *net_dev);
 static void sis900_init_rx_ring(struct net_device *net_dev);
 static int sis900_start_xmit(struct sk_buff *skb, struct net_device *net_dev);
-static int sis900_rx(struct net_device *net_dev);
-static void sis900_finish_xmit (struct net_device *net_dev);
+static int sis900_rx(struct net_device *net_dev, int limit);
+static int sis900_finish_xmit (struct net_device *net_dev);
 static 

Re: pktgen terminating condition

2007-09-01 Thread Daniele Venzano
- Message d'origine -
De: jamal <[EMAIL PROTECTED]>
Date: Fri, 31 Aug 2007 09:50:03 -0400
Sujet: Re: Re: pktgen terminating condition

>I dont know if you followed the discussion - by defering the freeing of
>skbs, you will be slowing down socket apps sending from the local
>machine. It may be ok if the socket buffers were huge, but that comes at
>the cost of system memory (which may not be a big deal)
>Do you by any chance recall why you used the idle interupt instead of
>txok to kick the prunning of tx descriptors?

That should be asked to the original author of the driver, that I wasn't able 
to contact when I took over the maintainership several years ago.
I think that since this chip is usually used on cheap/low performance 
(relatively speaking) devices it was felt that generating an interrupt for each 
transmitted packet was going to affect the performance of the system too much. 
At the start, if I remember correctly, this was a chip thought for consumer 
use, where transmissions won't happen continuously for long periods of time. 
Then the sis900 started to get used everywhere, from embedded systems to 
(cheap) server motherboards and the usage scenario changed.


--
Daniele Venzano
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re: pktgen terminating condition

2007-08-31 Thread jamal
On Fri, 2007-31-08 at 14:17 +0200, Daniele Venzano wrote:

> I don't regard the TxOK solution as something usable for mainline, but it has 
> its 
> use for the users of pktgen.

I dont know if you followed the discussion - by defering the freeing of
skbs, you will be slowing down socket apps sending from the local
machine. It may be ok if the socket buffers were huge, but that comes at
the cost of system memory (which may not be a big deal)
Do you by any chance recall why you used the idle interupt instead of
txok to kick the prunning of tx descriptors?

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen terminating condition

2007-08-31 Thread jamal
On Thu, 2007-30-08 at 22:19 -0700, David Miller wrote:

> You could implement this quite simply using skb->destructor.

Thats what i was thinking ..

> It will add some atomics, so on weaker pktgen source systems
> it might decrease the generators rate.

Indeed. So maybe a config option instead; it has value afaics in the
batching case only. Need to experiment.

cheers,
jamal


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re: pktgen terminating condition

2007-08-31 Thread Daniele Venzano
- Message d'origine -
De: "Mandeep Baines" <[EMAIL PROTECTED]>
Date: Wed, 29 Aug 2007 09:59:42 -0700
Sujet: Re: pktgen terminating condition

>> Looks good to me given the desire. I would bounce it by whoever the
>> maintainer is - they may have some insights on the lazy tx prune habit.
>>
>
>+ [EMAIL PROTECTED]
>+ [EMAIL PROTECTED]
>
>I'll work on a NAPI patch.
Here I'm (the maintainer). Sorry for the delay replying.

I don't regard the TxOK solution as something usable for mainline, but it has 
its use for the users of pktgen.

NAPI is the way to go.

Thanks, bye.


--
Daniele Venzano
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen terminating condition

2007-08-30 Thread David Miller
From: jamal <[EMAIL PROTECTED]>
Date: Thu, 30 Aug 2007 08:08:40 -0400

> I was confusing it with another issue where pktgen could send a lot of
> packets without waiting for them to be freed; there are some drivers
> (10G) which may hold onto 8K skbs. A gazillion ooms start spewing ;-> My
> thinking in resolving that was to do something like waht sockets do and
> charge pktgen so it doesnt have too many outstanding packets in flight.

You could implement this quite simply using skb->destructor.

It will add some atomics, so on weaker pktgen source systems
it might decrease the generators rate.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen terminating condition

2007-08-30 Thread jamal
On Wed, 2007-29-08 at 18:32 +0200, Robert Olsson wrote:

>  Yes it's synchronization issue... the test is over and we have sent 
>  all pkts to the device but pktgen cannot free the skb for it still 
>  has refcounts. 

Ok, right.

I was confusing it with another issue where pktgen could send a lot of
packets without waiting for them to be freed; there are some drivers
(10G) which may hold onto 8K skbs. A gazillion ooms start spewing ;-> My
thinking in resolving that was to do something like waht sockets do and
charge pktgen so it doesnt have too many outstanding packets in flight.

>  IMO must drivers have provisions to handle situation like. 

Mandeep was saying he found less than a handful that didnt conform.

> I'll 
>  guess we can discuss last-resort timer if it should be us, ms 
>  or possibly seconds but shouldn't need ping to make this happen. 
>  If so we probably have a ICMP packet sitting there waiting instead.  

I think as long as it doesnt affect throughput calculation (it just adds
to idle time) its fine.

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen terminating condition

2007-08-30 Thread jamal
On Wed, 2007-29-08 at 09:59 -0700, Mandeep Baines wrote:

> I'll work on a NAPI patch.

It's a GoodThing - go for it. 

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen terminating condition

2007-08-29 Thread Mandeep Baines
On 8/29/07, jamal <[EMAIL PROTECTED]> wrote:
> On Tue, 2007-28-08 at 21:43 -0700, Mandeep Singh Baines wrote:
>
> > I interpret this to mean that the interrupt gets generates after a packet
> > is transferred to the TFIFO on the NIC and the next packet in the ring is
> > NULL.
>
> iow, when tx transits to idle ..
>
> > This interrupt gets generates less often then TxOK which gets generated
> > after every completed packet transmit. So I'm thinking that maybe the
> > driver was written to use this interrupt instead of TxOK for this reason.
> > Really just my speculation.
>
> It does make sense if you are trying to cut down on tx interupts. It's a
> little extreme - and if you dont have much memory it may not be the best
> scheme. OTOH, you have moved to the other extreme imo ;-> You are
> generating an interupt per tx packet. This may not matter on modern
> hardware because that NIC seems to be Fast Ethernet.
> You may wanna heed Dave's advice and just always set the idle on a
> per-descriptor basis as well as txcomplete on every Xth packet (4 was
> suggested).

My hardware at home is not so modern:( I'm running a AMD Geode NX1750.
I like to punish myself:) Actually, it does the job and is very power
efficient.

Agreed, interrupting every packet is inefficient. A better solution is
probably NAPI. I'll work on a new patch.

> Looking at sis900_start_xmit() at the spot where OWN is set
> on the descriptor, theres possibly another bit in tx_ring[entry].cmdsts
> you may could set that will ask for tx complete; which makes me wonder:
> is that "outl(TxENA | inl(ioaddr + cr), ioaddr + cr)" really needed on a
> per-packet basis? Should it not be sufficient to turn it on once at
> open()?
>

You have to set this bit to restart the Tx State Machine. You could
optimize this if you use the TxIdle interrupt to tell you when you
really need to set this bit.

> > Anyway, if I rewrite the driver to use TxOK instead of TxIdle, pktgen
> > works fine.
>
> I think its a good thing pktgen caught this; i am unsure however if it
> is doing the right thing. Hoping Robert would respond.
> One thing pktgen could do is restrict the amount of outstanding buffers
> by using accounting the way sockets do - this at least wont stop
> progress as it did in your case.
>
> > I'm attaching the patch.
>
> Looks good to me given the desire. I would bounce it by whoever the
> maintainer is - they may have some insights on the lazy tx prune habit.
>

+ [EMAIL PROTECTED]
+ [EMAIL PROTECTED]

I'll work on a NAPI patch.

Regards,
Mandeep
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen terminating condition

2007-08-29 Thread Robert Olsson

jamal writes:
 > On Tue, 2007-28-08 at 21:43 -0700, Mandeep Singh Baines wrote:

 > I think its a good thing pktgen caught this; i am unsure however if it
 > is doing the right thing. Hoping Robert would respond.
 > One thing pktgen could do is restrict the amount of outstanding buffers
 > by using accounting the way sockets do - this at least wont stop
 > progress as it did in your case.

 Hello,

 Yes it's synchronization issue... the test is over and we have sent 
 all pkts to the device but pktgen cannot free the skb for it still 
 has refcounts. 

 IMO must drivers have provisions to handle situation like. I'll 
 guess we can discuss last-resort timer if it should be us, ms 
 or possibly seconds but shouldn't need ping to make this happen. 
 If so we probably have a ICMP packet sitting there waiting instead.  

 Cheers
--ro
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen terminating condition

2007-08-29 Thread Grant Grundler
On 8/29/07, Mandeep Baines <[EMAIL PROTECTED]> wrote:
...
> Unfortunately, the TxIdle interrupt would not free the last odd packet.
> However, if we want to quiet interrupts couldn't the driver just be
> converted NAPI. If this seems reasonable I could work on a patch.

It seems reasonable and that's what I was thinking when you
submitted the patch to use TXOk.

I think NAPI could be used with either TxOK or TxIdle interrupts.
For TxIdle to work, the driver would have to recognize one more
packet is still in flight and poll for completion of that packet - then
re-enable interrupts and switch back to interrupt mode of operation.
I just don't know if netperf TCP_RR perf numbers will suffer
(assuming TCP_RR is more important than CPU utilization/efficiency
on heavy TX loads like pktgen.)

grant
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen terminating condition

2007-08-29 Thread jamal
On Tue, 2007-28-08 at 21:43 -0700, Mandeep Singh Baines wrote:

> I interpret this to mean that the interrupt gets generates after a packet
> is transferred to the TFIFO on the NIC and the next packet in the ring is
> NULL. 

iow, when tx transits to idle ..

> This interrupt gets generates less often then TxOK which gets generated
> after every completed packet transmit. So I'm thinking that maybe the
> driver was written to use this interrupt instead of TxOK for this reason.
> Really just my speculation.

It does make sense if you are trying to cut down on tx interupts. It's a
little extreme - and if you dont have much memory it may not be the best
scheme. OTOH, you have moved to the other extreme imo ;-> You are
generating an interupt per tx packet. This may not matter on modern
hardware because that NIC seems to be Fast Ethernet.
You may wanna heed Dave's advice and just always set the idle on a
per-descriptor basis as well as txcomplete on every Xth packet (4 was
suggested). 
Looking at sis900_start_xmit() at the spot where OWN is set
on the descriptor, theres possibly another bit in tx_ring[entry].cmdsts
you may could set that will ask for tx complete; which makes me wonder:
is that "outl(TxENA | inl(ioaddr + cr), ioaddr + cr)" really needed on a
per-packet basis? Should it not be sufficient to turn it on once at
open()?

> Anyway, if I rewrite the driver to use TxOK instead of TxIdle, pktgen
> works fine. 

I think its a good thing pktgen caught this; i am unsure however if it
is doing the right thing. Hoping Robert would respond.
One thing pktgen could do is restrict the amount of outstanding buffers
by using accounting the way sockets do - this at least wont stop
progress as it did in your case.

> I'm attaching the patch.

Looks good to me given the desire. I would bounce it by whoever the
maintainer is - they may have some insights on the lazy tx prune habit.

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen terminating condition

2007-08-28 Thread David Miller
From: Mandeep Singh Baines <[EMAIL PROTECTED]>
Date: Tue, 28 Aug 2007 21:43:52 -0700

> Here is what the datasheet has to say about TxIdle:
> 
> "This event is signaled when the transmit state machine enters the idle state 
> from a non-idle state. This will happen whenever the state machine encounters
> an "end-of-list" condition (NULL link field or a descriptor with OWN clear)."
> 
> I interpret this to mean that the interrupt gets generates after a packet
> is transferred to the TFIFO on the NIC and the next packet in the ring is
> NULL. 
> 
> This interrupt gets generates less often then TxOK which gets generated
> after every completed packet transmit. So I'm thinking that maybe the
> driver was written to use this interrupt instead of TxOK for this reason.
> Really just my speculation.

I see, so essentially it doesn't interrupt until the entire TX
ring is empty and has been sent onto the wire.

Yes, this would be exactly sub-optimal for pktgen or in fact any
application :-)

It seems that the INTbit in the TX descriptor status of the
SIS190 can be an interrupt trigger.  In that case, a reasonable
and quite common scheme would be to set that bit every 1/4 of
the TX ring.  And also enable the TX idle interrupt.

So if the TX ring size is 8 entries and you received a set of
TX sends you'd set the interrupt status bits like this:

TX descr interrupt bit

packet 0:   none
packet 1:   INTbit
packet 2:   none
packet 3:   INTbit
packet 4:   none
packet 5:   INTbit
packet 6:   none
packet 7:   INTbit

And you'd get 4 TX descriptor based interrupts, one for each INTbit
and probably free up 2 TX packets each time (or more if there is some
overlap).

The idle interrupt bit take care of the case where you have an
odd number of packets (say removing packet 7 in the trace above)
to make sure those sub-1/4 group of TX frames get freed up and
processed in a deterministic amount of time.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen terminating condition

2007-08-28 Thread Mandeep Singh Baines
On Tue, 2007-28-08 at 20:28 -0400, jamal wrote:
> On Tue, 2007-28-08 at 17:06 -0700, David Miller wrote:
>
> > Devices need to free packets in a deterministic amount of time, no
> > matter what.
>
> If i understood the driver pointed to - "finite amount of time" spec is
> still met. Seems some interupt is generated (TX_IDLE) when the tx path
> gets idle and this will kick the tx cleanup. Mandeep, correct me if i am
> wrong. If my statement is correct, i wouldnt call this a driver bug;->
>
> Robert, I think i asked you this same question on that piece of code
> once  - and iirc you said its because you had to synchronise and make
> sure the packet made it to the wire. In this scenario, the packet has
> made it to the wire - just hasnt been freed. Note that when teaching
> pktgen to do batching, i removed that logic and things work fine.

Hi Jamal,

Here is what the datasheet has to say about TxIdle:

"This event is signaled when the transmit state machine enters the idle state 
from a non-idle state. This will happen whenever the state machine encounters
an "end-of-list" condition (NULL link field or a descriptor with OWN clear)."

I interpret this to mean that the interrupt gets generates after a packet
is transferred to the TFIFO on the NIC and the next packet in the ring is
NULL. 

This interrupt gets generates less often then TxOK which gets generated
after every completed packet transmit. So I'm thinking that maybe the
driver was written to use this interrupt instead of TxOK for this reason.
Really just my speculation.

Anyway, if I rewrite the driver to use TxOK instead of TxIdle, pktgen
works fine. I'm attaching the patch.

Signed-off-by: Mandeep Singh Baines <[EMAIL PROTECTED]>
---

diff --git a/drivers/net/sis900.c b/drivers/net/sis900.c
index 7c6e480..69e5a57 100644
--- a/drivers/net/sis900.c
+++ b/drivers/net/sis900.c
@@ -1032,7 +1032,7 @@ sis900_open(struct net_device *net_dev)
sis900_set_mode(ioaddr, HW_SPEED_10_MBPS, FDX_CAPABLE_HALF_SELECTED);
 
/* Enable all known interrupts by setting the interrupt mask. */
-   outl((RxSOVR|RxORN|RxERR|RxOK|TxURN|TxERR|TxIDLE), ioaddr + imr);
+   outl((RxSOVR|RxORN|RxERR|RxOK|TxURN|TxERR|TxOK), ioaddr + imr);
outl(RxENA | inl(ioaddr + cr), ioaddr + cr);
outl(IE, ioaddr + ier);
 
@@ -1557,7 +1557,7 @@ static void sis900_tx_timeout(struct net_device *net_dev)
outl(sis_priv->tx_ring_dma, ioaddr + txdp);
 
/* Enable all known interrupts by setting the interrupt mask. */
-   outl((RxSOVR|RxORN|RxERR|RxOK|TxURN|TxERR|TxIDLE), ioaddr + imr);
+   outl((RxSOVR|RxORN|RxERR|RxOK|TxURN|TxERR|TxOK), ioaddr + imr);
return;
 }
 
@@ -1655,7 +1655,7 @@ static irqreturn_t sis900_interrupt(int irq, void 
*dev_instance)
do {
status = inl(ioaddr + isr);
 
-   if ((status & (HIBERR|TxURN|TxERR|TxIDLE|RxORN|RxERR|RxOK)) == 
0)
+   if ((status & (HIBERR|TxURN|TxERR|TxOK|RxORN|RxERR|RxOK)) == 0)
/* nothing intresting happened */
break;
handled = 1;
@@ -1665,7 +1665,7 @@ static irqreturn_t sis900_interrupt(int irq, void 
*dev_instance)
/* Rx interrupt */
sis900_rx(net_dev);
 
-   if (status & (TxURN | TxERR | TxIDLE))
+   if (status & (TxURN | TxERR | TxOK))
/* Tx interrupt */
sis900_finish_xmit(net_dev);
 
@@ -2462,7 +2462,7 @@ static int sis900_resume(struct pci_dev *pci_dev)
sis900_set_mode(ioaddr, HW_SPEED_10_MBPS, FDX_CAPABLE_HALF_SELECTED);
 
/* Enable all known interrupts by setting the interrupt mask. */
-   outl((RxSOVR|RxORN|RxERR|RxOK|TxURN|TxERR|TxIDLE), ioaddr + imr);
+   outl((RxSOVR|RxORN|RxERR|RxOK|TxURN|TxERR|TxOK), ioaddr + imr);
outl(RxENA | inl(ioaddr + cr), ioaddr + cr);
outl(IE, ioaddr + ier);
 


Re: pktgen terminating condition

2007-08-28 Thread jamal
On Tue, 2007-28-08 at 17:06 -0700, David Miller wrote:

> Devices need to free packets in a deterministic amount of time, no
> matter what.

If i understood the driver pointed to - "finite amount of time" spec is
still met. Seems some interupt is generated (TX_IDLE) when the tx path
gets idle and this will kick the tx cleanup. Mandeep, correct me if i am
wrong. If my statement is correct, i wouldnt call this a driver bug;->

Robert, I think i asked you this same question on that piece of code
once  - and iirc you said its because you had to synchronise and make
sure the packet made it to the wire. In this scenario, the packet has
made it to the wire - just hasnt been freed. Note that when teaching
pktgen to do batching, i removed that logic and things work fine.

cheers,
jamal



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen terminating condition

2007-08-28 Thread Rick Jones


I don't even understand why this needs to be discussed to be honest
with you :-)


Just my (possibly frustrating :) habit of asking questions which help
further my understanding of the code :)

rick jones
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen terminating condition

2007-08-28 Thread David Miller
From: Rick Jones <[EMAIL PROTECTED]>
Date: Tue, 28 Aug 2007 16:48:24 -0700

> Not to defend any one driver, but could it be that the behaviour of TCP
> is such that we've not noticed until now?
> 
> Does TCP particularly care for an SKB carrying an ACK?  For ones carrying
> data there would be an ACK arriving before TCP could "really" free them
> right?  And that ACK (rx event) could very well come after some other TX
> completion, or perhaps trigger cleaning up TXs in the driver at that time?
> 
> And a UDP app is as likely as not to be request/response in nature, with the
> response behaving at the driver level rather like a TCP ACK in terms of the
> above.

The issue is not ACKs or anything of that nature, but socket buffer
accounting.

We can't release the socket until all transmit packets have been
liberated and their size subtracted from the send buffer quota.
Because these packets have their skb->destructor and skb->sk
set to reference that socket and the protocol code that manages
it's send buffer allocations.

In the worst possible case, using UDP as an example, let's say one big
packets fits in the send buffer and this is the one last packet that
gets into the devices TX ring.

It never liberates the packet, and if that is the only application
generating traffic in the system we deadlock.

Devices need to free packets in a deterministic amount of time, no
matter what.

I don't even understand why this needs to be discussed to be honest
with you :-)
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen terminating condition

2007-08-28 Thread Rick Jones

David Miller wrote:

From: Mandeep Singh Baines <[EMAIL PROTECTED]>
Date: Tue, 28 Aug 2007 15:47:18 -0700



It seems that some drivers do not immediately free skbs on transmit
complete.  They will hold the skb until there are more packets to
free. One example is the sis900 driver which uses TX_IDLE (tx
state-machine idle) instead of TX_OK (tx completed) as a way of
coalescing interrupts. I've also seen another driver which does
something similar. Not sure how prevalent this technique is.

So is this a bug in the drivers or a bug in pktgen?



This is a bug since it can stall TCP sockets, and even non-TCP sockets
will not be freeable until these SKBs are released by the device.


Not to defend any one driver, but could it be that the behaviour of TCP
is such that we've not noticed until now?

Does TCP particularly care for an SKB carrying an ACK?  For ones carrying
data there would be an ACK arriving before TCP could "really" free them
right?  And that ACK (rx event) could very well come after some other TX
completion, or perhaps trigger cleaning up TXs in the driver at that time?

And a UDP app is as likely as not to be request/response in nature, with the
response behaving at the driver level rather like a TCP ACK in terms of the
above.

Both of those are things I don't think we typically see happen in pktgen right?


All transmit packets must be released in a finite amount of time by
the driver without requiring more traffic to induce this release.


Yes, but for what definition of finite?

rick jones
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen terminating condition

2007-08-28 Thread David Miller
From: Mandeep Singh Baines <[EMAIL PROTECTED]>
Date: Tue, 28 Aug 2007 15:47:18 -0700

> It seems that some drivers do not immediately free skbs on transmit
> complete.  They will hold the skb until there are more packets to
> free. One example is the sis900 driver which uses TX_IDLE (tx
> state-machine idle) instead of TX_OK (tx completed) as a way of
> coalescing interrupts. I've also seen another driver which does
> something similar. Not sure how prevalent this technique is.
>
> So is this a bug in the drivers or a bug in pktgen?

This is a bug since it can stall TCP sockets, and even non-TCP sockets
will not be freeable until these SKBs are released by the device.

All transmit packets must be released in a finite amount of time by
the driver without requiring more traffic to induce this release.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen Multiqueue support [PATCH]

2007-08-28 Thread David Miller
From: Robert Olsson <[EMAIL PROTECTED]>
Date: Mon, 27 Aug 2007 16:55:19 +0200

> Below some pktgen support to send into different TX queues.
> This can of course be feed into input queues on other machines
> 
> Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>

Thanks for implementing this Robert, patch applied.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen multiqueue oops

2007-08-28 Thread David Miller
From: Robert Olsson <[EMAIL PROTECTED]>
Date: Mon, 27 Aug 2007 11:16:19 +0200

> Initially pkt_dev can be NULL this causes netif_subqueue_stopped to 
> oops. The patch below should cure it. But maybe the pktgen TX logic 
> should be reworked to better support the new multiqueue support. 
> 
> Signed-off-by: Robert Olsson <[EMAIL PROTECTED]>

Patch applied, thanks Robert.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen

2007-01-01 Thread David Miller
From: Robert Olsson <[EMAIL PROTECTED]>
Date: Fri, 1 Dec 2006 09:14:01 +0100

> 
> David Miller writes:
> 
>  > Agreed.
>  > 
>  > Robert, please fix this by using a completion so that we can
>  > wait for the threads to start up, something like this:
> 
> Included. It passes my test but Alexey and others test.

Ok, I'm going to put Robert's version of the fix into 2.6.19
and previous -stable branches.

But for 2.6.20 I'd like to do the following, based upon
Christoph's suggestion to convert to kthread.

Can folks please give this a spin?  I've tested that the
module loads and unloads properly, the threads startup
and shutdown, but that's it.

Thanks!

commit b3010a665cc33596fbf4be3fc6c3c5c80aeefb65
Author: David S. Miller <[EMAIL PROTECTED]>
Date:   Mon Jan 1 20:51:53 2007 -0800

[PKTGEN]: Convert to kthread API.

Based upon a suggestion from Christoph Hellwig.

This fixes various races in module load/unload handling
too.

Signed-off-by: David S. Miller <[EMAIL PROTECTED]>

diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index 1897a3a..04d4b93 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -148,6 +148,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -360,8 +361,7 @@ struct pktgen_thread {
spinlock_t if_lock;
struct list_head if_list;   /* All device here */
struct list_head th_list;
-   int removed;
-   char name[32];
+   struct task_struct *tsk;
char result[512];
u32 max_before_softirq; /* We'll call do_softirq to prevent starvation. 
*/
 
@@ -1689,7 +1689,7 @@ static int pktgen_thread_show(struct seq_file *seq, void 
*v)
BUG_ON(!t);
 
seq_printf(seq, "Name: %s  max_before_softirq: %d\n",
-  t->name, t->max_before_softirq);
+  t->tsk->comm, t->max_before_softirq);
 
seq_printf(seq, "Running: ");
 
@@ -3112,7 +3112,7 @@ static void pktgen_rem_thread(struct pktgen_thread *t)
 {
/* Remove from the thread list */
 
-   remove_proc_entry(t->name, pg_proc_dir);
+   remove_proc_entry(t->tsk->comm, pg_proc_dir);
 
mutex_lock(&pktgen_thread_lock);
 
@@ -3260,58 +3260,40 @@ out:;
  * Main loop of the thread goes here
  */
 
-static void pktgen_thread_worker(struct pktgen_thread *t)
+static int pktgen_thread_worker(void *arg)
 {
DEFINE_WAIT(wait);
+   struct pktgen_thread *t = arg;
struct pktgen_dev *pkt_dev = NULL;
int cpu = t->cpu;
-   sigset_t tmpsig;
u32 max_before_softirq;
u32 tx_since_softirq = 0;
 
-   daemonize("pktgen/%d", cpu);
-
-   /* Block all signals except SIGKILL, SIGSTOP and SIGTERM */
-
-   spin_lock_irq(¤t->sighand->siglock);
-   tmpsig = current->blocked;
-   siginitsetinv(¤t->blocked,
- sigmask(SIGKILL) | sigmask(SIGSTOP) | sigmask(SIGTERM));
-
-   recalc_sigpending();
-   spin_unlock_irq(¤t->sighand->siglock);
-
-   /* Migrate to the right CPU */
-   set_cpus_allowed(current, cpumask_of_cpu(cpu));
-   if (smp_processor_id() != cpu)
-   BUG();
+   BUG_ON(smp_processor_id() != cpu);
 
init_waitqueue_head(&t->queue);
 
-   t->control &= ~(T_TERMINATE);
-   t->control &= ~(T_RUN);
-   t->control &= ~(T_STOP);
-   t->control &= ~(T_REMDEVALL);
-   t->control &= ~(T_REMDEV);
-
t->pid = current->pid;
 
PG_DEBUG(printk("pktgen: starting pktgen/%d:  pid=%d\n", cpu, 
current->pid));
 
max_before_softirq = t->max_before_softirq;
 
-   __set_current_state(TASK_INTERRUPTIBLE);
-   mb();
+   set_current_state(TASK_INTERRUPTIBLE);
 
-   while (1) {
-
-   __set_current_state(TASK_RUNNING);
+   while (!kthread_should_stop()) {
+   pkt_dev = next_to_run(t);
 
-   /*
-* Get next dev to xmit -- if any.
-*/
+   if (!pkt_dev &&
+   (t->control & (T_STOP | T_RUN | T_REMDEVALL | T_REMDEV))
+   == 0) {
+   prepare_to_wait(&(t->queue), &wait,
+   TASK_INTERRUPTIBLE);
+   schedule_timeout(HZ / 10);
+   finish_wait(&(t->queue), &wait);
+   }
 
-   pkt_dev = next_to_run(t);
+   __set_current_state(TASK_RUNNING);
 
if (pkt_dev) {
 
@@ -3329,21 +3311,8 @@ static void pktgen_thread_worker(struct pktgen_thread *t)
do_softirq();
tx_since_softirq = 0;
}
-   } else {
-   prepare_to_wait(&(t->queue), &wait, TASK_INTERRUPTIBLE);
-   schedule_timeout(HZ / 10);
-   finish_wait(&(t->queue), &wait);
}
 
-   /*
-* Back from sleep, either due to the timeout or signal.

Re: pktgen

2006-12-01 Thread David Miller
From: "Alexey Dobriyan" <[EMAIL PROTECTED]>
Date: Fri, 1 Dec 2006 12:51:53 +0300

> On 12/1/06, Robert Olsson <[EMAIL PROTECTED]> wrote:
> > David Miller writes:
> >  > Agreed.
> >  >
> >  > Robert, please fix this by using a completion so that we can
> >  > wait for the threads to start up, something like this:
> >
> > Included. It passes my test but Alexey and others test.
> 
> Confused now. Is my "t->control &= ~(T_TERMINATE);" fix deprecated by
> completions?

The completions solve the bug completely.

But later on we should integate a change that eliminates
these spurious t->control bit clears at the start of the
pktgen thread execution, it just isn't needed to fix the
bug so we can make it later.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen

2006-12-01 Thread David Miller
From: Christoph Hellwig <[EMAIL PROTECTED]>
Date: Fri, 1 Dec 2006 08:22:25 +

> On Thu, Nov 30, 2006 at 08:14:23PM -0800, David Miller wrote:
> > Agreed.
> > 
> > Robert, please fix this by using a completion so that we can
> > wait for the threads to start up, something like this:
> 
> No, that's wrong aswell :)  Please use the kthread_ API that takes
> care of all this.  kernel_thread is going away mid-term, so you'd
> have to do this work anyway.

I was going to suggest this Christophe, but I wanted a change small
and easy enough to verify in order to merge the fix into the -stable
branch.  Converting the thing over to kthread would make the change
more risky in such a context.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen

2006-12-01 Thread Robert Olsson

Alexey Dobriyan writes:

 > 
 > Confused now. Is my "t->control &= ~(T_TERMINATE);" fix deprecated by
 > completions?
 
 It's not needed with completion patch as this does the job a bit more
 mainstream. The T_TERMINATE seems to work well I've tested on machine
 with CPU:s. Only once I noticed that only 2 of 4 threads got started 
 but it could be something else...

 And the race is hardly seen with any real use of pktgen.. .Let's hear
 what others... and completions was out-of-date.

 Cheers.
--ro
 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen

2006-12-01 Thread Alexey Dobriyan

On 12/1/06, Robert Olsson <[EMAIL PROTECTED]> wrote:

David Miller writes:
 > Agreed.
 >
 > Robert, please fix this by using a completion so that we can
 > wait for the threads to start up, something like this:

Included. It passes my test but Alexey and others test.


Confused now. Is my "t->control &= ~(T_TERMINATE);" fix deprecated by
completions?


--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -147,6 +147,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -160,7 +161,7 @@
 #include/* do_div */
 #include 

-#define VERSION  "pktgen v2.68: Packet Generator for packet performance
testing.\n"
+#define VERSION  "pktgen v2.69: Packet Generator for packet performance
testing.\n"

 /* #define PG_DEBUG(a) a */
 #define PG_DEBUG(a)
@@ -206,6 +207,11 @@ static struct proc_dir_entry *pg_proc_di
 #define VLAN_TAG_SIZE(x) ((x)->vlan_id == 0x ? 0 : 4)
 #define SVLAN_TAG_SIZE(x) ((x)->svlan_id == 0x ? 0 : 4)

+struct pktgen_thread_info {
+   struct pktgen_thread *t;
+   struct completion *c;
+};
+
 struct flow_state {
__u32 cur_daddr;
int count;
@@ -3264,10 +3270,11 @@ out:;
  * Main loop of the thread goes here
  */

-static void pktgen_thread_worker(struct pktgen_thread *t)
+static void pktgen_thread_worker(struct pktgen_thread_info *info)
 {
DEFINE_WAIT(wait);
struct pktgen_dev *pkt_dev = NULL;
+   struct pktgen_thread *t = info->t;
int cpu = t->cpu;
sigset_t tmpsig;
u32 max_before_softirq;
@@ -3307,6 +3314,8 @@ static void pktgen_thread_worker(struct
__set_current_state(TASK_INTERRUPTIBLE);
mb();

+complete(info->c);
+
while (1) {

__set_current_state(TASK_RUNNING);
@@ -3518,6 +3527,8 @@ static struct pktgen_thread *__init pktg
 static int __init pktgen_create_thread(const char *name, int cpu)
 {
int err;
+   struct pktgen_thread_info info;
+struct completion started;
struct pktgen_thread *t = NULL;
struct proc_dir_entry *pe;

@@ -3558,7 +3569,11 @@ static int __init pktgen_create_thread(c

t->removed = 0;

-   err = kernel_thread((void *)pktgen_thread_worker, (void *)t,
+   init_completion(&started);
+info.t = t;
+info.c = &started;
+
+   err = kernel_thread((void *)pktgen_thread_worker, (void *)&info,
  CLONE_FS | CLONE_FILES | CLONE_SIGHAND);
if (err < 0) {
printk("pktgen: kernel_thread() failed for cpu %d\n", t->cpu);
@@ -3568,6 +3583,7 @@ static int __init pktgen_create_thread(c
return err;
}

+   wait_for_completion(&started);
return 0;
 }

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen

2006-12-01 Thread Christoph Hellwig
On Thu, Nov 30, 2006 at 08:14:23PM -0800, David Miller wrote:
> Agreed.
> 
> Robert, please fix this by using a completion so that we can
> wait for the threads to start up, something like this:

No, that's wrong aswell :)  Please use the kthread_ API that takes
care of all this.  kernel_thread is going away mid-term, so you'd
have to do this work anyway.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen

2006-12-01 Thread Robert Olsson

David Miller writes:

 > Agreed.
 > 
 > Robert, please fix this by using a completion so that we can
 > wait for the threads to start up, something like this:

Included. It passes my test but Alexey and others test.
Cheers.
--ro

diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index 733d86d..a630a73 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -147,6 +147,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -160,7 +161,7 @@
 #include  /* do_div */
 #include 
 
-#define VERSION  "pktgen v2.68: Packet Generator for packet performance 
testing.\n"
+#define VERSION  "pktgen v2.69: Packet Generator for packet performance 
testing.\n"
 
 /* #define PG_DEBUG(a) a */
 #define PG_DEBUG(a)
@@ -206,6 +207,11 @@ static struct proc_dir_entry *pg_proc_di
 #define VLAN_TAG_SIZE(x) ((x)->vlan_id == 0x ? 0 : 4)
 #define SVLAN_TAG_SIZE(x) ((x)->svlan_id == 0x ? 0 : 4)
 
+struct pktgen_thread_info {
+   struct pktgen_thread *t;
+   struct completion *c;
+};
+
 struct flow_state {
__u32 cur_daddr;
int count;
@@ -3264,10 +3270,11 @@ out:;
  * Main loop of the thread goes here
  */
 
-static void pktgen_thread_worker(struct pktgen_thread *t)
+static void pktgen_thread_worker(struct pktgen_thread_info *info)
 {
DEFINE_WAIT(wait);
struct pktgen_dev *pkt_dev = NULL;
+   struct pktgen_thread *t = info->t;
int cpu = t->cpu;
sigset_t tmpsig;
u32 max_before_softirq;
@@ -3307,6 +3314,8 @@ static void pktgen_thread_worker(struct 
__set_current_state(TASK_INTERRUPTIBLE);
mb();
 
+complete(info->c);
+
while (1) {
 
__set_current_state(TASK_RUNNING);
@@ -3518,6 +3527,8 @@ static struct pktgen_thread *__init pktg
 static int __init pktgen_create_thread(const char *name, int cpu)
 {
int err;
+   struct pktgen_thread_info info;
+struct completion started;
struct pktgen_thread *t = NULL;
struct proc_dir_entry *pe;
 
@@ -3558,7 +3569,11 @@ static int __init pktgen_create_thread(c
 
t->removed = 0;
 
-   err = kernel_thread((void *)pktgen_thread_worker, (void *)t,
+   init_completion(&started);
+info.t = t;
+info.c = &started;
+
+   err = kernel_thread((void *)pktgen_thread_worker, (void *)&info,
  CLONE_FS | CLONE_FILES | CLONE_SIGHAND);
if (err < 0) {
printk("pktgen: kernel_thread() failed for cpu %d\n", t->cpu);
@@ -3568,6 +3583,7 @@ static int __init pktgen_create_thread(c
return err;
}
 
+   wait_for_completion(&started);
return 0;
 }
 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen

2006-11-30 Thread David Miller
From: Ben Greear <[EMAIL PROTECTED]>
Date: Thu, 30 Nov 2006 09:33:43 -0800

> Robert Olsson wrote:
> > @@ -3673,6 +3673,8 @@ static void __exit pg_cleanup(void)
> > struct list_head *q, *n;
> > wait_queue_head_t queue;
> > init_waitqueue_head(&queue);
> > +   
> > +   schedule_timeout_interruptible(msecs_to_jiffies(125));
> >  
> > /* Stop all interfaces & threads */
> >  
>
> That strikes me as a hack..surely there is a better method than just adding
> a sleep??

Agreed.

Robert, please fix this by using a completion so that we can
wait for the threads to start up, something like this:

...
#include 
...

struct pktgen_thread_info {
   struct pktgen_thread *t;
   struct completion *c;
};

static void pktgen_thread_worker(struct pktgen_thread_info *info)
{
struct pktgen_thread *t = info->t;
 ...
__set_current_state(TASK_INTERRUPTIBLE);
mb();

complete(info->c);
 ...
}

static int __init pktgen_create_thread(const char *name, int cpu)
{
 ...
struct pktgen_thread_info info;
struct complation started;
 ...
init_completion(&started);

info.t = t;
info.c = &started;

err = kernel_thread((void *)pktgen_thread_worker, (void *)t,
  CLONE_FS | CLONE_FILES | CLONE_SIGHAND);
if (err < 0) {
printk("pktgen: kernel_thread() failed for cpu %d\n", t->cpu);
remove_proc_entry(t->name, pg_proc_dir);
list_del(&t->th_list);
kfree(t);
return err;
}

wait_for_completion(&started);
return 0;
}

We can horse around with fixing the initial t->control bit flipping
in pktgen_thread_worker() in a future changeset if we want.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen

2006-11-30 Thread Ben Greear

Robert Olsson wrote:

Hello!

Seems you found a race when rmmod is done before it's fully started

Try:

diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index 733d86d..ac0b4b1 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -160,7 +160,7 @@
 #include/* do_div */
 #include 
 
-#define VERSION  "pktgen v2.68: Packet Generator for packet performance testing.\n"

+#define VERSION  "pktgen v2.69: Packet Generator for packet performance 
testing.\n"
 
 /* #define PG_DEBUG(a) a */

 #define PG_DEBUG(a)
@@ -3673,6 +3673,8 @@ static void __exit pg_cleanup(void)
struct list_head *q, *n;
wait_queue_head_t queue;
init_waitqueue_head(&queue);
+   
+   schedule_timeout_interruptible(msecs_to_jiffies(125));
 
 	/* Stop all interfaces & threads */
 
  

That strikes me as a hack..surely there is a better method than just adding
a sleep??

Ben

--
Ben Greear <[EMAIL PROTECTED]> 
Candela Technologies Inc  http://www.candelatech.com



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen

2006-11-30 Thread Robert Olsson


Hello!

Seems you found a race when rmmod is done before it's fully started

Try:

diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index 733d86d..ac0b4b1 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -160,7 +160,7 @@
 #include  /* do_div */
 #include 
 
-#define VERSION  "pktgen v2.68: Packet Generator for packet performance 
testing.\n"
+#define VERSION  "pktgen v2.69: Packet Generator for packet performance 
testing.\n"
 
 /* #define PG_DEBUG(a) a */
 #define PG_DEBUG(a)
@@ -3673,6 +3673,8 @@ static void __exit pg_cleanup(void)
struct list_head *q, *n;
wait_queue_head_t queue;
init_waitqueue_head(&queue);
+   
+   schedule_timeout_interruptible(msecs_to_jiffies(125));
 
/* Stop all interfaces & threads */
 


for i in 1 2 3 4 5 ; do modprobe pktgen ; rmmod pktgen ; done

In dmesg
pktgen v2.69: Packet Generator for packet performance testing.
pktgen v2.69: Packet Generator for packet performance testing.
pktgen v2.69: Packet Generator for packet performance testing.
pktgen v2.69: Packet Generator for packet performance testing.
pktgen v2.69: Packet Generator for packet performance testing.

Cheers.
--ro



Alexey Dobriyan writes:
 > On 11/30/06, David Miller <[EMAIL PROTECTED]> wrote:
 > > From: Alexey Dobriyan <[EMAIL PROTECTED]>
 > > Date: Wed, 29 Nov 2006 23:04:37 +0300
 > >
 > > > Looks like worker thread strategically clears it if scheduled at wrong
 > > > moment.
 > > >
 > > > --- a/net/core/pktgen.c
 > > > +++ b/net/core/pktgen.c
 > > > @@ -3292,7 +3292,6 @@ static void pktgen_thread_worker(struct
 > > >
 > > >  init_waitqueue_head(&t->queue);
 > > >
 > > > -t->control &= ~(T_TERMINATE);
 > > >  t->control &= ~(T_RUN);
 > > >  t->control &= ~(T_STOP);
 > > >  t->control &= ~(T_REMDEVALL);
 > >
 > > Good catch Alexey.  Did you rerun the load/unload test with
 > > this fix applied?  If it fixes things, I'll merge it.
 > 
 > Well, yes, it fixes things, except Ctrl+C getting you out of
 > modprobe/rmmod loop will spit
 > backtrace again. And other flags: T_RUN, T_STOP. Clearance is not
 > needed due to kZalloc and
 > create bugs as demostrated.
 > 
 > Give me some time.
 > -
 > To unsubscribe from this list: send the line "unsubscribe netdev" in
 > the body of a message to [EMAIL PROTECTED]
 > More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen

2006-11-29 Thread Alexey Dobriyan

On 11/30/06, David Miller <[EMAIL PROTECTED]> wrote:

From: Alexey Dobriyan <[EMAIL PROTECTED]>
Date: Wed, 29 Nov 2006 23:04:37 +0300

> Looks like worker thread strategically clears it if scheduled at wrong
> moment.
>
> --- a/net/core/pktgen.c
> +++ b/net/core/pktgen.c
> @@ -3292,7 +3292,6 @@ static void pktgen_thread_worker(struct
>
>init_waitqueue_head(&t->queue);
>
> -  t->control &= ~(T_TERMINATE);
>t->control &= ~(T_RUN);
>t->control &= ~(T_STOP);
>t->control &= ~(T_REMDEVALL);

Good catch Alexey.  Did you rerun the load/unload test with
this fix applied?  If it fixes things, I'll merge it.


Well, yes, it fixes things, except Ctrl+C getting you out of
modprobe/rmmod loop will spit
backtrace again. And other flags: T_RUN, T_STOP. Clearance is not
needed due to kZalloc and
create bugs as demostrated.

Give me some time.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen

2006-11-29 Thread David Miller
From: Alexey Dobriyan <[EMAIL PROTECTED]>
Date: Wed, 29 Nov 2006 23:04:37 +0300

> Looks like worker thread strategically clears it if scheduled at wrong
> moment.
> 
> --- a/net/core/pktgen.c
> +++ b/net/core/pktgen.c
> @@ -3292,7 +3292,6 @@ static void pktgen_thread_worker(struct
>  
>   init_waitqueue_head(&t->queue);
>  
> - t->control &= ~(T_TERMINATE);
>   t->control &= ~(T_RUN);
>   t->control &= ~(T_STOP);
>   t->control &= ~(T_REMDEVALL);

Good catch Alexey.  Did you rerun the load/unload test with
this fix applied?  If it fixes things, I'll merge it.

Thanks!
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen

2006-11-29 Thread Alexey Dobriyan
On Tue, Nov 28, 2006 at 03:33:25PM -0800, David Miller wrote:
> From: Alexey Dobriyan <[EMAIL PROTECTED]>
> Date: Wed, 22 Nov 2006 00:22:51 +0300
> 
> > [CCing netdev, bug in pktgen]
> > 
> > [build modular pktgen]
> > while true; do modprobe pktgen && rmmod pktgen; done
> > 
> > BUG: warning at fs/proc/generic.c:732/remove_proc_entry()
> >  [] remove_proc_entry+0x161/0x1ca
> >  [] pg_cleanup+0xd5/0xdc [pktgen]
> >  [] autoremove_wake_function+0x0/0x35
> >  [] sys_delete_module+0x162/0x189
> >  [] remove_vma+0x31/0x36
> >  [] do_munmap+0x193/0x1ac
> >  [] sysenter_past_esp+0x56/0x79
> >  [] fn_hash_delete+0x4f/0x1c7
> > 
> > On Tue, Nov 21, 2006 at 09:36:46PM +0100, Pavol Gono wrote:
> > > I am going to add two more:
> > > for i in 1 2 3 4 5 ; do modprobe pktgen ; rmmod pktgen ; done
> > 
> > Looks like it creates /proc/net/pktgen/kpktgen_%i but forgets to remove
> > them.
>
> It's pretty careful to delete all of the entries under
> /proc/net/pktgen/.
>
> When the module is brought down, it walks the list of threads
> and brings them down by setting T_TERMINATE in t->control.

Looks like worker thread strategically clears it if scheduled at wrong
moment.

--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -3292,7 +3292,6 @@ static void pktgen_thread_worker(struct
 
init_waitqueue_head(&t->queue);
 
-   t->control &= ~(T_TERMINATE);
t->control &= ~(T_RUN);
t->control &= ~(T_STOP);
t->control &= ~(T_REMDEVALL);

> This makes the thread break out of it's loop and run:
>
>   pktgen_stop(t);
>   pktgen_rem_all_ifs(t);
>   pktgen_rem_thread(t);

Kernel seeems to survive, but when I hit Ctrl+C after half
a minute backtrace is back being the very last dmesg lines.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen

2006-11-28 Thread David Miller
From: Alexey Dobriyan <[EMAIL PROTECTED]>
Date: Wed, 22 Nov 2006 00:22:51 +0300

> [CCing netdev, bug in pktgen]
> 
>   [build modular pktgen]
>   while true; do modprobe pktgen && rmmod pktgen; done
> 
>   BUG: warning at fs/proc/generic.c:732/remove_proc_entry()
>[] remove_proc_entry+0x161/0x1ca
>[] pg_cleanup+0xd5/0xdc [pktgen]
>  [] autoremove_wake_function+0x0/0x35
>[] sys_delete_module+0x162/0x189
>[] remove_vma+0x31/0x36
>[] do_munmap+0x193/0x1ac
>[] sysenter_past_esp+0x56/0x79
>[] fn_hash_delete+0x4f/0x1c7
> 
> On Tue, Nov 21, 2006 at 09:36:46PM +0100, Pavol Gono wrote:
> > I am going to add two more:
> > for i in 1 2 3 4 5 ; do modprobe pktgen ; rmmod pktgen ; done
> 
> Looks like it creates /proc/net/pktgen/kpktgen_%i but forgets to remove
> them.

It's pretty careful to delete all of the entries under
/proc/net/pktgen/.

When the module is brought down, it walks the list of threads
and brings them down by setting T_TERMINATE in t->control.
This makes the thread break out of it's loop and run:

pktgen_stop(t);
pktgen_rem_all_ifs(t);
pktgen_rem_thread(t);

pktgen_rem_all_ifs() will delete all device entries in
/proc/net/pktgen/

Next, pktgen_rem_thread() will kill off the entry for
/proc/net/pktgen/kpkgen_%i

Finally, the top-level module remove code will delete the
control file right before it tries to kill off the directory
via:

/* Clean up proc file system */
remove_proc_entry(PGCTRL, pg_proc_dir);
proc_net_remove(PG_PROC_DIR);

So I don't see any bugs that could cause this.

One possibility is that the thread's t->name is being
corrupted somehow, that would make the remove_proc_entry()
fail and cause the behavior you see.  But I can't see anything
obvious that might do that either.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen patch available for perusal.

2006-11-06 Thread Robert Olsson

Ben Greear writes:
 > > Changes:
 > > * use a nano-second timer based on the scheduler timer (TSC) for relative 
 > > times, instead of get_time_of_day.

Seems I missed to set tsc as clocksource. It makes a difference. Performance is 
normal and I'm less confused.

e1000 82546GB @ 1.6 GHz Opteron. Kernel 2.6.19-rc1_Bifrost_-g18e199c6-dirty 

echo acpi_pm> 
/sys/devices/system/clocksource/clocksource0/current_clocksource   

psize pps
-
60  556333
124  526942
252  452981
508  234996
1020  119748
1496  82248

echo tsc> 
/sys/devices/system/clocksource/clocksource0/current_clocksource 

psize pps
-
60  819914
124  747286
252  452975
508  234993
1020  119749
1496  82247

Cheers. 
--ro
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen patch available for perusal.

2006-11-06 Thread Robert Olsson

jamal writes:

 > If you are listening then start with:
 > 
 > 1) Do a simple test with just udp traffic as above, doing simple
 > accounting. This helps you to get a feel on how things work.
 > 2) modify the matching rules to match your magic cookie
 > 3) write a simple action invoked by your matching rules and use tc to
 > add it to the policy.
 > 4) integrate in your app now that you know what you are doing.

 Yes. Sounds like simple and general solution. No call-backs, no #ifdef's
 no extra modules. Just a little recipe in pktgen.txt

 Cheers.
--ro
 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen patch available for perusal.

2006-11-04 Thread Ben Greear

jamal wrote:

On Sat, 2006-04-11 at 09:29 -0800, Ben Greear wrote:
  



You can ofcourse add many of these based on other header info.

It seems to me your magic header is inside the UDP packet, correct?
In that case you will have to play with matches since you can specify
arbitraty offsets and values inside the packet.
  
Yes, I want to be able to randomize source/dest MAC, IP, and IP Port, so 
I need to match only on the
magic cookie.  I'll soon add support to deal with TCP packets as well, 
but that is only going to require

calculating a different offset to look for the magic value.

Of course you can do this from your application instead of using tc.
I think that will be the best place to control this and sync with
pktgen sending.
  
You mean the equiv of calling system(tc filter .) from the app, or 
do you mean something else

entirely?
the rest can be in a method 
called after you match in your script?





If the packet/byte counters that the drop action provides are not good
enough, you can write yourself a little module that will do the
accounting the way you want it. For example this will be necessary to
the out-of-sequence cheques below. Timestamps are already being updated
Look at net/sched/act_simple.c and its associated user space code in
iproute2.

Now, Ben - are you listening really this time or am i wasting my time
for the nth time giving you all these details? ;->
  
You always give details that are pertinent to your setup and not to 
mine, and you
always have some 'small' step that is 'write a kernel module and hack on 
tc' if you

want it to do something other than the generic thing.

If you are listening then start with:

1) Do a simple test with just udp traffic as above, doing simple
accounting. This helps you to get a feel on how things work.
2) modify the matching rules to match your magic cookie
  
These first two do not look too difficult.  I certainly need more 
control/stats than
what the generic counters would be, however, so I assume I need to move 
to step 3.

3) write a simple action invoked by your matching rules and use tc to
add it to the policy.
  
I have exactly no idea of what you mean by this.  How do I get this tc 
framework to
send the packet with 32-bit value FOO at offset X to some method called 
pktgen_rx_pkt(skb)
in the pktgen module?  That is all I want in a hook:  The ability to 
match on a pkt and send it
to my kernel module and to let my kernel module and have it traverse the 
stack no farther

unless my kernel module decides to re-insert it for some reason.

If this takes significant work, then please don't waste more time trying 
to talk me
into this, as my small patch to dev.c already does exactly what I need 
and the code

is very easy to understand.

If this is easy, how about you write the module and make the tc 
changes.  I'll work on any
changes needed in pktgen and polish up a nice patch for Robert et 
al  Then you can get
the latency distributions, OOO, Dup, Dropped and other counters that my 
pktgen logic

provides for your own testing purposes...

Thanks,
Ben

--
Ben Greear <[EMAIL PROTECTED]> 
Candela Technologies Inc  http://www.candelatech.com



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen patch available for perusal.

2006-11-04 Thread jamal
On Sat, 2006-04-11 at 09:29 -0800, Ben Greear wrote:
> jamal wrote:
   
> Please do send a script.  I match based on this method below.   Probably 
> you only
> need the part that checks for the MAGIC, 

What i do in my case is send to the SUT to UDP port 9. The SUT loops
back the packet to me after processing and i just have a simple check in
the traffic generator of:

a) Packet = IPV4, IP address matches what i want, port =9, 
b) If this matches i  account and drop it. This means all other traffic
(like ssh etc goes through fine).

I dont really need to check for length because _i know_ udp port 9 to
the SUT is test traffic.

For such a setup, the script would be as simple as (assuming SUT hooked
to eth0):

--
#interested in incoming packets on eth0
tc qdisc add dev eth0 ingress
#interested in packets coming from SUT(10.0.0.21) on UDP port 9
tc filter add dev eth0 parent : protocol ip prio 6 u32 \
match ip src 10.0.0.21/32 \
match udp port 9 0x \
flowid 1:16 \
action drop
---

When you do a listing of this filter you will see accounting info.
You list by saying:
---
tc -s filter ls dev eth0 parent :
---

You can ofcourse add many of these based on other header info.

It seems to me your magic header is inside the UDP packet, correct?
In that case you will have to play with matches since you can specify
arbitraty offsets and values inside the packet.

Of course you can do this from your application instead of using tc.
I think that will be the best place to control this and sync with
pktgen sending.

> the rest can be in a method 
> called after you match in your script?
> 

If the packet/byte counters that the drop action provides are not good
enough, you can write yourself a little module that will do the
accounting the way you want it. For example this will be necessary to
the out-of-sequence cheques below. Timestamps are already being updated
Look at net/sched/act_simple.c and its associated user space code in
iproute2.

Now, Ben - are you listening really this time or am i wasting my time
for the nth time giving you all these details? ;->

If you are listening then start with:

1) Do a simple test with just udp traffic as above, doing simple
accounting. This helps you to get a feel on how things work.
2) modify the matching rules to match your magic cookie
3) write a simple action invoked by your matching rules and use tc to
add it to the policy.
4) integrate in your app now that you know what you are doing.

If you follow these steps or a variant-of i will spend time and help.

cheers,
jamal


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen patch available for perusal.

2006-11-04 Thread Ben Greear

jamal wrote:

On Wed, 2006-01-11 at 11:11 -0800, Ben Greear wrote:

  
I'd be thrilled to have the receive logic go into pktgen, even if 
it was #if 0 with a comment
showing how to patch dev.c to get it working.  It would make my 
out-of-tree patch smaller

and should help others who are doing research and driver development...




I use pktgen extensively these days for ipsec testing. I also record
stats for pkts i receive using a tc action drop. Very simple and works
great.  
Ben, you keep insisting on doing this hook (every 3 months) because you

dont want to invest time to do it with netlink (I thought you sold this
as part of your product - the investment of your time is really not that
high). If you tell me what the packet trigger is, I will send you a
script ;->
  
Please do send a script.  I match based on this method below.   Probably 
you only
need the part that checks for the MAGIC, the rest can be in a method 
called after

you match in your script?

/* Returns < 0 if the skb is not a pktgen buffer. */
int pktgen_receive(struct sk_buff* skb) {
   /* See if we have a pktgen packet */
   /* TODO:  Add support for detecting IPv6, TCP packets too.  This 
will only

* catch UDP at the moment. --Ben
*/
   /*printk("pktgen-rcv, skb->len: %d\n", skb->len);*/
  
   if ((skb->len >= (20 + 8 + sizeof(struct pktgen_hdr))) &&

   (skb->protocol == __constant_htons(ETH_P_IP))) {
   struct pktgen_hdr* pgh;
  
   /* It's IP, and long enough, lets check the magic number.
* TODO:  This is a hack not always guaranteed to catch 
the right

* packets.
*/
  
   /*printk("Length & protocol passed, skb->data: %p, raw: %p\n",

 skb->data, skb->h.raw);*/
 
   pgh = (struct pktgen_hdr*)(skb->data + 20 + 8);
  
   /*

   tmp = (char*)(skb->data);
   for (i = 0; i<90; i++) {
   printk("%02hx ", tmp[i]);
   if (((i + 1) % 15) == 0) {
   printk("\n");
   }
   }
   printk("\n");
   */
  
   if (pgh->pgh_magic == __constant_ntohl(PKTGEN_MAGIC)) {

   struct net_device* dev = skb->dev;
   struct pktgen_dev* pkt_dev;
   __u32 seq = ntohl(pgh->seq_num);

   // TODO:  Need lock..maybe
   pkt_dev = dev->pkt_dev;
  
   if (!pkt_dev) {

   return -1;
   }

   pkt_dev->pkts_rcvd++;
   pkt_dev->bytes_rcvd += ((skb->tail - 
skb->mac.raw) + 4); /* +4 for the checksum */
  
   /* Check for out-of-sequence packets */

   if (pkt_dev->last_seq_rcvd == seq) {
   pkt_dev->dup_rcvd++;
   pkt_dev->dup_since_incr++;
   }
   else {
   __s64 rx;
   __s64 tx;
   struct timeval txtv;
   if (skb->tstamp.off_sec || skb->tstamp.off_usec) {
   skb_get_timestamp(skb, &txtv);
   }
   else {
   do_gettimeofday(&txtv);
   skb_set_timestamp(skb, &txtv);
   }
   rx = tv_to_us(&txtv);

   txtv.tv_usec = ntohl(pgh->tv_usec);
   txtv.tv_sec = ntohl(pgh->tv_sec);
   tx = tv_to_us(&txtv);
   record_latency(pkt_dev, rx - tx);
  
   if ((pkt_dev->last_seq_rcvd + 1) == seq) {

   if ((pkt_dev->peer_clone_skb > 1) &&
   (pkt_dev->peer_clone_skb > 
(pkt_dev->dup_since_incr + 1))) {
  
   pkt_dev->seq_gap_rcvd += 
(pkt_dev->peer_clone_skb -
  
pkt_dev->dup_since_incr - 1);

   }
   /* Great, in order...all is well */
   }
   else if (pkt_dev->last_seq_rcvd < seq) {
   /* sequence gap, means we 
dropped a pkt most likely */

   if (pkt_dev->peer_clone_skb > 1) {
   /* We dropped more than 
one sequence number's worth,
* and if we're using 
clone_skb, then this is quite
* a few.  This n

Re: pktgen patch available for perusal.

2006-11-04 Thread jamal
On Wed, 2006-01-11 at 11:11 -0800, Ben Greear wrote:

> I'd be thrilled to have the receive logic go into pktgen, even if 
> it was #if 0 with a comment
> showing how to patch dev.c to get it working.  It would make my 
> out-of-tree patch smaller
> and should help others who are doing research and driver development...
> 

I use pktgen extensively these days for ipsec testing. I also record
stats for pkts i receive using a tc action drop. Very simple and works
great.  
Ben, you keep insisting on doing this hook (every 3 months) because you
dont want to invest time to do it with netlink (I thought you sold this
as part of your product - the investment of your time is really not that
high). If you tell me what the packet trigger is, I will send you a
script ;->

Robert, I have started writing some generic netlink messaging to replace
the /proc that i will send your way. I need to get consensus on some
tunnel-mode IPSEC patches first then i will send the about three
patchsets your way (to replace the old ones i sent earlier).

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen patch available for perusal.

2006-11-03 Thread Ben Greear

Robert Olsson wrote:

Ben Greear writes:

 > It requires a hook in dev.c, or at least that is the only way I can 
 > think to implement it.


Well the hook be placed along the packet path even in drivers. In tulip I didn't 
even take packet of the ring in some experiments. 


And there plenty of existing hooks already i.e PRE_ROUTE?


For my particular application the hook needs to be right
after the bridge hook.  My dev.c hook looks like this:

#if defined(CONFIG_NET_PKTGEN) || defined(CONFIG_NET_PKTGEN_MODULE)
#include "pktgen.h"

#warning "Compiling dev.c for pktgen.";

int (*handle_pktgen_hook)(struct sk_buff *skb) = NULL;
EXPORT_SYMBOL(handle_pktgen_hook);

static __inline__ int handle_pktgen_rcv(struct sk_buff* skb) {
if (handle_pktgen_hook) {
return handle_pktgen_hook(skb);
}
return -1;
}
#endif

.

#if defined(CONFIG_NET_PKTGEN) || defined(CONFIG_NET_PKTGEN_MODULE)
if ((skb->dev->pkt_dev) &&
(handle_pktgen_rcv(skb) >= 0)) {
/* Pktgen may consume the packet, no need to send
 * to further protocols.
 */
goto out;
}
#endif


If the rx-hook logic is already in pktgen module, and it's symbol
exported, perhaps a second hook module could be written that
would just be the bridge between a driver or a PRE-ROUTE hook
and pktgen?  The hook module would probably not be much larger
than the code above...


Thanks,
Ben

--
Ben Greear <[EMAIL PROTECTED]>
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen patch available for perusal.

2006-11-03 Thread Robert Olsson

Ben Greear writes:

 > It requires a hook in dev.c, or at least that is the only way I can 
 > think to implement it.

Well the hook be placed along the packet path even in drivers. In tulip I 
didn't 
even take packet of the ring in some experiments. 

And there plenty of existing hooks already i.e PRE_ROUTE?

 > I don't think that hook is going to be allowed into the kernel, so the 
 > best I was hoping for  was to have the code in pktgen with #if 0 and let 
 > users patch their kernel to add the hook...

Right so what I was trying to say was rather having #if 0 and dead code in 
pktgen it could be kept separate. 

Cheers.
--ro

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen patch available for perusal.

2006-11-03 Thread Ben Greear

Robert Olsson wrote:

Ben Greear writes:

 > I'd be thrilled to have the receive logic go into pktgen, even if it was #if 
0 with a comment
 > showing how to patch dev.c to get it working.  It would make my out-of-tree 
patch smaller
 > and should help others who are doing research and driver development...


 Just an idea... RX part is pretty separate from TX. How about if we keep the 
 this code in a small seprate optional module? It can developed into some general 
 RX probe.
  
It requires a hook in dev.c, or at least that is the only way I can 
think to implement it.
I don't think that hook is going to be allowed into the kernel, so the 
best I was hoping for
was to have the code in pktgen with #if 0 and let users patch their 
kernel to add the

hook...

Ben


 Cheers.
--ro
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
  



--
Ben Greear <[EMAIL PROTECTED]> 
Candela Technologies Inc  http://www.candelatech.com



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen patch available for perusal.

2006-11-03 Thread Robert Olsson

Ben Greear writes:

 > I'd be thrilled to have the receive logic go into pktgen, even if it was #if 
 > 0 with a comment
 > showing how to patch dev.c to get it working.  It would make my out-of-tree 
 > patch smaller
 > and should help others who are doing research and driver development...


 Just an idea... RX part is pretty separate from TX. How about if we keep the 
 this code in a small seprate optional module? It can developed into some 
general 
 RX probe.

 Cheers.
--ro
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen patch available for perusal.

2006-11-01 Thread Ben Greear

Robert Olsson wrote:

Ben Greear writes:
 > I've completed the first pass of my changes to pktgen in 2.6.18.
 > Many of these features are probably DOA based on previous conversations,
 > but perhaps this will help someone

 Thanks. Well sometimes there is a need to capture and drop pkts and various 
 points, so it handy to have code fragments ready and just add the hook when 
 it's needed in driver or rx softirq etc.


I'd be thrilled to have the receive logic go into pktgen, even if it was #if 0 
with a comment
showing how to patch dev.c to get it working.  It would make my out-of-tree 
patch smaller
and should help others who are doing research and driver development...

If you'd accept a patch like this, please let me know.


 > Changes:
 > * use a nano-second timer based on the scheduler timer (TSC) for relative 
times, instead of get_time_of_day.

 Any chance you extract this one so we can compare with get_time_of_day. This 
pops up
 in the profiles and TX numbers are not what used to be.


I'll work on that.

Thanks,
Ben



 Cheers.
--ro
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
Ben Greear <[EMAIL PROTECTED]>
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PKTGEN: Adds thread limit parameter.

2006-03-13 Thread akepner

On Mon, 13 Mar 2006, Luiz Fernando Capitulino wrote:



BTW Robert, my TODO list for pktgen so far is:


* A new command called 'rem_device' to remove one device at a time
(currently, we can only remove all devices in one shoot with
'rem_devices_all')



This should be very simple to implement - there's now a
"T_REMDEV" flag in addition to "T_REMDEVALL", though no
"rem_device" method has been added. ("T_REMDEV" is currently
used only when a NETDEV_UNREGISTER notification is
received for the device.)

--
Arthur

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PKTGEN: Adds thread limit parameter.

2006-03-13 Thread Luiz Fernando Capitulino
On Mon, 13 Mar 2006 14:29:17 +0100
Robert Olsson <[EMAIL PROTECTED]> wrote:

| 
| Luiz Fernando Capitulino writes:
| 
|  >  Well, I wouldn't say it's _really_ needed. But it really avoids having
|  > too many thread entries in the pktgen's /proc directory, and as a good
|  > result, you will not have pending threads which will never run as well.
|  > 
|  >  Also note that the patch is trivial, if you look at it in detail,
|  > you'll see that the biggest change we have is the 'if' part. The rest I
|  > would call cosmetic because the behaivor is the same.
|  > 
|  > | If so -- Wouldn't a concept of a bitmask to control also which CPU's
|  > | that runs the threads be more general?
|  > 
|  >  Sounds like a bit complex, and would be my turn to ask if it's
|  > really needed. :)
| 
|  A bit set for each CPU there will a pktgen process running.

 Looks interesting. Could you give a few more details? Or an example
somewhere in the kernel (if any) for what you have in mind.

|  >  * Some minor fixes and cleanups, like functions returns being not
|  >  checked.
|  > 
|  >  * A new command called 'rem_device' to remove one device at a time
|  > (currently, we can only remove all devices in one shoot with
|  > 'rem_devices_all')
|  > 
|  >  * Ports pktgen to use the kernel thread API
| 
|  The current model was chosen simplicity and to bound a device 
|  to a specific CPU so it never cahanges. A change will need to 
|  carefully tested.

 Sure thing.

|  >  * cleanup the debug function usage
|  
|  I would like to remove the do_softirq stuff from pktgen... 

 Added to the TODO list. :)

-- 
Luiz Fernando N. Capitulino
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PKTGEN: Adds thread limit parameter.

2006-03-13 Thread Robert Olsson

Luiz Fernando Capitulino writes:

 >  Well, I wouldn't say it's _really_ needed. But it really avoids having
 > too many thread entries in the pktgen's /proc directory, and as a good
 > result, you will not have pending threads which will never run as well.
 > 
 >  Also note that the patch is trivial, if you look at it in detail,
 > you'll see that the biggest change we have is the 'if' part. The rest I
 > would call cosmetic because the behaivor is the same.
 > 
 > | If so -- Wouldn't a concept of a bitmask to control also which CPU's
 > | that runs the threads be more general?
 > 
 >  Sounds like a bit complex, and would be my turn to ask if it's
 > really needed. :)

 A bit set for each CPU there will a pktgen process running.
  
 >  * Some minor fixes and cleanups, like functions returns being not
 >  checked.
 > 
 >  * A new command called 'rem_device' to remove one device at a time
 > (currently, we can only remove all devices in one shoot with
 > 'rem_devices_all')
 > 
 >  * Ports pktgen to use the kernel thread API

 The current model was chosen simplicity and to bound a device 
 to a specific CPU so it never cahanges. A change will need to 
 carefully tested.
 
 >  * cleanup the debug function usage
 
 I would like to remove the do_softirq stuff from pktgen... 

 Cheers.
--ro
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PKTGEN: Adds thread limit parameter.

2006-03-13 Thread Luiz Fernando Capitulino

 Hi Robert,

On Mon, 13 Mar 2006 10:44:49 +0100
Robert Olsson <[EMAIL PROTECTED]> wrote:

| 
| Hello!
| 
| Really needed? 

 Well, I wouldn't say it's _really_ needed. But it really avoids having
too many thread entries in the pktgen's /proc directory, and as a good
result, you will not have pending threads which will never run as well.

 Also note that the patch is trivial, if you look at it in detail,
you'll see that the biggest change we have is the 'if' part. The rest I
would call cosmetic because the behaivor is the same.

| If so -- Wouldn't a concept of a bitmask to control also which CPU's
| that runs the threads be more general?

 Sounds like a bit complex, and would be my turn to ask if it's
really needed. :)

 BTW Robert, my TODO list for pktgen so far is:

 * Some minor fixes and cleanups, like functions returns being not
 checked.

 * A new command called 'rem_device' to remove one device at a time
(currently, we can only remove all devices in one shoot with
'rem_devices_all')

 * Ports pktgen to use the kernel thread API

 * cleanup the debug function usage

 Would be good to hear from you if they really makes sense.

 Thank you for the feedback,

| Luiz Fernando Capitulino writes:
|  > 
|  > Currently, pktgen will create one thread for each online CPU in the
|  > system. That behaivor may be annoying if you're using pktgen in a
|  > large SMP system.
|  > 
|  > This patch adds a new module parameter called 'pg_max_threads', which
|  > can be set to the maximum number of threads pktgen should create.
|  > 
|  > For example, if you're playing with pktgen in SMP system with 8
|  > CPUs, but want only two pktgen's threads, you can do:
|  > 
|  >modprobe pktgen pg_max_threads=2
|  > 
|  > Signed-off-by: Luiz Capitulino <[EMAIL PROTECTED]>
|  > 
|  > ---
|  > 
|  >  net/core/pktgen.c |   23 +--
|  >  1 files changed, 17 insertions(+), 6 deletions(-)
|  > 
|  > e210ad47d0db52496fdaabdd50bfe6ee0bcc53cd
|  > diff --git a/net/core/pktgen.c b/net/core/pktgen.c
|  > index 37b25a6..994aef1 100644
|  > --- a/net/core/pktgen.c
|  > +++ b/net/core/pktgen.c
|  > @@ -154,7 +154,7 @@
|  >  #include /* do_div */
|  >  #include 
|  >  
|  > -#define VERSION  "pktgen v2.66: Packet Generator for packet performance 
testing.\n"
|  > +#define VERSION  "pktgen v2.67: Packet Generator for packet performance 
testing.\n"
|  >  
|  >  /* #define PG_DEBUG(a) a */
|  >  #define PG_DEBUG(a)
|  > @@ -488,6 +488,7 @@ static unsigned int fmt_ip6(char *s, con
|  >  static int pg_count_d = 1000; /* 1000 pkts by default */
|  >  static int pg_delay_d;
|  >  static int pg_clone_skb_d;
|  > +static int pg_max_threads;
|  >  static int debug;
|  >  
|  >  static DEFINE_MUTEX(pktgen_thread_lock);
|  > @@ -3184,7 +3185,7 @@ static int pktgen_remove_device(struct p
|  >  
|  >  static int __init pg_init(void)
|  >  {
|  > -  int cpu;
|  > +  int i, threads;
|  >struct proc_dir_entry *pe;
|  >  
|  >printk(version);
|  > @@ -3208,15 +3209,24 @@ static int __init pg_init(void)
|  >/* Register us to receive netdevice events */
|  >register_netdevice_notifier(&pktgen_notifier_block);
|  >  
|  > -  for_each_online_cpu(cpu) {
|  > +  threads = num_online_cpus();
|  > +
|  > +  /*
|  > +   * Check if we should have less threads than the number
|  > +   * of online CPUs
|  > +   */
|  > +  if ((pg_max_threads > 0) && (pg_max_threads < threads))
|  > +  threads = pg_max_threads;
|  > +
|  > +  for (i = 0; i < threads; i++) {
|  >int err;
|  >char buf[30];
|  >  
|  > -  sprintf(buf, "kpktgend_%i", cpu);
|  > -  err = pktgen_create_thread(buf, cpu);
|  > +  sprintf(buf, "kpktgend_%i", i);
|  > +  err = pktgen_create_thread(buf, i);
|  >if (err)
|  >printk("pktgen: WARNING: Cannot create thread for cpu 
%d (%d)\n",
|  > -  cpu, err);
|  > +  i, err);
|  >}
|  >  
|  >if (list_empty(&pktgen_threads)) {
|  > @@ -3263,4 +3273,5 @@ MODULE_LICENSE("GPL");
|  >  module_param(pg_count_d, int, 0);
|  >  module_param(pg_delay_d, int, 0);
|  >  module_param(pg_clone_skb_d, int, 0);
|  > +module_param(pg_max_threads, int, 0);
|  >  module_param(debug, int, 0);
|  > -- 
|  > 1.2.4.gbe76
|  > 
|  > 
|  > 
|  > -- 
|  > Luiz Fernando N. Capitulino


-- 
Luiz Fernando N. Capitulino
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PKTGEN 3/3]: Updates version.

2006-03-03 Thread David S. Miller
From: Luiz Fernando Capitulino <[EMAIL PROTECTED]>
Date: Wed, 1 Mar 2006 23:05:54 -0300

> 
>  Due to the thread's lock changes, we're at a new version now.
> 
> Signed-off-by: Luiz Capitulino <[EMAIL PROTECTED]>

Applied, thanks a lot.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PKTGEN 2/3]: Removes thread_{un,}lock() macros.

2006-03-03 Thread David S. Miller
From: Luiz Fernando Capitulino <[EMAIL PROTECTED]>
Date: Wed, 1 Mar 2006 23:05:48 -0300

> 
>  As suggested by Arnaldo, this patch replaces the 
> thread_lock()/thread_unlock()
> by directly calls to mutex_lock()/mutex_unlock().
> 
>  This change makes the code a bit more readable, and the direct calls are used
> everywhere in the kernel.
> 
> Signed-off-by: Luiz Capitulino <[EMAIL PROTECTED]>

Applied.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PKTGEN 1/3]: Convert thread lock to mutexes.

2006-03-03 Thread David S. Miller
From: Luiz Fernando Capitulino <[EMAIL PROTECTED]>
Date: Wed, 1 Mar 2006 23:05:43 -0300

> 
>  pktgen's thread semaphores are strict mutexes, convert them to the mutex
> implementation.
> 
> Signed-off-by: Luiz Capitulino <[EMAIL PROTECTED]>

Applied.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen + napi == kaboom

2006-02-23 Thread Jesse Brandeburg
On 2/22/06, Simon Kirby <[EMAIL PROTECTED]> wrote:
> Of course, now it doesn't send as fast.  Hrmph. :)  On this older Xeon
> 2.4 Ghz w/533 FSB and e1000 & tg3 @ PCI-X 133 Mhz 64 bit, SMP kernel,
> single pktgen thread, I'm only seeing:
>
> clone_skb=0, 802.1Q tagging, 60 byte:
> e1000: 558526pps 268Mb/sec (268092480bps) errors: 0
> tg3: 621260pps 298Mb/sec (298204800bps) errors: 0
>
> clone_skb=0, no 802.1Q, 60 byte:
> e1000: 664558pps 318Mb/sec (318987840bps) errors: 0
> tg3:   772650pps 370Mb/sec (370872000bps) errors: 0
>
> clone_skb=16384, no 802.1Q, 60 byte:
> e1000: 684206pps 328Mb/sec (328418880bps) errors: 0
> tg3: 1069830pps 513Mb/sec (513518400bps) errors: 0
>
> I tried on an Opteron 140 box and it was faster for both cards, but not
> by much.  oprofile showed a lot of do_getttimeofday, so I hacked a bunch
> of calls out of pktgen -- I noticed the CPU time shifted around but the
> throughput was still the same as before, as if it's card or bus limited.
>
> Why is it so difficult to actually get 1 Gbps of small packets?  I also
> tried changing ring buffer sizes, txqueuelen, interrupt coalescing
> settings, etc... all I was able to do was make it slower or very
> slightly faster.

Its difficult because you have *loads* of transactions going over the
bus.  Linux's single transmit packet at a time methodology also
exacerbates this, as we're unable to coalesce transmit tail (TDT)
writes to the bus and are probably interrupting the previous DMA *a
lot* (see thread below)

You'll probably be able to get a little better throughput by switching
to a UP kernel, but from my experience you're getting pretty close to
the max for pci-x e1000 adapters.  There are some previous messages
about this like:
http://oss.sgi.com/projects/netdev/archive/2004-12/msg00017.html

beware the hardware bug when enabling TXDMAC!

Jesse
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen + napi == kaboom

2006-02-22 Thread Simon Kirby
On Wed, Feb 22, 2006 at 10:56:31AM -0800, Ben Greear wrote:

> For VLANs, make sure that the 'multi-skb' is always zero.  This is because
> the VLAN code modifies the skb by re-assigning the skb->dev to the 
> underlying
> device.
> 
> I'm not sure if this is the problem you hit, but it could be...

Yup, that was it.

Of course, now it doesn't send as fast.  Hrmph. :)  On this older Xeon
2.4 Ghz w/533 FSB and e1000 & tg3 @ PCI-X 133 Mhz 64 bit, SMP kernel,
single pktgen thread, I'm only seeing:

clone_skb=0, 802.1Q tagging, 60 byte:
e1000: 558526pps 268Mb/sec (268092480bps) errors: 0
tg3: 621260pps 298Mb/sec (298204800bps) errors: 0

clone_skb=0, no 802.1Q, 60 byte:
e1000: 664558pps 318Mb/sec (318987840bps) errors: 0
tg3:   772650pps 370Mb/sec (370872000bps) errors: 0

clone_skb=16384, no 802.1Q, 60 byte:
e1000: 684206pps 328Mb/sec (328418880bps) errors: 0
tg3: 1069830pps 513Mb/sec (513518400bps) errors: 0

I tried on an Opteron 140 box and it was faster for both cards, but not
by much.  oprofile showed a lot of do_getttimeofday, so I hacked a bunch
of calls out of pktgen -- I noticed the CPU time shifted around but the
throughput was still the same as before, as if it's card or bus limited. 

Why is it so difficult to actually get 1 Gbps of small packets?  I also
tried changing ring buffer sizes, txqueuelen, interrupt coalescing
settings, etc... all I was able to do was make it slower or very
slightly faster.

Simon-
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen + napi == kaboom

2006-02-22 Thread Ben Greear

Robert Olsson wrote:

Simon Kirby writes:

 > Just tried to do some benchmarks for outgoing packet rates with the
 > e1000 and tg3.  When I tried some vlan tagging with pktgen, it blew
 > up immediately:
 
 Hello!


 No pktgen has no support vlan as-is .Guess there should be some config
 option to select vlan and enable it and fill vlan header in the packet. 
 If you're motivated give it a try.


VLAN devices will add the header for you...pktgen should work with
no modification.

It will work *best* if you use my patch to give stack feedback so
that pktgen + vlan doesn't drop so many pkts on transmit.

diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -1,4 +1,4 @@
-/*
+/* -*- linux-c -*-
  * INET802.1Q VLAN
  * Ethernet-type device handling.
  *
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -438,6 +438,11 @@ int vlan_dev_hard_start_xmit(struct sk_b
struct net_device_stats *stats = vlan_dev_get_stats(dev);
struct vlan_ethhdr *veth = (struct vlan_ethhdr *)(skb->data);

+   /* Please note, dev_queue_xmit consumes the pkt regardless of the
+* return value.  So, will copy the skb first and free if successful.
+*/
+   struct sk_buff* skb2 = skb_get(skb);
+
/* Handle non-VLAN frames if they are sent to us, for example by DHCP.
 *
 * NOTE: THIS ASSUMES DIX ETHERNET, SPECIFICALLY NOT SUPPORTING
@@ -467,6 +472,10 @@ int vlan_dev_hard_start_xmit(struct sk_b
skb = __vlan_put_tag(skb, veth_TCI);
if (!skb) {
stats->tx_dropped++;
+   /* Free the extra copy, assuming this is a 
non-recoverable
+* issue and we don't want calling code to retry.
+*/
+   kfree_skb(skb2);
return 0;
}

@@ -484,13 +493,24 @@ int vlan_dev_hard_start_xmit(struct sk_b
   veth->h_vlan_proto, veth->h_vlan_TCI, 
veth->h_vlan_encapsulated_proto);
 #endif

-   stats->tx_packets++; /* for statics only */
-   stats->tx_bytes += skb->len;
-
skb->dev = VLAN_DEV_INFO(dev)->real_dev;
-   dev_queue_xmit(skb);

-   return 0;
+   {
+   int rv = dev_queue_xmit(skb);
+   if (rv == 0) {
+   /* Was success, need to free the skb reference since
+* we bumped up the user count above.  If there was an
+* error instead, then the skb2 will not be freed, and 
so
+* the calling code will be able to re-send it.
+*/
+
+   stats->tx_packets++; /* for statics only */
+   stats->tx_bytes += skb2->len;
+
+   kfree_skb(skb2);
+   }
+   return rv;
+   }
 }

 int vlan_dev_hwaccel_hard_start_xmit(struct sk_buff *skb, struct net_device 
*dev)


Thanks,
Ben

--
Ben Greear <[EMAIL PROTECTED]>
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pktgen + napi == kaboom

2006-02-22 Thread Ben Greear

Simon Kirby wrote:

2.6.15.4:

Just tried to do some benchmarks for outgoing packet rates with the
e1000 and tg3.  When I tried some vlan tagging with pktgen, it blew
up immediately:


For VLANs, make sure that the 'multi-skb' is always zero.  This is because
the VLAN code modifies the skb by re-assigning the skb->dev to the underlying
device.

I'm not sure if this is the problem you hit, but it could be...

Ben

--
Ben Greear <[EMAIL PROTECTED]>
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html