Re: reminder, 2.6.18 window...

2006-05-26 Thread Rick Jones
Can you ask internally on how openview would handle this? It carriers 
the major chunk of management tools market so it may provide good 
insight.



I've asked the question in an internal, informal communications channel. 
 No guarantees it will reach any OpenView types, but if it does I'll try 
to provide the gist of the replies.


While the question of the patch itself appears to be laid to rest for 
the time being, since I took an "action item" I figured it would be good 
to complete it.


Here was one response:


If the stats are being gathered via SNMP, most management systems do
one of two things:

- treat it as a discontinuity: in this case, the handling is similar
  to that for a device reboot; that is, delta calculation starts anew.

- treat it as a wrap-around (especially for 32-bit counters): the smarter
  ones have logic to detect whether this is a "feasible" wrap-around (e.g.,
  old measurement was "near" overflow, etc.) and appropriately adjust
  the delta.

In your case, it looks like you want to treat this as a discontinuity.
The interface table in IF-MIB has an attribute called ifLastChange; if
you reset the counter, you may want to set it to the sysUpTime value.
This way, a "proper" implementation could determine that a
discontinuity has occurred.


And then a more detailed response with an associated, and very long URL:



http://openview.hp.com/ecare/getsupportdoc?docid=OV-EN012963&urlN=http%3A%2F%2Fsupport.openview.hp.com%2Fselfsolve%2Fdo%2Fadvanced-search&fromOV
=false&urlB=http%3A%2F%2Fsupport.openview.hp.com%2Fselfsolve%2Fdo%2Fadvanced-search%3Faction%3Dresults&f=ss&hl=true

QUOTE:

This problem can be caused by the SNMP MIB counter wrap.  NNM 6.01 or
later has logic to detect collected MIB counter wrap.

If NNM detects that a MIB counter is wrapped, then it is considered as
one of the following two cases:

The counter reached its maximum value and wrapped.
The counter reset occurred due to the SNMP agent restart.
In the case of (1), snmpCollect takes the counter wrap into account and
adjusts the value taking the maximum value of the counter into
consideration.  However, in the case of (2), NNM cancels the measurement
of this period because it considers that the previous value that was
used for calculation is no longer valid.

If the value of the counter increases too fast, NNM may consider that
the detected counter wrap is due to the agent restart even though the
counter just wrapped.  In this case, the data of the measurement period
gets dropped.

There are two approaches available to avoid this situation:

Use counters that have a larger maximum size.  For example, use
IfHCInOctets/IfHCOutOctets(64bit - counter64) in IF_MIB instead of
IfInOctets/IfOutOctets(32bit - counter).
Shorten the period of measurement, so that the amount of the counter
increase is potentially short enough to let NNM detect the counter wrap
correctly.

Note:
It may be necessary to upgrade the operating system of some network
devices to gain access to 64-bit counters. Note also, that counter64 is
a SNMPv2 data type, the agent must support SNMPv2. If trying to access
the values using snmpwalk  use the parameter-v2c to ensure that the
variables can be accessed.



rick jones
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-26 Thread Andi Kleen

> The current patch is fine if your hardware implements the required atomicity 
> itself. 

Near all do optionally, but it would make increasing the statistics a magnitude 
more expensive.

Atomic operations don't come cheap on modern systems. And you would need
to change the fast path increments to atomic for this.

I suspect it could be done without atomics with some tricks (e.g. use a double
set of counters and switch on clear and use RCU), but it would make the whole
thing quite complex and still have more overhead in the fast path than
the current code.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-26 Thread Andi Kleen
On Wednesday 24 May 2006 22:08, Jeff Garzik wrote:
> Brent Cook wrote:
> > Note that this is just clearing the hardware statistics on the interface, 
> > and 
> > would not require any kind of atomic_increment addition for interfaces that 
> > support that. It would be kind-of awkward to implement this on drivers that 
> >  
> > increment stats in hardware though (lo, vlan, br, etc.) This also brings up 
> > the question of resetting the stats for 'netstat -s'
> 
> If you don't atomically clear the statistics, then you are leaving open 
> a window where the stats could easily be corrupted, if the network 
> interface is under load.

It could be handled by RCU with some moderately complex code  
(clear and clear again after a RCU quiescent period) 

But the real problem is that the user will always miss events during
the clear operation. That is why it is inherently racy.

An atomic user visible get-and-clear wouldn't have this problem, but it would 
be probably 
nasty to implement lockless without risking livelock on a busy system. And 
I also doubt it would have a nice user interface in the file system.

And really is it that hard to do a before-after diff?  I don't think so.


> 
> See...  this opens doors to tons of complexity.

Agreed.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-26 Thread Andi Kleen
On Wednesday 24 May 2006 10:01, Phil Dibowitz wrote:
> David Miller wrote:
> > Some time in the next few weeks, it is likely that the 2.6.18
> > merge window will open up shortly after a 2.6.17 release.
> > 
> > So if you have major 2.6.18 submissions planned for the networking,
> > you need to start thinking about getting it to me now.
> > 
> > There is a 2.6.18 tree up at:
> > 
> > master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.18.git
> > 
> > All it has right now is the I/O AT stuff at the moment, and I plan to
> > put Stephen Hemminger's LLC multicast/datagram changes in there as
> > well.
> 
> David,
> 
> I posted a patch for adding support for network device statistic
> resetting via ethtool. I saw no objections to it...

I think it's a bad idea because it's inherently racey

(I think I objected originally too) 

Similar patches were rejected a couple of times over the years already.

> I'd like to get this into 2.6.18. It's self-contained, so it has little
> chance of breaking other things and adds a useful feature that I've seen
> a lot of requests for.

It's broken by design.
-Andi

 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-25 Thread Phil Dibowitz
Brian Haley wrote:
> jamal wrote:
>> On Wed, 2006-24-05 at 12:14 -0700, Phil Dibowitz wrote:
>>
>>> Right, I'm aware there are other ways of doing this - I've written
>>> scripts to
>>> record a hundreds of numbers, and then subtract them from each other.
>>> But
>>> those scripts are work arounds 
> 
> I don't have any problem with Phil's changes.

OK - well, I think the discussion is over now...

But for those that are interested, I've posted the patches (and I
cleaned up the ethtool patch), as well as a README and FAQ here:

http://phildev.net/linux/patches/

A handful of people have expressed interest to me off-list, so we'll see
how that interest pans out. But in the meantime, you can find the
latest-and-greatest version there.

-- 
Phil Dibowitz [EMAIL PROTECTED]
Freeware and Technical Pages  Insanity Palace of Metallica
http://www.phildev.net/   http://www.ipom.com/

"Be who you are and say what you feel, because those who mind don't
matter and those who matter don't mind."
 - Dr. Seuss




signature.asc
Description: OpenPGP digital signature


Re: reminder, 2.6.18 window...

2006-05-25 Thread David Miller
From: Phil Dibowitz <[EMAIL PROTECTED]>
Date: Thu, 25 May 2006 14:04:12 -0700

> why would specifically not support a _feature_ of the hardware.

Sparc64 chips support a hash table like hw assist feature for TLB
reloading, I didn't use it for 8+ years and went with a virtual page
table approach instead.

I mean, your statement is totally meaningless.  Just because the
hardware can do something, doesn't mean we have any reason to use that
functionality.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-25 Thread Phil Dibowitz
On Thu, May 25, 2006 at 01:29:28PM -0700, David Miller wrote:
> From: Phil Dibowitz <[EMAIL PROTECTED]>
> Date: Thu, 25 May 2006 12:22:39 -0700
> 
> > So at least, for the _current_ implimentations, this should have no
> > performance impacts.
> 
> Regardless, I think this is something that userspace _can_
> take care of reasonably and therefore has no buisness in the
> kernel.
> 
> We do not duplicate functionality that is already possible
> _unless_ it is incredibly difficult for userland to do so.
> And in this case I do not think it is difficult for userland
> to interpret the counters in the way you want it to.
> 
> So, major NACK on this stuff.

Well, if that's how you feel, I'm probably not going to change your mind,
though I'll emphasize I think any solution in userspace is a work around -
*particularly* since most cards have a command specifically for "clear
stats"... why would specifically not support a _feature_ of the hardware.

-- 
Phil Dibowitz [EMAIL PROTECTED]
Freeware and Technical Pages  Insanity Palace of Metallica
http://www.phildev.net/   http://www.ipom.com/

"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."
 - Benjamin Franklin, 1759



signature.asc
Description: Digital signature


Re: reminder, 2.6.18 window...

2006-05-25 Thread David Miller
From: Phil Dibowitz <[EMAIL PROTECTED]>
Date: Thu, 25 May 2006 12:22:39 -0700

> So at least, for the _current_ implimentations, this should have no
> performance impacts.

Regardless, I think this is something that userspace _can_
take care of reasonably and therefore has no buisness in the
kernel.

We do not duplicate functionality that is already possible
_unless_ it is incredibly difficult for userland to do so.
And in this case I do not think it is difficult for userland
to interpret the counters in the way you want it to.

So, major NACK on this stuff.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-25 Thread Phil Dibowitz
On Thu, May 25, 2006 at 01:41:41PM -0500, Brent Cook wrote:
> On Thursday 25 May 2006 12:59, Phil Dibowitz wrote:
> > On Thu, May 25, 2006 at 08:05:37AM -0500, Brent Cook wrote:
> > > > I'll admit to not knowing all the intricacies of the kernel coding
> > > > involved, but I don't offhand see how zeroing the stats would be
> > > > significantly more complex than updating the stats during normal usage.
> > > > But I'll have to leave that argument to the experts.
> > >
> > > What it boils down to is that currently, a single CPU or thread ever
> > > touches the stats concurrently, so it doesn't have to lock them or do
> > > anything special to ensure that the continue incrementing. If you want to
> > > make sure that the statistics actually reset when you want them to, you
> > > have to account for this case:
> > >
> > >   CPU0 reads current value from memory (increment)
> > >   CPU1 writes 0 to current value in memory (reset)
> > >   CPU0 writes incremented value to memory (increment complete)
> >
> > Perhaps I'm missing something here, but these counters are only incrimented
> > in hardware... i.e. attomically.
> >
> 
> No, you're right - I'm just thinking that once one driver has this ability, 
> users are going to want it for all network devices, and implementation on 
> some devices (namely virtual ones - lo, tun, tap, br, vlan) is trickier than 
> just setting a register. Some hardware devices too - mv643xx_eth.c just 
> increments the network stats in software, for instance. Lockless software 
> reset is fine though as long as people understand the consequences - it's 
> absolutely fine, given the way I would use reset in my environment, MMV.

OK, good, I'm glad I was understanding things right.

Yes, if the framework gets accepted, I'll work on implimenting it in more
drivers. Only in ones I can get access to the hardware, obviously, but I
wasn't going to do work to impliment it in a ton of drivers if the framework
wasn't going to be accepted.

For virtual stuff, of course locking would be needed, and while this is
"performance degredation", it's only so when you choose to clear the stats.

So at least, for the _current_ implimentations, this should have no
performance impacts.

-- 
Phil Dibowitz [EMAIL PROTECTED]
Freeware and Technical Pages  Insanity Palace of Metallica
http://www.phildev.net/   http://www.ipom.com/

"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."
 - Benjamin Franklin, 1759



signature.asc
Description: Digital signature


Re: reminder, 2.6.18 window...

2006-05-25 Thread Brent Cook
On Thursday 25 May 2006 12:59, Phil Dibowitz wrote:
> On Thu, May 25, 2006 at 08:05:37AM -0500, Brent Cook wrote:
> > > I'll admit to not knowing all the intricacies of the kernel coding
> > > involved, but I don't offhand see how zeroing the stats would be
> > > significantly more complex than updating the stats during normal usage.
> > > But I'll have to leave that argument to the experts.
> >
> > What it boils down to is that currently, a single CPU or thread ever
> > touches the stats concurrently, so it doesn't have to lock them or do
> > anything special to ensure that the continue incrementing. If you want to
> > make sure that the statistics actually reset when you want them to, you
> > have to account for this case:
> >
> >   CPU0 reads current value from memory (increment)
> >   CPU1 writes 0 to current value in memory (reset)
> >   CPU0 writes incremented value to memory (increment complete)
>
> Perhaps I'm missing something here, but these counters are only incrimented
> in hardware... i.e. attomically.
>

No, you're right - I'm just thinking that once one driver has this ability, 
users are going to want it for all network devices, and implementation on 
some devices (namely virtual ones - lo, tun, tap, br, vlan) is trickier than 
just setting a register. Some hardware devices too - mv643xx_eth.c just 
increments the network stats in software, for instance. Lockless software 
reset is fine though as long as people understand the consequences - it's 
absolutely fine, given the way I would use reset in my environment, MMV.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-25 Thread Phil Dibowitz
On Thu, May 25, 2006 at 08:05:37AM -0500, Brent Cook wrote:
> > I'll admit to not knowing all the intricacies of the kernel coding
> > involved, but I don't offhand see how zeroing the stats would be
> > significantly more complex than updating the stats during normal usage. 
> > But I'll have to leave that argument to the experts.
> >
> 
> What it boils down to is that currently, a single CPU or thread ever touches 
> the stats concurrently, so it doesn't have to lock them or do anything 
> special to ensure that the continue incrementing. If you want to make sure 
> that the statistics actually reset when you want them to, you have to account 
> for this case:
> 
>   CPU0 reads current value from memory (increment)
>   CPU1 writes 0 to current value in memory (reset)
>   CPU0 writes incremented value to memory (increment complete)

Perhaps I'm missing something here, but these counters are only incrimented
in hardware... i.e. attomically.

And the reset I do is via a command register, which should also be atomic.

Now in a driver that was keeping this all in a local struct, I could
understand that need for locking, but in the skge case, and in fact in many
drivers I've looked at, the numbers are all kept in the hardware, incremented
by the hardware, as it gets packets.

So clearing them via command registershouldn't need locking as far as I can
tell.

But please, correct me if I'm wrong.

-- 
Phil Dibowitz [EMAIL PROTECTED]
Freeware and Technical Pages  Insanity Palace of Metallica
http://www.phildev.net/   http://www.ipom.com/

"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."
 - Benjamin Franklin, 1759



signature.asc
Description: Digital signature


Re: reminder, 2.6.18 window...

2006-05-25 Thread Rick Jones
Can you ask internally on how openview would handle this? It carriers the 
major chunk of management tools market so it may provide good insight.


I've asked the question in an internal, informal communications channel. 
 No guarantees it will reach any OpenView types, but if it does I'll 
try to provide the gist of the replies.


rick jones
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-25 Thread Bill Fink
On Thu, 25 May 2006, Brent Cook wrote:

> On Thursday 25 May 2006 02:23, Bill Fink wrote:
> > On Wed, 24 May 2006, Jeff Garzik wrote:
> > > Brent Cook wrote:
> > > > Note that this is just clearing the hardware statistics on the
> > > > interface, and would not require any kind of atomic_increment addition
> > > > for interfaces that support that. It would be kind-of awkward to
> > > > implement this on drivers that increment stats in hardware though (lo,
> > > > vlan, br, etc.) This also brings up the question of resetting the stats
> > > > for 'netstat -s'
> > >
> > > If you don't atomically clear the statistics, then you are leaving open
> > > a window where the stats could easily be corrupted, if the network
> > > interface is under load.
> > >
> > > This 'clearing' operation has implications on the rest of the statistics
> > > usage.
> > >
> > > More complexity, and breaking of apps, when we could just use the
> > > existing, working system?  I'll take the "do nothing, break nothing,
> > > everything still works" route any day.
> >
> > I'll admit to not knowing all the intricacies of the kernel coding
> > involved, but I don't offhand see how zeroing the stats would be
> > significantly more complex than updating the stats during normal usage. 
> > But I'll have to leave that argument to the experts.
> 
> What it boils down to is that currently, a single CPU or thread ever touches 
> the stats concurrently, so it doesn't have to lock them or do anything 
> special to ensure that the continue incrementing. If you want to make sure 
> that the statistics actually reset when you want them to, you have to account 
> for this case:
> 
>   CPU0 reads current value from memory (increment)
>   CPU1 writes 0 to current value in memory (reset)
>   CPU0 writes incremented value to memory (increment complete)
> 
> Check out do_add_counters() in net/ipv4/netfilter/ip_tables.c
> to see what's required to do this reliably in the kernel.

Thanks for the info.  I have a possibly naive question.  Would it
increase the reliability of clearing the stats using "lazy" zeroing
(no locking), if the zeroing app (ethtool) bound itself to the same
CPU that was handling interrupts for the device (assuming no sharing
of interrupts across CPUs)?

> The current patch is fine if your hardware implements the required atomicity 
> itself. Otherwise, you need a locking infrastructure to extend it to all 
> network devices if you want zeroing to always work. What I'm seeing here in 
> response to this is that it doesn't matter if zeroing just _mostly_ works, 
> which is what you would get if you didn't lock. Eh, I'm OK with that too, but 
> I think people are worried about the bugs that would get filed by admins when 
> just zeroing the stats on cheap NIC x only works 90% of the time, less under 
> load. Or not at all (not implemented in driver.) Then you're back to the 
> userspace solution or actually implement stat locking / atomic ops.

I would be fine with the "lazy" clearing of the stats (with a note
describing the limitations in the ethtool man page).  Being somewhat
anal, I would always check that the stats had in fact been zeroed
successfully before proceeding.  BTW I am in 100% agreement not to
do anything that would affect performance of the fast path, as I
understand proper locking would necessitate.

I will also look into the beforeafter utility that has been suggested,
to see how easy it is to use and how much extra work would be required
over just a direct visual examination of the interface statistics.

-Bill
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-25 Thread jamal
On Wed, 2006-24-05 at 13:25 -0700, Rick Jones wrote:

> The lanadmin (think ethtool) command of HP-UX has had a way to clear the 
> statistics reported by lanadmin.  I do not know however, if that affects 
> the stats in the actual SNMP interface MIBs as I've never had occasion 
> to look.
> 
> I still suggest beforeafter to people because clearing the lanadmin 
> stats requires root privs and because it may not be a kosher thing to do 
> on a customer system.
> 
> HP-UX does not provide a way to clear the stats reported by netstat.

Which also are snmp stats. 

So far the reasoning of why this is needed has been around debugging
which is in itself reasonable as long as it doesnt break things: it
does. 

For debugging: I can name about 3 ways you can achieve the counting
without touching the driver/hardware level stats. Someone needs to put
effort and create a tool to it.

[ Actually now that i spent 2 minutes looking at the patch i can
understand jgarziks concerns].

On Wed, 2006-24-05 at 16:44 -0400, Brian Haley wrote:

>So how is this different than if an SNMP station probes 
> my system,then 
> I reboot, then they probe again.  Things will seem to have gone 
> backwards, but they deal with that just fine.
> 

Rick has already answered this question i hope. There are triggers that 
management
tools use to identify reboots.

> DEC/Compaq/HP has allowed this on Tru64 UNIX since 1999 because we had 
> customers that wanted it, noone ever complained about complications with 
> SNMP. 

I am sure if we did things because your customer wanted it, we would approach 
anarchy at best.

>  We did save the last time the stats were zero'd in the struct for 
> posterity, but that was never get-able via SNMP:
> 

Because it is not part of the MIB i would suspect. 

> 
> --> netstat -I tu0 -s
> 
> tu0 Ethernet counters at Wed May 24 16:30:05 2006
> 
>609415 seconds since last zeroed
>3943458720 bytes received
> 113576310 bytes sent
> ...
> 
> Maybe saving a "ztime" would make people happier?

It may be insufficient. 
Management tools use the resetting as a trigger to indicate rollover of the 
counters. I dont know if you can go and fix everyone out there to have this
new ztime in mind.

Can you ask internally on how openview would handle this? It carriers the 
major chunk of management tools market so it may provide good insight.

On Wed, 2006-24-05 at 14:04 -0700, Rick Jones wrote: 
> Phil Dibowitz wrote:
> > Well, I can show you support on my home switch (cabletron) - the network 
> > guys
> > will be a little unhappy if I clear stats on our production network (cisco)
> > without warning them:
> 
> Isn't that last bit an example of why it might not be good to play-out 
> that rope?-)
> 

I dont mind him hanging himself, he just wants to hang us all ;->

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-25 Thread Dave Dillow
On Thu, 2006-05-25 at 03:23 -0400, Bill Fink wrote:
> likely problem areas.  The human mind (at least speaking for myself) is
> not nearly as adept when having to deal with deltas.  Yes, you can record
> the initial state of all the devices, run the stress test, record the new
> state of all the devices, and then spend a large amount of time devising
> a script to calculate all the deltas for all the scores of variables on
> all the involved devices, and then finally try and figure out what is
> wrong.  But it would be so much better, easier, and more efficient, if
> the kernel simply provided such a feature that almost all other networking
> devices provide.

ftp://ftp.cup.hp.com/dist/networking/tools/beforeafter.tar.gz
as Rick mentioned earlier, and then you won't need to write a
complicated script.
-- 
Dave Dillow <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-25 Thread Brent Cook
On Thursday 25 May 2006 02:23, Bill Fink wrote:
> On Wed, 24 May 2006, Jeff Garzik wrote:
> > Brent Cook wrote:
> > > Note that this is just clearing the hardware statistics on the
> > > interface, and would not require any kind of atomic_increment addition
> > > for interfaces that support that. It would be kind-of awkward to
> > > implement this on drivers that increment stats in hardware though (lo,
> > > vlan, br, etc.) This also brings up the question of resetting the stats
> > > for 'netstat -s'
> >
> > If you don't atomically clear the statistics, then you are leaving open
> > a window where the stats could easily be corrupted, if the network
> > interface is under load.
> >
> > This 'clearing' operation has implications on the rest of the statistics
> > usage.
> >
> > More complexity, and breaking of apps, when we could just use the
> > existing, working system?  I'll take the "do nothing, break nothing,
> > everything still works" route any day.
>
> I'll admit to not knowing all the intricacies of the kernel coding
> involved, but I don't offhand see how zeroing the stats would be
> significantly more complex than updating the stats during normal usage. 
> But I'll have to leave that argument to the experts.
>

What it boils down to is that currently, a single CPU or thread ever touches 
the stats concurrently, so it doesn't have to lock them or do anything 
special to ensure that the continue incrementing. If you want to make sure 
that the statistics actually reset when you want them to, you have to account 
for this case:

  CPU0 reads current value from memory (increment)
  CPU1 writes 0 to current value in memory (reset)
  CPU0 writes incremented value to memory (increment complete)

Check out do_add_counters() in net/ipv4/netfilter/ip_tables.c
to see what's required to do this reliably in the kernel.

The current patch is fine if your hardware implements the required atomicity 
itself. Otherwise, you need a locking infrastructure to extend it to all 
network devices if you want zeroing to always work. What I'm seeing here in 
response to this is that it doesn't matter if zeroing just _mostly_ works, 
which is what you would get if you didn't lock. Eh, I'm OK with that too, but 
I think people are worried about the bugs that would get filed by admins when 
just zeroing the stats on cheap NIC x only works 90% of the time, less under 
load. Or not at all (not implemented in driver.) Then you're back to the 
userspace solution or actually implement stat locking / atomic ops.

 - Brent
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-25 Thread Francois Romieu
Phil Dibowitz <[EMAIL PROTECTED]> :
[...]
> Right. I think the point here is that it does _NOT_ inherently break
> things. If you don't like the behavior, don't run "ethtool -z eth0",
> it's that simple.

It would be better to explain why several sysadmins want this feature
and why it can't be done in an hardware-independent way at the
application level.

-- 
Ueimor
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-25 Thread Pekka Savola

On Wed, 24 May 2006, Phil Dibowitz wrote:

On Wed, May 24, 2006 at 02:23:05PM -0400, Jeff Garzik wrote:

I disagree that we should bother about clearing statistics.  It always
adds more complication than necessary.  Few (if any) other statistics in
Linux permit easy clearing, often because adding operations other than
'increment' or 'read' requires adding expensive spinlocks or atomic
operations.


Every networking device in the world supports clearing interface statistics.
Why should linux not be able to do the most basic operation on any
cisco/juniper/enterasys/whatever managed switch or router?


AFAIK, note that clearing interface statistics on such routers doesn't 
clear SNMP statistics, just the statistics available through CLI.  So 
you really have two levels of statistics, and I suspect clearing 
"slave" statistics shouldn't be too difficult to implement.


--
Pekka Savola "You each name yourselves king, yet the
Netcore Oykingdom bleeds."
Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-25 Thread Bill Fink
On Wed, 24 May 2006, Phil Dibowitz wrote:

> Right. I think the point here is that it does _NOT_ inherently break
> things. If you don't like the behavior, don't run "ethtool -z eth0",
> it's that simple.
> 
> A co-worker suggested today, that maybe it'd appease people if the final
> ethtool patch made it a capitol option that you can only run by itself.
> I.e. if you can't call it with anything else, it's more difficult to
> call my accident. I'd be willing to this.

I think that's a good idea.  Since it is changing (zeroing) the stats,
it probably should be a capitol option.

-Bill
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-25 Thread Bill Fink
On Wed, 24 May 2006, Jeff Garzik wrote:

> Brent Cook wrote:
> > Note that this is just clearing the hardware statistics on the interface, 
> > and 
> > would not require any kind of atomic_increment addition for interfaces that 
> > support that. It would be kind-of awkward to implement this on drivers that 
> >  
> > increment stats in hardware though (lo, vlan, br, etc.) This also brings up 
> > the question of resetting the stats for 'netstat -s'
> 
> If you don't atomically clear the statistics, then you are leaving open 
> a window where the stats could easily be corrupted, if the network 
> interface is under load.
> 
> This 'clearing' operation has implications on the rest of the statistics 
> usage.
> 
> More complexity, and breaking of apps, when we could just use the 
> existing, working system?  I'll take the "do nothing, break nothing, 
> everything still works" route any day.

I'll admit to not knowing all the intricacies of the kernel coding involved,
but I don't offhand see how zeroing the stats would be significantly more
complex than updating the stats during normal usage.  But I'll have to
leave that argument to the experts.

To me the main argument is that such a stat zeroing feature would be
extremely useful.  When trying to track down nasty networking problems
that traverse a multitude of devices, it is often highly desirable to
zero the interface statistics on all the interfaces in the path (which
is available on all networking switches and routers I have worked with),
run some kind of stress test across the path, and then examine the packet
and error counters on all the involved interfaces.  This makes it easy to
pinpoint where packets are getting lost or errors are being introduced,
especially when there are scores of stats per device and you may not even
know a priori exactly what you are looking for.  Using such a scheme, the
human mind can quickly discern patterns in the data and focus in on any
likely problem areas.  The human mind (at least speaking for myself) is
not nearly as adept when having to deal with deltas.  Yes, you can record
the initial state of all the devices, run the stress test, record the new
state of all the devices, and then spend a large amount of time devising
a script to calculate all the deltas for all the scores of variables on
all the involved devices, and then finally try and figure out what is
wrong.  But it would be so much better, easier, and more efficient, if
the kernel simply provided such a feature that almost all other networking
devices provide.

I also think the SNMP/mgt apps argument is specious.  A) SNMP isn't even
an issue with all networks.  B) As has been pointed out by others, there
is no requirement to have to use such a new stats zeroing feature.  It
would simply be a tool in the network engineer's toolbelt, just like
possibly taking an interface down and back up to see if it corrects a
problem.  The network engineer has to balance the potential benefit/harm
of any action he chooses to take, but let him have that choice.  And C)
I don't think any decent SNMP/mgt app will be particularly bothered by
zeroing interface stats.  I believe they are fairly decent about dealing
with such events (I don't recall our MRTG graphs getting any giant spikes
when I've zeroed interface stats on our GigE/10-GigE switches).  I think
the main harm in such a case would be the loss of a sampling interval.

-Bill
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-25 Thread Ben Greear

Phil Dibowitz wrote:


As for the clearing, in this case, the clearing is done by a command to
the hardware - and I believe the hardware does that atomically. However,
I could certainly add a spinlock around it if someone sees a need.


No, because then you'd also have to add the spin-lock in the hot path
to keep rx/tx threads from accessing counters at the same time.  There is no
way a patch that hurts performance like this will be accepted, but I'm
still hopeful that a patch with zero or very near zero performance impact
will be accepted.

Ben

--
Ben Greear <[EMAIL PROTECTED]>
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-24 Thread Phil Dibowitz
Brian Haley wrote:
>
> I don't have any problem with Phil's changes.

Thanks Brian, and Andy, and Ben for your support and ideas.

> So how is this different than if an SNMP station probes my system,
> then I reboot, then they probe again.  Things will seem to have gone
> backwards, but they deal with that just fine.

Right. Reboots, rmmod/modprobe's, etc. can all cause this. Most
management interfaces seem to handle such things.

> DEC/Compaq/HP has allowed this on Tru64 UNIX since 1999 because we had
> customers that wanted it, noone ever complained about complications
> with SNMP.  We did save the last time the stats were zero'd in the
> struct for posterity, but that was never get-able via SNMP:
...
> Maybe saving a "ztime" would make people happier?

I could certainly do this. It would of course change the structure
making the patch slightly more invasive, but it people want this, I can
do it.

Andy wrote:
> On Wed, 2006-05-24 at 14:23 -0400, Jeff Garzik wrote:
>
>> I disagree that we should bother about clearing statistics.  It
>> always adds more complication than necessary.  Few (if any) other
>> statistics in Linux permit easy clearing,
>
> iptables -Z

Good call - I forgot about that.

Ben Greear wrote:
>> Are SNMP traps generated by going into single-user mode?  Rather like
>> what I was saying to Brian earlier.   I suspect though that an rmmod
>> doesn't generate an SNMP trap - unless perhaps that to do the rmmod
>> one has to first ifdown the interface and that might?
> 
> If the interface comes back, it will (may?) have a different device id
> (if-index),
> even if the name is the same.

Right - but most GUI management interfaces I've seen key off of
interface name *not* ifindex. Certainly Cacti, which I use, does.

> Regardless, not everyone uses SNMP, so clearing stats can still be
> useful.  Even
> if it is not implemented perfectly (ie, no locking, so it's possible
> that a clear
> will not totally clear some stats), it will still be right most of the
> time, and
> that will help the casual user who is trying to diagnose network errors
> with only
> console access to the system... (ie, ifconfig -a).

Right. I think the point here is that it does _NOT_ inherently break
things. If you don't like the behavior, don't run "ethtool -z eth0",
it's that simple.

A co-worker suggested today, that maybe it'd appease people if the final
ethtool patch made it a capitol option that you can only run by itself.
I.e. if you can't call it with anything else, it's more difficult to
call my accident. I'd be willing to this.

As for the clearing, in this case, the clearing is done by a command to
the hardware - and I believe the hardware does that atomically. However,
I could certainly add a spinlock around it if someone sees a need.

-- 
Phil Dibowitz [EMAIL PROTECTED]
Freeware and Technical Pages  Insanity Palace of Metallica
http://www.phildev.net/   http://www.ipom.com/

"Be who you are and say what you feel, because those who mind don't
matter and those who matter don't mind."
 - Dr. Seuss




signature.asc
Description: OpenPGP digital signature


Re: reminder, 2.6.18 window...

2006-05-24 Thread Ben Greear

Rick Jones wrote:

Phil Dibowitz wrote:


Does having the ability to boot into single user mode break 
networking? No, it

*allows* you to break networking. Does the _support_ of rmmod break the
kernel? No, but it *allows* you to.



Are SNMP traps generated by going into single-user mode?  Rather like 
what I was saying to Brian earlier.   I suspect though that an rmmod 
doesn't generate an SNMP trap - unless perhaps that to do the rmmod one 
has to first ifdown the interface and that might?


If the interface comes back, it will (may?) have a different device id 
(if-index),
even if the name is the same.

Regardless, not everyone uses SNMP, so clearing stats can still be useful.  Even
if it is not implemented perfectly (ie, no locking, so it's possible that a 
clear
will not totally clear some stats), it will still be right most of the time, and
that will help the casual user who is trying to diagnose network errors with 
only
console access to the system... (ie, ifconfig -a).

Ben

--
Ben Greear <[EMAIL PROTECTED]>
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-24 Thread Rick Jones

Phil Dibowitz wrote:

Well, I can show you support on my home switch (cabletron) - the network guys
will be a little unhappy if I clear stats on our production network (cisco)
without warning them:


Isn't that last bit an example of why it might not be good to play-out 
that rope?-)



Does having the ability to boot into single user mode break networking? No, it
*allows* you to break networking. Does the _support_ of rmmod break the
kernel? No, but it *allows* you to.


Are SNMP traps generated by going into single-user mode?  Rather like 
what I was saying to Brian earlier.   I suspect though that an rmmod 
doesn't generate an SNMP trap - unless perhaps that to do the rmmod one 
has to first ifdown the interface and that might?


rick jones
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-24 Thread Rick Jones
So how is this different than if an SNMP station probes my system, then 
I reboot, then they probe again.  Things will seem to have gone 
backwards, but they deal with that just fine.


In that case hasn't the system's uptime and/or last boot time in the MIB 
changed and so indicates to the management station that it has been 
rebooted?  It is also possible that the shutdown procedures or the 
rebooting procedures would have generated an SNMP trap to the management 
station(s) informing them of the reboot.


rick jones
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-24 Thread Andy
On Wed, 2006-05-24 at 14:23 -0400, Jeff Garzik wrote:

> I disagree that we should bother about clearing statistics.  It always 
> adds more complication than necessary.  Few (if any) other statistics in 
> Linux permit easy clearing, 

iptables -Z

> often because adding operations other than 
> 'increment' or 'read' requires adding expensive spinlocks or atomic 
> operations.

Why is 'set to N' so different from 'increment'?

- Andy


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-24 Thread Phil Dibowitz
On Wed, May 24, 2006 at 04:10:32PM -0400, jamal wrote:
> Can you provide some link to a vendor that allows resetting ethernet
> stats? I am almost certain, if they do they will have something or other
> which indicates that such a reset happened. It is also easier for cisco
> to have none standard feature "as of ios 15.16" which could support such
> behavior because they bundle everything including network management
> tools. We dont have that luxury. The BSDs may actually get away with it
> because they bundle user space apps as well. Perhaps some
> random Linux vendor as well..

Well, I can show you support on my home switch (cabletron) - the network guys
will be a little unhappy if I clear stats on our production network (cisco)
without warning them:

*
Cabletron Systems ELS100-24TXG

SWITCH STATISTICS Access Control: READ/WRITE

 IDTRANSMITTEDRECEIVED   FORWARDEDFILTERED DROPPED ERRORED
-
  11322815 1614038 1614038   0   0   51562
  220670321814254518142349   0 196   0
  3   17487573 1011964 1011964   0   0   0
  4  18247   20528   20528   0   0   0
  5  38775   0   0   0   0   0
  6  38775   0   0   0   0   0
  7  38775   0   0   0   0   0
  8  38775   0   0   0   0   0
  9  38820   0   0   0   0   0
 10  38820   0   0   0   0   0

  n. Next Pagep. Previous Pagef. First Pagel. Last Page

  s. Switch Summary   d. Port Statistics t. Trunking Statistics
  r. Refresh   c. Clear   x. Previous Menu

Enter Selection:c

*
Cabletron Systems ELS100-24TXG

SWITCH STATISTICS Access Control: READ/WRITE

 IDTRANSMITTEDRECEIVED   FORWARDEDFILTERED DROPPED ERRORED
-
  1  0   0   0   0   0   0
  2  0   0   0   0   0   0
  3  0   0   0   0   0   0
  4  0   0   0   0   0   0
  5  0   0   0   0   0   0
  6  0   0   0   0   0   0
  7  0   0   0   0   0   0
  8  0   0   0   0   0   0
  9  0   0   0   0   0   0
 10  0   0   0   0   0   0

  n. Next Pagep. Previous Pagef. First Pagel. Last Page

  s. Switch Summary   d. Port Statistics t. Trunking Statistics
  r. Refresh   c. Clear   x. Previous Menu

Enter Selection:

> > If my patch was invasive and broke things, 
> 
> It _does break_ things for all known management apps. 
> This is not to say it is not useful for testing, development or
> debugging (which is what you seem to be using it for) but it does mean
> it is broken. 

Does having the ability to boot into single user mode break networking? No, it
*allows* you to break networking. Does the _support_ of rmmod break the
kernel? No, but it *allows* you to.

Same thing here. The patch breaks nothing, it provides a tool that if used
without proper understanding, could break things. Just like almost any other
feature in the kernel.

-- 
Phil Dibowitz [EMAIL PROTECTED]
Freeware and Technical Pages  Insanity Palace of Metallica
http://www.phildev.net/   http://www.ipom.com/

"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."
 - Benjamin Franklin, 1759



signature.asc
Description: Digital signature


Re: reminder, 2.6.18 window...

2006-05-24 Thread Brian Haley

jamal wrote:

On Wed, 2006-24-05 at 12:14 -0700, Phil Dibowitz wrote:


Right, I'm aware there are other ways of doing this - I've written scripts to
record a hundreds of numbers, and then subtract them from each other. But
those scripts are work arounds 


I don't have any problem with Phil's changes.


It is not a work around, _it is design intent_. It is what network
management tools have been expecting since the days of the caveman.
These stats are supposed to be monotonically increasing; if that
behavior is contradicted, a rollover of the counters is assumed.


So how is this different than if an SNMP station probes my system, then 
I reboot, then they probe again.  Things will seem to have gone 
backwards, but they deal with that just fine.



for a feature _lacking_ in the kernel. A
feature that, as I've mentioned, is supported on any piece of networking gear
(and of course, lets not forget there's a specific option in the kernel config
*just* for "behave like a router").



Can you provide some link to a vendor that allows resetting ethernet
stats? I am almost certain, if they do they will have something or other
which indicates that such a reset happened.


DEC/Compaq/HP has allowed this on Tru64 UNIX since 1999 because we had 
customers that wanted it, noone ever complained about complications with 
SNMP.  We did save the last time the stats were zero'd in the struct for 
posterity, but that was never get-able via SNMP:


--> netstat -I tu0 -s

tu0 Ethernet counters at Wed May 24 16:30:05 2006

  609415 seconds since last zeroed
  3943458720 bytes received
   113576310 bytes sent
...

Maybe saving a "ztime" would make people happier?


It is also easier for cisco
to have none standard feature "as of ios 15.16" which could support such
behavior because they bundle everything including network management
tools.


I never received any free management tools with my Cisco routers :) , 
they charge big bucks for that stuff!


If my patch was invasive and broke things, 


It _does break_ things for all known management apps.


Can anyone show a management app this breaks?

-Brian
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-24 Thread Rick Jones

Can you provide some link to a vendor that allows resetting ethernet
stats? I am almost certain, if they do they will have something or other
which indicates that such a reset happened. It is also easier for cisco
to have none standard feature "as of ios 15.16" which could support such
behavior because they bundle everything including network management
tools. We dont have that luxury. The BSDs may actually get away with it
because they bundle user space apps as well. Perhaps some
random Linux vendor as well..


The lanadmin (think ethtool) command of HP-UX has had a way to clear the 
statistics reported by lanadmin.  I do not know however, if that affects 
the stats in the actual SNMP interface MIBs as I've never had occasion 
to look.


I still suggest beforeafter to people because clearing the lanadmin 
stats requires root privs and because it may not be a kosher thing to do 
on a customer system.


HP-UX does not provide a way to clear the stats reported by netstat.

fwiw,

rick jones
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-24 Thread jamal
On Wed, 2006-24-05 at 12:14 -0700, Phil Dibowitz wrote:

> 
> Right, I'm aware there are other ways of doing this - I've written scripts to
> record a hundreds of numbers, and then subtract them from each other. But
> those scripts are work arounds 

It is not a work around, _it is design intent_. It is what network
management tools have been expecting since the days of the caveman.
These stats are supposed to be monotonically increasing; if that
behavior is contradicted, a rollover of the counters is assumed.

> for a feature _lacking_ in the kernel. A
> feature that, as I've mentioned, is supported on any piece of networking gear
> (and of course, lets not forget there's a specific option in the kernel config
> *just* for "behave like a router").

Can you provide some link to a vendor that allows resetting ethernet
stats? I am almost certain, if they do they will have something or other
which indicates that such a reset happened. It is also easier for cisco
to have none standard feature "as of ios 15.16" which could support such
behavior because they bundle everything including network management
tools. We dont have that luxury. The BSDs may actually get away with it
because they bundle user space apps as well. Perhaps some
random Linux vendor as well..

> If my patch was invasive and broke things, 

It _does break_ things for all known management apps. 
This is not to say it is not useful for testing, development or
debugging (which is what you seem to be using it for) but it does mean
it is broken. 

cheers,
jamal




-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-24 Thread Jeff Garzik

Brent Cook wrote:
Note that this is just clearing the hardware statistics on the interface, and 
would not require any kind of atomic_increment addition for interfaces that 
support that. It would be kind-of awkward to implement this on drivers that  
increment stats in hardware though (lo, vlan, br, etc.) This also brings up 
the question of resetting the stats for 'netstat -s'


If you don't atomically clear the statistics, then you are leaving open 
a window where the stats could easily be corrupted, if the network 
interface is under load.


This 'clearing' operation has implications on the rest of the statistics 
usage.


More complexity, and breaking of apps, when we could just use the 
existing, working system?  I'll take the "do nothing, break nothing, 
everything still works" route any day.



What would be great is if ifconfig, netstat and their ilk just had a -z flag 
instead. This would write a file to the local user's home directory with a 
stats snapshot, and then every subsequent run would auto-calculate against 
the snapshot. You'd also need some way of resetting this when the stats 
actually _do_ reset (driver reload, reboot.) to avoid negative numbers.
That way, you can get what you want without having to write a bunch of 
fragile, awkward scripts, and the kernel isn't throwing away information 
either.


See...  this opens doors to tons of complexity.

Jeff


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-24 Thread Brent Cook
On Wednesday 24 May 2006 14:14, Phil Dibowitz wrote:
> On Wed, May 24, 2006 at 03:05:54PM -0400, Jeff Garzik wrote:
> > Phil Dibowitz wrote:
> > Given any method of clearing statistics across your cluster, I'm certain
> > you can come up with a similar method of obtaining the current statistic
> > (the baseline).
>
> Right, I'm aware there are other ways of doing this - I've written scripts
> to record a hundreds of numbers, and then subtract them from each other.
> But those scripts are work arounds for a feature _lacking_ in the kernel. A
> feature that, as I've mentioned, is supported on any piece of networking
> gear (and of course, lets not forget there's a specific option in the
> kernel config *just* for "behave like a router").
>
> If my patch was invasive and broke things, I would understand the
> hesitation, but this is a feature that allows people to *choose* to do this
> if they need to and the code is pretty self-contained.

I'm with you - this is a useful feature! But there aren't many other things 
I've found that can be cleared from the kernel other than by reloading a 
module, and dmesg -c. I think the object here isn't this particular patch, 
but the can-of-worms that it opens up.

Note that this is just clearing the hardware statistics on the interface, and 
would not require any kind of atomic_increment addition for interfaces that 
support that. It would be kind-of awkward to implement this on drivers that  
increment stats in hardware though (lo, vlan, br, etc.) This also brings up 
the question of resetting the stats for 'netstat -s'

What would be great is if ifconfig, netstat and their ilk just had a -z flag 
instead. This would write a file to the local user's home directory with a 
stats snapshot, and then every subsequent run would auto-calculate against 
the snapshot. You'd also need some way of resetting this when the stats 
actually _do_ reset (driver reload, reboot.) to avoid negative numbers.
That way, you can get what you want without having to write a bunch of 
fragile, awkward scripts, and the kernel isn't throwing away information 
either.

 - Brent
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-24 Thread Phil Dibowitz
On Wed, May 24, 2006 at 03:05:54PM -0400, Jeff Garzik wrote:
> Phil Dibowitz wrote:
> >On Wed, May 24, 2006 at 02:23:05PM -0400, Jeff Garzik wrote:
> >>I disagree that we should bother about clearing statistics.  It always 
> >>adds more complication than necessary.  Few (if any) other statistics in 
> >>Linux permit easy clearing, often because adding operations other than 
> >>'increment' or 'read' requires adding expensive spinlocks or atomic 
> >>operations.
> >
> >Every networking device in the world supports clearing interface 
> >statistics.
> >Why should linux not be able to do the most basic operation on any
> >cisco/juniper/enterasys/whatever managed switch or router?
> >
> >It's a common operation on a network interface, I don't see why this is a
> >concern.
> >
> >When I'm debugging a networking issue On a cluster of hundreds and hundreds
> >of machines at work, I want to be able to reset them all quickly, and get a
> >rough idea of if they're all climbing, if they're all climbing at the same
> >rate, etc. And being able to do "for i in `cat hostlist`; do ssh $i 
> >ethtool -z
> >eth0; done" is really, really, REALLY, useful.
> 
> Obtaining the difference between two numbers is not that difficult.
> 
> Given any method of clearing statistics across your cluster, I'm certain 
> you can come up with a similar method of obtaining the current statistic 
> (the baseline).

Right, I'm aware there are other ways of doing this - I've written scripts to
record a hundreds of numbers, and then subtract them from each other. But
those scripts are work arounds for a feature _lacking_ in the kernel. A
feature that, as I've mentioned, is supported on any piece of networking gear
(and of course, lets not forget there's a specific option in the kernel config
*just* for "behave like a router").

If my patch was invasive and broke things, I would understand the hesitation,
but this is a feature that allows people to *choose* to do this if they need
to and the code is pretty self-contained.

-- 
Phil Dibowitz [EMAIL PROTECTED]
Freeware and Technical Pages  Insanity Palace of Metallica
http://www.phildev.net/   http://www.ipom.com/

"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."
 - Benjamin Franklin, 1759



signature.asc
Description: Digital signature


Re: reminder, 2.6.18 window...

2006-05-24 Thread Jeff Garzik

Phil Dibowitz wrote:

On Wed, May 24, 2006 at 02:23:05PM -0400, Jeff Garzik wrote:
I disagree that we should bother about clearing statistics.  It always 
adds more complication than necessary.  Few (if any) other statistics in 
Linux permit easy clearing, often because adding operations other than 
'increment' or 'read' requires adding expensive spinlocks or atomic 
operations.


Every networking device in the world supports clearing interface statistics.
Why should linux not be able to do the most basic operation on any
cisco/juniper/enterasys/whatever managed switch or router?

It's a common operation on a network interface, I don't see why this is a
concern.

When I'm debugging a networking issue On a cluster of hundreds and hundreds
of machines at work, I want to be able to reset them all quickly, and get a
rough idea of if they're all climbing, if they're all climbing at the same
rate, etc. And being able to do "for i in `cat hostlist`; do ssh $i ethtool -z
eth0; done" is really, really, REALLY, useful.


Obtaining the difference between two numbers is not that difficult.

Given any method of clearing statistics across your cluster, I'm certain 
you can come up with a similar method of obtaining the current statistic 
(the baseline).


Jeff



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-24 Thread Phil Dibowitz
On Wed, May 24, 2006 at 02:23:05PM -0400, Jeff Garzik wrote:
> I disagree that we should bother about clearing statistics.  It always 
> adds more complication than necessary.  Few (if any) other statistics in 
> Linux permit easy clearing, often because adding operations other than 
> 'increment' or 'read' requires adding expensive spinlocks or atomic 
> operations.

Every networking device in the world supports clearing interface statistics.
Why should linux not be able to do the most basic operation on any
cisco/juniper/enterasys/whatever managed switch or router?

It's a common operation on a network interface, I don't see why this is a
concern.

When I'm debugging a networking issue On a cluster of hundreds and hundreds
of machines at work, I want to be able to reset them all quickly, and get a
rough idea of if they're all climbing, if they're all climbing at the same
rate, etc. And being able to do "for i in `cat hostlist`; do ssh $i ethtool -z
eth0; done" is really, really, REALLY, useful.

As for SNMP statistics, again, this is just like clearing stats on any other
platform - it's a manual thing... you're *choosing* to reset the stats, and
accept the consequences. Its not like the patch introduces some nightly
resetting - it's just an _option_ to users

It's about providing an option that requires no extra complications in the
code (at least in this case), and that has been requested on every
sysadmin-related list I'm on, and in most cases is a re-occuring topic.

-- 
Phil Dibowitz [EMAIL PROTECTED]
Freeware and Technical Pages  Insanity Palace of Metallica
http://www.phildev.net/   http://www.ipom.com/

"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."
 - Benjamin Franklin, 1759



signature.asc
Description: Digital signature


Re: reminder, 2.6.18 window...

2006-05-24 Thread Rick Jones
Those folks wanting link-level stats over an interval (I'm assuming that 
is wny someone would want to zero stats?) should feel free to embrace 
and extend beforeafter:


ftp://ftp.cup.hp.com/dist/networking/tools/

rick jones
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-24 Thread Jeff Garzik

Phil Dibowitz wrote:

David Miller wrote:

Some time in the next few weeks, it is likely that the 2.6.18
merge window will open up shortly after a 2.6.17 release.

So if you have major 2.6.18 submissions planned for the networking,
you need to start thinking about getting it to me now.

There is a 2.6.18 tree up at:

master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.18.git

All it has right now is the I/O AT stuff at the moment, and I plan to
put Stephen Hemminger's LLC multicast/datagram changes in there as
well.


David,

I posted a patch for adding support for network device statistic
resetting via ethtool. I saw no objections to it... it impliments the
framework as well as skge support, so it touches both your and Jeff's area.

For your reference, here's the two times I've posted it this month - I'm
happy to send it along again.

2006-05-18
RESEND: [PATCH] Interface Stat Clearing Framework, skge support, ethtool
http://marc.theaimsgroup.com/?l=linux-netdev&m=114794065502155&w=2

2006-04-30
[PATCH] Interface Stat Clearing Framework, skge support, ethtool
http://marc.theaimsgroup.com/?l=linux-netdev&m=114636704207480&w=2


I disagree that we should bother about clearing statistics.  It always 
adds more complication than necessary.  Few (if any) other statistics in 
Linux permit easy clearing, often because adding operations other than 
'increment' or 'read' requires adding expensive spinlocks or atomic 
operations.


Jeff



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-24 Thread jamal
On Wed, 2006-24-05 at 01:01 -0700, Phil Dibowitz wrote:
[..]
> 
> For your reference, here's the two times I've posted it this month - I'm
> happy to send it along again.


The problem with resetting stats is it is _most definetely_ going to
break management apps like SNMP.

This is not to say this is not useful for testing and at times debugging
- but backward compat and not pissing off the standards whic count on
this long standing feature is more important. So if you can 
make it work without breaking apps, it will be more palatable i am sure
to both Dave and Jeff. I would go as far as suggest that you should
allow for arbitrary setting (of which 0 is a speacial case).

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: reminder, 2.6.18 window...

2006-05-24 Thread Phil Dibowitz
David Miller wrote:
> Some time in the next few weeks, it is likely that the 2.6.18
> merge window will open up shortly after a 2.6.17 release.
> 
> So if you have major 2.6.18 submissions planned for the networking,
> you need to start thinking about getting it to me now.
> 
> There is a 2.6.18 tree up at:
> 
> master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.18.git
> 
> All it has right now is the I/O AT stuff at the moment, and I plan to
> put Stephen Hemminger's LLC multicast/datagram changes in there as
> well.

David,

I posted a patch for adding support for network device statistic
resetting via ethtool. I saw no objections to it... it impliments the
framework as well as skge support, so it touches both your and Jeff's area.

For your reference, here's the two times I've posted it this month - I'm
happy to send it along again.

2006-05-18
RESEND: [PATCH] Interface Stat Clearing Framework, skge support, ethtool
http://marc.theaimsgroup.com/?l=linux-netdev&m=114794065502155&w=2

2006-04-30
[PATCH] Interface Stat Clearing Framework, skge support, ethtool
http://marc.theaimsgroup.com/?l=linux-netdev&m=114636704207480&w=2

I'd like to get this into 2.6.18. It's self-contained, so it has little
chance of breaking other things and adds a useful feature that I've seen
a lot of requests for.

Thanks.
-- 
Phil Dibowitz [EMAIL PROTECTED]
Freeware and Technical Pages  Insanity Palace of Metallica
http://www.phildev.net/   http://www.ipom.com/

"Be who you are and say what you feel, because those who mind don't
matter and those who matter don't mind."
 - Dr. Seuss




signature.asc
Description: OpenPGP digital signature