Re: Best practice to get counters from a huge amount of routers.

Richard Mayers Mon, 09 May 2016 06:17:26 -0700

> Hi,
Hi again, and thanks for the answers. Very appreciated ! I leave the
comments in-line.
>
> Like James,  I  wrote  tools that handle several 10K variables on  5 minute 
> interval.
>
> I am not at all trying to discourage you in your project. There are some 
> concern you should keep in mind:
>
> Please always remember that the primary role of the network / network 
> equipment you try to manage is to transport data.
> I mean payload data, not management data. Have you estimated/calculated the 
> ration of the bandwidth you will consume "just" for management?
> (You did not mention how many counters per routers you intend to collect data 
> about.)
> Measurement should not bias measured (or only to a minimal extend).
Well that should not be a problem, since I have everything simulated
in a sever (using mininet and quagga routers). I have a dedicated out
of band interface at every device just for snmp and netflow. And
currently for snmp I am getting the data from the same machine and
storing it in a file (all the devices share the same file system), so
its kind of "cheating" but as a first step its what they told me to
do. Even with that solution I am having CPU problems due to all these
processes polling the counters and writing it to a file every second.


> Switchs and routers are not designed to assist to this extend, the management 
> station.
> SNMP is not the best method (high CPU load, low priority on the equipment) => 
> How about Netflow ( nfsen is a wonderful free tool , if you accept 5 min. 
> period)
I am already using netflow but for another purpose, and SNMP was only
for the counters to know the load at every link in "real time" ( If I
manage to accomplish that)
>
> There is a big difference between what can be done and what make sense to be 
> done.
>
> In any way Get-bulk (if several counters par routers) seems more appropriate 
> than Traps.
I can have up to ...20 interfaces per routers.
>
> There are some questions you should consider and find an answer to:
>
> How are configured the time-outs and the retries of your SNMP requests (you 
> intend to address some "real" equipment that may respond with latency) ?
>         Two seconds time-out and 3 retries don't make really sense when 
> polling every second.
So far I have not consider that, I assume I have some kind of ideal
scenario where everything works... and everything its "under my
control", like my simulation or a data center.

> Impact of a "non-responding" device?  (a faulty one, defectuous one), of 
> several faulty devices (let say 10). What happens to the 190 others?
> What is the latency of the network that interconnect your 200 routers?
The latency should be very small, 1) because I am using the out of
band network to monitor, 2) and my idea is to loadbalance in
datacenter networks.
> How do you "time-stamp" the collected data ?
I don't, I assume that all the data from the routers is from the same
period of "sampling".

I don't really know what to do, I need to know the load per link as
fast as possible so I can improve the load balancing decisions.

Thanks a lot,
Richard

------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Net-snmp-users mailing list
Net-snmp-users@lists.sourceforge.net
Please see the following page to unsubscribe or change other options:
https://lists.sourceforge.net/lists/listinfo/net-snmp-users

Re: Best practice to get counters from a huge amount of routers.

Reply via email to