Re: netflow in the core used for surveillance

2021-08-30 Thread Avi Freedman
Hi, all.

Re: last week's thread on the Vice article -

I can only speak for Kentik, and *we* don't resell or give 3rd party access
to NetFlow data from our hundreds of customers.  And never have.

But there is definitely interest out there.  We do get approached about it
periodically and always say no.  Mostly by commercial vendors and not (at
least directly) by governmental bodies.

Of course, our *customers* could in theory share their data via API key, or
by using our outbound streaming firehose.  But I've never talked to a
customer who wanted to share their flow data with a 3rd party.  Usually by
far the opposite.

The closest thing to this that our customers do ask about is re: aggregate
community views, which people could contribute to to help themselves and
the community.  While we don't do this now, if and when we do it: 1) won't
be with raw data, 2) will be opt-in only, 3) will be designed with
customers and have open methodology; and 4) will be likely with synthetic
test, BGP, device metrics, and other non-flow data to start.

Thanks,

Avi


Re: Service Provider NetFlow Collectors

2018-12-31 Thread Avi Freedman
We do have a minimum for commercial service that's more like $1500/mo but we 
are coming out with a free tier in Q1 with lower retention (among other deltas, 
but including fully slice and dice flow analytics +BGP that it sounded like 
Erik might be looking for).

Feel free to ping me if anyone would like to help us test the free tier in 
January.

Thanks,

Avi Freedman
CEO, Kentik

> Doesn't Kentik cost like $2000 a month minimum?
> 
> 
> On Mon, Dec 31, 2018 at 11:57 AM Matthew Crocker 
> wrote:
> 
> >  +1 Kentik as well,  DDoS, RTBH, Netflow.  Cloud based so I don't have to
> > worry about it.
> >
> > On 12/31/18, 11:37 AM, "NANOG on behalf of Bryan Holloway" <
> > nanog-boun...@nanog.org on behalf of br...@shout.net> wrote:
> >
> > +1 Kentik ...
> >
> > We've been using their DDoS/RTBH mitigation with good success.
> >
> >
> > On 12/31/18 3:52 AM, Eric Lindsjö wrote:
> > > Hi,
> > >
> > > We use kentik and we're very happy. Works great, tons of new
> > features
> > > coming along all the time. Going to start looking into ddos
> > detection
> > > and mitigation soon.
> > >
> > > Would recommend.
> > >
> > > Kind regards,
> > > Eric Lindsjö
> > >
> > >
> > > On 12/31/2018 04:29 AM, Erik Sundberg wrote:
> > >>
> > >> Hi Nanog….
> > >>
> > >> We are looking at replacing our Netflow collector. I am wonder what
> > >> other service providers are using to collect netflow data off their
> > >> Core and Edge Routers. Pros/Cons… What to watch out for any info
> > would
> > >> help.
> > >>
> > >> We are mainly looking to analyze the netflow data. Bonus if it does
> > >> ddos detection and mitigation.
> > >>
> > >> We are looking at
> > >>
> > >> ManageEngine Netflow Analyzer
> > >>
> > >> PRTG
> > >>
> > >> Plixer – Scrutinizer
> > >>
> > >> PeakFlow
> > >>
> > >> Kentik
> > >>
> > >> Solarwinds NTA
> > >>
> > >> Thanks in advance…
> > >>
> > >> Erik
> > >>
> > >>
> > >>
> > 
> > >>
> > >> CONFIDENTIALITY NOTICE: This e-mail transmission, and any
> > documents,
> > >> files or previous e-mail messages attached to it may contain
> > >> confidential information that is legally privileged. If you are not
> > >> the intended recipient, or a person responsible for delivering it
> > to
> > >> the intended recipient, you are hereby notified that any
> > disclosure,
> > >> copying, distribution or use of any of the information contained in
> > or
> > >> attached to this transmission is STRICTLY PROHIBITED. If you have
> > >> received this transmission in error please notify the sender
> > >> immediately by replying to this e-mail. You must destroy the
> > original
> > >> transmission and its attachments without reading or saving in any
> > >> manner. Thank you.
> > >
> >
> >
> >


LLDP via SNMP

2016-05-26 Thread Avi Freedman

Have had the question come up a few times, so I wanted to poll the community to 
see...

For those who are monitoring LLDP, how have you found the SNMP MIB support 
support for it on Juniper, Cisco, Brocade, Arista, and others?

Wondering if you've needed to resort to CLI scraping or APIs to get the data?

Thanks,

Avi Freedman
CEO, Kentik



Re: vFlow :: IPFIX, sFlow and Netflow collector

2017-05-16 Thread Avi Freedman
 lack of precision might not be an issue (they
pretty much all use probabalistic data structures like HLLs to do count and 
topN).  

And MemSQL can operate in that mode as well though I don't think that was how 
Mehrdad was showing it working with vFlow.

But again you can't ever go 'back in time' for an ad hoc query with
them so it's probably more interesting as an augment and offloader for most 
uses where you'd normally think of storing many billions or a few trillion 
flows.

Happy flow-ing...

Avi Freedman
CEO, Kentik



Re: vFlow :: IPFIX, sFlow and Netflow collector

2017-05-16 Thread Avi Freedman

> "NANOG"  wrote on 05/16/2017 03:34:39 PM:

> Nice analysis of the current state of the art.

Thanks; of DIY for store-all approaches, at least :)  

Commercial options is a different thread and I'm conflicted so shouldn't 
try to summarize those...

> > And then, the biggest flow store I know of (1 or 2 carriers may want to 
> argue
> > but I haven't seen theirs) is at DISA for DoD - > a decade of un-sampled 
> flow
> > coming from SiLK.  All stored in hourly un-indexed files, essentially 
> nothing
> > but CLI to access,
> 
> FlowViewer provides a web GUI for invoking SiLK analysis tools. Provides 
> textual and graphical analysis with the ability to track filtered subsets 
> over time. Screenshots, etc.:
> 
> https://sourceforge.net/projects/flowviewer/

Sorry, forgot about flowviewer - I've never seen it in use and asked at a bunch
of Flocons - but it looks updated more recently than I had thought.

On a related topic, I'd love to see NANOGers and general netops and perf-minded
people go to Flocon (put on by CERT, and heavily but not exclusively SiLK- and
security-focused).

Cross-pollination of interests, tools, and techniques will help us all...

> 
> Joe

Thanks,

Avi 



Re: DDOS, IDS, RTBH, and Rate limiting

2014-11-20 Thread Avi Freedman

> Netflow is stateful stuff, and just to run it on wirespeed, on hardware, 
> you need to utilise significant part of TCAM,

Cisco ASRs and MXs with inline jflow can do hundreds of K flows/second
without affecting packet forwarding.

> i am not talking that on some hardware it is just impossible to run it.
> So everything about netflow are built on assumption that hosting or ISP 
> can run it. And based on some observations, majority of small/middle 
> hosting providers are using minimal,just BGP capable L3 switch as core, 
> and cheapest but reliable L2/L3 on aggregation, and both are capable in 
> best case to run sampled sFlow.

Actually, sFlow from many vendors is pretty good (per your points about flow 
burstiness and delays), and is good enough for dDoS detection.  Not for 
security forensics, or billing at 99.99% accuracy, but good enough for
traffic visibility, peering analytics, and (d)DoS detection.



> So for a small hosting(up to 10G), i believe, FastNetMon is best 
> solution. Faster, and no significant investments to equipment. Bigger 
> hosting providers might reuse their existing servers, segment the 
> network, and implement inexpensive monitoring on aggregation switches 
> without any additional cost again.

It can be useful to have a 10G network monitoring box of course...

And with the right setup you can run FastNetMon or other tools in
addition to generating flow that can be of use for other purposes
as well...

> Ah, and there is one more huge problem with netflow vs FastNetMon - 
> netflow just by design cannot be adapted to run pattern matching, while 
> it is trivial to patch FastNetMon for that, turning it to mini-IDS for 
> free.

It's true, having a network tap can be useful for doing PCAP-y stuff.

But taps can be difficult or at least time consuming for people to
put in at scale.  Even, we've seen, for folks with 10G networks.
Often because they can get 90% of what they need for 4 different
business purposes from just flow :)

> Best regards,
> Denys

Avi Freedman| Your flow has something to show you; can you see it?|
CEO, CloudHelix | (avi at cloudhelix dot com) | my name one word on skype |



Re: DDOS, IDS, RTBH, and Rate limiting

2014-11-22 Thread Avi Freedman

> > On the contrary - SPAN nee port mirroring cuts into the
> > frames-per-second budget of linecards, as the traffic is in essence
> > being duplicated.  It is not 'free', and it has a profound impact on
> > the the switch's data-plane traffic forwarding capacity.
> > 
> > Unlike NetFlow.
>
> In hosting case mirroring usually done for uplink port, but i have to 
> agree, it might be a problem.

Have you seen any issues with SPANning?  We usually advise something like
a $1k netoptis tap or to be cheaper there are actually $50 fiber cables
with 30/70 taps embedded (so two such, one for RX tap and one for TX tap).

Of course, that only grabs a single 10gig whereas with SPAN you can 
potentially do more - but the issues we've seen across vendors is that
if you try to send more traffic into a SPAN port than its size, bad
things can happen.  Head of line blocking, random congestion, and other
strange failures.

And you trade off potential catastrophic downtime for SPAN-related
network destabilization, for guaranteed downtime to bring links down
to tap them.

> "Major" expenses - tuning server according author recommendations, and 
> writing shell script that will send to 4948 command to blackhope IP. For 
> qualified sysadmin it is 2 hours of work, and $500 max as a "labor" 
> cost. Thats it. What can be cheaper than $2000 in this case? I guess i 
> wont get answer.

I think the issue is not with your providing the info about fastnetmon,
its genesis, and what you see as the great use cases for it - more around
the statements on flow as an unusable source of data for various purposes.

Things seem to have died down around that though, which is good :)

> ---
> Best regards,
> Denys

Avi Freedman| Your flow has something to show you; can you see it?|
CEO, CloudHelix | (avi at cloudhelix dot com) | my name one word on skype |



Re: DDOS, IDS, RTBH, and Rate limiting

2014-11-22 Thread Avi Freedman

> > Cisco ASRs and MXs with inline jflow can do hundreds of K flows/second
> > without affecting packet forwarding.
>
> Yes, i agree,those are good for netflow, but when they already exist in 
> network.
>
> Does it worth to buy ASR, if L3 switch already doing the job 
> (BGP/ACL/rate-limit/routing)?

Not suggesting that anyone should change out their gear though per my other
message, I've seen SPAN make things go wonky on almost every vendor that
ISPs use for switching.

> Well, if it is available, except hardware limitations, there is second 
> obstacle, software licensing cost. On latest JunOS, for example on EX2200, 
> you need to purchase license (EFL), and if am not wrong it is $3000 for 
> 48port units.
>
> So if only sFlow feature is on stake, it worth to think, to purchase license,
> or to purchase server. Prices for JFlow license on MX, just for 5/10G is way 
> above cost of very decent server.

I believe that smaller MXs can run it for free.  Larger providers we've 
worked with often have magic cookies they can call in to get it enabled,
but I understand you're talking about the smaller-provider (or at least ~ 
10gig per POP across multiple POPs) case.

We see a lot of Brocade for switching in hosting providers, which makes 
sFlow easy, of course.

> > And with the right setup you can run FastNetMon or other tools in
> > addition to generating flow that can be of use for other purposes
> > as well...
>
> Technically there is ipt_NETFLOW, that can generate netflow on same box, 
> for statistical/telemetry purposes. But i am not sure it is possible to 
> run them together.

At frac 10gig you can just open pcap on a 10gig interface on a Linux
box getting a tap, of course.

What we did was use myricom cards and the myri_snf drivers and take from
the single-consumer ring buffers into large in-RAM ring buffers, and 
make those ring buffers available via LD_PRELOAD or cli tools to allow
flow, snort, p0f, tcpdump, etc to all be run at the same time at 10gig.

The key for that is not going through the kernel IP stack, though.

> > But taps can be difficult or at least time consuming for people to
> > put in at scale.  Even, we've seen, for folks with 10G networks.
> > Often because they can get 90% of what they need for 4 different
> > business purposes from just flow :)
>
> About scaling, i guess it depends on proper deployment strategy and 
> sysadmins/developers capabilities. For example to deploy new ruleset 
> for my pcap-based "homemade" analyser to 150 probes across the country - 
> is just one click.

Sounds cool.  You should write up that use case.  Hopefully you've secured
the metadata/command push channel well enough :)

> Best regards,
> Denys

Avi Freedman| Your flow has something to show you; can you see it?|
CEO, CloudHelix | (avi at cloudhelix dot com) | my name one word on skype |



Re: Cloudflare, and the 120Gbps DDOS "that almost broke the Internet"

2013-03-27 Thread Avi Freedman

An important question...

I recall a peering panel at an ISPCON in 1996 when the current 
Peering Badguys, BBN, were represented by John, who listened
to a ton of bitching for an hour about the unfairness of it all and
said (paraphrasing)...

"I understand you all have your opinions and desires but I just want
 to point out one thing.  It is now 1996, 2 years after the widespread
 adoption of the web, and in every city in the US there are at least
 two ISPs happily providing unlimited {dialup} access for under $20/mo.
 What do you think we'd have if it were run or regulated by the government?"

Luckily, many bureaucrats and politicians in our government do
understand that.  And so far The Community has been able to put
pressure on international bodies and other governments don't 
have the clout.  Hopefully that remains the case for some time.

Avi

> In general, governments have avoided regulating various aspects of
> the Internet, in part because of lack of understanding and in part
> because the community keeps telling them that trying to regulate
> won't work because of its decentralized nature.  As the Internet
> becomes increasingly important to each country's economy and its
> citizens, the status quo is not likely to continue. 
> 
> The real question is, when governments do decide to try and help
> "improve the Internet", who will they be listening to, and will
> the operator community have spoken with a clear enough voice in
> these matters on what actually would make for an improvement?
> 
> FYI,
> /John




Re: [Paper] B4: Experience with a Globally-Deployed Software Defined

2013-08-17 Thread Avi Freedman

No, people never use *flow controllers* for anything.

People have been doing SDN since before Google was around.  

OK, so it was horrible expect scripts but it worked.

Avi

> Unpossible.  I heard that no one really uses sdn for anything.
> 
> :)
> 
> T




Re: [Paper] B4: Experience with a Globally-Deployed Software Defined

2013-08-17 Thread Avi Freedman

> On Sat, Aug 17, 2013 at 2:32 PM, Avi Freedman  wrote:
> 
> > No, people never use *flow controllers* for anything.
>
> > People have been doing SDN since before Google was around.
> > OK, so it was horrible expect scripts but it worked.
>
> Not really.

Note I am talking about flow controllers in my first point.

(And I was trying to be funny to match Todd's tone, though
 I guess it's dangerous to try to copy the master)

Re: flow controllers -

The idea of centralized decision makers doing something (typically
per flow) has been proposed, in my experience, by those with little
operational experience or those with extraordinarily constrained
topoligies, types of traffic, and usually external filtering to
constrain the types of traffic one could face.

Because...  There have been no proposals that I have seen (or
that those who are at the Major Vendors who follow it more 
closely tell me about when I ask a few times/year) to actually
deal with the every-packet-is-a-flow problem we saw first with
7206VXRs and that remain a real possibility for Internet-
connected networks.  Distributing flow controllers and making
them hierarchical doesn't seem to help in the architectures
that I've seen proposed.

So it seems to be of use only for very tiny networks or for
very constrained and filtered or non Internet-connected topologies.

I'd be interested to be shown otherwise.

> Automatic reconfiguration of routers is not what a software-defined network 
> is.
>
> It is  one of the things (but not all of the things)  that SDN provides.
> 
> A software defined network is one where the forwarding behavior can be
> completely defined
> in software running outside of the devices that perform the forwarding.

That said, I wince every time someone starts talking about (not suggesting
you are here but many do) making the routing engineer or designer in a box
that sits on the bottom or besides the network.

Those who have experience and/or run larger infrastructure usually say
words like "of course we have to worry about feedback loops" but many don't.

I think innovation is great but I don't think there are that many shops
that are better off writing their own control pane (centralized, distribtued,
whatever) right now.

It's worth remembering that Google is a software company.  They are far ahead 
in software defined everything. 

> You can write expect scripts all day; but you cannot turn your basic switch
> into a Load balancer  or stateful firewall with one.
> or decide in real time exactly which destination Ethernet ports a packet
>  coming in a certain port is going to touch,  without having structured
> VLANs and  static MAC tables on the switches ahead of time.
> 
> Changing device configurations with expect scripts is a limited tiny subset
> of what SDN is.

True, but the number of production environments that are going to be more
stable or scalable by having people build their own control logic is pretty
small in my experience.  

And being able to debug and reach out to a community of operators with
a common set of experience of what to do, not to do, and how to debug
is extraordinarily valuable for production networks.

When I look at most of the non-Google big guys, SDN means pushing the
vendors for better control plane instrumentation and ability to program
(but more on the instrumentation side as where the gaps have been), and
potentially to get some cross-provider way of doing the above.

+ having merchant silicon one can get/use for cheap, typically for more
constrained topologies, doing pretty dumb switching and/or routing stuff.

> -JH

Where I see the delta a lot given the customer conversations I have is
in the magic provisioning of cloud network infrastructures.

New school SDN is that everything is a tunnel, magic software maps things,
commercial providers doing this uniformly have to aggressively rate-
limit their clients, and performance for content delivery is limited 
because the hypervisors must be briding and can't do PCI passthrough or
SR/IOV.

Old school SDN (not really that old school) is API-based provisioning
of network devices with vendor support (let's say Juniper) to do 
filtering, VLANs, and shaping and tunnels where needed.  

It'll definitely be interesting to see where things go over the next
few years.

I know tens of companies who have run away from cloud providers with
new(er) school SDN-ish infrastructures for the simplicity of just
having some high performance dedicated machines/hypervisors with
dead simple switching infrastructure.

Anyway, innovation is great but I just see few companies with the
understanding to go build their own control plane software to 
connect to the Internet with.  And those vendors who do build it
will get borged by one of the routing/switching vendors and things
will become product features, differentiated by providers, most
likely.  (Though I hope not)

Avi




Re: oss netflow collector/trending/analysis

2014-05-02 Thread Avi Freedman

There's also SiLK from CMU.  It's powerful but has a learning curve.

I also see pmacct being used both by some end networks and by 
some vendors as part of systems.

Avi

> Hey There,
> 
> I was just wondering, for people who are doing netflow analysis with
> open source tools and who are doing at least 10k or more flows per
> second, what are you using?
> 
> I know of three tool sets:
> 
> - The classic osu flow-tools and the modern continuation/fork.
> - ntop
> - nfdump/nfsen
> 
> Is there anything else I've missed? A few folks here really seem to like
> nfsen/nfdump.
> 
> Thanks,
> 
> Matt



Re: NetFlow - path from Routers to Collector

2015-09-01 Thread Avi Freedman

Looking at probably 100 networks' flow paths over the last year,
I'd say 1 or 2 have OOB for flow.

Maybe another 10-20 have interest in taking simpler time series
data of top talkers over their OOB networks, but not the flow
itself.

Agree w Roland that it can cause problems with telemetry if
there are big network misconfigs.  But for folks seeing DDoS,
we implement rate-limiting of the flows/sec via local proxies
to avoid overwhelming network capacity with the flow data...

Avi

>   I think the key here is that Roland isn't often constrained by
> these financial considerations.
> 
>   I would respectfully disagree with Roland here and agree with
> Job, Niels, etc.
> 
>   A few networks have robust out of band networks, but most
> I've seen have an interesting mixture of things and inband is usually
> the best method.
> 
>   Those that do have "seperate" networks may actually be CoC
> services from another deparment in the same company riding the same
> P/PE devices (sometimes routers).
> 
>   I've seen oob networks on DSL, datacenter wifi or cable swaps
> through the fence to an adjacent rack.
> 
>   An oob network need not be high bandwidth enough to do netflow
> sampling, this is well regarded as a waste of money by many as the costs
> for these can often be orders of magnitude more compared to a pure-IP
> or internet service.
> 
>   I'll say this ranks up there with people who think
> MPLS VPN == Encryption.  It's not unless you think a few byte
> label is going to confuse people.
> 
>   - Jared



Re: NetFlow - path from Routers to Collector

2015-09-01 Thread Avi Freedman
(Said Roland:)

> Again, to clarify - I count VLANs/VRFs as being sufficiently out-of-band 
> to handle flow telemetry on a reasonable basis without mixing it in with 
> customer traffic.
> 
> That changes the ratio.



> I agree with you, Avi, and others that a dedicated OOB network *just for 
> flow telemetry* doesn't make economic sense in most (any?) scenarios.
> 
> What I'm saying is that it oughtn't to be mixed in with customer 
> data-plane traffic.  Ideally, all management-plane traffic would 
> traverse a separate physical infrastructure.  Since we don't live in an 
> ideal world, virtual separation is generally Good Enough.

We see well under 20% doing logical separation but definitely folks
doing it...  For the definition of OOB as "separate routers and 
switches", we don't see anyone really sending flow over that kind
of OOB network.

> ---
> Roland Dobbins 

Avi Freedman
CEO, Kentik
avi at kentik dot com



Re: NetFlow - path from Routers to Collector

2015-09-01 Thread Avi Freedman
(Jared wrote):



> Most people I've seen have little data or insight into their 
> networks, or don't have the level that they would desire as 
> tools are expensive or impossible to justify due to capital costs.  
> Tossing in a recurring opex cost of DC XC fee  + transport + XC fee + 
> redundant aggregation often doesn't have the ROI you infer here. 
> I've put together some models in this area.  It seems to me the 
> DC/real estate companies involved could make a lot (more) money by 
> offering an OOB service that is 10Mb/s flat-rate for the same as an XC 
> fee and compete with their customers.

Equinix does have a very aggressively priced 10Mb/s flat-rate OOB (single 
IP only but that's not that hard to work around) for essentially XC
pricing.  It's been stable but not something you'd rely on for 100%
packet delivery to some other point on the Internet (so more for
reaching a per-pop OOB than for making a coherent OOB network with
a bunch of monitoring running 24x7).

Still, it's a good value for what it is.



> - Jared

Avi Freedman
CEO, Kentik
avi at kentik dot com



Re: NetFlow - path from Routers to Collector

2015-09-01 Thread Avi Freedman

Agreed, we are as well :)

VLAN, VRF, whatever.

+ optimal tweaks include local flow proxy that can also rate 
limit / re-sample, and send topk talkers over 'true' OOB.

Avi Freedman
CEO, Kentik
avi at kentik dot com

> On 2 Sep 2015, at 7:27, Avi Freedman wrote:
> 
> > We see well under 20% doing logical separation but definitely folks 
> > doing it...
> 
> ~20% matches our subjective observations, as well.
> 
> We're doing our best to increase that number.
> 
> ---
> Roland Dobbins 



Re: EyeBall View

2015-10-26 Thread Avi Freedman

> All,
> 
> I had an idea to create a product where we would have a host on every EyeBall 
> network. Customers could then connect to these hosts and check connectivity 
> back to their network. For instance you may want to see what the speed is 
> like from CableVision in central NJ to your network in South Florida or the 
> latency etc. I go large scale I wanted to know how much demand there was for 
> such a service.
> 
> Regards,
> 
> Dovid

Another approach to take is to enable monitoring of your infrastructure,
and then do active tests on top to web servers and other end points.

Passive instrumentation gives you the even bigger advantage of giving
you insight into issues actually affecting your users' traffic.

Just did a talk about this at NANOG 65:

https://www.nanog.org/sites/default/files/monday_general_freedman_flow.pdf

If you set up a tap or SPAN and grab a box with Intel (or many other kinds
of NICs), you can use PF_RING and nprobe to monitor at 100gig+ speeds.

For nprobe in particular as an "agent", some of the extended/augmented
data you can get via NetFlow includes:

http://www.ntop.org/wp-content/uploads/2013/03/nProbe_UserGuide.pdf

[NFv9 57595][IPFIX 35632.123] %CLIENT_NW_DELAY_MS Network latency 
client <-> nprobe (msec) 
[NFv9 57596][IPFIX 35632.124] %SERVER_NW_DELAY_MS Network latency 
nprobe <-> server (residual msec) 
[NFv9 57597][IPFIX 35632.125] %APPL_LATENCY_MSApplication 
latency (msec) 
[NFv9 57581][IPFIX 35632.109] %RETRANSMITTED_IN_PKTS  Number of 
retransmitted TCP flow packets (src->dst) 
[NFv9 57582][IPFIX 35632.110] %RETRANSMITTED_OUT_PKTS Number of 
retransmitted TCP flow packets (dst->src) 
[NFv9 57583][IPFIX 35632.111] %OOORDER_IN_PKTSNumber of out of 
order TCP flow packets (dst->src) 
[NFv9 57584][IPFIX 35632.112] %OOORDER_OUT_PKTS   Number of out of 
order TCP flow packets (dst->src) 
[NFv9 57585][IPFIX 35632.113] %UNTUNNELED_PROTOCOLUntunneled IP 
protocol byte 

The NANOG PPT shows an example of some of the slicing and dicing
you can then do (focused around retransmitted TCP packets, which
is what most of our customers are interested in focusing on as a
simple proxy metric for 'network performance').  Not soliciting 
flames on what the magic metrics should be - store them all and
use the ones that best correlate for you :)

Luca/ntop are actively working on nprobe, so I'm sure you could
get him to add throughput and other metrics as ell.

Also -

The same approach should work with Cisco AVC on ASRs, though it's
something we're just starting to test and may only work with 
specific sets of filters (vs blanket apply to 40gig of traffic
through an ASR).

Definitely curious if anyone in the NANOG community has tried AVC?
Or any other switch/router-layer performance instrumentation?

We've been interested in putting an agent on some of the Linux white
box switches, but the Broadcom chips in the current gens don't 
allow 'flow sampling' - getting all headers or none for a flow,
for a % of flows matching a profile.  And that's needed to do 
retransmit/OOO/latency tracking (vs just seeing samples of packets
across flows).

Again, pointers to switches that have that capability and can run
*nix apps would be appreciated :)

Avi Freedman
CEO, Kentik
avi at kentik dot com



Re: sFlow vs netFlow/IPFIX

2016-02-28 Thread Avi Freedman

Re: limits -

For Cisco/Juniper it's in the low hundreds of thousands of flows/sec
per chipset/linecard for 1:1 NetFlow/IPFIX, I think.

Then of course, as has been mentioned, you'll need to be able to send
it and receive it to something - and store+query.

Avi Freedman
CEO, Kentik

> On 28 February 2016 at 23:40, Nick Hilliard  wrote:



> Around here they are currently voting on a law that will require unsampled
> 1:1 netflow on all data in an ISP network with more than 100 users. Then
> store that data for 1 year, so the police and other parties can request a
> copy (with a warrant but you are never allowed to tell anyone that they
> came for the data and the judges will never say no).
> 
> My routers can apparently actually do 1:1 netflow and the documentation
> does not state any limits on that. So maybe I am lucky?
> 
> To the original question: in this country sFlow only is apparently about to
> become illegal.
> 
> Regards,
> 
> Baldur


Re: sFlow vs netFlow/IPFIX

2016-02-28 Thread Avi Freedman

> This maybe outside the scope of this list but I was wondering if anybody had 
> advice or lessons learned on the whole sFlow vs netFlow debate. We are 
> looking at using it for billing and influencing our sdn flows. It seems like 
> everything I have found is biased (articles by companies who have commercial 
> offerings for the "better" protocol)
> 
> Todd Crane

Most vendors that take "flow" take both so there tends not to be THAT much 
religion unless you talk to someone who hates being flooded with 1:1 flow, or 
debugging broken (usually NetFlow) implementations.

In our experience, they both basically work for ops use cases nowadays, for 
major vendors of routers, and most switches.

sFlow gives faster feedback and more accurate (adding things up, * sample 
rates, closer to SNMP counter data) than most NetFlow/IPFIX implementations.  
How much varies from slightly to extreme (if you're using Catalysts for 
NetFlow/IPFIX).

My thesis overall re: why sFlow 'just works' a bit better is that it's just so 
much easier to implement sFlow because there's no need to track flows (hash 
table or whatever data structure you need).  Just grab samples of headers, 
parse, fill structs, and send.  

That said, things are hugely less sucky than 10 or even 5 years ago in the 
NetFlow world, and for the right vendor and box and software it's possible to 
get NetFlow/IPFIX essentially as accurate.

And has been noted, it at least in theory some boxes that do tens to hundreds 
of gigabits (or low terabits) of traffic support 1:1, which you could in theory 
do with sFlow as a transport, but I haven't seen a switch or router that does 
that.  Re: 1-1 flow - the boxes supporting that are generally not the biggest 
purchase-able from Cisco or Juniper, but are used as the big-boy backbone and 
border routers by a good number of multi-terabit networks, and even some 
multi-tens-of-terabit networks.

Good luck in your flow journeys.

Avi Freedman
CEO, Kentik



Network nerd poker night 11/8 in Seattle

2017-11-07 Thread Avi Freedman

If there are any network+poker nerds in the Seattle area tomorrow, we have 5 
seats left at a network nerd poker night I'm hosting tomorrow night.

Attendees are from cloud, content provider, hosting, infra services, travel, 
and SaaS analytics industries.

We'll have food, drinks, a training session, and will be running ~3 
single-table No Limit Texas Hold'em tournaments.  

If there's time/interest afterwards I may also initiate anyone interested into 
the wonders of Pot Limit Omaha.

Prizes will be Bose head sets, to avoid corporate gift issues with playing for 
or awarding $.

It's at the W Hotel in Bellevue, at 6pm tomorrow night.

The focus is poker, socializing, and free-form network tech, business, and 
policy nerd discussions.  Travel and gadget geeking allowed as well.  Kentik is 
sponsoring the space, tables, and professional dealers, and we'll have a < 5 
minute sponsor presentation.

RSVP / info @ 
https://www.greenvelope.com/viewer/?ActivityCode=.public:ab155c3532ca4bd5ad563ff222b6a338393435313037#details

If it overflows we'll cut off RSVPs at the URL and/or let people know by email.

We're also going to organize to do another in Seattle in Feb and larger ones in 
NY and the Bay area in Q1, so if you have interest or ideas for format or quick 
content topics we could cover, please let me know.  One thing we're considering 
is adding a table for heads-up battles - participants to decide if they want to 
add peering as part of the stakes.

Thanks,

Avi



Re: Preferring peers over customers [was: Do Not Complicate Routing

2011-09-04 Thread Avi Freedman

Forgive my potential lack of understanding; perhaps BGP behavior has
changed or the way people use it has but my understanding is -

Since BGP is used in almost all circumstances in a mode where only
the best path to a prefix can be re-advertised, only one of the
peer or customer path can be used by a 3rd network, and if the peer 
path is used for a prefix for a customer, then a transit provider can't 
easily provide transit for that prefix back to the customer without 
serious routing shennanigans.

So isn't it in practice the case that if a provider prefers a peer to 
connect to a customer instead of the direct customer link, that:

1) The provider will lose the ability to bill for traffic delivered
   to that customer, and

2) The customer will lose redundancy of inbound path, and

3) The customer will almost certainly notice and have the chance to complain

I would expect that most cases of a provider (for a given prefix, which
is almost always a caveat here) preferring a peer to get to a customer 
would be something the customer had some input into via communities,
or by calling and bitching if the provider doesn't have a rich 
communities set or the ability to set them.

One thing one hears every so often (in cycles) is the pressure for
emerging Tier1/2 aspirants to not peer with customers of larger potential
peers who are also providers, to preserve revenue models of said larger 
peers, but that's a different situation.

And -

If applied to customers of customers, I'd think it'd revert to the cases
above.  Network X has customer Y and buys from provider C.  If C prefers
a peer to get to Y (this is all for a given prefix) and it wasn't due 
to policy expressed by X or Y via communities or request of provider C 
by X, then eventually someone's going to figure out that the backup path 
that presumably X and Y think is being paid for, isn't.  Then the people 
that pay money will bitch and action shoudl be taken.

Consistent announcements by a global network to its peers for the prefixes
of a given customer is another level of wonkiness that customers can
definitely influence by doing strange per-prefix communities settings,
but that again is probably another topic as it'd be presumably driven by 
the customer's actions, not the provider's traffic-engineering goals.

Or am I confused here on one, more, or all points?  Certainly possible.

One thing I think everyone can agree on - academic models of the ways 
that people combine routers, money, fiber, contracts, and policy almost
never catch up to the creativity, poltiics, policy, bugs, and stupidities 
that combine to make the Internet as wonderful as it is.

Avi

> On Sep 5, 2011, at 4:03, Randy Bush  wrote:
> 
> >> Because routing to peers as a policy instead of customer as a matter
> >> of policy, outside of corner cases make logical sence.
> >
> > welcome to the internet, it does not always make logical sense at first
> > glance.
> >
> > the myth in academia that customers are always preferred over peers
> > comes from about '96 when vaf complained to asp and me (and we moved it
> > to nanog for general discussion) that we were not announcing an
> > identical prefix list to him at east and west.  the reason turned out to
> > be that, on one of the routers, a peer path was shorter in some cases,
> > so we had chosen it.  we were perfectly happy with that but vaf was not,
> > and he ran the larger network so won the discussion.
> 
> The "myth" comes from engineers at large networks saying it is so.
> 
> We could also have a small miscommunication here.  For example, if a custome=
> r were multi-homed to a peer, and the customer and peer were on the same rou=
> ter, and the customer had prepended a single time (making the AS path equal)=
> , by your original statement you would have sent traffic to the peer.  Most p=
> eople would find that silly.  (And please do not point out customers and pee=
> rs do not connect to the same router, this is a simple example for illustrat=
> ive purposes.)
> 
> However, the statement you make above says that you preferred the peer becau=
> se "the path was shorter".  You do not specify if that is IGP distance, AS p=
> ath length, or some other metric, but it implies if the path were equal, you=
>  would prefer the customer - especially since the customer was preferred on t=
> he other coast.  So there may be assumptions on one side or the other that a=
> re not clear which are causing confusion.
> 
> Either way, this seems operationally relevant.
> 
> I would like the large networks of the world to state whether they prefer th=
> eir customer routes over peer routes, and how.  For instance, does $NETWORK p=
> refer customers only when the AS path is the same, or all the time no matter=
>  what?
> 
> Let's leave out corner cases - e.g. If a customer asks you, via communities o=
> r otherwise, to do something different.  This is a poll of default, vanilla c=
> onfigurations.
> 
> Please send them to me, or the list, wi

Re: community real-time BGP hijack notification service (fwd)

2008-09-12 Thread Avi Freedman

Hi, Arnaud.  The design is to only watch the origin ASN, not the other
ASNs in the path.  Support for doing something with the transit portion
wof the AS_PATH will be added, probably a very simple "alert if X is
in there" or "alert if Y is not in there".

As others have said it's imperfect so ideas are welcome but the goal
here is to try to keep it useful but simple.

Thanks,

Avi

> Date: Fri, 12 Sep 2008 14:18:58 +0200
> From: Arnaud de Prelle <[EMAIL PROTECTED]>
> To: Gadi Evron <[EMAIL PROTECTED]>
> Cc: [EMAIL PROTECTED]
> Subject: Re: community real-time BGP hijack notification service
> 
> Hello Gadi,
> 
> Gadi Evron wrote:
> > Hi, WatchMy.Net is a new community service to alert you when your prefix
> > has been hijacked, in real-time.
> 
> Very good initiative. You can count on me as one of your users.
> 
> Note that apparently it doesn't seem to be working as expected yet.
> Indeed I already received two false alerts:
> 
> 1.
> Subject:
> watchmy.net BGP Alert - seeing {91.198.99.0/24, 6450 3737 701 702 43751}
> 
> Body:
> Hello, we are seeing 91.198.99.0/24 being advertised with aspath 6450
> 3737 701 702 43751.
> 
> We are alerting you because of the rule you set that is watching for
> prefixes that match or are more specific than 91.198.99.0/24, and are
> originated with any origin AS other than one of 702,6661,8220




Re: community real-time BGP hijack notification service

2008-09-12 Thread Avi Freedman

> Nathan wrote:

> It is trivially easy for an attacker to falsify the origin AS. If 'they' are 
> not doing it already, then I'm quite surprised.
> This isn't really a good thing to alarm on, in my opinion. Or, maybe it is, 
> but 
> there should be big bold text explaining that it's not reliable as it's 
> trivially easy to falsify.

Yep, true.

However, there's the case that someone's just typo'd you, which has happened
to, near, around, and by me more frequently than an actual jackification.
There was the time I fumble-fingered some net99 space and Karl Denninger
started tracking me down to threaten lawsuits (before, I might add, asking
me to log into the offending device and change the config).

Anyway, the other case is where there shouldn't be a more specific, and you
still win.

Otherwise, yes, origin AS can be forged but the transit part is even messier,
I think.

> My best idea is looking at the AS_PATH for changes, and alerting whenever 
> that 
> happens. You'd obviously get a different path whenever there is churn in the 
> network though. I'm sure there's a way to do this, and I suspect having BGP 
> feeds from many many places is the most reliable way for it to happen, I just 
> haven't figured out why yet.

As you point out, the Internet is a really noisy and messy place.  Just doing
the "different than usual" is something I resisted here because there's so much
hidden partial transit that doesn't normally expose.  

More BGP feeds might just amplify that behavior, though the idea is to get more
feeds in.

> This seems like a service that Renesys etc. could/should (or maybe do?) 
> offer, 
> they seem well placed with all their BGP feeds..

Not sure who else offers it; it seemed reasonable to do and see if it's useful.
Gadi told me there was no free real-time alerting out there but I didn't really
look into it.

Certainly if anyone wants to see the dynamics, who has advertised what now and
in the deep dark past, etc Renesys would be the place to go as far as I know.

> Nathan Ward

Avi




Re: community real-time BGP hijack notification service

2008-09-12 Thread Avi Freedman
> Nathan wrote:

> My best quick hack solution so far is to fire off a traceroute and make sure
> that the traceroute gets ICMP TTL expire messages from IP addresses that are 
> in
> prefixes originated from all the ASes in the ASPATH.
> Still forgeable, but a bit more difficult.. still far from perfect though.

An interesting idea although I think the false positive rate would be very
high with all of the filtering (and mismatch between traceroute and BGP
topologies) that exists out there.

It'd be interesting for someone to try and see how well it works though.
(Any researchers hanging out on NANOG want to try a weekend project...)

> Nathan Ward

Avi





Re: community real-time BGP hijack notification service

2008-09-12 Thread Avi Freedman

Hi Erik -

There's a great button about Usenet -

"Reading Usenet is like drinking from a firehose;
 Posting to Usenet is like shouting from a mountaintop;
 Archiving Usenet is like saving used toilet tissue."

BGP may be somewhat more important, useful, and the results consumable
in the short-term, but for long-term archiving I think it devolves to
being more interested to researchers and other ubernerds who can use
the libraries and the very valuable data store and service that RIPE
provides (which is appreciated)!

I was thinking more for the medium term "what's normal" that goes
back beyond whatever's in the routing table this second, probably
for a few weeks to months max in most cases.

And I think for actual diagnosis what's needed is a great tool to
ask network and business questions of historic BGP data.  That's the
context in which I mention Renesys tools+data.

So I'd say to help the networkers of the world, it's probably more
about tools than history.

Thanks,

Avi

> RIS provides data in a searchable MySQL database for three months.
> 
> All we've ever collected is kept in a raw data format. This archive 
> starts in 1999, and we maintain a library to read the data.
> This data is free to use for any purpose and we will not remove any of 
> our raw data as it gets older.
> 
> We are also carefully looking into whether we should reduce or increase 
> the amount of data in our MySQL database - as that's easy to search for 
> our users.
> 
> However, any increase obviously comes with increased resource usage - so 
> this is something that requires careful thinking and planning.
> Another option is to store aggregated info on older data, instead of 
> keeping every update that ever occured.
> 
> But, this is just an idea that crosses our minds from time to time - I'm 
> not making promises on what we will implement :)
> 
> Of course, any ideas on how much more history would help you, are very 
> welcome.
> 
> cheers,
> -- 
> Erik Romijn RIPE NCC software engineer
> 




Re: community real-time BGP hijack notification service

2008-09-12 Thread Avi Freedman

Hmm, I'm trying to figure out the application here.

You have single prefixes originated or originate-able by more than
5 or 6 ASs?

I see - is it that you have, say a /16 with 13 potential ASs that might
be seen as originating more specifics inside that /16?

Hadn't considered that; we were envisioning that those specifics would 
be set up as separate alerts.

It's easy enough to extend the # of ASNs that can be listed, however.
That'll be done this weekend.

Thanks,

Avi

> Looks interesting, but it only takes a fairly short list of ASNs for a
> prefix. For our big CIDR blocks, we have WAY too many ASNs to enter them
> all, so it's pretty useless for me. I need to be able to enter at very
> least a dozen ASes and I suspect may folks have a LOT more then that.
> 
> For now, I'll enter some shorter pieces from the block, but I'm most
> concerned with the pieces that are not currently assigned, so are
> available for hijack. I have added the larger, unassigned blocks. I'll
> start adding assigned bits and pieces as well as unassigned pieces, but
> being able to put all valid origin ASes in the list for the full blocks
> would be a lot nicer.

> R. Kevin Oberman, Network Engineer