wait, so the whole of the thread is about stopping participants in the attack, and you're suggesting that removing/changing end-system switch/routing gear and doing something more complex than: deny udp any 123 any deny udp any 123 any 123 permit ip any any
is a good plan? I'd direct you at: <https://www.nanog.org/resources/tutorials> and particularly at: "Tutorial: ISP Security - Real World Techniques II" <https://www.nanog.org/meetings/nanog23/presentations/greene.pdf> On Mon, Feb 3, 2014 at 5:16 PM, Peter Phaal <peter.ph...@gmail.com> wrote: > On Mon, Feb 3, 2014 at 12:38 PM, Christopher Morrow > <morrowc.li...@gmail.com> wrote: >> On Mon, Feb 3, 2014 at 2:42 PM, Peter Phaal <peter.ph...@gmail.com> wrote: >>> On Mon, Feb 3, 2014 at 10:16 AM, Christopher Morrow >>> <morrowc.li...@gmail.com> wrote: >>>> On Mon, Feb 3, 2014 at 12:42 PM, Peter Phaal <peter.ph...@gmail.com> wrote: >> >>>> There's certainly the case that you could drop acls/something on >>>> equipment to selectively block the traffic that matters... I suspect >>>> in some cases the choice was: "50% of the edge box customers on this >>>> location are a problem, block it across the board here instead of X00 >>>> times" (see concern about tcam/etc problems) >>> >>> I agree that managing limited TCAM space is critical to the >>> scaleability of any mitigation solution. However, tying up TCAM space >>> on every edge device with filters to prevent each new threat is likely >> >> yup, there's a tradeoff, today it's being made one way, tomorrow >> perhaps a different way. My point was that today the percentage of sdn >> capable devices is small enough that you still need a decimal point to >> measure it. (I bet, based on total devices deployed) The percentage of >> oss backend work done to do what you want is likely smaller... >> >> the folk in NZ-land (Citylink, reannz ... others - find josh baily / >> cardigan) are making some strides, but only in the exchange areas so >> far. fun stuff... but not the deployed gear as an L2/L3 device in >> TWC/Comcast/Verizon. > > I agree that today most networks aren't SDN ready, but there are > inexpensive switches on the market that can perform these functions > and for providers that have them in their network, this is an option > today. In some environments, it could also make sense to drop in a > layer switches to monitor and control traffic entering / exiting the > network. it's probably not a good plan to forklift your edge, for dos targets where all you really need is a 3 line acl. > >>> The current 10G upgrade cycle provides an opportunity to deploy >> >> by 'current 10g upgrade cycle' you mean the one that happened 2-5 yrs >> ago? or somethign newer? did you mean 100G? > > I was referring to the current upgrade cycle in data centers, with > servers connected with 10G rather than 1G adapters. The high volumes > are driving down the cost of 10/40/100G switches. again, lots of cost and churn for 3 lines of acl... I'm not sold. >>> With integrated hybrid OpenFlow, there is very little activity on the >>> OpenFlow control plane. The normal BGP, ECMP, LAG, etc. control planes >>> handles forwarding of packets. OpenFlow is only used to selectively >>> override specific FIB entries. >> >> that didn't really answer the question :) if I have 10k customers >> behind the edge box and some of them NOW start being abused, then more >> later and that mix changes... if it changes a bunch because the >> attacker is really attackers. how fast do I change before I can't do >> normal ops anymore? > > Good point - the proposed solution is most effective for protecting > customers that are targeted by DDoS attacks. While trying to prevent Oh, so the 3 line acl is not an option? or (for a lot of customers a fine answer) null route? Some things have changed in the world of dos mitigation, but a bunch of the basics still apply. I do know that in the unfortunate event that your network is the transit or terminus of a dos attack at high volume you want to do the least configuration that'll satisfy the 2 parties involved (you and your customer)... doing a bunch of hardware replacement and/or sdn things when you can get the job done with some acls or routing changes is really going to be risky. > attackers entering the network is good citizenship, the value and > effectiveness of the mitigation service increases as you get closer to > the target of the attack. In this case there typically aren't very > many targets and so a single rule filtering on destination IP address > and protocol would typically be effective (and less disruptive to the > victim that null routing). > >> >>> Typical networks probably only see a few DDoS attacks an hour at the >>> most, so pushing a few rules an hour to mitigate them should have >>> little impact on the switch control plane. >> >> based on what math did you get 'few per hour?' As an endpoint (focal >> point) or as a contributor? The problem that started this discussion >> was being a contributor...which I bet happens a lot more often than >> /few an hour/. > > I am sorry, I should have been clearer, the SDN solution I was > describing is aimed at protecting the target's links, rather than > mitigating the botnet and amplification layers. and i'd say that today sdn is out of reach for most deployments, and that the simplest answer is already available. > The number of attacks was from the perspective of DDoS targets and > their service providers. If you are considering each participant in > the attack the number goes up considerably. I bet roland has some good round-numbers on number of dos attacks per day... I bet it's higher than a few per hour globally, for the ones that get noticed. >>> A good working definition of a large flow is 10% of a link's >>> bandwidth. If you only trigger actions for large flows then in the >>> worst case you would only require 10 rules per port to change how >>> these flows are treated. >> >> 10% of a 1g link is 100mbps, For contributors to ntp attacks, many of >> the contributors are sending ONLY 300x the input, so less than >> 100mbps. On a 10g link it's 1G... even more hidden. >> >> This math and detection aren't HARD, but tuning it can be a bit challenging. > > Agreed - the technique is less effective for addressing the > contributors to the attack. RPF and other edge controls should be note that the focus of the original thread was on the contributors. I think the target part of the problem has been solved since before the slides in the pdf link at the top... > applied, but until everyone participates and eliminates attacks at > source, there is still a value in filtering close to the target of the > attack. > >> >>>>> http://blog.sflow.com/2014/01/physical-switch-hybrid-openflow-example.html >>>>> >>>>> The example can be modified to target NTP mon_getlist requests and >>>>> responses using the following sFlow-RT flow definition: >>>>> >>>>> {'ipdestination,udpsourceport',value:'ntppvtbytes',filter:'ntppvtreq=20,42'} >>>>> >>>>> or to target DNS ANY requests: >>>>> >>>>> {keys:'ipdestination,udpsourceport',value:'frames',filter:'dnsqr=true&dnsqtype=255'} >>>>> >>>> >>>> this also assume almost 1:1 sampling... which might not be feasible >>>> either...otherwise you'll be seeing fairly lossy results, right? >>> >>> Actually, to detect large flows (defined as 10% of link bandwidth) >>> within a second, you would only require the following sampling rates: >> >> your example requires seeing the 1st packet in a cycle, and seeing >> into the first packet. that's going to required either acceptance of >> loss (and gathering the loss in another rule/fashion) or 1:1 sampling >> to be assured of getting ALL of the DNS packets and seeing what was >> queried. > > The flow analysis is stateless - based on a random sample of 1 in N > packets, you can decode the packet headers and determine the amount of > traffic associated with specific DNS queries. If you are looking at you're getting pretty complicated for the target side: ip access-list 150 permit ip any any log (note this is basically taken verbatim from the slides) view logs, see the overwhelming majority are to hostX port Y proto Z... filter, done. you can do that in about 5 mins time, quicker if you care to rush a bit. > the traffic close to the target, there may be hundreds of thousands of > DNS responses per second and so you very quickly determine the target > IP address and can apply a filter to remove DNS traffic to that > target. > >> provided your device does sflow and can export to more than one >> destination, sure. > > This brings up an interesting point use case for an OpenFlow capable > switch - replicating sFlow, NetFlow, IPFIX, Syslog, SNMP traps etc. > Many top of rack switches can also forward the traffic through a > GRE/VxLAN tunnel as well. yes, more complexity seems like a great plan... in the words of someone else: "I encourage my competitors to do this" I think roland's other point that not very many people actually even use sflow is not to be taken lightly here either. -chris > http://blog.sflow.com/2013/11/udp-packet-replication-using-open.html Domain Name: SFLOW.COM <snip> Registry Registrant ID: Registrant Name: PHAAL, PETER Registrant Organization: InMon Corp. <snip>