Re: [j-nsp] QFX DDOS Violations

Cristian Cardoso via juniper-nsp Wed, 30 Nov 2022 05:43:02 -0800

Hi Johan

I experienced a similar issue in my evpn-vxlan environment on QFX5120-48y
switches. The DDOS alert occurred whenever a large number of VM migrations
occurred simultaneously in my environment, some times there were 20 VM's in
simultaneous migration and the DDOS alarmed.


To solve this, I set the following value in the configuration:

qfx5120> show configuration system ddos-protection protocols
vxlan {
    aggregate {
        bandwidth 10000;
        burst 12000;
    }
}



Em qua., 30 de nov. de 2022 às 07:16, john doe via juniper-nsp <
juniper-nsp@puck.nether.net> escreveu:

> Hi!
>
> The leaf switches are QFX5k and it seems to be lacking some of the command
> you mentioned. We don't have any problem with bgp sessions going down, the
> impact is only the payload inside vxlan.
>
> Protocol Group: VXLAN
>
>   Packet type: aggregate (Aggregate for vxlan control packets)
>     Aggregate policer configuration:
>       Bandwidth:        500 pps
>       Burst:            200 packets
>       Recover time:     300 seconds
>       Enabled:          Yes
>     Flow detection configuration:
>       Flow detection system is off
>       Detection mode: Automatic  Detect time:  0 seconds
>       Log flows:      Yes        Recover time: 0 seconds
>       Timeout flows:  No         Timeout time: 0 seconds
>       Flow aggregation level configuration:
>         Aggregation level   Detection mode  Control mode  Flow rate
>         Subscriber          Automatic       Drop          0  pps
>         Logical interface   Automatic       Drop          0  pps
>         Physical interface  Automatic       Drop          500 pps
>     System-wide information:
>       Aggregate bandwidth is no longer being violated
>         No. of FPCs that have received excess traffic: 1
>         Last violation started at: 2022-11-30 09:08:02 CET
>         Last violation ended at:   2022-11-30 09:09:32 CET
>         Duration of last violation: 00:01:40 Number of violations: 1508
>       Received:  3548252144          Arrival rate:     201 pps
>       Dropped:   49294329            Max arrival rate: 160189 pps
>     Routing Engine information:
>       Bandwidth: 500 pps, Burst: 200 packets, enabled
>       Aggregate policer is never violated
>       Received:  0                   Arrival rate:     0 pps
>       Dropped:   0                   Max arrival rate: 0 pps
>         Dropped by individual policers: 0
>     FPC slot 0 information:
>       Bandwidth: 100% (500 pps), Burst: 100% (200 packets), enabled
>       Hostbound queue 255
>       Aggregate policer is no longer being violated
>         Last violation started at: 2022-11-30 09:08:02 CET
>         Last violation ended at:   2022-11-30 09:09:32 CET
>         Duration of last violation: 00:01:40 Number of violations: 1508
>       Received:  3548252144          Arrival rate:     201 pps
>       Dropped:   49294329            Max arrival rate: 160189 pps
>         Dropped by individual policers: 0
>         Dropped by aggregate policer:   50294227
>         Dropped by flow suppression:    0
>       Flow counts:
>         Aggregation level     Current       Total detected   State
>         Subscriber            0             0                Active
>
> vty)# show ddos scfd proto-states vxlan
> (sub|ifl|ifd)-cfg: op-mode:fc-mode:bwidth(pps)
> op-mode: a=automatic, o=always-on, x=disabled
> fc-mode: d=drop-all, k=keep-all, p=police
> d-t: detect time, r-t: recover time, t-t: timeout time
> aggr-t: last aggregated/deaggreagated time
> idx prot       group        proto mode detect agg flags state   sub-cfg
> ifl-cfg   ifd-cfg  d-t  r-t  t-t   aggr-t
> --- ----    --------     -------- ---- ------ --- ----- ----- ---------
> --------- ---------  ---  ---  ---   ------
>  23 6400       vxlan    aggregate auto     no   1     2     0 a:d:    0
> a:d:    0 a:d:  500    0    0    0        0
>
>
> Johan
>
> On Wed, Nov 30, 2022 at 8:53 AM Saku Ytti <s...@ytti.fi> wrote:
>
> > Hey,
> >
> > Before any potential trashing, I'd like to say that as far as I am
> > aware Juniper (MX) is the only platform on the market which isn't
> > trivial to DoS off the network, despite any protection users may have
> > tried to configure.
> >
> > > How do you identify the source problem of DDOS violations that junos
> logs
> > > for QFX? For example what interface that is causing the problem?
> >
> > I assume you are talking about QFX10k with Paradise (PE) chipset. I'm
> > not very familiar with it, but I know something about it when sold in
> > PTX10k quise, but there are significant differences. Answers are from
> > the PTX10k perspective. If you are talking about QFX5k many of the
> > answers won't apply, but the ukern side answers should help
> > troubleshoot it further, certainly with QFX5k the situation is worse
> > than it would be on QFX10k.
> >
> > > DDOS_PROTOCOL_VIOLATION_SET: Warning: Host-bound traffic for
> > > protocol/exception  VXLAN:aggregate exceeded its allowed bandwidth at
> > fpc 0
> > > for 30 times, started at...
> > >
> > > The configured rate for VXLAN is 500pps, ddos protection is seeing
> rates
> > > over 150 000pps
> >
> > Do you mean you've configured:
> > 'set system ddos-protection protocols vxlan aggregate bandwidth 500'.
> > What exactly are you seeing? What does 'show ddos-protection protocols
> > vxlan' say?Also 'start shell pfe network fpcX' + 'show ddos scfd
> > proto-states vxlan'
> >
> > Paradise (unlike Triton and Trio) does not support PPS policing at
> > all. So when you configure a PPS policer, what actually gets
> > programmed is 500pps*1500B bps. I've tried to argue this is a poor
> > default, 64B being superior choice.
> > In paradise 500pps would admit 500*(1500/64) or about 12kpps per
> > Paradise if those VXLAN packets were small. These would then be
> > policed by the LC CPU ukern into 500 pps for all the Paradise chips
> > living inside that LC CPU, before sending to RE over bme0.
> > After DDoS but before Paradise admits packet to the LC_CPU it goes
> > through VoQ, where most packets are classified as VoQ#2 which is
> > 10Mbps wide with no burstability (classification, width and
> > burstability is being changed on later images). So extremely trivial
> > rates will cause congestion on the VoQ#2 and a lot of protocols will
> > be competing for 10Mbps access to LC CPU, like BGP, ISIS, OSPF, LDP,
> > ND, ARP.
> >
> > > This is an spine/leaf setup, one theory is that the vxlan traffic that
> > most
> > > of our QFX boxes are activation ddos protection for is actually vxlan
> > > services running inside the vxlans, for example we have kubernetes
> > clusters
> > > using vxlan. Is that a sane theory?
> >
> > Not enough information to speculate.
> > In many cases ddos classification is wrong. You can review in the PFE,
> > 'show filter' => HOSTBOND_IPv4_FILTER then 'show filter index X
> > program'. You can also capture punted packets on interface where RE
> > meets FPC (I think bme0 here), in the bme0 interface TNP headers are
> > in top of the punted packets and in the TNP headers you will see what
> > ddos classification was used, you can turn the number into name by
> > looking at the 'show ddos scfd proto-statates'.
> >
> >
> > I naively wish I could set my ddos-protocol classification and voq
> > classification manually in 'lo0 filter', because the infrastructure
> > allows for great protection, but particularly when choosing which VoQ
> > packets share there is no obvious single best solution, it depends on
> > the environment. Like I could put RSVP, ISIS, LDP on single VoQ, as
> > they never compete with customers, BGP in another as they will compete
> > with customers and operators for me, and so forth. But of course this
> > wish is naive, as the solution the vendor offers is already too
> > complex for customers to use and giving more rope would just make the
> > mean config worse.
> >
> > --
> >   ++ytti
> >
> _______________________________________________
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
>
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Re: [j-nsp] QFX DDOS Violations

Reply via email to