Re: [j-nsp] rib-groups && VPN reflection

2019-04-23 Thread Adam Chappell
I think you've got it clear, Adam. Take routes that are originally destined
for inet.0, create a rib-group in order to leak them into a secondary
routing instance, and then expect normal L3VPN behaviour for those
route-instance routes. (The idea is to drop the routes into an overlay
topology, which includes a multi-homed network element, with an interface
and BGP relationship to inet.0 in order to influence it and "override"
based on original route properties.).

Yes, of course, loops are possible, but this is perfectly possible, and
policed by all the usual routing protocol mechanisms, IFF the route table
advertisement mode is non-reflector mode (meaning that routes are exported
directly from the VRFs and not from the bgp.l3vpn.0 holding table).

It's this change in functionality and behaviour based on other features
that disappoints me most here. KB32423 describes the situation, I've since
found, and it advises me to off the reflection. Off-box reflection, for
VPN, is probably the best idea anyway, but it's a shame that it isn't
clearer in the feature documentation that this road has serious pitfalls.

I do understand your point on using RTs to determine remote PE table
destination. But there's no easy way to take undecorated inet.0 routes and
annotate with RTs, is there?  Even if there was, I have to assume that the
original route in inet.0 can/will be displaced. That was actually the very
nice thing about rib-groups: the two copies of the route do not share any
route selection fate.

Appreciate the feedback from all. Haven't quite given up yet.

-- Adam.

On Sun, 21 Apr 2019 at 11:22,  wrote:

>
> I'm not sure I understand your objective, so just to confirm.
> Is your objective to leak route from routing table A to routing table B
> while being able to advertise the leaked route from table B to other PEs
> (where the route is expected to land in table B).
> If that's the case then this is not allowed as it can form routing loops.
> Instead one is expected to set the RTs on export from table A so that other
> PEs can import these in the desired table.
> This is where the use of inet.0 is troublesome -so in that case one is
> expected to do the route leaking on all remote ends.
> But you can see the pattern here, advertising is done from the originating
> table and then the "leaking" is supposed to be done on/by the remote end.
>
> adam
>
>
>
>

-- 

-- Adam.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


[j-nsp] rib-groups && VPN reflection

2019-04-18 Thread Adam Chappell
Hello all.

I figure this topic is a fundamental and probably frequently asked/answered
although it's new problem space for me. I thought I'd consult the font of
knowledge here to seek any advice.

Environment: MX, JUNOS 15.1F6
Headline requirement: Leak EBGP routes from global inet.0 into a VPN (in
order to implement off-ramp/on-ramp for DDoS protection traffic
conditioning).

Experience:
The challenge is quite simple on the surface. Use a rib-group directive on
the EBGP peer to group together inet.0 and the VRF.inet.0 together as
import-rib/Adj-Rib-In for the peer. Indeed this works as you would expect,
and received routes appear in both inet.0 and VRF.inet.0

But the problem is that if rpd is also configured with any of:
- IBGP reflection for inetvpn family
- EBGP for inetvpn
- advertise-from-main-vpn-table,

then any leaked routes, while being present in the VRF, do not get
advertised internally to other PE VPN routing tables.

The cause seems to be that these features cause the mechanics of
advertising VPN routes internally to change.  These features bring in a
requirement for rpd to retain VPN routes in their "native" inet-vpn form,
rather than simply consult the origin routing-instsances and synthesise on
demand so that the interaction with reflection clients or EBGP peers can be
handled.

So when these features are enabled, rpd opportunistically switches to a
mode where it goes to the trouble of cloning the instance-based vanilla
routes as inetvpn within bgp.l3vpn.0 or equiv.

Indeed battle-scarred Juniper engineers are probably familiar with this
document that offers counsel for how to maintain uptime in the face of this
optimisation gear-shift:
https://www.juniper.net/documentation/en_US/junos/topics/example/bgp-vpn-session-flap-prevention.html

I can understand and appreciate this, even if I might not like it.

But the abstraction seems to be incomplete. The method of copying routes to
bgp.l3vpn.0 is similar if not identical, under-the-hood, to the initial
rib-group operation I am performing at route source to leak the original
inet.0 route and this route, as seen in the VRF.inet.0 table, becomes a
Secondary route.

As such, it apparently isn't candidate for further cloning/copying into
bgp.l3vpn.0, and as a consequence the "leaked" route doesn't actually make
it into the VPN tables of other PEs.

The document suggests a workaround of maintaining the original route in
inet.0, but sadly for my use case, the whole premise of the leak operation
is to ultimately result in a global table inet.0 redirect elsewhere, so
depending on inet.0 route selection is a bit fragile for this.

My question to others is, is this a well-known man-trap that I am naively
unaware of?  Is it simply the case that best practice to get reflection off
of production VRF-hosting PEs is actually mandatory here, or are others
surprised by this apparent feature clash?  Can I reasonably expect it to be
addressed further down the software road?  Or is there another, perhaps
better, way of achieving my objective?

Any thoughts appreciated.

-- Adam.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


[j-nsp] BGP apparent I/O throttling on MX960 (JUNOS 14.1R6)

2016-10-24 Thread Adam Chappell
Hello all.

Anyone any experience with situations where "show bgp neighbor X.X.X.X" on
JUNOS CLI produces a small appendix to the usual output stating: "Received
and buffered octets: 20".  20 in this case seems to vary between
invocations, but usually under 100.  Example pseudo-sanitised output at the
end of the mail for anyone interested.

It seems to suggest that rpd completed a short read or is otherwise still
waiting for a complete message from the remote peer. In this case, the
remote peer was aware of the situation because they monitor their XR
speaker's BGP OutQ instrumentation to watch for slow readers. Their
observation was a zero-size advertised receive window in TCP preventing
their BGP speaker from sending queued message).

Problem disappeared after several days with no obvious action taken by
either party. OutQ on peer's side returned to zero, and I could see no
further messages of partial reception or buffering on our side.

Beyond the instrumentation, I could find no obvious evidence of degraded
function. Peer's large OutQ was a cause for concern (their export policy to
us is one of full table), but it went unrealised. No other BGP peer on the
box exhibited similar symptoms. NSR in use (wondered if replication between
REs could slow down effective TCP receive rate). Link-layer to peer was
reliable and with low latency.

Probably in the X-Files, I realise, but I thought a stab-in-the-dark here
might be worthwhile.

-- Adam.

Peer: REMOTE_IP+179 AS REMOTE_AS Local: LOCAL+51007 AS MY_AS
  Description: Peer
  Type: ExternalState: EstablishedFlags: 
  Last State: EstabSync Last Event: RecvKeepAlive
  Last Error: Cease
  Export: [ OUT ] Import: [ IN ]
  Options: 
  Options: 
  Authentication key is configured
  Holdtime: 90 Preference: 170
  Number of flaps: 1
  Last flap event: Stop
  Error: 'Cease' Sent: 1 Recv: 0
  Peer ID: PEER-ROUTER-ID   Local ID: MY-ROUTER-ID Active Holdtime: 90
  Keepalive Interval: 30 Group index: 9Peer index: 0
  BFD: disabled, down
  Local Interface: ae12.0
  NLRI for restart configured on peer: inet-unicast
  NLRI advertised by peer: inet-unicast inet-multicast
  NLRI for this session: inet-unicast
  Peer supports Refresh capability (2)
  Stale routes from peer are kept for: 300
  Peer does not support Restarter functionality
  NLRI that restart is negotiated for: inet-unicast
  NLRI of received end-of-rib markers: inet-unicast
  NLRI of all end-of-rib markers sent: inet-unicast
  Peer supports 4 byte AS extension (peer-as REMOTE_AS)
  Peer does not support Addpath
  Table inet.0 Bit: 10007
RIB State: BGP restart is complete
Send state: in sync
Active prefixes:  203815
Received prefixes:595715
Accepted prefixes:505370
Suppressed due to damping:0
Advertised prefixes:  5755
  Last traffic (seconds): Received 27   Sent 2Checked 57
  Input messages:  Total 13670093 Updates 13660814 Refreshes 41 Octets
1307202704
  Output messages: Total 299601 Updates 174126 Refreshes 0 Octets 15932309
  Output Queue[0]: 0
*  Received and buffered octets: 29*

adamc@router> show system connections extensive | find REMOTE
tcp4   0 38  LOCAL.51007   REMOTE.179
  ESTABLISHED
   sndsbcc: 38 sndsbmbcnt:256  sndsbmbmax: 131072
sndsblowat:   2048 sndsbhiwat:  16384
   rcvsbcc:  0 rcvsbmbcnt:  0  rcvsbmbmax: 131072
rcvsblowat:  1 rcvsbhiwat:  16384
   proc id:   1873  proc name:rpd
   iss: 3882968990  sndup: 3901286434
snduna: 3901286434 sndnxt: 3901286472  sndwnd:  32195
sndmax: 3901286472sndcwnd:  12397 sndssthresh:  0
   irs:  811065039  rcvup: 2118576537
rcvnxt: 2118576537 rcvadv: 1059075027  rcvwnd:  16384
   rtt: 1977181859   srtt: 556992rttv: 388071
rxtcur:  64000   rxtshift:  0   rtseq: 3901286434
rttmin:  0  mss:   1404
 flags: REQ_SCALE RCVD_SCALE REQ_TSTMP [0x13e0]
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


[j-nsp] JUNOS precision-timers for BGP

2016-04-25 Thread Adam Chappell
Does anyone have positive or negative experience with this feature in 14.1
please?

Currently in a situation troubleshooting consequences of high CPU usage
with a number of aggravating factors. Most sensitive to the scarcity of CPU
resources however is a number of BGP sessions with aggressive timers.

Quite often a commit operation seems to make rpd block sufficiently enough
(or indeed it's already starved out by other processes) to neglect
keepalives for these unforgiving BGP sessions and we end up losing them.

Juniper have recommended to us consideration of "precision-timers", a
global BGP knob which, if I understand it well, offloads all of the crucial
BGP session management functionality to a different rpd thread in order to
leave the main thread able to handle config requests etc. - not too
dissimilar to the session management separation in openbgpd etc.

The Juniper documentation says this feature is recommended for low hold
timers, and from what we can ascertain rpd is able to transition to
off-thread session management without a down/up which is pretty neat.

I'm aware of PR1044141 which apparently causes pain when used in
conjunction with traceoptions, but I'm keen to understand if others have
operational experience.

We're also making inroads to lower CPU demands through the use of
distributed PPM etc., but the regular pattern I tend to see there is that
this doesn't fly once the PPM'd protocol has a security knob added, eg.
adding authentication to BFD, VRRP etc.

-- Adam.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Multi Core on JUNOS?

2015-10-09 Thread Adam Chappell
On 8 October 2015 at 17:46, Saku Ytti  wrote:

>
> Hard step#3 is to make rpd use multiple cores. JNPR seems to choose to
> DIY inside rpd and just launch threads. I personally would like to see
> rpd distributed to multiple OS processes, and capitalise more on
> FreeBSD's memory management and scheduling. But I'm not sure how to
> handle the IPC efficiently without large cost in data duplications
> across processes.
>
>
I can imagine that making rpd MT is probably hard to the point of almost
not being worth the benefit (with current REs), unless one can adequately
break down the problem into divisable chunks of work that are insensitive
to execution order.  BGP peer route analysis, comparison against import
policy and current RIB might fall into that category, but not without a lot
of locking and potential for races.

I think there's more bang for buck in the 64-bit address space change what
with Internet FIB table size, and I'm quite interested in the developments
to make rpd 64-bit clean which I'm sure are also not insignificant. I
notice Mr Tinka already expressed a conservative view on jumping into a
64-bit rpd and I can totally understand and appreciate that view. Juniper
haven't made this a default on the 14.1R5 cut of code that we're currently
testing, so I imagine they're still looking for bleeding-edge feedback to
whittle out the potential nasties.

I'm quite intrigued by the tidbit in the Juniper docs, though, that
suggests that switching from a 32-bit to a 64-bit rpd is not service
affecting though which means that the wait-and-see approach is viable?  Or
am I totally misunderstanding this?

https://www.juniper.net/documentation/en_US/junos14.1/topics/reference/configuration-statement/routing-edit-system-processes.html

It doesn't say that one needs NSR or any sort of "help" from the backup RE
which might be the first assumption. So I wonder how they manage to get one
process to exit and the other one to start up with different runtimes,
differently sized pointers etc., and somehow share the same in-core RIB and
protocol state etc etc.? If this really does work, there's probably someone
sitting somewhere in Sunnydale immensely smug and under-stated right now,
and if so I'm sure he/she'd eat the MT problem for breakfast!

-- Adam.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp