Re: [j-nsp] Segment Routing Real World Deployment (was: VPC mc-lag)

2018-07-08 Thread Pavel Lunin
Hi experts,

I had a pleasure time reading the whole thread. Thanks, folks !

Honestly, I also (a bit like Saku) feel that Alexandre's case is more about
throwing the *unneeded* complexity away than about BGP vs. LDP.

The whole story of Kompella-style signaling for L2VPN and VPLS is
auto-discovery in a multi-point VPN service case.

But yes, there is the whole bunch of reasons why multi-point L2 VPN sucks,
and when bridged it sucks 10x more. So if you can throw it away, just throw
it away and you won't need to discuss how to signal it and auto-discover
remote sites.

And yes, as pseudo-wire data plane is way simpler than VPLS, depending on
your access network design, you can [try to] extend it end-to-end, all the
way to the access switch and [maybe, if you are lucky] dramatically
simplify your NOC's life.

However p2p pseudo-wire service is a kind of rare thing these days. There
are [quite a lot of] those poor folks who were never asked whether bridged
L2 VPN (aka VPLS) is needed in the network, they operate. They have no much
choice.

BGP signaling is the coolest part of the VPLS hell (some minimal magic is
required though). In general I agree with the idea that iBGP stability is
all about making the underlaying stuff simple and clean (IGP, BFD, Loss of
Light, whatever). Who said "policies"? For VPLS BGP signaling? Please don't.

And yes, switching frames between fancy full-feature PEs is just half of
the game. The autodiscovery beauty breaks when the frames say bye bye to
the MPLS backbone and meet the ugly access layer. Now you need to switch it
down to the end-point and this often ends up in old good^W VLAN
provisioning. But it's not about BGP, it's about VPLS. Or rather about
those brave folks, who build their services relying on all these
ethernet-on-steroid things.

--
Kind regards,
Pavel


On Sun, Jul 8, 2018 at 10:57 PM,  wrote:

> > From: James Bensley [mailto:jwbens...@gmail.com]
> > Sent: Friday, July 06, 2018 2:04 PM
> >
> >
> >
> > On 5 July 2018 09:56:40 BST, adamv0...@netconsultings.com wrote:
> > >> Of James Bensley
> > >> Sent: Thursday, July 05, 2018 9:15 AM
> > >>
> > >> - 100% rFLA coverage: TI-LA covers the "black spots" we currently
> > >have.
> > >>
> > >Yeah that's an interesting use case you mentioned, that I haven't
> > >considered, that is no TE need but FRR need.
> > >But I guess if it was business critical to get those blind spots
> > >FRR-protected then you would have done something about it already
> > >right?
> >
> > Hi Adam,
> >
> > Yeah correct, no mission critical services are effected by this for us,
> so the
> > business obviously hasn't allocated resource to do anything about it. If
> it was
> > a major issue, it should be as simple as adding an extra back haul link
> to a
> > node or shifting existing ones around (to reshape the P space and Q
> space to
> > "please" the FRR algorithm).
> >
> > >So I guess it's more like it would be nice to have,  now is it enough
> > >to expose the business to additional risk?
> > >Like for instance yes you'd test the feature to death to make sure it
> > >works under any circumstances (it's the very heart of the network after
> > >all if that breaks everything breaks), but the problem I see is then
> > >going to a next release couple of years later -since SR is a new thing
> > >it would have a ton of new stuff added to it by then resulting in
> > >higher potential for regression bugs with comparison to LDP or RSVP
> > >which have been around since
> > >ever and every new release to these two is basically just bug fixes.
> >
> > Good point, I think its worth breaking that down into two separate
> > points/concerns:
> >
> > Initial deployment bugs:
> > We've done stuff like pay for a CPoC with Cisco, then deployed, then had
> it
> > all blow up, then paod Cisco AS to asses the situation only to be told
> it's not a
> > good design :D So we just assume a default/safe view now that no amount
> > of testing will protect us. We ensure we have backout plans if something
> > immediately blows up, and heightened reporting for issues that take 72
> > hours to show up, and change freezes to cover issues that take a week to
> > show up etc. etc. So I think as far as an initial SR deployment goes,
> all we can
> > do is our best with regards to being cautious, just as we would with any
> > major core changes. So I don't see the initial deployment as any more
> risky
> > than other core projects we've undertaken like changing vendors, entire
> > chassis replacements, code upgrades between major versions etc.
> >
> > Regression bugs:
> > My opinion is that in the case of something like SR which is being
> deployed
> > based on early drafts, regression bugs is potentially a bigger issue
> than an
> > initial deployment. I hadn't considered this. Again though I think its
> > something we can reasonably prepare for. Depending on the potential
> > impact to the business you could go as far as standing up a new chassis
> next
> > to an existing one, bu

Re: [j-nsp] Segment Routing Real World Deployment (was: VPC mc-lag)

2018-07-08 Thread Alexandre Guimaraes
Adam,

Important observation, I prefer keep my pw working even a lot of segments of 
the network are affected by fiber cut and so on...

When I migrate my BGP VPLS services to l2circuits, my problems today is almost 
Zero.

No matter what happens, business order for everyone is to keep everything 
running 24/7/365 with zero downtime no matter what planned maintenance 
doesn’t count, since is planned.

VPLS services, as I said before, cause two outages in one year due l2 loop 
caused by operation team, after hours with no progress to find the loop origin, 
I was called (escalated) to solve the problem.

That’s is I want to mean with my experience, uptime, availability, quality of 
services and so on

I was a Cisco CCxx for many years with blind eyes in one vendor only even 
this vendor cause downtime “with brand”! My Cisco env goes down!!? Oh yes, it’s 
a Cisco!!! I am ok with that? Not anymore! I want peace, happy customers, sell 
more.

With the time that I have today, I can study new tech, make some lab tests, 
asking for this or for that with different vendors.

Today, I can sleep well without that fear if, someone will loop something, if 
some equipment will crash due cpu/memory problems.

And yes, I am a Network Warrior! But now a warrior tech. Like Call Of Duty 
Infinity Warfare! 

:)

att
Alexandre

Em 8 de jul de 2018, à(s) 17:58, "adamv0...@netconsultings.com" 
 escreveu:

>> From: James Bensley [mailto:jwbens...@gmail.com]
>> Sent: Friday, July 06, 2018 2:04 PM
>> 
>> 
>> 
>> On 5 July 2018 09:56:40 BST, adamv0...@netconsultings.com wrote:
 Of James Bensley
 Sent: Thursday, July 05, 2018 9:15 AM
 
 - 100% rFLA coverage: TI-LA covers the "black spots" we currently
>>> have.
 
>>> Yeah that's an interesting use case you mentioned, that I haven't
>>> considered, that is no TE need but FRR need.
>>> But I guess if it was business critical to get those blind spots
>>> FRR-protected then you would have done something about it already
>>> right?
>> 
>> Hi Adam,
>> 
>> Yeah correct, no mission critical services are effected by this for us, so 
>> the
>> business obviously hasn't allocated resource to do anything about it. If it 
>> was
>> a major issue, it should be as simple as adding an extra back haul link to a
>> node or shifting existing ones around (to reshape the P space and Q space to
>> "please" the FRR algorithm).
>> 
>>> So I guess it's more like it would be nice to have,  now is it enough
>>> to expose the business to additional risk?
>>> Like for instance yes you'd test the feature to death to make sure it
>>> works under any circumstances (it's the very heart of the network after
>>> all if that breaks everything breaks), but the problem I see is then
>>> going to a next release couple of years later -since SR is a new thing
>>> it would have a ton of new stuff added to it by then resulting in
>>> higher potential for regression bugs with comparison to LDP or RSVP
>>> which have been around since
>>> ever and every new release to these two is basically just bug fixes.
>> 
>> Good point, I think its worth breaking that down into two separate
>> points/concerns:
>> 
>> Initial deployment bugs:
>> We've done stuff like pay for a CPoC with Cisco, then deployed, then had it
>> all blow up, then paod Cisco AS to asses the situation only to be told it's 
>> not a
>> good design :D So we just assume a default/safe view now that no amount
>> of testing will protect us. We ensure we have backout plans if something
>> immediately blows up, and heightened reporting for issues that take 72
>> hours to show up, and change freezes to cover issues that take a week to
>> show up etc. etc. So I think as far as an initial SR deployment goes, all we 
>> can
>> do is our best with regards to being cautious, just as we would with any
>> major core changes. So I don't see the initial deployment as any more risky
>> than other core projects we've undertaken like changing vendors, entire
>> chassis replacements, code upgrades between major versions etc.
>> 
>> Regression bugs:
>> My opinion is that in the case of something like SR which is being deployed
>> based on early drafts, regression bugs is potentially a bigger issue than an
>> initial deployment. I hadn't considered this. Again though I think its
>> something we can reasonably prepare for. Depending on the potential
>> impact to the business you could go as far as standing up a new chassis next
>> to an existing one, but on the newer code version, run them in parallel,
>> migrating services over slowly, keep the old one up for a while before you
>> take it down. You could just do something as simple and physically replace
>> the routing engine, keep the old one on site for a bit so you can quickly 
>> swap
>> back. Or just drain the links in the IGP, downgraded the code, and then un-
>> drain the links, if you've got some single homed services on there. If you
>> have OOB access and plan all the rollback config in advanc

Re: [j-nsp] [c-nsp] Leaked Video or Not (Linux and Cisco for internal Sales folks)

2018-07-08 Thread adamv0025
> From: Marcus Leske [mailto:marcusles...@gmail.com]
> Sent: Saturday, July 07, 2018 3:58 PM
> 
> open APIs tops that funny abuse list IMHO :
> https://github.com/OAI/OpenAPI-Specification/issues/568
> 
> can we change the topic of the thread to an informative one, instead of a
> leaked video or not, to why exactly do network engineers are often
> confused by the abusive marketing all over the place of what is open and
> what is not and other computing terms.
> 
> I guess this is happening in networking more often than other domains
> because networking people didnt get a chance in their career to learn about
> the world of computing, their heads were somewhere else, learning about
> complex networking protocols and not the common computing interfaces,
> the open source world, existing  frameworks and paradigms, this video helps
> a bit on how did this happen:
> https://vimeo.com/262190505https://vimeo.com/262190505
> 
> has anyone here seen list of topics that network engineers usually miss on
> their journey ?  i know they never get exposed to software development
> and engineering in general, databases, web technologies, operating system
> fundamentals.
> 
Well I guess if you stick around in networking for long time you kind of get 
exposed to some of these to a certain level on a day job, some of it was 
covered in school in various levels of detail, and to some of these concepts we 
(networkers) get a specific very narrow filed exposure I'd say, like in your 
example of databases -well various protocol tables are good examples of 
decentralized distributed databases, then some Network OS-es are good examples 
of distributed operating systems. So I guess it then just boils down to the 
willingness of and individual to understand these concepts on an ever more 
fundamental level -with every next interaction with these. Maybe it draws one 
more towards the software development side or perhaps more towards the somewhat 
holistic understanding of the networking discipline through graph theory and 
complex adaptive systems.


adam

netconsultings.com
::carrier-class solutions for the telecommunications industry::


___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Segment Routing Real World Deployment (was: VPC mc-lag)

2018-07-08 Thread adamv0025
> From: James Bensley [mailto:jwbens...@gmail.com]
> Sent: Friday, July 06, 2018 2:04 PM
> 
> 
> 
> On 5 July 2018 09:56:40 BST, adamv0...@netconsultings.com wrote:
> >> Of James Bensley
> >> Sent: Thursday, July 05, 2018 9:15 AM
> >>
> >> - 100% rFLA coverage: TI-LA covers the "black spots" we currently
> >have.
> >>
> >Yeah that's an interesting use case you mentioned, that I haven't
> >considered, that is no TE need but FRR need.
> >But I guess if it was business critical to get those blind spots
> >FRR-protected then you would have done something about it already
> >right?
> 
> Hi Adam,
> 
> Yeah correct, no mission critical services are effected by this for us, so the
> business obviously hasn't allocated resource to do anything about it. If it 
> was
> a major issue, it should be as simple as adding an extra back haul link to a
> node or shifting existing ones around (to reshape the P space and Q space to
> "please" the FRR algorithm).
> 
> >So I guess it's more like it would be nice to have,  now is it enough
> >to expose the business to additional risk?
> >Like for instance yes you'd test the feature to death to make sure it
> >works under any circumstances (it's the very heart of the network after
> >all if that breaks everything breaks), but the problem I see is then
> >going to a next release couple of years later -since SR is a new thing
> >it would have a ton of new stuff added to it by then resulting in
> >higher potential for regression bugs with comparison to LDP or RSVP
> >which have been around since
> >ever and every new release to these two is basically just bug fixes.
> 
> Good point, I think its worth breaking that down into two separate
> points/concerns:
> 
> Initial deployment bugs:
> We've done stuff like pay for a CPoC with Cisco, then deployed, then had it
> all blow up, then paod Cisco AS to asses the situation only to be told it's 
> not a
> good design :D So we just assume a default/safe view now that no amount
> of testing will protect us. We ensure we have backout plans if something
> immediately blows up, and heightened reporting for issues that take 72
> hours to show up, and change freezes to cover issues that take a week to
> show up etc. etc. So I think as far as an initial SR deployment goes, all we 
> can
> do is our best with regards to being cautious, just as we would with any
> major core changes. So I don't see the initial deployment as any more risky
> than other core projects we've undertaken like changing vendors, entire
> chassis replacements, code upgrades between major versions etc.
> 
> Regression bugs:
> My opinion is that in the case of something like SR which is being deployed
> based on early drafts, regression bugs is potentially a bigger issue than an
> initial deployment. I hadn't considered this. Again though I think its
> something we can reasonably prepare for. Depending on the potential
> impact to the business you could go as far as standing up a new chassis next
> to an existing one, but on the newer code version, run them in parallel,
> migrating services over slowly, keep the old one up for a while before you
> take it down. You could just do something as simple and physically replace
> the routing engine, keep the old one on site for a bit so you can quickly swap
> back. Or just drain the links in the IGP, downgraded the code, and then un-
> drain the links, if you've got some single homed services on there. If you
> have OOB access and plan all the rollback config in advance, we can
> operationally support the risks, no differently to any other major core
> change.
> 
> Probably the hardest part is assessing what the risk actually is? How to know
> what level of additional support, monitoring, people, you will need. If you
> under resource a rollback of a major failure, and fuck the rollback too, you
> might need some new pants :)
> 
Well yes I suppose one could actually look at it as on any other major project 
like upgrade to a new SW release, or migration from LDP to RSVP-TE or adding a 
second plane -or all 3 together. 
And apart from the tedious and rigorous testing (god there's got to be a better 
way of doing SW validation testing) you made me think about scoping the 
fallback and contingency options in case things down work out.
These huge projects are always carried out in number of stages each broken down 
to several individual steps all this is to ease out the deployment but also to 
scope the fallout in case things go south.  
Like in migrations from LDP to RSVP you go intra-pop first then inter-pop 
between a pair of POPs and so on using small incremental steps and all this 
time the fallback option is the good old LDP maybe even well after the project 
is done until the operational confidence is high enough or till the next code 
upgrade. And I think a similar approach can be used to de-risk an SR rollout. 


adam   

netconsultings.com
::carrier-class solutions for the telecommunications industry::



Re: [j-nsp] Segment Routing Real World Deployment (was: VPC mc-lag)

2018-07-08 Thread adamv0025
> Of Mark Tinka
> Sent: Sunday, July 08, 2018 9:24 AM
> 
> 
> 
> On 7/Jul/18 23:24, Saku Ytti wrote:
> 
> > All these protocols have hello timers, LDP, ISIS, RSVP, BGP. And each
> > of them you'd like to configure to trigger from events without delay
> > when possible, instead of relying on timers. Indeed you can have BGP
> > next-hop invalidated the moment IGP informs it, allowing rapid
> > convergence.
> 
> For several years now, we've been happy with BFD offering this capability,
> primarily to IS-IS.
> 
> In my experience, as long as IS-IS (or your favorite IGP) is stable, upper
level
> protocols will be just as happy (of course, notwithstanding environmental
> factors such as a slow CPU, exhausted RAM, DoS attacks, link congestion,
> e.t.c.).
> 
Hold on gents, 
You are still talking about multi-hop TCP sessions, right? Sessions that
carry information that is ephemeral to the underlying transport network -why
would you want those session ever go down as a result of anything going on
in the underlying transport network -that's a leaky abstraction , not good
in my opinion.
You just reroute the multi-hop control-plane TCP session around the failed
link and move on, failed/flapping link should remain solely a data-plane
problem right? 
So in this particular case the VC label remains the same no matter that
transport labels change in reaction to failed link.
The PW should go down only in case any of the entities it's bound to goes
down be it a interface or a bridge-domain at either end (or a whole PE for
that matter) -and not because there's a problem somewhere in the core. 

adam

netconsultings.com
::carrier-class solutions for the telecommunications industry::

___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Segment Routing Real World Deployment (was: VPC mc-lag)

2018-07-08 Thread adamv0025
> Of Mark Tinka
> Sent: Sunday, July 08, 2018 9:20 AM
> 
Hi Mark,
 two points

> 
> 
> On 7/Jul/18 23:10, Saku Ytti wrote:
> 
> > Alexandre's point, to which I agree, is that when you run them over
> > LSP, you get all the convergency benefits of TE.
> 
> Unless you've got LFA (or BFD, for the poor man), in which case there is
no
> real incremental benefit.
> 
> We run BFD + LFA for IS-IS. We've never seen the need for RSVP-TE for FRR
> requirements.
> 
> 
> >  But I can understand
> > why someone specifically would not want to run iBGP on LSP,
> > particularly if they already do not run all traffic in LSPs, so it is
> > indeed option for operator. Main point was, it's not an argument for
> > using LDP signalled pseudowires.
> 
> We run all of our IPv4 and l2vpn pw's in (LDP-generated) LSP's. Not sure
if
> that counts...
> 
> I'm not sure whether there is a better reason for BGP- or LDP-signaled
pw's. I
> think folk just use what makes sense to them. I'm with Alexandre where I
> feel, at least in our case, BGP-based signaling for simple p2p or p2mp
pw's
> would be too fat.
> 
> 
> > If there is some transport problems, as there were in Alexandre's
> > case, then you may have lossy transport, which normally does not mean
> > rerouting, so you drop 3 hellos and get LDP down and pseudowire down,
> > in iBGP case not only would you be running iBGP on both of the
> > physical links, but you'd also need to get 6 hellos down, which is
> > roughly 6 orders of magnitude less likely.
> >
> > The whole point being using argument 'we had transport problem causing
> > BGP to flap' cannot be used as rationale reason to justify LDP
> > pseudowires.
> 
> So LDP will never be more stable than your IGP. Even with
over-configuration
> of LDP, it's still pretty difficult to totally mess it up that it's
unstable all on it
> own.
> 
> If my IGP loses connectivity, I don't want a false sense of session uptime
> either with LDP or BGP. I'd prefer they tear-down immediately, as that is
> easier to troubleshoot. What would be awkward is BGP or LDP being up, but
> no traffic being passed, as they wait for their Keepalive Hello's to time
out.
> 
The only way how you can be 100% sure about the service availability is
inserting test traffic onto the PW, that's why in Carrier Ethernet a good
practice is to use CFM so you can not only turn the L2ckt down if corrupted
but also be able to pinpoint the culprit precisely which in p2p L2 services
(with no mac learning) is quite problematic. 

> 
> >
> > I would provision both p2mp and p2p through minimal difference, as to
> > reduce complexity in provisioning. I would make p2p special case of
> > p2mp, so that when there are exactly two attachment circuits, there
> > will be no mac learning.
> > However if you do not do p2mp, you may have some stronger arguments
> > for LDP pseudowires, more so, if you have some pure LDP edges, with no
> > BGP.
> 
> Agreed that EVPN and VPLS better automate the provisioning of p2mp pw's.
> However, this is something you can easily script for LDP as well; and once
it's
> up, it's up.
> 
> And with LDP building p2mp pw's, you are just managing LDP session state.
> Unlike BGP, you are not needing to also manage routing tables, e.t.c.
> 
We have to distinguish here whether you're using BGP just for the VC
endpoint reachability and VC label propagation (VPLS) or also to carry
end-host reachability information (EVPN) and only in the latter you need ot
worry about the routing tables I the former the BGP function is exactly the
same as the function of a targeted LDP session -well in the VC label
propagation bit anyways (not the auto discovery bit of course). 


adam

netconsultings.com
::carrier-class solutions for the telecommunications industry::

___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Segment Routing Real World Deployment (was: VPC mc-lag)

2018-07-08 Thread adamv0025
> Of Alexandre Guimaraes
> Sent: Saturday, July 07, 2018 1:01 PM
> 
Hi Alexandre,
With the level of detail you provided I'm afraid that it seems like some of 
your troubles are rooted in somewhat suboptimal design choices.

> My Usage Cent
> 
> My core Network, P and PE, are 100% Juniper
> 
> We start using VPLS, based in BGP sessions, at that time we was working at
> maximum of 2 or 3 new provisions per day.
> We won a big project contract, we reach 90/100 per month.
> VPLS become a issue in all fronts...
> 
> Planning/ low ports - price of 10G ports using MX and rack space usage
> 
This is good business case for an aggregation network out of say those EX 
switches you mentioned to aggregate low speed customer links into bundles of 
10/40GE links towards PEs 
This allows you then to use the potential of a PE slot fully as dictated by the 
fabric making a better use of the chassis.
The carrier Ethernet features on PE that allows you to realize such L2 services 
aggregation are flexible VLAN tag manipulation (push/pop/translate 1/2 tags) 
and per interface VLANs range. 
Although the EX switches don't support per interface VLAN range but -I still 
think that ~4000 customers (or service vlans) per Agg. switch is enough. 
 
> Provisioning... vlan remap, memory usage of the routers and 2000/2500
> circuits/customers per MX
> 
Templates and automation in provisioning will really make the difference if you 
go pass a certain scale or customer onboarding rate,

> Tshoot, a headache to find the signaling problem when, for example: fiber
> degraded, all BGP sessions start flapping and the things become crazy and
> the impact increase each minute.
> 
I think BGP sessions are no different to LSP sessions in this regard, maybe 
just routed differently (not PE-to-PE but PE-to-RR).
Maybe running a BFD on your core links for rapid problems detection and 
interface hold down or dampening to stabilize the network could have helped 
with this.  

> Operating, vpls routing table become a pain is the ass when you use
> multipoint connections and with Lucifer reason, those multipoint become
> unreachable and the vpls table and all routing tables become ruge to analyze.
> 
On the huge routing table sizes, 
I think the problem of huge tables is something we all have to bear when in 
business of L2/L3 VPN services. 
But in Ethernet services only p2mp and mp2mp services require standard 
l2-switch-like mac learning and thus exhibit this scaling problem, but there's 
no need for mac learning for p2p services.
So I guess you could have just disabled the mac learning on instances that were 
intended to support p2p services.
Also it's a good practice to limit , contractually, how much resources can each 
VPN customer use, -in L2 services is for instance the MACs per interface or per 
bridge-domain, etc... 

> Regarding L2circuits using LDP.
> 
But hey I'm glad it worked out for you with the LDP signalled PWs, and yes I do 
agree the config is simpler for LDP. 

adam 

___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] SNMP NMS support of Junos VLAN MIBs

2018-07-08 Thread Chuck Anderson
Yes.  Juniper added a configuration knob to cause PortList to work
according to the standard on Junos ELS for at least EX4300, EX3400
etc:

set switch-options mib dot1q-mib port-list bit-map

in Junos versions at least as old as 14.1X53-D45 and 15.1X53-D57.3.
It also appears to commit in Junos 17.3R2 for MX, but I haven't tested
the functionality.

On Sun, Jul 08, 2018 at 11:51:32AM -0500, Colton Conor wrote:
> Chuck,
> 
> Did this Junos issue ever get resolved?
> 
> On Wed, Dec 9, 2015 at 10:31 AM, Chuck Anderson  wrote:
> 
> > Has anyone tried to use or implement polling of the Q-BRIDGE-MIB on
> > any Juniper products, using either commercial or open source NMS
> > software or custom in-house software?  What has been your experience
> > of the Juniper support of those SNMP products to correctly report
> > Port/VLAN memberships and VLAN/MAC FDB information?
> >
> > Juniper EX-series (at least EX2200,3200,4200) 12.x and earlier has a
> > working Q-BRIDGE-MIB (dot1qVlanStaticEgressPorts) and JUNIPER-VLAN-MIB
> > (jnxExVlan).  Because Q-BRIDGE-MIB refers only to internal VLAN
> > indexes, you need to use both MIBs to get Port/VLAN mappings including
> > the 802.1Q VLAN tag ID (jnxExVlanTag).  This means custom software, or
> > an NMS vendor willing to implement the Juniper Enterprise MIBs.
> >
> > All other Juniper Junos platforms only have Q-BRIDGE-MIB, but it is
> > broken (doesn't follow RFC 4363 standard PortList definition, instead
> > storing port indexes as ASCII-encoded, comma separated values),
> > apparently for a very long time.  So again, you need custom software
> > or an NMS vendor willing to implement the broken Juniper version of
> > Q-BRIDGE-MIB (along with detecting which implementation is needed on
> > any particular device).  This hasn't been a problem for us and in fact
> > went unnoticed, because we never cared to poll VLAN information from
> > our MX routers, only EX switches.
> >
> > But now EX-series (and QFX-series) 13.x and newer with ELS have
> > dropped the Enterprise JUNIPER-VLAN-MIB (a good thing to not require
> > Enterprise MIBs to get the VLAN tag ID) and have adopted the broken
> > Q-BRIDGE-MIB that all the other Junos platforms have been using (a
> > very bad thing).  I'm pushing to have Juniper fix this, but their
> > concern is that it may break SNMP software that has been assuming the
> > broken Q-BRIDGE-MIB implementation for all these years.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] SNMP NMS support of Junos VLAN MIBs

2018-07-08 Thread Colton Conor
Chuck,

Did this Junos issue ever get resolved?

On Wed, Dec 9, 2015 at 10:31 AM, Chuck Anderson  wrote:

> Has anyone tried to use or implement polling of the Q-BRIDGE-MIB on
> any Juniper products, using either commercial or open source NMS
> software or custom in-house software?  What has been your experience
> of the Juniper support of those SNMP products to correctly report
> Port/VLAN memberships and VLAN/MAC FDB information?
>
> Juniper EX-series (at least EX2200,3200,4200) 12.x and earlier has a
> working Q-BRIDGE-MIB (dot1qVlanStaticEgressPorts) and JUNIPER-VLAN-MIB
> (jnxExVlan).  Because Q-BRIDGE-MIB refers only to internal VLAN
> indexes, you need to use both MIBs to get Port/VLAN mappings including
> the 802.1Q VLAN tag ID (jnxExVlanTag).  This means custom software, or
> an NMS vendor willing to implement the Juniper Enterprise MIBs.
>
> All other Juniper Junos platforms only have Q-BRIDGE-MIB, but it is
> broken (doesn't follow RFC 4363 standard PortList definition, instead
> storing port indexes as ASCII-encoded, comma separated values),
> apparently for a very long time.  So again, you need custom software
> or an NMS vendor willing to implement the broken Juniper version of
> Q-BRIDGE-MIB (along with detecting which implementation is needed on
> any particular device).  This hasn't been a problem for us and in fact
> went unnoticed, because we never cared to poll VLAN information from
> our MX routers, only EX switches.
>
> But now EX-series (and QFX-series) 13.x and newer with ELS have
> dropped the Enterprise JUNIPER-VLAN-MIB (a good thing to not require
> Enterprise MIBs to get the VLAN tag ID) and have adopted the broken
> Q-BRIDGE-MIB that all the other Junos platforms have been using (a
> very bad thing).  I'm pushing to have Juniper fix this, but their
> concern is that it may break SNMP software that has been assuming the
> broken Q-BRIDGE-MIB implementation for all these years.
> ___
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
>
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] RES: Segment Routing Real World Deployment (was: VPC mc-lag)

2018-07-08 Thread Mark Tinka



On 8/Jul/18 00:05, Alexandre Guimaraes wrote:

> I am not here to make my words the rule, I am just sharing my -Real World 
> Deployment and Operation- knowledge and experience.

Thanks for sharing, Alexandre. This honest description of your
experiences is what I really like.

Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Segment Routing Real World Deployment (was: VPC mc-lag)

2018-07-08 Thread Mark Tinka


On 7/Jul/18 23:54, Aaron Gould wrote:

> Thanks Mark, I haven't been aware of any buffer deficiency in my
> 4550's.  If something adverse is occurring, I'm not aware.

The EX4550 has only 4MB of shared buffer memory. The EX4600 has only 12MB.

You need the "set class-of-service shared-buffer percent 100" command to
ensure some ports don't get starved of buffer space (which will manifest
as dropped frames, e.t.c.).

The Arista 7280R series switches have 4GB of buffer space on the
low-end, all the way to 8GB, 12GB, 16GB, 24GB and 32GB as you scale up.


>
> Thanks for the warning about large VC... I don't really intend on
> going past the (2) stacked.  After we outgrow it, I'll move on.

We are dropping the VC idea going forward. Simpler to just have enough
bandwidth between a switch and the router that you can predict.

Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Segment Routing Real World Deployment (was: VPC mc-lag)

2018-07-08 Thread Mark Tinka



On 7/Jul/18 23:24, Saku Ytti wrote:

> All these protocols have hello timers, LDP, ISIS, RSVP, BGP. And each
> of them you'd like to configure to trigger from events without delay
> when possible, instead of relying on timers. Indeed you can have BGP
> next-hop invalidated the moment IGP informs it, allowing rapid
> convergence.

For several years now, we've been happy with BFD offering this
capability, primarily to IS-IS.

In my experience, as long as IS-IS (or your favorite IGP) is stable,
upper level protocols will be just as happy (of course, notwithstanding
environmental factors such as a slow CPU, exhausted RAM, DoS attacks,
link congestion, e.t.c.).

Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Segment Routing Real World Deployment (was: VPC mc-lag)

2018-07-08 Thread Mark Tinka


On 7/Jul/18 23:18, Alexandre Guimaraes wrote:

> When we use l2circuits, you remove some layer of routing protocol 
> troubleshooting. In just few command you know what’s going on.
>
> In a flap, BGP session will be dropped after timers reached.
>
> RSVP/ISIS/LDP will be affect immediately.  Also ISIS is the fundamental key 
> of everything over here
>
> With BGP, you have to check everything twice, including filters everywhere if 
> someone change this or change that.

Sage...

Protein...

Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Segment Routing Real World Deployment (was: VPC mc-lag)

2018-07-08 Thread Mark Tinka



On 7/Jul/18 23:10, Saku Ytti wrote:

> Alexandre's point, to which I agree, is that when you run them over
> LSP, you get all the convergency benefits of TE.

Unless you've got LFA (or BFD, for the poor man), in which case there is
no real incremental benefit.

We run BFD + LFA for IS-IS. We've never seen the need for RSVP-TE for
FRR requirements.


>  But I can understand
> why someone specifically would not want to run iBGP on LSP,
> particularly if they already do not run all traffic in LSPs, so it is
> indeed option for operator. Main point was, it's not an argument for
> using LDP signalled pseudowires.

We run all of our IPv4 and l2vpn pw's in (LDP-generated) LSP's. Not sure
if that counts...

I'm not sure whether there is a better reason for BGP- or LDP-signaled
pw's. I think folk just use what makes sense to them. I'm with Alexandre
where I feel, at least in our case, BGP-based signaling for simple p2p
or p2mp pw's would be too fat.


> If there is some transport problems, as there were in Alexandre's
> case, then you may have lossy transport, which normally does not mean
> rerouting, so you drop 3 hellos and get LDP down and pseudowire down,
> in iBGP case not only would you be running iBGP on both of the
> physical links, but you'd also need to get 6 hellos down, which is
> roughly 6 orders of magnitude less likely.
>
> The whole point being using argument 'we had transport problem causing
> BGP to flap' cannot be used as rationale reason to justify LDP
> pseudowires.

So LDP will never be more stable than your IGP. Even with
over-configuration of LDP, it's still pretty difficult to totally mess
it up that it's unstable all on it own.

If my IGP loses connectivity, I don't want a false sense of session
uptime either with LDP or BGP. I'd prefer they tear-down immediately, as
that is easier to troubleshoot. What would be awkward is BGP or LDP
being up, but no traffic being passed, as they wait for their Keepalive
Hello's to time out.


>
> I would provision both p2mp and p2p through minimal difference, as to
> reduce complexity in provisioning. I would make p2p special case of
> p2mp, so that when there are exactly two attachment circuits, there
> will be no mac learning.
> However if you do not do p2mp, you may have some stronger arguments
> for LDP pseudowires, more so, if you have some pure LDP edges, with no
> BGP.

Agreed that EVPN and VPLS better automate the provisioning of p2mp pw's.
However, this is something you can easily script for LDP as well; and
once it's up, it's up.

And with LDP building p2mp pw's, you are just managing LDP session
state. Unlike BGP, you are not needing to also manage routing tables, e.t.c.

Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp