Re: RPKI TAs

2020-08-03 Thread Alex Band
I concur.

Four out of five RIR Trust Anchor Locators were recently updated to allow 
fetching the Trust Anchor via an HTTPS URI, further removing the dependence on 
rsync. Sadly, most TALs are not clearly published anywhere and I had to get 
them though GitHub issues and emails to be able to include them in the latest 
Routinator release.

These are what we believe to be the correct, up-to-date RPKI TALs:

https://github.com/NLnetLabs/routinator/tree/master/tals

You can find more discussion about this topic here:

https://github.com/NICMx/FORT-validator/issues/34
https://github.com/RIPE-NCC/rpki-validator-3/pull/215

RPA grief aside, ARIN seems to be the only RIR that publishes the latest 
version of their TAL clearly and correctly:

https://www.arin.net/resources/manage/rpki/tal/

-Alex


> On 2 Aug 2020, at 20:52, Randy Bush  wrote:
> 
> so i was trying to ensure i had a current set of TALs and was directed to
> 
>
> https://www.ripe.net/manage-ips-and-asns/resource-management/certification/ripe-ncc-rpki-trust-anchor-structure
> 
> the supposed TAL at the bottom of the page is pretty creative.  anyone
> know what to do there?
> 
> i kinda hacked with emacs and get
> 
>rsync://rpki.ripe.net/ta/ripe-ncc-ta.cerpublic.key.info
> 
>
> MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA0URYSGqUz2myBsOzeW1jQ6NsxNvlLMyhWknvnl8NiBCs/T/S2XuNKQNZ+wBZxIgPPV2pFBFeQAvoH/WK83HwA26V2siwm/MY2nKZ+Olw+wlpzlZ1p3Ipj2eNcKrmit8BwBC8xImzuCGaV0jkRB0GZ0hoH6Ml03umLprRsn6v0xOP0+l6Qc1ZHMFVFb385IQ7FQQTcVIxrdeMsoyJq9eMkE6DoclHhF/NlSllXubASQ9KUWqJ0+Ot3QCXr4LXECMfkpkVR2TZT+v5v658bHVs6ZxRD1b6Uk1uQKAyHUbn/tXvP8lrjAibGzVsXDT2L0x4Edx+QdixPgOji3gBMyL2VwIDAQAB
> 
> but kinda expected an rrdp uri too
> 
> and, to add insult to injury, the APNIC web page with their TAL
> 
>https://www.apnic.net/community/security/resource-certification/
> 
> requires javascript!
> 
> not to mention the ARIN stupidity
> 
> as if we needed another exercise in bureaucrats making operations
> painful.  most operations of any size have internal departments
> perfectly capable of doing that.
> 
> randy



Re: RPKI TAs

2020-08-03 Thread Matthias Waehlisch


On Mon, 3 Aug 2020, Alex Band wrote:

> These are what we believe to be the correct, up-to-date RPKI TALs:
> 
> https://github.com/NLnetLabs/routinator/tree/master/tals
> 


  why is it so hard that all RIRs make their TAL files available under 
the same URL path but different hosts, e.g., https://ripe.net/rpki/tal, 
https://arin.net/rpki/tal ?



  obviously, a single TAL would be better but this needs even more 
rhetoric ...


cheers
  matthias

-- 
Matthias Waehlisch
.  Freie Universitaet Berlin, Computer Science
.. http://www.cs.fu-berlin.de/~waehl


RE: BGP route hijack by AS10990

2020-08-03 Thread adamv0025
> Darrell Budic
> Sent: Sunday, August 2, 2020 6:23 PM
> 
> On Jul 30, 2020, at 5:37 PM, Baldur Norddahl 
> wrote:
> >
> > Telia implements RPKI filtering so the question is did it work? Were any
> affected prefixes RPKI signed? Would any prefixes have avoided being
> hijacked if RPKI signing had been in place?
> >
> > Regards
> >
> > Baldur - who had to turn off RPKI filtering at the request of JTAC to stop 
> > our
> mx204s from crashing :-(
> >
> 
> Oh uh, I’m getting close to getting RPKI going on my mx204s, or was until you
> posted that. What’s the story there, and perhaps which junos version?

Same here, would be interested in affected Junos versions or any details you 
can share please,

adam




Re: Has virtualization become obsolete in 5G?

2020-08-03 Thread Mark Tinka


On 3/Aug/20 08:40, Etienne-Victor Depasquale wrote:

> Is the following extract from this Heavy Reading white paper
> ,
> useful?
>
> " For transport network slicing, 
> operators strongly prefer soft slicing with virtual private networks
> (VPNs), 
> regardless of the VPN flavor.
> Ranking at the top of the list was Layer 3 VPNs (selected by 66% of
> respondents), 
> but Layer 2 VPNs, Ethernet VPNs (EVPNs), and segment routing 
> also ranked highly at 47%, 46%, and 46%, respectively. 
> The point is underscored by the low preferences among all of the hard
> slicing technologies— 
> those that physically partition resources among slices. 
> Hard slicing options formed the bottom tier among preferences."

Well, it's what I've been saying - we have tried & tested systems and
solutions that are already native to IP/MPLS networks. Why try to
reinvent network virtualization when there are plenty of existing
solutions in the wild for next to cheap? VLAN's. l2vpn's. l3vpn's. EVPN.
DWDM. And all the rest?

The whole fuss, for example, about the GRX vs. IPX all came down to
2Mbps private or public IP-based GTP tunnels vs. 100Mbps l3vpn's.

Mobile operators know how to make everyday protocols seem overly
complicated.

If we go by their nomenclature, the simple operators on this list have
been slicing infrastructure for yonks :-).

Mark.


Re: BGP route hijack by AS10990

2020-08-03 Thread Alex Band


> On 3 Aug 2020, at 11:04, adamv0...@netconsultings.com wrote:
> 
>> Darrell Budic
>> Sent: Sunday, August 2, 2020 6:23 PM
>> 
>> On Jul 30, 2020, at 5:37 PM, Baldur Norddahl 
>> wrote:
>>> 
>>> Telia implements RPKI filtering so the question is did it work? Were any
>> affected prefixes RPKI signed? Would any prefixes have avoided being
>> hijacked if RPKI signing had been in place?
>>> 
>>> Regards
>>> 
>>> Baldur - who had to turn off RPKI filtering at the request of JTAC to stop 
>>> our
>> mx204s from crashing :-(
>>> 
>> 
>> Oh uh, I’m getting close to getting RPKI going on my mx204s, or was until you
>> posted that. What’s the story there, and perhaps which junos version?
> 
> Same here, would be interested in affected Junos versions or any details you 
> can share please,

According to the information I received from the community[1], you should read 
PR1461602 and PR1309944 before deploying.

-Alex

[1] https://rpki.readthedocs.io/en/latest/rpki/router-support.html

Re: BGP route hijack by AS10990

2020-08-03 Thread Tom Beecher
>
> We can all do better. We should all do better.
>

Agreed.

However, every time we go on this Righteous Indignation of Should Do
crusade, it would serve us well to stop and remember that in every one of
our jobs, at many points in our careers, we have been faced with a
situation where something we SHOULD do ends up being deferred for something
we MUST to do. It is a universal truth that there will never enough time
and resources to complete both, especially not in our current business
environment that the only thing that matters is the numbers for the next
quarter. Sometimes as engineers we have to make choices,  sometimes
choices are imposed on us by pointy hairs.

Telia made a mistake. They owned it and will endeavor to do better. What
more can be asked?

On Fri, Jul 31, 2020 at 5:51 PM Mark Tinka  wrote:

>
>
> On 31/Jul/20 23:38, Sabri Berisha wrote:
>
> > Kudos to Telia for admitting their mistakes, and fixing their processes.
>
> Considering Telia's scope and "experience", that is one thing. But for
> the general good of the Internet, the number of intended or
> unintentional route hijacks in recent years, and all the noise that
> rises on this and other lists each time we have such incidents (this
> won't be the last), Telia should not have waited to be called out in
> order to get this fixed.
>
> Do we know if they are fixing this on just this customer of theirs, or
> all their customers? I know this has been their filtering policy with us
> (SEACOM) since 2014, as I pointed out earlier today. There has not been
> a shortage of similar incidents between now and then, where the
> community has consistently called for more deliberate and effective
> route filtering across inter-AS arrangements.
>
> There is massive responsibility for the community to act correctly for
> the Internet to succeed. Especially so during these Coronavirus times
> where the world depends on us to keep whatever shred of an economy is
> left up and running. Doubly so if you are a major concern (like Telia)
> for the core of the Internet.
>
> It's great that they are fixing this - but this was TOTALLY avoidable.
> That we won't see this again - even from the same the actors - isn't
> something I have high confidence in guaranteeing, based on current
> experience.
>
> We can all do better. We should all do better.
>
> Mark.
>
>


Re: BGP route hijack by AS10990

2020-08-03 Thread Mark Tinka



On 3/Aug/20 14:36, Alex Band wrote:
> According to the information I received from the community[1], you should 
> read PR1461602 and PR1309944 before deploying.

The good news is the code that fixes both of those issues is shipping.

Mark.


Yahoo! admin

2020-08-03 Thread Brian
If there's a Yahoo! admin on list that can contact me offlist I'd
appreciate it. You have a TLS issue on IPv6 only that your front line
customer care people insist is an email sending related issue.

Please tell me the only way to speak to someone there is to pay a
monthly fee when your own coders are breaking your own services.

Thanks.


Re: Has virtualization become obsolete in 5G?

2020-08-03 Thread David Monosov
Containerization and k8s aren't so much a shift away from virtualization
(horizontally), but a shift up from virtualization (vertically). It is a broader
theme than 5G - initially gaining traction with SaaS companies, and recently
appearing in NFV scenarios.

Under the hood, k8s relies on an operating system which in turn typically runs
inside a VM on a physical compute resource. Virtualization, thus, isn't obsolete
- but its implementation specifics lose importance.

The operator describes her desired configuration state once in the form of k8s
objects, and is ready to deploy a service to any k8s platform instance. This can
be an A-list k8s-as-a-service provider such as Amazon EKS, Google GKE, or Azure
AKS. It can also be an in-house VMWare Tanzu or Mirantis Cloud Platform
deployment that runs on the operator's own bare metal in their own data center.

This additional abstraction, however, is only magical when someone else gets
paid to deal with the detail. For an operator's in-house IT team, introducing
k8s can be a net increase in complexity. Now, not only do they have to deal with
all traditional IT challenges up to and including virtualization (life-cycle of
hardware, physical network, storage, virtualization, operating system,
licensing, backups, ...) - but also must map the k8s platform instance to these
underlying elements and ensure the correct functioning of the k8s platform 
itself.

Solutions are emerging (e.g. Amazon AWS Outposts, which allow an operator to
bring a micro Amazon region in-house), but we'll likely continue to see NFV
vendors supporting both VM-targetted and k8s-targetted deployment scenarios for
some time.

--
Sincerely,

David Monosov

On 01/08/2020 11:23, Etienne-Victor Depasquale wrote:
> Hi folks,
> 
> Over the past few weeks, I've attended webinars and watched videos organized 
> by
> Intel. 
> These activities have centred on 5G and examined applications (like "visual
> cloud" and "gaming"), 
> as well as segment-oriented aspects (like edge networks, 5G RAN and 5G Core).
> 
> I am stunned (no hyperbole) by the emphasis on Kubernetes in particular,
> and cloud-native computing in general. 
> Equally stunning (for me), public telecommunications networks have been 
> portrayed 
> as having a history that moved from integrated software and hardware, 
> to virtualization and now to cloud-native computing. 
> See, for example Alex Quach, here
> 
>  @10:30).
> I reason that Intel's implication is that virtualization is becoming obsolete.
> 
> Would anyone care to let me know his thoughts on this prediction?
> 
> 
> Cheers all,
> 
> Etienne
> 
> -- 
> Ing. Etienne-Victor Depasquale
> Assistant Lecturer
> Department of Communications & Computer Engineering
> Faculty of Information & Communication Technology
> University of Malta
> Web. https://www.um.edu.mt/profile/etiennedepasqualeI 


Re: BGP route hijack by AS10990

2020-08-03 Thread Rafael Possamai
To your point with regards to multiple failures combined causing an outage, 
here's some basic reading on the Swiss cheese model: 
https://en.wikipedia.org/wiki/Swiss_cheese_model 

>From over here it looks like the legacy filter was a latent failure, and the 
>BGP automation from the downstream peer of Telia was an active failure 
>(combined caused the outage). Now from the downstream peer's point of view, 
>perhaps the cause of their BGP automation failure was latent also, but we 
>wouldn't know without more details.

Pretty interesting topic.

Re: RPKI TAs

2020-08-03 Thread John Kristoff
On Sun, 2 Aug 2020 18:52:11 +
Randy Bush  wrote:

> not to mention the ARIN stupidity

Notwithstanding the RPA, downloading ARIN's TAL is straightforward:

As documented here:

  

One can wget, curl, or whatever this:

  

John


Re: Issue with Noction IRP default setting (Was: BGP route hijack by AS10990)

2020-08-03 Thread Tom Beecher
>
> Why are you not on your soap box about BIRD, FRrouting, OpenBGPd, Cisco,
> Juniper, etc... about how they can possibly allow every day screw ups to
> happen, but the same options like the NO_EXPORT community are available for
> the engineer to use? One solution would be to implement "BGP Group/Session
> Profiles" (ISP/RTBH/DDOS Filtering/Route Optimizers/etc) or a "BGP Session
> Wizard" (ask the operator questions about their intentions), then
> automatically generate import and export policies based on known accepted
> practices.
>

You seem to be implying that nobody has ever given feedback to a vendor
about their BGP implementation. That's incredibly far from the truth.
Default parameters on many NOS have been changed over the years to be safer
for 2AM compliance, or for operators with lesser experience.

It's correct that someone with any of those BGP implementations can make
configuration errors. The difference is those configuration errors are LESS
LIKELY to cause widespread disruption in the DFZ. I think back to many
years ago at the start of my career, and the first time I configured BGP on
a router with 2 upstreams. In an amazing rookie move, I created config
which I did not apply, reannouncing everything from 3356 to 1239 via
myself, and vice versa. While embracing, only a very small amount of
traffic ( <10Mbps ) and prefixes were impacted, since BGP worked as
designed, and the longer AS PATH I created was less desirable for almost
everyone.

IF you are going to create more specific announcements, be it with a BGP
"optimizer" , or with other BGP implementations, the SAFEST method to
prevent unintended consequences would be to add guardrails, like NO_EXPORT.
It's just a best practice. When you can make those good best practices a
default behavior? Even better! There is no downside to Nocton making
NO_EXPORT the default behavior, only upside to the stability of the
internet at large.

On Sat, Aug 1, 2020 at 4:31 PM Ryan Hamel  wrote:

> Job,
>
> I disagree on the fact that it is not fair to the BGP implementation
> ecosystem, to enforce a single piece of software to activate the no-export
> community by default, due to ignorance from the engineer(s) implementing
> the solution. It should be common sense that certain routes that should be
> advertised beyond the local AS, just like RFC1918 routes, and more. Also,
> wasn't it you that said Cisco routers had a bug in ignoring NO_EXPORT?
> Would you go on a rant with Cisco, even if Noction add that enabled
> checkbox by default?
>
> Why are you not on your soap box about BIRD, FRrouting, OpenBGPd, Cisco,
> Juniper, etc... about how they can possibly allow every day screw ups to
> happen, but the same options like the NO_EXPORT community are available for
> the engineer to use? One solution would be to implement "BGP Group/Session
> Profiles" (ISP/RTBH/DDOS Filtering/Route Optimizers/etc) or a "BGP Session
> Wizard" (ask the operator questions about their intentions), then
> automatically generate import and export policies based on known accepted
> practices.
>
> Another solution could be having the BGP daemon disclose the make, model
> family, and exact model of hardware it is running on, to BGP peers, and add
> more knobs into policy creation to match said values, and take action
> appropriately. That would be useful in getting around vendor specific
> issues, as well as belt & suspenders protection.
>
> Ryan
> On Aug 1 2020, at 9:58 am, Job Snijders  wrote:
>
> On Sat, Aug 01, 2020 at 06:50:55AM -0700, Ca By wrote:
> > I am not normally supporting a heavy hand in regulation, but i think it
> is
> > fair to say Noction and similar BGP optimizers are unsafe at any speed
> and
> > the FTC or similar should ban them in the USA. They harm consumers and
> are
> > a risk to national security / critical infrastructure
> >
> > Noction and similar could have set basic defaults (no-export, only create
> > /25 bogus routes to limit scope), but they have been clear that their
> greed
> > to suck up traffic does not benefit from these defaults and they wont do
> > it.
>
> Following a large scale BGP incident in March 2015, noction made it
> possible to optionally set the well-known NO_EXPORT community on route
> advertisements originated by IRP instances.
>
> "In order to further reduce the likelihood of these problems
> occurring in the future, we will be adding a feature within Noction
> IRP to give an option to tag all the more specific prefixes that it
> generates with the BGP NO_EXPORT community. This will not be enabled
> by default [snip]"
> https://www.noction.com/blog/route-optimizers
> Mar 27, 2015
>
> Due to NO_EXPORT not being set in the default configuration, there are
> probably if not certainly many unsuspecting network engineers who end up
> deploying this software - without ever even considering - to change that
> one setting in the configuration.
>
> Fast forward a few years and a few incidents, on the topic of default
> settings, following 

Re: BGP route hijack by AS10990

2020-08-03 Thread Job Snijders
On Mon, Aug 03, 2020 at 02:36:25PM +0200, Alex Band wrote:
> According to the information I received from the community[1], you
> should read PR1461602 and PR1309944 before deploying.
> 
> [1] https://rpki.readthedocs.io/en/latest/rpki/router-support.html

My take on PR1461602 is that it can be ignored, as it appears to only
manifest itself in a mostly cosmetic way: initial RTR session
establishment takes multiple minutes, but once RTR sessions are up
things work smoothly.

Under no circumstances should you enable RPKI ROV functionality on boxes
that suffer from PR1309944. That one is a real showstopper.

Kind regards,

Job


Re: Issue with Noction IRP default setting (Was: BGP route hijack by AS10990)

2020-08-03 Thread Job Snijders
Dear Ryan,

I have come to believe this is a Noction IRP specific issue.

On Sat, Aug 01, 2020 at 01:29:59PM -0700, Ryan Hamel wrote:
> I disagree on the fact that it is not fair to the BGP implementation
> ecosystem, to enforce a single piece of software to activate the
> no-export community by default

I am not exaggerating when I say that *ONLY* the name of this software
is mentioned when incidents like this happen. Other route manipulation
tools either use different (safer) technologies and/or mark routes with
NO_EXPORT.

Every few weeks I am in phone calls with new people who happened
originated hijacks which existed for traffic engineering purposes and
without fail it is always the same software from the same company that
originated the rogue routes.

It seems more efficient if the software were to ship with improved
default settings than me explaining the problem ad-nauseum to every new
engineer after they unsuspectingly stepped into this trap.

Not extremely dangerous by default, is it really too much to ask?

> Also, wasn't it you that said Cisco routers had a bug in ignoring
> NO_EXPORT? Would you go on a rant with Cisco, even if Noction add that
> enabled checkbox by default?

Cisco and Noction are separate companies, regardless of what Noction
does, the Cisco implementations are expected to confirm to their own
documentation and the BGP-4 specifications.

1/ Without setting NO_EXPORT by a default, route manipulation software
   by default is very dangerous.

2/ Even if NO_EXPORT is set, software defects happen from time to time
   and the existence of fake more-specific routes in a specific routing
   domain can have dire consequences (as has been demonstrated time
   after time).

Not setting NO_EXPORT as a default is setting your customers up for
failure. If your car's seatbelt accidentally breaks, it wouldn't
logically follow to also remove the airbags.

> Why are you not on your soap box about BIRD, FRrouting, OpenBGPd,
> Cisco, Juniper, etc... about how they can possibly allow every day
> screw ups to happen

It is interesting you mention these names, as all of them in recent
years went through a process to revisit some unsafe default behavior
and address it. These companies have far larger userbases, so if they
can do it, anyone can do it!

For the longest time many BGP implementations - BY DEFAULT - would
propagate any and all routes from EBGP peers to all other IGBP and EBGP
peers. The community identified this to be a root cause for many
incidents, and eventually came up with a change to the BGP-4
specification which codifies that the default should be safe instead of
dangerous. https://tools.ietf.org/html/rfc8212

- BIRD introduced support for RFC 8212 in BIRD 2 and higher
- FRRouting changed the defaults in 7.4 and higher
- Cisco IOS XR had RFC 8212 right from the start
- OpenBGPD changed its default behavior in version 6.4
- Juniper is still working on this, in the meantime a SLAX script can be
  used to emulate RFC 8212 behavior: 
https://github.com/packetsource/rfc8212-junos

It is well understood how default settings strongly shape the success or
failure of deployments. This is no different.

Kind regards,

Job


Re: RPKI TAs

2020-08-03 Thread Job Snijders
On Mon, Aug 03, 2020 at 08:17:55AM -0500, John Kristoff wrote:
> On Sun, 2 Aug 2020 18:52:11 +
> Randy Bush  wrote:
> 
> > not to mention the ARIN stupidity
> 
> Notwithstanding the RPA, downloading ARIN's TAL is straightforward:
> 
> As documented here:
> 
>   
> 
> One can wget, curl, or whatever this:
> 
>   

I dunno, 'straightforward' to me would mean the ARIN TA is installed by
default when you install a RPKI Cache Validator implementation, all
without requiring lawyers well-versed in both your native language AND
in the American legal system.

I can do DNSSEC, RPKI ROV, Signify, Web PKIs like TLS - all without
kludges. Here is a video (10 min) where I show how you can bootstrap a
system from 0 to 100 without relying party agreements:
https://www.youtube.com/watch?v=oBwAQep7Q7o

The highlight of the video is when I access ARIN's website over HTTPS,
after having resolved their webserver's IP address with a DNSSEC
validating recursor... to discover I need to get a lawyer to download a
.tal file which exists to protect *ARIN* members. Shouldn't ARIN members
demand that the process is as frictionless as possible? (both the new
and old RPA are the opposite of frictionless).

ARIN members (the RPKI users) depend on network operators both inside
and outside the ARIN region to honor their ROAs. The internet is global.
The ARIN ROA's will not be honored if the ARIN .tal file is missing. The
ARIN .tal file is missing because it cannot be included in open source
software without making things very awkward.

It is an insane situation. ARIN resource holders using ARIN's RPKI TA
are measurably *less* protected than their RIPE, APNIC, LACNIC and
AFRINIC counterparts.

Get this:

When you transfer your IP space away from ARIN, to *ANY* other RIR,
you'll derive *MORE* benefits from your RPKI ROA signing efforts. You
don't even need to renumber out of your space to improve your routing
security posture!

I believe ARIN's policy to institute a significant legal barrier to RPKI
infrastructure negatively impacts ARIN's own members.

Imagine having to sign a contract with DigiCert to obtain the public key
to be able to visit https://paypal.com. Ha-ha-ha-ha... folly. It would
be bad for business.

Kind regards,

Job


Re: BGP route hijack by AS10990

2020-08-03 Thread Mark Tinka



On 3/Aug/20 14:57, Tom Beecher wrote:

> Agreed. 
>
> However, every time we go on this Righteous Indignation of Should Do
> crusade, it would serve us well to stop and remember that in every one
> of our jobs, at many points in our careers, we have been faced with a
> situation where something we SHOULD do ends up being deferred for
> something we MUST to do. It is a universal truth that there will never
> enough time and resources to complete both, especially not in our
> current business environment that the only thing that matters is the
> numbers for the next quarter. Sometimes as engineers we have to make
> choices,  sometimes choices are imposed on us by pointy hairs. 
>
> Telia made a mistake. They owned it and will endeavor to do better.
> What more can be asked?

I think we've now gone past Telia's mistake and are considering what we
can all do as BGP actors to prevent this particular issue from making a
reprise.

Agreed, we all have bits we need to prioritize our time on. But the BGP
requires concerted effort of all actors on the Internet. How an operator
in Omsk works with BGP has a potentially direct impact on another
operator in Ketchikan. So whether I choose to spend more time on
attending conferences vs. upgrading my core network, neither of those
has an impact on the BGP. But if I'm going to not take BGP filtering as
seriously as I should, the engineer, their employer and customer,
sitting all the way in Yangon, could feel that.

The devices we use, nowadays, are only as useful as their connectedness.
No connectivity, and they're just bricks. Particularly in these
Coronavirus times, the Internet is what is keeping economies alive, and
folk employed. So rather than go back to the old days of, "We are busy,
it is what it is", let's figure out how to make it better. We don't have
to fix all of the Internet's governance issues this century - let's just
start with making this "BGP optimizer danger" fix + "all operators
should filter more deliberately" a reality.

Mark.



Re: BGP route hijack by AS10990

2020-08-03 Thread Mark Tinka


On 1/Aug/20 02:44, Rafael Possamai wrote:

> To your point with regards to multiple failures combined causing an
> outage, here's some basic reading on the Swiss cheese model:
> https://en.wikipedia.org/wiki/Swiss_cheese_model

You just reminded me of the defense's strategy in the court case against
HealthSouth's CEO Richard Scrushy, when they used a picture of a rat
carrying Swiss cheese (full of holes) in their closing arguments to the
jurors, to discredit the prosecution :-).

Mark.


Re: BGP route hijack by AS10990

2020-08-03 Thread Baldur Norddahl
On Mon, Aug 3, 2020 at 3:54 PM Job Snijders  wrote:

> On Mon, Aug 03, 2020 at 02:36:25PM +0200, Alex Band wrote:
> > According to the information I received from the community[1], you
> > should read PR1461602 and PR1309944 before deploying.
> >
> > [1] https://rpki.readthedocs.io/en/latest/rpki/router-support.html
>
> My take on PR1461602 is that it can be ignored, as it appears to only
> manifest itself in a mostly cosmetic way: initial RTR session
> establishment takes multiple minutes, but once RTR sessions are up
> things work smoothly.
>
> Under no circumstances should you enable RPKI ROV functionality on boxes
> that suffer from PR1309944. That one is a real showstopper.
>
>
We suffered a series of crashes that led to JTAC recommending disabling
RPKI. We had a core dump which matches PR1332626 which is confidential, so
I have no idea what it is about. Apparently what happened was the server
running the RPKI validation server rebooted and the service was not
configured to automatically restart. Also we did not have it redundant nor
did we monitor the service. So we had no working RPKI validation server and
that apparently caused the MX204 to become unstable in various ways. It
might run for a day but it would do all sorts of things like packet loss,
delays and generally be "strange". The first crash caused BGP, ssh and
subscriber management to be down, but LDP, OSPF, SNMP to be up. It became a
black hole we could not login to.  The worst possible kind of crash for a
router. We had to go onsite and pull the power.

The router appears to run fine after disabling RPKI. I suppose starting the
validation service may also fix the issue. But I am not going to go there
until I know what is in that PR and also I feel the RPKI funktion needs to
be failsafe before we can use it. I know we are at fault for not deploying
the validation service in a redundant setup and for failing at monitoring
the service. But we did so because we thought it not to be too important,
because a failed validation service should simply lead to no validation,
not a crashed router.

This is on JUNOS 20.1R1.11.

Regards,

Baldur


Re: BGP route hijack by AS10990

2020-08-03 Thread Mark Tinka


On 3/Aug/20 17:09, Baldur Norddahl wrote:

>
> We suffered a series of crashes that led to JTAC recommending
> disabling RPKI. We had a core dump which matches PR1332626 which is
> confidential, so I have no idea what it is about. Apparently what
> happened was the server running the RPKI validation server rebooted
> and the service was not configured to automatically restart. Also we
> did not have it redundant nor did we monitor the service. So we had no
> working RPKI validation server and that apparently caused the MX204 to
> become unstable in various ways. It might run for a day but it would
> do all sorts of things like packet loss, delays and generally be
> "strange". The first crash caused BGP, ssh and subscriber management
> to be down, but LDP, OSPF, SNMP to be up. It became a black hole we
> could not login to.  The worst possible kind of crash for a router. We
> had to go onsite and pull the power.
>
> The router appears to run fine after disabling RPKI. I suppose
> starting the validation service may also fix the issue. But I am not
> going to go there until I know what is in that PR and also I feel the
> RPKI funktion needs to be failsafe before we can use it. I know we are
> at fault for not deploying the validation service in a redundant setup
> and for failing at monitoring the service. But we did so because we
> thought it not to be too important, because a failed validation
> service should simply lead to no validation, not a crashed router.
>
> This is on JUNOS 20.1R1.11.

That's a really nasty bug.

Loss of an RTR session shouldn't kill the box, even if you are running
only one validator. If you can share details about why this happens when
you get them, that would be most helpful.

I'd be curious to know whether this is dependent on a specific
validator, or all of them.

Are there bits in Junos 20 that you can't get in fixed versions of 19?

Mark.


Re: RPKI TAs

2020-08-03 Thread Randy Bush
>   why is it so hard that all RIRs make their TAL files available under 
> the same URL path but different hosts, e.g., https://ripe.net/rpki/tal, 
> https://arin.net/rpki/tal ?

no, you are supposed to get TRUST material from alex's secret stash.
sigh.

it should be a dnssec lookup of ripe.net, tls secured lookup, find a TAL
as defind in the RFCs, and fetch it via tls.

randy


Re: RPKI TAs

2020-08-03 Thread Owen DeLong



> On Aug 3, 2020, at 07:54 , Job Snijders  wrote:
> 
> On Mon, Aug 03, 2020 at 08:17:55AM -0500, John Kristoff wrote:
>> On Sun, 2 Aug 2020 18:52:11 +
>> Randy Bush  wrote:
>> 
>>> not to mention the ARIN stupidity
>> 
>> Notwithstanding the RPA, downloading ARIN's TAL is straightforward:
>> 
>> As documented here:
>> 
>>  
>> 
>> One can wget, curl, or whatever this:
>> 
>>  
> 
> I dunno, 'straightforward' to me would mean the ARIN TA is installed by
> default when you install a RPKI Cache Validator implementation, all
> without requiring lawyers well-versed in both your native language AND
> in the American legal system.

I was able to download it just now without any authentication, lawyers, 
contracts,
or anything else… What more is it you are asking for?

> I can do DNSSEC, RPKI ROV, Signify, Web PKIs like TLS - all without
> kludges. Here is a video (10 min) where I show how you can bootstrap a
> system from 0 to 100 without relying party agreements:
> https://www.youtube.com/watch?v=oBwAQep7Q7o

I just obtained the ARIN TAL without ever signing an RPA. What am I missing?

All I did was follow the URL John provided.

Owen



Re: RPKI TAs

2020-08-03 Thread Matt Corallo
While I certainly agree with you, I have a certainly-naive question - what the 
difference is between ARIN and RIPE's T&C:

Aug  3 19:07:15 rpki-validator rpki-client[16164]: The RIPE NCC Certification 
Repository is subject to Terms and Conditions
Aug  3 19:07:15 rpki-validator rpki-client[16164]: See
http://www.ripe.net/lir-services/ncc/legal/certification/repository-tc

As far as I understand, to use RIPE's RPKI repo I have to similarly agree with 
RIPE's legal contract as well, though
they are somewhat less aggressive about making sure I check a box before using 
it.

Matt

On 8/3/20 10:54 AM, Job Snijders wrote:
> On Mon, Aug 03, 2020 at 08:17:55AM -0500, John Kristoff wrote:
>> On Sun, 2 Aug 2020 18:52:11 +
>> Randy Bush  wrote:
>>
>>> not to mention the ARIN stupidity
>>
>> Notwithstanding the RPA, downloading ARIN's TAL is straightforward:
>>
>> As documented here:
>>
>>   
>>
>> One can wget, curl, or whatever this:
>>
>>   
> 
> I dunno, 'straightforward' to me would mean the ARIN TA is installed by
> default when you install a RPKI Cache Validator implementation, all
> without requiring lawyers well-versed in both your native language AND
> in the American legal system.
> 
> I can do DNSSEC, RPKI ROV, Signify, Web PKIs like TLS - all without
> kludges. Here is a video (10 min) where I show how you can bootstrap a
> system from 0 to 100 without relying party agreements:
> https://www.youtube.com/watch?v=oBwAQep7Q7o
> 
> The highlight of the video is when I access ARIN's website over HTTPS,
> after having resolved their webserver's IP address with a DNSSEC
> validating recursor... to discover I need to get a lawyer to download a
> .tal file which exists to protect *ARIN* members. Shouldn't ARIN members
> demand that the process is as frictionless as possible? (both the new
> and old RPA are the opposite of frictionless).
> 
> ARIN members (the RPKI users) depend on network operators both inside
> and outside the ARIN region to honor their ROAs. The internet is global.
> The ARIN ROA's will not be honored if the ARIN .tal file is missing. The
> ARIN .tal file is missing because it cannot be included in open source
> software without making things very awkward.
> 
> It is an insane situation. ARIN resource holders using ARIN's RPKI TA
> are measurably *less* protected than their RIPE, APNIC, LACNIC and
> AFRINIC counterparts.
> 
> Get this:
> 
> When you transfer your IP space away from ARIN, to *ANY* other RIR,
> you'll derive *MORE* benefits from your RPKI ROA signing efforts. You
> don't even need to renumber out of your space to improve your routing
> security posture!
> 
> I believe ARIN's policy to institute a significant legal barrier to RPKI
> infrastructure negatively impacts ARIN's own members.
> 
> Imagine having to sign a contract with DigiCert to obtain the public key
> to be able to visit https://paypal.com. Ha-ha-ha-ha... folly. It would
> be bad for business.
> 
> Kind regards,
> 
> Job
> 


BGP full feed for testing purposes

2020-08-03 Thread Blažej Krajňák

Hello,

I'm wondering, if there is any public service I can get full BGP feed 
from for testing purposes.


I admin multi-homed AS50242 with two default routes for now (fail-over). 
I'm going to prepare new routing setup with extended validation so reall 
full BGP feed would be usefull. Yes, I can ask my upstream provider for 
it, but I don't want to change settings in production setup.



Thanks

Regards,
Blažej Krajňák


Re: BGP full feed for testing purposes

2020-08-03 Thread Brendan Carlson
Set up a Vultr instance and you can get a full feed from them for testing.
I've done this for a route collector and it worked well.

On Mon, Aug 3, 2020, 13:16 Blažej Krajňák  wrote:

> Hello,
>
> I'm wondering, if there is any public service I can get full BGP feed
> from for testing purposes.
>
> I admin multi-homed AS50242 with two default routes for now (fail-over).
> I'm going to prepare new routing setup with extended validation so reall
> full BGP feed would be usefull. Yes, I can ask my upstream provider for
> it, but I don't want to change settings in production setup.
>
>
> Thanks
>
> Regards,
> Blažej Krajňák
>


Re: BGP full feed for testing purposes

2020-08-03 Thread Josh Luthman
Greg Sowell helps you out here:

http://gregsowell.com/?page_id=5771

Josh Luthman
Office: 937-552-2340
Direct: 937-552-2343
1100 Wayne St
Suite 1337
Troy, OH 45373


On Mon, Aug 3, 2020 at 4:19 PM Brendan Carlson 
wrote:

> Set up a Vultr instance and you can get a full feed from them for testing.
> I've done this for a route collector and it worked well.
>
> On Mon, Aug 3, 2020, 13:16 Blažej Krajňák  wrote:
>
>> Hello,
>>
>> I'm wondering, if there is any public service I can get full BGP feed
>> from for testing purposes.
>>
>> I admin multi-homed AS50242 with two default routes for now (fail-over).
>> I'm going to prepare new routing setup with extended validation so reall
>> full BGP feed would be usefull. Yes, I can ask my upstream provider for
>> it, but I don't want to change settings in production setup.
>>
>>
>> Thanks
>>
>> Regards,
>> Blažej Krajňák
>>
>


Re: RPKI TAs

2020-08-03 Thread Randy Bush
> I dunno, 'straightforward' to me would mean the ARIN TA is installed by
> default when you install a RPKI Cache Validator implementation

uh, i want a trustable downlad of trust anchors.  and it ain't from
vendors.

but yes, arin's legal dos it typical arin.  but, if i ignore the bumph,
i can connect to their web site dnssec, tls, ... and get a viable TAL
which meets RFC specs.  that seems to me more than one can say for some
other RIRs.

randy


Suggestiins for DIA link in Alamo,CA area

2020-08-03 Thread Nathanael Cariaga
Guys, I'm looking for 300-500Mbps DIA circuit (with /28 IPs) to be
installed in Alamo, CA.  Any suggestions?