Hi Andrey,

can you share in what Junos version did you had these issues?



> On 23 Jul 2024, at 18:15, Andrey Kostin via juniper-nsp 
> <juniper-nsp@puck.nether.net> wrote:
> 
> Tried to enable rib-sharding on several routers in last weeks and got bunch 
> of problems.
> First, PE router with rib-sharding was losing connectivity to indirect routes 
> after every MPLS LSP autobandwidth adjustment. Let's PE-A has a static route 
> for X.X.X.X/29 pointing to IP Y.Y.Y.1 reachable via connected interface with 
> IP Y.Y.Y.0/31. PE-A advertises X.X.X.X/29 with next-hop Y.Y.Y.1, and 
> Y.Y.Y.0/31 with next-hop Z.Z.Z.Z/32 from lo0 address used as iBGP session 
> source. PE-B resolves Z.Z.Z.Z/32 via RSVP LSP with label L0, and X.X.X.X/29 
> is resolved via Y.Y.Y.1 via Z.Z.Z.Z/32 to the same label L0. When regular 
> autobandwidth adjustment happens, PE-B calculates and signals the new path 
> with label L1 using make-before-brake, and then switches traffic to the new 
> path by updating the label from L0 to L1 for prefixes that are using it. It 
> turns out that the label is updated for Z.Z.Z.Z/32, Y.Y.Y.0/31, but not for 
> X.X.X.X/29. After hold-down timer expires, PE-B signals deletion of path with 
> label L0, but still uses L0 for X.X.X.X/29 and traffic is blackholed because 
> downstream router has already deleted the label. Disabling rib-sharding on 
> PE-B solved this issue right away.
> Next, a memory leak happened on a non-RR router, eating memory from 17 to 95% 
> in three weeks. After disabling rib-sharding memory usage is at 14% so far.
> And finally, two regional route-reflectors without rib-sharding peered with 
> central RRs with sharding enabled, got to 100% CPU utilization right after 
> BGP sessions were established. It caused very slow route updates with 
> intermittent connectivity even for routes that haven't changed. Changes were 
> reverted on one of these routers, and another one was running at 100% RE CPU 
> until rib-sharding was disabled on one of central RRs. After disabling 
> rib-sharding on one central RR, CPU on the peered regional RR dropped to 
> 30-40% but still was higher than usual. Only when rib-sharding was disabled 
> on the second central RR, CPU utilization returned to normal 20-25%.
> 
> YMMV, but I don't think we're going to try this feature again in the 
> foreseeable future.
> 
> Kind regards,
> Andrey
> 
> Luca Salvatore писал(а) 2024-06-26 15:18:
>> For what it's worth, we're happily running rib-sharding on many MX10K
>> devices on 22.2R3-S2.
>> NSR is fine and we haven't had any issues
>>> On Sun, Jun 2, 2024 at 10:26 PM Gustavo Santos via juniper-nsp
>>> <juniper-nsp@puck.nether.net> wrote:
>>> I tried it again on JUNOS 21.4R3-S3.4 hit some bugs that crashed rpd
>>> daemon and I gave up.
>>> We will try it again later this year. If update threading /
>>> rib-sharding
>>> works as expected it will be better than having non stop routing
>>> running.
>>> Last time we had an issue caused by bgp routing update, it tooks
>>> about 50
>>> minutes to advertise all needed routes to one of the transit
>>> providers,
>>> because the time it takes to send full routing tables feed to remote
>>> peers.
>>> Em sex., 10 de mai. de 2024 às 16:45, Andrey Kostin via juniper-nsp
>>> <
>>> juniper-nsp@puck.nether.net> escreveu:
>>>> Hi juniper-nsp,
>>>> Just hit exactly the same issue as described in the message found
>>> in the
>>>> list archives:
>>>> Gustavo Santos
>>>> Mon Jan 4 15:13:18 EST 2021
>>>> Hi,
>>>> We got another MX10003 and we are updating it before get in
>>> production.
>>>> Reading the 19.4R3 release notes, we noticed that two
>>>> features update-threading  and  rib-sharding and I really liked
>>> what it
>>>> "promises" as faster BGP updates .
>>>> But there is a catch. We can't use this new feature with non-stop
>>>> routing
>>>> enabled.
>>>> The question is , are these features worth the non-stop routing
>>> loss?
>>>> Regards
>>>> "
>>>> bgp {
>>>> ##
>>>> ## Warning: Can't be configured together with routing-options
>>>> nonstop-routing
>>>> ##
>>>> rib-sharding;
>>>> ##
>>>> ## Warning: Update threading can't be configured together
>>> with
>>>> routing-options nonstop-routing
>>>> ##
>>>> update-threading;
>>>> }
>>>> "
>>>> That message seems didn't get any response.
>>>> However, I found an explanation at the bottom the page:
>> https://www.juniper.net/documentation/us/en/software/junos/cli-reference/topics/ref/statement/rib-sharding-edit-protocols-bgp.html
>>>> Support for NSR with sharding introduced in Junos OS Release 22.2.
>>>> BGP sharding supports IPv4, IPv6, L3VPN and BGP-LU from Junos OS
>>> Release
>>>> 20.4R1.
>>>> Still need to test and confirm on this platform, but on another
>>> router
>>>> it already works.
>>>> --
>>>> Kind regards,
>>>> Andrey
> 
> _______________________________________________
> juniper-nsp mailing list juniper-nsp@puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
_______________________________________________
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp

Reply via email to