show route all filter ...
Hi, We use special command (see below) to view information regarding prefixes containing particular BGP communities. When such commands applied, we noticed in logs Aug 27 11:50:17 ekt-rsm1 bird: Filter NULL did not return accept nor reject. Make up your mind Aug 27 11:50:17 ekt-rsm1 last message repeated 2690 times Could you explain why? Is there any way to avoid such messages and stay at current logging level? Aug 27 11:50:17 ekt-rsm1 bird: CLI connect Aug 27 11:50:17 ekt-rsm1 bird: CLI: show route all filter {if (0,48642) ~ bgp_community then accept;} table master Aug 27 11:50:17 ekt-rsm1 bird: Filter NULL did not return accept nor reject. Make up your mind Aug 27 11:50:17 ekt-rsm1 last message repeated 2690 times Aug 27 11:50:17 ekt-rsm1 bird: CLI connection closed -- Best regards, Mikhail A. Grishin m.gris...@msk-ix.ru
Re: BFD implementation in 1.4.0
Hi, Wanted to share some results. At version 1.4.2 we see solid uptime for BFD sessions. This output collected at 15August: bird show bfd sessions bfd1: IP addressInterface State Since Interval Timeout 193.232.245.134 bce1 Down 2014-06-09 11:17:23 1.0000.000 193.232.244.207 bce1 Up 2014-06-09 11:17:24 1.0005.000 193.232.245.54bce1 Up 2014-06-09 11:17:24 1.0005.000 193.232.244.80bce1 Up 2014-06-13 15:24:28 1.0005.000 193.232.245.198 bce1 Up 2014-06-23 10:37:03 1.0005.000 193.232.244.88bce1 Up 2014-07-30 14:33:23 1.0005.000 193.232.245.184 bce1 Up 2014-06-09 11:17:24 1.0005.000 193.232.245.133 bce1 Up 2014-06-17 16:02:02 1.0005.000 bird This is sessions with real customers, real routers which pass traffic in production environment controlled by different companies. This sessions was established with our test box (no prefixes announced via BGP, no reconfiguration changes). Then we tried to migrate this BFD sessions to our production route servers and faced with issues related to our network infrastructure. We have two separate IP networks at the same VLAN. Each customer has 2 peering IP: from the first IP subnet, and from the second. One IP assigned as primary, another as secondary at the same interface on customer side. Problem: Routers of our customers able to communicate in terms of BFD only with Route Server located in the same IP subnet with primary IP address on their interface. With the Route Server in another IP subnet they can't communicate in terms of BFD because SRC IP address for BFD packets is wrong, equal to primary IP, not secondary. This issue seen for Cisco, Juniper. Some platforms allow to redefine IP address for BFD communication, but as far as we see, nobody could communicate via BFD in both IP subnets at the same time. Ondrej Zajicek wrote, 02.04.2014 22:32: On Wed, Apr 02, 2014 at 05:19:06PM +0400, Mikhail A. Grishin wrote: Ondrej Zajicek wrote, 01.04.2014 20:06: On Wed, Mar 26, 2014 at 05:09:25PM +0400, Mikhail A. Grishin wrote: 1) How we can view via birdc the state of BFD-enabled peer in terms of BFD state (up/down) ? bird show bfd sessions bfd1: IP addressInterface State Since Interval Timeout 10.0.0.23 eth0 Up 2014-03-310.2000.600 Please add show bfd to context menu: Thanks, i missed that. 2) When BFD with some BGP peer is in Up state, how BFD-related parameters for that peer can be viewed via birdc? Examples for similar outputs from CiscoJuniper - in attach. Currently not available, but 'show bfd sessions' shows almost all relevant info anyways. OK, thanks. Any plans to improve? Probably. 4) (Minor) bird show protocols all bfd1 shows some Routes counters. Does that make sense? Well, no. Note that it also does not make sense for 'device' protocol, but nobody ever complained about that. ;-) kernel and direct protocols output doesn't show Routes counters :) Yes if they are up: bird show protocols all name prototablestate since info device1 Device master up 13:08:02 Preference: 240 Input filter: ACCEPT Output filter: REJECT Routes: 0 imported, 0 exported, 0 preferred Route change stats: received rejected filteredignored accepted Import updates: 0 0 0 0 0 Import withdraws:0 0--- 0 0 Export updates: 0 0 0--- 0 Export withdraws:0--------- 0 direct1 Direct master up 13:08:02 Preference: 240 Input filter: ACCEPT Output filter: REJECT Routes: 5 imported, 0 exported, 5 preferred Route change stats: received rejected filteredignored accepted Import updates: 5 0 0 0 5 Import withdraws:0 0--- 0 0 Export updates: 0 0 0--- 0 Export withdraws:0--------- 0 kernel1 Kernel master up 13:08:02 Preference: 10 Input filter: ACCEPT Output filter: (unnamed) Routes: 0 imported, 8 exported, 0 preferred Route change stats: received rejected filteredignored accepted Import updates: 0 0 0 0 0 Import withdraws:0 0--- 0 0 Export updates: 20 10 0--- 10 Export withdraws:2
Re: BFD implementation in 1.4.0
Ondrej Zajicek wrote, 01.04.2014 20:06: On Wed, Mar 26, 2014 at 05:09:25PM +0400, Mikhail A. Grishin wrote: 1) How we can view via birdc the state of BFD-enabled peer in terms of BFD state (up/down) ? bird show bfd sessions bfd1: IP addressInterface State Since Interval Timeout 10.0.0.23 eth0 Up 2014-03-310.2000.600 Please add show bfd to context menu: bird show ? show interfacesShow network interfaces show memoryShow memory usage show ospf ... Show information about OSPF protocol show protocols [protocol | pattern] Show routing protocols show roa ... Show ROA table show route ... Show routing table show static [name] Show details of static protocol show statusShow router status show symbols ... Show all known symbolic names 2) When BFD with some BGP peer is in Up state, how BFD-related parameters for that peer can be viewed via birdc? Examples for similar outputs from CiscoJuniper - in attach. Currently not available, but 'show bfd sessions' shows almost all relevant info anyways. OK, thanks. Any plans to improve? 3) We enabled BFD for some BGP peer when BGP proto was Established. We see BFD Down packets with tcpdump, BGP remains Established. We don't find in logs any info about BFD state for that BGP peer, probably this is not normal. That is expected. Only a BFD transition from Up to Down is supposed to shutdown the BGP or OSPF session, while general unavailability of BFD (or permanent AdminDown state) on the neighbor is not an obstacle for BGP or OSPF. See RFC 5882 for details. Other side configured BFD several days later. We don't see any information in logs about changing BFD state for that peer (from Down to Up). Probably this is also not normal. You must enable 'debug { events }' in BFD for logging Up/Down events, similarly to BGP. Already did this. 4) (Minor) bird show protocols all bfd1 shows some Routes counters. Does that make sense? Well, no. Note that it also does not make sense for 'device' protocol, but nobody ever complained about that. ;-) kernel and direct protocols output doesn't show Routes counters :) 5) Is there any possibility to configure different BFD-timers for different BGP-peers reachable via the same interface? Not currently. I hesitated between putting BFD parameters to a BFD interface block in the BFD protocol (as done) and to a BFD request block in client protocols: protocol bgp { bfd { ... }; } I chose the first approach, but perhaps it is a good idea to support both approachs. May be yes, both approaches is good. Our tests not show yet that this is requirement for us, but we see questions from customers about different sets of timers. P.S. One of customers shows : http://tools.ietf.org/html/draft-ietf-bfd-intervals-00
Re: BFD implementation in 1.4.0
Hello again, Today our BIRD daemon for BFD tests goes down to core. I'll send config, log and core files to developers in separate e-mail. There was around 5-7 BFD sessions active. We started the daemon again with debug protocols all; Also another one question , in addition to previous mail: 5) Is there any possibility to configure different BFD-timers for different BGP-peers reachable via the same interface? Mikhail A. Grishin wrote, 26.03.2014 17:09: Hello, Ondrej Zajicek wrote, 26.03.2014 4:21: On Thu, Mar 20, 2014 at 02:14:47PM +0400, Aleksey Berezin wrote: Recently I tried to test BFD implementation in 1.4.0 BIRD release. I am glad you tried the new BFD implementation, your post is perhaps the first public response to it. Ondrej, We also starting to test BFD implementation. Have some questions/suggestions. 1) How we can view via birdc the state of BFD-enabled peer in terms of BFD state (up/down) ? 2) When BFD with some BGP peer is in Up state, how BFD-related parameters for that peer can be viewed via birdc? Examples for similar outputs from CiscoJuniper - in attach. 3) We enabled BFD for some BGP peer when BGP proto was Established. We see BFD Down packets with tcpdump, BGP remains Established. We don't find in logs any info about BFD state for that BGP peer, probably this is not normal. Other side configured BFD several days later. We don't see any information in logs about changing BFD state for that peer (from Down to Up). Probably this is also not normal. 4) (Minor) bird show protocols all bfd1 shows some Routes counters. Does that make sense?
Re: BFD implementation in 1.4.0
Hello, Ondrej Zajicek wrote, 26.03.2014 4:21: On Thu, Mar 20, 2014 at 02:14:47PM +0400, Aleksey Berezin wrote: Recently I tried to test BFD implementation in 1.4.0 BIRD release. I am glad you tried the new BFD implementation, your post is perhaps the first public response to it. Ondrej, We also starting to test BFD implementation. Have some questions/suggestions. 1) How we can view via birdc the state of BFD-enabled peer in terms of BFD state (up/down) ? 2) When BFD with some BGP peer is in Up state, how BFD-related parameters for that peer can be viewed via birdc? Examples for similar outputs from CiscoJuniper - in attach. 3) We enabled BFD for some BGP peer when BGP proto was Established. We see BFD Down packets with tcpdump, BGP remains Established. We don't find in logs any info about BFD state for that BGP peer, probably this is not normal. Other side configured BFD several days later. We don't see any information in logs about changing BFD state for that peer (from Down to Up). Probably this is also not normal. 4) (Minor) bird show protocols all bfd1 shows some Routes counters. Does that make sense? -- Best regards, Mikhail A. Grishin m...@msk-ix.ru Juniper: --- show bfd session address 10.78.76.2 extensive|| || Detect Transmit|| ||Address State Interface Time Interval Multiplier|| ||10.78.76.2 Up 1.800 0.3003|| || Client Static, TX interval 0.100, RX interval 0.300|| || Session up time 2w6d 21:45, previous down time 00:01:09|| || Local diagnostic CtlExpire, remote diagnostic CtlExpire|| || Remote state Up, version 1|| || Min async interval 0.100, min slow interval 1.000|| || Adaptive async TX interval 0.100, RX interval 0.600|| || Local min TX interval 0.100, minimum RX interval 0.300, multiplier 3|| |*| Remote min TX interval 0.100, min RX interval 0.300, multiplier 3|*| || Local discriminator 2, remote discriminator 12|| || Echo mode disabled/inactive|| || Multi-hop route table 7, local-address 109.71.176.2|| Cisco: - #sh bfd neighbors details IPv4 Sessions NeighAddr LD/RD RH/RS State Int 193.232.244.103 1/3468765899 UpUpVl356 Session state is UP and not using echo function. Session Host: Software OurAddr: 193.232.245.198 Handle: 1 Local Diag: 0, Demand mode: 0, Poll bit: 0 MinTxInt: 75, MinRxInt: 75, Multiplier: 5 Received MinRxInt: 2, Received Multiplier: 5 Holddown (hits): 3518(0), Hello (hits): 750(279939) Rx Count: 298379, Rx Interval (ms) min/max/avg: 36/680/619 last: 232 ms ago Tx Count: 279941, Tx Interval (ms) min/max/avg: 1/768/660 last: 712 ms ago Elapsed time watermarks: 0 0 (last: 0) Registered protocols: BGP CEF Uptime: 2d03h Last packet: Version: 1 - Diagnostic: 0 State bit: Up - Demand bit: 0 Poll bit: 0 - Final bit: 0 C bit: 0 Multiplier: 5 - Length: 24 My Discr.: 3468765899 - Your Discr.: 1 Min tx interval: 5 - Min rx interval: 2 Min Echo interval: 0 -
Re: Blackholing: security considerations
Hi Alexander, Currently we 1) allow prefixes that exist in IRR as route object (exact match with 109.68.40.0/21) 2) Also we allow 109.68.40.0/21+ in case: 2.1) if that prefix contains our blackhole community 2.2) AND if that prefix has size [25 .. 32] About possibility that 109.68.40.0/21 is reachable via other peer, and we got new route and so on... For the years we not faces with any problems regarding this. If there is more elegant solution with BIRD, also interesting to know. Alexander Shikov wrote, 07.03.2014 0:25: Hi All, As usually IXPs do, we also perform route filtering with prefix lists. In prefix lists we include only those prefixes which have corresponding route objects in RADB/RIPE. We don't accept by default longer prefixes, i.e. in prefix list we include, for example, 10.0.0.0/21 but not 10.0.0.0/21+. With the purposes of blackholing sometimes there is need to accept more-specific prefixes, mostly /32, from BGP peers. The easiest way is just to accept /32 in filter. But the main problem is that any peer can announce /32 route to any network, even to unreachable one. Thus there is need to additionally check /32 routes. For the first look, we may include longer prefixes to prefix list, and then check incoming /32 prefix against it. Result will look like: bird show route protocol ITCONS 109.68.40.20/32via 193.25.181.253 on vlan777 [ITCONS 2014-03-06 22:02:42 from 193.25.180.17] * (100) [AS25372i] 109.68.40.0/21 via 193.25.180.17 on vlan777 [ITCONS 2014-03-06 21:45:24] * (100) [AS25372i] i.e. filtering against [ 109.68.40.0/21+ ]. Now let's assume that 109.68.40.0/21 is reachable via other peer, and we got new route, and it is better due to as-path length, and new peer does not want to blackhole 109.68.40.20. Then 109.68.40.0/21 via 193.25.180.17 will become inactive, but 109.68.40.20/32 via 193.25.181.253 from 193.25.180.17 will stay best, and new peer will lose traffic to 109.68.40.20. Thus, it'd be reasonable to compare received /32 against routing table, and accept it only if there is active less-specific route from same peer. Personally I was not able to find solution for bird. Now I'm wondering how do other IXPs perform such filtering? Any ideas or thoughts are kindly appreciated! Thanks in advance!
Re: Multiple OSPF adjacencies on same interface...
Kveri wrote, 16.12.2013 13:49: Hello, you cannot use LACP between 3 devices. That is only possible if two of those devices (Force10 routers/switches) are forming one logical device (Cisco VSS, MEC, virtual PortChannel, HP IRF), I don't know if Force10 has something like that. It has: http://hasanmansur.com/2012/11/07/force10-s4810-vlt-quick-configuration-sample/ http://en.wikipedia.org/wiki/Virtual_Link_Trunking If you do this however, those 2 routers will appear as one logical device (one OSPF neighbor) to the server, then you don't have a problem. This is preferred solution, because it takes the problem from OSPF to much faster technologies. On the other hand you can do VRRP between the routers and do OSPF on the hypervisor with both of the routers, in this case just beware the asymetric routing (which may/may not be a problem, depending on the setup). Regarding your setup, I assume you're using the same IP on both of the routers, this won't work because from the router perspective the links are UP and they're advertising the same /31 to the rest of the network, this will cause half of the packets/flows to be lost. So, you can either use some virtualization switching technology (if Force10 provides that), or you can use VRRP with 2 OSPF neighborships (but in that case you need /29 subnet), or you can do some sort of script on the server and use master-slave bonding mode, but be sure to always shutdown the inactive interface (be sure to always have enabled only one of them physically), that way only one of the Force10 routers would advertise the subnet... Martin 2013-12-16 10:24 odosielateľ napísal: Yes - the reason is that this router is a VM with two passthrough NICs. The hypervisor is connected to both Force10 routers/switches with LACP, so the VM needs to run linux bonding mode 2 to provide a bond0 interface to the VM. Neighbourship then needs to be established to both routers on this bonded interface. I tried to create neighbourship directly on the interfaces, but this does not work, I assume because the switches loadbalance traffic on the LACP portchannel. I could create a neighbourship with a VRRP interface, but as I understand it this will not work due to different router-ids in case of failover. So basically as I see it, this is the only way to make this work - unless you have another idea? Thanks Regards Kristoffer On 13/12/2013, at 17.37.29, Raphael Mazelier r...@futomaki.net wrote: I?m trying to use a bonded interface on linux to connect to two routers, one router on each physical link, each with a /31 subnet. Only one of the routers (Force10 S4810) forms adjacency with the linux host (whichever comes first), the other gets stuck in EXSTART until I shut/no shut the link, then Bird creates adjacency with both routers. What are you trying to do with this design ? It's rather strange. -- Raphael Mazelier
Re: as_path question
Matthew Walster wrote, 05.07.2011 16:07: 2011/7/5 Mikhail A. Grishinm...@ripn.net: What purpose of '{' and '}' at BGP.as_path output? It indicates an AS Set - some aggregation happened, the longer routes of which were in the two ASNs in the brackets. Why at 'show route' we see only '[i]'? (expected to see the first AS in as_path) IMO, it should show AS48467 as they were the aggregator. Moreover, this prefix (94.228.160.0/20) was filtered and not accepted because of this BIRD structure: # Apply as_path filters on the last AS (originated route) allas = [ 15905, 34211, 41206, 44116, 44893, 47773, 48467, 50875, 51031, 51186, 51443, 52163 ]; if ! (bgp_path.last ~ allas) then reject; So bgp_path.last doesn't hit to 48467 in this case. Is this normal? Matthew Walster
Re: BIRD: route selection question
Alexander Ilin wrote, 30.03.2011 18:15: On Mon, Mar 28, 2011 at 06:43:41PM +0200, Ondrej Zajicek wrote: On Mon, Mar 28, 2011 at 03:25:51PM +0400, Mikhail A. Grishin wrote: Hi, Ondrej Thank you for explanation. One more question: is it possible to add bgp always-compare-med option in BIRD? That would not be a problem. A question is under which circumstances such comparison has some sense. Probably only if user overwrites all MEDs on input to the AS. Hi Ondrej, We are using it at Cisco Route Servers and some of customers using it. I think it could be a good idea to have this option available, as some of our members want to have this feature enabled at our IXP. And one more thing: in some of the previous answers you told about time of route. In route selection process inside the BIRD, at what step this criteria counted? Before 'router_id' or after? Best regards, --- Alexander Y. Ilin CTO MSK-IX Phone:+7(495)737-0685, +7(499)192-9179
BIRD: route selection question
Hi, Why in the example below, the route via 194.226.100.51 is the best? According http://bird.network.cz/?get_docf=bird-6.html#ss6.1 Route selection rules ... Prefer the lowest value of the Multiple Exit Discriminator. And at both peers we have commented string --- # default bgp_med 0; # MED value we use for comparison when none is defined --- and the default value is 0. Probably, for the route via 194.226.100.45 , the MED should be counted as zero, and this route should win? BIRD 1.2.5 ready. bird show route 194.105.192.0/19 all 194.105.192.0/19 via 194.226.100.51 on em0 [R20632x1 2011-03-22 21:37:36] * (100) [AS6820i] Type: BGP unicast univ BGP.origin: IGP BGP.as_path: 20632 6820 BGP.next_hop: 194.226.100.51 BGP.med: 160 BGP.local_pref: 100 via 194.226.100.45 on em0 [R3277x1 2011-03-16 12:47:37] (100) [AS6820i] Type: BGP unicast univ BGP.origin: IGP BGP.as_path: 3277 6820 BGP.next_hop: 194.226.100.45 BGP.local_pref: 100
Re: BIRD: route selection question
Hi, Ondrej Thank you for explanation. One more question: is it possible to add bgp always-compare-med option in BIRD? Ondrej Zajicek wrote, 28.03.2011 15:31: On Mon, Mar 28, 2011 at 01:46:50PM +0400, Mikhail A. Grishin wrote: Hi, Why in the example below, the route via 194.226.100.51 is the best? According http://bird.network.cz/?get_docf=bird-6.html#ss6.1 Route selection rules ... Prefer the lowest value of the Multiple Exit Discriminator. As specified by BGP standard, MED is used to compare routes only if they came from the same neighboring AS [*]. These came from a different ones (20632 and 3277), so MED is not used and they are probably compared by router ID, time of route or similar low-priority criteria. [*] Perhaps it is not explicitly mentioned in documentation, but it is a standard BGP behavior.
Re: Check on routes announced by peer
On Sun, 30 Jan 2011 03:29:45 +, Nick n...@somerandomnick.ano.mailgate.vanet.org wrote: On Sat, Jan 29, 2011 at 04:02:32AM +0100, Arnold Nipper wrote: Ciao Simone, on 28.01.2011 18:18 Simone Morandini wrote: a (hopefully) quick question: one of our peer says it is announcing a set of network to the route server, but there routes do not actually appear to be there... If I issue a sh route protocol PEER the list is empty, as well as if I issue sh route where bgp_path.first=peer-as. Is there a way to check if those network actually arrive to the route server? very much depends on how your config looks like. If you don't have any incoming filters you should be able to see any announcement. Worst case is that you will have to sniff on the interface to see what's going on. or set debug for their protocol and check the log In the scenario where filters was applied on pipes, not on BGP protocols, all received routes can be viewed via CLI: show route protocol PEER table TABLE_FOR_THAT_PEER.
Re: Any IX willing to share their config?
Alexander Shikoff wrote, 25.12.2010 15:50: On Sat, Dec 25, 2010 at 11:57:04AM +0100, Ondrej Zajicek wrote: On Sat, Dec 25, 2010 at 05:03:46AM +0200, Alexander Shikoff wrote: One possible way to do that is not to try handle full 32bit ASNs, but perhaps just ~ 24bit ASNs and use communities (65000..65255,*) for (65000+X,Y) - Do not announce to peer X*65536+Y and similarly communities (65256..65511,*) for: (65256+X,Y) - Announce to peer X*65536+Y only. You're right. If I remember correctly IANA currently allocates 1024 numbers for each RIR, so your variant covers them entirely for some future years. Some additional thoughts: - this way breaks RFC1997 a little - current draft Internet Exchange Route Server (http://tools.ietf.org/html/draft-jasinska-ix-bgp-route-server-01) does not propose in details how to implement handling of 32bit ASNs via communities. Developers of this draft invite to comment this document (at Euro-IX community mailing list this summer). You may send some suggestions. - there is RFC5668 (4-Octet AS Specific BGP Extended Community, http://tools.ietf.org/search/rfc5668) but it defines only 2 octets for Local Administrator field. So BGP Ext. community support will not also allow easy implementation of 32bit ASN handling. I've googled around this problem and have not find yet another ideas/discussions etc. So your way seems to be most easy and effective at present moment. Another, even simpler, way is to assign each connected client with 32bit ASN some pseudo-ASN from private range. This pseudo-ASN would be used with standard communities (0:X, MyASN:X). MSK-IX uses this way. We not expect very large number of direct connected members with ASN 65535 in few next years. Most new members still have ASN16 numbers. Some have ASN32 and then migrated to ASN16 (due various difficulties: ddos protection, direct peerings etc.) So we can wait for new RFC with Extended Communities or for some other solution. RFC1997 community 'no-export' is also supported. Other communities including RFC1997 well-known ones are not supported and stripped. That seems a bit strange to me. Not sure what the other IXPs do but i think that communities are supposed to be propagated and RS should alter only communities destined for it. RFC1997 allows modification of community attribute according to a local policy. But Internet Exchange Route Server draft _recommends_ transparate propagation. But this recommendation requires consideration of possible security or routing issues (asymmetry etc). Just because of security/routing issues almost all of our members delete all communities received from IXP or those are not listed in IXP routing policy. If other IXP engineers are reading this maillist it would be great to hear their opinions. What's about well-known communities: for example, MSK-IX propagates 'no-export' transparately to peers. I think this approach does not meet RFC1997. MSK-IX does not support 'no-advertise' (0:MyASN is used instead). We're using 'no-export' only in an approach described by RFC1997. Our customers wanted to be able to announce some routes with 'no-export' transparently to other MSK-IX participants. That was before the BIRD became our main platform and before we implemented full-featured communities to our customers. At present, you can to propagate 'no-export' with the special community: http://www.msk-ix.ru/eng/routeserver.html#bgpcommunity Btw, as I remember, among other UNIX BGP daemons also there are some transparency with 'no-export'. Any transparent Route Server at every IXP by its nature doesn't meet RFC4271 (transparent RS doesn't update as-path attribute). All current inconsistency, including RFC1997 breaks, better to consider in the RFC about Route Servers. --- Communities sent to peers -- MyASN:X - Route is received from 16-bit ASN X 6550X:Y - Route is received from 32-bit ASN 65535*X+Y What purpose have these communities? That can be easily read from AS_PATH. If certain peer makes filters based not on AS_PATH but on community then these ones can help it.
Re: [Euro-ix-rs-vwg] New release 1.2.0
Ondrej Zajicek пишет: On Fri, Jan 29, 2010 at 06:17:43PM +0300, Mikhail A. Grishin wrote: However, startup process of the daemon still unstable. It could crash in period 5-40 seconds. After that time (if not crashed) all seems to be fine. Do you need latest core file and latest binary? I cannot get useful info from the core you sent me. The best thing would be to: 1) use unstripped binary of bird - i don't know whether stripping is a part of 'make install' or you just stripped it explicitly, but it should be sufficient to not use 'make install' and just copy bird binary after 'make' to the final destination. You can see a size difference between unstripped and stripped binary. 2) enable all logging in bird.conf using 'debug protocols all;' global option. This is probably too much logging for production usage, but it would be useful to analyze the crash. Then, after some crash, send me the (unstripped) binary, the core and the bird log. I think that may be this problem isn't related to patches. I would also expect that. Especially the community patch was too simple to break anything. We'll try to reproduce that behavior and to collect debug at time of enabling the BIRD on another our route server. I hope that will be on this week. -- Mikhail A. GrishinE-mail: m...@ripn.net Phone: +7 (495) 737-0685 MSK-IX Russian Institute for Public Networks Phone: +7 (499) 192-9179 Network Operations Center
Re: [Euro-ix-rs-vwg] New release 1.2.0
Ondrej Zajicek пишет: On Thu, Jan 28, 2010 at 12:54:03PM +0300, Mikhail A. Grishin wrote: First of all, thank you for patch and for fast respond! After applying both patches (date patch and well-known communities) on production server, we got some strange errors: Jan 28 12:02:04 msk-rsm2 bird: R34485x1: Error: Finite state machine error Jan 28 12:02:17 msk-rsm2 bird: R13174x1: Error: Finite state machine error Jan 28 12:02:28 msk-rsm2 bird: R3218x1: Error: Finite state machine error Jan 28 12:02:30 msk-rsm2 bird: R41842x1: Error: Finite state machine error Jan 28 12:03:09 msk-rsm2 bird: R34485x1: Error: Finite state machine error Jan 28 12:03:26 msk-rsm2 bird: R41842x1: Error: Finite state machine error Jan 28 12:03:29 msk-rsm2 bird: R13174x1: Error: Finite state machine error Jan 28 12:03:37 msk-rsm2 bird: R3218x1: Error: Finite state machine error What does it mean? This error messages mean that BIRD received BGP messages (packets) that were unexpected with regard to the current state of the BGP session. According to the debug log you sent, it seems that the neighbor sent UPDATE message immediately after it sent OPEN message, but it should send KEEPALIVE message first. I have no idea how such problem might be caused by these patches. Is this problem related to just a small number of neighbors and other neighbors work well? Hi, We found that Finite state machine error problem is not related to your patches. It randomly occurs on our production server at the time of daemon startup :(( The problem is occurs on small number of peers. (2 or 3 or 4 from ~280) Some problem peers are the same at next startup, some - not. On test server with small number of active peers (and same config) we doesn't see this issue. What can be done? Right now we see the problem on pure 1.2.0 release... About UPDATE message immediately after it sent OPEN - we ask one of our customers (which hit that problem) to collect debug from his side. See the attachments (3 files). One more, after applying the well-known communities patch, the BIRD goes to .core two times ( 5-30 seconds after startup):( Do you need the last core file? Without well-known communities patch we doesn't see this. (yet) -- Mikhail A. GrishinE-mail: m...@ripn.net Phone: +7 (495) 737-0685 MSK-IX Russian Institute for Public Networks Phone: +7 (499) 192-9179 Network Operations Center c7606s-m9-1#show version Cisco IOS Software, c7600rsp72043_rp Software (c7600rsp72043_rp-ADVIPSERVICESK9-M), Version 12.2(33)SRB4, RELEASE SOFTWARE (fc3) Technical Support: http://www.cisco.com/techsupport Copyright (c) 1986-2008 by Cisco Systems, Inc. Compiled Wed 23-Jul-08 19:23 by prod_rel_team ROM: System Bootstrap, Version 12.2(33r)SRB4, RELEASE SOFTWARE (fc1) c7606s-m9-1 uptime is 27 weeks, 2 days, 1 hour, 45 minutes Uptime for this control processor is 27 weeks, 2 days, 2 hours, 19 minutes Time since c7606s-m9-1 switched to active is 27 weeks, 2 days, 2 hours, 21 minutes System returned to ROM by s/w reset (SP by bus error at PC 0x8273DCC, address 0x0) System restarted at 11:48:30 MSD Tue Jul 21 2009 System image file is bootdisk:c7600rsp72043-advipservicesk9-mz.122-33.SRB4.bin Last reload type: Normal Reload Jan 28 13:02:43.008: BGP: ses global 193.232.246.100 (0) act read request no-op .Jan 28 13:02:43.008: BGP: ses global 193.232.246.100 (0) act Adding topology IPv4 Unicast:base .Jan 28 13:02:43.008: BGP: 193.232.246.100 active went from Active to OpenSent .Jan 28 13:02:43.008: BGP: 193.232.246.100 active sending OPEN, version 4, my as: 41842, holdtime 180 ID 4DF660A0seconds .Jan 28 13:02:43.008: BGP: 193.232.246.100 active send message type 1, length (incl. header) 50 .Jan 28 13:02:43.008: BGP: 193.232.246.100 active rcv message type 1, length (excl. header) 26 .Jan 28 13:02:43.008: BGP: 193.232.246.100 active rcv OPEN, version 4, holdtime 180 seconds .Jan 28 13:02:43.008: BGP: 193.232.246.100 active rcv OPEN w/ OPTION parameter len: 16 .Jan 28 13:02:43.008: BGP: 193.232.246.100 active rcvd OPEN w/ optional parameter type 2 (Capability) len 14 .Jan 28 13:02:43.008: BGP: 193.232.246.100 active OPEN has CAPABILITY code: 1, length 4 .Jan 28 13:02:43.008: BGP: 193.232.246.100 active OPEN has MP_EXT CAP for afi/safi: 1/1 .Jan 28 13:02:43.008: BGP: 193.232.246.100 active OPEN has CAPABILITY code: 2, length 0 .Jan 28 13:02:43.008: BGP: 193.232.246.100 active OPEN has ROUTE-REFRESH capability(new) for all address-families .Jan 28 13:02:43.008: BGP: 193.232.246.100 active OPEN has CAPABILITY code: 65, length 4 .Jan 28 13:02:43.008: BGP: 193.232.246.100 active unrecognized capability code: 65 - ingored .Jan 28 13:02:43.008: BGP: nbr global 193.232.246.100 neighbor does not have IPv4 MDT topology activated .Jan 28 13:02:43.008: BGP: nbr global 193.232.246.100 BGP nbr does not have BGP_AF_IPv4MDT topology activated .Jan 28 13:02:43.008: BGP: ses global 193.232.246.100 (0) act IPv4 Unicast:base mdt prepare old peer: BGP_MDT_STYLE_NONE