Hi Ondrej,

On Thu, 18 Jun 2015, Ondrej Zajicek wrote:

Hello

We noticed an incompatibility between Quagga and BIRD in
OSPF loading phase:

In short:

1) BIRD sends LSREQ packet with ~60 LSR.

2) Quagga answers with one LSUPD packet with ~20 LSA.

3) BIRD waits for remaining LSUPD packets (to cover all requested LSAs)
before sending new LSREQ.

4) Usually no more LSUPD is sent by Quagga, therefore BIRD timeouts
(RxmtInterval) before sending next LSREQ packet. This leads to
significantly longer loading phase.

That doesn't sound good. Do you still have the pcap?

We noticed this problem with Quagga 0.99.22 and BIRD 1.5.0

My interpretation of RFC 2328 10.9 is that when LSREQ is received, it should be immediately answered by multiple LSUPD packets to cover all LSRs from LSREQ. And when LSREQ is sent, it should be completely answered before sending the next one. But i would say that this aspect of RFC 2328 is especially vague and i see other alternative interpretations. What is the opinion of Quagga developers w.r.t. this issue?

The code concerned in Quagga has not changed in a long long time I think. We did implement Ogier's optimisation to the DD Exchange process, but that should be compatible with normal OSPF, and shouldn't affect the LSA Req/Update process thereafter regardless I think.

Quagga builds a list of all the LSAs to send when parsing the LSA Req. Then, in a seperate green-thread that is scheduled without delay after that, it should turn that list into a series of LSA Updates packets which are scheduled to be written out in another green-thread.

Quagga shouldn't delay sending those.

Possibilities:

* One of the LSAs sent by BIRD was not in Quagga's LSDB, so Quagga raised
  a BadLSReq event, and should have returned to the ExStart neighbour-state.
  You should see DD negotiation packets from Quagga then.

* One of the LSAs was too big to be sent in one MTU packet, Quagga tried
  to send it in a fragmented LSA Update packet, but that didn't work and
  got dropped somewhere under Quagga. Though, you should still have seen
  the rest of the LSA Updates then - unless all remaining were very large
  LSAs.

  Was it a network where routers had /lots/ of links?

  Unlikely...

* Similar, all the remaining LSAs were too large to fit in an IP+OSPF
  packet. We drop the packet and warn the admin. Unlikely...

* Some kind of unfairness issue somewhere meant Quagga never got around to
  either the "build the LSA Update response packets" green-thread, or else
  the "write out the packet queue for that neighbour" green-thread (we do
  prioritise some packets in the write-out-queue, e.g. Hellos - so
  there's potential unfairness there that could affect things under high
  loads).

Or something else....

I don't see anything obvious unfortunately.

Do you have more details on the network used? Do you remember if it was reproducible?

regards,
--
Paul Jakma      [email protected]  @pjakma Key ID: 64A2FF6A
Fortune:
We have only two things to worry about:  That things will never get
back to normal, and that they already have.

_______________________________________________
Quagga-dev mailing list
[email protected]
https://lists.quagga.net/mailman/listinfo/quagga-dev

Reply via email to