On 11/04/2024 14:24, Ilya Maximets wrote:
On 4/11/24 10:59, Chris Riches wrote:
 From what we know so far, the DB was full of stale connection-tracking
information such as the following:

[...]

Once the host was recovered by putting in the timeout increase,
ovsdb-server successfully started and GCed the database down from 2.4
*GB* to 29 *KB*. Had this happened before the host restart, we would
have never seen this problem. But since it seems possible to end up
booting with such a large DB, we figured a timeout increase was a
sensible measure to take.
Uff.  Sounds like ovn-controller went off the rails.

Normally, ovsdb-server compacts the database once in 10-20 minutes,
if the database doubles the size since the previous check.  If all
the transactions are that small, it would mean ovn-controller made
about 10K transactions per second in the 10-20 minutes before the
restart.  That's huge.

I wonder if this can be addressed with a better compaction strategy.
Something like forcing compaction if "the database is more than 10 MB
and increased 10x" regardless of the time.

I'm not sure exactly what the test was doing when this was observed, so I don't know whether that transaction volume is within the realm of possibility or if we're looking at a failure to perform compaction on time. It would be nice to have an enhanced safety-net for DB size, as we were only a few hundred MB away from hitting filesystem space issues as well.

Normally, ovsdb-server compacts the database once in 10-20 minutes, if the database doubles the size since the previous check.

I presume you mean if it doubled in size since the previous *compaction*? If we only compact when it doubles since the last *check*, then it would be easy for it to slightly-less-than-double every 10-20 minutes and never trigger the compaction while still growing exponentially.

I'm happy to discuss compaction approaches (though my expertise is very much in host service management and not OVS itself), but do you think there's merit in having this extended timeout as a backstop too?
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to