Hi mesos developers,
I have been working on a technical proposal to improve availability and
failover strategy for mesos replicated log library. The primary use case
we'd like to improve is aurora scheduler's leader failover and expensive
replicated log compaction that gets run at a new leader sta
Thanks for the design doc. Those graphs look awesome. We probably should
get those into this doc:
https://github.com/apache/mesos/blob/master/docs/replicated-log-internals.md
Regarding the doc, I'd like to see some correctness argument about why the
catchup process won't demote the current leader.
I think I'm missing why we would spend energy controlling the leveldb
compaction.
I thought leveldb was used for convenience, and its log structured merge
algorithm was just a consequence.
Why not spend energy writing an append only file as opposed to optimizing
compaction of an algorithm we d
I don't have enough context on why leveldb was choosen as a backend for
append only logs but at this moment it is deeply integrated into mesos
library. There are tools have been built around it, leveldb choice has
influenced the log interface: (beginning, ending positions), people are
doing snapsho