> Date: Tue, 26 Feb 2019 17:21:50 -0700 > From: Rich Megginson <rmegg...@redhat.com> > Message-ID: <d40bde83-1e88-b34f-9b5d-d2b320468...@redhat.com>
> On 2/26/19 4:26 PM, William Brown wrote: >>> I think the recursive/nested transaction on the database level are not the >>> problem, we do this correctly already, either all or no change becomes >>> persistent. >>> What we do not manage is modifications we do in parallel on the in memory >>> structure like the entry cache, changes to the EC are not managed by any >>> txn and I do not see how any of the database txn models would help, they do >>> not know about ec and can abort changes. >>> We would need to incorporate the EC into a generic txn model, or have a way >>> to flag ec entries as garbage for if a txn is aborted >> The issue is we allow parallel writes, which breaks the consistency >> guarantees of the EC anyway. LMDB won’t allow parallel writes (it’s single >> write - concurrent parallel readers), and most other modern kv stores take >> this approach too, so we should be remodelling our transactions to match >> this IMO. It will make the process of how we reason about the EC much much >> simpler I think. > Some sort of in-memory data structure with fast lookup and transactional > semantics (modify operations are stored as mvcc/cow so each read of the > database with a given txn handle sees its own > view of the ec, a txn commit updates the parent txn ec view, or the global ec > view if no parent, from the copy, a txn abort deletes the txn's copy of the > ec) is needed. A quick google search > turns up several hits. I'm not sure if the B+Tree proposed at > http://www.port389.org/docs/389ds/design/cache_redesign.html has > transactional semantics, or if such code could be added to its > implementation. > > With LMDB, if we could make the on-disk entry representation the same as the > in-memory entry representation, then we could use LMDB as the entry cache too > - the database would be the entry > cache as well. Exactly. This was the original design goal for back-mdb and LMDB in OpenLDAP. http://www.openldap.org/lists/openldap-devel/200905/msg00036.html Note that the back-mdb in OpenLDAP 2.4 is a compromise from this original design; we still have a slight deserialization pass when reading entries from the DB. But it's much simpler and faster than what we used to do with back-bdb/hdb. Ultimately - if your local persistence layer is so slow that it needs an in-memory cache, that local persistence layer is broken. This conclusion is inescapable, after many years of working with BerkeleyDB. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/ _______________________________________________ 389-devel mailing list -- 389-devel@lists.fedoraproject.org To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org