> Date: Tue, 26 Feb 2019 17:21:50 -0700
> From: Rich Megginson <rmegg...@redhat.com>
> Message-ID: <d40bde83-1e88-b34f-9b5d-d2b320468...@redhat.com>

> On 2/26/19 4:26 PM, William Brown wrote:
>>> I think the recursive/nested transaction on the database level are not the 
>>> problem, we do this correctly already, either all or no change becomes 
>>> persistent.
>>> What we do not manage is modifications we do in parallel on the in memory 
>>> structure like the entry cache, changes to the EC are not managed by any 
>>> txn and I do not see how any of the database txn models would help, they do 
>>> not know about ec and can abort changes.
>>> We would need to incorporate the EC into a generic txn model, or have a way 
>>> to flag ec entries as garbage for if a txn is aborted
>> The issue is we allow parallel writes, which breaks the consistency 
>> guarantees of the EC anyway. LMDB won’t allow parallel writes (it’s single 
>> write - concurrent parallel readers), and most other modern kv stores take 
>> this approach too, so we should be remodelling our transactions to match 
>> this IMO. It will make the process of how we reason about the EC much much 
>> simpler I think.

> Some sort of in-memory data structure with fast lookup and transactional 
> semantics (modify operations are stored as mvcc/cow so each read of the 
> database with a given txn handle sees its own 
> view of the ec, a txn commit updates the parent txn ec view, or the global ec 
> view if no parent, from the copy, a txn abort deletes the txn's copy of the 
> ec) is needed.  A quick google search 
> turns up several hits.  I'm not sure if the B+Tree proposed at 
> http://www.port389.org/docs/389ds/design/cache_redesign.html has 
> transactional semantics, or if such code could be added to its 
> implementation.
> 
> With LMDB, if we could make the on-disk entry representation the same as the 
> in-memory entry representation, then we could use LMDB as the entry cache too 
> - the database would be the entry 
> cache as well.

Exactly. This was the original design goal for back-mdb and LMDB in OpenLDAP.
http://www.openldap.org/lists/openldap-devel/200905/msg00036.html

Note that the back-mdb in OpenLDAP 2.4 is a compromise from this original 
design; we still
have a slight deserialization pass when reading entries from the DB. But it's 
much simpler
and faster than what we used to do with back-bdb/hdb.

Ultimately - if your local persistence layer is so slow that it needs an 
in-memory cache,
that local persistence layer is broken. This conclusion is inescapable, after 
many years of
working with BerkeleyDB.

-- 
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/
_______________________________________________
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org

Reply via email to