[ https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Davide Giannella updated OAK-4581: ---------------------------------- Fix Version/s: 1.12 > Persistent local journal for more reliable event generation > ----------------------------------------------------------- > > Key: OAK-4581 > URL: https://issues.apache.org/jira/browse/OAK-4581 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: core > Reporter: Chetan Mehrotra > Priority: Major > Labels: observation > Fix For: 1.10.0, 1.12 > > Attachments: OAK-4581.v0.patch > > > As discussed in OAK-2683 "hitting the observation queue limit" has multiple > drawbacks. Quite a bit of work is done to make diff generation faster. > However there are still chances of event queue getting filled up. > This issue is meant to implement a persistent event journal. Idea here being > # NodeStore would push the diff into a persistent store via a synchronous > observer > # Observors which are meant to handle such events in async way (by virtue of > being wrapped in BackgroundObserver) would instead pull the events from this > persisted journal > h3. A - What is persisted > h4. 1 - Serialized Root States and CommitInfo > In this approach we just persist the root states in serialized form. > * DocumentNodeStore - This means storing the root revision vector > * SegmentNodeStore - {color:red}Q1 - What does serialized form of > SegmentNodeStore root state looks like{color} - Possible the RecordId of > "root" state > Note that with OAK-4528 DocumentNodeStore can rely on persisted remote > journal to determine the affected paths. Which reduces the need for > persisting complete diff locally. > Event generation logic would then "deserialize" the persisted root states and > then generate the diff as currently done via NodeState comparison > h4. 2 - Serialized commit diff and CommitInfo > In this approach we can save the diff in JSOP form. The diff only contains > information about affected path. Similar to what is current being stored in > DocumentNodeStore journal > h4. CommitInfo > The commit info would also need to be serialized. So it needs to be ensure > whatever is stored there can be serialized or re calculated > h3. B - How it is persisted > h4. 1 - Use a secondary segment NodeStore > OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. > [~mreutegg] suggested that for persisted local journal we can also utilize a > SegmentNodeStore instance. Care needs to be taken for compaction. Either via > generation approach or relying on online compaction > h4. 2- Make use of write ahead log implementations > [~ianeboston] suggested that we can make use of some write ahead log > implementation like [1], [2] or [3] > h3. C - How changes get pulled > Some points to consider for event generation logic > # Would need a way to keep pointers to journal entry on per listener basis. > This would allow each Listener to "pull" content changes and generate diff as > per its speed and keeping in memory overhead low > # The journal should survive restarts > [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html > [2] > https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal > [3] > https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog -- This message was sent by Atlassian JIRA (v7.6.3#76005)