[jira] [Updated] (OAK-4581) Persistent local journal for more reliable event generation
[ https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrei Dulceanu updated OAK-4581: - Fix Version/s: (was: 1.38.0) 1.40.0 > Persistent local journal for more reliable event generation > --- > > Key: OAK-4581 > URL: https://issues.apache.org/jira/browse/OAK-4581 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: core >Reporter: Chetan Mehrotra >Priority: Major > Labels: observation > Fix For: 1.40.0 > > Attachments: OAK-4581.v0.patch > > > As discussed in OAK-2683 "hitting the observation queue limit" has multiple > drawbacks. Quite a bit of work is done to make diff generation faster. > However there are still chances of event queue getting filled up. > This issue is meant to implement a persistent event journal. Idea here being > # NodeStore would push the diff into a persistent store via a synchronous > observer > # Observors which are meant to handle such events in async way (by virtue of > being wrapped in BackgroundObserver) would instead pull the events from this > persisted journal > h3. A - What is persisted > h4. 1 - Serialized Root States and CommitInfo > In this approach we just persist the root states in serialized form. > * DocumentNodeStore - This means storing the root revision vector > * SegmentNodeStore - {color:red}Q1 - What does serialized form of > SegmentNodeStore root state looks like{color} - Possible the RecordId of > "root" state > Note that with OAK-4528 DocumentNodeStore can rely on persisted remote > journal to determine the affected paths. Which reduces the need for > persisting complete diff locally. > Event generation logic would then "deserialize" the persisted root states and > then generate the diff as currently done via NodeState comparison > h4. 2 - Serialized commit diff and CommitInfo > In this approach we can save the diff in JSOP form. The diff only contains > information about affected path. Similar to what is current being stored in > DocumentNodeStore journal > h4. CommitInfo > The commit info would also need to be serialized. So it needs to be ensure > whatever is stored there can be serialized or re calculated > h3. B - How it is persisted > h4. 1 - Use a secondary segment NodeStore > OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. > [~mreutegg] suggested that for persisted local journal we can also utilize a > SegmentNodeStore instance. Care needs to be taken for compaction. Either via > generation approach or relying on online compaction > h4. 2- Make use of write ahead log implementations > [~ianeboston] suggested that we can make use of some write ahead log > implementation like [1], [2] or [3] > h3. C - How changes get pulled > Some points to consider for event generation logic > # Would need a way to keep pointers to journal entry on per listener basis. > This would allow each Listener to "pull" content changes and generate diff as > per its speed and keeping in memory overhead low > # The journal should survive restarts > [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html > [2] > https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal > [3] > https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-4581) Persistent local journal for more reliable event generation
[ https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrei Dulceanu updated OAK-4581: - Fix Version/s: (was: 1.36.0) > Persistent local journal for more reliable event generation > --- > > Key: OAK-4581 > URL: https://issues.apache.org/jira/browse/OAK-4581 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: core >Reporter: Chetan Mehrotra >Priority: Major > Labels: observation > Fix For: 1.38.0 > > Attachments: OAK-4581.v0.patch > > > As discussed in OAK-2683 "hitting the observation queue limit" has multiple > drawbacks. Quite a bit of work is done to make diff generation faster. > However there are still chances of event queue getting filled up. > This issue is meant to implement a persistent event journal. Idea here being > # NodeStore would push the diff into a persistent store via a synchronous > observer > # Observors which are meant to handle such events in async way (by virtue of > being wrapped in BackgroundObserver) would instead pull the events from this > persisted journal > h3. A - What is persisted > h4. 1 - Serialized Root States and CommitInfo > In this approach we just persist the root states in serialized form. > * DocumentNodeStore - This means storing the root revision vector > * SegmentNodeStore - {color:red}Q1 - What does serialized form of > SegmentNodeStore root state looks like{color} - Possible the RecordId of > "root" state > Note that with OAK-4528 DocumentNodeStore can rely on persisted remote > journal to determine the affected paths. Which reduces the need for > persisting complete diff locally. > Event generation logic would then "deserialize" the persisted root states and > then generate the diff as currently done via NodeState comparison > h4. 2 - Serialized commit diff and CommitInfo > In this approach we can save the diff in JSOP form. The diff only contains > information about affected path. Similar to what is current being stored in > DocumentNodeStore journal > h4. CommitInfo > The commit info would also need to be serialized. So it needs to be ensure > whatever is stored there can be serialized or re calculated > h3. B - How it is persisted > h4. 1 - Use a secondary segment NodeStore > OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. > [~mreutegg] suggested that for persisted local journal we can also utilize a > SegmentNodeStore instance. Care needs to be taken for compaction. Either via > generation approach or relying on online compaction > h4. 2- Make use of write ahead log implementations > [~ianeboston] suggested that we can make use of some write ahead log > implementation like [1], [2] or [3] > h3. C - How changes get pulled > Some points to consider for event generation logic > # Would need a way to keep pointers to journal entry on per listener basis. > This would allow each Listener to "pull" content changes and generate diff as > per its speed and keeping in memory overhead low > # The journal should survive restarts > [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html > [2] > https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal > [3] > https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-4581) Persistent local journal for more reliable event generation
[ https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrei Dulceanu updated OAK-4581: - Fix Version/s: 1.38.0 > Persistent local journal for more reliable event generation > --- > > Key: OAK-4581 > URL: https://issues.apache.org/jira/browse/OAK-4581 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: core >Reporter: Chetan Mehrotra >Priority: Major > Labels: observation > Fix For: 1.36.0, 1.38.0 > > Attachments: OAK-4581.v0.patch > > > As discussed in OAK-2683 "hitting the observation queue limit" has multiple > drawbacks. Quite a bit of work is done to make diff generation faster. > However there are still chances of event queue getting filled up. > This issue is meant to implement a persistent event journal. Idea here being > # NodeStore would push the diff into a persistent store via a synchronous > observer > # Observors which are meant to handle such events in async way (by virtue of > being wrapped in BackgroundObserver) would instead pull the events from this > persisted journal > h3. A - What is persisted > h4. 1 - Serialized Root States and CommitInfo > In this approach we just persist the root states in serialized form. > * DocumentNodeStore - This means storing the root revision vector > * SegmentNodeStore - {color:red}Q1 - What does serialized form of > SegmentNodeStore root state looks like{color} - Possible the RecordId of > "root" state > Note that with OAK-4528 DocumentNodeStore can rely on persisted remote > journal to determine the affected paths. Which reduces the need for > persisting complete diff locally. > Event generation logic would then "deserialize" the persisted root states and > then generate the diff as currently done via NodeState comparison > h4. 2 - Serialized commit diff and CommitInfo > In this approach we can save the diff in JSOP form. The diff only contains > information about affected path. Similar to what is current being stored in > DocumentNodeStore journal > h4. CommitInfo > The commit info would also need to be serialized. So it needs to be ensure > whatever is stored there can be serialized or re calculated > h3. B - How it is persisted > h4. 1 - Use a secondary segment NodeStore > OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. > [~mreutegg] suggested that for persisted local journal we can also utilize a > SegmentNodeStore instance. Care needs to be taken for compaction. Either via > generation approach or relying on online compaction > h4. 2- Make use of write ahead log implementations > [~ianeboston] suggested that we can make use of some write ahead log > implementation like [1], [2] or [3] > h3. C - How changes get pulled > Some points to consider for event generation logic > # Would need a way to keep pointers to journal entry on per listener basis. > This would allow each Listener to "pull" content changes and generate diff as > per its speed and keeping in memory overhead low > # The journal should survive restarts > [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html > [2] > https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal > [3] > https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-4581) Persistent local journal for more reliable event generation
[ https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-4581: -- Fix Version/s: (was: 1.14.0) > Persistent local journal for more reliable event generation > --- > > Key: OAK-4581 > URL: https://issues.apache.org/jira/browse/OAK-4581 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: core >Reporter: Chetan Mehrotra >Priority: Major > Labels: observation > Fix For: 1.16.0 > > Attachments: OAK-4581.v0.patch > > > As discussed in OAK-2683 "hitting the observation queue limit" has multiple > drawbacks. Quite a bit of work is done to make diff generation faster. > However there are still chances of event queue getting filled up. > This issue is meant to implement a persistent event journal. Idea here being > # NodeStore would push the diff into a persistent store via a synchronous > observer > # Observors which are meant to handle such events in async way (by virtue of > being wrapped in BackgroundObserver) would instead pull the events from this > persisted journal > h3. A - What is persisted > h4. 1 - Serialized Root States and CommitInfo > In this approach we just persist the root states in serialized form. > * DocumentNodeStore - This means storing the root revision vector > * SegmentNodeStore - {color:red}Q1 - What does serialized form of > SegmentNodeStore root state looks like{color} - Possible the RecordId of > "root" state > Note that with OAK-4528 DocumentNodeStore can rely on persisted remote > journal to determine the affected paths. Which reduces the need for > persisting complete diff locally. > Event generation logic would then "deserialize" the persisted root states and > then generate the diff as currently done via NodeState comparison > h4. 2 - Serialized commit diff and CommitInfo > In this approach we can save the diff in JSOP form. The diff only contains > information about affected path. Similar to what is current being stored in > DocumentNodeStore journal > h4. CommitInfo > The commit info would also need to be serialized. So it needs to be ensure > whatever is stored there can be serialized or re calculated > h3. B - How it is persisted > h4. 1 - Use a secondary segment NodeStore > OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. > [~mreutegg] suggested that for persisted local journal we can also utilize a > SegmentNodeStore instance. Care needs to be taken for compaction. Either via > generation approach or relying on online compaction > h4. 2- Make use of write ahead log implementations > [~ianeboston] suggested that we can make use of some write ahead log > implementation like [1], [2] or [3] > h3. C - How changes get pulled > Some points to consider for event generation logic > # Would need a way to keep pointers to journal entry on per listener basis. > This would allow each Listener to "pull" content changes and generate diff as > per its speed and keeping in memory overhead low > # The journal should survive restarts > [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html > [2] > https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal > [3] > https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-4581) Persistent local journal for more reliable event generation
[ https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-4581: -- Fix Version/s: 1.16.0 > Persistent local journal for more reliable event generation > --- > > Key: OAK-4581 > URL: https://issues.apache.org/jira/browse/OAK-4581 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: core >Reporter: Chetan Mehrotra >Priority: Major > Labels: observation > Fix For: 1.14.0, 1.16.0 > > Attachments: OAK-4581.v0.patch > > > As discussed in OAK-2683 "hitting the observation queue limit" has multiple > drawbacks. Quite a bit of work is done to make diff generation faster. > However there are still chances of event queue getting filled up. > This issue is meant to implement a persistent event journal. Idea here being > # NodeStore would push the diff into a persistent store via a synchronous > observer > # Observors which are meant to handle such events in async way (by virtue of > being wrapped in BackgroundObserver) would instead pull the events from this > persisted journal > h3. A - What is persisted > h4. 1 - Serialized Root States and CommitInfo > In this approach we just persist the root states in serialized form. > * DocumentNodeStore - This means storing the root revision vector > * SegmentNodeStore - {color:red}Q1 - What does serialized form of > SegmentNodeStore root state looks like{color} - Possible the RecordId of > "root" state > Note that with OAK-4528 DocumentNodeStore can rely on persisted remote > journal to determine the affected paths. Which reduces the need for > persisting complete diff locally. > Event generation logic would then "deserialize" the persisted root states and > then generate the diff as currently done via NodeState comparison > h4. 2 - Serialized commit diff and CommitInfo > In this approach we can save the diff in JSOP form. The diff only contains > information about affected path. Similar to what is current being stored in > DocumentNodeStore journal > h4. CommitInfo > The commit info would also need to be serialized. So it needs to be ensure > whatever is stored there can be serialized or re calculated > h3. B - How it is persisted > h4. 1 - Use a secondary segment NodeStore > OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. > [~mreutegg] suggested that for persisted local journal we can also utilize a > SegmentNodeStore instance. Care needs to be taken for compaction. Either via > generation approach or relying on online compaction > h4. 2- Make use of write ahead log implementations > [~ianeboston] suggested that we can make use of some write ahead log > implementation like [1], [2] or [3] > h3. C - How changes get pulled > Some points to consider for event generation logic > # Would need a way to keep pointers to journal entry on per listener basis. > This would allow each Listener to "pull" content changes and generate diff as > per its speed and keeping in memory overhead low > # The journal should survive restarts > [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html > [2] > https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal > [3] > https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-4581) Persistent local journal for more reliable event generation
[ https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-4581: -- Fix Version/s: (was: 1.12.0) > Persistent local journal for more reliable event generation > --- > > Key: OAK-4581 > URL: https://issues.apache.org/jira/browse/OAK-4581 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: core >Reporter: Chetan Mehrotra >Priority: Major > Labels: observation > Fix For: 1.14.0 > > Attachments: OAK-4581.v0.patch > > > As discussed in OAK-2683 "hitting the observation queue limit" has multiple > drawbacks. Quite a bit of work is done to make diff generation faster. > However there are still chances of event queue getting filled up. > This issue is meant to implement a persistent event journal. Idea here being > # NodeStore would push the diff into a persistent store via a synchronous > observer > # Observors which are meant to handle such events in async way (by virtue of > being wrapped in BackgroundObserver) would instead pull the events from this > persisted journal > h3. A - What is persisted > h4. 1 - Serialized Root States and CommitInfo > In this approach we just persist the root states in serialized form. > * DocumentNodeStore - This means storing the root revision vector > * SegmentNodeStore - {color:red}Q1 - What does serialized form of > SegmentNodeStore root state looks like{color} - Possible the RecordId of > "root" state > Note that with OAK-4528 DocumentNodeStore can rely on persisted remote > journal to determine the affected paths. Which reduces the need for > persisting complete diff locally. > Event generation logic would then "deserialize" the persisted root states and > then generate the diff as currently done via NodeState comparison > h4. 2 - Serialized commit diff and CommitInfo > In this approach we can save the diff in JSOP form. The diff only contains > information about affected path. Similar to what is current being stored in > DocumentNodeStore journal > h4. CommitInfo > The commit info would also need to be serialized. So it needs to be ensure > whatever is stored there can be serialized or re calculated > h3. B - How it is persisted > h4. 1 - Use a secondary segment NodeStore > OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. > [~mreutegg] suggested that for persisted local journal we can also utilize a > SegmentNodeStore instance. Care needs to be taken for compaction. Either via > generation approach or relying on online compaction > h4. 2- Make use of write ahead log implementations > [~ianeboston] suggested that we can make use of some write ahead log > implementation like [1], [2] or [3] > h3. C - How changes get pulled > Some points to consider for event generation logic > # Would need a way to keep pointers to journal entry on per listener basis. > This would allow each Listener to "pull" content changes and generate diff as > per its speed and keeping in memory overhead low > # The journal should survive restarts > [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html > [2] > https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal > [3] > https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-4581) Persistent local journal for more reliable event generation
[ https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-4581: -- Fix Version/s: 1.14.0 > Persistent local journal for more reliable event generation > --- > > Key: OAK-4581 > URL: https://issues.apache.org/jira/browse/OAK-4581 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: core >Reporter: Chetan Mehrotra >Priority: Major > Labels: observation > Fix For: 1.12.0, 1.14.0 > > Attachments: OAK-4581.v0.patch > > > As discussed in OAK-2683 "hitting the observation queue limit" has multiple > drawbacks. Quite a bit of work is done to make diff generation faster. > However there are still chances of event queue getting filled up. > This issue is meant to implement a persistent event journal. Idea here being > # NodeStore would push the diff into a persistent store via a synchronous > observer > # Observors which are meant to handle such events in async way (by virtue of > being wrapped in BackgroundObserver) would instead pull the events from this > persisted journal > h3. A - What is persisted > h4. 1 - Serialized Root States and CommitInfo > In this approach we just persist the root states in serialized form. > * DocumentNodeStore - This means storing the root revision vector > * SegmentNodeStore - {color:red}Q1 - What does serialized form of > SegmentNodeStore root state looks like{color} - Possible the RecordId of > "root" state > Note that with OAK-4528 DocumentNodeStore can rely on persisted remote > journal to determine the affected paths. Which reduces the need for > persisting complete diff locally. > Event generation logic would then "deserialize" the persisted root states and > then generate the diff as currently done via NodeState comparison > h4. 2 - Serialized commit diff and CommitInfo > In this approach we can save the diff in JSOP form. The diff only contains > information about affected path. Similar to what is current being stored in > DocumentNodeStore journal > h4. CommitInfo > The commit info would also need to be serialized. So it needs to be ensure > whatever is stored there can be serialized or re calculated > h3. B - How it is persisted > h4. 1 - Use a secondary segment NodeStore > OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. > [~mreutegg] suggested that for persisted local journal we can also utilize a > SegmentNodeStore instance. Care needs to be taken for compaction. Either via > generation approach or relying on online compaction > h4. 2- Make use of write ahead log implementations > [~ianeboston] suggested that we can make use of some write ahead log > implementation like [1], [2] or [3] > h3. C - How changes get pulled > Some points to consider for event generation logic > # Would need a way to keep pointers to journal entry on per listener basis. > This would allow each Listener to "pull" content changes and generate diff as > per its speed and keeping in memory overhead low > # The journal should survive restarts > [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html > [2] > https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal > [3] > https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-4581) Persistent local journal for more reliable event generation
[ https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-4581: -- Fix Version/s: (was: 1.10.0) > Persistent local journal for more reliable event generation > --- > > Key: OAK-4581 > URL: https://issues.apache.org/jira/browse/OAK-4581 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: core >Reporter: Chetan Mehrotra >Priority: Major > Labels: observation > Fix For: 1.12 > > Attachments: OAK-4581.v0.patch > > > As discussed in OAK-2683 "hitting the observation queue limit" has multiple > drawbacks. Quite a bit of work is done to make diff generation faster. > However there are still chances of event queue getting filled up. > This issue is meant to implement a persistent event journal. Idea here being > # NodeStore would push the diff into a persistent store via a synchronous > observer > # Observors which are meant to handle such events in async way (by virtue of > being wrapped in BackgroundObserver) would instead pull the events from this > persisted journal > h3. A - What is persisted > h4. 1 - Serialized Root States and CommitInfo > In this approach we just persist the root states in serialized form. > * DocumentNodeStore - This means storing the root revision vector > * SegmentNodeStore - {color:red}Q1 - What does serialized form of > SegmentNodeStore root state looks like{color} - Possible the RecordId of > "root" state > Note that with OAK-4528 DocumentNodeStore can rely on persisted remote > journal to determine the affected paths. Which reduces the need for > persisting complete diff locally. > Event generation logic would then "deserialize" the persisted root states and > then generate the diff as currently done via NodeState comparison > h4. 2 - Serialized commit diff and CommitInfo > In this approach we can save the diff in JSOP form. The diff only contains > information about affected path. Similar to what is current being stored in > DocumentNodeStore journal > h4. CommitInfo > The commit info would also need to be serialized. So it needs to be ensure > whatever is stored there can be serialized or re calculated > h3. B - How it is persisted > h4. 1 - Use a secondary segment NodeStore > OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. > [~mreutegg] suggested that for persisted local journal we can also utilize a > SegmentNodeStore instance. Care needs to be taken for compaction. Either via > generation approach or relying on online compaction > h4. 2- Make use of write ahead log implementations > [~ianeboston] suggested that we can make use of some write ahead log > implementation like [1], [2] or [3] > h3. C - How changes get pulled > Some points to consider for event generation logic > # Would need a way to keep pointers to journal entry on per listener basis. > This would allow each Listener to "pull" content changes and generate diff as > per its speed and keeping in memory overhead low > # The journal should survive restarts > [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html > [2] > https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal > [3] > https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-4581) Persistent local journal for more reliable event generation
[ https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-4581: -- Fix Version/s: 1.12 > Persistent local journal for more reliable event generation > --- > > Key: OAK-4581 > URL: https://issues.apache.org/jira/browse/OAK-4581 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: core >Reporter: Chetan Mehrotra >Priority: Major > Labels: observation > Fix For: 1.10.0, 1.12 > > Attachments: OAK-4581.v0.patch > > > As discussed in OAK-2683 "hitting the observation queue limit" has multiple > drawbacks. Quite a bit of work is done to make diff generation faster. > However there are still chances of event queue getting filled up. > This issue is meant to implement a persistent event journal. Idea here being > # NodeStore would push the diff into a persistent store via a synchronous > observer > # Observors which are meant to handle such events in async way (by virtue of > being wrapped in BackgroundObserver) would instead pull the events from this > persisted journal > h3. A - What is persisted > h4. 1 - Serialized Root States and CommitInfo > In this approach we just persist the root states in serialized form. > * DocumentNodeStore - This means storing the root revision vector > * SegmentNodeStore - {color:red}Q1 - What does serialized form of > SegmentNodeStore root state looks like{color} - Possible the RecordId of > "root" state > Note that with OAK-4528 DocumentNodeStore can rely on persisted remote > journal to determine the affected paths. Which reduces the need for > persisting complete diff locally. > Event generation logic would then "deserialize" the persisted root states and > then generate the diff as currently done via NodeState comparison > h4. 2 - Serialized commit diff and CommitInfo > In this approach we can save the diff in JSOP form. The diff only contains > information about affected path. Similar to what is current being stored in > DocumentNodeStore journal > h4. CommitInfo > The commit info would also need to be serialized. So it needs to be ensure > whatever is stored there can be serialized or re calculated > h3. B - How it is persisted > h4. 1 - Use a secondary segment NodeStore > OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. > [~mreutegg] suggested that for persisted local journal we can also utilize a > SegmentNodeStore instance. Care needs to be taken for compaction. Either via > generation approach or relying on online compaction > h4. 2- Make use of write ahead log implementations > [~ianeboston] suggested that we can make use of some write ahead log > implementation like [1], [2] or [3] > h3. C - How changes get pulled > Some points to consider for event generation logic > # Would need a way to keep pointers to journal entry on per listener basis. > This would allow each Listener to "pull" content changes and generate diff as > per its speed and keeping in memory overhead low > # The journal should survive restarts > [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html > [2] > https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal > [3] > https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-4581) Persistent local journal for more reliable event generation
[ https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Egli updated OAK-4581: - Fix Version/s: (was: 1.8) 1.10 > Persistent local journal for more reliable event generation > --- > > Key: OAK-4581 > URL: https://issues.apache.org/jira/browse/OAK-4581 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: core >Reporter: Chetan Mehrotra >Assignee: Stefan Egli > Labels: observation > Fix For: 1.10 > > Attachments: OAK-4581.v0.patch > > > As discussed in OAK-2683 "hitting the observation queue limit" has multiple > drawbacks. Quite a bit of work is done to make diff generation faster. > However there are still chances of event queue getting filled up. > This issue is meant to implement a persistent event journal. Idea here being > # NodeStore would push the diff into a persistent store via a synchronous > observer > # Observors which are meant to handle such events in async way (by virtue of > being wrapped in BackgroundObserver) would instead pull the events from this > persisted journal > h3. A - What is persisted > h4. 1 - Serialized Root States and CommitInfo > In this approach we just persist the root states in serialized form. > * DocumentNodeStore - This means storing the root revision vector > * SegmentNodeStore - {color:red}Q1 - What does serialized form of > SegmentNodeStore root state looks like{color} - Possible the RecordId of > "root" state > Note that with OAK-4528 DocumentNodeStore can rely on persisted remote > journal to determine the affected paths. Which reduces the need for > persisting complete diff locally. > Event generation logic would then "deserialize" the persisted root states and > then generate the diff as currently done via NodeState comparison > h4. 2 - Serialized commit diff and CommitInfo > In this approach we can save the diff in JSOP form. The diff only contains > information about affected path. Similar to what is current being stored in > DocumentNodeStore journal > h4. CommitInfo > The commit info would also need to be serialized. So it needs to be ensure > whatever is stored there can be serialized or re calculated > h3. B - How it is persisted > h4. 1 - Use a secondary segment NodeStore > OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. > [~mreutegg] suggested that for persisted local journal we can also utilize a > SegmentNodeStore instance. Care needs to be taken for compaction. Either via > generation approach or relying on online compaction > h4. 2- Make use of write ahead log implementations > [~ianeboston] suggested that we can make use of some write ahead log > implementation like [1], [2] or [3] > h3. C - How changes get pulled > Some points to consider for event generation logic > # Would need a way to keep pointers to journal entry on per listener basis. > This would allow each Listener to "pull" content changes and generate diff as > per its speed and keeping in memory overhead low > # The journal should survive restarts > [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html > [2] > https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal > [3] > https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (OAK-4581) Persistent local journal for more reliable event generation
[ https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Egli updated OAK-4581: - Fix Version/s: 1.8 > Persistent local journal for more reliable event generation > --- > > Key: OAK-4581 > URL: https://issues.apache.org/jira/browse/OAK-4581 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: core >Reporter: Chetan Mehrotra >Assignee: Stefan Egli > Labels: observation > Fix For: 1.8 > > Attachments: OAK-4581.v0.patch > > > As discussed in OAK-2683 "hitting the observation queue limit" has multiple > drawbacks. Quite a bit of work is done to make diff generation faster. > However there are still chances of event queue getting filled up. > This issue is meant to implement a persistent event journal. Idea here being > # NodeStore would push the diff into a persistent store via a synchronous > observer > # Observors which are meant to handle such events in async way (by virtue of > being wrapped in BackgroundObserver) would instead pull the events from this > persisted journal > h3. A - What is persisted > h4. 1 - Serialized Root States and CommitInfo > In this approach we just persist the root states in serialized form. > * DocumentNodeStore - This means storing the root revision vector > * SegmentNodeStore - {color:red}Q1 - What does serialized form of > SegmentNodeStore root state looks like{color} - Possible the RecordId of > "root" state > Note that with OAK-4528 DocumentNodeStore can rely on persisted remote > journal to determine the affected paths. Which reduces the need for > persisting complete diff locally. > Event generation logic would then "deserialize" the persisted root states and > then generate the diff as currently done via NodeState comparison > h4. 2 - Serialized commit diff and CommitInfo > In this approach we can save the diff in JSOP form. The diff only contains > information about affected path. Similar to what is current being stored in > DocumentNodeStore journal > h4. CommitInfo > The commit info would also need to be serialized. So it needs to be ensure > whatever is stored there can be serialized or re calculated > h3. B - How it is persisted > h4. 1 - Use a secondary segment NodeStore > OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. > [~mreutegg] suggested that for persisted local journal we can also utilize a > SegmentNodeStore instance. Care needs to be taken for compaction. Either via > generation approach or relying on online compaction > h4. 2- Make use of write ahead log implementations > [~ianeboston] suggested that we can make use of some write ahead log > implementation like [1], [2] or [3] > h3. C - How changes get pulled > Some points to consider for event generation logic > # Would need a way to keep pointers to journal entry on per listener basis. > This would allow each Listener to "pull" content changes and generate diff as > per its speed and keeping in memory overhead low > # The journal should survive restarts > [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html > [2] > https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal > [3] > https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-4581) Persistent local journal for more reliable event generation
[ https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Egli updated OAK-4581: - Fix Version/s: (was: 1.6) removed target version 1.6 due to resourcing constraints > Persistent local journal for more reliable event generation > --- > > Key: OAK-4581 > URL: https://issues.apache.org/jira/browse/OAK-4581 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: core >Reporter: Chetan Mehrotra >Assignee: Stefan Egli > Labels: observation > Attachments: OAK-4581.v0.patch > > > As discussed in OAK-2683 "hitting the observation queue limit" has multiple > drawbacks. Quite a bit of work is done to make diff generation faster. > However there are still chances of event queue getting filled up. > This issue is meant to implement a persistent event journal. Idea here being > # NodeStore would push the diff into a persistent store via a synchronous > observer > # Observors which are meant to handle such events in async way (by virtue of > being wrapped in BackgroundObserver) would instead pull the events from this > persisted journal > h3. A - What is persisted > h4. 1 - Serialized Root States and CommitInfo > In this approach we just persist the root states in serialized form. > * DocumentNodeStore - This means storing the root revision vector > * SegmentNodeStore - {color:red}Q1 - What does serialized form of > SegmentNodeStore root state looks like{color} - Possible the RecordId of > "root" state > Note that with OAK-4528 DocumentNodeStore can rely on persisted remote > journal to determine the affected paths. Which reduces the need for > persisting complete diff locally. > Event generation logic would then "deserialize" the persisted root states and > then generate the diff as currently done via NodeState comparison > h4. 2 - Serialized commit diff and CommitInfo > In this approach we can save the diff in JSOP form. The diff only contains > information about affected path. Similar to what is current being stored in > DocumentNodeStore journal > h4. CommitInfo > The commit info would also need to be serialized. So it needs to be ensure > whatever is stored there can be serialized or re calculated > h3. B - How it is persisted > h4. 1 - Use a secondary segment NodeStore > OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. > [~mreutegg] suggested that for persisted local journal we can also utilize a > SegmentNodeStore instance. Care needs to be taken for compaction. Either via > generation approach or relying on online compaction > h4. 2- Make use of write ahead log implementations > [~ianeboston] suggested that we can make use of some write ahead log > implementation like [1], [2] or [3] > h3. C - How changes get pulled > Some points to consider for event generation logic > # Would need a way to keep pointers to journal entry on per listener basis. > This would allow each Listener to "pull" content changes and generate diff as > per its speed and keeping in memory overhead low > # The journal should survive restarts > [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html > [2] > https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal > [3] > https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-4581) Persistent local journal for more reliable event generation
[ https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Egli updated OAK-4581: - Attachment: OAK-4581.v0.patch Attached [^OAK-4581.v0.patch] (which bases on [OAK-4655.v1.patch|https://issues.apache.org/jira/secure/attachment/12826406/OAK-4655.v1.patch] that introduces support for multiple SegmentNodeStores). This is an ongoing effort, but wanted to share progress early. Here's the status: * introduces a {{NodeStateSerializer}} that NSs should somehow implement in order to map {{NodeState->String}} and vice-verca (for storage of the event). * introduces a {{EventQueueFactory}} that is used by the BackgroundObserver instead of hardcoding creation of a {{newArrayBlockingQueue}} (it still does the latter for sling compatibility cases, but that we should get rid of) ** one version of this is above {{newArrayBlockingQueue}} - ie an in-memory queue that can still be used for a few cases ** the new version though is {{PersistedEventQueueFactory}} that creates {{PersistedBlockingQueue}} where the storing magic will happen. *** this last one is early stages - currently just show-cases using a secondary/thirdary.. store for persistence. (But storing is quite unoptimized there atm). * so again, the main idea is that the {{BackgroundObserver}} remains largely unchanged - it still works on a {{queue}} and wouldn't notice what's behind the queue. The logic for storing/retrieving via persistence is hidden behind a {{BlockingQueue}} implementation, that's the main point here I think. The consequences that this introduces will be: * we can only store values in the CommitInfo that are serializable - others would have to be skipped/get lost * creators of {{new Jcr()/new Oak()}} would pass the corresponding {{EventQueueFactory}} - thus the mapping would be done in the 'jcr factory service'. The EventQueueFactory will then propagate down the Repository/Session chain to the BackgroundObserver. I'll continue working on the {{PersistedBlockingQueue}} including testing but would appreciate some early feedback about this approach. > Persistent local journal for more reliable event generation > --- > > Key: OAK-4581 > URL: https://issues.apache.org/jira/browse/OAK-4581 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: core >Reporter: Chetan Mehrotra >Assignee: Stefan Egli > Labels: observation > Fix For: 1.6 > > Attachments: OAK-4581.v0.patch > > > As discussed in OAK-2683 "hitting the observation queue limit" has multiple > drawbacks. Quite a bit of work is done to make diff generation faster. > However there are still chances of event queue getting filled up. > This issue is meant to implement a persistent event journal. Idea here being > # NodeStore would push the diff into a persistent store via a synchronous > observer > # Observors which are meant to handle such events in async way (by virtue of > being wrapped in BackgroundObserver) would instead pull the events from this > persisted journal > h3. A - What is persisted > h4. 1 - Serialized Root States and CommitInfo > In this approach we just persist the root states in serialized form. > * DocumentNodeStore - This means storing the root revision vector > * SegmentNodeStore - {color:red}Q1 - What does serialized form of > SegmentNodeStore root state looks like{color} - Possible the RecordId of > "root" state > Note that with OAK-4528 DocumentNodeStore can rely on persisted remote > journal to determine the affected paths. Which reduces the need for > persisting complete diff locally. > Event generation logic would then "deserialize" the persisted root states and > then generate the diff as currently done via NodeState comparison > h4. 2 - Serialized commit diff and CommitInfo > In this approach we can save the diff in JSOP form. The diff only contains > information about affected path. Similar to what is current being stored in > DocumentNodeStore journal > h4. CommitInfo > The commit info would also need to be serialized. So it needs to be ensure > whatever is stored there can be serialized or re calculated > h3. B - How it is persisted > h4. 1 - Use a secondary segment NodeStore > OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. > [~mreutegg] suggested that for persisted local journal we can also utilize a > SegmentNodeStore instance. Care needs to be taken for compaction. Either via > generation approach or relying on online compaction > h4. 2- Make use of write ahead log implementations > [~ianeboston] suggested that we can make use of some write ahead log > implementation like [1], [2] or [3] > h3. C - How changes get pulled > Some points to consider for event generation logic > # Would need a way to keep pointers
[jira] [Updated] (OAK-4581) Persistent local journal for more reliable event generation
[ https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcel Reutegger updated OAK-4581: -- Labels: observation (was: ) > Persistent local journal for more reliable event generation > --- > > Key: OAK-4581 > URL: https://issues.apache.org/jira/browse/OAK-4581 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: core >Reporter: Chetan Mehrotra > Labels: observation > Fix For: 1.6 > > > As discussed in OAK-2683 "hitting the observation queue limit" has multiple > drawbacks. Quite a bit of work is done to make diff generation faster. > However there are still chances of event queue getting filled up. > This issue is meant to implement a persistent event journal. Idea here being > # NodeStore would push the diff into a persistent store via a synchronous > observer > # Observors which are meant to handle such events in async way (by virtue of > being wrapped in BackgroundObserver) would instead pull the events from this > persisted journal > h3. A - What is persisted > h4. 1 - Serialized Root States and CommitInfo > In this approach we just persist the root states in serialized form. > * DocumentNodeStore - This means storing the root revision vector > * SegmentNodeStore - {color:red}Q1 - What does serialized form of > SegmentNodeStore root state looks like{color} - Possible the RecordId of > "root" state > Note that with OAK-4528 DocumentNodeStore can rely on persisted remote > journal to determine the affected paths. Which reduces the need for > persisting complete diff locally. > Event generation logic would then "deserialize" the persisted root states and > then generate the diff as currently done via NodeState comparison > h4. 2 - Serialized commit diff and CommitInfo > In this approach we can save the diff in JSOP form. The diff only contains > information about affected path. Similar to what is current being stored in > DocumentNodeStore journal > h4. CommitInfo > The commit info would also need to be serialized. So it needs to be ensure > whatever is stored there can be serialized or re calculated > h3. B - How it is persisted > h4. 1 - Use a secondary segment NodeStore > OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. > [~mreutegg] suggested that for persisted local journal we can also utilize a > SegmentNodeStore instance. Care needs to be taken for compaction. Either via > generation approach or relying on online compaction > h4. 2- Make use of write ahead log implementations > [~ianeboston] suggested that we can make use of some write ahead log > implementation like [1], [2] or [3] > h3. C - How changes get pulled > Some points to consider for event generation logic > # Would need a way to keep pointers to journal entry on per listener basis. > This would allow each Listener to "pull" content changes and generate diff as > per its speed and keeping in memory overhead low > # The journal should survive restarts > [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html > [2] > https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal > [3] > https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-4581) Persistent local journal for more reliable event generation
[ https://issues.apache.org/jira/browse/OAK-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Mehrotra updated OAK-4581: - Description: As discussed in OAK-2683 "hitting the observation queue limit" has multiple drawbacks. Quite a bit of work is done to make diff generation faster. However there are still chances of event queue getting filled up. This issue is meant to implement a persistent event journal. Idea here being # NodeStore would push the diff into a persistent store via a synchronous observer # Observors which are meant to handle such events in async way (by virtue of being wrapped in BackgroundObserver) would instead pull the events from this persisted journal h3. A - What is persisted h4. 1 - Serialized Root States and CommitInfo In this approach we just persist the root states in serialized form. * DocumentNodeStore - This means storing the root revision vector * SegmentNodeStore - {color:red}Q1 - What does serialized form of SegmentNodeStore root state looks like{color} - Possible the RecordId of "root" state Note that with OAK-4528 DocumentNodeStore can rely on persisted remote journal to determine the affected paths. Which reduces the need for persisting complete diff locally. Event generation logic would then "deserialize" the persisted root states and then generate the diff as currently done via NodeState comparison h4. 2 - Serialized commit diff and CommitInfo In this approach we can save the diff in JSOP form. The diff only contains information about affected path. Similar to what is current being stored in DocumentNodeStore journal h4. CommitInfo The commit info would also need to be serialized. So it needs to be ensure whatever is stored there can be serialized or re calculated h3. B - How it is persisted h4. 1 - Use a secondary segment NodeStore OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. [~mreutegg] suggested that for persisted local journal we can also utilize a SegmentNodeStore instance. Care needs to be taken for compaction. Either via generation approach or relying on online compaction h4. 2- Make use of write ahead log implementations [~ianeboston] suggested that we can make use of some write ahead log implementation like [1], [2] or [3] h3. C - How changes get pulled Some points to consider for event generation logic # Would need a way to keep pointers to journal entry on per listener basis. This would allow each Listener to "pull" content changes and generate diff as per its speed and keeping in memory overhead low # The journal should survive restarts [1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html [2] https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal [3] https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog was: As discussed in OAK-2683 "hitting the observation queue limit" has multiple drawbacks. Quite a bit of work is done to make diff generation faster. However there are still chances of event queue getting filled up. This issue is meant to implement a persistent event journal. Idea here being # NodeStore would push the diff into a persistent store via a synchronous observer # Observors which are meant to handle such events in async way (by virtue of being wrapped in BackgroundObserver) would instead pull the events from this persisted journal h3. A - What is persisted h4. 1 - Serialized Root States and CommitInfo In this approach we just persist the root states in serialized form. * DocumentNodeStore - This means storing the root revision vector * SegmentNodeStore - {color:red}Q1 - What does serialized form of SegmentNodeStore root state looks like{color} - Possible the RecordId of "root" state Note that with OAK-4528 DocumentNodeStore can rely on persisted remote journal to determine the affected paths. Which reduces the need for persisting complete diff locally h4. 2 - Serialized commit diff and CommitInfo In this approach we can save the diff in JSOP form. The diff only contains information about affected path. Similar to what is current being stored in DocumentNodeStore journal h4. CommitInfo The commit info would also need to be serialized. So it needs to be ensure whatever is stored there can be serialized or re calculated h3. B - How it is persisted h4. 1 - Use a secondary segment NodeStore OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. [~mreutegg] suggested that for persisted local journal we can also utilize a SegmentNodeStore instance. Care needs to be taken for compaction. Either via generation approach or relying on online compaction h4. 2- Make use of write ahead log implementations [~ianeboston] suggested that we can make use of some write ahead log implementation like [1], [2] or [3] h3. C - Ho