[jira] [Comment Edited] (OAK-3976) journal should support large(r) entries
[ https://issues.apache.org/jira/browse/OAK-3976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15746144#comment-15746144 ] Vikas Saurabh edited comment on OAK-3976 at 12/13/16 8:16 PM: -- Fixed (applied Marcel's patch) in trunk at [r1774094|https://svn.apache.org/r1774094]. was (Author: catholicon): Fixed in trunk at [r1774094|https://svn.apache.org/r1774094]. > journal should support large(r) entries > --- > > Key: OAK-3976 > URL: https://issues.apache.org/jira/browse/OAK-3976 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Affects Versions: 1.3.14 >Reporter: Stefan Egli >Assignee: Vikas Saurabh > Fix For: 1.6, 1.5.16 > > Attachments: OAK-3976-1.patch, OAK-3976.patch > > > Journal entries are created in the background write. Normally this happens > every second. If for some reason there is a large delay between two > background writes, the number of pending changes can also accumulate. Which > can result in (arbitrary) large single journal entries (ie with large {{_c}} > property). > This can cause multiple problems down the road: > * journal gc at this point loads 450 entries - and if some are large this can > result in a very large memory consumption during gc (which can cause severe > stability problems for the VM, if not OOM etc). This should be fixed with > OAK-3001 (where we only get the id, thus do not care how big {{_c}} is) > * before OAK-3001 is done (which is currently scheduled after 1.4) what we > can do is reduce the delete batch size (OAK-3975) > * background reads however also read the journal entries and even if > OAK-3001/OAK-3975 are implemented the background read can still cause large > memory consumption. So we need to improve this one way or another. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (OAK-3976) journal should support large(r) entries
[ https://issues.apache.org/jira/browse/OAK-3976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15743140#comment-15743140 ] Vikas Saurabh edited comment on OAK-3976 at 12/12/16 9:23 PM: -- Attaching patch - [^OAK-3976.patch]. Notable difference from earlier version: * {{JournalEntry}} also tracks if some branch commits have been added - this is required to avoid the case when a commit pushes the entry and before any other commit background write tries to push the journal (thus using the same revision for the entry and logging error that it couldn't push the entry) ** This fails a few tests which assume that DocumentNodeStore init pushes an empty journal entry ({{JournalTest#lastRevRecoveryJournalTest, JournalTest#lastRevRecoveryJournalTestWithConcurrency, JournalGCTest#getTailRevision}}). While we could avoid let these tests pass by remembering last journal-revision pushes and skip same revision to be pushed instead of tracking if we have any changes to be pushed or not. But, I prefer this way (not push if not required) - fixed test diff at \[0]. [~mreutegg] thoughts? * Path counter is incremented if a new {{TreeNode}} is pushed - probably we should rename it to something like node-count * Lock around journal push removed * A bit of refactor of where "numChanges" crosses threshold (now in {{done()}} method) \[0]: {noformat} diff --git a/oak-core/src/test/java/org/apache/jackrabbit/oak/plugins/document/JournalGCTest.java b/oak-core/src/test/java/org/apache/jackrabbit/oak/plugins/document/JournalGCTest.java index 1c05460..aef2c037 100644 --- a/oak-core/src/test/java/org/apache/jackrabbit/oak/plugins/document/JournalGCTest.java +++ b/oak-core/src/test/java/org/apache/jackrabbit/oak/plugins/document/JournalGCTest.java @@ -105,7 +105,7 @@ public class JournalGCTest { // must collect all journal entries. the first created when // DocumentNodeStore was initialized and the second created // by the background update -assertEquals(2, jgc.gc(1, TimeUnit.HOURS)); +assertEquals(1, jgc.gc(1, TimeUnit.HOURS)); // current time, but without the increment done by getTime() now = c.getTime() - 1; diff --git a/oak-core/src/test/java/org/apache/jackrabbit/oak/plugins/document/JournalTest.java b/oak-core/src/test/java/org/apache/jackrabbit/oak/plugins/document/JournalTest.java index feb9ee1..2420065 100644 --- a/oak-core/src/test/java/org/apache/jackrabbit/oak/plugins/document/JournalTest.java +++ b/oak-core/src/test/java/org/apache/jackrabbit/oak/plugins/document/JournalTest.java @@ -337,11 +337,8 @@ public class JournalTest extends AbstractJournalTest { DocumentNodeStore ds2 = mk2.getNodeStore(); final int c2Id = ds2.getClusterId(); -// should have 1 each with just the root changed -assertJournalEntries(ds1, "{}"); -assertJournalEntries(ds2, "{}"); -assertEquals(1, countJournalEntries(ds1, 10)); -assertEquals(1, countJournalEntries(ds2, 10)); +assertEquals(0, countJournalEntries(ds1, 10)); +assertEquals(0, countJournalEntries(ds2, 10)); //1. Create base structure /x/y NodeBuilder b1 = ds1.getRoot().builder(); @@ -383,9 +380,9 @@ public class JournalTest extends AbstractJournalTest { // besides the former root change, now 1 also has final String change1 = "{\"x\":{\"y\":{}}}"; -assertJournalEntries(ds1, "{}", change1); +assertJournalEntries(ds1, change1); final String change2 = "{\"x\":{}}"; -assertJournalEntries(ds2, "{}", change2); +assertJournalEntries(ds2, change2); String change2b = "{\"x\":{\"y\":{\"z\":{"; @@ -400,14 +397,14 @@ public class JournalTest extends AbstractJournalTest { assertEquals(head2, getDocument(ds1, "/").getLastRev().get(c2Id)); // now 1 is unchanged, but 2 was recovered now, so has one more: -assertJournalEntries(ds1, "{}", change1); // unchanged -assertJournalEntries(ds2, "{}", change2, change2b); +assertJournalEntries(ds1, change1); // unchanged +assertJournalEntries(ds2, change2, change2b); // just some no-ops: recovery.recover(c2Id); recovery.recover(Iterators.emptyIterator(), c2Id); -assertJournalEntries(ds1, "{}", change1); // unchanged -assertJournalEntries(ds2, "{}", change2, change2b); +assertJournalEntries(ds1, change1); // unchanged +assertJournalEntries(ds2, change2, change2b); } else { @@ -439,8 +436,8 @@ public class JournalTest extends AbstractJournalTest { ready.await(5, TimeUnit.SECONDS); start.countDown(); assertTrue(end.await(20, TimeUnit.SECONDS)); -assertJournalEntries(ds1, "{}", change1); // unchanged -
[jira] [Comment Edited] (OAK-3976) journal should support large(r) entries
[ https://issues.apache.org/jira/browse/OAK-3976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15740002#comment-15740002 ] Vikas Saurabh edited comment on OAK-3976 at 12/11/16 4:42 PM: -- Also, I was a bit afraid of any sort of deadlocks arising here. So, I tried \[0] 100 committer threads adding a random node (sleeping 10ms in between) and a thread running background ops every 1s (journal push threshold set to 10). Letting this party run for 10s didn't dead-lock... so, there's a bit of relief :). \[0]: {code} @Test public void journalPushMustntDeadlock() throws Exception { int oldJournalPushThreshold = DocumentNodeStore.journalPushThreshold; DocumentNodeStore.journalPushThreshold = 10; try { final DocumentNodeStore ns = builderProvider.newBuilder().setAsyncDelay(0).getNodeStore(); final AtomicBoolean stopTest = new AtomicBoolean(); List threads = new ArrayList<>(); threads.add(new Thread(new Runnable() { @Override public void run() { while (!stopTest.get()) { ns.runBackgroundOperations(); try { Thread.sleep(1000); //slow background thread } catch (InterruptedException e) { // ignore and continue; } } } })); for (int i = 0; i < 100; i++) { threads.add(new Thread(new Runnable() { @Override public void run() { while (!stopTest.get()) { NodeBuilder builder = ns.getRoot().builder(); builder.child("foo" + UUID.randomUUID()); try { merge(ns, builder); } catch (CommitFailedException e) { e.printStackTrace(); //ignore errors and continue } try { Thread.sleep(10); } catch (InterruptedException e) { // ignore and continue; } } } })); } for (Thread t : threads) { t.start(); } Thread.sleep(1);//let them party for 10 seconds stopTest.set(true); for (Thread t : threads) { t.join(); } int i = 1; } finally { DocumentNodeStore.journalPushThreshold = oldJournalPushThreshold; } } {code} was (Author: catholicon): Also, I was a bit afraid of any sort of deadlocks arising here. So, I tried \[0] 100 committer threads adding a random node (sleeping 10ms in between) and a thread running background ops every 1s (journal push threshold set to 10). Letting this party run for 10s didn't dead-lock... so, there's a bit of relief :). \[0]: @Test public void journalPushMustntDeadlock() throws Exception { int oldJournalPushThreshold = DocumentNodeStore.journalPushThreshold; DocumentNodeStore.journalPushThreshold = 10; try { final DocumentNodeStore ns = builderProvider.newBuilder().setAsyncDelay(0).getNodeStore(); final AtomicBoolean stopTest = new AtomicBoolean(); List threads = new ArrayList<>(); threads.add(new Thread(new Runnable() { @Override public void run() { while (!stopTest.get()) { ns.runBackgroundOperations(); try { Thread.sleep(1000); //slow background thread } catch (InterruptedException e) { // ignore and continue; } } } })); for (int i = 0; i < 100; i++) { threads.add(new Thread(new Runnable() { @Override public void run() { while (!stopTest.get()) { NodeBuilder builder = ns.getRoot().builder(); builder.child("foo" + UUID.randomUUID()); try { merge(ns, builder); } catch (CommitFailedException e) { e.printStackTrace(); //ignore errors and continue } try { Thread.sleep(10);
[jira] [Comment Edited] (OAK-3976) journal should support large(r) entries
[ https://issues.apache.org/jira/browse/OAK-3976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15732385#comment-15732385 ] Vikas Saurabh edited comment on OAK-3976 at 12/8/16 2:38 PM: - Good point. But, as you said, we'd probably be ok with minor accuracy losses. I avoided {{+=paths.length}} (in one step higher level modified method) for that reason though. was (Author: catholicon): Good point. But, as you said, we'd probably be ok with minor accuracy losses. I avoided {{+=paths.length}} for that reason though. > journal should support large(r) entries > --- > > Key: OAK-3976 > URL: https://issues.apache.org/jira/browse/OAK-3976 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Affects Versions: 1.3.14 >Reporter: Stefan Egli >Assignee: Vikas Saurabh > Fix For: 1.6, 1.5.16 > > > Journal entries are created in the background write. Normally this happens > every second. If for some reason there is a large delay between two > background writes, the number of pending changes can also accumulate. Which > can result in (arbitrary) large single journal entries (ie with large {{_c}} > property). > This can cause multiple problems down the road: > * journal gc at this point loads 450 entries - and if some are large this can > result in a very large memory consumption during gc (which can cause severe > stability problems for the VM, if not OOM etc). This should be fixed with > OAK-3001 (where we only get the id, thus do not care how big {{_c}} is) > * before OAK-3001 is done (which is currently scheduled after 1.4) what we > can do is reduce the delete batch size (OAK-3975) > * background reads however also read the journal entries and even if > OAK-3001/OAK-3975 are implemented the background read can still cause large > memory consumption. So we need to improve this one way or another. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (OAK-3976) journal should support large(r) entries
[ https://issues.apache.org/jira/browse/OAK-3976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15732050#comment-15732050 ] Vikas Saurabh edited comment on OAK-3976 at 12/8/16 1:55 PM: - [~mreutegg], here's roughly what I am thinking of doing: {code} diff --git a/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStore.java b/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStore.java index 04ecfd7..c168ff0 100644 --- a/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStore.java +++ b/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStore.java @@ -185,6 +185,13 @@ public final class DocumentNodeStore private boolean disableJournalDiff = diff --git a/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStore.java b/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStore.java index 04ecfd7..c168ff0 100644 --- a/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStore.java +++ b/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStore.java @@ -185,6 +185,13 @@ public final class DocumentNodeStore private boolean disableJournalDiff = Boolean.getBoolean(SYS_PROP_DISABLE_JOURNAL); +public static final String SYS_PROP_JOURNAL_PUSH_THRESHOLD = "oak.journalPushThreshold"; +/** + * Threshold for number of paths in journal entry to require a force push during commit + * (instead of at background write) + */ +static int journalPushThreshold = 10; //non-final to allow for tests. + /** * The document store (might be used by multiple node stores). */ @@ -790,6 +797,7 @@ public final class DocumentNodeStore changes.addChangeSet(getChangeSet(info)); // update head revision newHead[0] = before.update(c.getRevision()); +pushJournalEntry(c.getRevision(), false); setRoot(newHead[0]); commitQueue.headRevisionChanged(); dispatcher.contentChanged(getRoot(), info); @@ -2142,19 +2150,32 @@ public final class DocumentNodeStore return unsavedLastRevisions.persist(this, new UnsavedModifications.Snapshot() { @Override public void acquiring(Revision mostRecent) { -if (store.create(JOURNAL, singletonList(changes.asUpdateOp(mostRecent { -// success: start with a new document -changes = newJournalEntry(); -} else { -// fail: log and keep the changes -LOG.error("Failed to write to journal, accumulating changes for future write (~" + changes.getMemory() -+ " bytes)."); -} +pushJournalEntry(mostRecent, true); } }, backgroundOperationLock.writeLock()); } //-< internal >- +private ReadWriteLock journalPushLock = new ReentrantReadWriteLock(); +void pushJournalEntry(Revision r, boolean force) { +if (!force && changes.getNumChangedPaths() < journalPushThreshold) { +return; +} +journalPushLock.writeLock().lock(); +try { +if (store.create(JOURNAL, singletonList(changes.asUpdateOp(r { +// success: start with a new document +changes = newJournalEntry(); +} else { +// fail: log and keep the changes +LOG.error("Failed to write to journal, accumulating changes for future write (~" + changes.getMemory() ++ " bytes)."); +} +} finally { +journalPushLock.writeLock().unlock(); +} + +} /** * Returns the binary size of a property value represented as a JSON or diff --git a/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/JournalEntry.java b/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/JournalEntry.java index e3aab49..cadef3c 100644 --- a/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/JournalEntry.java +++ b/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/JournalEntry.java @@ -86,6 +86,8 @@ public final class JournalEntry extends Document { private volatile TreeNode changes = null; +private volatile int numChangedPaths = 0; + private boolean concurrent; JournalEntry(DocumentStore store) { @@ -318,6 +320,7 @@ public final class JournalEntry extends Document { for (String name : PathUtils.elements(path)) { node = node.getOrCreate(name); } +
[jira] [Comment Edited] (OAK-3976) journal should support large(r) entries
[ https://issues.apache.org/jira/browse/OAK-3976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15732050#comment-15732050 ] Vikas Saurabh edited comment on OAK-3976 at 12/8/16 12:56 PM: -- [~mreutegg], here's roughly what I am thinking of doing: {code} diff --git a/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStore.java b/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStore.java index 04ecfd7..c168ff0 100644 --- a/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStore.java +++ b/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStore.java @@ -185,6 +185,13 @@ public final class DocumentNodeStore private boolean disableJournalDiff = diff --git a/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStore.java b/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStore.java index 04ecfd7..c168ff0 100644 --- a/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStore.java +++ b/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStore.java @@ -185,6 +185,13 @@ public final class DocumentNodeStore private boolean disableJournalDiff = Boolean.getBoolean(SYS_PROP_DISABLE_JOURNAL); +public static final String SYS_PROP_JOURNAL_PUSH_THRESHOLD = "oak.journalPushThreshold"; +/** + * Threshold for number of paths in journal entry to require a force push during commit + * (instead of at background write) + */ +static int journalPushThreshold = 10; //non-final to allow for tests. + /** * The document store (might be used by multiple node stores). */ @@ -790,6 +797,7 @@ public final class DocumentNodeStore changes.addChangeSet(getChangeSet(info)); // update head revision newHead[0] = before.update(c.getRevision()); +pushJournalEntry(c.getRevision(), false); setRoot(newHead[0]); commitQueue.headRevisionChanged(); dispatcher.contentChanged(getRoot(), info); @@ -2142,19 +2150,32 @@ public final class DocumentNodeStore return unsavedLastRevisions.persist(this, new UnsavedModifications.Snapshot() { @Override public void acquiring(Revision mostRecent) { -if (store.create(JOURNAL, singletonList(changes.asUpdateOp(mostRecent { -// success: start with a new document -changes = newJournalEntry(); -} else { -// fail: log and keep the changes -LOG.error("Failed to write to journal, accumulating changes for future write (~" + changes.getMemory() -+ " bytes)."); -} +pushJournalEntry(mostRecent, true); } }, backgroundOperationLock.writeLock()); } //-< internal >- +private ReadWriteLock journalPushLock = new ReentrantReadWriteLock(); +void pushJournalEntry(Revision r, boolean force) { +if (!force && changes.getNumChangedPaths() < journalPushThreshold) { +return; +} +journalPushLock.writeLock().lock(); +try { +if (store.create(JOURNAL, singletonList(changes.asUpdateOp(r { +// success: start with a new document +changes = newJournalEntry(); +} else { +// fail: log and keep the changes +LOG.error("Failed to write to journal, accumulating changes for future write (~" + changes.getMemory() ++ " bytes)."); +} +} finally { +journalPushLock.writeLock().unlock(); +} + +} /** * Returns the binary size of a property value represented as a JSON or diff --git a/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/JournalEntry.java b/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/JournalEntry.java index e3aab49..cadef3c 100644 --- a/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/JournalEntry.java +++ b/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/JournalEntry.java @@ -86,6 +86,8 @@ public final class JournalEntry extends Document { private volatile TreeNode changes = null; +private volatile int numChangedPaths = 0; + private boolean concurrent; JournalEntry(DocumentStore store) { @@ -318,6 +320,7 @@ public final class JournalEntry extends Document { for (String name : PathUtils.elements(path)) { node = node.getOrCreate(name); } +