Re: [PR] OAK-10341 Tree store [jackrabbit-oak]

2024-07-12 Thread via GitHub


thomasmueller commented on code in PR #1577:
URL: https://github.com/apache/jackrabbit-oak/pull/1577#discussion_r1675841951


##
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/tree/TreeStore.java:
##
@@ -166,4 +171,28 @@ public Store getStore() {
 return store;
 }
 
+@Override
+public String getStorePath() {
+return directory.getAbsolutePath();
+}
+
+@Override
+public long getEntryCount() {
+return entryCount;
+}
+
+public void setEntryCount(long entryCount) {
+this.entryCount = entryCount;
+}
+
+@Override
+public String getIndexStoreType() {
+return STORE_TYPE;
+}
+
+@Override
+public boolean isIncremental() {
+return true;

Review Comment:
   To update the tree store incrementally, we can use log-structured merges, 
using Session.checkpoint() / Session.flush(). In theory, this is much more 
efficient than the IncrementalFlatFileStore, if done correctly. Specially if 
there are few changes. This is already implemented in the tree store, but needs 
to be "wired" correctly. 
   
   But I think I will not yet do that just yet, but instead concentrate on 
getting the basics working correctly (right now, e.g. the cache size is in 
number of entries instead of memory used... this can lead to OOME, or then in 
too low memory usage and so bad performance, if we have very unevenly sizes 
pages -- and we probably have this). Also, I need to port some unit tests.
   
   So I'll set it to "false" currently.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10341 Tree store [jackrabbit-oak]

2024-07-12 Thread via GitHub


thomasmueller commented on code in PR #1577:
URL: https://github.com/apache/jackrabbit-oak/pull/1577#discussion_r1675841951


##
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/tree/TreeStore.java:
##
@@ -166,4 +171,28 @@ public Store getStore() {
 return store;
 }
 
+@Override
+public String getStorePath() {
+return directory.getAbsolutePath();
+}
+
+@Override
+public long getEntryCount() {
+return entryCount;
+}
+
+public void setEntryCount(long entryCount) {
+this.entryCount = entryCount;
+}
+
+@Override
+public String getIndexStoreType() {
+return STORE_TYPE;
+}
+
+@Override
+public boolean isIncremental() {
+return true;

Review Comment:
   To update the tree store incrementally, we can use log-structured merges, 
using Session.checkpoint() / Session.flush(). In theory, this is much more 
efficient than the IncrementalFlatFileStore, if done correctly. Specially if 
there are few changes. This is already implemented in the tree store, but needs 
to be "wired" correctly. 
   
   But I think I will not yet do that just yet, but instead concentrate on 
getting the basics working correctly (right now, e.g. the cache size is in 
number of entries instead of memory used... this can lead to OOME if we have 
very unevenly sizes pages). Also, I need to port some unit tests.
   
   So I'll set it to "false" currently.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-07-12 Thread via GitHub


ionutzpi commented on PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#issuecomment-2225449508

   @reschke I vote for your proposal(fixing segment store)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-07-12 Thread via GitHub


reschke commented on PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#issuecomment-2225379200

   bq. For RDB it could be different: it could mean that if you want to use 
compression for RDB you'd have to accept that broken surrogates are not 
supported (unless we find a fast way to support it in compression).
   
   The elephant in the room is segment persistence, which IMHO allows *any* 
Java string. I don't believe discussing the differences of document store 
backends is helpful here; we should come to an agreement what the expections 
for Oak in *general* are. Having document store and segment store behave 
differently IMHO is a problem we should solve (by fixing segment store, would 
be my preference).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-07-12 Thread via GitHub


ionutzpi commented on PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#issuecomment-2225354781

   @reschke @reschke I created methods to check the behavior with broken 
surrogate before and after introducing compression. After compression the test 
failed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10905 | Add license header to AsyncCheckpointService [jackrabbit-oak]

2024-07-12 Thread via GitHub


nit0906 merged PR #1579:
URL: https://github.com/apache/jackrabbit-oak/pull/1579


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] OAK-10905 | Add license header to AsyncCheckpointService [jackrabbit-oak]

2024-07-12 Thread via GitHub


nit0906 opened a new pull request, #1579:
URL: https://github.com/apache/jackrabbit-oak/pull/1579

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10341 Tree store [jackrabbit-oak]

2024-07-11 Thread via GitHub


nit0906 commented on code in PR #1577:
URL: https://github.com/apache/jackrabbit-oak/pull/1577#discussion_r1675055875


##
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/tree/TreeStore.java:
##
@@ -166,4 +171,28 @@ public Store getStore() {
 return store;
 }
 
+@Override
+public String getStorePath() {
+return directory.getAbsolutePath();
+}
+
+@Override
+public long getEntryCount() {
+return entryCount;
+}
+
+public void setEntryCount(long entryCount) {
+this.entryCount = entryCount;
+}
+
+@Override
+public String getIndexStoreType() {
+return STORE_TYPE;
+}
+
+@Override
+public boolean isIncremental() {
+return true;

Review Comment:
   This should be false. 
   
   The TreeStore should be the base implementation - that builds the a tree 
representation of the whole repository, and then we can have something like an 
IncrementalTreeStore, which builds the diff b/w 2 checkpoints to be appended to 
the base TreeStore.
   
   Or maybe simply use the diff by IncrementalFlatFileStore and merge that in 
the base TreeStore.
   
   Refer - 
   
   IndexStore -> FlatFileStore -> IncrementalFlatFileStore 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10905 | Add a configurable async checkpoint creator service [jackrabbit-oak]

2024-07-11 Thread via GitHub


nit0906 merged PR #1560:
URL: https://github.com/apache/jackrabbit-oak/pull/1560


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] OAK-10945: Remove usage of Guava Function interface [jackrabbit-oak]

2024-07-11 Thread via GitHub


reschke opened a new pull request, #1578:
URL: https://github.com/apache/jackrabbit-oak/pull/1578

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10341 Tree store [jackrabbit-oak]

2024-07-11 Thread via GitHub


nit0906 commented on code in PR #1577:
URL: https://github.com/apache/jackrabbit-oak/pull/1577#discussion_r1674170093


##
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/tree/TreeStore.java:
##
@@ -0,0 +1,169 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.jackrabbit.oak.index.indexer.document.tree;
+
+import java.io.Closeable;
+import java.io.File;
+import java.io.IOException;
+import java.util.Iterator;
+import java.util.Map.Entry;
+
+import org.apache.jackrabbit.oak.commons.PathUtils;
+import org.apache.jackrabbit.oak.index.indexer.document.NodeStateEntry;
+import 
org.apache.jackrabbit.oak.index.indexer.document.NodeStateEntry.NodeStateEntryBuilder;
+import 
org.apache.jackrabbit.oak.index.indexer.document.flatfile.NodeStateEntryReader;
+import org.apache.jackrabbit.oak.index.indexer.document.tree.store.Session;
+import org.apache.jackrabbit.oak.index.indexer.document.tree.store.Store;
+import 
org.apache.jackrabbit.oak.index.indexer.document.tree.store.StoreBuilder;
+import org.apache.jackrabbit.oak.index.indexer.document.tree.store.utils.Cache;
+import org.apache.jackrabbit.oak.plugins.memory.EmptyNodeState;
+import org.apache.jackrabbit.oak.spi.state.NodeState;
+
+public class TreeStore implements Iterable, Closeable {

Review Comment:
   I see that this is draft right now - but we should probably implement the 
IndexStore interface to make it easier to integrate. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] OAK-10341 Tree store [jackrabbit-oak]

2024-07-11 Thread via GitHub


thomasmueller opened a new pull request, #1577:
URL: https://github.com/apache/jackrabbit-oak/pull/1577

   Work in progress


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-07-11 Thread via GitHub


stefan-egli commented on PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#issuecomment-991078

   With the default set to disable compression, in theory this PR shouldn't do 
no harm and could be merged, as changing the default is a conscious decision 
that could be done based on the requirement of the system and which 
DocumentStore is used.
   
   Having said that, I think the current state of `testInterestingStrings` 
isn't ideal, as it lists a broken surrogate in a test string, but then doesn't 
test it. So I'd say either it uses that broken surrogate in a test, or then 
shouldn't list it at all. Except, I think now that this problem with broken 
surrogate was identified, we need to address it, so we probably should have it 
in the test indeed.
   
   What we might do is have the test use different DocumentStore fixtures and 
test on memory, mongo and rdb *with* the broken surrogate property and verify 
that the behavior with or without compression is the same.
   
   That could mean that for mongo the fact that compression doesn't support 
broken surrogates turns out as a non-issue as mongo already doesn't support it. 
It would just be good to have a test that verifies that.
   
   For RDB it could be different: it could mean that if you want to use 
compression for RDB you'd have to accept that broken surrogates are not 
supported (unless we find a fast way to support it in compression).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10705: oak-standalone: update dependencies [jackrabbit-oak]

2024-07-11 Thread via GitHub


mbaedke merged PR #1411:
URL: https://github.com/apache/jackrabbit-oak/pull/1411


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10944: oak-auth-ldap: update commons-pool2 dependency to 2.12.0 [jackrabbit-oak]

2024-07-11 Thread via GitHub


reschke merged PR #1576:
URL: https://github.com/apache/jackrabbit-oak/pull/1576


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] OAK-10944: oak-auth-ldap: update commons-pool2 dependency to 2.12.0 [jackrabbit-oak]

2024-07-11 Thread via GitHub


reschke opened a new pull request, #1576:
URL: https://github.com/apache/jackrabbit-oak/pull/1576

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10905 | Add a configurable async checkpoint creator service [jackrabbit-oak]

2024-07-11 Thread via GitHub


nfsantos commented on code in PR #1560:
URL: https://github.com/apache/jackrabbit-oak/pull/1560#discussion_r1673766546


##
oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncCheckpointCreator.java:
##
@@ -0,0 +1,143 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.jackrabbit.oak.plugins.index;
+
+
+import org.apache.jackrabbit.oak.spi.state.NodeStore;
+import org.apache.jackrabbit.util.ISO8601;
+import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.Calendar;
+import java.util.Map;
+import java.util.Set;
+import java.util.TimeZone;
+
+/**
+ * This class is responsible for creating and deleting checkpoints 
asynchronously.
+ * The number of minimum concurrent checkpoints to keep in the system, along 
with the default lifetime of a checkpoint
+ * can be configured.
+ * When executed, this class should create one checkpoint in a single run with 
a configurable name.
+ * Following the creation of the checkpoint, it should try to delete 
checkpoints with the given name,
+ * in case the total number of such checkpoints is greater than the configured 
minimum concurrent checkpoints.
+ * By default, this task is registered using AsyncCheckpointService
+ */
+public class AsyncCheckpointCreator implements Runnable {
+
+/**
+ * Name of service property which determines the name of this Async task
+ */
+public static final String PROP_ASYNC_NAME = "oak.checkpoint.async";
+
+private final String name;
+private final long checkpointLifetimeInSeconds;
+private final long minConcurrentCheckpoints;
+private final long maxConcurrentCheckpoints;
+private final NodeStore store;
+public static final String CHECKPOINT_CREATOR_KEY = "creator";
+public static final String CHECKPOINT_CREATED_KEY = "created";
+public static final String CHECKPOINT_CREATED_TIMESTAMP_KEY= 
"created-epoch";
+public static final String CHECKPOINT_THREAD_KEY = "thread";
+public static final String CHECKPOINT_NAME_KEY = "name";
+private static final Logger log = LoggerFactory
+.getLogger(AsyncCheckpointCreator.class);
+
+public AsyncCheckpointCreator(@NotNull NodeStore store, @NotNull String 
name, long checkpointLifetimeInSeconds, long minConcurrentCheckpoints, long 
maxConcurrentCheckpoints) {
+this.store = store;
+this.name = name;
+this.checkpointLifetimeInSeconds = checkpointLifetimeInSeconds;
+this.minConcurrentCheckpoints = minConcurrentCheckpoints;
+// maxConcurrentCheckpoints should at least be 1 more than 
minConcurrentCheckpoints
+if (maxConcurrentCheckpoints <= minConcurrentCheckpoints) {
+log.warn("[{}] Max concurrent checkpoints is less than or equal to 
min concurrent checkpoints. " +
+"Setting max concurrent checkpoints to min concurrent 
checkpoints + 1.", this.name);
+this.maxConcurrentCheckpoints = minConcurrentCheckpoints + 1;
+} else {
+this.maxConcurrentCheckpoints = maxConcurrentCheckpoints;
+}
+}
+
+public String getName() {
+return name;
+}
+
+protected long getCheckpointLifetimeInSeconds() {
+return checkpointLifetimeInSeconds;
+}
+
+protected long getMinConcurrentCheckpoints() {
+return minConcurrentCheckpoints;
+}
+
+protected long getMaxConcurrentCheckpoints() {
+return maxConcurrentCheckpoints;
+}
+
+@Override
+public void run() {
+Map> filteredCheckpointMap = 
IndexUtils.getFilteredCheckpoints(store,
+entry -> 
name.equals(entry.getValue().get(CHECKPOINT_NAME_KEY)));
+// If the number of checkpoints created by this task are equal to or 
greater than the maxConcurrentCheckpoints, skip
+// creation of a new checkpoint.
+// This could happen in case the deletion of older checkpoints failed 
in multiple previous runs.
+if (filteredCheckpointMap.size() >= maxConcurrentCheckpoints) {
+log.info("[{}] Skipping checkpoint creation as the number of 
concurrent checkpoints is already at max limit", name);

Review Comment:
   

Re: [PR] OAK-10905 | Add a configurable async checkpoint creator service [jackrabbit-oak]

2024-07-11 Thread via GitHub


fabriziofortino commented on code in PR #1560:
URL: https://github.com/apache/jackrabbit-oak/pull/1560#discussion_r1673750045


##
oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncCheckpointCreator.java:
##
@@ -0,0 +1,143 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.jackrabbit.oak.plugins.index;
+
+
+import org.apache.jackrabbit.oak.spi.state.NodeStore;
+import org.apache.jackrabbit.util.ISO8601;
+import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.Calendar;
+import java.util.Map;
+import java.util.Set;
+import java.util.TimeZone;
+
+/**
+ * This class is responsible for creating and deleting checkpoints 
asynchronously.
+ * The number of minimum concurrent checkpoints to keep in the system, along 
with the default lifetime of a checkpoint
+ * can be configured.
+ * When executed, this class should create one checkpoint in a single run with 
a configurable name.
+ * Following the creation of the checkpoint, it should try to delete 
checkpoints with the given name,
+ * in case the total number of such checkpoints is greater than the configured 
minimum concurrent checkpoints.
+ * By default, this task is registered using AsyncCheckpointService
+ */
+public class AsyncCheckpointCreator implements Runnable {
+
+/**
+ * Name of service property which determines the name of this Async task
+ */
+public static final String PROP_ASYNC_NAME = "oak.checkpoint.async";
+
+private final String name;
+private final long checkpointLifetimeInSeconds;
+private final long minConcurrentCheckpoints;
+private final long maxConcurrentCheckpoints;
+private final NodeStore store;
+public static final String CHECKPOINT_CREATOR_KEY = "creator";
+public static final String CHECKPOINT_CREATED_KEY = "created";
+public static final String CHECKPOINT_CREATED_TIMESTAMP_KEY= 
"created-epoch";
+public static final String CHECKPOINT_THREAD_KEY = "thread";
+public static final String CHECKPOINT_NAME_KEY = "name";
+private static final Logger log = LoggerFactory
+.getLogger(AsyncCheckpointCreator.class);
+
+public AsyncCheckpointCreator(@NotNull NodeStore store, @NotNull String 
name, long checkpointLifetimeInSeconds, long minConcurrentCheckpoints, long 
maxConcurrentCheckpoints) {
+this.store = store;
+this.name = name;
+this.checkpointLifetimeInSeconds = checkpointLifetimeInSeconds;
+this.minConcurrentCheckpoints = minConcurrentCheckpoints;
+// maxConcurrentCheckpoints should at least be 1 more than 
minConcurrentCheckpoints
+if (maxConcurrentCheckpoints <= minConcurrentCheckpoints) {
+log.warn("[{}] Max concurrent checkpoints is less than or equal to 
min concurrent checkpoints. " +
+"Setting max concurrent checkpoints to min concurrent 
checkpoints + 1.", this.name);
+this.maxConcurrentCheckpoints = minConcurrentCheckpoints + 1;
+} else {
+this.maxConcurrentCheckpoints = maxConcurrentCheckpoints;
+}
+}
+
+public String getName() {
+return name;
+}
+
+protected long getCheckpointLifetimeInSeconds() {
+return checkpointLifetimeInSeconds;
+}
+
+protected long getMinConcurrentCheckpoints() {
+return minConcurrentCheckpoints;
+}
+
+protected long getMaxConcurrentCheckpoints() {
+return maxConcurrentCheckpoints;
+}
+
+@Override
+public void run() {
+Map> filteredCheckpointMap = 
IndexUtils.getFilteredCheckpoints(store,
+entry -> 
name.equals(entry.getValue().get(CHECKPOINT_NAME_KEY)));
+// If the number of checkpoints created by this task are equal to or 
greater than the maxConcurrentCheckpoints, skip
+// creation of a new checkpoint.
+// This could happen in case the deletion of older checkpoints failed 
in multiple previous runs.
+if (filteredCheckpointMap.size() >= maxConcurrentCheckpoints) {
+log.info("[{}] Skipping checkpoint creation as the number of 
concurrent checkpoints is already at max limit", name);

Review 

Re: [PR] OAK-10905 | Add a configurable async checkpoint creator service [jackrabbit-oak]

2024-07-11 Thread via GitHub


nit0906 commented on code in PR #1560:
URL: https://github.com/apache/jackrabbit-oak/pull/1560#discussion_r1673733028


##
oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncCheckpointCreator.java:
##
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.jackrabbit.oak.plugins.index;
+
+
+import org.apache.jackrabbit.oak.spi.state.NodeStore;
+import org.apache.jackrabbit.util.ISO8601;
+import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.Calendar;
+import java.util.Map;
+import java.util.Set;
+import java.util.TimeZone;
+
+/**
+ * This class is responsible for creating and deleting checkpoints 
asynchronously.
+ * The number of minimum concurrent checkpoints to keep in the system, along 
with the default lifetime of a checkpoint
+ * can be configured.
+ * When executed, this class should create one checkpoint in a single run with 
a configurable name.
+ * Following the creation of the checkpoint, it should try to delete 
checkpoints with the given name,
+ * in case the total number of such checkpoints is greater than the configured 
minimum concurrent checkpoints.
+ * By default, this task is registered using AsyncCheckpointService
+ */
+public class AsyncCheckpointCreator implements Runnable {
+
+/**
+ * Name of service property which determines the name of this Async task
+ */
+public static final String PROP_ASYNC_NAME = "oak.checkpoint.async";
+
+private final String name;
+private final long checkpointLifetimeInSeconds;
+private final long minConcurrentCheckpoints;
+private final NodeStore store;
+public static final String CHECKPOINT_CREATOR_KEY = "creator";
+public static final String CHECKPOINT_CREATED_KEY = "created";
+public static final String CHECKPOINT_CREATED_TIMESTAMP_KEY= 
"created-epoch";
+public static final String CHECKPOINT_THREAD_KEY = "thread";
+public static final String CHECKPOINT_NAME_KEY = "name";
+private static final Logger log = LoggerFactory
+.getLogger(AsyncCheckpointCreator.class);
+
+public AsyncCheckpointCreator(@NotNull NodeStore store, @NotNull String 
name, long checkpointLifetimeInSeconds, long minConcurrentCheckpoints) {
+this.store = store;
+this.name = name;
+this.checkpointLifetimeInSeconds = checkpointLifetimeInSeconds;
+this.minConcurrentCheckpoints = minConcurrentCheckpoints;
+}
+
+public String getName() {
+return name;
+}
+
+protected long getCheckpointLifetimeInSeconds() {
+return checkpointLifetimeInSeconds;
+}
+
+protected long getMinConcurrentCheckpoints() {
+return minConcurrentCheckpoints;
+}
+
+@Override
+public void run() {
+Calendar cal = Calendar.getInstance(TimeZone.getTimeZone("UTC"));
+String creationTimeStamp = String.valueOf(cal.getTimeInMillis());
+String creationTimeISOFormat = ISO8601.format(cal);
+String checkpoint = store.checkpoint(checkpointLifetimeInSeconds * 
1000, Map.of(
+CHECKPOINT_CREATOR_KEY, 
AsyncCheckpointCreator.class.getSimpleName(),
+CHECKPOINT_CREATED_KEY, creationTimeISOFormat,
+CHECKPOINT_CREATED_TIMESTAMP_KEY, creationTimeStamp,
+CHECKPOINT_THREAD_KEY, Thread.currentThread().getName(),
+CHECKPOINT_NAME_KEY, name));
+log.info("[{}] Created checkpoint {} with creation time {}", name, 
checkpoint, creationTimeISOFormat);
+
+// Get a list of checkpoints filtered on the basis of 
CHECKPOINT_NAME_KEY (name). This is done using the
+// getFilteredCheckpoints in the IndexUtils, which gets all the 
checkpoints in the node store and then filters the list based on
+// the provided predicate using the checkpoint info map associated 
with the checkpoints.
+// The filtered checkpoints list is then sorted based on the 
CHECKPOINT_CREATED_TIMESTAMP_KEY (created-epoch).
+// Both the initial filtering and sorting is done based on the 
information from the associated checkpoint info map of a given checkpoint.
+Set 

Re: [PR] OAK-10905 | Add a configurable async checkpoint creator service [jackrabbit-oak]

2024-07-11 Thread via GitHub


nfsantos commented on code in PR #1560:
URL: https://github.com/apache/jackrabbit-oak/pull/1560#discussion_r1673694331


##
oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncCheckpointCreator.java:
##
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.jackrabbit.oak.plugins.index;
+
+
+import org.apache.jackrabbit.oak.spi.state.NodeStore;
+import org.apache.jackrabbit.util.ISO8601;
+import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.Calendar;
+import java.util.Map;
+import java.util.Set;
+import java.util.TimeZone;
+
+/**
+ * This class is responsible for creating and deleting checkpoints 
asynchronously.
+ * The number of minimum concurrent checkpoints to keep in the system, along 
with the default lifetime of a checkpoint
+ * can be configured.
+ * When executed, this class should create one checkpoint in a single run with 
a configurable name.
+ * Following the creation of the checkpoint, it should try to delete 
checkpoints with the given name,
+ * in case the total number of such checkpoints is greater than the configured 
minimum concurrent checkpoints.
+ * By default, this task is registered using AsyncCheckpointService
+ */
+public class AsyncCheckpointCreator implements Runnable {
+
+/**
+ * Name of service property which determines the name of this Async task
+ */
+public static final String PROP_ASYNC_NAME = "oak.checkpoint.async";
+
+private final String name;
+private final long checkpointLifetimeInSeconds;
+private final long minConcurrentCheckpoints;
+private final NodeStore store;
+public static final String CHECKPOINT_CREATOR_KEY = "creator";
+public static final String CHECKPOINT_CREATED_KEY = "created";
+public static final String CHECKPOINT_CREATED_TIMESTAMP_KEY= 
"created-epoch";
+public static final String CHECKPOINT_THREAD_KEY = "thread";
+public static final String CHECKPOINT_NAME_KEY = "name";
+private static final Logger log = LoggerFactory
+.getLogger(AsyncCheckpointCreator.class);
+
+public AsyncCheckpointCreator(@NotNull NodeStore store, @NotNull String 
name, long checkpointLifetimeInSeconds, long minConcurrentCheckpoints) {
+this.store = store;
+this.name = name;
+this.checkpointLifetimeInSeconds = checkpointLifetimeInSeconds;
+this.minConcurrentCheckpoints = minConcurrentCheckpoints;
+}
+
+public String getName() {
+return name;
+}
+
+protected long getCheckpointLifetimeInSeconds() {
+return checkpointLifetimeInSeconds;
+}
+
+protected long getMinConcurrentCheckpoints() {
+return minConcurrentCheckpoints;
+}
+
+@Override
+public void run() {
+Calendar cal = Calendar.getInstance(TimeZone.getTimeZone("UTC"));
+String creationTimeStamp = String.valueOf(cal.getTimeInMillis());
+String creationTimeISOFormat = ISO8601.format(cal);
+String checkpoint = store.checkpoint(checkpointLifetimeInSeconds * 
1000, Map.of(
+CHECKPOINT_CREATOR_KEY, 
AsyncCheckpointCreator.class.getSimpleName(),
+CHECKPOINT_CREATED_KEY, creationTimeISOFormat,
+CHECKPOINT_CREATED_TIMESTAMP_KEY, creationTimeStamp,
+CHECKPOINT_THREAD_KEY, Thread.currentThread().getName(),
+CHECKPOINT_NAME_KEY, name));
+log.info("[{}] Created checkpoint {} with creation time {}", name, 
checkpoint, creationTimeISOFormat);
+
+// Get a list of checkpoints filtered on the basis of 
CHECKPOINT_NAME_KEY (name). This is done using the
+// getFilteredCheckpoints in the IndexUtils, which gets all the 
checkpoints in the node store and then filters the list based on
+// the provided predicate using the checkpoint info map associated 
with the checkpoints.
+// The filtered checkpoints list is then sorted based on the 
CHECKPOINT_CREATED_TIMESTAMP_KEY (created-epoch).
+// Both the initial filtering and sorting is done based on the 
information from the associated checkpoint info map of a given checkpoint.
+Set 

Re: [PR] OAK-10803 -- compress/uncompress property [jackrabbit-oak]

2024-07-11 Thread via GitHub


reschke commented on PR #1526:
URL: https://github.com/apache/jackrabbit-oak/pull/1526#issuecomment-299402

   > Why is this a MongoDB specific aspect? For MongoDB we know it is not 
supported (see [this skip 
here](https://github.com/apache/jackrabbit-oak/blob/0e327a27eb988a0f7656535275f3735ef8474e5d/oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/BasicDocumentStoreTest.java#L760))
 - but for RDB it is supported. So shouldn't this PR keep the support for RDB?
   
   I believe we should come up with consistent behaviour for all types of 
persistences. My preference would be to disallow Strings with broken UTF-8 
serializations, but that would be a change for segment persistence, and that's 
why I faced resistence when I proposed that back then.
   
   For *this* PR, it will have to handle what the current system allows.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10905 | Add a configurable async checkpoint creator service [jackrabbit-oak]

2024-07-11 Thread via GitHub


fabriziofortino commented on code in PR #1560:
URL: https://github.com/apache/jackrabbit-oak/pull/1560#discussion_r1673581897


##
oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncCheckpointCreator.java:
##
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.jackrabbit.oak.plugins.index;
+
+
+import org.apache.jackrabbit.oak.spi.state.NodeStore;
+import org.apache.jackrabbit.util.ISO8601;
+import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.Calendar;
+import java.util.Map;
+import java.util.Set;
+import java.util.TimeZone;
+
+/**
+ * This class is responsible for creating and deleting checkpoints 
asynchronously.
+ * The number of minimum concurrent checkpoints to keep in the system, along 
with the default lifetime of a checkpoint
+ * can be configured.
+ * When executed, this class should create one checkpoint in a single run with 
a configurable name.
+ * Following the creation of the checkpoint, it should try to delete 
checkpoints with the given name,
+ * in case the total number of such checkpoints is greater than the configured 
minimum concurrent checkpoints.
+ * By default, this task is registered using AsyncCheckpointService
+ */
+public class AsyncCheckpointCreator implements Runnable {
+
+/**
+ * Name of service property which determines the name of this Async task
+ */
+public static final String PROP_ASYNC_NAME = "oak.checkpoint.async";
+
+private final String name;
+private final long checkpointLifetimeInSeconds;
+private final long minConcurrentCheckpoints;
+private final NodeStore store;
+public static final String CHECKPOINT_CREATOR_KEY = "creator";
+public static final String CHECKPOINT_CREATED_KEY = "created";
+public static final String CHECKPOINT_CREATED_TIMESTAMP_KEY= 
"created-epoch";
+public static final String CHECKPOINT_THREAD_KEY = "thread";
+public static final String CHECKPOINT_NAME_KEY = "name";
+private static final Logger log = LoggerFactory
+.getLogger(AsyncCheckpointCreator.class);
+
+public AsyncCheckpointCreator(@NotNull NodeStore store, @NotNull String 
name, long checkpointLifetimeInSeconds, long minConcurrentCheckpoints) {
+this.store = store;
+this.name = name;
+this.checkpointLifetimeInSeconds = checkpointLifetimeInSeconds;
+this.minConcurrentCheckpoints = minConcurrentCheckpoints;
+}
+
+public String getName() {
+return name;
+}
+
+protected long getCheckpointLifetimeInSeconds() {
+return checkpointLifetimeInSeconds;
+}
+
+protected long getMinConcurrentCheckpoints() {
+return minConcurrentCheckpoints;
+}
+
+@Override
+public void run() {
+Calendar cal = Calendar.getInstance(TimeZone.getTimeZone("UTC"));
+String creationTimeStamp = String.valueOf(cal.getTimeInMillis());
+String creationTimeISOFormat = ISO8601.format(cal);
+String checkpoint = store.checkpoint(checkpointLifetimeInSeconds * 
1000, Map.of(
+CHECKPOINT_CREATOR_KEY, 
AsyncCheckpointCreator.class.getSimpleName(),
+CHECKPOINT_CREATED_KEY, creationTimeISOFormat,
+CHECKPOINT_CREATED_TIMESTAMP_KEY, creationTimeStamp,
+CHECKPOINT_THREAD_KEY, Thread.currentThread().getName(),
+CHECKPOINT_NAME_KEY, name));
+log.info("[{}] Created checkpoint {} with creation time {}", name, 
checkpoint, creationTimeISOFormat);
+
+// Get a list of checkpoints filtered on the basis of 
CHECKPOINT_NAME_KEY (name). This is done using the
+// getFilteredCheckpoints in the IndexUtils, which gets all the 
checkpoints in the node store and then filters the list based on
+// the provided predicate using the checkpoint info map associated 
with the checkpoints.
+// The filtered checkpoints list is then sorted based on the 
CHECKPOINT_CREATED_TIMESTAMP_KEY (created-epoch).
+// Both the initial filtering and sorting is done based on the 
information from the associated checkpoint info map of a given checkpoint.
+Set 

Re: [PR] OAK-10913 SQL-2 grammar: remove documentation for distinct [jackrabbit-oak]

2024-07-11 Thread via GitHub


thomasmueller merged PR #1552:
URL: https://github.com/apache/jackrabbit-oak/pull/1552


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10905 | Add a configurable async checkpoint creator service [jackrabbit-oak]

2024-07-10 Thread via GitHub


nit0906 commented on code in PR #1560:
URL: https://github.com/apache/jackrabbit-oak/pull/1560#discussion_r1673391713


##
oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncCheckpointCreator.java:
##
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.jackrabbit.oak.plugins.index;
+
+
+import org.apache.jackrabbit.oak.spi.state.NodeStore;
+import org.apache.jackrabbit.util.ISO8601;
+import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.Calendar;
+import java.util.Map;
+import java.util.Set;
+import java.util.TimeZone;
+
+/**
+ * This class is responsible for creating and deleting checkpoints 
asynchronously.
+ * The number of minimum concurrent checkpoints to keep in the system, along 
with the default lifetime of a checkpoint
+ * can be configured.
+ * When executed, this class should create one checkpoint in a single run with 
a configurable name.
+ * Following the creation of the checkpoint, it should try to delete 
checkpoints with the given name,
+ * in case the total number of such checkpoints is greater than the configured 
minimum concurrent checkpoints.
+ * By default, this task is registered using AsyncCheckpointService
+ */
+public class AsyncCheckpointCreator implements Runnable {
+
+/**
+ * Name of service property which determines the name of this Async task
+ */
+public static final String PROP_ASYNC_NAME = "oak.checkpoint.async";
+
+private final String name;
+private final long checkpointLifetimeInSeconds;
+private final long minConcurrentCheckpoints;
+private final NodeStore store;
+public static final String CHECKPOINT_CREATOR_KEY = "creator";
+public static final String CHECKPOINT_CREATED_KEY = "created";
+public static final String CHECKPOINT_CREATED_TIMESTAMP_KEY= 
"created-epoch";
+public static final String CHECKPOINT_THREAD_KEY = "thread";
+public static final String CHECKPOINT_NAME_KEY = "name";
+private static final Logger log = LoggerFactory
+.getLogger(AsyncCheckpointCreator.class);
+
+public AsyncCheckpointCreator(@NotNull NodeStore store, @NotNull String 
name, long checkpointLifetimeInSeconds, long minConcurrentCheckpoints) {
+this.store = store;
+this.name = name;
+this.checkpointLifetimeInSeconds = checkpointLifetimeInSeconds;
+this.minConcurrentCheckpoints = minConcurrentCheckpoints;
+}
+
+public String getName() {
+return name;
+}
+
+protected long getCheckpointLifetimeInSeconds() {
+return checkpointLifetimeInSeconds;
+}
+
+protected long getMinConcurrentCheckpoints() {
+return minConcurrentCheckpoints;
+}
+
+@Override
+public void run() {
+Calendar cal = Calendar.getInstance(TimeZone.getTimeZone("UTC"));
+String creationTimeStamp = String.valueOf(cal.getTimeInMillis());
+String creationTimeISOFormat = ISO8601.format(cal);
+String checkpoint = store.checkpoint(checkpointLifetimeInSeconds * 
1000, Map.of(
+CHECKPOINT_CREATOR_KEY, 
AsyncCheckpointCreator.class.getSimpleName(),
+CHECKPOINT_CREATED_KEY, creationTimeISOFormat,
+CHECKPOINT_CREATED_TIMESTAMP_KEY, creationTimeStamp,
+CHECKPOINT_THREAD_KEY, Thread.currentThread().getName(),
+CHECKPOINT_NAME_KEY, name));
+log.info("[{}] Created checkpoint {} with creation time {}", name, 
checkpoint, creationTimeISOFormat);
+
+// Get a list of checkpoints filtered on the basis of 
CHECKPOINT_NAME_KEY (name). This is done using the
+// getFilteredCheckpoints in the IndexUtils, which gets all the 
checkpoints in the node store and then filters the list based on
+// the provided predicate using the checkpoint info map associated 
with the checkpoints.
+// The filtered checkpoints list is then sorted based on the 
CHECKPOINT_CREATED_TIMESTAMP_KEY (created-epoch).
+// Both the initial filtering and sorting is done based on the 
information from the associated checkpoint info map of a given checkpoint.
+Set 

Re: [PR] OAK-10905 | Add a configurable async checkpoint creator service [jackrabbit-oak]

2024-07-10 Thread via GitHub


nit0906 commented on code in PR #1560:
URL: https://github.com/apache/jackrabbit-oak/pull/1560#discussion_r1673391713


##
oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncCheckpointCreator.java:
##
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.jackrabbit.oak.plugins.index;
+
+
+import org.apache.jackrabbit.oak.spi.state.NodeStore;
+import org.apache.jackrabbit.util.ISO8601;
+import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.Calendar;
+import java.util.Map;
+import java.util.Set;
+import java.util.TimeZone;
+
+/**
+ * This class is responsible for creating and deleting checkpoints 
asynchronously.
+ * The number of minimum concurrent checkpoints to keep in the system, along 
with the default lifetime of a checkpoint
+ * can be configured.
+ * When executed, this class should create one checkpoint in a single run with 
a configurable name.
+ * Following the creation of the checkpoint, it should try to delete 
checkpoints with the given name,
+ * in case the total number of such checkpoints is greater than the configured 
minimum concurrent checkpoints.
+ * By default, this task is registered using AsyncCheckpointService
+ */
+public class AsyncCheckpointCreator implements Runnable {
+
+/**
+ * Name of service property which determines the name of this Async task
+ */
+public static final String PROP_ASYNC_NAME = "oak.checkpoint.async";
+
+private final String name;
+private final long checkpointLifetimeInSeconds;
+private final long minConcurrentCheckpoints;
+private final NodeStore store;
+public static final String CHECKPOINT_CREATOR_KEY = "creator";
+public static final String CHECKPOINT_CREATED_KEY = "created";
+public static final String CHECKPOINT_CREATED_TIMESTAMP_KEY= 
"created-epoch";
+public static final String CHECKPOINT_THREAD_KEY = "thread";
+public static final String CHECKPOINT_NAME_KEY = "name";
+private static final Logger log = LoggerFactory
+.getLogger(AsyncCheckpointCreator.class);
+
+public AsyncCheckpointCreator(@NotNull NodeStore store, @NotNull String 
name, long checkpointLifetimeInSeconds, long minConcurrentCheckpoints) {
+this.store = store;
+this.name = name;
+this.checkpointLifetimeInSeconds = checkpointLifetimeInSeconds;
+this.minConcurrentCheckpoints = minConcurrentCheckpoints;
+}
+
+public String getName() {
+return name;
+}
+
+protected long getCheckpointLifetimeInSeconds() {
+return checkpointLifetimeInSeconds;
+}
+
+protected long getMinConcurrentCheckpoints() {
+return minConcurrentCheckpoints;
+}
+
+@Override
+public void run() {
+Calendar cal = Calendar.getInstance(TimeZone.getTimeZone("UTC"));
+String creationTimeStamp = String.valueOf(cal.getTimeInMillis());
+String creationTimeISOFormat = ISO8601.format(cal);
+String checkpoint = store.checkpoint(checkpointLifetimeInSeconds * 
1000, Map.of(
+CHECKPOINT_CREATOR_KEY, 
AsyncCheckpointCreator.class.getSimpleName(),
+CHECKPOINT_CREATED_KEY, creationTimeISOFormat,
+CHECKPOINT_CREATED_TIMESTAMP_KEY, creationTimeStamp,
+CHECKPOINT_THREAD_KEY, Thread.currentThread().getName(),
+CHECKPOINT_NAME_KEY, name));
+log.info("[{}] Created checkpoint {} with creation time {}", name, 
checkpoint, creationTimeISOFormat);
+
+// Get a list of checkpoints filtered on the basis of 
CHECKPOINT_NAME_KEY (name). This is done using the
+// getFilteredCheckpoints in the IndexUtils, which gets all the 
checkpoints in the node store and then filters the list based on
+// the provided predicate using the checkpoint info map associated 
with the checkpoints.
+// The filtered checkpoints list is then sorted based on the 
CHECKPOINT_CREATED_TIMESTAMP_KEY (created-epoch).
+// Both the initial filtering and sorting is done based on the 
information from the associated checkpoint info map of a given checkpoint.
+Set 

Re: [PR] OAK-10905 | Add a configurable async checkpoint creator service [jackrabbit-oak]

2024-07-10 Thread via GitHub


nit0906 commented on code in PR #1560:
URL: https://github.com/apache/jackrabbit-oak/pull/1560#discussion_r1673391192


##
oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncCheckpointCreator.java:
##
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.jackrabbit.oak.plugins.index;
+
+
+import org.apache.jackrabbit.oak.spi.state.NodeStore;
+import org.apache.jackrabbit.util.ISO8601;
+import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.Calendar;
+import java.util.Map;
+import java.util.Set;
+import java.util.TimeZone;
+
+/**
+ * This class is responsible for creating and deleting checkpoints 
asynchronously.
+ * The number of minimum concurrent checkpoints to keep in the system, along 
with the default lifetime of a checkpoint
+ * can be configured.
+ * When executed, this class should create one checkpoint in a single run with 
a configurable name.
+ * Following the creation of the checkpoint, it should try to delete 
checkpoints with the given name,
+ * in case the total number of such checkpoints is greater than the configured 
minimum concurrent checkpoints.
+ * By default, this task is registered using AsyncCheckpointService
+ */
+public class AsyncCheckpointCreator implements Runnable {
+
+/**
+ * Name of service property which determines the name of this Async task
+ */
+public static final String PROP_ASYNC_NAME = "oak.checkpoint.async";
+
+private final String name;
+private final long checkpointLifetimeInSeconds;
+private final long minConcurrentCheckpoints;
+private final NodeStore store;
+public static final String CHECKPOINT_CREATOR_KEY = "creator";
+public static final String CHECKPOINT_CREATED_KEY = "created";
+public static final String CHECKPOINT_CREATED_TIMESTAMP_KEY= 
"created-epoch";
+public static final String CHECKPOINT_THREAD_KEY = "thread";
+public static final String CHECKPOINT_NAME_KEY = "name";
+private static final Logger log = LoggerFactory
+.getLogger(AsyncCheckpointCreator.class);
+
+public AsyncCheckpointCreator(@NotNull NodeStore store, @NotNull String 
name, long checkpointLifetimeInSeconds, long minConcurrentCheckpoints) {
+this.store = store;
+this.name = name;
+this.checkpointLifetimeInSeconds = checkpointLifetimeInSeconds;
+this.minConcurrentCheckpoints = minConcurrentCheckpoints;
+}
+
+public String getName() {
+return name;
+}
+
+protected long getCheckpointLifetimeInSeconds() {
+return checkpointLifetimeInSeconds;
+}
+
+protected long getMinConcurrentCheckpoints() {
+return minConcurrentCheckpoints;
+}
+
+@Override
+public void run() {
+Calendar cal = Calendar.getInstance(TimeZone.getTimeZone("UTC"));
+String creationTimeStamp = String.valueOf(cal.getTimeInMillis());
+String creationTimeISOFormat = ISO8601.format(cal);
+String checkpoint = store.checkpoint(checkpointLifetimeInSeconds * 
1000, Map.of(
+CHECKPOINT_CREATOR_KEY, 
AsyncCheckpointCreator.class.getSimpleName(),
+CHECKPOINT_CREATED_KEY, creationTimeISOFormat,
+CHECKPOINT_CREATED_TIMESTAMP_KEY, creationTimeStamp,
+CHECKPOINT_THREAD_KEY, Thread.currentThread().getName(),
+CHECKPOINT_NAME_KEY, name));
+log.info("[{}] Created checkpoint {} with creation time {}", name, 
checkpoint, creationTimeISOFormat);
+
+// Get a list of checkpoints filtered on the basis of 
CHECKPOINT_NAME_KEY (name). This is done using the
+// getFilteredCheckpoints in the IndexUtils, which gets all the 
checkpoints in the node store and then filters the list based on
+// the provided predicate using the checkpoint info map associated 
with the checkpoints.
+// The filtered checkpoints list is then sorted based on the 
CHECKPOINT_CREATED_TIMESTAMP_KEY (created-epoch).
+// Both the initial filtering and sorting is done based on the 
information from the associated checkpoint info map of a given checkpoint.
+Set 

Re: [PR] OAK-10905 | Add a configurable async checkpoint creator service [jackrabbit-oak]

2024-07-10 Thread via GitHub


fabriziofortino commented on code in PR #1560:
URL: https://github.com/apache/jackrabbit-oak/pull/1560#discussion_r1672128721


##
oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncCheckpointService.java:
##
@@ -0,0 +1,95 @@
+package org.apache.jackrabbit.oak.plugins.index;
+
+
+
+import org.apache.jackrabbit.oak.osgi.OsgiWhiteboard;
+import org.apache.jackrabbit.oak.spi.state.NodeStore;
+import org.apache.jackrabbit.oak.spi.whiteboard.CompositeRegistration;
+import org.apache.jackrabbit.oak.spi.whiteboard.Registration;
+import org.apache.jackrabbit.oak.spi.whiteboard.Whiteboard;
+import org.osgi.framework.BundleContext;
+import org.osgi.service.component.annotations.*;
+import org.osgi.service.metatype.annotations.AttributeDefinition;
+import org.osgi.service.metatype.annotations.Designate;
+import org.osgi.service.metatype.annotations.ObjectClassDefinition;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+
+import static 
org.apache.jackrabbit.oak.spi.whiteboard.WhiteboardUtils.ScheduleExecutionInstanceTypes.RUN_ON_LEADER;
+import static 
org.apache.jackrabbit.oak.spi.whiteboard.WhiteboardUtils.scheduleWithFixedDelay;
+
+@Component(
+configurationPolicy = ConfigurationPolicy.REQUIRE,
+service = {})

Review Comment:
   `service = {}` is redundant
   ```suggestion
   @Component(configurationPolicy = ConfigurationPolicy.REQUIRE)
   ```



##
oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncCheckpointService.java:
##
@@ -0,0 +1,95 @@
+package org.apache.jackrabbit.oak.plugins.index;
+
+

Review Comment:
   nit: multiple blank lines
   ```suggestion
   
   
   ```



##
oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncCheckpointCreator.java:
##
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.jackrabbit.oak.plugins.index;
+
+
+import org.apache.jackrabbit.oak.spi.state.NodeStore;
+import org.apache.jackrabbit.util.ISO8601;
+import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.Calendar;
+import java.util.Map;
+import java.util.Set;
+import java.util.TimeZone;
+
+/**
+ * This class is responsible for creating and deleting checkpoints 
asynchronously.
+ * The number of minimum concurrent checkpoints to keep in the system, along 
with the default lifetime of a checkpoint
+ * can be configured.
+ * When executed, this class should create one checkpoint in a single run with 
a configurable name.
+ * Following the creation of the checkpoint, it should try to delete 
checkpoints with the given name,
+ * in case the total number of such checkpoints is greater than the configured 
minimum concurrent checkpoints.
+ * By default, this task is registered using AsyncCheckpointService
+ */
+public class AsyncCheckpointCreator implements Runnable {
+
+/**
+ * Name of service property which determines the name of this Async task
+ */
+public static final String PROP_ASYNC_NAME = "oak.checkpoint.async";
+
+private final String name;
+private final long checkpointLifetimeInSeconds;
+private final long minConcurrentCheckpoints;
+private final NodeStore store;
+public static final String CHECKPOINT_CREATOR_KEY = "creator";
+public static final String CHECKPOINT_CREATED_KEY = "created";
+public static final String CHECKPOINT_CREATED_TIMESTAMP_KEY= 
"created-epoch";
+public static final String CHECKPOINT_THREAD_KEY = "thread";
+public static final String CHECKPOINT_NAME_KEY = "name";
+private static final Logger log = LoggerFactory
+.getLogger(AsyncCheckpointCreator.class);
+
+public AsyncCheckpointCreator(@NotNull NodeStore store, @NotNull String 
name, long checkpointLifetimeInSeconds, long minConcurrentCheckpoints) {
+this.store = store;
+this.name = name;
+this.checkpointLifetimeInSeconds = checkpointLifetimeInSeconds;
+this.minConcurrentCheckpoints = minConcurrentCheckpoints;
+}
+
+public String getName() {
+return name;
+}
+
+protected long 

Re: [PR] OAK-10941: oak-run: avoid use of Guava's ClassToInstanceMap [jackrabbit-oak]

2024-07-09 Thread via GitHub


reschke merged PR #1575:
URL: https://github.com/apache/jackrabbit-oak/pull/1575


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] OAK-10941: oak-run: avoid use of Guava's ClassToInstanceMap [jackrabbit-oak]

2024-07-09 Thread via GitHub


reschke opened a new pull request, #1575:
URL: https://github.com/apache/jackrabbit-oak/pull/1575

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10940: oak-jcr: improve error message when mongo test fixture fails [jackrabbit-oak]

2024-07-09 Thread via GitHub


reschke merged PR #1574:
URL: https://github.com/apache/jackrabbit-oak/pull/1574


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] OAK-10940: oak-jcr: improve error message when mongo test fixture fails [jackrabbit-oak]

2024-07-09 Thread via GitHub


reschke opened a new pull request, #1574:
URL: https://github.com/apache/jackrabbit-oak/pull/1574

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10848: commons: remove use of slf4j.event.Level in SystemPropertySupplier API [jackrabbit-oak]

2024-07-09 Thread via GitHub


reschke merged PR #1573:
URL: https://github.com/apache/jackrabbit-oak/pull/1573


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] OAK-10848: commons: remove use of slf4j.event.Level in SystemPropertySupplier API [jackrabbit-oak]

2024-07-09 Thread via GitHub


reschke opened a new pull request, #1573:
URL: https://github.com/apache/jackrabbit-oak/pull/1573

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10905 | Add a configurable async checkpoint creator service [jackrabbit-oak]

2024-07-08 Thread via GitHub


nit0906 commented on PR #1560:
URL: https://github.com/apache/jackrabbit-oak/pull/1560#issuecomment-2216515827

   > I'm missing a more detailed description of what the checkpoint service 
should do in the PR description. Like what is the configuration, when does it 
take checkpoints, how many it keeps, when checkpoints are deleted, what is the 
maximum time that a checkpoint can be kept. Otherwise, it's hard to evaluate 
the PR.
   
   @nfsantos  - I have tried to add more details in comments in the code and in 
the description. The main configurations of the service are defined in the 
AsyncCheckpointService class with descriptions. But I have added the same in 
the PR description as well. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10905 | Add a configurable async checkpoint creator service [jackrabbit-oak]

2024-07-08 Thread via GitHub


nit0906 commented on code in PR #1560:
URL: https://github.com/apache/jackrabbit-oak/pull/1560#discussion_r1669679890


##
oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncCheckpointCreator.java:
##
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.jackrabbit.oak.plugins.index;
+
+import org.apache.jackrabbit.guava.common.collect.ImmutableMap;
+import org.apache.jackrabbit.oak.spi.state.NodeStore;
+import org.apache.jackrabbit.util.ISO8601;
+import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.Calendar;
+import java.util.Set;
+
+public class AsyncCheckpointCreator implements Runnable {
+
+/**
+ * Name of service property which determines the name of Async task
+ */
+public static final String PROP_ASYNC_NAME = "oak.checkpoint.async";
+
+private final String name;
+private long checkpointLifetimeInSeconds;
+private long minConcurrentCheckpoints;
+private final NodeStore store;
+public static final String CHECKPOINT_CREATOR_KEY = "creator";
+public static final String CHECKPOINT_CREATED_KEY = "created";
+public static final String CHECKPOINT_CREATED_TIMESTAMP_KEY= 
"created-epoch";
+public static final String CHECKPOINT_THREAD_KEY = "thread";
+public static final String CHECKPOINT_NAME_KEY = "name";
+private static final Logger log = LoggerFactory
+.getLogger(AsyncCheckpointCreator.class);
+
+public AsyncCheckpointCreator(@NotNull NodeStore store, @NotNull String 
name, long checkpointLifetimeInSeconds, long minConcurrentCheckpoints) {
+this.store = store;
+this.name = name;
+this.checkpointLifetimeInSeconds = checkpointLifetimeInSeconds;
+this.minConcurrentCheckpoints = minConcurrentCheckpoints;
+}
+
+public String getName() {
+return name;
+}
+
+protected long getCheckpointLifetimeInSeconds() {
+return checkpointLifetimeInSeconds;
+}
+
+protected long getMinConcurrentCheckpoints() {
+return minConcurrentCheckpoints;
+}
+
+@Override
+public void run() {
+Calendar cal = Calendar.getInstance();

Review Comment:
   I have added the a specific time zone as UTC , but I am not entirely sure if 
it's really needed (unless there could be a scenario where the author leader's 
pod could possibly get created in different timezones ? in which case, the 
default time zone would change).
   
   Either way, I specified it explicitly now. 
   
   Currently not moving to the new Time api , since using oak's ISO8601 for 
formatting, and would need to change that class as well to support the new api.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10905 | Add a configurable async checkpoint creator service [jackrabbit-oak]

2024-07-08 Thread via GitHub


nit0906 commented on code in PR #1560:
URL: https://github.com/apache/jackrabbit-oak/pull/1560#discussion_r1669660919


##
oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/IndexUtils.java:
##
@@ -281,4 +287,35 @@ public static String getCaller(@Nullable String[] 
ignoredJavaPackages) {
 // if every element is ignored, we assume it's an internal request
 return "(internal)";
 }
+
+/**
+ * Returns a Map consisting of checkpoints and checkpointInfoMap filtered 
based on a given predicate
+ * which can utilise checkpoint info map for filtering
+ * @param store
+ * @param predicate predicate used for filtering of checkpoints
+ * @return Map> filteredCheckpoint Map
+ */
+public static Map> 
getFilteredCheckpoints(NodeStore store, Predicate>> predicate) {
+return StreamSupport.stream(store.checkpoints().spliterator(), false)
+.collect(Collectors.toMap(str -> str,
+store::checkpointInfo))
+.entrySet()
+.stream()
+.filter(predicate)
+.collect(Collectors.toMap(Map.Entry::getKey, 
Map.Entry::getValue));
+}
+
+/**
+ * Returns a map with checkpoint name as key and checkpoint metadata map 
as value, sorted on the basis of particular key in the metadata map.

Review Comment:
   >sorted on the basis of particular key in the metadata map.
   
   every checkpoint has an associated metadata map and this sort can be based 
on any of the keys of the metadata map. 
   
   Added an example to try and make it more clear.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10905 | Add a configurable async checkpoint creator service [jackrabbit-oak]

2024-07-08 Thread via GitHub


nit0906 commented on code in PR #1560:
URL: https://github.com/apache/jackrabbit-oak/pull/1560#discussion_r1669649609


##
oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncCheckpointCreator.java:
##
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.jackrabbit.oak.plugins.index;
+
+import org.apache.jackrabbit.guava.common.collect.ImmutableMap;
+import org.apache.jackrabbit.oak.spi.state.NodeStore;
+import org.apache.jackrabbit.util.ISO8601;
+import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.Calendar;
+import java.util.Set;
+
+public class AsyncCheckpointCreator implements Runnable {
+
+/**
+ * Name of service property which determines the name of Async task

Review Comment:
   ok



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10905 | Add a configurable async checkpoint creator service [jackrabbit-oak]

2024-07-08 Thread via GitHub


nit0906 commented on code in PR #1560:
URL: https://github.com/apache/jackrabbit-oak/pull/1560#discussion_r1669649109


##
oak-core/src/test/java/org/apache/jackrabbit/oak/plugins/index/AsyncCheckpointServiceTest.java:
##
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.jackrabbit.oak.plugins.index;
+
+import org.apache.jackrabbit.guava.common.collect.ImmutableMap;
+import org.apache.jackrabbit.oak.api.jmx.IndexStatsMBean;
+import org.apache.jackrabbit.oak.plugins.memory.MemoryNodeStore;
+import org.apache.jackrabbit.oak.plugins.observation.ChangeCollectorProvider;
+import org.apache.jackrabbit.oak.spi.commit.ValidatorProvider;
+import org.apache.jackrabbit.oak.spi.state.Clusterable;
+import org.apache.jackrabbit.oak.spi.state.NodeStore;
+import org.apache.jackrabbit.oak.stats.StatisticsProvider;
+import org.apache.sling.testing.mock.osgi.MockOsgi;
+import org.apache.sling.testing.mock.osgi.junit.OsgiContext;
+import org.jetbrains.annotations.NotNull;
+import org.junit.Rule;
+import org.junit.Test;
+
+import java.util.Map;
+import java.util.concurrent.TimeUnit;
+
+import static org.junit.Assert.*;
+import static org.junit.Assert.assertTrue;
+
+public class AsyncCheckpointServiceTest {
+
+@Rule
+public final OsgiContext context = new OsgiContext();
+
+private final MemoryNodeStore nodeStore = new 
AsyncCheckpointServiceTest.FakeClusterableMemoryNodeStore();
+private final AsyncCheckpointService service = new 
AsyncCheckpointService();
+
+@Test
+public void asyncReg() {

Review Comment:
   added



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10905 | Add a configurable async checkpoint creator service [jackrabbit-oak]

2024-07-08 Thread via GitHub


nit0906 commented on code in PR #1560:
URL: https://github.com/apache/jackrabbit-oak/pull/1560#discussion_r1669638440


##
oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncCheckpointCreator.java:
##
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.jackrabbit.oak.plugins.index;
+
+import org.apache.jackrabbit.guava.common.collect.ImmutableMap;
+import org.apache.jackrabbit.oak.spi.state.NodeStore;
+import org.apache.jackrabbit.util.ISO8601;
+import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.Calendar;
+import java.util.Set;
+
+public class AsyncCheckpointCreator implements Runnable {

Review Comment:
   done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10905 | Add a configurable async checkpoint creator service [jackrabbit-oak]

2024-07-08 Thread via GitHub


nit0906 commented on code in PR #1560:
URL: https://github.com/apache/jackrabbit-oak/pull/1560#discussion_r1669632846


##
oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncCheckpointCreator.java:
##
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.jackrabbit.oak.plugins.index;
+
+import org.apache.jackrabbit.guava.common.collect.ImmutableMap;
+import org.apache.jackrabbit.oak.spi.state.NodeStore;
+import org.apache.jackrabbit.util.ISO8601;
+import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.Calendar;
+import java.util.Set;
+
+public class AsyncCheckpointCreator implements Runnable {
+
+/**
+ * Name of service property which determines the name of Async task
+ */
+public static final String PROP_ASYNC_NAME = "oak.checkpoint.async";
+
+private final String name;
+private long checkpointLifetimeInSeconds;
+private long minConcurrentCheckpoints;
+private final NodeStore store;
+public static final String CHECKPOINT_CREATOR_KEY = "creator";
+public static final String CHECKPOINT_CREATED_KEY = "created";
+public static final String CHECKPOINT_CREATED_TIMESTAMP_KEY= 
"created-epoch";
+public static final String CHECKPOINT_THREAD_KEY = "thread";
+public static final String CHECKPOINT_NAME_KEY = "name";
+private static final Logger log = LoggerFactory
+.getLogger(AsyncCheckpointCreator.class);
+
+public AsyncCheckpointCreator(@NotNull NodeStore store, @NotNull String 
name, long checkpointLifetimeInSeconds, long minConcurrentCheckpoints) {
+this.store = store;
+this.name = name;
+this.checkpointLifetimeInSeconds = checkpointLifetimeInSeconds;
+this.minConcurrentCheckpoints = minConcurrentCheckpoints;
+}
+
+public String getName() {
+return name;
+}
+
+protected long getCheckpointLifetimeInSeconds() {
+return checkpointLifetimeInSeconds;
+}
+
+protected long getMinConcurrentCheckpoints() {
+return minConcurrentCheckpoints;
+}
+
+@Override
+public void run() {
+Calendar cal = Calendar.getInstance();
+String creationTimeStamp = String.valueOf(cal.getTimeInMillis());
+String creationTimeISOFormat = ISO8601.format(cal);
+String checkpoint = store.checkpoint(checkpointLifetimeInSeconds * 
1000, ImmutableMap.of(
+CHECKPOINT_CREATOR_KEY, 
AsyncCheckpointCreator.class.getSimpleName(),
+CHECKPOINT_CREATED_KEY, creationTimeISOFormat,
+CHECKPOINT_CREATED_TIMESTAMP_KEY, creationTimeStamp,
+CHECKPOINT_THREAD_KEY, Thread.currentThread().getName(),
+CHECKPOINT_NAME_KEY, name));
+log.info("[{}] Created checkpoint {} with creation time {}", name, 
checkpoint, creationTimeISOFormat);
+
+// Get a sorted checkpoint set on created-epoch field from the list of 
filtered checkpoint based on the creating process.

Review Comment:
   Tried to explain it better.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10905 | Add a configurable async checkpoint creator service [jackrabbit-oak]

2024-07-08 Thread via GitHub


nit0906 commented on code in PR #1560:
URL: https://github.com/apache/jackrabbit-oak/pull/1560#discussion_r1669625631


##
oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncCheckpointCreator.java:
##
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.jackrabbit.oak.plugins.index;
+
+import org.apache.jackrabbit.guava.common.collect.ImmutableMap;
+import org.apache.jackrabbit.oak.spi.state.NodeStore;
+import org.apache.jackrabbit.util.ISO8601;
+import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.Calendar;
+import java.util.Set;
+
+public class AsyncCheckpointCreator implements Runnable {
+
+/**
+ * Name of service property which determines the name of Async task
+ */
+public static final String PROP_ASYNC_NAME = "oak.checkpoint.async";
+
+private final String name;
+private long checkpointLifetimeInSeconds;
+private long minConcurrentCheckpoints;
+private final NodeStore store;
+public static final String CHECKPOINT_CREATOR_KEY = "creator";
+public static final String CHECKPOINT_CREATED_KEY = "created";
+public static final String CHECKPOINT_CREATED_TIMESTAMP_KEY= 
"created-epoch";
+public static final String CHECKPOINT_THREAD_KEY = "thread";
+public static final String CHECKPOINT_NAME_KEY = "name";
+private static final Logger log = LoggerFactory
+.getLogger(AsyncCheckpointCreator.class);
+
+public AsyncCheckpointCreator(@NotNull NodeStore store, @NotNull String 
name, long checkpointLifetimeInSeconds, long minConcurrentCheckpoints) {
+this.store = store;
+this.name = name;
+this.checkpointLifetimeInSeconds = checkpointLifetimeInSeconds;
+this.minConcurrentCheckpoints = minConcurrentCheckpoints;
+}
+
+public String getName() {
+return name;
+}
+
+protected long getCheckpointLifetimeInSeconds() {
+return checkpointLifetimeInSeconds;
+}
+
+protected long getMinConcurrentCheckpoints() {
+return minConcurrentCheckpoints;
+}
+
+@Override
+public void run() {
+Calendar cal = Calendar.getInstance();
+String creationTimeStamp = String.valueOf(cal.getTimeInMillis());
+String creationTimeISOFormat = ISO8601.format(cal);
+String checkpoint = store.checkpoint(checkpointLifetimeInSeconds * 
1000, ImmutableMap.of(
+CHECKPOINT_CREATOR_KEY, 
AsyncCheckpointCreator.class.getSimpleName(),
+CHECKPOINT_CREATED_KEY, creationTimeISOFormat,
+CHECKPOINT_CREATED_TIMESTAMP_KEY, creationTimeStamp,
+CHECKPOINT_THREAD_KEY, Thread.currentThread().getName(),
+CHECKPOINT_NAME_KEY, name));
+log.info("[{}] Created checkpoint {} with creation time {}", name, 
checkpoint, creationTimeISOFormat);
+
+// Get a sorted checkpoint set on created-epoch field from the list of 
filtered checkpoint based on the creating process.
+Set sortedCheckpointSet = 
IndexUtils.getSortedCheckpointMap(IndexUtils.getFilteredCheckpoints(store,
+entry -> 
name.equals(entry.getValue().get(CHECKPOINT_NAME_KEY))), 
CHECKPOINT_CREATED_TIMESTAMP_KEY).keySet();
+int counter = sortedCheckpointSet.size();
+// Iterate over the sortedCheckpointSet and delete any checkpoints 
more than concurrentCheckpoints

Review Comment:
   Added a more verbose comment to make it clear.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10905 | Add a configurable async checkpoint creator service [jackrabbit-oak]

2024-07-08 Thread via GitHub


nit0906 commented on code in PR #1560:
URL: https://github.com/apache/jackrabbit-oak/pull/1560#discussion_r1669610107


##
oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncCheckpointCreator.java:
##
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.jackrabbit.oak.plugins.index;
+
+import org.apache.jackrabbit.guava.common.collect.ImmutableMap;
+import org.apache.jackrabbit.oak.spi.state.NodeStore;
+import org.apache.jackrabbit.util.ISO8601;
+import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.Calendar;
+import java.util.Set;
+
+public class AsyncCheckpointCreator implements Runnable {
+
+/**
+ * Name of service property which determines the name of Async task
+ */
+public static final String PROP_ASYNC_NAME = "oak.checkpoint.async";
+
+private final String name;
+private long checkpointLifetimeInSeconds;
+private long minConcurrentCheckpoints;
+private final NodeStore store;
+public static final String CHECKPOINT_CREATOR_KEY = "creator";
+public static final String CHECKPOINT_CREATED_KEY = "created";
+public static final String CHECKPOINT_CREATED_TIMESTAMP_KEY= 
"created-epoch";
+public static final String CHECKPOINT_THREAD_KEY = "thread";
+public static final String CHECKPOINT_NAME_KEY = "name";
+private static final Logger log = LoggerFactory
+.getLogger(AsyncCheckpointCreator.class);
+
+public AsyncCheckpointCreator(@NotNull NodeStore store, @NotNull String 
name, long checkpointLifetimeInSeconds, long minConcurrentCheckpoints) {
+this.store = store;
+this.name = name;
+this.checkpointLifetimeInSeconds = checkpointLifetimeInSeconds;
+this.minConcurrentCheckpoints = minConcurrentCheckpoints;
+}
+
+public String getName() {
+return name;
+}
+
+protected long getCheckpointLifetimeInSeconds() {
+return checkpointLifetimeInSeconds;
+}
+
+protected long getMinConcurrentCheckpoints() {
+return minConcurrentCheckpoints;
+}
+
+@Override
+public void run() {
+Calendar cal = Calendar.getInstance();
+String creationTimeStamp = String.valueOf(cal.getTimeInMillis());
+String creationTimeISOFormat = ISO8601.format(cal);
+String checkpoint = store.checkpoint(checkpointLifetimeInSeconds * 
1000, ImmutableMap.of(
+CHECKPOINT_CREATOR_KEY, 
AsyncCheckpointCreator.class.getSimpleName(),
+CHECKPOINT_CREATED_KEY, creationTimeISOFormat,
+CHECKPOINT_CREATED_TIMESTAMP_KEY, creationTimeStamp,
+CHECKPOINT_THREAD_KEY, Thread.currentThread().getName(),
+CHECKPOINT_NAME_KEY, name));
+log.info("[{}] Created checkpoint {} with creation time {}", name, 
checkpoint, creationTimeISOFormat);
+
+// Get a sorted checkpoint set on created-epoch field from the list of 
filtered checkpoint based on the creating process.
+Set sortedCheckpointSet = 
IndexUtils.getSortedCheckpointMap(IndexUtils.getFilteredCheckpoints(store,
+entry -> 
name.equals(entry.getValue().get(CHECKPOINT_NAME_KEY))), 
CHECKPOINT_CREATED_TIMESTAMP_KEY).keySet();
+int counter = sortedCheckpointSet.size();
+// Iterate over the sortedCheckpointSet and delete any checkpoints 
more than concurrentCheckpoints
+for (String cp : sortedCheckpointSet) {
+// Delete the checkpoint as long as the checkpoint count is 
greater than concurrentCheckpoints
+if (counter > minConcurrentCheckpoints) {
+if(store.release(cp) ) {
+log.info("[{}] Deleted checkpoint {}", name, cp);
+} else {
+log.warn("[{}] Unable to delete checkpoint {}. Removal 
will be attempted again in next run.", name, cp);
+}
+} else {
+break;
+}
+counter --;

Review Comment:
   Yes - this was done on purpose - we need to keep a *minimum* number of 
checkpoints in the system. 
   
   We are iterating on a sorted checkpoint set - > oldest 

[PR] OAK-10921 : combine two test variants using a common method [jackrabbit-oak]

2024-07-04 Thread via GitHub


stefan-egli opened a new pull request, #1572:
URL: https://github.com/apache/jackrabbit-oak/pull/1572

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10921 : fixed race condition where fullGC database variables gets… [jackrabbit-oak]

2024-07-04 Thread via GitHub


stefan-egli commented on code in PR #1562:
URL: https://github.com/apache/jackrabbit-oak/pull/1562#discussion_r1665651871


##
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/VersionGarbageCollectorIT.java:
##
@@ -1478,6 +1477,77 @@ private Iterator candidates(long 
fromModified, long toModified, in
 
 // OAK-10199 END
 
+// OAK-10921
+@Test
+public void resetFullGCFromOakRunWhileRunning() throws Exception {
+
+// if we reset fullGC from any external source while GC is running,
+// it should not update the fullGC variables.
+
+//1. Create nodes with properties
+NodeBuilder b1 = store1.getRoot().builder();
+
+// Add property to node & save
+for (int i = 0; i < 5; i++) {
+b1.child("z"+i).setProperty("prop", "foo", STRING);
+}
+store1.merge(b1, EmptyHook.INSTANCE, CommitInfo.EMPTY);
+store1.runBackgroundOperations();
+
+// enable the full gc flag
+writeField(gc, "fullGCEnabled", true, true);
+long maxAge = 1; //hours
+long delta = MINUTES.toMillis(10);
+//1. Go past GC age and check no GC done as nothing deleted
+clock.waitUntil(getCurrentTimestamp() + maxAge);
+VersionGCStats stats = gc(gc, maxAge, HOURS);
+assertStatsCountsZero(stats);
+
+//Remove property
+NodeBuilder b2 = store1.getRoot().builder();
+for (int i = 0; i < 5; i++) {
+b2.getChildNode("z"+i).removeProperty("prop");
+}
+store1.merge(b2, EmptyHook.INSTANCE, CommitInfo.EMPTY);
+store1.runBackgroundOperations();
+
+final AtomicReference gcRef = 
Atomics.newReference();
+final VersionGCSupport gcSupport = new 
VersionGCSupport(store1.getDocumentStore()) {
+
+@Override
+public Iterable getModifiedDocs(long fromModified, 
long toModified, int limit, @NotNull String fromId,
+  final @NotNull 
Set includePaths, final @NotNull Set excludePaths) {
+// reset fullGC variables externally while GC is running
+store1.getDocumentStore().remove(SETTINGS, 
SETTINGS_COLLECTION_ID);

Review Comment:
   I see the new `resetFullGCFromOakRunWhileRunning` is virtually identical to 
`resetGCFromOakRunWhileRunning` except for `reset()` vs `resetFullGC()` - which 
was the idea of course - but I would suggest to consider refactoring these two 
to avoid (test) code duplication - drafted this (without testing) in 
https://github.com/apache/jackrabbit-oak/pull/1572 to illustrate



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10933 : skip timestamp calculation if fullgc is not enabled [jackrabbit-oak]

2024-07-04 Thread via GitHub


stefan-egli merged PR #1570:
URL: https://github.com/apache/jackrabbit-oak/pull/1570


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10933 : skip timestamp calculation if fullgc is not enabled [jackrabbit-oak]

2024-07-04 Thread via GitHub


stefan-egli commented on PR #1570:
URL: https://github.com/apache/jackrabbit-oak/pull/1570#issuecomment-2208700186

   agree this is incomplete - it is merely a first fix for the most noisy 
logger. we likely will have to exclude more - but that shouldn't prevent this 
one from going through.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10933 : skip timestamp calculation if fullgc is not enabled [jackrabbit-oak]

2024-07-04 Thread via GitHub


stefan-egli commented on code in PR #1570:
URL: https://github.com/apache/jackrabbit-oak/pull/1570#discussion_r1665537149


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/VersionGCRecommendations.java:
##
@@ -170,7 +170,7 @@ public class VersionGCRecommendations {
 } else {
 oldestModifiedDryRunDocTimeStamp.set(fullGCDryRunTimestamp);
 }
-} else {
+} else if (fullGCEnabled) {

Review Comment:
   I was assuming dry run mode wouldn't be enabled if fullgc is disabled - but 
probably better to protect from that indeed. now done in 
https://github.com/apache/jackrabbit-oak/pull/1570/commits/0777573384e8d6504ab06eda073d0805a4e8a53a



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10933 : skip timestamp calculation if fullgc is not enabled [jackrabbit-oak]

2024-07-04 Thread via GitHub


rishabhdaim commented on code in PR #1570:
URL: https://github.com/apache/jackrabbit-oak/pull/1570#discussion_r1665485930


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/VersionGCRecommendations.java:
##
@@ -170,7 +170,7 @@ public class VersionGCRecommendations {
 } else {
 oldestModifiedDryRunDocTimeStamp.set(fullGCDryRunTimestamp);
 }
-} else {
+} else if (fullGCEnabled) {

Review Comment:
   We should move this check before the `isFullGCDryRun` check.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10933 : skip timestamp calculation if fullgc is not enabled [jackrabbit-oak]

2024-07-04 Thread via GitHub


rishabhdaim commented on PR #1570:
URL: https://github.com/apache/jackrabbit-oak/pull/1570#issuecomment-2208614324

   IMO, this is an incomplete fix. We need to add `fullGCEnabled` check before 
`isFullGCDryRun` at 
https://github.com/apache/jackrabbit-oak/blob/trunk/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/VersionGCRecommendations.java#L152
 & 
https://github.com/apache/jackrabbit-oak/blob/trunk/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/VersionGCRecommendations.java#L152,
 just like we did here: 
https://github.com/apache/jackrabbit-oak/blob/trunk/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/VersionGCRecommendations.java#L304


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10921 : fixed race condition where fullGC database variables gets… [jackrabbit-oak]

2024-07-04 Thread via GitHub


rishabhdaim commented on code in PR #1562:
URL: https://github.com/apache/jackrabbit-oak/pull/1562#discussion_r1665468708


##
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/VersionGarbageCollectorIT.java:
##
@@ -191,7 +190,7 @@ public GCCounts(FullGCMode mode, int deletedDocGCCount, int 
deletedPropsCount,
 public DocumentStoreFixture fixture;
 
 @Parameter(1)
-public FullGCMode fullGcMode;
+FullGCMode fullGcMode;

Review Comment:
   reduced the scope for `FullGCMode` to avoid leaking this object.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10933 : skip timestamp calculation if fullgc is not enabled [jackrabbit-oak]

2024-07-04 Thread via GitHub


stefan-egli commented on PR #1570:
URL: https://github.com/apache/jackrabbit-oak/pull/1570#issuecomment-2208485300

   tests look "good"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10843 : exclude a flaky test combination [jackrabbit-oak]

2024-07-04 Thread via GitHub


stefan-egli merged PR #1568:
URL: https://github.com/apache/jackrabbit-oak/pull/1568


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10934: segment-tar: update commons-pool2 dependendy to 2.12.0 [jackrabbit-oak]

2024-07-03 Thread via GitHub


reschke merged PR #1571:
URL: https://github.com/apache/jackrabbit-oak/pull/1571


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] OAK-10934: segment-tar: update commons-pool2 dependendy to 2.12.0 [jackrabbit-oak]

2024-07-03 Thread via GitHub


reschke opened a new pull request, #1571:
URL: https://github.com/apache/jackrabbit-oak/pull/1571

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10933 : skip timestamp calculation if fullgc is not enabled [jackrabbit-oak]

2024-07-03 Thread via GitHub


stefan-egli commented on PR #1570:
URL: https://github.com/apache/jackrabbit-oak/pull/1570#issuecomment-2206737281

   PS: waiting for test to confirm no regression introduced by this - but 
ideally this would be included asap


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] OAK-10933 : skip timestamp calculation if fullgc is not enabled [jackrabbit-oak]

2024-07-03 Thread via GitHub


stefan-egli opened a new pull request, #1570:
URL: https://github.com/apache/jackrabbit-oak/pull/1570

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10932: lucene: update commons-exec test dependency to 1.4.0 [jackrabbit-oak]

2024-07-03 Thread via GitHub


reschke merged PR #1569:
URL: https://github.com/apache/jackrabbit-oak/pull/1569


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] OAK-10932: lucene: update commons-exec test dependency to 1.4.0 [jackrabbit-oak]

2024-07-03 Thread via GitHub


reschke opened a new pull request, #1569:
URL: https://github.com/apache/jackrabbit-oak/pull/1569

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] OAK-10843 : exclude a flaky test combination [jackrabbit-oak]

2024-07-03 Thread via GitHub


stefan-egli opened a new pull request, #1568:
URL: https://github.com/apache/jackrabbit-oak/pull/1568

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10931: Update commons-io dependency to 2.16.1 [jackrabbit-oak]

2024-07-03 Thread via GitHub


reschke merged PR #1567:
URL: https://github.com/apache/jackrabbit-oak/pull/1567


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] OAK-10931: Update commons-io dependency to 2.16.1 [jackrabbit-oak]

2024-07-03 Thread via GitHub


reschke opened a new pull request, #1567:
URL: https://github.com/apache/jackrabbit-oak/pull/1567

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10928: fix flaky ElasticIndexPlannerCommonTest.indexedButZeroWeightProps [jackrabbit-oak]

2024-07-02 Thread via GitHub


fabriziofortino merged PR #1565:
URL: https://github.com/apache/jackrabbit-oak/pull/1565


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] add class FlatFileStoreSplitter [jackrabbit-oak]

2024-07-02 Thread via GitHub


github-actions[bot] closed pull request #567: add class FlatFileStoreSplitter
URL: https://github.com/apache/jackrabbit-oak/pull/567


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10929: Update commons-codec dependency to 1.17.0 [jackrabbit-oak]

2024-07-02 Thread via GitHub


reschke merged PR #1566:
URL: https://github.com/apache/jackrabbit-oak/pull/1566


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] OAK-10929: Update commons-codec dependency to 1.17.0 [jackrabbit-oak]

2024-07-02 Thread via GitHub


reschke opened a new pull request, #1566:
URL: https://github.com/apache/jackrabbit-oak/pull/1566

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] JCRVLT-747: support binaryless replication for authorizables and their descendants [jackrabbit-filevault]

2024-07-02 Thread via GitHub


reschke merged PR #333:
URL: https://github.com/apache/jackrabbit-filevault/pull/333


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10905 | Add a configurable async checkpoint creator service [jackrabbit-oak]

2024-07-02 Thread via GitHub


steffenvan commented on code in PR #1560:
URL: https://github.com/apache/jackrabbit-oak/pull/1560#discussion_r1657127391


##
oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/IndexUtils.java:
##
@@ -281,4 +287,35 @@ public static String getCaller(@Nullable String[] 
ignoredJavaPackages) {
 // if every element is ignored, we assume it's an internal request
 return "(internal)";
 }
+
+/**
+ * Returns a Map consisting of checkpoints and checkpointInfoMap filtered 
based on a given predicate
+ * which can utilise checkpoint info map for filtering
+ * @param store
+ * @param predicate predicate used for filtering of checkpoints
+ * @return Map> filteredCheckpoint Map
+ */
+public static Map> 
getFilteredCheckpoints(NodeStore store, Predicate>> predicate) {
+return StreamSupport.stream(store.checkpoints().spliterator(), false)
+.collect(Collectors.toMap(str -> str,
+store::checkpointInfo))
+.entrySet()
+.stream()
+.filter(predicate)
+.collect(Collectors.toMap(Map.Entry::getKey, 
Map.Entry::getValue));
+}
+
+/**
+ * Returns a map with checkpoint name as key and checkpoint metadata map 
as value, sorted on the basis of particular key in the metadata map.

Review Comment:
   could you rewrite what this could sort based on? Imo: `sorted on the basis 
of particular key in the metadata map` isn't super clear. And an example would 
be very helpful.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] OAK-10921 : fixed race condition where fullGC database variables gets… [jackrabbit-oak]

2024-07-02 Thread via GitHub


rishabhdaim opened a new pull request, #1562:
URL: https://github.com/apache/jackrabbit-oak/pull/1562

   … overridden if they are reset by external components


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10905 | Add a configurable async checkpoint creator service [jackrabbit-oak]

2024-07-02 Thread via GitHub


steffenvan commented on code in PR #1560:
URL: https://github.com/apache/jackrabbit-oak/pull/1560#discussion_r1657083883


##
oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncCheckpointCreator.java:
##
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.jackrabbit.oak.plugins.index;
+
+import org.apache.jackrabbit.guava.common.collect.ImmutableMap;
+import org.apache.jackrabbit.oak.spi.state.NodeStore;
+import org.apache.jackrabbit.util.ISO8601;
+import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.Calendar;
+import java.util.Set;
+
+public class AsyncCheckpointCreator implements Runnable {
+
+/**
+ * Name of service property which determines the name of Async task

Review Comment:
   Perhaps we could reformulate this to be:
   `Name of service property which determines the name of *this* Async task`
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10748: Improve statistics to collect which type of garbage is sent/deleted [jackrabbit-oak]

2024-07-02 Thread via GitHub


stefan-egli commented on code in PR #1543:
URL: https://github.com/apache/jackrabbit-oak/pull/1543#discussion_r1654745087


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/FullGCStatsCollector.java:
##
@@ -31,6 +32,34 @@ public interface FullGCStatsCollector {
  */
 void documentRead();
 
+/**
+ * Total No. of properties detected as garbage during a given GC phase
+ * @param mode GC phase
+ * @param numProps no. of garbage properties found in current cycle
+ */
+void candidateProperties(GCPhase mode, long numProps);
+
+/**
+ * Total No. of documents detected as garbage during a given GC phase
+ * @param mode GC phase
+ * @param numCommits no. of garbage documents found in current cycle
+ */
+void candidateDocuments(GCPhase mode, long numCommits);

Review Comment:
   What is the use case of this counter? I see it is currently only used in 
`collectUnmergedBranchCommits` hence seems a bit asymmetric. There already is a 
counter for how many documents are read (`documentRead`), is there an advantage 
of having `candidateDocuments` too?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10896: - Add OSGI configuration variable for removal of deleted properties and orphaned nodes [jackrabbit-oak]

2024-06-24 Thread via GitHub


shodaaan commented on code in PR #1539:
URL: https://github.com/apache/jackrabbit-oak/pull/1539#discussion_r1651122982


##
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/util/UtilsTest.java:
##
@@ -236,6 +237,30 @@ public void fullGCDisabledForRDB() {
 assertFalse("Full GC is disabled for RDB Document Store", 
fullGCEnabled);
 }
 
+@Test
+public void fullGCModeDefaultValue() {
+int fullGCModeDefaultValue = 
getFullGCMode(newDocumentNodeStoreBuilder());
+final int FULL_GC_MODE_NONE = 0;

Review Comment:
   Done.



##
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/util/UtilsTest.java:
##
@@ -236,6 +237,30 @@ public void fullGCDisabledForRDB() {
 assertFalse("Full GC is disabled for RDB Document Store", 
fullGCEnabled);
 }
 
+@Test
+public void fullGCModeDefaultValue() {
+int fullGCModeDefaultValue = 
getFullGCMode(newDocumentNodeStoreBuilder());
+final int FULL_GC_MODE_NONE = 0;
+assertEquals("Full GC mode has NONE value by default", 
fullGCModeDefaultValue, FULL_GC_MODE_NONE);
+}
+
+@Test
+public void fullGCModeSetViaConfiguration() {
+DocumentNodeStoreBuilder builder = newDocumentNodeStoreBuilder();
+final int FULL_GC_MODE_GAP_ORPHANS = 2;

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10896: - Add OSGI configuration variable for removal of deleted properties and orphaned nodes [jackrabbit-oak]

2024-06-24 Thread via GitHub


shodaaan commented on code in PR #1539:
URL: https://github.com/apache/jackrabbit-oak/pull/1539#discussion_r1651122588


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/util/Utils.java:
##
@@ -1009,6 +1009,16 @@ public static boolean 
isEmbeddedVerificationEnabled(final DocumentNodeStoreBuild
 return builder.isEmbeddedVerificationEnabled() || 
(docStoreEmbeddedVerificationFeature != null && 
docStoreEmbeddedVerificationFeature.isEnabled());
 }
 
+/**
+ * Returns the full GC mode value for the DocumentNodeStore.
+ *
+ * @param builder instance for DocumentNodeStoreBuilder
+ * @return true if full GC is enabled else false
+ */
+public static int getFullGCMode(final DocumentNodeStoreBuilder builder) 
{

Review Comment:
   Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10896: - Add OSGI configuration variable for removal of deleted properties and orphaned nodes [jackrabbit-oak]

2024-06-24 Thread via GitHub


shodaaan commented on code in PR #1539:
URL: https://github.com/apache/jackrabbit-oak/pull/1539#discussion_r165738


##
oak-run/src/main/java/org/apache/jackrabbit/oak/run/RevisionsCommand.java:
##
@@ -47,13 +47,13 @@
 import org.apache.jackrabbit.oak.plugins.document.NodeDocument;
 import org.apache.jackrabbit.oak.plugins.document.RevisionContextWrapper;
 import org.apache.jackrabbit.oak.plugins.document.SweepHelper;
+import org.apache.jackrabbit.oak.plugins.document.VersionGCOptions;
 import org.apache.jackrabbit.oak.plugins.document.VersionGCSupport;
+import org.apache.jackrabbit.oak.plugins.document.VersionGarbageCollector;
 import 
org.apache.jackrabbit.oak.plugins.document.VersionGarbageCollector.VersionGCInfo;
 import 
org.apache.jackrabbit.oak.plugins.document.VersionGarbageCollector.VersionGCStats;
 import org.apache.jackrabbit.oak.plugins.document.util.MongoConnection;
 import org.apache.jackrabbit.oak.run.commons.Command;
-import org.apache.jackrabbit.oak.plugins.document.VersionGCOptions;
-import org.apache.jackrabbit.oak.plugins.document.VersionGarbageCollector;

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] OAK-10917 - Make persistent cache related arguments optional for Azur… [jackrabbit-oak]

2024-06-24 Thread via GitHub


dulceanu opened a new pull request, #1555:
URL: https://github.com/apache/jackrabbit-oak/pull/1555

   …e compaction


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10896: - Add OSGI configuration variable for removal of deleted properties and orphaned nodes [jackrabbit-oak]

2024-06-24 Thread via GitHub


shodaaan commented on code in PR #1539:
URL: https://github.com/apache/jackrabbit-oak/pull/1539#discussion_r1651098372


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStore.java:
##
@@ -2603,7 +2605,7 @@ private void cleanOrphanedBranches() {
 }
 
 private void cleanRootCollisions() {
-String id = Utils.getIdFromPath(ROOT);
+String id = getIdFromPath(ROOT);

Review Comment:
   Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10896: - Add OSGI configuration variable for removal of deleted properties and orphaned nodes [jackrabbit-oak]

2024-06-24 Thread via GitHub


shodaaan commented on code in PR #1539:
URL: https://github.com/apache/jackrabbit-oak/pull/1539#discussion_r1651096514


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStore.java:
##
@@ -692,7 +694,7 @@ public int getMemory() {
 diffCache = builder.getDiffCache(this.clusterId);
 
 // check if root node exists
-NodeDocument rootDoc = store.find(NODES, Utils.getIdFromPath(ROOT));
+NodeDocument rootDoc = store.find(NODES, getIdFromPath(ROOT));

Review Comment:
   Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10896: - Add OSGI configuration variable for removal of deleted properties and orphaned nodes [jackrabbit-oak]

2024-06-24 Thread via GitHub


rishabhdaim commented on code in PR #1539:
URL: https://github.com/apache/jackrabbit-oak/pull/1539#discussion_r1651080682


##
oak-run/src/main/java/org/apache/jackrabbit/oak/run/RevisionsCommand.java:
##
@@ -69,10 +69,11 @@
 import static org.apache.jackrabbit.oak.plugins.document.Collection.NODES;
 import static 
org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreHelper.createVersionGC;
 import static 
org.apache.jackrabbit.oak.plugins.document.FormatVersion.versionOf;
+import static 
org.apache.jackrabbit.oak.plugins.document.util.Utils.getFullGCMode;
 import static 
org.apache.jackrabbit.oak.plugins.document.util.Utils.getIdFromPath;
 import static 
org.apache.jackrabbit.oak.plugins.document.util.Utils.getRootDocument;
-import static 
org.apache.jackrabbit.oak.plugins.document.util.Utils.isFullGCEnabled;

Review Comment:
   same as above.



##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/VersionGarbageCollector.java:
##
@@ -211,6 +211,29 @@ static FullGCMode getFullGcMode() {
 return fullGcMode;
 }
 
+/**
+ * Set the full GC mode to be used according to the provided configuration 
value.
+ * The configuration value will be ignored and fullGCMode will be reset to 
NONE
+ * if it is set to any other values than the supported ones.
+ * @param fullGcModeConfig
+ */
+static void setFullGcMode(int fullGcModeConfig) {

Review Comment:
   I would suggest renaming variable name to `fullGcMode`



##
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/util/UtilsTest.java:
##
@@ -236,6 +237,30 @@ public void fullGCDisabledForRDB() {
 assertFalse("Full GC is disabled for RDB Document Store", 
fullGCEnabled);
 }
 
+@Test
+public void fullGCModeDefaultValue() {
+int fullGCModeDefaultValue = 
getFullGCMode(newDocumentNodeStoreBuilder());
+final int FULL_GC_MODE_NONE = 0;

Review Comment:
   please use camelCase here.



##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/util/Utils.java:
##
@@ -1009,6 +1009,16 @@ public static boolean 
isEmbeddedVerificationEnabled(final DocumentNodeStoreBuild
 return builder.isEmbeddedVerificationEnabled() || 
(docStoreEmbeddedVerificationFeature != null && 
docStoreEmbeddedVerificationFeature.isEnabled());
 }
 
+/**
+ * Returns the full GC mode value for the DocumentNodeStore.
+ *
+ * @param builder instance for DocumentNodeStoreBuilder
+ * @return true if full GC is enabled else false
+ */
+public static int getFullGCMode(final DocumentNodeStoreBuilder builder) 
{

Review Comment:
   This utility method doesn't make sense, it only returns `fullGcMode` from 
the `builder` object. I would remove this.



##
oak-run/src/main/java/org/apache/jackrabbit/oak/run/RevisionsCommand.java:
##
@@ -47,13 +47,13 @@
 import org.apache.jackrabbit.oak.plugins.document.NodeDocument;
 import org.apache.jackrabbit.oak.plugins.document.RevisionContextWrapper;
 import org.apache.jackrabbit.oak.plugins.document.SweepHelper;
+import org.apache.jackrabbit.oak.plugins.document.VersionGCOptions;
 import org.apache.jackrabbit.oak.plugins.document.VersionGCSupport;
+import org.apache.jackrabbit.oak.plugins.document.VersionGarbageCollector;
 import 
org.apache.jackrabbit.oak.plugins.document.VersionGarbageCollector.VersionGCInfo;
 import 
org.apache.jackrabbit.oak.plugins.document.VersionGarbageCollector.VersionGCStats;
 import org.apache.jackrabbit.oak.plugins.document.util.MongoConnection;
 import org.apache.jackrabbit.oak.run.commons.Command;
-import org.apache.jackrabbit.oak.plugins.document.VersionGCOptions;
-import org.apache.jackrabbit.oak.plugins.document.VersionGarbageCollector;

Review Comment:
   please revert these import changes.



##
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/util/UtilsTest.java:
##
@@ -236,6 +237,30 @@ public void fullGCDisabledForRDB() {
 assertFalse("Full GC is disabled for RDB Document Store", 
fullGCEnabled);
 }
 
+@Test
+public void fullGCModeDefaultValue() {
+int fullGCModeDefaultValue = 
getFullGCMode(newDocumentNodeStoreBuilder());
+final int FULL_GC_MODE_NONE = 0;
+assertEquals("Full GC mode has NONE value by default", 
fullGCModeDefaultValue, FULL_GC_MODE_NONE);
+}
+
+@Test
+public void fullGCModeSetViaConfiguration() {
+DocumentNodeStoreBuilder builder = newDocumentNodeStoreBuilder();
+final int FULL_GC_MODE_GAP_ORPHANS = 2;

Review Comment:
   same as above.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:

[PR] OAK-10843 : exclude a flaky test combination [jackrabbit-oak]

2024-06-24 Thread via GitHub


stefan-egli opened a new pull request, #1554:
URL: https://github.com/apache/jackrabbit-oak/pull/1554

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10912 : make fullGc include/exclude paths configurable via OSGI config [jackrabbit-oak]

2024-06-24 Thread via GitHub


rishabhdaim merged PR #1551:
URL: https://github.com/apache/jackrabbit-oak/pull/1551


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10896: - Add OSGI configuration variable for removal of deleted properties and orphaned nodes [jackrabbit-oak]

2024-06-24 Thread via GitHub


stefan-egli commented on code in PR #1539:
URL: https://github.com/apache/jackrabbit-oak/pull/1539#discussion_r1650969435


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStore.java:
##
@@ -692,7 +694,7 @@ public int getMemory() {
 diffCache = builder.getDiffCache(this.clusterId);
 
 // check if root node exists
-NodeDocument rootDoc = store.find(NODES, Utils.getIdFromPath(ROOT));
+NodeDocument rootDoc = store.find(NODES, getIdFromPath(ROOT));

Review Comment:
   but it's in a random unrelated code - which is against our goal to not do 
unrelated changes



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10912 : make fullGc include/exclude paths configurable via OSGI config [jackrabbit-oak]

2024-06-24 Thread via GitHub


rishabhdaim commented on code in PR #1551:
URL: https://github.com/apache/jackrabbit-oak/pull/1551#discussion_r1650892053


##
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreServiceConfigurationTest.java:
##
@@ -83,7 +84,9 @@ public void defaultValues() throws Exception {
 assertEquals("MONGO", config.documentStoreType());
 assertEquals(DocumentNodeStoreService.DEFAULT_BUNDLING_DISABLED, 
config.bundlingDisabled());
 assertEquals(DocumentMK.Builder.DEFAULT_UPDATE_LIMIT, 
config.updateLimit());
-assertEquals(Arrays.asList("/"), 
Arrays.asList(config.persistentCacheIncludes()));
+assertEquals(of("/"), Arrays.asList(config.persistentCacheIncludes()));

Review Comment:
   Yes, it can be avoided, but IntelliJ insisted on making this change.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10896: - Add feature toggle for removal of deleted properties and orphaned nodes [jackrabbit-oak]

2024-06-24 Thread via GitHub


shodaaan commented on code in PR #1539:
URL: https://github.com/apache/jackrabbit-oak/pull/1539#discussion_r1650889773


##
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/VersionGCTest.java:
##
@@ -531,6 +531,57 @@ public void testResetFullGCDryRunMode() throws Exception {
 
 // OAK-10370 END
 
+// OAK-10896
+
+@Test
+public void testVersionGCLoadGCModeConfigurationNotApplicable() throws 
Exception {
+int fullGcModeNotAllowedValue = 5;
+int fullGcModeGapOrphans = 2;
+
+// set fullGcMode to allowed value that is different than NONE
+VersionGarbageCollector.setFullGcMode(fullGcModeGapOrphans);
+
+// reinitialize VersionGarbageCollector with not allowed value
+VersionGarbageCollector gc = new VersionGarbageCollector(
+ns, new VersionGCSupport(store), true, false, false,
+fullGcModeNotAllowedValue);
+
+assertEquals("Starting VersionGarbageCollector with not applicable / 
not allowed value" +
+"will set fullGcMode to default NONE", gc.getFullGcMode(), 
VersionGarbageCollector.FullGCMode.NONE);

Review Comment:
   Fixed the comment, unit tests and static change, thank you for the 
suggestions



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10896: - Add feature toggle for removal of deleted properties and orphaned nodes [jackrabbit-oak]

2024-06-24 Thread via GitHub


shodaaan commented on code in PR #1539:
URL: https://github.com/apache/jackrabbit-oak/pull/1539#discussion_r1650887782


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreService.java:
##
@@ -238,6 +239,8 @@ static DocumentStoreType fromString(String type) {
 private Feature cancelInvalidationFeature;
 private Feature docStoreFullGCFeature;
 private Feature docStoreEmbeddedVerificationFeature;
+private Feature docStoreFullGCModeGapOrphansFeature;
+private Feature docStoreFullGCModeEmptyPropertiesFeature;

Review Comment:
   Fixed, thank you for the comment.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10896: - Add feature toggle for removal of deleted properties and orphaned nodes [jackrabbit-oak]

2024-06-24 Thread via GitHub


shodaaan commented on code in PR #1539:
URL: https://github.com/apache/jackrabbit-oak/pull/1539#discussion_r1650887172


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStore.java:
##
@@ -692,7 +694,7 @@ public int getMemory() {
 diffCache = builder.getDiffCache(this.clusterId);
 
 // check if root node exists
-NodeDocument rootDoc = store.find(NODES, Utils.getIdFromPath(ROOT));
+NodeDocument rootDoc = store.find(NODES, getIdFromPath(ROOT));

Review Comment:
   This is consistent with how we are using static method calls. The consistent 
approach would be IMHO to actually do this everywhere, but I think a 
refactoring ticket is the best place for that.



##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStore.java:
##
@@ -716,7 +718,7 @@ public int getMemory() {
 setRoot(head);
 // make sure _lastRev is written back to store
 backgroundWrite();
-rootDoc = store.find(NODES, Utils.getIdFromPath(ROOT));
+rootDoc = store.find(NODES, getIdFromPath(ROOT));

Review Comment:
   This is consistent with how we are using static method calls. The consistent 
approach would be IMHO to actually do this everywhere, but I think a 
refactoring ticket is the best place for that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10748: Improve statistics to collect which type of garbage is sent/deleted [jackrabbit-oak]

2024-06-24 Thread via GitHub


Joscorbe commented on PR #1543:
URL: https://github.com/apache/jackrabbit-oak/pull/1543#issuecomment-2186350396

   I will squash this PR into a single commit once approved.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10914 : disable mongo-server-side exclude paths until query perfo… [jackrabbit-oak]

2024-06-24 Thread via GitHub


stefan-egli merged PR #1553:
URL: https://github.com/apache/jackrabbit-oak/pull/1553


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10748: Improve statistics to collect which type of garbage is sent/deleted [jackrabbit-oak]

2024-06-24 Thread via GitHub


Joscorbe commented on code in PR #1543:
URL: https://github.com/apache/jackrabbit-oak/pull/1543#discussion_r1650867475


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/VersionGarbageCollector.java:
##
@@ -240,7 +240,21 @@ static FullGCMode getFullGcMode() {
 AUDIT_LOG.info(" VersionGarbageCollector created with fullGcMode 
= {}", fullGcMode);
 }
 
-public void setStatisticsProvider(StatisticsProvider provider) {
+/**
+ * Please note that at the moment the includes do not
+ * take long paths into account. That is, if a long path was
+ * supposed to be included via an include, it is not.
+ * Reason for this is that long paths would require
+ * the mongo query to include a '_path' condition - which disallows
+ * mongo from using the '_modified_id' index. IOW long paths
+ * would result in full scans - which results in bad performance.
+ */
+void setFullGCPaths(@NotNull Set includes, @NotNull Set 
excludes) {
+this.fullGCIncludePaths = requireNonNull(includes);
+this.fullGCExcludePaths = requireNonNull(excludes);
+}
+
+void setStatisticsProvider(StatisticsProvider provider) {

Review Comment:
   Oh, the public was lost here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10914 : disable mongo-server-side exclude paths until query perfo… [jackrabbit-oak]

2024-06-24 Thread via GitHub


stefan-egli commented on PR #1553:
URL: https://github.com/apache/jackrabbit-oak/pull/1553#issuecomment-2186346121

   > I think we might need to update the test for this as well.
   
   Client-side incl/exclude always also applies (perhaps something to improve 
at some point), so if mongo doesn't do server-side exclude, the client-side 
still applies, so tests still run fine.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10912 : make fullGc include/exclude paths configurable via OSGI config [jackrabbit-oak]

2024-06-24 Thread via GitHub


stefan-egli commented on code in PR #1551:
URL: https://github.com/apache/jackrabbit-oak/pull/1551#discussion_r1650864494


##
oak-store-document/src/test/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreServiceConfigurationTest.java:
##
@@ -83,7 +84,9 @@ public void defaultValues() throws Exception {
 assertEquals("MONGO", config.documentStoreType());
 assertEquals(DocumentNodeStoreService.DEFAULT_BUNDLING_DISABLED, 
config.bundlingDisabled());
 assertEquals(DocumentMK.Builder.DEFAULT_UPDATE_LIMIT, 
config.updateLimit());
-assertEquals(Arrays.asList("/"), 
Arrays.asList(config.persistentCacheIncludes()));
+assertEquals(of("/"), Arrays.asList(config.persistentCacheIncludes()));

Review Comment:
   nitpick : looks unrelated/unnecessary?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10914 : disable mongo-server-side exclude paths until query perfo… [jackrabbit-oak]

2024-06-24 Thread via GitHub


rishabhdaim commented on PR #1553:
URL: https://github.com/apache/jackrabbit-oak/pull/1553#issuecomment-2186335794

   I think we might need to update the test for this as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] OAK-10914 : disable mongo-server-side exclude paths until query perfo… [jackrabbit-oak]

2024-06-24 Thread via GitHub


stefan-egli opened a new pull request, #1553:
URL: https://github.com/apache/jackrabbit-oak/pull/1553

   …rmance is fixed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10748: Improve statistics to collect which type of garbage is sent/deleted [jackrabbit-oak]

2024-06-24 Thread via GitHub


Joscorbe commented on PR #1543:
URL: https://github.com/apache/jackrabbit-oak/pull/1543#issuecomment-2186329449

   > actually, there are compilation errors:
   > 
   > ```
   > [INFO] -
   > Error:  COMPILATION ERROR : 
   > [INFO] -
   > Error:  
/home/runner/work/jackrabbit-oak/jackrabbit-oak/oak-run/src/main/java/org/apache/jackrabbit/oak/run/RevisionsCommand.java:[367,11]
 setStatisticsProvider(org.apache.jackrabbit.oak.stats.StatisticsProvider) is 
not public in 
org.apache.jackrabbit.oak.plugins.document.VersionGarbageCollector; cannot be 
accessed from outside package
   > Error:  
/home/runner/work/jackrabbit-oak/jackrabbit-oak/oak-run/src/main/java/org/apache/jackrabbit/oak/run/RevisionsCommand.java:[538,11]
 setStatisticsProvider(org.apache.jackrabbit.oak.stats.StatisticsProvider) is 
not public in 
org.apache.jackrabbit.oak.plugins.document.VersionGarbageCollector; cannot be 
accessed from outside package
   > [INFO] 2 errors 
   > ```
   
   I will double-check this, sounds weird, I built it and ran all the 
integration tests locally... 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10748: Improve statistics to collect which type of garbage is sent/deleted [jackrabbit-oak]

2024-06-24 Thread via GitHub


Joscorbe commented on code in PR #1543:
URL: https://github.com/apache/jackrabbit-oak/pull/1543#discussion_r1650854809


##
oak-run/src/main/java/org/apache/jackrabbit/oak/run/RevisionsCommand.java:
##
@@ -535,8 +544,8 @@ private void collectDocument(RevisionsOptions options, 
Closer closer, String pat
 }
 gc.collectGarbageOnDocument(documentNodeStore, workingDocument, 
options.isVerbose());
 
-//TODO: Probably we should output some details of fullGCStats. Could 
be done after OAK-10378
-//gc.getFullGCStats();
+System.out.println("Full GC Stats:");
+System.out.println(gc.getFullGCStatsReport());

Review Comment:
   Not on purpose, I will add them here too.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10912 : make fullGc include/exclude paths configurable via OSGI config [jackrabbit-oak]

2024-06-24 Thread via GitHub


rishabhdaim commented on code in PR #1551:
URL: https://github.com/apache/jackrabbit-oak/pull/1551#discussion_r1650849974


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/Configuration.java:
##
@@ -260,6 +260,27 @@
 "are separated with '::'. Example: 
-Doak.documentstore.persistentCacheIncludes=/content::/var")
 String[] persistentCacheIncludes() default {"/"};
 
+@AttributeDefinition(
+name = "Full GC Include Paths",
+description = "Paths which should be included in full garbage 
collection." +
+"Empty value means all paths are included. " +
+"Any path which is added to both include & exclude paths, 
" +
+"would be removed from included paths." +

Review Comment:
   fixed in 
https://github.com/apache/jackrabbit-oak/pull/1551/commits/55f310e2d65cfe3208ff7688ce93787eccb0cfba



##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/Configuration.java:
##
@@ -260,6 +260,27 @@
 "are separated with '::'. Example: 
-Doak.documentstore.persistentCacheIncludes=/content::/var")
 String[] persistentCacheIncludes() default {"/"};
 
+@AttributeDefinition(
+name = "Full GC Include Paths",
+description = "Paths which should be included in full garbage 
collection." +
+"Empty value means all paths are included. " +

Review Comment:
   fixed in 
https://github.com/apache/jackrabbit-oak/pull/1551/commits/55f310e2d65cfe3208ff7688ce93787eccb0cfba



##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/Configuration.java:
##
@@ -260,6 +260,27 @@
 "are separated with '::'. Example: 
-Doak.documentstore.persistentCacheIncludes=/content::/var")
 String[] persistentCacheIncludes() default {"/"};
 
+@AttributeDefinition(
+name = "Full GC Include Paths",
+description = "Paths which should be included in full garbage 
collection." +

Review Comment:
   fixed in 
https://github.com/apache/jackrabbit-oak/pull/1551/commits/55f310e2d65cfe3208ff7688ce93787eccb0cfba



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10912 : make fullGc include/exclude paths configurable via OSGI config [jackrabbit-oak]

2024-06-24 Thread via GitHub


rishabhdaim commented on code in PR #1551:
URL: https://github.com/apache/jackrabbit-oak/pull/1551#discussion_r1650849122


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreBuilder.java:
##
@@ -170,6 +174,8 @@ public class DocumentNodeStoreBuilder> {
 private boolean clusterInvisible;
 private boolean throttlingEnabled;
 private boolean fullGCEnabled;
+private Set fullGCIncludePaths = of();
+private Set fullGCExcludePaths = of();

Review Comment:
   fixed in 
https://github.com/apache/jackrabbit-oak/pull/1551/commits/55f310e2d65cfe3208ff7688ce93787eccb0cfba



##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/Configuration.java:
##
@@ -260,6 +260,27 @@
 "are separated with '::'. Example: 
-Doak.documentstore.persistentCacheIncludes=/content::/var")
 String[] persistentCacheIncludes() default {"/"};
 
+@AttributeDefinition(
+name = "Full GC Include Paths",
+description = "Paths which should be included in full garbage 
collection." +
+"Empty value means all paths are included. " +
+"Any path which is added to both include & exclude paths, 
" +
+"would be removed from included paths." +
+"Note that this value can be overridden with a system 
property " +
+"'oak.documentstore.fullGCIncludes' where paths " +
+"are separated with '::'. Example: 
-Doak.documentstore.fullGCIncludes=/content::/var")
+String[] fullGCIncludes() default {};
+
+@AttributeDefinition(
+name = "Full GC Exclude Paths",
+description = "Paths which should be excluded from full Garbage 
collection." +
+"Empty value means no paths are excluded." +
+"Any path added to excluded list would be removed from 
include paths (if present)." +
+"Note that this value can be overridden with a system 
property " +
+"'oak.documentstore.fullGCExcludes' where paths " +
+"are separated with '::'. Example: 
-Doak.documentstore.fullGCExcludes=/content::/var")
+String[] fullGCExcludes() default {};

Review Comment:
   fixed in 
https://github.com/apache/jackrabbit-oak/pull/1551/commits/55f310e2d65cfe3208ff7688ce93787eccb0cfba



##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/Configuration.java:
##
@@ -260,6 +260,27 @@
 "are separated with '::'. Example: 
-Doak.documentstore.persistentCacheIncludes=/content::/var")
 String[] persistentCacheIncludes() default {"/"};
 
+@AttributeDefinition(
+name = "Full GC Include Paths",
+description = "Paths which should be included in full garbage 
collection." +
+"Empty value means all paths are included. " +
+"Any path which is added to both include & exclude paths, 
" +
+"would be removed from included paths." +
+"Note that this value can be overridden with a system 
property " +
+"'oak.documentstore.fullGCIncludes' where paths " +
+"are separated with '::'. Example: 
-Doak.documentstore.fullGCIncludes=/content::/var")

Review Comment:
   fixed in 
https://github.com/apache/jackrabbit-oak/pull/1551/commits/55f310e2d65cfe3208ff7688ce93787eccb0cfba



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10912 : make fullGc include/exclude paths configurable via OSGI config [jackrabbit-oak]

2024-06-24 Thread via GitHub


rishabhdaim commented on code in PR #1551:
URL: https://github.com/apache/jackrabbit-oak/pull/1551#discussion_r1650825750


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/Configuration.java:
##
@@ -260,6 +260,27 @@
 "are separated with '::'. Example: 
-Doak.documentstore.persistentCacheIncludes=/content::/var")
 String[] persistentCacheIncludes() default {"/"};
 
+@AttributeDefinition(
+name = "Full GC Include Paths",
+description = "Paths which should be included in full garbage 
collection." +
+"Empty value means all paths are included. " +
+"Any path which is added to both include & exclude paths, 
" +
+"would be removed from included paths." +
+"Note that this value can be overridden with a system 
property " +
+"'oak.documentstore.fullGCIncludes' where paths " +
+"are separated with '::'. Example: 
-Doak.documentstore.fullGCIncludes=/content::/var")

Review Comment:
   This magic happens here: 
https://github.com/apache/jackrabbit-oak/blob/trunk/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreServiceConfiguration.java#L123



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10748: Improve statistics to collect which type of garbage is sent/deleted [jackrabbit-oak]

2024-06-24 Thread via GitHub


stefan-egli commented on PR #1543:
URL: https://github.com/apache/jackrabbit-oak/pull/1543#issuecomment-2186287106

   actually, there are compilation errors:
   ```
   [INFO] -
   Error:  COMPILATION ERROR : 
   [INFO] -
   Error:  
/home/runner/work/jackrabbit-oak/jackrabbit-oak/oak-run/src/main/java/org/apache/jackrabbit/oak/run/RevisionsCommand.java:[367,11]
 setStatisticsProvider(org.apache.jackrabbit.oak.stats.StatisticsProvider) is 
not public in 
org.apache.jackrabbit.oak.plugins.document.VersionGarbageCollector; cannot be 
accessed from outside package
   Error:  
/home/runner/work/jackrabbit-oak/jackrabbit-oak/oak-run/src/main/java/org/apache/jackrabbit/oak/run/RevisionsCommand.java:[538,11]
 setStatisticsProvider(org.apache.jackrabbit.oak.stats.StatisticsProvider) is 
not public in 
org.apache.jackrabbit.oak.plugins.document.VersionGarbageCollector; cannot be 
accessed from outside package
   [INFO] 2 errors 
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10748: Improve statistics to collect which type of garbage is sent/deleted [jackrabbit-oak]

2024-06-24 Thread via GitHub


stefan-egli commented on code in PR #1543:
URL: https://github.com/apache/jackrabbit-oak/pull/1543#discussion_r1650820289


##
oak-run/src/main/java/org/apache/jackrabbit/oak/run/RevisionsCommand.java:
##
@@ -535,8 +544,8 @@ private void collectDocument(RevisionsOptions options, 
Closer closer, String pat
 }
 gc.collectGarbageOnDocument(documentNodeStore, workingDocument, 
options.isVerbose());
 
-//TODO: Probably we should output some details of fullGCStats. Could 
be done after OAK-10378
-//gc.getFullGCStats();
+System.out.println("Full GC Stats:");
+System.out.println(gc.getFullGCStatsReport());

Review Comment:
   just curious : above the report is indented with 4 spaces but not here - is 
this on purpose?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10912 : make fullGc include/exclude paths configurable via OSGI config [jackrabbit-oak]

2024-06-24 Thread via GitHub


stefan-egli commented on code in PR #1551:
URL: https://github.com/apache/jackrabbit-oak/pull/1551#discussion_r1650776449


##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/Configuration.java:
##
@@ -260,6 +260,27 @@
 "are separated with '::'. Example: 
-Doak.documentstore.persistentCacheIncludes=/content::/var")
 String[] persistentCacheIncludes() default {"/"};
 
+@AttributeDefinition(
+name = "Full GC Include Paths",
+description = "Paths which should be included in full garbage 
collection." +
+"Empty value means all paths are included. " +

Review Comment:
   I find this potentially confusing.
   
   Currently when no include path is specified, it results in "include 
everything" - so specifying an "include everything" explicitly wouldn't be 
necessary. But if we wanted to support this, then I think it shouldn't be an 
empty string, but rather eg `"/"` as that is more explicit (so if someone would 
specify an empty string, we could rather log a warn and skip that entry).
   
   wdyt?



##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/Configuration.java:
##
@@ -260,6 +260,27 @@
 "are separated with '::'. Example: 
-Doak.documentstore.persistentCacheIncludes=/content::/var")
 String[] persistentCacheIncludes() default {"/"};
 
+@AttributeDefinition(
+name = "Full GC Include Paths",
+description = "Paths which should be included in full garbage 
collection." +

Review Comment:
   formatting is broken as some spaces are missing after dot
   



##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/Configuration.java:
##
@@ -260,6 +260,27 @@
 "are separated with '::'. Example: 
-Doak.documentstore.persistentCacheIncludes=/content::/var")
 String[] persistentCacheIncludes() default {"/"};
 
+@AttributeDefinition(
+name = "Full GC Include Paths",
+description = "Paths which should be included in full garbage 
collection." +
+"Empty value means all paths are included. " +
+"Any path which is added to both include & exclude paths, 
" +
+"would be removed from included paths." +
+"Note that this value can be overridden with a system 
property " +
+"'oak.documentstore.fullGCIncludes' where paths " +
+"are separated with '::'. Example: 
-Doak.documentstore.fullGCIncludes=/content::/var")

Review Comment:
   I see we sometimes state it can be overwritten via `-Doak.mongo` and 
sometimes `-Doak.documentstore` ... does it have this magic indeed or is one or 
the other wrong?



##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/Configuration.java:
##
@@ -260,6 +260,27 @@
 "are separated with '::'. Example: 
-Doak.documentstore.persistentCacheIncludes=/content::/var")
 String[] persistentCacheIncludes() default {"/"};
 
+@AttributeDefinition(
+name = "Full GC Include Paths",
+description = "Paths which should be included in full garbage 
collection." +
+"Empty value means all paths are included. " +
+"Any path which is added to both include & exclude paths, 
" +
+"would be removed from included paths." +

Review Comment:
   Not sure we need to point out someone configuring duplicate paths 
specifically.
   
   What about making this comment more generic into something like "Include and 
exclude paths can overlap. Exclude paths will take precedence" ?



##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreService.java:
##
@@ -499,6 +499,8 @@ private void configureBuilder(DocumentNodeStoreBuilder 
builder) {
 
setDocStoreEmbeddedVerificationFeature(docStoreEmbeddedVerificationFeature).
 setThrottlingEnabled(config.throttlingEnabled()).
 setFullGCEnabled(config.fullGCEnabled()).
+setFullGCIncludePaths(config.fullGCIncludes()).

Review Comment:
   Configuring the identical path in both include and exclude seems weird - but 
the code does behaves correctly, namely exclude has precedence, so it's "as if 
that path wasn't there". But I don't think we need to add an extra warn for 
this ..



##
oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreBuilder.java:
##
@@ -170,6 +174,8 @@ public class DocumentNodeStoreBuilder> {
 private boolean clusterInvisible;
 private boolean throttlingEnabled;
 private boolean fullGCEnabled;
+private Set fullGCIncludePaths = of();
+private Set fullGCExcludePaths = of();

Review Comment:
   btw, just a side-comment, I 

Re: [PR] OAK-10896: - Add feature toggle for removal of deleted properties and orphaned nodes [jackrabbit-oak]

2024-06-24 Thread via GitHub


stefan-egli commented on PR #1539:
URL: https://github.com/apache/jackrabbit-oak/pull/1539#issuecomment-2186194768

   PS: I think we should adjust jira/pr titles to talk about osgi config 
instead of feature toggle (to avoid future confusion)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: oak-dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   3   4   5   6   7   8   9   10   >