[PR] JCRVLT-745: stashing: move '/' to last in folder list, rename temp node to be easier to recognize [jackrabbit-filevault]

2024-05-08 Thread via GitHub


reschke opened a new pull request, #337:
URL: https://github.com/apache/jackrabbit-filevault/pull/337

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Rename DetailedGC to FullGC [jackrabbit-oak]

2024-05-08 Thread via GitHub


reschke commented on code in PR #1440:
URL: https://github.com/apache/jackrabbit-oak/pull/1440#discussion_r1594223850


##
oak-run/src/main/java/org/apache/jackrabbit/oak/run/RevisionsCommand.java:
##
@@ -46,19 +31,32 @@
 import org.apache.jackrabbit.oak.plugins.document.NodeDocument;
 import org.apache.jackrabbit.oak.plugins.document.RevisionContextWrapper;
 import org.apache.jackrabbit.oak.plugins.document.SweepHelper;
+import org.apache.jackrabbit.oak.plugins.document.VersionGCOptions;

Review Comment:
   Please do not move imports around (unless the change is exactly for thar 
purpose)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Rename DetailedGC to FullGC [jackrabbit-oak]

2024-05-08 Thread via GitHub


reschke commented on PR #1440:
URL: https://github.com/apache/jackrabbit-oak/pull/1440#issuecomment-2100830714

   Can you please open a JIRA ticket to track this change?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (JCRVLT-745) Stashing: naming and folder location

2024-05-08 Thread Julian Reschke (Jira)


[ 
https://issues.apache.org/jira/browse/JCRVLT-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844641#comment-17844641
 ] 

Julian Reschke commented on JCRVLT-745:
---

[~joscorbe] - in theory this shouldn't be a problem if the nodes indeed were 
temporary. What we found is that the stashing operation was failing due to a 
combination of issues (Oak document store handling of ordered children, and 
cleanup of intermediate commits). When it fails, it indeed leaves things where 
they are, and I'm not sure it would be good if FV tried to handle that.

That said, I'm planning to make a PR that changes the order of locations where 
the temp node creation is attempted (so "/" is last), and also improves naming 
of these nodes (for more information should they stick).

> Stashing: naming and folder location
> 
>
> Key: JCRVLT-745
> URL: https://issues.apache.org/jira/browse/JCRVLT-745
> Project: Jackrabbit FileVault
>  Issue Type: Improvement
>  Components: vlt
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Major
>
> We have seen cases where node stashing failed due to RuntimeExceptions (OOM 
> or MongoDB issues in Oak). In these cases, the tmp folder is not cleaned up. 
> If the operation is retried many times, there'll be many of these.
> Suggestion:
> 1. Do not use the root as default location,
> 2. Introduce an intermediary node that makes it clear that this is temp space 
> created by FileVault.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (JCRVLT-745) Stashing: naming and folder location

2024-05-08 Thread Jira


[ 
https://issues.apache.org/jira/browse/JCRVLT-745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844639#comment-17844639
 ] 

Jose Andrés Cordero commented on JCRVLT-745:


Hi [~tripod], would be possible to change the default location to not create 
the nodes directly under the root document, but somewhere else? This causes 
some performance issues and, as an example, when running the oak-run console to 
access a repository, it first reads all the direct descendants from the root 
node, which will be affected by big amounts of these temporary nodes.

> Stashing: naming and folder location
> 
>
> Key: JCRVLT-745
> URL: https://issues.apache.org/jira/browse/JCRVLT-745
> Project: Jackrabbit FileVault
>  Issue Type: Improvement
>  Components: vlt
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Major
>
> We have seen cases where node stashing failed due to RuntimeExceptions (OOM 
> or MongoDB issues in Oak). In these cases, the tmp folder is not cleaned up. 
> If the operation is retried many times, there'll be many of these.
> Suggestion:
> 1. Do not use the root as default location,
> 2. Introduce an intermediary node that makes it clear that this is temp space 
> created by FileVault.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] OAK-10786: oak-lucene: use copy of lucene-core 4.7.2 source code [jackrabbit-oak]

2024-05-08 Thread via GitHub


reschke commented on PR #1439:
URL: https://github.com/apache/jackrabbit-oak/pull/1439#issuecomment-2100398336

   > Which JARs have the mentioned package?
   
   AFAIR, lucene-core and lucene-msic have overlaps (but there may be more).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10786: oak-lucene: use copy of lucene-core 4.7.2 source code [jackrabbit-oak]

2024-05-08 Thread via GitHub


steffenvan commented on PR #1439:
URL: https://github.com/apache/jackrabbit-oak/pull/1439#issuecomment-2100358275

   Which JARs have the mentioned package? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10784: o.a.j.o.plugins.migration.version.VersionableEditor should… [jackrabbit-oak]

2024-05-08 Thread via GitHub


mbaedke merged PR #1438:
URL: https://github.com/apache/jackrabbit-oak/pull/1438


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10784: o.a.j.o.plugins.migration.version.VersionableEditor should… [jackrabbit-oak]

2024-05-08 Thread via GitHub


mbaedke commented on code in PR #1438:
URL: https://github.com/apache/jackrabbit-oak/pull/1438#discussion_r1593709973


##
oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/migration/version/VersionableEditor.java:
##
@@ -79,7 +80,7 @@ public class VersionableEditor extends DefaultEditor {
 
 private VersionableEditor(Provider provider, NodeBuilder rootBuilder) {
 this.rootBuilder = rootBuilder;
-this.versionStorage = getVersionStorage(rootBuilder);
+this.versionStorage = createVersionStorage(rootBuilder);

Review Comment:
   Yes, but the migration works completely on the nodestore API.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10778 - Support downloading from Mongo in parallel. [jackrabbit-oak]

2024-05-08 Thread via GitHub


nfsantos commented on code in PR #1435:
URL: https://github.com/apache/jackrabbit-oak/pull/1435#discussion_r1593725364


##
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/PipelinedMongoServerSelector.java:
##
@@ -0,0 +1,179 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.jackrabbit.oak.index.indexer.document.flatfile.pipelined;
+
+import com.mongodb.ServerAddress;
+import com.mongodb.connection.ClusterDescription;
+import com.mongodb.connection.ClusterType;
+import com.mongodb.connection.ServerDescription;
+import com.mongodb.event.ClusterClosedEvent;
+import com.mongodb.event.ClusterDescriptionChangedEvent;
+import com.mongodb.event.ClusterListener;
+import com.mongodb.event.ClusterOpeningEvent;
+import com.mongodb.selector.ServerSelector;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.List;
+import java.util.stream.Collectors;
+
+/**
+ * Selects a Mongo server that is available for a new connection. This policy 
is used by PipelinedMongoDownloadTask,
+ * the goal being to spread the two connections between the secondaries and 
avoid downloading from the primary.
+ * The policy tries to ensure the following:
+ *
+ * 
+ * - Establish new connections only to secondaries.
+ * - Do not establish more than one connection to a secondary.
+ * - If there is a connection to a secondary and that secondary is promoted to 
primary, the thread should disconnect
+ * from the primary and reconnect to a secondary.
+ * 
+ * This class uses the thread id of the caller to distribute the connections, 
that is, it assumes that there are
+ * two threads downloading from Mongo and each thread should be sent to a 
different secondary. If a thread calls several
+ * times the selection logic, it will receive always the same server. The 
thread id is used to identify the calling
+ * thread.
+ */
+public class PipelinedMongoServerSelector implements ServerSelector, 
ClusterListener {
+private static final Logger LOG = 
LoggerFactory.getLogger(PipelinedMongoServerSelector.class);
+
+// Thread ID -> ServerAddress
+private final HashMap serverAddressHashMap = new 
HashMap<>();
+// IDs of threads connected to primary
+private final HashSet connectedToPrimaryThreads = new HashSet<>();
+private final String threadNamePrefix;
+
+private ClusterDescription lastSeenClusterDescription;
+
+/**
+ * @param threadNamePrefix Threads that start with this prefix will be 
assigned only to secondary server. Other
+ * threads will be assigned all servers available.
+ */
+public PipelinedMongoServerSelector(String threadNamePrefix) {
+this.threadNamePrefix = threadNamePrefix;
+}
+
+@Override
+synchronized public List select(ClusterDescription 
clusterDescription) {
+return select(clusterDescription.getType(), 
clusterDescription.getServerDescriptions());
+}
+
+// Exposed for testing
+List select(ClusterType clusterType, 
List serverDescriptions) {
+if (clusterType != ClusterType.REPLICA_SET) {
+LOG.info("Cluster is not a replica set, returning all servers");
+return serverDescriptions;
+}
+String threadName = Thread.currentThread().getName();
+if (!threadName.startsWith(threadNamePrefix)) {
+LOG.info("Thread name does not start with {}, returning all 
servers", threadNamePrefix);
+return serverDescriptions;
+}
+
+Long threadId = Thread.currentThread().getId();
+LOG.debug("Selecting server from cluster: {}", 
serverDescriptions.stream()
+.map(sd -> sd.getAddress() + ", " + sd.getType() + ", " + 
sd.getState())
+.collect(Collectors.joining("\n", "\n", "")));
+var secondaries = serverDescriptions.stream()
+.filter(ServerDescription::isSecondary)
+.collect(Collectors.toSet());
+if (secondaries.isEmpty()) {
+LOG.info("No secondaries found, not selecting the primary. 
Returning empty list.");
+return 

Re: [PR] OAK-10778 - Support downloading from Mongo in parallel. [jackrabbit-oak]

2024-05-08 Thread via GitHub


nfsantos commented on code in PR #1435:
URL: https://github.com/apache/jackrabbit-oak/pull/1435#discussion_r1593718807


##
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/MongoRegexPathFilterFactory.java:
##
@@ -170,9 +170,9 @@ static List 
mergeIndexAndCustomExcludePaths(List indexExcludedPa
 return indexExcludedPaths;
 }
 
-var excludedUnion = new HashSet<>(indexExcludedPaths);

Review Comment:
   No specific reason, I'm still unsure about when to use var vs explicit type. 
In many cases keeping the type annotation makes the code more clear and does 
not significantly increase verbosity, for instance:
   ```
   var i = 1;
   
   int i = 1;
   ```
   The second option makes it clear that it is an int, while the first is 
ambiguous.
   
   In other cases, even if the type annotation is longer, I still somewhat 
prefer that it appears at least one time. For instance, here I feel it's better 
to use var:
   ```
   var bi = new DownloadPosition(batch[i].getModified(), batch[i].getId());
   ```
   Because the type is explicit on the right hand side.
   
   But here:
   ```
   FindIterable mongoIterable = dbCollection
   .find(findQuery)
   .sort(sortOrder);
   ```
   I think it's nicer to have the type annotation because the right hand side 
does not contain the type.
   
   But it's just my gut feeling of what seems more clear and easy to read, I'm 
not following any set of best-practices. It would be interesting to have a 
discussion about the use of var.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10778 - Support downloading from Mongo in parallel. [jackrabbit-oak]

2024-05-08 Thread via GitHub


nfsantos commented on code in PR #1435:
URL: https://github.com/apache/jackrabbit-oak/pull/1435#discussion_r1593718807


##
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/MongoRegexPathFilterFactory.java:
##
@@ -170,9 +170,9 @@ static List 
mergeIndexAndCustomExcludePaths(List indexExcludedPa
 return indexExcludedPaths;
 }
 
-var excludedUnion = new HashSet<>(indexExcludedPaths);

Review Comment:
   No specific reason, I'm still unsure about when to use var vs explicit type. 
In many cases keeping the type annotation makes the code more clear and does 
not significantly increase verbosity, for instance:
   ```
   var i = 1;
   
   int i = 1;
   ```
   The second option makes it clear that it is an int, while the first is 
ambiguous.
   
   In other cases, even if the type annotation is longer, I still somewhat 
prefer that it appears at least one time. For instance, here I feel it's better 
to use var:
   ```
   var bi = new DownloadPosition(batch[i].getModified(), batch[i].getId());
   ```
   Because the type is explicit on the right hand side.
   
   But here:
   ```
   FindIterable mongoIterable = dbCollection
   .find(findQuery)
   .sort(sortOrder);
   ```
   I think it's nicer to have the type annotation because the right hand side 
does not have contain the type.
   
   But it's just my gut feeling of what seems more clear and easy to read, I'm 
not following any set of best-practices. It would be interesting to have a 
discussion about the use of var.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (JCRVLT-751) ExportOptions.rootPath not properly converted to platform name format

2024-05-08 Thread Konrad Windszus (Jira)


 [ 
https://issues.apache.org/jira/browse/JCRVLT-751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konrad Windszus updated JCRVLT-751:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Fixed in 
https://github.com/apache/jackrabbit-filevault/commit/33b23e126e7e7260582dd43d4b9f4c93958da674.

> ExportOptions.rootPath not properly converted to platform name format
> -
>
> Key: JCRVLT-751
> URL: https://issues.apache.org/jira/browse/JCRVLT-751
> Project: Jackrabbit FileVault
>  Issue Type: Bug
>  Components: vlt
>Affects Versions: 3.7.2
>Reporter: Timothée Maret
>Assignee: Konrad Windszus
>Priority: Major
>  Labels: vault
> Fix For: 3.7.4
>
>
> AEM has a user synchronisation capability in the publish tier. The 
> synchronisation mechanism relies on FileVault to export and import content.
> Users stored in the repository under a path that starts with a _ and that 
> contain another _ can be exported but fail to be re-imported. For instance, 
> the user stored under the path /home/users/test/_6k_test-user-a won't be 
> imported.
> Debugging this issue, it seems that FileVault treats the _6k_ pattern as a 
> namespace and thus skip the resource upon import because the paths don't 
> match.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] OAK-10784: o.a.j.o.plugins.migration.version.VersionableEditor should… [jackrabbit-oak]

2024-05-08 Thread via GitHub


mbaedke commented on code in PR #1438:
URL: https://github.com/apache/jackrabbit-oak/pull/1438#discussion_r1593709973


##
oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/migration/version/VersionableEditor.java:
##
@@ -79,7 +80,7 @@ public class VersionableEditor extends DefaultEditor {
 
 private VersionableEditor(Provider provider, NodeBuilder rootBuilder) {
 this.rootBuilder = rootBuilder;
-this.versionStorage = getVersionStorage(rootBuilder);
+this.versionStorage = createVersionStorage(rootBuilder);

Review Comment:
   Depends on the RepositoryInitializers used. 
org.apache.jackrabbit.oak.InitialContent does create it, 
com.adobe.granite.repository.impl.GraniteContent does not. After all these 
years using it, I'd guess that's fine.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] JCRVLT-751 Bugfix/fix filename escaping in user export [jackrabbit-filevault]

2024-05-08 Thread via GitHub


kwin merged PR #336:
URL: https://github.com/apache/jackrabbit-filevault/pull/336


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [VOTE] Release Apache Jackrabbit 2.20.16

2024-05-08 Thread Manfred Baedke
[X] +1 Release this package as Apache Jackrabbit 2.20.16

...

[INFO] Apache Maven 3.6.3
[INFO] OS name: "linux", version: "6.5.0-1020-oem", arch: "amd64", family:
"unix"
[INFO] Java version: 11.0.22, vendor: Ubuntu, runtime:
/usr/lib/jvm/java-11-openjdk-amd64
[INFO] MAVEN_OPTS:
[INFO]

[INFO] ALL CHECKS OK

[INFO]


Am Di., 7. Mai 2024 um 06:01 Uhr schrieb Julian Reschke <
julian.resc...@gmx.de>:

> A candidate for the Jackrabbit 2.20.16 release is available at:
>
>  https://dist.apache.org/repos/dist/dev/jackrabbit/2.20.16/
>
> The release candidate is a zip archive of the sources in:
>
>  https://github.com/apache/jackrabbit/tree/jackrabbit-2.20.16/
>
> The SHA1 checksum of the archive is
> 68b2311cad6f69023320b64c350783c6e5b6e7fe.
>
> A staged Maven repository is available for review at:
>
>  https://repository.apache.org/
>
> The command for running automated checks against this release candidate is:
>
>  # run in SVN checkout of
> https://dist.apache.org/repos/dist/dev/jackrabbit/
>  $ sh check-release.sh jackrabbit 2.20.16
> 68b2311cad6f69023320b64c350783c6e5b6e7fe
>
> Please vote on releasing this package as Apache Jackrabbit 2.20.16.
> The vote is open for the next 72 hours and passes if a majority of at
> least three +1 Jackrabbit PMC votes are cast.
>
>  [ ] +1 Release this package as Apache Jackrabbit 2.20.16
>  [ ] -1 Do not release this package because...
>
> Best regards, Julian
>


Re: [PR] OAK-10778 - Support downloading from Mongo in parallel. [jackrabbit-oak]

2024-05-08 Thread via GitHub


fabriziofortino commented on code in PR #1435:
URL: https://github.com/apache/jackrabbit-oak/pull/1435#discussion_r1593561301


##
oak-run-commons/src/test/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/MongoTestBackend.java:
##
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.jackrabbit.oak.index.indexer.document.flatfile.pipelined;
+

Review Comment:
   nitpick: extra line
   ```suggestion
   ```



##
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/MongoParallelDownloadCoordinator.java:
##
@@ -0,0 +1,138 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.jackrabbit.oak.index.indexer.document.flatfile.pipelined;
+
+import org.apache.jackrabbit.oak.plugins.document.NodeDocument;
+import org.jetbrains.annotations.NotNull;
+
+/**
+ * Coordinates the two parallel download streams used to download from Mongo 
when parallelDump is enabled. One stream
+ * downloads in ascending order the other in descending order. This class 
keeps track of the top limit of the ascending
+ * stream and of the bottom limit of the descending stream, and determines if 
the streams have crossed. This indidcates

Review Comment:
   typo
   ```suggestion
* stream and of the bottom limit of the descending stream, and determines 
if the streams have crossed. This indicates
   ```



##
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/MongoRegexPathFilterFactory.java:
##
@@ -170,9 +170,9 @@ static List 
mergeIndexAndCustomExcludePaths(List indexExcludedPa
 return indexExcludedPaths;
 }
 
-var excludedUnion = new HashSet<>(indexExcludedPaths);

Review Comment:
   out of curiosity: I have seen you have replaced vars in multiple places. Any 
specific reason?



##
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/PipelinedMongoServerSelector.java:
##
@@ -0,0 +1,179 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.jackrabbit.oak.index.indexer.document.flatfile.pipelined;
+
+import com.mongodb.ServerAddress;
+import com.mongodb.connection.ClusterDescription;
+import com.mongodb.connection.ClusterType;
+import com.mongodb.connection.ServerDescription;
+import com.mongodb.event.ClusterClosedEvent;
+import com.mongodb.event.ClusterDescriptionChangedEvent;
+import com.mongodb.event.ClusterListener;
+import com.mongodb.event.ClusterOpeningEvent;
+import com.mongodb.selector.ServerSelector;
+import org.slf4j.Logger;

Re: [PR] OAK-10771 - Add disk cache size stats and issue warning if evicted segment has zero length [jackrabbit-oak]

2024-05-08 Thread via GitHub


smiroslav commented on code in PR #1427:
URL: https://github.com/apache/jackrabbit-oak/pull/1427#discussion_r1593534438


##
oak-segment-remote/src/main/java/org/apache/jackrabbit/oak/segment/remote/persistentcache/PersistentDiskCache.java:
##
@@ -204,7 +204,17 @@ private void cleanUpInternal() {
 }
 if (cacheSize.get() > maxCacheSizeBytes * 0.66) {
 File segment = segmentCacheEntry.getPath().toFile();
-cacheSize.addAndGet(-segment.length());
+long length = segment.length();
+if (length == 0) {
+if (logger.isDebugEnabled()) {

Review Comment:
   Why both debug and warn are needed. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10771 - Add disk cache size stats and issue warning if evicted segment has zero length [jackrabbit-oak]

2024-05-08 Thread via GitHub


smiroslav commented on code in PR #1427:
URL: https://github.com/apache/jackrabbit-oak/pull/1427#discussion_r1593533371


##
oak-segment-remote/src/main/java/org/apache/jackrabbit/oak/segment/remote/persistentcache/PersistentDiskCache.java:
##
@@ -204,7 +204,17 @@ private void cleanUpInternal() {
 }
 if (cacheSize.get() > maxCacheSizeBytes * 0.66) {
 File segment = segmentCacheEntry.getPath().toFile();
-cacheSize.addAndGet(-segment.length());
+long length = segment.length();
+if (length == 0) {
+if (logger.isDebugEnabled()) {
+logger.debug("Avoiding removal of zero-sized 
file {}", segmentCacheEntry.getPath());
+} else {
+logger.warn("Avoiding removal of zero-sized 
file.");
+}
+return;
+}
+long cacheSizeAfter = cacheSize.addAndGet(-length);
+diskCacheIOMonitor.updateCacheSize(cacheSizeAfter, 
directory.length());

Review Comment:
   https://docs.oracle.com/javase/8/docs/api/java/io/File.html#length--
   
   API docs says that File.length returns unspecified value for directories. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] Rename DetailedGC to FullGC [jackrabbit-oak]

2024-05-08 Thread via GitHub


daniancu opened a new pull request, #1440:
URL: https://github.com/apache/jackrabbit-oak/pull/1440

   Switching to FullGC instead of DetailedGC everywhere, method names, 
constants, arguments etc


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] OAK-10784: o.a.j.o.plugins.migration.version.VersionableEditor should… [jackrabbit-oak]

2024-05-08 Thread via GitHub


reschke commented on code in PR #1438:
URL: https://github.com/apache/jackrabbit-oak/pull/1438#discussion_r1593484456


##
oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/migration/version/VersionableEditor.java:
##
@@ -79,7 +80,7 @@ public class VersionableEditor extends DefaultEditor {
 
 private VersionableEditor(Provider provider, NodeBuilder rootBuilder) {
 this.rootBuilder = rootBuilder;
-this.versionStorage = getVersionStorage(rootBuilder);
+this.versionStorage = createVersionStorage(rootBuilder);

Review Comment:
   Hm, on secomd thought... I'd like to understand under what circumstances 
there can be an Oak repository without version storage. Isn't it always created 
by default?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org