[GitHub] [jackrabbit-oak] Ewocker commented on a diff in pull request #561: OAK-9758 error out if tika dependencies are missing and improve loggi…

2022-05-06 Thread GitBox


Ewocker commented on code in PR #561:
URL: https://github.com/apache/jackrabbit-oak/pull/561#discussion_r867283602


##
oak-run/src/main/java/org/apache/jackrabbit/oak/plugins/tika/TextExtractor.java:
##
@@ -270,22 +272,22 @@ public InputStream get() {
 // not being present. This is equivalent to disabling
 // selected media types in configuration, so we can simply
 // ignore these errors.
-log.debug("Failed to extract text from a binary property: {}."
+String format = "Failed to extract text from a binary property: 
{}."
 + " This often happens when some media types are 
disabled by configuration."
-+ " The stack trace is included to flag some 
'unintended' failures",
-path, e);
++ " The stack trace is included to flag some 
'unintended' failures";
+log.warn(format, linkageErrorFound ? path : new Object[]{path, e});
 parserErrorCount.incrementAndGet();
 return ERROR_TEXT;
 } catch (Throwable t) {
 // Capture and report any other full text extraction problems.
 // The special STOP exception is used for normal termination.
 if (!handler.isWriteLimitReached(t)) {
 parserErrorCount.incrementAndGet();
-parserError.debug("Failed to extract text from a binary 
property: "
-+ path
+String format = "Failed to extract text from a binary 
property: {}"
 + " This is a fairly common case, and nothing to"
 + " worry about. The stack trace is included to"
-+ " help improve the text extraction feature.", t);
++ " help improve the text extraction feature.";
+parserError.warn(format, throwableErrorFound ? path : new 
Object[]{path, t});

Review Comment:
   Do we still want to update it to info?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [jackrabbit-oak] Ewocker commented on a diff in pull request #561: OAK-9758 error out if tika dependencies are missing and improve loggi…

2022-05-06 Thread GitBox


Ewocker commented on code in PR #561:
URL: https://github.com/apache/jackrabbit-oak/pull/561#discussion_r867283424


##
oak-run/src/main/java/org/apache/jackrabbit/oak/plugins/tika/TextExtractor.java:
##
@@ -270,22 +272,22 @@ public InputStream get() {
 // not being present. This is equivalent to disabling
 // selected media types in configuration, so we can simply
 // ignore these errors.
-log.debug("Failed to extract text from a binary property: {}."
+String format = "Failed to extract text from a binary property: 
{}."
 + " This often happens when some media types are 
disabled by configuration."
-+ " The stack trace is included to flag some 
'unintended' failures",
-path, e);
++ " The stack trace is included to flag some 
'unintended' failures";
+log.warn(format, linkageErrorFound ? path : new Object[]{path, e});
 parserErrorCount.incrementAndGet();
 return ERROR_TEXT;
 } catch (Throwable t) {
 // Capture and report any other full text extraction problems.
 // The special STOP exception is used for normal termination.
 if (!handler.isWriteLimitReached(t)) {
 parserErrorCount.incrementAndGet();
-parserError.debug("Failed to extract text from a binary 
property: "
-+ path
+String format = "Failed to extract text from a binary 
property: {}"
 + " This is a fairly common case, and nothing to"
 + " worry about. The stack trace is included to"
-+ " help improve the text extraction feature.", t);
++ " help improve the text extraction feature.";
+parserError.warn(format, throwableErrorFound ? path : new 
Object[]{path, t});

Review Comment:
   Correct, I believe the error is for any sort of failure here... so not 
necessary just jar missing.
   And no, we do not have any test case for this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [jackrabbit-oak] Ewocker commented on a diff in pull request #561: OAK-9758 error out if tika dependencies are missing and improve loggi…

2022-05-06 Thread GitBox


Ewocker commented on code in PR #561:
URL: https://github.com/apache/jackrabbit-oak/pull/561#discussion_r867235841


##
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/run/cli/CommonOptions.java:
##
@@ -36,13 +36,15 @@ public class CommonOptions implements OptionsBean {
 private final OptionSpec nonOption;
 private final OptionSpec metrics;
 private final OptionSpec segment;
+private final OptionSpec continueMissingDep;
 private OptionSet options;
 
 public CommonOptions(OptionParser parser){
 help = parser.acceptsAll(asList("h", "?", "help"), "Show 
help").forHelp();
 readWriteOption = parser.accepts("read-write", "Connect to repository 
in read-write mode");
 metrics = parser.accepts("metrics", "Enables metrics based statistics 
collection");
 segment = parser.accepts("segment", "Use older oak-segment support");
+continueMissingDep = parser.accepts("continue-missing-tika-dep", 
"Continue to run when there are missing dependency");

Review Comment:
   changed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [jackrabbit-oak] Ewocker commented on a diff in pull request #561: OAK-9758 error out if tika dependencies are missing and improve loggi…

2022-05-06 Thread GitBox


Ewocker commented on code in PR #561:
URL: https://github.com/apache/jackrabbit-oak/pull/561#discussion_r867233951


##
oak-run/src/main/java/org/apache/jackrabbit/oak/plugins/tika/TextExtractor.java:
##
@@ -270,22 +272,22 @@ public InputStream get() {
 // not being present. This is equivalent to disabling
 // selected media types in configuration, so we can simply
 // ignore these errors.
-log.debug("Failed to extract text from a binary property: {}."
+String format = "Failed to extract text from a binary property: 
{}."
 + " This often happens when some media types are 
disabled by configuration."
-+ " The stack trace is included to flag some 
'unintended' failures",
-path, e);
++ " The stack trace is included to flag some 
'unintended' failures";
+log.warn(format, linkageErrorFound ? path : new Object[]{path, e});
 parserErrorCount.incrementAndGet();
 return ERROR_TEXT;
 } catch (Throwable t) {
 // Capture and report any other full text extraction problems.
 // The special STOP exception is used for normal termination.
 if (!handler.isWriteLimitReached(t)) {
 parserErrorCount.incrementAndGet();
-parserError.debug("Failed to extract text from a binary 
property: "
-+ path
+String format = "Failed to extract text from a binary 
property: {}"
 + " This is a fairly common case, and nothing to"
 + " worry about. The stack trace is included to"
-+ " help improve the text extraction feature.", t);
++ " help improve the text extraction feature.";
+parserError.warn(format, throwableErrorFound ? path : new 
Object[]{path, t});
 return ERROR_TEXT;

Review Comment:
   done



##
oak-run/src/main/java/org/apache/jackrabbit/oak/plugins/tika/TextExtractor.java:
##
@@ -270,22 +272,22 @@ public InputStream get() {
 // not being present. This is equivalent to disabling
 // selected media types in configuration, so we can simply
 // ignore these errors.
-log.debug("Failed to extract text from a binary property: {}."
+String format = "Failed to extract text from a binary property: 
{}."
 + " This often happens when some media types are 
disabled by configuration."
-+ " The stack trace is included to flag some 
'unintended' failures",
-path, e);
++ " The stack trace is included to flag some 
'unintended' failures";
+log.warn(format, linkageErrorFound ? path : new Object[]{path, e});
 parserErrorCount.incrementAndGet();

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [jackrabbit-oak] Ewocker commented on a diff in pull request #561: OAK-9758 error out if tika dependencies are missing and improve loggi…

2022-05-06 Thread GitBox


Ewocker commented on code in PR #561:
URL: https://github.com/apache/jackrabbit-oak/pull/561#discussion_r867233642


##
oak-run/src/main/java/org/apache/jackrabbit/oak/plugins/tika/TextExtractor.java:
##
@@ -79,6 +79,8 @@ class TextExtractor implements Closeable {
 private boolean initialized;
 private BinaryStats stats;
 private boolean closed;
+private boolean linkageErrorFound = false;

Review Comment:
   done



##
oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/binary/FulltextBinaryTextExtractor.java:
##
@@ -74,6 +74,8 @@ public class FulltextBinaryTextExtractor {
   private final boolean reindex;
   private Parser parser;
   private TikaConfigHolder tikaConfig;
+  private boolean linkageErrorFound = false;

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [jackrabbit-oak] Ewocker commented on a diff in pull request #561: OAK-9758 error out if tika dependencies are missing and improve loggi…

2022-05-06 Thread GitBox


Ewocker commented on code in PR #561:
URL: https://github.com/apache/jackrabbit-oak/pull/561#discussion_r867232958


##
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/run/cli/CommonOptions.java:
##
@@ -55,6 +57,8 @@ public boolean isHelpRequested(){
 return options.has(help);
 }
 
+public boolean isContinueMissingDep() { return 
options.has(continueMissingDep); }

Review Comment:
   done



##
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/run/cli/Options.java:
##
@@ -80,6 +80,15 @@ public OptionSet parseAndConfigure(OptionParser parser, 
String[] args, boolean c
 if (checkNonOptions) {
 checkNonOptions();
 }
+CommonOptions commonOpts = getCommonOpts();
+if (!commonOpts.isHelpRequested() && 
!commonOpts.isContinueMissingDep()) {

Review Comment:
   moved



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [jackrabbit-oak] klcodanr commented on pull request #535: OAK-9740 - Offset / Limit Option

2022-05-06 Thread GitBox


klcodanr commented on PR #535:
URL: https://github.com/apache/jackrabbit-oak/pull/535#issuecomment-1119822357

   @thomasmueller  I believe I've addressed your feedback, can you re:review? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [jackrabbit-oak] klcodanr commented on a diff in pull request #535: OAK-9740 - Offset / Limit Option

2022-05-06 Thread GitBox


klcodanr commented on code in PR #535:
URL: https://github.com/apache/jackrabbit-oak/pull/535#discussion_r867017876


##
oak-core/src/main/java/org/apache/jackrabbit/oak/query/QueryImpl.java:
##
@@ -1102,13 +1113,16 @@ private SelectorExecutionPlan 
getBestSelectorExecutionPlan(
 IndexPlan indexPlan = null;
 if (index instanceof AdvancedQueryIndex) {
 AdvancedQueryIndex advIndex = (AdvancedQueryIndex) index;
-long maxEntryCount = limit;
-if (offset > 0) {
-if (offset + limit < 0) {
+
+long localLimit = limit.orElse(Long.MAX_VALUE);
+long localOffset = offset.orElse(0L);
+long maxEntryCount = localLimit;
+if (localOffset > 0) {
+if (localOffset + localLimit < 0) {

Review Comment:
   It's a good point though, updated



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [jackrabbit-oak] FrancoisZhang commented on a diff in pull request #562: OAK-9760 Oak run index purge command active index check is in correct

2022-05-06 Thread GitBox


FrancoisZhang commented on code in PR #562:
URL: https://github.com/apache/jackrabbit-oak/pull/562#discussion_r867011034


##
oak-run/src/main/java/org/apache/jackrabbit/oak/indexversion/IndexVersionOperation.java:
##
@@ -86,42 +92,65 @@ public static List 
generateIndexVersionOperationList(Node
 }
 
 if (!reverseSortedIndexNameList.isEmpty()) {
-IndexName activeIndexNameObject = 
reverseSortedIndexNameList.remove(0);
-NodeState activeIndexNode = 
indexDefParentNode.getChildNode(PathUtils.getName(activeIndexNameObject.getNodeName()));
-boolean isActiveIndexLongEnough = 
isIndexPurgeReady(activeIndexNameObject, activeIndexNode, purgeThresholdMillis);
-int activeProductVersion = 
activeIndexNameObject.getProductVersion();
-indexVersionOperationList.add(new 
IndexVersionOperation(activeIndexNameObject));
+IndexName activeIndexNameObject = 
getActiveIndex(reverseSortedIndexNameList, parentPath, rootNode);
+if (activeIndexNameObject == null) {
+LOG.warn("Cannot find any active index from the list: {}", 
reverseSortedIndexNameList);
+} else {
+NodeState activeIndexNode = 
indexDefParentNode.getChildNode(PathUtils.getName(activeIndexNameObject.getNodeName()));
+boolean isActiveIndexLongEnough = 
isIndexPurgeReady(activeIndexNameObject, activeIndexNode, purgeThresholdMillis);
+int activeProductVersion = 
activeIndexNameObject.getProductVersion();
+indexVersionOperationList.add(new 
IndexVersionOperation(activeIndexNameObject));
 
 // for rest indexes except active index
-for (IndexName indexNameObject : reverseSortedIndexNameList) {
-String indexName = indexNameObject.getNodeName();
-NodeState indexNode = 
indexDefParentNode.getChildNode(PathUtils.getName(indexName));
-IndexVersionOperation indexVersionOperation = new 
IndexVersionOperation(indexNameObject);
-// if active index not long enough, NOOP for all indexes
-if (isActiveIndexLongEnough) {
-if (indexNameObject.getProductVersion() == 
activeProductVersion && indexNameObject.getCustomerVersion() == 0) {
-
indexVersionOperation.setOperation(Operation.DELETE_HIDDEN_AND_DISABLE);
-} else {
-// the check hidden oak mount logic only works when 
passing through the proper composite store
-if (isHiddenOakMountExists(indexNode)) {
-LOG.info("Found hidden oak mount node for: '{}', 
disable it but no index definition deletion", indexName);
+for (IndexName indexNameObject : reverseSortedIndexNameList) {
+String indexName = indexNameObject.getNodeName();
+NodeState indexNode = 
indexDefParentNode.getChildNode(PathUtils.getName(indexName));
+IndexVersionOperation indexVersionOperation = new 
IndexVersionOperation(indexNameObject);
+// if active index not long enough, NOOP for all indexes
+if (isActiveIndexLongEnough) {
+if (indexNameObject.getProductVersion() == 
activeProductVersion && indexNameObject.getCustomerVersion() == 0) {
 
indexVersionOperation.setOperation(Operation.DELETE_HIDDEN_AND_DISABLE);
-} else {
-
indexVersionOperation.setOperation(Operation.DELETE);
+} else if (indexNameObject.getProductVersion() <= 
activeProductVersion ) {
+// the check hidden oak mount logic only works 
when passing through the proper composite store
+if (isHiddenOakMountExists(indexNode)) {
+LOG.info("Found hidden oak mount node for: 
'{}', disable it but no index definition deletion", indexName);
+
indexVersionOperation.setOperation(Operation.DELETE_HIDDEN_AND_DISABLE);
+} else {
+
indexVersionOperation.setOperation(Operation.DELETE);
+}
 }
+// if the index product version is larger than active 
index, leave it as is
+// for instance: if there is active damAssetLucene-7, 
the inactive damAssetLucene-8 will be leave there as is

Review Comment:
   @thomasmueller just push a commit to reflect this, can you check it again? 
thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure 

[GitHub] [jackrabbit-oak] FrancoisZhang commented on a diff in pull request #562: OAK-9760 Oak run index purge command active index check is in correct

2022-05-06 Thread GitBox


FrancoisZhang commented on code in PR #562:
URL: https://github.com/apache/jackrabbit-oak/pull/562#discussion_r867005282


##
oak-run/src/main/java/org/apache/jackrabbit/oak/indexversion/IndexVersionOperation.java:
##
@@ -86,42 +92,65 @@ public static List 
generateIndexVersionOperationList(Node
 }
 
 if (!reverseSortedIndexNameList.isEmpty()) {
-IndexName activeIndexNameObject = 
reverseSortedIndexNameList.remove(0);
-NodeState activeIndexNode = 
indexDefParentNode.getChildNode(PathUtils.getName(activeIndexNameObject.getNodeName()));
-boolean isActiveIndexLongEnough = 
isIndexPurgeReady(activeIndexNameObject, activeIndexNode, purgeThresholdMillis);
-int activeProductVersion = 
activeIndexNameObject.getProductVersion();
-indexVersionOperationList.add(new 
IndexVersionOperation(activeIndexNameObject));
+IndexName activeIndexNameObject = 
getActiveIndex(reverseSortedIndexNameList, parentPath, rootNode);
+if (activeIndexNameObject == null) {
+LOG.warn("Cannot find any active index from the list: {}", 
reverseSortedIndexNameList);
+} else {
+NodeState activeIndexNode = 
indexDefParentNode.getChildNode(PathUtils.getName(activeIndexNameObject.getNodeName()));
+boolean isActiveIndexLongEnough = 
isIndexPurgeReady(activeIndexNameObject, activeIndexNode, purgeThresholdMillis);
+int activeProductVersion = 
activeIndexNameObject.getProductVersion();
+indexVersionOperationList.add(new 
IndexVersionOperation(activeIndexNameObject));
 
 // for rest indexes except active index
-for (IndexName indexNameObject : reverseSortedIndexNameList) {
-String indexName = indexNameObject.getNodeName();
-NodeState indexNode = 
indexDefParentNode.getChildNode(PathUtils.getName(indexName));
-IndexVersionOperation indexVersionOperation = new 
IndexVersionOperation(indexNameObject);
-// if active index not long enough, NOOP for all indexes
-if (isActiveIndexLongEnough) {
-if (indexNameObject.getProductVersion() == 
activeProductVersion && indexNameObject.getCustomerVersion() == 0) {
-
indexVersionOperation.setOperation(Operation.DELETE_HIDDEN_AND_DISABLE);
-} else {
-// the check hidden oak mount logic only works when 
passing through the proper composite store
-if (isHiddenOakMountExists(indexNode)) {
-LOG.info("Found hidden oak mount node for: '{}', 
disable it but no index definition deletion", indexName);
+for (IndexName indexNameObject : reverseSortedIndexNameList) {
+String indexName = indexNameObject.getNodeName();
+NodeState indexNode = 
indexDefParentNode.getChildNode(PathUtils.getName(indexName));
+IndexVersionOperation indexVersionOperation = new 
IndexVersionOperation(indexNameObject);
+// if active index not long enough, NOOP for all indexes
+if (isActiveIndexLongEnough) {
+if (indexNameObject.getProductVersion() == 
activeProductVersion && indexNameObject.getCustomerVersion() == 0) {
 
indexVersionOperation.setOperation(Operation.DELETE_HIDDEN_AND_DISABLE);
-} else {
-
indexVersionOperation.setOperation(Operation.DELETE);
+} else if (indexNameObject.getProductVersion() <= 
activeProductVersion ) {
+// the check hidden oak mount logic only works 
when passing through the proper composite store
+if (isHiddenOakMountExists(indexNode)) {
+LOG.info("Found hidden oak mount node for: 
'{}', disable it but no index definition deletion", indexName);
+
indexVersionOperation.setOperation(Operation.DELETE_HIDDEN_AND_DISABLE);
+} else {
+
indexVersionOperation.setOperation(Operation.DELETE);
+}
 }
+// if the index product version is larger than active 
index, leave it as is
+// for instance: if there is active damAssetLucene-7, 
the inactive damAssetLucene-8 will be leave there as is

Review Comment:
   @thomasmueller makes sense, let me refine this a bit to have a log here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:

[GitHub] [jackrabbit-oak] FrancoisZhang commented on a diff in pull request #562: OAK-9760 Oak run index purge command active index check is in correct

2022-05-06 Thread GitBox


FrancoisZhang commented on code in PR #562:
URL: https://github.com/apache/jackrabbit-oak/pull/562#discussion_r867005027


##
oak-run/src/main/java/org/apache/jackrabbit/oak/indexversion/IndexVersionOperation.java:
##
@@ -86,42 +92,65 @@ public static List 
generateIndexVersionOperationList(Node
 }
 
 if (!reverseSortedIndexNameList.isEmpty()) {
-IndexName activeIndexNameObject = 
reverseSortedIndexNameList.remove(0);
-NodeState activeIndexNode = 
indexDefParentNode.getChildNode(PathUtils.getName(activeIndexNameObject.getNodeName()));
-boolean isActiveIndexLongEnough = 
isIndexPurgeReady(activeIndexNameObject, activeIndexNode, purgeThresholdMillis);
-int activeProductVersion = 
activeIndexNameObject.getProductVersion();
-indexVersionOperationList.add(new 
IndexVersionOperation(activeIndexNameObject));
+IndexName activeIndexNameObject = 
getActiveIndex(reverseSortedIndexNameList, parentPath, rootNode);
+if (activeIndexNameObject == null) {
+LOG.warn("Cannot find any active index from the list: {}", 
reverseSortedIndexNameList);

Review Comment:
   @thomasmueller good points, yeah makes sense just to leave a warning message 
as is for now. we can check further later based on the log analysis.
   For now, it will be too risky to simply remove all indexes when no active 
index.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [jackrabbit-oak] mreutegg commented on pull request #560: OAK-9757 : increased node name limit for mongo 4.2 version

2022-05-06 Thread GitBox


mreutegg commented on PR #560:
URL: https://github.com/apache/jackrabbit-oak/pull/560#issuecomment-1119649190

   Also note that checks fails because two new files for classes `MongoVersion` 
and `MongoVersionTest` do not have a license header.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [jackrabbit-oak] mreutegg commented on pull request #560: OAK-9757 : increased node name limit for mongo 4.2 version

2022-05-06 Thread GitBox


mreutegg commented on PR #560:
URL: https://github.com/apache/jackrabbit-oak/pull/560#issuecomment-1119646255

   I suggest we take a slightly different approach. To me it feels wrong that 
`Utils.isLongPath()` is currently responsible for detecting when a node name is 
too long. I think this is the reason why so many other code changes are 
necessary. E.g. the method is then used in `Utils.getIdFromPath()`, which in 
turn is used in many places. But in many cases the check for the node name 
length is not actually needed. I my view it would be better if the node name 
length check is moved out of `Utils.isLongPath()` to a place were it is really 
needed. That is, when a new node is added through the `DocumentNodeStore` or 
created with an `UpdateOp` on the `DocumentStore`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [jackrabbit-oak] thomasmueller commented on a diff in pull request #562: OAK-9760 Oak run index purge command active index check is in correct

2022-05-06 Thread GitBox


thomasmueller commented on code in PR #562:
URL: https://github.com/apache/jackrabbit-oak/pull/562#discussion_r866548521


##
oak-run/src/main/java/org/apache/jackrabbit/oak/indexversion/IndexVersionOperation.java:
##
@@ -86,42 +92,65 @@ public static List 
generateIndexVersionOperationList(Node
 }
 
 if (!reverseSortedIndexNameList.isEmpty()) {
-IndexName activeIndexNameObject = 
reverseSortedIndexNameList.remove(0);
-NodeState activeIndexNode = 
indexDefParentNode.getChildNode(PathUtils.getName(activeIndexNameObject.getNodeName()));
-boolean isActiveIndexLongEnough = 
isIndexPurgeReady(activeIndexNameObject, activeIndexNode, purgeThresholdMillis);
-int activeProductVersion = 
activeIndexNameObject.getProductVersion();
-indexVersionOperationList.add(new 
IndexVersionOperation(activeIndexNameObject));
+IndexName activeIndexNameObject = 
getActiveIndex(reverseSortedIndexNameList, parentPath, rootNode);
+if (activeIndexNameObject == null) {
+LOG.warn("Cannot find any active index from the list: {}", 
reverseSortedIndexNameList);
+} else {
+NodeState activeIndexNode = 
indexDefParentNode.getChildNode(PathUtils.getName(activeIndexNameObject.getNodeName()));
+boolean isActiveIndexLongEnough = 
isIndexPurgeReady(activeIndexNameObject, activeIndexNode, purgeThresholdMillis);
+int activeProductVersion = 
activeIndexNameObject.getProductVersion();
+indexVersionOperationList.add(new 
IndexVersionOperation(activeIndexNameObject));
 
 // for rest indexes except active index

Review Comment:
   I would make this comment a bit clearer:
   
   ```
   // the reverseSortedIndexNameList will now contain the remaining indexes;
   // the active index was removed from that list
   ```



##
oak-run/src/main/java/org/apache/jackrabbit/oak/indexversion/IndexVersionOperation.java:
##
@@ -86,42 +92,65 @@ public static List 
generateIndexVersionOperationList(Node
 }
 
 if (!reverseSortedIndexNameList.isEmpty()) {
-IndexName activeIndexNameObject = 
reverseSortedIndexNameList.remove(0);
-NodeState activeIndexNode = 
indexDefParentNode.getChildNode(PathUtils.getName(activeIndexNameObject.getNodeName()));
-boolean isActiveIndexLongEnough = 
isIndexPurgeReady(activeIndexNameObject, activeIndexNode, purgeThresholdMillis);
-int activeProductVersion = 
activeIndexNameObject.getProductVersion();
-indexVersionOperationList.add(new 
IndexVersionOperation(activeIndexNameObject));
+IndexName activeIndexNameObject = 
getActiveIndex(reverseSortedIndexNameList, parentPath, rootNode);
+if (activeIndexNameObject == null) {
+LOG.warn("Cannot find any active index from the list: {}", 
reverseSortedIndexNameList);
+} else {
+NodeState activeIndexNode = 
indexDefParentNode.getChildNode(PathUtils.getName(activeIndexNameObject.getNodeName()));
+boolean isActiveIndexLongEnough = 
isIndexPurgeReady(activeIndexNameObject, activeIndexNode, purgeThresholdMillis);
+int activeProductVersion = 
activeIndexNameObject.getProductVersion();
+indexVersionOperationList.add(new 
IndexVersionOperation(activeIndexNameObject));
 
 // for rest indexes except active index
-for (IndexName indexNameObject : reverseSortedIndexNameList) {
-String indexName = indexNameObject.getNodeName();
-NodeState indexNode = 
indexDefParentNode.getChildNode(PathUtils.getName(indexName));
-IndexVersionOperation indexVersionOperation = new 
IndexVersionOperation(indexNameObject);
-// if active index not long enough, NOOP for all indexes
-if (isActiveIndexLongEnough) {
-if (indexNameObject.getProductVersion() == 
activeProductVersion && indexNameObject.getCustomerVersion() == 0) {
-
indexVersionOperation.setOperation(Operation.DELETE_HIDDEN_AND_DISABLE);
-} else {
-// the check hidden oak mount logic only works when 
passing through the proper composite store
-if (isHiddenOakMountExists(indexNode)) {
-LOG.info("Found hidden oak mount node for: '{}', 
disable it but no index definition deletion", indexName);
+for (IndexName indexNameObject : reverseSortedIndexNameList) {
+String indexName = indexNameObject.getNodeName();
+NodeState indexNode = 
indexDefParentNode.getChildNode(PathUtils.getName(indexName));
+IndexVersionOperation indexVersionOperation = new 
IndexVersionOperation(indexNameObject);
+// if active index not long enough, NOOP 

[GitHub] [jackrabbit-oak] nit0906 commented on a diff in pull request #561: OAK-9758 error out if tika dependencies are missing and improve loggi…

2022-05-06 Thread GitBox


nit0906 commented on code in PR #561:
URL: https://github.com/apache/jackrabbit-oak/pull/561#discussion_r866537233


##
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/run/cli/CommonOptions.java:
##
@@ -36,13 +36,15 @@ public class CommonOptions implements OptionsBean {
 private final OptionSpec nonOption;
 private final OptionSpec metrics;
 private final OptionSpec segment;
+private final OptionSpec continueMissingDep;
 private OptionSet options;
 
 public CommonOptions(OptionParser parser){
 help = parser.acceptsAll(asList("h", "?", "help"), "Show 
help").forHelp();
 readWriteOption = parser.accepts("read-write", "Connect to repository 
in read-write mode");
 metrics = parser.accepts("metrics", "Enables metrics based statistics 
collection");
 segment = parser.accepts("segment", "Use older oak-segment support");
+continueMissingDep = parser.accepts("continue-missing-tika-dep", 
"Continue to run when there are missing dependency");

Review Comment:
   maybe `ignore-missing-tika-dep` would be a better option ? 
   
   `ignoreMissingTikaDep = parser.accepts("ignore-missing-tika-dep", "Ignore 
when there are missing tika dependencies and continue to run");`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [jackrabbit-oak] thomasmueller commented on a diff in pull request #561: OAK-9758 error out if tika dependencies are missing and improve loggi…

2022-05-06 Thread GitBox


thomasmueller commented on code in PR #561:
URL: https://github.com/apache/jackrabbit-oak/pull/561#discussion_r866526453


##
oak-run/src/main/java/org/apache/jackrabbit/oak/plugins/tika/TextExtractor.java:
##
@@ -270,22 +272,22 @@ public InputStream get() {
 // not being present. This is equivalent to disabling
 // selected media types in configuration, so we can simply
 // ignore these errors.
-log.debug("Failed to extract text from a binary property: {}."
+String format = "Failed to extract text from a binary property: 
{}."
 + " This often happens when some media types are 
disabled by configuration."
-+ " The stack trace is included to flag some 
'unintended' failures",
-path, e);
++ " The stack trace is included to flag some 
'unintended' failures";
+log.warn(format, linkageErrorFound ? path : new Object[]{path, e});
 parserErrorCount.incrementAndGet();
 return ERROR_TEXT;
 } catch (Throwable t) {
 // Capture and report any other full text extraction problems.
 // The special STOP exception is used for normal termination.
 if (!handler.isWriteLimitReached(t)) {
 parserErrorCount.incrementAndGet();
-parserError.debug("Failed to extract text from a binary 
property: "
-+ path
+String format = "Failed to extract text from a binary 
property: {}"
 + " This is a fairly common case, and nothing to"
 + " worry about. The stack trace is included to"
-+ " help improve the text extraction feature.", t);
++ " help improve the text extraction feature.";
+parserError.warn(format, throwableErrorFound ? path : new 
Object[]{path, t});

Review Comment:
   According to the message, this doesn't mean the jar is missing, but that 
extraction failed (e.g. due to corruption in the file itself). If true (do we 
have a test case for this?) then we could log with "info" level here, instead 
of warning.



##
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/run/cli/CommonOptions.java:
##
@@ -55,6 +57,8 @@ public boolean isHelpRequested(){
 return options.has(help);
 }
 
+public boolean isContinueMissingDep() { return 
options.has(continueMissingDep); }

Review Comment:
   Please use newlines between { and }



##
oak-run/src/main/java/org/apache/jackrabbit/oak/plugins/tika/TextExtractor.java:
##
@@ -270,22 +272,22 @@ public InputStream get() {
 // not being present. This is equivalent to disabling
 // selected media types in configuration, so we can simply
 // ignore these errors.
-log.debug("Failed to extract text from a binary property: {}."
+String format = "Failed to extract text from a binary property: 
{}."
 + " This often happens when some media types are 
disabled by configuration."
-+ " The stack trace is included to flag some 
'unintended' failures",
-path, e);
++ " The stack trace is included to flag some 
'unintended' failures";
+log.warn(format, linkageErrorFound ? path : new Object[]{path, e});
 parserErrorCount.incrementAndGet();

Review Comment:
   Forgot to set linkageErrorFound = true



##
oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/binary/FulltextBinaryTextExtractor.java:
##
@@ -193,13 +196,14 @@ public Void call() throws Exception {
   // Capture and report any other full text extraction problems.
   // The special STOP exception is used for normal termination.
   if (!handler.isWriteLimitReached(t)) {
-log.debug(
-"[{}] Failed to extract text from a binary property: {}."
+String format = "[{}] Failed to extract text from a binary property: 
{}."
 + " This is a fairly common case, and nothing to"
 + " worry about. The stack trace is included to"
-+ " help improve the text extraction feature.",
-getIndexName(), path, t);
++ " help improve the text extraction feature.";

Review Comment:
   Same as above



##
oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/binary/FulltextBinaryTextExtractor.java:
##
@@ -74,6 +74,8 @@ public class FulltextBinaryTextExtractor {
   private final boolean reindex;
   private Parser parser;
   private TikaConfigHolder tikaConfig;
+  private boolean linkageErrorFound = false;

Review Comment:
   No need to set to false.



##

[GitHub] [jackrabbit-oak] rishabhdaim commented on pull request #560: OAK-9757 : increased node name limit for mongo 4.2 version

2022-05-06 Thread GitBox


rishabhdaim commented on PR #560:
URL: https://github.com/apache/jackrabbit-oak/pull/560#issuecomment-1119294326

   @reschke That's a good idea to move util function to DocumentStore and that 
would help us avoiding in calling metadata multiple times. 
   Let me make changes and raise the PR again, but I doubt that would restrict 
the code changes to less file since most of the changes are in tests cases and 
the changes are to pass the parameters to utils class.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org