thomasmueller commented on code in PR #2193:
URL: https://github.com/apache/jackrabbit-oak/pull/2193#discussion_r2021240395
##########
oak-search-elastic/src/main/java/org/apache/jackrabbit/oak/plugins/index/elastic/index/ElasticCustomAnalyzer.java:
##########
@@ -201,6 +257,21 @@ private static <FD> LinkedHashMap<String, FD>
loadFilters(NodeState state,
Map<String, Object> args = convertNodeState(child, transformers,
content);
+ if (name.equals("word_delimiter")) {
+ //
https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-word-delimiter-tokenfilter.html
+ // We recommend using the word_delimiter_graph instead of the
word_delimiter filter.
+ // The word_delimiter filter can produce invalid token graphs.
+ LOG.info("Replacing the word delimiter filter with the word
delimiter graph");
+ name = "word_delimiter_graph";
+ }
+ if (name.equals("hyphenation_compound_word")) {
+ name = "hyphenation_decompounder";
+ String hypenator = args.getOrDefault("hyphenator",
"").toString();
+ LOG.info("Using the hyphenation_decompounder: " + hypenator);
+ args.put("hyphenation_patterns_path",
"analysis/hyphenation_patterns.xml");
Review Comment:
I wanted to use a fixed name, so it is possible to configure it. Installing
this would have to be done manually, and we need to document it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]