[ https://issues.apache.org/jira/browse/LUCENE-7639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15826115#comment-15826115 ]
Dawid Weiss commented on LUCENE-7639: ------------------------------------- bq. [...] and it requires much bigger index Why is this the case? The inverted FST should be smaller than the suffix array and intersecting both should be a viable option to get {{*abc*}} wildcard matches? > Use Suffix Arrays for fast search with leading asterisks > -------------------------------------------------------- > > Key: LUCENE-7639 > URL: https://issues.apache.org/jira/browse/LUCENE-7639 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Yakov Sirotkin > Attachments: suffix-array.patch > > > If query term starts with asterisks FST checks all words in the dictionary so > request processing speed falls down. This problem can be solved with Suffix > Array approach. Luckily, Suffix Array can be constructed after Lucene start > from existing index. Unfortunately, Suffix Arrays requires a lot of RAM so we > can use it only when special flag is set: > -Dsolr.suffixArray.enable=true > It is possible to speed up Suffix Array initialization using several > threads, so we can control number of threads with > -Dsolr.suffixArray.initialization_treads_count=5 > This system property can be omitted, the default value is 5. > Attached patch is the suggested implementation for SuffixArray support, it > works for all terms starting with asterisks with at least 3 consequent > non-wildcard characters. This patch do not change search results and affects > only performance issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org