[
https://issues.apache.org/jira/browse/LUCENE-7639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15826166#comment-15826166
]
Dawid Weiss commented on LUCENE-7639:
-------------------------------------
Sure, I didn't mean to somehow diminish suffix arrays -- they're super nice!
I'd still be curious whether the thing can be done on a finite state automaton
alone.
> Use Suffix Arrays for fast search with leading asterisks
> --------------------------------------------------------
>
> Key: LUCENE-7639
> URL: https://issues.apache.org/jira/browse/LUCENE-7639
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Yakov Sirotkin
> Attachments: suffix-array.patch
>
>
> If query term starts with asterisks FST checks all words in the dictionary so
> request processing speed falls down. This problem can be solved with Suffix
> Array approach. Luckily, Suffix Array can be constructed after Lucene start
> from existing index. Unfortunately, Suffix Arrays requires a lot of RAM so we
> can use it only when special flag is set:
> -Dsolr.suffixArray.enable=true
> It is possible to speed up Suffix Array initialization using several
> threads, so we can control number of threads with
> -Dsolr.suffixArray.initialization_treads_count=5
> This system property can be omitted, the default value is 5.
> Attached patch is the suggested implementation for SuffixArray support, it
> works for all terms starting with asterisks with at least 3 consequent
> non-wildcard characters. This patch do not change search results and affects
> only performance issues.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]