Aye, and then you can use edit distance on single words (fuzzy query) to cope with fast typers On May 22, 2014 8:22 PM, "Robert Muir" <robert.m...@elasticsearch.com> wrote:
> On Wed, May 21, 2014 at 6:01 PM, Erik Rose <grinche...@gmail.com> wrote: > > I'm trying to move Mozilla's source code search engine (dxr.mozilla.org) > > from a custom-written SQLite trigram index to ES. In the current > production > > incarnation, we support fast regex (and, by extension, wildcard) > searches by > > extracting trigrams from the search pattern and paring down the > documents to > > those containing said trigrams. > > This is definitely a great approach for a database, but it won't work > exactly the same way for an inverted index because the datastructure > is totally different. > > In the inverted index queries like wildcards are slow: they must > iterate and match all terms in the document collection, then intersect > those postings with the rest of your query. So because its inverted, > it works backwards from what you expect and thats why adding > additional intersections like 'AND' don't speed anything up, they > haven't happened yet. > > N-grams can speed up partial matching in general, but the methods to > accomplish this are different: usually the best way to go about it is > to try to think about Analyzing the data in such a way that the > queries to accomplish what you need are as basic as possible. > > The first question is if you really need partial matching at all: I > don't have much knowledge about your use case, but just going from > your example, i would look at wildcards like "*Children*Next*" and ask > if instead i'd want to ensure my analyzer split on case-changes, and > try to see if i could get what i need with a sloppy phrase query. > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearch+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/CAMUKNZUS40rsAjmzrL_YK6yjgjZRumeQKFVPhVu9bUcW4nN_KA%40mail.gmail.com > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZugwATMVFH%3DFziTPkX-dT6%3DRGfwhCud2S_aBcSDYmxZEA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.