On Thu, May 22, 2014 at 4:31 PM, Erik Rose grinche...@gmail.com wrote:
Alright, try this on for size. :-)
Since the built-in regex-ish filters want to be all clever and
index-based, why not use the JS script plugin, which is happy to run as a
post-processing phase?
curl -s -XGET
Martijn took a swing at it just now. He eliminated any scoring-based
slowdown, like so (constant_score_filter)…
curl -s -XGET 'http://127.0.0.1:9200/dxr_test/line/_search?pretty' -d '{
query: {
filtered: {
query: {
match_all: {}
On Wed, May 21, 2014 at 6:01 PM, Erik Rose grinche...@gmail.com wrote:
I'm trying to move Mozilla's source code search engine (dxr.mozilla.org)
from a custom-written SQLite trigram index to ES. In the current production
incarnation, we support fast regex (and, by extension, wildcard) searches
Leading wildcards are really expensive. Maybe you can try creating a copy
of your content field that reverses the tokens using reverse token filter
[1]. By doing this you turn those expensive leading wildcards into
trailing wildcards which should give you better performance. I think your
query
Leading wildcards are really expensive. Maybe you can try creating a copy
of your content field that reverses the tokens using reverse token filter
[1].
Good advice, typically, but notice I have wildcards on either side.
Reversing just makes the trailing wildcard expensive. :-)
--
You
Aye, and then you can use edit distance on single words (fuzzy query) to
cope with fast typers
On May 22, 2014 8:22 PM, Robert Muir robert.m...@elasticsearch.com
wrote:
On Wed, May 21, 2014 at 6:01 PM, Erik Rose grinche...@gmail.com wrote:
I'm trying to move Mozilla's source code search engine
This is definitely a great approach for a database, but it won't work
exactly the same way for an inverted index because the datastructure
is totally different.
Ah, I was afraid of that. I hoped, due to the field being unanalyzed (and
the documentation's noted restriction that wildcard
Alright, try this on for size. :-)
Since the built-in regex-ish filters want to be all clever and index-based,
why not use the JS script plugin, which is happy to run as a
post-processing phase?
curl -s -XGET 'http://127.0.0.1:9200/dxr_test/line/_search?pretty' -d '{
query: {
I'm trying to move Mozilla's source code search engine (dxr.mozilla.org)
from a custom-written SQLite trigram index to ES. In the current production
incarnation, we support fast regex (and, by extension, wildcard) searches
by extracting trigrams from the search pattern and paring down the