Leading wildcards are really expensive.  Maybe you can try creating a copy
of your "content" field that reverses the tokens using reverse token filter
[1].  By doing this you turn those expensive leading wildcards into
trailing wildcards which should give you better performance.  I think your
query would look something like this:

{
  "query": {
    "constant_score": {
      "query": {
        "bool": {
          "should": [
            {"wildcard": {"content": "Children*Next*"}},
            {"wildcard": {"content_rev": "txeN*nerdlihC*"}}
          ]
        }
      }
    }
  }
}

Note that you will need to reverse your query string as the wildcard query
is not analyzed.

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-reverse-tokenfilter.html#analysis-reverse-tokenfilter

Thanks,
Matt Weber


On Thu, May 22, 2014 at 11:09 AM, Erik Rose <grinche...@gmail.com> wrote:

> Martijn took a swing at it just now. He eliminated any scoring-based
> slowdown, like so (constant_score_filter)…
>
>     curl -s -XGET 'http://127.0.0.1:9200/dxr_test/line/_search?pretty' -d
> '{
>         "query": {
>             "filtered": {
>                 "query": {
>                     "match_all": {}
>                 },
>                 "filter": {
>                     "and": [
>                         {
>                             "query": {
>                                 "match_phrase": {
>                                     "content_trg": "Children"
>                                 }
>                             }
>                         },
>                         {
>                             "query": {
>                                 "match_phrase": {
>                                     "content_trg": "Next"
>                                 }
>                             }
>                         },
>                         {
>                             "query": {
>                                 "wildcard": {
>                                     "content": {
>                                         "wildcard": "*Children*Next*",
>                                         "rewrite": "constant_score_filter"
>                                     }
>                                 }
>                             }
>                         }
>                     ]
>                 }
>             }
>         }
>     }'
>
> …but it didn't make any difference. Somehow, the `and` pipeline isn't
> behaving as we expect. Since ES can't provide any more detailed timing
> ouput, I guess the next step is to go look at the source code for the `and`
> filter and the wildcard query and see what's what.
>
> I think we'd both be fascinated to know what's going on, if anyone has
> anything to add.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/3114f40c-0b15-4dd4-8a6b-fc8c13d43f23%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/3114f40c-0b15-4dd4-8a6b-fc8c13d43f23%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Thanks,
Matt Weber

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJ3KEoA1fQjbkygEBhxZdMcb%3D22JGDph65qNn1cvkE66NLRn3A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to