I opened https://issues.apache.org/jira/browse/SOLR-2519 for this.
Mike http://blog.mikemccandless.com On Sun, May 15, 2011 at 8:02 AM, Michael McCandless <luc...@mikemccandless.com> wrote: > On Fri, May 6, 2011 at 8:49 AM, Michael McCandless > <luc...@mikemccandless.com> wrote: > >> Shouldn't we have field types in the eg schema for the different >> languages? Ie, text_zh, text_th, text_en, text_ja, text_nl, etc. > > In fact, until we break out dedicated language field types, shouldn't > we default autophrase to off in Solr? > > I think this is what ElasticSearch does (just inherits Lucene's > default for this) -- Shay, or any ElasticSearch users out there... can > you confirm? > > Leaving autophrase on is catastrophic for non-whitespace languages > (CJK and others), and at best iffy for whitespace languages (ie, > unexpected that the QueryParser would make a PhraseQuery when user > hadn't asked for one, not clear it really helps relevance for > whitespace languages, definitely hurts performance), so leaving it is > doing far more damage than good, as far as I can tell. > > Any objections to turning off autophrase by default in Solr, until we > have per-language field types? > > Mike > > http://blog.mikemccandless.com >