[ https://issues.apache.org/jira/browse/SOLR-2519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless reassigned SOLR-2519: ---------------------------------------- Assignee: Michael McCandless > Improve the defaults for the "text" field type in default schema.xml > -------------------------------------------------------------------- > > Key: SOLR-2519 > URL: https://issues.apache.org/jira/browse/SOLR-2519 > Project: Solr > Issue Type: Bug > Reporter: Michael McCandless > Assignee: Michael McCandless > > Spinoff from: http://lucene.markmail.org/thread/ww6mhfi3rfpngmc5 > The text fieldType in schema.xml is unusable for non-whitespace > languages, because it has the dangerous auto-phrase feature (of > Lucene's QP -- see LUCENE-2458) enabled. > Lucene leaves this off by default, as does ElasticSearch > (http://http://www.elasticsearch.org/). > Furthermore, the "text" fieldType uses WhitespaceTokenizer when > StandardTokenizer is a better cross-language default. > Until we have language specific field types, I think we should fix > the "text" fieldType to work well for all languages, by: > * Switching from WhitespaceTokenizer to StandardTokenizer > * Turning off auto-phrase -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org