[ https://issues.apache.org/jira/browse/LUCENE-6103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Itamar Syn-Hershko updated LUCENE-6103: --------------------------------------- Summary: StandardTokenizer doesn't tokenize word:word (was: StandardTokenizer doesn't tokenizer word:word) > StandardTokenizer doesn't tokenize word:word > -------------------------------------------- > > Key: LUCENE-6103 > URL: https://issues.apache.org/jira/browse/LUCENE-6103 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis > Affects Versions: 4.9 > Reporter: Itamar Syn-Hershko > > StandardTokenizer (and by result most default analyzers) will not tokenize > word:word and will preserve it as one token. This can be easily seen using > Elasticsearch's analyze API: > localhost:9200/_analyze?tokenizer=standard&text=word%20word:word > If this is the intended behavior, then why? I can't really see the logic > behind it. > If not, I'll be happy to join in the effort of fixing this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org