i think Ahmet is right. use WhiteSpace tokeniser will separate doc into
token.and then you use custom filter can delete some punctuation you want to
remove.Realization a custom filter is not very difficult.
380382...@qq.com
发件人: Yonghui Zhao
发送时间: 2017-03-08 12:22
收件人: Ahmet Arslan
抄送: jav
FYI, the http://lucene.apache.org/solr/mirrors-solr-latest-redir.html
link redirects to http://www.apache.org/dyn/closer.lua/lucene/solr/6.4.1
and not http://www.apache.org/dyn/closer.lua/lucene/solr/6.4.2
On 8 March 2017 at 01:00, Ishan Chattopadhyaya
wrote:
> 7 March 2017, Apache Solr 6.4.2
Hi Ahmet,
Thanks for your reply, but I didn't quite get your idea.
I want to get an analyzer like standard analyzer but with punctuation
customized.
I think one way is customizing an analyzer with a customizer tokenizer
like StandardTokenizer.
In my tokenizer I will re-write StandardTokenizerImp
7 March 2017, Apache Solr 6.4.2 available
Solr is the popular, blazing fast, open source NoSQL search platform from
the Apache Lucene project. Its major features include powerful full-text
search, hit highlighting, faceted search and analytics, rich document
parsing, geospatial search, extensive R
7 March 2017, Apache Lucene™ 6.4.2 available
The Lucene PMC is pleased to announce the release of Apache Lucene 6.4.2
Apache Lucene is a high-performance, full-featured text search engine library
written entirely in Java. It is a technology suitable for nearly any
application that requires full-t