Re: Re: any analyzer will keep punctuation?

2017-03-07 Thread 380382...@qq.com
i think Ahmet is right. use WhiteSpace tokeniser will separate doc into token.and then you use custom filter can delete some punctuation you want to remove.Realization a custom filter is not very difficult. 380382...@qq.com 发件人: Yonghui Zhao 发送时间: 2017-03-08 12:22 收件人: Ahmet Arslan 抄送: jav

Re: [ANNOUNCE] Apache Solr 6.4.2 released

2017-03-07 Thread Sahil Agarwal
​FYI, the http://lucene.apache.org/solr/mirrors-solr-latest-redir.html link redirects to http://www.apache.org/dyn/closer.lua/lucene/solr/6.4.1 and not http://www.apache.org/dyn/closer.lua/lucene/solr/6.4.2​ On 8 March 2017 at 01:00, Ishan Chattopadhyaya wrote: > 7 March 2017, Apache Solr 6.4.2

Re: any analyzer will keep punctuation?

2017-03-07 Thread Yonghui Zhao
Hi Ahmet, Thanks for your reply, but I didn't quite get your idea. I want to get an analyzer like standard analyzer but with punctuation customized. I think one way is customizing an analyzer with a customizer tokenizer like StandardTokenizer. In my tokenizer I will re-write StandardTokenizerImp

[ANNOUNCE] Apache Solr 6.4.2 released

2017-03-07 Thread Ishan Chattopadhyaya
7 March 2017, Apache Solr 6.4.2 available Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search and analytics, rich document parsing, geospatial search, extensive R

[ANNOUNCE] Apache Lucene 6.4.2 released

2017-03-07 Thread Ishan Chattopadhyaya
7 March 2017, Apache Lucene™ 6.4.2 available The Lucene PMC is pleased to announce the release of Apache Lucene 6.4.2 Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-t