Re: Lucene segment selection strategy

2015-07-17 Thread Michael McCandless
Curious ... Lucene should try to fallback to the older segments_N files (even if segments.gen points to the new, broken ones). We've removed segments.gen as of 5.x and I think it's unlikely we'll do another 4.10.x release at this point, but maybe still open the issue in case others hit it? Mike M

Lucene segment selection strategy

2015-07-17 Thread Geoff Cooney
Hi, We recently had an issue with an index where two sequential aborted but unsuccessfully rolled back commits resulted in empty segments_n files, segments_i13p and segments_i13q in this case. This resulted in an exception whenever we tried to open the index until we manually removed the bad segm

Re: Upgrading from lucene 2.9.1 to 5.X

2015-07-17 Thread Erick Erickson
Yes, very significant ones. Also memory usage is vastly improved. On Fri, Jul 17, 2015 at 11:31 AM, Shuangyang Yang wrote: > Erick, > > Thank you very much. Is there any performance comparison between these two > versions? Is there any data on the performance? > > Thank you very much > Best Regar

Re: Upgrading from lucene 2.9.1 to 5.X

2015-07-17 Thread Shuangyang Yang
Erick, Thank you very much. Is there any performance comparison between these two versions? Is there any data on the performance? Thank you very much Best Regards --- Shuangyang Yang linkedin.com/in/everyoung On 7/17/15, 10:06 AM, "Erick Erickson" wrote: >Please look at the CHANGES.txt fi

Re: Upgrading from lucene 2.9.1 to 5.X

2015-07-17 Thread Erick Erickson
Please look at the CHANGES.txt file for Solr and/or Lucene, there's a "New Features" section for every release. Best, Erick On Fri, Jul 17, 2015 at 7:55 AM, Shuangyang Yang wrote: > Erick, Amish, > > Thank you very much for your reply. They are really helpful. Can you tell > me what are the comp

Re: Upgrading from lucene 2.9.1 to 5.X

2015-07-17 Thread Shuangyang Yang
Erick, Amish, Thank you very much for your reply. They are really helpful. Can you tell me what are the compelling features from 2.9.1 to 5.X? I¹m sure there are many. Thank you very much Best Regards --- Shuangyang Yang linkedin.com/in/everyoung On 7/16/15, 8:41 PM, "Erick Erickson" wrote

Re: StandardTokenizer#setMaxTokenLength

2015-07-17 Thread Steve Rowe
Hi Piotr, Thanks for reporting! See https://issues.apache.org/jira/browse/LUCENE-6682 Steve www.lucidworks.com > On Jul 16, 2015, at 4:47 AM, Piotr Idzikowski > wrote: > > Hello. > I am developing own analyzer based on StandardAnalyzer. > I realized that tokenizer.setMaxTokenLength is called

Analyzer for supporting hyphenated words

2015-07-17 Thread Diego Socaceti
Hi all, i'm new to lucene and tried to write my own analyzer to support hyphenated words like wi-fi, jean-pierre, etc. For our customer it is important to find the word - wi-fi by wi, fi, wifi, wi-fi - jean-pierre by jean, pierre, jean-pierre, jean-* The analyzer: public class SupportHyphenate