AW: feedback: Indexing speed improvement lucene 2.2->2.3.1

2008-03-24 Thread Uwe Goetzke
Hi Ivan, No, we do not use StandardAnalyser or StandardTokenizer. Most data is processed by fTextTokenStream = result = new org.apache.lucene.analysis.WhitespaceTokenizer(reader); result = new ISOLatin2AccentFilter(result); // ISOLatin1AccentFilter modified that ö -> oe

AW: feedback: Indexing speed improvement lucene 2.2->2.3.1

2008-03-25 Thread Uwe Goetzke
Jake, With the bigram-based index we gave up for the struggle to find a well working language based index. We had implemented soundex (or different "sound"-alikes) and hyphenating but failed to deliver a user explainable search result ("why is this ranked higher" and so on...). One reason may b

Re: AW: feedback: Indexing speed improvement lucene 2.2->2.3.1

2008-03-24 Thread Michael McCandless
Ivan can you describe more about your application? The overall time for indexing has gotten much faster in 2.3, but this is assuming things like retrieving a document from its original source, filtering it, etc, are minimal. If you have an application where most of the time is spent outsi

Re: AW: feedback: Indexing speed improvement lucene 2.2->2.3.1

2008-03-24 Thread Ivan Vasilev
Yes Michael it seems our app takes time to retrieve docs from their sources. I have to run some profiler tool to see where is the bottleneck in our case. Thanks to you and Uwe for the answers! Ivan Michael McCandless wrote: Ivan can you describe more about your application? The overall time

Re: AW: feedback: Indexing speed improvement lucene 2.2->2.3.1

2008-03-25 Thread Jay
Hi Uwe, I am curious what NGramStemFilter is? Is it a combination of porter stemming and word ngram identification? Thanks! Jay Uwe Goetzke wrote: Hi Ivan, No, we do not use StandardAnalyser or StandardTokenizer. Most data is processed by fTextTokenStream = result = new org.apache.lucene

Re: AW: feedback: Indexing speed improvement lucene 2.2->2.3.1

2008-03-25 Thread Otis Gospodnetic
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Jay <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Tuesday, March 25, 2008 1:32:24 PM Subject: Re: AW: feedback: Indexing speed improvement lucene 2.2->2.3.1 Hi Uwe, I am curious

Re: AW: feedback: Indexing speed improvement lucene 2.2->2.3.1

2008-03-25 Thread Jay
- Original Message From: Jay <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Tuesday, March 25, 2008 1:32:24 PM Subject: Re: AW: feedback: Indexing speed improvement lucene 2.2->2.3.1 Hi Uwe, I am curious what NGramStemFilter is? Is it a combination of porter stemming an

Re: AW: feedback: Indexing speed improvement lucene 2.2->2.3.1

2008-03-25 Thread Otis Gospodnetic
a-user@lucene.apache.org Sent: Tuesday, March 25, 2008 6:15:54 PM Subject: Re: AW: feedback: Indexing speed improvement lucene 2.2->2.3.1 Sorry, I could not find the filter in the 2.3 API class list (core + contrib + test). I am not ware of lucene config file either. Could you please tell me where it

Re: AW: feedback: Indexing speed improvement lucene 2.2->2.3.1

2008-03-25 Thread yu
/index.html Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Jay <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Tuesday, March 25, 2008 6:15:54 PM Subject: Re: AW: feedback: Indexing speed improvement lucene 2.2->2.3.1 Sorry, I could

Re: AW: feedback: Indexing speed improvement lucene 2.2->2.3.1

2008-03-25 Thread Otis Gospodnetic
gt; To: java-user@lucene.apache.org Sent: Wednesday, March 26, 2008 12:04:33 AM Subject: Re: AW: feedback: Indexing speed improvement lucene 2.2->2.3.1 Hi Otis, I checked that contrib before and could not find NgramStemFilter. Am I missing other contrib? Thanks for the link! Jay Otis Gospodne

Re: AW: feedback: Indexing speed improvement lucene 2.2->2.3.1

2008-03-25 Thread yu
-- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Jay <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Tuesday, March 25, 2008 1:32:24 PM Subject: Re: AW: feedback: Indexing speed improvement lucene 2.2->2.3.1 Hi Uwe, I am curious what NGramStem

AW: AW: feedback: Indexing speed improvement lucene 2.2->2.3.1

2008-03-26 Thread Uwe Goetzke
dependent searching. Regards Uwe -Ursprüngliche Nachricht- Von: yu [mailto:[EMAIL PROTECTED] Gesendet: Mittwoch, 26. März 2008 05:26 An: java-user@lucene.apache.org Betreff: Re: AW: feedback: Indexing speed improvement lucene 2.2->2.3.1 Sorry for my ignorance, I am looking for Ng

Re: AW: AW: feedback: Indexing speed improvement lucene 2.2->2.3.1

2008-03-26 Thread Jay
ache.org Sent: Wednesday, March 26, 2008 12:04:33 AM Subject: Re: AW: feedback: Indexing speed improvement lucene 2.2->2.3.1 Hi Otis, I checked that contrib before and could not find NgramStemFilter. Am I missing other contrib? Thanks for the link! Jay Otis Gospodnetic wrote: Hi Jay, Sorry