RE: Boolean Query search performance

2008-03-10 Thread Beard, Brian
AHA! That is consistent with what is happening now, and explains the discrepancy. The original post of parens around each term was because I was adding them as separate boolean queries, but now with using just the clause the parens is around the entire clause with the boost. -Original

Good way of Indexing TextFiles

2008-03-10 Thread Sebastin
Hi All, I am going to create a Lucene Index Store of Size 300 GB per month.I read Lucene Index Performance tips in wiki.can anyone suggest what are all the steps need to be followed while dealing with big Indexes.My Index Store gets updated every second.I used to search 15 days records

RE: Swapping between indexes

2008-03-10 Thread Toke Eskildsen
On Fri, 2008-03-07 at 15:38 +0100, [EMAIL PROTECTED] wrote: With a commit after every add: (286 sec / 10,000 docs) 28.6 ms. With a commit after every 100 add: (12 sec / 10,000 docs) 1.2 ms. Only one commit: (8 sec / 10,000 docs) 0.8 ms. Of couse. If you need so less time to create a

Best way to do Query inflation?

2008-03-10 Thread Itamar Syn-Hershko
Hi all, I'm looking for the best way to inflate a query, so a query like: synchronous AND colour -- will become something like this: (synchronous OR asynchronous OR bsynchornous OR synchronos OR asynchronos OR bsynchornos) AND (colour OR acolour OR bcolour OR color OR acolor OR bcolor). I'm

Re: Best way to do Query inflation?

2008-03-10 Thread Mathieu Lecarme
https://admin.garambrogne.net/projets/revuedepresse/browser/trunk/src/java/lexicon/src/java/org/apache/lucene/lexicon/QueryUtils.java M. Itamar Syn-Hershko a écrit : Hi all, I'm looking for the best way to inflate a query, so a query like: synchronous AND colour -- will become something like

phrase search with custom TokenFilter

2008-03-10 Thread Embry, Clay
Hi, I have written a TokenFilter which breaks up words with internal dot characters and adds the whole word plus the pieces as tokens in the stream. I am using that TokenFilter with the StandardAnalyzer to index my documents. Then I do searches using the StandardAnalyzer. Everything is working

Biggest index

2008-03-10 Thread spring
Hi, I have some question about the index size on a single machine: What is your biggest index you use in production? Do you use MultiReader/Searcher? What hardware do you need to serve it? What kind of application is it? Thank you.

Re: Looking for an example of Using Position Increment Gap

2008-03-10 Thread Chris Hostetter
: the analysis section. (Basically writing a custom analyzer that introduces a : position increment gap between phrases) I am however curious if an example of : a usage like that exists somewhere that I could use as a basis for the : analyzer that I'm going to have to write to handle this case.

Re: Swapping between indexes

2008-03-10 Thread Eric Th
What's the CPU usage and MEM usage when doing With a commit after every 100 add vs. Only one commit ? 2008/3/10, Toke Eskildsen [EMAIL PROTECTED]: On Fri, 2008-03-07 at 15:38 +0100, [EMAIL PROTECTED] wrote: With a commit after every add: (286 sec / 10,000 docs) 28.6 ms. With a commit

Document ID shuffling under 2.3.x (on merge?)

2008-03-10 Thread Daniel Noll
Hi all. We're using the document ID to associate extra information stored outside Lucene. Some of this information is being stored at load-time and some afterwards; later on it turns out the information stored at load-time is returning the wrong results when converting the database contents