Re: FileNotFoundException in ConcurrentMergeScheduler

2008-06-12 Thread Michael McCandless
Hi Grant, My stress test is unable to reproduce this exception, either. I'm adding Wikipedia docs to an index, using a high merge factor, then opening a new writer with low merge factor (5) and calling optimize. This forces concurrent merges to run during the optimize. One more

Match best one from list

2008-06-12 Thread JustJoc
Im new to Lucene (dont they all just say that), and finding it a little daunting. I am trying to find a way to replicate functionality we currently have with our database searching to be able to apply it to documents too. Most of it is just simple matching, but there is one particular part I am

lucene wildcard query with stop character

2008-06-12 Thread Cam Bazz
Hello, Imagine I have the following documents having keys A AB ABC ABD ABCD now Imagine a query with keyword analyzer and a wildcard: AB* which will bring me ABC , ABD and ABCD but I just want to get ABC and ABD so can I make a query like AB* but does not have the character after AB Best

Re: lucene wildcard query with stop character

2008-06-12 Thread Matthew Hall
I assume you want all of your queries to function in this way? If so, you could just translate the * character into a ? at search time, which should give you the functionality you are asking for. Unless I'm missing something. Matt Cam Bazz wrote: Hello, Imagine I have the following

Re: Does lucene support distributed indexing?

2008-06-12 Thread Adrian Tarau
I've started an year ago a different implementation of ParallelMultiSearcher using a ThreadPoolExecutor where everything is parallelized. Unfortunately, I had to interrupt this and work on something else, but this month I'll start working again. Right now there are some dependencies so it cannot

Giving Bosst to a certain category of pages

2008-06-12 Thread sumittyagi
Hi, I am maintaing a website's search engine, and using lucene. my job is to give boost to a particular set of pages, like pages about the Products of the company, Pages giving description of the company, about technology used etc etc. How can i start that, I mean i just joined this job and

Re: Giving Boost to a certain category of pages

2008-06-12 Thread sumittyagi
which one do you think is faster, boosting at search time or boosting at index time... thanks for the reply.. Erick Erickson wrote: From the Hossman: '...Index time field boosts are a way to express things like this documents title is worth twice as much as the title of most documents.

Re: FileNotFoundException in ConcurrentMergeScheduler

2008-06-12 Thread Grant Ingersoll
On Jun 12, 2008, at 6:39 AM, Michael McCandless wrote: Hi Grant, My stress test is unable to reproduce this exception, either. I'm adding Wikipedia docs to an index, using a high merge factor, then opening a new writer with low merge factor (5) and calling optimize. This forces

Re: lucene wildcard query with stop character

2008-06-12 Thread Cam Bazz
well the ? would work if the length of each token be same. however, instead of ABC I want tags that change dynamically from 1 to unlimited length. I just I could just pad every token to a normalized length such as ...000A but i am hoping there is a better method. if we could tell lucene

Re: lucene wildcard query with stop character

2008-06-12 Thread Matthew Hall
Hrm.. can we see a more specific example of the type of data you are trying to query against here? Matt Cam Bazz wrote: well the ? would work if the length of each token be same. however, instead of ABC I want tags that change dynamically from 1 to unlimited length. I just I could just pad

Book: Building Search Applications: Lucene, LingPipe and Gate

2008-06-12 Thread Bob Carpenter
Manu Konchady's book on building search applications is out: Konchady, Manu. 2008. Building Search Applications: Lucene, LingPipe, and Gate. Mustru Publishing. It's available from Amazon: http://www.amazon.com/Building-Search-Applications-Lucene-Lingpipe/dp/0615204252/ The book's a gentle

Re: HitCollector and sorting

2008-06-12 Thread Chris Hostetter
: So, how can I get the same results using the HitCollector? Also it would be : really nice, if you could point me to some examples of using it... Take a look at TopFieldDocCollector It's a HitCollector provided out of the box that does sorting. If you look at the trunk, the (recently