Re: Caching in lucene

2007-09-17 Thread Shailendra Mudgal
Ok let me explain you. By warming up the reader i used to understand is that it memorizes the index terms. Therefore subsequent queries will be answered by using these terms. Is this correct ? On 9/18/07, Karl Wettin <[EMAIL PROTECTED]> wrote: > > > 18 sep 2007 kl. 08.33 skrev Shailendra Mudgal:

Re: Caching in lucene

2007-09-17 Thread Karl Wettin
18 sep 2007 kl. 08.33 skrev Shailendra Mudgal: Whether it caches frequently used terms ? I don't understand you question? -- karl On 9/18/07, Karl Wettin <[EMAIL PROTECTED]> wrote: 18 sep 2007 kl. 07.12 skrev Shailendra Mudgal: What my goal is to understand the caching strategy.

Re: Caching in lucene

2007-09-17 Thread Shailendra Mudgal
Whether it caches frequently used terms ? On 9/18/07, Karl Wettin <[EMAIL PROTECTED]> wrote: > > > 18 sep 2007 kl. 07.12 skrev Shailendra Mudgal: > > > > > What my goal is to understand the caching strategy. How well this > > work for repetitive queries. Is there any room available to improve >

Re: Caching in lucene

2007-09-17 Thread Karl Wettin
18 sep 2007 kl. 07.12 skrev Shailendra Mudgal: What my goal is to understand the caching strategy. How well this work for repetitive queries. Is there any room available to improve this. It is usually a loss of resources to cache results in a busy system with gaussianity distributed q

Re: Caching in lucene

2007-09-17 Thread Shailendra Mudgal
Hi Yonik, Thanks for your response. I'll feel great if you can explain this in more detail as i am not sure that whether i have understood this correctly or not. Or if you can direct me to some resource that will also be very good for me. What my goal is to understand the caching strategy. How w

Re: Span queries and complex scoring

2007-09-17 Thread Chris Hostetter
: Lucene has historically taken the exact opposite approach... open up the API : as needed. Unless there is very good reason for it, classes and data should : be kept private. Just to clarify the reasoning behind this: once something is made public, the API has to be supported in perpetuity --

Re: Span queries and complex scoring

2007-09-17 Thread Grant Ingersoll
Cedric, On Sep 17, 2007, at 11:54 AM, Erik Hatcher wrote: On Sep 17, 2007, at 6:51 AM, melix wrote: I've faced the very same problem with NearSpansOrdered and so on. I think unless there is a very good reason for it, classes should be made public, this would at least make the "delegate" d

Re: Caching in lucene

2007-09-17 Thread Yonik Seeley
On 9/17/07, Shailendra Mudgal <[EMAIL PROTECTED]> wrote: > One thing that i understand about IndexReader is that for subsequent > queries, results come fast as the IndexReader needs to be warmed up. > According to this, I am trying to find out the answers of following > questions : > - is there any

Re: Span queries and complex scoring

2007-09-17 Thread Erik Hatcher
On Sep 17, 2007, at 6:51 AM, melix wrote: I've faced the very same problem with NearSpansOrdered and so on. I think unless there is a very good reason for it, classes should be made public, this would at least make the "delegate" design pattern available. Lucene has historically taken the

Re: a query for a special AND?

2007-09-17 Thread Paul Elschot
On Monday 17 September 2007 11:40, Mohammad Norouzi wrote: > Hi > I have a problem in getting correct result from Lucene, consider we have an > index containing documents with fields "field1" and "field2" etc. now I want > to have documents in which their field1 are equal one by one and their > fie

Re: How to tokenize with comma in standard tokenizer

2007-09-17 Thread Mark Miller
Take the comma out of: | <#P: ("_"|"-"|"/"|"."|",") > in the .jj file (around line 92). Keep in mind that this will affect being able to find tokens that where previously indexed with the comma there (obviously). I believe the javacc target in the build file will rebuild...you need to get javac

How to tokenize with comma in standard tokenizer

2007-09-17 Thread Bhavin Pandya
Hi, Standard tokenizer works pretty well for me... but i found one problem with my usage... I want to tokenize..."TheRing6,Proposal6,GuyandGirl6" as a three saparate tokens.. while standard analyzer considering it as a one word because it has one digit in token. Expected three tokens: 1. ther

Re: Java Heap Space -Out Of Memory Error

2007-09-17 Thread testn
As I mentioned, IndexReader is the one that holds the memory. You should explicitly close the underlying IndexReader to make sure that the reader releases the memory. Sebastin wrote: > > Hi testn, > Every IndexFolder is of size 1.5 GB of size,eventhough when > i used to Open and

Caching in lucene

2007-09-17 Thread Shailendra Mudgal
Hi All, One thing that i understand about IndexReader is that for subsequent queries, results come fast as the IndexReader needs to be warmed up. According to this, I am trying to find out the answers of following questions : - is there any caching is done in lucene for search ? - if yes, is it fo

Re: Span queries and complex scoring

2007-09-17 Thread melix
Thanks Paul. I'm doing something very similar, but I'd like to notice that it is very hard to "extend" Lucene without breaking compatibility. I mean I have written classes that expand SpanQueries, but, for example, and to my mind not understandable, classes like "BooleanWeight" are package protect

Sort on ParallelMultiSearcher with remote searchables

2007-09-17 Thread Amadeous
I want to sort results of a query according to a specific field(date).I have a parallelMultiSearcher with some underlying remote searchers. Indexes in remote searchables are large and enabling sort will make searching slow. As I understood from previous threads, a solution is: searching without so

a query for a special AND?

2007-09-17 Thread Mohammad Norouzi
Hi I have a problem in getting correct result from Lucene, consider we have an index containing documents with fields "field1" and "field2" etc. now I want to have documents in which their field1 are equal one by one and their field2 with two different value to clarify consider I have this query: