Warmup queries

2009-07-02 Thread Ganesh
When ever i reopen my index, I do some warm up queries. I have few fields which will be used as filter and few others using BooleanQuery. Currently I am executing warm up queries for those fields which are part of Query and not part of Filter. My question is whether Warmup queries should incl

Re: Scaling out/up or a mix

2009-07-02 Thread Otis Gospodnetic
Disclaimer: I only skimmed the thread. RAM. If you can get the OS to buffer hot pages of your index you'll be good. The more the better, the faster the queries. More cores/CPUs means more concurrency, and if things are fast because the data is cached, it means you need fewer CPUs/cores. O

Re: Highligheter fails using JapaneseAnalyzer

2009-07-02 Thread k.sayama
Hi Tokenizer is not standard Lucene class. but to acquire startOffset and endOffset correctly, I edited Tokenizer. It is operating correctly now. I want to verify more patterns. thanks - Original Message - From: "Mark Harwood" To: Sent: Thursday, July 02, 2009 6:25 AM Subject:

Re: Highligheter fails using JapaneseAnalyzer

2009-07-02 Thread Matthew Hall
Out of curiosity, when you try your other test string "aaa _bbb ccc" what do the token byte offsets show? Matt Mark Harwood wrote: On 1 Jul 2009, at 17:39, k.sayama wrote: I could verify Token byte offsets The sytsem outputs aaa:0:3 bbb:0:3 ccc:4:7 That explains the highlighter behaviou

Re: Term Frequency vector consumes memory

2009-07-02 Thread Grant Ingersoll
On Jul 1, 2009, at 1:39 AM, Ganesh wrote: Thanks for your reply. My requirement is to fetch the list of top frequency terms indexed in a day. I used the logic said in the article (refer below link) http://stackoverflow.com/questions/195434/how-can-i-get-top-terms-for-a-subset-of-documents-i

Re: A simple Vector Space Model and TFIDF usage

2009-07-02 Thread Kamal Najib
Hallo Amir, So far i understand, you have two sets of documents, let we say set1 and set2. If you want to get the Similarity between the two sets documents you have to index the docs of one and schearch each doc of the others as a query, then you can get the similarity of the two documents. So:

Re: IndexWriter

2009-07-02 Thread Simon Willnauer
I don't know about your setup but you should do it before spring creates you indexwriter. you could use a wrapper for that indexwriter to unlock ahead of creating the delegate or rather have some startup listener which checks if it is locked and in turn have a shutdown listener which closes the wri

Re: IndexWriter

2009-07-02 Thread Amin Mohammed-Coleman
Ok My index writers are configured using spring. So basically I need to have spring application listener that checks on start up whether the directory is locked if it is then unlock. On application shutdown i have a listener that unlocks the directory if is locked. Not sure if that made sense.

Re: IndexWriter

2009-07-02 Thread Simon Willnauer
Ganesh is right you should check once you webapp is starting up if you keep the writer open as long as you app is up and running. I just mentioned it to make you aware of it and prevent some surprises if the app crashes. simon On Thu, Jul 2, 2009 at 9:03 AM, Ganesh wrote: > No. You should not do

Re: IndexWriter

2009-07-02 Thread Ganesh
No. You should not do this for every document you add or update. First time, When you open your writer, if the directory is locked, it will throw LockObtainFailedException, In this case, Unlock it and Open the writer again. Regards Ganesh - Original Message - From: "Amin Mohammed-Col