Can someone please tell me how to cache results in Lucene ? I know the classes, but I don't know how to go about it.
thanks, Askar On 7/24/07, Askar Zaidi <[EMAIL PROTECTED]> wrote: > > Thanks for the reply. > > I am timing the entire search process with a stop watch, a bit ghetto > style. My getXXX methods are: > > Document doc = hits.doc(i); > String str = doc.get("item"); > > So you can see that I am retrieving the entire document in a search query. > Ideally , I'd like to just retrieve the Field object that I want to run the > search on. I know this will give me a boost as one of my Fields is really > huge. > > My query is selecting the entire user data-set in the database. I'd like > to do some SQL based search in the query too so that I pick only those items > where the phrase matches. > > Index contains about 650MB of data. Index file size is 14478869 bytes. > > thanks, > AZ > > > On 7/24/07, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > > > > Where are you getting your numbers from? That is, where are your > > timers? Are you timing the rs.next() loop, or the individual calls > > to Lucene? What do the getXXXXX methods look like? How big are your > > queries? How big is your index? > > > > Essentially, we need more info to really help you. From what I can > > tell, you are generating 3 different Lucene queries for each record > > in the database. Frankly, I surprised your slowdown is only linear. > > > > On Jul 24, 2007, at 4:31 PM, Askar Zaidi wrote: > > > > > I have 512MB RAM allocated to JVM Heap. If I double my system RAM > > > from 768MB > > > to say 2GB or so, and give JVM 1.5GB Heap space, will I get quicker > > > results > > > ? > > > > > > Can I expect results which take 1 minute to be returned in 30 > > > seconds with > > > more RAM ? Should I also get a more powerful CPU ? A real server class > > > machine ? > > > > > > I have also done some of the optimizations that are mentioned on > > > the Lucene > > > website. > > > > > > thanks, > > > AZ > > > > > > On 7/24/07, Askar Zaidi < [EMAIL PROTECTED]> wrote: > > >> > > >> Hey Guys, > > >> > > >> I just finished up using Lucene in my application. I have data in a > > >> database , so while indexing I extract this data from the database > > >> and pump > > >> it into the index. Specifically , I have the following data in the > > >> index: > > >> > > >> <itemID> <tags> <title> <summary> <contents> > > >> > > >> where itemID is just a number (primary key in the DB) > > >> tags : text > > >> titie: text > > >> summary: text > > >> contents: Huge text (text extracted from files: pdfs, docs etc). > > >> > > >> Now while running a search query I realized that the response time > > >> increases in a linear fashion as the number of <itemID> increase > > >> in the DB. > > >> > > >> If I have 50 items, its 8 seconds > > >> 100 items, its 17 seconds. > > >> 300+ items, its 60 seconds and maybe more. > > >> > > >> In a perfect world, I'd like to search on 300+ items within 10-15 > > >> seconds. > > >> Can anyone give me tips to fine tune lucene ? > > >> > > >> Heres a code snippet: > > >> > > >> sql query = "SELECT itemID from items where creator = 'askar' ; > > >> > > >> --execute query-- > > >> > > >> while(rs.next ()){ > > >> > > >> score = doTagSearch(askar,text,itemID); > > >> scoreTitle = doTitleSearch(askar,text,itemID); > > >> scoreSummary = doSummarySearch(askar,text,itemID); > > >> > > >> ---- > > >> > > >> } > > >> > > >> So this code asks Lucene to search for the "text" in the itemID > > >> passed. > > >> itemID is already indexed. The while loop will run 300 times if > > >> there are > > >> 300 items....that gets slow...what can I do here ?? > > >> > > >> thanks for the replies, > > >> > > >> AZ > > >> > > > > -------------------------- > > Grant Ingersoll > > Center for Natural Language Processing > > http://www.cnlp.org/tech/lucene.asp > > > > Read the Lucene Java FAQ at http://wiki.apache.org/lucene-java/LuceneFAQ > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > >