Re: Can changes on an index be visible to an open IndexSearcher without reopening it?

2007-12-10 Thread Michael McCandless
This is an excellent question. You are right, what we really need is efficient reopening of an IndexSearcher. Creating & warming a new IndexSearcher can be expensive due [at least] to populating the FieldCache. This has been discussed before, eg here: http://www.gossamer-threads.com/lists/

help required ... ~ operator

2007-12-10 Thread Shakti_Sareen
Hi all, I am using StandardAnalyzer() to index the data. I am getting false hits in ~ operator query. Actual data is: "signals by magnets of different strength" and when I am parsing a query: "signals strength"~2 , I am getting a hit which is a false result. I am using QueryParser. Please

Re: help required ... ~ operator

2007-12-10 Thread Erik Hatcher
On Dec 10, 2007, at 4:48 AM, Shakti_Sareen wrote: I am using StandardAnalyzer() to index the data. I am getting false hits in ~ operator query. Actual data is: "signals by magnets of different strength" and when I am parsing a query: "signals strength"~2 , I am getting a hit which is a

Problem with termdocs.freq and other

2007-12-10 Thread chris.b
Here goes, I'm developing an application using lucene which will evaluate the representativeness of a list of keywords within a collection of documents. I'm doing this by indexing the documents and then, loading the list of keywords and using the IndexReader Class and DefaultSimilarity, retrieving

Re: Problem with termdocs.freq and other

2007-12-10 Thread Doron Cohen
> while (termDocs.next()) { > termDocs.next(); > } For one, this loop calls next() twice in each iteration, so every second is skipped... ? "chris.b" <[EMAIL PROTECTED]> wrote on 10/12/2007 12:58:15: > > Here goes, > I'm developing an application using lucene which

Re: Problem with termdocs.freq and other

2007-12-10 Thread chris.b
Okay, now i feel real stupid :p Seen as that solved all my problems (i think), thank you very much, Chris Doron Cohen wrote: > >> while (termDocs.next()) { >> termDocs.next(); >> } > > For one, this loop calls next() twice in each iteration, > so every second is

Re: Problem with termdocs.freq and other

2007-12-10 Thread Doron Cohen
> Seen as that solved all my problems (i think), Glad it helped! (btw it's always like this with, debugging - others see stuff in my code that I don't) - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail:

content depending Analyzing

2007-12-10 Thread Helmut Jarausch
Hi, I'm new to Lucene. I've seen similar questions to mine but didn't get an answer to my question: I'd like to index books from our library. Among other field there are LANG which contains a code specifying the language the book is written in TOC the table of contents When indexing I

RE: does the MultiSearcher class calculate IDF properly?

2007-12-10 Thread Seneviratne_Yasoja
Thank you for the response. I logged a bug https://issues.apache.org/jira/browse/LUCENE-1087 -Original Message- From: Chris Hostetter [mailto:[EMAIL PROTECTED] Sent: Friday, December 07, 2007 10:30 PM To: java-user@lucene.apache.org Subject: Re: does the MultiSearcher class calculate IDF

Re: content depending Analyzing

2007-12-10 Thread Daniel Naber
On Montag, 10. Dezember 2007, Helmut Jarausch wrote: > an Analyzer > implements a 'TokenStream(String fieldName, Reader reader)" > But for me that's too late. When tokenizing the TOC > field I would need access to the LANG field to decide > how to tokenize. IndexWriter contains an addDocument()

Out of memory?

2007-12-10 Thread Bob Daha
Hello, I'm building a ticketing system for my company and am using Lucene for some of the more complicated queries. I'd say my application differs from the typical lucene application in that my documents are (re)-indexed more frequently, the query load is actually relatively light, and most of

Re: Out of memory?

2007-12-10 Thread Chris Lu
Looks like you are using FieldCacheImpl to count search results for each category. ( or called facet search by a fancy name ). Well, it's a cache and the terms are loaded in the memory. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http

Re: Out of memory?

2007-12-10 Thread Grant Ingersoll
Those terms could be coming from a lot of diff. places. What size JVM heap, etc. are you giving your application? My guess is your index is too big to fit into RAM, you probably need to use a FSDirectory or give it more RAM, but it is hard to say for sure w/o knowing more about the applica

Re: Out of memory?

2007-12-10 Thread Bob Daha
Interesting... I didn't explicitly turn that on... I'm creating a query using query parser, then executing a search using that query and a sort. Would this algorithmically somehow invoke the FieldCacheImpl? If so, anyway I can turn it off? Basically the caching wouldn't be valuable anyway giv

Re: Out of memory?

2007-12-10 Thread Bob Daha
Thanks for the reply. The backup of the index on disc is around 80 MB. I'm giving my JVM 4 gigs of heap space... I probably should have mentioned in my first thread that this is not the first crash; it's done it a few times now and usually has almost a week of uptime before it crashes even wi

Re: Out of memory?

2007-12-10 Thread Chris Lu
I am wrong about the facet search. You are using Sorting, which also use FieldCacheImpl. Doesn't seem a way to turn it off though. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Luce

Re: Out of memory?

2007-12-10 Thread Chris Lu
It's hard to debug just by your description. But I think in general, you should close the reader and searcher, after you update the index. BTW: If you are using database, my software DBSight does this kind of pruning out old data and keep the index up-to-date. No memory leaks etc. It's not related

Re: Applying SpellChecker to a phrase

2007-12-10 Thread Chris Hostetter
Isn't MultiPhraseQuery what is desired here? you can add Term[]s per position and at least one term in each array must much. : > I was thinking of parsing the phrase query string into a : > sequence of terms, : > then constructing a phrase query object using add(Term term, : > int position) : >