RE: Relevance Feedback Lucene+Algorithms

2006-02-14 Thread Koji Sekiguchi
Please check Grant Ingersoll's presentation at ApacheCon 2005. He put out great demo programs for the relevance feedback using Lucene. Thank you, Koji > -Original Message- > From: varun sood [mailto:[EMAIL PROTECTED] > Sent: Wednesday, February 15, 2006 3:36 PM > To: java-user@lucene.apa

Relevance Feedback Lucene+Algorithms

2006-02-14 Thread varun sood
Hi, Can anyone share the experience of how to implement Relevance Feedback in Lucene? Can someone suggest me some algorithms and papers which can help me in building an effective Relevance Feedback system? Thanks in advance. Dexter.

Re: Help with mass delete from large index

2006-02-14 Thread Greg Gershman
I tried the same operation with the nightly 1.9 build and it worked fine, no NPEs during the deletes, and after optimization, search worked fine. I did a little bit of debugging, a call to getField returned null, so I think it was more than just the Term value that was missing. As the error only

Re: Size + memory restrictions

2006-02-14 Thread Greg Gershman
You may consider incrementally adding documents to your index; I'm not sure why there would be problems adding to an existing index, but you can always add additional documents. You can optimize later to get everything back into a single segment. Querying is a different story; if you are using th

Re: Size + memory restrictions

2006-02-14 Thread Daniel Naber
On Dienstag 14 Februar 2006 19:38, Eugene Tuan wrote: > Yes. We have the same problem. It is mainly because TermInforReader.java > that takes memory space to keep *.tii. In Lucene 1.9 you can change that using IndexWriter.setTermIndexInterval(). Regards Daniel -- http://www.danielnaber.de --

Re: QueryParser behaviour ..

2006-02-14 Thread Chris Hostetter
: >they key to understanding why that resulted in a phrase query instead of : >three term queries is that QueryParser doesn't treat comma as a special : >character, so it saw the string word1,word2,word3 and gave it to your : >analyzer. Since your analyzer gave back several tokens QueryParser bui

Re: Size + memory restrictions

2006-02-14 Thread Leon Chaddock
Eugene or anyone else, Do you know of any solutions? At the moment we have 4gb assigned to jvm but we only query on 4 of our 4gb segments, if we try and query against more we get memory problems. Thanks Leon - Original Message - From: "Eugene Tuan" <[EMAIL PROTECTED]> To: Sent: Tu

RE: Size + memory restrictions

2006-02-14 Thread Eugene Tuan
Yes. We have the same problem. It is mainly because TermInforReader.java that takes memory space to keep *.tii. Eugene -Original Message- From: Leon Chaddock [mailto:[EMAIL PROTECTED] Sent: Tuesday, February 14, 2006 8:43 AM To: java-user@lucene.apache.org Subject: Size + memory restr

Size + memory restrictions

2006-02-14 Thread Leon Chaddock
Hi, we are having tremendous problems building a large lucene index and querying it. The programmers are telling me that when the index file reaches 3.5 gb or 5 million docs the index file can no longer grow any larger. To rectify this they have built index files in multiple directories. Now

Re: Word files & Build vs. Buy?

2006-02-14 Thread Nick Burch
On Thu, 9 Feb 2006, Christiaan Fluit wrote: Yes, that's exactly what I'm doing. Having this in POI would benefit me a lot though, as I hardly understand the POI basics to be honest (my fault, not POI's). OK, that's now in POI (you'll need a scratchpad build from late yesterday or today, see h

Re: QueryParser behaviour ..

2006-02-14 Thread sergiu gordea
Chris Hostetter wrote: : I built a wrong query string "word1,word2,word3" instead of "word1 : word2 word3" : therefore I got a wrong query: field:"word1 word2 word3" instead of : field:word1 field:word2 field:word3. : : Is this an espected behaviour? : I used Standard analyzer, probably the

Re: When do files in 'deleteable' get deleted?

2006-02-14 Thread Volodymyr Bychkoviak
Lucene tries to delete 'deletable' files every time index is modified. The reason files can't be deleted is that those files are is use (thay are open). (This is applicable mainly to Windows, as far as Linux allows to delete files that are in use...) If your files 'deletable' don't get deleted

re: unwanted processing of keyword searches

2006-02-14 Thread Joerg Erdmenger
Ok just reread the overview of query language ( http://lucene.apache.org/java/docs/queryparsersyntax.html) and I guess there it states exactly were I misunderstood things. Sorry for the noise. Jörg

unwanted processing of keyword searches

2006-02-14 Thread Joerg Erdmenger
Hi, I have a little problem I'm not sure is with Lucene or with me not understanding correctly. I have built a search tool. I use a custom analyzer that is very simple and just chains some of the standard filters like this TokenStream result = new StandardTokenizer(reader); result = new StandardF