FieldCacheImpl caveat

2008-04-07 Thread Oliver Flege
An undocumented feature of FieldCacheImpl led to an OutOfMemoryError in our application and to know about that might be of general interest: We are using lucene 2.3.1 and our index (3m documents) is updated 3 times per day. On every update, we create a new instance of a Searcher class, which

New binary distribution of Oracle-Lucene integration

2008-04-07 Thread Marcelo Ochoa
Hi all: I just released a new version of Oracle-Lucene integration implemented as a Domain Index. Binary distribution have a very straightforward installation and testing step, downloads are at SF.net web site:

Indexing and Searching from within a single Document

2008-04-07 Thread syedfa
Dear Fellow Java/Lucene developers: I am writing an application where a user is able to search for keywords from within a single book. When the user conducts a search, he/she should receive a set of results that show the sentence/phrase within the book where the keyword is found.

RE: Indexing and Searching from within a single Document

2008-04-07 Thread jing_gao
Hi, I have a similar question. Not heard back from anyone yet. Dear Lucene experts, I'm currently evaluating options for our search tool. The need is: I have millions of entries in database, each entry is in such format (more or less) ID NameDescription start (number)

Dataset to test lucene

2008-04-07 Thread sumittyagi
hi, i need a dataset having html pages to test my lucene programs... from where can i download it.. -- View this message in context: http://www.nabble.com/Dataset-to-test-lucene-tp16538139p16538139.html Sent from the Lucene - Java Users mailing list archive at Nabble.com.

RE: Indexing and Searching from within a single Document

2008-04-07 Thread jing_gao
Hi, Some indexing tools give configurable options, you can use separators in a single documents (such as //, %%%), and indexing engine would treat each block as a separate document. Does Lucene have this type of functionalities? Thanks! Jing -Original Message- From: syedfa

Re: Indexing and Searching from within a single Document

2008-04-07 Thread Karl Wettin
You want to break down your book in mutiple documents, perhaps one per paragraph or so? I hope this helps. karl syedfa skrev: Dear Fellow Java/Lucene developers: I am writing an application where a user is able to search for keywords from within a single book. When the user conducts

Re: Dataset to test lucene

2008-04-07 Thread Karl Wettin
sumittyagi skrev: hi, i need a dataset having html pages to test my lucene programs... from where can i download it.. Is there a specific data set you are looking for or would any do? How about you download Wikipedia? karl

Appending to index

2008-04-07 Thread Nitasha Walia (niwalia)
Hi, I am a new user of Java Lucene. The default index is such that a new files is created every time, which requires me to delete the existing index folder. I want to append to the existing index. Can someone please guide me on how to do the same? Thanks, Nitasha Walia Software Engineer,

Re: Appending to index

2008-04-07 Thread Michael Wechner
Nitasha Walia (niwalia) wrote: Hi, I am a new user of Java Lucene. The default index is such that a new files is created every time, which requires me to delete the existing index folder. I want to append to the existing index. Can someone please guide me on how to do the same?

Why Lucene has to rewrite queries prior to actual searching?

2008-04-07 Thread Itamar Syn-Hershko
Hi all, Can someone from the experts here explain why Lucene has to get a rewritten query for the Searcher - so Phrase or Wildcards queries have to rewrite themselves into a primitive query, that is then passed to Lucene to look for? I'm probably not familiar too much with the internals of

Re: Why Lucene has to rewrite queries prior to actual searching?

2008-04-07 Thread Paul Elschot
Itamar, Query rewrite replaces wildcards with terms available from the index. Usually that involves replacing a wildcard with a BooleanQuery that is an effective OR over the available terms while using a flat coordination factor, i.e. it does not matter how many of the available terms actually

Re: Why Lucene has to rewrite queries prior to actual searching?

2008-04-07 Thread John Wang
Other use is for custom Query objects to reboost or expand the user query from information gathered from the indexreader at search time. -John On Mon, Apr 7, 2008 at 2:56 PM, Paul Elschot [EMAIL PROTECTED] wrote: Itamar, Query rewrite replaces wildcards with terms available from the index.

RE: Why Lucene has to rewrite queries prior to actual searching?

2008-04-07 Thread Itamar Syn-Hershko
Paul and John, Thanks for your quick reply. The problem with query rewriting is the beforementioned MaxClauseException. Instead of inflating the query and passing a deterministic list of terms to the actual search routine, Lucene could have accessed the vectors in the index using some sort of

RE: Appending to index

2008-04-07 Thread Nitasha Walia (niwalia)
Hi, I am sorry, I don't quite understand what you meant by.. IndexWriter.updateDocument(...) HTH Let me re-phrase my question: I need to append to an existing index. Presently, the code is structured to check for the existing file, and exit if the file exists: if(INDEX_FILE.exists()) {

Re: Why Lucene has to rewrite queries prior to actual searching?

2008-04-07 Thread Paul Elschot
Itamar, Have a look here: http://lucene.apache.org/java/2_3_1/scoring.html Regards, Paul Elschot Op Tuesday 08 April 2008 00:34:48 schreef Itamar Syn-Hershko: Paul and John, Thanks for your quick reply. The problem with query rewriting is the beforementioned MaxClauseException. Instead of

RE: Appending to index

2008-04-07 Thread Anshum Gupta
Hi, You could try opening the indexWriter with the 3rd arguement as false. The 3rd arg[] specifies whether a 'new' index has to be created or not. Something like IndexWriter write = new IndexWriter(INDEX_FILE, new StandardAnalyser(), false); //Notice the 3rd arguement here I guess this should