Re: UNIX command-line indexing script?

2004-03-15 Thread Erik Hatcher
Have a look at the Ant index task in the Lucene sandbox. You're on your own, currently, to build this and understand it, but I use it frequently. In fact, the sample index from our book is generated with this: index index=${build.dir}/index

Re: Reader Text input as field for HTML data text leading to null retrieval

2004-03-15 Thread Otis Gospodnetic
Re-directing this message to lucene-user list. That is the correct behaviour. Use http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/document/Field.html#Text(java.lang.String,%20java.lang.String) if you want to be able to retrieve the original value of the indexed text. Otis ---

Re: UNIX command-line indexing script?

2004-03-15 Thread Otis Gospodnetic
To add to this. The upcoming Lucene in Action book has ready to use code that will handle and index files in most popular file formats. Otis --- Erik Hatcher [EMAIL PROTECTED] wrote: Have a look at the Ant index task in the Lucene sandbox. You're on your own, currently, to build this and

Re: UNIX command-line indexing script?

2004-03-15 Thread Charlie Smith
So, how upcoming is this book going to be? [EMAIL PROTECTED] 3/15/2004 3:39:39 AM To add to this. The upcoming Lucene in Action book has ready to use code that will handle and index files in most popular file formats. Otis --- Erik Hatcher [EMAIL PROTECTED] wrote: Have a look at the Ant

java.io.IOException: Lock obtain timed out

2004-03-15 Thread Gabe
I am using Lucene 1.3 final and am having an error that I can't seem to shake. Basically, I am updating a Document in the index incrementally by calling an IndexReader to remove the document. This works. Then, I close the IndexReader with the following code: reader.unlock(reader.directory());

Re: java.io.IOException: Lock obtain timed out

2004-03-15 Thread Otis Gospodnetic
There is no need for that .unlock call, just .close() Otis --- Gabe [EMAIL PROTECTED] wrote: I am using Lucene 1.3 final and am having an error that I can't seem to shake. Basically, I am updating a Document in the index incrementally by calling an IndexReader to remove the document. This

Re: UNIX command-line indexing script?

2004-03-15 Thread Otis Gospodnetic
Erik and I are putting finishing touches on it, so by Summer (this one ;)). Otis --- Charlie Smith [EMAIL PROTECTED] wrote: So, how upcoming is this book going to be? [EMAIL PROTECTED] 3/15/2004 3:39:39 AM To add to this. The upcoming Lucene in Action book has ready to use code that will

Re: java.io.IOException: Lock obtain timed out

2004-03-15 Thread Gabe
Otis, I only put the unlock call in because I had the error in the first place. Removing it, the IOException still occurs, when trying to instantiate the IndexWriter. Thanks, Gabe --- Otis Gospodnetic [EMAIL PROTECTED] wrote: There is no need for that .unlock call, just .close() Otis

RE: java.io.IOException: Lock obtain timed out

2004-03-15 Thread Nguyen, Tri (NIH/NLM/LHC)
Did you close your writer if an Exception occured? I had a similiar problem, but it was fixed when i close the writer in the finally block. Below is my original code (which generate Mjava.io.Exception: Lock obtain timed out when an Exception is thrown) public static void index(File indexDir,

RE: java.io.IOException: Lock obtain timed out

2004-03-15 Thread Gabe
I notice in your catch clause you always set the writer to be true... (i.e. new IndexWriter(INDEX_DIR, analyzer, true). If I am not mistaken reading the docs, this overwrites the entire index, no? That is why I was setting that variable to false when doing an incremental update. When I reindex

RE: java.io.IOException: Lock obtain timed out

2004-03-15 Thread Gabe
I figured it out. an errant open IndexWriter. --- Nguyen, Tri (NIH/NLM/LHC) [EMAIL PROTECTED] wrote: Did you close your writer if an Exception occured? I had a similiar problem, but it was fixed when i close the writer in the finally block. Below is my original code (which generate

Can lucene index both Big5 and GB2312 encoding character?

2004-03-15 Thread Tuan Jean Tee
Can I find out if I have both Big5 and GB2312 encoded HTML files in two separate directories, and when I build the index, does Lucene able to distinguish the character set? or Lucene only work with single encoding. Thank you. IMPORTANT - This email and any attachments are confidential and may