RE: Lucene refresh index function (incremental indexing).

Tun Lin Tue, 25 Nov 2003 20:42:19 -0800

When I integrate with PDFBox, I cannot update, delete or change the filename
anymore. If I did any of the above, I will get a message: Lock obtain timed out.

Anyone can help? 

-----Original Message-----
From: Pleasant, Tracy [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, November 25, 2003 11:42 PM
To: Lucene Users List; [EMAIL PROTECTED]
Subject: RE: Lucene refresh index function (incremental indexing).

I was able to get PDFBox to work with my JSP webpages. 

I think you will have to in a way write your own code to do the PDF files (while
still calling the Lucene functions)

         doc = LucenePDFDocument.getDocument(file);

-----Original Message-----
From: Tun Lin [mailto:[EMAIL PROTECTED]
Sent: Monday, November 24, 2003 11:07 PM
To: 'Lucene Users List'
Subject: RE: Lucene refresh index function (incremental indexing).

Does it support indexing the contents of pdf files? I have found one project
called PDFBox that can be integrated with Lucene to search inside of the pdf
files. Currently, Lucene can only search for the pdf filename. I tried with
PDFBox and I got the following message when I typed the command: java
org.apache.lucene.demo.IndexHTML -create -index c:\\index .. 

log4j:WARN No appenders could be found for logger (org.pdfbox.pdfparser.PDFParse
r).
log4j:WARN Please initialize the log4j system properly.

Can anyone advise?

-----Original Message-----
From: Doug Cutting [mailto:[EMAIL PROTECTED]
Sent: Tuesday, November 25, 2003 5:01 AM
To: Lucene Users List
Subject: Re: Lucene refresh index function (incremental indexing).

Tun Lin wrote:
> These are the steps I took:
> 
> 1) I compile all the files in a particular directory using the
command: 
> java org.apache.lucene.demo.IndexHTML -create -index c:\\index .. 
> , putting all the indexed files in c:\\index.
> 2) Everytime, I added an additional file in that directory. I need to 
> reindex/recompile that directory to generate the indexes again. As the

> directory gets larger, the indexing takes a longer time.
> 
> My question is how do I generate the indexes automatically everytime a

> new document is added in that directory without me recompiling
everytime
manually?

To update, try removing the '-create' from the command line.  The demo code
supports incremental updates.  It will re-scan the directory and figure out
which files have changed, what new files have appeared and which previously
existing files have been removed.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Lucene refresh index function (incremental indexing).

Reply via email to