Re: Indexing Remote Documents

2005-10-27 Thread Erik Hatcher
Please post to java-user for such questions in the future. The short answer with Lucene is, if you can get text, you can index it. Lucene doesn't crawl URLs. Maybe you want Nutch instead for this feature? Or perhaps WebDAV access? Lots of ways, none directly related to Lucene though.

Re: Indexing Remote Documents

2005-10-27 Thread Chris Hostetter
: probably you'll need http client module (commons-httpclient or something) More specifically: when dealing with lucene, the concept of a "document" is very specific: it is an instance of org.apache.lucene.document.Document. how you construct one of these Document objects in your application is

Re: Indexing Remote Documents

2005-10-27 Thread DalHo Park
probably you'll need http client module (commons-httpclient or something) 2005/10/27, [EMAIL PROTECTED] <[EMAIL PROTECTED]>: > Can Lucene index remote documents? For example, if there are some documents > at http://server:/documents, can I index the documents directory tree? > Any help wou