Please post to java-user for such questions in the future.
The short answer with Lucene is, if you can get text, you can index
it. Lucene doesn't crawl URLs. Maybe you want Nutch instead for
this feature? Or perhaps WebDAV access? Lots of ways, none
directly related to Lucene though.
: probably you'll need http client module (commons-httpclient or something)
More specifically: when dealing with lucene, the concept of a "document"
is very specific: it is an instance of
org.apache.lucene.document.Document. how you construct one of these
Document objects in your application is
probably you'll need http client module (commons-httpclient or something)
2005/10/27, [EMAIL PROTECTED] <[EMAIL PROTECTED]>:
> Can Lucene index remote documents? For example, if there are some documents
> at http://server:/documents, can I index the documents directory tree?
> Any help wou