Hello, I've read you proposal (and all email related to it). One thing I'd like to advise is to distinguish the crawler and the loader component. The crawler is responsible for gathering documents from several sources. The loader (or indexer) is responsible for loading the gathered documents to the index (I think in batch mode).
I think it's redundant to hardcode the indexing logic into all crawler component (ftp, http, jdbc, filesys crawler). It's an interesting question how the components can communicate? (don't you think using avalon is a good way?) We are running a country wide search engine (not based on Lucene, it's a commercial application) and the crawler (http) is running on one machine and the loader (and the query server) on other machine. As the crawler and the loader uses files as communication interface we can add (delete) documents to the index manually. peter > -----Original Message----- > From: Andrew C. Oliver [mailto:[EMAIL PROTECTED]] > Sent: Thursday, February 07, 2002 1:35 PM > To: Lucene Developers List > Subject: Proposal for Lucene > > > Hi All, > > This is just a few thoughts about Lucene. Please send me > your feedback, > critiques and thought. > > If you folks would take a look: > > http://www.trilug.org/~acoliver/luceneplan.html > > if you'd like to submit patches: > > http://www.trilug.org/~acoliver/luceneplan.xml > > Once I've gotten feedback from the developer community I'll > send this to > the user community as well. > > Thanks, > > Andy > -- > www.superlinksoftware.com > www.sourceforge.net/projects/poi - port of Excel format to java > http://developer.java.sun.com/developer/bugParade/bugs/4487555.html > - fix java generics! > > > The avalanche has already started. It is too late for the pebbles to > vote. > -Ambassador Kosh > > > -- > To unsubscribe, e-mail: > <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: > <mailto:[EMAIL PROTECTED]> > > -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
