Grant Ingersoll wrote:
Mind you, our docs are an order of magnitude better than this other project
I agree, Lucene is a very well documented project compared to many. In general and in conjunction with LIA, it's a pretty easy project to get in to.
3. There is a whole lot of knowledge stored in the email archives, how can we leverage it?
This is indeed a key point. HitCollector and surrounding classes are poorly documented and there have been many replies to questions which recommend using a HitCollector.
The search package is generally well described, apart from what are described as 'low level API' or 'expert' methods and classes. I found I needed to get to that level to get the best out of Lucene in a framework that sits on top of it.
Performance is another topic which would really benefit from a 'best practice' guide. The dev and user posts concerning performance always get many responses. Although a challenge to produce, bringing together some kind of recommendations which relate user data to reader/writer usage, e.g. what maxBufferedDocs, maxMergeDocs, mergeFactor to use with a number of different usage scenarios would be great, although there's no substitute for evaluating that with your own data.
A definitive statement about 'optimize' and when (not) to use it and what its relationship with performance is. I know there's lots about it already, but it's dotted all over the place.
Maybe this sort of information would be better in LIA2... Antony --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]