Erik Hatcher wrote:

On Feb 7, 2005, at 1:21 AM, David Spencer wrote:

Erik Hatcher wrote:

XML-Indexing-Demo - I propose this be moved to an "examples" area if we keep it at all.
parsers - Is anyone using the PDF parser here?
taglib - my bad in committing this in the first place - its not well implemented and of marginal use. I propose to remove it entirely.
miscellaneous - I propose that when moved to contrib/util.
similarity & spellchecker - I propose this be combined with the contrib/util.
Thoughts on these?


Another way of looking at it is to group query expansion code together i.e. similarity + spellchecker + wordnet go together. I think calling things "util" or "misc" demeans them - but disclaimer, these 3 things are coincidentally all mine.


No offense or demeaning intended.

None taken! Sorry, I should have made that clear.
I agree w/ trying to make sense of the packaging as that gives Lucene more value.



I wasn't that happy with an umbrella "util" area myself, but also am trying to ensure we have a clean and sensible contrib area. Keep in mind that the idea is package each contrib project as its own separate package within the Lucene distribution. So highlighter, with the Lucene 2.0 release, would be packaged as highlighter-2.0.jar. The WordNet package is unique in that it is not something you add-on to an application using Lucene, but rather a tool that is used to generate an index for use with your

This may not be quite precise - the WordNet pkg does 2 things, [1] builds a synonym index and [2] expands queries. [2] is done in SynExpand.java.


Thus I thought it would make sense to think of a "query expansion" module and group this + the similarity stuff...

application. I'm not sure how these distinctions factor into how we package things.

The contrib area should be useful add-ons to Lucene's core, and isn't really appropriate for examples/demos, it seems to me.
The tricky pieces are miscellaneous, similarity, and spellchecker. These are tiny by themselves and putting them in a util area and packaging them altogether seems ok to me at one level, but does it make more sense to keep these completely separate?


OK, to be more concrete, I'll suggest the 3 above go to "search" or "query-expansion".


"search" is too generic, it seems, since all of Lucene could fit under that categorization. Maybe it makes the most sense to leave them as-is for the time being - though keeping it open for discussion is good to see what others think.

    Erik


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to