Hi, I would like to contribute a class based on the MoreLikeThis class in contrib/queries that generates a query based on the tags associated with a document. The class assumes that documents are tagged with a set of tags (which are stored in the index in a seperate Field). The class determines the top document terms associated with a given tag using the information gain metric.
While generating a MoreLikeThis query for a document the tags associated with document are used to determine the terms in the query. This class is useful for finding similar documents to a document that does not have many relevant terms but was tagged. I have attached the class and a test class and would appreciate any feedback. Thank you, Thomas --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org