Dmitry Serebrennikov [[EMAIL PROTECTED]] has implemented a substantial extension to Lucene which should help folks doing this sort of research. It provides an explicit vector representation for documents. This way you can, e.g., retrieve a number of documents, efficiently sum their vectors, then derive a new query from the sum. This code was posted to the list a long while back, but is now out of date. As soon as the 1.2 release is final, and Dmitry has time, he intends to merge it into Lucene.
Doug -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>