Hi all I have a problem that might be very trivial but I don't know how can I solve it using Lucene I created an index with Lucene for a huge data set around 3 million documents in various domains and another index for a corpus of 30 documents in a specific domain.for every document in the small corpus I want to find a similar document from the huge one. does anyone knows if the Lucene searcher can do that. secondly: how can I view the terms of a document and their frequencies form the generated index. thanksshaimaa