I'm using Lucene to index a set of documents that contain a lot of duplicate fields.
A simple example of our typical document structure might look like this: section_title: the first section text: this is section 1 text section_title: the second section text: this is section 2 text If I do a search in section_title for the text "first" then this document is a hit. However if I then call hits.doc(0).get("section_title"), the section title with the hit is *not* the field that is returned - the last occurance of section_title in the document is the one that gets returned. If I'm not mistaken, this is a bug? Ideally, the section title that made the document a hit would be returned. Less ideally, but still more desirable than the current behaviour, would be for the first section title to be returned, as this is typically what a user sees first when they choose to view the entire document after seeing it in the search results. Was the current behaviour intentional/anticipated? If no one has time to fix this but can point me at the best place to start I'm willing to attempt to make a fix and contribute a patch. At the moment I use the results of doc().get("section_title") as part of my search results, so the current behaviour is leading to slightly strange looking results pages... Regards, Lee Mallabone.