There is more to consider here. Lucene now supports "payloads", additional metadata on terms that can be leveraged with custom queries. I've not yet tinkered with them myself, but my understanding is that they would be useful (and in fact designed in part) for representing structured documents. It would behoove us to investigate how payloads might be leveraged for your needs here, such that a single field could represent an entire document, with payloads representing the hierarchical structure. This will require specialized Analyzer and Query subclasses be created to take advantage of payloads. The Lucene community itself is just now starting to exploit this new feature, so there isn't a lot out there on it yet, but I think it holds great promise for these purposes.

    Erik


Hello Erik,

Could you elaborate on how payloads could be used to represent a structured doc?

Thanks, Brian

Reply via email to