Hello, I would like to develop an OpenNLP application which would index place and people names, dates, numbers and monetary amounts, among other things, contained in thousands of PDFs. People and place names would be looked up in gazeeters (ie, dictionaries) and dates, numbers and amounts would be normalized so as to be comparable (eg, find all PDFs whose contents contain dates > 20010101 and < 20100101).
Furthermore, the generated index would have to be SOLR-compatible and tweakable (eg, one should be able to specify the criteria used to sort search results, eg, order documents by date, document name, people names, etc.) How difficult would it be to develop such an application in OpenNLP? Many thanks. Philippe
