Kind of depends on your use case. You can for example use the name
finder to detect entities in those articles. They could be used to
compute a graph which tells you which names are frequently mentioned
together.

Topic modeling might help to search for articles based on their topic.

Jörn

On Thu, 2015-02-19 at 21:29 +0100, Philippe de Rochambeau wrote:
> Hello,
> 
> In the past few months, I have indexed tens of thousands of PDFs containing 
> newspaper articles from 1887 until 1940 using SOLR for my company.
> 
> Every day, my colleagues in the Archive Department spend hours searching 
> through the archives using SOLR, looking for potentially-interesting articles 
> from a social and historical point of view.
> 
> Can OpenNLP be used to automate their work and/or to analyze patterns in the 
> data?
> 
> Many thanks.
> 
> Philippe

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to