Philippe,

I have used GATE run on Hadoop with Behemoth but I take the annotations
produced and push them to SolR for searching on later.
I am not familiar with mimir. It should be possible to feed documents
fetched with Nutch to GATE for text analysis

If you want to chat about using GATE with Nutch feel free to contact me off
list.

Alex



On 1 January 2014 13:12, Philippe de Rochambeau <[email protected]> wrote:

> Hello,
>
> can you use Nutch to crawl PDFs and extract person, location, dates, times
> an money amounts as entities, as opposed to plain text strings?
>
> In  GATE mimir-cloud (http://gate.ac.uk/mimir/), you can search for
> {People}, {Location}, {Date}, and {Money} entities (if you have previously
> used the appropriate Processing Resources to index your data sources, in
> GATE Developer 7.1.) For instance, you can run search queries such as:
>
> « JOHN PAUL » IN {People}
> Paris IN {Location},
> {Date normalized>20010101 normalize<20100101}
> {Money > 2000}
> ...
>
> Can you do such things in Nutch?
>
> Many thanks.
>
> Philippe

Reply via email to