RegexNameFinder when entity spans multiple tokens

Jake Dodd Wed, 19 Nov 2014 14:12:12 -0800

Hi all,

I’m trying to implement a RegexNameFinder for money entities (to supplement 
results from the default OpenNLP statistical model).


The money entities will span multiple tokens (for example, “$120 billion” is 
tokenized as ‘$’, ‘120’, ‘billion’). I’ve verified that my regex pattern will 
match the phrase “$120 billion”, but when used as a pattern in RegexNameFinder, 
the name finder returns no results.

Do RegexNameFinders match named entities that span multiple tokens? Or are they 
designed to find single-token named entities?

Cheers

Jake

RegexNameFinder when entity spans multiple tokens

Reply via email to