Issue can be found here: https://issues.apache.org/jira/browse/OPENNLP-702



On Tuesday, June 10, 2014 3:38 AM, Jörn Kottmann <[email protected]> wrote:
Hello,

that looks like a bug. Please open a jira issue.

Thanks,
Jörn




On 06/09/2014 01:08 AM, Richard Head Jr. wrote:
> Here's my dictionary:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <dictionary case_sensitive="false">
>    <entry>
>      <token>vitamin</token>
>      <token>b12</token>
>    </entry>
>    <entry>
>      <token>vitamin</token>
>      <token>b</token>
>    </entry>
>    <entry>
>      <token>john</token>
>      <token>doe</token>
>    </entry>
>    <entry>
>      <token>john</token>
>      <token>d</token>
>    </entry>
> </dictionary>
>
> When ran on this sentence using a DictionaryNameFinder: My name is john doe, 
> aka john d. I like vitamin b12.
>
> The following tokens are found: john doe, john d, vitamin b
>
> As you can see, when the 2nd token ends in a number, the longest match is 
> discarded.
> Bug, or am I missing something?
>
> Thanks

Reply via email to