Hi,

I noticed that the length is hardcoded to 4 in the PrefixFeatureGenerator
and the SuffixFeatureGenerator. I made this value configurable in the XML
for each feature generator. I also add a check for the length to keep
duplicate prefixes or suffixes being returned. (If the token is "yes" with
a length of 4 there would be two "yes" features returned.) If a value is
not provided in the XML it uses the default value of 4.

You can preview the changes here:
https://github.com/apache/opennlp/compare/master...jzonthemtn:prefixsuffix?expand=1

If this is a change that's desired by the group I can make a JIRA and a
pull request.

Thanks,
Jeff

Reply via email to