Hi, I have a few questions about abbreviation in sentence detector. I'd like to understand how it is working and improve it if possible.
1) How is the setence detector using the abbreviation dictionary? All train methods in SentenceDetectorME takes an abbreviation dictionary as argument, but is only saving it to the model. It is not using the dictionary to create the context generator, but it should, shouldn't it? 2) The command line trainer does not allow to pass an abbreviation dictionary. Maybe it should allow to pass a file name that contains the dictionary. 3) Maybe we should include tools to extract the abbreviation dictionary from the train corpus. Optionally this could be executed during training too. What do you think?
