There is a paper at this year's ACL conference on a statistical parser with some interesting properties [1]. I tracked down the software [2] and it is apache-licensed (unlike most other high quality parsers such as the Berkeley and Stanford parsers). It is written in Scala so in theory it should be compatible. Most importantly it is about as accurate as those state of the art parsers on English (about 33% error reduction from the Ratnaparkhi parser that opennlp currently uses), and may be superior for cross-language performance.
I am going to play with it with some of our clinical data to get a feel for speed/accuracy on clinical text. Just curious if there is any interest in a wrapper for this parser in opennlp? [1] Paper link: http://69.195.124.161/~aclwebor/anthology///P/P14/P14-1022.pdf [2] Software: https://github.com/dlwh/epic -- Tim Miller Instructor Boston Children's Hospital and Harvard Medical School [email protected] 617-919-1223
