On 11/10/11 6:59 PM, Aliaksandr Autayeu wrote:
Sounds interesting. And I would be cautions to avoid reinventing the wheel - the standard Java way is quite good. But may be I don't understand the code or your proposal well enough yet. James, may be before jumping into it, you can make a small before-after sample piece of code to illustrate better your idea? A snap of code before, a snap of code after. And a snap of "client" code before and after? What do you think?
We don't really have encoding issues in OpenNLP because the whole API relies on strings and strings are always UTF-16 in java. The only place where we need to deal with encoding is in our command line interface where we read in training data, have tools to transform data, evaluation tools and demo tools which read in plain text from the console. Jörn
