Good point. Would it make sense to have a reasonable default that writes to a tmp directory? E.g. /tmp/opennlp.<component>.<encoding>.<timestamp>.txt (but which can be changed with an option value).
On Wed, Jul 13, 2011 at 11:05 AM, Jörn Kottmann <[email protected]> wrote: > Hello, > > our cmd line tools print the processed text to stdout or stderr. > > We have three kind of tools doing that right now: > - Taggers, print out the processed text > - Converters > - Evaluators > > I believe this is problematic when working with texts which are not > compatible with the system default encoding. > > It might not be a big deal for the Taggers, because these are meant to be > a demonstration and quick test tool. > > But it is actually a problem for the evaluators and especially for the > converts, > because it could mean for some users that they just don't work depending > on the processed language and their encoding. > > Should we plan to change the converters and evaluators to always print > into a file, which has the same encoding as the input file? > > Jörn > > > -- Jason Baldridge Assistant Professor, Department of Linguistics The University of Texas at Austin http://www.jasonbaldridge.com http://twitter.com/jasonbaldridge
