On 7/12/11 3:11 PM, [email protected] wrote:
Added: 
incubator/opennlp/trunk/opennlp-tools/src/main/java/opennlp/tools/cmdline/BasicEvaluationParameters.java
URL:http://svn.apache.org/viewvc/incubator/opennlp/trunk/opennlp-tools/src/main/java/opennlp/tools/cmdline/BasicEvaluationParameters.java?rev=1145578&view=auto
==============================================================================
--- 
incubator/opennlp/trunk/opennlp-tools/src/main/java/opennlp/tools/cmdline/BasicEvaluationParameters.java
 (added)
+++ 
incubator/opennlp/trunk/opennlp-tools/src/main/java/opennlp/tools/cmdline/BasicEvaluationParameters.java
 Tue Jul
...
+
+  @ParameterDescription(valueName = "charsetName", description = "specifies the 
encoding which should be used for reading and writing text")
+  @OptionalParameter(defaultValue="UTF-8")
+  Charset getEncoding();

We should decide how we handle this, and do it consistently.
The trainers declare it as a mandatory parameter, the evaluators declare
it as optional now and take UTF-8 as default.

In my opinion we should either force the user to specify it, then he
needs to think about the encoding. Or we use the platform default encoding, because that is the default a user would expect by convention since all software tools usually
operate with the platform default encoding.

Or is there a good reason to use UTF-8 as a default?

I know that this is a decision which is difficult to get right,
as far as I know we have been criticized for the current way of doing
it because people don't want to pass the encoding parameter all the time.

Jörn

Reply via email to