On 02/04/2014 06:10 PM, [email protected] wrote:
+++ 
opennlp/trunk/opennlp-tools/src/main/java/opennlp/tools/cmdline/tokenizer/CommandLineTokenizer.java
 Tue Feb  4 17:10:11 2014

<SNIP>

    void process() {
-
-    ObjectStream<String> untokenizedLineStream =
-        new PlainTextByLineStream(new InputStreamReader(System.in));
-
-    ObjectStream<String> tokenizedLineStream = new WhitespaceTokenStream(
-        new TokenizerStream(tokenizer, untokenizedLineStream));
-
-    PerformanceMonitor perfMon = new PerformanceMonitor(System.err, "sent");
-    perfMon.start();
-
+    ObjectStream<String> untokenizedLineStream = null;
+
+    ObjectStream<String> tokenizedLineStream = null;
+    PerformanceMonitor perfMon = null;
      try {
+      untokenizedLineStream =
+              new PlainTextByLineStream(new MockInputStreamFactory(System.in), 
"UTF-8");

The encoding should not be changed. To read from System.in the default encoding should be used, and not UTF-8.
As far as I know that will not work on Windows.

Jörn

Reply via email to