On 02/04/2014 06:10 PM, [email protected] wrote:
+++
opennlp/trunk/opennlp-tools/src/main/java/opennlp/tools/cmdline/tokenizer/CommandLineTokenizer.java
Tue Feb 4 17:10:11 2014
<SNIP>
void process() {
-
- ObjectStream<String> untokenizedLineStream =
- new PlainTextByLineStream(new InputStreamReader(System.in));
-
- ObjectStream<String> tokenizedLineStream = new WhitespaceTokenStream(
- new TokenizerStream(tokenizer, untokenizedLineStream));
-
- PerformanceMonitor perfMon = new PerformanceMonitor(System.err, "sent");
- perfMon.start();
-
+ ObjectStream<String> untokenizedLineStream = null;
+
+ ObjectStream<String> tokenizedLineStream = null;
+ PerformanceMonitor perfMon = null;
try {
+ untokenizedLineStream =
+ new PlainTextByLineStream(new MockInputStreamFactory(System.in),
"UTF-8");
The encoding should not be changed. To read from System.in the default
encoding should be used, and not UTF-8.
As far as I know that will not work on Windows.
Jörn