e have not tested our
> > cross-platform updates with the older Mac CR and I suspect it will not
> > work. So I suggest using either CRLF or LF, which we have extensively
> > using across Windows and Posix systems.
> >
> > Tom
> >
> >
> > On 12/5/2015 6:13
either CRLF or LF, which we have extensively
> using across Windows and Posix systems.
>
> Tom
>
>
> On 12/5/2015 6:13 AM, moses-support-requ...@mit.edu wrote:
>> Date: Fri, 4 Dec 2015 23:13:10 +
>> From: Ulrich Germann
>> Subject: Re: [Moses-support] decoder qu
will not
work. So I suggest using either CRLF or LF, which we have extensively
using across Windows and Posix systems.
Tom
On 12/5/2015 6:13 AM, moses-support-requ...@mit.edu wrote:
> Date: Fri, 4 Dec 2015 23:13:10 +
> From: Ulrich Germann
> Subject: Re: [Moses-support] decoder
Hi Vincent,
as far as Moses is concerned, the end of a sentence is marked by whatever
the end-of-line marker is on the respective OS (Win: CRLF, Linux: LF, Mac:
CR, apparently). A period is treated as a plain old token. The purpose of
the sentence splitter that Kenneth mentioned is to tell Moses
Indeed, you should split sentences into separate lines. Here's the script:
https://github.com/moses-smt/mosesdecoder/blob/master/scripts/ems/support/split-sentences.perl
Note that the script assumes you have placed tags in the text to
force sentence boundaries. It will not assume that existing
well not exactly my question. I know Moses translate one "line" at a
time, meaning a string ending with a line feed.
My question is more, if the string contains a PERIOD (tokenized as
such), separating the line in 2 "sentences" then how does it behave ?
given my observation I have the feeling
I think you're asking if Moses translates one sentence at a time. The answer is
yes.
- John Burger
MITRE
> On Dec 4, 2015, at 04:43, Vincent Nguyen wrote:
>
> Actually I don't know if this is a decoder question or such.
>
> Here is my issue
>
> Let's say I have a text string with 2 sentenc
Actually I don't know if this is a decoder question or such.
Here is my issue
Let's say I have a text string with 2 sentences, with a period ending
the first sentence, but no CR+LF, just a space before the second sentence.
When I pass the full string to the pipe :
tokenizer + truecaser + moses