the moses tokenizer expects the input from standard in Hieu Hoang http://statmt.org/hieu
On Sun, 12 Apr 2020 at 10:27, Justin Cunningham <just1br...@outlook.com> wrote: > Hi, > > I’m currently working on a Neural Machine Translator but I am quite new to > it all. I am trying to tokenise my files in Linux using the following shell > script (https://github.com/JustCunn/IrishNMT/blob/master/GaeilgePrepare.sh) > and these files: > > http://opus.nlpl.eu/download.php?f=EUbookshop/v2/moses/en-ga.txt.zip > <http://opus.nlpl.eu/download.php?f=EUbookshop/v2/moses/de-fr.txt.zip> > http://opus.nlpl.eu/download.php?f=QED/v2.0a/moses/en-ga.txt.zip > > But it just won’t work. Sometimes it will skip it, others it will just be > stuck on the ‘Tokenizer... number of threads...”. For context, they are all > plain text files. Am I not formatting the text correctly? > > I’d appreciate if someone could help me with this as it would be a huge > help in my understanding of it all. > > Thanks, > Justin > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support >
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support