Hi,

I’m currently working on a Neural Machine Translator but I am quite new to it 
all. I am trying to tokenise my files in Linux using the following shell script 
(https://github.com/JustCunn/IrishNMT/blob/master/GaeilgePrepare.sh) and these 
files:

http://opus.nlpl.eu/download.php?f=EUbookshop/v2/moses/en-ga.txt.zip<http://opus.nlpl.eu/download.php?f=EUbookshop/v2/moses/de-fr.txt.zip>
http://opus.nlpl.eu/download.php?f=QED/v2.0a/moses/en-ga.txt.zip

But it just won’t work. Sometimes it will skip it, others it will just be stuck 
on the ‘Tokenizer... number of threads...”. For context, they are all plain 
text files. Am I not formatting the text correctly?

I’d appreciate if someone could help me with this as it would be a huge help in 
my understanding of it all.

Thanks,
Justin
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to