ah. I've rolled back Ken's change 'cos I need it to work with Python 2.7.

I've set the env variable in train-model.perl just before the call to
merge-alignment.py. That should patch ken's problem for now.

https://github.com/moses-smt/mosesdecoder/commit/acd3ac964a7df646e15e3c4210853e7b70bebcbf
But the better way is adding Rico's code to all python scripts


On 14 November 2014 13:20, Rico Sennrich <rico.sennr...@gmx.ch> wrote:

> Hieu Hoang <Hieu.Hoang@...> writes:
>
> > Ken - should we add encoding on open to all python scripts, rather than
> set the PYTHONIOENCODING env variable? That's basically what happens with
> the perl scripts/
> >
> > What python/Linux version are you using? I don't see it on my version
> (Python 2.7.3, Ubuntu 12.04)
>
> Hi all,
>
> It's kinda tricky to have consistent encoding between Python 2.X and Python
> 3. The patch to merge_alignment.py will fail under 2.X. I suggest to use
> io.open instead, which works with all versions from 2.6 up. And if any
> string processing is done, I suggest using 'from __future__ import
> unicode_literals' to ensure that all string literals are interpreted as
> unicode, and making sure that all input/output is UTF-8 (including
> stdin/stdout/stderr). I usually do this with the following code block:
>
> import codecs
> if sys.version_info < (3,0,0):
>   sys.stdin = codecs.getreader('UTF-8')(sys.stdin)
>   sys.stdout = codecs.getwriter('UTF-8')(sys.stdout)
>   sys.stderr = codecs.getwriter('UTF-8')(sys.stderr)
>
> best,
> Rico
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>



-- 
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to