On Thu, Jul 17, 2014 at 1:48 AM, Marko Rauhamaa <ma...@pacujo.net> wrote: > it is dangerous to assume that the file formats agree with > the locale.
Of course. You never assume anything about encodings. What you do is expect something about the encoding, and either throw an error if it's wrong, or figure out some other encoding to use. With anything that you broadly control (eg if your program is configured by a file in /etc that nothing else uses), you just decode with whatever you document your program as using, and any failure is *not your problem*. It's that simple. You don't replace /etc/passwd with a JPEG encoded photograph of your family tree and expect all your family to be able to log in; no more should you expect a file to be parsed correctly if it's meant to be UTF-8 and you save it in ISO-8859-4. The two cases are equally ridiculous. The only thing that might be an issue is that you can't use open(fn) to read your files, but you have to explicitly state the encoding. That would be an understandable problem, especially for someone who develops on a single platform and forgets that the default differs. As long as you always explicitly say encoding="utf-8", and document that you do so, any problems are someone else's. ChrisA -- https://mail.python.org/mailman/listinfo/python-list