Paulo da Silva wrote: > I am using a python3 script to produce a bash script from lots of > filenames got using os.walk. > > I have a template string for each bash command in which I replace a > special string with the filename and then write the command to the bash > script file. > > Something like this: > > shf=open(bashfilename,'w') > filenames=getfilenames() # uses os.walk > for fn in filenames: > ... > cmd=templ.replace("<fn>",fn) > shf.write(cmd) > > For certain filenames I got a UnicodeEncodeError exception at > shf.write(cmd)! > I use utf-8 and have # -*- coding: utf-8 -*- in the source .py. > > How can I fix this? > > Thanks for any help/comments.
You make it harder to debug your problem by not giving the complete traceback. If the error message contains 'surrogates not allowed' like in the demo below >>> with open("tmp.txt", "w") as f: ... f.write("\udcef") ... Traceback (most recent call last): File "<stdin>", line 2, in <module> UnicodeEncodeError: 'utf-8' codec can't encode character '\udcef' in position 0: surrogates not allowed you have filenames that are not valid UTF-8 on your harddisk. A possible fix would be to use bytes instead of str. For that you need to open `bashfilename` in binary mode ("wb") and pass bytes to the os.walk() call. Or you just go and fix the offending names. -- https://mail.python.org/mailman/listinfo/python-list