>UnicodeEncodeError: 'utf-8' codec can't encode character '\udcc5' in position >>61: surrogates not allowed
This indicates that i'am reading the filenames in a different encoding than what they actually are? What is i try to use bytes for path specifications, and have Python decode them in 'utf-8' ? fullpaths.add( os.path.join(root, fullpath).encode('utf-8') ) Will this work? As Michael said encoding is a process which you take unicode characters and conver them to bytestream using some charset(utf8 here) Will this work? -- http://mail.python.org/mailman/listinfo/python-list