Nick Coghlan writes: > One idea I had along those lines is a surrogatereplace error handler ( > http://bugs.python.org/issue22016) that emitted an ASCII question mark for > each smuggled byte, rather than propagating the encoding problem.
Please, don't. "Smuggled bytes" are not independent events. They tend to be correlated *within* file names, and this handler would generate names whose human semantics get lost (and there *are* human semantics, otherwise the name would be str(some_counter)). They tend to be correlated across file names, and this handler will generate multiple files with the same munged name (and again, the differentiating human semantics get lost). If you don't know the semantics of the intended file names, you can't generate good replacement names. This has to be an application-level function, and often requires user intervention to get good names. If you want to provide helper functions that applications can use to clean names explicitly, that might be OK. _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com