In a message of Thu, 03 Dec 2015 19:17:51 +0000, Adam Funk writes: >On 2015-12-03, Laura Creighton wrote: > >> In a message of Thu, 03 Dec 2015 15:12:15 +0000, Adam Funk writes: >>>I'm having trouble with some input files that are almost all proper >>>UTF-8 but with a couple of troublesome characters mixed in, which I'd >>>like to ignore instead of throwing ValueError. I've found the >>>openhook for the encoding >>> >>>for line in fileinput.input(options.files, >>>openhook=fileinput.hook_encoded("utf-8")): >>> do_stuff(line) >>> >>>which the documentation describes as "a hook which opens each file >>>with codecs.open(), using the given encoding to read the file", but >>>I'd like codecs.open() to also have the errors='ignore' or >>>errors='replace' effect. Is it possible to do this? >>> >>>Thanks. >> >> This should be both easy to add, and useful, and I happen to know that >> fileinput is being hacked on by Serhiy Storchaka right now, who agrees >> that this would be easy. So, with his approval, I stuck this into the >> tracker. http://bugs.python.org/issue25788 >> >> Future Pythons may not have the problem. > >Good to know, thanks.
Well, we have moved right along to 'You write the patch, Laura' so I can pretty much guarantee that future Pythons won't have the problem. :) Laura -- https://mail.python.org/mailman/listinfo/python-list