On Thu, 16 Jan 2014 14:47:00 +1100, Ben Finney wrote: > Steven D'Aprano <steve+comp.lang.pyt...@pearwood.info> writes: > >> enc = guess_encoding_from_bom("filename") if enc == something: >> # Can't guess, fall back on an alternative strategy ... >> else: >> f = open("filename", encoding=enc) >> >> >> If I forget to check the returned result, I should get an explicit >> failure as soon as I try to use it, rather than silently returning the >> wrong results. > > Yes, agreed. > >> What should I return as the default default? I have four possibilities: >> >> (1) 'undefined', which is an standard encoding guaranteed to >> raise an exception when used; > > +0.5. This describes the outcome of the guess. > >> (2) 'unknown', which best describes the result, and currently >> there is no encoding with that name; > > +0. This *better* describes the outcome, but I don't think adding a new > name is needed nor very helpful.
And there is a chance -- albeit a small chance -- that someday the std lib will gain an encoding called "unknown". >> (4) Don't return anything, but raise an exception. (But >> which exception?) > > +1. I'd like a custom exception class, sub-classed from ValueError. Why ValueError? It's not really a "invalid value" error, it's more "my heuristic isn't good enough" failure. (Maybe the file starts with another sort of BOM which I don't know about.) If I go with an exception, I'd choose RuntimeError, or a custom error that inherits directly from Exception. Thanks to everyone for the feedback. -- Steven -- https://mail.python.org/mailman/listinfo/python-list