Re: BOM and principle of least surprise

Paul Hoffman Wed, 05 May 2004 10:16:54 -0700

At 7:47 PM +0300 5/5/04, Jarkko Hietaniemi wrote:

> My hope for fewer options is for reading input. That is, I'd like the
default encoding for all inputs and outputs to be UTF8, unless it has
We tried this with perl 5.8.0 and the feedback was overwhelmingly
negative...  if people do "print chr 0xff" they do expect one byte,
not two.

Of course, but that's not the only way to have a single encoding for input and output. For example, "use utf8" (or some newly-named equivalent) could have effects on all file input and output in the scope. A different method would be to have all input and output be binary, but to have standardized operator overload systems (this is more cumbersome than the first suggestion).

What I don't want (and what we mostly have now) is a language where the programmer has to remember to ask "what encoding am I using" for every input or output command. If I have a text processing program, it is likely that all input and output will be in my chosen encoding; those that aren't need to be read/written using different tools (such as subroutines that have the new encoding specified for their scope).

Re: BOM and principle of least surprise

Reply via email to