Re: A million reasons why Encoding was a mistake

Nikolai Weibull Tue, 22 May 2012 07:29:03 -0700

On Tue, May 22, 2012 at 9:37 AM, Brian Candler <[email protected]> wrote:
> Austin Ziegler wrote in post #1061436:
>> This is *not* a Ruby problem, this is a *data* problem.


> (because ruby 1.9 sorts by byte
> ordering, which happens to work for UTF8 but not all other encodings of
> Unicode).

Only if you want to sort by code point.

> Some people see ruby 1.9's highly complex encoding implementation as a
> triumph of engineering; I see it as design smell.

That must be some easily impressed people.  It’s not a space rocket
(which is designed and built by a aerospace engineer, not a “rocket
scientist”, by the way.) or (every programmers favorite, it seems) a
bridge.

>> Matz and others have worked very hard to make sure that Ruby 1.9 works
>> well if you follow certain rules regarding your inputs and outputs.
>
> ... which one has to absorb by osmosis. Certainly the core API docs
> don't give these rules; in fact they give precious little about the
> encoding semantics of String. And you can't get much more of a core part
> of the language than String.

Completely agree.  This is a complex matter and should be treated as
such by the documentation.  Glossing over it in the documentation only
strengthens the belief that you don’t have to know or care about
encodings.

> Of course, because every String is now two-dimensional (x = sequence of
> bytes, y = Encoding) there is a much higher requirement to document
> every method which acts on a string or returns on a string, because
> there is a much larger variety of inputs and outputs to consider.

Well, you should design your APIs to ignore these dimensions.  Oh, and
always return UTF-8.

-- You received this message because you are subscribed to the Google Groups 
ruby-talk-google group. To post to this group, send email to 
[email protected]. To unsubscribe from this group, send email 
to [email protected]. For more options, visit this 
group at https://groups.google.com/d/forum/ruby-talk-google?hl=en

Re: A million reasons why Encoding was a mistake

Reply via email to