Em 08-05-2010 02:34, Michael Koziarski escreveu:
On Sat, May 8, 2010 at 12:03 PM, Rodrigo Rosenfeld Rosas
<[email protected]>  wrote:
Is there any approach currently used for making the Ruby 1.8/Rails 2.3.5
behavior the same in Ruby 1.9?

This is important for virtually any non-english application... Are there any
plans for integration some library for achieving the same results as Rails
currently supports?
My understanding is that ruby 1.9 is meant to support all these
operations internally, our mb_chars functionality was only ever
intended as a stop-gap until ruby itself could do native multi-byte
aware string operations.  So what you're seeing are bugs in ruby which
should be fixed there,  we probably shouldn't be maintaining a second
multi-byte aware library.



Please, take a look at this documentation for String#upcase:

http://ruby-doc.org/ruby-1.9/classes/String.html#M000593

"Returns a copy of str with all lowercase letters replaced with their uppercase counterparts. The operation is locale insensitive—*only characters ``a’’ to ``z’’ are affected*. Note: case replacement is effective only in ASCII region."

It doesn't seem Ruby 1.9 will change this behavior, so Rails should keep using its Proxy approach while Ruby doesn't support it itself.

My guess is that mb_chars should be set on Rails initialization with something like:

def mb_chars
self
end

String.send :include, StringMultiBytePatch unless 'ação'.upcase == 'AÇÃO'

Of course this is not the real code, but a suggestiong of an approach... The StringMultiBytePatch module would override mb_chars to use ActiveSupport::Multibyte::Chars proxy as noted by Norman Clarke.

Please, see also this thread from 2008:
http://old.nabble.com/String-upcase-downcase-with-UTF-8-strings-in-Ruby-1.9-td18372062.html

---
|in *Ruby* *1*.*9* I get the following behaviour:
|
|>> "aoueäöüé".*upcase*
|=> "AOUEäöüé"
|>> "AOUEÄÖÜÉ".downcase
|=> "aoueÄÖÜÉ"
|
|I can't find however find a bug in the bug tracking system.
|Doesn't this qualify as a bug?

The document for String#*upcase* says:

call-seq:
str.*upcase* => new_str

Returns a copy of <i>str</i> with all lowercase letters replaced with their
uppercase counterparts. The operation is locale insensitive---only
characters ``a'' to ``z'' are affected.
Note: case replacement is effective only in ASCII region.

"hEllO".*upcase* #=> "HELLO"

See "Note:". Tim Bray have persuaded me to do so, since case
conversion outside of ASCII region is highly dependent on country,
language, culture and script.

matz.
---

So, it doesn't seem Matz consider this a bug and he won't probably change this behavior for Ruby 1.9...

So, don't you think we should continue supporting mb_chars as before?

Best regards,

Rodrigo.

--
You received this message because you are subscribed to the Google Groups "Ruby on 
Rails: Core" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/rubyonrails-core?hl=en.

Reply via email to