Czarek,

Thanks so much for this detailed email. I think it brings the issues people
have been having into sharp relief. For some background, Ruby 1.9 has
historically had a number of issues that were troublesome to Rails. In at
least one case (Ruby 1.9 changed constant lookup in a very confusing and
backward incompatible way), we were able to lobby ruby-core to backtrack on
their decision.

A very useful thing for us to determine here would be what we could do in
ruby-core to simplify this issue. One possible solution we've discussed has
been to enable (and use) a default_source_encoding, which would (at very
least) pick up the default language from the environment.

I have some more comments inline.

Yehuda Katz
Developer | Engine Yard
(ph) 718.877.1325


On Sun, Apr 18, 2010 at 2:10 AM, Czarek <cezary.bagin...@gmail.com> wrote:

> On Sat, Apr 10, 2010 at 09:15:54PM +0200, Mislav Marohnić wrote:
> > >
> > >  - since no one seriously considers ruby 1.9 ready for production,
> > >    nobody is going to spend time merging patches for 1.9 encoding
> > >    support, so sending patches is a waste of time
> >
> >
> > All the "points" you listed basically just repeat what you stated here in
> > your last observation.
> >
> > Sending (quality) patches is never a waste of time. Patience is a virtue.
>
> I completely agree with both statements.
>
> And I also think that expecting people to prepare wonderful patches for
> rails 2.3.5 (released almost 5 months ago AFAIR) without some
> encouragement or directions would be a little too much. The reasons:
>

We are still maintaining Rails 2.3.5, and will continue to do so for the
near-future. Patches that add features to 2.3.x will probably be met with
serious scrutiny after 3.0 is final, but patches which fix bugs in any
supported version of Ruby (including 1.9.2, once it's released), will
continue to be considered.


>  1. Debugging which source file or line of code (part of rails or
>  not) emits a ASCII-8BIT string is very time consuming (since the
>  point of failure is very far from the cause). Without this, it is
>  difficult to determine if it already has a LH ticket or not.
>

Yes. This blows. Again, I think this comes down to a poor choice for default
source encoding (ASCII-8BIT). In my opinion, ruby-core should make the
default source encoding UTF-8. If this causes backward compatibility issues,
they should be handled in the Ruby code that introduces the issues, and
allowing the user to change the default source encoding would probably be
helpful as well.


>  2. There are already many 1.9 tickets present in 2.3.5 with no
>  applicable 'solutions'- to list just some I have been bitten by
>  already, or stumbled upon when searching for existing
>  patches/duplicates:
>
>
> https://rails.lighthouseapp.com/projects/8994/tickets/1988-make-utf8-partial-rendering-from-within-a-content_for-work-in-ruby19
>
> https://rails.lighthouseapp.com/projects/8994/tickets/2188-i18n-fails-with-multibyte-strings-in-ruby-19-similar-to-2038
>
> https://rails.lighthouseapp.com/projects/8994/tickets/2476-ascii-8bit-encoding-of-query-results-in-rails-232-and-ruby-191
>
> https://rails.lighthouseapp.com/projects/8994/tickets/3331-patch-block-invalid-chars-to-come-in-rails-app
>
> https://rails.lighthouseapp.com/projects/8994/tickets/3392-rackinput-requires-ascii-8bit-encoded-stringio
>    https://rails.lighthouseapp.com/projects/8994/tickets/3941
>
> https://rails.lighthouseapp.com/projects/8994/tickets/4336-ruby19-submitted-string-form-parameters-with-non-ascii-characters-cause-encoding-errors


As Jeremy said, this entire process is far too error-prone. We need to work
with ruby-core, before they release 1.9.2, to create a solution that doesn't
introduce this sort of problem. In my opinion (and you can quote me on
this), 1.9.x is DOA until this problem is addressed in a way that does not
lead to the sorts of tickets you showed above.


>  3. 1.8.7 is recommended for Rails. That is ok. But although the
>  2.3.5 release notes mention 1.9, they don't state anything about
>  potential UTF-8 problems with Ruby 1.9 (except for people's
>  comments), nor do they suggest what to do with such problems (e.g.
>  'wait until X', 'we are waiting for patches', 'send test cases',
>  'use 1.8.7', 'try -KU option', 'you are on your own unless you only
>  use en_us'). And there is also no mention of how to report issues
>  effectively or which commit to use to avoid reporting something
>  already on LH.
>

I agree. I'd also point out that in the past year, attempting to maintain
compatibility with 1.9.x has been extremely frustrating for Rails. In
addition to feature problems (encodings, constant lookup), we've been met
with repeated segfaults in both 1.9.1 and 1.9.2-*. Tracking down segfaults
is tricky, and while rails-core needs to attempt to keep up with 1.9.2-head,
you as a user should not be using a version of Ruby that is known to
segfault in pure-ruby code. To be clear, you may have never encountered any
segfaults, but we encounter them often when running the Rails test suite.
Note that Rails itself is pure Ruby, and the problems we have had are
invariably reproducible without any C extensions.


>  4. When using a combination of software (cucumber, webrat, rspec) it
>  may be *very* time consuming to even determine which gem is the
>  cause of the problem and which ones just send the problem further
>  down the call stack.
>

Indeed. This is why the whack-a-mole solution is unacceptable. At this
point, we've clearly demonstrated that the basic strategy of making String
literals in Ruby source files 8-bit-ASCII and providing no mechanism (except
file-for-file magic comments) is too unwieldy.


>  5. It is unreasonable to expect people to not try Rails with Ruby
>  1.9(even if by accident) and the worst thing is that is *seems* to
>  work, until UTF8 characters are used somewhere (template, db, etc).
>  No warning is given if Ruby 1.9 is used. So the natural thing to
>  assume when something is that one's setup is wrong. Which is true -
>  it's using Ruby 1.9 in the first place.
>

I agree. That said, I would personally *not* run a production Rails
application on Ruby 1.9.x until 1.9.2 is released and all known issues
(especially the segfaults I mentioned above) are resolved. One thing that
would make me feel more comfortable would be if ruby-core ran the Rails test
suite against 1.9.2-head. I know they're not obligated to do so, but it
would make the process significantly more robust. Rails core (and
specifically Carl and I) would happily invest whatever time needed to help
the Ruby core team get (and stay) up and running with the Rails suite.


>  6. Although I don't want absolute morons to use Rails, having no
>  'fail-safe' or warning will just scare good developers from Rails
>  just wanting to try out the framework, even if the issues are not
>  Rails bugs. There is no 'recommended' set of patches to apply and
>  test before reporting bugs with Ruby 1.9.
>

Agreed. And to be clear, I don't see any reason that someone who's using PHP
today shouldn't be able to use Rails tomorrow.


>  7. Most of the solutions you find for encoding problems with ROR and
>  Ruby 1.9 do not suggest the following: stick with 1.8, because
>  1.9 with Rails is a can of worms in this regard.
>

That is the recommended solution.


> I was wondering if this isn't really something more suitable for
> ruby-core: it would be nice to know where the string causing the error
> was created and why a given encoding was selected. This could at least
> provide bug reports with better details regarding the root cause.
>

Tracking the origin of every String might be expensive. Perhaps a debug mode
that did this would be helpful. That said, as I said above, I don't believe
that ASCII-8BIT is a good default for source files.


> I am really not the brightest developer out there and I apologize for
> not being able to propose something more useful than just stating
> obvious problems.
>

Your ability to clearly articulate the problems puts you head and shoulders
above most developers. Thank you very much for your efforts in clearly
outlining the issues.


> My question is: how can I help in a meaningful way that isn't a
> complete waste of my time and that isn't a duplication of other
> people's efforts?
>
> Since patches are never a waste of time, I propose the following?
>
> My first patch would be a warning about using Ruby 1.9 with Rails. To
> save people grief when they install Ruby 1.9 as their default.
>

That seems good. Would it be a warning in the initial Rails boot check (the
one that blocks running Rails with 1.8.6 and below). That seem like the
right place to me. We should perhaps have a more expansive explanation of
the issues with 1.9 and encodings (possibly a guide) that we could link to.


> My second patch is to rescue an exception in concat (output_safety),
> work around it with force_encoding if it is sane and issue a warning.
> Just to try help solve other issues that just *seem* related.
>

I'd want to see a log warning, in red, not just a Ruby warning that could be
hidden. I'd like to discuss applying this solution to master as well. Would
you mind hitting me up on GTalk (wyc...@gmail.com).


> Then I would put my efforts into discussing the issue on ruby-core
> if it would be possible to add location info (and reason for selected
> encoding: env, locale, magic, param, etc) for string creation on a
> test version of Rails - this may save many tens of thousands of man
> hours that would be wasted on debugging and help in the adoption of
> not only Ruby 1.9, but in good practices regarding supporting non-US
> languages in other gems.
>

I agree entirely. I will be happy to help lobby for these (or related)
changes. Do you think it makes sense to change the default source encoding?


> Then I would build a special version of Ruby that warns whenever a
> string is not created as UTF-8 and isn't explicitly created as ASCII,
> fork Rails and start adding test cases.
>

I'd love to help you with whichever of these efforts you think my assistance
would be valuable in. Again, please ping me.


> Would this really be the best approach?
>

It sounds on the right track :)


>
>
> Thanks in advance.
>

Again, thanks for your efforts here. It's too easy to get angry, post a
rant, and just leave entirely (or privately seethe). Your post here is a
model example of how I would personally like people to express their
concerns about serious problems that seem to remain unaddressed (or
underaddressed).


>
> --
> Cezary Baginski
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
>
> iEYEARECAAYFAkvKol4ACgkQgEYXSknSpI/pwwCeJvvSmCk1D+/wFqQ8Bs+RdcUx
> 0QUAoKrwtaZGixkGuFD8P+g3QnzbMka6
> =50Ik
> -----END PGP SIGNATURE-----
>
>

-- 
You received this message because you are subscribed to the Google Groups "Ruby 
on Rails: Core" group.
To post to this group, send email to rubyonrails-c...@googlegroups.com.
To unsubscribe from this group, send email to 
rubyonrails-core+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/rubyonrails-core?hl=en.

Reply via email to