[Zope-dev] Unicode treatment in 2.6b1

2002-09-26 Thread Florent Guillaume

Andreas Kostyrka  [EMAIL PROTECTED] wrote:
 So how are these Unicode changes supposed to work? Are non-ascii
 characters forbidden now? And how do I get UTF-8 text into Zope?

If all your code outputs is plain python strings, ZPublisher passes them
as-is to the client.

If ZPublisher has to output a Unicode string, it has to decide how to
translate that into a byte string at the other end. What it does then is
encode the Unicode string into the charset defined in any 'Content-Type:
text/xxx; charset=thecharset' header you produced using
RESPONSE.setHeader (defaulting to latin-1).

But how does ZPublisher get a Unicode string in the first place? Well it
gets it from the rendering of whatever method was called when publishing
the object.

For DTML, various blocks are joined together (function render_blocks()),
and if one of them happens to be Unicode then the join_unicode method
will make it so that all non-Unicode string are converted into Unicode
using unicode(s, 'latin-1'). So this assumes that plain strings are
encoded in latin-1. Note, WE MAY WANT TO PARAMETRIZE THIS. Basically
there could be an additional attribute to the DTML saying what's its
native encoding.

For PageTemplates, the various blocks produced by the template and
python are sent to an StringIO-like objects, which is responsible for
converting them into a coherent thing when its getvalue() method is
called. At the moment it doesn't deal very well mixed Unicode and
non-Unicode strings so the reported failures don't surprise me. WE NEED
TO FIX THIS BEFORE THE NEXT BETA, probably also by providing an explicit
native encoding. I believe that's what AltPT does.

Localizer 0.9, for instance, had the need to patch the StringIO-like
object to make it deal with joining non-Unicode and Unicode. Now that I
better understand the problem, I'll help fix this ASAP in core Zope.

Florent


-- 
Florent Guillaume, Nuxeo (Paris, France)
+33 1 40 33 79 87  http://nuxeo.com  mailto:[EMAIL PROTECTED]

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )



Re: [Zope-dev] Unicode treatment in 2.6b1

2002-09-26 Thread Florent Guillaume

 For PageTemplates, the various blocks produced by the template and
 python are sent to a StringIO-like object, which is responsible for
 converting them into a coherent thing when its getvalue() method is
 called. At the moment it doesn't deal very well mixed Unicode and
 non-Unicode strings so the reported failures don't surprise me.

BTW an example of a failing PageTemplate is:

html
  span tal:replace=python:u'hello' / café
/html

Because, deep inside StringIO, it tries to do something like:
''.join([u'hello', ' café'])

--
Florent Guillaume, Nuxeo (Paris, France)
+33 1 40 33 79 87  http://nuxeo.com  mailto:[EMAIL PROTECTED]

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )