Re: [Zope-dev] [Zope 2.10] ZPT going Unicode
Dieter Maurer wrote: Tino Wildenhain wrote at 2006-1-13 16:45 +0100: ... Maybe just have new uZPT with Unicode and leave the "old" ZPT allone? Maybe with limited ability to "add" old ZPT from ZMI or such. This would solve the backward-compatibility problems and would be a more smooth transition w/o the need of upgrade hacks and "strict" hacks (after all, we arent perl/php ;)) I fear it is not that easy: Unless we set Python's "defaultencoding" to the site encoding (and we have such a thing), Python cannot mix Unicode and non-Unicode. Thus, your "old" ZPT's would need to use only other old ZPT's and "old" Python scripts and "old" methods (returning encoded texts) while "strict" ZPT's would need to use only new (strict) ZPT's, scripts and methods. Quite unfeasible... Right, and setting Python's default encoding is out of the question. Regards, Martijn ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] [Zope 2.10] ZPT going Unicode
Dieter Maurer schrieb: > Tino Wildenhain wrote at 2006-1-13 16:45 +0100: > >>... >>Maybe just have new uZPT with Unicode and leave the "old" ZPT allone? >>Maybe with limited ability to "add" old ZPT from ZMI or such. >> >>This would solve the backward-compatibility problems and would be a more >>smooth transition w/o the need of upgrade hacks and "strict" hacks >>(after all, we arent perl/php ;)) > > > I fear it is not that easy: > > Unless we set Python's "defaultencoding" to the site encoding > (and we have such a thing), Python cannot mix Unicode and non-Unicode. > > Thus, your "old" ZPT's would need to use only other old ZPT's and > "old" Python scripts and "old" methods (returning encoded texts) > while "strict" ZPT's would need to use only new (strict) ZPT's, scripts > and methods. Quite unfeasible... Dont think so. The uZPTs would be aware of the fact of "unfriendly" environment. So handling of encoded templates can be done when they are used in a uZPT they would be promoted to unicode w/o touching the default python encoding. (If not unicode ... get encoding from template or site-default -> decode) Mixing all that capabilty with a single "switchable" ZPT implementation strikes me a lot harder to get right, useable and performant. But after all its just an idea. We can discuss it ;) ++Tino ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] [Zope 2.10] ZPT going Unicode
Tino Wildenhain wrote at 2006-1-13 16:45 +0100: > ... >Maybe just have new uZPT with Unicode and leave the "old" ZPT allone? >Maybe with limited ability to "add" old ZPT from ZMI or such. > >This would solve the backward-compatibility problems and would be a more >smooth transition w/o the need of upgrade hacks and "strict" hacks >(after all, we arent perl/php ;)) I fear it is not that easy: Unless we set Python's "defaultencoding" to the site encoding (and we have such a thing), Python cannot mix Unicode and non-Unicode. Thus, your "old" ZPT's would need to use only other old ZPT's and "old" Python scripts and "old" methods (returning encoded texts) while "strict" ZPT's would need to use only new (strict) ZPT's, scripts and methods. Quite unfeasible... -- Dieter ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] [Zope 2.10] ZPT going Unicode
Martijn Faassen wrote at 2006-1-13 13:25 +0100: > ... >What about input? If I have an input form, browsers tend to submit in >the encoding that the form as in, for instance UTF-8. This means I get >UTF-8 strings into my request. Of course, (textual) input should be converted to Unicode as well. Of course, the encoding used by the browser must be known for this. As the browser usually uses the encoding used by the page containing the form fixing a site encoding might solve this problem. -- Dieter ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] [Zope 2.10] ZPT going Unicode
Andreas Jung schrieb: ... >> Now, if I have code that takes something from that request and displays >> it in a unicode page template, you'd have a problem, as you'd be mixing >> UTF-8 with unicode there. Again this might result in a lot of broken >> code. >> > > I share your worries (meanwhile :-)). Enforcing unicode is too strict. I > think to relax the wrapper code so it can handle both unicode and > non-unicode (for backward compabitlity)...possibly using some 'strict' > flag that enforces the use of unicode...I just don't know yet how to add > this in a same way. Maybe just have new uZPT with Unicode and leave the "old" ZPT allone? Maybe with limited ability to "add" old ZPT from ZMI or such. This would solve the backward-compatibility problems and would be a more smooth transition w/o the need of upgrade hacks and "strict" hacks (after all, we arent perl/php ;)) Maybe with a "make all my ZPT uZPT" or the like for the real desperate. ++Tino ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] [Zope 2.10] ZPT going Unicode
--On 13. Januar 2006 13:25:19 +0100 Martijn Faassen <[EMAIL PROTECTED]> wrote: Andreas Jung wrote: Thoughts? All these changes seem to be the right thing. This will make unicode life in Zope a lot easier. I worry about backward compatibility though. Some code (such as PlacelessTranslationService) is doing wild things like monkeypatching the ZPT engine so that incoming unicode is encoded into UTF-8 during page template execution. I.e. the principle is quite different from that of Zope 2 itself, where the publisher takes care of translating things into an encoded string upon output. Since Silva doesn't use PTS anymore I don't worry about this, but Plone developers might. Changing the default encoding of Zope to UTF-8 might break a lot of assumptions in people's code. What about input? If I have an input form, browsers tend to submit in the encoding that the form as in, for instance UTF-8. This means I get UTF-8 strings into my request. Now, if I have code that takes something from that request and displays it in a unicode page template, you'd have a problem, as you'd be mixing UTF-8 with unicode there. Again this might result in a lot of broken code. I share your worries (meanwhile :-)). Enforcing unicode is too strict. I think to relax the wrapper code so it can handle both unicode and non-unicode (for backward compabitlity)...possibly using some 'strict' flag that enforces the use of unicode...I just don't know yet how to add this in a same way. Andreas pgpBiFEn5d6BD.pgp Description: PGP signature ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] [Zope 2.10] ZPT going Unicode
Andreas Jung wrote: --On 5. Januar 2006 12:47:20 +0100 Andreas Jung <[EMAIL PROTECTED]> wrote: As you might know I worked on the integration of the Zope 3 ZPT implementation for Zope 2.10. Before commiting the changes to the trunk I would like discuss my approach for Zope 2.10 I forgot to mention a major point: compatibility. When a ZPT is internally stored a unicode string then content returned by methods called through the ZPT will be implicitly converted to unicode. This will definitely raise UnicodeDecodeErrors. So how to deal with this issue? Ah, I wrote my reply before reading this. - allowing only unicode textual content when calling macros, PyScript etc. - converting non-unicode to unicode inside the TAL code using some encoding. The encoding could be specified as property of the called method (function properties) or object. In effect Python already does this, it just decodes to unicode using a strict ASCII encoding. Making this configurable per page template might be good, though I'm worried about supporting implicit behavior leading to bad coding patterns. I'd prefer code to be Python unicode clean, but allowing in, say, UTF-8 strings, into a page template and then implicitly converting them to unicode, is inviting people to persist in not understanding the way to write good unicode code. We really _need_ to discuss this issue early to minimize side effects and to be able to provide the best compatibility possible. Agreed! Regards, Martijn ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] [Zope 2.10] ZPT going Unicode
Andreas Jung wrote: Thoughts? All these changes seem to be the right thing. This will make unicode life in Zope a lot easier. I worry about backward compatibility though. Some code (such as PlacelessTranslationService) is doing wild things like monkeypatching the ZPT engine so that incoming unicode is encoded into UTF-8 during page template execution. I.e. the principle is quite different from that of Zope 2 itself, where the publisher takes care of translating things into an encoded string upon output. Since Silva doesn't use PTS anymore I don't worry about this, but Plone developers might. Changing the default encoding of Zope to UTF-8 might break a lot of assumptions in people's code. What about input? If I have an input form, browsers tend to submit in the encoding that the form as in, for instance UTF-8. This means I get UTF-8 strings into my request. Now, if I have code that takes something from that request and displays it in a unicode page template, you'd have a problem, as you'd be mixing UTF-8 with unicode there. Again this might result in a lot of broken code. Regards, Martijn ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] [Zope 2.10] ZPT going Unicode
Andreas Jung wrote at 2006-1-6 10:13 +0100: > ... >A site encoding as default makes sense however there are situations when >the encoding of a string from is different. E.g. when you deal with >XMLHTTPRequest the browser expects an utf8 fragment which might conflict >with an iso-8859-15 site-encoding. So there should be some mechanism to >specify the encoding. For PyScript one could introduce a property to >specify the encoding and methods of products I can imagine using function >properties..and this is possibly something some must be solved on the TAL >level. I do not like this idea: If I need to modify individual objects (such as Python Scripts), rather than fiddling with properties, I can simply modify the script to return Unicode. I do not think that we need additional special features beside support for some kind of site encoding. >> As pointed out in private email ("charset negociation will not work"), >> the response charset has a significant impact on the charset >> used in form data sent back to the server. >> This may pose severe problems when the response charset is >> not the same as the site encoding (for textual form data). > >This issue is connected to the ongoing/planned project to use the Z3 >publisher in Zope 2.10. There will be a sprint at PyCon next month afaik... I do not see that the problem is related to any specific publisher. Instead, it is a conceptual problem: When form data comes back, the server *MUST* know the encoding used by the client. For "POST", the encoding is hopefully specified in the "Content-Type" request header, but for "GET" there is almost surely no information available: the server must assume that the client used the same charset as the page it replied to. However, the server does not have this information if it uses charset negotiation. -- Dieter ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] [Zope 2.10] ZPT going Unicode
--On 6. Januar 2006 19:55:56 +0900 TAHARA Yusei <[EMAIL PROTECTED]> wrote: I've tried to the branch and I found a typo in zope.conf. I think default-zpublisher-encoding is a correct directive name. This is already fixed. By the way, I have a request related to this changes. If the rootfolder has `management_page_charset` property as default, this will very convenient for me. Because I can use japanese in ZMI without some setup things. Is this possible? This whole management_page_charset is some kind of hack. I also had some trouble getting the ZMI for ZPT to use UTF-8 (which another hack). The current ZMI screen call zpt/read which returns a unicode string (to be presented within the edit textarea). I don't know if any further conversions have to be performed when using japanese in the ZMI...please try yourself (ptEdit.pt). -aj pgp47jFSkCKRt.pgp Description: PGP signature ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] [Zope 2.10] ZPT going Unicode
Hello. > - the ZMI screens for ZPTs use a _fixed_ UTF-8 encoding (is this also > fine > for asian users?) This is fine for me, as one of asian user. > - The ZPublisher will convert a 'unicode' ZPT using > default_zpublisher_encoding (zope.conf) to a byte stream. The current > encoding is iso-8859-15. I think it would make sense to change it to > utf8. Otherwise we must explicitly set the content-type with > charset=utf8 set. Thoughts? I've tried to the branch and I found a typo in zope.conf. I think default-zpublisher-encoding is a correct directive name. By the way, I have a request related to this changes. If the rootfolder has `management_page_charset` property as default, this will very convenient for me. Because I can use japanese in ZMI without some setup things. Is this possible? -- TAHARA Yusei [EMAIL PROTECTED] ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] [Zope 2.10] ZPT going Unicode
--On 5. Januar 2006 21:12:08 +0100 Dieter Maurer <[EMAIL PROTECTED]> >> - allowing only unicode textual content when calling macros, PyScript etc. definitely not. I know that this is a hard requirement but it is an implicit requirement in Zope 3. - converting non-unicode to unicode inside the TAL code using some encoding. The encoding could be specified as property of the called method (function properties) or object. Using a site encoding would probably be the best way (as done by e.g. Plone and Archetypes). Most sites use in fact a fixed encoding most of the time. At the places where a different encoding is used, it can be employed explicitely to convert to unicode. A site encoding as default makes sense however there are situations when the encoding of a string from is different. E.g. when you deal with XMLHTTPRequest the browser expects an utf8 fragment which might conflict with an iso-8859-15 site-encoding. So there should be some mechanism to specify the encoding. For PyScript one could introduce a property to specify the encoding and methods of products I can imagine using function properties..and this is possibly something some must be solved on the TAL level. As pointed out in private email ("charset negociation will not work"), the response charset has a significant impact on the charset used in form data sent back to the server. This may pose severe problems when the response charset is not the same as the site encoding (for textual form data). This issue is connected to the ongoing/planned project to use the Z3 publisher in Zope 2.10. There will be a sprint at PyCon next month afaik... -aj pgpvYpjpK6486.pgp Description: PGP signature ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] [Zope 2.10] ZPT going Unicode
Andreas Jung wrote at 2006-1-5 13:39 +0100: > ... >I forgot to mention a major point: compatibility. > >When a ZPT is internally stored a unicode string then content returned by >methods called through the ZPT will be implicitly converted to unicode. >This will definitely raise UnicodeDecodeErrors. So how to deal with this >issue? > > - allowing only unicode textual content when calling macros, PyScript etc. definitely not. > - converting non-unicode to unicode inside the TAL code using some > encoding. The encoding could be specified as property of the called > method (function properties) or object. Using a site encoding would probably be the best way (as done by e.g. Plone and Archetypes). Most sites use in fact a fixed encoding most of the time. At the places where a different encoding is used, it can be employed explicitely to convert to unicode. As pointed out in private email ("charset negociation will not work"), the response charset has a significant impact on the charset used in form data sent back to the server. This may pose severe problems when the response charset is not the same as the site encoding (for textual form data). -- Dieter ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] [Zope 2.10] ZPT going Unicode
--On 5. Januar 2006 12:47:20 +0100 Andreas Jung <[EMAIL PROTECTED]> wrote: As you might know I worked on the integration of the Zope 3 ZPT implementation for Zope 2.10. Before commiting the changes to the trunk I would like discuss my approach for Zope 2.10 I forgot to mention a major point: compatibility. When a ZPT is internally stored a unicode string then content returned by methods called through the ZPT will be implicitly converted to unicode. This will definitely raise UnicodeDecodeErrors. So how to deal with this issue? - allowing only unicode textual content when calling macros, PyScript etc. - converting non-unicode to unicode inside the TAL code using some encoding. The encoding could be specified as property of the called method (function properties) or object. We really _need_ to discuss this issue early to minimize side effects and to be able to provide the best compatibility possible. -aj pgpApldUaM8nO.pgp Description: PGP signature ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
[Zope-dev] [Zope 2.10] ZPT going Unicode
As you might know I worked on the integration of the Zope 3 ZPT implementation for Zope 2.10. Before commiting the changes to the trunk I would like discuss my approach for Zope 2.10 - the ZopePageTemplate and PageTemplateFile classes will be replaced with wrapper classes around corresponding Zope 3 classes (no changes to the API, the existing unittest do pass with the wrappers) - the wrapper implementations store the content as Python unicode strings. The encoding of the content must be specified when creating/uploading new instances (either through the ZMI or through the 'encoding' parameters of the constructors). If the content is XML-ish then the encoding is taken from the XML preamble (defaults to utf8) - the ZMI screens for ZPTs use a _fixed_ UTF-8 encoding (is this also fine for asian users?) - deprecation of _all_ other methods, modules etc. from Products/PageTemplates (removal in 2.12 except of the wrapper classes) - deprecation of the TAL module (removal in 2.12) Open: - The ZPublisher will convert a 'unicode' ZPT using default_zpublisher_encoding (zope.conf) to a byte stream. The current encoding is iso-8859-15. I think it would make sense to change it to utf8. Otherwise we must explicitly set the content-type with charset=utf8 set. Thoughts? ToDo: in-place conversion of persistent ZPT instance through setstate() or so... Andreas pgp5yexs8WBFr.pgp Description: PGP signature ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )