Re: [Zope-dev] Should PageTemplate._text be a unicode or an encoded string in Zope 2.9.3?
Andreas Jung wrote: > > > --On 22. Juli 2006 16:17:09 +0200 Tino Wildenhain <[EMAIL PROTECTED]> > wrote: >>> huh?..even on the file system a pt file is encoded using some encoding. >>> For an XML pagetemplate file the encoding is clearly defined through >>> the BOM (if available) and/or the XML preamble. So the most reliable >>> solution would be to use XML PTs only. >> >> Yes but you have to explicitely store that information "somehow" in the >> file - zope objects can use other methods to transfer encoding >> information >> while they create the internal representation. >> meta-tags for charset are quite ugly but you basically have no other >> choice with filesystem stuff. >> Problem here if the various encoding notifications collide (XML header >> vs. XHTML meta-tag vs. BOM) so better have as few as possible - even >> better none when > > I am only talking of XML. And the encoding is clearly and unambiguously > defined through the BOM (if available) and the XML preamble. So any Pardon, thats nonsense. BOM means byte oder mark and not encoding mark (It would read EM instead, wouldnt it? ;) Its only used with some 16 bit encodings to tell the byte order of the two bytes (obviously). And XML only via its XML preambel, which is just another place to put encoding information in band. (In fact we should have publishing engine to fix this preamble as well as the infamous meta-tag (if available) to reflect the encoding currently in use. > application reading an XML file is able to detect the encoding and produce > a unicode string from the file. According to a discussion with Dieter Yes, and in case of the filesystempagetemplates and friends, the template is that application which reads and should produce the unicode string. Regards Tino ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Should PageTemplate._text be a unicode or an encoded string in Zope 2.9.3?
--On 22. Juli 2006 16:17:09 +0200 Tino Wildenhain <[EMAIL PROTECTED]> wrote: huh?..even on the file system a pt file is encoded using some encoding. For an XML pagetemplate file the encoding is clearly defined through the BOM (if available) and/or the XML preamble. So the most reliable solution would be to use XML PTs only. Yes but you have to explicitely store that information "somehow" in the file - zope objects can use other methods to transfer encoding information while they create the internal representation. meta-tags for charset are quite ugly but you basically have no other choice with filesystem stuff. Problem here if the various encoding notifications collide (XML header vs. XHTML meta-tag vs. BOM) so better have as few as possible - even better none when I am only talking of XML. And the encoding is clearly and unambiguously defined through the BOM (if available) and the XML preamble. So any application reading an XML file is able to detect the encoding and produce a unicode string from the file. According to a discussion with Dieter the Python XML parsers don't deal with the BOM and leave it up to the application to interpret the BOM correctly. -aj pgpHsujepZPLt.pgp Description: PGP signature ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Should PageTemplate._text be a unicode or an encoded string in Zope 2.9.3?
Andreas Jung schrieb: --On 22. Juli 2006 15:34:01 +0200 Tino Wildenhain <[EMAIL PROTECTED]> wrote: Well, pagetemplate files are another thing. They have to deal with the lack of charset information of a filesystem file and what they do once they load the data is even another thing. Even filesystem pagetemplates should work with unicode internal, making it easy to recode them for output and combine with other potentially unicode stuff. huh?..even on the file system a pt file is encoded using some encoding. For an XML pagetemplate file the encoding is clearly defined through the BOM (if available) and/or the XML preamble. So the most reliable solution would be to use XML PTs only. Yes but you have to explicitely store that information "somehow" in the file - zope objects can use other methods to transfer encoding information while they create the internal representation. meta-tags for charset are quite ugly but you basically have no other choice with filesystem stuff. Problem here if the various encoding notifications collide (XML header vs. XHTML meta-tag vs. BOM) so better have as few as possible - even better none when we deal with HTTP-Servers which can nicely handle this all out of band and on demand. webdav or put can send charset data, zmi would use default-zpublisher-encoding etc. If you store the internal object in unicode you can safely combine different souces of encoded data instead of having a weird mesh of decoding and encoding going on. So I would not care how to find out about the intended encoding - once the object is instantiated. Regards Tino ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Should PageTemplate._text be a unicode or an encoded string in Zope 2.9.3?
--On 22. Juli 2006 15:34:01 +0200 Tino Wildenhain <[EMAIL PROTECTED]> wrote: Well, pagetemplate files are another thing. They have to deal with the lack of charset information of a filesystem file and what they do once they load the data is even another thing. Even filesystem pagetemplates should work with unicode internal, making it easy to recode them for output and combine with other potentially unicode stuff. huh?..even on the file system a pt file is encoded using some encoding. For an XML pagetemplate file the encoding is clearly defined through the BOM (if available) and/or the XML preamble. So the most reliable solution would be to use XML PTs only. -aj pgpk47LXu3Abb.pgp Description: PGP signature ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Should PageTemplate._text be a unicode or an encoded string in Zope 2.9.3?
Stefan H. Holek wrote: > On 21. Jul 2006, at 16:53, Chris Withers wrote: > >> I wonder how Zope 3's filesystem-based ZPT's deal with this? > > zope.pagetemplate.pagetemplatefile.PageTemplateFile reads an eventual > header, or defaults to UTF-8. Well, pagetemplate files are another thing. They have to deal with the lack of charset information of a filesystem file and what they do once they load the data is even another thing. Even filesystem pagetemplates should work with unicode internal, making it easy to recode them for output and combine with other potentially unicode stuff. Tino ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Should PageTemplate._text be a unicode or an encoded string in Zope 2.9.3?
zope.pagetemplate.pagetemplatefile.PageTemplateFile reads an eventual header, or defaults to UTF-8. Stefan On 21. Jul 2006, at 16:53, Chris Withers wrote: I wonder how Zope 3's filesystem-based ZPT's deal with this? -- Anything that, in happening, causes something else to happen, causes something else to happen. --Douglas Adams ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Should PageTemplate._text be a unicode or an encoded string in Zope 2.9.3?
Jens Vagelpohl wrote: CMFCore/FSPageTemplate does not do anything special, it defers to the PageTemplate implementation. Yay! *sigh* Uggg... we need something like python's header-at-top-of-file-to-specify-encoding thing, unless we force ZPT source to be XML, in which case we can "do the right thing" in the XML style I wonder how Zope 3's filesystem-based ZPT's deal with this? cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Should PageTemplate._text be a unicode or an encoded string in Zope 2.9.3?
Andreas Jung wrote: security.declareProtected(change_page_templates, 'PUT') def PUT(self, REQUEST, RESPONSE): """ Handle HTTP PUT requests """ self.dav__init(REQUEST, RESPONSE) self.dav__simpleifhandler(REQUEST, RESPONSE, refresh=1) ## XXX this should be unicode or we must pass an encoding self.pt_edit(REQUEST.get('BODY', '')) RESPONSE.setStatus(204) return RESPONSE As you can see from the comment..there is some work to do. Yay ;-) AFAIK WevDAV the encoding is not available from a WebDAV request?! Really? :-( On the other hand there is code available that tries to obtain the encoding from the XML preamble (sniffEncoding)... where? and the very other hand there is still a problem with this method since the encoding can be determined by the BOM (if available)...this currently not handled through the code...I think I'll have a closer look at the code once again this week. What's a BOM when it's at home? cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Should PageTemplate._text be a unicode or an encoded string in Zope 2.9.3?
Andreas Jung wrote: > > > --On 17. Juli 2006 17:11:54 +0100 Chris Withers <[EMAIL PROTECTED]> > wrote: > >> Andreas Jung wrote: >>> >>> Zope 2.10 comes with the ZPT implementation of Zope 3 which works nicely >>> with unicode strings. However the 2.10 won't enforce the use of unicode >>> strings for backward compatibility. However (at least) the >>> ZopePageTemplate >>> class constructor has a flag 'strict' to enforce the use of unicode. >> >> Okay, but what actually gets stored in the ZPT when editing it via ZMI or >> WebDAV? > > Here's the code: > >security.declareProtected(change_page_templates, 'PUT') >def PUT(self, REQUEST, RESPONSE): >""" Handle HTTP PUT requests """ >self.dav__init(REQUEST, RESPONSE) >self.dav__simpleifhandler(REQUEST, RESPONSE, refresh=1) >## XXX this should be unicode or we must pass an encoding >self.pt_edit(REQUEST.get('BODY', '')) >RESPONSE.setStatus(204) >return RESPONSE > > As you can see from the comment..there is some work to do. AFAIK > WevDAV the encoding is not available from a WebDAV request?! But it is - in fact my local copies have the hack where I used management_page_charset here. Now I'm seeing we dont even need that - default_zpublisher_encoding is much better here. Kate (as webdav editor client) plays very well with that. > On the other hand there is code available that tries to obtain > the encoding from the XML preamble (sniffEncoding)...and the very other > hand Yes this double encoder marking is a mess. We need to be able to provide a fixup (just like the infamous -tag) > there is still a problem with this method since the encoding can be > determined by the BOM (if available)...this currently not handled through > the code...I think I'll have a closer look at the code once again this > week. BOM is only for filesystem "unicode" of some 16-bit variants. Nothing you really want to send over the wire (although you can). But after all its just another encoding so it would be a matter of setting the encoding correctly. Regards Tino ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Should PageTemplate._text be a unicode or an encoded string in Zope 2.9.3?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 17 Jul 2006, at 12:24, Andreas Jung wrote: What about loading FS ZPT's and things like CMF's FSZPT? Ask Tres or Jens :-) CMFCore/FSPageTemplate does not do anything special, it defers to the PageTemplate implementation. jens -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (Darwin) iD8DBQFEu/iERAx5nvEhZLIRAgsUAJ41ezK8o4XP6+/ff/cuiOMZPuatKACfUcy4 R1175YukgMw1oXQsj+seahs= =RLQm -END PGP SIGNATURE- ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Should PageTemplate._text be a unicode or an encoded string in Zope 2.9.3?
--On 17. Juli 2006 17:11:54 +0100 Chris Withers <[EMAIL PROTECTED]> wrote: Andreas Jung wrote: Zope 2.10 comes with the ZPT implementation of Zope 3 which works nicely with unicode strings. However the 2.10 won't enforce the use of unicode strings for backward compatibility. However (at least) the ZopePageTemplate class constructor has a flag 'strict' to enforce the use of unicode. Okay, but what actually gets stored in the ZPT when editing it via ZMI or WebDAV? Here's the code: security.declareProtected(change_page_templates, 'PUT') def PUT(self, REQUEST, RESPONSE): """ Handle HTTP PUT requests """ self.dav__init(REQUEST, RESPONSE) self.dav__simpleifhandler(REQUEST, RESPONSE, refresh=1) ## XXX this should be unicode or we must pass an encoding self.pt_edit(REQUEST.get('BODY', '')) RESPONSE.setStatus(204) return RESPONSE As you can see from the comment..there is some work to do. AFAIK WevDAV the encoding is not available from a WebDAV request?! On the other hand there is code available that tries to obtain the encoding from the XML preamble (sniffEncoding)...and the very other hand there is still a problem with this method since the encoding can be determined by the BOM (if available)...this currently not handled through the code...I think I'll have a closer look at the code once again this week. What about loading FS ZPT's and things like CMF's FSZPT? Ask Tres or Jens :-) -aj pgpKMKwhhkvb4.pgp Description: PGP signature ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Should PageTemplate._text be a unicode or an encoded string in Zope 2.9.3?
Andreas Jung wrote: Zope 2.10 comes with the ZPT implementation of Zope 3 which works nicely with unicode strings. However the 2.10 won't enforce the use of unicode strings for backward compatibility. However (at least) the ZopePageTemplate class constructor has a flag 'strict' to enforce the use of unicode. Okay, but what actually gets stored in the ZPT when editing it via ZMI or WebDAV? What about loading FS ZPT's and things like CMF's FSZPT? cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Should PageTemplate._text be a unicode or an encoded string in Zope 2.9.3?
Tino Wildenhain wrote: This would be my next question too regarding the management_page_charset cleanup I'm currently playing with. My vote would be to store unicode where possible - so you dont screw up everything when you change default_zpublisher_encoding in zope.conf. Yeah, unicode is good... cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Should PageTemplate._text be a unicode or an encoded string in Zope 2.9.3?
--On 17. Juli 2006 16:55:42 +0100 Chris Withers <[EMAIL PROTECTED]> wrote: Andreas Jung wrote: I've had problems when it's an encoded string, but that seems to be what is stored when you save a ZPT via the ZMI or WebDAV... ZPT in pre-Zope 2.10 knows nothing about unicode...it can be anything :-) And what about 2.10? Zope 2.10 comes with the ZPT implementation of Zope 3 which works nicely with unicode strings. However the 2.10 won't enforce the use of unicode strings for backward compatibility. However (at least) the ZopePageTemplate class constructor has a flag 'strict' to enforce the use of unicode. -aj FWIW, this seems to be problematic due to Zope 3's i18n stuff returning unicodes. Prior to that, everything was a happy utf-8 encoded string. What does Zope 2.10 do with all of this? cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk -- ZOPYX Ltd. & Co. KG - Charlottenstr. 37/1 - 72070 Tübingen - Germany Web: www.zopyx.com - Email: [EMAIL PROTECTED] - Phone +49 - 7071 - 793376 E-Publishing, Python, Zope & Plone development, Consulting pgptSINhMNku9.pgp Description: PGP signature ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Should PageTemplate._text be a unicode or an encoded string in Zope 2.9.3?
Chris Withers wrote: Andreas Jung wrote: I've had problems when it's an encoded string, but that seems to be what is stored when you save a ZPT via the ZMI or WebDAV... ZPT in pre-Zope 2.10 knows nothing about unicode...it can be anything :-) And what about 2.10? FWIW, this seems to be problematic due to Zope 3's i18n stuff returning unicodes. Prior to that, everything was a happy utf-8 encoded string. What does Zope 2.10 do with all of this? This would be my next question too regarding the management_page_charset cleanup I'm currently playing with. My vote would be to store unicode where possible - so you dont screw up everything when you change default_zpublisher_encoding in zope.conf. Regards Tino ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Should PageTemplate._text be a unicode or an encoded string in Zope 2.9.3?
Andreas Jung wrote: I've had problems when it's an encoded string, but that seems to be what is stored when you save a ZPT via the ZMI or WebDAV... ZPT in pre-Zope 2.10 knows nothing about unicode...it can be anything :-) And what about 2.10? FWIW, this seems to be problematic due to Zope 3's i18n stuff returning unicodes. Prior to that, everything was a happy utf-8 encoded string. What does Zope 2.10 do with all of this? cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Should PageTemplate._text be a unicode or an encoded string in Zope 2.9.3?
--On 17. Juli 2006 16:32:08 +0100 Chris Withers <[EMAIL PROTECTED]> wrote: The subject line says it all really ;-) I've had problems when it's an encoded string, but that seems to be what is stored when you save a ZPT via the ZMI or WebDAV... ZPT in pre-Zope 2.10 knows nothing about unicode...it can be anything :-) -aj pgpp8PDwhpBdy.pgp Description: PGP signature ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
[Zope-dev] Should PageTemplate._text be a unicode or an encoded string in Zope 2.9.3?
The subject line says it all really ;-) I've had problems when it's an encoded string, but that seems to be what is stored when you save a ZPT via the ZMI or WebDAV... Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )