[Zope-CMF] Re: [RFC] [Patch] GenericSetup and encodings

2006-06-08 Thread Yves Bastide

yuppie wrote:


As you already mentioned setting default-zpublisher-encoding to 'utf-8' 
doesn't really work. Just found that DT_Util.join_unicode has 'latin-1' 
hardcoded, so properties with other encodings are not supported by 
manage_propertiesForm.


Given that I don't think we have to support other 
default_zpublisher_encodings than 'latin-1'.


As AJ answered me 
(http://article.gmane.org/gmane.comp.web.zope.devel/11655), Unicode 
properties should use the u- types (ustring, utext). So the way to 
proceed could be:


* document that only iso-8859-1 \inter default_encoding string/text 
properties are supported


* ensure that the unicode types work (e.g., 
TarballExportContext.writeDataFile don't accept unicode text)


* change GenericSetup users (CMF, CPS) to use u* when needed (e.g. 
title, description). Yep, sure :-)




Cheers, Yuppie


yves

___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See http://collector.zope.org/CMF for bug reports and feature requests


[Zope-CMF] Re: [RFC] [Patch] GenericSetup and encodings

2006-06-08 Thread Yves Bastide

yuppie wrote:


As you already mentioned setting default-zpublisher-encoding to 'utf-8' 
doesn't really work. Just found that DT_Util.join_unicode has 'latin-1' 
hardcoded, so properties with other encodings are not supported by 
manage_propertiesForm.


I'm just about to send a mail to zope.devel about this :-)



Given that I don't think we have to support other 
default_zpublisher_encodings than 'latin-1'.


Cheers, Yuppie


yves

___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See http://collector.zope.org/CMF for bug reports and feature requests


[Zope-CMF] Re: [RFC] [Patch] GenericSetup and encodings

2006-06-08 Thread Yves Bastide

yuppie wrote:

Yves Bastide wrote:


converter is field2string, default_encoding is 
ZPublisher.HTTPRequest.default_encoding = 'iso-8859-15' (not to 
mistake for ZPublisher.Converters.default_encoding = 'iso-8859-15').


These default_encoding's are set by 
Zope2.Startup.datatypes.default_zpublisher_encoding (i.e. zope.conf's 
default-zpublisher-encoding directive).


So GenericSetup has to use ZPublisher.HTTPRequest.default_encoding as 
well. Right?


I think so. Or ZPublisher.Converters.default_encoding whose name may be 
more explicit (default_zpublisher_encoding sets 
{Converters,HTTPRequest,HTTPResponse}.default_encoding)




If CMF is messing around with other encodings (like using the site's 
default_charset for the portal titel) it has to override that.


If using utf-8 was the wrong approach your test_utils patch has to be 
modified as well.


It first needs to fail in the current setup ...



Cheers, Yuppie


Thanks,

yves

___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See http://collector.zope.org/CMF for bug reports and feature requests


[Zope-CMF] Re: [RFC] [Patch] GenericSetup and encodings

2006-06-08 Thread Yves Bastide

Replying to myself...

Yves Bastide wrote:


I know of at least one point, ZPublisher.Converters (field2string). 
However by the time a supposedly unicode string (say title:UTF-8:string) 
comes here, it's already iso8859. Will look deeper ...


Well, should have looked up in the call stack.
ZPublisher.HTTPRequest.processInputs, lines 527sq (Zope trunk):

527: item = unicode(item,character_encoding)
528: if hasattr(converter,'convert_unicode'):
529: item = converter.convert_unicode(item)
530: else:
531: item = converter(item.encode(default_encoding))

converter is field2string, default_encoding is 
ZPublisher.HTTPRequest.default_encoding = 'iso-8859-15' (not to mistake 
for ZPublisher.Converters.default_encoding = 'iso-8859-15').


These default_encoding's are set by 
Zope2.Startup.datatypes.default_zpublisher_encoding (i.e. zope.conf's 
default-zpublisher-encoding directive).


Of course, just setting default-zpublisher-encoding to utf-8 results in 
a garbled ZMI ...



yves

___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See http://collector.zope.org/CMF for bug reports and feature requests


[Zope-CMF] Re: [RFC] [Patch] GenericSetup and encodings

2006-06-08 Thread Yves Bastide

yuppie wrote:

Hi!


[...]
With this applied, Portàl (u'Port\xe0l'), which becomes 
'Port\xc3\xa0l', is displayed as Portà l ... Zope does input--output 
properties in utf-8, but stores them in iso8859.  Sigh.


I was afraid this would be complex :(
That's why I only use ASCII in configuration data.

Can you find out why it stores them in iso8859? Is this hardcoded or 
configurable somewhere?


I know of at least one point, ZPublisher.Converters (field2string). 
However by the time a supposedly unicode string (say title:UTF-8:string) 
comes here, it's already iso8859. Will look deeper ...


[About getEncoding()]


Don't know if third party products use it. I guess if CPS doesn't nobody 
does.


It does, though I suspect incorrectly. Florent?



AFAICS it could be deprecated at least for export contexts.

Well, I think I can wriggle out of most of my problems using 
translation. And I'll try and write UTF-8 unit tests if nobody beats 
me to it.


That would be great.


Hmm, by adding to an existing test suite, or creating a new one?


In general the unit tests have a module / class structure similar to the 
tested units. E.g. tests for utils.PropertyManagerHelpers should be 
added to test_utils.PropertyManagerHelpersTests. But sometimes there are 
reasons to add a new test suite, e.g. if you need a different setup.


I did modify test_utils's properties suite (see attached patch), but it 
passes with GenericSetup current version :-)





Cheers,

Yuppie


yves
Index: GenericSetup/tests/test_utils.py
===
--- GenericSetup/tests/test_utils.py	(revision 68520)
+++ GenericSetup/tests/test_utils.py	(working copy)
@@ -24,7 +24,7 @@
 from Products.GenericSetup.testing import DummySetupEnviron
 
 
-_EMPTY_PROPERTY_EXPORT = """\
+_EMPTY_PROPERTY_EXPORT = u"""\
 
 
  False
@@ -34,6 +34,7 @@
  
  0
  
+ 
  
  
  0.0
  False
 
-"""
+""".encode('utf-8')
 
-_NORMAL_PROPERTY_EXPORT = """\
+_NORMAL_PROPERTY_EXPORT = u"""\
 
 
  True
@@ -60,6 +61,7 @@
  
  1
  Foo String
+ \u0080
  Foo
   Text
  
@@ -78,9 +80,9 @@
  3.1415
  True
 
-"""
+""".encode('utf-8')
 
-_FIXED_PROPERTY_EXPORT = """\
+_FIXED_PROPERTY_EXPORT = u"""\
 
 
  True
@@ -93,6 +95,7 @@
  
  1
  Foo String
+ \u0080
  Foo
   Text
  
@@ -109,7 +112,7 @@
  3.1415
  True
 
-"""
+""".encode('utf-8')
 
 _SPECIAL_IMPORT = """\
 
@@ -240,6 +243,7 @@
 obj.manage_addProperty('foo_lines', '', 'lines')
 obj.manage_addProperty('foo_long', '0', 'long')
 obj.manage_addProperty('foo_string', '', 'string')
+obj.manage_addProperty('foo_unicode_string', '', 'string')
 obj.manage_addProperty('foo_text', '', 'text')
 obj.manage_addProperty('foo_tokens', '', 'tokens')
 obj.manage_addProperty('foo_selection', 'foobarbaz', 'selection')
@@ -264,6 +268,7 @@
 obj._updateProperty('foo_lines', 'Foo\nLines')
 obj._updateProperty('foo_long', '1')
 obj._updateProperty('foo_string', 'Foo String')
+obj._updateProperty('foo_unicode_string', u'\u0080'.encode('utf-8'))
 obj._updateProperty('foo_text', 'Foo\nText')
 obj._updateProperty( 'foo_tokens', ('Foo', 'Tokens') )
 obj._updateProperty('foo_selection', 'Foo')
@@ -303,6 +308,7 @@
 self.assertEqual(getattr(obj, 'foo_lines', None), None)
 self.assertEqual(getattr(obj, 'foo_long', None), None)
 self.assertEqual(getattr(obj, 'foo_string', None), None)
+self.assertEqual(getattr(obj, 'foo_unicode_string', None), None)
 self.assertEqual(getattr(obj, 'foo_text', None), None)
 self.assertEqual(getattr(obj, 'foo_tokens', None), None)
 self.assertEqual(getattr(obj, 'foo_selection', None), None)
___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See http://collector.zope.org/CMF for bug reports and feature requests


[Zope-CMF] Re: [RFC] [Patch] GenericSetup and encodings

2006-06-07 Thread Yves Bastide

yuppie wrote:

Hi!


Yves Bastide wrote:

yuppie wrote:


3.) GenericSetup is not tested with non-ASCII UTF-8 site settings. 
AFAIK import works, but not export. I consider this a bug.

[...]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf4 in position 
20: ordinal not in range(128)


This traceback just confirms that export does not work. Is import also 
broken?


Differently: it may or may not raise ... And Zope treats properties as 
iso8859-15 anyway.


Fresh install of Zope trunk (after a long struggle; make instance now 
works but make install is broken?) and CMF trunk, with


~/src/CMF$ svn diff
Index: CMFDefault/profiles/default/properties.xml
===
--- CMFDefault/profiles/default/properties.xml  (revision 68514)
+++ CMFDefault/profiles/default/properties.xml  (working copy)
@@ -1,6 +1,6 @@
 
 
- Portal
+ Portàl
  
  [EMAIL PROTECTED]

Fails when CMFDefault.factory.addConfiguredSite calls createSnapshot.


Here's a minimal patch for GenericSetup not to raise on the previous 
case (Demonstration product. Not for sale.)


[EMAIL PROTECTED]:~/src/CMF$ svn diff GenericSetup/
Index: GenericSetup/context.py
===
--- GenericSetup/context.py (revision 68514)
+++ GenericSetup/context.py (working copy)
@@ -475,7 +475,7 @@
 if isinstance( body, unicode ):
 encoding = self.getEncoding()
 if encoding is None:
-body = body.encode()
+body = body.encode('UTF-8')
 else:
 body = body.encode( encoding )

Index: GenericSetup/utils.py
===
--- GenericSetup/utils.py   (revision 68514)
+++ GenericSetup/utils.py   (working copy)
@@ -625,6 +625,8 @@
 else:
 if prop_map.get('type') == 'boolean':
 prop = str(bool(prop))
+elif isinstance(prop, str):
+prop = prop.decode('UTF-8')
 elif not isinstance(prop, basestring):
 prop = str(prop)
 child = self._doc.createTextNode(prop)
[EMAIL PROTECTED]:~/src/CMF$

With this applied, Portàl (u'Port\xe0l'), which becomes 'Port\xc3\xa0l', 
is displayed as Portà l ... Zope does input--output properties in utf-8, 
but stores them in iso8859.  Sigh.




Thanks for setting me right. What's the usefulness of getEncoding()? 
As you say, exported files don't need to be other than utf-8 encoded.


I guess it just exists for historical reasons.


Might it be removed, or default'ed to utf-8? Do people already rely on it?



Well, I think I can wriggle out of most of my problems using 
translation. And I'll try and write UTF-8 unit tests if nobody beats 
me to it.


That would be great.


Hmm, by adding to an existing test suite, or creating a new one?




Cheers,

Yuppie


Thanks,

yves

___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See http://collector.zope.org/CMF for bug reports and feature requests


[Zope-CMF] Re: zLOG -> logging

2006-06-07 Thread Yves Bastide

Florent Guillaume wrote:

Yves Bastide wrote:
And CPS 3.3 and 3.4 have been using Zope 2.9 since its inception, and 
is the only recommended "stable" platform for them.


Though they also flood the console with zLOG deprecation warnings ;-)
(I incrementally patched the worst offenders on my local copy, but 
never sent the result mainly because of the 
BLATHER/TRACE/DEBUG-to-debug impedance mismatch)


Well you're better off patching zLOG then, to make it not send the 
warning :)


Too easy :-)
(And I just saw ZODB.loglevels.{TRACE,BLATHER}. /me go hiding)



Florent



yves

___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See http://collector.zope.org/CMF for bug reports and feature requests


[Zope-CMF] Re: zLOG -> logging

2006-06-07 Thread Yves Bastide

Florent Guillaume wrote:


And CPS 3.3 and 3.4 have been using Zope 2.9 since its inception, and is 
the only recommended "stable" platform for them.


Though they also flood the console with zLOG deprecation warnings ;-)
(I incrementally patched the worst offenders on my local copy, but never 
sent the result mainly because of the BLATHER/TRACE/DEBUG-to-debug 
impedance mismatch)




Florent



yves

___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See http://collector.zope.org/CMF for bug reports and feature requests


[Zope-CMF] Re: [RFC] [Patch] GenericSetup and encodings

2006-06-07 Thread Yves Bastide

yuppie wrote:

Hi Yves!


Yves Bastide wrote:

GenericSetup has problems handling non-ASCII data.


1.) GenericSetup explicitly doesn't support non-UTF-8 XML in profiles. 
UTF-8 is the default encoding for XML and I can't see a need to support 
other XML encodings.


As output, right? Agreed.



2.) GenericSetup explicitly doesn't support non-UTF-8 site settings. If 
someone provides a good patch this feature can be added.


But with the problems you mention later ('default_charset', 
'management_page_charset', and so on), how would you envision it?




3.) GenericSetup is not tested with non-ASCII UTF-8 site settings. AFAIK 
import works, but not export. I consider this a bug.


Neither: CMF trunk, change portal_types/Document's title to 'Dôcument', 
export:


Traceback (innermost last):
  Module ZPublisher.Publish, line 115, in publish
  Module ZPublisher.mapply, line 88, in mapply
  Module ZPublisher.Publish, line 41, in call_object
  Module Products.GenericSetup.tool, line 471, in manage_exportAllSteps
  Module Products.GenericSetup.tool, line 272, in runAllExportSteps
  Module Products.GenericSetup.tool, line 736, in _doRunExportSteps
  Module Products.CMFCore.exportimport.typeinfo, line 198, in 
exportTypesTool

  Module Products.GenericSetup.utils, line 728, in exportObjects
  Module Products.GenericSetup.utils, line 722, in exportObjects
  Module Products.GenericSetup.utils, line 501, in _exportBody
  Module xml.dom.minidom, line 62, in toprettyxml
  Module StringIO, line 271, in getvalue
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf4 in position 20: 
ordinal not in range(128)





It treats strings sometimes as ASCII, sometimes as UTF-8, yet it has 
access to two variables: its own ISetupContext.getEncoding() (whose 
use I didn't fully grok) and CMF's 
ISetupContext.getSite().getProperty('default_charset').


Sorry, but your assumptions are wrong:

- The default setup tool creates export contexts without specifying the 
encoding, so ISetupContext.getEncoding() returns always None. And even 
if it would be set it represents the encoding of the exported files, not 
the site encoding.


- getSite().getProperty('default_charset') is CMF specific and should 
not be used in GenericSetup.


- The adapters adapt ISetupEnviron, not ISetupContext. getEncoding() and 
getSite() are not always available.


Thanks for setting me right. What's the usefulness of getEncoding()? As 
you say, exported files don't need to be other than utf-8 encoded.




First of all we need unit tests that make sure UTF-8 works and I think 
this should be the default used by GenericSetup. Code that needs to know 
how to find the site encoding can't be generic.


Yep.



There is an additional problem: If tools use the default property edit 
page from OFS the properties might have a different encoding than 
'default_charset' of the site. Since the default 
'management_page_charset' is UTF-8 we have less trouble if we allow only 
UTF-8.


D'oh! /manage is 8859-15, /manage_menu is -1 and manage_propertiesForm 
UTF-8. No wonder Firefox sometimes gets confused :-)


Well, I think I can wriggle out of most of my problems using 
translation. And I'll try and write UTF-8 unit tests if nobody beats me 
to it.


Thanks!




Cheers,

Yuppie


yves

___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See http://collector.zope.org/CMF for bug reports and feature requests


[Zope-CMF] [RFC] [Patch] GenericSetup and encodings

2006-06-06 Thread Yves Bastide

Hi,

GenericSetup has problems handling non-ASCII data. It treats strings 
sometimes as ASCII, sometimes as UTF-8, yet it has access to two 
variables: its own ISetupContext.getEncoding() (whose use I didn't fully 
grok) and CMF's ISetupContext.getSite().getProperty('default_charset').


Attached is a patch using both of them and somewhat working in my setup. 
Can knowledgeable people comment on it before I enter a collector issue? 
(I'm using GS alongside with CPS, which also needs some patching; yet 
basic things, such as exporting-importing an iso8859-15 Title in a CMF 
charset-default'ed to iso8859-15, should work)


Thanks!

Yves
Index: GenericSetup/utils.py
===
--- GenericSetup/utils.py	(revision 68510)
+++ GenericSetup/utils.py	(working copy)
@@ -498,7 +498,8 @@
 """Export the object as a file body.
 """
 self._doc.appendChild(self._exportNode())
-return self._doc.toprettyxml(' ')
+encoding = self.environ.getEncoding() or 'UTF-8'
+return self._doc.toprettyxml(' ', encoding=encoding)
 
 def _importBody(self, body):
 """Import the object from the file body.
@@ -617,6 +618,7 @@
 node.setAttribute('name', prop_id)
 
 prop = self.context.getProperty(prop_id)
+encoding = self.environ.getSite().getProperty('default_charset', '') or 'UTF-8'
 if isinstance(prop, (tuple, list)):
 for value in prop:
 child = self._doc.createElement('element')
@@ -625,8 +627,10 @@
 else:
 if prop_map.get('type') == 'boolean':
 prop = str(bool(prop))
+elif isinstance(prop, str):
+prop = prop.decode(encoding)
 elif not isinstance(prop, basestring):
-prop = str(prop)
+prop = unicode(prop)
 child = self._doc.createTextNode(prop)
 node.appendChild(child)
 
@@ -685,9 +689,10 @@
 raise BadRequest('%s cannot be changed' % prop_id)
 
 elements = []
+encoding = self.environ.getEncoding()
 for sub in child.childNodes:
 if sub.nodeName == 'element':
-elements.append(sub.getAttribute('value').encode('utf-8'))
+elements.append(sub.getAttribute('value').encode(encoding))
 
 if elements or prop_map.get('type') == 'multiple selection':
 prop_value = tuple(elements) or ()
@@ -696,7 +701,7 @@
 else:
 # if we pass a *string* to _updateProperty, all other values
 # are converted to the right type
-prop_value = self._getNodeText(child).encode('utf-8')
+prop_value = self._getNodeText(child).encode(encoding)
 
 if not self._convertToBoolean(child.getAttribute('purge')
   or 'True'):
___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See http://collector.zope.org/CMF for bug reports and feature requests