Hi!

I would like to resurrect this thread with my own recent experience.
My problem was to spread a set of zodb objects over several zope
installations. Every installation required some slight modifications to
be applied to original objects (such as usernames and passwords).
So I decided to export the original data in an xml dump and write a
small application which modifies the dump as necessary before import.
But to my distress I couldn't import neither a modified dump nor even
the original dump. The process failed with UnicodeDecodeError exception.

After some investigation I realized what was the problem (at least in my
case). The xml parser extracts all texts as unicode strings. But among
them are base64-encoded strings which are decoded into non-unicode
strings containing binary data. Of course this data can't be decoded by
any codec. The code in ppml.py sometimes concatenates the raw and
unicode strings and this raises UnicodeDecodeError.

I worked this around by converting the unicode strings into non-unicode
ones. Please look at the patch attached. Zope version 2.9.4.

--
Best regards,
Alexei

On 25. March 2006 21:40:48 +0100 Yoshinori Okuji <yo at nexedi.com> wrote:

 > On Saturday 25 March 2006 15:56, Andreas Jung wrote:
 >> Zope 2.7 throws a BadPickleGet, 12 exception, Zope 2.8 throws
 >> BadPickleGet, 13 and Zope 2.9 raises the described UnicodeDecodeError.
>> I don't expect that the import functionality works for even more complex
 >> objects. So I consider the whole functionality as totally broken. The
>> generated XML might be useful to perform any processing outside Zope but
 >> using it for re-importing it into another Zope systems definitely does
 >> _not_  work. So if the functionality should remain in Zope then it
should
 >> be fixed
 >> for Zope 2.10 lately.
 >
 > Here is a quick patch for this problem (against 2.9.1). There were two
 > different problems:
 >
 > - the id attributes were not generated, because the conditional was
 > reverse.
 >
 > - unlike xmllib, expat always returns Unicode data, so simply
 > concatenating  binary values generates Unicode objects with non-ascii
 > characters.
 >


--- /tmp/ppml.py	2006-10-27 00:36:18.000000000 +0600
+++ /usr/lib/zope2.9/lib/python/Shared/DC/xml/ppml.py	2006-10-27 00:00:17.000000000 +0600
@@ -573,7 +573,9 @@
 def save_tuple(self, tag, data):
     T=data[2:]
     if not T: return ')'
-    return save_put(self, '('+string.join(T,'')+'t', data[1])
+    try: ret='('+string.join(T,'')+'t'
+    except UnicodeDecodeError: ret='('+string.join(map(str,T),'')+'t'
+    return save_put(self, ret, data[1])
 
 def save_list(self, tag, data):
     L=data[2:]
@@ -590,7 +592,9 @@
     D=data[2:]
     if self.binary:
         v=save_put(self, '}', data[1])
-        if D: v=v+'('+string.join(D,'')+'u'
+        if D:
+	  try: v=v+'('+string.join(D,'')+'u'
+	  except UnicodeDecodeError: v=v+'('+string.join(map(str,D),'')+'u'
     else:
         v=save_put(self, '(d', data[1])
         if D: v=v+string.join(D,'s')+'s'
@@ -623,7 +627,8 @@
     stop=string.rfind(x,'t')  # This seems
     if stop>=0: x=x[:stop]    # wrong!
     v=save_put(self, v+x+'o', data[1])
-    v=v+data[4]+'b' # state
+    try: v=v+data[4]+'b' # state
+    except UnicodeDecodeError: v=str(v)+str(data[4])+'b' # state
     return v
 
 def save_global(self, tag, data):

_______________________________________________
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )

Reply via email to