I'm working on a CGI script that pulls XML data from a public database (Medline) and caches this data using shelveleto minimize load on the database. In general, the script works quite well, but keeps crashing every time I try to pickle a particular XML document. Below is a script that illustrates the problem, followed by the stack trace that is generated (thanks to Kent Johnson who helped me refine the script). I'd appreciate any advice for solving this particular problem. Someone on Python-Tutor suggested that the XML document has a circular reference, but I'm not sure exactly what this means, or why the document would have a reference to itself.
import urllib from pickle import Pickler from cStringIO import StringIO from xml.dom import minidom baseurl = 'http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?' params = { 'db': 'pubmed', 'retmode': 'xml', 'rettype': 'medline', } badkey = '16842422' params['id'] = badkey url = baseurl + urllib.urlencode(params) doc = minidom.parseString(urllib.urlopen(url).read()) print 'Successfully retrieved and parsed XML document with ID %s' % badkey f = StringIO() p = Pickler(f, 0) p.dump(doc) #Will fail on the above line print 'Successfully shelved XML document with ID %s' % badkey Here is the top of the stack trace: File "BadShelve.py", line 35, in <module> p.dump(doc) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/ pickle.py", line 224, in dump self.save(obj) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/ pickle.py", line 286, in save f(self, obj) # Call unbound method with explicit self File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/ pickle.py", line 725, in save_inst save(stuff) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/ pickle.py", line 286, in save f(self, obj) # Call unbound method with explicit self File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/ pickle.py", line 649, in save_dict self._batch_setitems(obj.iteritems()) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/ pickle.py", line 663, in _batch_setitems save(v) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/ pickle.py", line 286, in save f(self, obj) # Call unbound method with explicit self File "/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/ pickle.py", line 725, in save_inst save(stuff) -- http://mail.python.org/mailman/listinfo/python-list