Re: [Tutor] perplexing error with shelve REVISED

2007-10-31 Thread Aditya Lal
On 10/31/07, Orest Kozyar [EMAIL PROTECTED] wrote:

  Please post the entire traceback (omitting duplicate lines).

 Sorry, I should have included the traceback.  I've revised the sample
 script
 so that it generates the traceback when run.  The sample script is at the
 very bottom of this email.

 ### SCRIPT OUTPUT ###
 [kozyar]:~$ python example.py

 Successfully retrieved and parsed XML document with ID 16842423
 Successfully shelved XML document with ID 16842423
 Successfully retrieved and parsed XML document with ID 16842422

 Traceback (most recent call last):
   File example.py, line 30, in module
 data[badkey] = doc
   File /usr/lib64/python2.5/shelve.py, line 123, in __setitem__
 p.dump(value)

 RuntimeError: maximum recursion depth exceeded

 Exception exceptions.RuntimeError: 'maximum recursion depth exceeded' in
 bound method DbfilenameShelf.__del__ of {'16842423':
 xml.dom.minidom.Document instance at 0x96f290} ignored


 ### START SCRIPT ###
 import urllib, shelve
 from xml.dom import minidom

 baseurl = 'http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?'

 params = {
 'db':   'pubmed',
 'retmode':  'xml',
 'rettype':  'medline'
 }

 badkey = '16842422'
 goodkey = '16842423' #or just about any other ID

 data = shelve.open('data.tmp', writeback=True)

 params['id'] = goodkey
 url = baseurl + urllib.urlencode(params)
 doc = minidom.parseString(urllib.urlopen(url).read())
 print 'Successfully retrieved and parsed XML document with ID %s' %
 goodkey
 data[goodkey] = doc
 print 'Successfully shelved XML document with ID %s' % goodkey

 params['id'] = badkey
 url = baseurl + urllib.urlencode(params)
 doc = minidom.parseString(urllib.urlopen(url).read())
 print 'Successfully retrieved and parsed XML document with ID %s' % badkey
 data[badkey] = doc
 #Will fail on the above line
 print 'Successfully shelved XML document with ID %s' % badkey


The program worked fine for me (python 2.5.1 - stackless on Mac OSX). Can
you provide details about your platform (python version, OS, etc.) ?
-- 
Aditya
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] perplexing error with shelve REVISED

2007-10-31 Thread Eric Brunson
Orest Kozyar wrote:
 Please post the entire traceback (omitting duplicate lines). 
 

 Sorry, I should have included the traceback.  I've revised the sample script
 so that it generates the traceback when run.  The sample script is at the
 very bottom of this email.

 ### SCRIPT OUTPUT ###
 [kozyar]:~$ python example.py 

 Successfully retrieved and parsed XML document with ID 16842423
 Successfully shelved XML document with ID 16842423
 Successfully retrieved and parsed XML document with ID 16842422

 Traceback (most recent call last):
   File example.py, line 30, in module
 data[badkey] = doc
   File /usr/lib64/python2.5/shelve.py, line 123, in __setitem__
 p.dump(value)
 RuntimeError: maximum recursion depth exceeded

 Exception exceptions.RuntimeError: 'maximum recursion depth exceeded' in
 bound method DbfilenameShelf.__del__ of {'16842423':
 xml.dom.minidom.Document instance at 0x96f290} ignored


 ### START SCRIPT ###
 import urllib, shelve
 from xml.dom import minidom

 baseurl = 'http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?'

 params = {
 'db':   'pubmed',
 'retmode':  'xml',
 'rettype':  'medline'
 }

 badkey = '16842422'
 goodkey = '16842423' #or just about any other ID

 data = shelve.open('data.tmp', writeback=True)

 params['id'] = goodkey
 url = baseurl + urllib.urlencode(params)
 doc = minidom.parseString(urllib.urlopen(url).read())
 print 'Successfully retrieved and parsed XML document with ID %s' % goodkey
 data[goodkey] = doc
 print 'Successfully shelved XML document with ID %s' % goodkey

 params['id'] = badkey
 url = baseurl + urllib.urlencode(params)
 doc = minidom.parseString(urllib.urlopen(url).read())
 print 'Successfully retrieved and parsed XML document with ID %s' % badkey
   


The problem seems to be with pickling the doc object.

try adding:

import pickle
print pickle.dumps(doc)

right here in your code.  Are you sure the XML is well formed?

If nothing else, you could try shelving the XML rather than the doc.

 data[badkey] = doc
 #Will fail on the above line
 print 'Successfully shelved XML document with ID %s' % badkey

 ___
 Tutor maillist  -  Tutor@python.org
 http://mail.python.org/mailman/listinfo/tutor
   

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] perplexing error with shelve REVISED

2007-10-31 Thread Eric Brunson
Orest Kozyar wrote:
 Please post the entire traceback (omitting duplicate lines). 
 

 Sorry, I should have included the traceback.  I've revised the sample script
 so that it generates the traceback when run.  The sample script is at the
 very bottom of this email.
   

It appears you have a cyclic reference in your doc object.  Try adding 
doc.unlink() before you add it to your shelf.

 ### SCRIPT OUTPUT ###
 [kozyar]:~$ python example.py 

 Successfully retrieved and parsed XML document with ID 16842423
 Successfully shelved XML document with ID 16842423
 Successfully retrieved and parsed XML document with ID 16842422

 Traceback (most recent call last):
   File example.py, line 30, in module
 data[badkey] = doc
   File /usr/lib64/python2.5/shelve.py, line 123, in __setitem__
 p.dump(value)
 RuntimeError: maximum recursion depth exceeded

 Exception exceptions.RuntimeError: 'maximum recursion depth exceeded' in
 bound method DbfilenameShelf.__del__ of {'16842423':
 xml.dom.minidom.Document instance at 0x96f290} ignored


 ### START SCRIPT ###
 import urllib, shelve
 from xml.dom import minidom

 baseurl = 'http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?'

 params = {
 'db':   'pubmed',
 'retmode':  'xml',
 'rettype':  'medline'
 }

 badkey = '16842422'
 goodkey = '16842423' #or just about any other ID

 data = shelve.open('data.tmp', writeback=True)

 params['id'] = goodkey
 url = baseurl + urllib.urlencode(params)
 doc = minidom.parseString(urllib.urlopen(url).read())
 print 'Successfully retrieved and parsed XML document with ID %s' % goodkey
 data[goodkey] = doc
 print 'Successfully shelved XML document with ID %s' % goodkey

 params['id'] = badkey
 url = baseurl + urllib.urlencode(params)
 doc = minidom.parseString(urllib.urlopen(url).read())
 print 'Successfully retrieved and parsed XML document with ID %s' % badkey
 data[badkey] = doc
 #Will fail on the above line
 print 'Successfully shelved XML document with ID %s' % badkey

 ___
 Tutor maillist  -  Tutor@python.org
 http://mail.python.org/mailman/listinfo/tutor
   

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] perplexing error with shelve REVISED

2007-10-31 Thread Kent Johnson
Orest Kozyar wrote:
 Please post the entire traceback (omitting duplicate lines). 
 
 Sorry, I should have included the traceback.  I've revised the sample script
 so that it generates the traceback when run.  The sample script is at the
 very bottom of this email.

I've poked at this a little. The problem is in the pickling of the 
minidom object. It fails for me using
Python 2.5.1 (r251:54869, Apr 18 2007, 22:08:04)
[GCC 4.0.1 (Apple Computer, Inc. build 5367)] on darwin

If you use pickle instead of cPickle the stack trace at least shows the 
recursive calls.

Here is a slightly shorter program that demonstrates the problem a 
little better:

import urllib
from pickle import Pickler
from cStringIO import StringIO
from xml.dom import minidom

baseurl = 'http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?'

params = {
 'db':   'pubmed',
 'retmode':  'xml',
 'rettype':  'medline',
 }

badkey = '16842422'

params['id'] = badkey
url = baseurl + urllib.urlencode(params)
doc = minidom.parseString(urllib.urlopen(url).read())
print 'Successfully retrieved and parsed XML document with ID %s' % badkey

f = StringIO()
p = Pickler(f, 0)
p.dump(doc)

#Will fail on the above line
print 'Successfully shelved XML document with ID %s' % badkey


Here is the top of the stack trace:
   File BadShelve.py, line 35, in module
 p.dump(doc)
   File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/pickle.py, 
line 224, in dump
 self.save(obj)
   File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/pickle.py, 
line 286, in save
 f(self, obj) # Call unbound method with explicit self
   File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/pickle.py, 
line 725, in save_inst
 save(stuff)
   File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/pickle.py, 
line 286, in save
 f(self, obj) # Call unbound method with explicit self
   File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/pickle.py, 
line 649, in save_dict
 self._batch_setitems(obj.iteritems())
   File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/pickle.py, 
line 663, in _batch_setitems
 save(v)
   File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/pickle.py, 
line 286, in save
 f(self, obj) # Call unbound method with explicit self
   File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/pickle.py, 
line 725, in save_inst
 save(stuff)


You might want to take this to comp.lang.python or perhaps xml-sig.

Kent
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] perplexing error with shelve REVISED

2007-10-31 Thread Orest Kozyar
 It appears you have a cyclic reference in your doc object.  
 Try adding doc.unlink() before you add it to your shelf.

That fixed the problem.  I did not realize that XML documents could have
cyclic references.  Thanks for all your help!

Orest

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] perplexing error with shelve REVISED

2007-10-31 Thread Kent Johnson
Eric Brunson wrote:
 Orest Kozyar wrote:
 Please post the entire traceback (omitting duplicate lines). 
 
 Sorry, I should have included the traceback.  I've revised the sample script
 so that it generates the traceback when run.  The sample script is at the
 very bottom of this email.
   
 
 It appears you have a cyclic reference in your doc object.  Try adding 
 doc.unlink() before you add it to your shelf.

I thought pickle was supposed to deal with cyclic references? The docs say,

The pickle module keeps track of the objects it has already serialized, 
so that later references to the same object won't be serialized again. 
marshal doesn't do this.

This has implications both for recursive objects and object sharing. 
Recursive objects are objects that contain references to themselves. 
These are not handled by marshal, and in fact, attempting to marshal 
recursive objects will crash your Python interpreter. Object sharing 
happens when there are multiple references to the same object in 
different places in the object hierarchy being serialized. pickle stores 
such objects only once, and ensures that all other references point to 
the master copy. Shared objects remain shared, which can be very 
important for mutable objects.

Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] perplexing error with shelve REVISED

2007-10-31 Thread Orest Kozyar

 It appears you have a cyclic reference in your doc object.  
 Try adding doc.unlink() before you add it to your shelf.

Actually, I just realized that doc.unlink() seems to delete the entire XML
content, which kind of defeats the purpose of caching it.  I'll check on
xml-sig and the comp.lang.python group to see if there are any suggestions.

Thanks again for your help!
Orest

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] perplexing error with shelve

2007-10-30 Thread Orest Kozyar
I have a program which queries an online database (Medline) for XML data.
It caches all data using shelve to avoid hitting the database too many
times.  For some reason, I keep getting a RuntimeError: maximum recursion
depth exceeded when attempting to add a certain record.  This program has
successfully handled over 40,000 records from Medline, but for whatever
reason, this particular record (PMID: 16842422) produces this error.  If I
set maximum recursion depth to a larger value such as 5,000, I get a
segfault.  

Below is a script that should reproduce the problem I am having.  Any
guidance in solving this problem would greatly be appreciated.  

Thank you,
Orest

### START SCRIPT ###
import urllib, shelve
from xml.dom import minidom

baseurl = 'http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?'

params = {
'db':   'pubmed',
'retmode':  'xml',
'rettype':  'medline'
}

badkey = '16842422'
goodkey = '16842423' #or just about any other ID

data = shelve.open('data.tmp', writeback=True)

try:
params['id'] = goodkey
url = baseurl + urllib.urlencode(params)
doc = minidom.parseString(urllib.urlopen(url).read())
print 'Successfully retrieved and parsed XML document with ID %s' %
goodkey
data[goodkey] = doc
print 'Successfully shelved XML document with ID %s' % goodkey
except RuntimeError:
print 'Should not see this error message!'

try:
params['id'] = badkey
url = baseurl + urllib.urlencode(params)
doc = minidom.parseString(urllib.urlopen(url).read())
print 'Successfully retrieved and parsed XML document with ID %s' %
badkey
data[badkey] = doc
#Should not get this message!
print 'Successfully shelved XML document with ID %s' % badkey
except RuntimeError, e:
print 'Error shelving XML document with ID %s' % badkey
print e
### END SCRIPT ###

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] perplexing error with shelve REVISED

2007-10-30 Thread Bob Gailer
Orest Kozyar wrote:
 I have a program which queries an online database (Medline) for XML data.
 It caches all data using shelve to avoid hitting the database too many
 times.  For some reason, I keep getting a RuntimeError: maximum recursion
 depth exceeded when attempting to add a certain record.  
Please post the entire traceback (omitting duplicate lines). Else we 
can't (or won't try to) help.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] perplexing error with shelve

2007-10-30 Thread bob gailer
Orest Kozyar wrote:
 I have a program which queries an online database (Medline) for XML data.
 It caches all data using shelve to avoid hitting the database too many
 times.  For some reason, I keep getting a RuntimeError: maximum recursion
 depth exceeded when attempting to add a certain record.  
Please post the entire traceback. Else we can't (or won't try to) help.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] perplexing error with shelve REVISED

2007-10-30 Thread Orest Kozyar
 Please post the entire traceback (omitting duplicate lines). 

Sorry, I should have included the traceback.  I've revised the sample script
so that it generates the traceback when run.  The sample script is at the
very bottom of this email.

### SCRIPT OUTPUT ###
[kozyar]:~$ python example.py 

Successfully retrieved and parsed XML document with ID 16842423
Successfully shelved XML document with ID 16842423
Successfully retrieved and parsed XML document with ID 16842422

Traceback (most recent call last):
  File example.py, line 30, in module
data[badkey] = doc
  File /usr/lib64/python2.5/shelve.py, line 123, in __setitem__
p.dump(value)
RuntimeError: maximum recursion depth exceeded

Exception exceptions.RuntimeError: 'maximum recursion depth exceeded' in
bound method DbfilenameShelf.__del__ of {'16842423':
xml.dom.minidom.Document instance at 0x96f290} ignored


### START SCRIPT ###
import urllib, shelve
from xml.dom import minidom

baseurl = 'http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?'

params = {
'db':   'pubmed',
'retmode':  'xml',
'rettype':  'medline'
}

badkey = '16842422'
goodkey = '16842423' #or just about any other ID

data = shelve.open('data.tmp', writeback=True)

params['id'] = goodkey
url = baseurl + urllib.urlencode(params)
doc = minidom.parseString(urllib.urlopen(url).read())
print 'Successfully retrieved and parsed XML document with ID %s' % goodkey
data[goodkey] = doc
print 'Successfully shelved XML document with ID %s' % goodkey

params['id'] = badkey
url = baseurl + urllib.urlencode(params)
doc = minidom.parseString(urllib.urlopen(url).read())
print 'Successfully retrieved and parsed XML document with ID %s' % badkey
data[badkey] = doc
#Will fail on the above line
print 'Successfully shelved XML document with ID %s' % badkey

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor