urllib problem (maybe bugs?)
Hi, I'm trying to fill the form on page http://www.cbs.dtu.dk/services/TMHMM/ using urllib. There are two peculiarities. First of all, I am filling in incorrect key/value pairs in the parameters on purpose because that's the only way I can get it to work.. For "version" I am suppose to leave it unchecked, having value of empty string. And for name "outform" I am suppose to assign it a value of "-short". Instead, I left out "outform" all together and fill in "-short" for version. I discovered the method my accident. After I've done that it works fine for small SEQ values. Then, when I try to send large amount of data (1.4MB), it fails miserably with AttributeError exception. I highly suspect the two problems I have are the result of some bugs in the urllib module. Any suggestions? This is my code: fd = open('secretory0_1.txt', 'r') txt = fd.read() fd.close() params = urllib.urlencode({'SEQ': txt, 'configfile': '/usr/opt/www/pub/CBS/services/TMHMM-2.0/TMHMM2.cf', 'version': '-short'}) f = urllib.urlopen("http://www.cbs.dtu.dk/cgi-bin/nph-webface";, params) data = f.read() start = data.find('follow This link') secondurl = data[start:end] f = urllib.urlopen(secondurl) print f.read() The value pairs I am suppose to fill in are: SEQ => some sequence here configfile => '/usr/opt/www/pub/CBS/services/TMHMM-2.0/TMHMM2.cf' version => '' outform => '-short' The exception I get when sending secretory0_1.txt is: C:\Documents and Settings\thw\桌面>python testhttp.py Traceback (most recent call last): File "testhttp.py", line 11, in ? f = urllib.urlopen("http://www.cbs.dtu.dk/cgi-bin/nph-webface";, params) File "C:\Python24\lib\urllib.py", line 79, in urlopen return opener.open(url, data) File "C:\Python24\lib\urllib.py", line 182, in open return getattr(self, name)(url, data) File "C:\Python24\lib\urllib.py", line 307, in open_http return self.http_error(url, fp, errcode, errmsg, headers, data) File "C:\Python24\lib\urllib.py", line 322, in http_error return self.http_error_default(url, fp, errcode, errmsg, headers) File "C:\Python24\lib\urllib.py", line 550, in http_error_default return addinfourl(fp, headers, "http:" + url) File "C:\Python24\lib\urllib.py", line 836, in __init__ addbase.__init__(self, fp) File "C:\Python24\lib\urllib.py", line 786, in __init__ self.read = self.fp.read AttributeError: 'NoneType' object has no attribute 'read' Timothy -- http://mail.python.org/mailman/listinfo/python-list
Re: urllib problem (maybe bugs?)
On Wed, 30 Mar 2005 18:25:56 +0200, Fredrik Lundh <[EMAIL PROTECTED]> wrote: > Timothy Wu wrote: > > > After I've done that it works fine for small SEQ values. Then, when I > > try to send large amount of data (1.4MB), it fails miserably with > > AttributeError exception. > > the page states that you should send no more than 4000 proteins. how > many proteins do you have in your 1.4 megabyte file? It is exactly 4000 protein. I do it by hand with 4000 on browser and it works just fine. I am not 100% sure but sending 2000 proteins seems to break when I ran with urllib. > > I highly suspect the two problems I have are the result of some bugs > > in the urllib module. Any suggestions? > > if the urllib module couldn't handle forms, don't you think anyone else > would have noticed that by now? I would like to think so. It works for me before with much smaller inputs on other sites, too. However I can't explain how I had to intentionally fill in incorrect values into the attribute for it to work (even for testing sequence of merely 10 /characters long), yet the browser handles it just fine. > > > File "C:\Python24\lib\urllib.py", line 786, in __init__ > >self.read = self.fp.read > > AttributeError: 'NoneType' object has no attribute 'read' It says on the documentation, "If the connection cannot be made, or if the server returns an error code, the IOError exception is raised." That's not an expected error. > my guess is that the server shuts the connection down when you're send > too much data to it. have you contacted the server administrators? (see > the bottom of that page). I'll try to check on this. Thanks. However that filling-in-incorrect-value-for-an-attribute error is still unexplainable. -- http://mail.python.org/mailman/listinfo/python-list
Generator question
Hi, Using generator recursively is not doing what I expect: def test_gen(x): yield x x = x - 1 if x != 0: test_gen(x) for item in test_gen(3): print item This gives me a single number 3 and not printing 2 and 1 as I would expect. What is wrong?? Timothy -- http://mail.python.org/mailman/listinfo/python-list
Re: Generator question
On 11/26/06, Robert Kern <[EMAIL PROTECTED]> wrote: The only thing that the last line does is *create* a new generator object. You need to actually iterate over it and yield its values. E.g. In [2]: def test_gen(x): ...: yield x ...: x -= 1 ...: if x != 0: ...: for y in test_gen(x): ...: yield y ...: ...: In [3]: list(test_gen(3)) Out[3]: [3, 2, 1] Ha-HA, that makes perfect sense I guess. Though in my opinion the definition makes the code a bit harder to read. Thanks for the explanation. Timothy -- http://mail.python.org/mailman/listinfo/python-list
Gdmodule
Hi, Is Gdmodule used much at all in the Python community or are there alternative packages more suitable for the purpose? I seem to find documentation for Gdmodule ( http://newcenturycomputers.net/projects/gd-ref.html) to require prior experience with the GD library in another language. Or at least, it's too difficult for me to grasp. Timothy -- http://mail.python.org/mailman/listinfo/python-list
xml sax
Hi, I am using xml.sax.handler.ContentHandler to parse some simple xml. I want to detect be able to parse the content of this tag embedded in the XML. 174 Is the proper way of doing so involving finding the "Id" tag from startElement(), setting flag when seeing one, and in characters(), when seeing that flag set, save the content? What if multiple tags of the same name are nested at different levels and I want to differentiate them? I would be setting a flag for each level. I can imagine things get pretty messy when flags are all around. Timothy -- http://mail.python.org/mailman/listinfo/python-list
Re: xml sax
Oh right, why didn't I think of that. =) Many thanks. Timothy On Thu, Mar 20, 2008 at 1:45 AM, Robert Bossy <[EMAIL PROTECTED]> wrote: > Timothy Wu wrote: > > Hi, > > > > I am using xml.sax.handler.ContentHandler to parse some simple xml. > > > > I want to detect be able to parse the content of this tag embedded in > > the XML. > > 174 > > > > > > Is the proper way of doing so involving finding the "Id" tag > > from startElement(), setting flag when seeing one, and in characters(), > > when seeing that flag set, save the content? > > > > What if multiple tags of the same name are nested at different levels > > > > and I want to differentiate them? I would be setting a flag for each > level. > > I can imagine things get pretty messy when flags are all around. > > > Hi, > > You could have a list of all opened elements from the root to the > innermost. To keep such a list, you append the name of the element to > this stack at the end of startElement() and pop it off at the end of > endElement(). > > In this way you have acces to the path of the current parser position. > In order to differentiate between character data in Id and in Id/Id, you > just have to iterate at the last elements of the list. > > Cheers, > RB > -- > http://mail.python.org/mailman/listinfo/python-list > -- http://mail.python.org/mailman/listinfo/python-list
xml.sax problem
Hi, I have created a very, very simple parser for an XML. class FindGoXML2(ContentHandler): def characters(self, content): print content I have made it simple because I want to debug. This prints out any content enclosed by tags (right?). The XML is publicly available here: http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=gene&id=9622&retmode=xml I show a few line embedded in this XML: GO 3824 catalytic activity evidence: IEA Notice the third line before the last. I expect my content printout to print out "evidence:IEA". However this is what I get. - catalytic activity ==> this is the print out the line before e vidence: IEA - I don't understand why a few blank lines were printed after "catalytic activity". But that doesn't matter. What matters is where the string "evidence: IEA" is split into two printouts. First it prints only "e", then "vidence: IEA". I parsed 825 such XMLs without a problem, this occurs on my 826th XML. Any explanations?? Timothy -- http://mail.python.org/mailman/listinfo/python-list
Serialization, save type information in file and restore them
Hi, I created a class that's able to manipulate tabulated data. I want to be able to dump the bulk of the data and other attributes as a tab-delimited text. I have trouble saving/restoring type information in the file. For example, some attributes are int, others may be float, etc. So I want to store the data type as well as the data value themselves in a file. And I don't think I want to use Pickle because I want it to be readily opened in vi and be readable as a tab-delimited file and be able to import into Excel as well. What's the best way to achieve this? I was able to write string like "attribute = int(value)" into a file. But how do I get the value back? I want the "int(value)" string to be loaded into the program and be executable so I can actually create the instance variable in the class. Any help appreciated, thanks. Timothy -- http://mail.python.org/mailman/listinfo/python-list