well thanks ... it worked well ... but robotparser is in urllib isn't there a module like robotparser in urllib2
On Fri, Jan 23, 2009 at 3:55 PM, Andre Engels <andreeng...@gmail.com> wrote: > On Fri, Jan 23, 2009 at 10:37 AM, amit sethi <amit.pureene...@gmail.com> > wrote: > > so is there a way around that problem ?? > > Ok, I have done some checking around, and it seems that the Wikipedia > server is giving a return code of 403 (forbidden), but still giving > the page - which I think is weird behaviour. I will check with the > developers of Wikimedia why this is done, but for now you can resolve > this by editing robotparser.py in the following way: > > In the __init__ of the class URLopener, add the following at the end: > > self.addheaders = [header for header in self.addheaders if header[0] > != "User-Agent"] + [('User-Agent', '<whatever>')] > > (probably > > self.addheaders = [('User-Agent', '<whatever>')] > > does the same, but my version is more secure) > > -- > André Engels, andreeng...@gmail.com > -- A-M-I-T S|S
_______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor