Re: [Tutor] fetching wikipedia articles

amit sethi Fri, 23 Jan 2009 03:08:30 -0800

well thanks ... it worked well ... but robotparser is in urllib isn't there
a module like robotparser in
urllib2


On Fri, Jan 23, 2009 at 3:55 PM, Andre Engels <andreeng...@gmail.com> wrote:

> On Fri, Jan 23, 2009 at 10:37 AM, amit sethi <amit.pureene...@gmail.com>
> wrote:
> > so is there a way around that problem ??
>
> Ok, I have done some checking around, and it seems that the Wikipedia
> server is giving a return code of 403 (forbidden), but still giving
> the page - which I think is weird behaviour. I will check with the
> developers of Wikimedia why this is done, but for now you can resolve
> this by editing robotparser.py in the following way:
>
> In the __init__ of the class URLopener, add the following at the end:
>
> self.addheaders = [header for header in self.addheaders if header[0]
> != "User-Agent"] + [('User-Agent', '<whatever>')]
>
> (probably
>
> self.addheaders = [('User-Agent', '<whatever>')]
>
> does the same, but my version is more secure)
>
> --
> André Engels, andreeng...@gmail.com
>



-- 
A-M-I-T S|S

_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] fetching wikipedia articles

Reply via email to