Skip Montanaro <[EMAIL PROTECTED]> wrote

> It doesn't look any easier to do this using urllib2.  Seems like a
> semi-obvious oversight for both modules.  That suggests few people have 
> ever
> desired this capability.


my $.02:

I have trouble believing few people have not desired this for two reasons:

(1)  some web sites will shut out user agents they do not recognize to preserve 
bandwidth or for other reasons; the right User Agent ID can be required to get 
the data one wants;

(2)  It seems like it is a worthwhile courtesy to identify oneself when 
spidering or data scraping, and the User Agent ID seems like the obvious way to 
do that. I'd guess (and like to think) that Python users are generally a little 
more concerned with such courtesies than the user population of some other 
languages.

e.g.  Your website might get a hit from:  "Mozilla/5.0 (Songzilla MP3 Blog, 
http://songzilla.blogspot.com) Gecko/20041107 Firefox/1.0"

And you'll get to decide whether to shut them out or not, but at least it won't 
seem like the black hats are attacking.




Eric Pederson
http://www.songzilla.blogspot.com
:::::::::::::::::::::::::::::::::::
domainNot="@something.com"
domainIs=domainNot.replace("s","z")
ePrefix="".join([chr(ord(x)+1) for x in "do"])
mailMeAt=ePrefix+domainIs
:::::::::::::::::::::::::::::::::::

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to