[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2012-09-10 Thread Eduardo A . Bustamante López
Eduardo A. Bustamante López added the comment: Hi Senthil, > I fail to see the bug in here. Robotparser module is for reading and > parsing the robot.txt file, the module responsible for fetching it > could urllib. You're right, but robotparser's read() does a call to urllib

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2012-09-09 Thread Eduardo A . Bustamante López
Eduardo A. Bustamante López added the comment: I forgot to mention that I ran a nc process in parallel, to see what data is being sent: ``nc -l -p ``. -- ___ Python tracker <http://bugs.python.org/issue15

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2012-09-09 Thread Eduardo A . Bustamante López
Eduardo A. Bustamante López added the comment: I'm not sure what's the best approach here. 1. Avoid changes in the Lib, and document a work-around, which involves installing an opener with the specific User-agent. The draw-back is that it modifies the behaviour of urlopen() gl

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2012-09-02 Thread Eduardo A . Bustamante López
Eduardo A. Bustamante López added the comment: I guess a workaround is to do: robotparser.URLopener.version = 'MyVersion' -- ___ Python tracker <http://bugs.python.o

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2012-09-02 Thread Eduardo A . Bustamante López
Changes by Eduardo A. Bustamante López : Added file: http://bugs.python.org/file27101/myrobotparser.py ___ Python tracker <http://bugs.python.org/issue15851> ___ ___ Pytho

[issue15851] Lib/robotparser.py doesn't accept setting a user agent string, instead it uses the default.

2012-09-02 Thread Eduardo A . Bustamante López
New submission from Eduardo A. Bustamante López: I found that http://en.wikipedia.org/robots.txt returns 403 if the provided user agent is in a specific blacklist. And since robotparser doesn't provide a mechanism to change the default user agent used by the opener, it becomes unusabl